Concepedia

Publication | Open Access

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

65

Citations

0

References

2024

Year

Abstract

Keivan Alizadeh, Seyed Iman Mirzadeh, Dmitry Belenko, S. Khatamifard, Minsik Cho, Carlo C Del Mundo, Mohammad Rastegari, Mehrdad Farajtabar. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024.