Publication | Closed Access
Prompt Distillation for Efficient LLM-based Recommendation
115
Citations
28
References
2023
Year
Unknown Venue
Llm Fine-tuningEngineeringMachine LearningLarge Language ModelCorpus LinguisticsText MiningLarge Language ModelsNatural Language ProcessingInformation RetrievalData ScienceComputational LinguisticsDiscrete PromptLanguage ModelsMachine TranslationLarge Ai ModelConversational Recommender SystemComputer ScienceCold-start ProblemPrompt DistillationRetrieval Augmented GenerationCollaborative Filtering
Large language models (LLM) have manifested unparalleled modeling capability on various tasks, e.g., multi-step reasoning, but the input to these models is mostly limited to plain text, which could be very long and contain noisy information. Long text could take long time to process, and thus may not be efficient enough for recommender systems that require immediate response. In LLM-based recommendation models, user and item IDs are usually filled in a template (i.e., discrete prompt) to allow the models to understand a given task, but the models usually need extensive fine-tuning to bridge the user/item IDs and the template words and to unleash the power of LLM for recommendation. To address the problems, we propose to distill the discrete prompt for a specific task to a set of continuous prompt vectors so as to bridge IDs and words and to reduce the inference time. We also design a training strategy with an attempt to improve the efficiency of training these models. Experimental results on three real-world datasets demonstrate the effectiveness of our PrOmpt Distillation (POD) approach on both sequential recommendation and top-N recommendation tasks. Although the training efficiency can be significantly improved, the improvement of inference efficiency is limited. This finding may inspire researchers in the community to further improve the inference efficiency of LLM-based recommendation models.
| Year | Citations | |
|---|---|---|
Page 1
Page 1