Recommender Systems in the Era of Large Language Models (LLMs)
TL;DR Summary
This paper reviews techniques for enhancing recommender systems using Large Language Models (LLMs), focusing on pre-training, fine-tuning, and prompting. It highlights LLMs' potential in feature encoding and their future applications in recommender system research.
Abstract
With the prosperity of e-commerce and web applications, Recommender Systems (RecSys) have become an important component of our daily life, providing personalized suggestions that cater to user preferences. While Deep Neural Networks (DNNs) have made significant advancements in enhancing recommender systems by modeling user-item interactions and incorporating textual side information, DNN-based methods still face limitations, such as difficulties in understanding users' interests and capturing textual side information, inabilities in generalizing to various recommendation scenarios and reasoning on their predictions, etc. Meanwhile, the emergence of Large Language Models (LLMs), such as ChatGPT and GPT4, has revolutionized the fields of Natural Language Processing (NLP) and Artificial Intelligence (AI), due to their remarkable abilities in fundamental responsibilities of language understanding and generation, as well as impressive generalization and reasoning capabilities. As a result, recent studies have attempted to harness the power of LLMs to enhance recommender systems. Given the rapid evolution of this research direction in recommender systems, there is a pressing need for a systematic overview that summarizes existing LLM-empowered recommender systems, to provide researchers in relevant fields with an in-depth understanding. Therefore, in this paper, we conduct a comprehensive review of LLM-empowered recommender systems from various aspects including Pre-training, Fine-tuning, and Prompting. More specifically, we first introduce representative methods to harness the power of LLMs (as a feature encoder) for learning representations of users and items. Then, we review recent techniques of LLMs for enhancing recommender systems from three paradigms, namely pre-training, fine-tuning, and prompting. Finally, we comprehensively discuss future directions in this emerging field.
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
1.1. Title
The title of the paper is: "Recommender Systems in the Era of Large Language Models (LLMs)"
1.2. Authors
The authors of this paper are:
-
Zihuai Zhao (PhD student, Department of Computing (COMP), The Hong Kong Polytechnic University)
-
Wenqi Fan (Assistant Professor, Department of Computing (COMP) and Department of Management and Marketing (MM), The Hong Kong Polytechnic University)
-
Jiatong Li (PhD student, Department of Computing (COMP), The Hong Kong Polytechnic University)
-
Yunqing Liu (PhD student, Department of Computing (COMP), The Hong Kong Polytechnic University)
-
Xiaowei Mei (PhD, University of Florida; currently focuses on economic models of information systems)
-
Yiqi Wang (Assistant Professor, College of Computer, National University of Defense Technology (NUDT))
-
Zhen Wen (Sr. Applied Science Manager at Amazon Prime Video, formerly Chief Scientist at Tencent News Feeds)
-
Fei Wang (Head of Personalization Science at Amazon Prime Video, formerly Senior Director at Visa Research)
-
Xiangyu Zhao (Assistant Professor, School of Data Science, City University of Hong Kong)
-
Jiliang Tang (University Foundation Professor, Computer Science and Engineering Department, Michigan State University)
-
Qing Li (Chair Professor (Data Science) and Head of the Department of Computing, The Hong Kong Polytechnic University)
The authors are affiliated with various academic institutions and industry leaders in the fields of computer science, artificial intelligence, data mining, and recommender systems. Their backgrounds span areas like recommender systems, natural language processing, deep learning, graph neural networks, trustworthy AI, and information retrieval, indicating a strong interdisciplinary expertise relevant to the topic.
1.3. Journal/Conference
The paper was published on arXiv, a free distribution service and an open-access archive for scholarly articles, primarily in physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. As a preprint server, arXiv allows researchers to share their work before peer review and formal publication. While not a peer-reviewed journal or conference itself, posting on arXiv is a common practice in computer science to disseminate research rapidly and widely. The paper was published on 2023-07-05T06:03:40.000Z, and the provided link is for version 6, indicating several revisions since its initial submission.
1.4. Publication Year
2023
1.5. Abstract
With the rise of e-commerce and web applications, Recommender Systems (RecSys) have become crucial for providing personalized suggestions. While Deep Neural Networks (DNNs) have improved RecSys by modeling user-item interactions and incorporating textual side information, they still face challenges in understanding user interests, capturing textual nuances, generalizing across diverse scenarios, and reasoning about predictions. Concurrently, Large Language Models (LLMs) like ChatGPT and GPT-4 have revolutionized Natural Language Processing (NLP) and Artificial Intelligence (AI) due to their superior language understanding, generation, generalization, and reasoning capabilities. Recent research has begun to leverage LLMs to enhance RecSys. Given this rapid development, the paper provides a systematic overview of LLM-empowered RecSys. It summarizes existing methods across three paradigms: Pre-training, Fine-tuning, and Prompting. Specifically, it first introduces methods where LLMs act as feature encoders for user and item representations. Then, it reviews techniques for LLMs in RecSys through pre-training, fine-tuning, and prompting approaches. Finally, it discusses future directions in this emerging field.
1.6. Original Source Link
- Original Source Link: https://arxiv.org/abs/2307.02046v6 (Preprint)
- PDF Link: https://arxiv.org/pdf/2307.02046v6.pdf (Preprint)
2. Executive Summary
2.1. Background & Motivation
The core problem the paper addresses is the inherent limitations of traditional Deep Neural Network (DNN)-based Recommender Systems (RecSys) in effectively processing and utilizing complex textual information, generalizing across diverse recommendation scenarios, and performing multi-step reasoning.
This problem is highly important in the current field due to the proliferation of e-commerce and web applications, where RecSys are a vital component of daily life, driving user engagement and satisfaction by providing personalized suggestions. DNN-based methods, despite their advancements in modeling user-item interactions and incorporating textual side information, still struggle with:
-
Limited Natural Language Understanding (NLU): They cannot sufficiently capture rich textual knowledge about users and items due to model scale and data size limitations, leading to suboptimal prediction.
-
Inadequate Generalization Ability: Most
RecSysare task-specific, making it challenging to adapt a model trained for one type of recommendation (e.g., rating prediction) to another (e.g., top- recommendations with explanations). -
Difficulties in Complex Reasoning: They perform well on simple decisions but falter with multi-step reasoning, which is crucial for intricate tasks like trip planning where sequential, conditional decisions are needed.
The paper's entry point or innovative idea is to leverage the recently emerged
Large Language Models (LLMs), such asChatGPTandGPT-4, which have demonstrated remarkable capabilities inNatural Language Processing (NLP).LLMsoffer powerfulNLUand generation, impressive generalization to unseen tasks (e.g., throughin-context learning), and enhanced reasoning (e.g., viaChain-of-Thought prompting). These strengths directly address the identified limitations ofDNN-basedRecSys, suggesting a paradigm shift for developing next-generation personalized recommendation systems.
2.2. Main Contributions / Findings
The primary contributions of this paper, which is a survey, are:
-
Systematic Overview: It provides the first comprehensive and systematic overview of
LLM-empoweredRecommender Systems (RecSys), categorizing existing methods into three fundamental paradigms:Pre-training,Fine-tuning, andPrompting. This structured approach helps researchers understand the landscape of this rapidly evolving field. -
Representation Learning with LLMs: It details how
LLMscan be harnessed as feature encoders for learning robust representations of users and items, distinguishing betweenID-based RecSysandTextual Side Information-enhanced RecSys. -
Adaptation Paradigms for LLMs in RecSys: It reviews and categorizes advanced techniques for adapting
LLMstoRecSystasks based on:- Pre-training: Discussing specific pre-training tasks (e.g.,
Masked Language Modeling,Next Token Prediction) designed forRecSysdata. - Fine-tuning: Covering both
full-model fine-tuningandparameter-efficient fine-tuning (PEFT)strategies for specializedRecSystasks. - Prompting: Exploring
conventional prompting,in-context learning (ICL),Chain-of-Thought (CoT)prompting,prompt tuning(hard and soft), andinstruction tuningfor lightweight adaptation.
- Pre-training: Discussing specific pre-training tasks (e.g.,
-
Identification of Future Directions: It comprehensively discusses emerging challenges and promising future research directions in
LLM-empoweredRecSys, includinghallucination mitigation, ensuringtrustworthiness(safety, fairness, explainability, privacy), developingvertical domain-specific LLMs, improvingusers&items indexing, enhancingfine-tuning efficiency, and leveragingLLMsfordata augmentation.The key findings reached by the paper are that
LLMsdemonstrate significant potential to overcome the limitations of traditionalRecSysby improvingnatural language understanding,generalization, andreasoningcapabilities. The survey itself solves the problem of providing a structured, up-to-date resource for researchers to navigate the complex and rapidly expanding intersection ofLLMsandRecSys, fostering further innovation in the field.
3. Prerequisite Knowledge & Related Work
3.1. Foundational Concepts
To understand this paper, a reader should be familiar with the following foundational concepts:
-
Recommender Systems (RecSys): At its core, a
recommender systemis a software tool or technique that provides suggestions for items (e.g., movies, products, news articles, jobs) that are most relevant to a particular user. The goal is to address information overload by filtering vast amounts of available data and presenting personalized content.- Collaborative Filtering (CF): A common
RecSystechnique that makes predictions about a user's interest in an item based on the opinions of other users. It identifies users with similar tastes or items that are favored by similar users. For instance, if user A and user B like similar movies, and user A likes a movie that user B hasn't seen,CFmight recommend that movie to user B. This typically involves learning representations (embedding vectors) for users and items from their interaction history (e.g., purchases, ratings). - Content-based Recommendation: This method recommends items based on the similarity between item characteristics and a user's profile. For example, if a user enjoys action movies, a content-based system would recommend other action movies that share similar attributes (genre, actors, director).
Textual side information(like item descriptions, user reviews, user profiles) is particularly valuable here.
- Collaborative Filtering (CF): A common
-
Deep Neural Networks (DNNs):
DNNsare a class of artificial neural networks with multiple layers between the input and output layers. They are known for their ability to learn complex patterns and representations from data, a process calledrepresentation learning. InRecSys,DNNshave been used for:- Modeling User-Item Interactions: Capturing non-linear relationships between users and items.
- Encoding Side Information: Processing textual, image, or other auxiliary data associated with users and items.
- Types of DNNs relevant to RecSys:
- Recurrent Neural Networks (RNNs): Particularly effective for sequential data. In
RecSys, they model user interaction sequences (e.g., a user's browsing history) to predict future behaviors. - Graph Neural Networks (GNNs): Treat user-item interactions as graph-structured data, where users and items are nodes and interactions are edges.
GNNslearn representations by propagating messages across the graph. - Convolutional Neural Networks (CNNs): Primarily used for image processing but also applied in
RecSysfor encoding textual side information, such as user reviews.
- Recurrent Neural Networks (RNNs): Particularly effective for sequential data. In
-
Pre-trained Language Models (PLMs): These are
DNNs, often based on theTransformerarchitecture, that are pre-trained on a massive amount of text data from diverse sources (e.g., books, articles, websites). This pre-training allows them to learn general linguistic patterns, grammar, semantics, and even some world knowledge.- Transformer Architecture: Introduced by Vaswani et al. (2017), the
Transformeris a neural network architecture that relies heavily onself-attention mechanisms. UnlikeRNNs,Transformerscan process all words in a sequence simultaneously, making them highly efficient for long sequences and capable of capturing long-range dependencies.- Attention Mechanism: The core of the
Transformer. It allows the model to weigh the importance of different words in the input sequence when processing each word. TheAttentionfunction can be described as mapping a query and a set of key-value pairs to an output, where the output is a weighted sum of the values, with the weight assigned to each value computed by a compatibility function of the query with the corresponding key. A common form isScaled Dot-Product Attention: $ \mathrm{Attention}(Q, K, V) = \mathrm{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V $ Where:- (Query), (Key), (Value) are matrices representing the input embeddings.
- and have dimension , and has dimension .
- is the dot product between query and key, which measures similarity.
- is a scaling factor to prevent large values in the dot product from pushing the
softmaxfunction into regions with tiny gradients. - converts the scores into probabilities (weights).
- The output is a weighted sum of the
Valuevectors.
- Attention Mechanism: The core of the
- Types of PLMs:
- Encoder-only Models (e.g., BERT):
Bidirectional Encoder Representations from Transformers. They process input text bidirectionally, considering context from both left and right words. Pre-trained with tasks likeMasked Language Modeling (MLM)(predicting masked words) andNext Sentence Prediction (NSP)(predicting if two sentences follow each other). - Decoder-only Models (e.g., GPT):
Generative Pre-trained Transformer. They generate text sequentially, typically from left to right, based on the preceding context. Pre-trained withNext Token Prediction (NTP)(predicting the next word in a sequence). - Encoder-Decoder Models (e.g., T5):
Text-To-Text Transfer Transformer. They can handle any text-to-text task by converting all NLP problems into a text generation problem (e.g., sentiment analysis becomes "sentiment: I love this movie." -> "positive").
- Encoder-only Models (e.g., BERT):
- Transformer Architecture: Introduced by Vaswani et al. (2017), the
-
Large Language Models (LLMs): These are
PLMsthat have significantly scaled up in terms of parameter count (billions or even trillions) and training data volume. This scaling leads toemergent capabilitiesthat are not present in smaller models.- Emergent Capabilities: These include enhanced
language understandingandgeneration, impressivegeneralizationto unseen tasks, and sophisticatedreasoningabilities. - In-context Learning (ICL): An
LLM's ability to learn new tasks or adapt its responses based on examples provided directly in the input prompt, without explicit weight updates. It relies on the model's capacity to recognize patterns and adapt its behavior from the provided context. - Chain-of-Thought (CoT) Prompting: A technique that enhances
LLM's reasoning by providing intermediate reasoning steps as examples within the prompt. This guides the model to break down complex problems into simpler steps, improving the accuracy of its final answers.
- Emergent Capabilities: These include enhanced
3.2. Previous Works
The paper frames its review by first discussing RecSys development, then the evolution of PLMs to LLMs, and finally their combination.
- Early RecSys:
Collaborative Filtering (CF)andContent-based recommendationare the two main categories.Matrix Factorization (MF): A classicalCFmethod that learns latent representations (embeddings) of users and items from pure user-item interactions.
- DNNs in RecSys:
NeuMF: Replaces the inner product inMFwithDNNsto model non-linear interactions.GNNsforRecSys: Leveraging graph-structured data for user and item representations (LightGCN,NGCF).DeepCoNN: UsesCNNsto encode user reviews for rating predictions.NARRE: A neural attention framework for simultaneous rating prediction and explanation generation.
- PLMs in RecSys:
BERT4Rec: AdoptsBERTto model sequential user behaviors for sequential recommendations.- Transformer-based frameworks (Li et al. [48]): For simultaneous item recommendations and explanation generation.
- Evolution to LLMs:
-
BERT (2018): Encoder-only, bidirectional
Transformer,MLMandNSPpre-training. -
GPT (2018): Decoder-only, unidirectional
Transformer,NTPpre-training. -
T5 (2019): Encoder-decoder, text-to-text framework.
-
GPT-3 (2020): Significant scaling up, demonstrating
ICLcapabilities. -
LaMDA (2021), PaLM (2022): Further large-scale
LLMs. -
ChatGPT (2022), LLaMA (2023), Vicuna (2023): Modern
LLMswith highly advanced conversational and reasoning abilities, often fine-tuned withReinforcement Learning from Human Feedback (RLHF).Figure 2 from the original paper provides a visual timeline of these developments:
该图像是示意图,展示了传统推荐系统与基于大型语言模型(LLMs)的推荐系统之间的发展脉络。图中分为三个部分:传统模型、预训练语言模型及大型语言模型时代,分别列出了各类模型及其应用于推荐系统的方式。
-
The image shows a timeline of milestones, with traditional recommender models like Matrix Factorization and Recurrent Neural Networks appearing first, followed by Pre-trained Language Models like BERT and GPT-2 being adapted for RecSys (e.g., BERT4Rec). The latest era highlights Large Language Models (e.g., ChatGPT, LLaMA) demonstrating emergent capabilities like in-context learning and chain-of-thought prompting, leading to conversational and explainable RecSys.
- LLMs in Graph Learning: Chen et al. [18] propose two pipelines:
LLMs-as-Enhancers(e.g., enhancing textual node attributes) andLLMs-as-Predictors(e.g., directly predicting links). - LLMs in RecSys (Early Efforts mentioned in Introduction):
Chat-Rec[3]: LeveragesChatGPTfor conversational interaction and refining candidate sets from traditionalRecSys.- Zhang et al. [20] (using
T5): Enables natural language input for explicit preferences. TALLRec,M6-Rec,PALR,P5: Examples of sequentialRecSysusingLLMs.UniCRS: Knowledge-enhanced prompt learning for conversationalRecSys.UniMIND: Unified multi-task learning for conversationalRecSysusing prompt-based strategies.
3.3. Technological Evolution
The evolution of RecSys can be broadly seen in stages:
-
Early Methods (1990s-early 2000s): Rule-based systems, simple
collaborative filtering(e.g., user-user, item-item similarity),Matrix Factorization. These primarily relied on numerical interaction data. -
Traditional Machine Learning (2000s-early 2010s): Integration of
content-basedfeatures usingSVMs,decision trees, etc. Still limited in handling complex, high-dimensional features, especially text. -
Deep Learning Era (mid-2010s-present):
DNNs(MLPs,CNNs,RNNs,GNNs) dramatically improvedrepresentation learning. This allowed for more sophisticated modeling of user-item interactions and better utilization of rich side information like text, images, and sequences.BERT4Recmarks a key step in integratingPLMs. -
LLM Era (late 2022-present): The emergence of
LLMs(likeChatGPT,GPT-4) with vastly superiorNLU,generation,generalization, andreasoningcapabilities represents the current frontier. This paper specifically focuses on howLLMsare being integrated to address the remaining challenges of theDNNera inRecSys.This paper's work fits into the LLM Era, providing a timely review of the initial efforts and future potential of leveraging
LLMsto fundamentally enhanceRecSys.
3.4. Differentiation Analysis
Compared to previous RecSys methods, the core differences and innovations of the LLM-empowered approaches, as highlighted by this paper, are:
-
Enhanced Natural Language Understanding (NLU): Traditional
DNNsand even earlierPLMs(likeBERTfor feature extraction) struggled to fully grasp the nuances of textual side information.LLMs, with their massive scale and training, offer unprecedentedNLUcapabilities, allowing for a deeper understanding of item descriptions, user reviews, and profiles. This enables more semantic-rich user and item representations. -
Superior Generalization:
DNN-basedRecSysare often task-specific and require extensive re-training or fine-tuning for new scenarios.LLMsexhibit impressive generalization, especially within-context learning, allowing them to adapt to diverse recommendation tasks (e.g., top-, rating prediction, explanation generation, conversationalRecSys) with minimal or no explicit fine-tuning. -
Complex Reasoning and Explainability: Most
DNN-based models are "black boxes" and struggle with multi-step reasoning, making it difficult to generate coherent explanations for recommendations.LLMs, particularly withChain-of-Thought (CoT)prompting, can break down complex decisions, provide step-by-step reasoning, and generate human-like explanations, fostering trust and user engagement. -
Conversational and Interactive Capabilities:
LLMsnaturally support human-like conversation, enabling interactiveRecSyswhere users can express evolving preferences and receive refined suggestions through dialogue, a capability largely absent in traditional systems. -
Unified Frameworks:
LLMsfacilitate unifying variousRecSystasks into language generation problems (e.g.,T5's text-to-text paradigm), simplifying model design and deployment across different recommendation objectives.This survey differentiates itself from other contemporary surveys by focusing specifically on the latest generation of
LLMs(likeChatGPT,LLaMA) and systematically categorizing the domain-specific techniques (Pre-training, Fine-tuning, Prompting) for adapting them toRecSys. Earlier surveys onPLMsforRecSys[30] covered an older generation of language models, while others [31, 32] emphasized application aspects or pipelines rather than the underlyingLLMadaptation techniques themselves. This paper aims to provide a deeper, more technical understanding of howLLMsare integrated and adapted intoRecSys.
4. Methodology
This paper is a comprehensive review, and as such, its "methodology" is the systematic approach it takes to categorize and explain the integration of Large Language Models (LLMs) into Recommender Systems (RecSys). The authors structure their analysis into three primary paradigms for adapting LLMs to RecSys: Deep Representation Learning, Pre-training & Fine-tuning, and Prompting.
4.1. Principles
The core idea is to leverage the advanced capabilities of LLMs—particularly their superior natural language understanding and generation, generalization, and reasoning abilities—to overcome the limitations of traditional DNN-based RecSys. The theoretical basis is that LLMs, having been pre-trained on vast amounts of diverse text data, possess a rich understanding of semantics and context that can be transferred or adapted to the RecSys domain. This transfer can happen through different mechanisms:
- Feature Encoding: Using
LLMsto generate semantic representations (embeddings) for users and items, moving beyond simple discrete IDs. - Model Adaptation: Modifying
LLMsthroughpre-trainingonRecSys-specific data orfine-tuningon downstreamRecSystasks. - Behavioral Guidance: Guiding
LLMsto performRecSystasks directly or indirectly viaprompts, without necessarily altering their internal parameters significantly.
4.2. Core Methodology In-depth (Layer by Layer)
The paper structures its review into the following main categories:
4.2.1. Deep Representation Learning for LLM-Based Recommender Systems
This section focuses on how LLMs are used to learn representations (embeddings) for users and items, which are fundamental units in any RecSys.
4.2.1.1. ID-based Recommender Systems
In ID-based RecSys, users and items are identified by unique discrete IDs (e.g., user ID 123, item ID 456). The goal is to learn embedding vectors for these IDs based on user-item interactions.
- Concept: This approach represents users and items using short phrases that incorporate their unique
IDs(e.g.,'[prefix]_[ID]', like ). These phrases are then processed byLLMs. - Challenge: Pure
IDindexing lacks semantic information and struggles with data sparsity (new users/items without interaction history, known as thecold-start problem). - Methods:
- P5 [71]: A unified paradigm that converts various recommendation data formats (interactions, profiles, descriptions, reviews) into natural language sequences. It maps users and items to indexes (e.g., ) and uses a pre-trained
T5backbone, allowingLLMsto treat these indexes as special tokens in their vocabulary, preventing tokenization into separate pieces. - Indexing Solutions [74]: Hua et al. proposed different
indexing solutionsforP5, such assequential indexing,collaborative indexing,semantic (content-based) indexing, andhybrid indexing, highlighting the importance of howIDsare structured. - Semantic IDs [75]: Instead of arbitrary numerical
IDs,Semantic IDsare tuples of codewords with semantic meanings for each user or item, generated by a hierarchical method likeRQ-VAE(Residual Quantized Variational Autoencoder). This imbuesIDswith meaning thatLLMscan better interpret.
- P5 [71]: A unified paradigm that converts various recommendation data formats (interactions, profiles, descriptions, reviews) into natural language sequences. It maps users and items to indexes (e.g., ) and uses a pre-trained
4.2.1.2. Textual Side Information-enhanced Recommender Systems
This approach addresses the limitations of ID-based methods by incorporating rich textual side information about users and items.
- Concept: Given textual data (e.g., user profiles, reviews, item titles/descriptions),
LLMs(likeBERT) serve astext encodersto map users or items into a semantic space. This allows for fine-grained grouping of similar entities and better relevance calculations, especially in sparse data scenarios. - Methods:
- Modality-based RecSys [76]: Research has shown that
RecSysutilizing side information can outperform pureID-basedmethods. - Unisec [77]: Leverages item descriptions to learn transferable
universal item representations. It employs a lightweight item encoder withparametric whiteningand amixture-of-experts (MoE)enhanced adaptor. - Text-based Collaborative Filtering (TCF) [78]: Explores using
LLMs(e.g.,GPT-3) by prompting them to performCFtasks, demonstrating positive performance. - VQ-Rec [79]: Mitigates the issue of
LLMsover-emphasizing text features by learningvector-quantized item representations. This maps item text into a vector of discrete indices (item codes) which are then used to retrieve item representations from a code embedding table. - Zero-Shot Item-based Recommendation (ZSIR) [80]: Introduces a
Product Knowledge Graph (PKG)toLLMsto refine item features. User and item embeddings are learned via multiple pre-training tasks on thePKG. - ShopperBERT [81]: Pre-trains user embeddings based on user purchase history to model user representations in e-commerce.
- IDA-SR [81]:
ID-Agnostic User Behavior Pre-training framework for Sequential Recommendation. It directly extracts representations from text usingPLMslikeBERT. For an item with description , it preprocesses it by adding astart-of-sequence token[CLS]: $ D_i' = {[CLS], t_1, t_2, ..., t_m} $ This is fed to theLLM, and the embedding of the[CLS]token is then used as theID-agnostic item representation.
- Modality-based RecSys [76]: Research has shown that
The following figure illustrates the two methods for representing users and items:
该图像是示意图,展示了基于ID的表示与增强文本侧信息的表示两种方法用于LLM推荐系统中用户和物品的表示。左侧为基于用户和物品ID的传统方法,右侧则结合了用户评论等文本信息,通过编码器(如BERT)生成用户的语义空间表示。
The left side shows ID-based Representation, where user and item IDs are directly used as input. The right side shows Textual Side Information-enhanced Representation, where textual information (e.g., user reviews) is fed into a language model encoder (like BERT) to generate semantic representations, which then feed into the RecSys.
4.2.2. Pre-training & Fine-tuning LLMs for Recommender Systems
This section details how LLMs are adapted through modifying their weights, similar to how PLMs are developed and specialized.
4.2.2.1. Pre-training Paradigm for Recommender Systems
Pre-training involves training LLMs on a vast corpus to acquire broad linguistic understanding, then adapting this understanding to RecSys.
- Classical Pre-training Tasks:
- Masked Language Modeling (MLM): (For encoder-only or encoder-decoder
Transformers). Randomly masks tokens or spans in a sequence and requires theLLMto predict them based on the surrounding context. - Next Token Prediction (NTP)/Auto-regressive Generation: (For decoder-only
Transformers). Requires predicting the next token in a sequence based on the preceding context.
- Masked Language Modeling (MLM): (For encoder-only or encoder-decoder
- RecSys-specific Pre-training Tasks:
-
PTUM (Pre-training User Model) [82]: Proposes two tasks for user behaviors:
- Masked Behavior Prediction (MBP): Masks a single user behavior (unlike
MLMwhich might mask spans of language tokens) in an interaction sequence and predicts it. - Next K Behavior Prediction (NBP): Predicts the next behaviors in a user's interaction history, modeling the relevance between past and future actions.
- Masked Behavior Prediction (MBP): Masks a single user behavior (unlike
-
M6 [69]: Adopts two objectives similar to classical pre-training tasks:
- Text-infilling objective: Like
BART[92], masks a span of tokens and predicts the masked span, useful for assessing text plausibility in recommendation scoring. - Auto-regressive language generation objective: Similar to
NTP, but predicts the unmasked sentence from a masked sequence.
- Text-infilling objective: Like
-
P5 [71]: Uses
multi-mask modelingand mixes datasets from various recommendation tasks during pre-training. This allows it to generalize to diverse and even unseenRecSystasks withzero-shot generationcapabilities by applyingMasked Language Modelingon unified language sequences representing users and items.The following figure illustrates the pre-training workflow:
该图像是示意图,展示了用于推荐系统的预训练大型语言模型(LLMs)的方法。图中涉及到推荐数据集、multi-task 预训练提示,以及两种推荐方法:掩码语言建模和下一个令牌预测。左侧展示了包含用户 ID、购买历史和候选商品的多任务预训练提示;右侧则说明了如何通过 LLMs 进行文本生成与预测。
-
The figure shows the workflow for pre-training LLMs for recommender systems. It highlights two representative methods: Masked Language Modeling (which randomly masks tokens or spans in a sequence and requires LLMs to generate the masked content) and Next Token Prediction (which predicts the next token in a sequence). The process starts with recommendation datasets, proceeds to multi-task pre-training prompts, and then uses LLMs for the described pre-training tasks.
The following are the results from Table 1 of the original paper:
| Paradigms | Methods | Pre-training Tasks | Code Availability |
| Pre-training | PTUM [82] | Masked Behavior Prediction | https://github.com/wuch15/PTUM |
| Next K Behavior Prediction | |||
| M6 [69] | Auto-regressive Generation | Not available | |
| P5 [71] | Multi-task Modeling | https:/ /github.com/jeykigung/P5 |
4.2.2.2. Fine-tuning Paradigm for Recommender Systems
Fine-tuning adapts a pre-trained LLM to specific downstream RecSys tasks by further training it on task-specific datasets, adjusting its parameters.
-
Full-model Fine-tuning:
- Concept: Modifies all or most of the
LLM's weights during fine-tuning. This is straightforward but computationally expensive for very large models. - Examples:
- RecLLM [83]: Fine-tunes
LaMDAforConversational Recommender Systems (CRS)in YouTube video recommendation. - GIRL [87]: Uses supervised fine-tuning for instructing
LLMsin job recommendation. - LMRec (LLMs-driven recommendation) [84]: Addresses bias by using
train-side maskingandtest-side neutralizationof non-preferential entities to mitigate unintended biases fromLLMs. - TransRec [85]: An end-to-end framework for pre-trained
RecSysthat learns directly from raw features of mixed-modality items (text and images), allowing transfer across scenarios without requiring overlapping users or items. - Differentially Private (DP) LLMs [86]: Applies
DP LLMsforprivacy-preserving large-scale RecSys. - Contrastive Learning:
- SBERT [88]: Introduces a
triple loss functionfor intent sentences and corresponding positive/negative product examples in e-commerce. - UniTRec [89]: A unified framework combining
discriminative matching scoresandcandidate text perplexityascontrastive objectivesfor text-based recommendations.
- SBERT [88]: Introduces a
- RecLLM [83]: Fine-tunes
- Concept: Modifies all or most of the
-
Parameter-efficient Fine-tuning (PEFT):
- Concept: Addresses the high computational cost of full-model fine-tuning by only updating a small proportion of the
LLM's weights or adding a few trainable parameters. This makes fine-tuning feasible on limited resources. - Adapter Modules: Small neural networks inserted into the
Transformerlayers ofLLMs(e.g., after multi-head attention and feed-forward layers). During fine-tuning, only the adapters and layer normalization layers are trained, while the originalLLMweights are frozen. - LoRA (Low-Rank Adaptation of LLMs) [94]: A prominent
PEFTmethod. It introduces low-rank decomposition to simulate weight changes. For a weight matrix , LoRA adds a new pathway by representing the update as , where , with and and . Only and are trained. $ h = W_0x + BAx $ Where:- : output of the linear layer.
- : original pre-trained weight matrix.
- : input vector.
B, A: low-rank matrices where projects to a lower dimension , and projects back.- : the
rank, a hyperparameter that is typically much smaller than the original dimension of the weight matrix.
- Examples in RecSys:
-
TallRec [68]: Uses
LoRAto alignLLaMA-7Bwith recommendation tasks, enabling execution on a singleRTX 3090 GPU. -
GLRec [90]: Leverages
LoRAfor fine-tuningLLMsas job recommenders. -
LLaRA [95]: Utilizes
LoRAto adaptLLMsto different tasks. -
M6 [69]: Applies
LoRAfine-tuning for deployment on mobile devices.The following figure illustrates the fine-tuning workflow:
该图像是一个示意图,展示了如何为推荐系统微调大型语言模型(LLMs)。左侧展示了推荐数据集的结构,包括用户ID、购买历史和候选项;右侧则对比了完全模型微调和参数高效微调的两种方法。关键公式包含损失计算和更新过程。
-
- Concept: Addresses the high computational cost of full-model fine-tuning by only updating a small proportion of the
The figure illustrates the workflow for fine-tuning LLMs for recommender systems. It shows recommendation datasets (user ID, purchase history, candidate items) as input. The two main fine-tuning strategies are full-model fine-tuning (which updates all LLM parameters) and parameter-efficient fine-tuning (which updates only a small portion of LLM parameters or trainable adapters).
The following are the results from Table 2 of the original paper:
| Paradigms | Methods | References |
| Fine-tuning | Full-model Fine-tuning | [83], [84], [85], [86], [87], [88], and [89]1 |
| Parameter-efficient Fine-tuning | [68]2, [90], and [69] | |
| CodeAvailability:1https://github.com/veason-silverbullet/unitrec, 2https://github.com/sai990323/ta | ||
4.2.3. Prompting LLMs for Recommender Systems
Prompting involves adapting LLMs to downstream tasks by providing task-specific prompts (text templates) without or with minimal parameter updates. It unifies tasks into language generation, aligned with LLMs' pre-training objectives.
- Prompting Categories: The paper categorizes prompting insights into three main roles for
LLMs:-
LLMs act as recommender: Directly generating recommendations (e.g., top-, rating prediction, explanation).
-
Bridge LLMs and RecSys:
LLMsaugment or refine traditionalRecSys(e.g., data augmentation, refinement,APIcalls). -
LLM-based autonomous agent:
LLMssimulate user behaviors or manage complex recommendations by breaking them into sub-tasks.The following are the results from Table 3 of the original paper:
Paradigms Methods LLM Tasks LLM Backbones References Prompting Conventional Prompting Text Summarization ChatGPT [48] Relationship Extraction ChatGPT [4] In-context Learning (ICL) Recommendation Tasks(e.g., rating prediction, top-K recommendation,conversational recommendation, explanation generation, etc.) GPT-4ChatGPT [96][4[67]96]97]4 T5PaLM 1001, [1001]102], [10] Data Augmentation of RecSys GPT-4ChatGPTGPPT-3 [104][104], [105], [106]7[107] Data Refinement of RecSys ChatGPTGPT-3 31, [108] 19] GPT--2ChatGLM 110] 118 API Call of RecSys & Tools ChatGPT [112], [11319] User Behavior Simulation GPT-4 [114] ChatGPT [115]10, [116]1] Task Planning LLaMA [117] Chain-of-thought (CoT) Recommendation Tasks T5 [20] Task Planning GPT-4ChatGPT [114][112] Prompt Tuning Hard Prompt Tuning Recommendation Tasks GPT-2 [118] ICLn ubcas pt mey ar t (e Scn ..1 p) Soft Prompt Tuning Recommendation Tasks T5GPT-2PaLMM6 [119], [120]118][102]69] Instruction Tuning Full-model Tuningwith Prompt Recommendation Tasks T5LLaMA [20], [66] 1, [187] Parameter-efficient ModelTuning with Prompt Recommendation Tasks LLaMA [68]12, [9] [121]13
-
The image below illustrates the various prompting techniques:
该图像是一个示意图,展示了如何通过三种方法(In-context Learning, Prompt Tuning, Instruction Tuning)来利用大型语言模型(LLMs)增强推荐系统。在每种方法中,图中分别展示了输入数据、任务描述和输出结果的关联。
The figure presents three representative methods for prompting LLMs for recommender systems: In-context Learning (ICL, top), Prompt Tuning (middle), and Instruction Tuning (bottom).
- ICL requires minimal parameter updates to
LLMs. It uses task-specific prompts and in-context demonstrations (input-output examples) to guide theLLMto act as a recommender. - Prompt Tuning involves adding and updating a few prompt tokens (
soft prompt) toLLMswhile keeping the mainLLMparameters frozen. - Instruction Tuning fine-tunes
LLMsover multiple task-specific prompts (instructions), potentially withparameter-efficient methodslikeLoRA, enhancingzero-shotperformance.
4.2.3.1. Conventional Prompting
- Concept: Early methods focused on unifying downstream tasks (like summarization or relation extraction) into language generation tasks, which align with
LLMs' pre-training objectives. This includesprompt engineering(manually crafting prompts) andfew-shot prompting(providing a few input-output examples). - Application: Limited to
RecSystasks that closely resemble language generation, such as review summarization [48] or relation labeling between items [4].
4.2.3.2. In-context Learning (ICL)
- Concept: Introduced with
GPT-3,ICLallowsLLMsto learn new tasks from contextual information within the prompt, without weight updates. It relies onpromptsandin-context demonstrations. - Settings:
- Few-shot ICL: Provides a few input-output examples (demonstrations) along with the prompt.
- Zero-shot ICL: Provides only a natural language description of the task, without demonstrations.
- Roles of LLMs with ICL:
-
LLMs as Recommenders:
LLMsare directly prompted to performRecSystasks (e.g., top-K recommendation, rating prediction, explanation generation) by providing task descriptions and examples [48, 67].Role injection(e.g., "You are a book rating expert.") can prevent refusal [67]. -
Bridging LLMs and RecSys:
ICLcan teachLLMsto interact with traditionalRecSys.Chat-Rec[3] leveragesChatGPTto refine candidate items from conventionalRecSys.LLMscan also be taught to use external tools viatextual API call templates[113] (e.g., graph reasoning tools). -
LLM-based Autonomous Agents:
LLMsare equipped with memory and action modules to simulate user behaviors or manage complexRecSystasks (e.g.,InteRecAgent[114],RecAgent[115],Agent4Rec[116]).Few-shot ICLconnectsLLMswith these external modules.The following figure provides a brief template of zero-shot ICL and few-shot ICL for recommendation tasks:
该图像是一个示意图,展示了零-shot ICL 和 few-shot ICL 在推荐任务中的应用模板。左侧是 few-shot ICL 的说明与示例,右侧为 zero-shot ICL 的描述,强调在指定上下文下如何进行推荐。
-
The figure on the left illustrates few-shot ICL, where a prompt defines the task (e.g., "Recommend similar items..."), followed by several input-output examples (demonstrations), and then a new input for the LLM to complete. The figure on the right illustrates zero-shot ICL, where only the task description is provided in the prompt, and the LLM is expected to generate the output for a new input without examples.
4.2.3.3. Chain-of-thought (CoT) Prompting
- Concept: Addresses
LLMs' limitations in complex reasoning by annotating intermediate reasoning steps in the prompt. This helpsLLMsbreak down multi-step problems, like those in conversational recommendations, and generate step-by-step reasoning. - Settings:
- Zero-shot CoT: Inserts phrases like "Let's think step by step" to induce
LLMsto generate reasoning steps without explicit examples. - Few-shot CoT: Augments
ICLdemonstrations by providing input-CoT-output examples, where the reasoning steps are manually designed.
- Zero-shot CoT: Inserts phrases like "Let's think step by step" to induce
- Example (E-commerce): A
CoTprompt might guide theLLMto first infer user intent, then identify co-purchased items, and finally select relevant recommendations based on the intent.[CoT Prompting] Based on the user purchase history, let's think step-by-step. First, please infer the user's high-level shopping intent. Second, what items are usually bought together with the purchased items? Finally, please select the most relevant items based on the shopping intent and recommend them to the user.
- Application: Used in
InteRecAgent[114] andRecMind[112] for task planning and managing complex recommendations.
4.2.4. Prompt Tuning
Prompt tuning is an additive technique where new prompt tokens are added to LLMs and optimized based on task-specific datasets, typically involving minimal parameter updates.
4.2.4.1. Hard Prompt Tuning
- Concept: Generates and updates discrete text templates (natural language phrases) as prompts.
ICLcan be seen as a subclass, where in-context demonstrations are part of ahard prompt. - Challenge: Faces
discrete optimization, requiring laborious trial-and-error to find suitable prompts in the vast vocabulary space.
4.2.4.2. Soft Prompt Tuning
- Concept: Uses continuous vectors (embeddings) as prompts, which are optimized using gradient methods based on task-specific datasets. These
soft prompttokens are typically concatenated to the original input tokens at theLLM's input layer. Only thesoft promptand minimal input layer parameters are updated. - Methods:
- Feature-based Integration: Capturing user representations via
contrastive learningand encoding them into prompt tokens [127]. Encoding mutual information in cross-domain recommendations intosoft prompts[72, 128]. - Learned Prompts: Randomly initialized
soft promptsare optimized end-to-end with respect to a recommendation loss based on theLLM's output (e.g.,T5) [119].
- Feature-based Integration: Capturing user representations via
- Trade-off: More feasible for continuous optimization but less interpretable than
hard prompts.
4.2.5. Instruction Tuning
Instruction tuning combines prompting and fine-tuning by fine-tuning LLMs over multiple task-specific prompts (instructions). This enhances LLMs' ability to follow diverse instructions and improves zero-shot performance on unseen tasks.
- Instruction (Prompt) Generation Stage:
- Concept: Creates
instruction-based promptsin natural language, consisting of task-oriented input (based onRecSysdata) and desired output. - Templates: Recommendation-oriented instruction templates (user preferences, intentions, task forms) [20]. Three-part templates like "task description-input-output" [68, 70].
- Concept: Creates
- Model Tuning Stage:
- Concept: Fine-tunes
LLMson the generated instructions. This can be done via:- Full-model Tuning: Updating all
LLMparameters. - Parameter-efficient Model Tuning: Using
PEFTmethods likeLoRAfor lightweight tuning (e.g.,TallRec[68] forLLaMA).
- Full-model Tuning: Updating all
- Concept: Fine-tunes
- Beyond Text: Explored for enhancing graph understanding in
RecSys[90], where anLLM-based prompt constructor encodes paths in behavior graphs into natural language descriptions forinstruction tuning.
5. Experimental Setup
This paper is a comprehensive survey of LLM-empowered Recommender Systems. As such, the authors do not conduct their own experiments or propose a novel experimental setup. Instead, they review and synthesize the methodologies and findings from numerous existing research papers in this rapidly evolving field. Therefore, the traditional "Experimental Setup" sections (Datasets, Evaluation Metrics, Baselines) are not applicable to this survey paper itself.
However, the "Future Directions" section (Section 6) of this paper implicitly highlights areas where future research will require well-defined experimental setups, including specific datasets, evaluation metrics, and baselines to address the identified challenges. For instance, hallucination mitigation would require datasets specifically designed to test factual correctness, and trustworthiness research would involve metrics for fairness, robustness, and privacy, often benchmarked against existing LLM-RecSys or traditional RecSys baselines.
6. Results & Analysis
As this paper is a survey, it does not present its own experimental results or comparative analysis based on experiments conducted by its authors. The "Results & Analysis" section, therefore, does not apply in the conventional sense of reporting novel experimental findings, tables of numerical data, or ablation studies from this paper's work.
Instead, the paper's "results" are its comprehensive synthesis and categorization of existing research, identifying trends, common approaches, and outstanding challenges in the field of LLM-empowered Recommender Systems. Through its systematic review, the paper implicitly demonstrates:
-
Effectiveness of LLM Integration: The widespread and diverse research efforts (as summarized in Tables 1, 2, and 3) indicate that
LLMsare indeed being effectively integrated intoRecSysacross various tasks (e.g., top- recommendation, rating prediction, conversational recommendation, explanation generation). -
Validation of LLM Capabilities: The reviewed methods leverage
LLMs' strongNatural Language Understanding (NLU)for richer item/user representations,generalizationcapabilities for adaptability to new tasks, andreasoningfor more explainable and complex recommendations. -
Emergence of Adaptation Paradigms: The clear categorization into
Pre-training,Fine-tuning, andPromptingparadigms, along with their respective sub-methods, highlights the main strategies researchers are employing to adaptLLMsforRecSys. -
Identification of Key Challenges: The dedicated section on future directions (Section 6) serves as an analysis of the limitations and open problems faced by current
LLM-RecSysresearch, such ashallucination,trustworthinessissues, andfine-tuning efficiency.The tables presented in Section 4 (Methodology) of this analysis, specifically Table 1 (
Pre-training methods for LLM-empowered RecSys), Table 2 (Fine-tuning methods for LLM-empowered RecSys), and Table 3 (Prompting methods for LLM-empowered RecSys), are part of the original paper's contribution as a survey. They summarize the methods and LLM backbones used in other research, not the results of experiments performed by the authors of this survey. They serve as a structured data presentation of the current landscape ofLLM-RecSysresearch.
7. Conclusion & Reflections
7.1. Conclusion Summary
This survey provides a comprehensive overview of Large Language Models (LLMs) in Recommender Systems (RecSys), highlighting their significant potential to revolutionize the field. It systematically categorizes existing LLM-empowered RecSys into three primary paradigms: Pre-training, Fine-tuning, and Prompting. The paper first explains how LLMs can serve as powerful feature encoders for user and item representations, distinguishing between ID-based and textual side information-enhanced approaches. It then delves into specific techniques within pre-training (e.g., Masked Behavior Prediction, Next K Behavior Prediction), fine-tuning (e.g., full-model fine-tuning, Parameter-Efficient Fine-Tuning (PEFT) like LoRA), and prompting (e.g., in-context learning (ICL), Chain-of-Thought (CoT) prompting, prompt tuning, instruction tuning). The authors conclude by discussing several critical future directions that need to be addressed for the field to mature.
7.2. Limitations & Future Work
The authors identify several key limitations and promising future research directions for LLM-empowered RecSys:
- Hallucination Mitigation:
LLMscan generate plausible-sounding but factually incorrect information. This poses a severe threat, especially in high-stakesRecSys(e.g., medical, legal). Future work needs to explore using factual knowledge graphs and scrutinizing model outputs to verify accuracy. - Trustworthy Large Language Models for Recommender Systems: The widespread adoption of
LLMsinRecSysnecessitates addressing trustworthiness across four crucial dimensions:- Safety & Robustness:
LLMsare vulnerable to adversarial perturbations. Research is needed on automatic pre-processing of prompts (e.g., for malicious input) and adversarial training to enhance stability. - Non-discrimination & Fairness:
LLMscan perpetuate biases present in training data, leading to discriminatory recommendations. More research is required to address user-side and item-side fairness comprehensively, potentially through guided prompting. - Explainability: The "black box" nature of many advanced
LLMsmakes it hard for users to understand why a recommendation was made. Future work should focus on understandingLLMinternal mechanisms to enhance explainability. - Privacy:
LLMsrely on vast amounts of personal data, risking leakage of sensitive user information. Protecting user privacy through techniques like differentially privateLLMsand leveragingfederated learningwithLLMsas controllers is crucial.
- Safety & Robustness:
- Vertical Domain-Specific LLMs for Recommender Systems: General
LLMsare versatile, butdomain-specific LLMs(e.g., for health, finance) offer higher expertise and practicality for specializedRecSys. The challenge lies in collecting high-quality domain-specific data and developing suitable tuning strategies. - Users & Items Indexing:
LLMscan struggle with long text inputs, andID-basedapproaches lack semantic richness. Future work should explore advanced indexing methods that combine collaborative knowledge from user-item interactions with semantic information fromLLMsto address thelong text problemand capture user preferences more effectively. - Fine-tuning Efficiency:
Fine-tuning LLMsis computationally expensive. Improving efficiency, especially for multi-modalRecSys, using techniques likeadapter modulesand exploring end-to-end training optimization is a key direction. - Data Augmentation: Traditional data collection is resource-intensive.
LLMscan serve as powerful tools fordata augmentation, generating synthetic user behaviors or personalized content to bolsterRecSysrecommendations (e.g.,RecAgent,LLM-Rec).
7.3. Personal Insights & Critique
This survey is an extremely timely and valuable resource for anyone entering or already working in the field of LLM-empowered RecSys. Its strength lies in its comprehensive categorization and structured presentation of a rapidly evolving domain, making it beginner-friendly while maintaining academic rigor.
Inspirations and Transferability:
- Unified View: The paper's categorization of
LLMadaptation intoPre-training,Fine-tuning, andPromptingprovides a powerful framework that is not only applicable toRecSysbut can also be generalized to understand howLLMsare being integrated into almost any other application domain inAI. - Leveraging Existing Strengths: It clearly shows that
LLMsare not just replacingRecSysbut enhancing them. The integration ofLLMsas feature encoders, data augmenters, or reasoning engines highlights a symbiotic relationship rather than a complete overhaul. - The Power of Prompting: The detailed breakdown of
promptingtechniques (especiallyICLandCoT) underscores their disruptive potential for lightweight adaptation and unlocking complex reasoning, reducing the need for massive dataset curation and retraining. This agile approach is highly appealing.
Potential Issues, Unverified Assumptions, or Areas for Improvement:
- Rapid Obsolescence: The field of
LLMsis progressing at an unprecedented pace. While this survey is current as of its publication date (July 2023), some specific methods orLLMbackbones mentioned might quickly become outdated, or new paradigms might emerge. A living document or more frequent updates would be ideal, though impractical for a static paper. - Benchmarking and Fair Comparison: As a survey, it doesn't present its own comparative results. The effectiveness of many
LLM-basedRecSysis often shown in isolation or against older baselines. A critical open question, which the survey implicitly raises, is the consistent benchmarking and fair comparison of these diverseLLM-based methods across standardizedRecSystasks and datasets, especially given the varyingLLMbackbones and adaptation techniques. - Computational Cost as a Barrier: While
PEFTmethods are discussed, the underlying computational cost of operating andfine-tuninglargeLLMs(even efficiently) remains a significant practical barrier for many researchers and smaller organizations. The economic implications are a crucial factor often underemphasized. - The "Black Box" Dilemma and Explainability: While
CoToffers a path to explainability, the fundamental opacity of very largeLLMs(especially closed-source ones likeChatGPT/GPT-4) can create tension with thetrustworthinessgoal of explainability. The generated "explanation" might merely be a plausible narrative rather than a true reflection of the model's decision process. - Data Quality and Bias Amplification:
LLMsare powerful pattern matchers. If the input data (textual side information, interaction history forprompting) contains biases or noise,LLMsare likely to amplify these, leading to unfair or suboptimal recommendations. The proposedtrustworthinessdimension is critical, but its practical implementation is extremely challenging. - The "Long Context" Problem: The paper mentions the
long text probleminUsers&Items Indexing. WhileLLMsare getting better at longer contexts, there are still limitations in their ability to fully utilize very long sequences of user interactions or item descriptions without losing important details. - Interoperability with Traditional RecSys: The "Bridge LLMs and RecSys" paradigm is crucial. More research on seamless and efficient interoperability, where
LLMscomplement and enhance specialized traditionalRecSyscomponents rather than entirely replacing them, is important for practical deployments.
Similar papers
Recommended via semantic vector search.