- Title: CoRA: Collaborative Information Perception by Large Language Model’s Weights for Recommendation
- Authors: Yuting Liu, Jinghao Zhang, Yizhou Dang, Yuliang Liang, Qiang Liu, Guibing Guo, Jianzhe Zhao, Xingwei Wang
- Affiliations: Northeastern University, China and Institute of Automation, Chinese Academy of Sciences, China. The authors have expertise in recommendation systems, large language models, and pattern recognition.
- Journal/Conference: Published in the proceedings of the AAAI Conference on Artificial Intelligence (AAAI). AAAI is a top-tier, highly respected conference in the field of artificial intelligence, indicating the work is of high quality and significance.
- Publication Year: 2024
- Abstract: The paper addresses the challenge of integrating collaborative information into Large Language Models (LLMs) for recommendation. It identifies two key limitations in existing methods that fine-tune LLMs or add collaborative features to input prompts: (1) fine-tuning can degrade the LLM's general knowledge, and (2) adding collaborative features can disrupt the semantic meaning of the text prompts. To solve this, the authors propose Collaborative LoRA (CoRA), a new method that aligns collaborative information with the LLM's parameter space instead of its input space. CoRA uses a "collaborative query generator" to transform user and item embeddings from a collaborative filtering model into low-rank incremental weights. These weights are merged directly into the LLM's own weights, allowing the model to perceive collaborative signals without altering the original prompts or requiring full fine-tuning. The main finding is that this approach effectively enhances recommendation performance by successfully integrating both general knowledge from the LLM and collaborative information.
- Original Source Link: https://ojs.aaai.org/index.php/AAAI/article/view/33334 (Formally Published)
2. Executive Summary
-
Foundational Concepts:
- Large Language Models (LLMs): These are massive neural networks (e.g., GPT-3, LLaMA, Vicuna) trained on vast amounts of text data. They excel at understanding and generating human-like text. Most modern LLMs use a decoder-only architecture (illustrated in Figure 3a), which consists of stacked blocks containing a Multi-Head Self-Attention module (for understanding context within the input text, Figure 3b) and a Feed-Forward Network (for further processing).
- Collaborative Filtering (CF): A classic recommendation technique based on the idea that users with similar tastes will like similar items. CF models typically learn a low-dimensional vector, called an embedding or latent factor, for each user and item. The similarity between a user's and an item's embedding is used to predict preference.
- LoRA (Low-Rank Adaptation): A highly efficient technique for fine-tuning LLMs. Instead of updating the entire, massive weight matrix W of a model layer, LoRA freezes W and learns a small, low-rank update ΔW=BA, where B and A are much smaller matrices. This drastically reduces the number of trainable parameters and memory usage. CoRA cleverly adapts this idea by generating the low-rank matrices B and A on the fly based on collaborative information.
- LLM for Recommendation (LLMRec): A research area focused on leveraging LLMs for recommendation. This is often done by converting the recommendation task (e.g., "predict if user U will like item I") into a natural language prompt that the LLM can answer.
-
Previous Works:
- The paper categorizes prior work into two main groups:
- LLMRec without Collaborative Info: Methods like
TALLRec and ICL (In-Context Learning) rely solely on textual descriptions and prompting strategies. They perform well on cold-start items but miss the powerful signals from user interaction history.
- LLMRec with Collaborative Info: This is the most relevant category. Methods like
CoLLM, BinLLM, and LlaRA attempt to solve the problem by mapping collaborative embeddings into the LLM's input space. For example, CoLLM creates special soft tokens for users/items, and BinLLM converts embeddings into binary sequences. The paper argues these methods suffer from the semantic disruption problem.
-
Differentiation:
- The key innovation of CoRA is its shift from input space alignment to parameter space alignment.
- Previous Methods (Input Space):
Prompt + [Collaborative Features] -> LLM -> Prediction. This mixes text and non-text, confusing the LLM.
- CoRA (Parameter Space):
[Collaborative Features] -> Collaborative Weights Generator -> ΔW. Then, Prompt -> LLM(W+ΔW) -> Prediction. This keeps the prompt clean and injects the collaborative signal directly into the model's reasoning mechanism.
4. Methodology (Core Technology & Implementation)
The CoRA framework, depicted in the overall architecture diagram, operates in a two-stage process for each prediction. The base LLM weights remain frozen throughout.
该图像是论文中展示CoRA方法整体架构的示意图,描述了协同过滤模型如何生成协同特征并通过自注意力机制处理查询,最终将协同权重与LLM融合实现个性化推荐。
Figure 4 Interpretation: The diagram shows the complete workflow. Recommendation data is used to train a Collaborative Filtering Model to produce Collab. Features (user/item embeddings). These features are fed into the Collaborative Query Generator (the dashed box containing Self-Attention, Cross-Attention, etc.). This generator outputs a representation that is transformed into incremental weights (the "Projector" part, represented by the flame icon). These weights are added to the frozen LLM (the snowflake icon). The modified LLM then processes a standard text prompt to produce a "Yes/No" recommendation.
-
Step 1: Generating Collaborative Queries
- Obtain Embeddings: For a given user u and item i, a pre-trained CF model provides their respective embeddings, eu and ei. These are concatenated to form a single feature vector [eu,ei].
- Collaborative Query Generator: This module is designed to "distill" the essential collaborative information from the embeddings.
- It starts with k learnable
query embeddings. These act as learnable "questions" about the collaborative context.
- These queries pass through a series of N decoder-like blocks. In each block:
- Self-Attention: The k queries interact with each other, allowing them to share information and specialize.
- Cross-Attention: The queries attend to the input collaborative features [eu,ei]. This is the crucial step where the queries "absorb" the user-item specific collaborative signal.
- Feed-Forward Network: A standard MLP for further transformation.
- After passing through all blocks, a pooling operation aggregates the k processed queries into a single, final collaborative query embedding qc.
-
Step 2: Collaborative Perception in LLM via Weight Generation
- The goal is to convert the information-rich query qc into a weight update ΔW that has the same dimensions as a layer in the LLM. Doing this directly would require an enormous number of parameters.
- Inspired by LoRA, CoRA generates a low-rank weight update. The query qc is passed through two linear layers to produce two smaller matrices, which are then multiplied to form the final weight update. This process is described by the formula:
Wc=ΔWAΔWB=R(qcWFC)Wproj
- qc: The collaborative query embedding from the previous step.
- WFC: A fully connected layer that projects qc to an intermediate representation.
- R(⋅): A reshape operator that shapes the output into the first low-rank matrix, ΔWA, of size dmodel×r.
- Wproj: A linear projector matrix that acts as the second low-rank matrix, ΔWB (or just Wproj in the paper's notation), of size r×dmodel.
- Wc: The final user-item-specific collaborative weight matrix, with dimensions dmodel×dmodel.
-
Step 3: Merging Weights and Making Predictions
- The generated collaborative weight Wc is added to the corresponding pre-trained weight matrix W of one or more layers in the LLM.
W^=W+Wc
- This creates a temporarily modified LLM, LLMW^, which is now "aware" of the collaborative context for the specific user-item pair.
- This modified LLM then processes a standard, clean text prompt (as shown in Table 1) to generate the final prediction (e.g., "Yes" or "No").
-
Training Strategy
- The base LLM weights W are frozen.
- The only trainable parameters Θ are in the collaborative query generator and the final projection layers (WFC and Wproj).
- The model is trained end-to-end to minimize the binary cross-entropy (BCE) loss between the predicted output and the ground-truth label y.
Θ^=argminΘ(u,i,y)∈D∑ℓ(y^,y)
This is a transcribed version of Table 1 from the paper, showing an example prompt.
| #Question: A user has given high ratings to the following movies: {ItemTitleList}. Leverage the information to predict whether the user would enjoy the movie titled {TargetItemTitle}? Answer with "Yes" or "No". |
| \n#Answer: |
5. Experimental Setup
6. Results & Analysis
-
Core Results:
This is a transcribed version of Table 3 from the paper, showing the main performance comparison.
| Dataset |
Amazon-Book |
ML-1M |
| Method |
AUC |
UAUC |
Improve |
AUC |
UAUC |
Improve |
| Collab. |
MF |
0.7105 |
0.5543 |
14.04% |
0.6486 |
0.6396 |
10.56% |
| LightGCN |
0.7026 |
0.5619 |
13.93% |
0.5858 |
0.6512 |
15.68% |
| SASRec |
0.6675 |
0.5614 |
17.04% |
0.7005 |
0.6734 |
3.65% |
| LLMRec |
ICL |
0.5180 |
0.5043 |
51.61% |
0.5119 |
0.5178 |
38.37% |
| Prompt4NR |
0.6527 |
0.5011 |
25.10% |
0.7027 |
0.6713 |
3.28% |
| TALLRec |
0.6583 |
0.4971 |
25.11% |
0.7044 |
0.6741 |
3.31% |
| LLMRec w/ Collab. |
PersonPrompt |
0.7113 |
0.5596 |
13.44% |
0.7014 |
0.6503 |
5.40% |
| CoLLM-MF |
0.8021 |
0.5782 |
5.14% |
0.7028 |
0.6714 |
3.64% |
| CoLLM-LGCN |
0.7835 |
0.5663 |
7.48% |
0.7164 |
0.6842 |
4.68% |
| CoLLM-SAS |
0.7538 |
0.5874 |
7.55% |
0.7059 |
0.6531 |
4.84% |
| BinLLM |
0.8157 |
0.5724 |
4.83% |
0.7132 |
0.6815 |
2.11% |
| Ours |
CoRA-MF |
0.8179 |
0.6262 |
- |
0.7361 |
0.6884 |
- |
| CoRA-LGCN |
0.7886 |
0.5689 |
- |
0.7128 |
0.6966 |
- |
| CoRA-SAS |
0.7677 |
0.5961 |
- |
0.7019 |
0.6517 |
- |
- Key Insight: CoRA consistently achieves the best performance. For instance,
CoRA-MF on Amazon-Book scores 0.8179 AUC, outperforming BinLLM (0.8157) and CoLLM-MF (0.8021). The improvement is even more pronounced in the UAUC metric (0.6262 for CoRA-MF vs. 0.5782 for CoLLM-MF), suggesting CoRA provides better personalization across users. This validates that parameter-space alignment is a more effective integration strategy.
-
Warm and Cold Performance Analysis
该图像是图表,展示了CoRA方法与多种基准模型在Amazon-Book和ML-1M数据集的温暖(warm)和寒冷(cold)场景下的AUC与UAUC性能对比,CoRA在各场景中均表现最佳。
- Warm-start (users with many interactions): In parts (a) and (b), CoRA achieves the highest AUC and UAUC. This demonstrates its superior ability to leverage the rich collaborative signals available in warm scenarios.
- Cold-start (users with few interactions): In parts (c) and (d), all LLM-based methods outperform the traditional CF model (
MF), because they can rely on textual descriptions of items. CoRA again leads the pack, indicating that it successfully combines the general knowledge of the LLM (for understanding new items) with the learned collaborative patterns from the CF model, making it robust in both scenarios.
-
Ablations / Parameter Sensitivity
-
Effectiveness of Integrating Textual Information
该图像是图6,展示了不同变体在Amazon和ML-1M数据集上的性能比较。图中“ID-Only”代表去除商品文本信息,“w/ Text”表示加入商品文本描述,对比了多种模型在AUC和UAUC指标上的表现。
ID-Only vs. w/ Text: This study compares performance when using only collaborative embeddings (ID-Only) versus using both embeddings and item text (w/ Text).
- Key Finding: CoRA shows a substantial performance gain when textual information is added. In contrast,
CoLLM's performance on ML-1M decreases with the addition of text. This is strong evidence supporting the paper's central hypothesis: CoLLM's input-space alignment causes interference between collaborative features and text semantics, while CoRA's parameter-space alignment avoids this conflict and creates a positive synergy.
-
Impact of Collaborative Weight Type
This is a transcribed version of Table 4 from the paper.
| Weight Type |
Amazon-Book |
ML-1M |
| AUC |
UAUC |
AUC |
UAUC |
| qkvof |
0.8141 |
0.6068 |
0.7312 |
0.6801 |
| qkvo |
0.8179 |
0.6262 |
0.7361 |
0.6884 |
| qkv |
0.7741 |
0.5747 |
0.6947 |
0.5933 |
| qko |
0.8091 |
0.5949 |
0.7111 |
0.5973 |
| qk |
0.7685 |
0.5644 |
0.6784 |
0.5887 |
- Analysis: The experiment tests which weights in the LLM's decoder block should be modified. The labels
q, k, v, o, f refer to the query, key, value, and output projection weights in the self-attention module, and the feed-forward network weights, respectively. The best performance (qkvo) is achieved by adapting all weights within the self-attention module but leaving the feed-forward network unchanged. This suggests that the collaborative signal is most effectively injected by modulating how the model weighs and combines information from its input sequence.
7. Conclusion & Reflections
-
Conclusion Summary:
The paper successfully identifies and provides evidence for two critical problems in adapting LLMs for recommendation: knowledge degradation from fine-tuning and semantic disruption from input-space feature injection. It proposes CoRA, an elegant and effective solution that aligns collaborative information with the LLM's parameter space. By generating user-item specific low-rank weights, CoRA enables the LLM to perceive collaborative signals without compromising its inherent knowledge or disrupting text prompts. The experimental results robustly demonstrate the superiority of this paradigm, establishing a new state-of-the-art for LLM-based recommendation.
-
Limitations & Future Work:
The authors acknowledge that the work can be extended in several directions:
- Testing the framework with other backbone LLMs to verify its generalizability.
- Applying the CoRA paradigm to other recommendation tasks beyond binary prediction, such as sequential or conversational recommendation.
- Exploring its use in device-cloud collaborative learning scenarios, which could have practical implications for privacy and efficiency.
-
Personal Insights & Critique:
- Novelty and Elegance: The central idea of generating dynamic, instance-specific LoRA weights is highly innovative. It's a prime example of creatively repurposing a general machine learning technique (LoRA) for a specific domain problem. The solution is elegant because it cleanly separates the concerns of text understanding and collaborative signal processing.
- Generalizability: The "perceptual weights" concept pioneered in computer vision (
VLoRA) and adapted here for collaborative filtering is very powerful. This architectural pattern could likely be applied to inject any form of structured, non-textual side information (e.g., knowledge graphs, user social networks, item modalities) into an LLM without corrupting the language-based input.
- Dependency on CF Model: A potential weakness is that CoRA's performance is inherently tied to the quality of the initial embeddings produced by the pre-trained collaborative filtering model. If the CF model is weak or trained on sparse data, the collaborative signals passed to the LLM will be noisy, limiting the overall performance. Future work could explore joint training of the CF model and the CoRA generator to allow for end-to-end optimization.
- Inference Overhead: While training is efficient, there is a computational cost at inference time to run the collaborative query generator for every user-item pair. Although this is likely small compared to the LLM's forward pass, it's a practical consideration for real-time systems serving millions of requests.