ITMPRec: Intention-based Targeted Multi-round Proactive Recommendation
TL;DR Summary
ITMPRec is a novel intention-based targeted multi-round proactive recommendation method that addresses passive acceptance in personalized systems by selecting target items through pre-matching, utilizing multi-round nudging, and simulating user feedback with an LLM agent, outperf
Abstract
Personalized recommendations are integrated into daily life, but providers may want certain items to become more appealing over time through user interactions, yet this issue is often overlooked. The existing works are often based on the assumption that users will passively accept all intermediate sequences or not explore intention modeling in the targeted nudging process. Both of these factors result in suboptimal performance in the proactive recommendation. In this paper, we propose a novel intention-based targeted multi-round proactive recommendation method, dubbed ITMPRec. We first select target items using a pre-match strategy. Then, we employ a multi-round nudging recommendation method, incorporating a module to quantify users’ intention-level evolution, helping choose suitable intermediate items. Additionally, we model users’ sensitivity to changes caused by these items. Lastly, we propose an LLM agent as a pluggable component to simulate user feedback, offering an alternative to traditional click models by leveraging the agent’s external knowledge and reasoning capabilities. Through extensive experiments on four public datasets, we demonstrate the superiority of ITMPRec compared to eight baseline models.
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
1.1. Title
The central topic of the paper is ITMPRec: Intention-based Targeted Multi-round Proactive Recommendation. This title suggests a novel approach to recommendation systems that focuses on proactively guiding users towards specific items over multiple interactions, taking user intentions into account.
1.2. Authors
-
Yahong Lian (College of Computer Science, TJ Key Lab of NDST, DISSec, TMCC, TBI Center, Nankai University, Tianjin, China)
-
Chunyao Song (College of Computer Science, TJ Key Lab of NDST, DISSec, TMCC, TBI Center, Nankai University, Tianjin, China)
-
Tingjian Ge (Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, USA)
The authors are primarily affiliated with computer science departments and research centers, indicating a background in computational methods, data science, and artificial intelligence, which are highly relevant to recommendation systems.
1.3. Journal/Conference
The paper was published in the Proceedings of the ACM Web Conference 2025 (WWW '25). WWW (The Web Conference, formerly known as World Wide Web Conference) is a premier international academic conference on topics related to the World Wide Web. Its reputation is very high in the fields of information retrieval, web mining, and recommendation systems, making it a highly influential venue for research in these areas. Publication at WWW signifies that the research has undergone rigorous peer review and is considered to be of significant quality and impact.
1.4. Publication Year
2025 (This indicates it's a forthcoming publication or accepted paper for WWW '25).
1.5. Abstract
Personalized recommendation systems are ubiquitous in daily life. However, they often overlook the objective of content providers to make certain items more appealing over time through user interactions. Existing proactive recommendation methods typically assume users passively accept all intermediate recommendations or fail to model user intentions during the nudging process, leading to suboptimal performance.
To address these limitations, the paper proposes a novel method called ITMPRec (Intention-based Targeted Multi-round Proactive Recommendation). ITMPRec first employs a pre-match strategy to select target items. It then utilizes a multi-round nudging recommendation approach that includes a module to quantify the evolution of users' intentions, which aids in selecting appropriate intermediate items. Additionally, the model accounts for users' sensitivity to changes introduced by these intermediate items. Finally, ITMPRec introduces an LLM agent as a pluggable component to simulate user feedback, offering a sophisticated alternative to traditional click models by leveraging the agent's external knowledge and reasoning capabilities. Extensive experiments on four public datasets demonstrate that ITMPRec significantly outperforms eight baseline models.
1.6. Original Source Link
/files/papers/6911a6abc9b7d49a981aac07/paper.pdf (This link points to a local file, implying it's a preprint or a provided PDF for internal review.)
2. Executive Summary
2.1. Background & Motivation
The core problem the paper aims to solve is the limitation of traditional personalized recommendation systems, which primarily focus on predicting users' next preferences based on their historical behavior. While convenient, this user-centric approach can lead to several negative outcomes:
-
Filter Bubbles and Information Cocoons: By constantly reinforcing existing preferences, users can become confined to a narrow range of content, limiting exposure to diverse items and potentially harming both user experience and the content ecosystem.
-
Misalignment with Provider Goals: Content providers often have strategic objectives to promote specific items or categories, increase diversity, or guide users towards new experiences. Traditional systems do not inherently support these "nudging" or "proactive" goals.
This problem is important because it highlights a fundamental tension in recommendation systems: balancing user satisfaction with platform objectives and promoting content diversity. The existing solutions for proactive recommendation have significant gaps:
-
Random Target Item Selection: Previous approaches often assign target items randomly, which can lead to guiding users towards
cold-start items(items with little to no historical interaction data) or items that are too scattered, making successful guidance difficult and potentially irrelevant to provider goals. -
Neglect of User Intention: The role of a user's underlying
intention(a coarse-grained aspect compared topreference) in the multi-round nudging process is largely ignored, which is crucial for effective and dynamic guidance. -
Passive User Assumption: Many existing methods assume users will passively accept all intermediate recommendations, or they use simplistic, fixed thresholds for simulating user clicks. This unrealistic assumption fails to reflect real-world user behavior and leads to sub-optimal guidance paths.
The paper's innovative idea is to address these gaps by developing a sophisticated
multi-round proactive recommendationframework that not only selects meaningful target items (often a category of items) but also dynamically adapts to users' evolving intentions and individual sensitivities, leveraging advancedLLM agentsfor more realistic user feedback simulation.
2.2. Main Contributions / Findings
The paper makes several primary contributions to the field of proactive recommendation:
-
Targeted Multi-Round Proactive Recommendation Paradigm: It proposes
ITMPRec, a novel method that aims to guide users towards a class of target items (e.g., a specific category or genre) over multiple rounds, moving beyond the traditional single-item prediction. This approach is more aligned with content provider needs for focused promotion and can encourage content diversity. -
Pre-match Module for Target Item Selection:
ITMPRecintroduces apre-match modulethat collectsall users' opinionsto generate a sensible set of candidate target items within a specific category. This addresses the limitation of random target assignment, ensuring selected targets are more relevant and avoidcold-startissues. -
Intention-Induced Scores: The paper devises a mechanism to incorporate
intention-induced scoresinto the recommendation process. By modelingusers' intention-level evolution, it helps in choosing more suitable intermediate items that align with the user's changing coarse-grained goals during the nudging process, which was previously overlooked. -
Targeted Individual Arousal Coefficients (TIAC): Recognizing that users respond differently to new content,
ITMPRecintroducesTIAC. This module quantifies each user's uniquesensitivityorreceptivityto changes caused by intermediate recommendations, enabling more personalized and effective nudging. -
LLM Agent for User Feedback Simulation: A novel aspect is the integration of an
LLM agentas a pluggable component to simulate user click feedback on intermediate recommendations. This offers a more sophisticated and realistic alternative to traditionaldistribution-based click models, leveraging theLLM's external knowledge and reasoning capabilities to mimic complex user decision-making, thus better aligning the simulation with real-world scenarios. -
Empirical Superiority: Through extensive experiments on four real-world datasets,
ITMPRecdemonstrates significant superiority over eight state-of-the-art baseline models (including both sequential and proactive approaches). It achieves an average increase of36.47%inIoI@20and68.80%inIoR@20(metrics for proactive recommendation quality), validating its effectiveness.These findings collectively address the shortcomings of existing methods by providing a more holistic, intelligent, and realistic framework for proactive recommendation, benefiting both users (through expanded interests) and content providers (through targeted promotion).
3. Prerequisite Knowledge & Related Work
3.1. Foundational Concepts
To understand ITMPRec, a reader should be familiar with several core concepts in recommendation systems and machine learning:
- Personalized Recommendation Systems: These systems aim to predict user preferences for items (products, movies, music, etc.) and recommend relevant ones. They are based on user-item interaction data (e.g., clicks, purchases, ratings).
- Sequential Recommendation (SR): A sub-field of recommendation systems that focuses on modeling the chronological order of user interactions. Instead of just predicting static preferences, SR systems try to predict the next item a user will interact with, given their historical sequence of interactions. This often involves capturing
sequential patternsandshort-term preferences.- User/Item Embeddings: In recommendation systems,
embeddingsare low-dimensional, dense vector representations of users and items. These vectors are learned from interaction data and capture the latent features and relationships between users and items. For example, similar items would have embedding vectors that are close to each other in the embedding space. - Dot Product for Similarity: A common way to measure the similarity or interaction tendency between a user embedding and an item embedding is to compute their
dot product, i.e., . A higher dot product typically indicates higher predicted relevance or preference.
- User/Item Embeddings: In recommendation systems,
- Filter Bubble and Information Cocoon: These are phenomena where users are exposed only to information that confirms their existing beliefs or preferences, due to algorithms that personalize content.
- A
filter bubbleis created by personalized content filters that selectively guess what information a user would like to see. - An
information cocoonis a state where individuals are isolated from information that contradicts their beliefs, often resulting from their own choices and algorithms.
- A
- Proactive Recommendation: A paradigm that goes beyond passively predicting what a user will like. Instead, it actively tries to guide or nudge user preferences towards certain target items or categories, often over multiple rounds of interaction. This can be for purposes like promoting diversity, introducing new content, or achieving specific business goals.
- Large Language Models (LLMs): These are advanced AI models trained on vast amounts of text data, capable of understanding, generating, and reasoning with human language. They possess
external knowledge(information learned during pre-training) andreasoning capabilities(ability to infer, deduce, and plan based on instructions). - Click Models: In recommendation research,
click modelsare used to simulate user interactions (e.g., whether a user clicks on a recommended item). They estimate the probability of a user clicking on an item given certain features or conditions.- Bernoulli Distribution: A discrete probability distribution that describes the probability of an event happening (success) or not happening (failure) in a single trial. In click models, it can be used to model whether a user clicks (1) or doesn't click (0) an item.
- Sigmoid Function (): A mathematical function that maps any real-valued number to a value between 0 and 1. It is often used to convert a raw score into a probability. Its formula is .
- Contrastive Learning: A machine learning paradigm where the model learns by contrasting positive pairs (similar items/representations) with negative pairs (dissimilar items/representations). The goal is to bring positive pairs closer in the embedding space while pushing negative pairs farther apart.
- InfoNCE Loss: A common loss function used in contrastive learning. It encourages the model to distinguish a positive sample from a set of negative samples.
- BPR Loss (Bayesian Personalized Ranking Loss): A widely used ranking loss function in recommendation systems. It optimizes the model to rank positive items higher than negative (uninteracted) items for a given user.
- The formula for
BPRloss is typically given as: , where is the set of items user interacted with, is the set of items user did not interact with, and is the predicted score of item for user .
- The formula for
- Hyperparameters: Parameters whose values are set before the learning process begins (e.g., learning rate, embedding dimension, number of intentions). They control the learning algorithm itself.
3.2. Previous Works
The paper categorizes related work into Sequential Recommendation (SR) and Proactive Recommendation (ProactRec).
3.2.1. Sequential Recommendation (SR) Methods
SR methods aim to predict the next item a user will interact with based on their historical sequence. They typically focus on modeling chronological behaviors and capturing short-term user interests.
-
SASRec [18]: A classical
sequential recommendationmethod that uses aself-attention framework. It models item transitions by allowing items in a sequence to "attend" to each other, capturing long-range dependencies effectively.- Background:
Self-attentionis a mechanism that allows a model to weigh the importance of different parts of an input sequence when processing a specific element. It's a core component ofTransformers. The key idea is to computeQuery,Key, andValuematrices from the input embeddings. Theattention scoreis computed as a scaled dot product ofQueryandKey, followed by asoftmaxto get weights, which are then applied toValueto get the weighted sum. - Formula for
Self-Attention: $ \mathrm{Attention}(Q, K, V) = \mathrm{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V $ Where:- (Query), (Key), (Value) are matrices derived from the input sequence embeddings.
- is the dimension of the
Keyvectors, used for scaling to prevent the dot products from becoming too large. - is the dot product similarity between
QueryandKeyvectors. - normalizes the attention scores into probabilities.
- Background:
-
ICLRec [7]: An
intention contrastive learning paradigmthat models latent user intentions. It fuses these intentions into anSRmethod using anew contrastive self-supervised learning objective. It usesK-Meansto cluster item embeddings and calculate intention centers. -
MStein [12]: A
sequential recommendationmethod based onmutual Wasserstein discrepancy minimization. This technique helps in obtaining more fine-grained sequential patterns by measuring the "distance" between probability distributions. -
ICSRec [29]: A
sequential recommendationmethod that enhances its performance by incorporatingsubsequencesand consideringintention prototypesof users, constructingauxiliary objectivesfor intention learning. -
BSARec [34]: A
sequential recommendationmethod that incorporates anattentive inductive bias, suggesting that it biases the attention mechanism in a specific way to capture particular sequential patterns.Limitations of SR methods: They are primarily
user-centric, focusing onnext-item predictionandcatering to historical preferences. This inherently leads tofilter bubblesandinformation cocoons, as they reinforce existing interests rather than broadening them.
3.2.2. Proactive Recommendation (ProactRec) Methods
This field focuses on actively guiding user preferences.
- IRN (Influential Recommender System) [46]: A
Transformer-based proactive recommendationwork. It generates a sequence of intermediate items using apersonalized impression maskwith the goal of guiding users toward a target.- Limitation: Assumes users
passively accept all intermediate recommendations, which is often unrealistic.
- Limitation: Assumes users
- LLM-IPP (LLMs with Influential Recommender System) [37]: A
pure LLM-based proactive recommendationmethod that leveragesLLMsto producetargeted intermediate guidance sequences. It usesLLMsfor path planning and instruction following to ensure coherence and acceptability of recommendations.- Limitations:
Resource-intensiveandlimited scalability. Similar toIRN, it assumespassive acceptanceof intermediate items. The paper notes it doesn't show significant improvement over non-LLM methods in multi-round simulated clicks.
- Limitations:
- IPG (Iterative Preference Guidance) [3]: A
model-agnostic post-processing methodthat conductsproactive recommendation. It employs adistribution-based click moduleto simulate user feedback.- Limitations: Uses a
one-size-fits-all fixed thresholdto measure user impact, which is an oversimplification. Its overall performance needs further improvement.
- Limitations: Uses a
- Conversational Recommendation Systems [10, 33, 36, 45] and Multi-modal Recommendation Approaches [40]: These are related paradigms that also involve guiding users, often through interactive dialogues or multi-modal feedback, towards stated goals. However, the proactive manner in purely
sequential recommendation scenarioswas rarely explored beforeIRNandIPG.
3.3. Technological Evolution
Recommendation systems have evolved from simple collaborative filtering and content-based filtering to sophisticated sequential recommendation models that leverage deep learning architectures like RNNs, LSTMs, and Transformers. The initial focus was purely on user-centric predictions, aiming for high accuracy in predicting the next likely item.
The limitations of this user-centric approach—namely, filter bubbles and information cocoons—led to the emergence of proactive recommendation. This shift reflects a move from purely reactive systems to more goal-oriented or provider-aligned systems. Early proactive methods (like IRN and LLM-IPP) introduced the idea of multi-round guidance paths but often made simplifying assumptions about user behavior. IPG introduced a more realistic click simulation, but still lacked sophistication in modeling user dynamics.
ITMPRec fits into this evolution by addressing the key shortcomings of previous proactive recommendation methods. It improves target item selection (pre-match), deepens the modeling of user dynamics by considering intention evolution and individual sensitivity (TIAC), and introduces a more advanced user feedback simulation using LLMs, pushing the boundary of realistic and effective proactive nudging.
3.4. Differentiation Analysis
Compared to the main methods in related work, ITMPRec introduces several core differences and innovations:
-
Target Item Selection (Pre-match Module):
- Previous:
IRNandIPGoften rely onrandomly assigned target items(or a single, pre-defined target).LLM-IPPalso uses targeted guidance but doesn't explicitly detail a sophisticated target selection strategy. This can lead tocold-startissues or irrelevant targets. - ITMPRec Innovation:
ITMPRecproactively selects a category of target items using apre-match modulethat considers the aggregate preferences ofall users. This ensures that the chosen targets are meaningful and broadly appealing within a specific domain, making the nudging process more purposeful and successful.
- Previous:
-
User Intention Modeling (Intention-induced scores):
- Previous: Most
proactive recommendationmethods (e.g.,IRN,IPG) do not explicitlymodel user intentionduring theround-by-round nudging process. While someSRmethods likeICLRecandICSRecmodel intention fornext-item prediction, this concept was not integrated intoproactive nudging. - ITMPRec Innovation:
ITMPRecexplicitly quantifiesusers' intention-level evolutionusingintention-induced scores. This allows the system to choose intermediate items that not only align with immediate preferences but also strategically shift the user's underlyingcoarse-grained intentionstowards the target category.
- Previous: Most
-
User Sensitivity to Nudging (Targeted Individual Arousal Coefficients - TIAC):
- Previous:
IPGuses aone-size-fits-all fixed thresholdfor simulating user clicks, implicitly assuming uniform user responses to intermediate items.IRNandLLM-IPPassume passive acceptance, entirely ignoring user response variability. - ITMPRec Innovation:
ITMPRecintroducesTIACto modelusers' sensitivityorreceptivityto new content. This acknowledges that each user reacts differently to external stimuli, enabling a more personalized and realistic adaptation of thepreference evolutionprocess during nudging.
- Previous:
-
User Feedback Simulation (LLM Agent):
-
Previous:
IRNandLLM-IPPassumepassive acceptanceof all intermediate items.IPGuses adistribution-based click modelwhich is a step forward but still relatively simplistic, relying on a predefined probability function. -
ITMPRec Innovation:
ITMPRecoffers a sophisticatedLLM agentas a pluggable component forsimulating user feedback. Leveraging theLLM'sexternal knowledgeandreasoning capabilities, this agent can model more complex and realistic user decision-making processes, moving beyond simple probability distributions and providing more accurate feedback for training and evaluatingproactive recommendationstrategies.In essence,
ITMPRecmoves beyond simplistic assumptions about users and target selection by incorporating a deeper understanding of user dynamics (intentions, individual sensitivities) and more realistic feedback mechanisms (LLM agent), specifically tailored for themulti-round, targeted proactive recommendationsetting.
-
4. Methodology
4.1. Principles
The core principle of ITMPRec is to move beyond passive sequential recommendation by proactively nudging user preferences towards a predetermined category of target items over multiple interaction rounds. This is achieved by:
-
Strategic Target Selection: Instead of random targets,
ITMPRecidentifies a set of target items that are relevant to a specific category and somewhat aligned with overall user preferences. -
Dynamic Preference Evolution: It aims to gradually modify user preferences by recommending
intermediate itemsthat act as stepping stones. This evolution is not uniform for all users; it considers individual userintentionsand their uniquesensitivityto new recommendations. -
Realistic User Feedback Simulation: Since real-time user feedback in multi-round nudging is hard to collect offline,
ITMPRecrelies on a sophisticatedenvironment simulatorthat can realistically model user clicks, especially through the integration ofLarge Language Models (LLMs).The theoretical basis and intuition behind this approach stem from the understanding that:
- Users' preferences are dynamic and can be influenced.
- User behavior is driven by both explicit
preferences(fine-grained) and implicitintentions(coarse-grained). - Individuals react differently to external stimuli (
arousal theory), necessitating personalizednudgingstrategies. LLMspossess vastexternal knowledgeandreasoning capabilitiesthat can simulate complex human decision-making more accurately than simple statistical models.
4.2. Environment Simulator
To evaluate proactive recommendation methods in an offline setting, ITMPRec utilizes an environment simulator to generate realistic user feedback over multiple rounds. The simulator captures three main aspects: user and item embeddings, preference evolution, and click modeling.
4.2.1. User and Item Embeddings
Initially, ITMPRec uses a pre-trained graph-based recommendation method (specifically GraphAU [42]) to generate initial user embeddings and item embeddings. These embeddings capture the latent features of users and items in a -dimensional space.
- : The initial pre-trained embedding vector for user .
- : The initial pre-trained embedding vector for item .
4.2.2. Preference Evolution
The user's preference is not static; it evolves after interacting with new items. If a user positively interacts with an item in round , their embedding is updated to reflect this new preference. The paper models this preference evolution as a weighted sum of the user's current embedding and the interacted item's embedding.
The user 's embedding after interaction with item in round is updated as follows: $ \hat{e}_u^{r+1} \gets \beta_u^r \cdot \hat{e}_u^r + (1 - \beta_u^r) \cdot \hat{e}_z^r $ Where:
- : The updated embedding for user for the next round ().
- : The current embedding for user in round .
- : The embedding of the item that user interacted with in round .
- : A
targeted individual arousal coefficientfor user in round . This coefficient, which ranges between 0 and 1, controls thedegree of preference evolution. A higher means the user's preference changes less after interacting with item (more weight on ), while a lower means the preference changes more (more weight on ). This coefficient is specific to each user and round, originating from theTIACmodule (explained in Section 4.5).
4.2.3. Click Model (Traditional)
The click model simulates the user's decision to interact with a recommended item. It estimates the interaction probability between a user and an item . This traditional model is often distribution-based, assuming a probabilistic outcome based on the similarity between user and item embeddings.
The interaction probability (or activation score) between user and item in round is calculated, and then a binary click decision is made:
$
a_u^r = \sigma\mathrm{(} w\mathrm{(} (\hat{e}_u^r)^T \mathrm{ ~ . ~ } \hat{e}_z^r - b )\mathrm{)}
$
Where:
-
: A
binary value(1 for a user click, 0 otherwise), representing the outcome of the interaction. -
: The
sigmoid function, which squashes the input to a value between 0 and 1, representing a probability. -
: The
dot productof the user's current embedding and the item's embedding, indicating their similarity or predicted relevance. -
: Parameters of the
click model(e.g., slope). -
: A
bias term.The paper also proposes an
LLM-agent click modelas an alternative, discussed later in Section 4.6.
4.3. Pre-match Module
The pre-match module addresses the limitation of random target item assignment in previous proactive recommendation methods. Instead, it aims to select a set of target items that are meaningful within a specific category and have a collective appeal to users. This avoids cold-start issues with target items and aligns better with content provider goals.
The process involves two main steps:
- Calculate Overall User Preference for Candidate Items: For a given pool of
candidate target items(e.g., all items within a specific category), the module calculates acollective preference scorefromall usersfor each candidate item. $ L_{N_{can}} = \sum_{u=1}^U (e_l^T \cdot e_u^0) $ Where:- : A list or set of scores, one for each candidate item in the
candidate target pool. - : The total number of users.
- : The
dot product(similarity) between a candidate item 's initial embedding () and a user 's initial embedding (). This represents user 's initial preference for item . - : Indicates that is an item from the
candidate target pool, which has a size of .
- : A list or set of scores, one for each candidate item in the
- Select Top Target Items: From the scored candidate items, the module selects the top items (where ) to be the ultimate
target itemsfor guidance. $ L_{N_{tar}} = cut{sort(L_{N_{can}}, \searrow), N_{tar}} $ Where:-
: The final list of selected target items.
-
: A function that sorts the list in
descending order. -
: A function that returns the
first num elementsfrom the sorted list . -
: A pre-defined number, representing the desired count of target items (e.g., 20 or 50).
This
pre-match settingensures that the subsequentproactive recommendationprocess starts with target items that are collectively preferred by a wider user base, making successfulnudgingmore feasible.
-
4.4. Intention-induced Scores
To generate effective intermediate recommendations, ITMPRec considers not only direct user-item similarity but also the underlying user intentions. This section details how intention-induced scores are calculated and integrated into the recommendation process.
4.4.1. Basic Recommendation Score
The fundamental tendency of interaction between user and item in round is quantified using the inner product of their embeddings:
$
score_{(u,i)}^r = (e_u^r)^T \cdot e_i
$
Where:
- : The basic relevance score between user and item in round .
- : The current embedding of user in round .
- : The embedding of candidate item .
4.4.2. Post-processing Strategy with Nudging Aggressiveness
Following previous work [3], the system also considers the degree of nudging aggressiveness towards a target item . This is formulated as:
$
l_{uij}^r = score_{(u,i)}^r \cdot nudge_{(u,i,j)}^r
$
Where:
- : The overall score for recommending intermediate item to user to guide towards target item in round .
- : A term representing the nudging aggressiveness, which is associated with how much a user's preference () would shift towards the target item if they interacted with intermediate item . It's defined as the difference in similarity between the target item and the user's future embedding () versus their current embedding (). $ nudge_{(u,i,j)}^r = e_j^T e_u^{(r+1)} - e_j^T e_u^{(r)} $
In the previous work [3], it was assumed that user representation transition from round to follows a linear pattern:
$
e_u^{r+1} = \omega e_u^r + (1 - \omega) e_i
$
Where is a coefficient representing the weight of the old user representation.
Substituting this into the term:
$
nudge_{(u,i,j)}^r = e_j^T (\omega e_u^r + (1 - \omega) e_i) - e_j^T e_u^r
$
$
nudge_{(u,i,j)}^r = \omega e_j^T e_u^r + (1 - \omega) e_j^T e_i - e_j^T e_u^r
$
$
nudge_{(u,i,j)}^r = (1 - \omega) e_j^T e_i - (1 - \omega) e_j^T e_u^r
$
$
nudge_{(u,i,j)}^r = (1 - \omega) (e_j^T e_i - e_j^T e_u^r)
$
$
nudge_{(u,i,j)}^r = (1 - \omega) (e_i - e_u^r)^T e_j
$
Then, substituting this back into the overall score :
$
l_{uij}^r = ( (e_u^r)^T \cdot e_i ) \cdot (1 - \omega) (e_i - e_u^r)^T e_j
$
The term is a constant (assuming uniform across users in previous work) and does not influence the ultimate choice of the intermediate item that yields the maximum score, so it can be omitted. This leads to the simplified formulation:
$
l_{uij}^r = ( (e_u^r)^T \cdot e_i ) \cdot (e_i - e_u^r)^T e_j
$
This formula consists of two parts: the interaction-tendency term and the targeted-guidance term . The interaction-tendency part favors items similar to the user's current preference, while the targeted-guidance part favors items that move the user's preference closer to the target item .
4.4.3. Intention-level Score
Recognizing that previous studies overlooked intention-level dynamics, ITMPRec introduces this component. It first identifies the user's current intention and then assesses the intention-level similarity with candidate items.
- User Intention-level Vector: Given a global
intention matrix(where is the number of intentions, and each row is an intention vector) from a pre-trainedICLRecmodel, the user's currentintention-level vectoris identified as the closest intention prototype to their current user embedding . $ c_u^r = \underset{c_m \in {c_1, \ldots, c_{N_C}}}{argmin} (||c_m - e_u^r||_2^2) $ Where:- : The
intention-level vectorrepresenting user 's dominant intention in round . - : One of the global intention vectors.
- : The
squared Euclidean distance, used to find the closest intention vector.
- : The
- Item Intention-level Vector: Similarly, each item is also projected into the intention space to find its closest intention prototype.
$
c_i = \underset{c_m \in {c_1, \ldots, c_{N_C}}}{argmin} (||c_m - e_i||_2^2)
$
Where:
- : The
intention-level vectorrepresenting item 's dominant intention.
- : The
- Intention-level Similarity Score: The
intention-level scorebetween user and candidate item is then calculated using thedot productof their respective intention vectors. $ c_{score} = (c_u^r)^T \cdot c_i $
4.4.4. Final Combined Score
To incorporate intention-induced scores, the interaction-tendency term in the overall score (Equation 9) is modified. It now considers both the direct representational similarity between user and item and their similarity in the intention space. The targeted-guidance term remains unchanged.
The final formulation for the overall score is: $ l_{uij}^r = \langle (e_u^r)^T \cdot e_i + \lambda \quad (c_u^r)^T \cdot c_i \quad \rangle \cdot (e_i - e_u^r)^T e_j $ Where:
-
: A
hyperparameterthat controls theweightgiven to theintention-induced score(the second term in the angle brackets) relative to the directinterest score(the first term). A higher means more emphasis on intention alignment. -
The first part represents the enhanced
interaction-tendencyincorporating intention. -
The second part remains the
targeted-guidanceterm.The item that yields the
largest scorewill be selected as theintermediate itemto be recommended to the user in round , and this item then serves as input to theclick model.
4.5. Targeted Individual Arousal Coefficients (TIAC)
Previous work often assumed a uniform preference evolution coefficient (the in the linear update rule, or in this paper) for all users, implying everyone responds to new content similarly. ITMPRec challenges this by introducing Targeted Individual Arousal Coefficients (TIACs), which personalize the degree of preference evolution based on each user's unique receptivity to new stimuli and their relationship with the target item.
The TIAC () for user in round is calculated as follows:
-
Historical Preference Variance (Short-term Curiosity): First,
ITMPRecidentifies a set of items that represent the user'sshort-term preferencesor "curiosity" based on their current embedding and items they have not yet interacted with. $ \mathbf{hp}^{r-1}(u) = Top\mathcal{Q}{\phi(e_u^{r-1}, e_{idx})}, e_{idx} \in E \backslash {S_u^{r-1}} $ Where:- : A set of item embeddings representing user 's
short-term preferencesor items they are currently "curious" about, derived from their state in roundr-1. - : The
cosine similaritybetween two vectors and . This measures how similar is to other items. - : The set of
all itemsexcluding those already in user 'shistorical sequence(i.e., items the user has already interacted with). - : A function that returns the
top Q itemsfrom the specified set that have the highestcosine similaritywith the user's embedding . - : A
hyperparameterrepresenting thecapacity of short-term preferencesorcuriosity.
- : A set of item embeddings representing user 's
-
Arousal Value Calculation: Based on this set of
short-term preferences, anarousal valueis computed, reflecting how much the user is "aroused" by thetarget itemgiven their current curiosities. Thisarousal valuethen becomes theTIAC. $ \beta_u^r = \mathcal{P}\mathcal{O}\mathcal{O}L(\phi({\bf hp}^{r-1}(u), e_j)) $ Where:-
: The
Targeted Individual Arousal Coefficientfor user in round . -
: A
pooling operation(e.g.,average pooling) applied to thecosine similarities. This averages the similarity between the user'stop Q curiositiesand thetarget item. Other pooling methods (max, sum) could also be used. -
: The
cosine similaritybetween each item in the user'sshort-term preferenceset and thetarget item.This calculated is then fed into the
preference evolution module(Section 4.2.2), allowing forpersonalized preference updates. Users who are more "aroused" (i.e., whose current curiosities align more with the target item, leading to a higher ) might have their preferences shifted more effectively towards the target, or their preferences might be more stable. The actual impact depends on how is used in the preference update equation: a higher means the user's preference retains more weight, implying less change from the current state. Conversely, a lower would imply a stronger shift towards the interacted item .
-
4.6. LLM-based Click Simulation Agent
ITMPRec offers an alternative to the traditional distribution-based click model: an LLM-based click simulation agent. This agent leverages the external knowledge and reasoning capabilities of Large Language Models (LLMs) to simulate user feedback more realistically.
The LLM agent (e.g., ChatGLM3) takes the user's current historical sequence and the recommended intermediate item as input and outputs a binary click decision.
The action of user (click or no click) is obtained as: $ a_u^r = LLM(\mathcal{P}_F, \mathcal{H}_u^r, NAMES(i_u^r)) $ Where:
-
: The
binary action(0 for no click, 1 for click) generated by theLLMfor user in round . -
: Represents the
Large Language Modelfunction. -
: The
task instructionorpromptprovided to theLLM. This includesfew-shot examples(demonstrations of desired behavior) and aprompt templatethat guides theLLMto produce a binary output (0 or 1). -
: The
user's historical interaction sequenceup to round . -
: The
nameordescriptionof theintermediate itemrecommended to user in round . TheLLMcan use this textual information, along with itsexternal knowledge, to make a more informed decision.Based on the
LLM's output, the user'shistorical sequencefor the next round is updated: $ \mathcal{H}_u^{r+1} = \left{ \begin{array}{ll} CONCAT(\mathcal{H}_u^r, NAMES(i_u^r)), & \mathrm{if} ; a_u^r = 1 \ \mathcal{H}_u^r, & \mathrm{if} ; a_u^r = 0 \end{array} \right. $ Where: -
: The updated historical sequence for user for the next round.
-
: A function to
concatenatethe new item's name to the existing historical sequence if the user clicked it.
4.6.1. Discussion: Traditional vs. LLM Agent Click Model
- Traditional Click Model: Assumes that
higher similarity/scoredirectly translates to ahigher probabilityof user acceptance. This is a simplification and may not capture the nuances of human behavior. - LLM Agent Click Model: Offers richer
interpretabilityandexternal knowledge utilization.LLMscan consider various factors beyond simple similarity, such as item attributes, user's inferred mood, contextual information, and their own reasoning to decide whether a user would click an item. This makes the simulation morerealisticfor complex decision-making. TheLLMcan simulate intricate decision-making factors, which is valuable in today's era of complex user behaviors.
4.7. Overall Algorithm
The complete process of ITMPRec is summarized in Algorithm 1, outlining the multi-round nudging for each user and target item.
Algorithm 1: ITMPRec
Input:
- : set of users.
- : set of items.
- : historical sequences for each user .
- : total number of nudging rounds.
- : batch size for processing users.
Output:
- : The nudging path (sequence of clicked intermediate items) for each user and target item .
Procedure:
- Initialize Target Items: Determine the set of
target itemsto be nudged by applying thepre-match module(Equation 7). This step is performed once before the multi-round process begins. - Iterate Through Target Items: For each
target itemin the selected set (): a. Initialize Nudging Path: Initialize an empty nudging path for each user and the current target item . This path will store theintermediate itemsclicked by user while being nudged towards . b. Multi-Round Nudging: For eachroundfrom 0 toR-1: i. Update User Representation: Get the current user representation based on their historical sequence using the sequence encoder (Equation 5). If , is the initial historical sequence. ii. Initialize Intermediate Lists: Create empty lists (to store all recommended intermediate items in this round) and (to store the specific recommended item for each user). iii. Batch Processing for Users: Iterate through users in batches of size : 1. Calculate Intention-level Score: For each user in the current batch, calculate theirintention-level score(Equation 12) with candidate intermediate items using their current intention vector . 2. Select Intermediate Item: Determine the optimalintermediate itemfor user for this round by maximizing thefinal combined score(Equation 13). This item is intended to nudge user towards target . 3. Calculate TIAC: Compute thetargeted individual arousal coefficientfor user (Equation 15). 4. Store Recommended Item: Add to the list. iv. Simulate Clicks: Simulate user clicks on the recommended intermediate items. 1. : This step uses either thedistribution-based click model(Section 4.2.3) or theLLM-based click simulation agent(Section 4.6) to determine which of the recommended items are "clicked" by users. 2. : Consolidate the click results. v. Update User State and Path: For each user and their corresponding click result : 1. If Clicked: If user clicked the recommended item: * Update user's historical sequence by concatenating the clicked item (represented by ) and incorporating the arousal coefficient (this implicitly refers to the preference evolution formula from Section 4.2.2, where is used). * Extend the nudging path with the clicked item. - Return Nudging Paths: After all rounds and all target items are processed, return the complete
nudging pathsfor each user and target.
Key points from the algorithm:
- The
pre-match moduleis a one-time setup (Line 1). - The core of the algorithm (Lines 2-21) iterates through each target item and then for multiple rounds.
- For each round, it calculates personalized scores (Line 8-9) and arousal coefficients (Line 10).
- The
click simulation(Line 12) is crucial for determining actual user interaction and subsequentpreference evolution(Line 17). - The nudging path records the actual sequence of intermediate items that successfully led a user closer to a target.
5. Experimental Setup
5.1. Datasets
The experiments were conducted on four publicly available datasets, chosen to represent different domains and scales of recommendation scenarios:
- ML-100k (MovieLens 100k): A movie rating dataset.
- Domain: Movies
- Characteristics: Relatively dense, smaller scale.
- Scale: 943 users, 1,348 items, 98,704 interactions.
- Density: 7.7649%
- Average Items per User: 104.67
- Lastfm: A music listening dataset.
- Domain: Music artists/tracks
- Characteristics: Moderately dense.
- Scale: 945 users, 2,782 items, 246,368 interactions.
- Density: 9.3712%
- Average Items per User: 36.78
- Steam: A video game platform dataset.
- Domain: Video games
- Characteristics: Sparser, larger number of users.
- Scale: 12,611 users, 2,017 items, 220,100 interactions.
- Density: 0.9686%
- Average Items per User: 19.54
- Douban_movie: A movie dataset from Douban (a Chinese social networking service).
-
Domain: Movies
-
Characteristics: Sparser, very large number of items and interactions.
-
Scale: 2,623 users, 20,527 items, 1,161,110 interactions.
-
Density: 2.1565%
-
Average Items per User: 442.66
The following are the data statistics from Table 2 of the original paper:
Dataset ML-100k Lastfm Steam Douban_movie #Users 943 945 12,611 2,623 #Items 1,348 2,782 2,017 20,527 #Interactions 98,704 246,368 220,100 1,161,110 Density 7.7649% 9.3712% 0.9686% 2.1565% #Avg. Items per User 104.67 36.78 19.54 442.66
-
These datasets were chosen because they are widely used benchmarks in recommendation systems research, allowing for fair comparison with existing methods. They cover a range of density and scale, which helps validate the method's robustness across different data characteristics. For instance, ML-100k and Lastfm are denser, while Steam and Douban_movie are sparser, posing different challenges.
5.2. Evaluation Metrics
The performance of ITMPRec and baseline models is evaluated using three metrics, specifically chosen to assess the effectiveness of proactive recommendation tasks over multiple rounds. The evaluation is conducted at different stages ( rounds) of the recommendation process.
-
HitRatio (HR@P):
- Conceptual Definition:
HitRatiomeasures the proportion of users who positively interact with at least one recommended intermediate item within proactive recommendation cycles. It quantifies the system's ability to engage users during the nudging process. - Mathematical Formula: $ HR@P = \frac{1}{P |\mathcal{U}|} \sum_{p=1}^P \sum_{u \in \mathcal{U}} a_{up} $
- Symbol Explanation:
HR@P: Hit Ratio at rounds.- : The number of proactive recommendation cycles (rounds) considered for evaluation.
- : The total number of users in the dataset.
- : A
binary value(0 or 1) representing the feedback from theclick simulatorfor user in round . if the user clicked the recommended item in round , and otherwise. - : The total number of positive interactions (clicks) across all users and all rounds.
- Conceptual Definition:
-
Increase of Interest (IoI@P):
- Conceptual Definition:
Increase of Interestquantifies how much a user's interest in thetarget itemhas increased after rounds of proactive recommendations. It directly measures the effectiveness of the nudging process in shifting user preferences towards the desired target. A higher value indicates better guidance. - Mathematical Formula: $ IoI@P = \frac{1}{|\mathcal{U}|} \sum_{u \in \mathcal{U}} (\hat{e}_j^T \cdot \hat{e}_u^P - \hat{e}_j^T \cdot \hat{e}_u^0) $
- Symbol Explanation:
IoI@P: Increase of Interest at rounds.- : The total number of users.
- : The embedding of the
target item. This embedding remains constant throughout the process. - : The embedding of user after rounds of proactive recommendations (i.e., after their preference has potentially evolved).
- : The initial embedding of user at the start of the guidance phase (before any nudging).
- : The
dot product(similarity) between the target item embedding and the user's embedding after rounds. - : The
dot product(similarity) between the target item embedding and the user's initial embedding. - The difference measures the change in user 's interest in the target item. A positive value means increased interest.
- Conceptual Definition:
-
Increase of Ranking (IoR@P):
- Conceptual Definition:
Increase of Rankingmeasures the improvement in theranking positionof thetarget itemamong all other items, with respect to a user's preference, after rounds. It indicates how much closer the target item has become to being a top recommendation for the user. A higher value means the target item is ranked much higher after nudging. - Mathematical Formula: $ IoR@P = \frac{1}{|\mathcal{U}|} \sum_{u \in \mathcal{U}} \mathsf{Ran}{\hat{e}_j | \hat{e}_u^0} - \mathsf{Ran}{\hat{e}_j | \hat{e}_u^P} $
- Symbol Explanation:
IoR@P: Increase of Ranking at rounds.- : The total number of users.
- : A function that returns the
discrete rankingof thetarget itemamong all items, based on their similarity to the user's embedding . The ranking is typically based on similarity scores (e.g., dot products) with the user's embedding. A lower rank value means a higher position (e.g., rank 1 is the most preferred). - : The initial ranking of the target item for user .
- : The ranking of the target item for user after rounds of nudging.
- A positive value for
IoR@Pindicates that the target item's ranking has improved (i.e., its rank number has decreased, meaning it moved higher up the list).
- Conceptual Definition:
5.3. Baselines
The proposed ITMPRec method is compared against eight state-of-the-art baseline models, categorized into Sequential Recommendation (SR) methods and Proactive Recommendation (ProactRec) methods. For fairness, all methods use a distribution-based click simulator unless explicitly stated (e.g., for LLM-IPP under its own assumptions or for ITMPRec when combined with its LLM-agent click model).
5.3.1. Sequential Recommendation (SR) Methods
These methods are designed for next-item prediction and typically optimize for user-centric preferences.
- SASRec [18]: A foundational
self-attentive sequential recommendationmodel. It represents sequences as item embeddings and uses a Transformer encoder to capture sequential patterns. - ICLRec [7]: An
intention contrastive learningmodel forsequential recommendation. It explicitly models user intentions using a contrastive learning objective, improving user representation. - MStein [12]: A
sequential recommendationmethod that minimizesmutual Wasserstein discrepancyto capture fine-grained sequential patterns. - ICSRec [29]: A
sequential recommendationmethod that usesintent contrastive learningwithcross subsequencesto learn better representations. - BSARec [34]: An
attentive inductive biasbasedsequential recommendationmethod, aiming to improve attention mechanisms for sequential modeling.
5.3.2. Proactive Recommendation (ProactRec) Methods
These methods explicitly aim to guide user preferences towards a target.
-
IRN (Influential Recommender System) [46]: A
Transformer-based proactive recommendationmethod. It generates an entire sequence ofmiddle itemsin one go, with the assumption that users willpassively acceptall of them. -
IPG (Iterative Preference Guidance) [3]: A
model-agnostic post-processing frameworkforproactive recommendation. It uses aniterative approachto guide preferences and includes adistribution-based click moduleto simulate user responses. -
LLM-IPP (LLMs with Influential Recommender System) [37]: A
proactive recommendationmethod that usesLarge Language Models(specificallyGLM-4-Flashin the experiments) forpath planningandinstruction followingto generate guidance sequences. The paper notes its high resource consumption and its original assumption ofpassive user acceptance. In the context ofITMPRec's evaluation,LLM-IPPwas tested under theuser click simulation settingsfor fair comparison.These baselines were chosen to represent the current state-of-the-art in both
sequentialandproactive recommendation, covering various architectural approaches (attention, contrastive learning) and different levels of sophistication in handlingproactive guidance.
6. Results & Analysis
6.1. Core Results Analysis
The experimental results demonstrate the superiority of ITMPRec in proactive recommendation tasks across four datasets. The evaluation primarily focuses on HitRatio (HR@P), Increase of Interest (IoI@P), and Increase of Ranking (IoR@P) at different rounds ().
The following are the results from Table 4 of the original paper:
| Datasets | Methods | HR | IoI | IoR | |||||||||
| @5 | @10 | @15 | @20 | @5 | @10 | @15 | @20 | @5 | @10 | @15 | @20 | ||
| ML-100k | SASRec | 0.3994 | 0.3991 | 0.3980 | 0.3979 | 0.0455 | 0.0866 | 0.1121 | 0.1259 | -0.4036 | -0.9826 | -1.2254 | -1.1867 |
| ICLRec | 0.4124 | 0.4117 | 0.4102 | 0.4083 | 0.0394 | 0.0744 | 0.0952 | 0.1052 | 0.2398 | 0.2578 | 0.2476 | 0.0111 | |
| MStein | 0.3134 | 0.3125 | 0.3118 | 0.3114 | 0.0074 | 0.0141 | 0.0204 | 0.0264 | -0.1127 | -0.1355 | -0.022 | 0.12 | |
| BSARec | 0.3705 | 0.3702 | 0.3692 | 0.3689 | 0.0416 | 0.0814 | 0.1131 | 0.1365 | -0.3646 | -0.7027 | -1.1309 | -1.5034 | |
| ICSRec | 0.3642 | 0.3636 | 0.3628 | 0.3621 | 0.0412 | 0.0866 | 0.1231 | 0.1503 | -0.0593 | 0.0346 | 0.2145 | 0.2695 | |
| IRN | 0.4274 | 0.4270 | 0.4250 | 0.4237 | 0.0299 | 0.0578 | 0.0867 | 0.0912 | 0.0518 | 0.2712 | 1.3507 | 1.7407 | |
| IPG | 0.3866 | 0.3891 | 0.3895 | 0.3861 | 0.1520 | 0.2620 | 0.3409 | 0.3898 | 33.2767 | 68.7030 | 96.4608 | 111.8751 | |
| LLM-IPP | 0.3695 | 0.3680 | 0.3658 | 0.3659 | 0.0450 | 0.0865 | 0.1184 | 0.1412 | 0.6572 | 1.1868 | 1.3978 | 1.3998 | |
| ITMPRec w/o P | 0.4029 | 0.4027 | 0.4066 | 0.4067 | 0.2353 | 0.3951 | 0.4496 | 0.4622 | 63.9955 | 113.2598 | 128.8455 | 131.4221 | |
| ITMPRec | 0.4064 | 0.4024 | 0.4040 | 0.4016 | 0.2433 | 0.3998 | 0.4556 | 0.4690 | 70.0011 | 120.6690 | 136.8670 | 139.6954 | |
| Lastfm | SASRec | 0.3263 | 0.3254 | 0.3248 | 0.3243 | 0.0094 | 0.0174 | 0.0250 | 0.0311 | -0.1749 | -0.6057 | -0.7632 | -1.1204 |
| ICLRec | 0.4137 | 0.4129 | 0.4111 | 0.4106 | 0.0083 | 0.0102 | 0.0066 | 0.0001 | 0.1126 | 0.4359 | 0.8594 | 0.9521 | |
| MStein | 0.3289 | 0.3281 | 0.3275 | 0.3270 | -0.0024 | 0.0023 | 0.0139 | 0.0240 | -0.6893 | -0.8823 | -0.9250 | -0.9139 | |
| BSARec | 0.3334 | 0.3327 | 0.3320 | 0.3315 | 0.0193 | 0.0297 | 0.0400 | 0.0493 | 0.5216 | 0.7023 | 0.6054 | 0.5891 | |
| ICSRec | 0.3359 | 0.3351 | 0.3345 | 0.3369 | 0.0115 | 0.0251 | 0.0362 | 0.0458 | -0.1695 | -0.0528 | 0.0099 | 0.0688 | |
| IRN | 0.4028 | 0.4018 | 0.4008 | 0.4002 | 0.0101 | 0.0203 | 0.0400 | 0.0525 | 0.0185 | 0.0916 | 1.8248 | 3.2734 | |
| IPG | 0.3516 | 0.3528 | 0.3490 | 0.3520 | 0.1791 | 0.2976 | 0.3879 | 0.4544 | 25.2901 | 52.9695 | 80.5057 | 100.1863 | |
| ITMPRec w/o P | 0.4163 | 0.4110 | 0.4122 | 0.4113 | 0.2925 | 0.5218 | 0.5958 | 0.6161 | 60.5283 | 120.0176 | 137.4319 | 141.8555 | |
| ITMPRec | 0.4129 | 0.4153 | 0.4115 | 0.4135 | 0.3943 | 0.5938 | 0.6486 | 0.6614 | 96.9189 | 146.0627 | 159.1564 | 161.7352 | |
| Steam | SASRec | 0.4271 | 0.4263 | 0.4257 | 0.4251 | 0.0486 | 0.0991 | 0.1320 | 0.1521 | -0.2202 | 0.3557 | 1.1601 | 1.6881 |
| ICLRec | 0.3886 | 0.3878 | 0.3872 | 0.3867 | 0.0583 | 0.1140 | 0.1571 | 0.1898 | 0.8334 | 2.0505 | 3.3948 | 4.6866 | |
| MStein | 0.3929 | 0.3921 | 0.3914 | 0.3909 | 0.0584 | 0.1166 | 0.1620 | 0.1942 | 1.1366 | 2.4076 | 2.7133 | 2.5779 | |
| BSARec | 0.4096 | 0.4089 | 0.4083 | 0.4078 | 0.0608 | 0.1290 | 0.1760 | 0.2050 | 0.1626 | 2.0218 | 4.0323 | 5.6237 | |
| ICSRec | 0.4005 | 0.3998 | 0.3991 | 0.3986 | 0.0597 | 0.1223 | 0.1656 | 0.1927 | 0.0492 | 0.7546 | 1.6664 | 2.0173 | |
| IRN | 0.4205 | 0.4195 | 0.4188 | 0.4183 | 0.0418 | 0.0839 | 0.1628 | 0.2016 | 0.3826 | 0.5860 | 2.6768 | 6.6263 | |
| IPG | 0.3921 | 0.3907 | 0.3898 | 0.3895 | 0.1036 | 0.1777 | 0.2245 | 0.2554 | 17.6234 | 27.6880 | 33.4087 | 37.4944 | |
| ITMPRec w/o P | 0.3911 | 0.3899 | 0.3915 | 0.3907 | 0.1876 | 0.2654 | 0.2984 | 0.3108 | 46.7208 | 55.9344 | 58.9080 | 59.8572 | |
| ITMPRec | 0.3918 | 0.3937 | 0.3930 | 0.3923 | 0.2192 | 0.2955 | 0.3239 | 0.3336 | 55.3553 | 66.6745 | 70.6409 | 71.6806 | |
| Douban_movie | SASRec | 0.3673 | 0.3669 | 0.3662 | 0.3655 | -0.0021 | -0.0042 | -0.0046 | -0.0040 | 0.0888 | 0.2017 | 0.3321 | 0.5044 |
| ICLRec | 0.3277 | 0.3268 | 0.3261 | 0.3256 | 0.0002 | -0.0017 | -0.0009 | 0.0019 | 0.0062 | 0.0043 | 0.0475 | 0.1750 | |
| MStein | 0.3174 | 0.3166 | 0.3159 | 0.3154 | 0.0030 | 0.0076 | 0.0128 | 0.0176 | 0.0180 | 0.0636 | 0.1195 | 0.2197 | |
| BSARec | 0.4217 | 0.4215 | 0.4208 | 0.4200 | -0.0046 | -0.0095 | -0.0130 | -0.0150 | 0.0028 | -0.0768 | -0.1460 | -0.2929 | |
| ICSRec | 0.3304 | 0.3296 | 0.3289 | 0.3284 | 0.0019 | 0.0016 | 0.0037 | 0.0066 | 0.0858 | 0.1511 | 0.2715 | 0.4051 | |
| IRN | 0.3758 | 0.3753 | 0.3744 | 0.3739 | 0.0037 | 0.0069 | 0.0052 | 0.0010 | 0.1676 | 0.2913 | 0.4284 | 0.6543 | |
| IPG | 0.3310 | 0.3323 | 0.3310 | 0.3303 | 0.0849 | 0.1418 | 0.1885 | 0.2259 | 13.3451 | 21.2825 | 30.0722 | 39.0427 | |
| ITMPRec w/o P | 0.3439 | 0.3483 | 0.3422 | 0.3389 | 0.1465 | 0.2222 | 0.2798 | 0.3201 | 33.6715 | 48.5319 | 62.1714 | 73.9921 | |
| ITMPRec | 0.3366 | 0.3363 | 0.3361 | 0.3362 | 0.1619 | 0.2408 | 0.2960 | 0.3374 | 36.0797 | 50.5707 | 65.3341 | 77.2108 | |
6.1.1. Comparison with Traditional SR Methods
- Proactive vs. SR:
Proactive recommendation methods(IRN,IPG,ITMPRec) generally outperformtraditional SR methods(SASRec,ICLRec,MStein,BSARec,ICSRec) inIoIandIoRmetrics. This is a critical finding, asIoIandIoRdirectly measure the success of proactive guidance towards a target. TraditionalSRmethods often show low or evennegativeIoIandIoRvalues (e.g.,SASRecon ML-100k and Lastfm,BSARecon Douban_movie), indicating that without explicit guidance, user preferences either do not shift towards the target or even diverge. - HR@P: While
proactive methodsmight show a slightly lowerHR@Pcompared to someSRmethods (e.g.,SASRecon ML-100k,IRNon Steam), this is an acceptable trade-off.HR@Pmeasures engagement with any intermediate item, whereasIoIandIoRmeasure engagement specifically towards the target. The goal of proactive recommendation is not just clicks, but guided clicks. The decrease inHR@Pis often insignificant, suggesting that the system can still generate engaging intermediate items while steering preferences.
6.1.2. Comparison with Other Proactive Recommendation Methods
- IRN's Limitation:
IRNshows limitedIoIandIoRimprovements. This confirms the paper's hypothesis thatIRN's assumption ofpassive user acceptance(generating a full path upfront without accounting for real-time feedback) leads to suboptimal performance when asimulated click feedbackmechanism is introduced, as users might not accept the entire pre-planned sequence. - LLM-IPP's Underperformance:
LLM-IPP, despite being anLLM-basedmethod, underperforms significantly inIoI@PandIoR@Punder theuser click simulation settings. The paper notes its high time consumption (over 50 hours for ML-100k compared to others within 1 hour), which limits its practical applicability and suggests that simply using anLLMwithout the specificnudging mechanismsofITMPRecis not sufficient. - ITMPRec vs. IPG:
ITMPRecdemonstrates a substantial improvement overIPG, which is identified as the second-best proactive recommendation method.ITMPRecachieves average enhancements of36.47%inIoI@20and68.80%inIoR@20across the four datasets. This highlights the effectiveness ofITMPRec's novel components (pre-match, intention-induced scores, TIAC) in more effectively guiding users towards target categories and single items.
6.1.3. ITMPRec's Overall Performance
ITMPRecconsistently achieves the best performance inIoI@PandIoR@Pacross all datasets, confirming its ability to effectively shift user preferences towards target items.- The
HR@Pscores forITMPRecare competitive, demonstrating that the system can still generate engaging intermediate recommendations while fulfilling its proactive goal.
6.2. Ablation Studies / Parameter Analysis
The following are the results of ablation studies on four datasets from Table 3 of the original paper:
| Dataset | Ablation | HR@20 | IoI@20 | IoR@20 |
| ML-100k | w/o P | 0.4067 | 0.4622 | 131.4221 |
| w/o IIS | 0.3878 | 0.4596 | 136.6786 | |
| w/o TIAC | 0.3823 | 0.4006 | 118.3061 | |
| ITMPRec | 0.4016 | 0.4690 | 139.6954 | |
| Lastfm | w/o P | 0.4113 | 0.6161 | 141.8555 |
| w/o IIS | 0.3324 | 0.4030 | 97.2408 | |
| w/o TIAC | 0.3758 | 0.5149 | 116.5403 | |
| ITMPRec | 0.4135 | 0.6614 | 161.7352 | |
| Steam | w/o P | 0.3907 | 0.3108 | 59.8572 |
| w/o IIS | 0.3920 | 0.3321 | 71.5609 | |
| w/o TIAC | 0.3858 | 0.2472 | 38.4798 | |
| ITMPRec | 0.3923 | 0.3336 | 71.6806 | |
| Douban_movie | w/o P | 0.3389 | 0.3201 | 73.9921 |
| w/o IIS | 0.3329 | 0.3035 | 64.1521 | |
| w/o TIAC | 0.3303 | 0.2644 | 50.9361 | |
| ITMPRec | 0.3362 | 0.3374 | 77.2108 |
6.2.1. Ablation Study (RQ1)
The ablation study analyzes the contribution of three key components of ITMPRec: the pre-match module (P), intention-induced scores (IIS), and targeted individual arousal coefficients (TIAC).
-
w/o Pre-match (P):
- Removing the
pre-match module(by using random target selection) generally leads to a decrease inIoI@20andIoR@20. For example, on Lastfm,IoI@20drops from 0.6614 to 0.6161, andIoR@20from 161.7352 to 141.8555. This validates that selecting target items that are collectively appealing and avoidscold-startissues is crucial for effective nudging. - The paper notes that the
pre-match module(selecting targets from a specific category based on collective preference) helps avoid problems like scattered orcold-starttarget items, leading to more successful nudging.
- Removing the
-
w/o Intention-induced scores (IIS):
- The
intention-induced scorescomponent shows a significant degradation in performance, especially onLastfm(IoI@20drops from 0.6614 to 0.4030,IoR@20from 161.7352 to 97.2408) andDouban_movie. This highlights the importance of modelinguser intentionat a coarse-grained level to guide preferences effectively. - On the
Steamdataset, the impact ofIISis less pronounced. The authors suggest this might be due to the limited number of items users can search for in theSteamdomain, which restricts candidate item pools and reduces the impact of different selection strategies related to intention.
- The
-
w/o Targeted Individual Arousal Coefficients (TIAC):
- Removing
TIACconsistently leads to a notable drop inIoI@20andIoR@20across all datasets (e.g., on Lastfm,IoI@20drops from 0.6614 to 0.5149,IoR@20from 161.7352 to 116.5403). This confirms the importance of personalizing thepreference evolutiondegree based onindividual user sensitivityto new content. - The
TIACmodule performs particularly well ondiverse datasetslikeSteamandDouban_movie, emphasizing that accounting forindividual user responsesis crucial inmulti-round taskswhere users might have varying levels of curiosity or openness to new items.
- Removing
-
Overall: All three modules (,
IIS,TIAC) contribute positively to theinterest nudging metrics(IoI@PandIoR@P), even thoughHR@Pmight see slight, acceptable variations. This indicates that these components are well-designed to steer users towards target content, improving the quality ofproactive recommendation.
6.2.2. Parameter Sensitivity Analysis (RQ3)
The paper investigates the impact of two key hyperparameters: (number of items considered for personal curiosity in TIAC) and (number of intentions).
The following figure (Figure 4 from the original paper) shows the effect of hyperparameters and for four datasets:
该图像是图表,展示了超参数 和 对四个数据集的影响。上半部分的图表(左侧)显示了在不同采样数量 下的推荐效果(IOR),而下半部分(右侧)则呈现了在不同意图数量 下的推荐效果。每个数据集包括 Lastfm、ML-100k、Steam 和 Douban movie,反映了这些超参数对个性化推荐性能的影响。
-
Effect of (Figure 4a):
- For
dense datasetslikeLastfmandML-100k, sampling a smaller number oftop user preferences(e.g., ) is sufficient to characterize user responses and achieve optimal performance forTIAC. - For
sparser datasetslikeDouban_movieandSteam, a larger (e.g., ) is required to adequately modeluser arousal levels. This makes sense, as more items might be needed to capture a user's diverse or less-defined short-term preferences in sparse environments.
- For
-
Effect of (Figure 4b):
- represents the
number of intentionsmodeled by the system. A larger allows for morediverse user intentionsto be captured. - For
smaller datasetssuch asLastfmandML-100k, the model performs best with a relatively small number of intentions (around ). This suggests that these datasets might exhibit fewer distinct user intention patterns. - For
larger datasetslikeSteamandDouban_movie, a higher number of intentions (around ) yields better performance. This indicates that a more granular understanding of user intentions is beneficial in larger, potentially more diverse user bases.
- represents the
6.2.3. LLM-based vs. Distribution-based Click Simulation (RQ3)
The paper provides a detailed comparison of the LLM-based and distribution-based click simulation schemes, both quantitatively and qualitatively.
-
Quantitative Comparison (Figure 6): The following figure (Figure 6 from the original paper) shows the comparative results of the distribution-based and LLM-based click simulations on the Lastfm and Douban_movie datasets:
该图像是一个比较图表,展示了 Lastfm 和 Douban_movie 数据集上基于分布和 LLM 的点击模拟结果。上方为 Lastfm 的结果,左侧为 IoL,右侧为 IoR;下方为 Douban_movie 的结果,左侧为 IoL,右侧为 IoR。The
LLM-based click modelconsistentlyoutperformsthedistribution-based approachon bothLastfmandDouban_moviedatasets, for bothIoI@PandIoR@Pmetrics across various evaluation windows (). This indicates that theLLM agentprovides a more effective and realistic simulation of user behavior, leading to betternudging outcomes. -
Qualitative Comparison (Table 5): The following are the results from Table 5 of the original paper:
Target movies in target category Sci-Fi Description [1] Robert A. Heinlein's The Puppet Masters Sci-Fi, Horror. [2] Aliens Sci-Fi, Action, Thriller. [3] Mars Attacks! Sci-Fi, Action, Comedy, War. The latest five movies' categories in the viewing history: Drama, Animation, Children's, Comedy, War Intermediate items by LLM agent Intermediate items by distribution-based scheme Frighteners(Com, Hor) → Hunt for Red October(Act, Thr) → Forbidden Planet (Sci) ✓ Breakfast at Tiffany's (Dra, Rom) → While You Were Sleeping (Com, Rom) → Great Escape (War) → Best of the Best 3: No Turning Back (Act) → Strange Days (Sci, Act, Cri) ✓ Star Trek IV (Act, Adv, Sci) ✓ Forget Paris (Com, Rom) → G.I. Jane (Act, Dra, War) → Great Dictator (Com) → Star Trek IV (Sci) √ Drunks (Dra) → Balto (Ani,Chi) → Red Rock West (Thr) → Canadian Bacon (Com, War) → Dangerous Minds (Dra) → Strange Days (Act, Cri, Sci) √ Dangerous Ground (Dra) → Hour of the Pig (Dra, Mys) → Red Rock West (Thr) → Canadian Bacon (Com, War) → Moonlight and Valentino (Dra, Rom) → Dangerous Minds (Dra) → Hunt for Red October (Act, Thr) A case study on the
ML-100kdataset, with "Sci-Fi" as the target category, illustrates the qualitative difference:- LLM Agent: When nudging towards "Robert A. Heinlein's The Puppet Masters" (Sci-Fi, Horror), the
LLM agentrecommends a path like "Frighteners" (Com, Hor) "Hunt for Red October" (Act, Thr) "Forbidden Planet" (Sci). TheLLMseems to understand the nuances of genres. For example, "Forbidden Planet" is categorized as "Sci-Fi" but actually contains elements of "Action, Thriller, and Adventure." TheLLMleverages itsexternal knowledgeto connect these broader genre aspects, forming a coherentnudging patheven across seemingly disparate initial preferences (Drama, Animation, Children's, Comedy, War). - Distribution-based Scheme: In contrast, the
distribution-based schememight follow a more direct similarity path. For the same target, it recommends "Breakfast at Tiffany's" (Dra, Rom) "While You Were Sleeping" (Com, Rom) "Great Escape" (War) "Best of the Best 3" (Act) "Strange Days" (Sci, Act, Cri). While it eventually reaches a Sci-Fi movie, the path appears less "reasoned" and more purely based on feature similarity. For "Mars Attacks!" (Sci-Fi, Action, Comedy, War), thedistribution-based methodstruggles to recommendSci-Fimovies effectively at times. - The
LLM-based methoddemonstrates a more sophisticatedreasoning capability, allowing it to bridge seemingly larger gaps by identifying subtle connections (e.g., hidden genre elements, broader thematic links) through itsexternal knowledge, thus generating more effective and believablenudging paths.
- LLM Agent: When nudging towards "Robert A. Heinlein's The Puppet Masters" (Sci-Fi, Horror), the
6.2.4. Case Study: User Embedding Evolution (Appendix A.6)
The following figure (Figure 7 from the original paper) shows user embedding's evolution in ITMPRec:
该图像是热图,展示了ITMPRec模型在不同回合(Round)和维度(Dimension)下用户与目标之间的交互强度变化。横轴表示回合数,从用户到目标;纵轴显示不同的维度,色彩深浅反映了交互强度。图中信息有助于理解用户意图的发展。
The following figure (Figure 8 from the original paper) shows intermediate items recommended by ITMPRec:
该图像是一个示意图,展示了 ITMPRec 方法中用户意图随每轮推荐变化的热力图。图中横轴表示推荐轮次,纵轴表示用户维度,颜色深浅反映了意图强度的变化。
A visualization of a user's embedding evolution from the Lastfm dataset shows that ITMPRec effectively draws user preferences towards the target item over multiple rounds. The user embedding (represented as a point in a 2D space) gradually moves closer to the target item's embedding as intermediate items are recommended and "clicked." This visual evidence reinforces the idea that ITMPRec successfully nudges user preferences.
6.3. Summary of Findings
ITMPRec significantly advances the field of proactive recommendation by introducing sophisticated mechanisms for target selection, user intention modeling, individual sensitivity assessment, and realistic user feedback simulation. The ablation studies confirm the individual importance of these components, while parameter sensitivity analysis provides practical guidance for deployment. The superior performance over SR and existing ProactRec baselines, particularly in IoI and IoR, validates ITMPRec as an effective and robust solution for intention-based targeted multi-round proactive recommendation. The LLM-based click model is a notable innovation, offering more realistic and intelligent user feedback simulation.
7. Conclusion & Reflections
7.1. Conclusion Summary
This paper introduces ITMPRec, a novel Intention-based Targeted Multi-round Proactive Recommendation method designed to overcome the limitations of traditional sequential recommendation (SR) systems that primarily cater to users' historical preferences. ITMPRec focuses on proactively guiding users towards a specific category of items over multiple interaction rounds. Its key contributions include:
-
Pre-match Module: A strategy to intelligently select a set of target items by considering
all users' opinionswithin a specified category, thereby making the nudging process more purposeful and avoidingcold-startissues. -
Intention-induced Scores: Integration of a mechanism to quantify
users' intention-level evolution, which helps in selecting suitableintermediate itemsthat align with changingcoarse-grained user intentionsduring the guidance process. -
Targeted Individual Arousal Coefficients (TIAC): A component that models each user's unique
sensitivityandreceptivityto new content, allowing for personalizedpreference evolutionduringmulti-round nudging. -
LLM Agent for Click Simulation: A pluggable
Large Language Model (LLM)agent that provides a more realistic and intelligent simulation ofuser click feedbackonintermediate recommendations, leveraging theLLM'sexternal knowledgeandreasoning capabilitiescompared to traditionaldistribution-based models.Extensive experiments conducted on four real-world datasets demonstrate the significant superiority of
ITMPRecover eight state-of-the-art baselines, showcasing average increases of36.47%inIoI@20and68.80%inIoR@20. The ablation studies confirm the effectiveness of each proposed module.
7.2. Limitations & Future Work
The authors identify several directions for future work:
-
Causal Theory in Nudging: Further study of
causal theory [13]in thenudging processto better understand the cause-and-effect relationships between recommendations and user preference shifts. -
Model Explainability: Enhancing the model's
explainability [23]to provide users with clearer reasons for the proactive recommendations. -
Robustness in Complex Probabilistic Modeling: Improving the model's
robustnessin more complex probabilistic modeling scenarios, likely referring to the dynamics of user preferences and click behaviors.While not explicitly stated as limitations, the paper implicitly highlights some challenges:
- Resource Intensiveness of LLMs: The
LLM-IPPbaseline was noted for its high resource consumption, and whileITMPRecusesLLMspluggably, their integration still implies a computational cost trade-off compared to non-LLM components. - Offline Simulation Dependence: The entire framework relies on an
environment simulator. The quality ofproactive recommendationin real-world deployment depends heavily on how accurately this simulator (especially theLLM agent) mimics actual user behavior.
7.3. Personal Insights & Critique
ITMPRec presents a significant step forward in proactive recommendation by comprehensively addressing critical limitations of prior work. The shift from random target selection to a pre-match module is highly practical, aligning recommendation goals with content provider strategies. The integration of user intention and individual arousal coefficients is particularly insightful, moving beyond simplistic user models to capture more nuanced and dynamic aspects of human behavior. This makes the nudging process more adaptive and potentially more ethical, as it's tailored to individual user receptivity rather than a rigid push.
The use of an LLM agent for click simulation is perhaps the most innovative aspect. It acknowledges the complexity of user decision-making, which traditional distribution-based models cannot fully capture. This approach has broad implications beyond this paper, suggesting that LLMs could become a standard component in offline evaluation environments for complex interactive systems, not just recommendation. This allows for more robust offline testing before costly live experiments.
Potential issues or unverified assumptions:
-
Interpretability of LLM Agent: While
LLMsofferinterpretabilityin generating explanations, their internaldecision-making processfor simulating a click ( or1) might still be a black box. Understanding why anLLMagent decided a user would click, rather than just that it clicked, could be crucial for refining thenudging strategy. -
Generalizability of LLM Agent: The effectiveness of the
LLM agentrelies on theprompt engineering() and theLLM's underlyingexternal knowledge. WhileChatGLM3is powerful, itssimulated feedbackmight still be limited by the data it was trained on and the specific instructions it receives. Its ability to capture novel or highly niche user behaviors might be a challenge. -
Ethical Implications of Nudging: The term "proactive recommendation" or "nudging" inherently carries ethical considerations. While
ITMPRecaims to broaden user interests, it could also be misused to manipulate users towards less beneficial content. Future work could explicitly integrateethical safeguardsortransparency mechanismsto ensure responsiblenudging. -
Real-world Deployment Challenges: Although
ITMPRecshows strong offline performance, deploying a multi-roundproactive recommendationsystem in a real-world setting would involve significant engineering challenges, includingreal-time user preference updates,cold-start issues for intermediate items, and the computational cost of generatingLLM-based feedbackor even usingLLMsin live recommendation loops.The methods and conclusions of
ITMPReccould be applied to other domains wherepreference guidanceis desirable, such aseducational content recommendation(guiding students towards foundational topics),health and wellness apps(nudging users towards healthier habits or information), ornews feeds(encouraging exposure to diverse perspectives to combat polarization). The concept ofintention-awareandindividually sensitivemulti-round guidanceis broadly applicable to any sequential decision-making process involving human interaction.
Similar papers
Recommended via semantic vector search.