Paper status: completed

ITMPRec: Intention-based Targeted Multi-round Proactive Recommendation

Published:04/22/2025
Original Link
Price: 0.100000
3 readers
This analysis is AI-generated and may not be fully accurate. Please refer to the original paper.

TL;DR Summary

ITMPRec is a novel intention-based targeted multi-round proactive recommendation method that addresses passive acceptance in personalized systems by selecting target items through pre-matching, utilizing multi-round nudging, and simulating user feedback with an LLM agent, outperf

Abstract

Personalized recommendations are integrated into daily life, but providers may want certain items to become more appealing over time through user interactions, yet this issue is often overlooked. The existing works are often based on the assumption that users will passively accept all intermediate sequences or not explore intention modeling in the targeted nudging process. Both of these factors result in suboptimal performance in the proactive recommendation. In this paper, we propose a novel intention-based targeted multi-round proactive recommendation method, dubbed ITMPRec. We first select target items using a pre-match strategy. Then, we employ a multi-round nudging recommendation method, incorporating a module to quantify users’ intention-level evolution, helping choose suitable intermediate items. Additionally, we model users’ sensitivity to changes caused by these items. Lastly, we propose an LLM agent as a pluggable component to simulate user feedback, offering an alternative to traditional click models by leveraging the agent’s external knowledge and reasoning capabilities. Through extensive experiments on four public datasets, we demonstrate the superiority of ITMPRec compared to eight baseline models.

Mind Map

In-depth Reading

English Analysis

1. Bibliographic Information

1.1. Title

The central topic of the paper is ITMPRec: Intention-based Targeted Multi-round Proactive Recommendation. This title suggests a novel approach to recommendation systems that focuses on proactively guiding users towards specific items over multiple interactions, taking user intentions into account.

1.2. Authors

  • Yahong Lian (College of Computer Science, TJ Key Lab of NDST, DISSec, TMCC, TBI Center, Nankai University, Tianjin, China)

  • Chunyao Song (College of Computer Science, TJ Key Lab of NDST, DISSec, TMCC, TBI Center, Nankai University, Tianjin, China)

  • Tingjian Ge (Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, USA)

    The authors are primarily affiliated with computer science departments and research centers, indicating a background in computational methods, data science, and artificial intelligence, which are highly relevant to recommendation systems.

1.3. Journal/Conference

The paper was published in the Proceedings of the ACM Web Conference 2025 (WWW '25). WWW (The Web Conference, formerly known as World Wide Web Conference) is a premier international academic conference on topics related to the World Wide Web. Its reputation is very high in the fields of information retrieval, web mining, and recommendation systems, making it a highly influential venue for research in these areas. Publication at WWW signifies that the research has undergone rigorous peer review and is considered to be of significant quality and impact.

1.4. Publication Year

2025 (This indicates it's a forthcoming publication or accepted paper for WWW '25).

1.5. Abstract

Personalized recommendation systems are ubiquitous in daily life. However, they often overlook the objective of content providers to make certain items more appealing over time through user interactions. Existing proactive recommendation methods typically assume users passively accept all intermediate recommendations or fail to model user intentions during the nudging process, leading to suboptimal performance.

To address these limitations, the paper proposes a novel method called ITMPRec (Intention-based Targeted Multi-round Proactive Recommendation). ITMPRec first employs a pre-match strategy to select target items. It then utilizes a multi-round nudging recommendation approach that includes a module to quantify the evolution of users' intentions, which aids in selecting appropriate intermediate items. Additionally, the model accounts for users' sensitivity to changes introduced by these intermediate items. Finally, ITMPRec introduces an LLM agent as a pluggable component to simulate user feedback, offering a sophisticated alternative to traditional click models by leveraging the agent's external knowledge and reasoning capabilities. Extensive experiments on four public datasets demonstrate that ITMPRec significantly outperforms eight baseline models.

/files/papers/6911a6abc9b7d49a981aac07/paper.pdf (This link points to a local file, implying it's a preprint or a provided PDF for internal review.)

2. Executive Summary

2.1. Background & Motivation

The core problem the paper aims to solve is the limitation of traditional personalized recommendation systems, which primarily focus on predicting users' next preferences based on their historical behavior. While convenient, this user-centric approach can lead to several negative outcomes:

  1. Filter Bubbles and Information Cocoons: By constantly reinforcing existing preferences, users can become confined to a narrow range of content, limiting exposure to diverse items and potentially harming both user experience and the content ecosystem.

  2. Misalignment with Provider Goals: Content providers often have strategic objectives to promote specific items or categories, increase diversity, or guide users towards new experiences. Traditional systems do not inherently support these "nudging" or "proactive" goals.

    This problem is important because it highlights a fundamental tension in recommendation systems: balancing user satisfaction with platform objectives and promoting content diversity. The existing solutions for proactive recommendation have significant gaps:

  3. Random Target Item Selection: Previous approaches often assign target items randomly, which can lead to guiding users towards cold-start items (items with little to no historical interaction data) or items that are too scattered, making successful guidance difficult and potentially irrelevant to provider goals.

  4. Neglect of User Intention: The role of a user's underlying intention (a coarse-grained aspect compared to preference) in the multi-round nudging process is largely ignored, which is crucial for effective and dynamic guidance.

  5. Passive User Assumption: Many existing methods assume users will passively accept all intermediate recommendations, or they use simplistic, fixed thresholds for simulating user clicks. This unrealistic assumption fails to reflect real-world user behavior and leads to sub-optimal guidance paths.

    The paper's innovative idea is to address these gaps by developing a sophisticated multi-round proactive recommendation framework that not only selects meaningful target items (often a category of items) but also dynamically adapts to users' evolving intentions and individual sensitivities, leveraging advanced LLM agents for more realistic user feedback simulation.

2.2. Main Contributions / Findings

The paper makes several primary contributions to the field of proactive recommendation:

  1. Targeted Multi-Round Proactive Recommendation Paradigm: It proposes ITMPRec, a novel method that aims to guide users towards a class of target items (e.g., a specific category or genre) over multiple rounds, moving beyond the traditional single-item prediction. This approach is more aligned with content provider needs for focused promotion and can encourage content diversity.

  2. Pre-match Module for Target Item Selection: ITMPRec introduces a pre-match module that collects all users' opinions to generate a sensible set of candidate target items within a specific category. This addresses the limitation of random target assignment, ensuring selected targets are more relevant and avoid cold-start issues.

  3. Intention-Induced Scores: The paper devises a mechanism to incorporate intention-induced scores into the recommendation process. By modeling users' intention-level evolution, it helps in choosing more suitable intermediate items that align with the user's changing coarse-grained goals during the nudging process, which was previously overlooked.

  4. Targeted Individual Arousal Coefficients (TIAC): Recognizing that users respond differently to new content, ITMPRec introduces TIAC. This module quantifies each user's unique sensitivity or receptivity to changes caused by intermediate recommendations, enabling more personalized and effective nudging.

  5. LLM Agent for User Feedback Simulation: A novel aspect is the integration of an LLM agent as a pluggable component to simulate user click feedback on intermediate recommendations. This offers a more sophisticated and realistic alternative to traditional distribution-based click models, leveraging the LLM's external knowledge and reasoning capabilities to mimic complex user decision-making, thus better aligning the simulation with real-world scenarios.

  6. Empirical Superiority: Through extensive experiments on four real-world datasets, ITMPRec demonstrates significant superiority over eight state-of-the-art baseline models (including both sequential and proactive approaches). It achieves an average increase of 36.47% in IoI@20 and 68.80% in IoR@20 (metrics for proactive recommendation quality), validating its effectiveness.

    These findings collectively address the shortcomings of existing methods by providing a more holistic, intelligent, and realistic framework for proactive recommendation, benefiting both users (through expanded interests) and content providers (through targeted promotion).

3. Prerequisite Knowledge & Related Work

3.1. Foundational Concepts

To understand ITMPRec, a reader should be familiar with several core concepts in recommendation systems and machine learning:

  1. Personalized Recommendation Systems: These systems aim to predict user preferences for items (products, movies, music, etc.) and recommend relevant ones. They are based on user-item interaction data (e.g., clicks, purchases, ratings).
  2. Sequential Recommendation (SR): A sub-field of recommendation systems that focuses on modeling the chronological order of user interactions. Instead of just predicting static preferences, SR systems try to predict the next item a user will interact with, given their historical sequence of interactions. This often involves capturing sequential patterns and short-term preferences.
    • User/Item Embeddings: In recommendation systems, embeddings are low-dimensional, dense vector representations of users and items. These vectors are learned from interaction data and capture the latent features and relationships between users and items. For example, similar items would have embedding vectors that are close to each other in the embedding space.
    • Dot Product for Similarity: A common way to measure the similarity or interaction tendency between a user embedding eue_u and an item embedding eie_i is to compute their dot product, i.e., euTeie_u^T \cdot e_i. A higher dot product typically indicates higher predicted relevance or preference.
  3. Filter Bubble and Information Cocoon: These are phenomena where users are exposed only to information that confirms their existing beliefs or preferences, due to algorithms that personalize content.
    • A filter bubble is created by personalized content filters that selectively guess what information a user would like to see.
    • An information cocoon is a state where individuals are isolated from information that contradicts their beliefs, often resulting from their own choices and algorithms.
  4. Proactive Recommendation: A paradigm that goes beyond passively predicting what a user will like. Instead, it actively tries to guide or nudge user preferences towards certain target items or categories, often over multiple rounds of interaction. This can be for purposes like promoting diversity, introducing new content, or achieving specific business goals.
  5. Large Language Models (LLMs): These are advanced AI models trained on vast amounts of text data, capable of understanding, generating, and reasoning with human language. They possess external knowledge (information learned during pre-training) and reasoning capabilities (ability to infer, deduce, and plan based on instructions).
  6. Click Models: In recommendation research, click models are used to simulate user interactions (e.g., whether a user clicks on a recommended item). They estimate the probability of a user clicking on an item given certain features or conditions.
    • Bernoulli Distribution: A discrete probability distribution that describes the probability of an event happening (success) or not happening (failure) in a single trial. In click models, it can be used to model whether a user clicks (1) or doesn't click (0) an item.
    • Sigmoid Function (σ(x)\sigma(x)): A mathematical function that maps any real-valued number to a value between 0 and 1. It is often used to convert a raw score into a probability. Its formula is σ(x)=11+ex \sigma(x) = \frac{1}{1 + e^{-x}} .
  7. Contrastive Learning: A machine learning paradigm where the model learns by contrasting positive pairs (similar items/representations) with negative pairs (dissimilar items/representations). The goal is to bring positive pairs closer in the embedding space while pushing negative pairs farther apart.
    • InfoNCE Loss: A common loss function used in contrastive learning. It encourages the model to distinguish a positive sample from a set of negative samples.
  8. BPR Loss (Bayesian Personalized Ranking Loss): A widely used ranking loss function in recommendation systems. It optimizes the model to rank positive items higher than negative (uninteracted) items for a given user.
    • The formula for BPR loss is typically given as: LBPR=u=1UiRujIRulnσ(x^uix^uj) \mathcal{L}_{BPR} = - \sum_{u=1}^U \sum_{i \in R_u} \sum_{j \in I \setminus R_u} \ln \sigma(\hat{x}_{ui} - \hat{x}_{uj}) , where RuR_u is the set of items user uu interacted with, IRuI \setminus R_u is the set of items user uu did not interact with, and x^ui\hat{x}_{ui} is the predicted score of item ii for user uu.
  9. Hyperparameters: Parameters whose values are set before the learning process begins (e.g., learning rate, embedding dimension, number of intentions). They control the learning algorithm itself.

3.2. Previous Works

The paper categorizes related work into Sequential Recommendation (SR) and Proactive Recommendation (ProactRec).

3.2.1. Sequential Recommendation (SR) Methods

SR methods aim to predict the next item a user will interact with based on their historical sequence. They typically focus on modeling chronological behaviors and capturing short-term user interests.

  • SASRec [18]: A classical sequential recommendation method that uses a self-attention framework. It models item transitions by allowing items in a sequence to "attend" to each other, capturing long-range dependencies effectively.

    • Background: Self-attention is a mechanism that allows a model to weigh the importance of different parts of an input sequence when processing a specific element. It's a core component of Transformers. The key idea is to compute Query, Key, and Value matrices from the input embeddings. The attention score is computed as a scaled dot product of Query and Key, followed by a softmax to get weights, which are then applied to Value to get the weighted sum.
    • Formula for Self-Attention: $ \mathrm{Attention}(Q, K, V) = \mathrm{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V $ Where:
      • QQ (Query), KK (Key), VV (Value) are matrices derived from the input sequence embeddings.
      • dkd_k is the dimension of the Key vectors, used for scaling to prevent the dot products from becoming too large.
      • QKTQK^T is the dot product similarity between Query and Key vectors.
      • softmax()\mathrm{softmax}(\cdot) normalizes the attention scores into probabilities.
  • ICLRec [7]: An intention contrastive learning paradigm that models latent user intentions. It fuses these intentions into an SR method using a new contrastive self-supervised learning objective. It uses K-Means to cluster item embeddings and calculate intention centers.

  • MStein [12]: A sequential recommendation method based on mutual Wasserstein discrepancy minimization. This technique helps in obtaining more fine-grained sequential patterns by measuring the "distance" between probability distributions.

  • ICSRec [29]: A sequential recommendation method that enhances its performance by incorporating subsequences and considering intention prototypes of users, constructing auxiliary objectives for intention learning.

  • BSARec [34]: A sequential recommendation method that incorporates an attentive inductive bias, suggesting that it biases the attention mechanism in a specific way to capture particular sequential patterns.

    Limitations of SR methods: They are primarily user-centric, focusing on next-item prediction and catering to historical preferences. This inherently leads to filter bubbles and information cocoons, as they reinforce existing interests rather than broadening them.

3.2.2. Proactive Recommendation (ProactRec) Methods

This field focuses on actively guiding user preferences.

  • IRN (Influential Recommender System) [46]: A Transformer-based proactive recommendation work. It generates a sequence of intermediate items using a personalized impression mask with the goal of guiding users toward a target.
    • Limitation: Assumes users passively accept all intermediate recommendations, which is often unrealistic.
  • LLM-IPP (LLMs with Influential Recommender System) [37]: A pure LLM-based proactive recommendation method that leverages LLMs to produce targeted intermediate guidance sequences. It uses LLMs for path planning and instruction following to ensure coherence and acceptability of recommendations.
    • Limitations: Resource-intensive and limited scalability. Similar to IRN, it assumes passive acceptance of intermediate items. The paper notes it doesn't show significant improvement over non-LLM methods in multi-round simulated clicks.
  • IPG (Iterative Preference Guidance) [3]: A model-agnostic post-processing method that conducts proactive recommendation. It employs a distribution-based click module to simulate user feedback.
    • Limitations: Uses a one-size-fits-all fixed threshold to measure user impact, which is an oversimplification. Its overall performance needs further improvement.
  • Conversational Recommendation Systems [10, 33, 36, 45] and Multi-modal Recommendation Approaches [40]: These are related paradigms that also involve guiding users, often through interactive dialogues or multi-modal feedback, towards stated goals. However, the proactive manner in purely sequential recommendation scenarios was rarely explored before IRN and IPG.

3.3. Technological Evolution

Recommendation systems have evolved from simple collaborative filtering and content-based filtering to sophisticated sequential recommendation models that leverage deep learning architectures like RNNs, LSTMs, and Transformers. The initial focus was purely on user-centric predictions, aiming for high accuracy in predicting the next likely item.

The limitations of this user-centric approach—namely, filter bubbles and information cocoons—led to the emergence of proactive recommendation. This shift reflects a move from purely reactive systems to more goal-oriented or provider-aligned systems. Early proactive methods (like IRN and LLM-IPP) introduced the idea of multi-round guidance paths but often made simplifying assumptions about user behavior. IPG introduced a more realistic click simulation, but still lacked sophistication in modeling user dynamics.

ITMPRec fits into this evolution by addressing the key shortcomings of previous proactive recommendation methods. It improves target item selection (pre-match), deepens the modeling of user dynamics by considering intention evolution and individual sensitivity (TIAC), and introduces a more advanced user feedback simulation using LLMs, pushing the boundary of realistic and effective proactive nudging.

3.4. Differentiation Analysis

Compared to the main methods in related work, ITMPRec introduces several core differences and innovations:

  1. Target Item Selection (Pre-match Module):

    • Previous: IRN and IPG often rely on randomly assigned target items (or a single, pre-defined target). LLM-IPP also uses targeted guidance but doesn't explicitly detail a sophisticated target selection strategy. This can lead to cold-start issues or irrelevant targets.
    • ITMPRec Innovation: ITMPRec proactively selects a category of target items using a pre-match module that considers the aggregate preferences of all users. This ensures that the chosen targets are meaningful and broadly appealing within a specific domain, making the nudging process more purposeful and successful.
  2. User Intention Modeling (Intention-induced scores):

    • Previous: Most proactive recommendation methods (e.g., IRN, IPG) do not explicitly model user intention during the round-by-round nudging process. While some SR methods like ICLRec and ICSRec model intention for next-item prediction, this concept was not integrated into proactive nudging.
    • ITMPRec Innovation: ITMPRec explicitly quantifies users' intention-level evolution using intention-induced scores. This allows the system to choose intermediate items that not only align with immediate preferences but also strategically shift the user's underlying coarse-grained intentions towards the target category.
  3. User Sensitivity to Nudging (Targeted Individual Arousal Coefficients - TIAC):

    • Previous: IPG uses a one-size-fits-all fixed threshold for simulating user clicks, implicitly assuming uniform user responses to intermediate items. IRN and LLM-IPP assume passive acceptance, entirely ignoring user response variability.
    • ITMPRec Innovation: ITMPRec introduces TIAC to model users' sensitivity or receptivity to new content. This acknowledges that each user reacts differently to external stimuli, enabling a more personalized and realistic adaptation of the preference evolution process during nudging.
  4. User Feedback Simulation (LLM Agent):

    • Previous: IRN and LLM-IPP assume passive acceptance of all intermediate items. IPG uses a distribution-based click model which is a step forward but still relatively simplistic, relying on a predefined probability function.

    • ITMPRec Innovation: ITMPRec offers a sophisticated LLM agent as a pluggable component for simulating user feedback. Leveraging the LLM's external knowledge and reasoning capabilities, this agent can model more complex and realistic user decision-making processes, moving beyond simple probability distributions and providing more accurate feedback for training and evaluating proactive recommendation strategies.

      In essence, ITMPRec moves beyond simplistic assumptions about users and target selection by incorporating a deeper understanding of user dynamics (intentions, individual sensitivities) and more realistic feedback mechanisms (LLM agent), specifically tailored for the multi-round, targeted proactive recommendation setting.

4. Methodology

4.1. Principles

The core principle of ITMPRec is to move beyond passive sequential recommendation by proactively nudging user preferences towards a predetermined category of target items over multiple interaction rounds. This is achieved by:

  1. Strategic Target Selection: Instead of random targets, ITMPRec identifies a set of target items that are relevant to a specific category and somewhat aligned with overall user preferences.

  2. Dynamic Preference Evolution: It aims to gradually modify user preferences by recommending intermediate items that act as stepping stones. This evolution is not uniform for all users; it considers individual user intentions and their unique sensitivity to new recommendations.

  3. Realistic User Feedback Simulation: Since real-time user feedback in multi-round nudging is hard to collect offline, ITMPRec relies on a sophisticated environment simulator that can realistically model user clicks, especially through the integration of Large Language Models (LLMs).

    The theoretical basis and intuition behind this approach stem from the understanding that:

  • Users' preferences are dynamic and can be influenced.
  • User behavior is driven by both explicit preferences (fine-grained) and implicit intentions (coarse-grained).
  • Individuals react differently to external stimuli (arousal theory), necessitating personalized nudging strategies.
  • LLMs possess vast external knowledge and reasoning capabilities that can simulate complex human decision-making more accurately than simple statistical models.

4.2. Environment Simulator

To evaluate proactive recommendation methods in an offline setting, ITMPRec utilizes an environment simulator to generate realistic user feedback over multiple rounds. The simulator captures three main aspects: user and item embeddings, preference evolution, and click modeling.

4.2.1. User and Item Embeddings

Initially, ITMPRec uses a pre-trained graph-based recommendation method (specifically GraphAU [42]) to generate initial user embeddings and item embeddings. These embeddings capture the latent features of users and items in a dd-dimensional space.

  • e^u0Rd \hat{e}_u^0 \in \mathbb{R}^d : The initial pre-trained embedding vector for user uu.
  • e^i0Rd \hat{e}_i^0 \in \mathbb{R}^d : The initial pre-trained embedding vector for item ii.

4.2.2. Preference Evolution

The user's preference is not static; it evolves after interacting with new items. If a user uu positively interacts with an item zz in round rr, their embedding is updated to reflect this new preference. The paper models this preference evolution as a weighted sum of the user's current embedding and the interacted item's embedding.

The user uu's embedding after interaction with item zz in round rr is updated as follows: $ \hat{e}_u^{r+1} \gets \beta_u^r \cdot \hat{e}_u^r + (1 - \beta_u^r) \cdot \hat{e}_z^r $ Where:

  • e^ur+1 \hat{e}_u^{r+1} : The updated embedding for user uu for the next round (r+1r+1).
  • e^ur \hat{e}_u^r : The current embedding for user uu in round rr.
  • e^zr \hat{e}_z^r : The embedding of the item zz that user uu interacted with in round rr.
  • βur \beta_u^r : A targeted individual arousal coefficient for user uu in round rr. This coefficient, which ranges between 0 and 1, controls the degree of preference evolution. A higher βur \beta_u^r means the user's preference changes less after interacting with item zz (more weight on e^ur \hat{e}_u^r ), while a lower βur \beta_u^r means the preference changes more (more weight on e^zr \hat{e}_z^r ). This coefficient is specific to each user and round, originating from the TIAC module (explained in Section 4.5).

4.2.3. Click Model (Traditional)

The click model simulates the user's decision to interact with a recommended item. It estimates the interaction probability between a user uu and an item zz. This traditional model is often distribution-based, assuming a probabilistic outcome based on the similarity between user and item embeddings.

The interaction probability (or activation score) between user uu and item zz in round rr is calculated, and then a binary click decision aura_u^r is made: $ a_u^r = \sigma\mathrm{(} w\mathrm{(} (\hat{e}_u^r)^T \mathrm{ ~ . ~ } \hat{e}_z^r - b )\mathrm{)} $ Where:

  • aur a_u^r : A binary value (1 for a user click, 0 otherwise), representing the outcome of the interaction.

  • σ() \sigma(\cdot) : The sigmoid function, which squashes the input to a value between 0 and 1, representing a probability.

  • (e^ur)Te^zr (\hat{e}_u^r)^T \cdot \hat{e}_z^r : The dot product of the user's current embedding and the item's embedding, indicating their similarity or predicted relevance.

  • w() w(\cdot) : Parameters of the click model (e.g., slope).

  • b b : A bias term.

    The paper also proposes an LLM-agent click model as an alternative, discussed later in Section 4.6.

4.3. Pre-match Module

The pre-match module addresses the limitation of random target item assignment in previous proactive recommendation methods. Instead, it aims to select a set of target items that are meaningful within a specific category and have a collective appeal to users. This avoids cold-start issues with target items and aligns better with content provider goals.

The process involves two main steps:

  1. Calculate Overall User Preference for Candidate Items: For a given pool of candidate target items (e.g., all items within a specific category), the module calculates a collective preference score from all users for each candidate item. $ L_{N_{can}} = \sum_{u=1}^U (e_l^T \cdot e_u^0) $ Where:
    • LNcan L_{N_{can}} : A list or set of scores, one for each candidate item in the candidate target pool.
    • U U : The total number of users.
    • elTeu0 e_l^T \cdot e_u^0 : The dot product (similarity) between a candidate item ll's initial embedding (ele_l) and a user uu's initial embedding (eu0e_u^0). This represents user uu's initial preference for item ll.
    • lNcan l \in N_{can} : Indicates that ll is an item from the candidate target pool, which has a size of NcanN_{can}.
  2. Select Top Target Items: From the scored candidate items, the module selects the top NtarN_{tar} items (where NtarNcanN_{tar} \leq N_{can}) to be the ultimate target items for guidance. $ L_{N_{tar}} = cut{sort(L_{N_{can}}, \searrow), N_{tar}} $ Where:
    • LNtar L_{N_{tar}} : The final list of NtarN_{tar} selected target items.

    • sort(X,) sort(X, \searrow) : A function that sorts the list XX in descending order.

    • cut{X,num} cut\{X, num\} : A function that returns the first num elements from the sorted list XX.

    • Ntar N_{tar} : A pre-defined number, representing the desired count of target items (e.g., 20 or 50).

      This pre-match setting ensures that the subsequent proactive recommendation process starts with target items that are collectively preferred by a wider user base, making successful nudging more feasible.

4.4. Intention-induced Scores

To generate effective intermediate recommendations, ITMPRec considers not only direct user-item similarity but also the underlying user intentions. This section details how intention-induced scores are calculated and integrated into the recommendation process.

4.4.1. Basic Recommendation Score

The fundamental tendency of interaction between user uu and item ii in round rr is quantified using the inner product of their embeddings: $ score_{(u,i)}^r = (e_u^r)^T \cdot e_i $ Where:

  • score(u,i)r score_{(u,i)}^r : The basic relevance score between user uu and item ii in round rr.
  • eur e_u^r : The current embedding of user uu in round rr.
  • ei e_i : The embedding of candidate item ii.

4.4.2. Post-processing Strategy with Nudging Aggressiveness

Following previous work [3], the system also considers the degree of nudging aggressiveness towards a target item eje_j. This is formulated as: $ l_{uij}^r = score_{(u,i)}^r \cdot nudge_{(u,i,j)}^r $ Where:

  • luijr l_{uij}^r : The overall score for recommending intermediate item ii to user uu to guide towards target item jj in round rr.
  • nudge(u,i,j)r nudge_{(u,i,j)}^r : A term representing the nudging aggressiveness, which is associated with how much a user's preference (eue_u) would shift towards the target item eje_j if they interacted with intermediate item ii. It's defined as the difference in similarity between the target item eje_j and the user's future embedding (eu(r+1)e_u^{(r+1)}) versus their current embedding (eu(r)e_u^{(r)}). $ nudge_{(u,i,j)}^r = e_j^T e_u^{(r+1)} - e_j^T e_u^{(r)} $

In the previous work [3], it was assumed that user representation transition from round rr to r+1r+1 follows a linear pattern: $ e_u^{r+1} = \omega e_u^r + (1 - \omega) e_i $ Where ω \omega is a coefficient representing the weight of the old user representation. Substituting this into the nudge(u,i,j)r nudge_{(u,i,j)}^r term: $ nudge_{(u,i,j)}^r = e_j^T (\omega e_u^r + (1 - \omega) e_i) - e_j^T e_u^r $ $ nudge_{(u,i,j)}^r = \omega e_j^T e_u^r + (1 - \omega) e_j^T e_i - e_j^T e_u^r $ $ nudge_{(u,i,j)}^r = (1 - \omega) e_j^T e_i - (1 - \omega) e_j^T e_u^r $ $ nudge_{(u,i,j)}^r = (1 - \omega) (e_j^T e_i - e_j^T e_u^r) $ $ nudge_{(u,i,j)}^r = (1 - \omega) (e_i - e_u^r)^T e_j $ Then, substituting this back into the overall score luijr l_{uij}^r : $ l_{uij}^r = ( (e_u^r)^T \cdot e_i ) \cdot (1 - \omega) (e_i - e_u^r)^T e_j $ The term (1ω) (1 - \omega) is a constant (assuming uniform ω \omega across users in previous work) and does not influence the ultimate choice of the intermediate item that yields the maximum score, so it can be omitted. This leads to the simplified formulation: $ l_{uij}^r = ( (e_u^r)^T \cdot e_i ) \cdot (e_i - e_u^r)^T e_j $ This formula consists of two parts: the interaction-tendency term ((eur)Tei) ((e_u^r)^T \cdot e_i) and the targeted-guidance term (eieur)Tej (e_i - e_u^r)^T e_j . The interaction-tendency part favors items similar to the user's current preference, while the targeted-guidance part favors items that move the user's preference closer to the target item eje_j.

4.4.3. Intention-level Score

Recognizing that previous studies overlooked intention-level dynamics, ITMPRec introduces this component. It first identifies the user's current intention and then assesses the intention-level similarity with candidate items.

  1. User Intention-level Vector: Given a global intention matrix CRNC×dC \in \mathbb{R}^{N_C \times d} (where NCN_C is the number of intentions, and each row cmc_m is an intention vector) from a pre-trained ICLRec model, the user's current intention-level vector curc_u^r is identified as the closest intention prototype to their current user embedding eure_u^r. $ c_u^r = \underset{c_m \in {c_1, \ldots, c_{N_C}}}{argmin} (||c_m - e_u^r||_2^2) $ Where:
    • cur c_u^r : The intention-level vector representing user uu's dominant intention in round rr.
    • cm c_m : One of the NCN_C global intention vectors.
    • 22 ||\cdot||_2^2 : The squared Euclidean distance, used to find the closest intention vector.
  2. Item Intention-level Vector: Similarly, each item eie_i is also projected into the intention space to find its closest intention prototype. $ c_i = \underset{c_m \in {c_1, \ldots, c_{N_C}}}{argmin} (||c_m - e_i||_2^2) $ Where:
    • ci c_i : The intention-level vector representing item ii's dominant intention.
  3. Intention-level Similarity Score: The intention-level score between user uu and candidate item ii is then calculated using the dot product of their respective intention vectors. $ c_{score} = (c_u^r)^T \cdot c_i $

4.4.4. Final Combined Score

To incorporate intention-induced scores, the interaction-tendency term in the overall score (Equation 9) is modified. It now considers both the direct representational similarity between user uu and item ii and their similarity in the intention space. The targeted-guidance term remains unchanged.

The final formulation for the overall score luijrl_{uij}^r is: $ l_{uij}^r = \langle (e_u^r)^T \cdot e_i + \lambda \quad (c_u^r)^T \cdot c_i \quad \rangle \cdot (e_i - e_u^r)^T e_j $ Where:

  • λ \lambda : A hyperparameter that controls the weight given to the intention-induced score (the second term in the angle brackets) relative to the direct interest score (the first term). A higher λ \lambda means more emphasis on intention alignment.

  • The first part (eur)Tei+λ(cur)Tci \langle (e_u^r)^T \cdot e_i + \lambda \quad (c_u^r)^T \cdot c_i \quad \rangle represents the enhanced interaction-tendency incorporating intention.

  • The second part (eieur)Tej (e_i - e_u^r)^T e_j remains the targeted-guidance term.

    The item euire_{ui}^r that yields the largest score luijrl_{uij}^r will be selected as the intermediate item to be recommended to the user in round rr, and this item then serves as input to the click model.

4.5. Targeted Individual Arousal Coefficients (TIAC)

Previous work often assumed a uniform preference evolution coefficient (the ω \omega in the linear update rule, or β \beta in this paper) for all users, implying everyone responds to new content similarly. ITMPRec challenges this by introducing Targeted Individual Arousal Coefficients (TIACs), which personalize the degree of preference evolution based on each user's unique receptivity to new stimuli and their relationship with the target item.

The TIAC (βur \beta_u^r ) for user uu in round rr is calculated as follows:

  1. Historical Preference Variance (Short-term Curiosity): First, ITMPRec identifies a set of items that represent the user's short-term preferences or "curiosity" based on their current embedding and items they have not yet interacted with. $ \mathbf{hp}^{r-1}(u) = Top\mathcal{Q}{\phi(e_u^{r-1}, e_{idx})}, e_{idx} \in E \backslash {S_u^{r-1}} $ Where:

    • hpr1(u) \mathbf{hp}^{r-1}(u) : A set of QQ item embeddings representing user uu's short-term preferences or items they are currently "curious" about, derived from their state in round r-1.
    • ϕ(x,y) \phi(x, y) : The cosine similarity between two vectors xx and yy. This measures how similar eur1e_u^{r-1} is to other items.
    • E\{Sur1} E \backslash \{S_u^{r-1}\} : The set of all items EE excluding those already in user uu's historical sequence Sur1S_u^{r-1} (i.e., items the user has already interacted with).
    • TopQ{} Top\mathcal{Q}\{\cdot\} : A function that returns the top Q items from the specified set that have the highest cosine similarity with the user's embedding eur1e_u^{r-1}.
    • Q Q : A hyperparameter representing the capacity of short-term preferences or curiosity.
  2. Arousal Value Calculation: Based on this set of short-term preferences hpr1(u) \mathbf{hp}^{r-1}(u) , an arousal value is computed, reflecting how much the user is "aroused" by the target item eje_j given their current curiosities. This arousal value then becomes the TIAC βur \beta_u^r . $ \beta_u^r = \mathcal{P}\mathcal{O}\mathcal{O}L(\phi({\bf hp}^{r-1}(u), e_j)) $ Where:

    • βur \beta_u^r : The Targeted Individual Arousal Coefficient for user uu in round rr.

    • POOL() \mathcal{P}\mathcal{O}\mathcal{O}L(\cdot) : A pooling operation (e.g., average pooling) applied to the cosine similarities. This averages the similarity between the user's top Q curiosities and the target item eje_j. Other pooling methods (max, sum) could also be used.

    • ϕ(hpr1(u),ej) \phi({\bf hp}^{r-1}(u), e_j) : The cosine similarity between each item in the user's short-term preference set hpr1(u) \mathbf{hp}^{r-1}(u) and the target item eje_j.

      This calculated βur \beta_u^r is then fed into the preference evolution module (Section 4.2.2), allowing for personalized preference updates. Users who are more "aroused" (i.e., whose current curiosities align more with the target item, leading to a higher βur\beta_u^r) might have their preferences shifted more effectively towards the target, or their preferences might be more stable. The actual impact depends on how βur \beta_u^r is used in the preference update equation: a higher βur \beta_u^r means the user's preference e^ur \hat{e}_u^r retains more weight, implying less change from the current state. Conversely, a lower βur \beta_u^r would imply a stronger shift towards the interacted item e^zr \hat{e}_z^r .

4.6. LLM-based Click Simulation Agent

ITMPRec offers an alternative to the traditional distribution-based click model: an LLM-based click simulation agent. This agent leverages the external knowledge and reasoning capabilities of Large Language Models (LLMs) to simulate user feedback more realistically.

The LLM agent (e.g., ChatGLM3) takes the user's current historical sequence and the recommended intermediate item as input and outputs a binary click decision.

The action of user uu (click or no click) is obtained as: $ a_u^r = LLM(\mathcal{P}_F, \mathcal{H}_u^r, NAMES(i_u^r)) $ Where:

  • aur a_u^r : The binary action (0 for no click, 1 for click) generated by the LLM for user uu in round rr.

  • LLM() LLM(\cdot) : Represents the Large Language Model function.

  • PF \mathcal{P}_F : The task instruction or prompt provided to the LLM. This includes few-shot examples (demonstrations of desired behavior) and a prompt template that guides the LLM to produce a binary output (0 or 1).

  • Hur \mathcal{H}_u^r : The user's historical interaction sequence up to round rr.

  • NAMES(iur) NAMES(i_u^r) : The name or description of the intermediate item iuri_u^r recommended to user uu in round rr. The LLM can use this textual information, along with its external knowledge, to make a more informed decision.

    Based on the LLM's output, the user's historical sequence for the next round is updated: $ \mathcal{H}_u^{r+1} = \left{ \begin{array}{ll} CONCAT(\mathcal{H}_u^r, NAMES(i_u^r)), & \mathrm{if} ; a_u^r = 1 \ \mathcal{H}_u^r, & \mathrm{if} ; a_u^r = 0 \end{array} \right. $ Where:

  • Hur+1 \mathcal{H}_u^{r+1} : The updated historical sequence for user uu for the next round.

  • CONCAT() CONCAT(\cdot) : A function to concatenate the new item's name to the existing historical sequence if the user clicked it.

4.6.1. Discussion: Traditional vs. LLM Agent Click Model

  • Traditional Click Model: Assumes that higher similarity/score directly translates to a higher probability of user acceptance. This is a simplification and may not capture the nuances of human behavior.
  • LLM Agent Click Model: Offers richer interpretability and external knowledge utilization. LLMs can consider various factors beyond simple similarity, such as item attributes, user's inferred mood, contextual information, and their own reasoning to decide whether a user would click an item. This makes the simulation more realistic for complex decision-making. The LLM can simulate intricate decision-making factors, which is valuable in today's era of complex user behaviors.

4.7. Overall Algorithm

The complete process of ITMPRec is summarized in Algorithm 1, outlining the multi-round nudging for each user and target item.

Algorithm 1: ITMPRec

Input:

  • U \mathcal{U} : set of users.
  • I \mathcal{I} : set of items.
  • su s_u : historical sequences for each user uu.
  • R R : total number of nudging rounds.
  • B B : batch size for processing users.

Output:

  • Pujr P_{uj}^r : The nudging path (sequence of clicked intermediate items) for each user uu and target item jj.

Procedure:

  1. Initialize Target Items: Determine the set of target items to be nudged by applying the pre-match module (Equation 7). This step is performed once before the multi-round process begins.
  2. Iterate Through Target Items: For each target item jj in the selected set (NtarN_{tar}): a. Initialize Nudging Path: Initialize an empty nudging path Puj0=[] P_{uj}^0 = [] for each user uu and the current target item jj. This path will store the intermediate items clicked by user uu while being nudged towards jj. b. Multi-Round Nudging: For each round rr from 0 to R-1: i. Update User Representation: Get the current user representation eure_u^r based on their historical sequence SurS_u^r using the sequence encoder (Equation 5). If r=0r=0, Su0S_u^0 is the initial historical sequence. ii. Initialize Intermediate Lists: Create empty lists intermidsrintermids_r (to store all recommended intermediate items in this round) and recsurrecs_u^r (to store the specific recommended item for each user). iii. Batch Processing for Users: Iterate through users in batches of size BB: 1. Calculate Intention-level Score: For each user uu in the current batch, calculate their intention-level score (Equation 12) with candidate intermediate items using their current intention vector curc_u^r. 2. Select Intermediate Item: Determine the optimal intermediate item recurrec_u^r for user uu for this round by maximizing the final combined score luijrl_{uij}^r (Equation 13). This item is intended to nudge user uu towards target jj. 3. Calculate TIAC: Compute the targeted individual arousal coefficient βur \beta_u^r for user uu (Equation 15). 4. Store Recommended Item: Add recurrec_u^r to the recsurrecs_u^r list. iv. Simulate Clicks: Simulate user clicks on the recommended intermediate items. 1. intermidsr=clicksrecsurintermids_r = clicks{recs_u^r}: This step uses either the distribution-based click model (Section 4.2.3) or the LLM-based click simulation agent (Section 4.6) to determine which of the recommended items recsurrecs_u^r are "clicked" by users. 2. intermidsr.extend(intermidsr)intermids_r.extend(intermids_r): Consolidate the click results. v. Update User State and Path: For each user uu and their corresponding click result intermidsr[iidx]intermids_r[iidx]: 1. If Clicked: If user uu clicked the recommended item: * Update user's historical sequence Sur+1S_u^{r+1} by concatenating the clicked item (represented by recsur[iidx]recs_u^r[iidx]) and incorporating the arousal coefficient βur \beta_u^r (this implicitly refers to the preference evolution formula from Section 4.2.2, where βur\beta_u^r is used). * Extend the nudging path PujrP_{uj}^r with the clicked item.
  3. Return Nudging Paths: After all rounds and all target items are processed, return the complete nudging paths Pujr P_{uj}^r for each user and target.

Key points from the algorithm:

  • The pre-match module is a one-time setup (Line 1).
  • The core of the algorithm (Lines 2-21) iterates through each target item and then for multiple rounds.
  • For each round, it calculates personalized scores (Line 8-9) and arousal coefficients (Line 10).
  • The click simulation (Line 12) is crucial for determining actual user interaction and subsequent preference evolution (Line 17).
  • The nudging path PujrP_{uj}^r records the actual sequence of intermediate items that successfully led a user closer to a target.

5. Experimental Setup

5.1. Datasets

The experiments were conducted on four publicly available datasets, chosen to represent different domains and scales of recommendation scenarios:

  1. ML-100k (MovieLens 100k): A movie rating dataset.
    • Domain: Movies
    • Characteristics: Relatively dense, smaller scale.
    • Scale: 943 users, 1,348 items, 98,704 interactions.
    • Density: 7.7649%
    • Average Items per User: 104.67
  2. Lastfm: A music listening dataset.
    • Domain: Music artists/tracks
    • Characteristics: Moderately dense.
    • Scale: 945 users, 2,782 items, 246,368 interactions.
    • Density: 9.3712%
    • Average Items per User: 36.78
  3. Steam: A video game platform dataset.
    • Domain: Video games
    • Characteristics: Sparser, larger number of users.
    • Scale: 12,611 users, 2,017 items, 220,100 interactions.
    • Density: 0.9686%
    • Average Items per User: 19.54
  4. Douban_movie: A movie dataset from Douban (a Chinese social networking service).
    • Domain: Movies

    • Characteristics: Sparser, very large number of items and interactions.

    • Scale: 2,623 users, 20,527 items, 1,161,110 interactions.

    • Density: 2.1565%

    • Average Items per User: 442.66

      The following are the data statistics from Table 2 of the original paper:

      Dataset ML-100k Lastfm Steam Douban_movie
      #Users 943 945 12,611 2,623
      #Items 1,348 2,782 2,017 20,527
      #Interactions 98,704 246,368 220,100 1,161,110
      Density 7.7649% 9.3712% 0.9686% 2.1565%
      #Avg. Items per User 104.67 36.78 19.54 442.66

These datasets were chosen because they are widely used benchmarks in recommendation systems research, allowing for fair comparison with existing methods. They cover a range of density and scale, which helps validate the method's robustness across different data characteristics. For instance, ML-100k and Lastfm are denser, while Steam and Douban_movie are sparser, posing different challenges.

5.2. Evaluation Metrics

The performance of ITMPRec and baseline models is evaluated using three metrics, specifically chosen to assess the effectiveness of proactive recommendation tasks over multiple rounds. The evaluation is conducted at different stages (P[5,10,15,20]P \in [5, 10, 15, 20] rounds) of the recommendation process.

  1. HitRatio (HR@P):

    • Conceptual Definition: HitRatio measures the proportion of users who positively interact with at least one recommended intermediate item within PP proactive recommendation cycles. It quantifies the system's ability to engage users during the nudging process.
    • Mathematical Formula: $ HR@P = \frac{1}{P |\mathcal{U}|} \sum_{p=1}^P \sum_{u \in \mathcal{U}} a_{up} $
    • Symbol Explanation:
      • HR@P: Hit Ratio at PP rounds.
      • P P : The number of proactive recommendation cycles (rounds) considered for evaluation.
      • U |\mathcal{U}| : The total number of users in the dataset.
      • aup a_{up} : A binary value (0 or 1) representing the feedback from the click simulator for user uu in round pp. aup=1a_{up}=1 if the user clicked the recommended item in round pp, and aup=0a_{up}=0 otherwise.
      • p=1PuUaup \sum_{p=1}^P \sum_{u \in \mathcal{U}} a_{up} : The total number of positive interactions (clicks) across all users and all PP rounds.
  2. Increase of Interest (IoI@P):

    • Conceptual Definition: Increase of Interest quantifies how much a user's interest in the target item has increased after PP rounds of proactive recommendations. It directly measures the effectiveness of the nudging process in shifting user preferences towards the desired target. A higher value indicates better guidance.
    • Mathematical Formula: $ IoI@P = \frac{1}{|\mathcal{U}|} \sum_{u \in \mathcal{U}} (\hat{e}_j^T \cdot \hat{e}_u^P - \hat{e}_j^T \cdot \hat{e}_u^0) $
    • Symbol Explanation:
      • IoI@P: Increase of Interest at PP rounds.
      • U |\mathcal{U}| : The total number of users.
      • e^j \hat{e}_j : The embedding of the target item. This embedding remains constant throughout the process.
      • e^uP \hat{e}_u^P : The embedding of user uu after PP rounds of proactive recommendations (i.e., after their preference has potentially evolved).
      • e^u0 \hat{e}_u^0 : The initial embedding of user uu at the start of the guidance phase (before any nudging).
      • e^jTe^uP \hat{e}_j^T \cdot \hat{e}_u^P : The dot product (similarity) between the target item embedding and the user's embedding after PP rounds.
      • e^jTe^u0 \hat{e}_j^T \cdot \hat{e}_u^0 : The dot product (similarity) between the target item embedding and the user's initial embedding.
      • The difference (e^jTe^uPe^jTe^u0) (\hat{e}_j^T \cdot \hat{e}_u^P - \hat{e}_j^T \cdot \hat{e}_u^0) measures the change in user uu's interest in the target item. A positive value means increased interest.
  3. Increase of Ranking (IoR@P):

    • Conceptual Definition: Increase of Ranking measures the improvement in the ranking position of the target item among all other items, with respect to a user's preference, after PP rounds. It indicates how much closer the target item has become to being a top recommendation for the user. A higher value means the target item is ranked much higher after nudging.
    • Mathematical Formula: $ IoR@P = \frac{1}{|\mathcal{U}|} \sum_{u \in \mathcal{U}} \mathsf{Ran}{\hat{e}_j | \hat{e}_u^0} - \mathsf{Ran}{\hat{e}_j | \hat{e}_u^P} $
    • Symbol Explanation:
      • IoR@P: Increase of Ranking at PP rounds.
      • U |\mathcal{U}| : The total number of users.
      • Ran{e^je^u} \mathsf{Ran}\{\hat{e}_j | \hat{e}_u^*\} : A function that returns the discrete ranking of the target item e^j \hat{e}_j among all items, based on their similarity to the user's embedding e^u \hat{e}_u^* . The ranking is typically based on similarity scores (e.g., dot products) with the user's embedding. A lower rank value means a higher position (e.g., rank 1 is the most preferred).
      • Ran{e^je^u0} \mathsf{Ran}\{\hat{e}_j | \hat{e}_u^0\} : The initial ranking of the target item for user uu.
      • Ran{e^je^uP} \mathsf{Ran}\{\hat{e}_j | \hat{e}_u^P\} : The ranking of the target item for user uu after PP rounds of nudging.
      • A positive value for IoR@P indicates that the target item's ranking has improved (i.e., its rank number has decreased, meaning it moved higher up the list).

5.3. Baselines

The proposed ITMPRec method is compared against eight state-of-the-art baseline models, categorized into Sequential Recommendation (SR) methods and Proactive Recommendation (ProactRec) methods. For fairness, all methods use a distribution-based click simulator unless explicitly stated (e.g., for LLM-IPP under its own assumptions or for ITMPRec when combined with its LLM-agent click model).

5.3.1. Sequential Recommendation (SR) Methods

These methods are designed for next-item prediction and typically optimize for user-centric preferences.

  • SASRec [18]: A foundational self-attentive sequential recommendation model. It represents sequences as item embeddings and uses a Transformer encoder to capture sequential patterns.
  • ICLRec [7]: An intention contrastive learning model for sequential recommendation. It explicitly models user intentions using a contrastive learning objective, improving user representation.
  • MStein [12]: A sequential recommendation method that minimizes mutual Wasserstein discrepancy to capture fine-grained sequential patterns.
  • ICSRec [29]: A sequential recommendation method that uses intent contrastive learning with cross subsequences to learn better representations.
  • BSARec [34]: An attentive inductive bias based sequential recommendation method, aiming to improve attention mechanisms for sequential modeling.

5.3.2. Proactive Recommendation (ProactRec) Methods

These methods explicitly aim to guide user preferences towards a target.

  • IRN (Influential Recommender System) [46]: A Transformer-based proactive recommendation method. It generates an entire sequence of middle items in one go, with the assumption that users will passively accept all of them.

  • IPG (Iterative Preference Guidance) [3]: A model-agnostic post-processing framework for proactive recommendation. It uses an iterative approach to guide preferences and includes a distribution-based click module to simulate user responses.

  • LLM-IPP (LLMs with Influential Recommender System) [37]: A proactive recommendation method that uses Large Language Models (specifically GLM-4-Flash in the experiments) for path planning and instruction following to generate guidance sequences. The paper notes its high resource consumption and its original assumption of passive user acceptance. In the context of ITMPRec's evaluation, LLM-IPP was tested under the user click simulation settings for fair comparison.

    These baselines were chosen to represent the current state-of-the-art in both sequential and proactive recommendation, covering various architectural approaches (attention, contrastive learning) and different levels of sophistication in handling proactive guidance.

6. Results & Analysis

6.1. Core Results Analysis

The experimental results demonstrate the superiority of ITMPRec in proactive recommendation tasks across four datasets. The evaluation primarily focuses on HitRatio (HR@P), Increase of Interest (IoI@P), and Increase of Ranking (IoR@P) at different rounds (P[5,10,15,20]P \in [5, 10, 15, 20]).

The following are the results from Table 4 of the original paper:

Datasets Methods HR IoI IoR
@5 @10 @15 @20 @5 @10 @15 @20 @5 @10 @15 @20
ML-100k SASRec 0.3994 0.3991 0.3980 0.3979 0.0455 0.0866 0.1121 0.1259 -0.4036 -0.9826 -1.2254 -1.1867
ICLRec 0.4124 0.4117 0.4102 0.4083 0.0394 0.0744 0.0952 0.1052 0.2398 0.2578 0.2476 0.0111
MStein 0.3134 0.3125 0.3118 0.3114 0.0074 0.0141 0.0204 0.0264 -0.1127 -0.1355 -0.022 0.12
BSARec 0.3705 0.3702 0.3692 0.3689 0.0416 0.0814 0.1131 0.1365 -0.3646 -0.7027 -1.1309 -1.5034
ICSRec 0.3642 0.3636 0.3628 0.3621 0.0412 0.0866 0.1231 0.1503 -0.0593 0.0346 0.2145 0.2695
IRN 0.4274 0.4270 0.4250 0.4237 0.0299 0.0578 0.0867 0.0912 0.0518 0.2712 1.3507 1.7407
IPG 0.3866 0.3891 0.3895 0.3861 0.1520 0.2620 0.3409 0.3898 33.2767 68.7030 96.4608 111.8751
LLM-IPP 0.3695 0.3680 0.3658 0.3659 0.0450 0.0865 0.1184 0.1412 0.6572 1.1868 1.3978 1.3998
ITMPRec w/o P 0.4029 0.4027 0.4066 0.4067 0.2353 0.3951 0.4496 0.4622 63.9955 113.2598 128.8455 131.4221
ITMPRec 0.4064 0.4024 0.4040 0.4016 0.2433 0.3998 0.4556 0.4690 70.0011 120.6690 136.8670 139.6954
Lastfm SASRec 0.3263 0.3254 0.3248 0.3243 0.0094 0.0174 0.0250 0.0311 -0.1749 -0.6057 -0.7632 -1.1204
ICLRec 0.4137 0.4129 0.4111 0.4106 0.0083 0.0102 0.0066 0.0001 0.1126 0.4359 0.8594 0.9521
MStein 0.3289 0.3281 0.3275 0.3270 -0.0024 0.0023 0.0139 0.0240 -0.6893 -0.8823 -0.9250 -0.9139
BSARec 0.3334 0.3327 0.3320 0.3315 0.0193 0.0297 0.0400 0.0493 0.5216 0.7023 0.6054 0.5891
ICSRec 0.3359 0.3351 0.3345 0.3369 0.0115 0.0251 0.0362 0.0458 -0.1695 -0.0528 0.0099 0.0688
IRN 0.4028 0.4018 0.4008 0.4002 0.0101 0.0203 0.0400 0.0525 0.0185 0.0916 1.8248 3.2734
IPG 0.3516 0.3528 0.3490 0.3520 0.1791 0.2976 0.3879 0.4544 25.2901 52.9695 80.5057 100.1863
ITMPRec w/o P 0.4163 0.4110 0.4122 0.4113 0.2925 0.5218 0.5958 0.6161 60.5283 120.0176 137.4319 141.8555
ITMPRec 0.4129 0.4153 0.4115 0.4135 0.3943 0.5938 0.6486 0.6614 96.9189 146.0627 159.1564 161.7352
Steam SASRec 0.4271 0.4263 0.4257 0.4251 0.0486 0.0991 0.1320 0.1521 -0.2202 0.3557 1.1601 1.6881
ICLRec 0.3886 0.3878 0.3872 0.3867 0.0583 0.1140 0.1571 0.1898 0.8334 2.0505 3.3948 4.6866
MStein 0.3929 0.3921 0.3914 0.3909 0.0584 0.1166 0.1620 0.1942 1.1366 2.4076 2.7133 2.5779
BSARec 0.4096 0.4089 0.4083 0.4078 0.0608 0.1290 0.1760 0.2050 0.1626 2.0218 4.0323 5.6237
ICSRec 0.4005 0.3998 0.3991 0.3986 0.0597 0.1223 0.1656 0.1927 0.0492 0.7546 1.6664 2.0173
IRN 0.4205 0.4195 0.4188 0.4183 0.0418 0.0839 0.1628 0.2016 0.3826 0.5860 2.6768 6.6263
IPG 0.3921 0.3907 0.3898 0.3895 0.1036 0.1777 0.2245 0.2554 17.6234 27.6880 33.4087 37.4944
ITMPRec w/o P 0.3911 0.3899 0.3915 0.3907 0.1876 0.2654 0.2984 0.3108 46.7208 55.9344 58.9080 59.8572
ITMPRec 0.3918 0.3937 0.3930 0.3923 0.2192 0.2955 0.3239 0.3336 55.3553 66.6745 70.6409 71.6806
Douban_movie SASRec 0.3673 0.3669 0.3662 0.3655 -0.0021 -0.0042 -0.0046 -0.0040 0.0888 0.2017 0.3321 0.5044
ICLRec 0.3277 0.3268 0.3261 0.3256 0.0002 -0.0017 -0.0009 0.0019 0.0062 0.0043 0.0475 0.1750
MStein 0.3174 0.3166 0.3159 0.3154 0.0030 0.0076 0.0128 0.0176 0.0180 0.0636 0.1195 0.2197
BSARec 0.4217 0.4215 0.4208 0.4200 -0.0046 -0.0095 -0.0130 -0.0150 0.0028 -0.0768 -0.1460 -0.2929
ICSRec 0.3304 0.3296 0.3289 0.3284 0.0019 0.0016 0.0037 0.0066 0.0858 0.1511 0.2715 0.4051
IRN 0.3758 0.3753 0.3744 0.3739 0.0037 0.0069 0.0052 0.0010 0.1676 0.2913 0.4284 0.6543
IPG 0.3310 0.3323 0.3310 0.3303 0.0849 0.1418 0.1885 0.2259 13.3451 21.2825 30.0722 39.0427
ITMPRec w/o P 0.3439 0.3483 0.3422 0.3389 0.1465 0.2222 0.2798 0.3201 33.6715 48.5319 62.1714 73.9921
ITMPRec 0.3366 0.3363 0.3361 0.3362 0.1619 0.2408 0.2960 0.3374 36.0797 50.5707 65.3341 77.2108

6.1.1. Comparison with Traditional SR Methods

  • Proactive vs. SR: Proactive recommendation methods (IRN, IPG, ITMPRec) generally outperform traditional SR methods (SASRec, ICLRec, MStein, BSARec, ICSRec) in IoI and IoR metrics. This is a critical finding, as IoI and IoR directly measure the success of proactive guidance towards a target. Traditional SR methods often show low or even negative IoI and IoR values (e.g., SASRec on ML-100k and Lastfm, BSARec on Douban_movie), indicating that without explicit guidance, user preferences either do not shift towards the target or even diverge.
  • HR@P: While proactive methods might show a slightly lower HR@P compared to some SR methods (e.g., SASRec on ML-100k, IRN on Steam), this is an acceptable trade-off. HR@P measures engagement with any intermediate item, whereas IoI and IoR measure engagement specifically towards the target. The goal of proactive recommendation is not just clicks, but guided clicks. The decrease in HR@P is often insignificant, suggesting that the system can still generate engaging intermediate items while steering preferences.

6.1.2. Comparison with Other Proactive Recommendation Methods

  • IRN's Limitation: IRN shows limited IoI and IoR improvements. This confirms the paper's hypothesis that IRN's assumption of passive user acceptance (generating a full path upfront without accounting for real-time feedback) leads to suboptimal performance when a simulated click feedback mechanism is introduced, as users might not accept the entire pre-planned sequence.
  • LLM-IPP's Underperformance: LLM-IPP, despite being an LLM-based method, underperforms significantly in IoI@P and IoR@P under the user click simulation settings. The paper notes its high time consumption (over 50 hours for ML-100k compared to others within 1 hour), which limits its practical applicability and suggests that simply using an LLM without the specific nudging mechanisms of ITMPRec is not sufficient.
  • ITMPRec vs. IPG: ITMPRec demonstrates a substantial improvement over IPG, which is identified as the second-best proactive recommendation method. ITMPRec achieves average enhancements of 36.47% in IoI@20 and 68.80% in IoR@20 across the four datasets. This highlights the effectiveness of ITMPRec's novel components (pre-match, intention-induced scores, TIAC) in more effectively guiding users towards target categories and single items.

6.1.3. ITMPRec's Overall Performance

  • ITMPRec consistently achieves the best performance in IoI@P and IoR@P across all datasets, confirming its ability to effectively shift user preferences towards target items.
  • The HR@P scores for ITMPRec are competitive, demonstrating that the system can still generate engaging intermediate recommendations while fulfilling its proactive goal.

6.2. Ablation Studies / Parameter Analysis

The following are the results of ablation studies on four datasets from Table 3 of the original paper:

Dataset Ablation HR@20 IoI@20 IoR@20
ML-100k w/o P 0.4067 0.4622 131.4221
w/o IIS 0.3878 0.4596 136.6786
w/o TIAC 0.3823 0.4006 118.3061
ITMPRec 0.4016 0.4690 139.6954
Lastfm w/o P 0.4113 0.6161 141.8555
w/o IIS 0.3324 0.4030 97.2408
w/o TIAC 0.3758 0.5149 116.5403
ITMPRec 0.4135 0.6614 161.7352
Steam w/o P 0.3907 0.3108 59.8572
w/o IIS 0.3920 0.3321 71.5609
w/o TIAC 0.3858 0.2472 38.4798
ITMPRec 0.3923 0.3336 71.6806
Douban_movie w/o P 0.3389 0.3201 73.9921
w/o IIS 0.3329 0.3035 64.1521
w/o TIAC 0.3303 0.2644 50.9361
ITMPRec 0.3362 0.3374 77.2108

6.2.1. Ablation Study (RQ1)

The ablation study analyzes the contribution of three key components of ITMPRec: the pre-match module (P), intention-induced scores (IIS), and targeted individual arousal coefficients (TIAC).

  • w/o Pre-match (P):

    • Removing the pre-match module (by using random target selection) generally leads to a decrease in IoI@20 and IoR@20. For example, on Lastfm, IoI@20 drops from 0.6614 to 0.6161, and IoR@20 from 161.7352 to 141.8555. This validates that selecting target items that are collectively appealing and avoids cold-start issues is crucial for effective nudging.
    • The paper notes that the pre-match module (selecting targets from a specific category based on collective preference) helps avoid problems like scattered or cold-start target items, leading to more successful nudging.
  • w/o Intention-induced scores (IIS):

    • The intention-induced scores component shows a significant degradation in performance, especially on Lastfm (IoI@20 drops from 0.6614 to 0.4030, IoR@20 from 161.7352 to 97.2408) and Douban_movie. This highlights the importance of modeling user intention at a coarse-grained level to guide preferences effectively.
    • On the Steam dataset, the impact of IIS is less pronounced. The authors suggest this might be due to the limited number of items users can search for in the Steam domain, which restricts candidate item pools and reduces the impact of different selection strategies related to intention.
  • w/o Targeted Individual Arousal Coefficients (TIAC):

    • Removing TIAC consistently leads to a notable drop in IoI@20 and IoR@20 across all datasets (e.g., on Lastfm, IoI@20 drops from 0.6614 to 0.5149, IoR@20 from 161.7352 to 116.5403). This confirms the importance of personalizing the preference evolution degree based on individual user sensitivity to new content.
    • The TIAC module performs particularly well on diverse datasets like Steam and Douban_movie, emphasizing that accounting for individual user responses is crucial in multi-round tasks where users might have varying levels of curiosity or openness to new items.
  • Overall: All three modules (PP, IIS, TIAC) contribute positively to the interest nudging metrics (IoI@P and IoR@P), even though HR@P might see slight, acceptable variations. This indicates that these components are well-designed to steer users towards target content, improving the quality of proactive recommendation.

6.2.2. Parameter Sensitivity Analysis (RQ3)

The paper investigates the impact of two key hyperparameters: QQ (number of items considered for personal curiosity in TIAC) and NCN_C (number of intentions).

The following figure (Figure 4 from the original paper) shows the effect of hyperparameters QQ and NCN_C for four datasets:

Figure 4: The effect of hyperparameters \(Q\) and `N _ { C }` for four datasets 该图像是图表,展示了超参数 QQNCN_{C} 对四个数据集的影响。上半部分的图表(左侧)显示了在不同采样数量 QQ 下的推荐效果(IOR),而下半部分(右侧)则呈现了在不同意图数量 NCN_{C} 下的推荐效果。每个数据集包括 Lastfm、ML-100k、Steam 和 Douban movie,反映了这些超参数对个性化推荐性能的影响。

  • Effect of QQ (Figure 4a):

    • For dense datasets like Lastfm and ML-100k, sampling a smaller number of top user preferences (e.g., Q=5Q=5) is sufficient to characterize user responses and achieve optimal performance for TIAC.
    • For sparser datasets like Douban_movie and Steam, a larger QQ (e.g., Q=20Q=20) is required to adequately model user arousal levels. This makes sense, as more items might be needed to capture a user's diverse or less-defined short-term preferences in sparse environments.
  • Effect of NCN_C (Figure 4b):

    • NCN_C represents the number of intentions modeled by the system. A larger NCN_C allows for more diverse user intentions to be captured.
    • For smaller datasets such as Lastfm and ML-100k, the model performs best with a relatively small number of intentions (around NC=32N_C=32). This suggests that these datasets might exhibit fewer distinct user intention patterns.
    • For larger datasets like Steam and Douban_movie, a higher number of intentions (around NC=256N_C=256) yields better performance. This indicates that a more granular understanding of user intentions is beneficial in larger, potentially more diverse user bases.

6.2.3. LLM-based vs. Distribution-based Click Simulation (RQ3)

The paper provides a detailed comparison of the LLM-based and distribution-based click simulation schemes, both quantitatively and qualitatively.

  • Quantitative Comparison (Figure 6): The following figure (Figure 6 from the original paper) shows the comparative results of the distribution-based and LLM-based click simulations on the Lastfm and Douban_movie datasets:

    Figure 6: The comparative results of the distributionbased and LLM-based click simulations on the Lastfm and Douban_movie datasets. 该图像是一个比较图表,展示了 Lastfm 和 Douban_movie 数据集上基于分布和 LLM 的点击模拟结果。上方为 Lastfm 的结果,左侧为 IoL,右侧为 IoR;下方为 Douban_movie 的结果,左侧为 IoL,右侧为 IoR。

    The LLM-based click model consistently outperforms the distribution-based approach on both Lastfm and Douban_movie datasets, for both IoI@P and IoR@P metrics across various evaluation windows (PP). This indicates that the LLM agent provides a more effective and realistic simulation of user behavior, leading to better nudging outcomes.

  • Qualitative Comparison (Table 5): The following are the results from Table 5 of the original paper:

    Target movies in target category Sci-Fi Description
    [1] Robert A. Heinlein's The Puppet Masters Sci-Fi, Horror.
    [2] Aliens Sci-Fi, Action, Thriller.
    [3] Mars Attacks! Sci-Fi, Action, Comedy, War.
    The latest five movies' categories in the viewing history: Drama, Animation, Children's, Comedy, War
    Intermediate items by LLM agent Intermediate items by distribution-based scheme
    Frighteners(Com, Hor) → Hunt for Red October(Act, Thr) → Forbidden Planet (Sci) ✓ Breakfast at Tiffany's (Dra, Rom) → While You Were Sleeping (Com, Rom) → Great Escape (War) → Best of the Best 3: No Turning Back (Act) → Strange Days (Sci, Act, Cri) ✓
    Star Trek IV (Act, Adv, Sci) ✓ Forget Paris (Com, Rom) → G.I. Jane (Act, Dra, War) → Great Dictator (Com) → Star Trek IV (Sci) √
    Drunks (Dra) → Balto (Ani,Chi) → Red Rock West (Thr) → Canadian Bacon (Com, War) → Dangerous Minds (Dra) → Strange Days (Act, Cri, Sci) √ Dangerous Ground (Dra) → Hour of the Pig (Dra, Mys) → Red Rock West (Thr) → Canadian Bacon (Com, War) → Moonlight and Valentino (Dra, Rom) → Dangerous Minds (Dra) → Hunt for Red October (Act, Thr)

    A case study on the ML-100k dataset, with "Sci-Fi" as the target category, illustrates the qualitative difference:

    • LLM Agent: When nudging towards "Robert A. Heinlein's The Puppet Masters" (Sci-Fi, Horror), the LLM agent recommends a path like "Frighteners" (Com, Hor) \to "Hunt for Red October" (Act, Thr) \to "Forbidden Planet" (Sci). The LLM seems to understand the nuances of genres. For example, "Forbidden Planet" is categorized as "Sci-Fi" but actually contains elements of "Action, Thriller, and Adventure." The LLM leverages its external knowledge to connect these broader genre aspects, forming a coherent nudging path even across seemingly disparate initial preferences (Drama, Animation, Children's, Comedy, War).
    • Distribution-based Scheme: In contrast, the distribution-based scheme might follow a more direct similarity path. For the same target, it recommends "Breakfast at Tiffany's" (Dra, Rom) \to "While You Were Sleeping" (Com, Rom) \to "Great Escape" (War) \to "Best of the Best 3" (Act) \to "Strange Days" (Sci, Act, Cri). While it eventually reaches a Sci-Fi movie, the path appears less "reasoned" and more purely based on feature similarity. For "Mars Attacks!" (Sci-Fi, Action, Comedy, War), the distribution-based method struggles to recommend Sci-Fi movies effectively at times.
    • The LLM-based method demonstrates a more sophisticated reasoning capability, allowing it to bridge seemingly larger gaps by identifying subtle connections (e.g., hidden genre elements, broader thematic links) through its external knowledge, thus generating more effective and believable nudging paths.

6.2.4. Case Study: User Embedding Evolution (Appendix A.6)

The following figure (Figure 7 from the original paper) shows user embedding's evolution in ITMPRec:

该图像是热图,展示了ITMPRec模型在不同回合(Round)和维度(Dimension)下用户与目标之间的交互强度变化。横轴表示回合数,从用户到目标;纵轴显示不同的维度,色彩深浅反映了交互强度。图中信息有助于理解用户意图的发展。 该图像是热图,展示了ITMPRec模型在不同回合(Round)和维度(Dimension)下用户与目标之间的交互强度变化。横轴表示回合数,从用户到目标;纵轴显示不同的维度,色彩深浅反映了交互强度。图中信息有助于理解用户意图的发展。

The following figure (Figure 8 from the original paper) shows intermediate items recommended by ITMPRec:

该图像是一个示意图,展示了 ITMPRec 方法中用户意图随每轮推荐变化的热力图。图中横轴表示推荐轮次,纵轴表示用户维度,颜色深浅反映了意图强度的变化。 该图像是一个示意图,展示了 ITMPRec 方法中用户意图随每轮推荐变化的热力图。图中横轴表示推荐轮次,纵轴表示用户维度,颜色深浅反映了意图强度的变化。

A visualization of a user's embedding evolution from the Lastfm dataset shows that ITMPRec effectively draws user preferences towards the target item over multiple rounds. The user embedding (represented as a point in a 2D space) gradually moves closer to the target item's embedding as intermediate items are recommended and "clicked." This visual evidence reinforces the idea that ITMPRec successfully nudges user preferences.

6.3. Summary of Findings

ITMPRec significantly advances the field of proactive recommendation by introducing sophisticated mechanisms for target selection, user intention modeling, individual sensitivity assessment, and realistic user feedback simulation. The ablation studies confirm the individual importance of these components, while parameter sensitivity analysis provides practical guidance for deployment. The superior performance over SR and existing ProactRec baselines, particularly in IoI and IoR, validates ITMPRec as an effective and robust solution for intention-based targeted multi-round proactive recommendation. The LLM-based click model is a notable innovation, offering more realistic and intelligent user feedback simulation.

7. Conclusion & Reflections

7.1. Conclusion Summary

This paper introduces ITMPRec, a novel Intention-based Targeted Multi-round Proactive Recommendation method designed to overcome the limitations of traditional sequential recommendation (SR) systems that primarily cater to users' historical preferences. ITMPRec focuses on proactively guiding users towards a specific category of items over multiple interaction rounds. Its key contributions include:

  1. Pre-match Module: A strategy to intelligently select a set of target items by considering all users' opinions within a specified category, thereby making the nudging process more purposeful and avoiding cold-start issues.

  2. Intention-induced Scores: Integration of a mechanism to quantify users' intention-level evolution, which helps in selecting suitable intermediate items that align with changing coarse-grained user intentions during the guidance process.

  3. Targeted Individual Arousal Coefficients (TIAC): A component that models each user's unique sensitivity and receptivity to new content, allowing for personalized preference evolution during multi-round nudging.

  4. LLM Agent for Click Simulation: A pluggable Large Language Model (LLM) agent that provides a more realistic and intelligent simulation of user click feedback on intermediate recommendations, leveraging the LLM's external knowledge and reasoning capabilities compared to traditional distribution-based models.

    Extensive experiments conducted on four real-world datasets demonstrate the significant superiority of ITMPRec over eight state-of-the-art baselines, showcasing average increases of 36.47% in IoI@20 and 68.80% in IoR@20. The ablation studies confirm the effectiveness of each proposed module.

7.2. Limitations & Future Work

The authors identify several directions for future work:

  1. Causal Theory in Nudging: Further study of causal theory [13] in the nudging process to better understand the cause-and-effect relationships between recommendations and user preference shifts.

  2. Model Explainability: Enhancing the model's explainability [23] to provide users with clearer reasons for the proactive recommendations.

  3. Robustness in Complex Probabilistic Modeling: Improving the model's robustness in more complex probabilistic modeling scenarios, likely referring to the dynamics of user preferences and click behaviors.

    While not explicitly stated as limitations, the paper implicitly highlights some challenges:

  • Resource Intensiveness of LLMs: The LLM-IPP baseline was noted for its high resource consumption, and while ITMPRec uses LLMs pluggably, their integration still implies a computational cost trade-off compared to non-LLM components.
  • Offline Simulation Dependence: The entire framework relies on an environment simulator. The quality of proactive recommendation in real-world deployment depends heavily on how accurately this simulator (especially the LLM agent) mimics actual user behavior.

7.3. Personal Insights & Critique

ITMPRec presents a significant step forward in proactive recommendation by comprehensively addressing critical limitations of prior work. The shift from random target selection to a pre-match module is highly practical, aligning recommendation goals with content provider strategies. The integration of user intention and individual arousal coefficients is particularly insightful, moving beyond simplistic user models to capture more nuanced and dynamic aspects of human behavior. This makes the nudging process more adaptive and potentially more ethical, as it's tailored to individual user receptivity rather than a rigid push.

The use of an LLM agent for click simulation is perhaps the most innovative aspect. It acknowledges the complexity of user decision-making, which traditional distribution-based models cannot fully capture. This approach has broad implications beyond this paper, suggesting that LLMs could become a standard component in offline evaluation environments for complex interactive systems, not just recommendation. This allows for more robust offline testing before costly live experiments.

Potential issues or unverified assumptions:

  • Interpretability of LLM Agent: While LLMs offer interpretability in generating explanations, their internal decision-making process for simulating a click (aur=0a_u^r=0 or 1) might still be a black box. Understanding why an LLM agent decided a user would click, rather than just that it clicked, could be crucial for refining the nudging strategy.

  • Generalizability of LLM Agent: The effectiveness of the LLM agent relies on the prompt engineering (PF \mathcal{P}_F ) and the LLM's underlying external knowledge. While ChatGLM3 is powerful, its simulated feedback might still be limited by the data it was trained on and the specific instructions it receives. Its ability to capture novel or highly niche user behaviors might be a challenge.

  • Ethical Implications of Nudging: The term "proactive recommendation" or "nudging" inherently carries ethical considerations. While ITMPRec aims to broaden user interests, it could also be misused to manipulate users towards less beneficial content. Future work could explicitly integrate ethical safeguards or transparency mechanisms to ensure responsible nudging.

  • Real-world Deployment Challenges: Although ITMPRec shows strong offline performance, deploying a multi-round proactive recommendation system in a real-world setting would involve significant engineering challenges, including real-time user preference updates, cold-start issues for intermediate items, and the computational cost of generating LLM-based feedback or even using LLMs in live recommendation loops.

    The methods and conclusions of ITMPRec could be applied to other domains where preference guidance is desirable, such as educational content recommendation (guiding students towards foundational topics), health and wellness apps (nudging users towards healthier habits or information), or news feeds (encouraging exposure to diverse perspectives to combat polarization). The concept of intention-aware and individually sensitive multi-round guidance is broadly applicable to any sequential decision-making process involving human interaction.

Similar papers

Recommended via semantic vector search.

No similar papers found yet.