Neural Graph Collaborative Filtering
TL;DR Summary
The Neural Graph Collaborative Filtering (NGCF) framework integrates user-item interactions by propagating embeddings on a graph, effectively capturing collaborative signals and demonstrating significant improvements across several benchmark datasets.
Abstract
Learning vector representations (aka. embeddings) of users and items lies at the core of modern recommender systems. Ranging from early matrix factorization to recently emerged deep learning based methods, existing efforts typically obtain a user's (or an item's) embedding by mapping from pre-existing features that describe the user (or the item), such as ID and attributes. We argue that an inherent drawback of such methods is that, the collaborative signal, which is latent in user-item interactions, is not encoded in the embedding process. As such, the resultant embeddings may not be sufficient to capture the collaborative filtering effect. In this work, we propose to integrate the user-item interactions -- more specifically the bipartite graph structure -- into the embedding process. We develop a new recommendation framework Neural Graph Collaborative Filtering (NGCF), which exploits the user-item graph structure by propagating embeddings on it. This leads to the expressive modeling of high-order connectivity in user-item graph, effectively injecting the collaborative signal into the embedding process in an explicit manner. We conduct extensive experiments on three public benchmarks, demonstrating significant improvements over several state-of-the-art models like HOP-Rec and Collaborative Memory Network. Further analysis verifies the importance of embedding propagation for learning better user and item representations, justifying the rationality and effectiveness of NGCF. Codes are available at https://github.com/xiangwang1223/neural_graph_collaborative_filtering.
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
1.1. Title
Neural Graph Collaborative Filtering
1.2. Authors
Xiang Wang, Meng Wang, Xiangnan He, Fuli Feng, and Tat-Seng Chua. Their affiliations include National University of Singapore and University of Science and Technology of China, indicating a strong academic background in computer science, particularly in areas like recommender systems, data mining, and multimedia.
1.3. Journal/Conference
Published in the Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '19), July 21-25, 2019, Paris, France. SIGIR is a highly reputable and influential conference in the field of information retrieval and recommender systems, known for publishing cutting-edge research.
1.4. Publication Year
2019
1.5. Abstract
The paper addresses a crucial limitation in modern recommender systems: existing methods for learning user and item vector representations (embeddings), such as matrix factorization (MF) and deep learning-based approaches, typically derive embeddings from pre-existing features like IDs and attributes. This approach, the authors argue, fails to explicitly encode the collaborative signal—the latent information within user-item interactions—into the embedding process itself. Consequently, the resulting embeddings may not adequately capture the collaborative filtering (CF) effect.
To overcome this, the paper proposes Neural Graph Collaborative Filtering (NGCF). This novel framework integrates the user-item bipartite graph structure directly into the embedding learning process by propagating embeddings along the graph. This propagation mechanism allows for the expressive modeling of high-order connectivity within the user-item graph, thereby explicitly injecting the collaborative signal into the embeddings. Extensive experiments on three public benchmark datasets demonstrate that NGCF significantly outperforms several state-of-the-art models, including HOP-Rec and Collaborative Memory Network. Further analysis confirms the critical role of embedding propagation in learning superior user and item representations, thus validating NGCF's effectiveness and underlying rationale.
1.6. Original Source Link
Official Source: https://doi.org/10.1145/3331184.3331267 PDF Link: https://arxiv.org/pdf/1905.08108v2.pdf Publication Status: Officially published at SIGIR '19.
2. Executive Summary
2.1. Background & Motivation
The core problem the paper addresses is the suboptimality of user and item embeddings in existing recommender systems. Modern recommender systems, ranging from traditional Matrix Factorization (MF) to recent deep learning approaches, primarily learn embeddings from static features like user/item IDs or attributes. The crucial collaborative signal, which inherently exists within user-item interactions and reveals behavioral similarities, is largely not explicitly encoded during the embedding generation process. Instead, these interactions are often only used to define the objective function for training, leaving the embedding function itself unaware of the rich relational information. This omission means that the embeddings might lack the necessary information to fully capture the collaborative filtering (CF) effect, requiring the interaction function to compensate for this deficiency.
This problem is important because the quality of embeddings directly impacts the accuracy and effectiveness of recommendations. If embeddings are insufficient, even sophisticated interaction functions might struggle to provide optimal predictions. The challenge lies in efficiently integrating the massive scale of user-item interactions (which can reach millions or more in real-world applications) into the embedding process in a meaningful way.
The paper's innovative idea or entry point is to explicitly leverage the high-order connectivity present in the user-item interaction graph. Instead of just using interactions for training a model that operates on static embeddings, NGCF proposes to propagate embeddings directly on this graph structure. This allows the embeddings to dynamically evolve and absorb collaborative signals from neighbors up to hops away, thereby enriching them with contextual relational information that was previously implicit or ignored.
2.2. Main Contributions / Findings
The paper makes the following primary contributions and key findings:
- Highlighting the Importance of Explicit Collaborative Signal: The authors explicitly identify and articulate the critical importance of incorporating the
collaborative signaldirectly into theembedding functionof model-basedcollaborative filtering (CF)methods. They argue that prior methods largely neglect this, leading to suboptimal embeddings. - Introducing NGCF Framework: They propose
Neural Graph Collaborative Filtering (NGCF), a novel recommendation framework built upongraph neural networks (GNNs). NGCF explicitly encodes the collaborative signal by modelinghigh-order connectivitiesthrough anembedding propagationmechanism on the user-item interaction graph. - Empirical Validation and State-of-the-Art Performance: Through extensive experiments on three large-scale, real-world benchmark datasets (Gowalla, Yelp2018*, Amazon-Book), NGCF demonstrates significant performance improvements over several state-of-the-art
CFmodels, includingHOP-RecandCollaborative Memory Network (CMN). - Verification of Embedding Quality: Further analysis confirms that the
embedding propagationprocess is crucial for learning superior user and item representations. This verifies the rationality and effectiveness of NGCF's design, showing that explicitly integrating graph structure improves embedding quality and subsequently recommendation accuracy, particularly for inactive users (alleviating thesparsity issue).
3. Prerequisite Knowledge & Related Work
3.1. Foundational Concepts
To fully grasp Neural Graph Collaborative Filtering (NGCF), a reader should understand the following foundational concepts:
- Recommender Systems: Systems designed to predict user preferences or ratings for items, helping users discover new items they might like. They are widely used in e-commerce, streaming services, and social media.
- Collaborative Filtering (CF): A common technique in recommender systems that predicts user preferences by leveraging the preferences of similar users or the characteristics of similar items. The core idea is "users who liked X and Y also liked Z" (user-based CF) or "people who liked X also liked Y" (item-based CF).
- Embeddings (Vector Representations): In machine learning, an embedding is a mapping from discrete objects (like users, items, words, etc.) to a continuous vector space. Each user or item is represented by a dense vector of real numbers. The idea is that objects with similar properties or relationships will have similar vector representations in this space. For example, if two users have similar tastes, their user embeddings should be close to each other.
- Matrix Factorization (MF): A popular
CFtechnique. It decomposes the sparseuser-item interaction matrix(where rows are users, columns are items, and entries are ratings or implicit feedback) into two lower-rank matrices: a user matrix and an item matrix. Each row in the user matrix is a user embedding, and each column in the item matrix is an item embedding. The product of a user embedding and an item embedding (typically an inner product) reconstructs the original interaction. - Deep Learning: A subfield of machine learning that uses
neural networkswith multiple layers (hence "deep") to learn complex patterns from data. These networks can learn hierarchical representations and capture non-linear relationships. - Neural Networks: Computational models inspired by the structure and function of biological neural networks. They consist of interconnected nodes (neurons) organized in layers. Each connection has a weight, and neurons apply an
activation functionto the weighted sum of their inputs. - Activation Functions: Non-linear functions applied by neurons in a neural network. They introduce non-linearity, allowing the network to learn complex patterns.
LeakyReLU(Leaky Rectified Linear Unit) is an activation function defined as where is a small positive constant. It addresses the "dying ReLU" problem by allowing a small, non-zero gradient when the input is negative. - Optimization (Adam): An adaptive learning rate optimization algorithm used to train neural networks. It computes individual adaptive learning rates for different parameters from estimates of first and second moments of the gradients. It's widely used due to its efficiency and good performance.
- Regularization (L2 Regularization): A technique used to prevent
overfittingin machine learning models.L2 regularization(also known as weight decay) adds a penalty term proportional to the square of the magnitude of the model's weights to the loss function. This encourages the model to use smaller weights, making it simpler and less prone to capturing noise in the training data. - Dropout: A regularization technique for neural networks where randomly selected neurons are "dropped out" (i.e., temporarily ignored along with their incoming and outgoing connections) during training. This prevents complex co-adaptations on training data and improves the model's generalization ability.
- Graph Theory:
- Bipartite Graph: A graph whose vertices can be divided into two disjoint and independent sets, (users) and (items), such that every edge connects a vertex in to one in . There are no edges within or within . The user-item interaction network is naturally a bipartite graph.
- High-order Connectivity: Refers to paths in a graph that involve multiple steps or "hops." In a user-item graph, a 1-hop connection is a direct interaction (user-item). A 2-hop connection (user-item-user) implies behavioral similarity between two users (they interacted with the same item). A 3-hop connection (user-item-user-item) suggests that a user might be interested in an item that a similar user has interacted with. These longer paths carry rich
collaborative signals. - Adjacency Matrix (): A square matrix used to represent a finite graph. If there are nodes, is . An entry if there is an edge from node to node , and
0otherwise. For a user-item bipartite graph, a combined adjacency matrix can be formed that includes connections between users and items. - Degree Matrix (): A diagonal matrix where is the degree of vertex (the number of edges connected to it).
- Graph Laplacian (): For an unweighted graph, the
graph Laplacianis defined as . A normalized version, often used inGraph Neural Networks, is . This matrix is crucial for understanding the connectivity and structure of a graph and for spectral graph theory.
- Graph Neural Networks (GNNs): A class of deep learning methods designed to perform inference on graph-structured data. They extend the idea of neural networks to graphs by iteratively aggregating information from a node's neighbors to update its representation.
- Message Passing: A fundamental paradigm in
GNNs. It involves nodes exchanging "messages" (information/embeddings) with their neighbors and then aggregating these messages to update their own representations. This process can be repeated over multiple layers, allowing information to propagate across longer distances in the graph.
3.2. Previous Works
The paper discusses existing work in three main categories, positioning NGCF in relation to them:
3.2.1. Model-based CF Methods
These methods parameterize users and items with vector representations (embeddings) and reconstruct interactions based on these parameters.
-
Matrix Factorization (MF) [20, 26]: The foundational model. It embeds user/item IDs as vectors and uses their
inner productto predict interactions.- Formula (Simplified): For a user and item , the predicted rating is given by , where is the user embedding and is the item embedding.
-
Enhanced MF: Efforts to incorporate
side information(e.g., item content [2, 30], social relations [34], item relations [37], user reviews [3], external knowledge graph [32, 35]) to enrich embeddings. -
Neural Collaborative Filtering (NCF) methods (e.g., NeuMF [14]): Replace the
inner productinMFwithnon-linear neural networksto model interactions, capturing more complex relationships.- Formula (Conceptual for NeuMF): , where is a multi-layer perceptron (MLP) that concatenates or element-wise multiplies user and item embeddings and processes them through non-linear layers. For example, or , where is an MLP layer, is the final output layer, is element-wise product, and is concatenation.
-
Translation-based CF (e.g., LRML [28]): Use
Euclidean distance metricsas the interaction function.NGCF's Critique: These methods primarily build embedding functions using descriptive features (ID, attributes) and use interactions only for the objective function. They lack
explicit encoding of collaborative signalsin the embedding process, meaningtransitivity(e.g., if A likes B and B likes C, A might like C) is not guaranteed in the embedding space.
3.2.2. Graph-Based CF Methods
These methods explicitly use the user-item interaction graph to infer preferences.
- Label Propagation methods (e.g., ItemRank [7], BiRank [12]): Propagate "labels" (e.g., interacted items) on the graph to score items for a user. These are essentially
neighbor-based methodsand lackmodel parametersto optimize a recommendation objective. - HOP-Rec [40]: A recent
graph-based modelthat performsrandom walkson the graph to identifymulti-hop connected items. Thesehigh-order neighborsare then used to enrich the training data for anMFmodel.- Key Idea: Augments the implicit feedback data by adding synthetic user-item interactions based on paths discovered by random walks. The core prediction model remains MF, but trained on an enriched dataset.
- Formula (Conceptual): It would still use , but the training pairs
(u, i)would be expanded to include(u, i')where is a high-order neighbor of . NGCF's Critique:HOP-Recenriches training data but does not integrate high-order connectivity into the prediction model's embedding function itself. Its performance heavily relies onrandom walkparameters (e.g., decay factor).
3.2.3. Graph Convolutional Networks (GCNs) for Recommendation
These methods apply graph convolution operations to the user-item graph.
- GC-MC [29]: Applies
GCN[18] on a user-item graph, but only usesone convolutional layer, thus exploiting onlyfirst-order connections(direct user-item interactions).- GCN Layer (Simplified from Kipf & Welling [18]): , where is adjacency matrix with self-loops, is its degree matrix, is node features at layer , is weight matrix, and is activation. GC-MC adapts this to a bipartite graph.
- PinSage [42]: An industrial solution from Pinterest. It employs
multiple graph convolution layersbut primarily on anitem-item graph(derived from co-occurrence patterns) foritem recommendation. TheCF effectis captured at the item relation level, not directly on collective user behaviors.- Key Idea (from GraphSAGE [8]): Aggregates information from a node's local neighborhood. A generic GraphSAGE layer for node would involve: and . PinSage applies this on a large-scale item graph.
- SpectralCF [43]: Proposes a
spectral convolution operationto find connections between users and items in thespectral domain(usingeigen-decompositionof the graphadjacency matrix).- Key Idea: Utilizes the eigenvectors and eigenvalues of the graph Laplacian to define convolution.
NGCF's Critique:
GC-MCis limited to first-order connectivity.PinSagefocuses on item-item graphs.SpectralCFhas high computational complexity due toeigen-decomposition, making it unsuitable for large-scale graphs.
- Key Idea: Utilizes the eigenvectors and eigenvalues of the graph Laplacian to define convolution.
NGCF's Critique:
3.3. Technological Evolution
The evolution of collaborative filtering (CF) has generally moved from simpler statistical methods to more complex model-based approaches:
-
Early Methods (e.g., Neighborhood-based CF): Relied on calculating similarities between users or items directly from interaction patterns.
-
Matrix Factorization (MF): Introduced the concept of
embeddingsfor users and items, decomposing the interaction matrix to learn these latent factors. This marked a shift towards explicit model learning. -
MF with Side Information: Enhanced
MFby incorporating additional data (e.g., item attributes, social networks) to enrich embeddings and alleviatesparsity. -
Neural Collaborative Filtering (NCF): Replaced the linear
inner productinteraction function ofMFwithnon-linear neural networks, allowing for more expressive modeling of user-item relationships. This leveraged the power of deep learning for the interaction function. -
Graph-based Methods (e.g., HOP-Rec, early GCNs): Began to explicitly use the
user-item interaction graphstructure.HOP-Recused random walks to augment training data, whileGC-MCappliedGCNsbut often limited tofirst-order connections.NGCF's Position:
NGCFrepresents a significant step in this evolution by integratinggraph neural networksto enhance the embedding function itself, rather than just the interaction function or the training data. It leverages themessage-passing mechanismon theuser-item bipartite graphto explicitly capturehigh-order collaborative signalsdirectly into the user and itemembeddings. This means the embeddings are no longer static representations from features but are dynamically refined by the graph structure, making them inherently "collaborative-aware." This positionsNGCFat the forefront ofGNN-basedCFmodels, focusing on richer, structural information within the embeddings.
3.4. Differentiation Analysis
NGCF distinguishes itself from previous methods primarily through its explicit and integrated approach to embedding the collaborative signal using high-order connectivity on the user-item graph.
- Compared to MF and NeuMF:
- Difference:
MFandNeuMFlearn embeddings primarily fromIDmapping or staticfeatures, and use interactions only to train aninteraction function(inner product forMF,MLPforNeuMF). Thecollaborative signal(e.g., transitivity like user A and user B both like item I, so A and B are similar) is only implicitly captured if the interaction function is powerful enough to reconstruct interactions perfectly. - Innovation:
NGCFexplicitly injects thiscollaborative signalinto theembedding processitself bypropagating embeddingsacross theuser-item graph. This means the embeddings of users and items are constantly refined by their neighbors' embeddings, making them inherently aware ofhigh-order relationships.
- Difference:
- Compared to HOP-Rec:
- Difference:
HOP-Recalso considershigh-order connectivityby performingrandom walksto enrich the training data, but its underlying prediction model remainsMF. The high-order information is used to augment the dataset, not to modify the embedding function of the model. - Innovation:
NGCFintegrateshigh-order connectivitydirectly into the model's embedding function throughembedding propagation layers. This leads to betterembeddingsand superior performance, as the collaborative signal becomes an intrinsic part of how embeddings are learned.
- Difference:
- Compared to GC-MC and PinSage:
- Difference:
GC-MCappliesGCNsbut typically uses onlyone convolutional layer, thus capturing onlyfirst-order neighbors.PinSageuses multiple layers but operates on anitem-item graph(capturing item relations) rather than theuser-item bipartite graphfor directCFsignals. - Innovation:
NGCFusesmultiple embedding propagation layerson theuser-item bipartite graph, explicitly modelinghigh-order connectivitiesbetween users and items. Furthermore,NGCFuses alayer-aggregation mechanism(concatenation) to combine embeddings from different propagation layers, whichPinSagetypically doesn't, allowingNGCFto capture multi-grained representations.NGCFalso incorporates anelement-wise product() between user and item embeddings in its message construction, making messages dependent on the affinity between the source and target, whichGC-MClacks.
- Difference:
- Compared to SpectralCF:
- Difference:
SpectralCFalso aims to discoverconnectivitybut does so in thespectral domainusingeigen-decompositionof thegraph adjacency matrix. - Innovation:
NGCFoperates efficiently in thespatial domainwith an efficientmatrix-form propagation rule, avoiding the high computational complexity ofeigen-decompositionthat makesSpectralCFimpractical for large-scale recommendation.
- Difference:
- Compared to SVD++ (as discussed in Section 2.5.1):
-
Difference: augments user embeddings with implicitly feedback from items they interacted with. It's essentially a
first-order modelwith a specific aggregation mechanism. -
Innovation:
NGCFgeneralizes . By setting and simplifying its components (disabling transformation matrices and non-linear activation),NGCFcan recover . This shows thatNGCF's framework is more general and can extend beyond first-order connections to high-order ones.In summary,
NGCF's core innovation lies in explicitly and effectively injectingcollaborative signalsderived fromhigh-order graph connectivitydirectly into theembedding learning processthrough a scalablemessage-passing Graph Neural Networkarchitecture.
-
4. Methodology
4.1. Principles
The core principle of Neural Graph Collaborative Filtering (NGCF) is to overcome the limitation of traditional recommender systems where the collaborative signal (information about behavioral similarities derived from user-item interactions) is not explicitly encoded within the embedding function. Instead, NGCF proposes to integrate the user-item interaction graph structure directly into the embedding process. The theoretical basis is that high-order connectivity within this graph (e.g., paths like User-Item-User or User-Item-User-Item) contains rich semantic information that reveals deeper collaborative filtering (CF) effects.
The intuition is that a user's preference is not only influenced by the items they directly interacted with (first-order connections) but also by the items that users similar to them interacted with (second-order connections), and so on. Similarly, an item's characteristics can be inferred not just from its direct users, but also from the users who interact with similar items. NGCF achieves this by using a message-passing mechanism inspired by Graph Neural Networks (GNNs). It iteratively refines user and item embeddings by propagating information (messages) between connected nodes (users and items) on the bipartite interaction graph. Each propagation layer allows embeddings to absorb information from further 'hops' in the graph, thus explicitly embedding the high-order collaborative signals.
4.2. Core Methodology In-depth (Layer by Layer)
The NGCF framework consists of three main components: an embedding layer, multiple embedding propagation layers, and a prediction layer.
4.2.1. Embedding Layer
The embedding layer serves as the initialization point for user and item representations.
- Each user is associated with an initial embedding vector .
- Each item is associated with an initial embedding vector .
- Here, denotes the
embedding size(dimension). - These embeddings are stored in an
embedding look-up table: where is the total number of users and is the total number of items. This is the initial state for the embeddings and will be optimized end-to-end. Unlike traditional models where theseID embeddingsmight directly feed into an interaction layer, inNGCF, they are further refined byembedding propagation layers.
4.2.2. Embedding Propagation Layers
This is the core of NGCF, where collaborative signals are captured and embeddings are refined through message passing on the user-item graph.
4.2.2.1. First-order Propagation
This stage describes how embeddings are refined by considering first-order neighbors (direct connections). It involves two main operations: message construction and message aggregation.
-
Message Construction: For a directly connected user-item pair
(u, i), amessage embeddingis constructed, representing the information propagated from item to user . The function encodes this message, taking (item embedding) and (user embedding) as input, and includes a coefficient to control the decay factor during propagation on the edge(u, i). The paper implements as:- : The
message embeddingfrom item to user . - : The set of first-hop neighbors (items) of user .
- : The set of first-hop neighbors (users) of item .
- : This is the coefficient , derived from the
graph Laplacian norm. It acts as adiscount factor, meaning messages should decay withpath lengthand connectivity strength. - : Trainable
weight matricesthat transform the embeddings. is thetransformation size. - : Initial
embedding vectorsof item and user , respectively. - : The
element-wise product(Hadamard product) between and . This term is crucial as it makes the message dependent on theaffinitybetween the item and user, allowing more messages to pass from similar items.
- : The
-
Message Aggregation: After messages are constructed, they are aggregated from a node's neighborhood to refine its representation. For user , the aggregation function for its
first-layer representationis:- : The
representationof user after thefirst embedding propagation layer. - : The
Leaky Rectified Linear Unit activation function, which allows both positive and small negative signals to be encoded. - : The sum of messages propagated from all items in user 's neighborhood .
- : A
self-connectionterm, defined as . This term ensures that the user's original features (initial embedding) are retained and contribute to the refined representation. The weight matrix is shared with the first term in message construction. - A similar process is applied to obtain for an item . This explicitly uses
first-order connectivityto relate user and item representations.
- : The
4.2.2.2. High-order Propagation
To capture high-order connectivity and more complex collaborative signals, NGCF stacks multiple embedding propagation layers.
-
By stacking layers, a user (or item) can receive messages from its
L-hop neighbors. -
For the -th step (layer), the representation of user is recursively formulated as: wherein the messages being propagated are defined using the representations from the previous layer
(l-1):- : The representation of user after the -th layer.
- : Message from item to user at layer .
- : Self-connection message for user at layer .
- : Trainable
transformation matricesfor layer . is thetransformation sizefor layer . - : Item and user
representationsobtained from the previous layer(l-1). These representations already incorporate messages from(l-1)-hop neighbors. - The term remains the same as in
first-order propagationas it is a structural coefficient. This recursive process allows information to flow acrosshigh-order connectivities, explicitly injectingcollaborative signalsinto therepresentation learning. An analogous process occurs for items.
-
Propagation Rule in Matrix Form: For efficient batch implementation, the layer-wise propagation rule (Equations (5) and (6)) can be expressed in matrix form:
- : The matrix containing the representations for all users and items after steps of
embedding propagation. - : The initial
embedding matrixfrom theembedding layer, where and . - : An
identity matrixof appropriate size, representing the self-connections. - : The
Laplacian matrixfor the user-item graph. It is formulated using theadjacency matrixanddegree matrix:- : The
user-item interaction matrix. 0: An all-zero matrix.- : The
adjacency matrixof the combined user-item graph. - : The diagonal
degree matrix, where (the degree of node ). - The non-zero off-diagonal entries of are , which corresponds to the coefficient used in message construction.
- : The
- : Trainable weight matrices for layer .
- :
Element-wise productbetween the two matrices, which applies the interaction term across all user and item embeddings simultaneously. This matrix form allows for efficient computation without explicitnode samplingtypically required in large-scaleGCNs.
- : The matrix containing the representations for all users and items after steps of
4.2.3. Model Prediction
After embedding propagation layers, multiple representations are obtained for each user (i.e., ) and for each item (i.e., ).
- Since representations from different layers capture different orders of
connectivityand contribute uniquely to user preference, they are combined. - The paper uses
concatenationto form thefinal embeddingfor a user and an item:- : The
concatenation operation. Thislayer-aggregation mechanismenriches the initial embeddings and allows the model to control the range of propagation by adjusting . It's a simple yet effective choice as it involves no additional parameters.
- : The
- Finally, the user's preference towards a target item is estimated using an
inner productbetween their final embeddings:- : The predicted affinity score for user and item .
- : Transpose operation.
The paper emphasizes
embedding function learningand uses a simpleinner productforinteraction function, leaving more complex options for future work.
4.2.4. Optimization
To train the NGCF model, the pairwise BPR (Bayesian Personalized Ranking) loss [26] is optimized. BPR is widely used in recommender systems for implicit feedback, assuming that observed interactions should have higher prediction values than unobserved ones.
- The objective function is:
- : The set of
pairwise training data.- : The set of
observed user-item interactions(positive instances). - : The set of
unobserved user-item interactions(negative instances), typically generated bynegative sampling. (u, i, j): A triplet indicating that user prefers item over item .
- : The set of
- : The
sigmoid function, which squashes values between 0 and 1. - : The predicted score for user and item .
- : The predicted score for user and item .
- : The
log-lossfor theBPRobjective, which maximizes the difference between positive and negative item scores. - : The coefficient controlling the strength of
L2 regularization. - : The
L2 normof all trainable model parameters . - : The set of all trainable parameters, including the initial
embedding matrixand allweight matricesfrom the propagation layers.
- : The set of
- The model is optimized using
mini-batch Adam[17]. For eachmini-batchof sampled triplets, the user and item representations are established through propagation steps, and then model parameters are updated based on the gradients of theloss function.
4.2.4.1. Model Size
Despite generating an embedding matrix at each layer, NGCF introduces very few additional parameters compared to MF.
- The additional parameters are primarily the
two weight matricesfor each of the propagation layers. - Total additional parameters: .
- Since is typically small (e.g., < 5) and (transformation size) is often the
embedding size(much smaller than number of users/items), thisadditional cost is negligible. For instance, on a dataset like Gowalla with 64-dim embeddings and 3 layers,NGCFadds only 0.024 million parameters compared toMF's 4.5 million.
4.2.4.2. Message and Node Dropout
To combat overfitting, two dropout techniques are employed:
- Message Dropout: Randomly drops out
outgoing messagesduring propagation.- For the -th propagation layer, messages (Equation 6) are dropped with a probability . This means only a partial set of messages contributes to refined representations, making the model more robust to individual connections.
- Node Dropout: Randomly blocks a particular node and discards all its
outgoing messages.- For the -th layer, nodes (from the total users and items) are randomly dropped from the
Laplacian matrix. This reduces the influence of specific users or items, enhancing robustness.
- For the -th layer, nodes (from the total users and items) are randomly dropped from the
Dropoutis only used during training and disabled during testing.Message dropoutstrengthensrepresentation robustnessagainst single connections, whilenode dropoutaddresses the influence of specific nodes.
4.2.5. Discussions
4.2.5.1. NGCF Generalizes SVD++
The paper claims that NGCF can generalize [19], a well-known collaborative filtering model that augments user representations with implicit feedback from items they interacted with.
- To recover a simplified version of (termed
NGCF-SVD),NGCFis configured as follows:- Set the number of propagation layers .
- Disable the
transformation matrices() and thenon-linear activation function(). - Under these conditions, the representations and after the first layer simplify.
- The simplified model
NGCF-SVDcan be formulated as:- : Predicted score by the
NGCF-SVDmodel. - : Initial
user and item embeddings. - : Set of items user interacted with.
- : Set of users who interacted with item .
- : Coefficients controlling the contribution of neighboring item to user 's representation, and neighboring user to item 's representation, respectively.
- : Predicted score by the
- By setting (which is the decay factor from the message construction) and (or a similar specific decay),
NGCF-SVDcan exactly recover the model. - Furthermore,
FISM[16] (a widely used item-basedCFmodel) can also be seen as a special case where is always 0, effectively considering only the item-side aggregation. This demonstratesNGCF's generality and its ability to encompass existing first-order models within its framework.
4.2.5.2. Time Complexity Analysis
The computational complexity of NGCF is primarily determined by its layer-wise propagation rule (Equation 7).
- For the -th propagation layer, the main operation is matrix multiplication involving the
Laplacian matrixand embedding matrices. - The
computational complexityof this operation is , where is the number of non-zero entries in theLaplacian matrix(which corresponds to the number of observed interactions), is the current transformation size, and is the previous transformation size. - For the
prediction layer, only aninner productis involved. - Therefore, the
overall complexityfor evaluatingNGCFduring training for one epoch is: - Empirically, the paper notes that for the Gowalla dataset:
- Training:
MFcosts about 20s per epoch, whileNGCFcosts about 80s per epoch. - Inference:
MFcosts about 80s for all testing instances, whileNGCFcosts about 260s. This shows that whileNGCFis more computationally intensive thanMFdue to propagation layers, it remains tractable.
- Training:
5. Experimental Setup
5.1. Datasets
The experiments were conducted on three publicly accessible, real-world benchmark datasets that vary in domain, size, and sparsity. For all datasets, a 10-core setting was applied, meaning only users and items with at least ten interactions were retained to ensure data quality.
For each dataset, of historical interactions per user were used for the training set, and the remaining for the test set. From the training set, of interactions were randomly selected as a validation set for hyperparameter tuning. For each observed user-item interaction (positive instance), negative sampling was performed to pair it with one unobserved item (negative instance).
- Gowalla:
- Source: Check-in dataset from Gowalla, a location-based social networking service.
- Characteristics: Captures user check-ins at various locations.
- Domain: Location-based social interactions.
- Yelp2018:*
- Source: Adopted from the 2018 edition of the Yelp challenge.
- Characteristics: Local businesses (restaurants, bars) are treated as items.
- Domain: Business reviews and local services.
- Amazon-Book:
-
Source: Part of the Amazon-review dataset, specifically focusing on book reviews.
-
Characteristics: User ratings/reviews for books.
-
Domain: E-commerce product reviews (books).
The following are the results from Table 1 of the original paper:
Dataset #Users #Items #Interactions Density Gowalla 29,858 40,981 1,027,370 0.00084 Yelp2018* 31,668 38,048 1,561,406 0.00130 Amazon-Book 52,643 91,599 2,984,108 0.00062
-
These datasets were chosen because they are widely accessible, cover different domains, and represent various scales and sparsity levels, making them effective for validating the generality and performance of recommender systems. No concrete example of a data sample (e.g., a specific check-in record or review text) was provided in the paper.
5.2. Evaluation Metrics
To evaluate the effectiveness of top- recommendation and preference ranking, two widely used evaluation protocols are adopted: recall@K and ndcg@K. For the experiments, is set to 20 by default. The average metrics for all users in the test set are reported.
5.2.1. Recall@K
- Conceptual Definition:
Recall@Kmeasures the proportion of relevant items (items a user actually interacted with in the test set) that are successfully retrieved among the top recommended items. It focuses on how many of the "good" items were found within the recommendation list, regardless of their rank within that list. - Mathematical Formula:
- Symbol Explanation:
- : The set of users in the test set.
- : Denotes the cardinality (number of elements) of a set.
- RelevantItems
_u: The set of items that user interacted with in the test set (ground truth). - TopKRecommendations
_u: The set of the top items recommended to user by the model. - : Set intersection.
- : Summation over all users in the test set.
5.2.2. NDCG@K (Normalized Discounted Cumulative Gain at K)
- Conceptual Definition:
NDCG@Kis a measure of ranking quality, considering not only the presence of relevant items but also their position in the ranked list. It assigns higher scores to relevant items that appear earlier (at higher ranks) in the recommendation list. It is "normalized" by comparing the calculatedDCGto theideal DCG(where all relevant items are perfectly ranked at the top) to produce a score between 0 and 1. - Mathematical Formula:
- Symbol Explanation:
- : The relevance score of the item at position in the ranked list. In implicit feedback scenarios, this is typically 1 if the item is relevant (observed in the test set) and 0 otherwise.
- : The rank position of an item in the recommendation list (from 1 to ).
- : A logarithmic discount factor, penalizing relevant items at lower ranks.
- :
Discounted Cumulative Gainfor user at rank . - :
Ideal Discounted Cumulative Gainfor user at rank , calculated by placing all relevant items in the test set at the top of the list in decreasing order of relevance (which is 1 for all in implicit feedback). - : Ensures that
IDCG@Ksums up to or the total number of relevant items if less than . - : The set of users in the test set.
- : Denotes the cardinality of a set.
- : Summation over all users in the test set.
5.3. Baselines
NGCF was compared against several representative state-of-the-art collaborative filtering methods:
- MF (Matrix Factorization) [26]:
- Description: A basic
MFmodel optimized withBayesian Personalized Ranking (BPR)loss. It models user-item direct interactions as the target value for its interaction function (inner product). - Representativeness: A fundamental and widely used baseline in
CF.
- Description: A basic
- NeuMF (Neural Matrix Factorization) [14]:
- Description: A state-of-the-art
neural CFmodel that uses multiplehidden layersabove element-wise product and concatenation of user/item embeddings to capturenon-linear feature interactions. A two-layered architecture is used with consistent hidden dimension size. - Representativeness: A strong baseline representing
deep learningadvancements inCF's interaction function.
- Description: A state-of-the-art
- CMN (Collaborative Memory Network) [5]:
- Description: A state-of-the-art
memory-based model. It constructs user representations byattentively combiningmemory slots of neighboring users viamemory layers. It usesfirst-order connectionsto find similar users. - Representativeness: A strong
memory networkbased approach that explicitly considers neighbors.
- Description: A state-of-the-art
- HOP-Rec (High-Order Proximity for Implicit Recommendation) [40]:
- Description: A state-of-the-art
graph-based model. It utilizesrandom walksto derivehigh-order neighbors, which are then used to enrich theuser-item interaction datafor training anMFmodel withBPR. - Representativeness: A direct competitor that also attempts to leverage
high-order connectivity, but in a different manner.
- Description: A state-of-the-art
- PinSage [42]:
- Description: An industrial
GraphSAGE[8] implementation foritem-item graphs(Pinterest image recommendation). The paper adapts it to theuser-item interaction graph, using twograph convolution layerswith hidden dimension equal to embedding size. - Representativeness: A strong
GNN-based model, particularly for large-scale recommendation.
- Description: An industrial
- GC-MC (Graph Convolutional Matrix Completion) [29]:
- Description: Applies
GCN[18] as an encoder for user and item representations, but it primarily considersfirst-order neighborswith onegraph convolution layer. - Representativeness: An early and influential
GCN-basedCFmodel.
- Description: Applies
- SpectralCF [43]:
- Description: This model was considered but not selected for comparison due to its high computational and resource cost associated with
eigen-decompositionof thegraph adjacency matrix, making it impractical for large-scale datasets. - Representativeness: A spectral
GCNapproach, excluded due to practical limitations.
- Description: This model was considered but not selected for comparison due to its high computational and resource cost associated with
Parameter Settings:
Embedding size: Fixed to 64 for all models.HOP-Rec:Random walk stepssearched in ;learning ratein .Optimizer:Adamfor all models exceptHOP-Rec.Batch size: Fixed at 1024.Hyperparameters (grid search):Learning rate: .L2 regularization coefficient(): .Dropout ratio(for message dropout and node dropout ): .
Initializer:Xavier initializer[6] for model parameters.Early stopping: Applied ifrecall@20on thevalidation datadoes not improve for 50 successive epochs.NGCF depth (L): Set to 3 for modelingthird-order connectivity(unless specified otherwise for ablation studies).Default dropout for NGCF:node dropoutratio 0.0,message dropoutratio 0.1.
6. Results & Analysis
6.1. Core Results Analysis
6.1.1. Overall Performance Comparison
The paper conducted a comprehensive comparison of NGCF against several baselines on three datasets. The results, summarized in Table 2, demonstrate NGCF's superior performance across all benchmarks.
The following are the results from Table 2 of the original paper:
| Gowalla | Yelp2018* | Amazon-Book | ||||
| recall | ndcg | recall | ndcg | recall | ndcg | |
| MF | 0.1291 | 0.1109 | 0.0433 | 0.0354 | 0.0250 | 0.0196 |
| NeuMF | 0.1399 | 0.1212 | 0.0451 | 0.0363 | 0.0258 | 0.0200 |
| CMN | 0.1405 | 0.1221 | 0.0457 | 0.0369 | 0.0267 | 0.0218 |
| HOP-Rec | 0.1399 | 0.1214 | 0.0517 | 0.0428 | 0.0309 | 0.0232 |
| GC-MC | 0.1395 | 0.1204 | 0.0462 | 0.0379 | 0.0288 | 0.0224 |
| PinSage | 0.1380 | 0.1196 | 0.0471 | 0.0393 | 0.0282 | 0.0219 |
| NGCF-3 | 0.1569* | 0.1327* | 0.0579* | 0.0477* | 0.0337* | 0.0261* |
| %mprov. | 11.68% | 8.64% | 11.97% | 11.29% | 9.61% | 12.50% |
| p-value | 2.01e-7 | 3.03e-3 | 5.34e-3 | 4.62e-4 | 3.48e-5 | 1.26e-4 |
Key Observations:
- MF's Weakness:
MFperforms poorly across all datasets, indicating itslinear interaction functionand lack of explicitcollaborative signalencoding result in insufficientembeddings. - NeuMF's Improvement:
NeuMFconsistently outperformsMF, highlighting the benefit ofnon-linear feature interactionsin theinteraction function. However, it still doesn't explicitly encodecollaborative signalsinembeddings. - Impact of First-order Neighbors:
GC-MCshows improvements overMFandNeuMFin some cases (e.g., Amazon-Book), confirming that incorporatingfirst-order neighborshelps. However, its performance can be mixed (e.g.,ndcg@20on Yelp2018* is lower thanNeuMF), possibly due to its failure to fully capturenon-linear interactions. - Memory Networks and High-order Connectivity:
CMNgenerally performs better thanGC-MC, possibly due to itsneural attention mechanismfor weighting neighbors.HOP-Recachieves significant improvements, demonstrating the value ofhigh-order connectivityfor enriching training data.PinSagealso shows good performance, especially on Yelp2018*, as it leverageshigh-order connectivityin the embedding function. - NGCF's Dominance:
NGCF-3(NGCF with 3 propagation layers) consistently achieves the best performance on all datasets, significantly outperforming all baselines.- It improves
recall@20over the strongest baselines by 11.68% (Gowalla), 11.97% (Yelp2018*), and 9.61% (Amazon-Book). - The
p-values(all ) indicate these improvements arestatistically significant.
- It improves
- Justification for NGCF: The superior performance is attributed to
NGCF's ability to explicitly modelhigh-order connectivitythroughembedding propagation layers, effectively injectingcollaborative signalsinto theembedding learning process. UnlikeCMNandGC-MCwhich are limited tofirst-order neighbors,NGCFexplores deeper connections. Compared toPinSage,NGCF'slayer-aggregation mechanismcombines multi-grained representations from different layers, providing a richer final embedding. Its advantage overHOP-Recfurther validates that explicitly encoding CF in the embedding function is more effective than merely enriching training data.
6.1.2. Performance Comparison w.r.t. Interaction Sparsity Levels
The paper investigates how NGCF performs across different sparsity levels by dividing users into groups based on their number of interactions. The results for ndcg@20 are shown in Figure 4.
The following figure (Figure 4 from the original paper) shows ndcg@20 on different user groups in Gowalla, Yelp2018*, and Amazon-Book datasets.

该图像是性能比较图,展示了不同推荐模型在多个用户组上的ndcg@20指标,包括MF、NeuMF、CMN、HOP-Rec、PinSage、GC-MC和NGCF。图中显示了在游玩平台(GoWalla)、Yelp2018和亚马逊图书(Amazon-book)数据集上的实验结果,NGCF模型表现优异。
hidicahevolv onrae @20 .
Key Observations:
- Overall Superiority in Sparsity: Both
NGCFandHOP-Recconsistently outperform other baselines across all user groups and sparsity levels. This indicates that leveragingconnectivity information(especiallyhigh-order) is highly beneficial forrepresentation learning, particularly for users with few interactions. This suggestsNGCFcan effectively alleviate thesparsity issueinrecommender systems. - Greater Impact on Inactive Users:
NGCFshows more significant improvements in the groups with fewer interactions (e.g., and interactions in Gowalla, where improvements are 8.49% and 7.79% respectively over best baselines). For highly active users (e.g., in Gowalla), the improvements are still present but less dramatic (e.g., 1.29%). This verifies thatembedding propagationis particularly advantageous forrelatively inactive users, as it can synthesizecollaborative signalsfrom their sparse neighborhoods.
6.2. Ablation Studies / Parameter Analysis
6.2.1. Effect of Layer Numbers
The paper investigates the impact of the number of embedding propagation layers () on NGCF's performance, varying from 1 to 4.
The following are the results from Table 3 of the original paper:
| Gowalla | Yelp2018* | Amazon-Book | ||||
| recall | ndcg | recall | ndcg | recall | ndcg | |
| NGCF-1 | 0.1556 | 0.1315 | 0.0543 | 0.0442 | 0.0313 | 0.0241 |
| NGCF-2 | 0.1547 | 0.1307 | 0.0566 | 0.0465 | 0.0330 | 0.0254 |
| NGCF-3 | 0.1569 | 0.1327 | 0.0579 | 0.0477 | 0.0337 | 0.0261 |
| NGCF-4 | 0.1570 | 0.1327 | 0.0566 | 0.0461 | 0.0344 | 0.0263 |
Key Observations:
- Benefits of Depth: Increasing the depth from
NGCF-1toNGCF-2andNGCF-3generally enhances recommendation performance across all datasets.NGCF-1only capturesfirst-order neighbors, whileNGCF-2andNGCF-3capturesecond- and third-order connectivity, which carry more complexcollaborative signals(user similarity, potential recommendations). - Overfitting with Excessive Depth:
NGCF-4shows slight improvements on Amazon-Book but leads tooverfittingon Yelp2018* (performance drops compared toNGCF-3). This suggests that excessively deep architectures might introducenoiseinto therepresentation learningprocess. Three layers (NGCF-3) appear to be a good balance for capturing sufficientCF signalswithout introducing too much noise or overfitting. - Consistent Superiority: Even with varying layer numbers,
NGCFconsistently outperforms other methods (as seen by comparing Table 3 to Table 2), reaffirming the effectiveness of explicitly modelinghigh-order connectivity.
6.2.2. Effect of Embedding Propagation Layer and Layer-Aggregation Mechanism
To understand the specific contributions of the proposed embedding propagation layer design, variants of NGCF-1 (single layer) were tested.
The following are the results from Table 4 of the original paper:
| Gowalla | Yelp2018* | Amazon-Book | ||||
| recall | ndcg | recall | ndcg | recall | ndcg | |
| NGCF-1 | 0.1556 | 0.1315 | 0.0543 | 0.0442 | 0.0313 | 0.0241 |
| NGCF-1SVD++ | 0.1517 | 0.1265 | 0.0504 | 0.0414 | 0.0297 | 0.0232 |
| NGCF-1GC-MC | 0.1523 | 0.1307 | 0.0518 | 0.0420 | 0.0305 | 0.0234 |
| NGCF-1PinSage | 0.1534 | 0.1308 | 0.0516 | 0.0420 | 0.0293 | 0.0231 |
Key Observations:
- Importance of Representation Interactions:
NGCF-1consistently outperforms all its variants (, , ). This highlights the crucial role of therepresentation interaction term() within themessage construction function. This term makes messages dependent on the affinity between the source and target embeddings, acting like anattention mechanismand distilling more relevant information. The variants, by contrast, rely only onlinear transformations. - Significance of Self-Connections and Non-linearity: generally performs worse than and . This suggests the importance of retaining original features (via self-connections) and
non-linear transformationsin the propagation layer, which omits to simplify to an -like form. - Effectiveness of Layer Aggregation: When comparing
NGCF-1variants with their multi-layer counterparts (e.g., vs.GC-MC, vs.PinSagefrom Table 2), theNGCF-1variants, when combined throughlayer aggregation(concatenating output of different layers, even if only one layer is used for comparison), often show better performance. This underscores the significance of thelayer-aggregation mechanism(concatenation) for combining information from different layers, a finding consistent with otherGNNresearch.
6.2.3. Effect of Dropout
The paper investigates the effect of message dropout () and node dropout () on NGCF's performance. The results are plotted in Figure 5.
The following figure (Figure 5 from the original paper) shows the effect of node dropout and message dropout ratios on recall@20 across different datasets.

该图像是图表,展示了在不同数据集上MF和NGCF模型在每个训练周期的召回率(recall@20)。左图为Gowalla数据集,中图为Yelp2018数据集,右图为Amazon-Book数据集。可以观察到NGCF模型在所有数据集上均优于MF模型。
Figure 5: Effect of node dropout and message dropout ratios.
Key Observations:
- Node Dropout's Superiority:
Node dropoutgenerally offers better performance thanmessage dropout. For example, on Gowalla, anode dropoutratio of 0.2 yields arecall@20of 0.1514, which is higher than the bestmessage dropoutresult of 0.1506. - Robustness: This suggests that dropping out entire nodes (and all their outgoing messages) makes the
representationsmore robust not only against specific connections but also against the influence of particular users or items.Node dropoutappears to be a more effective strategy for preventingoverfittingingraph neural networksby promoting more generalizednode representations.
6.2.4. Test Performance w.r.t. Epoch
The convergence behavior of NGCF compared to MF over training epochs is shown in Figure 6 (mis-captioned as Figure 5 in the original VLM descriptions, but correctly referenced as Figure 6 in the text).
The following figure (Figure 6 from the original paper) shows the test performance (recall@20) of each epoch for MF and NGCF.

该图像是图表,展示了从MF (NGCF-0) 和 NGCF-3 获得的t-SNE转换表示。图中星形标记代表Gowalla数据集中的用户,各颜色的点表示相关项目,最佳效果在彩色显示中展现。
Figure 6: Test performance of each epoch of MF and NGCF.
Key Observations:
- Faster Convergence for NGCF:
NGCFexhibitsfaster convergencethanMFacross all three datasets. This is likely becauseNGCFexplicitly incorporatesindirectly connected users and itemsthroughembedding propagationwhen optimizing interaction pairs in eachmini-batch. - Better Model Capacity: This observation highlights
NGCF's superiormodel capacityand the effectiveness of performingembedding propagationdirectly in theembedding space. The richer information flow allowsNGCFto learn effectiverepresentationsmore quickly.
6.3. Effect of High-order Connectivity
To qualitatively understand how embedding propagation facilitates representation learning, the paper visualizes t-SNE transformed representations of randomly selected users and their relevant items from the Gowalla dataset. Figure 7 (mis-captioned as Figure 6 in the original VLM descriptions, but correctly referenced as Figure 7 in the text) compares MF (as NGCF-0, i.e., no propagation layers) with NGCF-3.
The following figure (Figure 7 from the original paper) shows the visualization of the learned t-SNE transformed representations derived from MF and NGCF-3. Each star represents a user from Gowalla dataset, while the points with the same color denote the relevant items. Best view in color.

Figure 7: Visualization of the learned t-SNE transformed representations derived from MF and NGCF-3. Each star represents a user from Gowalla dataset, while the points with the same color denote the relevant items. Best view in color.
Key Observations:
- Clustering in Embedding Space: With
NGCF-3, the representations of users and their relevant items (even those not seen during training for that specific user) are well-reflected in theembedding space. Points with the same colors (representing items consumed by the same users) tend to form distinctclusters. This means that users and their preferred items (and items preferred by similar users) are embedded into close regions of thevector space. - Enhanced Closeness with Propagation: Comparing
MF(Figure 7a) andNGCF-3(Figure 7b), it's evident thatNGCF-3causes theembeddingsof a user's historical items to be muchcloserto each other and to the user's own embedding. For example, for users like 12201 and 6880, their respective item clusters are more compact and distinct inNGCF-3. - Qualitative Verification: This visualization qualitatively verifies that the proposed
embedding propagation layereffectively injectsexplicit collaborative signalsinto therepresentations. Thehigh-order connectivitylearned through multiple layers brings related entities closer in theembedding space, which is crucial for accurate recommendations.
7. Conclusion & Reflections
7.1. Conclusion Summary
This work makes a significant contribution to model-based collaborative filtering (CF) by explicitly incorporating the collaborative signal into the embedding function itself. The authors introduce Neural Graph Collaborative Filtering (NGCF), a novel framework built upon graph neural networks (GNNs). NGCF leverages the user-item interaction graph to model high-order connectivities through an embedding propagation mechanism. This process enriches user and item embeddings by allowing them to iteratively absorb collaborative signals from their multi-hop neighbors. Extensive experiments on three real-world datasets consistently demonstrate that NGCF achieves state-of-the-art performance, significantly outperforming existing CF methods. The analysis further validates that embedding propagation is crucial for learning higher-quality user and item representations, especially benefiting inactive users and improving cold-start scenarios.
7.2. Limitations & Future Work
The authors acknowledge several areas for future improvement and research:
- Attention Mechanism: Incorporating an
attention mechanism[2] to learnvariable weightsfor neighbors duringembedding propagation. This would allow the model to dynamically prioritize more important neighbors rather than using fixed (Laplacian-based) weights, potentially improvingmodel generalizationandinterpretability. - Adversarial Learning: Exploring
adversarial learning[13] on user/itemembeddingsand the graph structure to enhanceNGCF'srobustness. This could make the model less susceptible to noisy or malicious interactions. - Exploiting Other Structural Knowledge: Expanding
NGCFto leverage other forms of structural information beyond simple user-item interactions. This includes:Cross features[41] forcontext-awareandsemantics-rich recommendation[22, 27].Item knowledge graphs[32] to establishknowledge-aware connectivitiesbetween users and items, which could unveil user decision-making processes.Social networks[34] to incorporate social influence into recommendations. The authors hope thatNGCF's development will contribute to more effective and interpretablerecommendation systemsby facilitating the reasoning of user online behavior.
7.3. Personal Insights & Critique
7.3.1. Personal Insights
- Paradigm Shift in Embedding Learning:
NGCFprovides a compelling argument for a paradigm shift inembedding learningforCF. Instead of treatingembeddingsas static attributes and interactions as mere training signals,NGCFdemonstrates the power of makingembeddingsinherently dynamic and context-aware by integratinggraph structure. This is a crucial insight: the "signal" should be in the representation itself, not just an artifact of the loss function. - Power of High-Order Connectivity: The paper effectively illustrates the semantic richness of
high-order connectivity. The example ofu1-i2-u2-i4clearly shows how multi-hop paths can capture nuancedcollaborative signalsthat simpler models miss. This highlights the importance of going beyond direct neighbors. - GNNs as a Natural Fit for CF:
NGCFreinforces the idea thatGraph Neural Networksare a natural and powerful tool forrecommender systems. The user-item interaction data is inherently graph-structured, andGNNsare designed to process such data bymessage passing, making them highly suitable for capturingcollaborative filteringeffects. - Addressing Sparsity: The empirical results demonstrating
NGCF's significant benefits for inactive users (those with fewer interactions) are particularly insightful. This suggestsembedding propagationcan effectively alleviate thesparsity issueby allowing users to indirectly gain information from a wider network of similar users and items.
7.3.2. Critique
-
Complexity of Message Construction: While the element-wise product term () in
message constructionis shown to be effective, its exact theoretical justification beyond "affinity" could be explored more deeply. Is it the optimal interaction? More complexneural interaction functionswithin the message could be considered, though the paper mentions this for future work in the prediction layer. -
Fixed Decay Factor: The use of the
graph Laplacian norm() as a fixeddecay factoris a standard choice but might not be optimal. Learning this decay factor or making it context-dependent (e.g., usingattention) could lead to further improvements, as mentioned in the future work. -
Scalability for Extremely Large Graphs: While the matrix-form propagation rule avoids
node samplingand is more efficient than someGCNs, theadjacency matrixandLaplacian matrixcan still be very large for industrial-scale recommendation systems with billions of users and items. The current formulation still processes the full graph (or at least its edges) in each layer. Further optimizations for handling truly massive, dynamic graphs might be necessary, such asmini-batchingover edges or subgraphs. -
Interpretability of High-Order Paths: While
NGCFexplicitly captureshigh-order connectivity, the final concatenatedembeddingmight lose some interpretability regarding which specific paths contribute most to a recommendation. Future work onattentioncould help in this regard. -
Hyperparameter Sensitivity: As with many
deep learningmodels,NGCFhas several hyperparameters (number of layers, dropout ratios, learning rate, L2 regularization). Optimal performance depends on careful tuning, and theoverfittingobserved withNGCF-4highlights this sensitivity.Overall,
NGCFis a rigorously designed and empirically validated model that significantly advances the field ofcollaborative filteringby elegantly integratinggraph neural networksto enhanceembedding learning. It opens many promising avenues for future research ingraph-based recommendation.
Similar papers
Recommended via semantic vector search.