Papers
Sign in to view your remaining parses.
Tag Filter
The Best of the Two Worlds: Harmonizing Semantic and Hash IDs for Sequential Recommendation
Published:12/11/2025
Sequential Recommender SystemsHarmonization of Semantic and Hash IDsLong-Tail Problem Mitigation MethodsMulti-Granular Semantic ModelingKnowledge Transfer Strategies in Recommendation Systems
The HRec framework harmonizes Semantic IDs and Hash IDs to tackle longtail issues in sequential recommendation systems, utilizing a dualbranch architecture for capturing multigranular semantics and a duallevel alignment strategy for knowledge transfer.
05
Rethinking Popularity Bias in Collaborative Filtering via Analytical Vector Decomposition
Published:12/11/2025
Popularity Bias Analysis in Collaborative FilteringBayesian Pairwise Ranking OptimizationDirectional Decomposition and Correction (DDC)Personalized Recommendation SystemGeometric Embedding Correction
This study reveals that popularity bias in collaborative filtering is an intrinsic geometric artifact of Bayesian Pairwise Ranking optimization. The proposed Directional Decomposition and Correction (DDC) framework significantly enhances personalization and fairness in recommenda
04
STARS: Semantic Tokens with Augmented Representations for Recommendation at Scale
Published:12/11/2025
Transformer-Based Sequential Recommendation SystemSemantic Item TokensLow-Latency Recommendation SystemDual-Memory User EmbeddingsCold-Start Product Recommendation
STARS is a transformerbased recommendation framework for largescale ecommerce that tackles coldstart and dynamic user intent challenges. It enhances matching with dualmemory user embeddings and semantic item tokens, achieving significant improvements in recommendation qualit
03
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Published:1/30/2023
Vision-Language ModelsBLIP-2 Pre-training StrategyLightweight Querying TransformerFrozen Image EncodersZero-shot Image-to-Text Generation
BLIP2 introduces an efficient visionlanguage pretraining strategy using frozen image encoders and language models, achieving stateoftheart performance on various tasks while significantly reducing the number of trainable parameters.
01
FuXi-$α$: Scaling Recommendation Model with Feature Interaction Enhanced Transformer
Published:2/5/2025
Feature Interaction Enhanced TransformerLarge-Scale Recommendation ModelsAdaptive Multi-channel Self-Attention MechanismMulti-stage Feed-Forward NetworkOnline A/B Testing
The FuXi model introduces an Adaptive Multichannel Selfattention mechanism to improve the modeling of temporal, positional, and semantic features, along with a Multistage FeedForward Network to enhance implicit feature interactions, outperforming existing models in offlin
02
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Published:4/9/2024
Long-Term Video UnderstandingMultimodal Long-Sequencing ModelVideo Question AnsweringVideo Information Storage MechanismMemory-Augmented Multimodal Learning
The MALMM model is introduced for longterm video understanding, utilizing an online approach and a memory bank to store historical video information, overcoming frame limitations. Extensive experiments demonstrate stateoftheart performance in tasks like video question answer
03
Compact and Wide-FOV True-3D VR Enabled by a Light Field Display Engine with a Telecentric Path
True 3D VR DisplayLight Field Display TechnologyMicro-LCD High Resolution DisplayTelecentric Optical Path DesignWide Field-of-View Imaging
This study introduces a true3D VR display system utilizing a light field display engine, achieving high resolution and over 60 degrees FOV through a telecentric optical path that mitigates field reduction caused by aberrations.
02
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Published:6/25/2023
KV Cache Optimization for Large Language ModelsHeavy Hitter AlgorithmDynamic Submodular ProblemEfficient Generation InferenceModel Inference Performance Enhancement
The paper introduces HO, a novel KV cache eviction strategy for Large Language Models that reduces memory usage by 510 times while improving inference throughput by up to 29x, based on identifying and retaining highcontribution tokens known as Heavy Hitters.
01
One-step Diffusion with Distribution Matching Distillation
Published:12/1/2023
Distribution Matching DistillationOne-Step Diffusion GenerationImage Generation Neural NetworkDiffusion Model OptimizationHigh-Quality Image Generation
This paper introduces Distribution Matching Distillation (DMD) to transform multistep diffusion models into onestep image generators with high quality. By minimizing KL divergence, DMD ensures output distribution alignment, outperforming existing methods with speeds up to 20 FP
04
Improving surface quality of LDED thin-wall Ti-6Al-4V alloy with ultralow influence on superficial layer via femtosecond laser polishing
Published:10/24/2025
Femtosecond Laser PolishingSurface Quality of Additively Manufactured Titanium AlloysLaser Direct Energy Deposition (LDED)Thin-Wall Ti-6Al-4V AlloyComparison Study of Nanosecond Laser Polishing
This study introduces femtosecond laser polishing to enhance the surface quality of LDED Ti6Al4V alloy. Results show a significant reduction in surface roughness from 37.24μm to 4.97μm while minimizing oxidation layer and heataffected zone depth, preventing surface deformation
031
FineRec:Exploring Fine-grained Sequential Recommendation
Published:4/20/2024
Fine-Grained Sequential RecommendationAttribute-Opinion Pair ExtractionUser-Item GraphDiversity-Aware Convolution OperationLarge Language Model Application
The FineRec framework is introduced for finegrained sequential recommendation by leveraging attributeopinion pairs from user reviews. Utilizing large language models and a diversityaware convolution operation, it significantly outperforms existing methods in representation lea
02
Socio-spatial segregation and human mobility: A review of empirical evidence
Published:1/22/2025
Socio-spatial Segregation and Human MobilityActivity Space and SegregationAnalysis of Human Mobility Data SourcesRelationship between Residential and Experienced SegregationMethodological Challenges in Segregation Research
This review examines how emerging mobility data since the 2010s enhances understanding of sociospatial segregation, focusing on activity space. It poses three questions regarding mobility data strengths, the relationship between mobility patterns and experienced segregation, and
02
Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning
Published:4/10/2025
Facial Attribute RecognitionFacial Expression RecognitionMultimodal Large Language ModelFaceInstruct-1M DatasetFace-Region Guided Cross-Attention
FaceLLaVA is a multimodal large language model designed for facial expression and attribute recognition, generating natural language descriptions. Utilizing the FaceInstruct1M dataset and a novel visual encoder, it outperforms existing models in various tasks and achieves highe
06
One-Minute Video Generation with Test-Time Training
Published:4/8/2025
Video Generation ModelsAutoregressive Generation ModelsTransformer-based Video GenerationTest-Time TrainingComplex Multi-Scene Story Generation
This paper introduces TestTime Training (TTT) layers to enhance oneminute video generation. By integrating TTT into a pretrained Transformer, the authors achieved more coherent videos from text storyboards, outperforming existing methods, despite some artifacts and efficiency
03
场景图增强的视觉语言常识推理生成
Scene Graph-Augmented Visual Language Reasoning GenerationVisual Language Commonsense ReasoningMultimodal Commonsense Reasoning AssessmentVisual Understanding with Large Language ModelsExperiments on VCR and VQA-X Datasets
The SGEVL framework enhances visuallanguage commonsense reasoning by integrating CLIP patch sequences and a gated crossmodal attention mechanism, improving LLM's visual understanding. It proposes locationfree scene graph generation for better reasoning accuracy. Experiments sh
05
Restora-Flow: Mask-Guided Image Restoration with Flow Matching
Published:11/25/2025
Flow Matching for Image RestorationTraining-Free Image Restoration MethodMask-Guided Image RestorationEvaluation on Medical Imaging DatasetsImage Denoising and Super-Resolution
RestoraFlow is a novel trainingfree image restoration method that utilizes flow matching guided by a degradation mask, incorporating trajectory correction. It shows superior perceptual quality and processing time compared to existing diffusion and flow matching methods across v
03
ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning
Published:10/7/2025
Humanoid Whole-Body ControlResidual Learning FrameworkGeneral Motion TrackingObject Interaction Task OptimizationTraining on Human Motion Data
ResMimic is a twostage residual learning framework that enhances humanoid control for locomanipulation. By refining outputs from a general motion tracking policy trained on human data, it significantly improves task success and training efficiency, as shown in simulations and r
04
Ternary Spike: Learning Ternary Spikes for Spiking Neural Networks
Published:12/11/2023
Spiking Neural NetworksTernary Spike NeuronInformation Capacity EnhancementEvent-Driven ComputingMultiplication-Free Operation
This study introduces a ternary spike neuron to address the limited information capacity of binary spikes in spiking neural networks. Utilizing values of {1, 0, 1} enhances information capacity while maintaining eventdriven, multiplicationfree advantages, with experiments show
05
Knowledge Circuits in Pretrained Transformers
Published:5/28/2024
Knowledge Circuit AnalysisPretrained Transformer ModelsExperiments with GPT-2 and TinyLLAMAImpact of Knowledge Editing TechniquesSelf-Attention Mechanism and Information Heads
This paper explores knowledge encoding in large language models, introducing the concept of 'Knowledge Circuits' to reveal critical subgraphs in their computation graphs. Experiments with GPT2 and TinyLLAMA showcase how information heads, relation heads, and MLPs collaboratively
04
The Heat Shock Transcription Factor HsfA Is Essential for Thermotolerance and Regulates Cell Wall Integrity in Aspergillus fumigatus
Published:4/9/2021
Heat Shock Transcription Factor HsfAThermotoleranceCell Wall IntegrityAspergillus InfectionHeat Shock Response Mechanism
The Heat Shock Transcription Factor HsfA is crucial for thermotolerance and cell wall integrity in Aspergillus fumigatus. Hightemperature exposure alters cell wall ultrastructure, while the interplay of HsfA and Hsp90 expression is regulated by cell wall signaling components, hi
02
……