Page 13 - Paper Library - AiPaper

The Best of the Two Worlds: Harmonizing Semantic and Hash IDs for Sequential Recommendation

Published:12/11/2025

Sequential Recommender SystemsHarmonization of Semantic and Hash IDsLong-Tail Problem Mitigation MethodsMulti-Granular Semantic ModelingKnowledge Transfer Strategies in Recommendation Systems

The H

^2

Rec framework harmonizes Semantic IDs and Hash IDs to tackle longtail issues in sequential recommendation systems, utilizing a dualbranch architecture for capturing multigranular semantics and a duallevel alignment strategy for knowledge transfer.

Rethinking Popularity Bias in Collaborative Filtering via Analytical Vector Decomposition

Published:12/11/2025

Popularity Bias Analysis in Collaborative FilteringBayesian Pairwise Ranking OptimizationDirectional Decomposition and Correction (DDC)Personalized Recommendation SystemGeometric Embedding Correction

This study reveals that popularity bias in collaborative filtering is an intrinsic geometric artifact of Bayesian Pairwise Ranking optimization. The proposed Directional Decomposition and Correction (DDC) framework significantly enhances personalization and fairness in recommenda

STARS: Semantic Tokens with Augmented Representations for Recommendation at Scale

Published:12/11/2025

Transformer-Based Sequential Recommendation SystemSemantic Item TokensLow-Latency Recommendation SystemDual-Memory User EmbeddingsCold-Start Product Recommendation

STARS is a transformerbased recommendation framework for largescale ecommerce that tackles coldstart and dynamic user intent challenges. It enhances matching with dualmemory user embeddings and semantic item tokens, achieving significant improvements in recommendation qualit

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

Published:1/30/2023

Vision-Language ModelsBLIP-2 Pre-training StrategyLightweight Querying TransformerFrozen Image EncodersZero-shot Image-to-Text Generation

BLIP2 introduces an efficient visionlanguage pretraining strategy using frozen image encoders and language models, achieving stateoftheart performance on various tasks while significantly reducing the number of trainable parameters.

FuXi-$α$: Scaling Recommendation Model with Feature Interaction Enhanced Transformer

Published:2/5/2025

Feature Interaction Enhanced TransformerLarge-Scale Recommendation ModelsAdaptive Multi-channel Self-Attention MechanismMulti-stage Feed-Forward NetworkOnline A/B Testing

The FuXi

α

model introduces an Adaptive Multichannel Selfattention mechanism to improve the modeling of temporal, positional, and semantic features, along with a Multistage FeedForward Network to enhance implicit feature interactions, outperforming existing models in offlin

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Published:4/9/2024

Long-Term Video UnderstandingMultimodal Long-Sequencing ModelVideo Question AnsweringVideo Information Storage MechanismMemory-Augmented Multimodal Learning

The MALMM model is introduced for longterm video understanding, utilizing an online approach and a memory bank to store historical video information, overcoming frame limitations. Extensive experiments demonstrate stateoftheart performance in tasks like video question answer

Compact and Wide-FOV True-3D VR Enabled by a Light Field Display Engine with a Telecentric Path

True 3D VR DisplayLight Field Display TechnologyMicro-LCD High Resolution DisplayTelecentric Optical Path DesignWide Field-of-View Imaging

This study introduces a true3D VR display system utilizing a light field display engine, achieving high resolution and over 60 degrees FOV through a telecentric optical path that mitigates field reduction caused by aberrations.

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Published:6/25/2023

KV Cache Optimization for Large Language ModelsHeavy Hitter AlgorithmDynamic Submodular ProblemEfficient Generation InferenceModel Inference Performance Enhancement

The paper introduces H

2

O, a novel KV cache eviction strategy for Large Language Models that reduces memory usage by 510 times while improving inference throughput by up to 29x, based on identifying and retaining highcontribution tokens known as Heavy Hitters.

One-step Diffusion with Distribution Matching Distillation

Published:12/1/2023

Distribution Matching DistillationOne-Step Diffusion GenerationImage Generation Neural NetworkDiffusion Model OptimizationHigh-Quality Image Generation

This paper introduces Distribution Matching Distillation (DMD) to transform multistep diffusion models into onestep image generators with high quality. By minimizing KL divergence, DMD ensures output distribution alignment, outperforming existing methods with speeds up to 20 FP

Improving surface quality of LDED thin-wall Ti-6Al-4V alloy with ultralow influence on superficial layer via femtosecond laser polishing

Published:10/24/2025

Femtosecond Laser PolishingSurface Quality of Additively Manufactured Titanium AlloysLaser Direct Energy Deposition (LDED)Thin-Wall Ti-6Al-4V AlloyComparison Study of Nanosecond Laser Polishing

This study introduces femtosecond laser polishing to enhance the surface quality of LDED Ti6Al4V alloy. Results show a significant reduction in surface roughness from 37.24μm to 4.97μm while minimizing oxidation layer and heataffected zone depth, preventing surface deformation

031

FineRec:Exploring Fine-grained Sequential Recommendation

Published:4/20/2024

Fine-Grained Sequential RecommendationAttribute-Opinion Pair ExtractionUser-Item GraphDiversity-Aware Convolution OperationLarge Language Model Application

The FineRec framework is introduced for finegrained sequential recommendation by leveraging attributeopinion pairs from user reviews. Utilizing large language models and a diversityaware convolution operation, it significantly outperforms existing methods in representation lea

Socio-spatial segregation and human mobility: A review of empirical evidence

Published:1/22/2025

Socio-spatial Segregation and Human MobilityActivity Space and SegregationAnalysis of Human Mobility Data SourcesRelationship between Residential and Experienced SegregationMethodological Challenges in Segregation Research

This review examines how emerging mobility data since the 2010s enhances understanding of sociospatial segregation, focusing on activity space. It poses three questions regarding mobility data strengths, the relationship between mobility patterns and experienced segregation, and

Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning

Published:4/10/2025

Facial Attribute RecognitionFacial Expression RecognitionMultimodal Large Language ModelFaceInstruct-1M DatasetFace-Region Guided Cross-Attention

FaceLLaVA is a multimodal large language model designed for facial expression and attribute recognition, generating natural language descriptions. Utilizing the FaceInstruct1M dataset and a novel visual encoder, it outperforms existing models in various tasks and achieves highe

One-Minute Video Generation with Test-Time Training

Published:4/8/2025

Video Generation ModelsAutoregressive Generation ModelsTransformer-based Video GenerationTest-Time TrainingComplex Multi-Scene Story Generation

This paper introduces TestTime Training (TTT) layers to enhance oneminute video generation. By integrating TTT into a pretrained Transformer, the authors achieved more coherent videos from text storyboards, outperforming existing methods, despite some artifacts and efficiency

场景图增强的视觉语言常识推理生成

Scene Graph-Augmented Visual Language Reasoning GenerationVisual Language Commonsense ReasoningMultimodal Commonsense Reasoning AssessmentVisual Understanding with Large Language ModelsExperiments on VCR and VQA-X Datasets

The SGEVL framework enhances visuallanguage commonsense reasoning by integrating CLIP patch sequences and a gated crossmodal attention mechanism, improving LLM's visual understanding. It proposes locationfree scene graph generation for better reasoning accuracy. Experiments sh

Restora-Flow: Mask-Guided Image Restoration with Flow Matching

Published:11/25/2025

Flow Matching for Image RestorationTraining-Free Image Restoration MethodMask-Guided Image RestorationEvaluation on Medical Imaging DatasetsImage Denoising and Super-Resolution

RestoraFlow is a novel trainingfree image restoration method that utilizes flow matching guided by a degradation mask, incorporating trajectory correction. It shows superior perceptual quality and processing time compared to existing diffusion and flow matching methods across v

ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning

Published:10/7/2025

Humanoid Whole-Body ControlResidual Learning FrameworkGeneral Motion TrackingObject Interaction Task OptimizationTraining on Human Motion Data

ResMimic is a twostage residual learning framework that enhances humanoid control for locomanipulation. By refining outputs from a general motion tracking policy trained on human data, it significantly improves task success and training efficiency, as shown in simulations and r

Ternary Spike: Learning Ternary Spikes for Spiking Neural Networks

Published:12/11/2023

Spiking Neural NetworksTernary Spike NeuronInformation Capacity EnhancementEvent-Driven ComputingMultiplication-Free Operation

This study introduces a ternary spike neuron to address the limited information capacity of binary spikes in spiking neural networks. Utilizing values of {1, 0, 1} enhances information capacity while maintaining eventdriven, multiplicationfree advantages, with experiments show

Knowledge Circuits in Pretrained Transformers

Published:5/28/2024

Knowledge Circuit AnalysisPretrained Transformer ModelsExperiments with GPT-2 and TinyLLAMAImpact of Knowledge Editing TechniquesSelf-Attention Mechanism and Information Heads

This paper explores knowledge encoding in large language models, introducing the concept of 'Knowledge Circuits' to reveal critical subgraphs in their computation graphs. Experiments with GPT2 and TinyLLAMA showcase how information heads, relation heads, and MLPs collaboratively

The Heat Shock Transcription Factor HsfA Is Essential for Thermotolerance and Regulates Cell Wall Integrity in Aspergillus fumigatus

Published:4/9/2021

Heat Shock Transcription Factor HsfAThermotoleranceCell Wall IntegrityAspergillus InfectionHeat Shock Response Mechanism

The Heat Shock Transcription Factor HsfA is crucial for thermotolerance and cell wall integrity in Aspergillus fumigatus. Hightemperature exposure alters cell wall ultrastructure, while the interplay of HsfA and Hsp90 expression is regulated by cell wall signaling components, hi

241 - 260 / 982

Papers