Papers

Sign in to view your remaining parses.
Tag Filter
Efficient Attention Mechanism
MemoryFormer : Minimize Transformer Computation by Removing Fully-Connected Layers
Published:11/6/2024
Transformer architectureEfficient Attention MechanismIn-Memory Lookup TablesReduction of Computational ComplexityMulti-Head Attention Operation
MemoryFormer is a novel transformer architecture that reduces computational complexity by eliminating most fullyconnected layers while retaining necessary multihead attention operations, utilizing inmemory lookup tables and hash algorithms for dynamic vector retrieval, validat
03
STORE: Semantic Tokenization, Orthogonal Rotation and Efficient Attention for Scaling Up Ranking Models
Published:11/24/2025
Scalable Ranking ModelsSemantic TokenizationOrthogonal Rotation TransformationHigh-Dimensional Feature SparsityEfficient Attention Mechanism
The paper introduces STORE, a scalable ranking framework addressing representation and computational bottlenecks in personalized recommendation systems through semantic tokenization, efficient attention, and orthogonal rotation.
03
Fast Video Generation with Sliding Tile Attention
Published:2/7/2025
Sliding Tile Attention MechanismVideo Diffusion Generation ModelsEfficient Attention MechanismHunyuanVideoComputational Efficiency Optimization
The study introduces Sliding Tile Attention (STA) to reduce computational bottlenecks in video generation, achieving 58.79% Model FLOPs Utilization while decreasing latency to 501 seconds without quality loss, demonstrating significant efficiency improvements over existing method
08