Papers
Sign in to view your remaining parses.
Tag Filter
Training-Free Acceleration Methods
SPECTRA: Faster Large Language Model Inference with Optimized Internal and External Speculation
Published:1/1/2025
LLM Reasoning Capacity EnhancementTraining-Free Acceleration MethodsTraining-Independent Inference OptimizationUtilization of Internal and External Speculation
SPECTRA is a novel framework that accelerates large language model inference through optimized internal and external speculation, requiring no additional training. It achieves up to 4.08x speedup over stateoftheart methods across various benchmarks, with its implementation pub
02
Information to Users
Published:9/1/1989
Training-Free Acceleration MethodsLLM Security MechanismRobotic Action LearningMath Reasoning BenchmarksText-to-Image Generation
The paper examines concurrency control algorithms for realtime database systems, highlighting existing technical flaws and potential methods to enhance algorithm efficiency, contributing significantly to improving the reliability of realtime data processing.
02
Inductive Generative Recommendation via Retrieval-based Speculation
Published:10/4/2024
Generative Recommendation SystemsTraining-Free Acceleration MethodsOnline Recommendation System OptimizationSequential Recommender SystemsImage Generation
The paper introduces , a retrievalbased inductive generative recommendation framework that addresses the limitations of generative models in recommending unseen items by utilizing a drafter model for candidate generation and a generative model for verification, enhancing
03
Denoising Diffusion Probabilistic Models
Published:6/20/2020
Diffusion ModelsImage SynthesisTraining-Free Acceleration MethodsCIFAR10 DatasetProgressive Lossy Decompression
The paper presents a novel denoising diffusion probabilistic model inspired by nonequilibrium thermodynamics, achieving highquality image synthesis. By training on a weighted variational bound, it establishes a new connection with denoising score matching, attaining competitive
02
Training LLM Agents to Empower Humans
Published:10/8/2025
Large Language Model Fine-TuningLLM-guided motion planningTraining-Free Acceleration MethodsMechanism of RL Preserving Prior Knowledge
This work introduces an unsupervised LLM finetuning method maximizing human empowerment, improving assistive agent effectiveness without extra human feedback, validated by user studies and coding benchmarks with higher acceptance and success rates.
012
RLPIR: Reinforcement Learning with Prefix and Intrinsic Reward
Published:10/8/2025
RL Training for Large Language ModelsSequence Policy OptimizationTraining-Free Acceleration MethodsLLM Reasoning Capacity Enhancement
RLPIR introduces a verifierfree RL framework using prefix rollout and intrinsic rewards, achieving comparable performance to RLVR with 7× faster training and 45% shorter reasoning sequences, enhancing LLM efficiency without relying on ground truth.
01
JURY-RL: Votes Propose, Proofs Dispose for Label-Free RLVR
Published:10/8/2025
RL Training for Large Language ModelsTraining-Free Acceleration MethodsReinforcement Learning for Math ReasoningSequence Policy Optimization
JURYRL separates answer proposal via voting from reward disposal via theorem proving, uses ResZero for unverifiable cases, stabilizing RL training and outperforming labelfree baselines in reasoning and code tasks, rivaling supervised training.
04
PEARL: Towards Permutation-Resilient LLMs
Published:2/20/2025
RL Training for Large Language ModelsLLM Reasoning Capacity EnhancementSequence Policy OptimizationTraining-Free Acceleration Methods
PEARL uses distributionally robust optimization and a permutationproposal network to enhance LLMs' resilience against worstcase input orderings, effectively mitigating permutation attacks and boosting performance across varied contexts.
06
EdgeShard: Efficient LLM Inference via Collaborative Edge Computing
Published:5/23/2024
LLM Reasoning Capacity EnhancementTraining-Free Acceleration MethodsCollaborative Edge Computing InferenceModel Sharding DeploymentDynamic Programming Optimization
EdgeShard uses collaborative edge computing to shard LLMs across distributed devices, optimizing latency and throughput via dynamic programming, reducing inference delay by 50% and doubling throughput while addressing cloud dependency challenges.
06
Learning to Reason without External Rewards
Published:5/26/2025
RL Training for Large Language ModelsSequence Policy OptimizationTraining-Free Acceleration MethodsReinforcement Learning for Math Reasoning
Intuitor leverages a model’s selfcertainty as an intrinsic reward in reinforcement learning, enabling unsupervised training of LLMs for complex reasoning with strong crossdomain generalization and no reliance on external labels or rewards.
05
dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive
Caching
Published:5/17/2025
Diffusion Model Fine-TuningEfficient Inference of Diffusion ModelsLLM Reasoning Capacity EnhancementTraining-Free Acceleration MethodsAuto-Regressive Diffusion Model
dLLMCache is a trainingfree adaptive caching method accelerating diffusion large language models by reusing intermediate computations, achieving up to 9.1× speedup on LLaDA 8B and Dream 7B without degrading output quality.
05
OneFlowSeq: Achieving One-Step Generation for Diffusion Language Models via Lightweight Distillation
Published:10/8/2025
Diffusion Model Fine-TuningAuto-Regressive Diffusion ModelLarge Language Model Fine-TuningSequence Policy OptimizationTraining-Free Acceleration Methods
OneFlowSeq distills a multistep diffusion teacher into a singlestep generator using MeanFlow supervision and Jacobianvector product signals, greatly accelerating inference and improving performance with 1600× fewer trainable parameters.
017