Papers

Sign in to view your remaining parses.
Tag Filter
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Published:3/19/2025
Generalist Humanoid Robot Foundation ModelVision-Language-Action ModelDiffusion Transformer ModuleHumanoid Robot Manipulation TasksMultimodal Data Training
GR00T N1 is an open foundation model for humanoid robots, integrating a reasoning module and a motion generation module. Trained endtoend with a pyramid of heterogeneous data, it outperforms existing imitation learning methods in simulation benchmarks, demonstrating high perfor
04
SAFE: Multitask Failure Detection for Vision-Language-Action Models
Published:6/12/2025
Failure Detection for Vision-Language-Action ModelsMultitask Failure DetectionFeature-Based Failure PredictionGeneralist Robot PoliciesAdaptive Failure Alert System
SAFE introduces a multitask failure detection system for visionlanguageaction models, leveraging internal features to predict task failure. It demonstrates superior performance across various environments, balancing accuracy and timeliness, and supports zeroshot generalization
01
On the role of theories in consciousness science
Published:11/26/2025
Role of Theories in Consciousness ScienceAdvances in Neuroscience TechnologyTheoretical and Empirical Developments in Consciousness Science
This paper critiques the role of theories in consciousness science, advocating for focused 'humbler theories.' It argues that the 'uniformity assumption' limits theoretical effectiveness, promoting testable theories to enhance the interactive loop between theory and experimentati
03
Curriculum Conditioned Diffusion for Multimodal Recommendation
Published:4/11/2025
Multimodal Recommendation SystemsDiffusion ModelsKnowledge-Aware Negative SamplingCurriculum Learning FrameworkMultimodal Aligning Module
The Curriculum Conditioned Diffusion framework (CCDRec) addresses data sparsity in multimodal recommendation by integrating diffusion models with negative sampling, enhancing personalization through the exploration of modality correlation. Its effectiveness and robustness are val
01
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
Published:9/9/2025
Diffusion Model Alignment with Human PreferencesDirect-Align MethodSemantic Relative Preference Optimization (SRPO)Predefined Noise PriorImage Recovery and Interpolation Techniques
The paper introduces DirectAlign, a method that optimizes diffusion models by predefining noise priors to align with finegrained human preferences, addressing multistep denoising costs and enabling realtime reward adjustments through Semantic Relative Preference Optimization (
02
S-SYNC: Shuttle and Swap Co-Optimization in Quantum Charge-Coupled Devices
Published:5/2/2025
Quantum Charge-Coupled Device (QCCD)Scheduling HeuristicsQuantum Application Success Rate OptimizationQuantum Computing CompilerQubit Shuttle and Swap
The SSYNC compiler optimizes shuttling and swapping operations in Quantum ChargeCoupled Devices (QCCD), reducing shuttling by 3.69 times and improving quantum application success rates by 1.73 times, thereby advancing the practical use of quantum computing.
05
Diffusion Transformers with Representation Autoencoders
Published:10/14/2025
Diffusion Transformers with Representation AutoencodersRepresentation AutoencodersImprovement of Image Generation QualityHigh-Dimensional Latent Space ModelingDINO and MAE Encoders
This paper introduces Representation Autoencoders (RAEs) that replace traditional Variational Autoencoders (VAEs) with pretrained representation encoders, enhancing image generation quality in Diffusion Transformers (DiTs). RAEs achieve highquality reconstructions and rich seman
06
Evolution Strategies at the Hyperscale
Published:11/21/2025
Evolution Strategies AlgorithmLow-Rank Learning OptimizationLarge-Scale Neural Network OptimizationGradient-Free Optimization MethodsOptimization in Reinforcement Learning Settings
This paper introduces EGGROLL, a lowrank learning optimization method that scales backpropagationfree optimization for large neural networks, significantly reducing computational and memory costs while maintaining performance even with billions of parameters.
06
CipherGPT: Secure Two-Party GPT Inference
Secure Two-Party GPT InferenceEncrypted Matrix MultiplicationSecure GELU Computation ProtocolSecure Top-k Sampling ProtocolGPT Inference Optimization
CipherGPT is a framework addressing user privacy in GPT inference. It innovatively optimizes secure matrix multiplication and GELU computation, achieving significant performance gains. Additionally, it introduces a secure topk sampling protocol, providing comprehensive benchmark
03
DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
Published:8/29/2023
Diffusion Model Image RestorationBlind Image RestorationHigh-Fidelity Image ReconstructionImage Quality EnhancementRegion-Adaptive Restoration Guidance
DiffBIR is a unified image restoration pipeline for blind image tasks, involving degradation removal and information regeneration. It utilizes IRControlNet for realistic detail generation and introduces regionadaptive guidance for usertunable balance between realness and fideli
02
Adenine base editor engineering reduces editing of bystander cytosines
Published:7/1/2021
Adenine Base Editor EngineeringDeaminase EngineeringBase Editing Tool OptimizationSpecific Base ConversionTargeted Editing Improvement
Engineered adenine base editor TadA7.10 significantly reduces unintended cytosine deamination with the D108Q mutation. This mutation lowers activity tenfold and is compatible with V106W. A P48R mutation enhances cytosine activity while decreasing adenine editing, creating a new T
03
ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation
Published:6/27/2025
Multi-Agent Personalized Recommendation SystemAugmented Retrieval-Generation FrameworkLLM-based Recommendation SystemsDynamic User Preference ModelingRecommendation System Evaluation
The ARAG framework enhances personalized recommendations by integrating a multiagent collaboration into RetrievalAugmented Generation. It employs agents for user understanding, natural language inference, context summarization, and item ranking, outperforming traditional RAG me
02
MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning
Published:10/8/2025
Multi-Agent Retrieval-Augmented GenerationCollaborative Chain-of-Thought ReasoningTask-Aware ReasoningDynamic Workflow OptimizationModular Reasoning-Driven Architecture
MARAG is a multiagent framework for RetrievalAugmented Generation, addressing ambiguities in complex information tasks. It employs specialized agents to decompose tasks and enables dynamic workflows, outperforming leading baseline models and validating collaborative reasoning'
010
HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation
Published:10/8/2025
RL Training for Large Language ModelsAgentic Retrieval-Augmented GenerationHierarchical Process RewardsKnowledge-Grounded Process RewardsSearch Decision Optimization
HiPRAG introduces a novel hierarchical process rewards method to tackle common oversearch and undersearch issues in agentic retrievalaugmented generation, improving search efficiency and accuracy significantly across multiple QA benchmarks, demonstrating the importance of opti
03
RecoveryChaining: Learning Local Recovery Policies for Robust Manipulation
Published:10/18/2024
Robot Recovery Policy LearningHierarchical Reinforcement LearningComplex Manipulation TasksModel-Based ControllersSparse Reward Optimization
This study introduces RecoveryChaining, a hierarchical reinforcement learning method for robust recovery in complex manipulation tasks. By integrating nominal controllers as options, it enables effective recovery strategies upon failure detection, enhancing robustness and success
03
Towards Domain-Specific Network Transport for Distributed DNN Training
Distributed Deep Learning Network TransportDNN Communication OptimizationNetwork Traffic Pattern AnalysisDomain-Specific Network ProtocolsCongestion Control Techniques
This paper presents MLT, a domainspecific network transport protocol for distributed DNN training, optimizing congestion control and data transfer techniques. Extensive evaluations show significant improvements in training throughput and network utilization, addressing performan
02
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Published:8/12/2024
Text-to-Video GenerationDiffusion ModelsDiffusion Transformer3D Variational AutoencoderVideo Generation Quality Improvement
CogVideoX is a largescale texttovideo model using a diffusion transformer that generates 10second videos at 16 fps and 768×1360 resolution. It addresses coherence and semantic alignment issues with methods like 3D VAE and expert transformers, achieving significant quality imp
03
Scalable Diffusion Models with Transformers
Published:12/20/2022
Diffusion ModelsTransformer architectureImage GenerationScalable Diffusion ModelsClass-Conditional Image Generation
This study introduces Diffusion Transformers (DiTs), which replace UNet with a transformer architecture for image generation. Higher Gflops correlate with better performance (lower FID), with the largest model achieving stateoftheart results on ImageNet benchmarks.
05
Hiding in the AI Traffic: Abusing MCP for LLM-Powered Agentic Red Teaming
Published:11/20/2025
MCP-Based Command and Control ArchitectureAutonomous Red Teaming SystemsGenerative Cyber Offense and DefenseAutonomous Penetration TestingAdversarial Behavior Analysis
This research presents a novel Command and Control architecture using the Model Context Protocol to facilitate stealthy autonomous reconnaissance agents, enhancing offensive cybersecurity by improving goaldirected behaviors and eliminating network artifacts that could reveal com
02
Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives
Published:11/8/2024
Foundations of Bradley-Terry ModelsPreference-Based Reward ModelingMulti-Task Evaluation for Large Language ModelsOrder Consistency in Reward ModelingReward Models Based on Deep Neural Networks
This paper reevaluates the BradleyTerry model in preferencebased reward modeling, establishes a theoretical foundation for convergence using deep neural networks, argues it's not strictly necessary for optimization, proposes an alternative upperbound algorithm, and validates m
04