Papers

Sign in to view your remaining parses.
Tag Filter
SALM: Spatial Audio Language Model with Structured Embeddings for Understanding and Editing
Published:7/23/2025
Spatial Audio Language ModelMultimodal Contrastive LearningStructured Audio EmbeddingsSpatial Audio Understanding and EditingZero-Shot Direction Classification
The paper presents SALM, a model that aligns spatial audio with natural language through multimodal contrastive learning, utilizing structured embeddings for separate and joint representation of semantic and spatial information, supporting zeroshot direction classification and t
00
Study of AI‑Driven Fashion Recommender Systems
Published:7/5/2023
Fashion Recommender SystemsImage-Based Recommendation SystemsAI Applications in Recommender SystemsUser-Item Relationship ModelingFashion Industry Data Analysis
This paper reviews the application of AI in fashion recommender systems over the last decade, emphasizing deep learning and computer vision. AI offers superior recommendations by addressing product diversity and compatibility, helping to mitigate consumer choice overload.
06
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning
Published:12/8/2025
Self-Distilled Reinforcement Learning for Large Language ModelsParallel Cognitive ReasoningAdaptive Decomposition Policy OptimizationLarge-Scale Parallel Reinforcement Learning TrainingModel Memory Management and Flow Control
The Native Parallel Reasoner (NPR) framework enables Large Language Models to achieve genuine parallel reasoning through selfdistilled training and novel policy optimization. NPR shows up to 24.5% performance improvements and 4.6x speedup in benchmarks, establishing a new standa
02
KV-Edit: Training-Free Image Editing for Precise Background Preservation
Published:2/25/2025
Training-Free Image EditingBackground Consistency PreservationDiT Models for Image GenerationMemory Optimization MethodsContent Generation in User-Provided Regions
KVEdit is a trainingfree image editing method addressing background consistency by using the KV cache mechanism in Diffusion Transformers. It preserves background tokens, enabling seamless foregroundbackground integration, significantly outperforming existing techniques in qua
02
Recommender Systems with Generative Retrieval
Published:5/9/2023
Generative Recommendation SystemsSemantic ID-based Recommendation ModelTransformer Sequence-to-Sequence ModelApproximate Nearest Neighbor SearchUser Behavior Prediction
This paper introduces a novel generative retrieval method using autoregressive decoding of Semantic IDs to enhance recommender system performance. A Transformerbased model effectively predicts the next item a user will interact with. Experiments show substantial improvements ove
02
Motion Inversion for Video Customization
Published:3/29/2024
Motion Customization in Video GenerationTemporal Transformer ModulesMotion Embedding RepresentationMotion Query-Key EmbeddingDifferential Operation Strategy
This paper presents a novel motion customization method using Motion Embeddings, addressing gaps in motion representation for video generation. It integrates with temporal transformer modules to optimize selfattention across frames, ensuring focus on motion and reducing appearan
03
Unsupervised Learning of Video Representations using LSTMs
Published:2/17/2015
Unsupervised Learning of Video RepresentationsLSTM ApplicationsHuman Action RecognitionUCF-101 DatasetHMDB-51 Dataset
This study presents an unsupervised method for learning video representations using multilayer LSTMs, showing improved classification accuracy in human action recognition on UCF101 and HMDB51 datasets, especially with limited training samples.
02
MotionClone: Training-Free Motion Cloning for Controllable Video Generation
Published:6/8/2024
Motion CloningTraining-Free Controllable Video GenerationSparse Temporal Attention MechanismVideo Generation SystemMotion Representation Extraction
MotionClone is a trainingfree framework for motion cloning from reference videos, facilitating controllable video generation tasks like texttovideo and imagetovideo. It efficiently extracts motion representations using sparse temporal attention weights, excelling in motion f
03
Plan, Posture and Go: Towards Open-World Text-to-Motion Generation
Published:12/22/2023
Open-World Text-to-Motion GenerationLLM-guided motion planningPosture Diffusion ModelMotion Generation FrameworkCLIP Model Application in Motion Generation
The PROMotion framework introduces a threemodule approach to enhance texttomotion generation, addressing challenges in openworld scenarios. By using a large language model to create structured scripts, it generates diverse and realistic 3D motions from complex natural langua
01
Successful Qualitative Research a practical guide for beginners
Published:1/1/2016
Qualitative Research MethodsBeginner's GuideResearch Methodology
This chapter offers practical guidance for beginners in qualitative research, emphasizing demystification, practice over theory, comprehensive support, the development of qualitative sensibility, and simplified pattern analysis to aid understanding and application.
00
Approximate Relational Reasoning for Quantum Programs
Published:1/1/2024
Approximate Relational Reasoning for Quantum ProgramsFormal Verification of Quantum Fourier TransformApproximate Quantum Coupling MethodRobustness Assessment of Quantum ProgramsApproximate Correctness Verification of Repeat-Until-Success Alg
This paper introduces a proof system for verifying approximate relational properties of quantum programs, addressing imperfections in quantum computation. It achieves formal verification of lowdepth approximations of the quantum Fourier transform and the repeatuntilsuccess alg
02
“Stroppy Bitches Who Just Need to Learn How to Settle”? Young Single Women and Norms of Femininity and Heterosexuality
Published:1/6/2018
Experiences of Young Women Being SingleGender Roles and Social NormsBeauty Standards and Gender OppressionExpectations in Heterosexual RelationshipsImpact of Traditional Gender Ideologies
This study explores the experiences of young heterosexual single women (ages 2535) in New Zealand, highlighting pressures related to traditional femininity and heterosexual norms, such as beauty standards and mandatory coupling, showing both resistance and compliance.
02
Adding Conditional Control to Text-to-Image Diffusion Models
Published:2/11/2023
Conditional Control for Text-to-Image Diffusion ModelsControlNet ArchitectureDeep Learning-Based Conditional ControlsTesting Multiple Conditioning ControlsStable Diffusion Model
The paper introduces ControlNet, a neural network architecture that adds spatial conditioning controls to texttoimage diffusion models, maintaining the robustness of pretrained models. Experiments demonstrate its strong performance on both small and large datasets, expanding t
02
OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation
Published:1/19/2023
OmniObject3D DatasetRealistic 3D Reconstruction3D Object GenerationLarge-Scale 3D Object ClassificationNovel-View Synthesis
OmniObject3D is a largevocabulary 3D object dataset with 6,000 highquality real scans across 190 categories, featuring rich annotations. It aims to advance 3D perception, reconstruction, and generation research with four evaluation tasks.
01
GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis
Published:8/7/2023
Logic Vulnerability Detection in Smart ContractsCombination of GPT with Program AnalysisStatic Analysis ToolsLarge Language Models for Vulnerability DetectionSolidity Code Analysis
The paper introduces GPTScan, the first tool that combines GPT with static analysis for detecting logic vulnerabilities in smart contracts. It breaks down vulnerabilities into scenarios and properties, achieving over 90% accuracy in token contracts and identifying previously miss
02
GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation
Published:6/22/2024
Geometry-Aware Large Reconstruction Model3D Gaussian Generation3D-aware Transformer StructureSparse 3D Structure OptimizationDeformable Cross-Attention Mechanism
This study introduces GeoLRM, a geometryaware large reconstruction model that efficiently generates highquality 3D assets with 512k Gaussians from 21 input images using only 11 GB of GPU memory. It addresses limitations of existing methods by utilizing a novel 3Daware transfor
05
Wide-FOV 3D Pancake VR Enabled by a Light Field Display Engine
Light Field Display EngineWide-FOV 3D Virtual RealityComputational Focus CuesMicro-LCDOptical Pancake Design
This paper presents a novel true3D Pancake VR system using a light field display engine and computational focus cues, achieving highresolution images. It addresses FOV reduction due to aberrations with a telecentric path, experimentally confirming clear 3D images with a 68.6de
03
The differential influence of Achievement Motivation on Subjective Well-being and the moderating role of Self-control
Published:9/27/2024
Achievement Motivation and Subjective Well-BeingModerating Role of Self-ControlPsychological Health Studies in College StudentsSelf-Management and Well-Being
This study surveyed 1,017 Chinese college students to explore the relationship between achievement motivation and subjective wellbeing, revealing that selfcontrol significantly moderates this relationship. High selfcontrol enhances positive impacts of success motivation and mi
02
Deciphering the impact of machine learning on education: Insights from a bibliometric analysis using bibliometrix R-package
Published:5/6/2024
Impact Analysis of Machine Learning in EducationBibliometrix Statistical Analysis MethodInterdisciplinary Research on Machine Learning and EducationTrends and Patterns in Educational ResearchChallenges and Ethical Considerations in ML Education Applicatio
This study uses bibliometric analysis to explore machine learning's impact on education, revealing its transformative potential for teaching methods. Analyzing 970 articles from 2000 to 2023 identifies growth patterns and key contributors, providing a comprehensive roadmap for in
02
A multifactorial model of intrinsic / environmental motivators, personal traits and their combined influences on math performance in elementary school
Achievement Goals Model for Math PerformanceInfluence of Self-Efficacy and InterestRole of Environmental Factors in Learning MotivationHolistic Multifactorial Path AnalysisElementary School Math Learning Research
This study develops a comprehensive multifactorial path analysis model to explore the influences of intrinsic and environmental motivators and personality traits on math performance among elementary students. Results from 762 Cypriot students highlight selfefficacy and interest
02