Page 15 - Paper Library - AiPaper

SALM: Spatial Audio Language Model with Structured Embeddings for Understanding and Editing

Published:7/23/2025

Spatial Audio Language ModelMultimodal Contrastive LearningStructured Audio EmbeddingsSpatial Audio Understanding and EditingZero-Shot Direction Classification

The paper presents SALM, a model that aligns spatial audio with natural language through multimodal contrastive learning, utilizing structured embeddings for separate and joint representation of semantic and spatial information, supporting zeroshot direction classification and t

Study of AI‑Driven Fashion Recommender Systems

Published:7/5/2023

Fashion Recommender SystemsImage-Based Recommendation SystemsAI Applications in Recommender SystemsUser-Item Relationship ModelingFashion Industry Data Analysis

This paper reviews the application of AI in fashion recommender systems over the last decade, emphasizing deep learning and computer vision. AI offers superior recommendations by addressing product diversity and compatibility, helping to mitigate consumer choice overload.

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Published:12/8/2025

Self-Distilled Reinforcement Learning for Large Language ModelsParallel Cognitive ReasoningAdaptive Decomposition Policy OptimizationLarge-Scale Parallel Reinforcement Learning TrainingModel Memory Management and Flow Control

The Native Parallel Reasoner (NPR) framework enables Large Language Models to achieve genuine parallel reasoning through selfdistilled training and novel policy optimization. NPR shows up to 24.5% performance improvements and 4.6x speedup in benchmarks, establishing a new standa

KV-Edit: Training-Free Image Editing for Precise Background Preservation

Published:2/25/2025

Training-Free Image EditingBackground Consistency PreservationDiT Models for Image GenerationMemory Optimization MethodsContent Generation in User-Provided Regions

KVEdit is a trainingfree image editing method addressing background consistency by using the KV cache mechanism in Diffusion Transformers. It preserves background tokens, enabling seamless foregroundbackground integration, significantly outperforming existing techniques in qua

Recommender Systems with Generative Retrieval

Published:5/9/2023

Generative Recommendation SystemsSemantic ID-based Recommendation ModelTransformer Sequence-to-Sequence ModelApproximate Nearest Neighbor SearchUser Behavior Prediction

This paper introduces a novel generative retrieval method using autoregressive decoding of Semantic IDs to enhance recommender system performance. A Transformerbased model effectively predicts the next item a user will interact with. Experiments show substantial improvements ove

Motion Inversion for Video Customization

Published:3/29/2024

Motion Customization in Video GenerationTemporal Transformer ModulesMotion Embedding RepresentationMotion Query-Key EmbeddingDifferential Operation Strategy

This paper presents a novel motion customization method using Motion Embeddings, addressing gaps in motion representation for video generation. It integrates with temporal transformer modules to optimize selfattention across frames, ensuring focus on motion and reducing appearan

Unsupervised Learning of Video Representations using LSTMs

Published:2/17/2015

Unsupervised Learning of Video RepresentationsLSTM ApplicationsHuman Action RecognitionUCF-101 DatasetHMDB-51 Dataset

This study presents an unsupervised method for learning video representations using multilayer LSTMs, showing improved classification accuracy in human action recognition on UCF101 and HMDB51 datasets, especially with limited training samples.

MotionClone: Training-Free Motion Cloning for Controllable Video Generation

Published:6/8/2024

Motion CloningTraining-Free Controllable Video GenerationSparse Temporal Attention MechanismVideo Generation SystemMotion Representation Extraction

MotionClone is a trainingfree framework for motion cloning from reference videos, facilitating controllable video generation tasks like texttovideo and imagetovideo. It efficiently extracts motion representations using sparse temporal attention weights, excelling in motion f

Plan, Posture and Go: Towards Open-World Text-to-Motion Generation

Published:12/22/2023

Open-World Text-to-Motion GenerationLLM-guided motion planningPosture Diffusion ModelMotion Generation FrameworkCLIP Model Application in Motion Generation

The PROMotion framework introduces a threemodule approach to enhance texttomotion generation, addressing challenges in openworld scenarios. By using a large language model to create structured scripts, it generates diverse and realistic 3D motions from complex natural langua

Successful Qualitative Research a practical guide for beginners

Published:1/1/2016

Qualitative Research MethodsBeginner's GuideResearch Methodology

This chapter offers practical guidance for beginners in qualitative research, emphasizing demystification, practice over theory, comprehensive support, the development of qualitative sensibility, and simplified pattern analysis to aid understanding and application.

Approximate Relational Reasoning for Quantum Programs

Published:1/1/2024

Approximate Relational Reasoning for Quantum ProgramsFormal Verification of Quantum Fourier TransformApproximate Quantum Coupling MethodRobustness Assessment of Quantum ProgramsApproximate Correctness Verification of Repeat-Until-Success Alg

This paper introduces a proof system for verifying approximate relational properties of quantum programs, addressing imperfections in quantum computation. It achieves formal verification of lowdepth approximations of the quantum Fourier transform and the repeatuntilsuccess alg

“Stroppy Bitches Who Just Need to Learn How to Settle”? Young Single Women and Norms of Femininity and Heterosexuality

Published:1/6/2018

Experiences of Young Women Being SingleGender Roles and Social NormsBeauty Standards and Gender OppressionExpectations in Heterosexual RelationshipsImpact of Traditional Gender Ideologies

This study explores the experiences of young heterosexual single women (ages 2535) in New Zealand, highlighting pressures related to traditional femininity and heterosexual norms, such as beauty standards and mandatory coupling, showing both resistance and compliance.

Adding Conditional Control to Text-to-Image Diffusion Models

Published:2/11/2023

Conditional Control for Text-to-Image Diffusion ModelsControlNet ArchitectureDeep Learning-Based Conditional ControlsTesting Multiple Conditioning ControlsStable Diffusion Model

The paper introduces ControlNet, a neural network architecture that adds spatial conditioning controls to texttoimage diffusion models, maintaining the robustness of pretrained models. Experiments demonstrate its strong performance on both small and large datasets, expanding t

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation

Published:1/19/2023

OmniObject3D DatasetRealistic 3D Reconstruction3D Object GenerationLarge-Scale 3D Object ClassificationNovel-View Synthesis

OmniObject3D is a largevocabulary 3D object dataset with 6,000 highquality real scans across 190 categories, featuring rich annotations. It aims to advance 3D perception, reconstruction, and generation research with four evaluation tasks.

GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis

Published:8/7/2023

Logic Vulnerability Detection in Smart ContractsCombination of GPT with Program AnalysisStatic Analysis ToolsLarge Language Models for Vulnerability DetectionSolidity Code Analysis

The paper introduces GPTScan, the first tool that combines GPT with static analysis for detecting logic vulnerabilities in smart contracts. It breaks down vulnerabilities into scenarios and properties, achieving over 90% accuracy in token contracts and identifying previously miss

GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation

Published:6/22/2024

Geometry-Aware Large Reconstruction Model3D Gaussian Generation3D-aware Transformer StructureSparse 3D Structure OptimizationDeformable Cross-Attention Mechanism

This study introduces GeoLRM, a geometryaware large reconstruction model that efficiently generates highquality 3D assets with 512k Gaussians from 21 input images using only 11 GB of GPU memory. It addresses limitations of existing methods by utilizing a novel 3Daware transfor

Wide-FOV 3D Pancake VR Enabled by a Light Field Display Engine

Light Field Display EngineWide-FOV 3D Virtual RealityComputational Focus CuesMicro-LCDOptical Pancake Design

This paper presents a novel true3D Pancake VR system using a light field display engine and computational focus cues, achieving highresolution images. It addresses FOV reduction due to aberrations with a telecentric path, experimentally confirming clear 3D images with a 68.6de

The differential influence of Achievement Motivation on Subjective Well-being and the moderating role of Self-control

Published:9/27/2024

Achievement Motivation and Subjective Well-BeingModerating Role of Self-ControlPsychological Health Studies in College StudentsSelf-Management and Well-Being

This study surveyed 1,017 Chinese college students to explore the relationship between achievement motivation and subjective wellbeing, revealing that selfcontrol significantly moderates this relationship. High selfcontrol enhances positive impacts of success motivation and mi

Deciphering the impact of machine learning on education: Insights from a bibliometric analysis using bibliometrix R-package

Published:5/6/2024

Impact Analysis of Machine Learning in EducationBibliometrix Statistical Analysis MethodInterdisciplinary Research on Machine Learning and EducationTrends and Patterns in Educational ResearchChallenges and Ethical Considerations in ML Education Applicatio

This study uses bibliometric analysis to explore machine learning's impact on education, revealing its transformative potential for teaching methods. Analyzing 970 articles from 2000 to 2023 identifies growth patterns and key contributors, providing a comprehensive roadmap for in

A multifactorial model of intrinsic / environmental motivators, personal traits and their combined influences on math performance in elementary school

Achievement Goals Model for Math PerformanceInfluence of Self-Efficacy and InterestRole of Environmental Factors in Learning MotivationHolistic Multifactorial Path AnalysisElementary School Math Learning Research

This study develops a comprehensive multifactorial path analysis model to explore the influences of intrinsic and environmental motivators and personality traits on math performance among elementary students. Results from 762 Cypriot students highlight selfefficacy and interest

281 - 300 / 980

Papers