Page 23 - Paper Library - AiPaper

BatmanNet: Bi-branch Masked Graph Transformer Autoencoder for Molecular Representation

Published:11/25/2022

Molecular Representation Learning for BiomedicineSelf-Supervised Learning in Graph Neural NetworksBatmanNet Graph Transformer AutoencoderImproving Performance in Drug Discovery TasksLarge-Scale Molecular Datasets

BatmanNet, a novel bibranch masked graph transformer autoencoder, is proposed for effective molecular representation learning. It uses a simple selfsupervised strategy to capture both local and global information, achieving stateoftheart results in drug discovery tasks.

Recent Developments in GNNs for Drug Discovery

Published:6/2/2025

Graph Neural Networks in Drug DiscoveryMolecule Generation and Property PredictionDrug-Drug Interaction PredictionMolecular Representation and Input TypesBenchmark Datasets for Drug Discovery

This paper reviews the latest developments in Graph Neural Networks (GNNs) for computational drug discovery, focusing on molecule generation, property prediction, and drugdrug interaction prediction, highlighting GNNs' capability to understand complex molecular patterns and thei

Highly Accurate Disease Diagnosis and Highly Reproducible Biomarker Identification with PathFormer

Published:2/12/2024

PathFormer ModelBiomarker IdentificationApplication of GNNs in Omics AnalysisPrecise Disease DiagnosisAlzheimer's Disease Dataset

The PathFormer model was developed to enhance biomarker identification accuracy and reproducibility across datasets, achieving a 30% improvement in disease diagnosis accuracy compared to existing GNNs, as demonstrated in Alzheimer's and cancer transcriptomic datasets.

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context

Published:7/1/2019

Transformer-XL ArchitectureLong-Range Dependency ModelingLanguage ModelingContext Fragmentation ResolutionPositional Encoding Scheme

The paper presents TransformerXL, a novel neural architecture that overcomes fixedlength context limitations in language modeling through a segmentlevel recurrence mechanism and a new positional encoding scheme, significantly outperforming traditional models with up to 1,800 t

ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Published:8/30/2025

LLM Reasoning Capacity EnhancementReinforcement Learning for Math Reasoning

The paper introduces 'ParaThinker,' a novel paradigm for scaling LLMs that utilizes native thought parallelism to overcome the bottleneck of 'Tunnel Vision' in testtime computation, significantly enhancing reasoning capabilities by synthesizing multiple diverse reasoning paths.

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Published:6/12/2025

Parallel Generation for Autoregressive Large Language ModelsMultiverse Generative ModelMapReduce ParadigmAdaptive Task DecompositionMultiverse Attention Mechanism

The Multiverse framework enables autoregressive language models to generate outputs with implicit parallelism via a MapReduce paradigm, consisting of adaptive task decomposition, parallel execution, and lossless synthesis. It shows competitive performance and improved efficiency,

Accelerating Retrieval-Augmented Language Model Serving with Speculation

Published:1/25/2024

Retrieval-Augmented Language Model ServingRaLMSpec Acceleration FrameworkSpeculative Retrieval MechanismBatched Verification StrategyDownstream QA Datasets

The RaLMSpec framework accelerates retrievalaugmented language model serving through speculative retrieval and batched verification, maintaining consistent outputs. Combining prefetching and asynchronous verification, it significantly enhances iterative RaLM efficiency, achievin

Agent-based Video Trimming

Published:12/13/2024

Agent-based Video TrimmingVideo StructuringVideo Filtering ModuleVideo Story CompositionVideo Evaluation Agent

This paper presents the Agentbased Video Trimming method (AVT) to address the issue of lengthy usergenerated videos. AVT effectively detects redundant footage and composes coherent stories through three phases: Video Structuring, Clip Filtering, and Story Composition, showing s

Techniques and Challenges of Image Segmentation: A Review

Published:3/2/2023

Foundation Model for Image SegmentationSemantic SegmentationDeep Learning for Image SegmentationImage Processing and Computer VisionChallenges in Image Segmentation Techniques

This paper reviews advancements in image segmentation, categorizing techniques into classic, collaborative, and deep learningbased semantic segmentation. It highlights challenges in feature extraction and model design while analyzing key algorithms, their applicability, and futu

Accelerating Retrieval-Augmented Generation

Published:2/6/2025

Retrieval-Augmented Generation SystemsExact Retrieval for Large Language ModelsIntelligent Knowledge Store ArchitectureAccelerated Exact Nearest Neighbor SearchLarge-Scale Vector Database Retrieval

The paper explores RetrievalAugmented Generation (RAG) to address hallucinations in large language models (LLMs). It introduces the Intelligent Knowledge Store (IKS), a nearmemory acceleration architecture that enhances exact retrieval speed by 13.4–27.9 times, improving infere

Maximizing RAG efficiency: A comparative analysis of RAG methods

Published:10/30/2024

Retrieval-Augmented Generation OptimizationComparative Analysis of RAG MethodsContextual Compression FiltersCross-Domain Dataset EvaluationVector Stores and Embedding Models

This paper analyzes various RAG methods through a grid search of 23,625 iterations to enhance efficiency. It highlights the need to balance context quality with similarity rankings and the crucial role of contextual compression filters for optimizing hardware use and token consum

An effective CNN and Transformer complementary network for medical image segmentation

Published:11/30/2022

Medical Image SegmentationCNN and Transformer Complementary NetworkCross-domain Fusion BlockFeature Complementary ModuleSwin Transformer Decoder

The CTCNet, a complementary network for medical image segmentation, combines CNN's local features with Transformer’s longrange dependencies. It utilizes encoders and crossdomain fusion to enhance feature representation, outperforming existing models in organ and cardiac segmen

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Published:11/12/2025

Generalist Agent Design3D Open World Task ExecutionVision-Language Model ApplicationHuman Interaction ParadigmZero-Shot Cross-Game Generalization

Lumine is introduced as the first open methodology for training generalist agents capable of realtime task execution in 3D open worlds, integrating perception, reasoning, and action to achieve high efficiency and strong zeroshot generalization across games.

Video-As-Prompt: Unified Semantic Control for Video Generation

Published:10/24/2025

Unified Semantic Control for Video GenerationVideo Diffusion TransformerVideo-As-Prompt Generation Framework100K Paired Video Semantic DatasetMixture-of-Transformers Architecture

The VideoAsPrompt (VAP) paradigm redefines semantic control in video generation using reference videos as prompts, combined with a MixtureofTransformers architecture. VAP builds the largest dataset, VAPData, achieving a 38.7% user preference rate, showcasing strong zeroshot

GNNExplainer: Generating Explanations for Graph Neural Networks

Published:3/10/2019

Explainability for Graph Neural NetworksGraph Structure OptimizationGNNExplainerNode Feature Importance IdentificationGraph-Based Machine Learning Tasks

GNNExplainer is the first general modelagnostic method for interpreting predictions of GNNs, identifying crucial subgraphs and node features while outperforming baselines by 17.1% on average, enhancing user trust and model transparency.

End-to-End Multi-Task Learning with Attention

Published:3/29/2018

Attention Mechanism in Multi-Task LearningMulti-Task Attention Network (MTAN)End-to-End Multi-Task Learning ArchitectureTask-Specific Feature LearningImage Classification Tasks

The paper introduces a novel MultiTask Attention Network (MTAN) for taskspecific featurelevel attention learning. This architecture employs a shared network and dedicated softattention modules, enabling efficient feature sharing and exceptional performance in multitask learn

Robotic computing system and embodied AI evolution: an algorithm-hardware co-design perspective

Published:10/1/2025

Robotic Computing SystemsEmbodied AI EvolutionAlgorithm-Hardware Co-Design

This study examines the evolution of robotic computing systems and embodied AI, proposing an algorithmhardware codesign perspective to address challenges in realtime performance and energy efficiency, highlighting limitations of existing hardware in meeting advanced motion pla

Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers

Published:6/25/2024

Diffusion Model QuantizationPost-Training QuantizationDiffusion TransformerDynamic Activation QuantizationImageNet Dataset

The paper introduces QDiT, a method for accurate quantization of Diffusion Transformers (DiTs), addressing spatial and temporal variance in weights and activations. By combining automatic quantization and samplewise dynamic activation quantization, QDiT reduces computational c

Killing Two Birds with One Stone: Unifying Retrieval and Ranking with a Single Generative Recommendation Model

Published:4/23/2025

Generative Recommendation SystemsUnified Generative Recommendation FrameworkRetrieval and Ranking in Recommendation SystemsInformation Sharing and OptimizationDynamic Balancing Optimization Mechanism

The study introduces the Unified Generative Recommendation Framework (UniGRF) to unify retrieval and ranking stages in recommendation systems, enhancing performance through information sharing and dynamic optimization, outperforming existing models in extensive experiments.

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Self-Verifiable Mathematical ReasoningLLM-based Theorem ProvingReinforcement Learning for Math ReasoningProof Generator and VerifierQuantitative Reasoning Capability Enhancement

The DeepSeekMathV2 model addresses the effectiveness of large language models in mathematical reasoning. By training a theorem prover verifier, it enables selfverification, producing more accurate proofs and achieving excellent results in competitions, demonstrating the potenti

042

441 - 460 / 980

Papers