Papers
Sign in to view your remaining parses.
Tag Filter
CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models
Published:12/16/2024
Streaming Speech SynthesisApplication of Large Language ModelsMultilingual DatasetOptimization of Speech Generation ModelsProgressive Semantic Decoding
CosyVoice 2 is an enhanced streaming speech synthesis model that optimizes token utilization with finitescalar quantization, simplifies the LM architecture using pretrained large language models, and employs chunkaware causal flow matching for humanlevel naturalness and virtu
08
5G Vehicle-to-Everything Services: Gearing Up for Security and Privacy
Published:11/13/2019
5G V2X Security and PrivacyVehicle-to-Everything Communication5G Network Services and ApplicationsVehicle Sensor NetworksLow Latency Communication
The paper provides a comprehensive review of security and privacy issues in 5G V2X services, analyzing architecture, use cases, and potential trust threats, while exploring recent protection strategies and highlighting future research directions to advance this field.
08
A Component Architecture for the Internet of Things
Published:9/1/2018
Component Architecture for Internet of ThingsConcurrent Time-Stamped Discrete-Event SemanticsAsynchronous Atomic Callbacks (AAC)Secure Swarm Toolkit (SST)Proxies and Services Interaction
This paper presents a componentbased architecture for IoT, featuring proxies called "accessors" that interact under concurrent, timestamped discreteevent semantics, leveraging asynchronous atomic callbacks (AAC) and the actor model to enhance efficiency and security.
04
UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution
Published:10/9/2025
Cascaded Video Super-ResolutionMultimodal Video GenerationLatent Video Diffusion ModelCondition Injection StrategiesMultimodal Condition Utilization
The paper introduces UniMMVSR, a unified framework for video superresolution that handles multiple input modalities. It explores condition injection strategies and demonstrates superior performance in detail and conformity to multimodal conditions, enabling 4K video generation.
01
iBitter-SCM: Identification and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides
Published:3/28/2020
Bitter Peptide Prediction ModelScoring Card MethodAmino Acid Propensity ScoringDietary Drug DevelopmentComparison of Machine Learning Classifiers
iBitterSCM is a novel computational model that predicts bitter peptides based on amino acid sequences using the scoring card method with propensity scores. It achieved 84.38% accuracy on independent datasets, outperforming other classifiers and serving as a significant tool for
02
Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension
Published:11/26/2024
Hotspot-Driven Peptide DesignAutoregressive Generation ModelsPeptide Drug DevelopmentFragment-Based Peptide GenerationEnergy Density Model
PepHAR is introduced as a hotspotdriven autoregressive model for peptide design, focusing on residues with higher interaction potential. It effectively combines fragment extension and optimization to generate peptides with correct geometries, advancing peptide drug development.
03
From prediction to design: Revealing the mechanisms of umami peptides using interpretable deep learning, quantum chemical simulations, and module substitution
Interpretable Deep LearningModule Substitution StrategyUmami Peptide DesignQuantum Chemical SimulationsVirtual Hydrolysis
This study uses interpretable deep learning and module substitution to efficiently screen and design umami peptides, achieving 0.94 accuracy. It identifies various umami peptides, explores module substitution mechanisms, and highlights essential amino acids for taste enhancement.
04
Mip-Splatting: Alias-free 3D Gaussian Slatting
3D Gaussian SplattingHigh-Frequency Artifact EliminationMip Filtering3D Smoothing Filter
The paper introduces a 3D smoothing filter that eliminates artifacts in 3D Gaussian Splatting by constraining the size of 3D Gaussian primitives based on sampling frequency, and effectively mitigates aliasing issues using a 2D Mip filter.
04
Mip-Splatting: Alias-free 3D Gaussian SRecently, 3D Gaussian Splatting has demonstrated impressive novel view synthesis results, reaching high fidelity and efficiency. However, strong artifacts can be observed when changing the sampling rate, e.g., by changing focal length or camera distance. We find that the source for this phenomenon can be attributed to the lack of 3D frequency constraints and the usage of a 2D dilation filter. To address this problem, we introduce a 3D smoothing filter which
3D Gaussian Splatting RenderingAlias-Free 3D Smoothing FilterNovel View SynthesisHigh-Frequency Artifact Elimination3D Frequency Constraints
The paper introduces a 3D smoothing filter to eliminate highfrequency artifacts in novel view synthesis by constraining the size of 3D Gaussian primitives. Replacing the 2D dilation filter with a 2D Mip filter effectively mitigates aliasing issues.
05
Inference Performance of Large Language Models on a 64-core RISC-V CPU with Silicon-Enabled Vectors
LLM Reasoning Capacity EnhancementRISC-V Based Hardware OptimizationSilicon-Enabled Vector ComputingEnergy-Efficient Computing ArchitecturesMatrix Multiplication Performance Benchmark
This study evaluates LLM inference performance on a 64core RISCV CPU with SiliconEnabled Vectors, revealing significant throughput and energy efficiency improvements, particularly for smaller models. It offers practical insights for deploying LLMs on future heterogeneous compu
07
Privacy-Preserving Action Recognition via Motion Difference Quantization
Published:8/4/2022
Privacy-Preserving Human Action RecognitionMotion Difference QuantizationAdversarial Training OptimizationPrivacy and Security Issues in Computer VisionImage Blurring and Difference Processing
This paper presents BDQ, a privacypreserving encoder for human action recognition, which utilizes blur, difference, and quantization to suppress privacy information while retaining recognition performance, achieving stateoftheart results in experiments across three benchmark
019
X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment
Vision-Language-Action Model
Published:10/12/2025
Vision-Language-Action ModelCross-Embodiment LearningSoft Prompt LearningGeneralist Robotic PlatformsLarge-Scale Heterogeneous Datasets
The paper introduces XVLA, a scalable VisionLanguageAction model utilizing a softprompted Transformer architecture. By integrating learnable embeddings for diverse robot data sources, XVLA achieves stateoftheart performance across simulations and real robots, demonstratin
02
Deciphering the biosynthetic potential of microbial genomes using a BGC language processing neural network model
Published:4/10/2025
Microbial Genomic Biosynthetic Potential AnalysisBiosynthetic Gene Cluster Prediction ModelTransformer-based Gene Location Relationship CaptureUltrahigh-Throughput BGC Screening ToolStudy of Microbial Secondary Metabolites
This study presents BGCProphet, a transformerbased model for predicting and classifying biosynthetic gene clusters in microbial genomes. It enhances efficiency and accuracy, analyzing over 85,000 genomes to reveal BGC distribution patterns and environmental influences, aiding r
03
StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video
Generation
Published:11/11/2025
video diffusion modelsReal-Time Interactive Video GenerationStreaming Content CreationLow-Latency Video GenerationMulti-GPU Real-Time Streaming Service
StreamDiffusionV2 is introduced as a streaming system for dynamic and interactive video generation, addressing temporal consistency and low latency issues in live streaming. It integrates SLOaware schedulers and other optimizations for trainingfree realtime service, enhancing
010
Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal
Locomotion Control
Published:1/30/2024
Dynamic Bipedal Robot ControlApplication of Deep Reinforcement LearningRobot Adaptivity and RobustnessDiverse Locomotion SkillsDual-History Control Architecture
This paper develops dynamic locomotion controllers for bipedal robots using deep reinforcement learning, surpassing single skill limitations with a novel dualhistory architecture that enhances adaptivity and robustness. The controllers show superior performance in diverse skills
02
ExBody2: Advanced Expressive Humanoid Whole-Body Control
Published:12/18/2024
Humanoid Whole-Body ControlExpressive Dynamic Motion GenerationMotion Capture-Based Control StrategyKinematic Adaptation Optimization for RobotsWhole-Body Motion Tracking Algorithm
The paper presents ExBody2, an advanced control method enabling humanoid robots to perform expressive wholebody movements while maintaining stability. It employs a training approach based on human motion capture and simulations, addressing tradeoffs between versatility and spec
03
SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model
Social Media Image Deepfake DetectionLarge Multimodal ModelDeepfake Localization and ExplanationDeepfake Detection DatasetImage Authenticity Verification
The SIDA framework utilizes large multimodal models for detecting, localizing, and explaining deepfakes in social media images. It also introduces the SIDSet dataset, comprising 300K diverse synthetic and authentic images with high realism and thorough annotations, enhancing det
011
FoldamerDB: a database of peptidic foldamers
Published:10/17/2019
Foldamer DatabaseAntimicrobial and Anticancer FoldamersBiologically Active FoldamersPublicly Accessible Compound DatabaseStructural and Sequence Information of Foldamers
FoldamerDB is an opensource, fully annotated database of peptidic foldamers, containing information on 1319 species and their biological activities, collected from over 160 papers. The userfriendly interface allows for comprehensive searching and filtering, addressing a gap in
05
Explainable Machine Learning and Deep Learning Models for Predicting TAS2R-Bitter Molecule Interactions
Published:10/9/2025
Explainable Machine Learning ModelsTAS2R-Bitter Molecule Interaction PredictionDeep Learning for Ligand RecognitionG Protein-Coupled Receptor Function ResearchMolecular Characteristics and Drug Design
This study developed explainable machine learning and deep learning models to predict interactions between bitter molecules and TAS2R receptors, enhancing ligand selection and understanding of receptor functions, with significant implications for drug design and disease research.
02
Identifying Sequential Residue Patterns in Bitter and Umami Peptides
Published:11/9/2022
Sequential Pattern Identification in Bitter and Umami PeptidesCoarse-Graining of Peptide Sequence SpaceQuantitative Structure-Activity Relationship for Taste FeaturesAmino Acid Pattern ExtractionSystematic Improvements for Bitter and Umami Peptide Features
This study explores how amino acid sequences in peptides affect taste, introducing a coarsegraining method to systematically identify optimal residue patterns for bitter and umami peptides, showing significant improvements over random and baseline models.
07
……