Papers
Sign in to view your remaining parses.
Tag Filter
Multimodal Video Generation
UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
Published:12/9/2025
Multimodal Video GenerationWorld-Aware Video GenerationDynamic Noising IntegrationUnified Dataset ConstructionCross-Modal Learning Framework
UnityVideo is a unified framework that enhances worldaware video generation by jointly learning from multiple modalities. It employs dynamic noising and a modality switcher, leveraging a largescale dataset of 1.3M samples to improve video quality and physical consistency.
03
UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution
Published:10/9/2025
Cascaded Video Super-ResolutionMultimodal Video GenerationLatent Video Diffusion ModelCondition Injection StrategiesMultimodal Condition Utilization
The paper introduces UniMMVSR, a unified framework for video superresolution that handles multiple input modalities. It explores condition injection strategies and demonstrates superior performance in detail and conformity to multimodal conditions, enabling 4K video generation.
01