Papers

Sign in to view your remaining parses.
Tag Filter
Video Generation Models
One-Minute Video Generation with Test-Time Training
Published:4/8/2025
Video Generation ModelsAutoregressive Generation ModelsTransformer-based Video GenerationTest-Time TrainingComplex Multi-Scene Story Generation
This paper introduces TestTime Training (TTT) layers to enhance oneminute video generation. By integrating TTT into a pretrained Transformer, the authors achieved more coherent videos from text storyboards, outperforming existing methods, despite some artifacts and efficiency
03
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
Published:11/26/2023
Diffusion ModelsVideo Generation ModelsText-to-Video GenerationHigh-Quality Video Fine-TuningVideo Dataset Curation
The paper presents Stable Video Diffusion (SVD), a model for highresolution texttovideo and imagetovideo generation. It evaluates a threestage training process and highlights the importance of wellcurated datasets for highquality video generation, demonstrating strong per
02
VideoGPT: Video Generation using VQ-VAE and Transformers
Published:4/21/2021
Video Generation ModelsVQ-VAE and Transformer Combined ApplicationBAIR Robot DatasetUCF-101 DatasetAutoregressive Generation Models
This paper introduces VideoGPT, a video generation model utilizing VQVAE and a simple transformer architecture, achieving competitive sample quality with stateoftheart GANs on various datasets and providing a reproducible reference for future research.
04
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Published:11/6/2001
Vision-Language ModelsVideo Generation ModelsMultimodal ReasoningVideo Thinking Benchmark
The "Thinking with Video" paradigm enhances multimodal reasoning by integrating video generation models, validated through the Video Thinking Benchmark, showing performance improvements in both vision and text tasks while addressing static constraints and modality separation.
02