Home Papers RSS LibraryAnnouncements Changelog PPT

Papers

解析模型

Sign in to view your remaining parses.

Email me when analysis completesPick favorite folders after submittingKeep analysis private from users who haven't submitted this paper (still saved as your default analysis)

Tag Filter

Video Generation Models

One-Minute Video Generation with Test-Time Training

Published:4/8/2025

Video Generation ModelsAutoregressive Generation ModelsTransformer-based Video GenerationTest-Time TrainingComplex Multi-Scene Story Generation

This paper introduces TestTime Training (TTT) layers to enhance oneminute video generation. By integrating TTT into a pretrained Transformer, the authors achieved more coherent videos from text storyboards, outperforming existing methods, despite some artifacts and efficiency

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

Published:11/26/2023

Diffusion ModelsVideo Generation ModelsText-to-Video GenerationHigh-Quality Video Fine-TuningVideo Dataset Curation

The paper presents Stable Video Diffusion (SVD), a model for highresolution texttovideo and imagetovideo generation. It evaluates a threestage training process and highlights the importance of wellcurated datasets for highquality video generation, demonstrating strong per

VideoGPT: Video Generation using VQ-VAE and Transformers

Published:4/21/2021

Video Generation ModelsVQ-VAE and Transformer Combined ApplicationBAIR Robot DatasetUCF-101 DatasetAutoregressive Generation Models

This paper introduces VideoGPT, a video generation model utilizing VQVAE and a simple transformer architecture, achieving competitive sample quality with stateoftheart GANs on various datasets and providing a reproducible reference for future research.

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Published:11/6/2001

Vision-Language ModelsVideo Generation ModelsMultimodal ReasoningVideo Thinking Benchmark

The "Thinking with Video" paradigm enhances multimodal reasoning by integrating video generation models, validated through the Video Thinking Benchmark, showing performance improvements in both vision and text tasks while addressing static constraints and modality separation.

1 - 4 / 4

Go to

© 2025 AiPaper · Friend Links · Sitemap