Context Awesome

awesome-video-diffusion-models

github.com/chenhsing/awesome-video-diffusion-models ↗

[CSUR] A Survey on Video Diffusion Models

2.3k

GitHub Stars

360

Curated Resources

8

Categories

49 min ago

Last Refreshed

Open-source Toolboxes and Foundation ModelsDataText-to-Video GenerationVideo Generation with other conditionsDepth-guided Video GenerationUnconditional Video GenerationVideo CompletionVideo Editing

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me training-based resources from awesome-video-diffusion-models"

Installation instructions →

What's inside

Text-to-Video Generation

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion ModelTraining-based
A²RD: Agentic Autoregressive Diffusion for Long Video ConsistencyTraining-free
AdaDiff: Adaptive Step Selection for Fast DiffusionTraining-free
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion ModelsTraining-based
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific TuningTraining-based
A Recipe for Scaling up Text-to-Video Generation with Text-free VideosTraining-based

Video Generation with other conditions

AADiff: Audio-Aligned Video Synthesis with Text-to-Image DiffusionSound-guided Video Generation
Action Reimagined: Text-to-Pose Video Editing for Dynamic Human ActionsPose-guided Video Generation
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character AnimationPose-guided Video Generation
AnimateAnything: Fine-Grained Open Domain Image Animation with Motion GuidanceMotion-guided Video Generation
Animated Stickers: Bringing Stickers to Life with Video DiffusionImage-guided Video Generation
AtomoVideo: High Fidelity Image-to-Video GenerationImage-guided Video Generation

Depth-guided Video Generation

ActAnywhere: Subject-Aware Video Background GenerationMulti-modal guided Video Generation
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency LearningMulti-modal guided Video Generation
Any-to-Any Generation via Composable DiffusionMulti-modal guided Video Generation
Boximator: Generating Rich and Controllable Motions for Video SynthesisMulti-modal guided Video Generation
CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional ModelingMulti-modal guided Video Generation

Data

Advancing High-Resolution Video-Language Representation with Large-Scale Video TranscriptionsCaption-level
AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AIMetric and BenchMark
CelebV-Text: A Large-Scale Facial Text-Video DatasetCaption-level
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video GenerationMetric and BenchMark
CVPR 2023 Text Guided Video Editing CompetitionMetric and BenchMark
EvalCrafter: Benchmarking and Evaluating Large Video Generation ModelsMetric and BenchMark

Video Editing

A Generalist Framework for Panoptic Segmentation of Images and VideosVideo Understanding
AnimateZero: Video Diffusion Models are Zero-Shot Image AnimatorsTraining-free Editing Model
Anything in Any Scene: Photorealistic Video Object InsertionMulti-modal Control Editing Model
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video EditingTraining-free Editing Model
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific AdaptationDomain-specific Editing Model
BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion ModelsTraining-free Editing Model

Video Completion

AID: Adapting Image2Video Diffusion Models for Instruction-guided Video PredictionVideo Prediction
AVID: Any-Length Video Inpainting with Diffusion ModelVideo Enhancement and Restoration
CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video StreamingVideo Enhancement and Restoration
Control-A-Video: Controllable Text-to-Video Generation with Diffusion ModelsVideo Prediction
Diffusion Models for Video Prediction and InfillingVideo Prediction
Diffusion Probabilistic Modeling for Video GenerationVideo Prediction

Open-source Toolboxes and Foundation Models

AnimateDiff
Personalized T2V Genetation
CogVideoX
T2V Generation
Diffusers (T2V synthesis)
T2V Genetation
EMU-Video
T2V Generation
Fliki
T2V Generation
GEN-2
T2V Generation & Editing

Resources

arXiv

Showing a sample of 360 resources. View the full list on GitHub →