awesome-video-diffusion-models
github.com/chenhsing/awesome-video-diffusion-models ↗[CSUR] A Survey on Video Diffusion Models
2.3k
GitHub Stars
357
Curated Resources
8
Categories
5 hours ago
Last Refreshed
Open-source Toolboxes and Foundation ModelsDataText-to-Video GenerationVideo Generation with other conditionsDepth-guided Video GenerationUnconditional Video GenerationVideo CompletionVideo Editing
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me training-based resources from awesome-video-diffusion-models"
Installation instructions →What's inside
Text-to-Video Generation
- 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion ModelTraining-based
- AdaDiff: Adaptive Step Selection for Fast DiffusionTraining-free
- Align your Latents: High-Resolution Video Synthesis with Latent Diffusion ModelsTraining-based
- AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific TuningTraining-based
- A Recipe for Scaling up Text-to-Video Generation with Text-free VideosTraining-based
- ART•V: Auto-Regressive Text-to-Video Generation with Diffusion ModelsTraining-based
Video Generation with other conditions
- AADiff: Audio-Aligned Video Synthesis with Text-to-Image DiffusionSound-guided Video Generation
- Action Reimagined: Text-to-Pose Video Editing for Dynamic Human ActionsPose-guided Video Generation
- Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character AnimationPose-guided Video Generation
- AnimateAnything: Fine-Grained Open Domain Image Animation with Motion GuidanceMotion-guided Video Generation
- Animated Stickers: Bringing Stickers to Life with Video DiffusionImage-guided Video Generation
- AtomoVideo: High Fidelity Image-to-Video GenerationImage-guided Video Generation
Depth-guided Video Generation
- ActAnywhere: Subject-Aware Video Background GenerationMulti-modal guided Video Generation
- Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
- AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency LearningMulti-modal guided Video Generation
- Any-to-Any Generation via Composable DiffusionMulti-modal guided Video Generation
- Boximator: Generating Rich and Controllable Motions for Video SynthesisMulti-modal guided Video Generation
- CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional ModelingMulti-modal guided Video Generation
Data
- Advancing High-Resolution Video-Language Representation with Large-Scale Video TranscriptionsCaption-level
- AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AIMetric and BenchMark
- CelebV-Text: A Large-Scale Facial Text-Video DatasetCaption-level
- ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video GenerationMetric and BenchMark
- CVPR 2023 Text Guided Video Editing CompetitionMetric and BenchMark
- EvalCrafter: Benchmarking and Evaluating Large Video Generation ModelsMetric and BenchMark
Video Editing
- A Generalist Framework for Panoptic Segmentation of Images and VideosVideo Understanding
- AnimateZero: Video Diffusion Models are Zero-Shot Image AnimatorsTraining-free Editing Model
- Anything in Any Scene: Photorealistic Video Object InsertionMulti-modal Control Editing Model
- A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video EditingTraining-free Editing Model
- Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific AdaptationDomain-specific Editing Model
- BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion ModelsTraining-free Editing Model
Video Completion
- AID: Adapting Image2Video Diffusion Models for Instruction-guided Video PredictionVideo Prediction
- AVID: Any-Length Video Inpainting with Diffusion ModelVideo Enhancement and Restoration
- CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video StreamingVideo Enhancement and Restoration
- Control-A-Video: Controllable Text-to-Video Generation with Diffusion ModelsVideo Prediction
- Diffusion Models for Video Prediction and InfillingVideo Prediction
- Diffusion Probabilistic Modeling for Video GenerationVideo Prediction
Open-source Toolboxes and Foundation Models
- AnimateDiff
Personalized T2V Genetation
- CogVideoX
T2V Generation
- Diffusers (T2V synthesis)
T2V Genetation
- EMU-Video
T2V Generation
- Fliki
T2V Generation
- GEN-2
T2V Generation & Editing
Resources
Showing a sample of 357 resources. View the full list on GitHub →