awesome-llms-for-video-understanding
github.com/yunlong10/awesome-llms-for-video-understanding ↗🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.
3.2k
GitHub Stars
223
Curated Resources
4
Categories
4 hours ago
Last Refreshed
🔥🔥🔥 Video Understanding with Large Language Models: A Survey📢 News😎 Vid-LLMs: ModelsTasks, Datasets, and Benchmarks
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me 🗒️ taxonomy 1 resources from awesome-llms-for-video-understanding"
Installation instructions →What's inside
Tasks, Datasets, and Benchmarks
- ActivityNet
ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding
- ActivityNet Captions
Dense-Captioning Events in Videos
- ActivityNet-QA
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
- Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events
06/2025
- BlackSwanSuite
Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events
- Can Video Large Multimodal Models Think Like Doubters-or Double-Down: A Study on Defeasible Video Entailment
08/2025
😎 Vid-LLMs: Models
- An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM🗒️ Taxonomy 1
IG-VLM
- AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?🗒️ Taxonomy 1
AntGPT
- Artemis towards referential understanding in complex videos🗒️ Taxonomy 1
Artemis
- A Simple LLM Framework for Long-Range Video Question-Answering🗒️ Taxonomy 1
LLoVi
- AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn🗒️ Taxonomy 1
AssistGPT
- Audio-Visual LLM for Video Understanding🗒️ Taxonomy 2
-
📢 News
- IEEE Xplore
GitHub
Showing a sample of 223 resources. View the full list on GitHub →