awesome-rag-vision
github.com/zhengxujosh/awesome-rag-vision ↗Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision
334
GitHub Stars
140
Curated Resources
2
Categories
2 hours ago
Last Refreshed
ResourcesRAG for Vision
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me workshops and tutorials resources from awesome-rag-vision"
Installation instructions →What's inside
Resources
- A Comprehensive Guide to Building Multimodal RAG SystemsWorkshops and Tutorials
- An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and AudioWorkshops and Tutorials
- Build an AI-powered multimodal RAG system with Docling and GraniteWorkshops and Tutorials
- Building an Image Search RAG App with Llama 3.2 VisionWorkshops and Tutorials
- Building Multimodal RAG Application for Video PreprocessingWorkshops and Tutorials
- Guide to Multimodal RAG for Images and Text (in 2025)Workshops and Tutorials
RAG for Vision
- A General Retrieval-Augmented Generation Framework for Multimodal Case-Based Reasoning Applications1 Visual Understanding
Marom
- AlzheimerRAG: Multimodal RAG for Clinical Use Cases using PubMed1 Visual Understanding
Lahiri et al.
- A Multi-Granularity Retrieval Framework for Visually-Rich Documents1 Visual Understanding
Xu et al.
- Animate-A-Story2 Visual Generation
He et al.
- AUGUSTUS: An LLM-Driven Multimodal Agent System1 Visual Understanding
Jain et al.
- Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models1 Visual Understanding
Jia et al.
Showing a sample of 140 resources. View the full list on GitHub →