Context Awesome

awesome-rag-vision

github.com/zhengxujosh/awesome-rag-vision ↗

Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision

339

GitHub Stars

140

Curated Resources

2

Categories

21 hours ago

Last Refreshed

ResourcesRAG for Vision

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me workshops and tutorials resources from awesome-rag-vision"

Installation instructions →

What's inside

Resources

A Comprehensive Guide to Building Multimodal RAG SystemsWorkshops and Tutorials
An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and AudioWorkshops and Tutorials
Build an AI-powered multimodal RAG system with Docling and GraniteWorkshops and Tutorials
Building an Image Search RAG App with Llama 3.2 VisionWorkshops and Tutorials
Building Multimodal RAG Application for Video PreprocessingWorkshops and Tutorials
Guide to Multimodal RAG for Images and Text (in 2025)Workshops and Tutorials

RAG for Vision

A General Retrieval-Augmented Generation Framework for Multimodal Case-Based Reasoning Applications1 Visual Understanding
Marom
AlzheimerRAG: Multimodal RAG for Clinical Use Cases using PubMed1 Visual Understanding
Lahiri et al.
A Multi-Granularity Retrieval Framework for Visually-Rich Documents1 Visual Understanding
Xu et al.
Animate-A-Story2 Visual Generation
He et al.
AUGUSTUS: An LLM-Driven Multimodal Agent System1 Visual Understanding
Jain et al.
Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models1 Visual Understanding
Jia et al.

Showing a sample of 140 resources. View the full list on GitHub →