awesome-large-multimodal-reasoning-models

The development and future prospects of large multimodal reasoning models.

614

GitHub Stars

336

Curated Resources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me 4.1 multimodal understanding resources from awesome-large-multimodal-reasoning-models"

A4BenchVLM@schoolUnLOK-VQADocMark4.1 Multimodal Understanding
AI2-THORGibsoniGibsonIsaac Lab4.3 Multimodal Reasoning
AIGCBenchEvalCrafterLMM4LMMRISEBench4.2 Multimodal Generation
LWM , Genesis , HQ-Edit , InstructPix2Pix
AIR-BenchMMAUSD-evalCoVoST24.1 Multimodal Understanding
MELD , CoVoST2 , SIFT-50M , Clotho
Argus InspectionVisText-MosquitoChemtableKokushiMD-104.3 Multimodal Reasoning
AudioBenchVoiceBenchFleursMusicBench4.1 Multimodal Understanding
Librispeech , Common Voice , Aishell , Fleurs

Active-O32.3 Stage 3 Language-Centric Long Reasoning - System-2 Thinking and Planning
Qwen2.5-VL-7B
Ada-R12.3 Stage 3 Language-Centric Long Reasoning - System-2 Thinking and Planning
DeepSeek-R1-Distill-Qwen (7B, 1.5B)
AGoT2.2 Stage 2 Language-Centric Short Reasoning - System-1 Reasoning
T,I
ALBEF2.1 Stage 1 Perception Driven Reasoning - Developing Task-Specific Reasoning Modules
2021
AnyMAL2.2 Stage 2 Language-Centric Short Reasoning - System-1 Reasoning
T, I, A, V
AR-MCTS2.2 Stage 2 Language-Centric Short Reasoning - System-1 Reasoning
T,I

Showing a sample of 336 resources. View the full list on GitHub →