Skip to main content

The development and future prospects of large multimodal reasoning models.

612
GitHub Stars
336
Curated Resources
3
Categories
3 hours ago
Last Refreshed
2 Roadmap of Multimodal Reasoning Models3 Towards Native Multimodal Reasoning Model4 Dataset and Benchmark

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me 4.1 multimodal understanding resources from awesome-large-multimodal-reasoning-models"

Installation instructions →

What's inside

4 Dataset and Benchmark

2 Roadmap of Multimodal Reasoning Models

  • Active-O32.3 Stage 3 Language-Centric Long Reasoning - System-2 Thinking and Planning

    Qwen2.5-VL-7B

  • Ada-R12.3 Stage 3 Language-Centric Long Reasoning - System-2 Thinking and Planning

    DeepSeek-R1-Distill-Qwen (7B, 1.5B)

  • AGoT2.2 Stage 2 Language-Centric Short Reasoning - System-1 Reasoning

    T,I

  • ALBEF2.1 Stage 1 Perception Driven Reasoning - Developing Task-Specific Reasoning Modules

    2021

  • AnyMAL2.2 Stage 2 Language-Centric Short Reasoning - System-1 Reasoning

    T, I, A, V

  • AR-MCTS2.2 Stage 2 Language-Centric Short Reasoning - System-1 Reasoning

    T,I

3 Towards Native Multimodal Reasoning Model

Resources

Showing a sample of 336 resources. View the full list on GitHub →