Skip to main content

A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.

3.2k
GitHub Stars
705
Curated Resources
10
Categories
8 hours ago
Last Refreshed
📚 Survey💥 Vision Language Action (VLA) Models & World Action Models (WAM)🚶 Vision Language Navigation (VLN) Models🎬 Vision Action (VA) Models🧠 Other Multimodal Large Language Model (MLLM)-based/related Embodied LearningPhysics-aware PolicySim-to-Real TransferBenchmarkSimulatorRelated Works

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me 2026 resources from awesome-embodied-vla-va-vln"

Installation instructions →

What's inside

💥 Vision Language Action (VLA) Models & World Action Models (WAM)

Benchmark

🎬 Vision Action (VA) Models

🧠 Other Multimodal Large Language Model (MLLM)-based/related Embodied Learning

🚶 Vision Language Navigation (VLN) Models

Sim-to-Real Transfer

Showing a sample of 705 resources. View the full list on GitHub →