awesome-multimodal-llm-for-code
github.com/xjywhu/awesome-multimodal-llm-for-code ↗Multimodal Large Language Models for Code Generation under Multimodal Scenarios
250
GitHub Stars
150
Curated Resources
14
Categories
4 hours ago
Last Refreshed
1. Web/Mobile UI Code Generation2. Scientific Plots Code Generation3. Visually Rich Programming and Math4. SVG Code Generation and Understanding5. Slide && Presentation Generation6. Program Repair7. UML and workflow code generation8. CAD code generation9. Poster code generation10. Multimodal document generation11. 3D code generation12. Game13. Code for MLLM's Image Understanding14. General
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me 1. web/mobile ui code generation resources from awesome-multimodal-llm-for-code"
Installation instructions →What's inside
1. Web/Mobile UI Code Generation
- 1D-Bench: A Benchmark for Iterative UI Code Generation with Visual Feedback in Real-World
- Advancing vision-language models in front-end development via data synthesis
- Automatically Generating UI Code from Screenshot: A Divide-and-Conquer-Based Approach.
- Automatically Generating Web Applications from Requirements Via Multi-Agent Test-Driven Development
- Benchmarking Multimodal LLMs on Code Generation for Complex Interactive Webpages
- Beyond Prototyping: Autonomous, Enterprise-Grade Frontend Development from Pixel to Production via a Specialized Multi-Agent Framework
12. Game
- 90% Faster, 100% Code-Free: MLLM-Driven Zero-Code 3D Game Development
- GameUIAgent: An LLM-Powered Framework for Automated Game UI Design with Structured Intermediate Representation
- OpenGame: Open Agentic Coding for Games
- PlayCoder: Making LLM-Generated GUI Code Playable
- V-GameGym: Visual Game Generation for Code Large Language Models
14. General
- ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation
- Automated LaTeX Code Generation from Handwritten Mathematical Expressions
- EmbodiedCoder: Parameterized Embodied Mobile Manipulation via Modern Coding Model
- Empowering Agile-Based Generative Software Development through Human-AI Teamwork
- FullStack Bench: Evaluating LLMs as Full Stack Coders
- Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
5. Slide && Presentation Generation
- AutoPresent: Designing Structured Visuals from Scratch.
- Code2Video: A Code-centric Paradigm for Educational Video Generation.
- Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1.
- Paper2Video: Automatic Video Generation from Scientific Papers.
- Paper2Web: Let's Make Your Paper Alive!.
- PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides.
10. Multimodal document generation
2. Scientific Plots Code Generation
- Breaking the SFT Plateau: Multimodal Structured Reinforcement Learning for Chart-to-Code Generation.
- Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data
- ChartCards: A Chart-Metadata Generation Framework for Multi-Task Chart Understanding.
- ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation.
- ChartEditBench: Evaluating Grounded Multi-Turn Chart Editing in Multimodal Language Models
- ChartMaster: Advancing Chart-to-Code Generation with Real-World Charts and Chart Similarity Reinforcement Learning.
8. CAD code generation
- CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward
- CADReview: Automatically Reviewing CAD Programs with Error Detection and Correction
- CME-CAD: Heterogeneous Collaborative Multi-Expert Reinforcement Learning for CAD Code Generation
- mrCAD: Multimodal Refinement of Computer-aided Designs
4. SVG Code Generation and Understanding
- Can Large Language Models Understand Symbolic Graphics Programs?
- Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models.
- DuetSVG: Unified Multimodal SVG Generation with Internal Visual Guidance.
- LLM4SVG: Empowering LLMs to Understand and Generate Complex Vector Graphics
- LogoMotion: Visually Grounded Code Generation for Content-Aware Animation.
- Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation.
Showing a sample of 150 resources. View the full list on GitHub →