awesome-gui-agent
github.com/showlab/awesome-gui-agent ↗💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
1.2k
GitHub Stars
176
Curated Resources
6
Categories
4 hours ago
Last Refreshed
Datasets / BenchmarksModels / AgentsSurveysProjectsSafetyRelated Repositories
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me datasets / benchmarks resources from awesome-gui-agent"
Installation instructions →What's inside
Datasets / Benchmarks
- A3: Android Agent Arena for Mobile GUI Agents
- A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility
- AgentStudio: A Toolkit for Building General Virtual Agents
- AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents
- AndroidEnv: A Reinforcement Learning Platform for Android
- Android in the Wild: A Large-Scale Dataset for Android Device Control
Projects
Models / Agents
- AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations
- A Data-Driven Approach for Learning to Control Computers
- AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-Tuning
- Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems
- Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
- Agent S: An Open Agentic Framework that Uses Computers Like a Human
Safety
- Adversarial Attacks on Multimodal Agents
- AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents
- Caution for the Environment: Multimodal Agents are Susceptible to Environmental Distractions
- EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
- Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
- MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control
Related Repositories
Showing a sample of 176 resources. View the full list on GitHub →