Context Awesome

awesome-code-benchmark

github.com/tongye98/awesome-code-benchmark ↗

A comprehensive code domain benchmark review of LLM researches.

236

GitHub Stars

229

Curated Resources

2

Categories

17 hours ago

Last Refreshed

Surveys🚀 Benchmark Categories

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me program repair, testing & debugging resources from awesome-code-benchmark"

Installation instructions →

What's inside

🚀 Benchmark Categories

A11YRepair: Bridging Web Accessibility Barriers via Knowledge-Enhanced Divide-and-Conquer RepairProgram Repair, Testing & Debugging
Advancing vision-language models in front-end development via data synthesisFrontend, UI & Visual-Interactive Development
AInsteinBench: Benchmarking Coding Agents on Scientific RepositoriesRepository & Agentic Software Engineering
aiXamine: Simplified LLM Safety and SecuritySecurity, Reliability & Robustness
A Large-scale Class-level Benchmark Dataset for Code Generation with LLMsCode Generation & Completion
ArkEval: Benchmarking and Evaluating Automated CodeRepair for ArkTSProgram Repair, Testing & Debugging

Surveys

Showing a sample of 229 resources. View the full list on GitHub →