Skip to main content

A curated list of awesome leaderboard-oriented resources for AI domain

362
GitHub Stars
547
Curated Resources
9
Categories
2 hours ago
Last Refreshed
Model RankingImageDatabase RankingDataset RankingMetric RankingInfrastructure RankingPaper RankingUsage RankingCompany Ranking

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me 3d resources from awesome-foundation-model-leaderboards"

Installation instructions →

What's inside

Image

  • 3D Arena3D

    3D Arena hosts 3D generation arena, where various 3D generative models compete based on their performance in generating 3D models.

  • 3DGen Arena3D

    3DGen Arena hosts the 3D generation arena, where various 3D generative models compete based on their performance in generating 3D models.

  • 3D-POPE3D

    3D-POPE is a benchmark to evaluate object hallucination in 3D generative models.

  • AbelMath

    Abel is a platform to evaluate the mathematical capabilities of LLMs.

  • Abstract Image

    Abstract Image is a benchmark to evaluate multimodal LLMs (MLLM) in understanding and visually reasoning about abstract images, such as maps, charts, and layouts.

  • AesBench

    AesBench is a benchmark to evaluate MLLMs on image aesthetics perception.

Model Ranking

  • ACLUEText

    ACLUE is an evaluation benchmark for ancient Chinese language comprehension.

  • African Languages LLM Eval LeaderboardText

    African Languages LLM Eval Leaderboard tracks progress and ranks performance of LLMs on African languages.

  • AGIEvalText

    AGIEval is a human-centric benchmark to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving.

  • AI Benchmarking HubComprehensive

    AI Benchmarking Hub tracks and compares AI model performance in reasoning, coding, and knowledge tasks.

  • ai-benchmarksText

    ai-benchmarks contains a handful of evaluation results for the response latency of popular AI services.

  • Aider LLM LeaderboardsCode

    Aider LLM Leaderboards evaluate LLM's ability to follow system prompts to edit code.

Resources

  • AIcrowd

    AIcrowd hosts machine learning challenges and competitions across domains such as computer vision, NLP, and reinforcement learning, aimed at both researchers and practitioners.

  • AI Hub

    AI Hub offers a variety of competitions to encourage AI solutions to real-world problems, with a focus on innovation and collaboration.

  • AI Studio

    AI Studio offers AI competitions mainly for computer vision, NLP, and other data-driven tasks, allowing users to develop and showcase their AI skills.

  • Allen Institute for AI

    The Allen Institute for AI provides leaderboards and benchmarks on tasks in natural language understanding, commonsense reasoning, and other areas in AI research.

  • Codabench

    Codabench is an open-source platform for benchmarking AI models, enabling customizable, user-driven challenges across various AI domains.

  • DataFountain

    DataFountain is a Chinese AI competition platform featuring challenges in finance, healthcare, and smart cities, encouraging solutions for industry-related problems.

Metric Ranking

  • AlignScore

    AlignScore evaluates the performance of different metrics in assessing factual consistency.

Usage Ranking

Dataset Ranking

  • DataComp

    DataComp is a benchmark to evaluate the performance of various datasets with a fixed model architecture.

Showing a sample of 547 resources. View the full list on GitHub →