Skip to main content

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

6k
GitHub Stars
438
Curated Resources
11
Categories
22 hours ago
Last Refreshed
🚀 Start HerePapersTools and CodeAPIsDatasets and BenchmarksAI Content DetectorsCoursesTutorials and GuidesVideosCommunities🔬 Autonomous Research & Self-Improving Agents

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me other notable repositories resources from awesome-prompt-engineering"

Installation instructions →

What's inside

Tools and Code

  • 12-Factor AgentsOther Notable Repositories

    Principles for building production-grade LLM-powered software. ~17K+ ⭐

  • AgentaPrompt Management and Testing

    Open-source LLM developer platform for prompt management, evaluation, human feedback, and deployment.

  • AgentlessVibe Coding and AI Coding Assistants

    Simple three-phase approach (localize → repair → validate) to solving software development problems. ~2K+ ⭐

  • AgentSealRed Teaming and Prompt Security

    "Open-source scanner that runs 150 attack probes to test AI agents for prompt injection and extraction vulnerabilities."

  • Agno (formerly Phidata)Agent Frameworks

    Python agent framework with microsecond instantiation. ~20K+ ⭐

  • AI Agent System Prompts LibraryOther Notable Repositories

    Collection of system prompts from production AI coding agents (Claude Code, Gemini CLI, Cline, Aider, Roo Code).

Tutorials and Guides

Courses

🔬 Autonomous Research & Self-Improving Agents

Datasets and Benchmarks

  • AgentHarmRed Teaming and Adversarial Datasets

    110 malicious agent tasks across 11 harm categories.

  • Artificial Analysis Intelligence Index v3Leaderboards and Meta-Benchmarks

    Aggregates 10 evaluations.

  • BigCodeBenchMajor Benchmarks (2024–2026)

    1,140 coding tasks across 7 domains; AI achieves ~35.5% vs. 97% human success.

  • Chatbot Arena / LM ArenaMajor Benchmarks (2024–2026)

    6M+ user votes for Elo-rated pairwise LLM comparisons. De facto standard for human preference.

  • CodeAlpaca-20kPrompt and Instruction Datasets

    20,000 programming instruction-output pairs.

  • GPQAMajor Benchmarks (2024–2026)

    448 "Google-proof" STEM questions; non-expert validators achieve only 34%.

Communities

Showing a sample of 438 resources. View the full list on GitHub →