awesome-agent-failures

A community curated collection of AI agent failure modes and battle-tested solutions.

190

GitHub Stars

Curated Resources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me autonomous agent failures resources from awesome-agent-failures"

$47,000 LangChain A2A Multi-Agent LoopAutonomous Agent Failures
Analyzer/Verifier agent pair entered an undetected feedback loop for 264 hours (11 days), accruing $47K in API costs with no useful output; observability without enforcement.
Air Canada Chatbot Legal RulingLegal and Financial Incidents
Airline held liable after chatbot gave incorrect bereavement fare information, ordered to pay $812 in damages.
Amazon Q Causes Retail Website OutagesAutonomous Agent Failures
Amazon Q gave engineers guidance from an outdated wiki, causing four high-severity incidents in one week, 6.3M lost orders, and a six-hour customer-facing outage.
Amazon Q VS Code Prompt Injection Supply Chain AttackAI Agent Security Incidents
Attacker injected prompt into official AWS extension telling Amazon Q to delete filesystems and wipe S3 buckets; only a syntax error prevented mass destruction across 1M+ installs.
Ars Technica AI-Fabricated QuotesInstitutional Failures
Senior AI reporter Benj Edwards fired after using a Claude Code–based extraction tool that fabricated quotes attributed to engineer Scott Shambaugh; article retracted, called "a serious failure of our standards."
Character.AI LawsuitsSafety & Misinformation
Multiple lawsuits alleging chatbots promoted self-harm and delivered inappropriate content to minors.

AI Risk Summit 2025Industry Resources
Conference on AI agent risks.
AI Safety in RAGIndustry Resources
Vectara's analysis of RAG hallucination challenges.
AmazonBooks
Investigates how AI systems inherit human biases and examines efforts to align machine learning with ethical and social values.
AmazonBooks
Explores the risks of advanced AI and argues for aligning AI systems with human values to ensure safety.
A Survey on Large Language Model based Autonomous AgentsResearch Papers
Comprehensive survey of LLM-based agents.
A Survey on Large Language Model Reasoning FailuresResearch Papers
A comprehensive review that introduces a novel taxonomy of reasoning in LLMs (embodied vs. non-embodied), and spotlights three categories of reasoning.

Showing a sample of 61 resources. View the full list on GitHub →