Skip to main content

Papers about red teaming LLMs and Multimodal models.

167
GitHub Stars
286
Curated Resources
21
Categories
4 hours ago
Last Refreshed
SurveysTaxonomiesPositionsPhenomenonsCompletion ComplianceInstruction IndirectionGeneralization GlideModel ManipulationSuffix SearchersPrompt SearchersTraining Time DefensesInference Time DefensesEvaluation MetricsEvaluation BenchmarksApplication DomainsApplication RisksAttack StrategiesAttack SearchersDefenseApplicationBenchmarks

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me surveys on risks resources from openredteaming"

Installation instructions →

What's inside

Evaluation Benchmarks

  • Blog

    A full-fledged benchmark for evaluating protection capabilities of AI models

  • Paper

Completion Compliance

Instruction Indirection

Suffix Searchers

Surveys

Training Time Defenses

Showing a sample of 286 resources. View the full list on GitHub →