awesome-interpretability
github.com/wassname/awesome-interpretability ↗Awesome tools for interpreting, manipulating the internals of of deep neural networks.
10
GitHub Stars
57
Curated Resources
4
Categories
23 hours ago
Last Refreshed
Mechanistic interpretability librariesExplainability, counterfactuals and probingStructured outputSee more
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me explainability, counterfactuals and probing resources from awesome-interpretability"
Installation instructions →What's inside
Explainability, counterfactuals and probing
Mechanistic interpretability libraries
- an extremely opinionated toolkit for doing whatever you want to specific models,
- A tutorial on doing it manually
- BauKit
light, simple, and well loved
- cupbearer
- Docent
interactive model explanation and steering interface
- Graphpatch
promising but abandoned
Structured output
- clownfish
2023 Modifying Transformers to Follow a JSON Schema - not updated
- Constrained-Text-Generation-Studio
- guardrails
- instructor
for remote api's without logits
- jsonformer
- kor
Showing a sample of 57 resources. View the full list on GitHub →