awesome-observability
github.com/adriannovegil/awesome-observability ↗Awesome observability page
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me platforms resources from awesome-observability"
Installation instructions →What's inside
5. Transport
- ActiveMQ
Powerful open source messaging and integration patterns server.
- Aeron
Efficient reliable UDP unicast, UDP multicast, and IPC message transport.
- Apache Kafka
Publish-subscribe messaging rethought as a distributed commit log.
- Apollo
Faster, more reliable, easier to maintain messaging broker built from the foundations of the original ActiveMQ.
- Ascoltatori
Pub/sub library for Node.
- Beanstalk
Simple, fast work queue.
10. LLM & AI Observability
- AgentaPlatforms
Open-source LLMOps platform for prompt playground, prompt management, LLM evaluation, and observability.
- agenttraceCost & Usage Tracking
TUI observability for AI coding agents. Tracks cost, tokens, tool failures, anomalies, health, and CI gates across Claude Code, Codex, Gemini CLI, Aider, and Cursor exports.
- Arize PhoenixPlatforms
Open-source AI observability platform for tracing, evaluation, datasets, experiments, prompt management and playground. Built on OpenTelemetry with Python and TypeScript support.
- burn0Cost & Usage Tracking
Open-source Node.js cost observability with one import. Auto-detects and tracks per-request costs for 50+ services (LLMs, SaaS, databases) via HTTP interception. Sub-millisecond overhead, local-first with optional cloud dashboard.
- BurndCost & Usage Tracking
Local-first CLI that reads Claude Code JSONL session files and surfaces cost leaks via pattern detectors (retry storms, tool overuse, repeated reads, thrash, etc.). npx-installable, MIT, zero telemetry; findings export to a shareable report URL.
- ClevAgentPlatforms
Runtime monitoring for AI agents — heartbeat watchdog, loop detection, cost tracking, auto-restart.
9. Processing and Analyze and Act
- AlertaAlerts
Tool used to consolidate and de-duplicate alerts from multiple sources for quick "at-a-glance" visualisation.
- Anomaly Detection in Prometheus MetricsAnomalies Detection
Prototype for a Prometheus Anomaly Detector (PAD) which can be deployed on OpenShift. The PAD is a framework to deploy a metric prediction model to detect anomalies in prometheus metrics.
- Anomaly Detection Toolkit (ADTK)Anomalies Detection
Python package for unsupervised / rule-based time series anomaly detection.
- BansheeAnomalies Detection
Real-time anomalies(outliers) detection system for periodic metrics.
- BosunAlerts
Time Series Alerting Framework.
- CabotAlerts
Get alerted when services go down or metrics go crazy.
14. Observability as a Service
- Alibaba Cloud Logs Service
Complete real-time data logging service that has been developed by Alibaba Group.
- Azure Monitor
Full observability into your applications, infrastructure, and network.
- CloudWatch
Observability of your AWS resources and applications on AWS and on-premises.
- Dash0
Modern OpenTelemetry Native Observability, built on CNCF Open Standards such as PromQL, Perses and OLTP with full cost control. Supports monitoring metrics, logs and traces. With dashboarding and alerting capabilities.
- Epsagon
Application Monitoring Built for Containers and Serverless.
- Geneos
Real-time monitoring for all your environments in one platform.
7. Storage
- Apache CassandraNoSQL Database (The Others :-P)
Scalability and high availability with linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure.
- Apache HBase"Meta Projects" (data storage, multi-tenant, aggregation, high availability, etc)
The Hadoop database, a distributed, scalable, big data store.
- Apache LuceneSearch Engine
Java library providing powerful indexing and search features.
- Apache SolrSearch Engine
Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene.
- ArangoDBGraph Database
Natively store data for graph, document and search needs.
- ClickHouseSQL Database
Fast open-source OLAP database management system.
8. Visualization
- API Status CheckUptime
Free real-time status monitoring for 120+ developer APIs including AWS, Stripe, GitHub, OpenAI, and more. Track third-party API availability with alerts and status pages.
- BlueWave UptimeUptime
Open-source, self-hosted monitoring tool built with React.js, Node.js, and MongoDB, designed to track server uptime, response times, and incidents in real-time with beautiful visualizations.
- ChronografDashboarding
User interface and administrative component of the InfluxDB platform.
- ExplorVizGeneral & Tools
Live trace visualization for large software landscapes.
- Flame GraphGeneral & Tools
Visualization of profiled software, allowing the most frequent code-paths to be identified quickly and accurately.
- Flame ScopeGeneral & Tools
FlameScope is a visualization tool for exploring different time ranges as Flame Graphs.
12. Application Performance Monitoring Solutions (APM)
- Apitally
API monitoring, analytics, and request logging for REST APIs, with lightweight open-source SDKs for Python, Node.js, Go, .NET, and Java.
- AppDynamics
Business and application performance monitoring.
- AppOptics
Continuous monitoring built to scale with your applications for less downtime and lower resource usage.
- Aspecto
Troubleshoot performance bottlenecks and errors within your microservices.
- Aternity
Simplified high-definition APM visibility leveraging Real User Monitoring, Synthetic Monitoring, and OpenTelemetry, that is scalable, easy to use and deploy, and unifies insights across end users, applications, networks, and the cloud-native ecosystem.
- Blue Matador
Easiest and fastest way to monitor your cloud environments on the market. Just provide your read-only credentials and start getting insights in minutes.
6. Collector
- AugurConfiguration & Linters
Static analysis linter for OpenTelemetry Collector configurations. Detects misconfigurations, hardcoded credentials, and missing critical components (memory limiters, batch processors) before deployment. Built on OPA/Rego with customizable policies and CI/CD integration.
- BrubeckLogging
Statsd-compatible stats aggregator written in C.
- GoAccessLogging
Open source real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser. It provides fast and valuable HTTP statistics for system administrators that require a visual server report on the fly.
- Grafana MimirMetrics
Mimir is an open source, horizontally scalable, highly available, multi-tenant TSDB for long-term storage for Prometheus.
- Last9Logging
Unified Logs Explorer with search, filters, SQL query support, and OpenTelemetry-native ingestion.
- LogbookLogging
Extensible Java library to enable complete request and response logging for different client- and server-side technologies.
Showing a sample of 288 resources. View the full list on GitHub →