Directory
Eval and observability
Tracing, evaluation, and monitoring platforms so you can ship agents you actually trust.
| Tool | License | Pricing | Models | Stars | Updated |
|---|---|---|---|---|---|
| AgentOps Observability built specifically for AI agents. | MIT | freemium Free tier | multi | 5.6k | 2026-06-25 |
| Arize Phoenix Open-source LLM tracing and eval. | Elastic-2.0 | free Self-host free | multi | 10.3k | 2026-06-25 |
| Braintrust Evals and prompt playground for serious teams. | Proprietary | freemium Free tier | multi | — | — |
| Comet Opik Open-source eval and tracing from Comet. | Apache-2.0 | freemium Self-host free | multi | 19.8k | 2026-06-25 |
| DeepEval (Confident AI) Open-source unit tests for LLMs. | Apache-2.0 | free Self-host free | multi | 16.5k | 2026-06-25 |
| Fiddler AI observability and governance for enterprise. | Proprietary | paid Custom (enterprise) | multi | — | — |
| Galileo Guardrails and evaluation for production LLMs. | Proprietary | paid Custom | multi | — | — |
| Helicone Proxy-based LLM observability with instant cost tracking. | Apache-2.0 | freemium Free tier | multi | 5.9k | 2026-06-11 |
| Laminar Open-source observability and eval for AI agents. | Apache-2.0 | freemium Self-host free | multi | 3.0k | 2026-06-25 |
| Langfuse The most-used open-source LLM observability tool. | MIT | freemium Self-host free; cloud tier | multi | 29.8k | 2026-06-25 |
| LangSmith LangChain's observability and eval platform. | Proprietary | freemium Free developer tier | multi | — | — |
| Latitude Open-source prompt management and eval. | LGPL-3.0 | freemium Self-host free | multi | 4.3k | 2026-06-25 |
| Lunary Open-source LLM observability and prompt management. | Apache-2.0 | freemium Self-host free | multi | — | — |
| Maxim AI Evaluation and simulation for AI agents. | Proprietary | freemium Free tier | multi | — | — |
| OpenLLMetry OpenTelemetry instrumentation for LLMs. | Apache-2.0 | free Free | multi | 7.2k | 2026-06-25 |
| Patronus AI Automated evaluation and guardrails for LLMs. | Proprietary | paid Custom | multi | — | — |
| Pydantic Logfire Observability from the Pydantic team. | Proprietary | freemium Free tier | multi | — | — |
| SigNoz Open-source OpenTelemetry observability. | MIT | freemium Self-host free | multi | 27.5k | 2026-06-25 |
| Weights and Biases Weave LLM tracing and eval inside W&B. | Proprietary | freemium Free tier | multi | — | — |