TraceDog

Documentation

TraceDog is an observability and reliability layer for LLM applications. These docs explain the product model, API contract, reliability engine, and operating patterns needed to build and ship with confidence.

30-second architecture pitch

Your app sends one trace per model turn. TraceDog ingests it, computes grounding + claim-level reliability signals, persists the enriched result, and exposes fleet metrics plus deep per-trace debugging. The same contract is used for production requests and benchmark runners.

What makes TraceDog different

Claim-level explainability, not only aggregate quality numbers
Unified plane: production traces and offline evaluations share the same API
Actionable outputs for engineering teams: failure hints, severity, and repair loops

If you are new to the codebase, start with Quickstart, then read System flow, API ingest, Claim graph, and Processing plane.