Per-agent hallucination risk and reliability — last 7 days. Each agent is tracked independently so regressions are immediately attributable.
evaluation · 7d window · 0 traces
Daily average — lower is better. Spikes indicate degraded retrieval or synthesis.
No trend data in this window.
Daily average reliability score — higher is better.
No reliability data yet.
evaluation · 7d window · 0 traces
Daily average — lower is better. Spikes indicate degraded retrieval or synthesis.
No trend data in this window.
Daily average reliability score — higher is better.
No reliability data yet.