VERITAS Preview

Benchmark

No runs yet. Press Run benchmark to begin.

Generates a deterministic synthetic corpus of 240 Tasks across six archetypes, computes bootstrap 95% CIs on every metric, breaks accuracy down by conflict type, and renders a separate real-data operational panel.

Five numbers, with intervals and provenance.