Verdict Playground
Paste an AI output. Pick a Guardian. Watch the verdict change in real time.
This is the same three-outcome contract every Trinitite-governed call follows: passed returns the output unchanged, corrected returns an RFC 6902 JSON Patch with the fix applied, and blocked returns a 403-style verdict with a forensic record. The playground runs entirely in your browser against a mock rubric — no API key required.
Demo runs entirely in your browser against a mock rubric. Real Guardians are trained on your policies and ship as LoRA adapters — see Guardian Training.
What you're looking at
Verdict tag — the same three values your client code branches on:
PASSED(green) — original output is compliant.CORRECTED(teal) — the Guardian patches the offending span and returns the patched result.BLOCKED(red) — the Guardian refuses to issue a safe correction. Your client surfaces a403.
Reason — short geometric description of why the Guardian decided what it did. Real Guardians return a structured correction_diff payload — see Observability.
JSON Patches — RFC 6902 operations, color-coded by op. Apply them with any RFC-6902 library on your end.
Receipt fields — every decision in production is hashed, signed, and Merkle-chained. The mock receipt shows the four fields you'd see on the wire: ledger_id, latency_ms, policy_hash, and the active determinism mode.
What the playground is not
The playground uses a toy rubric with hard-coded regex for SSN, card numbers, and a few destructive SQL patterns. It is intentionally not the real Guardian — the real Guardian is a fine-tuned LoRA model trained against your specific policies, evaluated against the geometric Policy Manifold (see Architecture). The point of the playground is to make the contract tangible — outcome, patches, receipt — not the implementation.
For a live test against a real Guardian, request sandbox access and follow the 5-min Quickstart.
What's next
→ Quickstart — wire a real Guardian into your code.
→ Cookbook — copy-paste recipes for common stacks.
→ Threat Library — named adversarial patterns + downloadable test fixtures.