Migration: from OpenAI Moderation API

Per-category booleans → unified Passed/Corrected/Blocked.

Why move

The Moderation API is a content classifier. It says "this is hate / sexual / self-harm / violence / harassment" — and stops there. It does nothing about it, has no audit story, and only covers categories the OpenAI team curates. Trinitite is a complete decision system for arbitrary policies: you train a Guardian against your rubric and get the verdict + the patch + the ledger receipt.

Concept mapping

OpenAI Moderation concept	Trinitite equivalent
Predefined categories (hate, sexual, …)	Trained Guardians on your policies
`flagged: true / false`	`outcome: passed / corrected / blocked`
`categories.hate.score`	Encapsulated in Guardian decision
No remediation	RFC 6902 JSON Patch
No audit	Glass Box Ledger receipt

API translation

POST /v1/chat
{
"guardian": "tone-and-policy",
"input": [{
  "role": "assistant",
  "content": "<ai output>"
}]
}

← 200 OK
{
"outcome": "blocked",
"reason": "Output projects into Hate-speech sub-region",
"policy_hash": "0xa83f...",
"ledger_id": "lg_01HZ2T..."
}

One decision. Custom rubric (built from your handbook, your MSA, your compliance docs). Auditable and replayable. Same surface for every policy domain — not just the OpenAI fixed set.

Why move​

Concept mapping​

API translation​

Why move

Concept mapping

API translation