Skip to main content

POST /v1/chat

Try it

POST /v1/chatsandbox
The core Guardian-mode call. Submit a conversation with a Guardian name and receive Passed / Corrected / Blocked.
Parameters
Equivalent curl
curl "https://sandbox.trinitite.ai/v1/chat" \
  -H "Authorization: Bearer $TRINITITE_API_KEY" \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
  "guardian": "pii-redactor",
  "input": [
    {
      "role": "assistant",
      "content": "Customer SSN: 123-45-6789"
    }
  ]
}'
Response
Click Send to run this request against the sandbox.

The sandbox uses your tenant's tk_test_ key. Sandbox calls do not bill, do not retain bodies, and write to a sandbox-only ledger partition.

For a richer in-browser demo with mock policy decisions, see the Verdict Playground.


Overview

The chat endpoint is Trinitite's core governance API. Send a conversation with a Guardian name and receive an instant verdict: Passed, Corrected, or Blocked.

The Guardian validates the AI output against its trained rubric and the runtime instructions, and either returns the output as-is, applies a surgical RFC 6902 JSON Patch correction, or blocks it with a full forensic record.

The endpoint has two modes:

ModeWhen to useSelector
Guardian Mode (default)Govern an AI output against a Guardian's policiesgoverned: true (default)
Direct ModeOpenAI-compatible chat completion that still writes to the audit ledger; useful for tool-calling agents that want a single endpointgoverned: false

Endpoint

POST https://api.trinitite.ai/v1/chat

Headers

HeaderRequiredDescription
AuthorizationYesBearer <session_token | api_key> — see Authentication
Content-TypeYesapplication/json
Idempotency-KeyOptionalStable client-supplied key; identical requests within 24 hours return the original response
X-Request-IdOptionalClient correlation ID; echoed back in the response
X-Trinitite-Nhi-Token / X-Trinitite-Nhi-IdConditionalRequired when acting on behalf of an autonomous workload (Non-Human Identity)
X-Trinitite-Workload-OriginConditionalRequired alongside any NHI header

Guardian Mode

Request body

{
"guardian": "string",
"instructions": "string",
"input": [
{ "role": "developer | user | assistant | tool", "content": "string" }
],
"temperature": 0.0,
"top_p": 1.0,
"max_tokens": 1024,
"governed": true
}

Parameters

guardian (required)

string — the name of the Guardian to apply. Guardians are trained with a rubric and examples via the Guardians API.

"guardian": "PII-Redactor"

The Guardian must be in the ready state. Calls against a Guardian still in training return 422 unprocessable_entity.


instructions (required)

string — runtime instructions for this specific call. The active policy brief — specific, declarative, unambiguous.

"instructions": "Redact all Social Security Numbers (XXX-XX-XXXX) and credit card numbers. Replace with [REDACTED]. Block if more than three PII instances are detected."

Writing effective instructions:

  • Be explicit about what to detect.
  • Define the correction behavior (replace, remove, reformat).
  • Specify when to block vs. correct.
  • Cover edge cases (international formats, null fields).

Example patterns:

// PII redaction
"instructions": "Detect and redact: SSNs (XXX-XX-XXXX), credit cards (16 digits), US phone numbers. Replace each with [REDACTED]. Block if 5+ PII instances detected."

// Format enforcement
"instructions": "All ticket IDs must follow the format 'TKT-[number]' as a string. Correct numeric IDs or IDs missing the 'TKT-' prefix. Block if ticket_id is null or missing."

// Region compliance
"instructions": "Ensure all data queries are scoped to the US region. Detect and correct any queries attempting to access EU or APAC data. Block if the query explicitly requests cross-region data."
Required in Guardian Mode

Requests without instructions (when governed is true or unset) return 400 bad_request.


input (required)

array — the conversation history, OpenAI message format. The final message is the AI output being governed.

{
"role": "developer | user | assistant | tool",
"content": "string",
"tool_call_id": "string (optional, for role=tool)",
"tool_calls": "array (optional)"
}

Roles:

  • developer — system-level context (equivalent to OpenAI's system role)
  • user — end-user input
  • assistant — the AI output to be governed (typically the last message)
  • tool — output from a tool/function call
"input": [
{ "role": "developer", "content": "You are a customer support assistant. Never share PII." },
{ "role": "user", "content": "What is my account balance?" },
{ "role": "assistant", "content": "Your account is registered to John Doe, SSN: 123-45-6789, balance: $50,000." }
]

The Guardian evaluates the last message against the rubric and the provided instructions.


temperature (optional)

number, range 0.01.0, default 0.0. Sampling temperature. Trinitite is batch-invariant at every temperature setting — the same input always produces the same governance verdict.

top_p (optional)

number, range 0.01.0, default 1.0. Nucleus sampling cut-off.

max_tokens (optional)

integer, minimum 1. Maximum completion length.

governed (optional)

boolean, default true. When false, the request runs in Direct Mode (see below).

tools (optional)

array — only valid when governed: false. Tool-calling definitions in OpenAI shape. Sending tools with governed: true returns 400 bad_request.


Responses

Passed — 200 OK

The output is fully compliant. No changes needed.

{
"id": "log_01JF8R3M3X4N5Q6T7V8W9Y0Z1A",
"guardian": "PII-Redactor",
"status": "passed",
"created": "2026-05-01T15:42:00Z",
"governance": {
"action": "No violations detected. Output is compliant.",
"corrections": []
},
"usage": {
"prompt_tokens": 45,
"completion_tokens": 12,
"total_tokens": 57
}
}

What your app should do: use the original output from your input array unchanged.


Corrected — 200 OK

The Guardian applied a surgical correction.

{
"id": "log_01JF8R3M4Y5N6Q7T8V9W0Y1Z2B",
"guardian": "PII-Redactor",
"status": "corrected",
"created": "2026-05-01T15:42:01Z",
"governance": {
"action": "Redacted SSN to comply with PII policy.",
"reason": "PII_EXPOSURE",
"corrections": [
{
"op": "replace",
"path": "/content",
"value": "Your account is registered to John Doe, SSN: [REDACTED], balance: $50,000."
}
]
},
"usage": {
"prompt_tokens": 45,
"completion_tokens": 12,
"total_tokens": 57
}
}

What your app should do: apply the JSON Patch corrections to the last message in your input array and use the result.


Blocked — 403 Forbidden

The output contained a critical violation that could not be safely corrected.

{
"id": "log_01JF8R3M5Z6N7Q8T9V0W1Y2Z3C",
"guardian": "PII-Redactor",
"status": "blocked",
"created": "2026-05-01T15:42:02Z",
"governance": {
"action": "BLOCKED: Multiple critical PII violations detected.",
"reason": "PII_EXFILTRATION",
"violations": [
{ "type": "pii_exposure", "severity": "critical", "details": "Multiple Social Security Numbers detected", "count": 3 },
{ "type": "pii_exposure", "severity": "critical", "details": "Credit card number detected", "count": 1 }
]
},
"usage": {
"prompt_tokens": 45,
"completion_tokens": 0,
"total_tokens": 45
}
}

What your app should do: do not use the original output. Return a safe fallback to the user and log the id for investigation.


JSON Patch format

Corrections use RFC 6902 JSON Patch — a standard for precise, minimal document mutations.

{ "op": "replace", "path": "/content", "value": "Corrected text here." }
{ "op": "replace", "path": "/function_call/arguments/ssn", "value": "[REDACTED]" }
{ "op": "remove", "path": "/metadata/sensitive_field" }
{ "op": "add", "path": "/governance_note", "value": "Corrected for compliance" }

The path is rooted at the last message in your input array. Apply the patch to that message, then resume your workflow with the patched content.


Complete example

A customer support AI emits a response containing a Social Security Number.

Request

curl -X POST https://api.trinitite.ai/v1/chat \
-H "Authorization: Bearer $TRINITITE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"guardian": "PII-Redactor",
"instructions": "Detect and redact all Social Security Numbers (XXX-XX-XXXX). Replace with [REDACTED]. Block if multiple SSNs are present.",
"input": [
{ "role": "developer", "content": "You are a customer support assistant. Never share PII." },
{ "role": "user", "content": "What is my account information?" },
{ "role": "assistant", "content": "Your account is registered to John Doe, SSN: 123-45-6789, balance: $50,000." }
],
"temperature": 0.0
}'

Applying the correction

import jsonpatch

last_message = input_messages[-1].copy()
if data["status"] == "corrected":
patch = jsonpatch.JsonPatch(data["governance"]["corrections"])
corrected = patch.apply(last_message)
final_text = corrected["content"]

Production integration pattern

import os
import jsonpatch
import requests

def govern(messages, guardian_name, instructions):
response = requests.post(
"https://api.trinitite.ai/v1/chat",
headers={
"Authorization": f"Bearer {os.environ['TRINITITE_API_KEY']}",
"Content-Type": "application/json",
},
json={
"guardian": guardian_name,
"instructions": instructions,
"input": messages,
"temperature": 0.0,
},
)

if response.status_code == 403:
data = response.json()
raise BlockedError(data["governance"]["action"], log_id=data["id"])

data = response.json()
last_message = messages[-1]

if data["status"] == "passed":
return last_message["content"]

if data["status"] == "corrected":
patched = jsonpatch.JsonPatch(data["governance"]["corrections"]).apply(last_message)
return patched["content"]

raise RuntimeError(f"Unexpected status: {data['status']}")

Direct Mode

Setting governed: false switches the endpoint to Direct Mode — an OpenAI-compatible chat completion that still writes a record to the Glass Box Ledger but performs no Guardian intervention. Use this when you want a single endpoint surface for both governed and ungoverned calls (the same Guardian name appears in the audit log either way), or for tool-calling flows that are inherently incompatible with output-side rectification.

{
"guardian": "Customer-Support",
"governed": false,
"input": [
{ "role": "user", "content": "What is the weather in Paris?" }
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Retrieve current weather",
"parameters": {
"type": "object",
"properties": { "city": { "type": "string" } },
"required": ["city"]
}
}
}
]
}

The response shape mirrors OpenAI's chat completion plus an id and guardian field for ledger correlation:

{
"id": "log_01JF8R3M9N0Q1T2V3W4Y5Z6A7B",
"object": "chat.completion",
"created": "2026-05-01T15:43:11Z",
"guardian": "Customer-Support",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The weather in Paris is currently 18°C and partly cloudy.",
"tool_calls": []
},
"finish_reason": "stop"
}
],
"usage": { "prompt_tokens": 24, "completion_tokens": 14, "total_tokens": 38 }
}
Want full proxy semantics?

For the OpenAI-compatible /v1/chat/completions shape with provider passthrough, see the Proxy endpoint.


Errors

HTTPerror.codeCause
400validation_errorBody failed schema validation. details lists field paths
400bad_requestMissing instructions in Guardian Mode; tools set in Guardian Mode
401unauthenticatedMissing or invalid credential
403forbiddenCredential lacks guardians:read permission, or scoped to a different Guardian
404not_foundGuardian name not found in your organization
422unprocessable_entityGuardian is not in ready state (e.g. still training)
429rate_limitedPer-organization rate limit exceeded; respect Retry-After
503emergency_shutdownOrg-wide emergency kill switch is engaged
{
"error": {
"code": "bad_request",
"message": "Missing required field: instructions. Instructions are required when governed is true.",
"details": { "field": "instructions" },
"request_id": "req_01J9X..."
}
}

Performance

  • Typical latency: 50–400 ms per Guardian call.
  • Guardians are LoRA-cached after first use — no cold start beyond initial deploy.
  • Use the Idempotency-Key header to safely retry on 5xx without double-billing or double-logging.

Next steps