POST /v1/chat

Try it

POST /v1/chatsandbox

The core Guardian-mode call. Submit a conversation with a Guardian name and receive Passed / Corrected / Blocked.

Parameters

Guardian name

Conversation (JSON array of messages)

Equivalent curl

curl "https://sandbox.trinitite.ai/v1/chat" \
  -H "Authorization: Bearer $TRINITITE_API_KEY" \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
  "guardian": "pii-redactor",
  "input": [
    {
      "role": "assistant",
      "content": "Customer SSN: 123-45-6789"
    }
  ]
}'

Response

Click Send to run this request against the sandbox.

The sandbox uses your tenant's tk_test_ key. Sandbox calls do not bill, do not retain bodies, and write to a sandbox-only ledger partition.

For a richer in-browser demo with mock policy decisions, see the Verdict Playground.

Overview

The chat endpoint is Trinitite's core governance API. Send a conversation with a Guardian name and receive an instant verdict: Passed, Corrected, or Blocked.

The Guardian validates the AI output against its trained rubric and the runtime instructions, and either returns the output as-is, applies a surgical RFC 6902 JSON Patch correction, or blocks it with a full forensic record.

The endpoint has two modes:

Mode	When to use	Selector
Guardian Mode (default)	Govern an AI output against a Guardian's policies	`governed: true` (default)
Direct Mode	OpenAI-compatible chat completion that still writes to the audit ledger; useful for tool-calling agents that want a single endpoint	`governed: false`

Endpoint

POST https://api.trinitite.ai/v1/chat

Headers

Header	Required	Description
`Authorization`	Yes	`Bearer <session_token \| api_key>` — see Authentication
`Content-Type`	Yes	`application/json`
`Idempotency-Key`	Optional	Stable client-supplied key; identical requests within 24 hours return the original response
`X-Request-Id`	Optional	Client correlation ID; echoed back in the response
`X-Trinitite-Nhi-Token` / `X-Trinitite-Nhi-Id`	Conditional	Required when acting on behalf of an autonomous workload (Non-Human Identity)
`X-Trinitite-Workload-Origin`	Conditional	Required alongside any NHI header

Guardian Mode

Request body

{
  "guardian": "string",
  "instructions": "string",
  "input": [
    { "role": "developer | user | assistant | tool", "content": "string" }
  ],
  "temperature": 0.0,
  "top_p": 1.0,
  "max_tokens": 1024,
  "governed": true
}

Parameters

`guardian` (required)

string — the name of the Guardian to apply. Guardians are trained with a rubric and examples via the Guardians API.

"guardian": "PII-Redactor"

The Guardian must be in the ready state. Calls against a Guardian still in training return 422 unprocessable_entity.

`instructions` (required)

string — runtime instructions for this specific call. The active policy brief — specific, declarative, unambiguous.

"instructions": "Redact all Social Security Numbers (XXX-XX-XXXX) and credit card numbers. Replace with [REDACTED]. Block if more than three PII instances are detected."

Writing effective instructions:

Be explicit about what to detect.
Define the correction behavior (replace, remove, reformat).
Specify when to block vs. correct.
Cover edge cases (international formats, null fields).

Example patterns:

// PII redaction
"instructions": "Detect and redact: SSNs (XXX-XX-XXXX), credit cards (16 digits), US phone numbers. Replace each with [REDACTED]. Block if 5+ PII instances detected."

// Format enforcement
"instructions": "All ticket IDs must follow the format 'TKT-[number]' as a string. Correct numeric IDs or IDs missing the 'TKT-' prefix. Block if ticket_id is null or missing."

// Region compliance
"instructions": "Ensure all data queries are scoped to the US region. Detect and correct any queries attempting to access EU or APAC data. Block if the query explicitly requests cross-region data."

Required in Guardian Mode

Requests without instructions (when governed is true or unset) return 400 bad_request.

`input` (required)

array — the conversation history, OpenAI message format. The final message is the AI output being governed.

{
  "role": "developer | user | assistant | tool",
  "content": "string",
  "tool_call_id": "string (optional, for role=tool)",
  "tool_calls": "array (optional)"
}

Roles:

developer — system-level context (equivalent to OpenAI's system role)
user — end-user input
assistant — the AI output to be governed (typically the last message)
tool — output from a tool/function call

"input": [
  { "role": "developer", "content": "You are a customer support assistant. Never share PII." },
  { "role": "user", "content": "What is my account balance?" },
  { "role": "assistant", "content": "Your account is registered to John Doe, SSN: 123-45-6789, balance: $50,000." }
]

The Guardian evaluates the last message against the rubric and the provided instructions.

`temperature` (optional)

number, range 0.0–1.0, default 0.0. Sampling temperature. Trinitite is batch-invariant at every temperature setting — the same input always produces the same governance verdict.

`top_p` (optional)

number, range 0.0–1.0, default 1.0. Nucleus sampling cut-off.

`max_tokens` (optional)

integer, minimum 1. Maximum completion length.

`governed` (optional)

boolean, default true. When false, the request runs in Direct Mode (see below).

`tools` (optional)

array — only valid when governed: false. Tool-calling definitions in OpenAI shape. Sending tools with governed: true returns 400 bad_request.

Responses

Passed — `200 OK`

The output is fully compliant. No changes needed.

{
  "id": "log_01JF8R3M3X4N5Q6T7V8W9Y0Z1A",
  "guardian": "PII-Redactor",
  "status": "passed",
  "created": "2026-05-01T15:42:00Z",
  "governance": {
    "action": "No violations detected. Output is compliant.",
    "corrections": []
  },
  "usage": {
    "prompt_tokens": 45,
    "completion_tokens": 12,
    "total_tokens": 57
  }
}

What your app should do: use the original output from your input array unchanged.

Corrected — `200 OK`

The Guardian applied a surgical correction.

{
  "id": "log_01JF8R3M4Y5N6Q7T8V9W0Y1Z2B",
  "guardian": "PII-Redactor",
  "status": "corrected",
  "created": "2026-05-01T15:42:01Z",
  "governance": {
    "action": "Redacted SSN to comply with PII policy.",
    "reason": "PII_EXPOSURE",
    "corrections": [
      {
        "op": "replace",
        "path": "/content",
        "value": "Your account is registered to John Doe, SSN: [REDACTED], balance: $50,000."
      }
    ]
  },
  "usage": {
    "prompt_tokens": 45,
    "completion_tokens": 12,
    "total_tokens": 57
  }
}

What your app should do: apply the JSON Patch corrections to the last message in your input array and use the result.

Blocked — `403 Forbidden`

The output contained a critical violation that could not be safely corrected.

{
  "id": "log_01JF8R3M5Z6N7Q8T9V0W1Y2Z3C",
  "guardian": "PII-Redactor",
  "status": "blocked",
  "created": "2026-05-01T15:42:02Z",
  "governance": {
    "action": "BLOCKED: Multiple critical PII violations detected.",
    "reason": "PII_EXFILTRATION",
    "violations": [
      { "type": "pii_exposure", "severity": "critical", "details": "Multiple Social Security Numbers detected", "count": 3 },
      { "type": "pii_exposure", "severity": "critical", "details": "Credit card number detected", "count": 1 }
    ]
  },
  "usage": {
    "prompt_tokens": 45,
    "completion_tokens": 0,
    "total_tokens": 45
  }
}

What your app should do: do not use the original output. Return a safe fallback to the user and log the id for investigation.

JSON Patch format

Corrections use RFC 6902 JSON Patch — a standard for precise, minimal document mutations.

{ "op": "replace", "path": "/content", "value": "Corrected text here." }
{ "op": "replace", "path": "/function_call/arguments/ssn", "value": "[REDACTED]" }
{ "op": "remove",  "path": "/metadata/sensitive_field" }
{ "op": "add",     "path": "/governance_note", "value": "Corrected for compliance" }

The path is rooted at the last message in your input array. Apply the patch to that message, then resume your workflow with the patched content.

Complete example

A customer support AI emits a response containing a Social Security Number.

Request

cURL
Python
JavaScript
Java

curl -X POST https://api.trinitite.ai/v1/chat \
  -H "Authorization: Bearer $TRINITITE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "guardian": "PII-Redactor",
    "instructions": "Detect and redact all Social Security Numbers (XXX-XX-XXXX). Replace with [REDACTED]. Block if multiple SSNs are present.",
    "input": [
      { "role": "developer", "content": "You are a customer support assistant. Never share PII." },
      { "role": "user", "content": "What is my account information?" },
      { "role": "assistant", "content": "Your account is registered to John Doe, SSN: 123-45-6789, balance: $50,000." }
    ],
    "temperature": 0.0
  }'

import os
import requests

response = requests.post(
    "https://api.trinitite.ai/v1/chat",
    headers={
        "Authorization": f"Bearer {os.environ['TRINITITE_API_KEY']}",
        "Content-Type": "application/json",
    },
    json={
        "guardian": "PII-Redactor",
        "instructions": (
            "Detect and redact all Social Security Numbers (XXX-XX-XXXX). "
            "Replace with [REDACTED]. Block if multiple SSNs are present."
        ),
        "input": [
            {"role": "developer", "content": "You are a customer support assistant. Never share PII."},
            {"role": "user", "content": "What is my account information?"},
            {"role": "assistant", "content": "Your account is registered to John Doe, SSN: 123-45-6789, balance: $50,000."},
        ],
        "temperature": 0.0,
    },
)

data = response.json()

const response = await fetch('https://api.trinitite.ai/v1/chat', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.TRINITITE_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    guardian: 'PII-Redactor',
    instructions:
      'Detect and redact all Social Security Numbers (XXX-XX-XXXX). Replace with [REDACTED]. Block if multiple SSNs are present.',
    input: [
      { role: 'developer', content: 'You are a customer support assistant. Never share PII.' },
      { role: 'user', content: 'What is my account information?' },
      { role: 'assistant', content: 'Your account is registered to John Doe, SSN: 123-45-6789, balance: $50,000.' },
    ],
    temperature: 0.0,
  }),
});

const data = await response.json();

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;

HttpClient client = HttpClient.newHttpClient();

String body = """
    {
      "guardian": "PII-Redactor",
      "instructions": "Detect and redact all Social Security Numbers (XXX-XX-XXXX). Replace with [REDACTED]. Block if multiple SSNs are present.",
      "input": [
        {"role": "developer", "content": "You are a customer support assistant. Never share PII."},
        {"role": "user", "content": "What is my account information?"},
        {"role": "assistant", "content": "Your account is registered to John Doe, SSN: 123-45-6789, balance: $50,000."}
      ],
      "temperature": 0.0
    }
    """;

HttpRequest request = HttpRequest.newBuilder()
    .uri(URI.create("https://api.trinitite.ai/v1/chat"))
    .header("Authorization", "Bearer " + System.getenv("TRINITITE_API_KEY"))
    .header("Content-Type", "application/json")
    .POST(HttpRequest.BodyPublishers.ofString(body))
    .build();

HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());

Applying the correction

Python
JavaScript
Java

import jsonpatch

last_message = input_messages[-1].copy()
if data["status"] == "corrected":
    patch = jsonpatch.JsonPatch(data["governance"]["corrections"])
    corrected = patch.apply(last_message)
    final_text = corrected["content"]

import jsonpatch from 'fast-json-patch';

const lastMessage = input[input.length - 1];

if (data.status === 'corrected') {
  const corrected = jsonpatch.applyPatch(
    lastMessage,
    data.governance.corrections,
  ).newDocument;
  finalText = corrected.content;
}

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.flipkart.zjsonpatch.JsonPatch;

ObjectMapper mapper = new ObjectMapper();
JsonNode payload = mapper.readTree(response.body());
JsonNode lastMessage = mapper.valueToTree(input.get(input.size() - 1));

if ("corrected".equals(payload.get("status").asText())) {
  JsonNode corrections = payload.get("governance").get("corrections");
  JsonNode patched = JsonPatch.apply(corrections, lastMessage);
  String finalText = patched.get("content").asText();
}

Production integration pattern

Python
JavaScript

import os
import jsonpatch
import requests

def govern(messages, guardian_name, instructions):
    response = requests.post(
        "https://api.trinitite.ai/v1/chat",
        headers={
            "Authorization": f"Bearer {os.environ['TRINITITE_API_KEY']}",
            "Content-Type": "application/json",
        },
        json={
            "guardian": guardian_name,
            "instructions": instructions,
            "input": messages,
            "temperature": 0.0,
        },
    )

    if response.status_code == 403:
        data = response.json()
        raise BlockedError(data["governance"]["action"], log_id=data["id"])

    data = response.json()
    last_message = messages[-1]

    if data["status"] == "passed":
        return last_message["content"]

    if data["status"] == "corrected":
        patched = jsonpatch.JsonPatch(data["governance"]["corrections"]).apply(last_message)
        return patched["content"]

    raise RuntimeError(f"Unexpected status: {data['status']}")

import jsonpatch from 'fast-json-patch';

async function govern(messages, guardianName, instructions) {
  const response = await fetch('https://api.trinitite.ai/v1/chat', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.TRINITITE_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      guardian: guardianName,
      instructions,
      input: messages,
      temperature: 0.0,
    }),
  });

  const data = await response.json();
  const lastMessage = messages[messages.length - 1];

  if (response.status === 403) {
    throw new Error(`Blocked: ${data.governance.action} (id: ${data.id})`);
  }

  switch (data.status) {
    case 'passed':
      return lastMessage.content;
    case 'corrected':
      return jsonpatch
        .applyPatch(lastMessage, data.governance.corrections)
        .newDocument.content;
    default:
      throw new Error(`Unexpected status: ${data.status}`);
  }
}

Direct Mode

Setting governed: false switches the endpoint to Direct Mode — an OpenAI-compatible chat completion that still writes a record to the Glass Box Ledger but performs no Guardian intervention. Use this when you want a single endpoint surface for both governed and ungoverned calls (the same Guardian name appears in the audit log either way), or for tool-calling flows that are inherently incompatible with output-side rectification.

{
  "guardian": "Customer-Support",
  "governed": false,
  "input": [
    { "role": "user", "content": "What is the weather in Paris?" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Retrieve current weather",
        "parameters": {
          "type": "object",
          "properties": { "city": { "type": "string" } },
          "required": ["city"]
        }
      }
    }
  ]
}

The response shape mirrors OpenAI's chat completion plus an id and guardian field for ledger correlation:

{
  "id": "log_01JF8R3M9N0Q1T2V3W4Y5Z6A7B",
  "object": "chat.completion",
  "created": "2026-05-01T15:43:11Z",
  "guardian": "Customer-Support",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The weather in Paris is currently 18°C and partly cloudy.",
        "tool_calls": []
      },
      "finish_reason": "stop"
    }
  ],
  "usage": { "prompt_tokens": 24, "completion_tokens": 14, "total_tokens": 38 }
}

Want full proxy semantics?

For the OpenAI-compatible /v1/chat/completions shape with provider passthrough, see the Proxy endpoint.

Errors

HTTP	`error.code`	Cause
`400`	`validation_error`	Body failed schema validation. `details` lists field paths
`400`	`bad_request`	Missing `instructions` in Guardian Mode; `tools` set in Guardian Mode
`401`	`unauthenticated`	Missing or invalid credential
`403`	`forbidden`	Credential lacks `guardians:read` permission, or scoped to a different Guardian
`404`	`not_found`	Guardian name not found in your organization
`422`	`unprocessable_entity`	Guardian is not in `ready` state (e.g. still `training`)
`429`	`rate_limited`	Per-organization rate limit exceeded; respect `Retry-After`
`503`	`emergency_shutdown`	Org-wide emergency kill switch is engaged

{
  "error": {
    "code": "bad_request",
    "message": "Missing required field: instructions. Instructions are required when governed is true.",
    "details": { "field": "instructions" },
    "request_id": "req_01J9X..."
  }
}

Performance

Typical latency: 50–400 ms per Guardian call.
Guardians are LoRA-cached after first use — no cold start beyond initial deploy.
Use the Idempotency-Key header to safely retry on 5xx without double-billing or double-logging.

Next steps

Create Guardians programmatically → Guardians API
Inspect the Glass Box Ledger → Logs API
OpenAI/Anthropic-compatible passthrough → Proxy endpoint
Authentication and rate limits → Authentication

Try it​

Overview​

Endpoint​

Headers​

Guardian Mode​

Request body​

Parameters​

guardian (required)​

instructions (required)​

input (required)​

temperature (optional)​

top_p (optional)​

max_tokens (optional)​

governed (optional)​

tools (optional)​

Responses​

Passed — 200 OK​

Corrected — 200 OK​

Blocked — 403 Forbidden​

JSON Patch format​

Complete example​

Request​

Applying the correction​

Production integration pattern​

Direct Mode​

Errors​

Performance​

Next steps​

Try it

Overview

Endpoint

Headers

Guardian Mode

Request body

Parameters

`guardian` (required)

`instructions` (required)

`input` (required)

`temperature` (optional)

`top_p` (optional)

`max_tokens` (optional)

`governed` (optional)

`tools` (optional)

Responses

Passed — `200 OK`

Corrected — `200 OK`

Blocked — `403 Forbidden`

JSON Patch format

Complete example

Request

Applying the correction

Production integration pattern

Direct Mode

Errors

Performance

Next steps