SLA & Limits

What we promise, what we measure, what you can rely on for capacity planning.

For framework-by-framework compliance, see Compliance Matrix. For company-side trust controls, see Trust Center.

Availability

Tier	Target	Measured against	Credit
Sandbox	best-effort	n/a	n/a
Standard	99.9 % / month	Control-plane availability	10 % service credit per 0.1 % below
Enterprise	99.95 % / month	Control-plane + Guardian inference	25 % service credit per 0.05 % below; 100 % at < 99.5 %
Sovereign	per contract	Customer-defined SLOs	per contract

Availability does not include scheduled maintenance announced ≥ 7 days in advance.

Latency SLOs

Surface	Target (p99)	Notes
Guardian decision (cache hit)	< 50 ms	Hot path for repeated outputs.
Guardian decision (cache miss)	< 400 ms	Default Guardian inference path.
Proxy (Guardian + upstream LLM)	upstream + < 400 ms	The Guardian adds at most 400 ms to your existing call.
MCP `tools/call` end-to-end	upstream + < 250 ms	Includes pre and post phases.
`/v1/logs` query (last 7 days, ≤ 100 results)	< 800 ms	Beyond 7 days, see retention windows.
`POST /v1/training/retrain`	30 – 90 min	Async; poll `GET /v1/training/jobs/:id`.

All latency SLOs are measured at the edge — your network latency to the edge is in addition.

Rate limits

Surface	Sandbox	Standard	Enterprise
`POST /v1/chat`	10 RPS	100 RPS	per contract
`POST /v1/proxy/*`	10 RPS	100 RPS	per contract
`POST /v1/mcp/sessions/*`	5 RPS	50 RPS	per contract
`POST /v1/cli/evaluate`	5 RPS	50 RPS	per contract
`GET /v1/logs`	1 RPS	10 RPS	per contract
`POST /v1/training/retrain`	1 / hour	10 / hour	per contract
Other reads	10 RPS	50 RPS	per contract

When a limit is hit, the response is HTTP 429 with a Retry-After header (seconds). Always honor Retry-After; clients that ignore it and hammer will be throttled at the edge for 10 minutes.

Pagination

All list endpoints use cursor-based pagination:

curl "$TRINITITE_BASE/v1/logs?limit=100" \
  -H "Authorization: Bearer $TRINITITE_API_KEY"
# → { "data": [...], "next_cursor": "eyJ0..." }

curl "$TRINITITE_BASE/v1/logs?limit=100&cursor=eyJ0..." \
  -H "Authorization: Bearer $TRINITITE_API_KEY"

limit defaults to 25 and is capped at 100. next_cursor is null on the last page.

Idempotency

Mutating endpoints accept an X-Trinitite-Idempotency-Key header. The platform deduplicates within a 24-hour window. Reusing the same key with the same body bytes returns the original response. Reusing the same key with different body bytes returns 409 idempotency_key_reused.

Best practice: generate a UUIDv4 per logical operation, include it on every retry of that operation.

Versioning

The API is versioned at the path prefix: /v1/.... We commit to:

No breaking changes within a major version.
New optional fields can appear at any time.
New error codes can appear at any time (always 4xx or 5xx; never silently change a 2xx to a 4xx).
Deprecations are announced with a minimum 6-month sunset window.
Endpoints in Beta are not subject to the above and may change with notice.

Determinism guarantees

For any two requests with the same:

Guardian (and adapter version),
Policy hash,
Input bytes,
Determinism mode flag,

…the platform returns a bit-identical output. Replays beyond that envelope are classified per the replay verdict taxonomy.

Availability​

Latency SLOs​

Rate limits​

Pagination​

Idempotency​

Versioning​

Determinism guarantees​