SLA & Limits
What we promise, what we measure, what you can rely on for capacity planning.
For framework-by-framework compliance, see Compliance Matrix. For company-side trust controls, see Trust Center.
Availability
| Tier | Target | Measured against | Credit |
|---|---|---|---|
| Sandbox | best-effort | n/a | n/a |
| Standard | 99.9 % / month | Control-plane availability | 10 % service credit per 0.1 % below |
| Enterprise | 99.95 % / month | Control-plane + Guardian inference | 25 % service credit per 0.05 % below; 100 % at < 99.5 % |
| Sovereign | per contract | Customer-defined SLOs | per contract |
Availability does not include scheduled maintenance announced ≥ 7 days in advance.
Latency SLOs
| Surface | Target (p99) | Notes |
|---|---|---|
| Guardian decision (cache hit) | < 50 ms | Hot path for repeated outputs. |
| Guardian decision (cache miss) | < 400 ms | Default Guardian inference path. |
| Proxy (Guardian + upstream LLM) | upstream + < 400 ms | The Guardian adds at most 400 ms to your existing call. |
MCP tools/call end-to-end | upstream + < 250 ms | Includes pre and post phases. |
/v1/logs query (last 7 days, ≤ 100 results) | < 800 ms | Beyond 7 days, see retention windows. |
POST /v1/training/retrain | 30 – 90 min | Async; poll GET /v1/training/jobs/:id. |
All latency SLOs are measured at the edge — your network latency to the edge is in addition.
Rate limits
| Surface | Sandbox | Standard | Enterprise |
|---|---|---|---|
POST /v1/chat | 10 RPS | 100 RPS | per contract |
POST /v1/proxy/* | 10 RPS | 100 RPS | per contract |
POST /v1/mcp/sessions/* | 5 RPS | 50 RPS | per contract |
POST /v1/cli/evaluate | 5 RPS | 50 RPS | per contract |
GET /v1/logs | 1 RPS | 10 RPS | per contract |
POST /v1/training/retrain | 1 / hour | 10 / hour | per contract |
| Other reads | 10 RPS | 50 RPS | per contract |
When a limit is hit, the response is HTTP 429 with a Retry-After header (seconds). Always honor Retry-After; clients that ignore it and hammer will be throttled at the edge for 10 minutes.
Pagination
All list endpoints use cursor-based pagination:
curl "$TRINITITE_BASE/v1/logs?limit=100" \
-H "Authorization: Bearer $TRINITITE_API_KEY"
# → { "data": [...], "next_cursor": "eyJ0..." }
curl "$TRINITITE_BASE/v1/logs?limit=100&cursor=eyJ0..." \
-H "Authorization: Bearer $TRINITITE_API_KEY"
limit defaults to 25 and is capped at 100. next_cursor is null on the last page.
Idempotency
Mutating endpoints accept an X-Trinitite-Idempotency-Key header. The platform deduplicates within a 24-hour window. Reusing the same key with the same body bytes returns the original response. Reusing the same key with different body bytes returns 409 idempotency_key_reused.
Best practice: generate a UUIDv4 per logical operation, include it on every retry of that operation.
Versioning
The API is versioned at the path prefix: /v1/.... We commit to:
- No breaking changes within a major version.
- New optional fields can appear at any time.
- New error codes can appear at any time (always
4xxor5xx; never silently change a2xxto a4xx). - Deprecations are announced with a minimum 6-month sunset window.
- Endpoints in
Betaare not subject to the above and may change with notice.
Determinism guarantees
For any two requests with the same:
- Guardian (and adapter version),
- Policy hash,
- Input bytes,
- Determinism mode flag,
…the platform returns a bit-identical output. Replays beyond that envelope are classified per the replay verdict taxonomy.