Skip to main content

SLA & Limits

What we promise, what we measure, what you can rely on for capacity planning.

For framework-by-framework compliance, see Compliance Matrix. For company-side trust controls, see Trust Center.


Availability

TierTargetMeasured againstCredit
Sandboxbest-effortn/an/a
Standard99.9 % / monthControl-plane availability10 % service credit per 0.1 % below
Enterprise99.95 % / monthControl-plane + Guardian inference25 % service credit per 0.05 % below; 100 % at < 99.5 %
Sovereignper contractCustomer-defined SLOsper contract

Availability does not include scheduled maintenance announced ≥ 7 days in advance.


Latency SLOs

SurfaceTarget (p99)Notes
Guardian decision (cache hit)< 50 msHot path for repeated outputs.
Guardian decision (cache miss)< 400 msDefault Guardian inference path.
Proxy (Guardian + upstream LLM)upstream + < 400 msThe Guardian adds at most 400 ms to your existing call.
MCP tools/call end-to-endupstream + < 250 msIncludes pre and post phases.
/v1/logs query (last 7 days, ≤ 100 results)< 800 msBeyond 7 days, see retention windows.
POST /v1/training/retrain30 – 90 minAsync; poll GET /v1/training/jobs/:id.

All latency SLOs are measured at the edge — your network latency to the edge is in addition.


Rate limits

SurfaceSandboxStandardEnterprise
POST /v1/chat10 RPS100 RPSper contract
POST /v1/proxy/*10 RPS100 RPSper contract
POST /v1/mcp/sessions/*5 RPS50 RPSper contract
POST /v1/cli/evaluate5 RPS50 RPSper contract
GET /v1/logs1 RPS10 RPSper contract
POST /v1/training/retrain1 / hour10 / hourper contract
Other reads10 RPS50 RPSper contract

When a limit is hit, the response is HTTP 429 with a Retry-After header (seconds). Always honor Retry-After; clients that ignore it and hammer will be throttled at the edge for 10 minutes.


Pagination

All list endpoints use cursor-based pagination:

curl "$TRINITITE_BASE/v1/logs?limit=100" \
-H "Authorization: Bearer $TRINITITE_API_KEY"
# → { "data": [...], "next_cursor": "eyJ0..." }

curl "$TRINITITE_BASE/v1/logs?limit=100&cursor=eyJ0..." \
-H "Authorization: Bearer $TRINITITE_API_KEY"

limit defaults to 25 and is capped at 100. next_cursor is null on the last page.


Idempotency

Mutating endpoints accept an X-Trinitite-Idempotency-Key header. The platform deduplicates within a 24-hour window. Reusing the same key with the same body bytes returns the original response. Reusing the same key with different body bytes returns 409 idempotency_key_reused.

Best practice: generate a UUIDv4 per logical operation, include it on every retry of that operation.


Versioning

The API is versioned at the path prefix: /v1/.... We commit to:

  • No breaking changes within a major version.
  • New optional fields can appear at any time.
  • New error codes can appear at any time (always 4xx or 5xx; never silently change a 2xx to a 4xx).
  • Deprecations are announced with a minimum 6-month sunset window.
  • Endpoints in Beta are not subject to the above and may change with notice.

Determinism guarantees

For any two requests with the same:

  • Guardian (and adapter version),
  • Policy hash,
  • Input bytes,
  • Determinism mode flag,

…the platform returns a bit-identical output. Replays beyond that envelope are classified per the replay verdict taxonomy.