Skip to main content

Guardian Cards

One card per Guardian. EU AI Act-style transparency.

Every Guardian shipped with the platform has its own model card — what it was trained from, its determinism envelope, its latency budget, what it accepts and returns, and the known limits. Cards exist for both pre-built Guardians (below) and tenant-trained Guardians (in the dashboard).

PDF cards are also downloadable and signed — they're the artifact your AI committee files.


pii-redactor

v1.4.0 · Output
Trained from
Trinitite PII rubric (2026-Q1) + 12,400 adversarial spans
Base policy
trinitite/pii-baseline-v3
Determinism
fixed_kv_tile=256
Latency p99
142ms
What it accepts
  • Any assistant-role string output from the upstream model
  • Optional structured spans (chat completion, JSON tool result)
What it returns
  • passed — when no PII spans are detected
  • corrected — RFC 6902 patches replacing each PII span with a typed token
  • blocked — only if an unredactable secret pattern (live API key, JWT) is present
Limits & known gaps
  • Languages: English-first; Spanish, French, German verified to 90% recall
  • PII types: SSN, credit/debit card, US/EU/UK address, DOB, email, phone, IP, ICD-10
  • Not in scope: Government-classified markings; medical-device identifiers (separate Guardian)
Original tokens are recoverable only via the Glass Box Ledger receipt with the pii:reveal scope (typically held by your DPO, not the application).
Closed-loop training is opt-in — see the Trust Center for the data flow.

sql-safe

v2.1.0 · SQL
Trained from
OWASP SQLi corpus + 30k tenant-anonymized destructive queries
Base policy
trinitite/sql-safe-baseline-v2
Determinism
fixed_kv_tile=256
Latency p99
118ms
What it accepts
  • Raw SQL strings emitted by an AI assistant or tool call
What it returns
  • passed — for read-only queries (SELECT, EXPLAIN, SHOW)
  • corrected — collapses safe-equivalent destructive queries (DELETE / TRUNCATE) into a SELECT preview
  • blocked — for unrecoverable destructive intent (DROP TABLE, REVOKE, CREATE USER)
Limits & known gaps
  • Dialects: PostgreSQL, MySQL, MSSQL, BigQuery, Snowflake
  • Not in scope: Stored procedures, dynamic SQL constructed at runtime
Writeback queries (INSERT / UPDATE) are passed by default — use the sql-safe-strict variant if you want every mutation manually approved.

secret-scrubber

v1.2.0 · Output
Trained from
GitHub Secret Scanning patterns + 4,200 tenant-vault leak fixtures
Base policy
trinitite/secret-scrubber-baseline-v1
Determinism
fixed_kv_tile=256
Latency p99
96ms
What it accepts
  • Any string output (chat, code completion, tool result)
What it returns
  • passed — when no high-confidence secret patterns are detected
  • blocked — for high-confidence live secret matches (sk-..., AKIA..., GitHub PAT, JWT, Slack token)
Limits & known gaps
  • Confidence threshold: Tunable per-tenant; default = 0.92 (zero-FP target)
  • Not in scope: Custom internal credential formats — train a tenant-specific Guardian via /v1/policies
There is no safe correction for a live secret — the Guardian only blocks. Do not configure secret-scrubber to "correct".

stripe-refund-guardian

v1.0.0 · MCP / Tool
Trained from
Stripe API schema (refund namespace) + 3,100 adversarial argument variations
Base policy
trinitite/mcp-tool-baseline-v2
Determinism
fixed_kv_tile=256
Latency p99
188ms
What it accepts
  • stripe.create_refund tool call payloads
What it returns
  • passed — for amounts within the per-tenant safe range
  • corrected — patches malformed amounts (e.g. "N/A") to median historical value
  • blocked — for amounts > tenant-set ceiling, or cross-tenant charge IDs
Limits & known gaps
  • Tenant config required: Per-tenant safe range — default $0.01 to $10,000.00
  • Not in scope: stripe.create_payout (separate Guardian)
This is one of the bundled MCP per-tool Guardians. To add a new tool, register the schema via POST /v1/mcp/tools and the Teleological Data Generator handles the rest.

Tenant-trained Guardians

Cards for Guardians you've trained in your tenant (e.g. acme-pii-and-tone in the multi-tenant cookbook) live in the dashboard, generated automatically from the policy and training pipeline.