MCP Gateway
One gateway. Every tool server. Every call governed.
Model Context Protocol (MCP) moved the AI risk surface from what a model says to what an agent does. A model that writes bad prose has a hallucination problem. An agent that calls stripe.create_refund, postgres.query, or aws.iam.create_role without a governed intercept has an execution problem. Trinitite's MCP Gateway is that intercept — with per-tool Guardians, composable governance pipelines, certification locks that prevent silent policy drift, and forensic receipts for every call.
This page is the architectural deep-dive. For the endpoint schemas, see API Reference → MCP Gateway.
Two Deployment Topologies
Centralized Gateway
All MCP traffic is routed through an external Trinitite gateway at the network layer. The MCP Client sends JSON-RPC to the gateway, which validates and forwards compliant calls to the MCP Server.
TRADEOFF: One additional network hop. Maximum centralization.
Client-Side Middleware
The Guardian is embedded directly within the MCP Client's execution flow. Tool calls are intercepted before they leave the application memory space — no network hop, zero additional latency.
TRADEOFF: In-process. Deepest integration point.
The gateway can sit at the network edge as a centralized proxy that every MCP client in your organization connects to, or embedded in-process inside the MCP Client as middleware. Both patterns provide identical governance guarantees — the difference is where the intercept lives. For most deployments, client-side middleware is recommended: no network hop, deepest integration, lowest latency.
Governance Pipelines — Four Modes
Governance Pipelines — Four Modes, One Contract
Every tool call runs through one of four pipelines:
- Passthrough — no evaluation. Telemetry only. Useful for untrusted tools you're observing before certifying.
- Rules — programmatic rule engine. Allow/deny lists, argument constraints, output-schema validation, sequence rules. Sub-5ms. Cacheable.
- SLM — a small-language-model Guardian evaluates the call in context deterministically. Catches semantic intent (the "
DELETEdisguised as a read query" case) that rules can't express cleanly. - Chained — rules first on the hot path; ambiguous verdicts fall through to the SLM. You get rule latency when the rules are conclusive and SLM judgment when they're not.
A tool, a tool class, or a whole server can be pinned to any of the four modes. Changing a tool's mode is an auditable event with a required reason.
Autocorrection in Action
Tool Call Autocorrection Lifecycle
When the LLM outputs a syntactically or semantically invalid argument, the Guardian intercepts the payload before transport. A deterministic JSON Patch autocorrects the error — the validated request proceeds to the MCP Server without costly LLM re-generation.
When the LLM generates a call with a syntactically or semantically invalid argument, the Guardian intercepts the payload before it reaches transport. A deterministic JSON Patch autocorrects the error, the validated request proceeds to the MCP Server, and no expensive LLM re-generation is needed. This is Semantic Rectification applied to tool calls.
Per-Tool Certification
Per-Tool Certification — Four Pinned Attributes
A certified tool is pinned to four attributes at the moment of certification:
| Attribute | What it pins |
|---|---|
| Policy hash | The sha256 of the finalized policy document the Guardian was trained against |
| Guardian digest | adapter_hash plus the base-model digest the adapter sits on |
| Determinism mode | Whether the Guardian runs in batch-invariant seeded mode for this tool |
| OAuth scopes | The minimum scope set the call is allowed to exercise |
Any drift — a policy update, a new Guardian version, a silent determinism-mode flip — surfaces as a policy-drift error on the call. The gateway blocks until the certification is refreshed. A silent upgrade cannot change how a regulated tool call is evaluated, because the gateway refuses to act on an unsigned configuration.
This is what distinguishes "the AI decided" from "the AI decided, under a specific policy version, evaluated by a specific Guardian version, in a specific determinism mode, with a specific permission scope." The second is auditable. The first isn't.
Per-Tool-Call Guardian Architecture
Per-Tool-Call Guardian Architecture
Each tool call gets its own Guardian — a specialist trained on the exact schema, semantics, and threat surface of that specific API operation. The Base Guardian provides universal safety infrastructure (determinism, ledger, rectification); the Tool Guardian extends it with hyper-specific knowledge of what a valid Stripe refund looks like vs. a malicious one.
The architecture stacks in two layers. The Base Guardian provides universal safety infrastructure shared across all tool calls: batch-invariant determinism, semantic rectification engine, Glass Box Ledger, LoRA hot-swap. The Tool Guardian is a specialist LoRA adapter trained on the specific tool — schema, semantics, threat surface.
For any tool call not covered by a pre-built Guardian, the Teleological Data Generator produces thousands of adversarial variations from the tool's OpenAPI spec or MCP definition. The Guardian is trained, versioned, and shipped — no manual training data authoring required.
Amount limits, authorization checks, fraud intent detection.
SQL injection prevention, unbounded query limits, write-access enforcement.
Destructive action gating, secret exposure detection, repo scope enforcement.
Policy-compliant messaging, channel access control, PII in transit.
IAM boundary enforcement, resource tagging, blast-radius containment.
Schema-trained Guardians auto-generated from your OpenAPI spec or tool definition.
Confused-Deputy Defense
Confused Deputy Defense — Per-Upstream Consent
When an AI client delegates OAuth to an upstream through the gateway, Trinitite only mints authorization codes for upstreams the user has explicitly consented to. An untrusted client cannot trick the gateway into acting as its deputy for an upstream it was never authorized to reach.
Additional defenses layered on:
- SSRF guard — upstream URLs that resolve to loopback, RFC 1918, or link-local addresses are refused.
- Redirect containment — redirect chains that chain into private networks are blocked, not followed.
- RFC 8707 audience restriction — every upstream token is bound to the upstream it was issued for. A token scoped for one tool server cannot be replayed against another.
- Session pinning — every MCP session is pinned to the API key and NHI that created it. A leaked session ID cannot be picked up by a different principal.
Governance Advertisement
On the very first handshake with a Trinitite gateway, the client receives a signed advertisement describing which governance phases are active, which policy hash is in force, and whether deterministic receipts will be emitted for this connection.
{
"governance_phases": ["pre_call", "post_result", "resource_read", "prompt_fetch", "server_message"],
"policy_hash": "sha256:3f2a…",
"pipeline_mode": "chained",
"determinism_mode": "batch_invariant",
"receipts_emitted": true,
"minimum_trinitite_version": "26.4.0"
}
Your client code can refuse to operate against a Trinitite deployment whose governance posture doesn't match what you expect. Catastrophic misconfiguration becomes a connection-time failure, not a Tuesday-afternoon incident.
Deterministic Receipts
When the governance kernel runs in deterministic mode, every verdict carries a receipt:
{
"receipt_id": "rcp_…",
"tool": "stripe.create_refund",
"model_digest": "sha256:…",
"adapter_hash": "sha256:…",
"seed": 2026050201,
"policy_hash": "sha256:…",
"input_digest": "sha256:…",
"output_digest": "sha256:…",
"verdict": "blocked",
"signed_envelope": "base64:…"
}
A regulator, a disputing party, or an auditor can re-run the exact same inputs through the exact same Guardian build and reproduce the verdict bit-for-bit. "The AI decided" stops being a black box — it becomes a reproducible, signable claim.
See Glass Box Ledger for the full receipt envelope and replay story.
What Gets Governed
Every interaction with an MCP server runs through the same evaluation surface:
| Interaction | What Trinitite governs |
|---|---|
tools/call | Schema, semantic intent, argument injection, scope enforcement, attack-signature match |
tools/list | Tool visibility per NHI tier — a T0 agent sees a reduced catalog |
resources/read | Content returned from tool resources — PII / secret / prompt-injection detection |
prompts/get | Prompt templates pulled from the server — same evaluation surface as tool outputs |
sampling/createMessage | Server-initiated sampling — governed before reaching the client's model |
elicitation/create | Server asking the client for structured input — evaluated for coercion / prompt-injection |
roots/list | Filesystem / resource root requests — evaluated against tool's certified scope |
Server-initiated back-channel messages are a common blind spot. Trinitite evaluates them with the same Guardian surface as forward tool calls.
What You Get
| Capability | DIY MCP wiring | Trinitite MCP Gateway |
|---|---|---|
| Endpoint topology | Per-client configuration per server | Single gateway, unified catalog |
| Policy enforcement | Custom middleware or nothing | Composable pipelines: passthrough / rules / SLM / chained |
| Per-tool judgment | One global filter | Specialist Guardian per tool operation |
| Silent drift | Not detectable | Certification lock → policy-drift errors |
| OAuth safety | Depends on client | Confused-deputy-safe per-upstream consent grants |
| Upstream auth | Bearer passthrough | SSRF-guarded fetches, RFC 8707 audience-bound tokens |
| Audit trail | Tool-server logs | Deterministic receipts, hash-chained to ledger |
| Replay | Not possible | Bit-exact replay per certified determinism mode |
MCP Operations Surface
The gateway is not just an intercept — it's a fleet operations surface. Once tool calls flow through Trinitite, the same telemetry powers analytics rollups, alerting, immutable config audit, and on-demand replay.
MCP Operations Surface — every fleet, observable
Analytics rollups
GET /v1/mcp/analytics/tools returns time-bucketed metrics per tool, per server, and per NHI: pass_rate, correct_rate, block_rate, avg_latency_ms, p99_latency_ms. Charts in your operator UI roll up from the same data — see the Analytics endpoint.
Alert rules + events
Encode operational policies as alert rules — block_rate > 5% over 10 minutes, p99_latency > 600ms for the last hour, unique_tools > 50 per session. Each fire emits an mcp.alert.fired event into the security stream and writes a Glass Box receipt. See Cookbook → SIEM export for downstream wiring.
Immutable config audit
Every change to an MCP server registration, OAuth client, or alert rule writes a config-audit event. Diffs are first-class — you can answer "who certified this tool against this Guardian, when, and against which policy hash?" without trawling chat logs.
Session replay
MCP Session Replay — every phase, every receipt
Every MCP call writes a receipt envelope at each phase. The session replay viewer reconstructs the timeline from the Glass Box Ledger — bit-exact when the same Guardian and policy hash are still loaded.
GET /v1/mcp/sessions/:id/replay returns the timeline above for any session within retention. Replays are classified per the replay verdict taxonomy — bit_exact when the same Guardian and policy hash are still loaded, semantic_only after a major upgrade, divergent when something changed in a way that breaks the prior verdict, original_missing when the original adapter is no longer available.
Next Steps
→ Glass Box Ledger — the receipt envelope and replay machinery.
→ LLM Proxy — the sister surface governing chat completions and hosted MCP.
→ Guardian Training — how per-tool Guardians get trained from OpenAPI specs.
→ Cookbook: Govern an MCP tool call — the implementation steps.