Writing

Every server is code in your context

By Sachin Mehta • 29 May 2026 • 10 min read

MCP AI Agents Prompt Injection Operational Resilience DORA

Most teams adopt the Model Context Protocol as plumbing: the open standard, shipped by Anthropic in late 2024, that lets a model reach external tools without bespoke glue for every pairing. The wiring is the easy part. What breaks in week three is where the trust boundary was drawn, and that is a resilience question, not an integration one.

The integration framing hides a resilience problem

Wire MCP up and the agent can read a database or file a ticket. Move on. The connection is rarely what fails. What fails is that the trust boundary was placed where the integration diagram suggested, rather than where execution actually happens. Get that wrong and the failure is not a clean error. It is a model acting on instructions nobody in the firm wrote, through a tool nobody scoped as an attack surface.

The correction starts with one fact the integration framing obscures: the model never calls the tool. It emits text. Everything between that text and a real side effect is something the host does on the model's behalf. Once that lands, the resilience questions ask themselves.

The model never calls the tool. The host does. That single fact relocates the entire security boundary.

The lifecycle, so the failure points have somewhere to attach

Six steps, each message JSON-RPC 2.0 under the hood. The detail matters only because each step is a place where something holds or gives way.

Handshake. Host and server negotiate capabilities through an initialize exchange before any work happens.
Discovery. The client calls tools/list; the server returns each tool with a name, a description, and a JSON schema.
Inject. The host places those tool definitions into the model's context as the available tool set.
Decide. The model emits a tool_use block, a tool name plus arguments. It has not called anything; it has produced structured text.
Route and validate. The host reads that intent, validates it, and the client sends tools/call to the server.
Return. The server runs the code and the result returns to context as tool_result. The loop repeats until the model stops calling tools. That loop is the agent.

Step four is the one engineers misread. The model does not run anything; it produces a request to run something. This step is probabilistic, not deterministic. The model can pick the wrong tool, invent one that does not exist, or emit malformed arguments, which is precisely why the host has to validate every call before it executes. Treat the model's tool intent as a request to be checked, not a command to be obeyed.

The host is the integrity boundary

The model only ever sees text and emits text. It reads tool descriptions, it writes tool-call intents. It never opens a socket, never holds a credential, never executes anything. The host does all of that on the model's behalf.

Internalise this and three consequences follow, and all three are resilience properties, not features. The model's knowledge of a tool is only as good as the description and schema the server sent. Anything a server puts in a description or a result lands in the model's context as instruction-shaped text. And the host, not the model, is where integrity is enforced or lost. Most production incidents trace back to teams that put the trust boundary inside the model, because they believed the model was doing the calling. It was not. The boundary is the host, and a boundary you have misplaced is a boundary you are not defending.

Three failure modes that surface under adversarial pressure

A resilient design names its failure modes before they arrive. MCP has three specific to the protocol, and each stays silent until production load or an adversary finds it.

Failure mode 1

The lethal trifecta

A tool description and a tool result are model-facing text. A compromised or hostile server can place instructions in either, and the model reads them as part of its context. This is prompt injection with a supply-chain shape: every server you connect is code running inside your model's trust boundary.

The acute formOnce a single agent holds access to untrusted content, access to your private data, and a path to send data out, a malicious description or result can chain those three into exfiltration without a single line of your own code being wrong. The defence is architectural, not vigilance: never put untrusted-content access in the same agent that also holds private data and an outbound channel. Separate the three and the trifecta cannot complete.

Alongside separation: pin and allowlist the servers you connect, and verify provenance before you trust a server's output.

Failure mode 2

The transport boundary

MCP defines two transports. stdio launches the server as a local subprocess over standard input and output: fast, simple, local only, dies with the process that spawned it. Streamable HTTP makes the server a remote endpoint over the network, and it superseded the original HTTP-plus-SSE transport in the 2025 spec revision.

The trap is concrete, and I hit it the unglamorous way: wiring a local stdio server to a model running off-machine, then watching it fail with no useful error, because there is no subprocess on the remote side to attach to. The moment the model runtime and the tool are not on the same machine, you need an HTTP endpoint, and remote endpoints pull in authorisation (the spec uses OAuth 2.1). Teams prototype on stdio, then hit the wall the day they deploy behind a remote model. Design for the transport you will deploy on, not the one that demos easiest.

Failure mode 3

Schema drift

A server changes a tool's arguments and ships it. The model's understanding is now stale, the calls it constructs are subtly wrong, and nothing throws. You get intermittent failures that look like model flakiness but are a contract mismatch between server and context. This is the failure that survives every test that passed yesterday. Version your servers and pin them like any other dependency, because that is what they are.

The context tax

One more property that degrades quietly. Every connected tool's schema sits in the model's context on every turn, so the cost scales with each server you add. Wire up five servers with forty tools between them and you pay that token bill continuously, and the model gets worse at choosing the right tool as the menu grows. Lean tool sets are not housekeeping. They are a reliability control: fewer tools means cheaper turns and sharper selection. Filter the connected set to what the task needs.

Recover by architecture

The thread through every failure above is that none is fixed by trusting the model more or watching harder. They are fixed by where you draw the boundaries. That is the resilience-engineering posture: the system holds because of how it is built, not because nothing hostile ever arrives.

The host validates every tool_use against the schema before it executes. Model intent is a request, not a command.
Servers are pinned and allowlisted. Provenance is verified before output is trusted.
No single agent holds untrusted content, private data, and an outbound channel at once. Split the trifecta by design.
Sensitive tools run on servers you self-host. The half of every loop that touches firm data stays on infrastructure you control.
Server contracts are versioned and pinned. A schema change is a dependency change and gets the same governance.
Connected tool sets are kept lean, filtered to the task, for cost and for selection accuracy.

Because the protocol is open, the fourth point is available to anyone: wrap your data behind a server you control, self-host it, and keep the sensitive half of every loop off third-party infrastructure entirely. Treat MCP as a way to standardise access to your own stack rather than to rent someone else's, and vendor lock-in stops being a risk you carry.

Regulatory frame note: For regulated firms, connected MCP servers are ICT components. Under DORA Article 9 they fall inside the lifecycle integrity obligation, and any third-party-operated server sits under Article 28 third-party ICT risk. PRA SS2/21 deployment-environment attestation applies to the agent's tool surface. Equivalents: RBI IT Risk Management Guidelines, MAS TRM, HKMA TM-G-1. The architecture generalises across jurisdictions; framework mapping and materiality thresholds differ.

The protocol is young and the surface is moving. What does not move is the principle: the host is the boundary, and resilience is a property you build into where the boundaries sit, not a posture you adopt after the first incident.

About the Author

I am a CISA and CISSP-certified governance practitioner. My day-to-day work spans technology risk, audit defensibility, and cross-border regulatory intelligence across the UK (FCA, PRA), India (RBI, SEBI, IFSCA), Southeast Asia (MAS), and the Gulf (CBUAE), with working knowledge of the EU AI Act's financial services implications.

My current research sits at the intersection of audit-defensible AI deployment patterns and supervisory expectations in regulated firms, including sovereign open-weights deployment and the governance of agent and inference pipelines in firms with cross-border regulatory perimeters.

A footnote on Sentinel Engine

Sentinel Engine is the sovereign model deployment I run from my own hardware, currently in beta. The host-as-boundary discipline above is the one Sentinel operates under: tools that touch firm data run on servers I control, and the boundary sits in the host, not the model.

LinkedIn • [email protected] • rtapulse.com

Collaborate

Corrections, counterexamples, and build ideas welcome. [email protected] • Discussions • Issues • How to collaborate.

Disclosures

Practitioner opinion. Not legal or regulatory advice. No vendor relationships. Full disclosures.

Request a topic

← Field Notes

Post-Quantum Cryptography →