Redaction fails open: whitelist your MCP tool's output instead

I maintain HeadlessTracker, an MCP server that reads crypto balances across exchanges and wallets and hands them to an AI host. It touches API keys. So "where can a secret leak?" is the question I think about most — and a conversation with a couple of security-focused folks on Bluesky this week sharpened how I talk about it. This is the pattern I landed on, and why I think it generalizes to any agent tool that touches a credential.

The leak path everyone forgets

The obvious advice is "don't put the secret in the model's context." Fine. But there's a subtler path, and it's the one builders cut corners on: your tool's output is an egress channel. Whatever a tool returns gets piped into the host model's context, and that context is frequently logged — reasoning traces, debug dumps, eval transcripts. As someone in the thread put it, reasoning traces get treated as debug slop. If your tool's output ever echoes the secret — or anything sensitive the upstream API handed back — it has already leaked, even if you were careful never to pass the credential into a prompt yourself.

So tool output is a trust boundary. The question is how you guard it.

The reflex: redact at egress

Redaction fails open: whitelist your MCP tool's output instead

Related reading

My MCP Server Only Talks to APIs I Trust. That Doesn't Mean the Data Coming…

Stop Leaking Secrets into your LLM Context Windows

leakproof: stop your AI coding tool from leaking secrets to the cloud (local,…

🚀 I Built Trade MCP: Remote MCP Server for Crypto Tools and Safer AI Trading…

I Build MCP Servers. Here's the Security Hole Nobody Talks About.

The security problem nobody is talking about: MCP servers