Nous Research’s open-source Hermes Agent now ships a Tool Search feature. It directly addresses a growing bottleneck in AI agent systems: too many MCP tools filling up the context window. In this explainer article, we will breaks down what Tool Search does, how it works, and when to use it.

The Problem: MCP Tools Are Eating Your Context Window

When you connect multiple MCP (Model Context Protocol) servers to an AI agent, every tool’s JSON schema gets sent to the model on every turn. This happens even if the model only needs one or two tools for a given task.

Real-world deployments feel this immediately. A Hermes deployment with five MCP servers and 34 tools shows average prompt sizes of 45,000 tokens per turn. Roughly 22,000 of those tokens — around 50% — are tool schema overhead alone.

Anthropic’s own engineering data shows tool definitions can consume 134,000 tokens before optimization. Tool Attention measures the “MCP Tools Tax” at 15,000–60,000 tokens per turn for typical multi-server deployments.