RecursiveMAS (arXiv 2604.25917) showed that agents sharing internal reasoning state outperform agents that share only final outputs. The average accuracy gain across benchmarks was 8.3 points. The mechanism: each agent passes not just its answer but the latent embeddings from its own reasoning process, and the next agent conditions on both. The paper is a good result.
The catch is access. RecursiveMAS requires open-weight models with hidden states exposed at inference time. That rules out Claude, GPT-4o, and Gemini. I built a Claude-native version using the Anthropic extended thinking API. The core idea transfers: instead of passing latent vectors, pass the full thinking text. The paper calls it internal state sharing; the Claude version calls it thinking-block relay.
The architecture problem
Claude's extended thinking blocks carry an encrypted signature tied to the originating conversation. You cannot pass a signed thinking block into a different agent's messages array. The API rejects it. The workaround: extract the text from the thinking block and inject it as a regular user message.
# Extract thinking text from Agent 1










