Inside Claude Code

Recently (March 31, 2026), the full TypeScript source for Claude Code shipped (accidentally) as part of the @anthropic-ai/claude-code npm package. This provided a rare look inside how Anthropic built a production agent harness used by hundreds of thousands of developers. I spent time reading through the code, running analysis scripts, and digging into the design choices made by the Claude Code team. This post collates the findings.TLDR;Most of the code is NOT the agent loop. The loop is ~1,700 lines. The other 511,000 lines are the harness.Prompts are the largest engineering investment you can’t see. ~60,000 tokens of prompt text across 37 tool prompts, the system prompt, and a compaction prompt. The BashTool prompt alone is longer than most agent tutorials.Context management is a pipeline, not a single operation. Three tiers that compose - budgeting, microcompaction, full compaction - each handling a different shape of growth.Every automated recovery path needs a circuit breaker. A missing retry cap let 1,279 sessions run 50+ consecutive compaction failures each, wasting ~250K API calls per day before the team noticed.Safety code compounds. 38,000 lines - 7.5% of the codebase - devoted to permissions and security. Every new tool, deployment mode, and model version adds more.The line between single-agent and multi-agent is blurred. 90 feature flags reveal multi-agent coordination, background memory consolidation, and verification agents behind feature gates.Alot of the insights came from both reading the code, and comments by the engineers, in the code. For context, this builds on previous posts about the agent execution loop and building your own Claude Code from scratch. This post is part of a series adapted from Designing Multi-Agent Systems, which covers agent architecture, tool design, and context engineering with full implementation code. Before diving into lessons, here’s the shape of the codebase. I wrote a Python script that walks the full source tree and generates these charts. The numbers are from the March 31, 2026 npm code.513,216 lines of TypeScript across 1,884 files. 42 tools. 90 feature flags.The agent loop is less than 1% of the codebase. The other 99% keeps it alive ...The biggest surprise: utils/ is the largest directory at ~180,000 lines (35% of the codebase). The infrastructure that supports the agent - model routing, permission management, config loading, analytics, error handling - outweighs the agent itself by a wide margin.BashTool: 1,143 lines of execution, 11,271 lines of making sure the execution is safe.BashTool is 12,414 lines - larger than many complete agent frameworks. The tool itself (BashTool.tsx) is 1,143 lines. The remaining ~11,000 lines are safety infrastructure:bashPermissions.ts (2,621 lines) - permission rule matching, classifier integration, allowlist logicbashSecurity.ts (2,592 lines) - 20+ AST-based security checks (injection, substitution, obfuscation detection)readOnlyValidation.ts (1,990 lines) - determining which commands are safe to auto-approve (allowlists for git, docker, ripgrep, pyright, gh CLI subcommands and their flags)pathValidation.ts (1,303 lines) - preventing writes outside allowed directoriessedValidation.ts (684 lines) - an entire validation system just for sed commands (because the model loves using sed to edit files, and sed can do almost anything)The tool is 10% execution logic, 90% “making sure the execution is safe.”PowerShellTool is 8,961 lines - a ground-up reimplementation (not a port of BashTool) with PowerShell-specific threats: download cradles (Invoke-WebRequest | Invoke-Expression), COM object abuse, Constrained Language Mode bypass detection, dynamic command names via variables, and module loading attacks. Windows shell security is a different threat model from Unix, and the code appears to be engineered to reflect that.AgentTool - which manages sub-agent spawning - is 6,784. The long tail of smaller tools (Glob, Grep, Sleep) are 50-300 lines each. The complexity concentrates where the attack surface is largest.The core loop in Claude Code’s query.ts is recognizable. It’s a while (true) that calls the Claude API, collects tool_use blocks from the response, executes them, pushes tool_result messages back, and loops. This mirrors the agentic loop described in the agent execution loop and building your own Claude Code.But the loop carries a State object with 10 fields:type State = {

Inside Claude Code

Inside Claude Code

Related reading

80% of Anthropic's Production Code Is Now Written by Claude. Here Is What That…

Claude now authors over 80% of code merged into its own codebase

Four agents, 77 projects, 90 minutes: the multi-agent Claude Code pattern I run…

Claude Code Guide 2026: 25 Features with Examples + Demo

How I Set Up Claude Code with 26 Production Subagents (CLAUDE.md, MCP, Hooks)

Anthropic: Claude Now Writes 80% of Its Own Code in 2026

Related reading

80% of Anthropic's Production Code Is Now Written by Claude. Here Is What That…

Claude now authors over 80% of code merged into its own codebase

Four agents, 77 projects, 90 minutes: the multi-agent Claude Code pattern I run…

Claude Code Guide 2026: 25 Features with Examples + Demo

How I Set Up Claude Code with 26 Production Subagents (CLAUDE.md, MCP, Hooks)

Anthropic: Claude Now Writes 80% of Its Own Code in 2026