A misconfigured package.json field shipped Claude Code’s entire source to npm in March. The dump is roughly half a million lines of TypeScript and includes the system prompt, the full tool set, the query engine, the renderer, the bash sandbox, and several unreleased feature flags. Marc Bara’s Medium write-up is the most useful summary I’ve seen. This post pulls out a few of the findings that matter for anyone building or comparing competing harnesses.
Where the orchestration lives
Bara reports that multi-agent coordination, the “treat memory as a hint, verify against the codebase” instruction, and the sub-agent contracts all live in prompt strings. Anthropic can change any of them by editing a string and shipping a new release.
That matches what I and others have noticed: Claude Code releases change model behaviour overnight. It also explains why people who try to build stable workflows on top of Claude Code keep getting surprised. The dependency isn’t the API; it’s the prompt that wraps the API, and that prompt can change between any two releases.
The renderer was rebuilt from scratch
The terminal renderer is built around an Int32Array-backed character pool with bitmask-encoded style metadata, designed to skip most redraws. Bara reports a roughly fifty-fold reduction in render calls.
Earlier in the year, Anthropic’s Thariq described the Claude Code TUI as “a small game engine”, with a per-frame pipeline that constructs a React scene graph, lays out elements, rasterises to a 2D screen, diffs against the previous frame, and emits ANSI sequences. The framing drew pushback. The most-circulated reply was Claude Code’s renderer is more complex than a game engine, arguing the renderer uses an order of magnitude more instructions per frame than Super Mario 64 did to draw an actual 3D world, and that terminals themselves are not the bottleneck. The leak suggests both sides had a point. The team did rebuild the rendering pipeline from scratch around game-engine-style techniques, but only because the original choice to run React in the terminal was almost certainly the reason for the long-standing flicker problem in the first place. A React component graph re-laying out on every keystroke is exactly the kind of work a terminal is not built for, and the rebuild looks like compensation for that decision rather than evidence that terminals are slow.
The bash sandbox is harder than it looks
Each command runs through a long sequence of numbered security checks, with blocked Zsh builtins and defences against unicode injection. Bara walks through some of the specific checks.
This is one of the parts of Claude Code that is hard to clone from a README. Most agent harnesses run shell commands through one or two layers of validation; the Claude Code sandbox is doing substantially more than that.
The feature flags
Two feature flags surfaced in the dump that hadn’t been public at the time. Bara names them KAIROS (a background daemon mode with autonomous operation, GitHub webhooks, and nightly memory consolidation) and ULTRAPLAN (cloud-offloaded architectural analysis for large repos). ULTRAPLAN has since shipped publicly. KAIROS has not, at the time of writing.
Both point at the direction Claude Code is moving: from interactive coding tool toward background agent service.
The shipping mistake
The leak shipped via one bad package.json field. As Bara puts it, the sophistication of what you build is independent of the reliability of how you ship it. The Claude Code codebase was substantial; the npm publish step was the weak link.
For anyone building or operating their own agent: the npm publish path is one of the riskier places to make a mistake. Worth running npm publish --dry-run before any release of an internal tool that contains your prompts. It runs the whole publish flow, packs the tarball, and prints the file list that would be uploaded, but stops before sending anything to the registry.
What it gives competitors
Codex, opencode, AMP, kilocode, and any other team building an agent CLI now have a working reference for the parts of Claude Code that were hardest to replicate from the outside: the renderer, the sandbox, the prompt orchestration, the memory model. How much of that shows up in the next release of each tool is the question worth watching.