Claude Code sandbox bypass patched after 130 releases

Anthropic patched a sandbox-bypass flaw in Claude Code without a public advisory, shutting a hole that lived on developer laptops rather than in distant cloud infrastructure. SecurityWeek reported the fix after researcher Aonan Guan demonstrated that network controls in Anthropic’s coding agent could be sidestepped, turning a trusted sandbox into an exit path for secrets on the workstation running it.

Guan’s write-up traced the flaw across Claude Code releases from 2.0.24 through 2.1.89 — a stretch of about 130 versions that began with the tool’s 20 October 2025 general availability launch. Anthropic closed it in release 2.1.90. The issue turned on wildcard egress allowlists, the rules that are supposed to restrict the agent to approved destinations. Guan showed they could be defeated.

Guan saw it plainly. Anthropic shipped a safety feature and asked developers to trust it. The feature had a hole.

“Shipping a sandbox with a hole is worse than not shipping one.”

— Aonan Guan

Guan’s post said the bypass could be chained into data exfiltration if Claude Code was running with access to local credentials, source files or other sensitive material on a developer machine.

The Register confirmed Anthropic had fixed the issue without a public advisory or CVE. Guan published the company’s response through its vulnerability disclosure programme: Anthropic said it had not decided whether a CVE would be issued and could not share a timeline. None of the public reports pointed to active exploitation. What they did show was how much trust modern coding agents now demand from the laptops they run on.

Why the patch matters for developer teams

Claude Code is not an autocomplete tool anymore, if it ever was. It reads repositories, opens terminals, runs commands and makes network calls on a user’s behalf. If the outbound traffic controls can be bypassed, a poisoned prompt, hostile dependency or malicious repository can turn a trusted assistant into a data path out of the workstation.

For security teams, the Anthropic patch is a data point: AI developer tooling belongs in the same risk bucket as browser extensions, package managers and CI helpers. Network sandboxes, approval prompts and container boundaries only reduce risk when the implementation holds. Once they fail, the exposure stops being an abstract model-misuse scenario and becomes the familiar problem of credentials, tokens and proprietary code on an engineer’s laptop.

Enterprise buyers have a disclosure problem here too. When a vendor sells a sandbox as a security boundary, the customer needs to know the version range, which boundary was exposed and when the fix landed. Both SecurityWeek and The Register described the Claude Code issue as a rift between the safety language vendors use for AI agents and the reality of patching them.

For Australian dev teams trialling coding agents on corporate networks, the lesson is practical rather than theoretical. Treat the agent as privileged software with a foothold on the workstation, not a harmless chat window, and track its versions and guardrails the way you would any other privileged tool. Anthropic closed this particular flaw in 2.1.90. The open question for the wider market is how many similar controls in AI tooling have not been tested at all.

Claude Code sandbox bypass patched after 130 releases

Why the patch matters for developer teams

Related

Anthropic Mythos flaws put patch speed at the centre

ASIC demands urgent cyber uplift as frontier AI Mythos accelerates threats

Anthropic Mythos used by NSA for cyber work after AU rollout