Anthropic releases managed agents that learn from session history
Original: Anthropic's managed agents hold a similar view: no replayable/rewindable sandbox is needed, the execution environment can be ephemeral, and agents are able to use its historical session records for re
Source: anthropic.com ↗
Who: Posted by Tzu Gwo on X (formerly Twitter), sharing an engineering blog post from Anthropic (the AI safety company behind the Claude family of models), authored by Lance Martin, Gabe Cemaj, and Michael Cohen on the Anthropic engineering team.
What's new: Anthropic has shipped Managed Agents, a hosted service that runs long-horizon tasks on behalf of developers. The core architectural decision is to separate what the team calls the "brain" — and its control loop — from the "hands," meaning the sandboxed computers where code actually runs, and from the session log, the durable record of everything the agent has done. Each piece becomes an independent, replaceable component rather than one fragile bundle.
How it works: Previously, all three components — the model's control loop, the execution environment, and the session record — lived inside a single . That made the whole setup a "pet": a fragile individual you cannot afford to lose. The new design turns containers into interchangeable "cattle": if one dies, the control loop catches it as a simple error, provisions a fresh container using a standard recipe, and picks up from the last saved event. The session log lives outside everything else, so a crashed control loop can be rebooted by fetching that log and resuming from the last recorded step. Credentials — passwords and access tokens Claude might need — are kept in a secure vault or baked into resources at setup time, so code Claude generates inside the sandbox can never read them directly. This closes a class of vulnerabilities.
The numbers: Decoupling the brain from the hands eliminated the need to boot a container before the model could start thinking. Sessions that never touch the sandbox no longer pay that startup cost. The result: median dropped roughly 60 percent, and the 95th-percentile wait time dropped over 90 percent.
Why it matters: The deeper motivation is longevity. Anthropic found that scaffolding written for one Claude version quickly becomes wrong for the next — a reset added because suffered "context anxiety" became dead weight on because that behavior had disappeared. By designing stable interfaces between components rather than hard-coding assumptions about what the model can or cannot do, Managed Agents aims to accommodate future Claude versions — and future agent designs — without requiring a full rebuild each time.