# Managed Agents Guide

*Insight · April 10, 2026 · 14 min read*

# How I Built My First Claude Managed Agent

Hey folks,

Anthropic released Claude Managed Agents this month (April 2026). I'd been seeing the docs float around, and honestly, I just wanted to play with it. No production use case. No grand plan. Just a "let me spin this up and see what happens" kind of afternoon.

The pitch is that you get a fully managed container where Claude runs autonomously. Bash, file I/O, web search, web fetch. All built in. No agent loop to build yourself. No sandbox to babysit. You define an agent, spin up a session, send it a message, and watch it work.

I wanted to test that. So I gave it a task: pull today's AI news from the web, read the articles, decide what actually matters, and write me a newsletter draft. Simple enough to finish in one sitting. Complex enough to see if the system actually holds up.

45 minutes later, I had a working agent. Here's the whole experience.

---

## The mental model (four things, that's it)

Before writing any code, the docs lay out four building blocks. Once these click, everything else makes sense.

![Managed Agents architecture overview - the brain, the hands, and the session as separate components](https://www-cdn.anthropic.com/images/4zrzovbb/website/903b624ada206b10753a24c6a1367e74a869165d-1080x1080.png)
*Source: [Anthropic Engineering Blog](https://www.anthropic.com/engineering/managed-agents)*

**Agent** is the configuration. Your model, system prompt, and tools. You create it once and reference it by ID. Like a function definition you never redeclare.

**Environment** is the container template. Packages, networking rules, runtimes. Also created once. Multiple sessions can share the same environment, but each one gets its own isolated container.

**Session** is where the work happens. One agent plus one environment equals one running instance. Fresh container every time.

**Events** are how you communicate. You send messages in, the agent streams tool calls and text back via server-sent events (SSE). This part ended up being more interesting than I expected. More on that later.

That's the whole mental model. Four concepts. Everything else is just API calls.

---

## Setting up

You need an Anthropic API key from the Console and the Python SDK. That's the entire prerequisite list.

![The decoupled architecture: brain (Claude + harness), hands (sandbox), and session (event log) running independently](https://www-cdn.anthropic.com/images/4zrzovbb/website/73e900af5b9d6ed8c64db0a8e74d4465963556b7-1640x1596.png)
*Under the hood: the brain, hands, and session are fully decoupled. If a container dies, the harness catches it. If the harness crashes, a new one picks up from the last event. Source: [Anthropic Engineering Blog](https://www.anthropic.com/engineering/managed-agents)*

```bash
pip install anthropic
export ANTHROPIC_API_KEY="your-key-here"
```

Every Managed Agents request needs a beta header: `managed-agents-2026-04-01`. The SDK adds it for you automatically. If you're using curl directly, you pass it as an HTTP header.

---

## Step 1: Create the agent

This is where I spent the most time. Not because the API is complicated. It's literally one POST request. But the system prompt turned out to be the single biggest variable in output quality. And I probably should have seen that coming, but I didn't.

My first attempt was lazy. Something like "research AI news and write a newsletter." The result read like a Wikipedia summary of yesterday's press releases. No structure. No editorial voice. No reason anyone would actually want to read it.

Then I got specific. I told it what categories to use. What tone to write in. What to skip. What counts as a source worth citing. The output changed completely. Same model, same tools, same everything. Better prompt, better newsletter. Context engineering at work (funny enough, I literally made a video about this).

```python
import anthropic

client = anthropic.Anthropic()

agent = client.beta.agents.create(
    model="claude-sonnet-4-6",
    name="AI News Scraper",
    system="""You are an AI news research agent. Your job:

1. Search the web for the most important AI/ML news from the last 24 hours
2. Fetch and read the top 5-7 articles
3. For each article, extract: title, source, date, and a 2-3 sentence summary
4. Write a newsletter draft in Markdown with sections:
   - Top Story (single most important development)
   - Research (papers, benchmarks, new architectures)
   - Industry (product launches, funding, partnerships)
   - Tools and Infrastructure (dev tooling, open source)
5. Save the newsletter as newsletter.md

Write in a direct, technical tone. No hype language. Include specific
numbers and citations. Skip anything that is just a product announcement
with no substance.""",
    tools=[
        {"type": "agent_toolset_20260401"}
    ]
)

print(f"Agent ID: {agent.id}")
```

The `agent_toolset_20260401` turns on the full built-in tool suite: bash, file read/write/edit, glob, grep, web search, web fetch. All of it.

You can also selectively disable tools if you want tighter control:

```python
# Only enable what you actually need
tools=[
    {
        "type": "agent_toolset_20260401",
        "default_config": {"enabled": False},
        "configs": [
            {"name": "web_search", "enabled": True},
            {"name": "web_fetch", "enabled": True},
            {"name": "write", "enabled": True},
            {"name": "read", "enabled": True}
        ]
    }
]
```

I left everything on. Good thing I did, too. The agent ended up writing and executing helper scripts using bash inside the container. Didn't ask it to do that. It just decided it was the fastest way to deduplicate article metadata. More on that in a sec.

---

## Step 2: Create the environment

The environment is your container config. What packages are pre-installed, what network access the agent has. The base image already ships with Python, Node.js, Go, and common system tools. You add extras through the `packages` field.

```python
environment = client.beta.environments.create(
    name="news-scraper-env",
    config={
        "type": "cloud",
        "packages": {
            "pip": ["beautifulsoup4", "feedparser"]
        },
        "networking": {"type": "unrestricted"}
    }
)

print(f"Environment ID: {environment.id}")
```

I used unrestricted networking because the agent needs to hit arbitrary news sites. For a production setup, you'd lock it down with an explicit allowlist:

```python
# Production: restricted networking
config={
    "type": "cloud",
    "networking": {
        "type": "limited",
        "allowed_hosts": [
            "https://arxiv.org",
            "https://news.ycombinator.com",
            "https://techcrunch.com"
        ],
        "allow_package_managers": True
    }
}
```

One nice thing: both the agent and environment are persistent. Create them once, save the IDs, reference them forever. Subsequent sessions reuse the cached environment so startup is faster after the first run.

---

## Step 3: Start a session and send the task

This is where things start feeling different from a regular API call. You create the session, fire off a message, and then just... watch.

```python
session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
    title="Newsletter experiment - April 10, 2026"
)

client.beta.sessions.events.create(
    session_id=session.id,
    events=[{
        "type": "user.message",
        "content": [{
            "type": "text",
            "text": "Research today's top AI/ML news from the last 24 hours. Write the newsletter and save it as newsletter.md."
        }]
    }]
)
```

---

## Step 4: The streaming. This was the fun part.

OK so this is where the experiment went from "yeah this is cool" to "oh, this is *actually* cool."

The event stream gives you real-time visibility into every decision the agent makes. Tool calls as they fire. Text output as it generates. Status changes when it's done. All through a single SSE connection.

![Server-Sent Events: a persistent HTTP connection where the server streams events to the client](https://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ytl7ktnefwb4shiks5w.png)
*SSE keeps a single HTTP connection open. Events arrive as they happen. No polling. Source: [dev.to/tjindapitak](https://dev.to/tjindapitak/server-sent-events-explained-2fd7)*

Three event types you care about:

- `agent.message` is the text output, streamed incrementally as the agent writes
- `agent.tool_use` fires every time the agent calls a tool (search, fetch, bash, write, etc.)
- `session.status_idle` means the agent is done. Break out of your loop.

```python
with client.beta.sessions.stream(session_id=session.id) as stream:
    for event in stream:
        if event.type == "agent.message":
            for block in event.content:
                if block.type == "text":
                    print(block.text, end="", flush=True)

        elif event.type == "agent.tool_use":
            print(f"\n  [Tool: {event.name}]")

        elif event.type == "session.status_idle":
            print("\n\nAgent finished.")
            break
```

Watching it run in the terminal was honestly kind of mesmerizing. Here's what a typical run looked like:

```
I'll research today's top AI/ML news and compile a newsletter.

  [Tool: web_search]
  [Tool: web_fetch]
  [Tool: web_fetch]
  [Tool: web_search]
  [Tool: web_fetch]
  [Tool: web_fetch]
  [Tool: web_fetch]
  [Tool: bash]          ← didn't ask for this
  [Tool: write]

I've compiled today's AI newsletter and saved it as newsletter.md.

Agent finished.
Tokens: 5,240 in / 3,180 out
```

A few things I noticed watching these runs:

**It made two separate searches.** First a broad sweep for AI news. Then a targeted follow-up specifically for research papers and open source releases. It split the work based on the newsletter categories in my system prompt. Nobody told it to do two passes. It just figured that was the best approach.

**It didn't fetch every search result.** Out of maybe 15 results, it fetched 5. It read the snippets, decided which ones were worth the full read, and skipped duplicates and press releases with no substance. That's editorial judgment. In a normal scraping pipeline, you'd write custom filtering logic for that. Here, the model just did it.

**The bash call caught me off guard.** On one run, the agent wrote a small Python script to parse and deduplicate article metadata, ran it inside the container, and used the cleaned output to write the newsletter. It had a full Linux environment and it used it. I found that genuinely impressive. This is different from a stateless API call where the model can only respond with text. Here it can actually *compute*.

**No lag between tool calls.** The SSE connection stayed open and events arrived the moment they happened. Total time per run was about 60-90 seconds. Most of that was network latency from the web fetches.

---

## Reading the output

The newsletter lives inside the session's container. Since sessions are stateful, you can just ask for it:

```python
client.beta.sessions.events.create(
    session_id=session.id,
    events=[{
        "type": "user.message",
        "content": [{
            "type": "text",
            "text": "Read newsletter.md and output the full contents."
        }]
    }]
)

with client.beta.sessions.stream(session_id=session.id) as stream:
    for event in stream:
        if event.type == "agent.message":
            for block in event.content:
                if block.type == "text":
                    print(block.text, end="", flush=True)
        elif event.type == "session.status_idle":
            break
```

The container persists across messages within a session. So you can iterate. "Move the Anthropic story to the top." "Add a Tools section." "Make the summaries shorter." It works because the agent still has all the files and context from the previous turn. You're not rebuilding state on every call. That felt really smooth.

---

## Token usage and what this actually costs

After a session finishes, you can pull cumulative token counts:

```python
final = client.beta.sessions.retrieve(session_id=session.id)

print(f"Input tokens:   {final.usage.input_tokens}")
print(f"Output tokens:  {final.usage.output_tokens}")
print(f"Cache creation: {final.usage.cache_creation_input_tokens}")
print(f"Cache reads:    {final.usage.cache_read_input_tokens}")
```

I ran this about a dozen times over a few days. Numbers were pretty consistent:

| Metric | Typical range | What's in there |
|--------|---------------|-----------------|
| Input tokens | 4,500 - 6,000 | System prompt, user message, fetched web content, tool metadata |
| Output tokens | 2,800 - 3,500 | Newsletter draft, reasoning, tool call arguments |
| Cache reads | High on back-to-back runs | System prompt + repeated context (5-min TTL) |
| Total per run | ~10,000 tokens | A few cents at Sonnet pricing |

Prompt caching is built in. The system prompt and repeated context get cached automatically with a 5-minute TTL. So if you're working interactively and sending follow-up messages in the same session, subsequent turns are cheaper because of cache reads. For a daily batch job that runs cold once per morning, you pay full input price each time. Either way, we're talking single-digit cents per execution.

Honestly, the cost is a rounding error. The more interesting number is wall-clock time, and 60-90 seconds per run is fast enough that it feels interactive even though the agent is doing real work behind the scenes.

---

## Custom tools: when you need to reach outside the container

Built-in tools handle everything that happens inside the container. For anything outside (sending emails, writing to a database, calling a private API), you define custom tools.

The pattern is clean. You declare the tool schema on the agent. When the agent decides to call it, the session pauses. Your code does the actual work. You send the result back. The agent continues.

Here's how I added email delivery:

```python
agent = client.beta.agents.create(
    model="claude-sonnet-4-6",
    name="AI News Scraper + Email",
    system="...",  # same prompt, plus "send the newsletter when done"
    tools=[
        {"type": "agent_toolset_20260401"},
        {
            "type": "custom",
            "name": "send_email",
            "description": "Send the compiled newsletter via email to the subscriber list. Call this after writing newsletter.md. The subject should include today's date. Provide the full newsletter content as markdown in the body parameter.",
            "input_schema": {
                "type": "object",
                "properties": {
                    "subject": {"type": "string", "description": "Email subject line with date"},
                    "body": {"type": "string", "description": "Full newsletter in markdown"}
                },
                "required": ["subject", "body"]
            }
        }
    ]
)
```

Then in the streaming loop, you handle the custom tool event:

```python
with client.beta.sessions.stream(session_id=session.id) as stream:
    for event in stream:
        if event.type == "agent.message":
            for block in event.content:
                if block.type == "text":
                    print(block.text, end="", flush=True)

        elif event.type == "agent.custom_tool_use":
            if event.name == "send_email":
                result = send_newsletter_email(
                    subject=event.input["subject"],
                    body=event.input["body"]
                )
                client.beta.sessions.events.create(
                    session_id=session.id,
                    events=[{
                        "type": "user.custom_tool_result",
                        "custom_tool_use_id": event.id,
                        "content": [{
                            "type": "text",
                            "text": f"Sent to {result['count']} subscribers."
                        }]
                    }]
                )

        elif event.type == "session.status_idle":
            print("\nDone.")
            break
```

One thing I learned: be *really* detailed in your custom tool descriptions. "Send email" as a description is not enough. Three to four sentences explaining when to call it, what format the input should be in, and what a successful result looks like. The description is the only context the model has for deciding when and how to use your tool. Treat it like you're writing onboarding docs for a new team member.

---

## The daily runner

Once the agent and environment exist, the daily script is tiny:

```python
#!/usr/bin/env python3
"""Daily AI newsletter agent."""
import anthropic

AGENT_ID = "agent_01..."
ENVIRONMENT_ID = "env_01..."

def run():
    client = anthropic.Anthropic()
    session = client.beta.sessions.create(
        agent=AGENT_ID, environment_id=ENVIRONMENT_ID
    )
    client.beta.sessions.events.create(
        session_id=session.id,
        events=[{
            "type": "user.message",
            "content": [{"type": "text", "text":
                "Research today's top AI/ML news. Write the newsletter. Save as newsletter.md."
            }]
        }]
    )

    with client.beta.sessions.stream(session_id=session.id) as stream:
        for event in stream:
            if event.type == "agent.message":
                for block in event.content:
                    if block.type == "text":
                        print(block.text, end="", flush=True)
            elif event.type == "agent.tool_use":
                print(f"\n  [{event.name}]", end="")
            elif event.type == "session.status_idle":
                break

    usage = client.beta.sessions.retrieve(session_id=session.id).usage
    print(f"\nTokens: {usage.input_tokens} in / {usage.output_tokens} out")

if __name__ == "__main__":
    run()
```

Cron it, GitHub Actions it, whatever you prefer. Each run creates a fresh session. Agent and environment are already sitting there waiting.

---

## What I took away from this

**The system prompt is the product.** I keep coming back to this. The model, the tools, the container. Those were all straightforward. The system prompt took iteration. And the gap between a vague prompt and a specific one was the gap between output you'd delete and output you'd actually send to people.

**Streaming changes how you think about agents.** Being able to watch tool calls in real time gives you immediate feedback. I caught a bad system prompt iteration within 10 seconds because I could see the agent searching for the wrong terms. Without the stream, I would have waited 90 seconds for bad output and then spent another 5 minutes guessing what went wrong.

**Having a real container matters.** The agent wrote helper scripts. It used the file system as scratch space between steps. It ran grep to search through fetched content. That's meaningfully different from a stateless completion call. The compute environment expands what the model can do in ways I didn't fully appreciate until I watched it happen.

**Token costs are boring (in a good way).** ~10k tokens per run. A few cents. The cost that actually matters is time, and 60-90 seconds is fast enough that it doesn't feel like waiting.

**Custom tools are the bridge to everything else.** Built-in tools handle research and compute. Custom tools handle the outside world. The handoff is explicit and clean. No middleware framework required.

If you've been curious about managed agents but haven't tried them yet, I'd genuinely recommend just spinning one up and seeing what happens. Pick a task, write a specific system prompt, and watch the event stream. That's what I did, and I learned more in 45 minutes of experimenting than I would have from reading the docs for a week.

The full docs are here if you want to dig in:

- [Managed Agents overview](https://platform.claude.com/docs/en/managed-agents/overview)
- [Quickstart](https://platform.claude.com/docs/en/managed-agents/quickstart)
- [Tools reference](https://platform.claude.com/docs/en/managed-agents/tools)
- [Environment setup](https://platform.claude.com/docs/en/managed-agents/environments)
- [Events and streaming](https://platform.claude.com/docs/en/managed-agents/events-and-streaming)

Until next time,
Amit.

[Read on the site →](https://www.agenticamit.com/insight/managed-agents-guide)