You ask your AI assistant to research a competitor's product, summarize the findings, draft a comparison document, and email it to your team. A single agent can do each of these steps — but doing them well, in sequence, while maintaining context across all four tasks, pushes against the limits of what one agent session handles gracefully. The context window fills up. The agent loses track of earlier steps. The quality of the final output degrades because the model is juggling too many concerns simultaneously.
This is the problem that agent swarms solve. Instead of one agent doing everything, a coordinator agent breaks the task into subtasks and delegates each one to a specialist agent. The research agent searches the web and returns structured findings. The writing agent takes those findings and drafts the document. The email agent sends the result. Each agent operates in its own container, with its own context window, focused on one thing.
The idea isn't new — CrewAI and AutoGen have been exploring multi-agent patterns for over a year. What's different about NanoClaw's approach is that each agent in the swarm runs in its own isolated container, which means the security and isolation guarantees that apply to single-agent conversations extend to multi-agent workflows automatically.
How Swarms Work in NanoClaw
NanoClaw's swarm architecture is built on a simple primitive: an agent can spawn other agents. When Claude Code runs inside a container, one of its available tools is agent delegation — the ability to describe a subtask and have NanoClaw spawn a new container to handle it. The parent agent gets back the result when the child agent completes.
The orchestration happens naturally through Claude's own reasoning. You don't define a workflow graph or configure agent roles in a YAML file. You describe what you want, and Claude decides whether to handle it directly or delegate parts to sub-agents. The decision is based on the same reasoning that makes Claude good at breaking down complex problems — it recognizes when a task has independent components that would benefit from focused attention.
In practice, this looks like a tree of containers. The root container receives your message, decides it needs web research and document writing, spawns two child containers, waits for their results, and synthesizes a final response. Each child container has its own CLAUDE.md context, its own mounted workspace, and its own set of tools. The research agent has web browsing enabled; the writing agent has file write access to the shared workspace. Neither can access the other's context or tools.
Why Container Isolation Matters for Swarms
Most multi-agent frameworks run all agents in the same process. CrewAI's agents share a Python runtime. AutoGen's agents share a conversation thread. This works for demos, but it creates problems at scale that are hard to fix after the fact.
The first problem is blast radius. If one agent in a swarm encounters a prompt injection — a malicious website that tries to hijack the agent's behavior — the injection is contained to that single container. It can't affect the parent agent, can't access other child agents' contexts, and can't read the original user's conversation history. The compromised container gets torn down when it completes, and the parent agent receives whatever output it produced (which the parent can evaluate for quality before using).
The second problem is resource contention. Agents in a shared process compete for the same context window, the same memory, and the same CPU. In NanoClaw, each container has its own resources. A research agent that's browsing heavy web pages doesn't slow down a writing agent that's drafting a document. The containers run concurrently on separate threads, and the host's resource management handles scheduling.
The third problem is credential scoping. A research agent needs web access but shouldn't have file write permissions. A file management agent needs disk access but shouldn't have web access. In a shared-process framework, enforcing these boundaries requires application-level permission checks that can be bypassed. In NanoClaw, the boundaries are container mounts — the research agent literally cannot write to disk because no writable path is mounted into its container.
Practical Swarm Patterns
The patterns that emerge from real usage are more interesting than the theoretical architecture. The most common is the research-and-synthesize pattern: a parent agent spawns 3-5 research agents to investigate different aspects of a question in parallel, collects their findings, and produces a synthesized answer that's more thorough than any single agent could produce in one pass.
The second common pattern is draft-and-review. An agent writes a first draft, then spawns a reviewer agent with instructions to critique it. The reviewer's feedback goes back to the original agent (or a new drafting agent) for revision. This produces noticeably better output than single-pass generation, because the reviewer agent has a fresh context window and can evaluate the draft without the cognitive load of having written it.
The third pattern is tool specialization. Some tasks require tools that are expensive or risky — web browsing, shell command execution, file system modifications. A parent agent can delegate these operations to child agents with specific tool access, keeping its own context clean and its own permissions minimal. The parent never directly touches the filesystem or the network; it only processes the results that child agents return.
The Limits of Swarms
Swarms aren't free. Every child agent is a separate Claude API call, which means separate token costs. A swarm that spawns five research agents costs roughly five times as much as a single agent doing all the research. For simple questions — "what's the weather?" or "translate this sentence" — swarms are pure overhead.
The latency also compounds. Even with parallel execution, the parent agent has to wait for the slowest child to complete before it can synthesize results. A swarm of five agents where one takes 30 seconds to browse a slow website means the user waits 30 seconds plus synthesis time, regardless of how fast the other four were.
NanoClaw handles this pragmatically. Claude decides when to use swarms based on task complexity — simple questions get direct answers, complex multi-part requests get delegated. The user doesn't configure swarm behavior; they just ask questions, and the system scales its approach to match the complexity. The goal isn't to use swarms everywhere — it's to use them where they genuinely produce better results than a single agent could.
The multi-agent future isn't about replacing single agents. It's about giving single agents the ability to call for help when they need it, in a way that's secure, isolated, and transparent. NanoClaw's container-per-agent model makes that possible without the security compromises that come from running multiple agents in a shared environment.