Claude Code Agent Chains: One Prompt, Three Agents, Zero Context Bleed

This Changes How You Build Content Systems

Claude Code can run a chain of three sub-agents passing work between each other from a single prompt. Each agent has one job, one file, and one model — and the output quality is noticeably better than dumping everything into one context window.

Here's the template.

---

The Problem With One-Agent Workflows

Most business owners who start building AI content systems do the same thing: they write one massive prompt that asks Claude to research the topic, write the draft, check the voice, and ship it — all in one go. The output is mediocre and they don't know why. The prompt isn't the problem. The architecture is.

When you hand one agent a list of five jobs, it does all five at 60% quality. Context bloats. The model starts trading off: it skims the research to get to the writing, or it writes something that vaguely sounds like your voice but misses the brief because it never fully committed to it. You end up editing the AI's output more than you would've edited a junior writer's draft, which defeats the entire point.

The fix isn't a better prompt. It's fewer jobs per agent.

---

How the Three-Agent Chain Works

The chain has three nodes. Each node is a separate Claude sub-agent with its own model selection, its own input file, and its own output file. Agent 1 hands a file to Agent 2. Agent 2 hands a file to Agent 3. Agent 3 either ships the output or kicks it back to Agent 2 with revision notes. The whole thing fires from one prompt.

Here's what each agent actually does.

---

1. The Researcher — Sonnet, Light Context, One Job

The researcher's job is to pull source material and hand off a structured brief. That's it. It doesn't write a word of the final draft.

Sonnet is the right model here because the task is retrieval and summarization — not reasoning-heavy generation. Sonnet reads the URLs, scrapes the docs, pulls the stats, and writes a structured brief: topic, key points, sources, constraints. The output file is clean. No fluff. No half-formed arguments. Just the raw material the writer needs.

**Example:** You're building a weekly industry newsletter. The researcher gets two URLs — a new model release page and a benchmark comparison post. It reads both, pulls the relevant numbers (think "74.8 versus 72.1 on MMLU at $0.96 versus $8.90"), and drops them into `brief.md`. That file has six bullet points and four cited sources. The researcher's work is done in under 90 seconds.

**What you gain:** The writer starts with a complete brief, not a blank page or a vague instruction like "write about the new Anthropic release." The gap between brief and draft shrinks from minutes of re-reading source material to zero.

---

2. The Writer — Opus, Full Context, One Job

The writer takes `brief.md` and drafts the work. It doesn't go back to the source URLs. It doesn't edit for voice. It writes from what the researcher gave it — nothing more.

Opus is the right model here because this is the generation-heavy step. You're asking for structured, coherent prose from a compact brief, and Opus handles that better at this task depth than Sonnet or Haiku. The writer's output file is a complete draft: all sections, all transitions, all the content. Rough in places, maybe — that's fine. The editor handles it.

**Example:** The writer receives the `brief.md` from the newsletter example above. Six bullet points, four sources, one constraint: "written in Jon's voice — peer engineer register, specific numbers, no hype words." Opus produces a 400-word newsletter section in one pass. It uses the 74.8 versus 72.1 numbers. It doesn't editorialize. It hands `draft.md` to the editor.

**What you gain:** A complete draft that's actually grounded in the source material because that's all the writer had access to. No hallucinated stats. No wandering off-brief. The brief was the only context window.

---

3. The Editor — Haiku, Narrow Context, One Job

The editor has one rule: does this draft match the voice and answer the brief? Yes or no.

Haiku is the right model here because the task is evaluation, not generation. Haiku reads `draft.md` against a compact voice checklist — a 10-line document listing what the voice sounds like and what it never sounds like. It doesn't rewrite the whole draft. It flags specific sentences and either approves the file for output or writes a short revision note and sends `draft.md` back to the writer.

The narrow context is intentional. The editor doesn't need the source URLs. It doesn't need the full research brief. It needs two things: the draft and the voice rules. Keeping the context tight keeps Haiku fast and cheap — and avoidance of context bleed is the whole point of the chain.

**Example:** The editor reads the newsletter draft and flags two sentences: one uses "" (banned word) and one doesn't include a specific number where the brief clearly had one. The editor writes a two-line revision note to `draft.md` — "Line 4: remove ',' rewrite as direct statement. Line 7: include the $0.96 figure from the brief." That note goes back to the writer. Opus fixes both lines. The editor approves on pass two. Output ships.

**What you gain:** Voice consistency across every draft, enforced by a model that's only looking at one thing. The editor doesn't drift because it has nothing to drift with. It either approves or it doesn't.

---

How the Files Connect

The chain moves in one direction: `brief.md` → `draft.md` → `output.md`. Each agent reads one file and writes one file. No agent reads another agent's output file except the one downstream from it.

The structure looks like this:

``` /project brief.md ← Researcher writes this draft.md ← Writer reads brief.md, writes this revision.md ← Editor writes this if draft fails check output.md ← Editor writes this if draft passes check ```

When the editor kicks the draft back, it writes to `revision.md`. The writer reads `revision.md` and updates `draft.md`. The editor re-reads `draft.md`. The loop runs until the draft passes — usually two passes, sometimes one.

---

Why Model Selection Matters Here

This isn't arbitrary. Sonnet, Opus, and Haiku aren't interchangeable — they're matched to the task weight at each node.

Sonnet at the research step keeps cost low on a task that doesn't need heavy generation. Running Opus to scrape URLs and summarize docs is like hiring a senior engineer to take meeting notes. The work doesn't need that horsepower.

Opus at the writing step earns its cost. The brief-to-draft conversion is where reasoning and generation depth actually matter. That's where you get output that sounds like something, not output that sounds like "here is a draft of the requested content."

Haiku at the editing step keeps the feedback loop fast. A quick evaluation pass doesn't need a powerful model — it needs a focused one. Haiku with a narrow context and a clear checklist catches voice drift and off-brief sentences without the cost or latency of running Opus twice.

For a 400-word newsletter piece, the full three-agent chain costs under $2 in API calls. Running everything through Opus three times would cost $15–$20 and produce worse output because the context would be bloated with research noise before the writer even started.

---

The Template Prompt

Here's the prompt structure that fires the full chain from a single input:

``` Run a three-agent chain using the following configuration:

AGENT 1 — Researcher (Sonnet) - Input: [URL list or topic brief] - Task: Read sources, extract key facts and quotes, write a structured brief - Output: brief.md

AGENT 2 — Writer (Opus) - Input: brief.md - Task: Draft [content type] in [voice description] - Output: draft.md

AGENT 3 — Editor (Haiku) - Input: draft.md + voice-rules.md - Task: Check draft against voice rules and brief. If pass → write output.md. If fail → write revision.md with specific notes, return to Agent 2. - Max revision loops: 2

Final output: output.md ```

Swap in your content type, your voice description, and your source URLs. The chain handles the rest.

---

What This System Replaces

Before this architecture, the typical workflow for a business owner using AI to create content looks like this: open ChatGPT or Claude.ai, paste in a 600-word prompt asking for everything at once, get a generic draft, spend 30–45 minutes editing it into something that sounds like the brand, wonder why the AI keeps missing the specific stats from the research they included in the prompt.

The three-agent chain replaces that whole loop with a structured pipeline that runs in under 3 minutes and produces a draft that needs one revision pass at most. The 30–45 minutes of editing collapses down to reading `revision.md` and deciding whether to ship.

That's not a claim about saving time. That's what the workflow change produces structurally — because when each agent has one job, each agent does one job well.

---

What to Build Next

Once the three-agent chain is running for content, the same architecture applies to other business workflows: research-plan-review for client proposals, audit-recommend-format for SEO, scrape-categorize-report for competitive intel. The roles change. The structure doesn't.

One prompt. Three agents. One job each. No context bleed.

If you want the full Claude Code template file — including the voice-rules checklist, the prompt structure, and the file schema — book a call and we'll set it up for your specific workflow.

**Book a call: yoursite.com/book**