Forty minutes into a session, after three rounds of file exploration and a test investigation, Claude stops following the plan.
Not dramatically. It doesn't refuse or throw errors. It just quietly starts taking shortcuts. The CLAUDE.md says "dispatch to subagents." Claude reads it at the top of the session and follows it perfectly for the first two tasks. Then context fills up, the instructions compete with 50,000 tokens of file contents and tool results, and Claude decides it's faster to edit the React component directly instead of dispatching a third subagent.
The types don't match the API. The build breaks in CI. And you're left debugging a cross-layer inconsistency that the orchestrator pattern was designed to prevent.
If you've used Claude Code on a large codebase, you've seen this. The agent is sharp and disciplined at the start. By the end of a long session, it's cutting corners. Not because the model is bad, but because that's how attention works. Instructions written in natural language get diluted as context grows. The rules that were front-of-mind at 2,000 tokens are background noise at 50,000.
If you run a monorepo with multiple services, an SDK, and frontend apps, you've probably adopted the orchestrator pattern already: main thread plans and reviews, subagents implement each layer. It's the right architecture. But it breaks down in long sessions. Writing better instructions isn't the fix. You need to make the pattern unbreakable.
Three enforcement layers. Each catches what the previous one misses.
Layer 1: CLAUDE.md tells Claude what to do
The foundation is natural language guidance in CLAUDE.md. Here's the orchestrator section you'd add to yours:
## Work Routing — Orchestrator + Subagent Architecture
**The main thread is the orchestrator — it plans, dispatches,
reviews, and verifies. Subagents implement.**
### Default Session Flow (non-trivial tasks)
**Phase 0 — Clarify** (main thread, before any code):
- Ask clarifying questions about requirements, scope, edge cases
- Identify affected project(s)
- Determine the layer stack: which services, packages, and apps
**Phase 1 — Task Plan** (main thread, uses TaskCreate):
- Create tasks with clear acceptance criteria
- Order tasks inside-out: backend → SDK/library → UI
- Set task dependencies with addBlockedBy
**Phase 2 — Dispatch** (main thread launches subagents):
- One subagent per task, with project-specific CLAUDE.md and rules
- Parallel dispatch for independent tasks
- Sequential for dependent tasks (wait for backend before SDK)
**Phase 3 — Verify** (main thread, with user):
- Review subagent results for cross-project consistency
- Do types match API response shapes?
- Walk through results with userOn a fresh session, this is all you need. Claude reads the phases, creates tasks, dispatches correctly. The problem is sessions aren't always fresh. After two rounds of exploration, a failed test investigation, and a schema refactor, the context window is crowded. That's when Phase 1 gets compressed and Claude reaches for the Edit tool directly.
CLAUDE.md is the cheapest layer to write and handles the common case. The next two layers handle the uncommon one.
Layer 2: Skills structure the workflow
Skills are markdown files in .claude/skills/ that Claude loads when you type a slash command. Each skill is a directory with a SKILL.md entry point. They're more structured than CLAUDE.md because they define specific phases, tool restrictions, and output formats.
Two skills enforce the orchestrator workflow:
/plan creates the task structure. It reads the codebase, identifies affected layers, generates tasks with dependencies, and scaffolds test stubs. The skill restricts its own tools to read-only operations. It can explore and write plan files, but it's scoped to planning.
/dispatch routes tasks to subagents. This is the skill that makes the orchestrator pattern concrete. It lives in .claude/skills/dispatch/SKILL.md and runs in four phases:
Phase 0 — Clarify: Identify which projects are affected. "Add a lastSeen field" might touch three projects: the backend service (schema), the SDK (types), and the frontend app (table column).
Phase 1 — Create tasks: Build a TaskCreate list ordered inside-out. Backend first, SDK second, UI last. Wire dependencies with addBlockedBy so the SDK task can't start until the backend finishes.
Phase 2 — Dispatch: Launch one subagent per task. Each gets a focused prompt:
You are working on: {project name}
Read first: {claude_md path} and .claude/rules/{rules_file}
Path: {project path}
Commands: {build, test commands for this project}
TASK: {task subject}
{task description with acceptance criteria}
RULES:
- Write the test FIRST, then implement to make it pass
- Stay within the project directory
- Return: what you did, test results, and any issues foundPhase 3 — Verify: After all agents complete, the main thread checks cross-layer consistency. Do the types match the API response? Does the UI hook use the right query keys?
The skill includes a routing table so Claude knows which project owns which keyword. Adapt this to your codebase:
| Keywords | Project | Rules file |
|---|---|---|
| user, auth, billing, workspace | api-server | backend-rules.md |
| sdk, hook, useQuery, module | sdk-package | sdk-rules.md |
| dashboard, ui, page, component | frontend-app | frontend-rules.md |
| schema, migration, model | database | backend-rules.md |
Here's what a dispatch looks like for "Add lastSeen field":
Phase 1: Create Tasks
#1 [api-server] Add lastSeen to user schema, DTO, service
Acceptance: API returns lastSeen in user response
Dependencies: none
#2 [sdk-package] Add lastSeen to User type and hook
Acceptance: SDK type includes lastSeen, hook returns it
Dependencies: blocked by #1
#3 [frontend-app] Display lastSeen in user list table
Acceptance: column renders, shows relative time
Dependencies: blocked by #2
Phase 2: Dispatch
→ Launch agent #1 (backend) immediately
→ When #1 completes → launch agent #2 (SDK)
→ When #2 completes → launch agent #3 (UI)
Phase 3: Verify
→ Types match across layers ✓
→ Ready to commitEach subagent gets a clean context window with only its layer's rules. The backend agent reads backend rules. The SDK agent reads SDK conventions. The frontend agent reads component patterns. No cross-contamination.
This is layer 2 because skills are structured but still advisory. Claude chooses to invoke /plan and /dispatch. Nothing forces it to.
Layer 3: Hooks make it mandatory
Hooks are shell scripts that fire at deterministic lifecycle points. They can approve, deny, or modify any tool call. This is where we actually enforce the pattern.
The key hook: deny file edits on the main thread for source files.
#!/bin/bash
# .claude/hooks/enforce-orchestrator.sh
# PreToolUse hook: only subagents can edit source files.
# Main thread can edit markdown, config, and .claude/ files.
INPUT=$(cat)
FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty')
[ -z "$FILE_PATH" ] && exit 0
# Always allow: markdown, JSON config, .claude directory
if [[ "$FILE_PATH" == *.md ]] || \
[[ "$FILE_PATH" == *.mdx ]] || \
[[ "$FILE_PATH" == */.claude/* ]] || \
[[ "$FILE_PATH" == */docs/* ]] || \
[[ "$FILE_PATH" == *.json ]]; then
exit 0
fi
# Always allow: subagents (they have a task from dispatch)
AGENT_TYPE=$(echo "$INPUT" | jq -r '.agent_type // empty')
if [ -n "$AGENT_TYPE" ]; then
exit 0
fi
# Main thread trying to edit source code → deny
cat << 'EOF'
{
"hookSpecificOutput": {
"hookEventName": "PreToolUse",
"permissionDecision": "deny",
"permissionDecisionReason": "The main thread is the orchestrator — it plans and reviews, it does not edit source files. Use /dispatch to route this work to a subagent, or create a task with TaskCreate first."
}
}
EOF
exit 0Wire it in .claude/settings.json:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": ".claude/hooks/enforce-orchestrator.sh"
}
]
}
]
}
}When Claude on the main thread tries to edit a .ts file, the hook fires, detects no agent_type in the input (meaning it's the main session, not a subagent), and returns a deny with a clear message. Claude sees the denial and knows to dispatch instead.
Subagents pass through because they have agent_type set in the hook input. Markdown, docs, and config files pass through because planning artifacts should be editable from the main thread.
This is the enforcement layer. Claude can't skip it by losing attention or deprioritizing a rule. The hook fires on every Edit and Write call, checks the same conditions, and returns the same decision. Deterministic.
The three layers working together
Here's a real session flow with all three layers active:
Layer 1 (CLAUDE.md) tells Claude the workflow. Layer 2 (skills) structures it into phases. Layer 3 (hooks) blocks violations. The session ends with cross-layer verification on the main thread: do the SDK types match the API response? Does the UI hook use the right query keys?
Second hook: verify before closing
You can also run a TaskCompleted hook that typechecks before any task can close:
#!/bin/bash
# .claude/hooks/verify-task.sh
# TaskCompleted hook: typecheck the affected project before marking done.
INPUT=$(cat)
TASK_DESC=$(echo "$INPUT" | jq -r '.task_description // empty')
# Extract project from task subject: "[api-server] Add field" → api-server
PROJECT=$(echo "$TASK_DESC" | grep -oP '(?<=\[)[^\]]+' || true)
if [ -z "$PROJECT" ]; then
exit 0 # No project tag, skip check
fi
# Map project to tsconfig location (adapt to your repo structure)
case "$PROJECT" in
*-service|*-server) TSCONFIG="services/$PROJECT/tsconfig.json" ;;
*-sdk|*-package) TSCONFIG="packages/$PROJECT/tsconfig.json" ;;
*-app|*-admin) TSCONFIG="apps/$PROJECT/tsconfig.json" ;;
*) exit 0 ;;
esac
if [ -f "$TSCONFIG" ]; then
if ! npx tsc --noEmit --project "$TSCONFIG" 2>/tmp/tsc-errors.txt; then
echo "TypeScript errors in $PROJECT — fix before completing:" >&2
head -20 /tmp/tsc-errors.txt >&2
exit 2 # Block task completion
fi
fi
exit 0Exit code 2 blocks the completion and feeds the error back to Claude. The subagent sees the TypeScript errors and fixes them before the task can close. No false completions.
The starter kit
Here's everything you need. Copy into your repo and adapt the project routing table.
Four files:
| File | Purpose |
|---|---|
CLAUDE.md (orchestrator section) | Layer 1: Natural language guidance for the plan-dispatch-verify flow |
.claude/skills/dispatch/SKILL.md | Layer 2: Skill that routes tasks to subagents by project |
.claude/hooks/enforce-orchestrator.sh | Layer 3: Denies source file edits on the main thread |
.claude/hooks/verify-task.sh | Layer 3: Typechecks before task completion |
Wire them in .claude/settings.json:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": ".claude/hooks/enforce-orchestrator.sh"
}
]
}
],
"TaskCompleted": [
{
"hooks": [
{
"type": "command",
"command": ".claude/hooks/verify-task.sh"
}
]
}
]
}
}Make the hooks executable: chmod +x .claude/hooks/*.sh. Commit everything to the repo so every team member gets the same enforcement.
Adapt to your project: Change the file path exemptions in enforce-orchestrator.sh (we exempt *.md, *.json, docs/, .claude/). Change the project-to-tsconfig mapping in verify-task.sh. Change the routing table in SKILL.md to match your project structure.
Prerequisites
- Claude Code v2.1+ (hooks API stable since January 2026)
jqinstalled (hooks parse JSON input from stdin)- Familiarity with subagent patterns and the Claude extension stack
When to skip the full loop
You don't need /dispatch to fix a typo. Single-file changes in a single project don't need the orchestrator overhead. That's why the hook exemptions are generous: markdown, config, docs, and .claude/ files are always editable from the main thread.
The enforcement targets the dangerous case: multi-project features where a change in one layer needs to propagate through two others. That's where skipping the plan step causes the cross-layer bugs the orchestrator pattern was designed to prevent.
What changes after adding enforcement
The main thread gets quieter. It reads code, creates plans, dispatches tasks, reviews results. It doesn't produce diffs. That can feel strange at first, like something is missing. But the output quality improves noticeably. Each subagent gets sharp, focused context instead of sharing a bloated window with two other layers.
The deeper lesson applies beyond developer tooling. When you deploy AI agents that talk to customers, you don't rely on prompt instructions alone. You add tool-level guardrails, input validation, quality scoring. The same principle applies to your dev environment. Guidance sets the intent. Enforcement makes it real.
The answer to "how do you enforce this?" is: you don't write better instructions. You move the enforcement out of the prompt.
- Copy CLAUDE.md orchestrator section into your repo
- Create .claude/skills/dispatch/SKILL.md with your project routing table
- Create .claude/hooks/enforce-orchestrator.sh with your file path exemptions
- Create .claude/hooks/verify-task.sh with your project-to-tsconfig mapping
- Wire hooks in .claude/settings.json
- chmod +x .claude/hooks/*.sh
- Test: ask Claude to edit a .ts file — should get denied on main thread
- Test: run /dispatch — subagent should edit freely
- Commit all files to repo
- Claude Code Hooks Reference — official documentation for all 21 hook events, matcher syntax, and decision control
- Create Custom Subagents — official Claude Code subagent documentation, AGENTS.md format, and dispatch patterns
- Addy Osmani — The Code Agent Orchestra: what makes multi-agent coding work, pitfalls, and quality gates
- Claude Code Advanced Patterns — Anthropic webinar on subagents, MCP, and scaling to real codebases
- Claude Code Best Practices — orchestration workflow patterns and CLAUDE.md structure
- Claude Code Hooks Tutorial: 5 Production Hooks From Scratch — practical hook examples with enforcement patterns
- Claude Code Sub-Agents: Parallel vs Sequential Patterns — dispatch strategy and dependency management
- Claude Code Async: Background Agents and Parallel Tasks — async dispatch and orchestration
- Claude Code Hooks: Auto-Format, Security Guards, and Test Triggers on Every Tool Call
- Multi-agent Orchestration Patterns for Production 2026
Co-founder
Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.
Learn Agentic AI
One lesson a week — practical techniques for building, testing, and shipping AI agents. From prompt engineering to production monitoring. Learn by doing.



