Deployed on Cloudflare · serval-orchestrator.burademirung.workers.dev

A multi-agent control plane for IT service management.

A supervisor agent decomposes each request and dispatches scoped specialist agents — Triage, Access-Review, Onboarding — that operate Serval's system of record over the Model Context Protocol. Every agent is a Cloudflare Durable Object. Watch it orchestrate, live.

How it works ↓

Coordinated agents

Durable Objects

Serval MCP tools

Best practices applied

opus·sonnet·haiku

Tiered Claude models

The orchestration, streaming live

Pick a scenario. The supervisor Durable Object plans, delegates over RPC, and the chosen specialists run their own Claude tool-loops against the Serval MCP backend — every decision and tool call streamed to your browser over Server-Sent Events.

orchestration trace · /api/run idle

Agents

Supervisor

opus-4-8

Triage

haiku-4-5

Access-Review

sonnet-4-6

Onboarding

sonnet-4-6

Live runs call Claude through Cloudflare AI Gateway. The backend is a faithful in-Worker mock of Serval (12 tools); flip one env var to point at the real Serval MCP server.

One supervisor, three specialists, a shared system of record

The hard part of a multi-agent system is coordination and context. Here, each agent is an independently-addressable Durable Object, so context isolation is enforced by the runtime — not by prompt discipline. The supervisor holds only references and distilled findings; specialists never leak their transcripts upward.

Supervisor Agent · Durable Object

plans · simplicity-gates · cheap routing via haiku · delegates via getAgentByName() RPC · synthesizes with opus-4-8

resilient parallel fan-out · Promise.allSettled · per-specialist 45 s timeout

Triage

haiku-4-5 · classify, prioritize, reply

Access-Review

sonnet-4-6 · evaluate JIT access

Onboarding

sonnet-4-6 · tickets, access, workflow

each runs its own Claude tool-loop · scoped to a least-privilege slice of tools · via this.mcp RPC transport

ServalMCP · McpAgent

12 tools over Streamable HTTP /mcp · stateful seeds · mock now ⇄ live Serval by env

Deployed on Cloudflare

One Worker · five Durable Objects · Claude via AI Gateway — no servers to manage, scales at the edge.

solid = request path dashed = async / optionalbackend: mock ⇄ live via SERVAL_MODE

Engagement

System of engagement

A Worker routes /api/run to a Supervisor Durable Object that streams the run back as SSE. Static assets serve this page from the edge.

Reasoning

Tiered models

Opus 4.8 supervises and synthesizes; Sonnet 4.6 and Haiku 4.5 do the specialist work. Strong where it matters, cheap on the leaves.

Record

System of record

The mock Serval backend is a real remote MCP server — point MCP Inspector or Claude Desktop at /mcp and call its 12 tools directly.

The agent registry

Least privilege is real, not cosmetic: each specialist is handed only its slice of the Serval toolset before any tool reaches Claude. A specialist literally cannot see — and therefore cannot call — a tool outside its scope.

Agent

Model

Role

Scoped tools

Supervisor

claude-opus-4-8

Plan · delegate (RPC) · synthesize the final answer

Agent / delegate only

Triage

claude-haiku-4-5

Classify & prioritize tickets, draft replies

list_ticketsget_ticketupdate_ticketpost_message

Access-Review

claude-sonnet-4-6

Evaluate JIT access against deterministic policy

list_access_requestsget_access_requestget_userreview_access_request

Onboarding

claude-sonnet-4-6

New-hire flow: tickets, baseline access, workflow

get_usercreate_ticketcreate_access_requestlist_workflowsrun_workflow

Tools marked in red mutate state — they're flagged destructiveHint in MCP and traced on every call.

Everything used to build it

A deliberately current stack — the June-2026 frontier of agents, protocol, and edge platform, with every choice justified.

runtime

Cloudflare

Agents SDK on Durable Objects

Supervisor + specialists + the MCP backend are all Durable Objects. RPC, SQLite state, hibernation, and edge streaming come for free.

protocol

MCP · 2025-11-25

McpAgent backend

The Serval mock speaks the latest MCP — structuredContent results, readOnlyHint/destructiveHint annotations, and isError results the model can recover from.

model

Anthropic

Claude via AI Gateway

@anthropic-ai/sdk routed through Cloudflare AI Gateway: caching, retries, cost tracking, and reconnect buffering — with a hand-rolled, fully-traced tool loop.

orchestration

Pattern

Supervisor / workers

Dynamic decomposition, a 4-field delegation contract, a simplicity gate, cheap routing via Haiku for single-domain requests, and resilient fan-out via getAgentByName() + Promise.allSettled — one specialist failing never kills the run.

interface

Edge SSE

Streaming over the edge

The supervisor returns a ReadableStream of trace events — no duration limit on Workers — rendered live in this console.

quality

TDD

Typed & tested

Zod contracts at every boundary, pure tool operations unit-tested, scenario routing guarded, and a deterministic access policy with its own suite.

The best practices, and how each is implemented

Thirty researched practices from Anthropic's agent & context-engineering guidance, the MCP spec, OpenAI/LangChain operations, and the Cloudflare platform — every one mapped to a concrete mechanism in the code, not just an aspiration.

A Orchestration

✓

Orchestrator–workers patternA supervisor dynamically decomposes each request and delegates to worker agents over RPC.

✓

4-field delegation contractEvery task spec carries objective, output format, tool guidance, and explicit boundaries — the #1 fix for agents duplicating or missing work.

✓

Simplicity gate + cheap routingA Haiku model (MODEL_ROUTER) classifies intent cheaply; single-domain requests skip fan-out entirely. Full fan-out only when work genuinely spans domains.

✓

Effort scaled to complexityPer-agent step caps; a strong supervisor with cheaper specialist leaves.

✓

Verbatim forwardingSpecialist findings pass through with fidelity where it matters — no telephone-game distortion.

B Context engineering

✓

Runtime context isolationEach specialist is a separate Durable Object — isolation is enforced by the platform, not by prompting.

✓

Lean supervisorIt holds only the plan and distilled findings — never a specialist's raw transcript or tool output.

✓

Distilled returnsSpecialists return a one-paragraph finding plus references, not payloads.

✓

Context-rot defensesCritical instructions at prompt edges; least-privilege tools keep each window small and high-signal.

✓

Just-in-time dataAgents load Serval data through tools at runtime rather than pre-stuffing context.

C Tools & protocol

✓

Faithful mockIdentical tool names, shapes, and error modes to real Serval — a true stand-in, swappable by env.

✓

Structured tool outputstructuredContent results per MCP 2025-11-25, validated client-side with Zod.

✓

Tool annotationsreadOnlyHint / destructiveHint on every tool so consumers show accurate intent.

✓

Errors as results, not throwsisError results feed back so the agent self-corrects instead of crashing.

✓

Curated, non-overlapping toolsMinimal per-specialist surface — no ambiguity for the model to trip on.

D Safety & security

✓

Deterministic policy boundarydecideAccess() is enforced server-side in the tool — an agent can never be more permissive than policy.

✓

Least-privilege scopingTools filtered to each specialist's allowlist before reaching the model.

✓

Idempotent writesIdempotency keys make every create safe under retry.

✓

Secrets handlingKeys live in Wrangler secrets / .dev.vars, never in code, logs, or the bundle.

✓

Annotations treated as untrustedTool metadata never drives privileged behavior.

E Observability & eval

✓

OTel-shaped tracingOne trace per run — agent, tool, and result events — the very stream this console renders.

✓

Structured output validationZod-validated findings and results at every boundary, with graceful fallback.

✓

End-state evaluationScenarios graded on the final Serval state, not on a brittle exact trajectory.

✓

Routing contract testsCheap structural evals guard which specialists each scenario engages.

✓

AI Gateway analyticsEvery model call is observable, cached, and retried at the gateway.

F Model & platform controls

✓

Tiered model selectionOpus supervisor, Sonnet/Haiku specialists — model IDs in env, bumpable without code.

✓

Graceful degradationFan-out uses Promise.allSettled with per-specialist 45 s timeouts — a failing specialist's error is surfaced and skipped; the run continues. Synthesis falls back to a deterministic merge if the model call fails.

✓

Mock-now / real-readyA single SERVAL_MODE flip points the MCP client at real Serval.

✓

Per-run isolationFresh specialist instances per run keep each demo clean and reproducible.

✓

Zero-build edge deployWrangler bundles and ships the whole stack — Worker, Durable Objects, and this page.

The safety boundary is code, not a prompt

The headline claim of an agentic ITSM tool is that it grants access safely. We don't trust the model to honor that. The Access-Review agent's decisions pass through a deterministic policy inside the tool itself — so even a jailbroken model cannot approve production or admin access.

decideAccess() · enforced in review_access_request

● approve — low-risk read, active user ● escalate — admin or production grant ● deny — inactive requester

// An agent decision can never be more permissive than policy allows.
const verdict = decideAccess({ resource, scope, requesterActive, isProduction, isAdmin });
const enforced = RANK[agentDecision] > RANK[verdict.decision];
record(enforced ? verdict.decision : agentDecision);  // policy wins, always

How it was built

From a single URL to a deployed multi-agent system — research-first, spec-driven, and verified in the real app.

Research

Deep, fact-checked research

The product, the MCP API, the June-2026 agent & context-engineering frontier, and the Cloudflare platform — adversarially verified, then distilled into a design.

Design

Spec, then re-spec for Cloudflare

A full design spec — re-platformed when the Claude Agent SDK proved incompatible with Workers, onto the Cloudflare Agents SDK, McpAgent, and AI Gateway.

Build

Subagent-driven TDD

A phased plan executed task-by-task with two-stage review — every SDK shape verified against the installed packages, every boundary typed and tested.

Verify

Caught in the real app

A live run revealed the MCP client tags tools with a generated connection id, not the server name — silently hiding all 12 tools. Only running it for real surfaced it. Fixed, redeployed, verified.