← Dual-Axis Framework  ·  Pattern reference

Approval Gate

Governance × Route

A policy-driven gate that routes high-stakes agent actions to human approval. Decision is logged, reviewable, and reversible.

Chain
Route
Parallel
Orchestrate
Loop
Hierarchy
Governance
Approval Gate
Progressive Commitment
Observability Harness
Blast Radius Control

Why this pattern exists

An agent that can read is not the same product as an agent that can write. An agent that can write internal memos is not the same product as one that can delete production data, send customer-facing email, or transfer funds. Authority is the dimension along which agent risk compounds fastest, and Approval Gate is the pattern that lets authority scale incrementally without scaling risk linearly.

For leadership: this is the pattern that separates agents you can defend in a board meeting from agents that turn into incidents at 3am. The November 2025 Anthropic disclosure of an agent-orchestrated cyberattack — in which an attacker used Claude with 80–90% autonomy to execute the first publicly-acknowledged large-scale AI-driven intrusion campaign — made the case for Approval Gate concrete and urgent for every executive who had previously seen this as a “maybe later” concern. Approval Gate is not optional governance theatre. It is the load-bearing element of any agent that touches the world.

The agent-design problem it solves

Approval Gate sits between an agent’s proposed action and its execution. It does four things:

  1. Route by policy — a deterministic rule (not the model) decides which actions need approval. Reads pass; writes confirm; deletes escalate. The rule is auditable, versioned, owned by engineering plus compliance, never by the model.
  2. Pause cleanly — the agent halts at the gate, persists state, hands off the decision to a human. The decision can arrive milliseconds later (auto-approve in dev mode) or days later (genuine human review) without breaking the agent.
  3. Capture context for the reviewer — the gate produces a decision packet: the proposed action, the reasoning trace, the data the action will touch, the predicted side effects. The reviewer is not asked “trust me?” — they are asked a specific yes/no with evidence.
  4. Record the verdict — approved, denied, modified, escalated — with the reviewer’s identity and the time. This record is the audit trail regulators come for first.

The pattern is fundamentally a routing decision — hence Governance × Route — not a workflow extension. The route rule is the design surface. Get the routing right and the rest follows; get the routing wrong and the gate becomes either a useless rubber-stamp or a productivity-killing bottleneck.

Deep thinking direction

The hardest design question in Approval Gate is where to draw the threshold. Too low and every action triggers approval, latency explodes, reviewers stop reading carefully, rubber-stamp culture takes over. Too high and high-stakes actions slip through unreviewed until the first incident. The discipline is to set the threshold by category, not by confidence. Categories are stable: “modifies billing” is always a category-3 action regardless of how confident the agent is. Confidence is the model’s self-report and should never be the gating signal — the actions an agent is most confident about are exactly the actions where overconfidence is most dangerous.

Three failure modes recur. Rubber-stamping: reviewers approve everything within seconds because volume is too high. The discipline is rate-limiting at the source — if the agent is generating more decisions per hour than a human can review meaningfully, the gate has been miscalibrated upstream. Approval Drift: the rule expands quietly to “everything important gets approved” without anyone noticing. The discipline is explicit version control on the route rule, treated like production code. Silent Bypass: an emergency override path becomes the normal path. The discipline is explicit logging of every override with mandatory post-hoc review.

The architectural insight is that Approval Gate is the RBAC + change-control pattern from enterprise security reborn for the agent world. The route rule is the role definition; the verdict log is the access-review report; the threshold is the role’s permission scope. Engineers who have built SOX-compliant change-control systems recognize this pattern in minutes. The medium changed; the controls stayed.

Engineering blog posts — curated

Latest paper progress (arXiv)

Where this pattern is developed

  • Manning bookDesigning AI Agents, Chapter 9 §9.2 (Governance / Approval Gate).
  • PaperHuang & Zhou (2026), §4.7 Pattern 7.