Six stages. One sequence. Nothing ships unproven.

The same closed loop, at mechanism resolution. Strict invocation order: the ordered PreToolUse chain fires before each write; the Stop hook evaluates HANDOFF before the unproven-items gate.

IntentHuman brief → EARS-format requirement, validated by the SMT spec compiler + spec_validator.→ validated requirement
DecomposeThe initializer subagent compiles the EARS spec into a default-unproven feature_list.json.→ coverage model
ImplementThe implementer subagent builds one slice in a git worktree, under the ordered PreToolUse plan/scope gates.→ one atomic commit
VerifyThe verifier subagent runs five layers, accepted only past the SubagentStop evidence gate.→ evidence record
ProveA four-field Evidence_Record + a hash-chained audit log + the Z3 logic model.→ tamper-evident proof
GateThe Stop hook holds the line locally; OPA/Conftest runs a zero-evidence policy at merge; a GitHub ruleset makes both required.→ fail-closed verdict

Claude Code hooks are command-type and fail closed; PostToolUse cannot undo an executed action; the hook roster is emerging/version-gated.

The same loop, at higher resolution

Open up the six stages and the engine’s capabilities come into view — each one belongs to a stage you already watched run.

Every capability block maps onto exactly one loop stage. Expanded, the parts are visible; read top to bottom, they re-converge into the same loop — specify, cover, execute, verify, prove, then observe and gate.

Specify
Turn intent into checkable requirements before any code is written.
- Requirement capture
  B01
  Intent compiled into EARS-notation requirements, each one addressable.
- Acceptance criteria
  B02
  Every requirement carries its own pass condition, written before the work.
- Scope contract
  B03
  The agreed boundary of the change, so drift is detectable rather than silent.
Cover
Map each requirement to the tests and paths that would prove it.
- Coverage mapping
  B04
  Each requirement linked to the tests that exercise it — gaps stay visible.
- Test synthesis
  B05
  Tests generated against acceptance criteria, not against the implementation.
- Path tracing
  B06
  Checks whether code is wired into a live execution path, not merely compiling.
Execute
Do the work under constant, in-flight control — not a single end check.
- Tool selection
  B07
  The right tool chosen per step, so the model is not its own dispatcher.
- Context management
  B08
  Requirements re-introduced after a context cut-off, so nothing silently drops.
- Mid-flight steering
  B12
  Bad edits are prevented before they land, and corrected with feedback after.
  PreToolUse prevention + PostToolUse feedback
- Anti-loopmaxxing
  B13
  Caps detect spinning and route the work to HANDOFF instead of churning.
  caps → routing to HANDOFF
Verify
Establish from facts — not self-report — whether the work actually holds.
- Independent verification
  B10
  A separate pass checks the output; the author does not grade its own work.
  System-correct stage
- Regression guard
  B11
  Prior guarantees re-checked, so a new change cannot quietly undo an old one.
  System-correct stage
- Evidence capture
  B14
  Each check records the verifiable fact it produced, not a claim that it ran.
  System-correct stage
Prove
Bind every result to a durable, replayable record an auditor could read.
- Evidence record
  B15
  Results written to a structured record that ties code back to its requirement.
  System-correct stage
- Trace assembly
  B17
  Steps stitched into a replayable trace, so the path to done is reconstructable.
  System-correct stage
- Portable proof
  B18
  Evidence travels with the work via open carriers — no vendor lock to read it.
  System-correct stage
Observe / Gate
Decide completeness deterministically — and watch the system that decides.
- Fail-closed completion gate
  B09
  Completeness decided from verifiable facts only; an unproven item is held.
  Stop hook + OPA/Conftest + GitHub ruleset
  Verification mechanism
- Security & control
  B16
  Layered controls surface classes of issue and keep a human stop within reach.
  SAST + secrets + SLSA + DAST + kill-switch
  Verification mechanism
- Observability
  B19
  The gate and its decisions are themselves monitored, so the watcher is watched.
  System-correct stage

What is in the box

Powered by proven, in-production solutions

The collapse you just watched is not one new invention. It is the assembly of tools your own teams already trust — each one taking over a proving role you have been performing by hand. Members carrying the verifies tag produce or enforce verifiable facts; the rest are the infrastructure that carries those facts. B-block IDs name where each sits in the loop.

The enforcement layer

Claude Code official hooks — deterministic, per-edit, command-type

PreToolUseverifies
B07gate the edit before it lands
Replaces the human standing watch over every change — the scope-enforcer who can never blink.
PostToolUseverifies
B07record what the edit actually did
Captures the after-state so 'looks done' has to face the record — but cannot reach back and undo.
Stopverifies
B07decide completeness at the close
Refuses to let the run call itself finished on its own say-so.
SubagentStopverifies
B07gate a subagent's handoff
Holds an unproven subagent result at the seam instead of waving it through.
PreCompactverifies
B07preserve facts across a context cut-off
Removes the role you played as the agent's memory — re-introducing the requirement it forgot.
SessionStartverifies
B07rehydrate the binding context
Re-anchors the run to its requirements so nothing starts unmoored.

The proving roles

Subagents — one bounded responsibility each

initializer
B01bind the work to its requirements
Pins each task to a stated requirement so later you can prove which one it served.
implementer
B05produce the change
Does the build inside the gate — its output is held to the same proving as everything else.
verifierverifies
B11check it was wired into a live path
Takes over hand-checking whether code truly runs vs. merely compiles green.
research
B03select tools and gather context
Absorbs the tool-selector role so the choice is recorded, not improvised in your head.

Evidence and observation

State, telemetry, and the evidence record

Neon
B13durable evidence and run state
Keeps the evidence record where an auditor can read it, not trapped in a vanished session.
Langfuse / OTel
B15traces over the whole run
Makes every step observable so the path from requirement to result is reconstructable.

Behavioural verification

Tests, properties, and end-to-end checks

Playwrightverifies
B11end-to-end execution checks
Proves the change behaves through a real path, not just in isolation.
Hypothesisverifies
B11property-based testing
Searches for the counterexample you would not have thought to write by hand.
DeepEvalverifies
B11evaluation of model-facing behaviour
Holds generative behaviour to stated criteria instead of a vibe check.

Static and security analysis

Code scanning, policy, and secret detection

Semgrep / CodeQLverifies
B09static analysis of the code
Finds the defect class before it reaches a path you own in production.
OPA / Conftestverifies
B09policy-as-code gates
Encodes the rule once so the gate, not a reviewer's memory, enforces it.
OWASP ZAPverifies
B09dynamic security probing
Exercises the running surface for exposure the static view cannot see.
gitleaksverifies
B09secret detection
Stops a leaked credential at the gate rather than in an incident review later.

Formal and supply-chain assurance

Proof obligations and provenance

Z3verifies
B17constraint and proof obligations
Discharges the conditions you would otherwise reason about by hand and hope you got right.
SLSAverifies
B17build provenance framework
Ties the artifact back to how it was built so the chain is inspectable.

Rollout and orchestration

Feature control and long-running coordination

OpenFeature / flagd
B19controlled exposure of changes
Lets a change land behind a flag so exposure is a decision, not an accident.
Temporaloptional · roadmap
B19durable long-running workflows
Coordinates multi-step work that outlives a single run — sits outside the gate engine by design.

The honest limits of these mechanisms

The enforcement layer is built on command-type hooks: they run a real command and decide on its exit, so a gate fails closed — an unproven step is held, not waved through, even when something errors.

PostToolUse observes after an edit has already happened — it can record and refuse to advance, but it cannot undo the edit. Prevention lives in PreToolUse; the after-the-fact hook is a recorder and a brake, not a time machine.

This roster is emerging and version-gated: the set of tools and the hook surface they bind to change as the platform and its upstream dependencies evolve. Members marked optional · roadmap are not yet load-bearing in the gate engine.

Vendor-neutral by construction

Opinionated defaults, not a cage.

The platform takes strong positions on how work gets proven. None of those positions trap your record of it. Everything the engine produces leaves in formats you already own and tools you already run — so the proof of what was done outlives any decision to keep using us.

feature_list.json
The capability surface as a plain manifest.
It is a file you own — readable, diffable, and committable to your own repository.
EvidenceRecord
What was checked, by which independent verifier, with what verdict.
Structured JSON you can store, query, and re-verify outside this platform.
requirement-ID Baggage
The requirement that a unit of work satisfies, carried alongside it.
Standard distributed-tracing headers — the same propagation your existing tracing already reads. No proprietary format to adopt.

One evidence record, as it leaves the engine

{  "requirement_id": "REQ-PAYMENTS-REFUND",  "verified_by": "independent-verifier",  "gate_verdict": "satisfied",  // gate satisfied — independently verified  "baggage": "requirement-id=REQ-PAYMENTS-REFUND"}

In plain terms: a named requirement was checked by an independent verifier, the gate recorded a satisfied verdict, and the requirement’s id rides along as standard distributed-tracing headers — the same baggage propagation your existing tracing already reads, with no proprietary format to adopt. Export it, re-check it, or walk away with it.

Loading…

Six stages. One sequence. Nothing ships unproven.

The same closed loop, at mechanism resolution. Strict invocation order: the ordered PreToolUse chain fires before each write; the Stop hook evaluates HANDOFF before the unproven-items gate.

IntentHuman brief → EARS-format requirement, validated by the SMT spec compiler + spec_validator.→ validated requirement

DecomposeThe initializer subagent compiles the EARS spec into a default-unproven feature_list.json.→ coverage model

ImplementThe implementer subagent builds one slice in a git worktree, under the ordered PreToolUse plan/scope gates.→ one atomic commit

VerifyThe verifier subagent runs five layers, accepted only past the SubagentStop evidence gate.→ evidence record

ProveA four-field Evidence_Record + a hash-chained audit log + the Z3 logic model.→ tamper-evident proof

GateThe Stop hook holds the line locally; OPA/Conftest runs a zero-evidence policy at merge; a GitHub ruleset makes both required.→ fail-closed verdict

Claude Code hooks are command-type and fail closed; PostToolUse cannot undo an executed action; the hook roster is emerging/version-gated.

The same loop, at higher resolution

Open up the six stages and the engine’s capabilities come into view — each one belongs to a stage you already watched run.

Specify
Turn intent into checkable requirements before any code is written.
- Requirement capture
  B01
  Intent compiled into EARS-notation requirements, each one addressable.
- Acceptance criteria
  B02
  Every requirement carries its own pass condition, written before the work.
- Scope contract
  B03
  The agreed boundary of the change, so drift is detectable rather than silent.
Cover
Map each requirement to the tests and paths that would prove it.
- Coverage mapping
  B04
  Each requirement linked to the tests that exercise it — gaps stay visible.
- Test synthesis
  B05
  Tests generated against acceptance criteria, not against the implementation.
- Path tracing
  B06
  Checks whether code is wired into a live execution path, not merely compiling.
Execute
Do the work under constant, in-flight control — not a single end check.
- Tool selection
  B07
  The right tool chosen per step, so the model is not its own dispatcher.
- Context management
  B08
  Requirements re-introduced after a context cut-off, so nothing silently drops.
- Mid-flight steering
  B12
  Bad edits are prevented before they land, and corrected with feedback after.
  PreToolUse prevention + PostToolUse feedback
- Anti-loopmaxxing
  B13
  Caps detect spinning and route the work to HANDOFF instead of churning.
  caps → routing to HANDOFF
Verify
Establish from facts — not self-report — whether the work actually holds.
- Independent verification
  B10
  A separate pass checks the output; the author does not grade its own work.
  System-correct stage
- Regression guard
  B11
  Prior guarantees re-checked, so a new change cannot quietly undo an old one.
  System-correct stage
- Evidence capture
  B14
  Each check records the verifiable fact it produced, not a claim that it ran.
  System-correct stage
Prove
Bind every result to a durable, replayable record an auditor could read.
- Evidence record
  B15
  Results written to a structured record that ties code back to its requirement.
  System-correct stage
- Trace assembly
  B17
  Steps stitched into a replayable trace, so the path to done is reconstructable.
  System-correct stage
- Portable proof
  B18
  Evidence travels with the work via open carriers — no vendor lock to read it.
  System-correct stage
Observe / Gate
Decide completeness deterministically — and watch the system that decides.
- Fail-closed completion gate
  B09
  Completeness decided from verifiable facts only; an unproven item is held.
  Stop hook + OPA/Conftest + GitHub ruleset
  Verification mechanism
- Security & control
  B16
  Layered controls surface classes of issue and keep a human stop within reach.
  SAST + secrets + SLSA + DAST + kill-switch
  Verification mechanism
- Observability
  B19
  The gate and its decisions are themselves monitored, so the watcher is watched.
  System-correct stage

What is in the box

Powered by proven, in-production solutions

The enforcement layer

Claude Code official hooks — deterministic, per-edit, command-type

PreToolUseverifies
B07gate the edit before it lands
Replaces the human standing watch over every change — the scope-enforcer who can never blink.
PostToolUseverifies
B07record what the edit actually did
Captures the after-state so 'looks done' has to face the record — but cannot reach back and undo.
Stopverifies
B07decide completeness at the close
Refuses to let the run call itself finished on its own say-so.
SubagentStopverifies
B07gate a subagent's handoff
Holds an unproven subagent result at the seam instead of waving it through.
PreCompactverifies
B07preserve facts across a context cut-off
Removes the role you played as the agent's memory — re-introducing the requirement it forgot.
SessionStartverifies
B07rehydrate the binding context
Re-anchors the run to its requirements so nothing starts unmoored.

The proving roles

Subagents — one bounded responsibility each

initializer
B01bind the work to its requirements
Pins each task to a stated requirement so later you can prove which one it served.
implementer
B05produce the change
Does the build inside the gate — its output is held to the same proving as everything else.
verifierverifies
B11check it was wired into a live path
Takes over hand-checking whether code truly runs vs. merely compiles green.
research
B03select tools and gather context
Absorbs the tool-selector role so the choice is recorded, not improvised in your head.

Evidence and observation

State, telemetry, and the evidence record

Neon
B13durable evidence and run state
Keeps the evidence record where an auditor can read it, not trapped in a vanished session.
Langfuse / OTel
B15traces over the whole run
Makes every step observable so the path from requirement to result is reconstructable.

Behavioural verification

Tests, properties, and end-to-end checks

Playwrightverifies
B11end-to-end execution checks
Proves the change behaves through a real path, not just in isolation.
Hypothesisverifies
B11property-based testing
Searches for the counterexample you would not have thought to write by hand.
DeepEvalverifies
B11evaluation of model-facing behaviour
Holds generative behaviour to stated criteria instead of a vibe check.

Static and security analysis

Code scanning, policy, and secret detection

Semgrep / CodeQLverifies
B09static analysis of the code
Finds the defect class before it reaches a path you own in production.
OPA / Conftestverifies
B09policy-as-code gates
Encodes the rule once so the gate, not a reviewer's memory, enforces it.
OWASP ZAPverifies
B09dynamic security probing
Exercises the running surface for exposure the static view cannot see.
gitleaksverifies
B09secret detection
Stops a leaked credential at the gate rather than in an incident review later.

Formal and supply-chain assurance

Proof obligations and provenance

Z3verifies
B17constraint and proof obligations
Discharges the conditions you would otherwise reason about by hand and hope you got right.
SLSAverifies
B17build provenance framework
Ties the artifact back to how it was built so the chain is inspectable.

Rollout and orchestration

Feature control and long-running coordination

OpenFeature / flagd
B19controlled exposure of changes
Lets a change land behind a flag so exposure is a decision, not an accident.
Temporaloptional · roadmap
B19durable long-running workflows
Coordinates multi-step work that outlives a single run — sits outside the gate engine by design.

The honest limits of these mechanisms

Vendor-neutral by construction

Opinionated defaults, not a cage.

feature_list.json
The capability surface as a plain manifest.
It is a file you own — readable, diffable, and committable to your own repository.
EvidenceRecord
What was checked, by which independent verifier, with what verdict.
Structured JSON you can store, query, and re-verify outside this platform.
requirement-ID Baggage
The requirement that a unit of work satisfies, carried alongside it.
Standard distributed-tracing headers — the same propagation your existing tracing already reads. No proprietary format to adopt.

One evidence record, as it leaves the engine

{  "requirement_id": "REQ-PAYMENTS-REFUND",  "verified_by": "independent-verifier",  "gate_verdict": "satisfied",  // gate satisfied — independently verified  "baggage": "requirement-id=REQ-PAYMENTS-REFUND"}

Six stages. One sequence. Nothing ships unproven.

Requirement capture

Acceptance criteria

Scope contract

Coverage mapping

Test synthesis

Path tracing

Tool selection

Context management

Mid-flight steering

Anti-loopmaxxing

Independent verification

Regression guard

Evidence capture

Evidence record

Trace assembly

Portable proof

Fail-closed completion gate

Security & control

Observability

The enforcement layer

The proving roles

Evidence and observation

Behavioural verification

Static and security analysis

Formal and supply-chain assurance

Rollout and orchestration

Opinionated defaults, not a cage.

Six stages. One sequence. Nothing ships unproven.

Requirement capture

Acceptance criteria

Scope contract

Coverage mapping

Test synthesis

Path tracing

Tool selection

Context management

Mid-flight steering

Anti-loopmaxxing

Independent verification

Regression guard

Evidence capture

Evidence record

Trace assembly

Portable proof

Fail-closed completion gate

Security & control

Observability

The enforcement layer

The proving roles

Evidence and observation

Behavioural verification

Static and security analysis

Formal and supply-chain assurance

Rollout and orchestration

Opinionated defaults, not a cage.