UltraCode Goal
The problem
Section titled “The problem”You hand an agent an epic and tell it to build until done. It runs, it commits, it declares victory. At review time you learn that “done” meant the model felt done — a story it wrote about the work, not a verdict on the work.
Autonomous runs that look done are not done. The thing deciding completion only ever sees the transcript; it cannot open the gate file written to disk. A model grading its own output is the weakest possible signal for a release gate, and by default it is the only signal you get.
The fix
Section titled “The fix”UltraCode Goal does not trust the transcript. It hard-gates the epic before launch and reads completion from a file after the work — three enforcement layers between “the agent stopped” and “the epic shipped”:
- A preflight gate that fails closed. The run launches only when
preflight_check.pyreturns green after its remediation pass, with the intervention budget at zero. A red blocker stops the run; it does not become a question for later. - TEA red-phase tests as the Definition-of-Done. The Test Architect turns each story’s acceptance criteria into executable, failing tests first, so “done” is a measurable transition from red to green — not prose.
- A deterministic gate verdict. A story advances only when
gate_eval.pyreadsPASSfrom TEA’sgate-decision.json. It never re-derives the thresholds and never asks the model. The verdict JSON is the truth, and you can read it yourself.
gate-decision.json → PASS✓If the gate file is missing or unparseable, the contract counts it as a failing signal — prose drift degrades to a conservative re-loop, never a silent false-advance.
Install and run your first epic →
What you get
Section titled “What you get”Completion stops being a feeling in the transcript and becomes a fact on disk. Every green story is one git commit on an isolated epic branch — rollback you can actually trust, not a checkpoint that misses Bash changes. The run ends with a delivered, gate-passed epic, a run report, and a deferred-work ledger of anything safely parked for later.
Read the rest
Section titled “Read the rest”The docs split into three buckets — Why (start here), Try (do stuff), and Reference (look things up).
Why
- Why UltraCode Goal — the problem in depth, the three enforcement layers, and when not to use it.
Try
- Getting Started — prerequisites, install, the flags, and your first autonomous run.
- How It Works — the six stages, their routing conditions, and the headless emit shape.
- Parallel Mode — the experimental worktree fan-out and its known limits.
Reference
- Architecture — the conductor model, the enforcement layers in depth, and customization resolution.
- Gate Model — how
gate_eval.pymapsgate_statusto a verdict, the thresholds, and the fail-closed contract. - Health Check — the terminal self-improvement reflection: what it sends, the privacy model, and how to disable it.
- Cross-Session Recall — the optional claude-mem integration and its trust model.
- Troubleshooting — real failure modes and their remediations.