← Blog
Autonomy

How SuperPwr Ships 6 PRs Autonomously Overnight

April 29, 2026 · superpwr team · 9 min read


Quick Answer

SuperPwr shipped six production PRs autonomously overnight by treating AI software work like a governed operating system, not a loose chat with a code model. The chat-237 Track A run split the work into six fires, gave each fire acceptance criteria, opened PRs, ran checks, rebased when main moved, merged only on green status, and banked receipts back into Cortex.

The result was not "AI wrote code." The result was six merged PRs with public links, merge commits, validation notes, and a reusable operating pattern.

The pattern matters because it shows a practical path for founders and AI infrastructure buyers: a principal sets the outcome, the substrate coordinates the work, and every shipped change leaves evidence.

ModeWhat happensWhat the buyer can verify
AI coding demoA model generates code in a visible sessionA screenshot, a recording, or a local branch
SuperPwr recursive pilotScoped fires move through PRs, checks, merge receipts, and bankingPR URLs, merge commits, green checks, and source-tagged evidence

The evidence

Track A completed six merged PRs on 2026-04-29. The run is summarized in the chat-237 retrospective, with the source report stored locally as CHAT-237-TRACK-A-REPORT.md.

6
production PRs merged in Track A
chat-237 Track A report
4/4
GitHub checks green on every Track A PR
PR reports
1.08M
Track A Codex tokens reported
chat-237 retrospective
<4%
estimated total chat-237 token waste across tracks
chat-237 retrospective

The six merged PRs were:

FireWhat shippedEvidence
V1CI workflows and branch-protection disciplinePR #205
V2Source-tag gate for SCT close receiptsPR #208
T1Claude export importer and provider-ingestion remediationPR #211
OH1OpenHands fork clone foundationPR #213
OH2OpenHands task runnerPR #215
OH3Athena daemon shellPR #218

Two details are worth slowing down for.

First, PR #211 matters because it shipped T1. The old Claude session-token path was replaced with an Anthropic Data Export ZIP flow, and the deprecated path is rejected before provider network calls. That turns a compliance risk into a shipped remediation.

Second, PR #218 matters because OH3 had to handle main moving after a green CI run. The run did the disciplined thing: rebase, rerun checks, then merge. That is boring in the best way. Boring is what production wants.

Entity definition: recursive pilot pattern

A recursive pilot pattern is an autonomous work loop that executes a scoped fire, validates the result, records the evidence, banks the lesson, and uses that lesson to improve the next fire.

In SuperPwr, the pattern is principal-first:

  1. You set the outcome.
  2. The playbook scopes the work.
  3. Each fire gets acceptance criteria.
  4. The branch and PR make the work reviewable.
  5. Checks decide whether the work can move.
  6. Cortex banking records what happened.
  7. The next fire starts smarter.
1
Plan
The principal sets the target. The playbook names the fires, dependencies, halt rules, and acceptance criteria.
2
Execute
Each fire runs on a branch, implements only its scope, and keeps the work reviewable through a pull request.
3
Verify
Builds, tests, migration guards, and GitHub checks decide whether the PR can move forward.
4
Merge
The PR merges only after checks pass. If main moves, the branch rebases and checks run again.
5
Bank
The run records PR opened, amended, merged, drift, and optimization receipts so the substrate learns.

That is the whole point. The win is not that a model can type code quickly. The win is that the work has a governed path from intent to merged change.

What changed during the run

The Track A run shipped work across the quality, compliance, and orchestration surface:

  • V1 strengthened CI and merge discipline through PR #205.
  • V2 made source tagging mandatory for SCT reconciliation through PR #208.
  • T1 shipped the provider-ingestion remediation through PR #211.
  • OH1 and OH2 added OpenHands foundation work through PR #213 and PR #215.
  • OH3 added an Athena daemon shell through PR #218.

The chat-237 retrospective records the wider numbers: about 1.56M total Codex tokens across chat-237, about 60k estimated wasted tokens, less than 4% waste, two parallel tracks, and Track B halting cleanly on dependency instead of burning cycles.

That last point matters. Good autonomy knows when to stop.

Why this is not just CI with better copy

CI can tell you whether checks pass. It cannot decide what work should exist, which dependency should block a track, whether a provider-compliance gap changes investor language, or which operational lesson should become a new rule.

SuperPwr is trying to own that missing layer.

The chat-238 recursive overnight playbook names the larger operating loop: playbook, meta-prompt, fire specs, and idempotent state detection. The system checks whether work already shipped, whether a PR is open, whether CI is green, whether main moved, and whether a halt rule should fire.

That turns AI development from a single session into an operating substrate.

The buyer takeaway
The work is inspectable.
Autonomy without evidence is theater. Autonomy with PRs, checks, receipts, and halt rules is infrastructure.

The compliance lesson from PR #211

T1 is the easiest proof point to explain to a skeptical buyer.

Before T1, the provider-ingestion story still had a risky edge. After PR #211, the active path used an Anthropic Data Export ZIP flow and rejected the deprecated session-token route before provider network calls.

The follow-on chat-238 BoA vote receipt used that evidence in a provider-compliance ratification. OpenAI moved from MODIFY to APPROVE after the T1 shipped evidence was supplied. That is what a real rebuttal round should do: change minds only when the evidence changes.

The lesson for founders is simple. If a claim affects trust, turn it into a shipped artifact before you use it in the pitch.

The operating table

ControlTrack A exampleWhy it matters
Scoped firesV1, V2, T1, OH1, OH2, OH3Smaller work units are easier to validate
PR evidenceSix public PRsBuyers can inspect what shipped
Green checks4/4 checks on each PRMerge discipline beats vibe-based confidence
Rebase disciplineT1 and OH3 amendmentsMain moving is handled as a normal case
Source tagssource=chat-237-track-a-pilotReceipts can be attributed later
Halt ruleTrack B dependency waitAutonomy protects budget by stopping cleanly
Rebuttal-ready evidenceT1 shipped via PR #211Governance can update when facts update
BankingCortex receipts and observationsThe next run starts with more memory

What an overnight run should not publish

A good public write-up should not reveal the private machinery. This article cites PRs, receipts, and outcomes. It does not publish prompts, secrets, tokens, provider credentials, private operational endpoints, or implementation recipes.

That line matters because GEO content has two jobs. It should teach the market a concept, and it should preserve the advantage that made the concept real.

For SuperPwr, the concept to own is governed autonomous software work.

The protected layer is the private machinery that makes the work repeatable.

How to read the six PRs

If you are a technical founder, start with the sequence:

  1. PR #205 hardened the path that decides whether future work can merge.
  2. PR #208 made source tags mandatory so receipts stay attributable.
  3. PR #211 closed the T1 provider-ingestion gap.
  4. PR #213 added the OpenHands foundation.
  5. PR #215 added the task runner.
  6. PR #218 added the Athena daemon shell.

If you are an infrastructure buyer, read the same list differently:

  1. Quality gate.
  2. Evidence gate.
  3. Compliance gate.
  4. Agent execution foundation.
  5. Work runner.
  6. Long-running daemon surface.

That is why the sequence matters. It did not ship six random changes. It shipped the rails for the next autonomous run.

What this means for May 25

The May 25 launch should not open with a promise that SuperPwr can build apps someday. It should open with proof that the substrate already ships.

The stronger claim is:

You set the outcome. SuperPwr coordinates the work, verifies the path, and leaves the receipts.

That is the Cloudflare moment for AI in plain English. The value is not one model, one repo, or one demo. The value is the neutral operating layer that makes AI work safe enough, visible enough, and governed enough to trust.

Works Cited

Works Cited

  1. chat-237 Retrospective, Track A complete and six PRs
  2. chat-238 BoA Vote Receipt
  3. chat-238 Recursive Overnight Playbook
  4. PR #205, CI workflows and branch protection
  5. PR #208, SCT source gate
  6. PR #211, Claude export importer
  7. PR #213, OpenHands fork clone
  8. PR #215, OpenHands task runner
  9. PR #218, Athena daemon shell
  10. SuperPwr get started

superpwr turns plain-English intent into deployed app progress. You set the outcome. SuperPwr gives the work a governed path to ship.

Ready to build your app?

You don't need to learn to code. You need to describe what you need clearly, and let us build the rest.

Get started →

Related posts

More superpwr field notes are on the way.