Toby Incidents — How this works
Toby Incidents
This folder is the warroom for Toby bug reports and technical incidents. It's maintained by a team of four agents working as a workflow.
The team
| Agent | Role |
|---|---|
toby-incident-coordinator | Warroom commander. Picks the work unit (inbox / labeled queue / discernment sweep of the open backlog), dispatches specialists, synthesises the incident doc, decides which transitions to record. |
toby-frontend-doctor | UI specialist. Reproduces via Playwright, reads apps/extension / apps/landing / apps/mobile. |
toby-backend-doctor | Go API specialist. Pulls GCP logs, queries Toby prod DB read-only, reads apps/api. |
toby-incident-validator | Quality gate. Re-checks evidence, applies triple-check (correctness / quality / security), returns a binding verdict + confidence before the incident closes. |
toby-incident-fix-shipper | Last-mile patcher. Only runs when validator returned validated + high confidence. Creates a fresh git worktree of axiomzen/toby-mono-repo under /tmp/, applies the proposed fix from the incident doc, runs the verify plan, pushes a warroom/... branch, and opens a PR. The user's primary checkout is never touched. Skipped automatically on conditional / rejected / non-high-confidence verdicts — humans review those first. |
After all five sub-agents finish, the coordinator posts a single consolidated report to Slack #C0B3FN70MEE (Toby AZ Slack) summarising verdict, root cause, doctors used, ticket outcome, and PR URL (or decline reason). The Slack report is skipped on no-op runs (Wave 0 finds nothing to investigate).
The workflow is recorded in AIOS as Toby Incident Response (id 9b78790f-2aea-4f65-876f-53d1a114c3ae).
How to report an incident
The only way into the warroom is a ticket labeled needs-warroom. Manual file drops into _inbox/ are deprecated — that path got us into trouble (no provenance, no priority, no assignee, no audit trail when a complaint moved between mediums).
The flow:
- File a ticket in the Toby project (via UI, MCP, or an agent like
toby-state-of-business). For a user-facing bug, setkind: "bug"and addlabels: ["needs-warroom"]. Body should describe symptom / reproduce / when / anything-noticed — same shape as the old inbox files. - The ticket→warroom bridge (
lib/tickets.ts→bridgeWarroomIfNeeded) sees the new label, writes an inbox file attoby/incidents/_inbox/YYYY-MM-DD-<ticket-id>-<slug>.mdcontaining the ticket body + provenance, and stampswarroom-bridgedon the ticket so it isn't double-written. - The warroom workflow (
Toby Incident Response, daily cron at 09:00 UTC) picks up the inbox file at its next tick, runs the 4-wave investigation, writes the canonical incident doc, and usesaios_tickets_record_attemptto transition the source ticket todone/in_review/blockedbased on the validator's verdict.
End-to-end latency is therefore "next 09:00 UTC tick", not "4–8 min". If a complaint genuinely can't wait until tomorrow, an operator can still tick the workflow manually from the Workflows app — but the standard path is the bridge.
What the warroom produces
Each closed incident becomes a single canonical doc:
toby/incidents/2026-05-11-blank-extension-page.md
With sections: symptom, reproduction, root cause, production impact, proposed fix (as a diff), verify plan, validator verdict + confidence, open questions, timeline of who-did-what-when, and links to the doctors' run artifacts for archeology.
The agents NEVER apply patches to the Toby codebase themselves. The incident doc carries a fix proposal that you decide whether to ship.
Triggering manually (operator-only escape hatch)
If a complaint cannot wait until the next cron tick, an operator can run the workflow directly from the Workflows app: tick Toby Incident Response. The coordinator's Wave 0 will pull from inbox-first, then ticket queue. Prefer the labeled-ticket path — manual ticks bypass the audit trail and should be reserved for genuine emergencies.
Verdicts
The validator returns one of three states; the incident doc reflects it:
closed→ validator confirmed the fix would work; ready to ship after operator review.open→ twice-rejected by the validator and the doctors couldn't satisfy its objections. Operator decides what to do next.conditional→ fix is good IF a specific question is resolved first (e.g. "is this migration safe under live write traffic?"). Surfaced for operator decision.
Folder structure
toby/incidents/
├── README.md ← this file
├── _inbox/ ← bridge-written; one file per warroom-bound ticket
│ └── 2026-05-12-toby-14-<slug>.md
├── 2026-05-11-blank-extension-page.md ← canonical closed incident doc
└── ...
The agents never apply patches to the codebase. The ticket→warroom bridge writes _inbox/ files; the workflow consumes them and then leaves them alone (operator decides when to archive). Humans should not hand-drop files into _inbox/ — use the labeled-ticket path so provenance lives in one system.