artifacts/toby-pm/7a4c4afb-12dc-419f-808f-0e9a014417cd/incidents-2026-05-11-blank-extension-page-ingestion.mdIngestion — `toby/incidents/2026-05-11-blank-extension-page.md`
_Run: 7a4c4afb-12dc-419f-808f-0e9a014417cd · Ingested by: toby-pm · 2026-05-11_
What landed
The incident warroom (workflow Toby Incident Response, id 9b78790f-2aea-4f65-876f-53d1a114c3ae) shipped its first canonical incident doc at toby/incidents/2026-05-11-blank-extension-page.md. Status: closed. Validator verdict: validated, high confidence. Wave timing per the doc's timeline: dispatch 17:08 → frontend finding 17:14 → backend finding 17:30 → validator 17:36 → published 17:38 (30 minutes end-to-end — within the README's advertised 4–8 min envelope per agent, exceeded for end-to-end due to two doctor passes + validation).
This is the same bug the dashboard's "Live blank-page reliability incident" open question has been tracking against compass anchor #1 (reliability) and playbook O1 KR1 (hard deadline 2026-05-24).
Root cause (one paragraph, faithful to the doc)
The Toby new-tab page renders the static preload skeleton forever because AuthWrapper at apps/extension/app/containers/Toby.tsx:304 returns null while isUserHydrated is false. The isUserHydrated boolean is bound 1:1 to getUser() at apps/extension/app/state/accessors/user.tsx:45-50, which wraps a chrome.storage.local.get callback with no timeout, no chrome.runtime.lastError check, no .catch. Chrome's "extension context invalidated" state (entered on every auto-update of a chrome_url_override new-tab extension, and on manual disable/reload, and on rarer SW crashes) drops that callback — turning a previously-tolerable platform quirk into a reliable user-visible hang. The trigger was commit d68726b29 (2026-04-09), which widened the gate by adding !isUserHydrated without bounding the new dependency. The same defect class exists at apps/extension/app/utils/chromeapi.ts:248-259, so isDraftReady shares the same silent-hang surface.
What was refuted
The earlier hypothesis from toby-product-strategist (artifact 388c1db4-59b7-49e9-8ec3-ecfba972c95f) that this was an MV3 service-worker boot regression is now refuted. Backend evidence: prod-api SHA 4b0107858… hasn't changed since 2026-02-02 (well before the post-2026-04-09 complaint window — the 2026-04-01 deploys are config-only); 0 5xx in last 24h; worst day this week 19/1.18M = 0.0016%; SW boot path is structurally clean (every chrome.*.addListener registers synchronously at module top level); getUser() makes no network call so the hang is pre-HTTP and an API regression cannot structurally cause it.
Memory update: do not carry the SW-boot-regression hypothesis forward. The prior strategist artifact should be treated as historical context only.
Fix proposal (frontend-only, defence-in-depth)
Three layers, all in apps/extension:
- Layer 1 — bound hydration with 5s timeout, fail open. Patch
apps/extension/app/state/accessors/user.tsx(around line 71) sogetUser()is wrapped in asetTimeout(5000)that flipssetIsUserHydrated(true)on expiry; chain.then/.catch/.finallycorrectly. Apply same shape toapps/extension/app/hooks/useOnboarding2Draft.ts:12-30forisDraftReady. Validator confirmed race-safety, non-destructiveness, and thatd68726b29's original intent (no Onboarding2 flash for returning users) is preserved. - Layer 2 — visible recovery screen at 8s. Replace
return nullinToby.tsx:304with aStuckRecoveryScreenmounted after 8s with the already pre-approved copy "Your tabs are safe. Tap to recover." (the same line carried in the dashboard's next-step under O1 KR1). - Layer 3 —
NewTabHangShowntelemetry beacon at the recovery-screen site. First-ever signal between CWS-review complaint and Sentry/Amplitude funnels.
Operator decisions surfaced in the doc:
- Should
NewTabHangShown(and the optional Layer-1NewTabHydrationTimeoutbeacon) be feature-flag gated? Default: on. Validator concurs. - Should
prod-apibe redeployed as part of this incident? Both backend doctor and validator: no.
Non-blocker follow-ups (not gating the incident close):
- Apply Layer-1 shape to
isInitializing(theuseIsRestoring()IDB-backed path inToby.tsx:168-275). - SW hardening:
.catch()onpersistQueryClientRestoreatbackground.ts:14;AbortController+ 10s timeout on thecontextMenus.ts:145-191fetch; build a unifiedchromeStorageGet<T>(keys, { timeoutMs })helper. - Layer-1 telemetry beacon (
NewTabHydrationTimeout) so the common 5s recovery path is visible in Amplitude, not just the 8s tail.
Dashboard impact
reliability-blank-page-fixbet (ICE 576, O1 KR1, deadline 2026-05-24) — diagnosis is now complete and validated. The dashboard's next-step under this bet is no longer "reproduce and triage" — it is "operator decides whether to ship the 3-layer diff documented in the incident doc". Bets queue is owned by the strategist agent; the dashboard updates only the next-step shape and crosslinks the canonical doc.- Operations § (warroom) — first canonical incident published; advertised flow worked end-to-end. Update the Operations § to reference the first closed incident as proof-of-life, not just a feature claim.
- Recent Shipments — add 2026-05-11 entry for the incident close.
- Open Questions — the existing "Live blank-page reliability incident" question gains a pointer to the canonical doc and the operator-decision framing (no longer "unknown root cause"). A new question is not needed; the existing one mutates.
- Doc Index — add the canonical incident doc (
toby/incidents/2026-05-11-blank-extension-page.md). - Key decisions — capture the "no prod-api redeploy" call and the default feature-flag-on call as decisions made inside the warroom.
Scope reconciliation (why this derivative lives here, not next to source)
Operator orders said "save next to source so it's discoverable". My hard scope rules forbid writing into toby/incidents/ — that's owned by the incident-coordinator team and hand-edits would collide with the warroom's writes. Resolution: the derivative goes in the toby-pm workspace artifact dir; discoverability comes from the dashboard linking both the canonical incident doc and this ingestion artifact. Same pattern used for prior X / strategy ingestions.
Citations
- Canonical incident doc:
toby/incidents/2026-05-11-blank-extension-page.md(status: closed, verdict: validated) - Frontend finding:
artifacts/toby-frontend-doctor/6e2b3eb9-36bf-42d3-8de3-5afa48f4b167/finding.md - Backend finding:
artifacts/toby-backend-doctor/083ec6d2-63e9-4c3e-b55e-a95301a4aa72/finding.md - Validation:
artifacts/toby-incident-validator/a28a3690-38d7-4ce9-a9c2-c6d436da1793/validation.md - Synthesis draft (preserved):
artifacts/toby-incident-coordinator/df069a93-28df-4439-8838-cfd953c4c974/synthesis-draft.md - Proximate code sites:
apps/extension/app/containers/Toby.tsx:304,apps/extension/app/state/accessors/user.tsx:45-50,66-99 - Class-of-bug code site:
apps/extension/app/utils/chromeapi.ts:248-259 - Proximate commit:
d68726b29(Jad Haidar, 2026-04-09) - Refuted prior hypothesis: artifact
388c1db4-59b7-49e9-8ec3-ecfba972c95f