Toby — State of the Project
_Last updated: 2026-05-13_
TL;DR
Toby's monorepo flatten (Mar 2026) and Phase 1 onboarding cleanup shipped in extension v1.13.0 on 2026-04-14, including the new 4-hour Session Start heartbeat for retention analysis. 2026-05-12 — the blank-page reliability hotfix has SHIPPED TO PR. The first canonical warroom incident (toby/incidents/2026-05-11-blank-extension-page.md, status now closed → shipped) closed its loop: TOBY-14 bridged into _inbox/ at 04:59 UTC, toby-incident-fix-shipper opened PR #12 at 05:08 UTC (~9 minutes inbox→PR), TOBY-14 transitioned to done at 05:10 UTC. The 3-layer frontend-only fix — 5s timeouts fail-open on getUser() + getOnboarding2Draft, StuckRecoveryScreen at 8s with pre-approved copy, NewTabHangShown telemetry beacon — is now in-tree on branch warroom/2026-05-11-blank-extension-page-toby-14 (commit 06baf0f8a), including a brand-new apps/extension/app/components/StuckRecoveryScreen.tsx. 2026-05-13 — second new agent-owned surface online: toby-code-reviewer (daily 07:00 UTC cron) shipped its first review at toby/code-reviews/2026-05-13-10commits.md, back-filling the 10 most recent commits to main with 3 medium + 4 low findings (test coverage gaps on the shipped Session Start heartbeat hook, security concerns on the new .sandcastle/ agentic-workflow secret scanner + gate-skip audit trail). Three tickets are being filed by that run. This is the first time the Wave 4 auto-ship path has executed end-to-end — completing the proof of both warroom rails (high-confidence ships, medium-confidence asks for a human review pass). O1 KR1 deadline (2026-05-24) is now comfortably ahead — operator decision becomes review / merge / deploy / monitor, not implementation. Diagnosis bears repeating: root cause was an unbounded chrome.storage.local.get callback in getUser() made user-visible by commit d68726b29 (2026-04-09); backend is innocent (prod-api SHA stable since 2026-02-02; 0 5xx in last 24h); the earlier toby-product-strategist MV3-SW-boot-regression hypothesis is refuted. The new code-reviewer independently confirms d68726b29's fix is correct — the issue was scope, not correctness. Warroom dispatch path is now proven twice — the Toby Incident Response workflow (id 9b78790f-2aea-4f65-876f-53d1a114c3ae) closed its second canonical incident on 2026-05-12 (toby/incidents/2026-05-12-retention-offers-silent.md, TOBY-6) with a validated + medium verdict; the validator caught compile defects in the draft diff, returned the corrected version, and Wave 4 correctly skipped auto-ship. That same incident resolved the long-standing CancelSubscription.tsx open question (the frontend integration shipped; the funnel is structurally leaky upstream) and surfaced a new strategic finding: ~82% of cancels bypass the in-app retention modal entirely (120 cancels → 22 reasons → 1 accept in 30 days). As of 2026-05-10 the toby/strategy/ sub-folder holds the full strategic spine: the Compass (identity / axioms / anchors), the Bets rolling queue (in-flight / proposed / killed with ICE + falsifying signals), and the Q2 2026 Growth Playbook (toby/strategy/playbook.md) that bridges them. In parallel, the toby-x-strategist agent has finished its X execution surface, and the toby-blog-seo agent has shipped two blog drafts and a third pricing-inconsistency signal (TheTab claims Toby is "$9/mo" in its public comparison table — strengthens the urgency of the pricing-reality-reconcile deadline 2026-05-13). The X channel goes operational this week — first scheduled post Tue 2026-05-12, contingent on @TobyForTabs account creds being in the operator's hands. Repo is on main with uncommitted edits to CLAUDE.md and untracked docs/ai-onboarding-ideas-analysis.md, product/ideas/, and research-docs/. A brand-new top-level codebase subsystem also landed via commit 75a09e34d: .sandcastle/ — host-side orchestration for unattended agentic slice execution (separate rail from the warroom fix-shipper).
Strategic anchor (Compass + Playbook + Bets)
The toby/strategy/ folder is now the authoritative strategic surface. Three docs work together: read them in order Compass → Playbook → Bets. This dashboard defers to them; conflicts are bugs in this dashboard, not the strategy docs.
- Identity layer — Compass (2026-05-10). Vision, axioms (visual tangibility / ambient-not-invocational / persistence-is-product / instant clarity), anchors (reliability [at risk] / cloud-backed account / Chrome-extension ambient new-tab / free-tier viable / invisible performance cost), brand promise ("It's okay, Toby has it now.") and anti-promises (no count-lowering claim, no AI auto-organize pre-Q4 2026, no enterprise team parity, not the only way to manage tabs).
- Quarter layer — Q2 2026 Growth Playbook (2026-05-10). Growth thesis: the under-pulled lever this quarter is reliability + activation, not acquisition — Toby's loss curve isn't a top-of-funnel problem (CWS conversion on residual traffic is ~30%, strong) but a value-realization problem (New Adopter stickiness 58.4% vs. 83-94% for every other segment). Diagnosis of current growth loop: primary CWS-organic loop is declining (page views collapsed Oct 8 2025, -83%, never recovered); secondary word-of-mouth is real but not engineerable; latent curator loop (1,848 Free-Tier Archivists, 18.6% public-share rate, 14,306 active card-share links) is the only under-pulled compounding motion this quarter. Falsification clause: if Phase 2 welcome A/B fails AND blank-page hotfix lands AND CWS install-conversion still falls 2 consecutive weeks post-rewrite, the bottleneck is structurally upstream (Chrome 133 absorbed our wedge).
- Action layer — Bets (2026-05-10). Rolling ICE-scored queue with in-flight / proposed / validated / killed sections. Every bet declares a falsifying signal. Top 5 bets surfaced by the playbook (in ICE order):
pricing-reality-reconcile(600),reliability-blank-page-fix(576) — diagnosis closed 2026-05-11; PR #12 open 2026-05-12 via warroom Wave 4; awaits review/merge/CWS deploy,cws-narrative-repair(392),public-collection-pride-loop(336, the under-pulled lever),phase-2-welcome-ab(192). 2026-05-12 — new strategic surface from warroom run #2: aretention-funnel-bypass-fixproposed-bet is implied by the 2026-05-12 retention_offers incident (~82% of cancels bypass the in-app retention modal; product-shaped, FE-orchestration problem, upstream of monetization bets). Surfaced here for operator triage into the bets queue — not yet formally scored. - Channel-execution layer — X — X Strategy + Content Pipeline + Engagement Targets (2026-05-10). Operator-driven X channel; 26 ICP targets ranked across 6 Tier-A / 11 Tier-B / 9 Tier-C; new bot-filter rule applied (followers ≥ 100 AND account < 2025-06); hard engagement rules (no link in first reply, max 1 reply/target/day, no DM cold-pitch). Team-buyer bucket = 0 targets, surfacing back into the playbook open question on team-buyer signal; drop-decision deadline 2026-05-17.
- Channel-execution layer — Blog/SEO (Run #2 · 2026-05-12) — Blog & SEO Pipeline + two published drafts: Why You Have 80 Tabs Open (And Why That's Actually Fine) (P1 top-funnel, 2026-05-09) and OneTab Alternative: Save Tabs as Workspaces, Not URL Lists (P5 mid-funnel, 2026-05-12, "URL list → workspace" reframe anchored by OneTab's own December-2025 data-loss warning). Structured 2-week-cadence motion; 5 content pillars mirror the X strategy (P1 tab anxiety / P2 save-session ritual / P3 power-user shortcuts 🔒 rel-gated / P4 better-than-bookmarks / P5 competitor alternatives); pillar mix over 8 posts is ~3 P1 / ~2 P5 / ~1 P2 / ~1 P4 / ~1 P3-once-unblocked. Queue (post Run #2): rank-1 Bookmarks vs Tab Manager, rank-2 How to save Chrome tabs, rank-3 "I'll deal with it tomorrow", rank-4 Session Buddy alternative, rank-5 Arc browser switchers, rank-6 Public collection of the week (curator loop · O3 KR3), rank-7 Chrome 133 vs. Toby. Pipeline disowns the old breathless landing-page voice in favor of calm-specific-generous voice matching the X strategy. Drafts only — operator hand-copies to
apps/landing/src/content/post/to publish; first-publish flow is the immediate open question. Pipeline recommends holding the OneTab draft until reliability hotfix ships (it's feature-promotional and would route readers into the live bug); the foundational tab-hoarder draft can publish earlier. - Audience anchors — Solo Pro Power Saver (9.5% / 6,656 users / 94% weekly stickiness, revenue anchor) and Tenured Free Organizer (62.4% / 43,722 users / 86.6% weekly stickiness, durability + social proof). (see: toby/01-personas.md, toby/strategy/compass.md)
Operations — Incident warroom (proven 2026-05-11 · proven again 2026-05-12)
A four-agent AIOS workflow sits behind toby/incidents/ to triage and propose fixes for any reported Toby bug. The warroom proposes — the operator ships. Source of truth: toby/incidents/README.md. Two canonical incidents now closed — see toby/incidents/2026-05-11-blank-extension-page.md (high-confidence) and toby/incidents/2026-05-12-retention-offers-silent.md (medium-confidence; Wave 4 correctly skipped auto-ship).
- Workflow —
Toby Incident Response(id9b78790f-2aea-4f65-876f-53d1a114c3ae). Active, project=toby. Trigger: daily cron0 9 * * *(09:00 UTC). Operator manual tick remains as an escape hatch. The oldfile_event on _inbox/**description was inaccurate — the workflow never had a file_event trigger; it ran on cron + read inbox files inside Wave 0. - Team —
toby-incident-coordinator(warroom commander, only agent that writes canonical docs),toby-frontend-doctor(Playwright +apps/extension/landing/mobile),toby-backend-doctor(GCP logs + read-only prod DB +apps/api),toby-incident-validator(re-runs spot-checks + triple-check; verdict gate),toby-incident-fix-shipper(added 2026-05-12) — last-mile patcher that creates a PR inaxiomzen/toby-mono-repowhen validator returnsvalidated+ high confidence; operates inside a/tmp/worktree so the primary checkout is never touched. - Protocol — 7 waves: Wave 0 (pick: inbox → labeled queue → discernment sweep) → Wave 1 Investigate (doctors in parallel) → Wave 2 Synthesise (coordinator drafts root-cause + fix
diff+ verify plan) → Wave 3 Validate (verdict:validated/rejected/conditional; up to 2 retry passes on rejection) → Wave 4 Ship (fix-shipper opens a PR — ONLY on validated + high confidence; medium confidence routes the patch to the canonical doc for human review instead) → Wave 5 Transition (recordssolved/in_review/blockedon the source ticket) → Wave 6 Report (coordinator posts a consolidated Slack message to#C0B3FN70MEEwith verdict, root cause, ticket outcome, and PR URL or decline reason; skipped on no-op runs). - Proof-of-life #1 — 2026-05-11 blank-extension-page (SHIPPED to PR 2026-05-12). Dispatched 17:08, frontend finding 17:14, backend finding 17:30, validator 17:36, published 17:38 — diagnosis end-to-end in ~30 minutes with
validated/ high confidence verdict. Root cause pinned toapps/extension/app/state/accessors/user.tsx:45-50(unboundedchrome.storage.local.getcallback), proximate trigger commitd68726b29(2026-04-09); 3-layer frontend-only fix proposed (5sgetUsertimeout, 8s recovery screen with pre-approved "Your tabs are safe. Tap to recover." copy,NewTabHangShowntelemetry beacon); priortoby-product-strategistMV3-SW-boot-regression hypothesis refuted by independent backend evidence. Wave 4 closed the loop 2026-05-12: TOBY-14 bridged at 04:59 UTC,toby-incident-fix-shipperopened PR #12 at 05:08 UTC (~9 min inbox→PR) on branchwarroom/2026-05-11-blank-extension-page-toby-14(commit06baf0f8a, base75a09e34d). Files:apps/extension/app/state/accessors/user.tsx,apps/extension/app/hooks/useOnboarding2Draft.ts,apps/extension/app/containers/Toby.tsx, and new componentapps/extension/app/components/StuckRecoveryScreen.tsx. Ship-result artifact:artifacts/toby-incident-fix-shipper/b3400d87-0830-4f89-bb70-4c3907c085f1/ship-result.md. TOBY-14 transitioned todoneat 05:10 UTC. Ingestion summaries:artifacts/toby-pm/7a4c4afb-12dc-419f-808f-0e9a014417cd/incidents-2026-05-11-blank-extension-page-ingestion.md(diagnosis close) andartifacts/toby-pm/bdbce617-3091-43c4-9c01-20c16b19946c/incidents-2026-05-11-blank-extension-page-ship-update-ingestion.md(ship update). - Proof-of-life #2 — 2026-05-12 retention_offers-silent (TOBY-6). Closed ~one day after warroom v1.0; Wave 0 discernment sweep opted-in TOBY-6 (urgent, no agent owner, cross-cutting). Both doctors converged: backend disconfirmed the regression hypothesis (prod-api SHA stable since 2026-02-02; only insert site exercised correctly; no kill-switch / silent flag); frontend pinned the funnel-bypass paths. Coordinator drafted a Tier 1 instrumentation patch (10-LOC zap structured log in
GetRetentionOfferatapps/api/context/v3/subscription_context.go~L598). Validator returnedvalidated + medium— caught compile defects in the draftdiff(log.Info→ctx.Logger.Info, missing nil-guard,team.IDnot in scope, struct-path corrections) and produced the corrected compile-ready replacement. Wave 4 correctly skipped auto-ship: per spec, medium-confidence verdicts route the corrected patch into the canonical doc for human Go-reviewer sign-off rather than opening a PR. TOBY-6 transitioned toin_review. The gate works in both directions — the 2026-05-11 high-confidence run shipped via the fix-shipper; this medium-confidence run correctly asked for a human pass. Ingestion summary:artifacts/toby-pm/00036a80-5931-405a-85ab-1e39ee3a545f/incidents-2026-05-12-retention-offers-silent-ingestion.md. - How to file — file a Toby ticket with
labels: ["needs-warroom"](loose body shape: symptom / reproduce / when / anything-noticed). The ticket→warroom bridge (lib/tickets.ts→bridgeWarroomIfNeeded) writes an inbox file attoby/incidents/_inbox/YYYY-MM-DD-<ticket-id>-<slug>.mdand stamps the ticketwarroom-bridged. The workflow's next 09:00 UTC tick picks it up. Manual ticks via Workflows app remain an operator-only escape hatch for emergencies that can't wait one day; hand-dropping files into_inbox/is deprecated. - Hard guarantees — agents NEVER apply patches except via the fix-shipper on validated + high-confidence; agents NEVER touch
_inbox/; agents NEVER delete docs. On medium / conditional / rejected verdicts, the doc carries a fix proposal as adiff— operator review decides whether to ship.
Operations — Code review (online 2026-05-13)
New agent-owned operational surface — daily walk of commits to main with a triple-check skill (correctness / quality / security). The reviewer proposes — the operator (or follow-up tickets) ships. Source of truth: outputs in toby/code-reviews/.
- Agent —
toby-code-reviewer(slugda1e2bb3-fbd9-454a-9848-4fe4e05089e3). Active, project=toby. Trigger: daily cron0 7 * * *(07:00 UTC). Walks new commits tomainsince the last reviewed SHA; back-fills the most recent N commits on first run. - Outputs — (1) a roll-up review doc at
toby/code-reviews/YYYY-MM-DD-Ncommits.mdcontaining "What shipped" (commit list) + "Findings" (severity × dimension, with code excerpts, suggested actions); (2) tickets filed against any finding rising to "real bug / security hole / missing test" threshold (improvement / bug / issue kind). - Hard guarantees — reviewer NEVER patches the codebase. Even high-confidence findings exit as tickets, not PRs. Distinct from the warroom fix-shipper (which DOES open PRs but only on validated + high-confidence warroom incidents).
- Proof-of-life #1 — 2026-05-13 first review (10 commits back-fill). Run
e5abf485-212b-4a2a-946e-882c6b5b22ecstarted 2026-05-13T07:00:18Z. First doc:toby/code-reviews/2026-05-13-10commits.md. Covered window: 10 commits onmainending at75a09e34d(the sandcastle feat-commit). 3 medium + 4 low findings, with the three mediums each earning a filed ticket:- medium · quality —
useSessionStart.ts:1-75(commit0f3aa38) has no test coverage on a 75-line hook driving a production analytics event. Phase 1 ship debt — should not gate Phase 2 but earns its own ticket. - medium · security —
.sandcastle/scan-secrets.mts:1-182(commit75a09e3) is the only thing standing between an unattended LLM agent and a pushed credential — no test coverage on the patterns or the skip-list. Single-pattern regex, single-line, no multi-line obfuscation detection. Author's own comment acknowledges the limitation. - medium · security —
.sandcastle/main.mts:602,648(commit75a09e3) — two env-var escape hatches (SANDCASTLE_SKIP_GATES_VERIFY=1,SANDCASTLE_SKIP_SECRET_SCAN=1) fully disable verification + secret-scan gates with only aconsole.warn. No audit trail ties a pushed branch back to which gate was waived. - Lower-severity findings: Slack backtick-fence-break formatting hazard on the CWS review monitor (b9bea18), correctness flag on
EXPECTED_GATES = ['lint','build'](typecheck + test intentionally omitted; lock-step risk when TS-clean lands), trust-boundary note on agent-controlledgh pr create --title, and a doc-only note ond68726b29(the AuthWrapper hydration fix — review independently confirms the fix is correct, agreeing with the warroom finding that the issue was scope, not correctness). - Ingestion summary:
artifacts/toby-pm/20dc862a-fd5c-4eba-a6e1-94f0300bd1e5/code-reviews-2026-05-13-10commits-ingestion.md.
- medium · quality —
- Cross-surface intelligence — independent triangulation against the warroom: the code-reviewer reviewed
d68726b29and reached the same conclusion (fix is correct) from a different angle (test-coverage lens, not downstream-blast-radius lens). The two reviews don't conflict; they cover orthogonal dimensions on the same commit. This is a useful pattern — different agents converging on the same artifact strengthen the dashboard's confidence in shared findings.
Q2 2026 OKRs (7 weeks remaining)
Sourced directly from toby/strategy/playbook.md. Quarter ends 2026-06-30.
-
O1 — Stop the bleed: restore CWS-rank trajectory and review average.
- KR1: Ship blank-page reliability hotfix by 2026-05-24 — zero new 1-star "blank screen" reviews within 14 days. Diagnosis closed 2026-05-11; PR opened 2026-05-12 via warroom Wave 4 (
toby/incidents/2026-05-11-blank-extension-page.md, TOBY-14 →done, PR #12, commit06baf0f8a). Operator decision now: review, merge, build + CWS-deploy new extension version, then watch the 14-day telemetry window forNewTabHangShownand new 1-star reviews. Cross-confirmation 2026-05-13: code-reviewer independently confirms commitd68726b29(the proximate trigger of the bug) is itself correct — the issue was the gate widening without bounding its new dependency, exactly what PR #12 fixes. - KR2: Publish rewritten CWS listing (title + description + social proof + CWV benchmark) by 2026-06-01 — target +20% install-conversion in the 4-week window.
- KR3: CWS rolling 30-day review average stops declining (week-over-week non-negative) by 2026-06-30.
- KR1: Ship blank-page reliability hotfix by 2026-05-24 — zero new 1-star "blank screen" reviews within 14 days. Diagnosis closed 2026-05-11; PR opened 2026-05-12 via warroom Wave 4 (
-
O2 — Earn the activation moment: prove (or kill) the welcome A/B and lift D7.
- KR1: Phase 2 welcome A/B fully instrumented and live at canary 5% by 2026-05-19. Baseline today: 0 commits — silent-slip risk.
- KR2: At decision review 2026-05-26, at least one variant ≥34% D7 retention at n≥2,000/arm. Baseline: 32.92% V2-only.
- KR3: New Adopter persona weekly stickiness rises from 58.4% to ≥65% by 2026-06-30.
-
O3 — Find a price anchor that holds, and activate the dormant curator loop.
- KR1:
pricing-reality-reconcilecomplete by 2026-05-13 — one short doc, single authoritative number. Strengthened 2026-05-12: the blog pipeline surfaced a third inconsistent price (TheTab claims Toby is "$9/mo" in its public comparison table) — now three different prices across internal modeling, the Efficient.app listing, and a competitor's blog. Audit must cover all three. - KR2:
role-based-paywall-gatingdesign doc shipped (not built) by 2026-06-15 — defines which team / admin / sharing features move behind paid and which stay free. - KR3: 4 public collections featured (X + blog) by 2026-06-30. Target: 10 by end of Q3.
toby-blog-seohas the recurring "Public collection of the week" curator series queued at rank-6 intoby/blog/pipeline.md— direct lever for this KR.
- KR1:
-
Implicit cross-OKR finding (2026-05-12) — the warroom's second incident exposed a structural retention-funnel leak (~82% of cancels bypass the in-app retention modal; ~0.83% accept-rate on 30d). The Tier 2 follow-ups (hide Stripe-portal
Viewlink / configureflow_dataredirect / give legacy + basic users an in-app cancel CTA) are upstream of monetization bets but not currently scored against any O3 KR. Surface to the strategist agent for the next playbook iteration; do not retrofit the existing OKR set this quarter. -
Implicit cross-OKR finding (2026-05-13) — the code-reviewer's first review surfaced two security-flavored findings against the new
.sandcastle/subsystem (commit75a09e3) that aren't tied to any current OKR but matter because sandcastle is now a live path that opens PRs againstmain. If any agent-authored slice slips a credential past the regex scanner, OR if a CI runner exportsSANDCASTLE_SKIP_*=1without an audit trail, the autonomy story is materially weaker. Not blocking Q2 — surface as Open Questions; audit-log work could fit into engineering-hygiene scope alongside the housekeeping Tier 4 retention-secrets ticket.
Immediate next steps
Ordered by playbook ICE + dependency. Reliability and pricing are upstream of everything else.
- Pricing reality audit (1h) — ICE 600, blocks
role-based-paywall-gating. Cross-check CWS listing, gettoby.com, Stripe price IDs in production vs. the internal $4.50/mo modeling input; reconcile against public $6/$10 listing on Efficient.app and the $9/mo claim in TheTab's public comparison post (surfaced this run via the blog pipeline). OKR: O3 KR1, due 2026-05-13 — owner: TBD. Side benefit: also unblocks blog-pipeline price-claim guardrail (currently no blog post may mention price; pipeline complies) (from: toby/strategy/playbook.md, toby/strategy/bets.md#pricing-reality-reconcile, toby/blog/pipeline.md open questions + competitor-blog watch) - Anchor #1 protection — Reliability hotfix review + merge + deploy — ICE 576. PR OPEN 2026-05-12 — PR #12 (branch
warroom/2026-05-11-blank-extension-page-toby-14, commit06baf0f8a). The 3-layer frontend-only fix specced intoby/incidents/2026-05-11-blank-extension-page.mdis now in-tree: 5s timeouts fail-open ongetUser()(apps/extension/app/state/accessors/user.tsx) andgetOnboarding2Draft(apps/extension/app/hooks/useOnboarding2Draft.ts);StuckRecoveryScreenat 8s (apps/extension/app/containers/Toby.tsx+ new componentapps/extension/app/components/StuckRecoveryScreen.tsx);NewTabHangShowntelemetry beacon. Operator steps: (1) review + merge PR #12 — validator already vetted race-safety, regression-safety againstd68726b29, and copy; code-reviewer also independently confirmsd68726b29's fix is correct; CI is the canonical typecheck/lint gate since no local check ran in the ephemeral worktree; (2) build a new extension version + push to Chrome Web Store; (3) monitor: 7-day baseline establishment onNewTabHangShownin Amplitude, watch CWS reviews for new 1-star "blank screen" complaints (target: zero in the 14-day window). No prod-api redeploy (both backend doctor and validator agreed — API is innocent). Side benefit: unblocks the blog pipeline's P3 power-user posts AND the recommended publish of the OneTab Alternative blog draft (currently rel-gated bytoby-blog-seo). OKR: O1 KR1, due 2026-05-24 — owner: TBD (likely Jad given proximate commit ownership) (from: toby/incidents/2026-05-11-blank-extension-page.md PR shipped, toby/strategy/compass.md anchor 1, toby/strategy/bets.md#reliability-blank-page-fix, toby/blog/pipeline.md P3 gate + OneTab gate recommendation, toby/code-reviews/2026-05-13-10commits.md d68726b finding) - Tier 1 retention instrumentation — Go-reviewer sign-off (2026-05-12) — validator returned
validated + medium; Wave 4 correctly skipped auto-ship. The corrected compile-ready 10-LOC zap log line atapps/api/context/v3/subscription_context.go~L598 (between eligibility evaluation and response build) is intoby/incidents/2026-05-12-retention-offers-silent.md. Any owner ofapps/api/context/v3/can approve. Patch is logging-only, additive, no behavior change, no schema change, 1-commit rollback. Once shipped, wait 24-48h then run the Cloud Logging query in the verify plan; expect 1-3retention_offer_eligibleevents in the window. Source ticket TOBY-6 sits inin_reviewagainst this (from: toby/incidents/2026-05-12-retention-offers-silent.md fix tier 1) - File Tier 2/3/4 retention follow-up tickets (2026-05-12) — incident doc recommends filing now with stub bodies pointing back to the canonical doc; backfill numbers in 14 days when Tier 1 telemetry flows. Tier 2 = product/FE work (hide Stripe-portal
Viewlink or configureflow_dataredirect; legacy+basic in-app cancel CTA; investigateretention_yearly0%-accept branch). Tier 3 = schema/analytics (addretention_offer_viewstable orstatus/offered_at/declined_atcolumns; wire AmplitudeRETENTION_OFFER_SHOWN/DECLINEDinto BI). Tier 4 = housekeeping (5 missingTOBY_RETENTION*secrets in GCP Secret Manager; either create with current defaults or remove lookup). Tier 2 is the bigger monetization lever than Tier 1 (from: toby/incidents/2026-05-12-retention-offers-silent.md fix tiers 2-4) - Code-reviewer's three auto-filed tickets — operator triage (NEW 2026-05-13) — the
toby-code-reviewerrun filed three tickets on its first pass: (1)useSessionStarthook needs unit-test coverage (improvement, medium — Phase 1 ship debt on the Session Start heartbeat hook; quiet-failure risk goes invisible in BQ until backfill); (2).sandcastle/scan-secrets.mtsneeds unit-test coverage (improvement, medium — the only thing between an unattended LLM and a pushed credential has no tests on its patterns or skip-list); (3) sandcastle skip-flags need audit logging (issue, low —SANDCASTLE_SKIP_GATES_VERIFY=1andSANDCASTLE_SKIP_SECRET_SCAN=1waive gates with only aconsole.warn). Operator should triage priority + owner; (1) is straightforward unit tests, (2) + (3) are the autonomy story for.sandcastle/(from: toby/code-reviews/2026-05-13-10commits.md filed tickets) - CWS narrative-repair sprint — ICE 392. Retitle to
Toby — Tab Manager: Save Sessions, Cloud Sync & Notes, rewrite description with explicit cloud-sync mention, surface enterprise social proof, publish Core Web Vitals benchmark. Framing must NOT lean on cloud-sync as differentiator. OKR: O1 KR2, due 2026-06-01 — owner: TBD (from: research-docs/toby-delta-2026-05-05-v3.md, toby/strategy/bets.md#cws-narrative-repair) - Public-collection pride loop — ICE 336, the under-pulled growth lever this quarter. Surface and reward Free-Tier Archivist creators via X "public collection of the week" + curator-spotlight slot on
gettoby.com. Zero engineering.toby-blog-seocarries the recurring "Public collection of the week" blog series at queue rank-6 — pairs cleanly with the X surface. OKR: O3 KR3 — owner: TBD (likely toby-x-strategist + toby-blog-seo) (from: toby/strategy/playbook.md, toby/strategy/bets.md#public-collection-pride-loop, toby/blog/pipeline.md queue rank 6) - Blog publish flow — operator decision. The agent writes drafts into the wiki at
toby/blog/; the production blog lives atapps/landing/src/content/post/. Three sub-decisions stacked: (1) hand-copy approved drafts into the codebase repo or change the publish flow? (2) image hand-off — neither draft has a cover image; existing posts use~/assets/images/blog/<post-folder>/<image>.png; the OneTab draft specifically needs a side-by-side "URL list vs visual collection" hero (the image carries the post's core claim); (3) confirm internal-link URL shape ongettoby.combefore any inter-post link ships. Blocks: first publish of either of the two drafts intoby/blog/(from: toby/blog/pipeline.md open questions) - OneTab Alternative draft — reliability-gate decision. The
toby-blog-seoagent explicitly recommends holding publish until the 3-layer reliability hotfix ships (O1 KR1, 2026-05-24) — the OneTab draft is feature-promotional (Save Session + new tab page) so publishing it pre-fix would drive switchers directly into the live blank-page bug. Foundational "Why 80 Tabs Open" draft can publish earlier (less feature-promotional). Operator owns the call; treating as a recommendation, not a hard rule (from: toby/blog/pipeline.md operator-decisions section, toby/blog/onetab-alternative.md editor notes, toby/incidents/2026-05-11-blank-extension-page.md) - Distribution-loop decisions — blog drafts + X anchors. Two parallel calls: (A) Foundational draft + X Post 1 — both state "47 tabs is not a personality flaw"; option (a) X Post 1 links to the blog post (deep distribution, requires blog publish first), (b) treat as parallel statements of the same idea, no inter-link (X ships Tue 2026-05-12 regardless). Recommend (b) if publish-flow can't resolve by Tue 2026-05-12 morning. (B) OneTab draft + new X anchor — the pipeline doc explicitly calls out that the OneTab post needs its own dedicated X anchor (the Post 1 pair was already claimed). Operator should queue a new X post draft on the "URL list vs your work" angle once publish timing is known (from: toby/blog/pipeline.md open questions, toby/blog/onetab-alternative.md editor notes, toby/x/content-pipeline.md Post 1)
- X channel goes live this week — operator-driven. Step 1: confirm @TobyForTabs account creds are in operator's hands (acct-gate blocker; must clear by Mon 2026-05-11). Step 2: pin Post 13 from
toby/x/content-pipeline.md(🔒 acct-only, no price claim, no feature screenshots — safe ahead of pricing reconcile + rel hotfix). Step 3: post Post 1 on Tue 2026-05-12, Post 2 Wed 2026-05-13, Post 3 Thu 2026-05-14. Step 4: @nibzard Tier-A reply within 48h (thread window closes ~2026-05-12). Fallback: if @nibzard window closes before reply ships, pivot Tier-A lead to @airplanestar_ (heavy-organiser, 5.5k followers) or @wayne_effect (lost-tab-grief canonical voice). Zero engineering — pure operator execution (from: toby/x/engagement-targets.md, toby/x/content-pipeline.md) - Phase 2 welcome A/B at canary 5% by 2026-05-19 — ICE 192. Confirm Core Services
/v1/experimentsendpoint shape, registeronboarding-welcome-v1, ship Slices 1–2 (prefetchWelcomeExperiment+ new hook +withExperimentProps). OKR: O2 KR1, due 2026-05-19 — owner: Jad (from: tasks/phase2-todo.md, toby/strategy/bets.md#phase-2-welcome-ab) - Publish "Chrome 133 vs. Toby" comparison page targeting Perplexity / ChatGPT / Claude recommendation flows — Should bet. Lean on axioms 1+2+3 (visual tangibility, ambient surface, persistence), NOT cloud sync. Now also queued by
toby-blog-seoat rank-7 in the blog pipeline — owner: TBD (from: research-docs/toby-delta-2026-05-05-v3.md, toby/strategy/bets.md#chrome-133-vs-toby-comparison-page, toby/blog/pipeline.md queue rank 7) - Founder-level decision: Atlassian/Dia partnership/context-API discovery meeting — explicitly deferred this quarter by playbook anti-bet (one-way door; decision belongs at founder level + after the v3 research dossier's Sept-2026 check-in). Re-raise post Q2 — owner: founder (from: research-docs/toby-delta-2026-05-05-v3.md, toby/strategy/playbook.md anti-bets)
Phase / Milestone progress
- Phase 1 — Session Start + 1.13.0 cleanup: SHIPPED. Extension
1.13.0published 2026-04-14; signup-position A/B experiment removed (ship "end" variant);AuthWrappergated on user hydration;experimentEntityIdrace fixed; 4hSession Startheartbeat live with 60k/day projection and 180k/day halt threshold (commits 0f3aa38d2, bc5e45305, d68726b29, cde22c935, 9d6e8e4f3). 2026-05-13 review surfaced ship-debt:useSessionStart.tshook has no test coverage on its 4h-rate-limit /days_since_signupparse / storage-write-ordering logic — quiet-failure risk goes invisible until BQ backfill. Ticket filed (see: toby/code-reviews/2026-05-13-10commits.mduseSessionStartfinding). - Phase 2 — Welcome A/B + shared onboarding spine: PLANNED, not yet in flight. 12 slices specced with halt triggers; canary stages 5%→20%→50%→100%; decision review 2026-05-26. Playbook O2 KR1 sets a hard date: canary 5% live by 2026-05-19 or O2 is at risk (see: tasks/phase2-todo.md, tasks/onboarding-experiment-plan.md, toby/strategy/playbook.md OKRs)
- CWS Review Monitor (Cloud Run job): SHIPPED + reliability-hardened. AI-drafted responses live since 2026-03-30 (ba247d9ac); fallback Slack message added 2026-04-29 (commit b9bea18cd). 2026-05-13 review surfaced a low-quality formatting hazard on the fallback message:
escapeSlackMrkdwndoesn't strip backticks, so adraftErr.Error()echoing a triple-backtick body breaks the Slack fence. Doc-only, no ticket (see: toby/code-reviews/2026-05-13-10commits.md b9bea18 finding). - Retention discount — frontend integration CONFIRMED SHIPPED (via 2026-05-12 warroom diagnosis). Backend live since
cbc92a78d; FE dispatch site is live atapps/extension/app/components/Modal/Downgrade/CancelSubscription.tsx:643-709;RETENTION_OFFER_DECLINEDAmplitude wiring live at:622-627. The previous worklog.md "pending" entry from Jan 2026 is stale — the integration shipped at some point between then and 2026-05-12. The funnel works for users who reach the in-app retention modal (17 all-time accepts, 16retention_legacy+ 1retention_yearly) — what's broken is structural: ~82% of cancels bypass the modal entirely. See open question on retention-funnel structural leak (see: toby/incidents/2026-05-12-retention-offers-silent.md) - Monorepo flatten + Turborepo / pnpm workspaces migration: SHIPPED late March 2026 —
apps/api,apps/extension,apps/landing,apps/mobile(commits 87bec6267, a90230ce1, 134f9bb90, ec843c5a2, c5545cbd5, 2574b5379, 5bd961266) - Auto-generated architecture docs (
docs/architecture/_index.yaml, flows, Mermaid diagrams): SHIPPED Jan 2026 — 21 controllers, 86 endpoints, 70 mutations, 111 tracked events parsed (see: worklog.md, CLAUDE.md) - Strategic research v3 (Phase 0.5–9 pipeline) + delta vs v2: SHIPPED 2026-05-05 (see: research-docs/toby-research-2026-05-05-v3.md, research-docs/toby-delta-2026-05-05-v3.md)
- Strategy spine — SHIPPED to wiki 2026-05-10. Three co-equal docs: Compass (identity / axioms / anchors), Bets (rolling ICE-scored queue), Q2 2026 Playbook (growth thesis, OKRs, anti-bets, red-team).
- X execution surface — SHIPPED to wiki 2026-05-10. Three docs at
toby/x/: strategy.md, content-pipeline.md, engagement-targets.md. Channel goes operational Tue 2026-05-12 (first scheduled post) contingent on operator confirming @TobyForTabs creds by Mon 2026-05-11. - Incident warroom — STOOD UP 2026-05-11 + PROVEN ON BOTH RAILS. New folder
toby/incidents/plusREADME.md. AIOS workflowToby Incident Response(id9b78790f-2aea-4f65-876f-53d1a114c3ae) active; five agents on roster (coordinator + frontend doctor + backend doctor + validator + fix-shipper). Two canonical incidents now both have ship-outcomes:toby/incidents/2026-05-11-blank-extension-page.md(high-confidence, ~30 min diagnosis + Wave 4 fix-shipper opened PR #12 on 2026-05-12 — first end-to-end auto-ship) andtoby/incidents/2026-05-12-retention-offers-silent.md(medium-confidence — Wave 4 correctly skipped auto-ship; Tier 1 patch in canonical doc awaiting human Go-reviewer sign-off). Both rails now exercised: high-confidence ships, medium-confidence routes for human review. - Reliability blank-page hotfix — PR OPEN 2026-05-12. PR #12 against
axiomzen/toby-mono-repo:main(branchwarroom/2026-05-11-blank-extension-page-toby-14, commit06baf0f8aon base75a09e34d). 3-layer frontend-only fix from the 2026-05-11 incident doc lands as: edits toapps/extension/app/state/accessors/user.tsx(Layer 1 5sgetUser()timeout),apps/extension/app/hooks/useOnboarding2Draft.ts(Layer 1 onisReady/isDraftReady),apps/extension/app/containers/Toby.tsx(Layer 2StuckRecoveryScreenat 8s + Layer 3NewTabHangShownbeacon), and a brand-new componentapps/extension/app/components/StuckRecoveryScreen.tsxwith pre-approved copy "Your tabs are safe. Tap to recover.". Deliberately NOT in PR #12 (queued as follow-ups, not blockers): Layer-1 forisInitializing, the SW-hardening trio, the Layer-1NewTabHydrationTimeoutbeacon. Ship-shipper runb3400d87-0830-4f89-bb70-4c3907c085f1; source ticket TOBY-14 (statusdone). (see: toby/incidents/2026-05-11-blank-extension-page.md PR-shipped section, artifacts/toby-pm/bdbce617-3091-43c4-9c01-20c16b19946c/incidents-2026-05-11-blank-extension-page-ship-update-ingestion.md) - Blog & SEO pipeline — STOOD UP 2026-05-12 (Run #1) + RUN #2 SHIPPED 2026-05-12. Folder
toby/blog/now contains the pipeline doc + two drafts. Run #2 also migrated legacy flattoby/blog-*.mdfiles into the sub-folder (structural move, no content edits). Agenttoby-blog-seoproducestoby/blog/pipeline.md(2-week cadence, 5 pillars mirroring X, 7 ranked queued topics, explicit NOT-pursued list, voice fingerprint anchored totoby/strategy/compass.mdand the X strategy) and drafts as the cycle rolls. Shipped drafts:toby/blog/why-you-have-so-many-tabs-open.md(P1 top-funnel, 2026-05-09, paired with X Post 1) andtoby/blog/onetab-alternative.md(P5 mid-funnel, 2026-05-12, "URL list → workspace" reframe). Agent writes drafts only — operator hand-copies toapps/landing/src/content/post/to publish. All existing guardrails honored: no price claims, P3 rel-gated, no AI pre-announce, no holiday filler, cordial silence on competitors. Ingestion summaries:artifacts/toby-pm/ea1223d6-…/blog-pipeline-ingestion.md(Run #1) +artifacts/toby-pm/bf6972e6-…/blog-pipeline-run2-ingestion.md(Run #2). - Code-review channel — ONLINE 2026-05-13 (Run #1). New folder
toby/code-reviews/with agenttoby-code-revieweron a daily0 7 * * *cron. First doc:toby/code-reviews/2026-05-13-10commits.md— back-fill of the 10 most recent commits onmain(window ends at75a09e34d). 3 medium + 4 low findings; the three mediums each filed a ticket:useSessionStartno-tests (Phase 1 ship debt),.sandcastle/scan-secrets.mtsno-tests (LLM-credential-safety net has no validation), sandcastle skip-flags audit-trail missing. Reviewer NEVER patches — it files tickets only (orthogonal to the warroom fix-shipper). Ingestion summary:artifacts/toby-pm/20dc862a-fd5c-4eba-a6e1-94f0300bd1e5/code-reviews-2026-05-13-10commits-ingestion.md. - Sandcastle agentic-workflow subsystem — SHIPPED 2026-05-13 via commit
75a09e34d. New top-level codebase artifact at.sandcastle/(orchestratormain.mts, secret scannerscan-secrets.mts). Host-side, runs lint + build gates and an added-lines secret regex, then opens a PR againstmainviagh pr create. Two security-flavored open questions (see below) from the code-reviewer's first review: no test coverage on the secret patterns / skip-list; no audit trail when env-var escape hatches are set. This is separate infrastructure from the warroom fix-shipper — sandcastle ships agent-authored slices, fix-shipper ships warroom-validated incident patches.
Roadmap
Next 2 weeks
- Pricing audit complete by 2026-05-13 (O3 KR1) — single authoritative number doc. Audit must now cover three inconsistent prices: internal $4.50/mo, public $6/$10 Efficient.app listing, and TheTab's $9/mo claim (surfaced via
toby-blog-seocompetitor-blog watch). Also unlocks the blog pipeline's price-claim guardrail (see: toby/strategy/playbook.md, toby/blog/pipeline.md open questions + competitor blogs) - X channel goes operational Tue 2026-05-12 — first scheduled post (Post 1 from
toby/x/content-pipeline.md); @nibzard Tier-A reply within 48h; @TobyForTabs creds-gate must clear by Mon 2026-05-11 first (see: toby/x/engagement-targets.md) - Blog channel goes operational this cycle — two drafts now written; path to first publish is the operator publish-flow decision (hand-copy to
apps/landing/src/content/post/vs. flow change), image hand-off (OneTab draft specifically needs a "URL list vs visual collection" hero), and internal-link URL convention confirmation. Recommend targeting Tue 2026-05-12 for the foundational tab-hoarder post (distribution-loop alignment with X Post 1); hold the OneTab Alternative draft until the reliability hotfix ships pertoby-blog-seorecommendation (see: toby/blog/pipeline.md, toby/blog/onetab-alternative.md editor notes, toby/x/content-pipeline.md Post 1) - Reliability hotfix merge + CWS deploy by 2026-05-24 (O1 KR1) — PR #12 is open. Operator path: review → merge → bump extension version → CWS-deploy → 7-day Amplitude baseline on
NewTabHangShown+ 14-day CWS-review watch for "blank screen" 1-stars (target: zero). Unblocks blog pipeline P3 posts AND unlocks publish of the OneTab Alternative draft as side effects (see: toby/incidents/2026-05-11-blank-extension-page.md PR shipped, toby/strategy/compass.md anchor 1, toby/blog/pipeline.md P3 gate + OneTab gate) - Tier 1 retention-instrumentation Go-reviewer sign-off (2026-05-12) — corrected 10-LOC zap log patch is in the canonical incident doc; awaiting one human review pass from a Toby Go reviewer (any owner of
apps/api/context/v3/). After merge + deploy, wait 24-48h then run the Cloud Logging query in the verify plan and file Tier 2/3/4 follow-ups with real numbers attached (see: toby/incidents/2026-05-12-retention-offers-silent.md) - File Tier 2/3/4 retention follow-up tickets (2026-05-12) — file now with stub bodies pointing back to the canonical doc to prevent work-getting-lost. Tier 2 is the bigger monetization lever (hide Stripe-portal link, configure
flow_dataredirect, give legacy users an in-app cancel CTA, investigate whyretention_yearlyaccepts ~0% of non-legacy users) (see: toby/incidents/2026-05-12-retention-offers-silent.md) - Triage the three code-reviewer auto-filed tickets (NEW 2026-05-13) — assign owners, set priorities. (1)
useSessionStartunit tests is the cheapest win and lives on Phase 1 already-shipped surface; (2).sandcastle/scan-secrets.mtstests are the autonomy-story gate — should be filed before any sandcastle-authored slice opens its first real PR; (3) sandcastle audit-log is the partner to (2) — both are cheap, both close a non-zero-risk gap that opens whenever an agent has push access (see: toby/code-reviews/2026-05-13-10commits.md filed tickets) - Team-buyer X-pillar drop decision by 2026-05-17 — engagement-targets bucket recap is still zero candidates. Operator must confirm whether to drop the pillar from X entirely or supply target shape (see: toby/x/engagement-targets.md bucket recap)
- Phase 2 welcome A/B at canary 5% by 2026-05-19 (O2 KR1) — Slices 1–2 (experiment plumbing +
withExperimentPropsacross 17ONBOARDING_V2_*sites). Decision review 2026-05-26. Per playbook red-team residual, if 2026-05-19 arrives with no Phase 2 commits, O2 is dead — operator escalation (see: tasks/phase2-todo.md, toby/strategy/playbook.md) - CWS narrative-repair sprint kickoff (O1 KR2 work begins) — retitle, description rewrite, enterprise social proof, CWV benchmark (see: research-docs/toby-delta-2026-05-05-v3.md)
Next month
- CWS listing rewrite published by 2026-06-01 (O1 KR2) — measure +20% install-conversion over 4 weeks (see: toby/strategy/playbook.md, toby/strategy/bets.md#cws-narrative-repair)
- Blog cadence rolls — at 2-week cadence with two drafts shipped 2026-05-09 and 2026-05-12, the operator should publish the next post (Bookmarks vs Tab Manager · queue rank-1) on ~2026-05-26 and the post after that (How to save Chrome tabs · queue rank-2) on ~2026-06-09; once reliability ships, the first P3 post enters the queue (see: toby/blog/pipeline.md queue)
- Code-review cadence rolls — daily 07:00 UTC. Each new commit landing on
main(whether human-authored, sandcastle-authored, or warroom-fix-shipper-authored) enters the review window. Expect a steady cadence of ticket-shaped findings; the dashboard will surface medium+ findings as they appear and treat the doc index as the rolling archive (see: toby/code-reviews/) - Tier 1 retention-instrumentation verify window (~14 days post-merge) — once Tier 1 ships, the Cloud Logging query in the canonical doc lets us compute the first-ever FE funnel ratio (
retention_offer_eligibleevents vs.cancellation_reasonsrows). Becomes the disambiguation between "modal renders but is declined" and "modal-never-calls-the-endpoint" — i.e. settles whether the bigger lever is in copy/offer-strength (Tier 2c) or in routing more cancels through the modal (Tier 2a + 2b) (see: toby/incidents/2026-05-12-retention-offers-silent.md verify plan) - Phase 2 Slices 3–10: workspace-name slide removal + inline rename nudge, real-tab fallback in SaveTabsSlide, seed-on-skip starter content, ExtensionMenuSlide demoted to showcase, auth-as-modal-overlay on blurred dashboard, new ONBOARDING_V2_OPEN_TAB event/slide (see: tasks/phase2-todo.md)
- Phase 2 canary 5% → 20% → 50% → 100%; kill if neither variant hits 34% D7 at n≥2,000/arm at the 2026-05-26 decision review (see: tasks/phase2-todo.md, tasks/onboarding-experiment-plan.md)
role-based-paywall-gatingdesign doc shipped by 2026-06-15 (O3 KR2) — defines which team/admin/sharing features move behind paid and which stay free under compass anchor #4 (see: toby/strategy/playbook.md, toby/strategy/bets.md#role-based-paywall-gating)- Public "Chrome 133 vs. Toby" comparison page — axioms 1/2/3, NOT cloud sync. Free version only; paid push is anti-bet. Now also queued at rank-7 in
toby/blog/pipeline.md(see: toby/strategy/playbook.md anti-bets) - Curator loop: ≥4 public collections featured by 2026-06-30 (O3 KR3) — first featured collection ideally amplified via X + the new blog "Public collection of the week" recurring series (see: toby/strategy/playbook.md O3 KR3, toby/x/content-pipeline.md, toby/blog/pipeline.md queue rank 6)
- Reliability follow-ups (non-blocking on the incident close): apply Layer-1 timeout shape to
isInitializing(useIsRestoring()IDB path inToby.tsx:168-275); SW hardening (.catch()onpersistQueryClientRestoreatbackground.ts:14,AbortController+10s timeout incontextMenus.ts:145-191, unifiedchromeStorageGet<T>(keys, { timeoutMs })helper to replace every rawchrome.storage.local.getcallsite); Layer-1NewTabHydrationTimeouttelemetry beacon so the common 5s recovery path is visible in Amplitude (see: toby/incidents/2026-05-11-blank-extension-page.md follow-ups)
Next quarter / beyond
- Curator loop scales to 10 featured public collections by EoQ3 (see: toby/strategy/playbook.md O3 KR3)
- Role-based feature gating built (after Q2 design doc lands) — 2× conversion benchmark on Multi-User Collaborator + Free-Tier Archivist segments (see: research-docs/toby-delta-2026-05-05-v3.md, toby/strategy/bets.md#role-based-paywall-gating)
- MCP integration v1 as a B2B / Team-plan unlock (see: product/ideas/mcp-integration.md)
- AI feature relaunch deferred to Q4 2026. No pre-announce — compass anti-promise + playbook anti-bet (blog pipeline explicitly honors this in its NOT-pursued list) (see: research-docs/toby-delta-2026-05-05-v3.md, toby/strategy/compass.md, toby/strategy/playbook.md, toby/blog/pipeline.md NOT-pursued)
- Atlassian / Dia ecosystem partnership decision — explicitly deferred past Q2 by playbook anti-bet; revisit at founder level after Sept-2026 v3 falsifiable check-in (see: research-docs/toby-delta-2026-05-05-v3.md, toby/strategy/playbook.md anti-bets)
- Falsifiable Sept 2026 check-in: blank-page closed + onboarding D1 lift + paywall restructured + Atlassian/Linear public move; otherwise pivot-or-wind-down by EOY 2026 (see: research-docs/toby-research-2026-05-05-v3.md)
- North Star metric: Weekly Card Opens (~37k users); supporting metric: Weekly Card Saves (~9k users) (see: worklog.md)
Recent shipments
The 14d window has been broken open: PR #12 (the reliability hotfix) lands code in-tree for the first time since 2026-04-29, and two new agent-owned operational surfaces (code-review channel, sandcastle subsystem) came online this week.
- Operational surface: Code-review channel online — new agent
toby-code-reviewer(daily 07:00 UTC cron) shipped its first review attoby/code-reviews/2026-05-13-10commits.md, back-filling the 10 most recent commits tomain(window ends at75a09e34d). 3 medium + 4 low findings; three tickets filed (Session Start hook no-tests;.sandcastle/scan-secrets.mtsno-tests; sandcastle skip-flags audit-trail missing). Reviewer files tickets only — never patches. Rune5abf485-212b-4a2a-946e-882c6b5b22ec; ingestion summaryartifacts/toby-pm/20dc862a-fd5c-4eba-a6e1-94f0300bd1e5/code-reviews-2026-05-13-10commits-ingestion.md(2026-05-13) - Code: Sandcastle agentic-workflow subsystem — commit
75a09e34dlands.sandcastle/main.mts(orchestrator: lint+build gates, retry,gh pr create) and.sandcastle/scan-secrets.mts(added-lines secret regex with skip-list for lockfiles + binaries). Host-side rail for unattended agent slice execution that opens PRs againstmain. Separate from the warroom fix-shipper — sandcastle ships agent-authored slices, fix-shipper ships warroom-validated incident patches; both rails end in a PR but originate differently. Two security-flavored findings already filed against the subsystem by the code-reviewer (no tests on the scanner; no audit trail on env-var skip flags) (2026-05-13) - Code: Reliability blank-page hotfix shipped to PR — PR #12 on
axiomzen/toby-mono-repo, branchwarroom/2026-05-11-blank-extension-page-toby-14, commit06baf0f8a(base75a09e34donorigin/main). 3-layer frontend-only fix from the 2026-05-11 incident: 5sgetUser()+getOnboarding2Draftfail-open timeouts;StuckRecoveryScreenat 8s;NewTabHangShowntelemetry beacon. Files:apps/extension/app/state/accessors/user.tsx,apps/extension/app/hooks/useOnboarding2Draft.ts,apps/extension/app/containers/Toby.tsx, and net-newapps/extension/app/components/StuckRecoveryScreen.tsx. First end-to-end exercise of Wave 4 of the warroom workflow —toby-incident-fix-shipperran inb3400d87-0830-4f89-bb70-4c3907c085f1, source ticket TOBY-14 transitioned todoneat 05:10 UTC. (2026-05-12) - Wiki: Second canonical warroom incident — retention_offers silent (TOBY-6) —
toby/incidents/2026-05-12-retention-offers-silent.md(statuscloseddiagnosis; verdictvalidated + medium; Wave 4 correctly skipped auto-ship; TOBY-6 →in_review). Doctors converged: backend cleared the API (prod-api SHA stable since 2026-02-02; only insert site exercised correctly; no kill-switch / silent flag); frontend pinned three structural FE-orchestration bypass paths (Stripe-portalViewlink preloaded inside in-app Subscription panel, Stripe renewal-email "Manage subscription" links to the same portal,team_legacy/team_basicusers with no in-app cancel CTA at all). Coordinator drafted Tier 1 instrumentation patch (10-LOC zap structured log atapps/api/context/v3/subscription_context.go~L598); validator caught compile defects, returned corrected compile-ready replacement, returnedmediumconfidence. Three strategic findings: (1) ~82% of cancels bypass the in-app retention modal entirely — structural funnel leak, product-shaped not regression; (2)CancelSubscription.tsxfrontend integration did ship at some point, resolving the long-running open question; (3) of users who DO reach the modal, only ~5-10% accept — backend can't disambiguate "show-without-accept" from "decline" because schema is accept-only. Ingestion summary:artifacts/toby-pm/00036a80-5931-405a-85ab-1e39ee3a545f/incidents-2026-05-12-retention-offers-silent-ingestion.md(2026-05-12) - Wiki: Blog & SEO pipeline Run #2 + second draft —
toby/blog/pipeline.mdregenerated 2026-05-12 09:00 UTC with a structural migration applied (legacy flattoby/blog-*.mdfiles moved into the newtoby/blog/sub-folder, no content edits during relocation) plus a second published draft:toby/blog/onetab-alternative.md— P5 mid-funnel, ~1,500 words, "URL list → workspace" reframe anchored by OneTab's own December-2025 troubleshooting-page data-loss warning. The draft is generous to OneTab, explicitly self-routes Workona-shaped readers, and honors the "no punch-down" anti-bet. Pipeline queue reordered (Bookmarks vs Tab Manager now rank-1; Chrome 133 vs. Toby added at rank-7). New pricing signal surfaced via competitor-blog watch: TheTab's blog claims Toby is "$9/mo" — third inconsistent price (alongside internal $4.50/mo and public $6/$10) that strengthens the urgency ofpricing-reality-reconcile. Pipeline recommends holding OneTab draft publish until the reliability hotfix ships (it's feature-promotional). Ingestion summary:artifacts/toby-pm/bf6972e6-dbf0-4794-bea0-9da5e89afdd2/blog-pipeline-run2-ingestion.md(2026-05-12) - Wiki: Blog & SEO pipeline run #1 — new agent-owned surface at
toby/blog/. Pipeline doc establishes 2-week cadence, 5 pillars mirroring X, voice fingerprint anchored to the compass, 7 ranked queued topics, explicit NOT-pursued list. First draft published as a wiki doc: Why You Have 80 Tabs Open (And Why That's Actually Fine) — P1 top-funnel, explicitly paired with X Post 1. Operator decision now owns the publish-flow handoff toapps/landing/src/content/post/(2026-05-12) - Wiki: First incident closed by warroom —
toby/incidents/2026-05-11-blank-extension-page.md(statusclosed, verdictvalidatedhigh-confidence). End-to-end in ~30 min: frontend doctor pinnedapps/extension/app/state/accessors/user.tsx:45-50(unboundedchrome.storage.local.getcallback); backend doctor cleared the API (prod-api SHA stable since 2026-02-02, 0 5xx, refuting the prior MV3-SW-boot-regression hypothesis); coordinator drafted a 3-layer frontend-only fix; validator confirmed race-safety andd68726b29regression-check. Agents never patched the codebase — operator review owns the ship decision against the O1 KR1 deadline (2026-05-24) (2026-05-11) - Wiki: Incident warroom stood up at
toby/incidents/— folder seeded with onboardingREADME.md; AIOS workflowToby Incident Response(id9b78790f-2aea-4f65-876f-53d1a114c3ae) active; four-agent team (toby-incident-coordinator,toby-frontend-doctor,toby-backend-doctor,toby-incident-validator) on roster. Daily cron at 09:00 UTC reads_inbox/then the ticket queue; manual ticks remain as an operator escape hatch. The earlier dashboard claim of a file_event trigger was inaccurate — corrected 2026-05-12 when the ticket→warroom bridge landed (2026-05-11) - Wiki: X engagement-targets list re-generated at
toby/x/engagement-targets.md— 26 ranked ICP targets (6 Tier-A / 11 Tier-B / 9 Tier-C) with reply drafts, "what NOT to do" guardrails, and updated competitor watch list (Uncluttr, TabVault Pro, ThoughtFold, tab-out, leap-tabs.com flagged do-not-engage). New bot-filter rule applied: followers ≥ 100 AND account < 2025-06. Four new Tier-A targets added (@airplanestar_, @tropicanacailin, @wayne_effect, @benrayfield); four new Tier-B switchers added; ThoughtFold flagged as sharper threat for AuDHD-ICP overlap (2026-05-10) - Wiki: X content pipeline + X strategy at
toby/x/content-pipeline.mdandtoby/x/strategy.md— operator-driven channel surface; first scheduled post Tue 2026-05-12, pin candidate Post 13 (🔒 acct-gated only) (2026-05-10) - Wiki: Q2 2026 Growth Playbook published at
toby/strategy/playbook.md— growth thesis, current-loop diagnosis, top-5 bets, Q2 OKRs, anti-bets, red-team pass (2026-05-10) - Wiki: Rolling Bets Queue at
toby/strategy/bets.md— ICE-scored in-flight / proposed / validated / killed with falsifying signals (resolves prior open question about bets.md location) (2026-05-10) - Wiki: Product Compass v0.1 at
toby/strategy/compass.md— identity, axioms, anchors, brand promise, anti-promise (2026-05-10) - Post fallback Slack message when CWS review AI draft fails (commit b9bea18c, 2026-04-29) — 2026-05-13 review notes a low-severity backtick-fence-break formatting hazard in the error path; doc-only, no ticket.
- Docs: Installation
device_iddata caveat + analytics troubleshooting (commit da1ba81e, 2026-04-14) - Chore:
build:storescript to build all prod extension variants (commit e7d00a8e, 2026-04-14) - Fix: correct fabricated "Open toby" volume baseline in Phase 1 halt threshold (commit 389556ae, 2026-04-14)
- Feat: 4h
Session Startheartbeat event for intra-day retention analysis (commit 0f3aa38d, 2026-04-14) — 2026-05-13 review notes no unit tests on theuseSessionStart.tshook driving this production analytics event; ticket filed (improvement, medium). - Docs: Phase 1 / Phase 2 specs, plans, and todos for onboarding experiment work (commit 40684905, 2026-04-14)
- Docs: BigQuery schema, connection IDs, and data caveats added to analytics skill (commit b1fe667b, 2026-04-09)
- Chore: extension
1.13.0(commit 9d6e8e4f, 2026-04-09) - Fix: gate
AuthWrapperon user hydration to prevent duplicate onboarding events (commit d68726b2, 2026-04-09) — identified as the proximate trigger of the blank-page hang pertoby/incidents/2026-05-11-blank-extension-page.md. The fix is correct — confirmed independently by the 2026-05-13 code-reviewer pass; the issue was that it widened the gate without bounding the new dependency. Now defended-in-depth by PR #12 (commit06baf0f8a, 2026-05-12) — the gate is preserved; the timeouts + recovery screen prevent the unbounded-callback failure mode from killing the UI. - Fix: derive
experimentEntityIdfrom the draft to eliminate race condition (commit cde22c93, 2026-04-09) - Refactor: remove
onboarding-signup-positionA/B experiment, ship "end" variant (commit bc5e4530, 2026-04-09) - Add Chrome Web Store review monitor with AI-drafted responses (commit ba247d9a, 2026-03-30)
Key decisions to date
- Code-review channel design — review + file tickets, never patch (2026-05-13). The new
toby-code-revieweragent walksmain-commits daily, applies a triple-check skill (correctness / quality / security), writes a roll-up review doc totoby/code-reviews/, and files tickets for findings rising to bug/security/missing-test threshold. It never patches — even high-confidence findings exit as tickets, not PRs. This is orthogonal to the warroom fix-shipper, which DOES open PRs but only on validated-high-confidence warroom incidents. The dashboard surfaces both rails so operators see the two paths tomainclearly. (see: toby/code-reviews/2026-05-13-10commits.md, this dashboard's Operations sections) - Wave 4 auto-ship path executed end-to-end for the first time (2026-05-12 via
toby/incidents/2026-05-11-blank-extension-page.mdPR-shipped section).toby-incident-fix-shipperran inb3400d87-0830-4f89-bb70-4c3907c085f1, picked up TOBY-14 from the bridge, branched off75a09e34dintowarroom/2026-05-11-blank-extension-page-toby-14, applied the 3-layer diff (4 files, including newapps/extension/app/components/StuckRecoveryScreen.tsx), and opened PR #12 ~9 minutes after inbox bridge. This completes the proof of both warroom rails: 2026-05-12 medium-confidence skip + 2026-05-12 high-confidence auto-ship. (see: toby/incidents/2026-05-11-blank-extension-page.md PR shipped, toby/incidents/README.md Wave 4 spec, artifacts/toby-pm/bdbce617-3091-43c4-9c01-20c16b19946c/incidents-2026-05-11-blank-extension-page-ship-update-ingestion.md) - Reliability hotfix scope locked at the 3-layer fix specced in the 2026-05-11 doc (decision rendered by PR #12 contents, 2026-05-12). The fix-shipper deliberately scoped the PR to (1) Layer-1 timeouts on
getUser()+getOnboarding2Draft, (2) Layer-2StuckRecoveryScreenat 8s, (3) Layer-3NewTabHangShownbeacon. The three follow-up surfaces flagged in the incident doc (Layer-1 forisInitializing, SW-hardening trio, Layer-1 telemetry beacon) are deliberately deferred as separate work — not in PR #12. This is the design intent: minimum-blast-radius hotfix first, hardening as follow-on PRs. (see: toby/incidents/2026-05-11-blank-extension-page.md PR-shipped + Follow-ups sections) - Medium-confidence validator verdicts skip auto-ship by design (captured 2026-05-12 via
toby/incidents/2026-05-12-retention-offers-silent.md). When the validator returnsvalidated + medium, Wave 4 (fix-shipper) is deliberately bypassed and the corrected compile-readydiffis routed into the canonical incident doc for human review. This is the intended design — the 2026-05-11 high-confidence run proved the auto-ship path; the 2026-05-12 medium-confidence run proved the human-review-required path. The gate works in both directions. (see: toby/incidents/2026-05-12-retention-offers-silent.md, toby/incidents/README.md Wave 4 spec) retention_offerstable is accept-only by design — "0 issued, 0 accepted" is unmeasurable from this schema (captured 2026-05-12). The table records confirmed "CLAIM DISCOUNT" clicks; shows-without-accept and declines write nothing. Any "issued" or "shown" metric needs Tier 3 schema work (newretention_offer_viewstable ORstatus/offered_at/declined_atcolumns) and/or AmplitudeRETENTION_OFFER_SHOWN/DECLINEDevents wired into BI. The Tier 1 logging patch is an interim measure that lets us measure "eligible offers shown" via Cloud Logging without a schema change. (see: toby/incidents/2026-05-12-retention-offers-silent.md root cause #1)- Blog/SEO motion as a structured, agent-owned pipeline (captured 2026-05-12 in
toby/blog/pipeline.md). Replaces ad-hoc landing-page posts. Agent writes wiki drafts; operator publishes by hand-copy. Voice fingerprint =toby/strategy/compass.md+toby/x/strategy.md(calm, specific, generous; never breathless). Pillar mix mirrors X. P3 power-user posts are rel-gated on the blank-page hotfix. No price claims until pricing-reality-reconcile completes (O3 KR1, due 2026-05-13). No AI pre-announce until Q4 2026 relaunch. No competitor punch-down — comparison content only, never trash-talk. (see: toby/blog/pipeline.md voice + NOT-pursued sections) - Reliability gate also applies to feature-promotional comparison posts (captured 2026-05-12 in
toby/blog/pipeline.mdRun #2). The OneTab Alternative draft is rel-gated by the pipeline agent itself — recommended hold-until-hotfix-ships, since the post leans on Save Session + new tab page as the wedge and publishing pre-fix would drive switchers into the live bug. Top-funnel posts (e.g. Why 80 Tabs Open) are not similarly gated. (see: toby/blog/pipeline.md operator-decisions, toby/blog/onetab-alternative.md editor notes) - Blank-page reliability — no prod-api redeploy as part of this incident (2026-05-11; both backend doctor and validator concurred). Backend is innocent (prod-api SHA stable since 2026-02-02; 0 5xx; SW boot path clean;
getUser()makes no network call so the hang is pre-HTTP). A redeploy is needless blast-radius. The fix lives entirely inapps/extension. (see: toby/incidents/2026-05-11-blank-extension-page.md) NewTabHangShowntelemetry beacon — feature-flag-gated, default on (2026-05-11; validator concurs). First-ever signal between CWS-review complaint and Sentry/Amplitude funnels. The optional Layer-1NewTabHydrationTimeoutbeacon follows the same default. (see: toby/incidents/2026-05-11-blank-extension-page.md operator decisions)- Prior MV3-SW-boot-regression hypothesis is REFUTED (2026-05-11). Earlier
toby-product-strategistartifact388c1db4-59b7-49e9-8ec3-ecfba972c95fis now treated as historical context only; do not carry forward as a live theory. Independent backend evidence (SHA stability, 5xx volume, SW boot listener registration, network-free hang path) rules it out. (see: toby/incidents/2026-05-11-blank-extension-page.md "What this is NOT") - Incident warroom design — agents never patch (except via fix-shipper on validated + high-confidence) (captured 2026-05-11 in
toby/incidents/README.md; reinforced 2026-05-12). The five-agent team (toby-incident-coordinator+ frontend/backend doctors + validator + fix-shipper) produces canonical incident docs with fix proposals asdiff. High-confidence + validated → fix-shipper opens a PR. Medium-confidence + validated → correcteddifflives in the canonical doc; operator decides. Conditional / rejected → up to 2 retry passes. Doctors write run-artifact findings; only the coordinator writes the canonical doc; validator returns a binding verdict. Agents never touch_inbox/; agents never delete docs. (see: toby/incidents/README.md, toby/incidents/2026-05-12-retention-offers-silent.md) - Q2 2026 anti-bets (codified in playbook 2026-05-10): (1) no pre-announce of AI organize features on any public channel; (2) no paid acquisition while CWS rank is unrecovered ($54/yr ARPU × <5% full-price conversion = math doesn't work); (3) no Firefox/Safari port this quarter — engineering throttle forces a "Phase 2 + reliability OR platform port" choice and we pick the first; (4) no public punch-down at OneTab / Workona / Session Buddy / Arc / new entrants Uncluttr / TabVault Pro / ThoughtFold / tab-out / leap-tabs (comparison content fine, trash-talk is brand-poison); (5) Atlassian/Dia 60-min partnership discovery deferred past Q2 (one-way door); (6) no paid push around "Toby vs Chrome 133" until conversion economics fix (the free comparison page is fine) (see: toby/strategy/playbook.md anti-bets)
- X engagement protocol locked (from
toby/x/engagement-targets.md): first touch is always in-character + generous; no link in the first reply (dropgettoby.comonly when prompted); max 1 reply per target per day; no DM cold-pitches; no mass engagement (>8/day reads as a bot); never reply as the brand to vulnerable-distress posts; cordial silence on competitor accounts (Uncluttr, TabVault Pro, ThoughtFold, tab-out, leap-tabs.com). New 2026-05-10 — bot filter applied: every target satisfiesfollowers_count ≥ 100ANDaccount_created < 2025-06— keeps targets credibly human and avoids burning brand reputation on bot-shaped replies (see: toby/x/engagement-targets.md hard rules) - Top growth lever this quarter is the dormant curator loop, not acquisition channels — 1,848 Free-Tier Archivists with 18.6% public-share rate are Toby's only native viral surface and have never been explicitly activated; the loop costs zero engineering. Now jointly served by X + blog: blog pipeline queues a recurring "Public collection of the week" curator series (rank 6) that pairs with the X curator-spotlight slot (see: toby/strategy/playbook.md, toby/strategy/bets.md#public-collection-pride-loop, toby/blog/pipeline.md queue rank 6)
- Phase 1 / Phase 2 split: ship 1.13.0 with cleanup + Session Start before Jad's OOO; rebuild experiment machinery from scratch in Phase 2 (clean slate is intentional) (see: tasks/onboarding-experiment-plan.md)
- Welcome A/B isolated variable = "presence of a dedicated welcome / Get Started screen". Success metric = D7 retention with +5pp lift over the V2-only baseline (32.92%); kill criterion if neither variant reaches 34% D7 at n≥2,000/arm by 2026-05-26 (see: tasks/onboarding-experiment-plan.md, toby/strategy/playbook.md O2 KR2)
- Auth becomes a modal overlay on a blurred dashboard (shared-spine improvement, not an experiment arm) (see: tasks/onboarding-experiment-plan.md)
- Draft is the source of truth for
variant/isFallback; once the 2s timeout fires the draft is bucketed and frozen — late API responses write only to a debug-onlyexperimentApiLatefield (see: tasks/onboarding-experiment-plan.md) - All cross-auth analytics joins use Amplitude
device_id, not_entityId(the SDK re-identifies on login) (see: tasks/onboarding-experiment-plan.md) - Toby has no guest mode — "Skip" means skip the tutorial, not skip auth; skip path pre-seeds starter demo content into the draft (see: tasks/onboarding-experiment-plan.md)
- North Star = Weekly Card Opens; supporting = Weekly Card Saves; supporting events
Open card,Open all cards,Close all, Open these,Add tab(see: worklog.md) - New endpoints use the v3 controller pattern (BaseController + explicit DI); v2 (
gocraft/web+ context structs) stays only for legacy (see: CLAUDE.md) - 90-day investment ratio (post-v3): 70% reliability, 20% role-based gating, 10% AI feature relaunch (see: research-docs/toby-delta-2026-05-05-v3.md)
- AI feature relaunch deferred Q3 → Q4 2026 (see: research-docs/toby-delta-2026-05-05-v3.md)
- "270K untracked WAU" mirage resolved: real active user base is ~62–75K (CWS WAU 380k inflated 5–6×); ~54K free, ~5,800 paid (10.7% active-user conversion) (see: product/strategy/next-actions.md)
- AI requires JWT + team_id +
"ai"feature flag, so AI cannot run on the unauthenticated Save Tabs slide in the "end" variant — only "beginning" supported AI there (see: docs/ai-onboarding-ideas-analysis.md) - Persistence-as-activation is a 10× axiom (compass axiom 3): "I opened a new tab and my organized world was still there", not "AI organized my tabs" (see: docs/ai-onboarding-ideas-analysis.md, toby/strategy/compass.md axiom 3)
- Calm axis sharpened to "calm-by-ambient-surfacing" (compass axiom 2 — value in the gap between intent and next action). Plain "calm" is now a category claim (Sunsama, Neurosity, Cold Turkey); ambient new-tab surfacing is what stays defensible (see: research-docs/toby-delta-2026-05-05-v3.md, toby/strategy/compass.md axiom 2)
- Cloud-account differentiation relegated to table stakes (compass anchor #2) — Chrome 133 ships saved-tab-group cross-device sync natively (see: toby/strategy/compass.md anchor 2, 2026-05-10)
- Reliability promoted to anchor #1 (compass) — the blank-page bug puts the entire product at risk, not just CWS rank (see: toby/strategy/compass.md anchor 1, 2026-05-10)
Open questions / blockers
- Sandcastle agentic-workflow — secret scanner has no test coverage (NEW 2026-05-13). Per
toby/code-reviews/2026-05-13-10commits.md,.sandcastle/scan-secrets.mts:1-182(commit75a09e3) is the only thing standing between an unattended LLM agent and a pushed credential. It's a regex-on-added-lines walker with a skip-list — single-pattern, single-line, no inline-base64, no multi-line obfuscation detection. The author's own comment acknowledges it isn't designed against an adversarial agent, but the regression risk is silent. Operator action: prioritise the auto-filed ticket (.sandcastle/scan-secrets.mtsneeds unit-test coverage); ideally close before the first sandcastle-authored slice opens a real PR. (see: toby/code-reviews/2026-05-13-10commits.md scan-secrets finding) - Sandcastle agentic-workflow — gate-skip env vars have no audit trail (NEW 2026-05-13).
SANDCASTLE_SKIP_GATES_VERIFY=1andSANDCASTLE_SKIP_SECRET_SCAN=1fully disable verification and secret-scan with only aconsole.warn. As the workflow scales (CI runner, shared image, teammate rc-file), no audit ties a pushed branch back to which gate was waived. Operator action: triage the auto-filed ticket (Sandcastle skip-flags audit logging); cheap to add (PR body field +.sandcastle/audit.log). (see: toby/code-reviews/2026-05-13-10commits.md sandcastle skip-flag finding) - Sandcastle
EXPECTED_GATESlock-step risk (NEW 2026-05-13).EXPECTED_GATES = ['lint', 'build']intentionally omitstypecheckandtestbecause the extension repo has pre-existing TS debt and no vitest scaffold onmain. The moment the TS-clean branch lands,EXPECTED_GATESmust be updated in lockstep — easy to forget; would silently let bad slices through. Doc-only finding from the code-reviewer; would warrant a tracking issue once the TS-clean branch is in flight. (see: toby/code-reviews/2026-05-13-10commits.md EXPECTED_GATES finding) useSessionStart.tsshipped without unit tests (NEW 2026-05-13). Phase 1 production analytics hook (the 4h Session Start heartbeat at commit0f3aa38d) has 75 lines of quiet logic — rate limit,days_since_signupparse, storage-write-ordering — and no tests. Code-reviewer auto-filed a ticket. Quiet-failure risk: a staleuserIdshape or broken rate-limit math wouldn't surface until someone runs a BigQuery backfill. Doesn't block Phase 2 but is ship debt on already-shipped code. (see: toby/code-reviews/2026-05-13-10commits.md useSessionStart finding)- Retention-funnel structural leak (2026-05-12). Per
toby/incidents/2026-05-12-retention-offers-silent.md, the in-app retention modal is bypassed by ~82% of cancels (120 cancels → 22 reasons → 1 accept over 30 days). Three FE-orchestration paths route around the modal: (a) the Stripe Customer PortalViewlink is preloaded inside the in-app Subscription panel (apps/extension/app/components/Modal/OrgSettings/Subscription.tsx:51-62, 192-208); (b) Stripe renewal/receipt emails contain "Manage subscription" links to the same portal; (c)team_legacy+team_basicusers have no in-app cancel CTA at all becausehasSubscriptionexcludes them (Subscription.tsx:41-43) — and that's the cohort with the worst churn pressure (Feb-26 ThankYouLegacy renewals). Operator owns the Tier 2 product calls — hide the Stripe-portal link OR configureflow_dataredirect; widenhasSubscription; investigate theretention_yearly~0%-accept branch. Bigger monetization lever than the Tier 1 instrumentation patch. (see: toby/incidents/2026-05-12-retention-offers-silent.md fix tier 2, toby/strategy/playbook.md O3) - Tier 1 retention instrumentation — awaiting Go-reviewer sign-off (2026-05-12). Corrected compile-ready 10-LOC zap log patch is in the canonical incident doc. Validator already vetted shape + symbols; one Toby Go reviewer needs to approve. After merge + deploy, the verify plan computes the first-ever FE funnel ratio in 14 days — settles whether the lever is in copy/offer (decline-not-show) or in routing (show-not-call). (see: toby/incidents/2026-05-12-retention-offers-silent.md fix tier 1 + verify plan)
- Five missing
TOBY_RETENTION*GCP secrets — housekeeping (2026-05-12).TOBY_RETENTIONMINSUBSCRIPTIONDAYS,TOBY_RETENTIONCOOLDOWNMONTHS,TOBY_RETENTIONLEGACYYEARLYPRICE,TOBY_RETENTIONCOUPONLEGACY,TOBY_RETENTIONCOUPONYEARLY. System operates correctly on struct-tag defaults — the only side effect is 5 "failed to access secret version" log lines per cold start. File as separate ticket; either create with current defaults or remove the lookup entirely. (see: toby/incidents/2026-05-12-retention-offers-silent.md fix tier 4) - Blog publish flow. The agent writes drafts into the wiki at
toby/blog/; the production blog lives atapps/landing/src/content/post/. Three sub-questions stacked: (1) hand-copy approved drafts into the codebase repo or change the publish flow? (2) image hand-off — neither current draft has a cover; existing posts use~/assets/images/blog/<post-folder>/<image>.png; the OneTab draft specifically needs a "URL list vs visual collection" hero (the image carries the post's core claim); (3) internal-link URL shape ongettoby.commay not match filenames — confirm before any inter-post link ships. Blocks: first publish of either of the two drafts attoby/blog/. Related operator calls: pair-or-not with X Post 1 ("47 tabs is not a personality flaw") on Tue 2026-05-12, and queue a new X anchor for the OneTab draft (the Post 1 pair is already claimed) (see: toby/blog/pipeline.md open questions, toby/blog/onetab-alternative.md editor notes). - OneTab Alternative draft — reliability-gate decision.
toby-blog-seorecommends holding publish of the OneTab draft until the 3-layer reliability hotfix ships (O1 KR1, due 2026-05-24) so we don't drive switchers into the live blank-page bug. The foundational tab-hoarder draft is not similarly gated. Operator decides; treating as recommendation, not a hard rule. Resolves naturally when the hotfix ships (see: toby/blog/pipeline.md operator-decisions, toby/blog/onetab-alternative.md editor notes). - Reliability hotfix — review, merge, deploy, monitor (diagnosis closed 2026-05-11; PR opened 2026-05-12). PR #12 carries the 3-layer fix; commit
06baf0f8a; branchwarroom/2026-05-11-blank-extension-page-toby-14. The fix-shipper validator vetted race-safety +d68726b29-regression-safety + copy; the 2026-05-13 code-reviewer pass independently confirmsd68726b29is itself correct (the gate widening was right; what was missing was bounding the new dependency, which PR #12 fixes); CI is the canonical typecheck/lint gate (no local check ran in the ephemeral worktree). Operator steps now: (1) PR review + merge; (2) bump extension version (Phase-1 was1.13.0; this is likely1.14.0or.x.xpatch — operator decides); (3) push new build through CWS; (4) watchNewTabHangShownin Amplitude for 7-day baseline + CWS reviews for new "blank screen" 1-stars over the 14-day O1 KR1 window (target: zero). The three non-blocker follow-ups from the incident doc (Layer-1 forisInitializing, SW-hardening trio, Layer-1NewTabHydrationTimeoutbeacon) remain queued as separate work. Side-effect: unblocks blog pipeline's P3 power-user posts AND unlocks publish of the OneTab Alternative draft once merged + deployed (see: toby/incidents/2026-05-11-blank-extension-page.md PR shipped + Follow-ups, toby/strategy/compass.md anchor 1, toby/strategy/playbook.md O1, toby/blog/pipeline.md P3 gate + OneTab gate, toby/code-reviews/2026-05-13-10commits.md d68726b finding). - Reliability follow-up PRs — not yet filed (2026-05-12; surfaced by PR #12's deliberate scoping). The 2026-05-11 incident doc's "Follow-ups (NOT blockers for closing this incident)" section lists three distinct work items that were deliberately not included in PR #12: (1) apply Layer-1 5s timeout shape to
isInitializing(useIsRestoring()IDB-backed path inToby.tsx:168-275) so the page can self-heal pre-8s without a tap; (2) SW hardening trio —.catch()onpersistQueryClientRestoreatbackground.ts:14,AbortController+10s oncontextMenus.ts:145-191fetch, unifiedchromeStorageGet<T>(keys, { timeoutMs })helper to replace every rawchrome.storage.local.getcallsite; (3) Layer-1NewTabHydrationTimeouttelemetry beacon so the common 5s recovery path is visible in Amplitude (without it, only the 8s tail surfaces). Operator should decide whether to file these as separatebug/improvementtickets now (with stub bodies pointing to the incident doc) so the work doesn't get lost after PR #12 merges (see: toby/incidents/2026-05-11-blank-extension-page.md Follow-ups section). - Phase 2 silent slip risk. Plan targets "week of Apr 20"; today is 2026-05-13 and there are zero Phase 2 commits. Playbook red-team residual: if 2026-05-19 arrives with no commits, the middle of the playbook disintegrates and O2 is dead. Confirm whether work is on an unpushed branch or Jad's ETA is firm (see: tasks/phase2-todo.md, toby/strategy/playbook.md red-team).
- Pricing contradiction — now three different numbers (strengthened 2026-05-12). Internal modeling uses $4.50/mo; Efficient.app lists Toby at $6/$10; TheTab's blog comparison post lists Toby at $9/mo (surfaced via
toby-blog-seocompetitor-blog watch this run). Either pricing has been changing without internal docs being updated, public listings are stale, or there's a tier we're not tracking. Blocksrole-based-paywall-gating; playbook O3 KR1 sets hard deadline 2026-05-13. Also keeps the blog pipeline's price-claim guardrail in force (no blog post may mention price until reconciled — both drafts comply) (see: research-docs/toby-delta-2026-05-05-v3.md, toby/strategy/playbook.md O3, toby/blog/pipeline.md operator-decisions + competitor-blog watch). - @TobyForTabs account-creds gate. X channel cannot ship until operator confirms account credentials are in-hand. Step 1 of the engagement-targets operator one-pager. If this isn't resolved by Mon 2026-05-11, the Tue 2026-05-12 scheduled first post slips. (Post 13 pin is also acct-gated.) (see: toby/x/engagement-targets.md operator one-pager).
- Competitor watch — engagement-targets list refreshed 2026-05-10. Five accounts now surfaced in the live engagement-targets doc as do-not-engage competitors or category-adjacent: Uncluttr (@ciprian__b — AI-organize tab manager, CWS #18 "vertical tab manager", just earned Blue Checkmark, building in public on the AI lane Toby has deferred to Q4 2026), TabVault Pro (@godesign_art — free OneTab/Session-Buddy alternative, launched 2026-04-19), ThoughtFold (via @PersonAI_talks — "zero-cloud Chrome tab manager for neurodivergent brains", PH launch 2026-04-25 — sharper threat for overlapping AuDHD ICP shape; engagement-targets + content-pipeline + blog pipeline all advise keeping our framing implicit), tab-out (@Yieioo — open-source visual tab manager, added "Page Snapshot" feature 2026-05-03), leap-tabs (@RomanGweb3 — sidebar tab manager + spaces). Policy is cordial silence (X) + comparison content fine but no trash-talk (blog — OneTab draft is the first proof-of-life). TabRack and LocalArchive (carried in last cycle) are no longer in the engagement-targets list — confirm with strategist whether they fell off the signal-source query or were deliberately de-prioritized (see: toby/x/engagement-targets.md Tier C, toby/x/content-pipeline.md drafts-not-proposed, toby/blog/pipeline.md competitor blogs + NOT-pursued).
- Team-buyer X-pillar drop decision deadline 2026-05-17. The latest engagement-targets bucket recap is still zero team-buyer candidates after 26 targets. Doc forces an operator-only call: either supply the missing target shape (e.g., LinkedIn-DM canvass of 25 multi-team admins to surface ≥3 "I want a paid Team plan, but [reason]" responses) or drop the team-buyer pillar from X entirely. 4,908 active multi-team users / 79 paid yearly Team subs is the imbalance being tested (see: toby/x/engagement-targets.md bucket recap, toby/strategy/playbook.md open questions, toby/x/strategy.md).
- Does the "ambient new-tab surface" axiom survive Google deprecating new-tab override permissions? (from playbook red-team) Strategic 2027-2028 risk. We've taken no defensive action. Resolves when: Google publishes a deprecation roadmap OR a credible signal lands from a Chrome team contact (see: toby/strategy/playbook.md open questions, toby/strategy/compass.md axiom 2).
- What's the true full-price new-user conversion rate? (from playbook) Action 1 in
product/strategy/next-actions.mdis still unresolved. Determines whether the flywheel is structurally viable at $54/yr (see: product/strategy/next-actions.md, toby/strategy/playbook.md open questions). - Will the reliability hotfix close the
not_usingcancellation reason (39% of churn)? (from playbook) Hypothesis: blank-page failures drive part ofnot_using(users churn because product was broken, not because they didn't value it). Resolves when: 60 days post-hotfix the churn-surveynot_usingshare is re-pulled and compared to the 39% baseline. Now genuinely testable — PR #12 is open as of 2026-05-12; the 60-day measurement window starts the day the merged build hits CWS, NOT the day diagnosis closed (see: toby/strategy/playbook.md open questions, toby/01-personas.md, toby/incidents/2026-05-11-blank-extension-page.md). worklog.mdlast entry is 2026-02-02 — stale by ~3 months. Either it needs an update or another rolling log has taken its place. Worklog stale-flag onCancelSubscription.tsxretention-discount wiring is now resolved as of the 2026-05-12 incident (integration confirmed shipped via liveapps/extension/app/components/Modal/Downgrade/CancelSubscription.tsx:643-709); the worklog itself should be updated to retire that pending item.- Uncommitted state on
main: modifiedCLAUDE.md; untrackeddocs/ai-onboarding-ideas-analysis.md,product/ideas/(mcp-integration.md),research-docs/(six toby-research/delta files). Should these be committed before Phase 2 begins? - Action 0 follow-up: WAU number was corrected from 380K → ~62–75K; verify all dashboards, marketing copy, and investor-facing materials reflect the corrected number (see: product/strategy/next-actions.md).
- Free-tier collaboration limits vs. anchor #4 (compass): the role-based-gating bet may restrict collaboration on the free tier, but anchor #4 says "the free experience must be genuinely usable, not crippled". Where is the line? Needs explicit decision before the gating experiment ships — playbook O3 KR2 forces the call by 2026-06-15 (see: toby/strategy/compass.md anchor 4, toby/strategy/playbook.md O3 KR2).
Doc index
- Toby — State of the Project — this dashboard (operational view: status, OKRs, next steps, decisions, blockers).
- Toby — Product Compass — identity / axioms / anchors / brand promise / anti-promises. Authoritative on who Toby is (2026-05-10).
- Toby — Q2 2026 Growth Playbook — growth thesis, current-loop diagnosis, top-5 bets, OKRs, anti-bets, red-team pass (2026-05-10).
- Toby — Rolling Bets Queue — ICE-scored in-flight / proposed / validated / killed with falsifying signals (2026-05-10).
- Toby on X — Strategy — operator-driven channel surface, ICP buckets, pillar framing (2026-05-10).
- Toby on X — Content Pipeline — 13 drafted posts, scheduling, pin candidate (Post 13) (2026-05-10).
- Toby on X — Engagement Targets — 26 ranked candidates (6 Tier-A / 11 Tier-B / 9 Tier-C), reply drafts, bot-filter rule, hard engagement rules, competitor watch (2026-05-10).
- Toby — Blog & SEO Pipeline — Run #2 (2026-05-12): 2-week cadence, 5 pillars mirroring X, 7 ranked queued topics, voice fingerprint, competitor-blog watch list, guardrail open questions (2026-05-12).
- Why You Have 80 Tabs Open (And Why That's Actually Fine) — first draft from the blog pipeline; P1 top-funnel; paired with X Post 1 (2026-05-09).
- OneTab Alternative: Save Tabs as Workspaces, Not URL Lists — second draft from the blog pipeline; P5 mid-funnel; "URL list → workspace" reframe; rel-gate recommended on publish (2026-05-12).
- Toby Incidents — How this works — warroom onboarding doc; five-agent workflow
Toby Incident Response(id9b78790f-2aea-4f65-876f-53d1a114c3ae); daily 09:00 UTC cron; entry path is a ticket withneeds-warroomlabel, bridged into_inbox/automatically (2026-05-11; bridge + fix-shipper added 2026-05-12). - Incident: 2026-05-11 — Blank extension page (infinite-load hang on new tab) — first canonical incident; status
closed → shipped(diagnosis 2026-05-11, PR #12 opened 2026-05-12 via Wave 4 fix-shipper); verdictvalidated(high confidence); root cause + 3-layer frontend-only fix landed on branchwarroom/2026-05-11-blank-extension-page-toby-14(commit06baf0f8a); 3 follow-up surfaces deliberately deferred (2026-05-11 · 2026-05-12 ship update). - Incident: 2026-05-12 — retention_offers silent (TOBY-6) — second canonical incident; status
closed(diagnosis); verdictvalidated + medium; Wave 4 correctly skipped auto-ship; corrected Tier 1 instrumentation patch in the doc awaiting Go-reviewer sign-off; Tier 2-4 follow-ups to be filed as separate tickets. Surfaces structural retention-funnel leak (~82% bypass) and resolves the long-standingCancelSubscription.tsxintegration question (2026-05-12). - Toby code review · 2026-05-13 — NEW 2026-05-13. First run from the new
toby-code-revieweragent; back-fill of the 10 most recent commits onmain(window ends at75a09e34d); 3 medium + 4 low findings; 3 tickets auto-filed (useSessionStartno-tests,.sandcastle/scan-secrets.mtsno-tests, sandcastle skip-flags audit-trail missing). Reviewer never patches — files tickets only (2026-05-13). - Toby — Personas — Solo Pro Power Saver + Tenured Free Organizer + the other 5 segments (100% of active users).
Other strategic context for this dashboard is sourced from the codebase: tasks/onboarding-experiment-plan.md, tasks/phase1-todo.md, tasks/phase2-todo.md, research-docs/toby-research-2026-05-05-v3.md, research-docs/toby-delta-2026-05-05-v3.md, product/strategy/next-actions.md, product/strategy/soul.md, product/learnings.md, product/ideas/mcp-integration.md, docs/ai-onboarding-ideas-analysis.md, worklog.md, CLAUDE.md, and the brand-new top-level .sandcastle/ subsystem (orchestrator + secret scanner; commit 75a09e34d).