artifacts/toby-incident-validator/db1a3c0a-b500-432d-a579-658f01657186/validation.mdValidation — TOBY-6 retention_offers silent
Verdict
validated — medium confidence. The diagnosis is high confidence (every load-bearing claim re-checked and matched on the nose). The Tier 1 patch concept is sound (logging-only, low-risk, right surface). But the literal diff in the synthesis-draft is non-compiling in three concrete ways, and that's enough to hold back automated shipping. A human reviewer can correct the patch in <5 min from the precise notes below, then ship. Closing as in_review rather than validated_high reflects that human-in-the-loop step.
If the coordinator wants automated shipping anyway, re-dispatch the doctors to produce a compile-ready diff and I'll re-validate.
What I spot-checked
| Claim | Source | How I re-verified | Result |
|---|---|---|---|
retention_offers total = 17 | Backend finding L52-54 | consoledb SELECT count(*) FROM retention_offers | 17 ✅ exact match |
retention_offers last 30d = 1 | Backend finding L55 | consoledb WHERE created_at > now() - interval '30 days' | 1 ✅ exact match |
cancellation_reasons last 30d = 22 | Backend finding L57 | consoledb same predicate | 22 ✅ exact match |
| 120 canceled subs in 30d | Backend finding L58 | consoledb same predicate | 120 ✅ exact match |
Schema is accept-only (no status / offered_at / declined_at) | Backend finding L75-88, migration V71 | consoledb information_schema.columns AND read V71__retention_offers.up.sql:1-11 | Confirmed. Only accepted_at + created_at. ✅ |
16/17 accepts are retention_legacy | Backend finding L90-95 | consoledb GROUP BY coupon_id | 16 retention_legacy + 1 retention_yearly ✅ exact match |
GetRetentionOffer structure at L599-607 | Synthesis-draft Tier 1 target | Read subscription_context.go:560-720 | Confirmed. Eligibility evaluated at L590-597, response built at L599-607 — patch insertion point at L598 is correct. ✅ |
Cancel handler at L75-151 never invokes retention | Backend finding L143-145 | Read subscription_context.go:75-151 | Confirmed. L138 calls PaymentSvc.CancelSubscription directly, returns Stripe portal URL. No retention check, no offer write, no event. ✅ |
hasSubscription gate at Subscription.tsx:41-43 excludes team_legacy / team_basic | FE finding L119-135, synthesis L23 | Read Subscription.tsx:38-62 | Confirmed. L41-43 literally: !['team_legacy', 'team_basic'].includes(team.accessRole). ✅ |
retentionEligibilityResult field layout | Implied by proposed diff | Read subscription_context.go:222-228 | Mismatch found. Result has Eligible bool, Reason string, Offer *dtos.RetentionOfferDetail, Subscription *models.Subscription, IsLegacy bool. Fields OfferType / CouponID are nested under result.Offer.*, not on result itself. ⚠️ |
What I did not re-verify:
- The Feb-2026 monthly trend (
134 reasons / 10 accepts) — the 30d numbers match, so the 4-10% ratio is plausible; not load-bearing for the ship decision. - The
gcloud secrets list --filter=name~TOBY_RETENTIONreturning empty (Tier 4 housekeeping). It's outside the ship candidate. - Cloud Logging traces for the team
ce2cc1achappy-path — my IAM scope returns permission-denied onlist_log_entriesfortoby-production-286416(known limitation from prior runs). The DB evidence (1retention_yearlyrow dated 2026-05-12) corroborates the live trace claim independently. - A live FE Playwright reproduction —
apps/extensionis not pre-compiled in this workspace (same blocker the FE doctor cited), and the funnel claim is about cohorts I can't synthesize anyway.
Correctness (triple-check lens 1)
The funnel framing follows from the evidence. The four numbers (120 → 22 → 17 → 1) all match the DB on re-query. The schema fact — that the table records ACCEPTS only — is independently load-bearing AND independently verifiable: I read the migration and confirmed accepted_at is the only acceptance-time field and there is no status / offered_at / declined_at column. So every "0 offers issued" datapoint in the ticket IS necessarily "0 offers accepted"; the metric the ticket asks about is unmeasurable from this table. That's the synthesis-draft's central claim and it holds.
One small numeric inconsistency to flag (does not change the verdict): the FE finding L82 cites product/metrics/surveys/churn-survey-analysis.md as saying "14 rows... real accept rate is ~14%." Current table has 17 rows. The churn-analysis doc is older — this isn't a contradiction, just a date drift. Not load-bearing.
The "no recent regression" claim holds: prod-api revision prod-api-00427-9p2 at SHA 4b0107858 has been deployed since 2026-02-02 (matches my own memory of this stable SHA). The FE cancel-modal files haven't changed since 2026-03-31 (per FE finding L188-190). So the 30-day silence isn't caused by a recent push — it's structural.
Quality (triple-check lens 2)
Tier 1 is appropriately scoped — logging-only, no behavior change, no schema change, no user impact. Reverting is a 1-commit rollback. The Tier 2-4 items are explicitly marked NOT shipped and surfaced as separate-ticket candidates. The doc does not smuggle product changes into the ship.
Verify plan is testable but soft. The plan: deploy, wait 24h, query jsonPayload.message="retention_offer_eligible" in Cloud Logging. Expected: 1-3 events over 24-48h. If zero, that's also signal. This is honest about the sparse traffic. The Cloud Logging filter syntax (jsonPayload.message=...) IS the correct shape for zap-emitted structured logs in GCP — that part of the verify plan is correct.
The literal Tier 1 diff is non-compiling. This is my main quality objection:
| The synthesis-draft writes | The code actually wants |
|---|---|
log.Info(...) | ctx.Logger.Info(...) — the file uses zap (go.uber.org/zap) via the context logger. No log package import. See existing L652-656 and L709 as canonical examples in the same file. |
"team_id", team.ID, (key/value variadic) | zap.String("teamID", teamID), — the file's convention is zap.String/zap.Error, key name is teamID (camelCase) per existing L653. Also team is not in scope inside GetRetentionOffer; only teamID (string) is. |
"offer_type", result.OfferType, | zap.String("offerType", string(result.Offer.OfferType)) — OfferType is a field on result.Offer (*dtos.RetentionOfferDetail), not on result itself. And RetentionOfferType is a typed string (per L438/466/488), so an explicit cast is needed. |
"coupon_id", result.CouponID, | zap.String("couponID", result.Offer.CouponID) — same nesting issue. |
| (missing) | result.Offer != nil guard — defensive: when Eligible == true the Offer is always set by validateRetentionEligibility, but a nil-deref-on-shipping-day is the kind of regression that ruins a logging-only PR. |
Compile-ready replacement (drop-in at L598, between eligibility check and RenderRes):
if result.Eligible && result.Offer != nil {
ctx.Logger.Info("retention_offer_eligible",
zap.String("teamID", teamID),
zap.String("userID", userID),
zap.String("offerType", string(result.Offer.OfferType)),
zap.String("couponID", result.Offer.CouponID),
)
}
The synthesis-draft DOES hedge ("Exact log API and result field names should be confirmed against the surrounding code at apply time"), but the fix-shipper agent's job description is "applies the proposed fix from the doc" — not "rewrites the diff to compile". Punting compile-correctness to the shipper is the wrong abstraction; either the doc compiles or it's pre-shipping work.
Safety (triple-check lens 3) — including security-review pass
No auth / session / token / JWT / CSP / secret surface touched. The patch is additive logging in an existing handler. security-review skill not applicable beyond the PII check below.
PII risk: low. The proposed log fields are teamID and userID (UUIDs), offerType (enum), couponID (short opaque coupon name like retention_legacy). No email, name, IP, payment details. Existing log lines in this file already emit teamID via zap.String (L653) — the patch is consistent with house convention and doesn't widen the PII surface beyond what's already in prod logs.
Log volume risk: negligible. Historical evidence is ~22 cancellation_reasons rows per 30 days. GetRetentionOffer is only called after a reason is submitted (per the FE flow at CancelSubscription.tsx:643-709). Worst plausible case is a handful of log lines per day; in the observed 12-day window only 2 GetRetentionOffer HTTP requests showed up at all. No blow-up risk.
Structured-log convention: must use zap, not stdlib log. Established in the same file at L652-656 and L709, and across stripe_context.go. The compile-ready replacement above follows the house style.
No security-relevant follow-ups. Tier 2 items (Stripe Portal flow_data, legacy CTA gate) have product/UX implications but no security implications. Tier 4 (missing TOBY_RETENTION* secrets) is housekeeping; the code's fail-soft fallback to struct-tag defaults is more secure than a hard-fail on missing secrets, and the current defaults are the values actually expected.
What would change my verdict to validated + high
Single change: the synthesis-draft's Tier 1 "concrete diff" replaced with the compile-ready block in the Quality section above (correct zap API, correct nested field paths, scoped variables, Offer != nil guard). No diagnosis changes needed; no further evidence-gathering needed; no re-dispatch of doctors needed.
What would change it to rejected
Spot-check disconfirming any of the numbers (120/22/17/1) — would invalidate the funnel framing. None disconfirmed; verdict not rejected.
Recommendation to coordinator
Close the canonical doc with status: in_review. In one paragraph in the doc's open-questions section, paste the compile-ready replacement block from the Quality section above and tag a Toby Go reviewer (any owner of apps/api/context/v3/) for sign-off. After approval, ship via fix-shipper — the verify plan as written will work correctly with zap-emitted structured logs.