A
AIOS Wiki
read-only · public mirror
Open AIOS
Wikiai-supportresearchai-support/research/04-phase-2-unit-economics.md

AI Customer Support Agents — 2026 Unit Economics Dossier

Hand-authored·5 min read·10 sections·Last edited May 12 by initial import·View history

1. Human Baseline Cost

  • All-in cost-per-ticket: $2.93–$49.69, avg $15.56 (LiveChatAI)
  • US in-house: $35–55/hr → $3.50–5.50/ticket at 6-min AHT
  • Offshore (PH/IN): $7–16/hr → $0.70–1.60/ticket
  • 20-agent Manila team: $15K–22K/month all-in
  • BPO hidden fees: +15-30%
  • By tier (MetricNet): Tier-0 self-serve $2-5; Tier-1 ~$22; Tier-2 ~$70; Tier-3 $104+
  • Crypto Tier-2 reality (Dapper-shape, identity-sensitive): $25-40/ticket US; Tier-3 escalations $100+

2. AI Token Math (May 2026)

Per-token pricing:

ModelInput $/MOutput $/MCached input
Claude Sonnet 4.6$3$15$0.30
Claude Haiku 4.5$1$5$0.10
Claude Opus 4.7$5$25$0.50
GPT-5$1.25$10$0.125
GPT-5.5 (Apr 2026)$5$30n/a — prices doubled
Gemini 2.5 Pro$1$10discounted
Gemini 2.5 Flash$0.30$2.50discounted
Llama 4 70B (Together)$0.88$0.88n/a

Per-resolution costs (defensible 2026 inference cost):

  • Haiku 4.5 + caching: $0.04–0.08
  • Sonnet 4.6 + caching, 3.5 calls avg: $0.10–0.18
  • Cascaded (Haiku→Sonnet): $0.07–0.12 weighted
  • Long-context retrieval blow-up: $0.30–0.60
  • HITL (AZ shape): inference $0.10–0.18 + reduced human $1.80 = $1.80–2.20/ticket (vs $4-5 fully human, $0.15 fully AI)

Cost blow-ups:

  1. Long-context retrieval (>2.5K retrieved tokens degrades quality + raises cost linearly)
  2. Agentic ReAct loops — O(n²) token growth; unconstrained agents measured $5-8/task
  3. Fine-tuning — $5K-50K per training run + 2-5x inference; mostly abandoned in 2024

3. Pricing Models in Market

  • Per-resolution (Fin $0.99, Decagon, Sierra ~$1-3): growing fastest
  • Per-conversation (Decagon default, Ada): $0.50-2
  • Per-seat (Zendesk legacy, Salesforce): $50-200/seat — declining 21%→15% in 12mo
  • Hybrid (platform fee + outcome): 27%→41% in 12mo — actual winner
  • Implementation services: $50-200K one-time, universal
  • Per-resolution backfires above ~5K resolutions/mo — line items look bigger than human cost; Supp.support et al publishing "per-resolution is a trap"

4. Vendor Gross Margins

  • Realistic 2026 blended GM for pure-play AI support: 60-70% (NOT 80-85% SaaS norm)
  • Inference COGS: 8-25% depending on vertical
  • FDE/forward-deployed engineering: 5-15% (counted as COGS, not S&M)
  • Per-customer prompt/skill tuning: 3-10%
  • Sierra reportedly mid-60s GM despite $150K+ platform fees (heavy FDE)
  • Decagon similar
  • Cresta cautionary: $52M revenue / 508 employees = $102K rev/employee, well below SaaS norms

5. Dapper-Sized ROI Case (50 tickets/day = 18,250/yr)

  • Mix: 60% T1, 30% T2, 10% T3
  • Human baseline (US-leveraged crypto identity work): T1@$20×11K + T2@$50×5.5K + T3@$120×1.8K = ~$714K/yr
  • With AI agent (50% T1 deflect, 30% T2 deflect, HITL on rest):
    • AI inference: trivial line
    • Reduced human: ~$245K
    • AI vendor ACV $100K
    • Total ~$345K, savings ~$369K, payback <4 months on $100K ACV
  • Break-even thresholds:
    • $100K ACV → 28% deflection on 50/day (achievable; industry avg 23%, best-in-class 45-50%)
    • $200K ACV → 55% deflection (only narrow Tier-1)
  • Skeptical caveat: vendor "deflection rate" is gamed (Intercom counts no-reply as resolved). Quality-adjusted is 60-75% of headline.

6. Volume Threshold for Viability

  • Floor: $50K vendor ACV + $30K amortized impl + $50K internal AI-ops = $130K/yr fixed
  • At $15 blended/ticket × 40% deflection → need ~21K tickets/yr ≈ 60/day floor
  • Below 30/day, AI tools cost MORE than they save unless: (a) SLA premium, (b) self-serve + zero impl, (c) regulated industry $100+/ticket
  • Strategic: Mid-market 100-500/day is the sweet spot. Enterprise >2,000/day = FDE-heavy services compressing margins. SMB <30/day = self-serve land grab.

7. Switching Cost / TCO

Year-1 TCO multiplier: 2.0–2.5× quoted ACV

  • Implementation services: $40-200K
  • Internal eng integration: 160-400 hrs ($30-80K)
  • Training data prep: $20-100K
  • KB curation: $20-60K
  • Change management: $30-100K
  • Dual-running 2-6 months
  • Ongoing prompt/skill maintenance: 0.5-1 FTE
  • Cash payback for mid-market: 12-18 months realistic

8. Founder Economics

Bootstrap to $1M ARR:

  • 4 engineers × $250K loaded × 18 months = $1.5M
  • Inference + infra: $270K
  • Total: $1.8-2.5M to reach 10 customers × $100K — within seed range

Competitive (vs Sierra/Decagon head-to-head):

  • 6-10 core engineers + 3-5 FDE by 10 customers
  • Full-time evals/red team
  • $5-8M/year burn at product-ready stage

Scaling curve: $1M→$10M now 2× cheaper than classic SaaS. $10M→$100M is roughly the same as classic SaaS, sometimes worse, because outcome pricing compresses revenue as product improves.

9. What Kills These Companies

  1. Forethought — acquired by Zendesk March 2026, ~$200M+. Mid-tier squeezed between Zendesk-AI bottom and Sierra/Decagon top. Lesson: middle is hard to defend.
  2. Subtl.ai shut down July 2025 — horizontal "AI for knowledge" without vertical fit
  3. DigitalGenius — pre-LLM era, pivoted to Shopify/ecom niche, no longer in conversation
  4. 27% of AI startups (3,800) shut down 2025, +13% (1,800) early 2026 — ~40% failure in 24mo
  5. Cresta scaling cautionary: $52M rev / 508 employees = broken cost structure
  6. Qualtrics: AI customer service fails at 4× rate of any other AI use case for end-customer satisfaction

10. Five Sharpest Economic Insights

  1. HITL has fundamentally better unit economics than full-autonomy. $0.15 inference + $1.80 reduced human = ~$2/ticket. Customer keeps control. Sell certainty, not deflection.

  2. Vendor margin floor is 60-65%, not 80%. FDE/solutions is COGS, not S&M. Build pricing to land at $2-3/ticket-touched (HITL) so $100K customer = ~50K tickets/yr — covers FDE.

  3. Per-resolution is the puck, but the puck is moving to hybrid. Pure per-resolution produces customer rage at scale and renegotiation when AI improves. Price like Twilio: floor + meter + ceiling.

  4. Viable customer floor ~30/day, sweet spot 100-500/day. Below 30 → fixed cost crushes ROI. Above 2,000 → bespoke compresses margins. Mid-market is AZ-shaped target — also where Dapper sits and most crypto/web3 lives.

  5. Inference costs NOT going to zero. Sonnet 4.6 flat from 4.5; GPT-5.5 doubled prices April 2026. The era of monotonic price drops is over. Win comes from caching architecture, cascaded routing, bounded agentic loops — not from waiting for prices to fall. Build cost discipline into v1.

Bottom-line: A focused 4-engineer AZ team can plausibly hit $1M ARR in 18 months on $2-2.5M burn by targeting 100-500-tickets/day mid-market with HITL-first product, hybrid pricing (~$60K floor + $0.50/ticket overage), and vertical wedge (web3/crypto Tier-2 work). Avoid Sierra/Cresta playbook of $200M+ raises for F500 logos.

Risk: LLM platform players (Anthropic, OpenAI) ship "support agent" first-party in 2026-27 and compress vertical into a feature. Defensible moat = integration depth + vertical workflows + HITL UX, not model orchestration.