AI Customer Support Agents — 2026 Unit Economics Dossier
1. Human Baseline Cost
- All-in cost-per-ticket: $2.93–$49.69, avg $15.56 (LiveChatAI)
- US in-house: $35–55/hr → $3.50–5.50/ticket at 6-min AHT
- Offshore (PH/IN): $7–16/hr → $0.70–1.60/ticket
- 20-agent Manila team: $15K–22K/month all-in
- BPO hidden fees: +15-30%
- By tier (MetricNet): Tier-0 self-serve $2-5; Tier-1 ~$22; Tier-2 ~$70; Tier-3 $104+
- Crypto Tier-2 reality (Dapper-shape, identity-sensitive): $25-40/ticket US; Tier-3 escalations $100+
2. AI Token Math (May 2026)
Per-token pricing:
| Model | Input $/M | Output $/M | Cached input |
|---|---|---|---|
| Claude Sonnet 4.6 | $3 | $15 | $0.30 |
| Claude Haiku 4.5 | $1 | $5 | $0.10 |
| Claude Opus 4.7 | $5 | $25 | $0.50 |
| GPT-5 | $1.25 | $10 | $0.125 |
| GPT-5.5 (Apr 2026) | $5 | $30 | n/a — prices doubled |
| Gemini 2.5 Pro | $1 | $10 | discounted |
| Gemini 2.5 Flash | $0.30 | $2.50 | discounted |
| Llama 4 70B (Together) | $0.88 | $0.88 | n/a |
Per-resolution costs (defensible 2026 inference cost):
- Haiku 4.5 + caching: $0.04–0.08
- Sonnet 4.6 + caching, 3.5 calls avg: $0.10–0.18
- Cascaded (Haiku→Sonnet): $0.07–0.12 weighted
- Long-context retrieval blow-up: $0.30–0.60
- HITL (AZ shape): inference $0.10–0.18 + reduced human $1.80 = $1.80–2.20/ticket (vs $4-5 fully human, $0.15 fully AI)
Cost blow-ups:
- Long-context retrieval (>2.5K retrieved tokens degrades quality + raises cost linearly)
- Agentic ReAct loops — O(n²) token growth; unconstrained agents measured $5-8/task
- Fine-tuning — $5K-50K per training run + 2-5x inference; mostly abandoned in 2024
3. Pricing Models in Market
- Per-resolution (Fin $0.99, Decagon, Sierra ~$1-3): growing fastest
- Per-conversation (Decagon default, Ada): $0.50-2
- Per-seat (Zendesk legacy, Salesforce): $50-200/seat — declining 21%→15% in 12mo
- Hybrid (platform fee + outcome): 27%→41% in 12mo — actual winner
- Implementation services: $50-200K one-time, universal
- Per-resolution backfires above ~5K resolutions/mo — line items look bigger than human cost; Supp.support et al publishing "per-resolution is a trap"
4. Vendor Gross Margins
- Realistic 2026 blended GM for pure-play AI support: 60-70% (NOT 80-85% SaaS norm)
- Inference COGS: 8-25% depending on vertical
- FDE/forward-deployed engineering: 5-15% (counted as COGS, not S&M)
- Per-customer prompt/skill tuning: 3-10%
- Sierra reportedly mid-60s GM despite $150K+ platform fees (heavy FDE)
- Decagon similar
- Cresta cautionary: $52M revenue / 508 employees = $102K rev/employee, well below SaaS norms
5. Dapper-Sized ROI Case (50 tickets/day = 18,250/yr)
- Mix: 60% T1, 30% T2, 10% T3
- Human baseline (US-leveraged crypto identity work): T1@$20×11K + T2@$50×5.5K + T3@$120×1.8K = ~$714K/yr
- With AI agent (50% T1 deflect, 30% T2 deflect, HITL on rest):
- AI inference: trivial line
- Reduced human: ~$245K
- AI vendor ACV $100K
- Total ~$345K, savings ~$369K, payback <4 months on $100K ACV
- Break-even thresholds:
- $100K ACV → 28% deflection on 50/day (achievable; industry avg 23%, best-in-class 45-50%)
- $200K ACV → 55% deflection (only narrow Tier-1)
- Skeptical caveat: vendor "deflection rate" is gamed (Intercom counts no-reply as resolved). Quality-adjusted is 60-75% of headline.
6. Volume Threshold for Viability
- Floor: $50K vendor ACV + $30K amortized impl + $50K internal AI-ops = $130K/yr fixed
- At $15 blended/ticket × 40% deflection → need ~21K tickets/yr ≈ 60/day floor
- Below 30/day, AI tools cost MORE than they save unless: (a) SLA premium, (b) self-serve + zero impl, (c) regulated industry $100+/ticket
- Strategic: Mid-market 100-500/day is the sweet spot. Enterprise >2,000/day = FDE-heavy services compressing margins. SMB <30/day = self-serve land grab.
7. Switching Cost / TCO
Year-1 TCO multiplier: 2.0–2.5× quoted ACV
- Implementation services: $40-200K
- Internal eng integration: 160-400 hrs ($30-80K)
- Training data prep: $20-100K
- KB curation: $20-60K
- Change management: $30-100K
- Dual-running 2-6 months
- Ongoing prompt/skill maintenance: 0.5-1 FTE
- Cash payback for mid-market: 12-18 months realistic
8. Founder Economics
Bootstrap to $1M ARR:
- 4 engineers × $250K loaded × 18 months = $1.5M
- Inference + infra: $270K
- Total: $1.8-2.5M to reach 10 customers × $100K — within seed range
Competitive (vs Sierra/Decagon head-to-head):
- 6-10 core engineers + 3-5 FDE by 10 customers
- Full-time evals/red team
- $5-8M/year burn at product-ready stage
Scaling curve: $1M→$10M now 2× cheaper than classic SaaS. $10M→$100M is roughly the same as classic SaaS, sometimes worse, because outcome pricing compresses revenue as product improves.
9. What Kills These Companies
- Forethought — acquired by Zendesk March 2026, ~$200M+. Mid-tier squeezed between Zendesk-AI bottom and Sierra/Decagon top. Lesson: middle is hard to defend.
- Subtl.ai shut down July 2025 — horizontal "AI for knowledge" without vertical fit
- DigitalGenius — pre-LLM era, pivoted to Shopify/ecom niche, no longer in conversation
- 27% of AI startups (3,800) shut down 2025, +13% (1,800) early 2026 — ~40% failure in 24mo
- Cresta scaling cautionary: $52M rev / 508 employees = broken cost structure
- Qualtrics: AI customer service fails at 4× rate of any other AI use case for end-customer satisfaction
10. Five Sharpest Economic Insights
-
HITL has fundamentally better unit economics than full-autonomy. $0.15 inference + $1.80 reduced human = ~$2/ticket. Customer keeps control. Sell certainty, not deflection.
-
Vendor margin floor is 60-65%, not 80%. FDE/solutions is COGS, not S&M. Build pricing to land at $2-3/ticket-touched (HITL) so $100K customer = ~50K tickets/yr — covers FDE.
-
Per-resolution is the puck, but the puck is moving to hybrid. Pure per-resolution produces customer rage at scale and renegotiation when AI improves. Price like Twilio: floor + meter + ceiling.
-
Viable customer floor ~30/day, sweet spot 100-500/day. Below 30 → fixed cost crushes ROI. Above 2,000 → bespoke compresses margins. Mid-market is AZ-shaped target — also where Dapper sits and most crypto/web3 lives.
-
Inference costs NOT going to zero. Sonnet 4.6 flat from 4.5; GPT-5.5 doubled prices April 2026. The era of monotonic price drops is over. Win comes from caching architecture, cascaded routing, bounded agentic loops — not from waiting for prices to fall. Build cost discipline into v1.
Bottom-line: A focused 4-engineer AZ team can plausibly hit $1M ARR in 18 months on $2-2.5M burn by targeting 100-500-tickets/day mid-market with HITL-first product, hybrid pricing (~$60K floor + $0.50/ticket overage), and vertical wedge (web3/crypto Tier-2 work). Avoid Sierra/Cresta playbook of $200M+ raises for F500 logos.
Risk: LLM platform players (Anthropic, OpenAI) ship "support agent" first-party in 2026-27 and compress vertical into a feature. Defensible moat = integration depth + vertical workflows + HITL UX, not model orchestration.