Delivery Plan
Three phases from foundation to mature governance over 18 months
Ship fast, prove value, then scale. Every phase has measurable gates — if the data doesn't support continuing, we stop and recalibrate before burning more budget.
Phase 1: Build & Validate
Day 0–180 — foundation spikes, controlled pilot, then hardening.
Foundation
Day 0–60- Establish baseline metrics across SG GrabFood support
- Spike 1 (2 weeks): Policy retrieval engine — target >90% precision on top-20 policies
- Spike 2 (3 weeks): Deviation taxonomy classifier — map override categories to risk tiers
- Spike 3 (3 weeks): Copilot panel v1 — inline suggestions in agent desktop
- Stand up inference gateway with model routing and token tracking
Pilot
Day 60–120- Market: Singapore, GrabFood vertical
- 50 agents in treatment group + 50 control agents
- 8-week controlled experiment with weekly data reviews
- Go/no-go decision gate at Day 90
Harden
Day 120–180- Activate risk scoring with approval gates for Tier H deviations
- Launch post-resolution audit pipeline (sampled)
- Supervisor queue with SLA-based routing
- Weekly model calibration cycles begin
Go / No-Go Gates — Day 90
Five criteria that must pass before the pilot graduates to hardening. Miss any gate and we pause.
Policy adherence
CSAT
P50 resolution time
Agent adoption
Risk precision
All five gates must pass at Day 90. Failure on any metric triggers a pause-and-recalibrate cycle.
Phase 2: Expand & Learn
Day 181–365 — multi-market rollout, ML upgrades, and gamification.
Phase 3: Scale & Optimize
Day 366–540 — self-serve migration, upstream prevention, and platform maturity.
Supervisor Capacity Model
How many supervisor FTEs are needed at each scale for Tier H review queues.
Assumption: ~1.5% of all tickets produce Tier H deviations requiring supervisor review. At ~10 min/review, one FTE handles ~1,000 reviews/month. As ML precision improves, false positives drop and FTE needs decrease.
Feasibility Matrix
Technical feasibility and confidence level for each core capability.
| Capability | Feasibility | Confidence |
|---|---|---|
| Real-time policy suggestions | High | Medium |
| Deviation risk scoring | Medium | Medium |
| Selective approval gates | High | High |
| Post-resolution audit | High | Medium |
| Multi-market policy engine | Medium | Low–Med |
| Cost-optimized model routing | High | Medium |
Prove it small, scale it fast
The 180-day foundation phase is designed to produce clear, measurable proof that selective friction works — before committing to multi-market expansion. Every gate is quantifiable and every phase has a kill switch.