Discovery & Research

Baseline inputs, stakeholder mapping, interview plan, and key assumptions

Before designing the copilot, we need to understand what exists, who cares, and what we're betting on. This section covers Grab's existing infrastructure, the 11 stakeholder groups whose buy-in determines success, a structured interview plan, and the 6 assumptions that must hold for the selective friction model to work.

Baseline Research

What Grab already has — public sources and engineering disclosures that anchor Workstream 3 design.

Workforce Routing

Grab Engineering Blog, 2020

8 countries, skill-based + priority routing with safety bumping. Dynamic queue management.

Copilot must integrate with routing data (skill tags, priority) to stay aligned.

In-House Chat Platform

Grab Engineering Blog, 2020

Persistent sessions, unified customer context, integrated CRM — agents see full history.

LLM insight layer can pull structured context instead of re-asking customers.

Automated FAQ Tier

Grab Engineering Blog, 2020

AI-powered first line handles repetitive inquiries (balance, status checks).

Workstream 3 focuses on cases that slip past automation — higher complexity, higher risk.

AI Merchant Assistant & Driver Companion

Anthropic + Grab, 2025

25% negative sentiment reduction, +5.7ppt resolution rate for merchants. 250k+ drivers supported.

Leadership already invests in AI; copilot must match similar depth and ROI story.

AI-Led Profit Target

The Star / Reuters, Feb 2026

Grab targets tripling profit by 2028 via AI and new services.

Copilot must demonstrate measurable operating leverage to align with top-level goals.

Policy & Control Landscape

Grab public policy pages

Cancellation policies, refund timelines (GrabFood/Mart/Pay), safety zero-tolerance.

Workstream 3 must embed these policy constraints into deviation handling.

Key Design Implications

1
Existing automation covers FAQ flows — WS3 is about complex cases and policy exceptions
2
Routing and chat context are already rich — leverage, don't recreate
3
Leadership expects AI to hit operating metrics — need balanced scorecard
4
Policy constraints vary by product/market — market packs must map thresholds

Stakeholder Analysis

11 stakeholder groups whose alignment determines whether the copilot works.

Support Agents

Primary user
Needs: Speed, context, confidence in suggestions. Low-friction workflow. Recognition for good judgment.
Concern: Fear rigid tools and surveillance. Worry deviation tracking becomes punishment.
>70% adoption within 60 days; trust score ≥3.5/5

Supervisors

Queue reviewer
Needs: Prioritized queue with rationale, risk breakdown, and case context. Manageable volume.
Concern: Queue overload during peaks. Accountability for approved deviations that later cause problems.
<15 min SLA Tier H; false-positive rate <15%

Customers

Indirect beneficiary
Needs: Fast, fair, respectful outcomes. Don't want to repeat information.
Concern: Will notice if quality degrades (longer waits from unnecessary escalation).
CSAT ≥4.2/5; resolution time P50 stable or improved

Policy / Compliance

Rule authority
Needs: Consistency, auditability, risk control. Explainable decisions with full provenance.
Concern: Copilot misclassifying policy-risk as corrective (false negatives).
Audit-ready evidence on 100% of Tier H/C cases

Legal / Privacy

Data governance
Needs: Compliance with PDPA, PDP, DPA across 6 SEA markets. Clear data classification.
Concern: Deviation logs containing customer PII without pseudonymization.
All privacy open items resolved before pilot launch

Ops Managers

Market-level owner
Needs: Balanced scorecard: CSAT, AHT, cost, policy adherence — visible together.
Concern: Copilot creating new ops burden (calibration, queue management, maintenance).
Cost per resolved case within target band

QA / Training

Quality feedback
Needs: Observable patterns by case type, market, agent cohort. Actionable coaching insights.
Concern: Audit pipeline producing too many low-signal flags.
QA agreement rate >80% on audit flags

Finance

Budget authority
Needs: Predictable LLM spend. Visibility into compensation leakage. Clear attribution.
Concern: LLM costs scaling faster than value delivered.
LLM cost per resolved case <$0.01

Engineering / ML

Build team
Needs: Clean labeled data (500+ Tier H cases). Latency budgets <500ms. Stable integrations.
Concern: Label quality, integration schema drift, classifier accuracy degradation.
Classifier precision/recall targets met; e2e <500ms

Workstream 1 & 2

Cross-workstream
Needs: Structured handoff: graduation candidates with confidence, prevention signals with root causes.
Concern: Noisy or premature graduation signals. Misaligned definitions of 'ready'.
≥3 flows graduated within 12 months

Merchants / Drivers

Indirect
Needs: Fair, policy-consistent treatment. Reduced arbitrary compensation.
Concern: Downstream effects of inconsistent agent decisions.
Reduced compensation leakage; consistent treatment

Interview Plan

Structured discovery across 7 stakeholder groups, 26–35 interviews total.

Markets: SG (pilot) + one expansion market (MY or ID)

Format: 30–45 min semi-structured; recorded with consent; notes shared within 48h

GroupInterviewsSelection CriteriaQuestions
Support agents12–15Mix of tenured (>6m) + newer; high-override + low-override; SG + expansion market12
Supervisors4–6Currently handling escalation queues; mix of SG and expansion market10
Ops managers3–4SG + expansion; at least one with cross-market visibility9
Policy / compliance2–3Regional compliance lead + market-level policy owner9
QA / training2–3QA leads covering pilot product lanes8
Legal / privacy2Grab Legal (data privacy counsel) + market-level contact7
Finance1–2Support cost owner or FP&A covering support operations5

Sample Discovery Questions

Agents

When you override a suggestion, how do you decide what to do instead? What information do you wish you had?

Agents

If we asked you to tag overrides with a reason (<5 seconds), would that feel reasonable or like surveillance?

Supervisors

What signals tell you an override is genuinely risky vs. a reasonable judgment call?

Ops

If we flag 5% of overrides as high-risk for supervisor review, does that feel like too many, too few, or about right?

Legal

Does Grab's existing support data processing consent cover logging agent deviation data?

Finance

If the copilot reduces cost per case by X%, how would Finance want that attributed?

Interview Output Artifacts

Deviation heatmap by market/product/case typePilot lane selection; classifier calibration
High-risk action inventory (ranked)Policy enforcement hard boundaries; risk scoring weights
Threshold calibration inputsRisk score formula; market policy packs
Agent trust and adoption risk assessmentChange management phasing; communication plan
Supervisor workflow requirementsSupervisor queue UX design; SLA settings
Gamification design preferencesReinforcement mechanics in autonomy model
Policy freshness / maintenance estimateAssumptions tracker validation
Privacy/consent gap listPolicy enforcement framework — open items
Cost-per-case baseline by marketCost model calibration; success metric baselines
Pilot lane recommendation (confirmed)Delivery plan Day 60–120 scope

Assumptions Tracker

6 assumptions that must hold for the selective friction model to work. P0 is existential.

#1P1

Better contextual suggestions raise agent adoption above 70%

Exit: >70% of agents use suggestions ≥1x/session within 60 daysuntested
#2P0

Most policy-risk can be captured by <20 high-signal rules (>80% coverage)

Exit: >80% of supervisor-confirmed violations covered by ≤20 rulesuntested

Existential: if this fails, the selective-friction model shifts to heavier ML classification.

#3P1

Selective gates won't hurt resolution time (P50 regression <5%)

Exit: P50 regression <5% in gated lanes vs controluntested
#4P2

Post-resolution audits detect quality-risk with QA agreement >80%

Exit: QA agreement rate >80% on audit flagsuntested
#5P1

Cost-aware routing keeps LLM cost <$0.01/case at 1M tickets

Exit: Cost per resolved case <$0.01 at steady stateuntested
#6P2

Policy packs maintain <48h freshness SLA

Exit: Freshness SLA >95%; stale incidents <2%untested

Dependency notes

  • Assumption #2 is existential — validate first in Spike 2 (weeks 5–8)
  • Assumptions #1 and #3 validated together in pilot — shared cohort and timeline
  • Assumption #5 depends on Spike 1 (policy retrieval) and model routing implementation