Agent Autonomy Model

Autonomy is default; intervention is selective and risk-driven

Agent trust is not a soft metric — it directly determines whether reason tags are honest, whether overrides are reported, and whether the learning loop gets clean signal. Introducing a deviation classification system will trigger surveillance fear if rolled out without deliberate change management.

Autonomy Bands

Three tiers of agent freedom — friction concentrated only where risk justifies it.

Band A

Trusted / Low-risk

Corrective overrides flow with minimal friction. Tier L actions require zero extra clicks.

~70% of all overrides

Band B

Standard

Medium-risk deviations require reason tags — structured dropdown adding ~5 seconds of handling time.

~20% of all overrides

Band C

Watchlist / High-risk

Targeted supervisor friction on risk-sensitive actions. Async approval — agent not blocked from other case work.

~10% of all overrides

Zero frictionSelective friction

Anti-Surveillance Safeguards

Five guardrails that prevent the system from becoming a surveillance tool.

Visible risk rationale

Show agents why a deviation was flagged — risk rationale visible in the copilot panel.

Structured appeal path

Inline appeal button for flagged decisions, logged for calibration review.

Coaching before consequences

Use signals for coaching and quality improvement before any punitive actions.

Agent data access

Agent deviation records are accessible to the agent — right of access per privacy framework.

No individual ranking

Statistical monitoring visible to ops managers only, not used for individual ranking without separate HR process.

Phased Adoption Model

Four phases that build trust before adding friction — shadow first, enforce last.

Shadow Mode

4 weeks before pilot

Copilot panel active with suggestions, no enforcement
Deviation classifier runs in background — logged but not shown
Collect baseline data and validate classifier accuracy

"New suggestion tool" — no mention of deviation tracking

Transparency Mode

First 2 weeks of pilot

Agents see the deviation status indicator (green/yellow/orange/red)
Reason tags available but optional (encouraged, not required)
Agents shown what would have been flagged, no friction applied

"We're testing a system that helps identify when our suggestions are wrong so we can improve them."

Soft Launch

Weeks 3–6 of pilot

Reason tags required for Tier M overrides (<5 seconds)
Tier H cases flagged for review — agent not blocked
Weekly feedback sessions with pilot agents (15 min)

"We're now using your input to catch risky situations earlier and to make sure good judgment gets recognized."

Active Mode

Weeks 7+ of pilot

Full selective friction: Tier H requires supervisor approval
Appeal path available for misclassified flags
Gamified reinforcement — badges for confirmed good judgment
Monthly agent advisory panel (5 rotating agents)

"The system is live. Most overrides flow through with no friction. When flagged, here's why and how to appeal."

Communication Principles

How we frame the system to agents — trust is earned through language and action.

Lead with agent benefit

"This system exists to recognize good judgment, not to police you."

Be honest about tracking

Agents will discover it anyway. Transparency builds trust; secrecy destroys it.

Show early wins

Within 2 weeks of active mode, publish corrective overrides that improved suggestions — with agent credit.

Never punitive without process

Coaching first, always. HR escalation only after repeated, confirmed violations post-coaching.

Coaching Model

Escalation ladder — coaching through 1:1 sessions, not automated warnings.

Corrective accepted by reviewers

Positive reinforcement — badge/acknowledgment

Repeated cost/policy-risk deviations

Targeted coaching conversation + Band C controls

Persistent quality-risk closures

Manager intervention + QA escalation

All coaching delivery

1:1 sessions informed by deviation data — not automated warnings

Feedback Mechanisms

Four channels ensuring agent voice shapes how the system evolves.

Weekly Pulse Survey

3 questions, <1 min

"Did the copilot help today? Was any flagging unfair? What should we fix?"

Monthly Focus Group

6–8 agents, 30 min

Deeper discussion on UX, trust, and suggestion quality.

Agent Advisory Panel

5 agents, rotating monthly

Reviews threshold changes, new rules, and gamification design before deployment.

Inline Feedback Button

On every suggestion

Thumbs up/down with optional comment — feeds suggestion quality metrics.

Adoption Success Criteria

Five measurable gates that define pilot success.

>70%

agents using suggestions within 60 days

≥3.5/5

trust score by end of pilot

>90%

reason-tag completion for Tier M

<5%

appeal rate of Tier H flags

Zero

attrition attributed to copilot