beyond usability: measuring speech application success silke witt-ehsani, phd vp, vui design center...

20
Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

Upload: angelica-martin

Post on 26-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

Beyond Usability: Measuring Speech Application Success

Silke Witt-Ehsani, PhDVP, VUI Design CenterTuVox

Page 2: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Outline

What is Success?Success Criteria

Success Metrics

Putting it all together:

A health check

methodology

Success vs DesignHow they effect each

other

Case studies

Page 3: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Success Criteria: i.e. What is “success”?

Common criteria:Are callers transferred to the correct destination?

How many callers are being helped?

How do callers like my speech applications?

What is the system recognition accuracy?

Different questions (Success Criteria) require different answers (Success Metrics)

How do we do that?

Page 4: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Success Metrics: Subjective vs Objective

SubjectiveUsability studyWhole call recordingsIndividual caller feedback

Objective = Application StatisticsAutomation ratesContainment ratesNon-cooperative caller rate

Page 5: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Success Metrics: Business vs Technical

Business Metrics for Business User:

• Routing Accuracy• Agent Transfers• Customer Satisfaction Technical Users:

• need detailed application performance on dialog state level

• grammar coverage• NoMatch, NoInput

• need ability to drill down

More Transfers out of application = higher call center cost

Higher Routing Accuracy = Less Agent-to-agent transfers

Business stakeholders care about the bottom line impact of several application and speech events

Page 6: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Common Business Metrics

Containment rate = “keep caller hostage in the system”

Automation rate = “offer complete functionality…”

Successful routing = “get the caller to the right expert”

Average call duration

And many, many more ….

Page 7: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Application Health Check - Business

3 main elements of a Business Health Check are

1. Custom defined success rate

2. Non co-operative Caller rate

3. Agent Transfer rate Transfer due to explicit caller request Transfer due to errors (both speech and system) Transfer by design (i.e. correctly routed calls)

Page 8: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Example Success Metric: Routing Accuracy

Definition:Confirmed routed calls (calls reaching an end destination) over all calls

Useful metric when using:Skills-based routing

Routing application with N routing points 68.3%

77%

% Routing Accuracy

~150 routing points

~ 50 routing points

4 routing points

85%

Page 9: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Example: Non Co-operative Callers

Possible reasons: Degree of caller acceptance of system

Non application related, such as wrong number, child crying etc.

Definition:Non-cooperative callers is the percentage of all callers that immediately hang-up or request an agent but never interact with the application

Expected range:

5-10% of call volume

6.3%

8.6%

% Non-cooperative Callers

Open-endedRouter

Directed DialogTechnical Support

Page 10: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Example: Agent Transfers

Applications tend to have many different types of agent transfers.Main categories:

Customer zero-ing outRouting to an agent based on caller information is a “Designed Transfer”Routing due to some logic in the application is a “Necessary Transfer”

Agent Transfers have immediately impact on call center cost

45%

4.7%

% Agent Requests

Definition:% Agent transfers of all calls

Example from a Telecommunications Company

Page 11: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Baseline and Trending

Numbers are relative, they only have meaning in a context

When defining success metrics,

1. create a baseline

2. then compare to that.

Potential Baselines:previous IVR touch-tone application

Go-live Performance

52%

66%

Customers finding speech easier or much easier than IVR

76%

Usability Go-live Tuning 1

Page 12: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Application Health check = Technical

Purpose of hotspot analysisIdentify areas where application is performing

sub-optimal

Hotspot analysis should be done for each dialog state

Important: Hotspot analysis gives the

“where” of issues, not the “why”!

Page 13: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Framework for Technical Health Check

TuVox Hotspot analysis = Integrated view of:Hang-up ( %H )

% Final NoInput ( %NI)

% Final NoMatch ( %NM)

Transfer Requests ( %TR )

State Exit Count =

# of calls * ( %H + %NI + %NM + %TR)

Rule of Thumb :

These numbers are a first order of approximation:Sort by highest state exit count

Review one by one in context, i.e. high hang-up because it is a logical end point

Page 14: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Hotspot Analysis Example

Prompt ID Prompt Text # Hits# 2nd No Match

# 2nd No Input

# User Hangup

Total Exit Number

STTransferTS#124Would you like to hear that website again 8993 181 205 6091 6477

STTransferSS#11

Please hold while I get someone who can help you. 2894 0 0 2894 2894

NTGetQueue#302 Please say yes or no. 21573 180 0 143 323

NTDisBilling#9

Which do you need help with a bill a service charge, a purchase or something else. 2211 217 25 81 323

NTFinder#301 Please say yes or no. 3711 121 0 102 223

Page 15: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

Success Criteria and Design

Page 16: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Success and Design are tightly linked

Success Metric

Authentication

Look up all loans for this callers

Does caller has a line of credit?

no

yes

no

Loan Menu: Balance More loan details Make loan

payment

Caller selects from list of loans

Does caller have more than

1 loan?yes

Design

Success determines the design

Design influences success

Page 17: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Case Study 1: Airline application

Customer requirement: 64% Success

Success definition:“For 64% of the callers entering the application, their ticket reservation record has to be retrieved from the back-end

Design consequences:Ensure via prompting that callers have their record identifier number before entering the application

Make it hard to get to an agent, i.e. multiple retries

Explain what the record identifier was

0%

10%

20%

30%

40%

50%

60%

70%

80%

Go-live Tuning 1 Tuning 2Lo

ok-u

p Su

cces

s

Design tailored to success criteria but at the expense of ease of use and caller experience

Page 18: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Case Study 2: Travel Application

Impact on Application PerformanceTurn failure rate = Decreased by 39%

Opt-out rate to the call center = Decreased by 44%

0

5

10

15

20

25

Turn FailureRate

Opt-out Rate

Menu-style

Question style

Hotspot analysis identifies a too high number of exists at a main menu

Observation: One menu option is much more common than other 5 choicesOld Design: Menu with 6 optionsNew Design: Yes/no question followed by a menu

Page 19: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Case Study 3: HighTech Routing Application

3 success criteria:Average call handling less than 30 secs

High customer satisfaction

4 queues to route to, but many different call reasons

Influence of these criteria on the design:Only 1 reprompt instead to standard 2 attempts

No traditional error prompting a la ‘sorry I didn’t get that’

Natural language open ended prompting with high coverage grammar

Page 20: Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox

S P E E C H W I T H I N R E A C H

Summary

Define Application Success Criteria

Based on that, define success metrics

Use trending and baseline to put data in context

Success Criteria and Design are highly interlinked, i.e. success criteria determine the design

The design influences how targeted success metrics can be met