building compassionate conversational systems

BuildingCompassionateandPersonalizedConversationalSystems- Apointofview

Presenter: Rama AkkirajuIBM Distinguished Engineer

Acknowledgements: To our entire team in Watson

5/31/17Devoxx 20172

TobuildCompassionateandPersonalizedConversationalSystems,three coremodelsareneeded

3

Naturally3

(Mediums)

Interact2

People11.Understandpeopleatadeeperlevel

2.Understandstylesofhumaninteractionandoptimizehuman-computerinteraction

3.Understandandrespondinvariousmediumsinwhichinteractionscanoccur

• Need the ability to interact2 naturally3 with people1

InputTypes:Text,Speech,GesturesMediums:Computers,Mobiledevices,Robots,Avatars

#1:PeopleModeling/UserModeling

4

People1

UserModeling:Ourframework

@Copyright IBM 2015 5

Act

Be

Feel

Context

Think Options

Explore&

Decide

Inner State Environment Outer State

Anindividualtakesactionbasedonthecombinationofhis/heruniquebeing&environment

UserModeling:Ourframework

@Copyright IBM 2015 6

Act

Search

Preferences

Communications

Decisions

Commitments

Purchases

Context

Life Style, Events Sociological Economic

Political Technological

Options

Price Promotions

Products/Services

Place

FeelPerceptions

Emotions

Sensations

Attitudes

Influences

Sentiments

Be

Personality

Needs, Values

Beliefs

Motives

Identity

Goals, Ambitions

Interests

Think

Knowledge

Skills

Opinions

Cognitive Style

Explore&

Decide

Choices

Consequences

Session

Intent

Time

UsePersonalityInsightstoengagewithindividualsatpersonalizedlevel

7

Source: https://www.army.mil/article/78562/Leavi ng_the_battlefiel d__Soldi er_shares_story_of_PTSD

https://watson-pi-demo.mybluemix.net/

HowtoactonPersonalitytraits?Traits->Actions/Behaviors

8

EmotionalAnalysishelpsbuildempatheticsystems

9https://sentiment-and-emotion.mybluemix.net/

UseToneAnalyzertounderstandandfinetuneyourmessage

http://tone-analyzer-demo.mybluemix.net

Personalizing shopping Experience with Personality Insights

5/31/17Devoxx 201711

5/31/17

Assessing Customer Satisfaction with Tones

uuuu

Chapter2:HumanInteractionPatterns

14

Styles of Interactions2

NaturalInteractionsamongPeople

15

Verbal (expressive, aggressive, passive) , Non-verbal (gestures, facial expressions, postures)

DialogAct• Dialog Act is a specialized Speech Act. Typically, looks at patterns in dialogs.

16

• Statement• backchannel/acknowledge • Opinion• abandoned/uninterpretable

• agreement/accept • appreciation • yes-no-question • non-verbal • yes answers

• conventional-closing • wh-question • no answers response • quotation

• Summarize/reformulate • affirmative • action-directive

• collaborative completion • repeat-phrase open-question • rhetorical-questions • reject • other answers conventional-

opening or-clause • commits self-talk • downplayer• apology

• thankingSource: Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech http://www.aclweb.org/anthology/J00-3003

Dialog Strategies Start

Giving an extra

Acknowledging Need

Description of Need

Anger

Acknowledging w/out encouraging

Refocus statements

Active Listening

Possibility of mistake

Admitting mistake

Allowing venting

Apology

Smiles

Arranging Follow-up

Need cannot be fulfilled on

the spot

Assurance of effort

Assurance of result

Mistake has been made

Bonus buyoffBroken recordUncooperative

customer

Closing positively

Common Courtesy

Completing Follow-up

Contact Security

Aggressiveness Disengaging

Distraction

FrustrationEmpathy statement

Expediting

Expert Recommendati

on

Explain Reasoning or action

EmbarrassmentFace-Saving

Out

ConflictFinding

Agreement Points

Following up

Helpless

Offering Choice

Empowering

Preventive strike

Privacy insurance

Privacy concern

Probing question

Pros and Cons

Providing Alternatives

Providing Takeaway

Confusion ProvidingExplanation

Questioning instead of

stating

Referral to supervisor

Referral to 3rd

party

Lost focus Refocus Inappropriate behavior Setting Limits

Critical

Neutral mode

Summarize the conversation

Silence

Thank-you

Timeout Use customer name

Verbal Softeners

When QuestionYou’re right

Action

Negative Emotion

Monologue

End External Giving

Emotions

General

States

GratitudeStatementHappiness

Work by IBM Haifa Research team Michal Shmueli-Scheuer, Jonathan Herzig, Guy Feigenblat, David Konopnicki

@Copyright IBM 2015

UnderstandvariousmediumsinwhichHuman-Computerinteractioncanoccur

18

Mediums3

Mediumsofinteraction:OngoingworkinResearch

• Text• Speech

• Non-verbal clues: pauses, volume, intonation, pitch,

• Video• Gestures, facial expressions, eye contact, posture, and tone of voice, distance,

• Other• ?

19

DifferentchannelsforConversations

• Kiosks• Bots• Robots• Virtual agents on mobile-devices

• Virtual agents accessible on a computer

• Question from User modeling point of view.• Would user style of interaction with the system change based on

devices/channels?• Would users willingness to reveal information about themselves change

depending on the channel/device?

20

TobuildCompassionateandPersonalizedConversationalSystems,three coremodelsareneeded

21

Naturally3

(Mediums)

Interact2

People11.Understandpeopleatadeeperlevel

2.Understandstylesofhumaninteractionandoptimizehuman-computerinteraction

3.Understandandrespondinvariousmediumsinwhichinteractionscanoccur

• Need the ability to interact2 naturally3 with people1

InputTypes:Text,Speech,GesturesMediums:Computers,Mobiledevices,Robots,Avatars

ibmwatson.com

facebook.com/ibmwatson

@ibmwatson

22

ToneAnalyzerinCustomerSupportQ&AForumStudy #1: Clients’ Q&A forum data was analyzed• Confident responses are more likely to receive Kudos (r = 0.23)• Tentative responses are less likely to receive Kudos (r=0.27)

• We found that we can predict kudos received with 66% accuracy which is better than random (50%)

• We applied multiple state of the art classifiers such as Naïve Bayes, SVM, Random Forest and did 10-fold cross validation

Study #2: Twitter customer support forums (333 conversations (240 Sat, 93 not-Sat))• Moreangrycustomersarelesslikelytobesatisfiedaftertheconversation(r=

-0.198)• Moredisgustedcustomersarelesslikelytobesatisfiedaftertheconversation

(r=-0.184)• Agentswhoshowhigheremotionalrangearelesslikelytosatisfythe

customer (r=-0.186)

PersonalityInsights:ProblemSetup

• Given at least 1,500 words of text authored by an individual, infer the personality, needs and values of that individual.

24

PersonalityInsightsAccuracy– Latestresults

25

# of Tweets

Mean Absolute Error (MAE)

Trait Name Mean Absolute Error (MAE) Correlation

Agreeableness 0.0999 0.2920

Conscientiousness 0.1174 0.3259

Extraversion 0.1477 0.2521

Neuroticism 0.1404 0.4182

Openness 0.0862 0.3650

• A Machine Learned model for predicting Personality Traits• UsesWord2Vec features (Stanford Glove pre-trained model)• Ground truth collected include 2,000 psychometric surveys

HowmanywordstoinferPersonality?

26

# of Tweets

Mean Absolute Error (MAE)

We reach 95% of the max accuracy with as low as 30 tweets.

0.09

0.095

0.1

0.105

0.11

0.115

0.12

0.125

0.13

0 50 100 150 200 250 300 350

MA

E

Number of tweets used for testing

Trait Agreeableness – MAE VS number of tweets

Old Model

New Model

Old Model: Linguistic Inquiry Word Count (LIWC) basedNew Model: Word2Vec based

Greeting • Opening• Closing

Statement

• Give Info• Expressive (Pos/Neg)• Complaint• Offer Help• Suggest Action• Promise• Sarcasm• Other

Request• Request Help• Request Info• Other

Question• Yes-No Question• Wh- Question• Open Question

Answer

• Yes-Answer• No-Answer• Response-Ack• Other

Social Act

• Thanks• Apology• Downplayer

Methodology•Designingmorefine-grainedactionabledialogueacts:

DataCollection• Wegatherannotations for800conversations(5,327 turns,~6turns/conversationonaverage,4differentagentcompanies)usingcrowdworkers.

• Theyareaskedtoselectasmanycategoriesasrequiredtofullycharacterizetheintentofthetweet.

0

500

1000

1500

2000

2500

Full Data Distribution (@800 conversations, 5,327 turns)

Utterancesarecomplex:Asinglelabelisnotsufficient

0 50 100 150 200 250 300 350 400 450 500

(statement_info, answer_other)

(statement_expressive_negative, statement_complaint)

(statement_info, statement_complaint)

(request_info, question_yesno)

(request_info, question_wh)

(request_info, question_open)

(statement_offer, request_info)

(statement_info, statement_expressive_negative)

(request_info, socialact_apology)

(statement_info, statement_suggestion)

(statement_suggestion, request_info)

(statement_info, socialact_thanks)

(statement_info, answer_yes)

(statement_info, request_info)

(question_yesno, socialact_apology)

(statement_info, question_yesno)

§ Wetestthehypothesisthateachturnmayrequiremorethanonedialogueactlabelbyfindingthedistributionoflabeloverlapinourannotations

§ Weverifythatlabelsfrequentlyco-occur,soclassificationshouldassignanutterancemultiplelabels

ExperimentalSetup

•WedevelopasequentialSVM-HMMmodelonthedata

• LabelingModes:

– Single labeltoaturn

– Multiple labelstoaturn

• SVM-HMMlearningmethods:

– Standard (future-lookingHMM)

– Online (modelpredictsasinglelabelatatime,andcannotusefutureturns)

FeaturesUsed

Textual:N-gramsPunctuation

Temporal:TurnNumberResponseTime

Emotional:NRCEmotion(Anger,Sad,frustration,positiveetc.)

Speaker:SecondPersonIndicators(you,youretc)

Dialogue (Lexical):GreetingOpening/Closing IndicatorsYes-NoQuestionIndicatorsWh-QuestionIndicatorsYes/NoAnswerIndicatorsThankingIndicatorsApology Indicators

ClassDivision:6,8,and10(Easy&Hard)classes

33

6 Labels 8 Labels 10 Labels (Easy) 10 Labels (Hard)1. Statement Informative2.Request Information3.Statement Complaint4.Yes-No Question5.Expressive Negative Statement6. Other

1. Statement Informative2.Request Information3.Statement Complaint4.Yes-No Question5.Expressive Negative Statement6.Statement Suggestion7. General Answer8. Other

1. Statement Informative2.Request Information3.Statement Complaint4.Yes-No Question5.Expressive Negative Statement6.Statement Suggestion7.General Answer8.Apology Social Act9.Thanking Social Act10.Other

1. Statement Informative2.Request Information3.Statement Complaint4.Yes-No Question5.Expressive Negative Statement6.Statement Suggestion7. General Answer8. Statement Offer9. Open Question10. Other

SVM-HMMSequentialModeloutperformsnon-sequentialbaselines

We expect a larger improvement by SVM-HMM with longer conversations (currently ~6 turns/conversation)

Agents are morepredictablethancustomers

Prediction results are better when using *only* agent turns… Agent acts are less variedCustomers are more difficult, but prediction is still good

Conversationoutcomesarestronglydistinguishableusingpredicteddialogueacts

• Puttingitalltogether:Weranoutcomeexperimentsusingfullconversationasinput,andourpredicteddialogueactlabelsasfeatures

• Webalancethedistributionofoutcomesforeachclass:• Satisfied/not-satisfied(216conversations/class)• Resolved/not-resolved(271conversations/class)• Frustrated/not-frustrated(229conversations/class)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Satisfaction Resolution Frustration

LinSVC

Dialogue Ngrams+HC Ngrams+HC+Dialogue

Observations:• Forsatisfactionandresolution,

dialogueactfeaturesarecapturingalloftheinformationinthen-grams,andtheyalsoareusefulandexplanatory

• Frustrationgreatlybenefitsfromhandcraftedfeatures– lessaccuratelytiedtojustdialoguefeatures.

building compassionate conversational systems

Technology