building compassionate conversational systems
TRANSCRIPT
BuildingCompassionateandPersonalizedConversationalSystems- Apointofview
Presenter: Rama AkkirajuIBM Distinguished Engineer
Acknowledgements: To our entire team in Watson
5/31/17Devoxx 20172
TobuildCompassionateandPersonalizedConversationalSystems,three coremodelsareneeded
3
Naturally3
(Mediums)
Interact2
People11.Understandpeopleatadeeperlevel
2.Understandstylesofhumaninteractionandoptimizehuman-computerinteraction
3.Understandandrespondinvariousmediumsinwhichinteractionscanoccur
• Need the ability to interact2 naturally3 with people1
InputTypes:Text,Speech,GesturesMediums:Computers,Mobiledevices,Robots,Avatars
#1:PeopleModeling/UserModeling
4
People1
UserModeling:Ourframework
@Copyright IBM 2015 5
Act
Be
Feel
Context
Think Options
Explore&
Decide
Inner State Environment Outer State
Anindividualtakesactionbasedonthecombinationofhis/heruniquebeing&environment
UserModeling:Ourframework
@Copyright IBM 2015 6
Act
Search
Preferences
Communications
Decisions
Commitments
Purchases
Context
Life Style, Events Sociological Economic
Political Technological
Options
Price Promotions
Products/Services
Place
FeelPerceptions
Emotions
Sensations
Attitudes
Influences
Sentiments
Be
Personality
Needs, Values
Beliefs
Motives
Identity
Goals, Ambitions
Interests
Think
Knowledge
Skills
Opinions
Cognitive Style
Explore&
Decide
Choices
Consequences
Session
Intent
Time
UsePersonalityInsightstoengagewithindividualsatpersonalizedlevel
7
Source: https://www.army.mil/article/78562/Leavi ng_the_battlefiel d__Soldi er_shares_story_of_PTSD
https://watson-pi-demo.mybluemix.net/
HowtoactonPersonalitytraits?Traits->Actions/Behaviors
8
EmotionalAnalysishelpsbuildempatheticsystems
9https://sentiment-and-emotion.mybluemix.net/
UseToneAnalyzertounderstandandfinetuneyourmessage
http://tone-analyzer-demo.mybluemix.net
Personalizing shopping Experience with Personality Insights
5/31/17Devoxx 201711
5/31/17Page 12
Assessing Customer Satisfaction with Tones
uuuu
Chapter2:HumanInteractionPatterns
14
Styles of Interactions2
NaturalInteractionsamongPeople
15
Verbal (expressive, aggressive, passive) , Non-verbal (gestures, facial expressions, postures)
DialogAct• Dialog Act is a specialized Speech Act. Typically, looks at patterns in dialogs.
16
• Statement• backchannel/acknowledge • Opinion• abandoned/uninterpretable
• agreement/accept • appreciation • yes-no-question • non-verbal • yes answers
• conventional-closing • wh-question • no answers response • quotation
• Summarize/reformulate • affirmative • action-directive
• collaborative completion • repeat-phrase open-question • rhetorical-questions • reject • other answers conventional-
opening or-clause • commits self-talk • downplayer• apology
• thankingSource: Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech http://www.aclweb.org/anthology/J00-3003
Dialog Strategies Start
Giving an extra
Acknowledging Need
Description of Need
Anger
Acknowledging w/out encouraging
Refocus statements
Active Listening
Possibility of mistake
Admitting mistake
Allowing venting
Apology
Smiles
Arranging Follow-up
Need cannot be fulfilled on
the spot
Assurance of effort
Assurance of result
Mistake has been made
Bonus buyoffBroken recordUncooperative
customer
Closing positively
Common Courtesy
Completing Follow-up
Contact Security
Aggressiveness Disengaging
Distraction
FrustrationEmpathy statement
Expediting
Expert Recommendati
on
Explain Reasoning or action
EmbarrassmentFace-Saving
Out
ConflictFinding
Agreement Points
Following up
Helpless
Offering Choice
Empowering
Preventive strike
Privacy insurance
Privacy concern
Probing question
Pros and Cons
Providing Alternatives
Providing Takeaway
Confusion ProvidingExplanation
Questioning instead of
stating
Referral to supervisor
Referral to 3rd
party
Lost focus Refocus Inappropriate behavior Setting Limits
Critical
Neutral mode
Summarize the conversation
Silence
Thank-you
Timeout Use customer name
Verbal Softeners
When QuestionYou’re right
Action
Negative Emotion
Monologue
End External Giving
Emotions
General
States
GratitudeStatementHappiness
Work by IBM Haifa Research team Michal Shmueli-Scheuer, Jonathan Herzig, Guy Feigenblat, David Konopnicki
@Copyright IBM 2015
UnderstandvariousmediumsinwhichHuman-Computerinteractioncanoccur
18
Mediums3
Mediumsofinteraction:OngoingworkinResearch
• Text• Speech
• Non-verbal clues: pauses, volume, intonation, pitch,
• Video• Gestures, facial expressions, eye contact, posture, and tone of voice, distance,
• Other• ?
19
DifferentchannelsforConversations
• Kiosks• Bots• Robots• Virtual agents on mobile-devices
• Virtual agents accessible on a computer
• Question from User modeling point of view.• Would user style of interaction with the system change based on
devices/channels?• Would users willingness to reveal information about themselves change
depending on the channel/device?
20
TobuildCompassionateandPersonalizedConversationalSystems,three coremodelsareneeded
21
Naturally3
(Mediums)
Interact2
People11.Understandpeopleatadeeperlevel
2.Understandstylesofhumaninteractionandoptimizehuman-computerinteraction
3.Understandandrespondinvariousmediumsinwhichinteractionscanoccur
• Need the ability to interact2 naturally3 with people1
InputTypes:Text,Speech,GesturesMediums:Computers,Mobiledevices,Robots,Avatars
ibmwatson.com
facebook.com/ibmwatson
@ibmwatson
22
ToneAnalyzerinCustomerSupportQ&AForumStudy #1: Clients’ Q&A forum data was analyzed• Confident responses are more likely to receive Kudos (r = 0.23)• Tentative responses are less likely to receive Kudos (r=0.27)
• We found that we can predict kudos received with 66% accuracy which is better than random (50%)
• We applied multiple state of the art classifiers such as Naïve Bayes, SVM, Random Forest and did 10-fold cross validation
Study #2: Twitter customer support forums (333 conversations (240 Sat, 93 not-Sat))• Moreangrycustomersarelesslikelytobesatisfiedaftertheconversation(r=
-0.198)• Moredisgustedcustomersarelesslikelytobesatisfiedaftertheconversation
(r=-0.184)• Agentswhoshowhigheremotionalrangearelesslikelytosatisfythe
customer (r=-0.186)
PersonalityInsights:ProblemSetup
• Given at least 1,500 words of text authored by an individual, infer the personality, needs and values of that individual.
24
PersonalityInsightsAccuracy– Latestresults
25
# of Tweets
Mean Absolute Error (MAE)
Trait Name Mean Absolute Error (MAE) Correlation
Agreeableness 0.0999 0.2920
Conscientiousness 0.1174 0.3259
Extraversion 0.1477 0.2521
Neuroticism 0.1404 0.4182
Openness 0.0862 0.3650
• A Machine Learned model for predicting Personality Traits• UsesWord2Vec features (Stanford Glove pre-trained model)• Ground truth collected include 2,000 psychometric surveys
HowmanywordstoinferPersonality?
26
# of Tweets
Mean Absolute Error (MAE)
We reach 95% of the max accuracy with as low as 30 tweets.
0.09
0.095
0.1
0.105
0.11
0.115
0.12
0.125
0.13
0 50 100 150 200 250 300 350
MA
E
Number of tweets used for testing
Trait Agreeableness – MAE VS number of tweets
Old Model
New Model
Old Model: Linguistic Inquiry Word Count (LIWC) basedNew Model: Word2Vec based
Greeting • Opening• Closing
Statement
• Give Info• Expressive (Pos/Neg)• Complaint• Offer Help• Suggest Action• Promise• Sarcasm• Other
Request• Request Help• Request Info• Other
Question• Yes-No Question• Wh- Question• Open Question
Answer
• Yes-Answer• No-Answer• Response-Ack• Other
Social Act
• Thanks• Apology• Downplayer
Methodology•Designingmorefine-grainedactionabledialogueacts:
DataCollection• Wegatherannotations for800conversations(5,327 turns,~6turns/conversationonaverage,4differentagentcompanies)usingcrowdworkers.
• Theyareaskedtoselectasmanycategoriesasrequiredtofullycharacterizetheintentofthetweet.
0
500
1000
1500
2000
2500
Full Data Distribution (@800 conversations, 5,327 turns)
Utterancesarecomplex:Asinglelabelisnotsufficient
0 50 100 150 200 250 300 350 400 450 500
(statement_info, answer_other)
(statement_expressive_negative, statement_complaint)
(statement_info, statement_complaint)
(request_info, question_yesno)
(request_info, question_wh)
(request_info, question_open)
(statement_offer, request_info)
(statement_info, statement_expressive_negative)
(request_info, socialact_apology)
(statement_info, statement_suggestion)
(statement_suggestion, request_info)
(statement_info, socialact_thanks)
(statement_info, answer_yes)
(statement_info, request_info)
(question_yesno, socialact_apology)
(statement_info, question_yesno)
§ Wetestthehypothesisthateachturnmayrequiremorethanonedialogueactlabelbyfindingthedistributionoflabeloverlapinourannotations
§ Weverifythatlabelsfrequentlyco-occur,soclassificationshouldassignanutterancemultiplelabels
ExperimentalSetup
•WedevelopasequentialSVM-HMMmodelonthedata
• LabelingModes:
– Single labeltoaturn
– Multiple labelstoaturn
• SVM-HMMlearningmethods:
– Standard (future-lookingHMM)
– Online (modelpredictsasinglelabelatatime,andcannotusefutureturns)
FeaturesUsed
Textual:N-gramsPunctuation
Temporal:TurnNumberResponseTime
Emotional:NRCEmotion(Anger,Sad,frustration,positiveetc.)
Speaker:SecondPersonIndicators(you,youretc)
Dialogue (Lexical):GreetingOpening/Closing IndicatorsYes-NoQuestionIndicatorsWh-QuestionIndicatorsYes/NoAnswerIndicatorsThankingIndicatorsApology Indicators
ClassDivision:6,8,and10(Easy&Hard)classes
33
6 Labels 8 Labels 10 Labels (Easy) 10 Labels (Hard)1. Statement Informative2.Request Information3.Statement Complaint4.Yes-No Question5.Expressive Negative Statement6. Other
1. Statement Informative2.Request Information3.Statement Complaint4.Yes-No Question5.Expressive Negative Statement6.Statement Suggestion7. General Answer8. Other
1. Statement Informative2.Request Information3.Statement Complaint4.Yes-No Question5.Expressive Negative Statement6.Statement Suggestion7.General Answer8.Apology Social Act9.Thanking Social Act10.Other
1. Statement Informative2.Request Information3.Statement Complaint4.Yes-No Question5.Expressive Negative Statement6.Statement Suggestion7. General Answer8. Statement Offer9. Open Question10. Other
SVM-HMMSequentialModeloutperformsnon-sequentialbaselines
We expect a larger improvement by SVM-HMM with longer conversations (currently ~6 turns/conversation)
Agents are morepredictablethancustomers
Prediction results are better when using *only* agent turns… Agent acts are less variedCustomers are more difficult, but prediction is still good
Conversationoutcomesarestronglydistinguishableusingpredicteddialogueacts
• Puttingitalltogether:Weranoutcomeexperimentsusingfullconversationasinput,andourpredicteddialogueactlabelsasfeatures
• Webalancethedistributionofoutcomesforeachclass:• Satisfied/not-satisfied(216conversations/class)• Resolved/not-resolved(271conversations/class)• Frustrated/not-frustrated(229conversations/class)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Satisfaction Resolution Frustration
LinSVC
Dialogue Ngrams+HC Ngrams+HC+Dialogue
Observations:• Forsatisfactionandresolution,
dialogueactfeaturesarecapturingalloftheinformationinthen-grams,andtheyalsoareusefulandexplanatory
• Frustrationgreatlybenefitsfromhandcraftedfeatures– lessaccuratelytiedtojustdialoguefeatures.