dialogue acts - university of...

Dialogue Acts Bin Zhang February 1, 2011

Upload: others

Post on 17-Jun-2020




0 download


Page 1: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Dialogue Acts

Bin Zhang

February 1, 2011

Page 2: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition


A. Stolcke et al. 2000, Dialogue act modeling for automatictagging and recognition of conversational speech

D. Davidov et al. 2010, Semi-supervised recognition of sarcasticsentences in Twitter and Amazon

Page 3: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Dialogue Act

I It is a specialized speech actI Typical dialogue acts

I StatementI QuestionI BackchannelI AgreementI DisagreementI Apology

Page 4: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Examples of Dialogue Acts

Page 5: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Dialogue Act Labeling

I Unlabled dialogue data from Switchboard corpusI Contains conversational telephone speechI Speech is segmented into utterancesI Each utterance has a dialogue act label

I Tag setI Dialogue act markup in several layers (DAMSL)I SWBD-DAMSL tag set: 50 tags original, reduced to 42 tags

for better annotator consistency

I 1115 conversations (205000 utterances) annotated by 8linguistic graduate students in 3 months, κ = 0.80 (excellentagreement)

Page 6: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Dialogue Act Labeling

I Unlabled dialogue data from Switchboard corpusI Contains conversational telephone speechI Speech is segmented into utterancesI Each utterance has a dialogue act label

I Tag setI Dialogue act markup in several layers (DAMSL)I SWBD-DAMSL tag set: 50 tags original, reduced to 42 tags

for better annotator consistency

I 1115 conversations (205000 utterances) annotated by 8linguistic graduate students in 3 months, κ = 0.80 (excellentagreement)

Page 7: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Dialogue Act Labeling

I Unlabled dialogue data from Switchboard corpusI Contains conversational telephone speechI Speech is segmented into utterancesI Each utterance has a dialogue act label

I Tag setI Dialogue act markup in several layers (DAMSL)I SWBD-DAMSL tag set: 50 tags original, reduced to 42 tags

for better annotator consistency

I 1115 conversations (205000 utterances) annotated by 8linguistic graduate students in 3 months, κ = 0.80 (excellentagreement)

Page 8: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Common Dialogue Act Types

Statements descriptive, narrative, or personal statements

Opinions often include such hedges as I think, I believe, itseems, and I mean

Questions yes-no questions, declarative questions, WH questions

Backchannels short utterances that play discourse-structuringroles, e.g., indicating that the speaker should go ontalking

Abandoned utterances those that the speaker breaks off withoutfinishing, and are followed by a restart

Turn exits similar to abandoned utterances, but with speakerchange

Answers yes answers and no answers

Agreements mark the degree to which a speaker accepts someprevious proposal, plan, opinion, or statement

Page 9: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Sequence Modeling of Dialogue Acts

I A conversation can be considered as a sequence of utterances

I Dialogue acts of the utterances in a conversation areinter-dependent

I Denote U as the dialogue act sequence of the conversation,and E as the evidence about the conversation (audio and/ortext)

U∗ = argmaxU

P(U|E )

I Using Bayes rule

U∗ = argmaxU

P(E |U)P(U)

P(E |U) dialogue act likelihoodP(U) discourse grammar

Page 10: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Sequence Modeling of Dialogue Acts

I A conversation can be considered as a sequence of utterances

I Dialogue acts of the utterances in a conversation areinter-dependent

I Denote U as the dialogue act sequence of the conversation,and E as the evidence about the conversation (audio and/ortext)

U∗ = argmaxU

P(U|E )

I Using Bayes rule

U∗ = argmaxU

P(E |U)P(U)

P(E |U) dialogue act likelihoodP(U) discourse grammar

Page 11: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Sequence Modeling of Dialogue Acts

I A conversation can be considered as a sequence of utterances

I Dialogue acts of the utterances in a conversation areinter-dependent

I Denote U as the dialogue act sequence of the conversation,and E as the evidence about the conversation (audio and/ortext)

U∗ = argmaxU

P(U|E )

I Using Bayes rule

U∗ = argmaxU

P(E |U)P(U)

P(E |U) dialogue act likelihoodP(U) discourse grammar

Page 12: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Dialogue Act HMM

I Markov assumption

P(Ui |U1,U2, . . . ,Ui−1) = P(Ui |Ui−k , . . . ,Ui−1)

I Independence assumption

P(E |U) =∏i

P(Ei |Ui )

Page 13: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Dialogue Act HMM

I Markov assumption

P(Ui |U1,U2, . . . ,Ui−1) = P(Ui |Ui−k , . . . ,Ui−1)

I Independence assumption

P(E |U) =∏i

P(Ei |Ui )

Page 14: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Discourse Grammar

I As is the nature of dialogues, the dialogue acts of theutterances in a dialogue are highly related

I Back-off dialogue act n-gram models as a simple and efficientdiscourse model

I If speaker is known, speaker-dependent dialogue act n-grammodels are better

Page 15: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Discourse Grammar

I As is the nature of dialogues, the dialogue acts of theutterances in a dialogue are highly related

I Back-off dialogue act n-gram models as a simple and efficientdiscourse model

I If speaker is known, speaker-dependent dialogue act n-grammodels are better

Page 16: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Discourse Grammar

I As is the nature of dialogues, the dialogue acts of theutterances in a dialogue are highly related

I Back-off dialogue act n-gram models as a simple and efficientdiscourse model

I If speaker is known, speaker-dependent dialogue act n-grammodels are better

Page 17: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Combining Evidence

I Words (W )

I ASR acoustics (A)

I Prosodic features (F ): pitch, duration, energy, etc., of thespeech signal

P(W ,A,F |U)

Page 18: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Combining Evidence

I Words (W )

I ASR acoustics (A)

I Prosodic features (F ): pitch, duration, energy, etc., of thespeech signal

P(W ,A,F |U)

Page 19: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Experimental Results

Page 20: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition


A. Stolcke et al. 2000, Dialogue act modeling for automatictagging and recognition of conversational speech

D. Davidov et al. 2010, Semi-supervised recognition of sarcasticsentences in Twitter and Amazon

Page 21: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition


I Sarcasm – the activity of saying or writing the opposite ofwhat you mean, or of speaking in a way intended to makesomeone else feel stupid or show them that you are angry

Page 22: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Example of Twitter

Page 23: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Example of Amazon

Page 24: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Sarcastic Sentences in Two Genres

I In Twitter messages, sarcastic sentences appear in a widerange. Some sarcastic sentences are marked #sarcasm by theuser

I In Amazon product reviews, sarcastic sentences are usuallywith the negative reviews

Page 25: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Sarcastic Sentences in Two Genres

I In Twitter messages, sarcastic sentences appear in a widerange. Some sarcastic sentences are marked #sarcasm by theuser

I In Amazon product reviews, sarcastic sentences are usuallywith the negative reviews

Page 26: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Examples of Sarcasm

I thank you Janet Jackson for yet another year of Super Bowlclassic rock! (Twitter)

I Hes with his other woman: XBox 360. Its 4:30 fool. Sure Ican sleep through the gunfire (Twitter)

I Wow GPRS data speeds are blazing fast (Twitter)

I [I] Love The Cover (book, amazon)

I Defective by design (music player, amazon)

Page 27: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

The Semi-supervised Approach

I Data processing

I Pattern extraction

I Pattern selection

I Data enrichment

I Classification

Page 28: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Data Processing

I For Twitter: actual user, URL, and hashtags tokenized by[USER], [URL], [HASHTAG], respectively

I For Amazon: product, author, company, and book nametokenized by [PRODUCT], [AUTHOR], [COMPANY],[TITLE], respectively

Page 29: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Pattern Extraction

I Construct pattern templatesI Words are classified into high-frequency words and content

words (less frequent)I Punctuations considered high-frequency wordsI Templates are created using both word classes from knowledge

I InstantiationI Templates are instantiated from the data, with high-frequency

words replaces by actual wordsI Hundres of patterns collectedI During matching, partial matching is allowed

Page 30: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Pattern Extraction

I Construct pattern templatesI Words are classified into high-frequency words and content

words (less frequent)I Punctuations considered high-frequency wordsI Templates are created using both word classes from knowledge

I InstantiationI Templates are instantiated from the data, with high-frequency

words replaces by actual wordsI Hundres of patterns collectedI During matching, partial matching is allowed

Page 31: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Pattern Selection

I Remove patterns that originated from one book/product(Amazon)

I Remove patterns that occur in both classes

Page 32: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Data Enrichment

I Used to get more labeled data

I Motivation: sarcastic sentences usually co-occur

I Use labeled sentences as query to search for more sentenceson the web

I Found sentences are assigned similar label to the query

Page 33: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition


I Feature vectors are composed of:I Match score of each pattern in the sentencesI Additional punctuation-based features

I k-nearest neighbor classifier is used

Page 34: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Data Annotation

I Sentences are annotated using five labels (1, 2, 3, 4, 5), 1being not sarcastic, 5 being cleary sarcastic. In classification,1, 2 are negative labels, 3, 4, 5 are positive labels

I The Twitter #sarcasm hashtag is found to be biased andnoisy

I Inter-annotator agreement (fair agreement)I Twitter: κ = 0.41I Amazon: κ = 0.34

I Amazon training data

Positive Negative

Seed 80 505Enriched 471 5020

I Evaluation data: for each genre 90 postive + 90 negativesentences

Page 35: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Data Annotation

I Sentences are annotated using five labels (1, 2, 3, 4, 5), 1being not sarcastic, 5 being cleary sarcastic. In classification,1, 2 are negative labels, 3, 4, 5 are positive labels

I The Twitter #sarcasm hashtag is found to be biased andnoisy

I Inter-annotator agreement (fair agreement)I Twitter: κ = 0.41I Amazon: κ = 0.34

I Amazon training data

Positive Negative

Seed 80 505Enriched 471 5020

I Evaluation data: for each genre 90 postive + 90 negativesentences

Page 36: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Data Annotation

I Sentences are annotated using five labels (1, 2, 3, 4, 5), 1being not sarcastic, 5 being cleary sarcastic. In classification,1, 2 are negative labels, 3, 4, 5 are positive labels

I The Twitter #sarcasm hashtag is found to be biased andnoisy

I Inter-annotator agreement (fair agreement)I Twitter: κ = 0.41I Amazon: κ = 0.34

I Amazon training data

Positive Negative

Seed 80 505Enriched 471 5020

I Evaluation data: for each genre 90 postive + 90 negativesentences

Page 37: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Data Annotation

I Sentences are annotated using five labels (1, 2, 3, 4, 5), 1being not sarcastic, 5 being cleary sarcastic. In classification,1, 2 are negative labels, 3, 4, 5 are positive labels

I The Twitter #sarcasm hashtag is found to be biased andnoisy

I Inter-annotator agreement (fair agreement)I Twitter: κ = 0.41I Amazon: κ = 0.34

I Amazon training data

Positive Negative

Seed 80 505Enriched 471 5020

I Evaluation data: for each genre 90 postive + 90 negativesentences

Page 38: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Experimental Results

I Baseline (star-sentiment, Amazon only): low rating reviewswith string positive sentiment

Page 39: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Experimental Results

I Baseline (star-sentiment, Amazon only): low rating reviewswith string positive sentiment

Page 40: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition


Page 41: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Improving Speech Recognition using Dialogue Acts

I Word recognition (with evidence)

W ∗i = argmax


P(Wi |Ai ,E )

I Dialogue acts can be considered as a factor that the acousticand language models depend on

I When the dialogue acts are unknown, the models are themixtures of dialogue act-dependent models

I Mixture-of-posteriors

P(Wi |Ai ,E ) =∑Ui

P(Wi |Ui )P(Ai |Wi )

P(Ai |Ui )P(Ui |E )

I Mixture-of-LMs

P(Wi |Ai ,E ) ≈∑Ui

P(Wi |Ui )P(Ui |E )P(Ai |Wi )

P(Ai )

Page 42: Dialogue Acts - University of Washingtonssli.ee.washington.edu/courses/ee517/discussTalks/BZtalk...Outline A. Stolcke et al. 2000, Dialogue act modeling for automatic tagging and recognition

Experimental Results