an impact analysis of features in a classification ......maynard, d. et al. (2014).“who cares...
TRANSCRIPT
Universität Bielefeld
June 27, 2014
An Impact Analysis of Features in a ClassificationApproach to Irony Detection in Product Reviews
Konstantin Buschmeier, Philipp Cimiano, Roman Klinger
Semantic Computing Group, CIT-EC, Bielefeld University
Slides are available at http://www.roman-klinger.de/talks/irony.pdf
Universität Bielefeld
Outline
1 Introduction
2 Method
3 Experiments
4 Summary
Buschmeier, Cimiano, Klinger 2 / 38
Universität Bielefeld
IntroductionOutline
1 Introduction
2 Method
3 Experiments
4 Summary
Buschmeier, Cimiano, Klinger 3 / 38
Universität Bielefeld
IntroductionWhat is Irony?
Merriam Webster Dictionary, 2014 (excerpt)
“the use of words that mean the opposite of what you reallythink especially in order to be funny” (verbal irony)
“the use of words to express something other than andespecially the opposite of the literal meaning”
“a situation that is strange or funny because things happenin a way that seems to be the opposite of what youexpected” (situational irony)
“incongruity between the actual result of a sequence ofevents and the normal or expected result “
Buschmeier, Cimiano, Klinger 3 / 38
Universität Bielefeld
IntroductionWhat is Irony? – Examples (1)
“Thanks that you took care of the dirty dishes.”
Buschmeier, Cimiano, Klinger 4 / 38
Universität Bielefeld
IntroductionWhat is Irony? – Examples (2)
[Scene from breaking bad.]
“He might be upset.”
Buschmeier, Cimiano, Klinger 5 / 38
Universität Bielefeld
IntroductionWhat is Irony? – Examples (3)
Buschmeier, Cimiano, Klinger 6 / 38
Universität Bielefeld
IntroductionWhat is Irony? – Examples (4)
Buschmeier, Cimiano, Klinger 7 / 38
Universität Bielefeld
IntroductionWhat is sarcasm?
Merriam Webster Dictionary, 2014 (excerpt)
“a sharp and often satirical or ironic utterance designed tocut or give pain”
Buschmeier, Cimiano, Klinger 8 / 38
Universität Bielefeld
IntroductionIrony markers and factors
S. Attardo (2000). “Irony Markers and Functions: Towards a Goal-orientedTheory of Irony and its Processing”. In: Rask: Internationalt Tidsskrift forSprog og Kommunikation
Irony factors
⇒ . . . are essential for irony to happen
Irony markers
⇒ . . . are marking the occurrence in irony
Irony can happen without markers!
Buschmeier, Cimiano, Klinger 9 / 38
Universität Bielefeld
IntroductionIrony in product reviews (1)
From a review for a movie“Read the book!”
From a review for a book“i would recomend this book to friends who have insomnia orthose who i absolutely despise.”
Ironic EnvironmentA. Utsumi (2000). “Verbal irony as implicit display of ironicenvironment: Distinguishing ironic utterances from nonirony”. In:Journal of Pragmatics
Buschmeier, Cimiano, Klinger 10 / 38
Universität Bielefeld
IntroductionExamples (2)
“. . . Pros: Fits my girthy frame, has wolves on it, attracts womenCons: Only 3 wolves [. . . ], cannot see wolves when sitting with armscrossed, wolves would have been better if they glowed in the dark.”
Buschmeier, Cimiano, Klinger 11 / 38
Universität Bielefeld
IntroductionExamples (3)
Buschmeier, Cimiano, Klinger 12 / 38
Universität Bielefeld
IntroductionExamples (4)
Buschmeier, Cimiano, Klinger 13 / 38
Universität Bielefeld
IntroductionWhy detect Irony?
Error reduction by sarcasm detectionin polarity detection of tweetsD. Maynard et al. (2014). “Who cares about Sarcastic Tweets? Investigatingthe Impact of Sarcasm on Sentiment Analysis.” In: LREC
Supports understanding of irony in languageIt is fun.
Buschmeier, Cimiano, Klinger 14 / 38
Universität Bielefeld
IntroductionPrevious Work – Definitions of Irony
A. Utsumi (2000). “Verbal irony as implicit display of ironic environment:Distinguishing ironic utterances from nonirony”. In: Journal of Pragmatics
D. Wilson et al. (2012). “Explaining Irony”. In: Meaning and Relevance
H. H. Clark et al. (1984). “On the pretense theory of irony.” In: Journal ofExperimental Psychology: General
S. Kumon-Nakamura et al. (1995). “How About Another Piece of Pie: TheAllusional Pretense Theory of Discourse Irony”. In: Journal of ExperimentalPsychology: General
Buschmeier, Cimiano, Klinger 15 / 38
Universität Bielefeld
IntroductionPrevious Work – Automatically Detecting Irony (excerpt)
Feature Impact analysis in TwitterF. Barbieri et al. (2014). “Modelling Irony in Twitter: Feature Analysis andEvaluation”. In: LRECA. Reyes et al. (2011). “Mining subjective knowledge from customer reviews:a specific case of irony detection”. In: WASSA@ACLR. Gonzalez-Ibanez et al. (2011). “Identifying sarcasm in Twitter: a closerlook”. In: ACL-HLT
Google book search for specific phrases, automated classificationM. L. Dress et al. (2008). “Regional Variation in the Use of Sarcasm”. In:Journal of Language and Social Psychology
Portuguese Newspaper comments, specific featuresP. Carvalho et al. (2009). “Clues for detecting irony in user-generated con-tents: oh. . . !! it’s “so easy” ;-)”. In: TSA@CIKM
Amazon review sentences, KNN, rich feature setO. Tsur et al. (2010). “ICWSM – A Great Catchy Name: Semi-SupervisedRecognition of Sarcastic Sentences in Online Product Reviews.” In: ICWSM
Buschmeier, Cimiano, Klinger 16 / 38
Universität Bielefeld
IntroductionData Resource
Amazon Corpus publishedE. Filatova (2012). “Irony and Sarcasm: Corpus Generation and AnalysisUsing Crowdsourcing”. In: LREC
Amazon Mechanical Turk Annotation of Corpus
1st step: Selection of an ironic and a regular review for aproduct each, submission of review ID2nd step: Validation of annotation by 5 additional turkers,kept in corpus when majority agreedAdditional information was extracted not taken into accountin this work437 ironic, 817 regular reviews, 1254 altogether
sarcasm ..= verbal ironyBuschmeier, Cimiano, Klinger 17 / 38
Universität Bielefeld
MethodOutline
1 Introduction
2 Method
3 Experiments
4 Summary
Buschmeier, Cimiano, Klinger 18 / 38
Universität Bielefeld
MethodWorkflow
Supervised classification problemEach review categorized into being ironic or non-ironicCorpus by Filatova, 2012 usedClassifiers taken into account:
Naıve Bayes, support vector machine (with linearkernel), logistic regression, decision tree, random forestAs implemented in Python library scikit-learn
Buschmeier, Cimiano, Klinger 18 / 38
Universität Bielefeld
MethodProblem Specific Features
ImbalanceStar-rating is positive, more negative words (142/35) ∗
Star-rating is negative, more positive words (0/0)
Example
☀☀☀☀☀
Avoid that TV show. Highly addictive.
∗ (ironic reviews with that feature/non-ironic reviews with that feature)
Buschmeier, Cimiano, Klinger 19 / 38
Universität Bielefeld
MethodProblem Specific Features
Hyperbole
Three successive positive words (2/4)Three successive negative words (4/4)
Example
That is the best, awesome, greatest, washing machine ever!
Buschmeier, Cimiano, Klinger 20 / 38
Universität Bielefeld
MethodProblem Specific Features
QuotesTwo succeeding positive adjectives/nouns in quotes (25/25)Two succeeding negative adjectives/nouns in quotes (16/15)
Example
They advertise it as “very good”.
Buschmeier, Cimiano, Klinger 21 / 38
Universität Bielefeld
MethodProblem Specific Features
Pos/Neg and Punctuation
Positive word, exclamation mark in a distance of four (7/19)Negative word, exclamation mark in a distance of four (4/2)
Example
Such a great thing!
Buschmeier, Cimiano, Klinger 22 / 38
Universität Bielefeld
MethodProblem Specific Features
Pos/Neg and Ellipsis
Positive word, ellipsis in a distance of four (27/33)Negative word, ellipsis in a distance of four (28/18)
Example
Such a great thing. . .
Buschmeier, Cimiano, Klinger 23 / 38
Universität Bielefeld
MethodProblem Specific Features
Ellipsis and Punctuations
An ellipsis is followed by multiple punctuation marks (4/1)
Example
You really say. . . ?!?
Buschmeier, Cimiano, Klinger 24 / 38
Universität Bielefeld
MethodProblem Specific Features
PunctuationExistence of multiple exclamation marks (31/51)Existence of multiple question marks (10/6)Combination of question with exclamation mark (12/4)
Example
“!!!!!”, “??”, “?!”
Buschmeier, Cimiano, Klinger 25 / 38
Universität Bielefeld
MethodProblem Specific Features
Interjection
Terms like “wow” and “huh”, “lol” (16/18)
Laughter
Onomatopoeia like “haha” (1/2)Smilies (6/25)
Example
That machine is really like . . . *WOW*. . . hahahaha :-)
Buschmeier, Cimiano, Klinger 26 / 38
Universität Bielefeld
MethodBag-of-Words
Every occurring term is used to generate a feature
FeaturesExample text: “This is great.”
The word “This” occursThe word “is” occursThe word “great” occurs. . .
Buschmeier, Cimiano, Klinger 27 / 38
Universität Bielefeld
ExperimentsOutline
1 Introduction
2 Method
3 Experiments
4 Summary
Buschmeier, Cimiano, Klinger 28 / 38
Universität Bielefeld
ExperimentsBaselines
Use the star-rating as five features (“star-rating”)Bag-of-WordsMajority of positive/negative words (“sentiment”)
Buschmeier, Cimiano, Klinger 28 / 38
Universität Bielefeld
ExperimentsResults, Logistic Regression, 10-fold CV
0
20
40
60
80
100
Star-Rating
BOWSentiment
All+Star-Rating
All Specific
F1
71.768.8
58.1
74.467.8
50.8
Buschmeier, Cimiano, Klinger 29 / 38
Universität Bielefeld
ExperimentsDistributions
0
100
200
300
400
500
600
700
1 2 3 4 5
Num
ber
of R
evie
ws
Stars
Corpus
IronyNon Irony
0
100
200
300
400
500
600
700
1 2 3 4 5
Num
ber
of R
evie
ws
Stars
Prediction
IronyNon Irony
Buschmeier, Cimiano, Klinger 30 / 38
Universität Bielefeld
ExperimentsResults for different classifiers
0
20
40
60
80
100
Logistic Regr.
SVMDecision Tree
Random Forest
Naive Bayes
F1
74.471.3 72.2
48.2
65.0
Buschmeier, Cimiano, Klinger 31 / 38
Universität Bielefeld
ExperimentsInformation Gain of Bag-of-Words
Which phrases are important to decide for irony?
great, I mean, easy, mean, is very, very, stupid, is a, worst, highly,a great, easy to, the worst, excellent, price, fast, a bit, shirt,works, money, man, simple, worse, use, Oh, idea, nothing, and it,How, the best, wrong
Buschmeier, Cimiano, Klinger 32 / 38
Universität Bielefeld
SummaryOutline
1 Introduction
2 Method
3 Experiments
4 Summary
Buschmeier, Cimiano, Klinger 33 / 38
Universität Bielefeld
SummarySummary & Future work
Summary
The first feature evaluation for irony detection on a publiclyavailable corpusMeta-information is a strong indicatorSetting with actual text based features is more useful
OutlookMeasure text similarity of reviews of same productTransfer known theories about the use of irony to textInclude method in our fine-grained aspect/evaluation phraseextraction model for sentiment analysis (Klinger et al.,2013b; Klinger et al., 2013a)
Buschmeier, Cimiano, Klinger 33 / 38
Universität Bielefeld
BibliographyBibliography I
Attardo, S. (2000). “Irony Markers and Functions: Towards aGoal-oriented Theory of Irony and its Processing”. In: Rask:Internationalt Tidsskrift for Sprog og Kommunikation.
Barbieri, F. et al. (2014). “Modelling Irony in Twitter: FeatureAnalysis and Evaluation”. In: LREC.
Carvalho, P. et al. (2009). “Clues for detecting irony inuser-generated contents: oh. . . !! it’s “so easy” ;-)”. In:TSA@CIKM.
Clark, H. H. et al. (1984). “On the pretense theory of irony.” In:Journal of Experimental Psychology: General.
Dress, M. L. et al. (2008). “Regional Variation in the Use ofSarcasm”. In: Journal of Language and Social Psychology.
Buschmeier, Cimiano, Klinger 34 / 38
Universität Bielefeld
BibliographyBibliography II
Filatova, E. (2012). “Irony and Sarcasm: Corpus Generation andAnalysis Using Crowdsourcing”. In: LREC.
Gonzalez-Ibanez, R. et al. (2011). “Identifying sarcasm in Twitter:a closer look”. In: ACL-HLT.
Klinger, R. et al. (2013a). “Bi-directional Inter-dependencies ofSubjective Expressions and Targets and their Value for a JointModel”. In: ACL.
— (2013b). “Joint and Pipeline Probabilistic Models forFine-Grained Sentiment Analysis: Extracting Aspects, SubjectivePhrases and their Relations”. In: ICDMW.
Kumon-Nakamura, S. et al. (1995). “How About Another Piece ofPie: The Allusional Pretense Theory of Discourse Irony”. In:Journal of Experimental Psychology: General.
Buschmeier, Cimiano, Klinger 35 / 38
Universität Bielefeld
BibliographyBibliography III
Maynard, D. et al. (2014). “Who cares about Sarcastic Tweets?Investigating the Impact of Sarcasm on Sentiment Analysis.” In:LREC.
Reyes, A. et al. (2011). “Mining subjective knowledge fromcustomer reviews: a specific case of irony detection”. In:WASSA@ACL.
Tsur, O. et al. (2010). “ICWSM – A Great Catchy Name:Semi-Supervised Recognition of Sarcastic Sentences in OnlineProduct Reviews.” In: ICWSM.
Utsumi, A. (2000). “Verbal irony as implicit display of ironicenvironment: Distinguishing ironic utterances from nonirony”. In:Journal of Pragmatics.
Buschmeier, Cimiano, Klinger 36 / 38
Universität Bielefeld
BibliographyBibliography IV
Wilson, D. et al. (2012). “Explaining Irony”. In: Meaning andRelevance.
Buschmeier, Cimiano, Klinger 37 / 38
Universität Bielefeld
June 27, 2014
An Impact Analysis of Features in a ClassificationApproach to Irony Detection in Product Reviews
Konstantin Buschmeier, Philipp Cimiano, Roman Klinger
Semantic Computing Group, CIT-EC, Bielefeld University
Slides are available at http://www.roman-klinger.de/talks/irony.pdf