word sense and subjectivity

71
Word Sense and Subjectivity Jan Wiebe Rada Mihalcea University of Pittsburgh University of North Texas

Upload: dixie

Post on 25-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Word Sense and Subjectivity. Jan Wiebe Rada Mihalcea University of Pittsburgh University of North Texas. Introduction. Growing interest in the automatic extraction of opinions, emotions , and sentiments in text (subjectivity). - PowerPoint PPT Presentation

TRANSCRIPT

  • Word Sense and Subjectivity Jan Wiebe Rada MihalceaUniversity of Pittsburgh University of North Texas

    COLING-ACL-2006

  • IntroductionGrowing interest in the automatic extraction of opinions, emotions, and sentiments in text (subjectivity)

    COLING-ACL-2006

  • Subjectivity Analysis: ApplicationsOpinion-oriented question answering: How do the Chinese regard the human rights record of the United States? Product review mining: What features of the ThinkPad T43 do customers like and which do they dislike? Review classification: Is a review positive or negative toward the movie?Tracking emotions toward topics over time: Is anger ratcheting up or cooling down toward an issue or event?Etc.

    COLING-ACL-2006

  • IntroductionContinuing interest in word senseSense annotated resources being developed for many languageswww.globalwordnet.orgActive participation in evaluations such as SENSEVAL

    COLING-ACL-2006

  • Word Sense and SubjectivityThough both are concerned with text meaning, they have mainly been investigated independently

    COLING-ACL-2006

  • Subjectivity Labels on Senses

    Alarm, dismay, consternation (fear resulting from the awareness of danger)

    Alarm, warning device, alarm system (a device that signals the occurrence of some undesirable event)

    COLING-ACL-2006

  • Subjectivity Labels on Senses Interest, involvement -- (a sense of concern with and curiosity about someone or something; "an interest in music") Interest -- (a fixed charge for borrowing money; usually a percentage of the amount borrowed; "how much interest do you pay on your mortgage?")

    COLING-ACL-2006

  • WSD using Subjectivity Tagging

    COLING-ACL-2006

  • WSD using Subjectivity TaggingSense 4 a sense of concernwith and curiosity about someone or something SSense 1 a fixed charge for borrowing money OThe notes do not pay interest.He spins a riveting plot which grabs and holds the readers interest. WSDSystemSense 4Sense 1?Sense 1Sense 4?SubjectivityClassifierSO

    COLING-ACL-2006

  • WSD using Subjectivity TaggingSense 4 a sense of concernwith and curiosity about someone or something SSense 1 a fixed charge for borrowing money OThe notes do not pay interest.He spins a riveting plot which grabs and holds the readers interest. WSDSystemSense 4Sense 1?Sense 1Sense 4?SubjectivityClassifierSO

    COLING-ACL-2006

  • Subjectivity Tagging using WSDSubjectivityClassifierThe notes do not pay interest.He spins a riveting plot which grabs and holds the readers interest. O S?S O?

    COLING-ACL-2006

  • Subjectivity Tagging using WSDSubjectivityClassifierThe notes do not pay interest.He spins a riveting plot which grabs and holds the readers interest. WSDSystemSense 4Sense 1O S?S O?

    COLING-ACL-2006

  • Subjectivity Tagging using WSDSubjectivityClassifierThe notes do not pay interestHe spins a riveting plot which grabs and holds the readers interest. WSDSystemSense 4Sense 1O S?S O?

    COLING-ACL-2006

  • GoalsExplore interactions between word sense and subjectivityCan subjectivity labels be assigned to word senses?ManuallyAutomaticallyCan subjectivity analysis improve word sense disambiguation?Can word sense disambiguation improve subjectivity analysis? Future work

    COLING-ACL-2006

  • OutlineMotivation and GoalsAssigning Subjectivity Labels to Word SensesManuallyAutomatically Word Sense Disambiguation using Automatic Subjectivity AnalysisConclusions

    COLING-ACL-2006

  • Prior Work on Subjectivity TaggingIdentifying words and phrases associated with subjectivityThink ~ private state; Beautiful ~ positive sentimentHatzivassiloglou & McKeown 1997; Wiebe 2000; Kamps & Marx 2002; Turney 2002; Esuli & Sabastiani 2005; EtcSubjectivity classification of sentences, clauses, phrases, or word instances in contextsubjective/objective; positive/negative/neutralRiloff & Wiebe 2003; Yu & Hatzivassiloglou 2003; Dave et al 2003; Hu & Liu 2004; Kim & Hovy 2004; Etc.Here: subjectivity labels are applied to word senses

    COLING-ACL-2006

  • OutlineMotivation and GoalsAssigning Subjectivity Labels to Word SensesManuallyAutomatically Word Sense Disambiguation using Automatic Subjectivity AnalysisConclusions

    COLING-ACL-2006

  • Annotation SchemeAssigning subjectivity labels to WordNet sensesS: subjectiveO: objectiveB: both

    COLING-ACL-2006

  • Annotators are given the synset and its hypernym

    Alarm, dismay, consternation (fear resulting form the awareness of danger)Fear, fearfulness, fright (an emotion experiences in anticipation of some specific pain or danger (usually accompanied by a desire to flee or fight))

    COLING-ACL-2006

  • Subjective Sense DefinitionWhen the sense is used in a text or conversation, we expect it to express subjectivity, and we expect the phrase/sentence containing it to be subjective.

    COLING-ACL-2006

  • Objective Senses: ObservationWe dont necessarily expect phrases/sentences containing objective senses to be objectiveWould you actually be stupid enough to pay that rate of interest?Will someone shut that darn alarm off?

    Subjective, but not due to interest or alarm

    COLING-ACL-2006

  • Objective Sense DefinitionWhen the sense is used in a text or conversation, we dont expect it to express subjectivity and, if the phrase/sentence containing it is subjective, the subjectivity is due to something else.

    COLING-ACL-2006

  • Senses that are BothCovers both subjective and objective usagesExample: absorb, suck, imbibe, soak up, sop up, suck up, draw, take in, take up (take in, also metaphorically; The sponge absorbs water well; She drew strength from the Ministers Words)

    COLING-ACL-2006

  • Annotated Data64 words; 354 senses Balanced subset [32 words; 138 senses]; 2 judgesThe ambiguous nouns of the SENSEVAL-3 English Lexical Task [20 words; 117 senses]; 2 judges[Mihalcea, Chklovski & Kilgarriff, 2004]Others [12 words; 99 senses]; 1 judge

    COLING-ACL-2006

  • Annotated Data: Agreement Study64 words; 354 senses Balanced subset [32 words; 138 senses]; 2 judges16 words have both S and O senses 16 words do not (8 only S and 8 only O)All subsets balanced between nouns and verbsUncertain tags also permitted

    COLING-ACL-2006

  • Inter-Annotator Agreement ResultsOverall: Kappa=0.74 Percent Agreement=85.5%

    COLING-ACL-2006

  • Inter-Annotator Agreement ResultsOverall: Kappa=0.74 Percent Agreement=85.5%

    Without the 12.3% cases when a judge is U:Kappa=0.90 Percent Agreement=95.0%

    COLING-ACL-2006

  • Inter-Annotator Agreement ResultsOverall: Kappa=0.74 Percent Agreement=85.5%

    16 words with S and O senses: Kappa=0.7516 words with only S or O: Kappa=0.73Comparable difficulty

    COLING-ACL-2006

  • Inter-Annotator Agreement Results64 words; 354 senses The ambiguous nouns of the SENSEVAL-3 English Lexical Task [20 words; 117 senses] 2 judgesU tags not permittedEven so, Kappa=0.71

    COLING-ACL-2006

  • OutlineMotivation and GoalsAssigning Subjectivity Labels to Word SensesManuallyAutomatically Word Sense Disambiguation using Automatic Subjectivity AnalysisConclusions

    COLING-ACL-2006

  • Related Workunsupervised word-sense ranking algorithm of [McCarthy et al 2004]That task: approximate corpus frequencies of word sensesOur task: predict a word-sense property (subjectivity)method for learning subjective adjectives of [Wiebe 2000]That task: label wordsOur task: label word senses

    COLING-ACL-2006

  • OverviewMain idea: assess the subjectivity of a word sense based on information about the subjectivity of a set of distributionally similar words in a corpus annotated with subjective expressions

    COLING-ACL-2006

  • MPQA Opinion Corpus10,000 sentences from the world press annotated for subjective expressions[Wiebe at al., 2005]www.cs.pitt.edu/mpqa

    COLING-ACL-2006

  • Subjective ExpressionsSubjective expressions: opinions, sentiments, speculations, etc. (private states) expressed in language

    COLING-ACL-2006

  • ExamplesHis alarm grew.The leaders roundly condemned the Iranian Presidents verbal assault on Israel.He would be quite a catch.That doctor is a quack.

    COLING-ACL-2006

  • Preliminaries: subjectivity of word w

    COLING-ACL-2006

  • Subjectivity of word wUnannotatedCorpus(BNC)[-1, 1] [highly objective, highly subjective] #insts(DSW) in SE - #insts(DSW) not in SE #insts (DSW)

    subj(w) =

    COLING-ACL-2006

  • Subjectivity of word w

    COLING-ACL-2006

  • Subjectivity of word sense wi

    Rather than 1, add or subtractsim(wi,dswj)+sim(wi,dsw1)-sim(wi,dsw1)+sim(wi,dsw2)[-1, 1]

    COLING-ACL-2006

  • Method Step 1 Given word wFind distributionally similar words [Lin 1998]DSW = {dswj | j = 1 .. n}Experiment with top 100 and 160

    COLING-ACL-2006

  • Method Step 2

    COLING-ACL-2006

  • Method Step 2Find the similarity between each word sense and each distributionally similar word

    wnss can be any concept-based similarity measure between word senses we use Jiang & Conrath 1997

    COLING-ACL-2006

  • Method Step 2Find the similarity between each word sense and each distributionally similar word

    wnss can be any concept-based similarity measure between word senses we use Jiang & Conrath 1997

    COLING-ACL-2006

  • Method Step 2Find the similarity between each word sense and each distributionally similar word

    wnss can be any concept-based similarity measure between word senses we use Jiang & Conrath 1997

    COLING-ACL-2006

  • Method Step 2Find the similarity between each word sense and each distributionally similar word

    wnss can be any concept-based similarity measure between word senses we use Jiang & Conrath 1997

    COLING-ACL-2006

  • Method Step 2Find the similarity between each word sense and each distributionally similar word

    wnss can be any concept-based similarity measure between word senses we use Jiang & Conrath 1997

    COLING-ACL-2006

  • Method Step 3

    Input: word sense wi of word w DSW = {dswj | j = 1..n} sim(wi,dswj) MPQA Opinion Corpus

    Output: subjectivity score subj(wi)

    COLING-ACL-2006

  • Method Step 3

    totalsim = #insts(dswj) * sim(wi,dswj)

    subj = 0for each dswj in DSW: for each instance k in insts(dswj): if k is in a subjective expression: subj += sim(wi,dswj) else: subj -= sim(wi,dswj)subj(wi) = subj / totalsim

    COLING-ACL-2006

  • Method Optional Variationw1 dsw1 dsw2 dsw3

    w2 dsw1 dsw2 dsw3

    w3 dsw1 dsw2 dsw3

    if k is in a subjective expression: subj += sim(wi,dswj) else: subj -= sim(wi,dswj)

    Selected

    COLING-ACL-2006

  • EvaluationCalculate subj scores for all word senses, and sort themWhile 0 is a natural candidate for division between S and O, we perform the evaluation for different thresholds in [-1,+1]Calculate the precision of the algorithm at different points of recall

    COLING-ACL-2006

  • Evaluation

    Automatic assignment of subjectivity for 272 word senses (no DSW instances for 82 senses)

    Baseline: random selection of S labelsNumber of assigned S labels matches number of S labels in the gold standard (recall = 1.0)

    COLING-ACL-2006

  • Evaluation: precision/recall curvesNumber of distri-butionally similar words = 160

    COLING-ACL-2006

    Chart4

    0.2711

    0.270.810.47

    0.270.770.45

    0.270.550.45

    0.270.510.45

    0.270.50.38

    0.270.420.37

    0.270.410.37

    0.270.380.36

    0.270.370.34

    0.270.270.27

    baseline

    selected

    all

    Recall

    Precision

    data

    baselineselectedall

    00.2711

    0.10.270.810.47

    0.20.270.770.45

    0.30.270.550.45

    0.40.270.510.45

    0.50.270.50.38

    0.60.270.420.37

    0.70.270.410.37

    0.80.270.380.36

    0.90.270.370.34

    10.270.270.27

    data

    baseline

    selected

    all

    Recall

    Precision

  • Evaluation

    Break-even pointPoint where precision and recall are equal

    COLING-ACL-2006

  • OutlineMotivation and GoalsAssigning Subjectivity Labels to Word SensesManuallyAutomatically Word Sense Disambiguation using Automatic Subjectivity AnalysisConclusions

    COLING-ACL-2006

  • OverviewAugment an existing WSD system with a feature reflecting the subjectivity of the context of the ambiguous wordCompare the performance of original and subjectivity-aware WSD systemsThe ambiguous nouns of the SENSEVAL-3 English Lexical Task SENSEVAL-3 data

    COLING-ACL-2006

  • Original WSD SystemIntegrates local and topical features:Local: context of three words to the left and right, their part-of-speechTopical: top five words occurring at least three times in the context of a word sense[Ng & Lee, 1996], [Mihalcea, 2002]Nave Bayes classifier[Lee & Ng, 2003]

    COLING-ACL-2006

  • Automatic Subjectivity ClassifierRule-based automatic sentence classifier from [Wiebe & Riloff 2005]Included in OpinionFinder; available at:www.cs.pitt.edu/mpqa/

    COLING-ACL-2006

  • Subjectivity Tagging for WSDUsed to tag sentences of the SENSEVAL-3 data that containtarget nounsSubjectivityClassifierSentencekatmosphereOS

    COLING-ACL-2006

  • WSD using Subjectivity Tagging

    COLING-ACL-2006

  • Words with S and O Senses4.3% error reduction; significant (p < 0.05 paired t-test)
  • Words with Only O Senses

    COLING-ACL-2006

  • ConclusionsCan subjectivity labels be assigned to word senses?ManuallyGood agreement; Kappa=0.74Very good when uncertain cases removed; Kappa=0.90AutomaticallyMethod substantially outperforms baselineShowed feasibility of assigning subjectivity labels to the fine-grained level of word senses

    COLING-ACL-2006

  • ConclusionsCan subjectivity analysis improve word sense disambiguation?Improves performance, but mainly for words with both S and O senses (4.3% error reduction; significant (p < 0.05))Performance largely remains the same or degrades for words that dontAssign subjectivity labels to WordNet; WSD system should consult WordNet tags to decide when to pay attention to the contextual subjectivity feature.

    COLING-ACL-2006

  • Thank You

    COLING-ACL-2006

  • Refining WordNetSemantic RichnessFind inconsistencies and gapsVerb assault attack, round, assail, last out, snipe, assault (attack in speech or writing) The editors of the left-leaning paper attacked the new House SpeakerBut no sense for the noun as in His verbal assault was vicious

    COLING-ACL-2006

  • Observation MPQA corpusCorpus somewhat noisy for our taskMPQA annotates subjective expressionsObjective senses can appear in subjective expressionsHypothesis: subjective senses tend to appear more often in subjective expressions than objective senses do, and so the appearance of words in subjective expressions is evidence of sense subjectivity

    COLING-ACL-2006

  • WSD using Subjectivity TaggingHypothesis: instances of subjective senses are more likely to be in subjective sentences, so sentence subjectivity is an informative feature for WSD of words with both subjective and objective senses

    COLING-ACL-2006

  • Subjective Sense ExamplesHe was boiling with anger Seethe, boil (be in an agitated emotional state; The customer was seething with anger)Be (have the quality of being; (copula, used with an adjective or a predicate noun); John is rich; This is not a good answer)

    COLING-ACL-2006

  • Subjective Sense ExamplesWhats the catch? Catch (a hidden drawback; it sounds good but whats the catch?)Drawback (the quality of being a hindrance; he pointed out all the drawbacks to my plan)

    That doctor is a quack. Quack (an untrained person who pretends to be a physician and who dispenses medical advice)Doctor, doc, physician, MD, Dr., medico

    COLING-ACL-2006

  • Objective Sense ExamplesThe alarm went off Alarm, warning device, alarm system (a device that signals the occurrence of some undesirable event)Device (an instrumentality invented for a particular purpose; the device is small enough to wear on your wrist; a device intended to conserve water

    The water boiled Boil (come to the boiling point and change from a liquid to vapor; Water boils at 100 degrees Celsius)Change state, turn (undergo a transformation or a change of position or action)

    COLING-ACL-2006

  • Objective Sense ExamplesHe sold his catch at the market Catch, haul (the quantity that was caught; the catch was only 10 fish)Indefinite quantity (an estimated quantity)

    The ducks quack was loud and brief Quack (the harsh sound of a duck)Sound (the sudden occurrence of an audible event)

    COLING-ACL-2006

    Continued research on enhancing dictionaries such as wordnet; this link points to close to 50 wordnet projects for various languages.a growing number of research groups participate in large-scale evaluations such as SENSEVAL. Some comment that blah groups participated inBoxes around the subjectivity classifer. Color on the sentences. Remove the italics.Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.The main goal in this work is toSo, what did the annotators judge?Probably put less detail on this slideProbably put less detail on this slideDifferent taskNotes: similarity-select clearly performs better than similarity-all The number of distr.similar words (160 vs 100) does not make a big difference (the same observation was made by McCarthy in the context of WSD)Notes: similarity-select clearly performs better than similarity-all The number of distr.similar words (160 vs 100) does not make a big difference (the same observation was made by McCarthy in the context of WSD)animateThe idea is The main goal in this work is toThe main goal in this work is toAnd, if we care about improving our lexical resources, attention to the subjectivity of senses could, for example, reveal missing senses. So, lets look at the results