a database of nate chambers and dan jurafsky stanford university narrative schemas

35
A A Database Database of of Nate Chambers and Dan Jurafsky Stanford University Narrative Narrative Schemas Schemas

Upload: lorraine-cook

Post on 17-Dec-2015

222 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

A A Database Database ofof

Nate Chambers and Dan Jurafsky

Stanford University

Narrative Narrative SchemasSchemas

Page 2: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Two Joint TasksTwo Joint Tasks

Events in a Narrative Semantic Roles

suspect, criminal, client, immigrant, journalist, government, …

police, agent, officer, authorities, troops, official, investigator, …

Page 3: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

ScriptsScripts

• Background knowledge for language understanding

Restaurant Script

Schank and Abelson. 1977. Scripts Plans Goals and Understanding. Lawrence Erlbaum.

Mooney and DeJong. 1985. Learning Schemata for NLP. IJCAI-85.

• Hand-coded• Domain dependent

Page 4: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

ApplicationsApplications

• Coreference• Argument prediction

• Summarization• Inform sentence selection with event confidence scores

• Textual Inference• Does a document infer other events

• Selectional Preferences• Use chains to inform argument types

• Aberration Detection• Detect surprise/unexpected events in text

• Story Generation

Page 5: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

The ProtagonistThe Protagonist

protagonist:

(noun)

1. the principal character in a drama or other literary work

2. a leading actor, character, or participant in a literary work or real event

Page 6: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Narrative Event Narrative Event ChainsChains

ACL-2008Narrative Event Chains

(1)Narrative relations(2)Single arguments(3)Temporal ordering

Page 7: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Inducing Narrative RelationsInducing Narrative Relations

1. Dependency parse a document.

2. Run coreference to cluster entity mentions.

3. Count pairs of verbs with coreferring arguments.

4. Use pointwise mutual information to measure relatedness.

Chambers and Jurafsky. Unsupervised Learning of Narrative Event Chains. ACL-08

Narrative Coherence AssumptionVerbs sharing coreferring arguments are semantically connected

by virtue of narrative discourse structure.

Page 8: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas
Page 9: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Chain Example (ACL-08)Chain Example (ACL-08)

Page 10: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Schema ExampleSchema Example

Police, Agent, Authorities

Judge, OfficialProsecutor, Attorney

Plea, Guilty, InnocentSuspect, Criminal,Terrorist, …

Page 11: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Narrative SchemasNarrative Schemas

N = (E,C)E = {arrest, charge, plead, convict, sentence}

C ={C1,C2,C3}

Page 12: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Learning SchemasLearning Schemas

narsim(N,v j ) = maxC i

chainsim(Ci,< v j ,d >)d∈Dv j

Page 13: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Training DataTraining Data

• NYT portion of the Gigaword Corpus• David Graff. 2002. English Gigaword. Linguistic Data Consortium.

• 1.2 million documents

• Stanford Parser• http://nlp.stanford.edu/software/lex-parser.shtml

• OpenNLP coreference• http://opennlp.sourceforge.net

• Lemmatize verbs and noun arguments.

Page 14: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Viral ExampleViral Example

mosquito, aids, virus, tick, catastrophe, disease

virus, disease, bacteria, cancer, toxoplasma, strain

Page 15: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Authorship ExampleAuthorship Example

company, author, group, year, microsoft, magazine

book, report, novel, article, story, letter, magazine

Page 16: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Temporal OrderingTemporal Ordering

• Supervised classifier for before/after relations• Chambers and Jurafsky, EMNLP 2008.• Chambers et al., ACL 2007.

• Classify all pairs of verbs in Gigaword• Record counts of before and after relations

16

Page 17: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

The DatabaseThe Database

1. Narrative Schemas (unordered)1. Various sizes of schemas (6, 8, 10, 12)

2. 1813 base verbs

2. Temporal Orderings1. Pairs of verbs

2. Counts of before and after relations

17

• http://cs.stanford.edu/people/nc/schemas/

Page 18: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

EvaluationsEvaluations

• The Cloze Test• Chambers, Jurafsky. ACL-2008.• Chambers, Jurafsky. ACL-2009.

• Comparison to FrameNet• Chambers, Jurafsky. ACL-2009.

• Corpus Coverage• LREC 2010

18

Page 19: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Comparison to FrameNetComparison to FrameNet

• Narrative Schemas• Focuses on events that occur together in a narrative.

• FrameNet (Baker et al., 1998)

• Focuses on events that share core roles.

Page 20: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Comparison to FrameNetComparison to FrameNet

• Narrative Schemas• Focuses on events that occur together in a narrative.• Schemas represent larger situations.

• FrameNet (Baker et al., 1998)

• Focuses on events that share core roles.• Frames typically represent single events.

Page 21: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Comparison to FrameNetComparison to FrameNet

1. How similar are schemas to frames?• Find “best” FrameNet frame by event overlap

2. How similar are schema roles to frame elements?• Evaluate argument types as FrameNet frame elements.

Page 22: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

FrameNet Schema SimilarityFrameNet Schema Similarity

1. How many schemas map to frames?• 13 of 20 schemas mapped to a frame• 26 of 78 (33%) verbs are not in FrameNet

2. Verbs present in FrameNet• 35 of 52 (67%) matched frame• 17 of 52 (33%) did not match

Page 23: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

FrameNet Schema SimilarityFrameNet Schema Similarity

traderisefallslip

quote

Exchange

Change Position on a Scale

Multiple FrameNet FramesOne Schema

• Why are 33% unaligned?• FrameNet represents subevents as separate frames• Schemas model sequences of events.

Undressing

n/a

Narrative Relation

Page 24: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

EvaluationsEvaluations

• The Cloze Test• Chambers, Jurafsky. ACL-2008.• Chambers, Jurafsky. ACL-2009.

• Comparison to FrameNet• Chambers, Jurafsky. ACL-2009.

• Corpus Coverage• LREC 2010

24

Page 25: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Corpus Coverage EvaluationCorpus Coverage Evaluation

• Narrative Schemas are generalized knowledge structures.

• Newspaper articles discuss specific scenarios.

• How many events in an article’s description are stereotypical events in narrative schemas?

25

Page 26: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Coverage ExampleCoverage Example

26

Investing Schema• invest • sell • take_out • buy • pay• withdraw • pull • pull_out • put • put_in

Article Text• aware• sell• think• pay• walk• put

He is painfully aware that if he sold his four-bedroom brick suburban home for the $220,000 that he thinks he can get for it and then paid off his mortgage, he would walk away with, as he puts it, ….

He is painfully aware that if he sold his four-bedroom brick suburban home for the $220,000 that he thinks he can get for it and then paid off his mortgage, he would walk away with, as he puts it, ….

Page 27: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Coverage ScoreCoverage Score

• Largest Connected Component• Largest subset of vertices such that a path exists between all vertices

• Events are connected if there exists some schema such that both events are members.

27

Article Text• aware• sell• think• pay• walk• put

3 of the 6 are connected50% coverage

Page 28: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Coverage ResultsCoverage Results

• 69 documents• 740 events

• Macro-average document coverage• Final coverage: 34%

“One third of a document’s events are part of a self-contained narrative schema.”

28

Page 29: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Evaluation ResultsEvaluation Results

• The Cloze Test• Schemas improve 36% on event prediction over verb-based

similarity

• Comparison to FrameNet• 65% of schemas match FrameNet frames• 33% of schema events are novel to FrameNet

• Corpus Coverage• 96% of events are connected in the space of events• 34% of events are connected by self-contained schemas

29

Page 30: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Schemas OnlineSchemas Online

• Narrative Schemas• http://cs.stanford.edu/people/nc/schemas/

• Coverage Evaluation (Cloze Test)• http://cs.stanford.edu/people/nc/data/chains/

30

Nate Chambers and Dan Jurafsky

Page 31: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

31

Page 32: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

FrameNet Argument SimilarityFrameNet Argument Similarity

2. Argument role mapping to frame elements.• 72% of arguments appropriate as frame elements

law, ban, rule, constitutionality,conviction, ruling, lawmaker, tax

INCORRECT

FrameNet frame: EnforcingFrame element: Rule

Page 33: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Coverage EvaluationCoverage Evaluation

1. Choose a news article at random.

2. Identify the protagonist.

3. Extract the narrative event chain.

• Match the chain to the best narrative schema with the largest event overlap.

Page 34: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

Coverage DatasetCoverage Dataset

• NYT portion of the Gigaword Corpus• Randomly selected the year 2001• 69 random newspaper articles within 2001

• 100 initially chosen, 31 removed that are not news

• Identified the protagonist and events by hand

34

Page 35: A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas

The ResourceThe Resource

• 3.5% of events in new documents are disconnected from the space of narrative relations• Chambers and Jurafsky. ACL 2008.

• 66% of events in new documents are not clustered into generalized narrative schemas.

• The extent to which a document discusses new variants of known schemas remains for future work.

35