a database of nate chambers and dan jurafsky stanford university narrative schemas
TRANSCRIPT
A A Database Database ofof
Nate Chambers and Dan Jurafsky
Stanford University
Narrative Narrative SchemasSchemas
Two Joint TasksTwo Joint Tasks
Events in a Narrative Semantic Roles
suspect, criminal, client, immigrant, journalist, government, …
police, agent, officer, authorities, troops, official, investigator, …
ScriptsScripts
• Background knowledge for language understanding
Restaurant Script
Schank and Abelson. 1977. Scripts Plans Goals and Understanding. Lawrence Erlbaum.
Mooney and DeJong. 1985. Learning Schemata for NLP. IJCAI-85.
• Hand-coded• Domain dependent
ApplicationsApplications
• Coreference• Argument prediction
• Summarization• Inform sentence selection with event confidence scores
• Textual Inference• Does a document infer other events
• Selectional Preferences• Use chains to inform argument types
• Aberration Detection• Detect surprise/unexpected events in text
• Story Generation
The ProtagonistThe Protagonist
protagonist:
(noun)
1. the principal character in a drama or other literary work
2. a leading actor, character, or participant in a literary work or real event
Narrative Event Narrative Event ChainsChains
ACL-2008Narrative Event Chains
(1)Narrative relations(2)Single arguments(3)Temporal ordering
Inducing Narrative RelationsInducing Narrative Relations
1. Dependency parse a document.
2. Run coreference to cluster entity mentions.
3. Count pairs of verbs with coreferring arguments.
4. Use pointwise mutual information to measure relatedness.
Chambers and Jurafsky. Unsupervised Learning of Narrative Event Chains. ACL-08
Narrative Coherence AssumptionVerbs sharing coreferring arguments are semantically connected
by virtue of narrative discourse structure.
Chain Example (ACL-08)Chain Example (ACL-08)
Schema ExampleSchema Example
Police, Agent, Authorities
Judge, OfficialProsecutor, Attorney
Plea, Guilty, InnocentSuspect, Criminal,Terrorist, …
Narrative SchemasNarrative Schemas
€
N = (E,C)E = {arrest, charge, plead, convict, sentence}
€
C ={C1,C2,C3}
Learning SchemasLearning Schemas
€
narsim(N,v j ) = maxC i
chainsim(Ci,< v j ,d >)d∈Dv j
∑
Training DataTraining Data
• NYT portion of the Gigaword Corpus• David Graff. 2002. English Gigaword. Linguistic Data Consortium.
• 1.2 million documents
• Stanford Parser• http://nlp.stanford.edu/software/lex-parser.shtml
• OpenNLP coreference• http://opennlp.sourceforge.net
• Lemmatize verbs and noun arguments.
Viral ExampleViral Example
mosquito, aids, virus, tick, catastrophe, disease
virus, disease, bacteria, cancer, toxoplasma, strain
Authorship ExampleAuthorship Example
company, author, group, year, microsoft, magazine
book, report, novel, article, story, letter, magazine
Temporal OrderingTemporal Ordering
• Supervised classifier for before/after relations• Chambers and Jurafsky, EMNLP 2008.• Chambers et al., ACL 2007.
• Classify all pairs of verbs in Gigaword• Record counts of before and after relations
16
The DatabaseThe Database
1. Narrative Schemas (unordered)1. Various sizes of schemas (6, 8, 10, 12)
2. 1813 base verbs
2. Temporal Orderings1. Pairs of verbs
2. Counts of before and after relations
17
• http://cs.stanford.edu/people/nc/schemas/
EvaluationsEvaluations
• The Cloze Test• Chambers, Jurafsky. ACL-2008.• Chambers, Jurafsky. ACL-2009.
• Comparison to FrameNet• Chambers, Jurafsky. ACL-2009.
• Corpus Coverage• LREC 2010
18
Comparison to FrameNetComparison to FrameNet
• Narrative Schemas• Focuses on events that occur together in a narrative.
• FrameNet (Baker et al., 1998)
• Focuses on events that share core roles.
Comparison to FrameNetComparison to FrameNet
• Narrative Schemas• Focuses on events that occur together in a narrative.• Schemas represent larger situations.
• FrameNet (Baker et al., 1998)
• Focuses on events that share core roles.• Frames typically represent single events.
Comparison to FrameNetComparison to FrameNet
1. How similar are schemas to frames?• Find “best” FrameNet frame by event overlap
2. How similar are schema roles to frame elements?• Evaluate argument types as FrameNet frame elements.
FrameNet Schema SimilarityFrameNet Schema Similarity
1. How many schemas map to frames?• 13 of 20 schemas mapped to a frame• 26 of 78 (33%) verbs are not in FrameNet
2. Verbs present in FrameNet• 35 of 52 (67%) matched frame• 17 of 52 (33%) did not match
FrameNet Schema SimilarityFrameNet Schema Similarity
traderisefallslip
quote
Exchange
Change Position on a Scale
Multiple FrameNet FramesOne Schema
• Why are 33% unaligned?• FrameNet represents subevents as separate frames• Schemas model sequences of events.
Undressing
n/a
Narrative Relation
EvaluationsEvaluations
• The Cloze Test• Chambers, Jurafsky. ACL-2008.• Chambers, Jurafsky. ACL-2009.
• Comparison to FrameNet• Chambers, Jurafsky. ACL-2009.
• Corpus Coverage• LREC 2010
24
Corpus Coverage EvaluationCorpus Coverage Evaluation
• Narrative Schemas are generalized knowledge structures.
• Newspaper articles discuss specific scenarios.
• How many events in an article’s description are stereotypical events in narrative schemas?
25
Coverage ExampleCoverage Example
26
Investing Schema• invest • sell • take_out • buy • pay• withdraw • pull • pull_out • put • put_in
Article Text• aware• sell• think• pay• walk• put
He is painfully aware that if he sold his four-bedroom brick suburban home for the $220,000 that he thinks he can get for it and then paid off his mortgage, he would walk away with, as he puts it, ….
He is painfully aware that if he sold his four-bedroom brick suburban home for the $220,000 that he thinks he can get for it and then paid off his mortgage, he would walk away with, as he puts it, ….
Coverage ScoreCoverage Score
• Largest Connected Component• Largest subset of vertices such that a path exists between all vertices
• Events are connected if there exists some schema such that both events are members.
27
Article Text• aware• sell• think• pay• walk• put
3 of the 6 are connected50% coverage
Coverage ResultsCoverage Results
• 69 documents• 740 events
• Macro-average document coverage• Final coverage: 34%
“One third of a document’s events are part of a self-contained narrative schema.”
28
Evaluation ResultsEvaluation Results
• The Cloze Test• Schemas improve 36% on event prediction over verb-based
similarity
• Comparison to FrameNet• 65% of schemas match FrameNet frames• 33% of schema events are novel to FrameNet
• Corpus Coverage• 96% of events are connected in the space of events• 34% of events are connected by self-contained schemas
29
Schemas OnlineSchemas Online
• Narrative Schemas• http://cs.stanford.edu/people/nc/schemas/
• Coverage Evaluation (Cloze Test)• http://cs.stanford.edu/people/nc/data/chains/
30
Nate Chambers and Dan Jurafsky
31
FrameNet Argument SimilarityFrameNet Argument Similarity
2. Argument role mapping to frame elements.• 72% of arguments appropriate as frame elements
law, ban, rule, constitutionality,conviction, ruling, lawmaker, tax
INCORRECT
FrameNet frame: EnforcingFrame element: Rule
Coverage EvaluationCoverage Evaluation
1. Choose a news article at random.
2. Identify the protagonist.
3. Extract the narrative event chain.
• Match the chain to the best narrative schema with the largest event overlap.
Coverage DatasetCoverage Dataset
• NYT portion of the Gigaword Corpus• Randomly selected the year 2001• 69 random newspaper articles within 2001
• 100 initially chosen, 31 removed that are not news
• Identified the protagonist and events by hand
34
The ResourceThe Resource
• 3.5% of events in new documents are disconnected from the space of narrative relations• Chambers and Jurafsky. ACL 2008.
• 66% of events in new documents are not clustered into generalized narrative schemas.
• The extent to which a document discusses new variants of known schemas remains for future work.
35