statistical script learning with recurrent neural nets€¦ · background: statistical scripts •...

79
Statistical Script Learning with Recurrent Neural Nets Karl Pichotta Dissertation Proposal December 17, 2015 1

Upload: others

Post on 30-Apr-2020

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Statistical Script Learning with Recurrent Neural Nets

Karl Pichotta Dissertation Proposal December 17, 2015

1

Page 2: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Motivation

• Following the Battle of Actium, Octavian invaded Egypt. As he approached Alexandria, Antony's armies deserted to Octavian on August 1, 30 BC.

• Did Octavian defeat Antony?

2

Page 3: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Motivation

• Following the Battle of Actium, Octavian invaded Egypt. As he approached Alexandria, Antony's armies deserted to Octavian on August 1, 30 BC.

• Did Octavian defeat Antony?

3

Page 4: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Motivation

• Antony’s armies deserted to Octavian ⇒Octavian defeated Antony

• Not simply a paraphrase rule!

• Need world knowledge.

4

Page 5: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Scripts• Scripts: models of events in sequence.

• Events don’t appear in text randomly, but according to world dynamics.

• Scripts try to capture these dynamics.

• Enable automatic inference of implicit events, given events in text (e.g. Octavian defeated Antony).

5

Page 6: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Research Questions

• How can Neural Nets improve automatic inference of events from documents?

• Which models work best empirically?

• Which types of explicit linguistic knowledge are useful?

6

Page 7: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Proposed Work

• Conclusion

!

• Completed Work

• Proposed Work

• Conclusion

7

Page 8: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Statistical Scripts

• Recurrent Neural Nets

• Background

• Statistical Scripts

• Recurrent Neural Nets

8

Page 9: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: Statistical Scripts

• Statistical Scripts: Statistical Models of Event Sequences.

• Non-statistical scripts date back to the 1970s [Schank & Abelson 1977].

• Statistical script learning is a small-but-growing subcommunity [e.g. Chambers & Jurafsky 2008].

• Model the probability of an event given prior events.

9

Page 10: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: Statistical Script Learning

10

Millions of

Documents

NLP Pipeline • Syntax • Coreference

Millions of Event Sequences

Train a Statistical Model

Page 11: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: Statistical Script Inference

11

New Test Document

NLP Pipeline • Syntax • Coreference

Single Event Sequence

Query Trained Statistical Model

Inferred Probable Events

Page 12: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: Statistical Scripts

• Central Questions:

• What is an “Event?” (Part 1 of completed work)

• Which models work well? (Part 2 of completed work)

• How to evaluate?

• How to incorporate into end tasks?

12

Page 13: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Statistical Scripts

• Recurrent Neural Nets

• Background

• Statistical Scripts

• Recurrent Neural Nets

13

Page 14: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: RNNs• Recurrent Neural Nets (RNNs): Neural Nets

with cycles in computation graph.

• RNN Sequence Models: Map inputs x1, …, xt to outputs o1, …, ot via learned latent vector states z1, …, zt.

14

Page 15: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: RNNs

[� [� [W

���

���

R� R� RW

]� ]� ]W

[W

RW

]W

15

[Elman 1990]

Page 16: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: RNNs

• Hidden Unit can be arbitrarily complicated, as long as we can calculate gradients!

[�

���

���

R�

5118QLW

[�

R�

5118QLW

[�

R�

5118QLW

16

Page 17: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: LSTMs

• Long Short-Term Memory (LSTM): More complex hidden RNN unit. [Hochreiter & Schmidhuber, 1997]

• Explicitly addresses two issues:

• Vanishing Gradient Problem.

• Long-Range Dependencies.

17

Page 18: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: LSTM

g

t

= tanh (Wx,m

x

t

+W

z,m

z

t�1 + b

g

)i

t

= � (Wx,i

x

t

+W

z,i

z

t�1 + b

i

)

f

t

= � (Wx,f

x

t

+W

z,f

z

t�1 + b

f

)

o

t

= � (Wx,o

x

t

+W

h,i

z

t�1 + b

o

)

zt = ot � tanhmt

18

]W

2XWSXW�*DWH�RW

)RUJHW�*DWH�IW

LW JW�

]W��[W

PW���0HPRU\�9HFWRU�

,QSXW�*DWHV

mt = ft �mt�1 + it � gt

Page 19: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Background: LSTMs• LSTMs successful for many hard NLP tasks recently:

• Machine Translation [Kalchbrenner and Blunsom 2013, Bahdanau et al. 2015].

• Captioning Images/Videos [Donahue et al. 2015, Venugopalan et al. 2015].

• Language Modeling [Sundermeyer et al. 2012, Kim et al. 2016].

• Question Answering [Hermann et al. 2015, Gao et al. 2015].

19

Page 20: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Proposed Work

• Conclusion

• Background

!

• Proposed Work

• Conclusion

20

Page 21: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Multi-Argument Events

• RNN Scripts

• Background

!

!

21

Page 22: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Multi-Argument Events

• RNN Scripts

• Background

• Completed Work

• Multi-Argument Events

• RNN Scripts

22

Page 23: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Events

• To model “events,” we need a formal definition.

• For us, it will be variations of “verbs with participants.”

23

Page 24: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

• Other Methods use (verb, dependency) pair events [Chambers & Jurafsky 2008; 2009; Jans et al. 2012; Rudinger et al. 2015].

• Captures how an entity relates to a verb.

Pair Events

24

(vb, dep)

Verb Syntactic Dependency

Page 25: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Pair Events• Napoleon remained married to Marie Louise,

though she did not join him in exile on Elba and thereafter never saw her husband again.

(remain_married, subj)!(not_join, obj)!(not_see, obj)

(remain_married, prep)!(not_join, subj)!(not_see, subj)

N. M.L.(remain_married, subj)!(not_join, obj)!(not_see, obj)

(remain_married, prep)!(not_join, subj)!(not_see, subj)

• …Doesn’t capture interactions between entities.25

Page 26: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Multi-Argument Events [P. & Mooney, EACL 2014]

• Use more complex events with multiple entities.

• Learning is more complicated…

• …But inferred events are quantitatively better.

26

Page 27: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

• We represent events as tuples:

• Entities may be null (“·”).

• Entities have only coreference information.

Multi-Argument Events

27

v (es, eo, ep)Verb

Subject EntityObject Entity

Prepositional Entity

Page 28: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Multi-Argument Events• Napoleon remained married to Marie Louise,

though she did not join him in exile on Elba and thereafter never saw her husband again.

remain_married(N, ·, to ML)!not_join(ML, N, ·)!not_see(ML, N, ·)

• Incorporate entities into events as variables.

• Captures pairwise interaction between entities.28

Page 29: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Entity Rewritingremain_married(N, ·, to ML)!not_join(ML, N, ·)!not_see(ML, N, ·)

• not_join(x, y, ·) should predict not_see(x, y, ·) for all x, y.

• During learning, canonicalize co-occurring events:

• Rename variables to a small fixed set.

• Add co-occurrences of all consistent rewritings of the events.

29

Page 30: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Learning & Inference• Learning: From large corpus, count N(a,b), the

number of times event b occurs after event a with at most two intervening events (“2-skip bigram” counts).

• Inference: Infer event b at timestep t according to:

S(b) =tX

i=1

logP (b|ai) +X̀

i=t+1

logP (ai|b)

Prob. of b following events before t

Prob. of b preceding events after t

| {z } | {z }

30[Jans et al. 2012]

Page 31: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Evaluation• “Narrative Cloze” (Chambers & Jurafsky, 2008):

from an unseen document, hold one event out, try to infer it given remaining document.

• “Recall at k” (Jans et al., 2012): make k top inferences, calculate recall of held-out events.

• We evaluate on a number of metrics, but only present one here for clarity (different results are comparatively similar).

31

Page 32: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Experiments

• Train on 1.1M NYT articles (Gigaword).

• Use Stanford Parser/Coref.

32

Page 33: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Results: Pair EventsUnigram

Single-Protagonist

Joint

0 0.1 0.2 0.3 0.4

0.336

0.282

0.297

Recall at 10 for inferring (verb, dependency) events.

33

Page 34: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Results: Multi-Argument Events

Unigram

Multi-Protagonist

Joint

0 0.063 0.125 0.188 0.25

0.245

0.209

0.216

Recall at 10 for inferring Multi-argument events.

34

Page 35: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Multi-Argument Events

• RNN Scripts

• Background

• Completed Work

• Multi-Argument Events

• RNN Scripts

35

Page 36: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Co-occurrence Model Shortcomings

• The co-occurrence-based method has shortcomings:

• “x married y” and “x is married to y” are unrelated events.

• Nouns are ignored. (she sits on the chair vs she sits on the board of directors).

• Relative position of events in sequence is ignored (only one notion of co-occurrence).

36

Page 37: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

LSTM Script models [P. & Mooney, AAAI 2016]

• Feed event sequences into LSTM sequence model.

• To infer events, have the model generate likely events from sequence.

• Can input noun info, coref info, or both.

37

Page 38: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

LSTM Script models• In April 1866 Congress again passed the bill.

Johnson again vetoed it.

[�

���

���

R�

5118QLW

[�

R�

5118QLW

[�

R�

5118QLW

38

[pass, congress, bill, in, april]; [veto, johnson, it, ·, ·]

Page 39: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

LSTM Script models• In April 1866 Congress again passed the bill.

Johnson again vetoed it.

FRQJUHVV

R�

ELOO

R�

DSULO

R�

LQ

R�

SDVV

R�

/670

YHWR

R�

-RKQVRQ

R�

LW

R�

R�

R��

39

[pass, congress, bill, in, april]; [veto, johnson, it, ·, ·]

[ ] [ ]

Page 40: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

LSTM Script models• In April 1866 Congress again passed the bill.

Johnson again vetoed it.

v es eo ep p v es eo ep p- S 1 S - - S 1 - -

FRQJUHVV

ELOO

ELOO

DSULO

DSULO

LQ

LQ

YHWR

SDVV

FRQJUHVV

/670

YHWR

-RKQVRQ

-RKQVRQ

LW

LW

40

[ ] [ ]

Page 41: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

LSTM Script models• Train on English Wikipedia.

• Run Stanford parser, coref; extract sequences of events.

• Train LSTM using Batch Stochastic Gradient Descent with Momentum.

• Minimize cross-entropy loss of predictions.

• Backpropagate error through layers and through time.

• To infer new events, just have the LSTM generate the next five outputs with highest probability (using beam search).

41

Page 42: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Results: Predicting Verbs & Coreference Info

Unigram

Joint

LSTM coref

LSTM coref+noun

0 0.05 0.1 0.15 0.2

0.152

0.145

0.124

0.101

Recall at 25 for inferring Verbs & Coref info

42

Page 43: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Results: Predicting Verbs & Nouns

Unigram

Joint

LSTM noun

LSTM coref+noun

0 0.02 0.04 0.06 0.08

0.061

0.054

0.037

0.025

Recall at 25 for inferring Verbs & Nouns

43

Page 44: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Human Evaluations

• Solicit judgments on individual inferences on Amazon Mechanical Turk.

• Have annotators rank inferences from 1-5 (or mark “Nonsense,” scored 0).

• More interpretable.

44

Page 45: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Results: Crowdsourced EvalRandom

Joint Entity

Joint Noun

LSTM Entity

LSTM Noun

0 1 2 3 4

3.67

3.08

2.21

2.87

0.87

Filtered Human judgments of top inferences (5 Max)

45

Page 46: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Annotator Examples

46

As a result , during the October municipal election , serious violence broke out on polling day , with shots exchanged by competing mobs .

• appeal to X • X has a X • known as X • X has a X • X was arrested

• Random • Joint Ent • Joint Noun • LSTM Ent • LSTM Noun

• 2.7 • 0.3 • 3.3 • 0.3 • 4.3

Page 47: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Annotator Examples

47

Today the remaining community has shrunk to about 50 mostly elderly people . The Kehila Kedosha Yashan Synagogue remains locked , only opened for visitors on request . Emigrant Romaniotes return every summer and open the old synagogue .

• all of the X 's men were lost • X found • X wrote • build a X • synagogue was closed

• Random • Joint Ent • Joint Noun • LSTM Ent • LSTM Noun

• 2.0 • 2.0 • 2.0 • 1.7 • 3.0

Page 48: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Generating “Stories”• Can generate “stories” by starting with <S>

beginning-of-sequence pseudo-event.

• Sample from distribution of initial event components (first verb).

• Take sample as first-step input, sample distribution of next components.

• Repeat until sampling </S> end-of-sequence token.

48

Page 49: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Generated “Stories”

(bear, ., ., kingdom, into) (attend, she, brown, graduation, after) (earn, she, master, university, from) (admit, ., she, university, to) (receive,she,bachelor,university,from) (involve, ., she, production, in) (represent, she, company, ., .)

Born into a kingdom,… …she attended Brown after graduation She earned her Masters from the University She was admitted to a University She had received a bachelors from a University She was involved in the production She represented the company.

Generated event tuples English Descriptions

49

Page 50: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Proposed Work

• Conclusion

• Background

• Completed Work

• Proposed Work

• Conclusion

50

Page 51: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Research Questions

• How can Neural Nets improve automatic inference of events from documents?

• Which models work best empirically?

• Which types of explicit linguistic knowledge are useful?

51

Page 52: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Proposed Work

• Conclusion

• Background

• Completed Work

• Proposed Work

• Conclusion

52

Page 53: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Proposed Work

• Background

• Completed Work

• Proposed Work • Better Models

• Better Events • Discourse Relations • Bonus

• Coreference • Question-Answering

53

Page 54: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Proposed Work

• Background

• Completed Work

• Proposed Work • Better Models

• Better Events • Discourse Relations • Bonus

• Coreference • Question-Answering

54

• Better Models • Better Events • Discourse Relations • Bonus

• Coreference • Question-Answering

Page 55: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Models

• Other Neural Approaches may work better than LSTM.

55

Page 56: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Models (1/3)• Different kinds of RNN:

• Gated Recurrent Units (GRUs) [Cho et al. 2014]

• Grid LSTM [Kalchbrenner et al. 2015]

• Gated Feedback Recurrent Units [Chung et al. 2015]

• Replacing one black box with another.

56

Page 57: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Models (2/3)• Convolutional Neural Networks (CNNs):

• Learn 1D convolution operators to apply to event sequences.

• Arrive ultimately at vector predicting next event(s).

• Recent success with NLP classification tasks. [Kalchbrenner, Grefenstette, & Blunsom 2014; Kim 2014; Zhang, Zhao, & LeCun 2015]

57

Page 58: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Models (3/3)

• Attention-based Models: contain explicit notion of where in input is most predictively useful (where to “pay attention”).

• Recently shown to be useful in NLP tasks (Bahdanau et al. 2015, Hermann et al. 2015).

58

Page 59: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Models (3/3)• Churchill had suffered a mild stroke while on holiday in the south of France in the summer of

1949. The strain of carrying the Premiership and Foreign Office contributed to his stroke at 10 Downing Street after dinner on the evening of 23 June 1953. Despite being partially paralysed down one side, he presided over a Cabinet meeting the next morning without anybody noticing his incapacity. Thereafter his condition deteriorated, and it was thought that he might not survive the weekend. Had Eden been fit, Churchill's premiership would most likely have been over. News of this was kept from the public and from Parliament, who were told that Churchill was suffering from exhaustion. He went to his country home, Chartwell, to recuperate, and by the end of June he astonished his doctors by being able, dripping with perspiration, to lift himself upright from his chair. He joked that news of his illness had chased the trial of the serial killer John Christie off the front pages. Churchill was still keen to pursue a meeting with the Soviets and was open to the idea of a reunified Germany. He refused to condemn the Soviet crushing of East Germany, commenting on 10 July 1953 that "The Russians were surprisingly patient about the disturbances in East Germany". He thought this might have been the reason for the removal of Beria. Churchill returned to public life in October 1953 to make a speech at the Conservative Party conference at Margate.

59

Page 60: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Models (3/3)• Churchill had suffered a mild stroke while on holiday in the south of France in the summer of

1949. The strain of carrying the Premiership and Foreign Office contributed to his stroke at 10 Downing Street after dinner on the evening of 23 June 1953. Despite being partially paralysed down one side, he presided over a Cabinet meeting the next morning without anybody noticing his incapacity. Thereafter his condition deteriorated, and it was thought that he might not survive the weekend. Had Eden been fit, Churchill's premiership would most likely have been over. News of this was kept from the public and from Parliament, who were told that Churchill was suffering from exhaustion. He went to his country home, Chartwell, to recuperate, and by the end of June he astonished his doctors by being able, dripping with perspiration, to lift himself upright from his chair. He joked that news of his illness had chased the trial of the serial killer John Christie off the front pages. Churchill was still keen to pursue a meeting with the Soviets and was open to the idea of a reunified Germany. He refused to condemn the Soviet crushing of East Germany, commenting on 10 July 1953 that "The Russians were surprisingly patient about the disturbances in East Germany". He thought this might have been the reason for the removal of Beria. Churchill returned to public life in October 1953 to make a speech at the Conservative Party conference at Margate.

60

Page 61: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Models (3/3)• Churchill had suffered a mild stroke while on holiday in the south of France in the summer

of 1949. The strain of carrying the Premiership and Foreign Office contributed to his stroke at 10 Downing Street after dinner on the evening of 23 June 1953. Despite being partially paralysed down one side, he presided over a Cabinet meeting the next morning without anybody noticing his incapacity. Thereafter his condition deteriorated, and it was thought that he might not survive the weekend. Had Eden been fit, Churchill's premiership would most likely have been over. News of this was kept from the public and from Parliament, who were told that Churchill was suffering from exhaustion. He went to his country home, Chartwell, to recuperate, and by the end of June he astonished his doctors by being able, dripping with perspiration, to lift himself upright from his chair. He joked that news of his illness had chased the trial of the serial killer John Christie off the front pages. Churchill was still keen to pursue a meeting with the Soviets and was open to the idea of a reunified Germany. He refused to condemn the Soviet crushing of East Germany, commenting on 10 July 1953 that "The Russians were surprisingly patient about the disturbances in East Germany". He thought this might have been the reason for the removal of Beria. Churchill returned to public life in October 1953 to make a speech at the Conservative Party conference at Margate.

61

Page 62: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Models (3/3)

• An attention-based script system would add an explicit distribution of predictive utility over observed events.

62

Page 63: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Proposed Work

• Background

• Completed Work

• Proposed Work • Better Models

• Better Events • Discourse Relations • Bonus

• Coreference • Question-Answering

63

• Better Models • Better Events • Discourse Relations • Bonus

• Coreference • Question-Answering

Page 64: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Raw Text v. Linguistics

64

Only raw text

Arbitrarily complex NLP

Some Linguistic Structure

? ?

Page 65: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Events• Events in P & Mooney (2016) are v(es, eo, ep, p) 5-

tuples.

• This throws away a lot of important information. Importance is an empirical question, but should be investigated.

• We will investigate a number of ways to add information to events (enumerated next…).

65

Page 66: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Events (1/4)• Fixed-arity events throw away multiple Prepositional

Phrases.

• In 1697, Peter the Great traveled incognito to Europe on an 18-month journey with a large Russian delegation to seek the aid of the European monarchs.

• Is presently: (travel, peter, ·, (in 1697)).

• Could be something like (travel, Peter, ·, (in 1697), (to Europe), (on journey), (with delegation)).

66

Page 67: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Events (2/4)

• Many important modifiers that aren’t grammatically prepositions.

• King Frederick William I nearly executed his son for desertion.

• Without “nearly” we make drastically wrong inferences!

67

Page 68: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Events (3/4)• Head Nouns of Arguments are Insufficient:

• Martin Luther wrote to his bishop protesting the sale of indulgences.

• If “Sale of Indulgences” is just “sale”…

• …we can’t conclude “Luther disapproved of indulgences.”

68

Page 69: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Better Events (4/4)

• Nominal (noun) events are very common:

• In the years following his death, a series of civil wars tore Alexander’s empire apart.

• Noun events are crucial for inferring events.

69

Page 70: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Discourse Relations• Relations between events are important.

• The Roman cavalry won an early victory by swiftly routing the Carthaginian horses.

• Because the local authorities had forbidden students from forming organizations, Princip and other members of Young Bosnia met in secret.

• Connectives express relations between events, are likely useful for event inference.

70

Page 71: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Discourse Parsers• Off-the-shelf Discourse Parsers have been trained

to annotate discursively important relations between spans of text.

• Trained on one of two Discourse treebanks [RST Treebank and Penn Discourse Treebank].

• Label spans of text as being, e.g., causally or temporally related .

71

Page 72: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Using Discourse Parsers• Could incorporate shallow discourse structure into

events (i.e. “these two events are related by this discourse relation”).

• Could also hypothetically incorporate discourse structure directly into structure of Neural Net.

• RST parses are trees; could use recently-introduced Tree-LSTMs [Tai et al. 2015] on the topology.

72

Page 73: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Induced Connectives

• Could also induce a closed class of connectives (e.g. “before,” “because of,” …) in an unsupervised manner.

• A number of ways to integrate into RNN sequence model.

73

Page 74: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Outline

• Background

• Completed Work

• Proposed Work

• Background

• Completed Work

• Proposed Work • Better Models

• Better Events • Discourse Relations • Bonus

• Coreference • Question-Answering

74

• Better Models • Better Events • Discourse Relations • Bonus

• Coreference • Question-Answering

Page 75: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

• Voltaire, pretending to work in Paris as an assistant to a notary, spent much of his time writing poetry. When his father found out, he sent Voltaire to study law. Nevertheless, he continued to write…

• Unifying “he” and “Voltaire” could be done with script knowledge.

• Script information can improve coreference.

Scripts for Coreference

75

Page 76: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

• A number of conceivable ways to incorporate scripts into ML-based coreference engines:

• Add script probability, assuming a coreference decision is made, as feature to coref system.

• Add probability of sequence of all events involving an entity as a feature.

Scripts for Coreference

76

Page 77: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

• Scripts are intuitively useful for Question-Answering Systems.

• Add confident script inferences to Knowledge Base about document.

• Would allow inferences about implicit events.

Scripts for Question-Answering

77

Page 78: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Conclusion• LSTMs do much better than Markov-like statistical script

systems.

• We propose:

• Using better neural models.

• Better events.

• More discourse-awareness.

• Trying to improve coref.

• Improving question-answering.

78

Page 79: Statistical Script Learning with Recurrent Neural Nets€¦ · Background: Statistical Scripts • Statistical Scripts: Statistical Models of Event Sequences. • Non-statistical

Thanks!

79