ph.d. thesis oral defense - stanford nlp groupnlp.stanford.edu/pubs/spitkovskythesis-slides.pdf ·...

252
Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin I. Spitkovsky Grammar Induction and Parsing with Dependency-and-Boundary Models V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 1 / 60

Upload: others

Post on 01-Aug-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Ph.D. Thesis Oral Defense

Stanford Artificial Intelligence Laboratory (SAIL)

Valentin I. Spitkovsky

Grammar Induction and Parsingwith Dependency-and-Boundary Models

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 1 / 60

Page 2: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Problem: Unsupervised Parsing (and Grammar Induction)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 2 / 60

Page 3: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Problem: Unsupervised Parsing (and Grammar Induction)

Input: Raw Text

... By most measures, the nation’s industrial sector is now

growing very slowly — if at all. Factory payrolls fell in

September. So did the Federal Reserve ...

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 2 / 60

Page 4: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Problem: Unsupervised Parsing (and Grammar Induction)

NN NNS VBD IN NN ♦| | | | | |

Factory payrolls fell in September .

Input: Raw Text (Sentences, Tokens and POS-tags)

... By most measures, the nation’s industrial sector is now

growing very slowly — if at all. Factory payrolls fell in

September. So did the Federal Reserve ...

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 2 / 60

Page 5: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Problem: Unsupervised Parsing (and Grammar Induction)

NN NNS VBD IN NN ♦| | | | | |

Factory payrolls fell in September .

Input: Raw Text (Sentences, Tokens and POS-tags)

... By most measures, the nation’s industrial sector is now

growing very slowly — if at all. Factory payrolls fell in

September. So did the Federal Reserve ...

Output: Syntactic Structures (and a Probabilistic Grammar)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 2 / 60

Page 6: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Motivation: Unsupervised (Dependency) Parsing

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 3 / 60

Page 7: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Motivation: Unsupervised (Dependency) Parsing

Insert your favorite reason(s) why you’d like to parseanything in the first place...

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 3 / 60

Page 8: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Motivation: Unsupervised (Dependency) Parsing

Insert your favorite reason(s) why you’d like to parseanything in the first place...

... adjust for any data without reference tree banks:— i.e., understudied languages or genres (e.g., legal).

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 3 / 60

Page 9: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Motivation: Unsupervised (Dependency) Parsing

Insert your favorite reason(s) why you’d like to parseanything in the first place...

... adjust for any data without reference tree banks:— i.e., understudied languages or genres (e.g., legal).

Potential applications:◮ machine translation

— word alignment, phrase extraction, reordering;

◮ web search— retrieval, query refinement;

◮ question answering, speech recognition, etc.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 3 / 60

Page 10: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Evaluation: Directed Dependency Accuracy

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 4 / 60

Page 11: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Evaluation: Directed Dependency Accuracy

Scoring example:

NN NNS VBD IN NN ♦| | | | | |

Factory payrolls fell in September .

Directed Score: 35 = 60% (left/right-branching baselines: 2

5 = 40%)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 4 / 60

Page 12: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

The Problem

Evaluation: Directed Dependency Accuracy

Scoring example:

NN NNS VBD IN NN ♦| | | | | |

Factory payrolls fell in September .

Directed Score: 35 = 60% (left/right-branching baselines: 2

5 = 40%);

Undirected Score: 45 = 80% (left/right-branching baselines: 4

5 = 80%).

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 4 / 60

Page 13: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: A Brief History

1992 — word classes (Carroll and Charniak)

1998 — greedy linkage via mutual information (Yuret)

2001 — iterative re-estimation with EM (Paskin)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 5 / 60

Page 14: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: A Brief History

1992 — word classes (Carroll and Charniak)

1998 — greedy linkage via mutual information (Yuret)

2001 — iterative re-estimation with EM (Paskin)

2004 — right-branching baseline— valence (DMV) (Klein and Manning)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 5 / 60

Page 15: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: A Brief History

1992 — word classes (Carroll and Charniak)

1998 — greedy linkage via mutual information (Yuret)

2001 — iterative re-estimation with EM (Paskin)

2004 — right-branching baseline— valence (DMV) (Klein and Manning)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 5 / 60

Page 16: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: A Brief History

1992 — word classes (Carroll and Charniak)

1998 — greedy linkage via mutual information (Yuret)

2001 — iterative re-estimation with EM (Paskin)

2004 — right-branching baseline— valence (DMV) (Klein and Manning)

2004 — annealing techniques (Smith and Eisner)

2005 — contrastive estimation (Smith and Eisner)

2006 — structural biasing (Smith and Eisner)

2007 — common cover link representation (Seginer)

2008 — logistic normal priors (Cohen et al.)

2009 — lexicalization and smoothing (Headden et al.)

2009 — soft parameter tying (Cohen and Smith)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 5 / 60

Page 17: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 18: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 19: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

h

e.g.: verb

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 20: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

h

e.g.: verb

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 21: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

h

e.g.: verb

a1

noun

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 22: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

h

e.g.: verb

a1

nounphrase

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 23: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

h

e.g.: verb

a1

nounphrase

etc...

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 24: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

h

e.g.: verb

a1

nounphrase

etc...

a2

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 25: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

h

e.g.: verb

a1

nounphrase

etc...

a2

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 26: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

h

e.g.: verb

a1

nounphrase

etc...

a2

STOP

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 27: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Dependency Model with Valence

a head-outward model, with word classesand valence/adjacency (Klein and Manning, 2004)

h

a1 a2

STOP

P(th) =∏

dir∈{L,R}

PSTOP(ch, dir,

adj︷︸︸︷

1n=0)

n∏

i=1

P(tai ) PATTACH(ch, dir, cai )

(1− PSTOP(ch, dir,

adj︷︸︸︷

1i=1))

n=|args(h,dir)|V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 6 / 60

Page 28: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Unsupervised Learning Engine

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 7 / 60

Page 29: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Unsupervised Learning Engineparameter fitting via expectation-maximization (EM)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 7 / 60

Page 30: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Unsupervised Learning Engineparameter fitting via expectation-maximization (EM)

MAGIC

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 7 / 60

Page 31: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Unsupervised Learning Engineparameter fitting via expectation-maximization (EM)

◮ by means of inside-outside re-estimation (Baker, 1979)

w1 wmwp−1 wp wq wq+1

N1 (Manning and Schutze, 1999)

N j

· · · · · · · · ·

α

β

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 7 / 60

Page 32: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: Unsupervised Learning Engineparameter fitting via expectation-maximization (EM)

BLACK

BOX

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 7 / 60

Page 33: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: The Standard Corpus — WSJ

5 10 15 20 25 30 35 40 45

5

10

15

20

25

30

35

40

45Sentences (1,000s)

Tokens (1,000s)100

200

300

400

500

600

700

800

900

WSJk

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 8 / 60

Page 34: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Prior Art

Prior Art: The Standard Corpus — WSJ10

5 10 15 20 25 30 35 40 45

5

10

15

20

25

30

35

40

45Sentences (1,000s)

Tokens (1,000s)100

200

300

400

500

600

700

800

900

WSJk

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 8 / 60

Page 35: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Outline

Outline

Introduction to the Grammar Induction Problem◮ The Inputs and Outputs (text → dependency parses)◮ A Probabilistic Model (DMV)◮ Parameter Fitting (EM)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 9 / 60

Page 36: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Outline

Outline

Introduction to the Grammar Induction Problem◮ The Inputs and Outputs (text → dependency parses)◮ A Probabilistic Model (DMV)◮ Parameter Fitting (EM)

Part I: Non-Convex Optimization

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 9 / 60

Page 37: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Outline

Outline

Introduction to the Grammar Induction Problem◮ The Inputs and Outputs (text → dependency parses)◮ A Probabilistic Model (DMV)◮ Parameter Fitting (EM)

Part I: Non-Convex Optimization

Part II: Parsing Constraints

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 9 / 60

Page 38: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Outline

Outline

Introduction to the Grammar Induction Problem◮ The Inputs and Outputs (text → dependency parses)◮ A Probabilistic Model (DMV)◮ Parameter Fitting (EM)

Part I: Non-Convex Optimization

Part II: Parsing Constraints

Part III: Generative Models

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 9 / 60

Page 39: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Outline

Outline

Introduction to the Grammar Induction Problem◮ The Inputs and Outputs (text → dependency parses)◮ A Probabilistic Model (DMV)◮ Parameter Fitting (EM)

Part I: Non-Convex Optimization

Part II: Parsing Constraints

Part III: Generative Models

State-of-the-Art Integrated Solution

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 9 / 60

Page 40: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Outline

Outline

Introduction to the Grammar Induction Problem◮ The Inputs and Outputs (text → dependency parses)◮ A Probabilistic Model (DMV)◮ Parameter Fitting (EM)

Part I: Non-Convex Optimization

Part II: Parsing Constraints

Part III: Generative Models

State-of-the-Art Integrated Solution (in submission)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 9 / 60

Page 41: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Outline

Optimization

Viterbi Training

Baby Steps

Lateen EM

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 10 / 60

Page 42: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues...

Issue I: Why so little data?

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Page 43: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues...

Issue I: Why so little data?

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Uninformed

Page 44: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues...

Issue I: Why so little data?

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Uninformed

Page 45: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues...

Issue I: Why so little data?

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Uninformed

Oracle

Page 46: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues...

Issue I: Why so little data?

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Uninformed

Oracle

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 11 / 60

Page 47: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues...

Issue II: Non-convex objectives...

maximizing the probability of data (sentence strings):

θUNS = argmaxθ

s

log∑

t∈T (s)

Pθ(t)

︸ ︷︷ ︸

Pθ(s)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 12 / 60

Page 48: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues...

Issue II: Non-convex objectives...

maximizing the probability of data (sentence strings):

θUNS = argmaxθ

s

log∑

t∈T (s)

Pθ(t)

︸ ︷︷ ︸

Pθ(s)

supervised objective would be convex:

θSUP = argmaxθ

s

logPθ(t∗(s))

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 12 / 60

Page 49: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues...

Issue II: Non-convex objectives...

maximizing the probability of data (sentence strings):

θUNS = argmaxθ

s

log∑

t∈T (s)

Pθ(t)

︸ ︷︷ ︸

Pθ(s)

supervised objective would be convex:

θSUP = argmaxθ

s

logPθ(t∗(s))

could optimize likeliest parse trees (also non-convex):

θVIT = argmaxθ

s

maxt∈T (s)

logPθ(t)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 12 / 60

Page 50: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues... Classic EM

Classic EM:

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Uninformed

Oracle

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 13 / 60

Page 51: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues... Viterbi EM

Viterbi EM:

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Oracle

Uninformed

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 14 / 60

Page 52: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues... Viterbi EM

Hard vs. Soft EM:

Viterbi EM: zoom in on likeliest tree

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 15 / 60

Page 53: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues... Viterbi EM

Hard vs. Soft EM:

Viterbi EM: zoom in on likeliest tree◮ does not degrade with input length◮ better retains supervised solutions◮ actually learns from scratch

— see also Cohen and Smith (2010)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 15 / 60

Page 54: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues... Viterbi EM

Hard vs. Soft EM:

Viterbi EM: zoom in on likeliest tree◮ does not degrade with input length◮ better retains supervised solutions◮ actually learns from scratch

— see also Cohen and Smith (2010)

Classic EM: “focus across the board”

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 15 / 60

Page 55: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Issues... Viterbi EM

Hard vs. Soft EM:

Viterbi EM: zoom in on likeliest tree◮ does not degrade with input length◮ better retains supervised solutions◮ actually learns from scratch

— see also Cohen and Smith (2010)

Classic EM: “focus across the board”◮ hard to see the trees for the forest...

... but known to work with shorter inputs!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 15 / 60

Page 56: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Non-convex Optimization

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 16 / 60

Page 57: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Non-convex Optimization

start with an easy (convex) case

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 16 / 60

Page 58: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Non-convex Optimization

start with an easy (convex) case

slowly extend it to the fully complex target task

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 16 / 60

Page 59: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Non-convex Optimization

start with an easy (convex) case

slowly extend it to the fully complex target task

take tiny (cautious) steps in the problem space

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 16 / 60

Page 60: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Non-convex Optimization

start with an easy (convex) case

slowly extend it to the fully complex target task

take tiny (cautious) steps in the problem space

... try not to stray far from relevantneighborhoods in the solution space

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 16 / 60

Page 61: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Non-convex Optimization

start with an easy (convex) case

slowly extend it to the fully complex target task

take tiny (cautious) steps in the problem space

... try not to stray far from relevantneighborhoods in the solution space

base case: sentences of length one (trivial — no init)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 16 / 60

Page 62: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Non-convex Optimization

start with an easy (convex) case

slowly extend it to the fully complex target task

take tiny (cautious) steps in the problem space

... try not to stray far from relevantneighborhoods in the solution space

base case: sentences of length one (trivial — no init)

incremental step: smooth WSJk; re-init WSJ(k + 1)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 16 / 60

Page 63: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Graduated Learning

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 17 / 60

Page 64: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Graduated Learning

WSJ1 — Atone (verbs!)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 17 / 60

Page 65: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Graduated Learning

WSJ1 — Atone (verbs!)

WSJ2 — It is. (pronouns!)Darkness fell.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 17 / 60

Page 66: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... as Graduated Learning

WSJ1 — Atone (verbs!)

WSJ2 — It is. (pronouns!)Darkness fell.

Become a LobbyistWSJ3 — But many have. (determiners!)

They didn’t.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 17 / 60

Page 67: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... and Related Notions

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 18 / 60

Page 68: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... and Related Notions

shaping (Skinner, 1938)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 18 / 60

Page 69: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... and Related Notions

shaping Psychology / Cognitive Science (Skinner, 1938)

less is more (Kail, 1984; Newport, 1988; 1990)

starting small (Elman, 1993)

◮ scaffold on model complexity [restrict memory]◮ scaffold on data complexity [restrict input]

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 18 / 60

Page 70: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... and Related Notions

shaping Psychology / Cognitive Science (Skinner, 1938)

less is more (Kail, 1984; Newport, 1988; 1990)

starting small (Elman, 1993)

◮ scaffold on model complexity [restrict memory]◮ scaffold on data complexity [restrict input]

stepping stones (Brown et al., 1993)

coarse-to-fine NLP / AI (Charniak and Johnson, 2005)

curriculum learning (Bengio et al., 2009)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 18 / 60

Page 71: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... and Related Notions

shaping Psychology / Cognitive Science (Skinner, 1938)

less is more (Kail, 1984; Newport, 1988; 1990)

starting small (Elman, 1993)

◮ scaffold on model complexity [restrict memory]◮ scaffold on data complexity [restrict input]

stepping stones (Brown et al., 1993)

coarse-to-fine NLP / AI (Charniak and Johnson, 2005)

curriculum learning (Bengio et al., 2009)

continuation methods (Allgower and Georg, 1990)

deterministic annealing Math / OR (Rose, 1998)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 18 / 60

Page 72: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... and Related Notions

shaping Psychology / Cognitive Science (Skinner, 1938)

less is more (Kail, 1984; Newport, 1988; 1990)

starting small (Elman, 1993)

◮ scaffold on model complexity [restrict memory]◮ scaffold on data complexity [restrict input]

stepping stones (Brown et al., 1993)

coarse-to-fine NLP / AI (Charniak and Johnson, 2005)

curriculum learning (Bengio et al., 2009)

continuation methods (Allgower and Georg, 1990)

deterministic annealing Math / OR (Rose, 1998)

successive approximations!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 18 / 60

Page 73: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Aside I: Speaking of successive approximations...

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 19 / 60

Page 74: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... Observations & Results!

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Uninformed

Oracle

Page 75: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... Observations & Results!

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Uninformed

Oracle

Baby Steps

Page 76: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... Observations & Results!

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Uninformed

Oracle

Baby Steps

Page 77: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Baby Steps

Idea I: Baby Steps ... Observations & Results!

5 10 15 20 25 30 35 40

10

20

30

40

50

60

70

WSJk

Directed Dependency Accuracy (%)on WSJ40

Uninformed

Oracle

Baby Steps

Less is More︸ ︷︷ ︸

K&M∗

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 20 / 60

Page 78: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Leapfrog

Idea II: Less is More & Leapfrog ... a Hack!

5 10 15 20 25 30 35 40

20

30

40

50

60

70

80

90

WSJk

Directed Dependency Accuracy (%)on WSJk

Oracle

Uninformed

Baby StepsK&M

Page 79: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Leapfrog

Idea II: Less is More & Leapfrog ... a Hack!

5 10 15 20 25 30 35 40

20

30

40

50

60

70

80

90

WSJk

Directed Dependency Accuracy (%)on WSJk

Oracle

Uninformed

Baby Steps

K&M∗

Page 80: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Leapfrog

Idea II: Less is More & Leapfrog ... a Hack!

5 10 15 20 25 30 35 40

20

30

40

50

60

70

80

90

WSJk

Directed Dependency Accuracy (%)on WSJk

Oracle

Uninformed

Baby Steps

K&M∗

Page 81: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Leapfrog

Idea II: Less is More & Leapfrog ... a Hack!

5 10 15 20 25 30 35 40

20

30

40

50

60

70

80

90

WSJk

Directed Dependency Accuracy (%)on WSJk

Oracle

Uninformed

Baby Steps

K&M∗

Page 82: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Leapfrog

Idea II: Less is More & Leapfrog ... a Hack!

5 10 15 20 25 30 35 40

20

30

40

50

60

70

80

90

WSJk

Directed Dependency Accuracy (%)on WSJk

Oracle

Uninformed

Baby Steps

K&M∗

Page 83: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Leapfrog

Idea II: Less is More & Leapfrog ... a Hack!

5 10 15 20 25 30 35 40

20

30

40

50

60

70

80

90

WSJk

Directed Dependency Accuracy (%)on WSJk

Oracle

Uninformed

Baby Steps

K&M∗

Leapfrog

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 21 / 60

Page 84: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Leapfrog

Aside II: Also, to be fair...

5 10 15 20 25 30 35 40

20

30

40

50

60

70

80

90

WSJk

Undirected Dependency Accuracy (%)on WSJk

Uninformed

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 22 / 60

Page 85: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Lateen EM: Intuition

Use both objectives (a primary and a secondary).

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 23 / 60

Page 86: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Lateen EM: Intuition

Use both objectives (a primary and a secondary).

As a captain can’t count on favorable winds, so anunsupervised learner can’t rely on co-operative gradients.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 23 / 60

Page 87: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Lateen EM: Intuition

Use both objectives (a primary and a secondary).

As a captain can’t count on favorable winds, so anunsupervised learner can’t rely on co-operative gradients.

Lateen strategies de-emphasize fixed points, e.g., bytacking around local attractors, in a zig-zag fashion.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 23 / 60

Page 88: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Lateen EM: Simple Algorithm

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 24 / 60

Page 89: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Lateen EM: Simple Algorithm

Alternate ordinary soft and hard EM algorithms:switching when stuck helps escape local optima.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 24 / 60

Page 90: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Lateen EM: Simple Algorithm

Alternate ordinary soft and hard EM algorithms:switching when stuck helps escape local optima.

E.g., Italian grammar induction improvesfrom 41.8% to 56.2% after three lateen alternations:

50 100 150 200 250 300

3.03.54.04.5

3.39

3.26

3.42

3.19

3.33

3.23

3.39

3.18

3.29

3.21

3.39

3.18

3.29

3.22

bpt

iteration

Pumping action: – hard EM pushes down the top curve (primary objective);– soft EM pushes down the bottom curve (the secondary

objective), often at the expense of the primary.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 24 / 60

Page 91: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Lateen EM: Early-Stopping

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 25 / 60

Page 92: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Lateen EM: Early-Stopping

Use one objective to validate moves proposed by theother: stop if the secondary objective gets worse.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 25 / 60

Page 93: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Lateen EM: Early-Stopping

Use one objective to validate moves proposed by theother: stop if the secondary objective gets worse.

30% faster, on average, than either standard EM.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 25 / 60

Page 94: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 95: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 96: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 97: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 98: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

Exploit multiple views and iterate (Blum and Mitchell, 1998):

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 99: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

Exploit multiple views and iterate (Blum and Mitchell, 1998):

◮ Lateen EM optimizes for Viterbi parse trees and forests;

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 100: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

Exploit multiple views and iterate (Blum and Mitchell, 1998):

◮ Lateen EM optimizes for Viterbi parse trees and forests;◮ Baby Steps focuses on simpler sentences in the data.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 101: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

Exploit multiple views and iterate (Blum and Mitchell, 1998):

◮ Lateen EM optimizes for Viterbi parse trees and forests;◮ Baby Steps focuses on simpler sentences in the data.

Capitalize on EM’s advantages:

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 102: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

Exploit multiple views and iterate (Blum and Mitchell, 1998):

◮ Lateen EM optimizes for Viterbi parse trees and forests;◮ Baby Steps focuses on simpler sentences in the data.

Capitalize on EM’s advantages:◮ guarantee to not harm a primary likelihood

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 103: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

Exploit multiple views and iterate (Blum and Mitchell, 1998):

◮ Lateen EM optimizes for Viterbi parse trees and forests;◮ Baby Steps focuses on simpler sentences in the data.

Capitalize on EM’s advantages:◮ guarantee to not harm a primary likelihood;◮ begin with large steps in a parameter space.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 104: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

Exploit multiple views and iterate (Blum and Mitchell, 1998):

◮ Lateen EM optimizes for Viterbi parse trees and forests;◮ Baby Steps focuses on simpler sentences in the data.

Capitalize on EM’s advantages:◮ guarantee to not harm a primary likelihood;◮ begin with large steps in a parameter space.

Ameliorate EM’s disadvantages:

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 105: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

Exploit multiple views and iterate (Blum and Mitchell, 1998):

◮ Lateen EM optimizes for Viterbi parse trees and forests;◮ Baby Steps focuses on simpler sentences in the data.

Capitalize on EM’s advantages:◮ guarantee to not harm a primary likelihood;◮ begin with large steps in a parameter space.

Ameliorate EM’s disadvantages:◮ avoid taking disproportionately many (and ever-smaller)

steps to approach a likelihood’s fixed point

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 106: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Optimization: Summary

Take guesswork out of tuning EM:◮ Lateen EM ties termination to a sign change;◮ Viterbi EM and Baby Steps don’t require initializers.

Exploit multiple views and iterate (Blum and Mitchell, 1998):

◮ Lateen EM optimizes for Viterbi parse trees and forests;◮ Baby Steps focuses on simpler sentences in the data.

Capitalize on EM’s advantages:◮ guarantee to not harm a primary likelihood;◮ begin with large steps in a parameter space.

Ameliorate EM’s disadvantages:◮ avoid taking disproportionately many (and ever-smaller)

steps to approach a likelihood’s fixed point;◮ escape by changing objectives and mixing solutions!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 26 / 60

Page 107: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Lateen EM

Constraints

Web Markup

Punctuation

Capitalization

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 27 / 60

Page 108: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Overview

Constraints: Supervised and Unsupervised

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 28 / 60

Page 109: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Overview

Constraints: Supervised and Unsupervisedcompact summaries of high-level insights into a domain

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 28 / 60

Page 110: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Overview

Constraints: Supervised and Unsupervisedcompact summaries of high-level insights into a domain— can significantly reduce the search space

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 28 / 60

Page 111: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Overview

Constraints: Supervised and Unsupervisedcompact summaries of high-level insights into a domain— can significantly reduce the search space— often easier to enforce than to model

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 28 / 60

Page 112: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Overview

Constraints: Supervised and Unsupervisedcompact summaries of high-level insights into a domain— can significantly reduce the search space— often easier to enforce than to model

relevant to unsupervised learning (less rope to hang self)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 28 / 60

Page 113: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Overview

Constraints: Supervised and Unsupervisedcompact summaries of high-level insights into a domain— can significantly reduce the search space— often easier to enforce than to model

relevant to unsupervised learning (less rope to hang self)— in general, steer at the “right” regularities in data

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 28 / 60

Page 114: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Overview

Constraints: Supervised and Unsupervisedcompact summaries of high-level insights into a domain— can significantly reduce the search space— often easier to enforce than to model

relevant to unsupervised learning (less rope to hang self)— in general, steer at the “right” regularities in data— linguistic structure underdetermined by raw text

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 28 / 60

Page 115: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Overview

Constraints: Supervised and Unsupervisedcompact summaries of high-level insights into a domain— can significantly reduce the search space— often easier to enforce than to model

relevant to unsupervised learning (less rope to hang self)— in general, steer at the “right” regularities in data— linguistic structure underdetermined by raw text

partial bracketings (Pereira and Schabes, 1992)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 28 / 60

Page 116: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Overview

Constraints: Supervised and Unsupervised

compact summaries of high-level insights into a domain— can significantly reduce the search space— often easier to enforce than to model

relevant to unsupervised learning (less rope to hang self)— in general, steer at the “right” regularities in data— linguistic structure underdetermined by raw text

partial bracketings (Pereira and Schabes, 1992)

synchronous grammars (Alshawi and Douglas, 2000)

linear-time parsing, skewness of trees, etc. (Seginer, 2007)

sparse posterior regularization (Ganchev et al., 2009)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 28 / 60

Page 117: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Common Constituents

..., but [S [NP the <a>Toronto Star][VP reports [NP this][PP in the softest possible way]</a>,[S stating ...]]]

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 29 / 60

Page 118: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Common Constituents

..., but [S [NP the <a>Toronto Star][VP reports [NP this][PP in the softest possible way]</a>,[S stating ...]]]

S → NP VP

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 29 / 60

Page 119: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Common Constituents

..., but [S [NP the <a>Toronto Star][VP reports [NP this][PP in the softest possible way]</a>,[S stating ...]]]

S → NP VP→ DT NNP NNP VBZ NP PP S

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 29 / 60

Page 120: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Constituent Productions

%NP → NNP NNP 9.6NP → NNP 4.6NP → NP PP 3.4

NP → NNP NNP NNP 2.4NP → DT NNP NNP 2.1NP → NN 1.8

NP → DT NNP NNP NNP 1.7NP → DT NN 1.7NP → DT NNP NNP 1.6S → NP VP 1.4

NP → DT NNP NNP NNP 1.2NP → DT JJ NN 1.1NP → NNS 1.0NP → JJ NN 0.8NP → NP NP 0.8

35.3

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 30 / 60

Page 121: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Constituent Productions

%NP → NNP NNP 9.6NP → NNP 4.6NP → NP PP 3.4

NP → NNP NNP NNP 2.4NP → DT NNP NNP 2.1NP → NN 1.8

NP → DT NNP NNP NNP 1.7NP → DT NN 1.7NP → DT NNP NNP 1.6S → NP VP 1.4

NP → DT NNP NNP NNP 1.2NP → DT JJ NN 1.1NP → NNS 1.0NP → JJ NN 0.8NP → NP NP 0.8

35.3

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 30 / 60

Page 122: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Constituent Productions

%NP → NNP NNP 9.6NP → NNP 4.6NP → NP PP 3.4

NP → NNP NNP NNP 2.4NP → DT NNP NNP 2.1NP → NN 1.8

NP → DT NNP NNP NNP 1.7NP → DT NN 1.7NP → DT NNP NNP 1.6S → NP VP 1.4

NP → DT NNP NNP NNP 1.2NP → DT JJ NN 1.1NP → NNS 1.0NP → JJ NN 0.8NP → NP NP 0.8

35.3

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 30 / 60

Page 123: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Common Dependencies

..., but [S [NP the <a>Toronto Star][VP reports [NP this][PP in the softest possible way]</a>,[S stating ...]]]

DT NNP NNP VBZ DT IN DT JJS JJ NN

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 31 / 60

Page 124: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Common Dependencies

..., but [S [NP the <a>Toronto Star][VP reports [NP this][PP in the softest possible way]</a>,[S stating ...]]]

DT NNP NNP VBZ DT IN DT JJS JJ NN

DT NNP VBZ

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 31 / 60

Page 125: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Common Dependencies

..., but [S [NP the <a>Toronto Star][VP reports [NP this][PP in the softest possible way]</a>,[S stating ...]]]

DT NNP NNP VBZ DT IN DT JJS JJ NN

DT NNP VBZ

“the <a>Star reports</a>”

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 31 / 60

Page 126: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Head-Outward Spawns

%NNP 24.4NN 8.1

DT NNP 6.1DT NN 5.9NNS 4.5NNPS 1.4VBG 1.3

NNP NNP NN 1.2VBD 1.0IN 1.0VBN 1.0

DT JJ NN 0.9VBZ 0.9

POS NNP 0.9JJ 0.8

59.4

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 32 / 60

Page 127: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Head-Outward Spawns

%NNP 24.4NN 8.1

DT NNP 6.1DT NN 5.9NNS 4.5NNPS 1.4VBG 1.3

NNP NNP NN 1.2VBD 1.0IN 1.0VBN 1.0

DT JJ NN 0.9VBZ 0.9

POS NNP 0.9JJ 0.8

59.4

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 32 / 60

Page 128: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis

Syntax of Markup: Head-Outward Spawns

%NNP 24.4NN 8.1

DT NNP 6.1DT NN 5.9NNS 4.5NNPS 1.4VBG 1.3

NNP NNP NN 1.2VBD 1.0IN 1.0VBN 1.0

DT JJ NN 0.9VBZ 0.9

POS NNP 0.9JJ 0.8

59.4

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 32 / 60

Page 129: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Transition Formatting

Formatting:

strong connection between markup and syntax

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 33 / 60

Page 130: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Transition Formatting

Formatting:

strong connection between markup and syntax

yields a suite of accurate parsing constraints

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 33 / 60

Page 131: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Transition Formatting

Formatting:

strong connection between markup and syntax

yields a suite of accurate parsing constraints

improves grammar induction over web data

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 33 / 60

Page 132: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Transition Formatting

Formatting:

strong connection between markup and syntax

yields a suite of accurate parsing constraints

improves grammar induction over web data, but...

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 33 / 60

Page 133: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Transition Formatting

Formatting:

strong connection between markup and syntax

yields a suite of accurate parsing constraints

improves grammar induction over web data, but...◮ works better with more natural language-resources!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 33 / 60

Page 134: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Transition Formatting

Formatting:

strong connection between markup and syntax

yields a suite of accurate parsing constraints

improves grammar induction over web data, but...◮ works better with more natural language-resources!

missing structural cues:

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 33 / 60

Page 135: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Transition Formatting

Formatting:

strong connection between markup and syntax

yields a suite of accurate parsing constraints

improves grammar induction over web data, but...◮ works better with more natural language-resources!

missing structural cues:— e.g., punctuation (will make heavy use)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 33 / 60

Page 136: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Transition Formatting

Formatting:

strong connection between markup and syntax

yields a suite of accurate parsing constraints

improves grammar induction over web data, but...◮ works better with more natural language-resources!

missing structural cues:— e.g., punctuation (will make heavy use)

and capitalization (will not discuss today)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 33 / 60

Page 137: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Transition Formatting

Formatting:

strong connection between markup and syntax

yields a suite of accurate parsing constraints

improves grammar induction over web data, but...◮ works better with more natural language-resources!

missing structural cues:— e.g., punctuation (will make heavy use)

and capitalization (will not discuss today)

raw word streams often difficult even for humans— e.g., transcribed utterances (Kim and Woodland, 2002)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 33 / 60

Page 138: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Raw Text

Example: Raw Word Stream

ALTHOUGH IT PROBABLY HAS REDUCED THELEVEL OF EXPENDITURES FOR SOME

PURCHASERS UTILIZATION MANAGEMENTLIKE MOST OTHER COST CONTAINMENTSTRATEGIES DOESN’T APPEAR TO HAVE

ALTERED THE LONG-TERM RATE OFINCREASE IN HEALTH-CARE COSTS THE

INSTITUTE OF MEDICINE AN AFFILIATE OFTHE NATIONAL ACADEMY OF SCIENCESCONCLUDED AFTER A TWO-YEAR STUDY

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 34 / 60

Page 139: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Formatted Text

Example:

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 35 / 60

Page 140: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Formatted Text

Example:

[SBAR Although it probably has reduced the level ofexpenditures for some purchasers],

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 35 / 60

Page 141: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Formatted Text

Example:

[SBAR Although it probably has reduced the level ofexpenditures for some purchasers], [NP utilization

management] —

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 35 / 60

Page 142: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Formatted Text

Example:

[SBAR Although it probably has reduced the level ofexpenditures for some purchasers], [NP utilization

management] — [PP like most other costcontainment strategies] —

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 35 / 60

Page 143: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Formatted Text

Example:

[SBAR Although it probably has reduced the level ofexpenditures for some purchasers], [NP utilization

management] — [PP like most other costcontainment strategies] — [VP doesn’t appear to

have altered the long-term rate of increase inhealth-care costs],

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 35 / 60

Page 144: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Formatted Text

Example:

[SBAR Although it probably has reduced the level ofexpenditures for some purchasers], [NP utilization

management] — [PP like most other costcontainment strategies] — [VP doesn’t appear to

have altered the long-term rate of increase inhealth-care costs], [NP the Institute of Medicine],

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 35 / 60

Page 145: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Formatted Text

Example:

[SBAR Although it probably has reduced the level ofexpenditures for some purchasers], [NP utilization

management] — [PP like most other costcontainment strategies] — [VP doesn’t appear to

have altered the long-term rate of increase inhealth-care costs], [NP the Institute of Medicine],

[NP an affiliate of the National Academy of

Sciences],

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 35 / 60

Page 146: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Formatted Text

Example:

[SBAR Although it probably has reduced the level ofexpenditures for some purchasers], [NP utilization

management] — [PP like most other costcontainment strategies] — [VP doesn’t appear to

have altered the long-term rate of increase inhealth-care costs], [NP the Institute of Medicine],

[NP an affiliate of the National Academy of

Sciences], [VP concluded after a two-year study].

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 35 / 60

Page 147: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Strong Assumption

Intuition:

strong constraint

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 36 / 60

Page 148: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Strong Assumption

Intuition:

strong constraint: (head ← head) in training

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 36 / 60

Page 149: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Strong Assumption

Intuition:

strong constraint: (head ← head) in training

word head , head word word ,

head word word word word word word word .

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 36 / 60

Page 150: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Strong Assumption

Intuition:

strong constraint: (head ← head) in training

word head , head word word ,

head word word word word word word word .

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 36 / 60

Page 151: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Strong Assumption

Intuition:

strong constraint: (head ← head) in training

word head , head word word ,

head word word word word word word word .

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 36 / 60

Page 152: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Strong Assumption

Intuition:

strong constraint: (head ← head) in training

Other countries , including West Germany ,

may have a hard time justifying continued membership .

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 36 / 60

Page 153: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Weak Assumption

Intuition:

weak constraint

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 37 / 60

Page 154: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Weak Assumption

Intuition:

weak constraint: (head ← external word) in inference

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 37 / 60

Page 155: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Weak Assumption

Intuition:

weak constraint: (head ← external word) in inference

word word head word word word ,

head word word word word word word word .

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 37 / 60

Page 156: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Weak Assumption

Intuition:

weak constraint: (head ← external word) in inference

word word head word word word ,

head word word word word word word word .

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 37 / 60

Page 157: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Weak Assumption

Intuition:

weak constraint: (head ← external word) in inference

word word head word word word ,

head word word word word word word word .

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 37 / 60

Page 158: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Example Weak Assumption

Intuition:

weak constraint: (head ← external word) in inference

IFI also has nonvoting preferred shares ,

which are quoted on the Milan stock exchange .

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 37 / 60

Page 159: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis Constituents

Linguistic Analysis:

punctuation and syntax are related(Nunberg, 1990; Briscoe, 1994;Jones 1994; Doran, 1998, inter alia)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 38 / 60

Page 160: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis Constituents

Linguistic Analysis:

punctuation and syntax are related(Nunberg, 1990; Briscoe, 1994;Jones 1994; Doran, 1998, inter alia)

49.4% of inter-punctuationfragments are constituents

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 38 / 60

Page 161: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis Constituents

Linguistic Analysis:

punctuation and syntax are related(Nunberg, 1990; Briscoe, 1994;Jones 1994; Doran, 1998, inter alia)

49.4% of inter-punctuationfragments are constituents

many fragments deriveprecisely themselves:

%IN 9.6NN 7.2NNP 6.3CD 3.8VBD 3.2VBZ 3.0RB 2.8VBG 2.2VBP 1.9NNS 1.8WDT 1.6MD 1.1VBN 1.1

IN VBD 1.0JJ 0.7

52.8

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 38 / 60

Page 162: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis Strong Dependencies

Training:

strong constraint, e.g.,

... arrests followed a “ Snake Day ” at Utrecht ...

— already 74.0% agreement with head-percolated trees.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 39 / 60

Page 163: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis Weak Dependencies

Inference:

weak constraint, e.g.,

Maryland Club also distributes tea , which ...

— now 92.9% agreement with head-percolated trees!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 40 / 60

Page 164: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Linguistic Analysis Weak Dependencies

Modeling

Context-Sensitive Unsupervsed Tags

Dependency-and-Boundary Models

Reduced Models of Grammar

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 41 / 60

Page 165: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Example: Actually, state-of-the-art uses gold tags!

IN PRP RB VBZ VBN DT NN IN NNS IN DTNNS NN NN IN RBS JJ NN NN NNS VBZ RBVB TO VB VBN DT JJ NN IN NN IN JJ NNSDT NNP IN NNP DT NN IN DT NNP NNP IN

NNPS VBD IN DT JJ NN

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 42 / 60

Page 166: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Two benefits of tag-sets...Grouping: pooling statistics of words that play similarsyntactic roles improves generalization (reduces sparsity):

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 43 / 60

Page 167: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Two benefits of tag-sets...Grouping: pooling statistics of words that play similarsyntactic roles improves generalization (reduces sparsity):

◮ indeed, working directly with words does not do well.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 43 / 60

Page 168: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Two benefits of tag-sets...Grouping: pooling statistics of words that play similarsyntactic roles improves generalization (reduces sparsity):

◮ indeed, working directly with words does not do well.

Disambiguation: for words that take on multiple parts ofspeech, knowing gold tags limits the parsing search space:

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 43 / 60

Page 169: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Two benefits of tag-sets...Grouping: pooling statistics of words that play similarsyntactic roles improves generalization (reduces sparsity):

◮ indeed, working directly with words does not do well.

Disambiguation: for words that take on multiple parts ofspeech, knowing gold tags limits the parsing search space:

◮ reducing manual tags to monosemous clusterings lowersperformance below that of the unsupervised categoriesconstructed by Finkel and Manning (2009) / Clark (2003).

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 43 / 60

Page 170: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Two benefits of tag-sets...Grouping: pooling statistics of words that play similarsyntactic roles improves generalization (reduces sparsity):

◮ indeed, working directly with words does not do well.

Disambiguation: for words that take on multiple parts ofspeech, knowing gold tags limits the parsing search space:

◮ reducing manual tags to monosemous clusterings lowersperformance below that of the unsupervised categoriesconstructed by Finkel and Manning (2009) / Clark (2003).

Can improve with context-sensitive unsupervised clusters!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 43 / 60

Page 171: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Two benefits of tag-sets...Grouping: pooling statistics of words that play similarsyntactic roles improves generalization (reduces sparsity):

◮ indeed, working directly with words does not do well.

Disambiguation: for words that take on multiple parts ofspeech, knowing gold tags limits the parsing search space:

◮ reducing manual tags to monosemous clusterings lowersperformance below that of the unsupervised categoriesconstructed by Finkel and Manning (2009) / Clark (2003).

Can improve with context-sensitive unsupervised clusters!

1 start with a hard assignment; (standard word clustering)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 43 / 60

Page 172: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Two benefits of tag-sets...Grouping: pooling statistics of words that play similarsyntactic roles improves generalization (reduces sparsity):

◮ indeed, working directly with words does not do well.

Disambiguation: for words that take on multiple parts ofspeech, knowing gold tags limits the parsing search space:

◮ reducing manual tags to monosemous clusterings lowersperformance below that of the unsupervised categoriesconstructed by Finkel and Manning (2009) / Clark (2003).

Can improve with context-sensitive unsupervised clusters!

1 start with a hard assignment; (standard word clustering)

2 inject context-colored noise; (get out of the local optimum)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 43 / 60

Page 173: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Two benefits of tag-sets...Grouping: pooling statistics of words that play similarsyntactic roles improves generalization (reduces sparsity):

◮ indeed, working directly with words does not do well.

Disambiguation: for words that take on multiple parts ofspeech, knowing gold tags limits the parsing search space:

◮ reducing manual tags to monosemous clusterings lowersperformance below that of the unsupervised categoriesconstructed by Finkel and Manning (2009) / Clark (2003).

Can improve with context-sensitive unsupervised clusters!

1 start with a hard assignment; (standard word clustering)

2 inject context-colored noise; (get out of the local optimum)

3 Viterbi-train a bitag HMM. (the unTagger)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 43 / 60

Page 174: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Breaking the barrier...

Swapped out gold tags for unsupervised word categories:

System Description Accuracy

“punctuation” with gold tags 58.4

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 44 / 60

Page 175: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Breaking the barrier...

Swapped out gold tags for unsupervised word categories:

System Description Accuracy

“punctuation” with gold tags 58.4“punctuation” with monosemous induced tags 58.2 (−0.2)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 44 / 60

Page 176: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Breaking the barrier...

Swapped out gold tags for unsupervised word categories:

System Description Accuracy

“punctuation” with gold tags 58.4“punctuation” with monosemous induced tags 58.2 (−0.2)“punctuation” with context-sensitive induced tags 59.1 (+0.7)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 44 / 60

Page 177: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Breaking the barrier...

Swapped out gold tags for unsupervised word categories:

System Description Accuracy

“punctuation” with gold tags 58.4“punctuation” with monosemous induced tags 58.2 (−0.2)“punctuation” with context-sensitive induced tags 59.1 (+0.7)

◮ only a small drop from switching to (monosemous)unsupervised clusters — previous systems lost ∼ 5pts

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 44 / 60

Page 178: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Gold Tags Unlexicalized Tokens

Word Categories: Breaking the barrier...

Swapped out gold tags for unsupervised word categories:

System Description Accuracy

“punctuation” with gold tags 58.4“punctuation” with monosemous induced tags 58.2 (−0.2)“punctuation” with context-sensitive induced tags 59.1 (+0.7)

◮ only a small drop from switching to (monosemous)unsupervised clusters — previous systems lost ∼ 5pts;

◮ first state-of-the-art system to improve with(context-sensitive) unsupervised clusters!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 44 / 60

Page 179: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models (DBMs):

Use boundary cues in head-driven dependency grammars.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 45 / 60

Page 180: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models (DBMs):

Use boundary cues in head-driven dependency grammars.

E.g., induce structure by working inwards from edges

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 45 / 60

Page 181: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models (DBMs):

Use boundary cues in head-driven dependency grammars.

E.g., induce structure by working inwards from edges:

DT NN VBZ IN DT NN

| | | | | |

[The check] is in [the mail].︸ ︷︷ ︸

Subject NP︸ ︷︷ ︸

Object NP

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 45 / 60

Page 182: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models (DBMs):

Use boundary cues in head-driven dependency grammars.

E.g., induce structure by working inwards from edges:

DT NN VBZ IN DT NN

| | | | | |

[The check] is in [the mail].︸ ︷︷ ︸

Subject NP︸ ︷︷ ︸

Object NP

◮ learn from left fringe (determiner DT) to parse object NP

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 45 / 60

Page 183: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models (DBMs):

Use boundary cues in head-driven dependency grammars.

E.g., induce structure by working inwards from edges:

DT NN VBZ IN DT NN

| | | | | |

[The check] is in [the mail].︸ ︷︷ ︸

Subject NP︸ ︷︷ ︸

Object NP

◮ learn from left fringe (determiner DT) to parse object NP◮ based on right fringe (noun NN), correctly parse subject NP

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 45 / 60

Page 184: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models (DBMs):

Use boundary cues in head-driven dependency grammars.

E.g., induce structure by working inwards from edges:

DT NN VBZ IN DT NN

| | | | | |

[The check] is in [the mail].︸ ︷︷ ︸

Subject NP︸ ︷︷ ︸

Object NP

◮ learn from left fringe (determiner DT) to parse object NP◮ based on right fringe (noun NN), correctly parse subject NP◮ between them, glean make-up of larger phrases (e.g., VP)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 45 / 60

Page 185: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models: Highlights

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 46 / 60

Page 186: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models: Highlights

DBM-1: Use words at the fringes.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 46 / 60

Page 187: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models: Highlights

DBM-1: Use words at the fringes.

◮ truly head-outward model (Alshawi, 1996)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 46 / 60

Page 188: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models: Highlights

DBM-1: Use words at the fringes.

◮ truly head-outward model (Alshawi, 1996)

◮ conditions on what can more often be seen!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 46 / 60

Page 189: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models: Highlights

DBM-1: Use words at the fringes.

◮ truly head-outward model (Alshawi, 1996)

◮ conditions on what can more often be seen!

DBM-2: Fragments differ from complete sentences.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 46 / 60

Page 190: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models: Highlights

DBM-1: Use words at the fringes.

◮ truly head-outward model (Alshawi, 1996)

◮ conditions on what can more often be seen!

DBM-2: Fragments differ from complete sentences.

◮ incomplete fragments are uncharacteristically short

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 46 / 60

Page 191: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models: Highlights

DBM-1: Use words at the fringes.

◮ truly head-outward model (Alshawi, 1996)

◮ conditions on what can more often be seen!

DBM-2: Fragments differ from complete sentences.

◮ incomplete fragments are uncharacteristically short◮ roots of fragments are generally not verbs or modals

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 46 / 60

Page 192: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Dependency-and-Boundary Models: Highlights

DBM-1: Use words at the fringes.

◮ truly head-outward model (Alshawi, 1996)

◮ conditions on what can more often be seen!

DBM-2: Fragments differ from complete sentences.

◮ incomplete fragments are uncharacteristically short◮ roots of fragments are generally not verbs or modals

DBM-3: Learn to stitch together fragments.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 46 / 60

Page 193: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 194: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

chce

dir = R adj = T

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 195: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

chce

dir = R adj = T

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 196: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

ch

dir = R adj = T

cd1

ce

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 197: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

ch

dir = R

cd1

ceadj = F

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 198: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

ch

dir = R

cd1

ceadj = F

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 199: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

ch

dir = R

cd1

adj = F

cd2

ce

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 200: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

ch

dir = R

cd1

adj = F

cd2

ce

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 201: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

ch

dir = R

cd1

adj = F

cd2

ceSTOP

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 202: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

ch

dir = R

cd1

adj = F

cd2

ceSTOP

PROOT(ch | comp)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 203: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

ch

dir = R

cd1

adj = F

cd2

ceSTOP

PROOT(ch | comp) PATTACH(cd | ch, dir , cross)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 204: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Dependency-and-Boundary Models

Class-based, head-outward generation (Alshawi, 1996)

ch

dir = R

cd1

adj = F

cd2

ceSTOP

PROOT(ch | comp) PATTACH(cd | ch, dir , cross) PSTOP(| dir , adj , ce, comp)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 47 / 60

Page 205: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models

How to better exploit more data better?

more text in long sentences, but those can be hard...

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 48 / 60

Page 206: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models

How to better exploit more data better?

more text in long sentences, but those can be hard...◮ require Viterbi training, punctuation constraints, etc.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 48 / 60

Page 207: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models

How to better exploit more data better?

more text in long sentences, but those can be hard...◮ require Viterbi training, punctuation constraints, etc.

could we “start small” and still use more data?

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 48 / 60

Page 208: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models

How to better exploit more data better?

more text in long sentences, but those can be hard...◮ require Viterbi training, punctuation constraints, etc.

could we “start small” and still use more data?◮ ... and wouldn’t it be nice if we could just split things up!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 48 / 60

Page 209: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models What If

What if we chopped up input at punctuation?

impact on quantity of data (with a 15-token threshold):◮ more and simpler word sequences incorporated earlier

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 49 / 60

Page 210: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models What If

What if we chopped up input at punctuation?

impact on quantity of data (with a 15-token threshold):◮ more and simpler word sequences incorporated earlier◮ much more dense coverage of available data:

⋆ number of training inputs goes up 2.2x⋆ number of tokens increases 4.3x

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 49 / 60

Page 211: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models What If

What if we chopped up input at punctuation?

impact on quantity of data (with a 15-token threshold):◮ more and simpler word sequences incorporated earlier◮ much more dense coverage of available data:

⋆ number of training inputs goes up 2.2x⋆ number of tokens increases 4.3x

but, also impact on quality of data

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 49 / 60

Page 212: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models What If

What if we chopped up input at punctuation?

impact on quantity of data (with a 15-token threshold):◮ more and simpler word sequences incorporated earlier◮ much more dense coverage of available data:

⋆ number of training inputs goes up 2.2x⋆ number of tokens increases 4.3x

but, also impact on quality of data:◮ many fewer complete sentences exhibiting full structure

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 49 / 60

Page 213: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models What If

What if we chopped up input at punctuation?

impact on quantity of data (with a 15-token threshold):◮ more and simpler word sequences incorporated earlier◮ much more dense coverage of available data:

⋆ number of training inputs goes up 2.2x⋆ number of tokens increases 4.3x

but, also impact on quality of data:◮ many fewer complete sentences exhibiting full structure◮ even less representative than short inputs:

⋆ but mostly phrases and clauses...

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 49 / 60

Page 214: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models What If

What if we chopped up input at punctuation?

impact on quantity of data (with a 15-token threshold):◮ more and simpler word sequences incorporated earlier◮ much more dense coverage of available data:

⋆ number of training inputs goes up 2.2x⋆ number of tokens increases 4.3x

but, also impact on quality of data:◮ many fewer complete sentences exhibiting full structure◮ even less representative than short inputs:

⋆ but mostly phrases and clauses...

... however, we have an appropriate model family!

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 49 / 60

Page 215: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

Example (cont’d)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 50 / 60

Page 216: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

Example (cont’d)

length & type left & right

complete 51 S IN NN

incomplete 12 SBAR IN NNS2 NP NN NN6 PP IN NNS

14 VP VBZ NNS4 NP DT NNP

8 NP DT NNPS5 VP VBD NN

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 50 / 60

Page 217: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

Example (cont’d)

DBM-2length & type left & right

complete 51 S IN NN

incomplete 12 SBAR IN NNS2 NP NN NN6 PP IN NNS

14 VP VBZ NNS4 NP DT NNP

8 NP DT NNPS5 VP VBD NN

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 50 / 60

Page 218: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

Example (cont’d)

DBM-1length & type left & right

complete 51 S IN NN

incomplete 12 SBAR IN NNS2 NP NN NN6 PP IN NNS

14 VP VBZ NNS4 NP DT NNP

8 NP DT NNPS5 VP VBD NN

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 50 / 60

Page 219: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

Example (cont’d)

length & type left & right

complete 51 S IN NN

incomplete 12 SBAR IN NNS2 NP NN NN6 PP IN NNS

14 VP VBZ NNS4 NP DT NNP

8 NP DT NNPSDBM-3 5 VP VBD NN

partial parse forests “easy-first” (Goldberg and Elhadad, 2010)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 50 / 60

Page 220: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

Example (cont’d)

length & type left & right

complete 51 S IN NN

incomplete 12 SBAR IN NNS2 NP NN NN6 PP IN NNS

reduced model 14 VP VBZ NNSDBM-0 4 NP DT NNP

8 NP DT NNPS5 VP VBD NN

partial parse forests “easy-first” (Goldberg and Elhadad, 2010)

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 50 / 60

Page 221: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

Integrated System

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 51 / 60

Page 222: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Combiners... (from Leapfrog)

Page 223: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Combiners... (from Leapfrog)

LD

LD

LD

+arg

MAXLD

C1

C ∗1 = L(C1)

C2C ∗2 = L(C2)

C ∗1 + C ∗

2 = C+

Page 224: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Combiners... (from Leapfrog)

LD

LD

LD

+arg

MAXLD

C1

C ∗1 = L(C1)

C2C ∗2 = L(C2)

C ∗1 + C ∗

2 = C+

LDC2

C1

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 52 / 60

Page 225: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Inductors... (from Baby Steps and Lateen EM)

Page 226: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Inductors... (from Baby Steps and Lateen EM)

HL·DBMD

SDBM

D

SDBM0D

C

F

S

D0

C1

C2

C ′1

C ′2

Page 227: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Inductors... (from Baby Steps and Lateen EM)

HL·DBMD

SDBM

D

SDBM0D

C

F

S

D0

C1

C2

C ′1

C ′2

CD

lsplit

Dl+1split

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 53 / 60

Page 228: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Iterating... (from Baby Steps, Less is More, etc.)

Page 229: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Iterating... (from Baby Steps, Less is More, etc.)

1 2 14 15

Page 230: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Iterating... (from Baby Steps, Less is More, etc.)

1 2 14 15

HL·DBM

Dl+1split∅

Cl Cl+1l+1

l+1

Page 231: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Iterating... (from Baby Steps, Less is More, etc.)

1 2 14 15

HL·DBM

Dl+1split∅

Cl Cl+1l+1

l+1

HL·DBM

D45split

HL·DBM

D45C

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 54 / 60

Page 232: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 233: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 234: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%DMV @10 34.2%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 235: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%DMV @10 34.2%Baby Steps @15 39.2%Baby Steps @45 39.4%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 236: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%DMV @10 34.2%Baby Steps @15 39.2%Baby Steps @45 39.4%Soft Parameter Tying (Cohen and Smith, 2009) 42.2%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 237: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%DMV @10 34.2%Baby Steps @15 39.2%Baby Steps @45 39.4%Soft Parameter Tying (Cohen and Smith, 2009) 42.2%Less is More @15 44.1%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 238: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%DMV @10 34.2%Baby Steps @15 39.2%Baby Steps @45 39.4%Soft Parameter Tying (Cohen and Smith, 2009) 42.2%Less is More @15 44.1%Leapfrog @45 45.0%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 239: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%DMV @10 34.2%Baby Steps @15 39.2%Baby Steps @45 39.4%Soft Parameter Tying (Cohen and Smith, 2009) 42.2%Less is More @15 44.1%Leapfrog @45 45.0%

(Gimpel and Smith, 2012) 53.1%(Gillenwater et al., 2010) 53.3%

(Bisk and Hockenmaier, 2012) 53.3%(Blunsom and Cohn, 2010) 55.7%

(Tu and Honavar, 2012) 57.0%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 240: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%DMV @10 34.2%Baby Steps @15 39.2%Baby Steps @45 39.4%Soft Parameter Tying (Cohen and Smith, 2009) 42.2%Less is More @15 44.1%Leapfrog @45 45.0%

(Gimpel and Smith, 2012) 53.1%(Gillenwater et al., 2010) 53.3%

(Bisk and Hockenmaier, 2012) 53.3%(Blunsom and Cohn, 2010) 55.7%

(Tu and Honavar, 2012) 57.0%Integrated System — no initializers or gold tags — 64.4%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 241: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%DMV @10 34.2%Baby Steps @15 39.2%Baby Steps @45 39.4%Soft Parameter Tying (Cohen and Smith, 2009) 42.2%Less is More @15 44.1%Leapfrog @45 45.0%

(Gimpel and Smith, 2012) 53.1%(Gillenwater et al., 2010) 53.3%

(Bisk and Hockenmaier, 2012) 53.3%(Blunsom and Cohn, 2010) 55.7%

(Tu and Honavar, 2012) 57.0%Integrated System — no initializers or gold tags — 64.4%Supervised DBM 76.3%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 55 / 60

Page 242: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on Section 23 of WSJ

Right-Branching (Klein and Manning, 2004) 31.7%DMV @10 34.2%Baby Steps @15 39.2%Baby Steps @45 39.4%Soft Parameter Tying (Cohen and Smith, 2009) 42.2%Less is More @15 44.1%Leapfrog @45 45.0%

During the same time, supervised (constituency)parsing advanced from 91.8F1 (Petrov, 2010)

to 92.4F1 (Shindo et al., 2012).

Integrated System — no initializers or gold tags — 64.4%Supervised DBM 76.3%

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 56 / 60

Page 243: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Reduced Models Example Continued

System: Results ... on 23 CoNLL sets

Also state-of-the-art for multi-lingual evaluation,across 19 languages from disparate families:

English + Arabic GreekBasque Hungarian

Bulgarian ItalianCatalan JapaneseChinese PortugueseCzech SlovenianDanish SpanishDutch Swedish

German Turkish

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 57 / 60

Page 244: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Conclusion Thanks!

Thanks!

Omri Abend, Eneko Agirre, Hiyan Alshawi,Serafim Batzoglou, John Bauer, Thorsten Brants, Dan Cer,

Nate Chambers, Angel Chang, Pi-Chuan Chang,Wanxiang Che, Johnny Chen, Jenny Finkel, Andy Golding,Spence Green, Sonal Gupta, David Hall, Dan Jurafsky,

Eisar Lipkovitz, Ting Liu, Chris Manning, Marie de Marneffe,David McClosky, Ryan McDonald, Liz Morin, Andrew Ng,Peter Norvig, Art Owen, Marius Pasca, Fernando Pereira,

Slav Petrov, Daniel Pipes, Agnieszka Purves, Daniel Ramage,Marta Recasens, Roi Reichart, Roy Schwartz, Richard Socher,Mihai Surdeanu, Julie Tibshirani, Mengqiu Wang, Eric Yeh,

annoymous reviewers, and many others of Stanford NLP,Fannie & John Hertz Foundation, and Google Inc.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 58 / 60

Page 245: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Conclusion Thanks!

Thanks!

Omri Abend, Eneko Agirre, Hiyan Alshawi,Serafim Batzoglou, John Bauer, Thorsten Brants, Dan Cer,

Nate Chambers, Angel Chang, Pi-Chuan Chang,Wanxiang Che, Johnny Chen, Jenny Finkel, Andy Golding,Spence Green, Sonal Gupta, David Hall, Dan Jurafsky,

Eisar Lipkovitz, Ting Liu, Chris Manning, Marie de Marneffe,David McClosky, Ryan McDonald, Liz Morin, Andrew Ng,Peter Norvig, Art Owen, Marius Pasca, Fernando Pereira,

Slav Petrov, Daniel Pipes, Agnieszka Purves, Daniel Ramage,Marta Recasens, Roi Reichart, Roy Schwartz, Richard Socher,Mihai Surdeanu, Julie Tibshirani, Mengqiu Wang, Eric Yeh,

annoymous reviewers, and many others of Stanford NLP,Fannie & John Hertz Foundation, and Google Inc.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 58 / 60

Page 246: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Conclusion Thanks!

Thanks!

Omri Abend, Eneko Agirre, Hiyan Alshawi,Serafim Batzoglou, John Bauer, Thorsten Brants, Dan Cer,

Nate Chambers, Angel Chang, Pi-Chuan Chang,Wanxiang Che, Johnny Chen, Jenny Finkel, Andy Golding,Spence Green, Sonal Gupta, David Hall, Dan Jurafsky,

Eisar Lipkovitz, Ting Liu, Chris Manning, Marie de Marneffe,David McClosky, Ryan McDonald, Liz Morin, Andrew Ng,Peter Norvig, Art Owen, Marius Pasca, Fernando Pereira,

Slav Petrov, Daniel Pipes, Agnieszka Purves, Daniel Ramage,Marta Recasens, Roi Reichart, Roy Schwartz, Richard Socher,Mihai Surdeanu, Julie Tibshirani, Mengqiu Wang, Eric Yeh,

annoymous reviewers, and many others of Stanford NLP,Fannie & John Hertz Foundation, and Google Inc.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 58 / 60

Page 247: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Conclusion Thanks!

Thanks!

Omri Abend, Eneko Agirre, Hiyan Alshawi,Serafim Batzoglou, John Bauer, Thorsten Brants, Dan Cer,

Nate Chambers, Angel Chang, Pi-Chuan Chang,Wanxiang Che, Johnny Chen, Jenny Finkel, Andy Golding,Spence Green, Sonal Gupta, David Hall, Dan Jurafsky,

Eisar Lipkovitz, Ting Liu, Chris Manning, Marie de Marneffe,David McClosky, Ryan McDonald, Liz Morin, Andrew Ng,Peter Norvig, Art Owen, Marius Pasca, Fernando Pereira,

Slav Petrov, Daniel Pipes, Agnieszka Purves, Daniel Ramage,Marta Recasens, Roi Reichart, Roy Schwartz, Richard Socher,Mihai Surdeanu, Julie Tibshirani, Mengqiu Wang, Eric Yeh,

annoymous reviewers, and many others of Stanford NLP,Fannie & John Hertz Foundation, and Google Inc.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 58 / 60

Page 248: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Conclusion Thanks!

Thanks!

Omri Abend, Eneko Agirre, Hiyan Alshawi,Serafim Batzoglou, John Bauer, Thorsten Brants, Dan Cer,

Nate Chambers, Angel Chang, Pi-Chuan Chang,Wanxiang Che, Johnny Chen, Jenny Finkel, Andy Golding,Spence Green, Sonal Gupta, David Hall, Dan Jurafsky,

Eisar Lipkovitz, Ting Liu, Chris Manning, Marie de Marneffe,David McClosky, Ryan McDonald, Liz Morin, Andrew Ng,Peter Norvig, Art Owen, Marius Pasca, Fernando Pereira,

Slav Petrov, Daniel Pipes, Agnieszka Purves, Daniel Ramage,Marta Recasens, Roi Reichart, Roy Schwartz, Richard Socher,Mihai Surdeanu, Julie Tibshirani, Mengqiu Wang, Eric Yeh,

annoymous reviewers, and many others of Stanford NLP,Fannie & John Hertz Foundation, and Google Inc.

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 58 / 60

Page 249: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Conclusion Baby Steps...

Dramatization: http://www.youtube.com/embed/ncFCdCjBqcE?start=4&end=66.5

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 59 / 60

Page 250: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Conclusion Baby Steps...

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 59 / 60

Page 251: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Conclusion Baby Steps...

“... the real challenge is to make simple things look beautiful.”— Glenn Corteza, Tanguero Argentino

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 59 / 60

Page 252: Ph.D. Thesis Oral Defense - Stanford NLP Groupnlp.stanford.edu/pubs/SpitkovskyThesis-slides.pdf · Ph.D. Thesis Oral Defense Stanford Artificial Intelligence Laboratory (SAIL) Valentin

Conclusion Questions?

Thanks again!

Questions?

V.I. Spitkovsky (Stanford & Google) Ph.D. Thesis Oral Defense Gates #415 (2013-08-14) 60 / 60