context-free parsing - northeastern university · context-free grammar the most common way of...
TRANSCRIPT
![Page 1: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/1.jpg)
Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing
Natural Language Processing!CS 6120—Spring 2014!
Northeastern University!!
David Smith!with some slides
from Jason Eisner & Andrew McCallum
![Page 2: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/2.jpg)
Language structure and meaning
We want to know how meaning is mapped onto what language structures.Commonly in English in ways like this:
[Thing The dog] is [Place in the garden]
[Thing The dog] is [Property fierce]
[Action [Thing The dog] is chasing [Thing the cat]]
[State [Thing The dog] was sitting [Place in the garden] [Timeyesterday]]
[Action [Thing We] ran [Path out into the water]]
[Action [Thing The dog] barked [Property/Manner loudly]]
[Action [Thing The dog] barked [Property/Amount nonstop for fivehours]]
![Page 3: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/3.jpg)
Language structure and meaning
We want to know how meaning is mapped onto what language structures.Commonly in English in ways like this:
[Thing The dog] is [Place in the garden]
[Thing The dog] is [Property fierce]
[Action [Thing The dog] is chasing [Thing the cat]]
[State [Thing The dog] was sitting [Place in the garden] [Timeyesterday]]
[Action [Thing We] ran [Path out into the water]]
[Action [Thing The dog] barked [Property/Manner loudly]]
[Action [Thing The dog] barked [Property/Amount nonstop for fivehours]]
![Page 4: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/4.jpg)
Part of speech “Substitution Test”
The {sad, intelligent, green, fat, ...} one is in the corner.
![Page 5: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/5.jpg)
Constituency
The idea: Groups of words may behave as a single unit or phrase, called aconsituent.
E.g. Noun Phrase
Kermit the frogtheyDecember twenty-sixththe reason he is running for president
![Page 6: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/6.jpg)
Constituency
Sentences have parts, some of which appear to have subparts. Thesegroupings of words that go together we will call constituents.
(How do we know they go together? Coming in a few slides...)
I hit the man with a cleaverI hit [the man with a cleaver]I hit [the man] with a cleaver
You could not go to her partyYou [could not] go to her partyYou could [not go] to her party
![Page 7: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/7.jpg)
Constituent PhrasesFor constituents, we usually name them as phrases based on the word thatheads the constituent:
the man from Amherst is a Noun Phrase (NP) because the head man is a noun
extremely clever is an Adjective Phrase (AP) because the head clever is an adjective
down the river is a Prepositional Phrase (PP) because the head down is a preposition
killed the rabbit is a Verb Phrase (VP) because the head killed is a verb
Note that a word is a constituent (a little one). Sometimes words also actas phrases. In:
Joe grew potatoes.Joe and potatoes are both nouns and noun phrases.
Compare with:
The man from Amherst grew beautiful russet potatoes.
We say Joe counts as a noun phrase because it appears in a place that alarger noun phrase could have been.
![Page 8: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/8.jpg)
Evidence constituency exists
1. They appear in similar environments (before a verb)Kermit the frog comes on stageThey come to Massachusetts every summerDecember twenty-sixth comes after ChristmasThe reason he is running for president comes out only now.But not each individual word in the consituent*The comes our... *is comes out... *for comes out...
2. The constituent can be placed in a number of di�erent locationsConsituent = Prepositional phrase: On December twenty-sixthOn December twenty-sixth I’d like to fly to Florida.I’d like to fly on December twenty-sixth to Florida.I’d like to fly to Florida on December twenty-sixth.But not split apart*On December I’d like to fly twenty-sixth to Florida.*On I’d like to fly December twenty-sixth to Florida.
![Page 9: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/9.jpg)
Context-free grammar
The most common way of modeling constituency.
CFG = Context-Free Grammar = Phrase Structure Grammar= BNF = Backus-Naur Form
The idea of basing a grammar on constituent structure dates back toWilhem Wundt (1890), but not formalized until Chomsky (1956), and,independently, by Backus (1959).
![Page 10: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/10.jpg)
Context-free grammar
G = ⌅T,N, S,R⇧
• T is set of terminals (lexicon)
• N is set of non-terminals For NLP, we usually distinguish out a setP � N of preterminals which always rewrite as terminals.
• S is start symbol (one of the nonterminals)
• R is rules/productions of the form X ⇥ ⇤, where X is a nonterminaland ⇤ is a sequence of terminals and nonterminals (may be empty).
• A grammar G generates a language L.
![Page 11: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/11.jpg)
An example context-free grammar
G = ⌅T,N, S,R⇧
T = {that, this, a, the, man, book, flight, meal, include, read, does}
N = {S, NP, NOM, VP, Det, Noun, Verb, Aux}
S = S
R = {
S ⇥ NP VP Det ⇥ that | this | a | theS ⇥ Aux NP VP Noun ⇥ book | flight | meal | manS ⇥ VP Verb ⇥ book | include | readNP ⇥ Det NOM Aux ⇥ doesNOM ⇥ NounNOM ⇥ Noun NOMVP ⇥ VerbVP ⇥ Verb NP
}
![Page 12: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/12.jpg)
Application of grammar rewrite rulesS� NP VP Det� that | this | a | theS� Aux NP VP Noun� book | flight | meal | manS� VP Verb� book | include | readNP� Det NOM Aux� doesNOM� Noun
NOM� Noun NOM
VP� Verb
VP� Verb NP
S ⇥ NP VP⇥ Det NOM VP⇥ The NOM VP⇥ The Noun VP⇥ The man VP⇥ The man Verb NP⇥ The man read NP⇥ The man read Det NOM⇥ The man read this NOM⇥ The man read this Noun⇥ The man read this book
![Page 13: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/13.jpg)
Parse tree
S
��������
⇥⇥⇥⇥⇥⇥⇥⇥
NP����
⇥⇥⇥⇥
Det
The
NOM
Noun
man
VP
�����
⇥⇥⇥⇥⇥
Verb
read
NP����
⇥⇥⇥⇥
Det
this
NOM
Noun
book
![Page 14: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/14.jpg)
CFGs can capture recursion
Example of seemingly endless recursion of embedded prepositional phrases:PP ⇥ Prep NPNP ⇥ Noun PP
[S The mailman ate his [NP lunch [PP with his friend [PP from the cleaningsta� [PP of the building [PP at the intersection [PP on the north end [PP
of town]]]]]]].
(Bracket notation)
![Page 15: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/15.jpg)
Grammaticality
A CFG defines a formal language = the set of all sentences (strings ofwords) that can be derived by the grammar.
Sentences in this set said to be grammatical.
Sentences outside this set said to be ungrammatical.
![Page 16: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/16.jpg)
The Chomsky hierarchy
• Type 0 Languages / GrammarsRewrite rules �⇥ ⇥where � and ⇥ are any string of terminals and nonterminals
• Context-sensitive Languages / GrammarsRewrite rules �X⇥ ⇥ �⇤⇥where X is a non-terminal, and �, ⇥, ⇤ are any string of terminals andnonterminals, (⇤ must be non-empty).
• Context-free Languages / GrammarsRewrite rules X ⇥ ⇤where X is a nonterminal and ⇤ is any string of terminals andnonterminals
• Regular Languages / GrammarsRewrite rules X ⇥ �Ywhere X, Y are single nonterminals, and � is a string of terminals; Ymight be missing.
![Page 17: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/17.jpg)
Parsing regular grammars
(Languages that can be generated by finite-state automata.)Finite state automaton ⇤ regular expression ⇤ regular grammar
Space needed to parse: constant
Time needed to parse: linear (in the length of the input string)
Cannot do embedded recursion, e.g. anbn. (Context-free grammars can.)In the language: ab, aaabbb; not in the language: aabbb
The cat likes tuna fish.The cat the dog chased likes tuna fishThe cat the dog the boy loves chased likes tuna fish.
John, always early to rise, even after a sleepless night filled with the criesof the neighbor’s baby, goes running every morning.
John and Mary, always early to rise, even after a sleepless night filled withthe cries of the neighbor’s baby, go running every morning.
![Page 18: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/18.jpg)
Parsing context-free grammars
(Languages that can be generated by pushdown automata.)
Widely used for surface syntax description (correct word order specification)in natural languages.
Space needed to parse: stack (sometimes a stack of stacks)In general, proportional to the number of levels of recursion in the data.
Time needed to parse: in general O(n3).
Can to anbn, but cannot do anbncn.
Chomsky Normal Form
All rules of the form X ⇥ Y Z or X ⇥ a or S ⇥ ⌅.(S is the only non-terminal that can go to ⌅.)Any CFG can be converted into this form.
How would you convert the rule W ⇥ XY aZ to Chomsky Normal Form?
![Page 19: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/19.jpg)
Chomsky Normal Form Conversion
These steps are used in the conversion:
1. Make S non-recursive
2. Eliminate all epsilon except the one in S (if there is one)
3. Eliminate all chain rules
4. Remove useless symbols (the ones not used in any production).
How would you convert the following grammar?S ⇥ ABSS ⇥ ⌅A⇥ ⌅A⇥ xyzB ⇥ wBB ⇥ v
![Page 20: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/20.jpg)
What is parsing?
We want to run the grammar backwards to find the structure.
Parsing can be viewed as a search problem.
We search through the legal rewritings of the grammar.We want to find all structures matching an input string of words (for themoment)
We can do this bottom-up or top-downThis distinction is independent of depth-first versus breadth-first; we can doeither both ways.Doing this we build a search tree which is di�erent from the parse tree.
![Page 21: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/21.jpg)
Recognizers and parsers
• A recognizer is a program for which a given grammar and a givensentence returns YES if the sentence is accepted by the grammar (i.e.,the sentence is in the language), and NO otherwise.
• A parser in addition to doing the work of a recognizer also returns theset of parse trees for the string.
![Page 22: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/22.jpg)
Soundness and completeness
• A parser is sound if every parse it returns is valid/correct.
• A parser terminates if it is guaranteed not to go o� into an infinite loop.
• A parser is complete if for any given grammar and sentence it is sound,produces every valid parse for that sentence, and terminates.
• (For many cases, we settle for sound but incomplete parsers: e.g.probabilistic parsers that return a k-best list.)
![Page 23: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/23.jpg)
Top-down parsing
Top-down parsing is goal-directed.
• A top-down parser starts with a list of constituents to be built.• It rewrites the goals in the goal list by matching one against the LHS of
the grammar rules,• and expanding it with the RHS,• ...attempting to match the sentence to be derived.
If a goal can be rewritten in several ways, then there is a choice of whichrule to apply (search problem)
Can use depth-first or breadth-first search, and goal ordering.
![Page 24: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/24.jpg)
Top-down parsing example (Breadth-first)
S� NP VP Det� that | this | a | theS� Aux NP VP Noun� book | flight | meal | manS� VP Verb� book | include | readNP� Det NOM Aux� doesNOM� Noun
NOM� Noun NOM
VP� Verb
VP� Verb NP
Book that flight.
(Work out top-down, breadth-first search on the board...)
![Page 25: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/25.jpg)
Top-down parsing example (Breadth-first)
S
S�� ⇥⇥
NP VP
S
�����
⇥⇥⇥⇥⇥
Aux NP VP
S
VP
S
����⇥⇥⇥⇥
NP
��� ⇥⇥⇥
Det NOM
VP
Verb
S
�����
⇥⇥⇥⇥⇥
NP
��� ⇥⇥⇥
Det NOM
VP
��� ⇥⇥⇥
Verb NP
... S
VP
Verb
S
VP
��� ⇥⇥⇥
Verb VP
... S
VP
����⇥⇥⇥⇥
Verb
book
NP
���⇥⇥⇥
Det
that
NOM
Noun
flight
![Page 26: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/26.jpg)
Problems with top-down parsing
• Left recursive rules... e.g. NP ⇥ NP PP... lead to infinite recursion
• Will do badly if there are many di�erent rules for the same LHS. Considerif there are 600 rules for S, 599 of which start with NP, but one of whichstarts with a V, and the sentence starts with a V.
• Useless work: expands things that are possible top-down but not there(no bottom-up evidence for them).
• Top-down parsers do well if there is useful grammar-driven control: searchis directed by the grammar.
• Top-down is hopeless for rewriting parts of speech (preterminals) withwords (terminals). In practice that is always done bottom-up as lexicallookup.
• Repeated work: anywhere there is common substructure.
![Page 27: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/27.jpg)
Bottom-up parsing
Top-down parsing is data-directed.
• The initial goal list of a bottom-up parser is the string to be parsed.• If a sequence in the goal list matches the RHS of a rule, then this
sequence may be replaced by the LHS of the rule.• Parsing is finished when the goal list contains just the start symbol.
If the RHS of several rules match the goal list, then there is a choice ofwhich rule to apply (search problem)
Can use depth-first or breadth-first search, and goal ordering.
The standard presentation is as shift-reduce parsing.
![Page 28: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/28.jpg)
Bottom-up parsing example
S� NP VP Det� that | this | a | theS� Aux NP VP Noun� book | flight | meal | manS� VP Verb� book | include | readNP� Det NOM Aux� doesNOM� Noun
NOM� Noun NOM
VP� Verb
VP� Verb NP
Book that flight.
(Work out bottom-up search on the board...)
![Page 29: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/29.jpg)
Shift-reduce parsing
Stack Input remaining Action() Book that flight shift(Book) that flight reduce, Verb ⇥ book, (Choice #1 of 2)(Verb) that flight shift(Verb that) flight reduce, Det ⇥ that(Verb Det) flight shift(Verb Det flight) reduce, Noun ⇥ flight(Verb Det Noun) reduce, NOM ⇥ Noun(Verb Det NOM) reduce, NP ⇥ Det NOM(Verb NP) reduce, VP ⇥ Verb NP(Verb) reduce, S ⇥ V(S) SUCCESS!
Ambiguity may lead to the need for backtracking.
![Page 30: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/30.jpg)
Shift Reduce Parser
Start with the sentence to be parsed in an input bu�er.
• a ”shift” action correponds to pushing the next input symbol from thebu�er onto the stack
• a ”reduce” action occurrs when we have a rule’s RHS on top of thestack. To perform the reduction, we pop the rule’s RHS o� the stackand replace it with the terminal on the LHS of the corresponding rule.
(When either ”shift” or ”reduce” is possible, choose one arbitrarily.)
If you end up with only the Start symbol on the stack, then success!If you don’t, and you cannot and no ”shift” or ”reduce” actions are possible,backtrack.
![Page 31: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/31.jpg)
Shift Reduce Parser
In a top-down parser, the main decision was which production rule to pick.In a bottom-up shift-reduce parser there are two decisions:
1. Should we shift another symbol, or reduce by some rule?
2. If reduce, then reduce by which rule?
both of which can lead to the need to backtrack
![Page 32: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/32.jpg)
Problems with bottom-up parsing
• Unable to deal with empty categories: termination problem, unlessrewriting empties as constituents is somehow restricted (but then it’sgenerally incomplete)
• Useless work: locally possible, but globally impossible.
• Ine⇥cient when there is great lexical ambiguity (grammar-driven controlmight help here). Conversely, it is data-directed: it attempts to parsethe words that are there.
• Repeated work: anywhere there is common substructure.
• Both Top-down (LL) and Bottom-up (LR) parsers can (and frequentlydo) do work exponential in the sentence length on NLP problems.
![Page 33: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/33.jpg)
Principles for success
• Left recursive structures must be found, not predicted.
• Empty categories must be predicted, not found.
• Don’t waste e�ort re-working what was previously parsed beforebacktracking.
An alternative way to fix things:
• Grammar transformations can fix both left-recursion and epsilonproductions.
• Then you parse the same language but with di�erent trees.
• BUT linguists tend to hate you, because the structure of the re-writtengrammar isn’t what they wanted.
![Page 34: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/34.jpg)
From Shift-Reduce to CKY
• Shift-reduce parsing can make wrong turns, needs backtracking!
• Shift-reduce must pop the top of the stack, but how many items to pop?!
• Time-space tradeoff!
• Chomsky normal form
![Page 35: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/35.jpg)
Chomsky Normal Form
• Any CFL can be generated by an equivalent grammar in CNF!
• Rules of three types!
• X → Y Z X, Y, Z nonterminals!
• X → a X nonterminal, a terminal!
• S → ε S the start symbol!
• NB: the derivation of a given string may change
![Page 36: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/36.jpg)
CNF Conversion
• Create new start symbol!
• Remove NTs that can generate epsilon!
• Remove NTs that can generate each other, (unary rule cycles)!
• Chain rules with RHS > 2!
• Related topic: rule Markovization (later)
![Page 37: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/37.jpg)
CKY Algorithm
• Input: string of n words!
• Output (of recognizer): grammatical or not!
• Dynamic programming in a chart:!
• rows labeled 0 to n-1!
• columns labeled 1 to n!
• cell [i,j] lists possible constituents spanning words between i and j
![Page 38: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/38.jpg)
CKY Algorithm
Andrew McCallum, UMass Amherst
CKY algorithm, recognizer version
! for i := 1 to n
! Add to [i-1,i] all (part-of-speech) categories for the ith word
! for width := 2 to n
! for start := 0 to n-width
! Define end := start + width
! for mid := start+1 to end-1
! for every constituent X in [start,mid]
! for every constituent Y in [mid,end]
! for all ways of combining X and Y (if any)
! Add the resulting constituent to
[start,end] if it’s not already there.
![Page 39: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/39.jpg)
CKY Algorithm
Andrew McCallum, UMass Amherst
CKY algorithm, recognizer version
! for i := 1 to n
! Add to [i-1,i] all (part-of-speech) categories for the ith word
! for width := 2 to n
! for start := 0 to n-width
! Define end := start + width
! for mid := start+1 to end-1
! for every constituent X in [start,mid]
! for every constituent Y in [mid,end]
! for all ways of combining X and Y (if any)
! Add the resulting constituent to
[start,end] if it’s not already there.
Time complexity?
![Page 40: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/40.jpg)
CKY Algorithm
Andrew McCallum, UMass Amherst
CKY algorithm, recognizer version
! for i := 1 to n
! Add to [i-1,i] all (part-of-speech) categories for the ith word
! for width := 2 to n
! for start := 0 to n-width
! Define end := start + width
! for mid := start+1 to end-1
! for every constituent X in [start,mid]
! for every constituent Y in [mid,end]
! for all ways of combining X and Y (if any)
! Add the resulting constituent to
[start,end] if it’s not already there.
Time complexity? O(Gn3)
![Page 41: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/41.jpg)
CKY Algorithm
Andrew McCallum, UMass Amherst
CKY algorithm, recognizer version
! for i := 1 to n
! Add to [i-1,i] all (part-of-speech) categories for the ith word
! for width := 2 to n
! for start := 0 to n-width
! Define end := start + width
! for mid := start+1 to end-1
! for every constituent X in [start,mid]
! for every constituent Y in [mid,end]
! for all ways of combining X and Y (if any)
! Add the resulting constituent to
[start,end] if it’s not already there.
Time complexity?
Space complexity?
O(Gn3)
![Page 42: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/42.jpg)
CKY Algorithm
Andrew McCallum, UMass Amherst
CKY algorithm, recognizer version
! for i := 1 to n
! Add to [i-1,i] all (part-of-speech) categories for the ith word
! for width := 2 to n
! for start := 0 to n-width
! Define end := start + width
! for mid := start+1 to end-1
! for every constituent X in [start,mid]
! for every constituent Y in [mid,end]
! for all ways of combining X and Y (if any)
! Add the resulting constituent to
[start,end] if it’s not already there.
Time complexity?
Space complexity?
O(Gn3)
O(Tn2)
![Page 43: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/43.jpg)
N 84
Det 13
P 2
V 52
NP 4
VP 41
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
NP ! time
Vst ! time
NP ! flies
VP ! flies
P ! like
V ! like
Det ! an
N ! arrow
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 44: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/44.jpg)
N 84
Det 13
P 2
V 52
NP 4
VP 41
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 45: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/45.jpg)
N 84
Det 13
P 2
V 52
NP 4
VP 41
NP 10NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 46: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/46.jpg)
N 84
Det 13
P 2
V 52
NP 4
VP 41
NP 10
S 8
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 47: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/47.jpg)
N 84
Det 13
P 2
V 52
NP 4
VP 41
NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 48: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/48.jpg)
N 84
Det 13
_P 2
V 52
_NP 4
VP 41
NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 49: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/49.jpg)
N 84
NP 10Det 13
_P 2
V 52
_NP 4
VP 41
NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 50: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/50.jpg)
N 84
NP 10Det 13
_P 2
V 52
_NP 4
VP 41
NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 51: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/51.jpg)
N 84
NP 10Det 13
PP 12_P 2
V 52
__NP 4
VP 41
_NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 52: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/52.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
__NP 4
VP 41
_NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 53: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/53.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
__NP 4
VP 41
_NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 54: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/54.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18__NP 4
VP 41
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 55: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/55.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
__NP 4
VP 41
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 56: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/56.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 57: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/57.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 58: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/58.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 59: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/59.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 60: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/60.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 61: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/61.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 62: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/62.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
___NP 4
VP 41
NP 24
S 22
S 27
NP 24
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 63: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/63.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
NP 24
S 27
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 64: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/64.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
NP 24
S 27
S 22
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 65: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/65.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
NP 24
S 27
S 22
S 27
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
![Page 66: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/66.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
NP 24
S 27
S 22
S 27
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
SFollow backpointers …
![Page 67: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/67.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
NP 24
S 27
S 22
S 27
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
S
NP VP
![Page 68: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/68.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
NP 24
S 27
S 22
S 27
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
S
NP VP
VP PP
![Page 69: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/69.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
NP 24
S 27
S 22
S 27
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
S
NP VP
VP PP
P NP
![Page 70: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/70.jpg)
N 84
NP 10Det 13
PP 12
VP 16
_P 2
V 52
NP 18
S 21
VP 18
__NP 4
VP 41
NP 24
S 22
S 27
NP 24
S 27
S 22
S 27
__NP 10
S 8
S 13
NP 3
Vst 3
0
time 1 flies 2 like 3 an 4 arrow 5
1 S ! NP VP
6 S ! Vst NP
2 S ! S PP
1 VP ! V NP
2 VP ! VP PP
1 NP ! Det N
2 NP ! NP PP
3 NP ! NP NP
0 PP ! P NP
S
NP VP
VP PP
P NP
Det N
![Page 71: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/71.jpg)
Parsing as Deduction
• CKY as inference rules!
• CKY as Prolog program!
• But Prolog is top-down with backtracking!
• i.e., “backward chaining”, CKY is “forward chaining”!
• Inference rules as Boolean semiring
![Page 72: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/72.jpg)
Probabilistic CFGs
• Generative process (already familiar)!
• It’s context free: Rules are applied independently, therefore we multiply their probabilities!
• How to estimate probabilities?!
• Supervised and unsupervised
![Page 73: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/73.jpg)
Questions for PCFGs
• What is the most likely parse for a sentence? (parsing)!
• What is the probability of a sentence? (language modeling)!
• What rule probabilities maximize the probability of a sentence? (parameter estimation)
![Page 74: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/74.jpg)
Algorithms for PCFGs
• Exact analogues to HMM algorithms!
• Parsing: Viterbi CKY!
• Language modeling: inside probabilities!
• Parameter estimation: inside-outside probabilities with EM
![Page 75: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/75.jpg)
Parsing as Deduction
constit(B, i, j) ⇤ constit(C, j, k) ⇤A� BC ⇥ constit(A, i, k)
word(W, i) ⇤A�W ⇥ constit(A, i, i + 1)
⇤A,B, C ⇥ N,W ⇥ V, 0 � i, j, k � m
constit(A, I1, I) :- word(I, W), (A ---> [W]), I1 is I - 1.
constit(A, I, K) :- constit(B, I, J), constit(C, J, K), (A ---> [B, C]).
In Prolog:
But Prolog uses top-down search with backtracking...
![Page 76: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/76.jpg)
Parsing as Deduction
constit(B, i, j) ⇤ constit(C, j, k) ⇤A� BC ⇥ constit(A, i, k)
word(W, i) ⇤A�W ⇥ constit(A, i, i + 1)
⇤A,B, C ⇥ N,W ⇥ V, 0 � i, j, k � m
constit(A, i, k) =_
B,C,j
constit(B, i, j) ^ constit(C, j, k) ^A ! B C
constit(A, i, j) =_
W
word(W, i, j) ^A ! W
![Page 77: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/77.jpg)
Parsing as Deduction
And how about the inside algorithm?
constit(A, i, k) =_
B,C,j
constit(B, i, j) ^ constit(C, j, k) ^A ! B C
constit(A, i, j) =_
W
word(W, i, j) ^A ! W
score(constit(A, i, k)) = max
B,C,jscore(constit(B, i, j))
· score(constit(C, j, k))· score(A ! B C)
score(constit(A, i, j)) = max
Wscore(word(W, i, j)) · score(A ! W )
![Page 78: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/78.jpg)
Problems with Inside-Outside EM
• Each sentence at each iteration takes O(m3n3)!
• Local maxima even more problematic than for HMMs: Charniak (1993) found a different maximum for each of 300 trials!
• More NTs needed to learn a good model!
• NTs don’t correspond to intuitions: HMMs are easier to constrain with tag dictionaries
![Page 79: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/79.jpg)
Treebank Grammars
• What rules would you extract from this tree?!
• What probabilities would you assign them?
Parse tree
S
��������
⇥⇥⇥⇥⇥⇥⇥⇥
NP����
⇥⇥⇥⇥
Det
The
NOM
Noun
man
VP
�����
⇥⇥⇥⇥⇥
Verb
read
NP����
⇥⇥⇥⇥
Det
this
NOM
Noun
book
![Page 80: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/80.jpg)
Treebank Grammars
• Penn Treebank!
• Lots of rules have high fanout (flat phrases)!
• Lots of unary cycles!
• How should we evaluate?!
• What are the consequences of CNF conversion?
![Page 81: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/81.jpg)
Inside & Viterbi AlgorithmsLet �A(i, j) = p(constit(A, i, j))
= p(wij | nonterminal A from i to j)
�A(i, j) = pbest(constit(A, i, j))
�A(i, k) = maxB,C,j
�B(i, j)) · �C(j, k) · p(A⇥ B C)
�S(0, n) = ?�S(0, n) = ?
Let
NB: index between words; M&S index words
�A(i, k) =�
B,C,j
�B(i, j) · �C(j, k) · p(A⇥ B C)
![Page 82: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/82.jpg)
Inside & Outside
w(i-1, i) w(j, j+1)w(0, 1) w(n-1, n)
constit(A, i, j)
p(words 0-i, words j-n, constit)
![Page 83: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/83.jpg)
Inside & Outside
w(i-1, i) w(j, j+1)w(0, 1) w(n-1, n)
Inside:!p(words i-j |
constit)
constit(A, i, j)
p(words 0-i, words j-n, constit)
![Page 84: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/84.jpg)
Inside & Outside
w(i-1, i) w(j, j+1)w(0, 1) w(n-1, n)
Inside:!p(words i-j |
constit)
constit(A, i, j)
p(words 0-i, words j-n, constit)
![Page 85: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/85.jpg)
Inside & Outside
w(i-1, i) w(j, j+1)w(0, 1) w(n-1, n)
Inside:!p(words i-j |
constit)
constit(A, i, j)
p(words 0-i, words j-n, constit)
Inside x Outside = Inside probability of the whole tree
![Page 86: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/86.jpg)
Outside Algorithm�A(i, j) = p(w0,i, Ai,j , wj,n)
�A(i, j) =n�
B,C,k=j
�B(i, k) · ⇥C(j, k) · p(B ⇥ A C)
+i�
B,C,k=0
�B(k, j) · ⇥C(k, i) · p(B ⇥ C A)
Uses inside probs.
Some resemblance to derivative product rule
�S(0, n) =?
�PP (0, n) =?
![Page 87: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/87.jpg)
Top-Down/Bottom-Up
• Top-down parsers!
• Can get caught in infinite loops!
• Take exponential time backtracking!
• CKY!
• Needs Chomsky normal form!
• Builds all possible constituents
![Page 88: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/88.jpg)
Andrew McCallum, UMass Amherst
Earley Parser (1970)
• Nice combination of
– dynamic programming
– incremental interpretation
– avoids infinite loops
– no restrictions on the form of the context-free
grammar.A ! B C the D of causes no problems
– O(n3) worst case, but faster for many grammars
– Uses left context and optionally right context to
constrain search.
Earley Parser (1970)
![Page 89: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/89.jpg)
Andrew McCallum, UMass Amherst
Overview of the Algorithm
• Finds constituents and partial constituents in input
– A ! B C . D E is partial: only the first half of the A
A
B C D E
A ! B C . D E
D+ =A
B C D E
A ! B C D . E
Earley’s Overview
![Page 90: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/90.jpg)
Andrew McCallum, UMass Amherst
Overview of the Algorithm
• Proceeds incrementally left-to-right
– Before it reads word 5, it has already built all hypotheses that are
consistent with first 4 words
– Reads word 5 & attaches it to immediately preceding hypotheses.
Might yield new constituents that are then attached to hypotheses
immediately preceding them …
– E.g., attaching D to A ! B C . D E gives A ! B C D . E
– Attaching E to that gives A ! B C D E .
– Now we have a complete A that we can attach to hypotheses
immediately preceding the A, etc.
Earley’s Overview
![Page 91: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/91.jpg)
Andrew McCallum, UMass Amherst
The Parse Table
• Columns 0 through n corresponding to the gaps between
words
• Entries in column 5 look like (3, NP ! NP . PP)
(but we’ll omit the ! etc. to save space)
– Built while processing word 5
– Means that the input substring from 3 to 5matches the initial NP portion of a NP ! NP PP rule
– Dot shows how much we’ve matched as of column 5
– Perfectly fine to have entries like (3, VP ! is it . true that S)
The Parse Table
![Page 92: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/92.jpg)
Andrew McCallum, UMass Amherst
The Parse Table
• Entries in column 5 look like (3, NP ! NP . PP)
• What will it mean that we have this entry?– Unknown right context: Doesn’t mean we’ll necessarily be able
to find a VP starting at column 5 to complete the S.
– Known left context: Does mean that some dotted rule back incolumn 3 is looking for an S that starts at 3.
• So if we actually do find a VP starting at column 5, allowing us tocomplete the S, then we’ll be able to attach the S to something.
• And when that something is complete, it too will have a customerto its left …
• In short, a top-down (i.e., goal-directed) parser: it chooses to startbuilding a constituent not because of the input but because that’swhat the left context needs. In the spoon, won’t build spoon as averb because there’s no way to use a verb there.
• So any hypothesis in column 5 could get used in the correct parse,if words 1-5 are continued in just the right way by words 6-n.
The Parse Table
![Page 93: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/93.jpg)
Andrew McCallum, UMass Amherst
Earley’s Algorithm, recognizer version
• Add ROOT ! . S to column 0.
• For each j from 0 to n:– For each dotted rule in column j,
(including those we add as we go!)look at what’s after the dot:
• If it’s a word w, SCAN:
– If w matches the input word between j and j+1, advance the dotand add the resulting rule to column j+1
• If it’s a non-terminal X, PREDICT:
– Add all rules for X to the bottom of column j, wth the dot at thestart: e.g. X ! . Y Z
• If there’s nothing after the dot, ATTACH:
– We’ve finished some constituent, A, that started in column I<j. Sofor each rule in column j that has A after the dot: Advance the dotand add the result to the bottom of column j.
• Output “yes” just if last column has ROOT ! S .• NOTE: Don’t add an entry to a column if it’s already there!
Earley’s as a Recognizer
![Page 94: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/94.jpg)
Andrew McCallum, UMass Amherst
Summary of the Algorithm
• Process all hypotheses one at a time in order.(Current hypothesis is shown in blue.)
• This may add to the end of the to-dolist, or try to add again.
new hypotheses
old hypotheses
• Process a hypothesis according to what followsthe dot:• If a word, scan input and see if it matches
• If a nonterminal, predict ways to match it• (we’ll predict blindly, but could reduce # of predictions by
looking ahead k symbols in the input and only makingpredictions that are compatible with this limited right context)
• If nothing, then we have a complete constituent, soattach it to all its customers
Earley’s Summary
![Page 95: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/95.jpg)
Andrew McCallum, UMass Amherst
A Grammar
S ! NP VP NP ! Papa
NP! Det N N ! caviar
NP! NP PP N ! spoon
VP! V NP V ! ate
VP! VP PP P ! with
PP! P NP Det ! the
Det ! a
An Input Sentence
Papa ate the caviar with a spoon.
A (Whimsical) Grammar
![Page 96: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/96.jpg)
0 ROOT . S
0 initialize
Remember this stands for (0, ROOT ! . S)
![Page 97: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/97.jpg)
0 S . NP VP
0 ROOT . S
0
predict the kind of S we are looking for
Remember this stands for (0, S ! . NP VP)
![Page 98: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/98.jpg)
0 NP . Papa
0 NP . NP PP
0 NP . Det N
0 S . NP VP
0 ROOT . S
0
predict the kind of NP we are looking for(actually we’ll look for 3 kinds: any of the 3 will do)
![Page 99: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/99.jpg)
0 Det . a
0 Det . the
0 NP . Papa
0 NP . NP PP
0 NP . Det N
0 S . NP VP
0 ROOT . S
0
predict the kind of Det we are looking for (2 kinds)
![Page 100: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/100.jpg)
0 Det . a
0 Det . the
0 NP . Papa
0 NP . NP PP
0 NP . Det N
0 S . NP VP
0 ROOT . S
0
predict the kind of NP we’re looking for but we were already looking for these sodon’t add duplicate goals! Note that this happenedwhen we were processing a left-recursive rule.
![Page 101: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/101.jpg)
0 Det . a
0 Det . the
0 NP . Papa
0 NP . NP PP
0 NP . Det N
0 S . NP VP
0 NP Papa .0 ROOT . S
0 Papa 1
scan: the desired word is in the input!
![Page 102: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/102.jpg)
0 Det . a
0 Det . the
0 NP . Papa
0 NP . NP PP
0 NP . Det N
0 S . NP VP
0 NP Papa .0 ROOT . S
0 Papa 1
scan: failure
![Page 103: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/103.jpg)
0 Det . a
0 Det . the
0 NP . Papa
0 NP . NP PP
0 NP . Det N
0 S . NP VP
0 NP Papa .0 ROOT . S
0 Papa 1
scan: failure
![Page 104: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/104.jpg)
0 Det . a
0 Det . the
0 NP . Papa
0 NP . NP PP
0 NP NP . PP0 NP . Det N
0 S NP . VP0 S . NP VP
0 NP Papa .0 ROOT . S
0 Papa 1
attach the newly created NP(which starts at 0) to its customers (incomplete constituents that end at 0and have NP after the dot)
![Page 105: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/105.jpg)
0 Det . a
0 Det . the
1 VP . VP PP0 NP . Papa
1 VP . V NP0 NP . NP PP
0 NP NP . PP0 NP . Det N
0 S NP . VP0 S . NP VP
0 NP Papa .0 ROOT . S
0 Papa 1
predict
![Page 106: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/106.jpg)
0 Det . a
1 PP . P NP0 Det . the
1 VP . VP PP0 NP . Papa
1 VP . V NP0 NP . NP PP
0 NP NP . PP0 NP . Det N
0 S NP . VP0 S . NP VP
0 NP Papa .0 ROOT . S
0 Papa 1
predict
![Page 107: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/107.jpg)
1 V . ate0 Det . a
1 PP . P NP0 Det . the
1 VP . VP PP0 NP . Papa
1 VP . V NP0 NP . NP PP
0 NP NP . PP0 NP . Det N
0 S NP . VP0 S . NP VP
0 NP Papa .0 ROOT . S
0 Papa 1
predict
![Page 108: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/108.jpg)
1 V . ate0 Det . a
1 PP . P NP0 Det . the
1 VP . VP PP0 NP . Papa
1 VP . V NP0 NP . NP PP
0 NP NP . PP0 NP . Det N
0 S NP . VP0 S . NP VP
0 NP Papa .0 ROOT . S
0 Papa 1
predict
![Page 109: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/109.jpg)
1 P . with
1 V . ate0 Det . a
1 PP . P NP0 Det . the
1 VP . VP PP0 NP . Papa
1 VP . V NP0 NP . NP PP
0 NP NP . PP0 NP . Det N
0 S NP . VP0 S . NP VP
0 NP Papa .0 ROOT . S
0 Papa 1
predict
![Page 110: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/110.jpg)
1 P . with
1 V . ate0 Det . a
1 PP . P NP0 Det . the
1 VP . VP PP0 NP . Papa
1 VP . V NP0 NP . NP PP
0 NP NP . PP0 NP . Det N
0 S NP . VP0 S . NP VP
1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2
scan: success!
![Page 111: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/111.jpg)
1 P . with
1 V . ate0 Det . a
1 PP . P NP0 Det . the
1 VP . VP PP0 NP . Papa
1 VP . V NP0 NP . NP PP
0 NP NP . PP0 NP . Det N
0 S NP . VP0 S . NP VP
1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2
scan: failure
![Page 112: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/112.jpg)
1 P . with
1 V . ate0 Det . a
1 PP . P NP0 Det . the
1 VP . VP PP0 NP . Papa
1 VP . V NP0 NP . NP PP
0 NP NP . PP0 NP . Det N
1 VP V . NP0 S NP . VP0 S . NP VP
1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2
attach
![Page 113: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/113.jpg)
1 P . with
1 V . ate0 Det . a
1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP . NP PP1 VP . V NP0 NP . NP PP
2 NP . Det N0 NP NP . PP0 NP . Det N
1 VP V . NP0 S NP . VP0 S . NP VP
1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2
predict
![Page 114: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/114.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP . NP PP1 VP . V NP0 NP . NP PP
2 NP . Det N0 NP NP . PP0 NP . Det N
1 VP V . NP0 S NP . VP0 S . NP VP
1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2
predict (these next few stepsshould look familiar)
![Page 115: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/115.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP . NP PP1 VP . V NP0 NP . NP PP
2 NP . Det N0 NP NP . PP0 NP . Det N
1 VP V . NP0 S NP . VP0 S . NP VP
1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2
predict
![Page 116: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/116.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP . NP PP1 VP . V NP0 NP . NP PP
2 NP . Det N0 NP NP . PP0 NP . Det N
1 VP V . NP0 S NP . VP0 S . NP VP
1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2
scan (this time we fail sincePapa is not the next word)
![Page 117: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/117.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP . NP PP1 VP . V NP0 NP . NP PP
2 NP . Det N0 NP NP . PP0 NP . Det N
1 VP V . NP0 S NP . VP0 S . NP VP
2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3
scan: success!
![Page 118: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/118.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP . NP PP1 VP . V NP0 NP . NP PP
2 NP . Det N0 NP NP . PP0 NP . Det N
1 VP V . NP0 S NP . VP0 S . NP VP
2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3
![Page 119: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/119.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP . NP PP1 VP . V NP0 NP . NP PP
2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3
![Page 120: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/120.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3
![Page 121: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/121.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
![Page 122: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/122.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
![Page 123: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/123.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
attach
![Page 124: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/124.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
2 Det . the1 PP . P NP0 Det . the
2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
attach
(again!)
![Page 125: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/125.jpg)
1 P . with
2 Det . a1 V . ate0 Det . a
1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
attach
(again!)
![Page 126: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/126.jpg)
1 P . with
4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
![Page 127: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/127.jpg)
0 ROOT S .1 P . with
4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
attach
(again!)
![Page 128: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/128.jpg)
0 ROOT S .1 P . with
4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
![Page 129: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/129.jpg)
4 P . with
0 ROOT S .1 P . with
4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
![Page 130: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/130.jpg)
4 P . with
0 ROOT S .1 P . with
4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4
![Page 131: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/131.jpg)
4 P . with
0 ROOT S .1 P . with
4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5
![Page 132: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/132.jpg)
4 P . with
0 ROOT S .1 P . with
4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5
![Page 133: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/133.jpg)
4 P . with
0 ROOT S .1 P . with
4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5
![Page 134: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/134.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5
![Page 135: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/135.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5
![Page 136: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/136.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5
![Page 137: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/137.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5
![Page 138: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/138.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
5 Det a .4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6
![Page 139: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/139.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det . N4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
5 Det a .4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6
![Page 140: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/140.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
6 N . spoon5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
6 N . caviar5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det . N4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
5 Det a .4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6
![Page 141: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/141.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
6 N . spoon5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
6 N . caviar5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det . N4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
5 Det a .4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6
![Page 142: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/142.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
6 N . spoon5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
6 N . caviar5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det . N4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .5 Det a .4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6 spoon 7
![Page 143: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/143.jpg)
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6 spoon 7
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
6 N . spoon5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
6 N . caviar5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .5 NP Det . N4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .5 Det a .4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
![Page 144: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/144.jpg)
4 P . with
0 ROOT S .1 P . with
5 Det . a4 PP . P NP2 Det . a1 V . ate0 Det . a
5 Det . the1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
5 NP . Papa0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP6 N . spoon5 NP . NP PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .6 N . caviar5 NP . Det N1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .5 NP Det . N4 PP P . NP2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .5 Det a .4 P with .3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6 spoon 7
![Page 145: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/145.jpg)
4 P . with
0 ROOT S .1 P . with
4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 146: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/146.jpg)
4 P . with
0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 147: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/147.jpg)
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 148: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/148.jpg)
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 149: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/149.jpg)
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 150: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/150.jpg)
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 151: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/151.jpg)
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 152: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/152.jpg)
0 ROOT S .
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 153: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/153.jpg)
0 ROOT S .
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 154: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/154.jpg)
0 ROOT S .
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 155: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/155.jpg)
0 ROOT S .
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 156: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/156.jpg)
0 ROOT S .
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
![Page 157: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/157.jpg)
Andrew McCallum, UMass Amherst
Left Recursion Kills Pure
Top-Down Parsing …
VP
![Page 158: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/158.jpg)
Andrew McCallum, UMass Amherst
Left Recursion Kills Pure
Top-Down Parsing …
VP
VP PP
![Page 159: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/159.jpg)
Andrew McCallum, UMass Amherst
Left Recursion Kills Pure
Top-Down Parsing …
VP PP
VP
VP PP
![Page 160: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/160.jpg)
Andrew McCallum, UMass Amherst
Left Recursion Kills Pure
Top-Down Parsing …
VP PP
VP
VP PP
VP PP
makes new hypothesesad infinitum before we’veseen the PPs at all
hypotheses try to predictin advance how many PP’s will arrive in input
![Page 161: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/161.jpg)
Andrew McCallum, UMass Amherst
… but Earley’s Alg is Okay!
VP
PPVP
1 VP ! . VP PP
(in column 1)
![Page 162: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/162.jpg)
Andrew McCallum, UMass Amherst
… but Earley’s Alg is Okay!
VP
V NP
1 VP ! V NP .
ate the caviar
VP
PPVP
1 VP ! . VP PP
(in column 1)
(in column 4)
![Page 163: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/163.jpg)
Andrew McCallum, UMass Amherst
… but Earley’s Alg is Okay!
VP
V NP
VP
PPVP
V NP
attach
ate the caviar
1 VP ! VP . PP
VP
PPVP
1 VP ! . VP PP
(in column 1)
(in column 4)
![Page 164: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/164.jpg)
Andrew McCallum, UMass Amherst
… but Earley’s Alg is Okay!
VP
V NP
VP
PPVP
V NP
ate the caviar
with a spoon
1 VP ! VP PP .
VP
PPVP
1 VP ! . VP PP
(in column 1)
(in column 7)
![Page 165: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/165.jpg)
Andrew McCallum, UMass Amherst
… but Earley’s Alg is Okay!
VP
V NP
VP
PPVP
V NP
ate the caviar
with a spoon
1 VP ! VP PP .
VP
PPVP
1 VP ! . VP PP
can be reused(in column 1)
(in column 7)
![Page 166: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/166.jpg)
Andrew McCallum, UMass Amherst
… but Earley’s Alg is Okay!
VP
V NP
VP
PPVP
V NP
VP
PP
1 VP ! VP . PP
ate the caviar
with a spoon
VP
PPVP
1 VP ! . VP PP
can be reused(in column 1)
(in column 7)
attach
![Page 167: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/167.jpg)
Andrew McCallum, UMass Amherst
… but Earley’s Alg is Okay!
VP
V NP
VP
PPVP
V NP
VP
PP
ate the caviar
with a spoon
in his bed
1 VP ! VP PP .
VP
PPVP
1 VP ! . VP PP
can be reused(in column 1)
(in column 10)
![Page 168: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/168.jpg)
Andrew McCallum, UMass Amherst
… but Earley’s Alg is Okay!
VP
V NP
VP
PPVP
V NP
VP
PP
ate the caviar
with a spoon
in his bed
1 VP ! VP PP .
VP
PPVP
1 VP ! . VP PP
can be reused again(in column 1)
(in column 10)
![Page 169: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/169.jpg)
Andrew McCallum, UMass Amherst
… but Earley’s Alg is Okay!
VP
V NP
VP
PPVP
V NP
VP
PPVP
1 VP ! . VP PP
can be reused again
VP
PP
VP
PP
1 VP ! VP . PP
ate the caviar
with a spoon
in his bed
(in column 1)
(in column 10)
attach
![Page 170: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/170.jpg)
0 ROOT S .
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
completed a VP in col 4col 1 lets us use it in a VP PP structure
![Page 171: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/171.jpg)
0 ROOT S .
7 P . with
1 VP VP . PP
0 S NP VP .
2 NP NP . PP4 P . with
1 VP V NP .0 ROOT S .1 P . with
7 PP . P NP4 PP . P NP2 Det . a1 V . ate0 Det . a
1 VP VP PP .1 VP VP . PP2 Det . the1 PP . P NP0 Det . the
2 NP NP PP .0 S NP VP .2 NP . Papa1 VP . VP PP0 NP . Papa
5 NP NP . PP2 NP NP . PP3 N . spoon2 NP . NP PP1 VP . V NP0 NP . NP PP
4 PP P NP .1 VP V NP .3 N . caviar2 NP . Det N0 NP NP . PP0 NP . Det N
5 NP Det N .2 NP Det N .2 NP Det . N1 VP V . NP0 S NP . VP0 S . NP VP
6 N spoon .…3 N caviar .2 Det the .1 V ate .0 NP Papa .0 ROOT . S
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7
completed that VP = VP PP in col 7col 1 would let us use it in a VP PP structurecan reuse col 1 as often as we need
![Page 172: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/172.jpg)
Beyond Recognition
• So far, we’ve described an Earley recognizer!
• Note what we did when we tried to create entries that already existed!
• What should we do when combining items?!
• How to derive outside algorithm?
![Page 173: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/173.jpg)
Parsing(Tricks
![Page 174: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/174.jpg)
Le./Corner(Parsing
• Technique(for(1(word(of(lookahead(in(algorithms(like(Earley’s(
!
• (can(also(do(mul@/word(lookahead(but(it’s(harder)
![Page 175: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/175.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP0 NP . Papa0 Det . the0 Det . a
attach
Basic(Earley’s(Algorithm
![Page 176: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/176.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the0 Det . a
predict
![Page 177: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/177.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a
predict
![Page 178: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/178.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted
predict
![Page 179: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/179.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted
predict
![Page 180: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/180.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted
predict
! .V makes us add all the verbs in the vocabulary!
! Slow – we’d like a shortcut.
![Page 181: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/181.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted
predict
![Page 182: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/182.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted
predict
![Page 183: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/183.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted
predict
! Every .VP adds all VP " … rules again. ! Before adding a rule, check it’s not a
duplicate. ! Slow if there are > 700 VP " … rules,
so what will you do in Homework 3?
![Page 184: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/184.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted1 P . with
predict
![Page 185: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/185.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted1 P . with
predict! .P makes us add all the prepositions …
![Page 186: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/186.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted1 P . with
1-word lookahead would helpate
![Page 187: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/187.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted1 P . with
1-word lookahead would helpate
No point in adding words other than ate
![Page 188: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/188.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 V . drank1 V . snorted1 P . with
1-word lookahead would helpate
No point in adding words other than ate
In fact, no point in adding any constituent that can’t start with ate
Don’t bother adding PP, P, etc.
![Page 189: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/189.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP0 NP . Papa0 Det . the0 Det . a
attach
With(Le./Corner(Filter
ate
![Page 190: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/190.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP0 NP . Papa0 Det . the0 Det . a
attach
With(Le./Corner(Filter
ate
PP can’t start with ate
![Page 191: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/191.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP0 NP . Papa0 Det . the0 Det . a
attach
With(Le./Corner(Filter
ate
PP can’t start with ate
Pruning– now we won’t predict 1 PP . P NP 1 PP . ate
![Page 192: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/192.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP0 NP . Papa0 Det . the0 Det . a
attach
With(Le./Corner(Filter
ate
PP can’t start with ate
Pruning– now we won’t predict 1 PP . P NP 1 PP . ateeither!
![Page 193: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/193.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP0 NP . Papa0 Det . the0 Det . a
attach
With(Le./Corner(Filter
ate
PP can’t start with ate
Pruning– now we won’t predict 1 PP . P NP 1 PP . ateeither!
![Page 194: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/194.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the0 Det . a
ate
predict
![Page 195: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/195.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 V . ate0 Det . a 1 V . drank
1 V . snorted
predict
ate
![Page 196: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/196.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 V . ate0 Det . a 1 V . drank
1 V . snorted
predict
ate
![Page 197: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/197.jpg)
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 Det . the 1 V . ate0 Det . a 1 V . drank
1 V . snorted
predict
ate
![Page 198: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/198.jpg)
• Grammar(might(have(rules(X"→"A"G"H"P"X"→"B"G"H"P"
• Could(end(up(with(both(of(these(in(chart:((2,"X"→"A"."G"H"P)"in(column(5(
(2,"X"→"B"."G"H"P)"in(column(5(
• But(these(are(now(interchangeable:(if(one(produces(X(then(so(will(the(other(• To(avoid(this(redundancy,(can(always(use(doNed(rules(of(this(form:("X"→"..."G"H"P
Merging(Right/Hand(Sides
![Page 199: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/199.jpg)
• Similarly,(grammar(might(have(rules(X"→"A"G"H"P"X"→"A"G"H"Q"
• Could(end(up(with(both(of(these(in(chart:((2,"X"→"A"."G"H"P)"in(column(5(
(2,"X"→"A"."G"H"Q)"in(column(5(
• Not(interchangeable,(but(we’ll(be(processing(them(in(parallel(for(a(while(…(
• Solu@on:(write(grammar(as(X"→"A"G"H"(P|Q)
Merging(Right/Hand(Sides
![Page 200: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/200.jpg)
• Combining(the(two(previous(cases:(X"→"A"G"H"P"X"→"A"G"H"Q"X"→"B"G"H"P"X"→"B"G"H"Q"
becomes(X"→"(A"|"B)"G"H"(P"|"Q)"
• And(o.en(nice(to(write(stuff(like(NP"→"(Det"|"ε)"Adj*"N
Merging(Right/Hand(Sides
![Page 201: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/201.jpg)
Merging(Right/Hand(Sides
![Page 202: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/202.jpg)
X"→"(A"|"B)"G"H"(P"|"Q)NP"→"(Det"|"ε)"Adj*"N
Merging(Right/Hand(Sides
![Page 203: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/203.jpg)
X"→"(A"|"B)"G"H"(P"|"Q)NP"→"(Det"|"ε)"Adj*"N
• These(are(regular(expressions!
Merging(Right/Hand(Sides
![Page 204: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/204.jpg)
X"→"(A"|"B)"G"H"(P"|"Q)NP"→"(Det"|"ε)"Adj*"N
• These(are(regular(expressions!• Build(their(minimal(DFAs:
Merging(Right/Hand(Sides
![Page 205: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/205.jpg)
X"→"(A"|"B)"G"H"(P"|"Q)NP"→"(Det"|"ε)"Adj*"N
• These(are(regular(expressions!• Build(their(minimal(DFAs:
Merging(Right/Hand(Sides
A
B
P
QG HX →
![Page 206: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/206.jpg)
X"→"(A"|"B)"G"H"(P"|"Q)NP"→"(Det"|"ε)"Adj*"N
• These(are(regular(expressions!• Build(their(minimal(DFAs:
Merging(Right/Hand(Sides
A
B
P
QG HX →
Det
Adj
Adj
NNP →
N
![Page 207: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/207.jpg)
X"→"(A"|"B)"G"H"(P"|"Q)NP"→"(Det"|"ε)"Adj*"N
• These(are(regular(expressions!• Build(their(minimal(DFAs:
Merging(Right/Hand(Sides
A
B
P
QG HX →
Det
Adj
Adj
NNP →
N
!Automaton states replace dotted rules (X → A G . H P)
![Page 208: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/208.jpg)
Merging(Right/Hand(SidesIndeed,(all(NP"→"rules(can(be(unioned(into(a(single(DFA!
NP → ADJP ADJP JJ JJ NN NNS NP → ADJP DT NN NP → ADJP JJ NN NP → ADJP JJ NN NNS NP → ADJP JJ NNS NP → ADJP NN NP → ADJP NN NN NP → ADJP NN NNS NP → ADJP NNS NP → ADJP NPR NP → ADJP NPRS NP → DT NP → DT ADJP NP → DT ADJP , JJ NN NP → DT ADJP ADJP NN NP → DT ADJP JJ JJ NN NP → DT ADJP JJ NN NP → DT ADJP JJ NN NN
etc.
![Page 209: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/209.jpg)
Merging(Right/Hand(SidesIndeed,(all(NP"→"rules(can(be(unioned(into(a(single(DFA!
NP → ADJP ADJP JJ JJ NN NNS | ADJP DT NN | ADJP JJ NN | ADJP JJ NN NNS | ADJP JJ NNS | ADJP NN | ADJP NN NN | ADJP NN NNS | ADJP NNS | ADJP NPR | ADJP NPRS | DT | DT ADJP | DT ADJP , JJ NN | DT ADJP ADJP NN | DT ADJP JJ JJ NN | DT ADJP JJ NN | DT ADJP JJ NN NN
etc.
![Page 210: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/210.jpg)
Merging(Right/Hand(SidesIndeed,(all(NP"→"rules(can(be(unioned(into(a(single(DFA!
NP → ADJP ADJP JJ JJ NN NNS | ADJP DT NN | ADJP JJ NN | ADJP JJ NN NNS | ADJP JJ NNS | ADJP NN | ADJP NN NN | ADJP NN NNS | ADJP NNS | ADJP NPR | ADJP NPRS | DT | DT ADJP | DT ADJP , JJ NN | DT ADJP ADJP NN | DT ADJP JJ JJ NN | DT ADJP JJ NN | DT ADJP JJ NN NN
etc.
regular expression
![Page 211: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/211.jpg)
Merging(Right/Hand(SidesIndeed,(all(NP"→"rules(can(be(unioned(into(a(single(DFA!
NP → ADJP ADJP JJ JJ NN NNS | ADJP DT NN | ADJP JJ NN | ADJP JJ NN NNS | ADJP JJ NNS | ADJP NN | ADJP NN NN | ADJP NN NNS | ADJP NNS | ADJP NPR | ADJP NPRS | DT | DT ADJP | DT ADJP , JJ NN | DT ADJP ADJP NN | DT ADJP JJ JJ NN | DT ADJP JJ NN | DT ADJP JJ NN NN
etc.
regular expression
DFA
ADJPDTNP →
NP
![Page 212: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/212.jpg)
Merging(Right/Hand(SidesIndeed,(all(NP"→"rules(can(be(unioned(into(a(single(DFA!
NP → ADJP ADJP JJ JJ NN NNS | ADJP DT NN | ADJP JJ NN | ADJP JJ NN NNS | ADJP JJ NNS | ADJP NN | ADJP NN NN | ADJP NN NNS | ADJP NNS | ADJP NPR | ADJP NPRS | DT | DT ADJP | DT ADJP , JJ NN | DT ADJP ADJP NN | DT ADJP JJ JJ NN | DT ADJP JJ NN | DT ADJP JJ NN NN
etc.
regular expression
DFA
ADJPDTNP →
NP
ADJP →ADJ
P
ADJP
![Page 213: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/213.jpg)
Merging(Right/Hand(SidesIndeed,(all(NP"→"rules(can(be(unioned(into(a(single(DFA!
NP → ADJP ADJP JJ JJ NN NNS | ADJP DT NN | ADJP JJ NN | ADJP JJ NN NNS | ADJP JJ NNS | ADJP NN | ADJP NN NN | ADJP NN NNS | ADJP NNS | ADJP NPR | ADJP NPRS | DT | DT ADJP | DT ADJP , JJ NN | DT ADJP ADJP NN | DT ADJP JJ JJ NN | DT ADJP JJ NN | DT ADJP JJ NN NN
etc.
regular expression
DFA
ADJPDTNP →
NP
ADJP →ADJ
P
ADJP
![Page 214: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/214.jpg)
Earley’s(Algorithm(on(DFAs
• What(does(Earley’s(algorithm(now(look(like?
Column(4
…
(2,((()
(
(
VP → …PP
NP
…
predict
![Page 215: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/215.jpg)
Earley’s(Algorithm(on(DFAs
• What(does(Earley’s(algorithm(now(look(like?
Column(4
…
(2,((()
(4,((()
(4,((()
Det
Adj
Adj
NNP →
N
PPVP → …
PP
NP
…
predict
PP → …
![Page 216: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/216.jpg)
Earley’s(Algorithm(on(DFAs
• What(does(Earley’s(algorithm(now(look(like?
Column(4 Column(5 … Column(7
… …
(2,((() (4,((()
(4,((()
(4,((() (4,(((()
Det
Adj
Adj
NNP →
N
PPVP → …
PP
NP
…
PP → …
predict or attach?
![Page 217: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/217.jpg)
Earley’s(Algorithm(on(DFAs
• What(does(Earley’s(algorithm(now(look(like?
Column(4 Column(5 … Column(7
… …
(2,((() (4,((()
(4,((()
(4,((() (4,(((()
Det
Adj
Adj
NNP →
N
PPVP → …
PP
NP
…
PP → …
predict or attach?Both!
![Page 218: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/218.jpg)
Pruning(for(Speed
![Page 219: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/219.jpg)
• Heuris@cally(throw(away(cons@tuents(that(probably(won’t(make(it(into(best(complete(parse.((
Pruning(for(Speed
![Page 220: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/220.jpg)
• Heuris@cally(throw(away(cons@tuents(that(probably(won’t(make(it(into(best(complete(parse.((
• Use(probabili@es(to(decide(which(ones.((
Pruning(for(Speed
![Page 221: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/221.jpg)
• Heuris@cally(throw(away(cons@tuents(that(probably(won’t(make(it(into(best(complete(parse.((
• Use(probabili@es(to(decide(which(ones.((–So(probs(are(useful(for(speed(as(well(as(accuracy!
Pruning(for(Speed
![Page 222: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/222.jpg)
• Heuris@cally(throw(away(cons@tuents(that(probably(won’t(make(it(into(best(complete(parse.((
• Use(probabili@es(to(decide(which(ones.((–So(probs(are(useful(for(speed(as(well(as(accuracy!
• Both(safe(and(unsafe(methods(exist
Pruning(for(Speed
![Page 223: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/223.jpg)
• Heuris@cally(throw(away(cons@tuents(that(probably(won’t(make(it(into(best(complete(parse.((
• Use(probabili@es(to(decide(which(ones.((–So(probs(are(useful(for(speed(as(well(as(accuracy!
• Both(safe(and(unsafe(methods(exist–Throw(x(away(if(p(x)(<(10/200( (and(lower(this(threshold(if(we(don’t(get(a(parse)
Pruning(for(Speed
![Page 224: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/224.jpg)
• Heuris@cally(throw(away(cons@tuents(that(probably(won’t(make(it(into(best(complete(parse.((
• Use(probabili@es(to(decide(which(ones.((–So(probs(are(useful(for(speed(as(well(as(accuracy!
• Both(safe(and(unsafe(methods(exist–Throw(x(away(if(p(x)(<(10/200( (and(lower(this(threshold(if(we(don’t(get(a(parse)–Throw(x(away(if(p(x)(<(100(*(p(y)( for(some(y(that(spans(the(same(set(of(words
Pruning(for(Speed
![Page 225: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/225.jpg)
• Heuris@cally(throw(away(cons@tuents(that(probably(won’t(make(it(into(best(complete(parse.((
• Use(probabili@es(to(decide(which(ones.((–So(probs(are(useful(for(speed(as(well(as(accuracy!
• Both(safe(and(unsafe(methods(exist–Throw(x(away(if(p(x)(<(10/200( (and(lower(this(threshold(if(we(don’t(get(a(parse)–Throw(x(away(if(p(x)(<(100(*(p(y)( for(some(y(that(spans(the(same(set(of(words–Throw(x(away(if(p(x)*q(x)(is(small,(where(q(x)(is(an(es@mate(of(probability(of(all(rules(needed(to(combine(x(with(the(other(words(in(the(sentence
Pruning(for(Speed
![Page 226: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/226.jpg)
Agenda((“Best/First”)(Parsing
![Page 227: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/227.jpg)
• Explore(best(op@ons(first(– Should(get(some(good(parses(early(on(–(grab(one(&(go!
Agenda((“Best/First”)(Parsing
![Page 228: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/228.jpg)
• Explore(best(op@ons(first(– Should(get(some(good(parses(early(on(–(grab(one(&(go!
• Priori@ze(cons@ts((and(doNed(cons@ts)(–Whenever(we(build(something,(give(it(a(priority(• How(likely(do(we(think(it(is(to(make(it(into(the(highest/prob(parse?((
– usually(related(to(log(prob.(of(that(cons@t(–might(also(hack(in(the(cons@t’s(context,(length,(etc.(– if(priori@es(are(defined(carefully,(obtain(an(A*(algorithm
Agenda((“Best/First”)(Parsing
![Page 229: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/229.jpg)
• Explore(best(op@ons(first(– Should(get(some(good(parses(early(on(–(grab(one(&(go!
• Priori@ze(cons@ts((and(doNed(cons@ts)(–Whenever(we(build(something,(give(it(a(priority(• How(likely(do(we(think(it(is(to(make(it(into(the(highest/prob(parse?((
– usually(related(to(log(prob.(of(that(cons@t(–might(also(hack(in(the(cons@t’s(context,(length,(etc.(– if(priori@es(are(defined(carefully,(obtain(an(A*(algorithm
• Put(each(cons@t(on(a(priority(queue((heap)
Agenda((“Best/First”)(Parsing
![Page 230: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/230.jpg)
• Explore(best(op@ons(first(– Should(get(some(good(parses(early(on(–(grab(one(&(go!
• Priori@ze(cons@ts((and(doNed(cons@ts)(–Whenever(we(build(something,(give(it(a(priority(• How(likely(do(we(think(it(is(to(make(it(into(the(highest/prob(parse?((
– usually(related(to(log(prob.(of(that(cons@t(–might(also(hack(in(the(cons@t’s(context,(length,(etc.(– if(priori@es(are(defined(carefully,(obtain(an(A*(algorithm
• Put(each(cons@t(on(a(priority(queue((heap)
• Repeatedly(pop(and(process(best(cons@tuent.(((– CKY(style:(combine(w/(previously(popped(neighbors.((– Earley(style:(scan/predict/aNach(as(usual.((What(else?
Agenda((“Best/First”)(Parsing
![Page 231: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/231.jpg)
Preprocessing
![Page 232: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/232.jpg)
• First(“tag”(the(input(with(parts(of(speech:(–Guess(the(correct(preterminal(for(each(word,(using(faster(methods(we’ll(learn(later(–Now(only(allow(one(part(of(speech(per(word(–This(eliminates(a(lot(of(crazy(cons@tuents!(–But(if(you(tagged(wrong(you(could(be(hosed(
• Raise(the(stakes:((–What(if(tag(says(not(just(“verb”(but(“transi@ve(verb”?((Or(“verb(with(a(direct(object(and(2(PPs(aNached”?((((“supertagging”)(
Preprocessing
![Page 233: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/233.jpg)
• First(“tag”(the(input(with(parts(of(speech:(–Guess(the(correct(preterminal(for(each(word,(using(faster(methods(we’ll(learn(later(–Now(only(allow(one(part(of(speech(per(word(–This(eliminates(a(lot(of(crazy(cons@tuents!(–But(if(you(tagged(wrong(you(could(be(hosed(
• Raise(the(stakes:((–What(if(tag(says(not(just(“verb”(but(“transi@ve(verb”?((Or(“verb(with(a(direct(object(and(2(PPs(aNached”?((((“supertagging”)(
• Safer(to(allow(a(few(possible(tags(per(word,(not(just(one(…
Preprocessing
![Page 234: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/234.jpg)
if(x(
then((
( if(y(((((
( then((((((
( (if(a(((((
( (then(b(((((
( (endif(((((
( else(b(((((
( endif(((((
else(b(
endif
Center/Embedding
![Page 235: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/235.jpg)
if(x(
then((
( if(y(((((
( then((((((
( (if(a(((((
( (then(b(((((
( (endif(((((
( else(b(((((
( endif(((((
else(b(
endif
Center/Embedding
STATEMENT → if EXPR then STATEMENT endif
![Page 236: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/236.jpg)
if(x(
then((
( if(y(((((
( then((((((
( (if(a(((((
( (then(b(((((
( (endif(((((
( else(b(((((
( endif(((((
else(b(
endif
Center/Embedding
STATEMENT → if EXPR then STATEMENT endif
![Page 237: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/237.jpg)
if(x(
then((
( if(y(((((
( then((((((
( (if(a(((((
( (then(b(((((
( (endif(((((
( else(b(((((
( endif(((((
else(b(
endif
Center/Embedding
STATEMENT → if EXPR then STATEMENT endif
STATEMENT → if EXPR then STATEMENT else STATEMENT endif
![Page 238: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/238.jpg)
Center/Embedding
![Page 239: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/239.jpg)
• This(is(the(rat(that(ate(the(malt.
Center/Embedding
![Page 240: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/240.jpg)
• This(is(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(ate.
Center/Embedding
![Page 241: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/241.jpg)
• This(is(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(ate.
Center/Embedding
![Page 242: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/242.jpg)
• This(is(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(ate.
• This(is(the(cat(that(bit(the(rat(that(ate(the(malt.
Center/Embedding
![Page 243: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/243.jpg)
• This(is(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(ate.
• This(is(the(cat(that(bit(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(that(the(cat(bit(ate.
Center/Embedding
![Page 244: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/244.jpg)
• This(is(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(ate.
• This(is(the(cat(that(bit(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(that(the(cat(bit(ate.
Center/Embedding
![Page 245: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/245.jpg)
• This(is(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(ate.
• This(is(the(cat(that(bit(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(that(the(cat(bit(ate.
• This(is(the(dog(that(chased(the(cat(that(bit(the(rat(that(ate(the(malt.
Center/Embedding
![Page 246: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/246.jpg)
• This(is(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(ate.
• This(is(the(cat(that(bit(the(rat(that(ate(the(malt.• This(is(the(malt(that(the(rat(that(the(cat(bit(ate.
• This(is(the(dog(that(chased(the(cat(that(bit(the(rat(that(ate(the(malt.• This(is(the(malt(that([the(rat(that([the(cat(that([the(dog(chased](bit](ate].
Center/Embedding
![Page 247: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/247.jpg)
[What(did(you(disguise(([those(handshakes(that(you(greeted(([the(people(we(bought((
[the(bench(([Billy(was(read(to]((
on]((
with]((
with]((
for]?
More(Center/Embedding
![Page 248: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/248.jpg)
[What(did(you(disguise(([those(handshakes(that(you(greeted(([the(people(we(bought((
[the(bench(([Billy(was(read(to]((
on]((
with]((
with]((
for]?
More(Center/Embedding
[Which(mantelpiece(did(you(put([the(idol(I(sacrificed(
[the(fellow(we(sold([the(bridge(you(threw(
[the(bench((((((([Billy(was(read(to]
![Page 249: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/249.jpg)
[What(did(you(disguise(([those(handshakes(that(you(greeted(([the(people(we(bought((
[the(bench(([Billy(was(read(to]((
on]((
with]((
with]((
for]?
More(Center/Embedding
[Which(mantelpiece(did(you(put([the(idol(I(sacrificed(
[the(fellow(we(sold([the(bridge(you(threw(
[the(bench((((((([Billy(was(read(to]on](
![Page 250: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/250.jpg)
[What(did(you(disguise(([those(handshakes(that(you(greeted(([the(people(we(bought((
[the(bench(([Billy(was(read(to]((
on]((
with]((
with]((
for]?
More(Center/Embedding
[Which(mantelpiece(did(you(put([the(idol(I(sacrificed(
[the(fellow(we(sold([the(bridge(you(threw(
[the(bench((((((([Billy(was(read(to]on](
off](
![Page 251: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/251.jpg)
[What(did(you(disguise(([those(handshakes(that(you(greeted(([the(people(we(bought((
[the(bench(([Billy(was(read(to]((
on]((
with]((
with]((
for]?
More(Center/Embedding
[Which(mantelpiece(did(you(put([the(idol(I(sacrificed(
[the(fellow(we(sold([the(bridge(you(threw(
[the(bench((((((([Billy(was(read(to]on](
off](
to](
![Page 252: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/252.jpg)
[What(did(you(disguise(([those(handshakes(that(you(greeted(([the(people(we(bought((
[the(bench(([Billy(was(read(to]((
on]((
with]((
with]((
for]?
More(Center/Embedding
[Which(mantelpiece(did(you(put([the(idol(I(sacrificed(
[the(fellow(we(sold([the(bridge(you(threw(
[the(bench((((((([Billy(was(read(to]on](
off](
to](
to](
![Page 253: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/253.jpg)
[What(did(you(disguise(([those(handshakes(that(you(greeted(([the(people(we(bought((
[the(bench(([Billy(was(read(to]((
on]((
with]((
with]((
for]?
More(Center/Embedding
[Which(mantelpiece(did(you(put([the(idol(I(sacrificed(
[the(fellow(we(sold([the(bridge(you(threw(
[the(bench((((((([Billy(was(read(to]on](
off](
to](
to](
on]?
![Page 254: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/254.jpg)
[What(did(you(disguise(([those(handshakes(that(you(greeted(([the(people(we(bought((
[the(bench(([Billy(was(read(to]((
on]((
with]((
with]((
for]?
More(Center/Embedding
[Which(mantelpiece(did(you(put([the(idol(I(sacrificed(
[the(fellow(we(sold([the(bridge(you(threw(
[the(bench((((((([Billy(was(read(to]on](
off](
to](
to](
on]?
Take that, English teachers!
![Page 255: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/255.jpg)
[What(did(you(disguise(([those(handshakes(that(you(greeted(([the(people(we(bought((
[the(bench((
[Billy(was(read(to]((
on]((
with]((
with]((
for]?
[For(what(did(you(disguise(([those(handshakes(with(which(you(greeted(([the(people(with(which(we(bought((
[the(bench(on(which((
[Billy(was(read(to]?
Center(Recursion(vs.(Tail(Recursion
“pied piping” – NP moves leftward,
preposition follows along
![Page 256: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/256.jpg)
Disallow(Center/Embedding?
![Page 257: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/257.jpg)
• Center/embedding(seems(to(be(in(the(grammar,(but(people(have(trouble(processing(more(than(1(level(of(it.
Disallow(Center/Embedding?
![Page 258: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/258.jpg)
• Center/embedding(seems(to(be(in(the(grammar,(but(people(have(trouble(processing(more(than(1(level(of(it.
• You(can(limit(#(levels(of(center/embedding(via(features:((e.g.,(S[S_DEPTH=n+1](→(A(S[S_DEPTH=n](B
Disallow(Center/Embedding?
![Page 259: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/259.jpg)
• Center/embedding(seems(to(be(in(the(grammar,(but(people(have(trouble(processing(more(than(1(level(of(it.
• You(can(limit(#(levels(of(center/embedding(via(features:((e.g.,(S[S_DEPTH=n+1](→(A(S[S_DEPTH=n](B
• If(a(CFG(limits(#(levels(of(embedding,(then(it(can(be(compiled(into(a(finite/state(machine(–(we(don’t(need(a(stack(at(all!(– Finite/state(recognizers(run(in(linear(@me.((
– However,(it’s(tricky(to(turn(them(into(parsers(for(the(original(CFG(from(which(the(recognizer(was(compiled.
Disallow(Center/Embedding?
![Page 260: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/260.jpg)
Overview
• Treebanks and evaluation!
• Lexicalized parsing (with heads)!
• Examples: Collins
![Page 261: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/261.jpg)
Treebanks
! Pure Grammar Induction Approaches tend not to
produce the parse trees that people want
! Solution
Ø Give a some example of parse trees that we want
Ø Make a learning tool learn a grammar
! Treebank
Ø A collection of such example parses
Ø PennTreebank is most widely used
![Page 262: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/262.jpg)
Treebanks
! Penn Treebank
! Trees are represented via bracketing
! Fairly flat structures for Noun Phrases
(NP Arizona real estate loans)
! Tagged with grammatical and semantic functions
(-SBJ , –LOC, …)
! Use empty nodes(*) to indicate understood subjects and
extraction gaps
![Page 263: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/263.jpg)
( ( S ( NP-SBJ The move) ( VP followed ( NP ( NP a round ) ( PP of (NP ( NP similar increases ) ( PP by ( NP other lenders ) ) ( PP against ( NP Arizona real estate loans ))))) , ( S-ADV ( NP-SBJ * ) ( VP reflecting ( NP a continuing decline ) ( PP-LOC in (NP that market )))))) . )
![Page 264: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/264.jpg)
Treebanks
! Many people have argued that it is better to have
linguists constructing treebanks than grammars
! Because it is easier
- to work out the correct parse of sentences
! than
- to try to determine what all possible manifestations of a
certain rule or grammatical construct are
![Page 265: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/265.jpg)
Andrew McCallum, UMass
Parser Evaluation
![Page 266: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/266.jpg)
Evaluation
Ultimate goal is to build system for IE, QA, MT
People are rarely interested in syntactic analysis for its own
sake
Evaluate the system for evaluate the parser
For Simplicity and modularization, and Convenience
Compare parses from a parser with the result of hand
parsing of a sentence(gold standard)
What is objective criterion that we are trying to
maximize?
![Page 267: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/267.jpg)
Evaluation
Tree Accuracy (Exact match)
It is a very tough standard!!!
But in many ways it is a sensible one to use
PARSEVAL Measures
For some purposes, partially correct parses can be useful
Originally for non-statistical parsers
Evaluate the component pieces of a parse
Measures : Precision, Recall, Crossing brackets
![Page 268: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/268.jpg)
Evaluation
(Labeled) Precision
How many brackets in the parse match those in the correct
tree (Gold standard)?
(Labeled) Recall
How many of the brackets in the correct tree are in the
parse?
Crossing brackets
Average of how many constituents in one tree cross over
constituent boundaries in the other treeB1 ( )B2 ( )B3 ( )B4 ( ) w1 w2 w3 w4 w5 w6 w7 w8
![Page 269: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/269.jpg)
Problems with PARSEVAL
Even vanilla PCFG performs quite well
It measures success at the level of individual decisions
You must make many consecutive decisions correctly to be
correct on the entire tree.
![Page 270: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/270.jpg)
Problems with PARSEVAL (2)
Behind story
The structure of Penn Treebank
Flat ! Few brackets ! Low Crossing brackets
Troublesome brackets are avoided
! High Precision/Recall
The errors in precision and recall are minimal
In some cases wrong PP attachment penalizes Precision,
Recall and Crossing Bracket Accuracy minimally.
On the other hand, attaching low instead of high, then every
node in the right-branching tree will be wrong: serious harm
![Page 271: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/271.jpg)
3
62%
![Page 272: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/272.jpg)
Evaluation
Do PARSEVAL measures succeed in real tasks?
Many small parsing mistakes might not affect tasks of
semantic interpretation
(Bonnema 1996,1997)
Tree Accuracy of the Parser : 62%
Correct Semantic Interpretations : 88%
(Hermajakob and Mooney 1997)
English to German translation
At the moment, people feel PARSEVAL measures are
adequate for the comparing parsers
![Page 273: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/273.jpg)
Lexicalized Parsing
![Page 274: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/274.jpg)
Andrew McCallum, UMass
Limitations of PCFGs
• PCFGs assume:- Place invariance
- Context free: P(rule) independent of
• words outside span
• also, words with overlapping derivation
- Ancestor free: P(rule) independent of
• Non-terminals above.
• Lack of sensitivity to lexical information
• Lack of sensitivity to structural frequencies
![Page 275: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/275.jpg)
Andrew McCallum, UMass
Lack of Lexical Dependency
Means that
P(VP ! V NP NP)
is independent of the particular verb involved!
... but much more likely with ditransitive verbs (like gave).
He gave the boy a ball.
He ran to the store.
![Page 276: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/276.jpg)
The Need for Lexical Dependency
Probabilities dependent on Lexical words
Problem 1 : Verb subcategorization
VP expansion is independent of the choice of verb
However …
Including actual words information when making decisions
about tree structure is necessary
verb
come take think want
VP -> V 9.5% 2.6% 4.6% 5.7%
VP -> V NP 1.1% 32.1% 0.2% 13.9%
VP -> V PP 34.5% 3.1% 7.1% 0.3%
VP -> V SBAR 6.6% 0.3% 73.0% 0.2%
VP -> V S 2.2% 1.3% 4.8% 70.8%
![Page 277: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/277.jpg)
Weakening the independence assumption of PCFG
Probabilities dependent on Lexical words
Problem 2 : Phrasal Attachment
Lexical content of phrases provide information for decision
Syntactic category of the phrases provide very little
information
Standard PCFG is worse than n-gram models
![Page 278: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/278.jpg)
Another case of PP attachment ambiguity
Another Case of PP Attachment Ambiguity
(a) S
NP
NNS
workers
VP
VP
VBD
dumped
NP
NNS
sacks
PP
IN
into
NP
DT
a
NN
bin
![Page 279: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/279.jpg)
Another case of PP attachment ambiguity
(b) S
NP
NNS
workers
VP
VBD
dumped
NP
NP
NNS
sacks
PP
IN
into
NP
DT
a
NN
bin
![Page 280: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/280.jpg)
Another case of PP attachment ambiguity
(a)
Rules
S NP VP
NP NNS
VP VP PP
VP VBD NP
NP NNS
PP IN NP
NP DT NN
NNS workers
VBD dumped
NNS sacks
IN into
DT a
NN bin
(b)
Rules
S NP VP
NP NNS
NP NP PP
VP VBD NP
NP NNS
PP IN NP
NP DT NN
NNS workers
VBD dumped
NNS sacks
IN into
DT a
NN bin
If NP NP PP NP VP VP PP VP then (b) is
more probable, else (a) is more probable.
Attachment decision is completely independent of the words
![Page 281: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/281.jpg)
A case of coordination ambiguityA Case of Coordination Ambiguity
(a) NP
NP
NP
NNS
dogs
PP
IN
in
NP
NNS
houses
CC
and
NP
NNS
cats
(b) NP
NP
NNS
dogs
PP
IN
in
NP
NP
NNS
houses
CC
and
NP
NNS
cats
(a)
Rules
NP NP CC NP
NP NP PP
NP NNS
PP IN NP
NP NNS
NP NNS
NNS dogs
IN in
NNS houses
CC and
NNS cats
(b)
Rules
NP NP CC NP
NP NP PP
NP NNS
PP IN NP
NP NNS
NP NNS
NNS dogs
IN in
NNS houses
CC and
NNS cats
Here the two parses have identical rules, and therefore have
identical probability under any assignment of PCFG rule
probabilities
![Page 282: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/282.jpg)
Weakening the independence assumption of PCFG
Probabilities dependent on Lexical words
Solution
Lexicalize CFG : Each phrasal node with its head word
Background idea
Strong lexical dependencies between heads and their
dependents
![Page 283: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/283.jpg)
Heads in Context-Free RulesHeads in Context-Free Rules
Add annotations specifying the “head” of each rule:
S NP VP
VP Vi
VP Vt NP
VP VP PP
NP DT NN
NP NP PP
PP IN NP
Vi sleeps
Vt saw
NN man
NN woman
NN telescope
DT the
IN with
IN in
Note: S=sentence, VP=verb phrase, NP=noun phrase, PP=prepositional
phrase, DT=determiner, Vi=intransitive verb, Vt=transitive verb, NN=noun,
IN=preposition
![Page 284: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/284.jpg)
More about headsMore about Heads
Each context-free rule has one “special” child that is the head
of the rule. e.g.,
S NP VP (VP is the head)
VP Vt NP (Vt is the head)
NP DT NN NN (NN is the head)
A core idea in linguistics
(X-bar Theory, Head-Driven Phrase Structure Grammar)
Some intuitions:
– The central sub-constituent of each rule.
– The semantic predicate in each rule.
![Page 285: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/285.jpg)
Rules which recover heads:Example rules for NPs
Rules which Recover Heads:An Example of rules for NPs
If the rule contains NN, NNS, or NNP:
Choose the rightmost NN, NNS, or NNP
Else If the rule contains an NP: Choose the leftmost NP
Else If the rule contains a JJ: Choose the rightmost JJ
Else If the rule contains a CD: Choose the rightmost CD
Else Choose the rightmost child
e.g.,
NP DT NNP NN
NP DT NN NNP
NP NP PP
NP DT JJ
NP DT
![Page 286: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/286.jpg)
Adding Headwords to TreesAdding Headwords to Trees
S(questioned)
NP(lawyer)
DT
the
NN
lawyer
VP(questioned)
Vt
questioned
NP(witness)
DT
the
NN
witness
A constituent receives its headword from its head child.
S NP VP (S receives headword from VP)
VP Vt NP (VP receives headword from Vt)
NP DT NN (NP receives headword from NN)
![Page 287: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/287.jpg)
Adding Headtags to TreesAdding Headtags to Trees
S(questioned, Vt)
NP(lawyer, NN)
DT
the
NN
lawyer
VP(questioned, Vt)
Vt
questioned
NP(witness, NN)
DT
the
NN
witness
Also propogate part-of-speech tags up the trees
(We’ll see soon why this is useful!)
![Page 288: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/288.jpg)
Explosion of number of rules
New rules might look like:
VP[gave] ! V[gave] NP[man] NP[book]
But this would be a massive explosion in number of
rules (and parameters)
![Page 289: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/289.jpg)
Sparseness and the Penn TreebankSparseness & the Penn Treebank
! The Penn Treebank – 1 million words of parsed English
WSJ – has been a key resource (because of the widespread
reliance on supervised learning)
! But 1 million words is like nothing:
" 965,000 constituents, but only 66 WHADJP, of which
only 6 aren’t how much or how many, but there is an
infinite space of these (how clever/original/incompetent
(at risk assessment and evaluation))
! Most of the probabilities that you would like to compute,
you can’t compute
383
Sparseness & the Penn Treebank (2)
! Most intelligent processing depends on bilexical statis-
tics: likelihoods of relationships between pairs of words.
! Extremely sparse, even on topics central to the WSJ :
" stocks plummeted 2 occurrences
" stocks stabilized 1 occurrence
" stocks skyrocketed 0 occurrences
"#stocks discussed 0 occurrences
! So far there has been very modest success augmenting
the Penn Treebank with extra unannotated materials or
using semantic classes or clusters (cf. Charniak 1997,
Charniak 2000) – as soon as there are more than tiny
amounts of annotated training data.
384
Probabilistic parsing
! Charniak (1997) expands each phrase structure tree in
a single step.
! This is good for capturing dependencies between child
nodes
! But it is bad because of data sparseness
! A pure dependency, one child at a time, model is worse
! But one can do better by in between models, such as
generating the children as a Markov process on both
sides of the head (Collins 1997; Charniak 2000)
385
Correcting wrong context-freedom assumptions
Horizonal Markov Order
Vertical Order h = 0 h = 1 h ≤ 2 h = 2 h =∞
v = 0 No annotation 71.27 72.5 73.46 72.96 72.62
(854) (3119) (3863) (6207) (9657)
v ≤ 1 Sel. Parents 74.75 77.42 77.77 77.50 76.91
(2285) (6564) (7619) (11398) (14247)
v = 1 All Parents 74.68 77.42 77.81 77.50 76.81
(2984) (7312) (8367) (12132) (14666)
v ≤ 2 Sel. GParents 76.50 78.59 79.07 78.97 78.54
(4943) (12374) (13627) (19545) (20123)
v = 2 All GParents 76.74 79.18 79.74 79.07 78.72
(7797) (15740) (16994) (22886) (22002)
386
Correcting wrong context-freedom assumptions
VPˆS
TO
to
VPˆVP
VB
see
PPˆVP
IN
if
NPˆPP
NN
advertising
NNS
works
VPˆS
TOˆVP
to
VPˆVP
VBˆVP
see
SBARˆVP
INˆSBAR
if
SˆSBAR
NPˆS
NNˆNP
advertising
VPˆS
VBZˆVP
works
(a) (b)
387
Correcting wrong context-freedom assumptions
Cumulative Indiv.Annotation Size F1 ∆ F1 ∆ F1
Baseline 7619 77.72 0.00 0.00UNARY-INTERNAL 8065 78.15 0.43 0.43UNARY-DT 8078 80.09 2.37 0.22UNARY-RB 8081 80.25 2.53 0.48TAG-PA 8520 80.62 2.90 2.57SPLIT-IN 8541 81.19 3.47 2.17SPLIT-AUX 9034 81.66 3.94 0.62SPLIT-CC 9190 81.69 3.97 0.17SPLIT-% 9255 81.81 4.09 0.20TMP-NP 9594 82.25 4.53 1.12GAPPED-S 9741 82.28 4.56 0.22POSS-NP 9820 83.06 5.34 0.33SPLIT-VP 10499 85.72 8.00 1.41BASE-NP 11660 86.04 8.32 0.78DOMINATES-V 14097 86.91 9.19 1.47RIGHT-REC-NP 15276 87.04 9.32 1.99
388
![Page 290: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/290.jpg)
Sparseness and the Penn TreebankSparseness & the Penn Treebank
! The Penn Treebank – 1 million words of parsed English
WSJ – has been a key resource (because of the widespread
reliance on supervised learning)
! But 1 million words is like nothing:
" 965,000 constituents, but only 66 WHADJP, of which
only 6 aren’t how much or how many, but there is an
infinite space of these (how clever/original/incompetent
(at risk assessment and evaluation))
! Most of the probabilities that you would like to compute,
you can’t compute
383
Sparseness & the Penn Treebank (2)
! Most intelligent processing depends on bilexical statis-
tics: likelihoods of relationships between pairs of words.
! Extremely sparse, even on topics central to the WSJ :
" stocks plummeted 2 occurrences
" stocks stabilized 1 occurrence
" stocks skyrocketed 0 occurrences
"#stocks discussed 0 occurrences
! So far there has been very modest success augmenting
the Penn Treebank with extra unannotated materials or
using semantic classes or clusters (cf. Charniak 1997,
Charniak 2000) – as soon as there are more than tiny
amounts of annotated training data.
384
Probabilistic parsing
! Charniak (1997) expands each phrase structure tree in
a single step.
! This is good for capturing dependencies between child
nodes
! But it is bad because of data sparseness
! A pure dependency, one child at a time, model is worse
! But one can do better by in between models, such as
generating the children as a Markov process on both
sides of the head (Collins 1997; Charniak 2000)
385
Correcting wrong context-freedom assumptions
Horizonal Markov Order
Vertical Order h = 0 h = 1 h ≤ 2 h = 2 h =∞
v = 0 No annotation 71.27 72.5 73.46 72.96 72.62
(854) (3119) (3863) (6207) (9657)
v ≤ 1 Sel. Parents 74.75 77.42 77.77 77.50 76.91
(2285) (6564) (7619) (11398) (14247)
v = 1 All Parents 74.68 77.42 77.81 77.50 76.81
(2984) (7312) (8367) (12132) (14666)
v ≤ 2 Sel. GParents 76.50 78.59 79.07 78.97 78.54
(4943) (12374) (13627) (19545) (20123)
v = 2 All GParents 76.74 79.18 79.74 79.07 78.72
(7797) (15740) (16994) (22886) (22002)
386
Correcting wrong context-freedom assumptions
VPˆS
TO
to
VPˆVP
VB
see
PPˆVP
IN
if
NPˆPP
NN
advertising
NNS
works
VPˆS
TOˆVP
to
VPˆVP
VBˆVP
see
SBARˆVP
INˆSBAR
if
SˆSBAR
NPˆS
NNˆNP
advertising
VPˆS
VBZˆVP
works
(a) (b)
387
Correcting wrong context-freedom assumptions
Cumulative Indiv.Annotation Size F1 ∆ F1 ∆ F1
Baseline 7619 77.72 0.00 0.00UNARY-INTERNAL 8065 78.15 0.43 0.43UNARY-DT 8078 80.09 2.37 0.22UNARY-RB 8081 80.25 2.53 0.48TAG-PA 8520 80.62 2.90 2.57SPLIT-IN 8541 81.19 3.47 2.17SPLIT-AUX 9034 81.66 3.94 0.62SPLIT-CC 9190 81.69 3.97 0.17SPLIT-% 9255 81.81 4.09 0.20TMP-NP 9594 82.25 4.53 1.12GAPPED-S 9741 82.28 4.56 0.22POSS-NP 9820 83.06 5.34 0.33SPLIT-VP 10499 85.72 8.00 1.41BASE-NP 11660 86.04 8.32 0.78DOMINATES-V 14097 86.91 9.19 1.47RIGHT-REC-NP 15276 87.04 9.32 1.99
388
![Page 291: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/291.jpg)
Lexicalized, Markov out from head
![Page 292: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/292.jpg)
Collins 1997:
Markov model out from head
Sparseness & the Penn Treebank
! The Penn Treebank – 1 million words of parsed English
WSJ – has been a key resource (because of the widespread
reliance on supervised learning)
! But 1 million words is like nothing:
" 965,000 constituents, but only 66 WHADJP, of which
only 6 aren’t how much or how many, but there is an
infinite space of these (how clever/original/incompetent
(at risk assessment and evaluation))
! Most of the probabilities that you would like to compute,
you can’t compute
383
Sparseness & the Penn Treebank (2)
! Most intelligent processing depends on bilexical statis-
tics: likelihoods of relationships between pairs of words.
! Extremely sparse, even on topics central to the WSJ :
" stocks plummeted 2 occurrences
" stocks stabilized 1 occurrence
" stocks skyrocketed 0 occurrences
"#stocks discussed 0 occurrences
! So far there has been very modest success augmenting
the Penn Treebank with extra unannotated materials or
using semantic classes or clusters (cf. Charniak 1997,
Charniak 2000) – as soon as there are more than tiny
amounts of annotated training data.
384
Probabilistic parsing
! Charniak (1997) expands each phrase structure tree in
a single step.
! This is good for capturing dependencies between child
nodes
! But it is bad because of data sparseness
! A pure dependency, one child at a time, model is worse
! But one can do better by in between models, such as
generating the children as a Markov process on both
sides of the head (Collins 1997; Charniak 2000)
385
Correcting wrong context-freedom assumptions
Horizonal Markov Order
Vertical Order h = 0 h = 1 h ≤ 2 h = 2 h =∞
v = 0 No annotation 71.27 72.5 73.46 72.96 72.62
(854) (3119) (3863) (6207) (9657)
v ≤ 1 Sel. Parents 74.75 77.42 77.77 77.50 76.91
(2285) (6564) (7619) (11398) (14247)
v = 1 All Parents 74.68 77.42 77.81 77.50 76.81
(2984) (7312) (8367) (12132) (14666)
v ≤ 2 Sel. GParents 76.50 78.59 79.07 78.97 78.54
(4943) (12374) (13627) (19545) (20123)
v = 2 All GParents 76.74 79.18 79.74 79.07 78.72
(7797) (15740) (16994) (22886) (22002)
386
Correcting wrong context-freedom assumptions
VPˆS
TO
to
VPˆVP
VB
see
PPˆVP
IN
if
NPˆPP
NN
advertising
NNS
works
VPˆS
TOˆVP
to
VPˆVP
VBˆVP
see
SBARˆVP
INˆSBAR
if
SˆSBAR
NPˆS
NNˆNP
advertising
VPˆS
VBZˆVP
works
(a) (b)
387
Correcting wrong context-freedom assumptions
Cumulative Indiv.Annotation Size F1 ∆ F1 ∆ F1
Baseline 7619 77.72 0.00 0.00UNARY-INTERNAL 8065 78.15 0.43 0.43UNARY-DT 8078 80.09 2.37 0.22UNARY-RB 8081 80.25 2.53 0.48TAG-PA 8520 80.62 2.90 2.57SPLIT-IN 8541 81.19 3.47 2.17SPLIT-AUX 9034 81.66 3.94 0.62SPLIT-CC 9190 81.69 3.97 0.17SPLIT-% 9255 81.81 4.09 0.20TMP-NP 9594 82.25 4.53 1.12GAPPED-S 9741 82.28 4.56 0.22POSS-NP 9820 83.06 5.34 0.33SPLIT-VP 10499 85.72 8.00 1.41BASE-NP 11660 86.04 8.32 0.78DOMINATES-V 14097 86.91 9.19 1.47RIGHT-REC-NP 15276 87.04 9.32 1.99
388
![Page 293: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/293.jpg)
Modeling Rule Productions as Markov Processes
Step 1: generate category of head child
S(told,V[6])
S(told,V[6])
VP(told,V[6])
VP S, told, V[6]
![Page 294: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/294.jpg)
Modeling Rule Productions as Markov Processes
Step 2: generate left modifiers in a Markov chain
S(told,V[6])
?? VP(told,V[6])
S(told,V[6])
NP(Hillary,NNP) VP(told,V[6])
VP S, told, V[6] NP(Hillary,NNP) S,VP,told,V[6],LEFT
![Page 295: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/295.jpg)
Modeling Rule Productions as Markov Processes
Step 2: generate left modifiers in a Markov chain
S(told,V[6])
?? NP(Hillary,NNP) VP(told,V[6])
S(told,V[6])
NP(yesterday,NN) NP(Hillary,NNP) VP(told,V[6])
VP S, told, V[6] NP(Hillary,NNP) S,VP,told,V[6],LEFT
NP(yesterday,NN) S,VP,told,V[6],LEFT
![Page 296: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/296.jpg)
Modeling Rule Productions as Markov Processes
Step 2: generate left modifiers in a Markov chain
S(told,V[6])
?? NP(yesterday,NN) NP(Hillary,NNP) VP(told,V[6])
S(told,V[6])
STOP NP(yesterday,NN) NP(Hillary,NNP) VP(told,V[6])
VP S, told, V[6] NP(Hillary,NNP) S,VP,told,V[6],LEFT
NP(yesterday,NN) S,VP,told,V[6],LEFT STOP S,VP,told,V[6],LEFT
![Page 297: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/297.jpg)
Modeling Rule Productions as Markov Processes
Step 3: generate right modifiers in a Markov chain
S(told,V[6])
STOP NP(yesterday,NN) NP(Hillary,NNP) VP(told,V[6]) ??
S(told,V[6])
STOP NP(yesterday,NN) NP(Hillary,NNP) VP(told,V[6]) STOP
VP S, told, V[6] NP(Hillary,NNP) S,VP,told,V[6],LEFT
NP(yesterday,NN) S,VP,told,V[6],LEFT STOP S,VP,told,V[6],LEFT
STOP S,VP,told,V[6],RIGHT
![Page 298: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/298.jpg)
A Refinement: Adding a Distance Variable
if position is adjacent to the head.
S(told,V[6])
?? VP(told,V[6])
S(told,V[6])
NP(Hillary,NNP) VP(told,V[6])
VP S, told, V[6]
NP(Hillary,NNP) S,VP,told,V[6],LEFT,
![Page 299: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/299.jpg)
Adding dependency on structure
![Page 300: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/300.jpg)
Weakening the independence assumption of PCFG
Probabilities dependent on structural context
PCFGs are also deficient on purely structural grounds too
Really context independent?
Expansion % as Subj % as Obj
NP ! PRP 13.7% 2.1%
NP ! NNP 3.5% 0.9%
NP ! DT NN 5.6% 4.6%
NP ! NN 1.4% 2.8%
NP ! NP SBAR 0.5% 2.6%
NP ! NP PP 5.6% 14.1%
![Page 301: Context-Free Parsing - Northeastern University · Context-free grammar The most common way of modeling constituency. CFG = Context-Free Grammar = Phrase Structure Grammar = BNF =](https://reader034.vdocuments.net/reader034/viewer/2022052614/6060fbdffb60cb688035e3ae/html5/thumbnails/301.jpg)
Weakening the independence assumption of PCFG
Sparseness & the Penn Treebank
! The Penn Treebank – 1 million words of parsed English
WSJ – has been a key resource (because of the widespread
reliance on supervised learning)
! But 1 million words is like nothing:
" 965,000 constituents, but only 66 WHADJP, of which
only 6 aren’t how much or how many, but there is an
infinite space of these (how clever/original/incompetent
(at risk assessment and evaluation))
! Most of the probabilities that you would like to compute,
you can’t compute
383
Sparseness & the Penn Treebank (2)
! Most intelligent processing depends on bilexical statis-
tics: likelihoods of relationships between pairs of words.
! Extremely sparse, even on topics central to the WSJ :
" stocks plummeted 2 occurrences
" stocks stabilized 1 occurrence
" stocks skyrocketed 0 occurrences
"#stocks discussed 0 occurrences
! So far there has been very modest success augmenting
the Penn Treebank with extra unannotated materials or
using semantic classes or clusters (cf. Charniak 1997,
Charniak 2000) – as soon as there are more than tiny
amounts of annotated training data.
384
Probabilistic parsing
! Charniak (1997) expands each phrase structure tree in
a single step.
! This is good for capturing dependencies between child
nodes
! But it is bad because of data sparseness
! A pure dependency, one child at a time, model is worse
! But one can do better by in between models, such as
generating the children as a Markov process on both
sides of the head (Collins 1997; Charniak 2000)
385
Correcting wrong context-freedom assumptions
Horizonal Markov Order
Vertical Order h = 0 h = 1 h ≤ 2 h = 2 h =∞
v = 0 No annotation 71.27 72.5 73.46 72.96 72.62
(854) (3119) (3863) (6207) (9657)
v ≤ 1 Sel. Parents 74.75 77.42 77.77 77.50 76.91
(2285) (6564) (7619) (11398) (14247)
v = 1 All Parents 74.68 77.42 77.81 77.50 76.81
(2984) (7312) (8367) (12132) (14666)
v ≤ 2 Sel. GParents 76.50 78.59 79.07 78.97 78.54
(4943) (12374) (13627) (19545) (20123)
v = 2 All GParents 76.74 79.18 79.74 79.07 78.72
(7797) (15740) (16994) (22886) (22002)
386
Correcting wrong context-freedom assumptions
VPˆS
TO
to
VPˆVP
VB
see
PPˆVP
IN
if
NPˆPP
NN
advertising
NNS
works
VPˆS
TOˆVP
to
VPˆVP
VBˆVP
see
SBARˆVP
INˆSBAR
if
SˆSBAR
NPˆS
NNˆNP
advertising
VPˆS
VBZˆVP
works
(a) (b)
387
Correcting wrong context-freedom assumptions
Cumulative Indiv.Annotation Size F1 ∆ F1 ∆ F1
Baseline 7619 77.72 0.00 0.00UNARY-INTERNAL 8065 78.15 0.43 0.43UNARY-DT 8078 80.09 2.37 0.22UNARY-RB 8081 80.25 2.53 0.48TAG-PA 8520 80.62 2.90 2.57SPLIT-IN 8541 81.19 3.47 2.17SPLIT-AUX 9034 81.66 3.94 0.62SPLIT-CC 9190 81.69 3.97 0.17SPLIT-% 9255 81.81 4.09 0.20TMP-NP 9594 82.25 4.53 1.12GAPPED-S 9741 82.28 4.56 0.22POSS-NP 9820 83.06 5.34 0.33SPLIT-VP 10499 85.72 8.00 1.41BASE-NP 11660 86.04 8.32 0.78DOMINATES-V 14097 86.91 9.19 1.47RIGHT-REC-NP 15276 87.04 9.32 1.99
388