cs460/626 : natural language processing/speech, nlp and the web (lecture 20– parsing) pushpak...
TRANSCRIPT
![Page 1: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/1.jpg)
CS460/626 : Natural Language Processing/Speech, NLP and the Web
(Lecture 20– Parsing)
Pushpak BhattacharyyaCSE Dept., IIT Bombay
28th Feb, 2011
![Page 2: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/2.jpg)
Need for Parsing Sentences are linear structures, on
the face of it Is that the right view?
Is there a hierarchy- a tree- hidden behind the linear structure?
Is there a principle in branching What are the constituents and
when should the constituent give rise to children?
What is the hierarchy building principle?
![Page 3: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/3.jpg)
Deeper trees needed for capturing sentence structure
NP
PPAP
big
The
of poems
with the blue cover
[The big book of poems with theBlue cover] is on the table.
book
This wont do!
PP
![Page 4: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/4.jpg)
PPs are at the same level: flat with respect to the head word “book”
NP
PPAP
big
The
of poems
with the blue cover
[The big book of poems with theBlue cover] is on the table.
book
No distinction in terms of dominance or c-
command
PP
![Page 5: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/5.jpg)
“Constituency test of Replacement” runs into problems
One-replacement: I bought the big [book of poems with
the blue cover] not the small [one] One-replacement targets book of
poems with the blue cover Another one-replacement:
I bought the big [book of poems] with the blue cover not the small [one] with the red cover
One-replacement targets book of poems
![Page 6: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/6.jpg)
More deeply embedded structureNP
PP
AP
big
The
of poems
with the blue cover
N’1
Nbook
PP
N’2
N’3
![Page 7: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/7.jpg)
To target N1’
I want [NPthis [N’big book of poems with the red cover] and not [Nthat [None]]
![Page 8: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/8.jpg)
Other languages
NP
PPAP
big
The
of poems
with the blue cover
[niil jilda vaalii kavita kii kitaab]
book
English
NP
PPAP
niil jilda vaalii kavita kii
kitaab
PP
badii
Hindi
PP
![Page 9: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/9.jpg)
Other languages: contd
NP
PPAP
big
The
of poems
with the blue cover
[niil malaat deovaa kavitar bai ti]
book
English
NP
PPAP
niil malaat deovaa kavitar
bai
PP
motaa
Bengali
PPti
![Page 10: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/10.jpg)
Grammar and Parsing Algorithms
![Page 11: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/11.jpg)
A simplified grammar
S NP VP NP DT N | N VP V ADV | V
![Page 12: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/12.jpg)
A segment of English Grammar
S’(C) S S{NP/S’} VP VP(AP+) (VAUX) V (AP+)
({NP/S’}) (AP+) (PP+) (AP+) NP(D) (AP+) N (PP+) PPP NP AP(AP) A
![Page 13: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/13.jpg)
Example Sentence
People laugh1 2 3
Lexicon:People - N, V Laugh - N, V
These are positions
This indicate that both Noun and Verb is
possible for the word “People”
![Page 14: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/14.jpg)
Top-Down Parsing
State Backup State Action
-----------------------------------------------------------------------------------------------------
1. ((S) 1) - -
2. ((NP VP)1) - -
3a. ((DT N VP)1) ((N VP) 1) -
3b. ((N VP)1) - -
4. ((VP)2) - Consume “People”
5a. ((V ADV)2) ((V)2) -
6. ((ADV)3) ((V)2) Consume “laugh”
5b. ((V)2) - -
6. ((.)3) - Consume “laugh”
Termination Condition : All inputs over. No symbols remaining.
Note: Input symbols can be pushed back.
Position of input pointer
![Page 15: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/15.jpg)
Discussion for Top-Down Parsing This kind of searching is goal driven. Gives importance to textual precedence
(rule precedence). No regard for data, a priori (useless
expansions made).
![Page 16: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/16.jpg)
Bottom-Up Parsing
Some conventions:N12
S1? -> NP12 ° VP2?
Represents positions
End position unknownWork on the LHS done, while the work on RHS remaining
![Page 17: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/17.jpg)
Bottom-Up Parsing (pictorial representation)
S -> NP12 VP23 °
People Laugh 1 2 3
N12 N23
V12 V23
NP12 -> N12 ° NP23 -> N23 °
VP12 -> V12 ° VP23 -> V23 °
S1? -> NP12 ° VP2?
![Page 18: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/18.jpg)
Problem with Top-Down Parsing
• Left Recursion• Suppose you have A-> AB rule. Then we will have the expansion as
follows:• ((A)K) -> ((AB)K) -> ((ABB)K) ……..
![Page 19: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/19.jpg)
Combining top-down and bottom-up strategies
![Page 20: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/20.jpg)
Top-Down Bottom-Up Chart Parsing
Combines advantages of top-down & bottom-up parsing.
Does not work in case of left recursion. e.g. – “People laugh”
People – noun, verb Laugh – noun, verb
Grammar – S NP VPNP DT N | N
VP V ADV | V
![Page 21: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/21.jpg)
Transitive Closure
People laugh
1 2 3
S NP VP NP N VP V NP DT N S NPVP S NP VP NP N VP V ADV success
VP V
![Page 22: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/22.jpg)
Arcs in Parsing
Each arc represents a chart which records Completed work (left of ) Expected work (right of )
![Page 23: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011](https://reader035.vdocuments.net/reader035/viewer/2022062620/5519ca515503468b0c8b458a/html5/thumbnails/23.jpg)
Example
People laugh loudly
1 2 3 4
S NP VP NP N VP V VP V ADVNP DT N S NPVP VP VADV S NP VPNP N VP V ADV S NP VP
VP V