cs460/626 : natural language processing/speech, nlp and the …cs626-460-2012/lecture... · 2012....

87
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 13, 14– Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 2 nd Feb, 2011

Upload: others

Post on 09-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CS460/626 : Natural Language Processing/Speech, NLP and the Web

(Lecture 13, 14– Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing)

Pushpak BhattacharyyaCSE Dept., IIT Bombay

2nd Feb, 2011

Page 2: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Dealing With Structural Ambiguity

� Multiple parses for a sentence

� The man saw the boy with a telescope.

� The man saw the mountain with a � The man saw the mountain with a telescope.

� The man saw the boy with the ponytail.

At the level of syntax, all these sentences are ambiguous. But semantics can disambiguate 2nd & 3rd sentence.

Page 3: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Prepositional Phrase (PP) Attachment Problem

V – NP1 – P – NP2(Here P means preposition)

NP attaches to NP ?NP2 attaches to NP1 ?

or NP2 attaches to V ?

Page 4: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

• Xerox Parsers•XLE Parser (Freely available)•XIP Parser (expensive)

Linguistic Data Consortium(LDC at UPenn)

European Language Resource Association (ELRA)

Linguistic Data Consortium for Indian Languages (LDC-IL)

Europe India

Page 5: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Parse Trees for a Structurally Ambiguous Sentence

Let the grammar be –

S → NP VP

NP → DT N | DT N PPNP → DT N | DT N PP

PP → P NP

VP → V NP PP | V NP

For the sentence,

“I saw a boy with a telescope”

Page 6: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Parse Tree - 1S

NP VP

N V NP

Det N PP

P NP

Det N

I saw

a boy

with

a telescope

Page 7: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Parse Tree -2

S

NP VP

N V NP PP

Det N P NP

Det NI saw

a boy with

a telescope

Page 8: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Parsing Structural Ambiguity

Page 9: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Parsing for Structurally Ambiguous Sentences

� Sentence “I saw a boy with a telescope”

� Grammar:

S � NP VP

NP ART N | ART N PP | PRONNP � ART N | ART N PP | PRON

VP � V NP PP | V NP

ART� a | an | the

N � boy | telescope

PRON � I

V � saw

Page 10: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Ambiguous Parses

� Two possible parses:

� PP attached with Verb (i.e. I used a telescope to see)

� ( S ( NP ( PRON “I” ) ) ( VP ( V “saw” )

( NP ( (ART “a”) ( N “boy”))

( PP (P “with”) (NP ( ART “a” ) ( N ( PP (P “with”) (NP ( ART “a” ) ( N “telescope”)))))

� PP attached with Noun (i.e. boy had a telescope)

� ( S ( NP ( PRON “I” ) ) ( VP ( V “saw” )

( NP ( (ART “a”) ( N “boy”)

(PP (P “with”) (NP ( ART “a” ) ( N “telescope”))))))

Page 11: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

Page 12: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

Page 13: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”, backup

state (b) used

(b) ( ( PRON VP ) 1)

Page 14: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”, backup

state (b) used

(b) ( ( PRON VP ) 1)

3

B( ( PRON VP ) 1 ) − −

Page 15: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”, backup

state (b) used

(b) ( ( PRON VP ) 1)

3

B( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

Page 16: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”, backup

state (b) used

(b) ( ( PRON VP ) 1)

3

B( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

Page 17: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”, backup

state (b) used

(b) ( ( PRON VP ) 1)

3

B( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

Page 18: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”, backup

state (b) used

(b) ( ( PRON VP ) 1)

3

B( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

Page 19: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”, backup

state (b) used

(b) ( ( PRON VP ) 1)

3

B( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

Page 20: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”, backup

state (b) used

(b) ( ( PRON VP ) 1)

3

B( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed

Page 21: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

2 ( ( NP VP ) 1 ) − − Use NP � ART N |

ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”, backup

state (b) used

(b) ( ( PRON VP ) 1)

3

B( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed

Page 22: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

… … … … …

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

10 ( ( P NP ) 5 ) − −

11 ( ( NP ) 6 ) − Consumed “with”

Page 23: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

… … … … …

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

10 ( ( P NP ) 5 ) − −

11 ( ( NP ) 6 ) − Consumed “with”

12 ( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 )(b) ( ( PRON ) 6)

Page 24: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

… … … … …

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

10 ( ( P NP ) 5 ) − −

11 ( ( NP ) 6 ) − Consumed “with”

12 ( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 )(b) ( ( PRON ) 6)

13 ( ( N ) 7 ) − Consumed “a”

Page 25: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S � NP VP

… … … … …

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

10 ( ( P NP ) 5 ) − −

11 ( ( NP ) 6 ) − Consumed “with”

12 ( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 )(b) ( ( PRON ) 6)

13 ( ( N ) 7 ) − Consumed “a”

14 ( ( − ) 8 ) − Consume “telescope”

Finish Parsing

Page 26: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Top Down Parsing -Observations

� Top down parsing gave us the Verb Attachment Parse Tree (i.e., I used a telescope)telescope)

� To obtain the alternate parse tree, the backup state in step 5 will have to be invoked

� Is there an efficient way to obtain all parses ?

Page 27: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

Bottom Up Parse

Colour Scheme :

� Blue for Normal Parse

� Green for Verb Attachment Parse

� Purple for Noun Attachment Parse

� Red for Invalid Parse

Page 28: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

Bottom Up Parse

NP12 � PRON12�

S �NP �VPS1?�NP12�VP2?

Page 29: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3?

S1?�NP12�VP2? VP2?�V23�NP3?PP??

Page 30: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PPS1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5?

Page 31: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

NP �ART N �PP

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PP

NP35�ART34N45�

NP3?�ART34N45�PP5?S1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5?

Page 32: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

NP �ART N �PP

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PP

NP35�ART34N45�

NP3?�ART34N45�PP5?S1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5?

VP25�V23NP35�

S15�NP12VP25�

VP2?�V23NP35�PP5?

Page 33: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

NP �ART N �PP

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PP

PP5?�P56�NP6?NP35�ART34N45�

NP3?�ART34N45�PP5?S1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5?

VP25�V23NP35�

S15�NP12VP25�

VP2?�V23NP35�PP5?

Page 34: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

NP �ART N �PP

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PP

PP5?�P56�NP6?NP35�ART34N45� NP68�ART67�N7?

NP6 �ART �N PPNP3?�ART34N45�PP5?S1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5? NP6?�ART67�N78PP8?

VP25�V23NP35�

S15�NP12VP25�

VP2?�V23NP35�PP5?

Page 35: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

NP �ART N �PP

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PP

PP5?�P56�NP6?NP35�ART34N45� NP68�ART67�N7?

NP6 �ART �N PP

NP68�ART67N78�

NP3?�ART34N45�PP5?S1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5? NP6?�ART67�N78PP8?

VP25�V23NP35�

S15�NP12VP25�

VP2?�V23NP35�PP5?

Page 36: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

NP �ART N �PP

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PP

PP5?�P56�NP6?NP35�ART34N45� NP68�ART67�N7?

NP6 �ART �N PP

NP68�ART67N78�

NP3?�ART34N45�PP5?S1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5? NP6?�ART67�N78PP8?

VP25�V23NP35�

S15�NP12VP25�

VP2?�V23NP35�PP5? PP58�P56NP68�

Page 37: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

NP �ART N �PP

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PP

PP5?�P56�NP6?NP35�ART34N45� NP68�ART67�N7?

NP6 �ART �N PP

NP68�ART67N78�

NP3?�ART34N45�PP5?S1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5? NP6?�ART67�N78PP8?

VP25�V23NP35�

S15�NP12VP25�

VP2?�V23NP35�PP5? PP58�P56NP68�

NP38�ART34N45PP58�

Page 38: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

NP �ART N �PP

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PP

PP5?�P56�NP6?NP35�ART34N45� NP68�ART67�N7?

NP6 �ART �N PP

NP68�ART67N78�

NP3?�ART34N45�PP5?S1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5? NP6?�ART67�N78PP8?

VP25�V23NP35�

S15�NP12VP25�

VP2?�V23NP35�PP5? PP58�P56NP68�

NP38�ART34N45PP58�

VP28�V23NP38�VP28�V23NP35PP58�

Page 39: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

I saw a boy with a telescope

1 2 3 4 5 6 7 8

NP �ART N �PP

Bottom Up Parse

NP12 � PRON12�

S �NP �VP VP �V �NP PP

VP2?�V23�NP3? NP35 �ART34�N45

NP �ART �N PP

PP5?�P56�NP6?NP35�ART34N45� NP68�ART67�N7?

NP6 �ART �N PP

NP68�ART67N78�

NP3?�ART34N45�PP5?S1?�NP12�VP2? VP2?�V23�NP3?PP?? NP3?�ART34�N45PP5? NP6?�ART67�N78PP8?

VP25�V23NP35�

S15�NP12VP25�

VP2?�V23NP35�PP5? PP58�P56NP68�

NP38�ART34N45PP58�

VP28�V23NP38�VP28�V23NP35PP58�

S18�NP12VP28�

Page 40: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Bottom Up Parsing -Observations

� Both Noun Attachment and Verb Attachment Parses obtained by simply systematically applying the rulessystematically applying the rules

� Numbers in subscript help in verifying the parse and getting chunks from the parse

Page 41: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Exercise

For the sentence,

“The man saw the boy with a telescope” & the grammar given previously, & the grammar given previously, compare the performance of top-down, bottom-up & top-down chart parsing.

Page 42: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Start of Probabilistic Parsing

Page 43: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Example of Sentence labeling: Parsing

[S1[S[S[VP[VBCome][NP[NNPJuly]]]]

[,,]

[CC and]

[S [NP [DT the] [JJ IIT] [NN campus]]

[ [ is][VP [AUX is]

[ADJP [JJ abuzz]

[PP[IN with]

[NP[ADJP [JJ new] [CC and] [ VBG returning]]

[NNS students]]]]]]

[..]]]

Page 44: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Noisy Channel Modeling

Noisy ChannelSource sentence

Target parse

T*= argmax [P(T|S)]T

= argmax [P(T).P(S|T)]T

= argmax [P(T)], since given the parse theT sentence is completely

determined and P(S|T)=1

Page 45: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Corpus

� A collection of text called corpus, is used for collecting various language data

� With annotation: more information, but manual labor intensive

� Practice: label automatically; correct manually

� The famous Brown Corpus contains 1 million tagged words.

� Switchboard: very famous corpora 2400 conversations, 543 speakers, many US dialects, annotated with orthography and phonetics

Page 46: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Discriminative vs. Generative Model

W* = argmax (P(W|SS))W

Compute directly fromP(W|SS)

Compute fromP(W).P(SS|W)

DiscriminativeModel

GenerativeModel

Page 47: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Language Models

� N-grams: sequence of n consecutive words/characters

� Probabilistic / Stochastic Context Free Grammars:Grammars:� Simple probabilistic models capable of handling

recursion

� A CFG with probabilities attached to rules

� Rule probabilities � how likely is it that a particular rewrite rule is used?

Page 48: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

PCFGs

� Why PCFGs? � Intuitive probabilistic models for tree-structured

languages

� Algorithms are extensions of HMM algorithms

� Better than the n-gram model for language � Better than the n-gram model for language modeling.

Page 49: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Formal Definition of PCFG

� A PCFG consists of

� A set of terminals {wk}, k = 1,….,V

{wk} = { child, teddy, bear, played…}

� A set of non-terminals {Ni}, i = 1,…,n

{Ni} = { NP, VP, DT…}

� A designated start symbol N1

� A set of rules {Ni → ζj}, where ζj is a sequence of terminals & non-terminals

NP → DT NN

� A corresponding set of rule probabilities

Page 50: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Rule Probabilities

� Rule probabilities are such that

E.g., P( NP → DT NN) = 0.2

P( NP → NN) = 0.5

i

P(N ) 1i ji ζ∀ → =∑

P( NP → NN) = 0.5

P( NP → NP PP) = 0.3

� P( NP → DT NN) = 0.2 � Means 20 % of the training data parses

use the rule NP → DT NN

Page 51: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Probabilistic Context Free Grammars

� S → NP VP 1.0

� NP → DT NN 0.5

� NP → NNS 0.3

� NP → NP PP 0.2

� DT → the 1.0

� NN → gunman 0.5

� NN → building 0.5

� VBD → sprayed 1.0� NP → NP PP 0.2

� PP → P NP 1.0

� VP → VP PP 0.6

� VP → VBD NP 0.4

� VBD → sprayed 1.0

� NNS → bullets 1.0

Page 52: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Example Parse t1`

� The gunman sprayed the building with bullets.S1.0

NP0.5 VP0.6

DT NN PP

P (t1) = 1.0 * 0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.00225

VPDT1.0NN0.5

VBD1.0NP0.5

PP1.0

DT1.0 NN0.5

P1.0 NP0.3

NNS1.0

bullets

with

buildingthe

The gunman

sprayed

0.00225VP0.4

Page 53: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Another Parse t2

S1.0

NP0.5 VP0.4

DT NN0.5 VBD NP

P (t2) = 1.0 * 0.5 * 1.0 * 0.5 * 0.4 * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0

=

� The gunman sprayed the building with bullets.

DT1.0NN0.5 VBD1.0

NP0.5 PP1.0

DT1.0 NN0.5 P1.0 NP0.3

NNS1.0

bullets

withbuildingthe

Thegunman sprayed

NP0.2 = 0.0015

Page 54: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Probability of a sentence

� Notation :

� wab – subsequence wa….wb

� Nj dominates wa….wb

or yield(Nj) = wa….wbwa……………..wb

Nj

the..sweet..teddy ..bear

NP

1

1 1

1

: ( )

( ) ( , )

= ( ) ( | )

( )m

m mt

mt

t yield t w

P w P w t

P t P w t

P t=

=

=

Where t is a parse tree of the sentence

• Probability of a sentence = P(w1m)

1( | ) = 1mP w tQ

If t is a parse tree for the sentence w1m, this will be 1 !!

Page 55: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Assumptions of the PCFG model

� Place invariance :

P(NP → DT NN) is same in locations 1 and 2

� Context-free :

P(NP → DT NN | anything outside “The child”) = P(NP → DT NN)= P(NP → DT NN)

� Ancestor free : At 2,

P(NP → DT NN|its ancestor is VP) = P(NP →DT NN)

S

NP

The child

VP

NP

The toy

1

2

Page 56: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Probability of a parse tree

� Domination :We say Nj dominates from k to l, symbolized as , if Wk,l is derived from Nj

� P (tree |sentence) = P (tree | S1,l ) where S1,l means that the start symbol S dominates the word sequence W1,l

P (t |s) approximately equals joint probability � P (t |s) approximately equals joint probability of constituent non-terminals dominating the sentence fragments (next slide)

Page 57: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Probability of a parse tree (cont.)S1,l

NP1,2 VP3,l

N2V3,3 PP4,l

P4,4 NP5,lw

2

DT1

w1 w

3

P ( t|s ) = P (t | S1,l )

= P ( NP1,2, DT1,1 , w1,

N2,2, w2,

VP , V , ww

4 w5

wl

VP3,l, V3,3 , w3,

PP4,l, P4,4 , w4, NP5,l, w5…l | S1,l )

= P ( NP1,2 , VP3,l | S1,l) * P ( DT1,1 , N2,2 | NP1,2) * D(w1 | DT1,1) * P (w2 | N2,2) * P (V3,3, PP4,l | VP3,l) * P(w3 | V3,3) * P( P4,4, NP5,l | PP4,l ) * P(w4|P4,4) * P (w5…l | NP5,l)

(Using Chain Rule, Context Freeness and Ancestor Freeness )

Page 58: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

HMM ↔ PCFG

� O observed sequence ↔ w1m

sentence

X state sequence ↔ t parse tree� X state sequence ↔ t parse tree

� µ model ↔ G grammar

� Three fundamental questions

Page 59: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

HMM ↔ PCFG

� How likely is a certain observation given the model? ↔ How likely is a sentence given the

grammar?

How to choose a state sequence which best

1( | )mP w G( | )P O µ ↔� How to choose a state sequence which best

explains the observations? ↔ How to choose a

parse which best supports the sentence?

arg max ( | , )X

P X O µ 1arg max ( | , )mt

P t w G↔

Page 60: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

HMM ↔ PCFG

� How to choose the model parameters that best explain the observed data? ↔ How to

choose rule probabilities which maximize the probabilities of the observed sentences?probabilities of the observed sentences?

arg max ( | )P Oµ

µ 1arg max ( | )mG

P w G

Page 61: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Interesting ProbabilitiesN1

NP

(4,5)NPβ

What is the probability of having a NP at this position such that it will derive “the building” ? -

Inside Probabilities

The gunman sprayed the building with bullets1 2 3 4 5 6 7

(4,5) NPαWhat is the probability of starting from N1

and deriving “The gunman sprayed”, a NP and “with bullets” ? -

Outside Probabilities

Page 62: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Interesting Probabilities

� Random variables to be considered

� The non-terminal being expanded. E.g., NP

� The word-span covered by the non-terminal.

E.g., (4,5) refers to words “the building”

� While calculating probabilities, consider:

� The rule to be used for expansion : E.g., NP → DT NN

� The probabilities associated with the RHS non-terminals : E.g., DT subtree’s inside/outside probabilities & NN subtree’s inside/outside probabilities

Page 63: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Outside Probability

� αj(p,q) : The probability of beginning with N1

& generating the non-terminal Njpq and all

words outside wp..wq

1( 1) ( 1)( , ) ( , , | )jj p pq q mp q P w N w Gα − +=

w1 ………wp-1 wp…wqwq+1 ……… wm

N1

Nj

Page 64: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Inside Probabilities

� βj(p,q) : The probability of generating the words wp..wq starting with the non-terminal Nj

pq. ( , ) ( | , )jj pq pqp q P w N Gβ =

N1

w1 ………wp-1 wp…wqwq+1 ……… wm

β

αN1

Nj

Page 65: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Outside & Inside Probabilities: example

4,5

(4,5) for "the building"

(The gunman sprayed, , with bullets | )

NP

P NP G

α=

N1

4,5(4,5) for "the building" (the building | , )NP P NP Gβ =

The gunman sprayed the building with bullets

1 2 3 4 5 6 7

NP

Page 66: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

An important parsing algo

Page 67: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Illustrating CYK [Cocke, Younger, Kashmi] Algo

� S → NP VP 1.0

� NP → DT NN 0.5

� NP → NNS 0.3

� NP → NP PP 0.2

• DT → the 1.0

• NN → gunman 0.5

• NN → building 0.5

• VBD → sprayed 1.0� NP → NP PP 0.2

� PP → P NP 1.0

� VP → VP PP 0.6

� VP → VBD NP 0.4

• VBD → sprayed 1.0

• NNS → bullets 1.0

Page 68: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: Start with (0,1)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT

1 -------

2 ------- -------2 ------- ---------

3 ------- ---------

--------

4 --------

---------

--------

---------

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 69: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: Keep filling diagonals0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT

1 ------- NN

2 ------- -------2 ------- ---------

3 ------- ---------

--------

4 --------

---------

--------

---------

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 70: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: Try getting higher level structures

0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP

1 ------- NN

2 ------- -------2 ------- ---------

3 ------- ---------

--------

4 --------

---------

--------

---------

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 71: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: Diagonal continues0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP

1 ------- NN

2 ------- ------- VBD2 ------- ---------

VBD

3 ------- ---------

--------

4 --------

---------

--------

---------

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 72: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

1 ------- NN ---------

2 ------- ---------

VBD

3 ------- ---------

--------

4 --------

---------

--------

---------

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 73: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

1 ------- NN ---------

2 ------- ---------

VBD

3 ------- ---------

--------

DT

4 --------

---------

--------

---------

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 74: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

1 ------- NN --------

---------- -

2 ------- ---------

VBD ---------

3 ------- ---------

--------

DT

4 --------

---------

--------

---------

NN

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 75: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: starts filling the 5th column0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

1 ------- NN --------

---------- -

2 ------- ---------

VBD ---------

3 ------- ---------

--------

DT NP

4 --------

---------

--------

---------

NN

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 76: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

1 ------- NN --------

---------- -

2 ------- ---------

VBD ---------

VP

3 ------- ---------

--------

DT NP

4 --------

---------

--------

---------

NN

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 77: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

1 ------- NN --------

---------

---------- - -

2 ------- ---------

VBD ---------

VP

3 ------- ---------

--------

DT NP

4 --------

---------

--------

---------

NN

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 78: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: S found, but NO termination!0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

S

1 ------- NN --------

---------

---------- - -

2 ------- ---------

VBD ---------

VP

3 ------- ---------

--------

DT NP

4 --------

---------

--------

---------

NN

5 --------

---------

--------

---------

---------

6 --------

---------

--------

---------

---------

---------

Page 79: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

S

1 ------- NN --------

---------

---------- - -

2 ------- ---------

VBD ---------

VP

3 ------- ---------

--------

DT NP

4 --------

---------

--------

---------

NN

5 --------

---------

--------

---------

---------

P

6 --------

---------

--------

---------

---------

---------

Page 80: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

S ---------

1 ------- NN --------

---------

---------

---------- - - -

2 ------- ---------

VBD ---------

VP ---------

3 ------- ---------

--------

DT NP ---------

4 --------

---------

--------

---------

NN ---------

5 --------

---------

--------

---------

---------

P

6 --------

---------

--------

---------

---------

---------

Page 81: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: Control moves to last column0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

S ---------

1 ------- NN --------

---------

---------

---------

2 ------- ---------

VBD ---------

VP ---------

3 ------- ---------

--------

DT NP ---------

4 --------

---------

--------

---------

NN ---------

5 --------

---------

--------

---------

---------

P

6 --------

---------

--------

---------

---------

---------

NPNNS

Page 82: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

S ---------

1 ------- NN --------

---------

---------

---------- - - -

2 ------- ---------

VBD ---------

VP ---------

3 ------- ---------

--------

DT NP ---------

4 --------

---------

--------

---------

NN ---------

5 --------

---------

--------

---------

---------

P PP

6 --------

---------

--------

---------

---------

---------

NPNNS

Page 83: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

S ---------

1 ------- NN --------

---------

---------

---------

2 ------- ---------

VBD ---------

VP ---------

3 ------- ---------

--------

DT NP ---------

NP

4 --------

---------

--------

---------

NN ---------

---------

5 --------

---------

--------

---------

---------

P PP

6 --------

---------

--------

---------

---------

---------

NPNNS

Page 84: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK (cont…)0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

S ---------

1 ------- NN --------

---------

---------

---------- - - -

2 ------- ---------

VBD ---------

VP ---------

VP

3 ------- ---------

--------

DT NP ---------

NP

4 --------

---------

--------

---------

NN ---------

---------

5 --------

---------

--------

---------

---------

P PP

6 --------

---------

--------

---------

---------

---------

NPNNS

Page 85: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: filling the last column0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

S ---------

1 ------- NN --------

---------

---------

---------

---------- - - - -

2 ------- ---------

VBD ---------

VP ---------

VP

3 ------- ---------

--------

DT NP ---------

NP

4 --------

---------

--------

---------

NN ---------

---------

5 --------

---------

--------

---------

---------

P PP

6 --------

---------

--------

---------

---------

---------

NPNNS

Page 86: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: terminates with S in (0,7)

0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

To

From

1 2 3 4 5 6 7

0 DT NP --------

---------

S ---------

S

1 ------- NN --------

---------

---------

---------

---------- - - - -

2 ------- ---------

VBD ---------

VP ---------

VP

3 ------- ---------

--------

DT NP ---------

NP

4 --------

---------

--------

---------

NN ---------

---------

5 --------

---------

--------

---------

---------

P PP

6 --------

---------

--------

---------

---------

---------

NPNNS

Page 87: CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CYK: Extracting the Parse Tree

� The parse tree is obtained by keeping back pointers.

S (0-7)

NP (0-2)

VP (2-7)2) 7)

VBD (2-3)

NP (3-7)

DT (0-1)

NN (1-2)

The gunman

sprayed

NP (3-5)

PP (5-7)

DT (3-4)

NN (4-5)

P (5-6)

NP (6-7)

NNS (6-7)the building

with

bullets