cs626-449: nlp, speech and web-topics-in-ai pushpak bhattacharyya cse dept., iit bombay lecture 37:...

79
CS626-449: NLP, Speech and Web- Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Upload: robyn-higgins

Post on 16-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

CS626-449: NLP, Speech and Web-Topics-in-AI

Pushpak BhattacharyyaCSE Dept., IIT Bombay

Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Page 2: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Vaquious Triangle

2

Anal

ysis

Generation

Transfer Based(do deep semantic processBefore entering the target language)

Direct(enter the target Language immediatelyThrough a dictionary)

Interlingua based (do deep semantic processBefore entering the target language)

Vaquious: an eminentFrench Machine Translation Researcher-Originally a Physicist

Page 3: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

3

Universal Networking Language

Universal Words (UWs) Relations Attributes Knowledge Base

Page 4: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

4

UNL Graph

obj

agt

@ entry @ past

minister(icl>person)

forward(icl>send)

mail(icl>collection)

He(icl>person)

@def

@def

gol

He forwarded the mail to the minister.

Page 5: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

5

AGT / AOJ / OBJ AGT  (Agent)

Definition:  Agt defines a thing which initiates an action

AOJ (Thing with attribute)Definition:  Aoj defines a thing which is in a state or has an attribute

OBJ (Affected thing)Definition: Obj defines a thing in focus which is directly affected by an event or state

Page 6: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

6

Examples John broke the window.

agt ( break.@entry.@past, John)

This flower is beautiful.aoj ( beautiful.@entry, flower)

He blamed John for the accident.obj ( blame.@entry.@past, John)

Page 7: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

7

BEN BEN (Beneficiary)

Definition:  Ben defines a not directly related beneficiary or victim of an event or state

Can I do anything for you?ben ( do.@entry.@interrogation.@politeness, you )obj ( do.@entry.@interrogation.@politeness,

anything )agt (do.@entry.@interrogation.@politeness, I )

Page 8: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

8

PUR PUR (Purpose or objective)

Definition:  Pur defines the purpose or objectives of the agent of an event or the purpose of a thing exist

This budget is for food.pur ( food.@entry, budget )mod ( budget, this )

Page 9: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

9

RSN

RSN (Reason)Definition:  Rsn defines a reason why an event or a state happens

They selected him for his honesty.agt(select(icl>choose).@entry, they)obj(select(icl>choose) .@entry, he)rsn (select(icl>choose).@entry, honesty)

Page 10: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

10

TIM

TIM (Time)Definition:  Tim defines the time an event occurs or a state is true

I wake up at noon.agt ( wake up.@entry, I )tim ( wake up.@entry, noon(icl>time))

Page 11: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

11

TMF TMF (Initial time)

Definition:  Tmf defines a time an event starts

The meeting started from morning.obj ( start.@entry.@past, meeting.@def )tmf ( start.@entry.@past, morning(icl>time) )

Page 12: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

12

TMT TMT (Final time)

Definition: Tmt defines a time an event ends

The meeting continued till evening.obj ( continue.@entry.@past, meeting.@def )tmt ( continue.@entry.@past,evening(icl>time) )

Page 13: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

13

PLC PLC (Place)

Definition:  Plc defines the place an event occurs or a state is true or a thing exists

He is very famous in India.aoj ( famous.@entry, he )man ( famous.@entry, very)plc ( famous.@entry, India)

Page 14: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

14

PLF

PLF  (Initial place)Definition:  Plf defines the place an event begins or a state becomes true

Participants come from the whole world.

agt ( come.@entry, participant.@pl )plf ( come.@entry, world )mod ( world, whole)

Page 15: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

15

PLT

PLT  (Final place)Definition:  Plt defines the place an event ends or a state becomes false

We will go to Delhi.agt ( go.@entry.@future, we )plt ( go.@entry.@future, Delhi)

Page 16: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

16

INS

INS   (Instrument) Definition:  Ins defines the instrument to carry out an event

I solved it with computeragt ( solve.@entry.@past, I )ins ( solve.@entry.@past, computer )obj ( solve.@entry.@past, it )

Page 17: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

17

Attributes Constitute syntax of UNL Play the role of bridging the conceptual world

and the real world in the UNL expressions Show how and when the speaker views what is

said and with what intention, feeling, and so on Seven types:

Time with respect to the speaker Aspects Speaker’s view of reference Speaker’s emphasis, focus, topic, etc. Convention Speaker’s attitudes Speaker’s feelings and viewpoints

Page 18: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

18

Tense: @past

The past tense is normally expressed by @past

{unl}agt(go.@entry.@past, he)…{/unl}

He went there yesterday

Page 19: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

19

Aspects: @progress

{unl}man

( rain.@entry.@present.@progress, hard )

{/unl}

It’s raining hard.

Page 20: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

20

Speaker’s view of reference

@def (Specific concept (already referred))The house on the corner is for sale.

@indef (Non-specific class)There is a book on the desk

@not is always attached to the UW which is negated.

He didn’t come. agt ( come.@entry.@past.@not, he )

Page 21: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

21

Speaker’s emphasis

@emphasisJohn his name is.

mod ( name, he )aoj ( John.@emphasis.@entry, name )

@entry denotes the entry point or main UW of an UNL expression

Page 22: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

22

Subcategorization Frames Specify the categorial class of the

lexical item. Specify the environment. Examples:

kick: [V; _ NP]cry: [V; _ ] rely: [V; _PP] put: [V; _ NP PP]think: : [V; _ S` ]

Page 23: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

23

Subcategorization Rules

V y /_NP]_ ]_PP]_NP PP]_S`]

Subcategorization Rule:

Page 24: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

24

Subcategorization Rules

1. S NP VP

2. VP V (NP) (PP) (S`)…3. NP Det N4. V rely / _PP]5. P on / _NP]6. Det the7. N boy, friend

The boy relied on the friend.

Page 25: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

25

Semantically Odd Constructions

Can we exclude these two ill-formed structures ? *The boy frightened sincerity. *Sincerity kicked the boy.

Selectional Restrictions

Page 26: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

26

Selectional Restrictions

Inherent Properties of Nouns:[+/- ABSTRACT], [+/- ANIMATE]

E.g., Sincerity [+ ABSTRACT]Boy [+ANIMATE]

Page 27: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

27

Selectional Rules A selectional rule specifies certain selectional

restrictions associated with a verb.

V y /[+/-ABSTARCT][+/-

ANIMATE]

V frighten

/ [+/-ABSTARCT]

[+ANIMATE]

__

__

__

__

Page 28: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

28

Subcategorization FrameforwardV__ NP PP

invitationN__ PP

accessibleA__ PP

e.g., An invitation to the party

e.g., A program making science is more accessible to young people

e.g., We will be forwarding our new catalogue to you

Page 29: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

29

Thematic Roles

The man forwarded the mail to the minister.

forward

V__ NP PP

Event FORWARD [Thing THE MAN], [Thing THE MAIL],

[Path TO THE MINISTER]

()

Page 30: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

30

How to define the UWs in UNL Knowledge-Base?

Nominal concept Abstract Concrete

Verbal concept Do Occur Be

Adjective concept Adverbial concept

Page 31: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

31

Nominal Concept: Abstract thing

abstract thing{(icl>thing)}culture(icl>abstract thing)

civilization(icl>culture{>abstract thing})direction(icl>abstract thing)

east(icl>direction{>abstract thing})duty(icl>abstract thing)

mission(icl>duty{>abstract thing})responsibility(icl>duty{>abstract thing})

accountability{(icl>responsibility>duty)}event(icl>abstract thing{,icl>time>abstract thing}) meeting(icl>event{>abstract thing,icl>group>abstract thing})

conference(icl>meeting{>event}) TV

conference{(icl>conference>meeting)}

Page 32: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

32

Nominal Concept: Concrete thing

concrete thing{(icl>thing,icl>place>thing)}building(icl>concrete thing)

factory(icl>building{>concrete thing})house(icl>building{>concrete thing})

substance(icl>concrete thing)cloth(icl>substance{>concrete thing})

cotton(icl>cloth{>substance})fiber(icl>substance{>concrete thing})

synthetic fiber{(icl>fiber>substance)} textile fiber{(icl>fiber>substance)}

liquid(icl>substance{>concrete thing})

beverage(icl>food,icl>liquid>substance}) coffee(icl>beverage{>food}) liquor(icl>beverage{>food})

beer(icl>liquor{>beverage})

Page 33: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

33

Verbal concept: do

do({icl>do,}agt>thing,gol>thing,obj>thing)express({icl>do(}agt>thing,gol>thing,obj>thing{)})

state(icl>express(agt>thing,gol>thing,obj>thing))

explain(icl>state(agt>thing,gol>thing,obj>thing))add({icl>do(}agt>thing,gol>thing,obj>thing{)})

change({icl>do(}agt>thing,gol>thing,obj>thing{)})

convert(icl>change(agt>thing,gol>thing,obj>thing)classify({icl>do(}agt>thing,gol>thing,obj>thing{)})

divide(icl>classify(agt>thing,gol>thing,obj>thing))

Page 34: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

34

Verbal concept: occur and be occur({icl>occur,}gol>thing,obj>thing)

melt({icl>occur(}gol>thing,obj>thing{)})

divide({icl>occur(}gol>thing,obj>thing{)})arrive({icl>occur(}obj>thing{)})

be({icl>be,}aoj>thing{,^obj>thing}) exist({icl>be(}aoj>thing{)})

born({icl>be(}aoj>thing{)})

Page 35: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

35

How to define the UWs in UNL

Knowledge Base?

In order to distinguish among the verb classes headed by 'do', 'occur' and 'be', the following features are used: 

UW[ need an agent ]

[ need an object ]

English

'do' + + "to kill"

'occur' - + "to fall"

'be' - - "to know"

 

Page 36: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

36

The verbal UWs (do, occur, be) also take some pre-defined semantic cases, as follows:

How to define the UWs in UNL Knowledge-Base?

UW PRE-DEFINED CASES

English

'do' takes necessarily agt>thing

"to kill"

'occur' takes necessarily obj>thing

"to fall"

'be' takes necessarily aoj>thing

"to know"

 

Page 37: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

37

Complex sentenceI want to watch this movie.

movie(icl>)

want (icl>)

@entry.@past

obj

@def

:01

I (iof>person)

watch (icl>do)@entry.@inf

objag

t

agt

I (iof>person)

Page 38: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

38

Approach to UNL Generation

Page 39: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Problem Definition Generate UNL expressions for English

sentences in a robust and scalable manner, using syntactic analysis and lexical

resources extensively. This needs

detecting semantically relatable entities and solving attachment problems

Page 40: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Semantically Relatable Sequences (SRS)

Definition: A semantically relatable Sequence (SRS) of a sentence is a group of words in the sentence (not necessarily consecutive) that appear in the semantic graph of the sentence as linked nodes or nodes with speech act labels

(This is motivated by UNL representation)

Page 41: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

SRS as an intermediary to and intermediary

SourceLanguageSentence

TargetLanguageSentence

SRS UNL

Page 42: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Example to illustrate SRS

“The man bought a

new car in June” in: modifier

a: indefinite

the: definite

man

past tense

agent

bought

object

time

car

new

June

modifier

Page 43: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Sequences from “the man bought a new car in June”

a. {man, bought}b. {bought, car}c. {bought, in, June}d. {new, car}e. {the, man}f. {a, car}

Page 44: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Basic questions

Which words can form semantic constituents, which we call Semantically Relatable Sequences (SRS)?

What after all are the SRSs of the given sentence?

What semantic relations can link the words in an SRS and the SRSs themselves?

Page 45: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Postulate

A sentence needs to be broken into Sequences of at most three forms {CW, CW} {CW, FW, CW} {FW, CW}

where CW refers to content word or a clause and FW to function word

Page 46: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

SRS and Language Phenomena

Page 47: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Movement: Preposition Stranding

John, we laughed at. (we , laughed.@entry)---------(CW,

CW) (laughed.@entry,at, John)---(CW, FW,

CW)

Page 48: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Movement: Topicalization

The problem, we solved. (we , solved.@entry)------------(CW,

CW) (solved.@entry , problem)-----

(CW,CW) (the, problem)--------------------(CW,CW)

Page 49: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Movement: Relative Clauses John told a joke which we had already heard.

(John, told.@entry) -------------------(CW, CW) (told.@entry, :01) ---------------------(CW,CW) SCOPE01(we,had,heard.@entry)-------(CW,

FW,CW) SCOPE01(already,heard.@entry)-------

(CW,CW) SCOPE01(heard@entry,which,joke)----

(CW,FW,CW) SCOPE01(a, joke)--------------------------(FW,CW)

Page 50: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Movement: Interrogatives Who did you refer her to?

(did , refer.@entry.@interrogative)-------(FW,CW)

(you, refer.@entry.@interrogative)--------(CW,CW)

(refer.@entry.@interrogative , her)--------(CW,CW)

(refer.@entry.@interrogative , to,who)----

(CW,FW,CW)

Page 51: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Empty Pronominals: to-infinitivals Bill was wise to sell the piano.

(wise.@entry , SCOPE01)---------------(CW,CW) SCOPE01(sell.@entry , piano)---------(CW,CW) (Bill, was, wise.@entry) -----------------(CW,

FW,CW) SCOPE01(Bill, to, sell.@entry)---------(CW,

FW,CW) SCOPE01(the, piano) --------------------(FW,CW)

Page 52: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Empty pronominal: Gerundial The cat leapt down spotting a thrush on the lawn. (The, cat) -------------------------------(FW, CW) (cat, leapt.@entry) --------------------(CW, CW) (leapt.@entry , down) ----------------(CW, CW) (leapt.@entry , SCOPE01) -----------------(CW, CW) SCOPE01(spotting.@entry,thrush)--------(CW,CW) SCOPE01(spotting.@entry,on,lawn)---(CW,FW,CW)

Page 53: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

PP Attachment John cracked the glass with a stone.

(John, cracked.@entry)--------------(CW,CW) (cracked.@entry, glass)-------------(CW,CW) (cracked.@entry, with, stone)----

(CW,FW,CW) (a, stone)------------------------------(FW,CW) (the,glass)-------------------------(FW,CW)

Page 54: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

SRS and PP attachment (Mohanty, Almeida, Bhattacharyya, 04)

Conditions Sub-conditions Attachment Point

[PP] is subcategorized by the verb [V]

[NP2] is licensed by a preposition [P]

[NP2] is attached to the verb [V] (e.g., He forwarded the mail to the minister)

[PP] is subcategorized by the noun in [NP1]

[NP2] is licensed by a preposition [P]

[NP2] is attached to the noun in [NP1](e.g., John published six articles on machine translation )

[PP] is neither subcategorized by the verb [V] nor by the noun in [NP1]

[NP2] refers to [PLACE] / [TIME] feature

[NP2] is attached to the verb [V](e.g., I saw Mary in her office; The girls met the teacher on different days)

Page 55: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Linguistic Study to Computation

Page 56: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Syntactic constituents to Semantic constituents

A probabilistic parser (Charniak, 04) is used.

Other resources: Wordnet and Oxford Advanced Learner’s Dictionary

In a parse tree, tags give indications of CW and FW: NP, VP, ADJP and ADVP CW PP (prepositional phrase), IN

(preposition) and DT (determiner) FW

Page 57: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Observation: Headwords of sibling nodes form SRSs

“John has bought

a car.”

SRS:{has, bought}, {a, car}, {bought, car}

a

(C) VP bought

(F) AUX has(C) VP bought

(C) VBD bought (C) NP car

(F) DT a (C) NN car

bought

car

has

Page 58: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Need: Resilience to wrong PP attachment

“John has published an

article on linguistics” Use PP attachment heuristics Get

{article, on, linguistics}

on linguistics

(C)VP published

(F) PP on(C)VBD published (C)NP article

published

(F)DT an

an

(C)NNarticle

(F)IN on

article

(C)NNS linguistics

(C)NPlinguistics

Page 59: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

to-infinitival“I forced him to watch this movie” Clause boundary is the VP node, labeled with SCOPE

Tag is modified to TO, a FW tag, indicating that it heads a to-infinitival clause,

The duplication and insertion of the NP node with head him (depicted by shaded nodes)

as a sibling of the VBD node

with head forced is done to bring

out the existence of a semantic relation between force and

him.

(C)VP watch

(C)VBD forced (C)NP him(C) S SCOPE

(F)TO toto

(C)VP forced

to

forced

(C)VP

(C)PRP him

him

(C)NP him

him

(C)PRP him

Page 60: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Linking of clauses: “John said that he was reading a novel” Head of S node marked as Scope SRS: {said, that, SCOPE}.

Adverbial clauses have similar parse tree structures except that the subordinating conjunctions are different from that.

(C)VBD said (F) SBAR that

(C) VP said

(F) IN that(C) S SCOPE

said that

Page 61: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Implementation Block Diagram of the system

Parse Tree

Charniak Parser

Scope Handler

Attachment Resolver

WordNet 2.0

Sub-categorization Database

Input Sentence

Parse Tree modification and augmentation with head and scope

information

AugmentedParse Tree

Semantically Related Sequences

Noun classification

Semantically Relatable Sequences Generator

THAT clause as Subcat property

Preposition as Subcat property

Time and Place features

Page 62: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Head determination

Uses a bottom-up strategy to determine the headword for every node in the parse tree.

Crucial in obtaining the SRSs, since wrong head information may end up getting propagated all the way up the tree

Processes the children of every node starting from the rightmost child and checks the head information already specified against the node’s tag to determine the head of the node

Some special cases are: SBAR node A VP node with PRO insertion, copula, Phrasal verbs

etc. NP nodes with of-PP cases and conjunctions under

them, which lead to scope creation.

Page 63: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Scope handler

Performs modification on the parse trees by insertion of nodes in to-infinitival cases

Adjusts of the tag and head information in case of SBAR nodes

Page 64: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Attachment resolver

Takes a (CW1, FW, CW2) as input and checks the time and place features of CW2, the noun class of CW1 and the subcategorization information for the CW1 and

FW pair

to decide the attachment. If none of these yield any deterministic

results, take the attachment indicated by the parser

Page 65: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

SRS generator

Performs a breadth-first search on the parse tree and performs detailed processing at every node N1 of the tree.

S nodes which dominate entire clauses (main or embedded) are treated as CWs.

SBAR and TO nodes are treated as FWs.

Page 66: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

AlgorithmAlgorithmIf the node N1 is a CW (new/JJ,

published/VBD, fact/NN, boy/NN, John/NNP) perform the following checks:

If the sibling N2 of N 1 is a CW (car/NN, article/NN, SCOPE/S)

Then create {CW,CW} ({new, car}, {published, article}, {boy, SCOPE})

If the sibling N2 is a FW (in/PP, that/SBAR, and/CC)

Then, check if N2 has a child FW, N3 (in/IN, that/IN) and a child CW, N4 (June/NN, SCOPE/S)

If yes,Then use attachment resolver to decide

the CW to which N3 and N4 attach.Create{CW,FW,CW} ({published, in,

June}, {fact, that, SCOPE})If no,

Then check if next sibling N5 of N 1 is a CW (Mary/NN)

If yes,Create {CW,FW,CW} ({John, and, Mary})If the node N1 is a FW (the/DT, is/AUX,

to/TO), perform the following checks: If the parent node is a CW (boy/NP,

famous/VP)Check if sibling is an adjective.i. If yes, (famous/JJ)Then, create {CW,FW,CW} ({She, is,

famous})ii. If no, (boy/NN)Then, create {FW,CW} ({the, boy}, {has,

bought})If the parent node N6 is a FW (to/TO) and

the sibling node N7 is a CW (learn/VB)Use attachment resolver to decide on the

preceding CW to which N6 and N7 can attach.

Create {CW,FW,CW} ({exciting, to, learn})

Page 67: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Evaluation FrameNet corpus [Baker et. al., 1998], a

semantically annotated corpus, as the testdata.

92310 sentences (call this the gold standard)

Created automatically from the FrameNet corpus taking verbs, nouns and adjectives as the targets Verbs as the target- 37,984 (i.e., semantic

frames of verbs) Nouns as the target-37,240 Adjectives as the target-17,086

Page 68: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Score for high frequency verbsVerb Frequency ScoreSwim 280 0.709Depend 215 0.804Look 187 0.835Roll 173 0.7Rush 172 0.775Phone 162 0.695Reproduce 159 0.797Step 159 0.795Urge 157 0.765Avoid 152 0.789

Page 69: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Scores of 10 verb groups of high frequency in the Gold Standard

Page 70: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Scores of 10 noun groups of high frequency in the Gold Standard

Page 71: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

An actual sentence

A. Sentence : A form of asbestos once used to make Kent cigarette filters has caused a high percentage of cancer deaths among a group of workers exposed to it more than 30 years ago, researchers reported.

Page 72: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Relative performance on SRS constructs

0 20 40 60 80 100

Total SRSs

(FW,CW)

(CW,FW,CW)

(CW,CW)

Par

amet

ers

mat

ched

Recall/Precision

Recall

Precision

Page 73: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Results on sentence constructs

0 20 40 60 80 100

To-infinitival clause resolution

Complement-clause resolution

Clause linkings

PP Resolution

Par

amet

er

Recall/Precision

Recall

Precision

Rajat Mohanty, Anupama Dutta and Pushpak Bhattacharyya, Semantically Relatable Sets: Building Blocks for Repesenting Semantics, 10th Machine Translation Summit ( MT Summit 05), Phuket, September, 2005.

Page 74: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Statistical Approach

Page 75: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Use SRL marked corpora Daniel Gildea and Daniel Jurafsky. 2002. Automatic labeling of

semantic roles. Computational Linguistics, 28(3):245–288.

PropBank corpus Role annotated WSJ part of Penn Treebank [10]

PropBank role-set [2,4] Core roles: ARG0 (Proto-agent), ARG1 (Proto-patient) to ARG5 Adjunctive roles:

ARGM-LOC (for locatives),

ARGM-TMP (for temporals), etc.

Page 76: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

SRL marked corpora contd… PropBank roles: an example

[ARG0 It] operates] [ARG1 stores] [ARGM−LOC mostly in Iowa and Nebraska]

Preprocessing systems [2] Part of speech tagger Base Chunker Full syntactic parser Named entities recognizer

Fig.4: Parse tree output, Source: [5]

Page 77: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Probabilistic estimation [1] Empirical probability estimation over candidate roles for each

constituent based upon extracted features

here,

t is the target word

r is a candidate role,

h , pt, gov, voice are features

Linear interpolation, with condition

• Geometric mean, with condition

),,,,,(#

),,,,,,(#),,,,,|(

tvoicepositiongovpth

tvoicepositiongovpthrtvoicepositiongovpthrP

),,|()|(),,|(),|()|()|( 54321 tpthrPhrPtgovptrPtptrPtrPtconstituenrP

)},,|()|(),,|(),|()|(exp{1

)|( 54321 tpthrPhrPtgovptrPtptrPtrPz

tconstituenrP

1)|( r tconstituenrP

1i i

Page 78: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

A state-of-art SRL system: ASSERT [4]

Main points [3,4] Use of Support Vector Machine [13] as classifier Similar to FrameNet “domains”, “Predicate Clusters” are introduced Named Entities [14] is used as a new feature

Experiment I (Parser dependency testing) Use of PropBank bracketed corpus Use of Charniak parser trained on Penn Treebank corpus

Parse Task Precision (%) Recall (%) F-score (%) Accuracy (%)

TreebankId. 97.5 96.1 96.8 -

Class. - - - 93.0

Id. + Class. 91.8 90.5 91.2 -

CharniakId. 87.8 84.1 85.9 -

Class. - - - 92.0

Id. + Class. 81.7 78.4 80.0 -

Table 1: Performance of ASSERT for Treebank and Charniak parser outputs.Id. Stands for identification task and Class. stands for classification task. Data source: [4]

Page 79: CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)

Experiments and Results Experiment II (Cross genre testing)

1. Training on PropBanked WSJ data and testing on Brown Corpus

2. Charniak parser trained on first PropBank then Brown

Table 2: Performance of ASSERT for various experimental combinations Date source: [4]