prepositional phrase attachment & generation of semantic relation
DESCRIPTION
Prepositional Phrase Attachment & Generation of Semantic Relation. Ashish Almeida (03M05601) Guide: Pushpak Bhattacharyya. Problem Definition. Semantics Extraction English to UNL: UNL: Language independent knowledge representation Some important problem - PowerPoint PPT PresentationTRANSCRIPT
Prepositional Phrase AttachmentPrepositional Phrase Attachment & &
Generation of Semantic Relation Generation of Semantic Relation
Ashish Almeida (03M05601)Ashish Almeida (03M05601)
Guide: Pushpak BhattacharyyaGuide: Pushpak Bhattacharyya
April 19, 2023
2
Problem DefinitionProblem Definition
• Semantics Extraction– English to UNL:
• UNL: Language independent knowledge representation
– Some important problem• Prepositional phrase (PP) attachment• Semantic head detection• PRO resolution• Generation of semantic relations
April 19, 2023
3
UNL: Semantics RepresentationUNL: Semantics Representation
–He read the book on physics
He
read
book
physics
agent object
modifier
Universal Networking Language – UNL
• Knowledge representation through graph
• Concepts and relationships among them
• Universal word (UW)
- unique concept
• Relation
- connect two UWs
April 19, 2023
4
Example: PP AttachmentExample: PP AttachmentHe read the book on physics
He
read
book
physics
the on
He
read
book
physicsthe
on
CorrectIncorrect
April 19, 2023
5
OverviewOverview• Problem definition• Previous work
• PP Attachment• Semantic Head Detection• PRO resolution in infinitival-to• Automatic Dictionary Enrichment• Rules and implementation• Results & Conclusion• References
April 19, 2023
6
Previous WorkPrevious Work
• English to UNL analysis– P. Bhattacharyya: UNL analysis process
• PP attachment– Ratnaparakhi: probabilistic approach– Brill: rule based approach
• Semantic relations– P.Pantel: detection of different roles of
preposition
April 19, 2023
7
PP AttachmentPP Attachment
April 19, 2023
8
The Sentence Frame [V-N-P-N]The Sentence Frame [V-N-P-N]– [ V-NP1-P-NP2 ]
• Attachment problem (V or NP1)• NP: simple noun phrase without any embedded clause or
prepositional phrase• Sufficient context information• Comparison with other’s work
• Example: He [is reading]V [this book]NP1 [for]P [his exam]NP2.
Solution to PP attachment- based on argument structure theory.
April 19, 2023
9
Argument Structure (AS) of Argument Structure (AS) of VerbVerb
• Example: He forwarded the mail to John.– Forward (X, Y) – Forward (the mail, John)
• The verb takes to-PP as a complement – The verb also determines the choice of
preposition, i.e., to
• Important clue: the noun after ‘to’ attaches to verb ‘forward’
April 19, 2023
10
Argument Structure: NounsArgument Structure: Nouns
• Example: We received [[an invitation] to the wedding].– noun attachment– invitation (wedding)
• Noun ‘invitation’ demands to-PP as an argument
• Receive (invitation (wedding) )
April 19, 2023
11
Augmenting the Dictionary Augmenting the Dictionary EntriesEntries
[forward] “forward(icl>do)” (V, VOA, #_TO_AR2)
UWEnglish word Attributes list
2nd argument is to-prepositional phrase
verb
Action verb
• Dictionary encodes the knowledge through this attribute (#_TO_AR2) that the verb ‘forward’ takes to-PP as second argument.
April 19, 2023
12
PP AttachmentPP Attachment• In [V-N1-P-N2] frame,
– N2 can attach to V or N1
– It depends on argument taking property of both V and N1
• 2 cases: V may or may not demand P-N2
• 2 cases: N1 may or may not demand P-N2
• While attaching N2 to V or N1, Priority is given– First to argument-hood– Second to neighbor-hood
... of V and N1
April 19, 2023
13
PP Attachment TablePP Attachment Table• Four cases:
for example for the frame [V-N1-of-N2]
V demands
N1
demands
N2 attaches
to _
Examples
1 to-PP to-PP N1I can’t easily give an answer to
the question.
2 to-PP No to-PP V John gave a flower to Mary.
3 No to-PP to-PP N1She made several minor amendments to her essay.
4 No to-PP No to-PP N1I caught a bus to the coast.
April 19, 2023
14
Automatic Dictionary Automatic Dictionary EnrichmentEnrichment
• Oxford Dictionary (OALD): argument structure
• WordNet: argument structure
• Penn Treebank corpus: PRO controlled-ness property of verbs
April 19, 2023
15
Using Oxford DictionaryUsing Oxford Dictionary• A typical entry in OALD
– E.g. noun addition Second Sense
add•ition noun……2 [C] ~ (to sth) a thing that is added to sth else: the latest addition to our range of cars an addition to the family(= another child) (NAmE) to build a new addition onto a house last minute additions to the government’s package of proposals
“Addition to <something>” indicates that the word ‘addition’ takes to-PP as an argument
Added the feature #_TO_AR1 in the attribute list of the noun ‘addition’.
April 19, 2023
16
Semantic RelationsSemantic Relations• The semantic relations between verb and its
argument is an idiosyncratic property of the verb• Semantic relations of arguments are stored in
the lexicon as feature• Using Beth Levin’s verb category
– Verbs in same class behave similarly • syntactically and semantically
• Example:– Give type verbs: give, lend, pay, sell, refund
• Give - #_TO_AR2_, #_TO_AR2_GOL
April 19, 2023
17
Semantic Head DetectionSemantic Head Detectioncase study - case study - ofof
April 19, 2023
18
Semantic Head DetectionSemantic Head Detection
• In case of NP involving [N1-of-N2],
• Syntactically, N1 is head
– University of Mumbai– Bunch of sticks
• Semantically, N1 or N2 can be head
– Bunch of sticks– Sticks is semantic head
• qua (sticks, bunch)
April 19, 2023
19
Example: Semantic HeadExample: Semantic Head
V
N1 N2
V
N1 N2
Saw the book of physics Drank a cup of milk
April 19, 2023
20
PartitivesPartitives• Dictionary enrichment• Identified and classified such quantity words
– Numbers- one-third, dozen– Container- can, cup, bag– Collection- bundle, group– Measure- inch, gram– Indefinite amount - drop, dose
• #PARTITIVE attribute is given to such words
April 19, 2023
21
Solution: Semantic Head Solution: Semantic Head detectiondetection
• Given the sentence frame [N1 of N2], if N1
has the attribute #PARTITIVE then N2
becomes semantic head
• Quantity (qua) relation is generated.
• For example– Cup of tea
• qua (tea, cup)
April 19, 2023
22
PRO Resolution PRO Resolution in in toto-infinitival Clauses-infinitival Clauses
April 19, 2023
23
What is PRO?What is PRO?
• PRO: – pronominal, anaphoric
• He wants [to go]IP .
• Hei wants [PROi to go].
• Subject of ‘go’ is same as subject of ‘want’, i.e. ‘he’
• PRO is co-indexed with the subject ‘he’
April 19, 2023
24
PRO: IdiosyncraticPRO: Idiosyncratic
• PRO: – Subject controlled
• Hei promised me [PROi to come for the party].
– Object controlled• He ordered usk [PROk to finish the work].
• Promise – subject controlled
• Order – object controlled
• Added as an attribute of the verb
April 19, 2023
25
PRO ResolutionPRO Resolution
• If – the verb has “sub/obj-cotrpolled-PRO”
property– and has to-infinitival clause
• Then– copy the subject/object of that main clause to
the position of PRO and give it same UW-id (unique identifier).
April 19, 2023
26
PRO Realization in UNLPRO Realization in UNL• They promised Mary [to give a party]
April 19, 2023
27
Dictionary Enrichment : PRODictionary Enrichment : PRO((S
(NP-SBJ-1 investors)
(VP continue
(S (NP-SBJ *-1)
(VP to
(VP pour
(NP cash)
(PP-DIR into
(NP money funds))))))
.))
• Penn Tree Bank Corpus•Annotated with co-indexed PRO information• NP-SBJ-1 is also subject of to-clause *-1
Thus the verb ‘continue’ will get attribute ‘subject-controlled-pro’
E.g.: They ____ him to write the letter.English Wordnet provide such frames against verbs, which indicates that the verb takes to-inf as an argument
April 19, 2023
28
ImplementationImplementation
April 19, 2023
29
UNL systemUNL system
Rule-baseFor English
Dictionary
EnconnvertorEnglish
sentenceUNL
expression
April 19, 2023
30
Enconvertor: AnalysisEnconvertor: Analysis
• Enconvertor– Rules based – Similar to Turing machine– Two analysis heads (windows)– Many condition heads (windows)– Move over a sentence
• Usually, word by word
April 19, 2023
31
Rules: ShiftRules: Shift• Shift (can move left or right)
– Right shift over a sentence by a word– For instance,
R{V,^# FOR AR2:::}{N:::}(PRE,#FOR)P60;Move to the right (R) over the sentence,
if
the left analysis window {V,^# FOR AR2:::} is on verb which does not expect for-PP as second argument (^ indicates negation)
And right analysis window {N:::} is on noun
And next condition window (PRE, #FOR) matches to a preposition FOR
The rule has absolute priority of 60. (255 is hightest)
April 19, 2023
32
Rules: ReduceRules: Reduce• Reduce (delete a node and/or relate it to other node)
– Delete a node and create a relation<{V,#_FOR_AR2,#_FOR_AR2_rsn:::}{N,FORRES,PRERES::rsn:}P25;
Delete word under right analysis window while creating a reason (rsn) relation with the verb on its left,
if The left analysis window {V,#_FOR_AR2,#_FOR_AR2_rsn:::} is on verb
which expects for-PP as second argument (#_FOR_AR2) And right analysis window {N,FORRES,PRERES::rsn:} is on a nounwhich also specifies rsn relation to be created
The rule has absolute priority of 25. (255 is hightest)
April 19, 2023
33
LimitationsLimitations
• Prerequisite:– word sense disambiguation– Dictionary contains all words of the sentence
• Multiword or named entity detection is based on dictionary lookup
• Arbitrary PRO is not handled
April 19, 2023
34
Results: PP attachment (Results: PP attachment (ofof and and toto))
Sentences Correct attachment/unl
Incorrect Accuracy
%
V-N1-of-N2
BNC/oxford
1000 886 114 88
V-N1-of-N2
(WSJ data)
661 597 64 90
Sentences
(oxford/BNC)
Correct Role detection
Correct UNL/attachment/PRO resolution
To preposition 100 97 84
To infinitival 100 93 77
April 19, 2023
35
ResultsResults
#Temporal preposition phrases 1326
#Cases of correct UNL 1112
Average accuracy 83.9%
Total (N1-of-N2) 1140
Total partitives 197 (17.3%)
Recall (partitives detection)
182 (92%)
• Semantic Head DetectionSemantic Head Detection
• Temporal analysisTemporal analysis
April 19, 2023
36
Error analysisError analysis
• Inadequate rules– Missing rules that handle common
phenomena leads to wrong UNL
• Errors in attributes assigned to entries in dictionary– Spelling errors, missing attributes etc.
• Idiomatic constructs
April 19, 2023
37
ConclusionConclusion• Future work
– It can be applied to other prepositions• Special cases like ‘of’ and ‘to’ could be investigated
– Clause attachment can similarly be handled
• Key idea– Enrichment of dictionary automatically/ semi-
automatically• It involves adding syntactic and semantic level attributes
April 19, 2023
38
ResourcesResources• A. S. Hornby. 2006. Oxford Advanced
Learner’s Dictionary of Current English. Oxford University Press, Oxford.
• Chris Greaves. 2006. Web Concordancer, http://www.edict.com.hk
• George Miller. 2003. WordNet 2.0. http://wordnet.princeton.edu/
• M. Marcus, G. Kim and M. Marcinkiewicz. 1994. The Penn Treebank: annotating predicate-argument structure. ARPA.
April 19, 2023
39
ReferencesReferences• UNDL Foundation. 2003. The Universal Networking Language
(UNL) specifications version 3.2. http://www.unlc.undl.org• Jignashu Parikh, Jagadish Khot, Shachi Dave and Pushpak
Bhattacharyya. 2004. Predicate Preserving Parsing. European Union Working Conference on Sharing Capability in Localization and Human Language Technologies (SCALLA04), Kathmandu, Nepal
• Jane Grimshaw. 1990. Argument Structure. The MIT Press, Cambridge, Mass.
• E. Brill and R. Resnik. 1994. A Rule based approach to Prepositional Phrase Attachment disambiguation. Proc. of the fifteenth International conference on computational linguistics. Kyoto.
• Adwait Ratnaparkhi. 1998. Statistical Models for Unsupervised Prepositional Phrase Attachment. Proceedings of COLING-ACL. http://www.cis.upenn.edu/ adwait/statnlp.html
April 19, 2023
40
ContributionContribution• R. K. Mohanty, A. Almeida, Srinivas S. and P.
Bhattacharyaa. 2004. The complexity of OF. ICON, Hyderabad, India.
• A. Almeida and P. Bhattacharyya. 2007. Semantics of ‘to’ ICCTA 2007, Kolkata, India.
• R. K. Mohanty, A. Almeida and P. Bhattacharyaa. 2005. Prepositional Phrase Attachment and Interlingua.CCLING-2005 Workshop, Mexico, India.
April 19, 2023
41
Thanks
April 19, 2023
42
Questions asked by reviewers and Questions asked by reviewers and answersanswers
April 19, 2023
43
Questions - Prof. S. KaushikQuestions - Prof. S. Kaushik• The lexicon carries lot of information which will make
development of lexicons very difficult task. Subsequently this will make processing slow and inefficient. Comment on this.
• The entries in the lexicon has following structure• [Head-word] “Universal Word” (attribute list)
• In our work, we have been adding more attributes into this attribute list. This does not complicate the dictionary. In MT based system it is common practice to have many attributes for each word in the lexicon. Addition of more attribute to the words has no effect on number of entries in the dictionary. However, if the dictionary size increase, the dictionary access can be made faster with the help of database storage and proper indexing scheme.
• Also, We have tried to address the issue of creating/ enriching the lexicon automatically through annotated corpus/ oxford dictionary to simplify the dictionary creation.
April 19, 2023
44
• Are the existing lexicons and rules scalable?– Existing lexicon and rules are scalable. – We can add more entries into lexicon. It uses
indexing, so that there will be little difference in speed since the access time will be in terms of O(log n).
– Rules can also be extended. Though for a given language (say English) rules will be finite in number. Thus there will not be any sizable increase in the number of rules.
April 19, 2023
45
• Can your approach be extended for other languages?– This work is done specifically for English. It
uses heavily argument structure information and word properties.
– But the linguistic theory can also be applied while solving similar problems in other languages. The algorithm developed for attachment can be tried out on languages which have structure similar to English.
April 19, 2023
46
Questions – Prof. SasiKumarQuestions – Prof. SasiKumar
April 19, 2023
47
• How significant is the UNL base for the work reported here? If the translation framework was something else, how much would that affect the work done? – UNL is a well known interlingua. Some other interlinguas are
LCS (Lexical Conceptual Structure) by Dorr and Conceptual Structures. These interlinguas do not have computer information support. Since there representation is complex compared to UNL. There is a universal language called Esperanto. But it also lacks preciseness and hence is difficult to represent in the computer.
– Any framework will have two parts: enconversion and deconversion. Difficulty of analysis depends on how deeply that framework encodes the knowledge. Besides, this work is based on argument structure theory and semantic properties of the words. Hence any framework can be used for this.
April 19, 2023
48
• What was the methodology adopted for the analysis reported in chapters 4-7?– Our approach is based on linguistic theory and
principles. The process involves corpus lookup, extraction of different syntactic patterns form the corpus and its analysis. We relied mainly on concordance search on Brown corpus and BNC corpus. Initially, we focused on analysis of sentences with only of-PPs. For testing we used sentences from BNC corpus and WSJ data-set used by Ratnaparkhi.
– For study of partitives, we manually looked for partitives in the corpus in addition to using thesaurus and Wordnet ontologies.
– For dictionary enrichment, we referred to various available resources. We explored them to extract desired features for the dictionary.
April 19, 2023
49
• How do you know if the categories identified for this analysis are exhaustive? Are there alternative ways to categorise? Is there a basis for categoraisation?– For verbs, we used Beth Levin work on verb
classification and Wordnet. Wordnet ontologies are used for noun categories.
– In the case of prepositions, we tried to categorize prepositions according to their roles, i.e., temporal, spatial, manner etc. But except for temporal, we were not able to do much work in this direction. We found that unless we do analysis of each preposition individually, it would be difficult to categorize them. So we chose to do complete analysis of individual prepositions. This led us to select much common prepositions such as of and to.