cs460/626 : natural language processing/speech, nlp …pb/cs626-460-2011/cs626-460-lect37... ·...
TRANSCRIPT
CS460/626 : Natural Language Processing/Speech, NLP and the Web
(Lecture 37– Semantics; Universal Networking Language)
Pushpak BhattacharyyaCSE Dept., IIT Bombay
12th April, 2011
Semantics: wikipedia
•Semantics (from Greek sēmantiká, neuter plural of sēmantikós) is the study of meaning.
•It typically focuses on the relation •It typically focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata.
Computational Semantics: wikipedia
•Computational semantics is the study of how to automate the process of constructing and reasoning with meaning representations of natural language expressions.
•Some traditional topics of interest are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution.
•Methods employed usually draw from formal semantics or statistical semantics.
•Computational semantics has points of contact with the areas of lexical semantics (word sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving).
•Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.
A hurdle: signifier-denotatadichotomy
� Divide between a word and what it stands for
� “red” is NOT red in colour� “red” is NOT red in colour
� “red wine”, “red rose”, “he is in the red” denote very different sense of the word
� Translation into another language reveals this difference
A Perpective
Semantics
Pragmatics
Discourse
Morphology
Lexicon
Syntax
Semantics
Our tryst with semantics:
Universal Networking Language (UNL)
Motivation
� Extraction of semantics, i.e., deep meaning is important for many applications.� Machine Translation, Meaning-based IR, CLIRMachine Translation, Meaning-based IR, CLIR
� Robust, scalable & efficient methods of knowledge extraction required
� Machine Translation and Cross Lingual IR: a need of the hour for crossing language barrier
7
Interlingua: a vehicle for machine translation
EnglishHindi
Interlingua(UNL)
FrenchChinese
generation
Analysis
8
UNL: a United Nations project
� Started in 1996� 10 year program� 15 research groups across continents� First goal: generators� Next goal: analysers (needs solving various ambiguity
problems)problems)� Current active language groups
� UNL_French (GETA-CLIPS, IMAG)� UNL_English+Hindi� UNL_Italian (Univ. of Pisa)� UNL_Portugese (Univ of Sao Paolo, Brazil)� UNL_Russian (Institute of Linguistics, Moscow)� UNL_Spanish (UPM, Madrid)
9
World-wide Universal Networking Language (UNL) Project
UNL
English Russian
Marathi
10
Japanese
Hindi
Spanish
� Language independent meaning representation.
Others
The UNL MT System: an Overview
11
NLP@IITB
12
Foundations and Applications
� UNL Foundations� Semantic Relations
� Universal Words
� Attributes
� How to write UNL expressionsHow to write UNL expressions
� UNL Applications� Machine Translation: Rule based and Statistical
� Search
� Text Entailment
� Sentiment Analysis
13
LanguageProcessing & Understanding
Information Extraction:Part of Speech taggingNamed EntityRecognition
Shallow ParsingSummarization
IR:Cross Lingual SearchCrawlingIndexingMultilingual Relevance Feedback
Machine Learning:Semantic Role labelingSentiment Analysis
Text Entailment(web 2.0 applications)
Using graphical models, support
vector machines, neural networks
Machine Translation:StatisticalInterlingua BasedEnglish�Indianlanguages
Indianlanguages�IndianlanguagesIndowordnet
Resources: http://www.cfilt.iitb.ac.inPublications: http://www.cse.iitb.ac.in/~pb
Linguistics is the eye and computation thebody
UNL represents knowledge: John eats rice with a spoon
Semantic relations
attributes
Universal words
Repositoryof 42SemanticRelations
and84 attributelabels
15
Sentence embeddings
Deepa claimed that she had composed a poem.
[UNL]
agt(claim.@entry.@past, Deepa)agt(claim.@entry.@past, Deepa)
obj(claim.@entry.@past, :01)
agt:01(compose.@past.@entry.@complete, she)
obj:01(compose.@past.@entry.@complete, poem.@indef)
[\UNL]
16
Constituents of Universal Networking Language
� Universal Words (UWs)
� Relations
� Attributes
17
� Attributes
� Knowledge Base
UNL Graph
@ entry @ pastforward(icl>send)
He forwarded the mail to the minister.
18
obj
agt
minister(icl>person)
mail(icl>collection)
he(icl>person)
@def
@def
gol
UNL Expression
agt (forward(icl>send).@ entry @ past, he(icl>person))
19
obj (forward(icl>send).@ entry @ past, minister(icl>person))
gol (forward(icl>send ).@ entry @ past, mail(icl>collection). @def)
What is a Universal Word (UW)?
� Words of UNL
� Constitute the UNL vocabulary, the syntactic-semantic units to form UNL expressions
� A UW represents a concept
Basic UW (an English word/compound word/phrase
20
� Basic UW (an English word/compound word/phrasewith no restrictions or Constraint List)
� Restricted UW (with a Constraint List )
� Examples:
“crane(icl>device)”
“crane(icl>bird)”
The Lexicon
Format of the dictionary entry
e.g., [minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN);
� Head word
[headword] {} “Universal word“ (Attribute list);
21
Head word
� Universal word
� Attributes
� Morphological - Pl(plural), V_ed(past tense form)
� Syntactic - V(verb),VOA(verb of action)
� Semantic - ANIMT(animate), PLACE, TIME
The Lexicon (cntd)
Content words:
[forward] {} “forward(icl>send)” (V,VOA) <E,0,0>;
He forwarded the mail to the minister.
22
[forward] {} “forward(icl>send)” (V,VOA) <E,0,0>;
[mail] {} “mail(icl>message)” (N,PHSCL,INANI) <E,0,0>;
[minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN) <E,0,0>;
Headword Universal Word Attributes
The Lexicon (cntd)
function words:
[he] {} “he” (PRON,SUB,SING,3RD) <E,0,0>;
He forwarded the mail to the minister.
23
<E,0,0>;
[the] {} “the” (ART,THE) <E,0,0>;
[to] {} “to” (PRE,#TO) <E,0,0>;
Headword Universal Word
Attributes
Hindi example: स�ंा का उदाहरण १/२
साव�भौमशदमु�य शद
farmer(icl>creator)farmerN,ANIMT,FAUNA,MML,PRSN
E
गणु
farmer(icl>creator)farmer
शेतकर
कसानN,M,ANIMT,FAUNA,MML,PRSN,Na
N,ANIMT,FAUNA,MML,PRSN
M
H
N,M,ANIMT,FAUNA,MML,PRSN
The Features of a UW
� Every concept existing in any language must correspond to a UW
� The constraint list should be as small as
25
� The constraint list should be as small as necessary to disambiguate the headword
� Every UW should be defined in the UNL Knowledge-Base
Restricted UWs
� Examples
� He will hold office until the spring of next year.
� The spring was broken.
26
� Restricted UWs, which are Headwords with a constraint list, for example:
“spring(icl>season)”
“spring(icl>device)”
“spring(icl>jump)”
“spring(icl>fountain)”
How to create UWs?
� Pick up a concept� the concept of “crane" as "a device for lifting heavy loads” or
as “a long-legged bird that wade in water in search of food”
27
search of food”
� Choose an English word for the concept.� In the case for “crane", since it is a word of English, the corresponding word should be ‘crane'
� Choose a constraint list for the word.� [ ] ‘crane(icl>device)'� [ ] ‘crane(icl>bird)'
How to create UNL expressions
English sentences: basic structure
� A <verb> B
� John eats bread
� agt(eat.@entry, John)
obj(eat.@entry, bread)
R2
verb
BA
R1
R2
� obj(eat.@entry, bread)
� A <verb>
� John sleeps
� aoj(sleep.@entry, John)
� A <be> B
� John is good
� aoj(good.@entry, John)
verb
A
R1
B
A
aoj
Hindi sentences: basic structure
� A B <verb>
� John roti khaataa hai
� agt(eat.@entry, John)
� obj(eat.@entry, bread)
A <verb>
R2
verb
BA
R1
R2
� A <verb>
� John sotaa hai
� aoj(sleep.@entry, John)
� A <be> B
� John acchaa hai
� aoj(good.@entry, John)
verb
A
R1
B
A
aoj
:02
Complex English sentences: Use recursion on the basic structure
A <verb> B
� John who is a good boy eats bread which is toasted
agt(eat.@entry, :01)
eat
:02:01
agt obj
:02:01� agt(eat.@entry, :01)
� obj(eat.@entry, :02)
� aoj:01(boy, John.@entry)
� mod:01(boy, good)
� obj:01(toast, bread.@entry.@focus)
boy
John
aoj
toast
Bread
obj
good
mod
Red arrows indicate entry nodes