universal networking language
DESCRIPTION
Shalini Gupta - 07305R02. Universal Networking Language. The Problem. Large exploration of Data Linguistic barriers(Multilingualism) - PowerPoint PPT PresentationTRANSCRIPT
Universal Networking Language
Shalini Gupta - 07305R02
The Problem Large exploration of Data Linguistic barriers(Multilingualism)
Web contents are mostly in English and cannot be accessed without some proficiency in this language
Though India forms large part of total population, the proportion of Internet Access is very low.
Need for high speed translation to different languages
Solution: Machine Translation 2 approaches:
Transfer based Works on specific pairs of languages
Some text analysis on source language Some on target language
Interlingua based Build a universal language Convert data to universal language De convert it back Needs only 2N conversions opposed to
N*(N-1) translations for transfer based
UNL: An Interlingua Language independent Knowledge
Representation Vehicle for machine translation UNL solves “Information Monopolies”
problem
Interlingua
(UNL)
Hindi
ChineseFrench
English
Outline Introduction UNL Components Some Controversial Issues in UNL
Language Divergences between Hindi
and English
Conclusion
Introduction to UNL Proposed by the United Nations University Enables computers to process information and
knowledge across the language barriers Replicates functions of natural languages in
human communication Enables distributing, receiving and understanding
multilingual information Represents information sentence by sentence
UNL Graph Each sentence is converted into a hyper graph
Concepts as nodes Relations as directed arcs
Concepts are called Universal Words Word Knowledge represented by Universal
Words (UWs) which are language independent
Conceptual Knowledge captured by relating UWs through relations
Example: John eats rice with a spoon
Universal Word
Attribute
Semantic Relations
UNL Expression
John eats rice with a spoon {unl} agt(eat(icl>do).@entry.@present,
John(iof>person) obj(eat(icl>do).@entry.@present,
rice(icl>food) ins(eat(icl>do).@entry.@present,
spoon(icl>artifact).@indef {/unl}
Universal Word
Types of Universal Word Syntactic and semantic unit of UNL Represents a concept Represents node in graph of UNL expression 2 classes:
Unit concepts Basic UWs Restricted UWs Extra UWs
Compound concepts: Scopes
Types of Universal Words(UWs)
Basic UWs Bare headwords with no constraint list E.g. :
house drink
Restricted UWs Headwords with a constraint list Represents a more specific concept, or subset
of concepts
Types of UWs (contd..)
Constraint List restricts the range of the concept that a Basic UW represents E.g. :
state(icl>country) state(icl>abstract thing)
Extra UWs Special type of Restricted UW Denote concepts that are not present in English. Foreign-language words are used as Head Words E.g. :
Bharatnatyam(icl>dance)
Compound Concepts
Raju said that [he had opened the window]
say (icl>do)
Raju (iof>person
open (icl>do) @entry.@past
@complete
window
(icl>obj) he
@entry.@past
agt obj
agt obj:01
Compound Concepts (contd..)
Set of binary relations that are grouped together to express a compound concept
Interpreted as a whole Expressed by a scope in UNL expressions Raju said that [he had opened the window].
Part of the sentence within square brackets should be grouped
Only when they are grouped together and considered as a whole unit can the correct interpretation be obtained.
Relations
Relation of UNL is expressed as: <relation>(<uw1>, <uw2>) <relation> is one of the relations defined in UNL <uw1>, <uw2> are universal words
E.g. John broke the window agt(break(icl>do).@entry.@past, John(iof>person)) obj(break(icl>do).@entry.@past, window(icl>thing))
41 such relations have been defined
Attributes Describe subjectivity of sentence Enrich the description given by UWs and
relations E.g. Time with respect to the Speaker
happened in the past : @past happening at present : @present will happen in future : @future John broke the window
agt(break(icl>do).@entry.@past, John(iof>person))
UNL Knowledge Base
Defines every possible relation between concepts
Two important roles Defines semantics of Universal Words Gives linguistic knowledge of concepts
E.g. The anchor wrote the script Linguistic Knowledge tells that anchor is a
person Semantics tells that only a person can write a
script (Anchor(of ship) can't do so)
Controversial Issues
Meaning Representation Language: Should provide sufficient means to express
knowledge. Should be simple.
Main expressive device of UNL is
Restrictions
New expressive means for describing UWs have been proposed.
Semantic Restriction
UW: operator(icl>thing) Doesn't effectively separate the meaning 2 meanings
long distance operator(icl>human) addition operator (icl>abstract thing)
Hypernymy and Meronymy are mostly used for expressing restrictions
Synonmy and antonymy can be used E.g. wealth(equ>richness), poor(ant>rich)
Argument Frame Restriction X borrows Y from Z for W All four arguments are needed to define the
action of borrowing completely Example
John borrowed $10000 for 3 years John has been borrowing money for 3 years
UNL as a meaning representation language
should have an ability to draw a distinction between the argument and non-argument links of predicates
Weakly Differentiated Relations
Some relations seem to be weakly differentiated and therefore difficult to use consistently. E.g. gol (final state) – plt (final place) E.g. src (initial state) – plf (initial place)
John went to Brussels can be described both with gol and plt difference is that gol characterizes Brussels
as the final state of John, while plt – as the final place of the whole event
Redundant Relations
Some relations seems to be based more on the semantic class of UWs
E.g. mod (modification) – man (manner) Difference between them boils down to the
semantic class of the starting point of the relation answered politely (man) [to answer] a polite answer (mod) [an answer]
Relations 'man' and 'mod' can be merged
Divergences between English and Hindi
Constituent Order Divergence Jim is playing tennis. जि�म टै�नि�स खे�ल रहा� हा�
(S) (V) (O) (S) (O) (V) Adjunction Divergence
The [living in Delhi] boy दि�ल्ल� म� रहा��वा�ल� लडका�
Preposition-Stranding Divergence Which shop did John go to? निकास दुका�� ��� गया� म�
Divergences(contd..)
Null Subject Divergence �� रहा� हूं� going-am
Pleonastic Divergence It is raining. याहा बा�रिरश हा! र�हा� हा�
Conflational Divergence Jim stabbed him. जि�म उसका! छु$ र� स� म�र�
Promotional Divergence The play is on. खे�ल चल रहा� हा�
Conclusion
UNL is an Interlingua for Machine Translation
Studied Components of UNL
Controversial Issues in UNL
Divergences between English and Hindi
References
Igor Boguslavsky. Some controversial issues of UNL: linguistic aspects. 2004.
Shachi Dave and Pushpak Bhattacharyya. Knowledge extraction from Hindi text, 2001.
Shachi Dave, Jignashu Parikh, and Pushpak Bhattacharyya. Interlingua-based English-Hindi machine translation and language divergence. Machine Translation, 16(4):251–304, 2001.
References The universal networking language manual,
www.undl.org. 2006. Zhu M. Uchida H. The universal networking
language (UNL) specifications. Technical Report, 2005.
Thank You
UNL System
Knowledge Extraction from Hindi Text
EnConverter is a language independent parser
provides framework for analysis Need to provide a lexicon and Analysis Rules Analysis Rule: (<PRE>)... <LNODE>
<RNODE> (<SUF1>) (<SUF2>) (<SUF3>)... <PRI>
Lexicon Entry: [HW] {ID} ”UW” (ATTRIB1, ATTRIB2, ...) <FLG,FRE,PRI>;
Knowledge Extraction from Hindi Text
Each Step: Morphological
Analysis Decision
Relation Lexical
Attribute UNL Attribute
Verbal Concepts Classes of predicates
actions ( have an active initiator, Eg. kill) activities ( set of heterogeneous actions with
common goal, Eg.trade) events (Have no agent, Eg. the bridge
broke ) processes (Denote a situation that occupies
a certain time span, Eg. the tree grows) states (Homogeneous, do not denote a
change, Eg. hear, ache)
Classes of predicates
properties (Differ from the states in that they are atemporal, Eg. blind, red)
relations (Specify relation between two or more things, Eg. love, hate,)
In UNL, all verbal concepts group into three classes (icl>do) contains actions and activities (icl>occur) consists of events and processes (icl>be) composed of states, properties and
relations
Adjectival Concepts
All adjectival concepts are divided into two classes: predicative (aoj>thing) restrictive (mod>thing)
This does not work well in some situations Eg. Wise Greeks diluted wine with water
Restrictive interpretation: ‘Those Greeks who were wise diluted wine with water. Silly ones didn’t’.
Non-restrictive (qualificative) interpretation: ‘Greeks were wise. They diluted wine with water’.
Its restrictive vs qualificative
Should be applied to other modifiers also
The students sitting in the corner are waiting for the professor The students(,) who are sitting in the corner(,) are
waiting for the professor. The students in the corner are waiting for the
professor The phrase 'who are sitting' can be restrictive
(‘those of the students who are sitting in the corner are waiting for the professor; others are not’)
non-restrictive (‘the students are waiting for the professor; they are sitting in the corner’)