let us try to understand and write •introduce unl unl ... · scn (scene) event or state or...

8
1 Let us try to understand and write UNL (Universal Networking Language) ATR-SLT seminar, 12/8/05, rev. 14/12/06, 7/2/07 Christian Boitet, GETALP, LIG, IMAG, UJF (Grenoble 1) ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 1 Plan Introduce UNL Learn UNL Read UNL Write UNL ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 2 Why use UNL as a pivot ? brief reminder UNL is a project an artificial langage a format of multilingual documents (actually, 2 formats) The UNL language has unique features even if it is perfectible ! It is in effect an "anglo-semantic pivot" ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 3 Language : a simple UNL graph (false: "score a goal#1", "goal#2"!) Ronaldo has scored a goal into the left corner of the goal -- Ronaldo a marqué un but dans le coin gauche des buts ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 4 a UNL graph with recursion and its auxiliary UNL-tree Isaac sees that an apple falls and he explains it. agt(explain(icl>do).@entry,Isaac(icl>proper noun)) obj(explain(icl>do).@entry,:01) obj:01(fall(icl>occur).@entry,apple) and(explain(icl>do).@entry,see(icl>do)) agt(see(icl>do),Isaac(icl>proper noun) obj(see(icl>do),:01) explain Isaac:01 agt see :01 obj and apple fall obj Isaac:01 agt :01 obj UNL tree (auxiliary) explain Isaac agt see agt :01 obj obj and :01 apple fall obj UNL (hyper) graph ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 5 What is the UNL language ? Small ongoing controversy… A way to look at a UNL (hyper)graph : it corresponds to an utterance U-L in language L by representing the abstract structure of an equivalent English utterance U-E as « viewed from L » ==> the semantic attributes not necessarily expressed in L may be absent : frequent under-specification aspect coming from French, determination or number coming from Japanese, etc.

Upload: lephuc

Post on 11-Sep-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

1

Let us try to understand and writeUNL

(Universal Networking Language)

ATR-SLT seminar, 12/8/05, rev. 14/12/06, 7/2/07Christian Boitet, GETALP, LIG, IMAG, UJF (Grenoble 1)

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 1

Plan

• Introduce UNL• Learn UNL• Read UNL• Write UNL

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 2

Why use UNL as a pivot ?

• brief reminderUNL is

• a project• an artificial langage• a format of multilingual documents (actually, 2 formats)

• The UNL language has unique featureseven if it is perfectible !

• It is in effect an "anglo-semantic pivot"

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 3

Language : a simple UNL graph (false: "score a goal#1", "goal#2"!)

Ronaldo has scored a goal into the left corner of the goal -- Ronaldo a marqué un but dans le coin gauche des buts

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 4

a UNL graph with recursion and its auxiliary UNL-treeIsaac sees that an apple falls and he explains it.

agt(explain(icl>do).@entry,Isaac(icl>proper noun))obj(explain(icl>do).@entry,:01)obj:01(fall(icl>occur).@entry,apple)and(explain(icl>do).@entry,see(icl>do))agt(see(icl>do),Isaac(icl>proper noun)obj(see(icl>do),:01)

explainIsaac:01 agt

see

:01obj

andapple

fall

obj

Isaac:01 agt :01obj

UNL tree (auxiliary)

explain

Isaac

agt

seeagt

:01

obj

obj

and

:01

apple

fallobj

UNL (hyper) graph

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 5

What is the UNL language ?

• Small ongoing controversy…• A way to look at a UNL (hyper)graph :

it corresponds to an utterance U-L in language Lby representing the abstract structure

• of an equivalent English utterance U-E• as « viewed from L »

==> the semantic attributes not necessarily expressed in L may beabsent : frequent under-specification

• aspect coming from French,• determination or number coming from Japanese,• etc.

2

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 6

The reasons for using UNL in MT (aims being quality multilingual MT)

(and it can be used in many other ways!)PROs

• Technical success of pivot MT does exist(ATLAS, PIVOT, ULTRA, KANT for text MT, CSTAR-II, MedSLT, MASTOR for speech MT)

• UNL derives from the pivot of ATLAS-II (Fujitsu)& was designed by the same author (H. Uchida)

• Possible quality & coverage :ATLAS-II has been the best E ↔ J system since > 15 yearsIts version 13 has more than 5.440.000 entries in each dictionary (E, J)

CONs• Translation via UNL (double!) leads certainly to a lesser asymptotic quality than

transfer via « multi-level structures »,BUT

• UNL can be « co-edited » from any source language• UNL does not imply any computational approach• With a large enough corpus of pairs (sentence, UNL graph), a deconverter and an

enconverter could be learned by corpus-based methods.(cf. SLT MASTOR system of IBM)

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 7

The original UNL-html format

{de dtime=20020130-2035, deco=man}Ich lief gestern im Park. {/de}{es dtime=20020130-2031, deco=UNL-SP}Yo corri ayer en el parque.{/es}{fr dtime=20020131-0805, deco=UNL-FR}J’ai couru dans le parc hier. {/fr}[/S][S:2]{org:el}My dog barked at me.{/org}{unl}agt(bark(icl>do).@entry.@past,dog(icl>animal))gol(bark(icl>do).@entry.@past,i(icl>person))pos(dog(icl>animal),i(icl>person)){/unl}{de dtime=20020130-2036, deco=man}Mein Hund bellte zu mir.{/de}{fr dtime=20020131-0806, deco=UNL-FR}Mon chien aboya pour moi. [/S] [/P][/D]</BODY></HTML>

<HTML><HEAD><TITLE>Example 1 El/UNL</TITLE></HEAD><BODY>[D:dn=Mar Example 1, on= UNL French, [email protected]][P][S:1]{org:el}I ran in the park yesterday.{/org}{unl}agt(run(icl>do).@entry.@past,i(icl>person))plc(run(icl>do).@entry.@past,park(icl>place).@def)tim(run(icl>do).@entry.@past,yesterday){/unl}{cn dtime=20020130-2030, deco=man}我昨天在公園裡跑步{/cn}

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 8

The equivalent UNL-xml format

• As simple as UNL-html• Open to all XML-related tools

<unl:D on=“WJT” dt=“04032002”><unl:P number=“1”><unl:S number=“1’><unl:org: lang=“cn”>我昨天在公園裡__</unl:org><unl:unl sn=“Ariane” pn=“WJT” dt=“04032002”>agt(run.@entry.@past,i)plc(run.@entry.@past,park.@def)tim(run.@entry.@past,yesterday)</unl:unl>

<unl:GS lang=“cn”>我昨天在公園裡__</unl:GS><unl:GS lang=“de”>Ich lief in den Park gestern. </unl:GS><unl:GS lang=“el”>I ran in the pary yesterday.</unl:GS><unl:GS lang=“es”>Yo corri ayer en el parque.</unl:GS><unl:GS lang=“fr”>J’ai couru dans le parc hier. </unl:GS></unl:S></unl:P></unl:D>

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 9

Output of the UNL-viewer and display in a browser

Display

Example 1 El/UNLJ’ai couru dans le parc hier. Mon chien aboya

pour moi.

Output from the viewer (for French)<HTML><HEAD><TITLE>Example 1 El/UNL</TITLE></HEAD><BODY>J’ai couru dans le parc hier.Mon chien aboya pour moi.</BODY></HTML>

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 10

Scenario

• User reads a multilingual document in language Li• User wishes to correct some errors in Li• User switches to the coedition environment• User’s corrections will be executed

later on the textimmediately on the graph

• User asks for deconversion into Li• User iterates corrections if not satisfied, asks for deconversion into L1…Ln when

OK• User returns to reading mode

Learn UNL

Adapted from a tutorial by Étienne Blanc, GETA.For more details, see the extract of UNL

specifications 3.0 (distributed)

3

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 12

• Graph = {relations between nodes bearing UWs & attributes} The dog watches its master.

watch

masterdog

agt obj

pos

agt(watch(agt>thing,obj>thing).@entry,dog(icl>animal).@def)obj(watch(agt>thing,obj>thing).@entry,master(icl>human))pos(dog(icl>animal).@def,master(icl>human))

• A graph line :agt (watch(agt>thing,obj>thing).@entry , dog(icl>animal).@def)

agt : binary relation 'defining a thing which initiates an action'

watch(icl>do) : 'universal word' or 'unit of virtual vocabulary' (UW) made of- a 'headword' : watch- a 'restriction' : agt>thing,obj>thing —> lexical disambiguation + argument frame

@entry, @def : «attributes » specifying how the concept is used in the graph : - @entry means that the node is the graph entry ;- @def specifies definiteness

Basic notions  : a simple UNL-graph

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 13

regret

John

agt

knowagt

:01

obj

obj

and

John knows that Peter will not come and regrets it.

Peter

come

agt

:01

agt:01(come.@entry.@future.@not,you)

This "scope" node of the graph is the subgraph described here.

Basic notions  : a UNL hypergraph

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 14

agt (agent) action—agt→ thing in focus which initiates itand (conjunction) X—and→ Y conjunctive relation between 2 concepts (word or phrase senses)aoj (thing with attribute) state or attribute —aoj→ thing concerned bas (basis) degree—bas→ thing used as the basis (standard) for a comparison ben (beneficiary) event or state —ben→ indirect beneficiary or victim of itcag (co-agent) action—cag→ thing not in focus which initiates it in parallel with the agentcao (co-thing with attribute) state or attribute—cao→ thing not in focus concerned in parallelcnt (content) X—cnt→ Y equivalent concept (Y≈X)cob (affected co-thing) implicit parallel event or state—cob→ thing directly affectedcon (condition) focused event or state—con→ non-focused event or state which conditions itcoo (co-occurrence) focused event or state—coo→ co-occurring event or statedur (duration) event or state—dur→ period of time during which it occurs or existsfmt (range) X—frt→ Y range between two things (from X to Y)frm (origin) X—frm→ Y origin of thing Xgol (goal/final state) event—gol→ final state of an object or thing finally associated with its objectins (instrument) event—ins→ thing used to carry it outman (manner) event or state—man→ way to carry out the event or to characterize the statemet (method) event—met→ method to carry it outmod (modification) focused thing—mod→ thing which restricts itnam (name) thing—mod→ a name of that thingobj (affected thing) event or state—obj→ thing in focus directly affected by it

Basic notions : semantic relations 1/2

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 15

opl (affected place) event—opl→ place in focus where it has effectsor (disjunction) X—or→ Y disjunctive relation between 2 concepts (word or phrase senses)per (proportion, rate or distribution) X—per→ thing used as basis (standard) or unit ofproportion, rate or distribution Xplc (place) event or state or thing—plc→ place where it occurs or is true or existsplf (initial place) event or state—plf→ place where it begins or becomes trueplt (final place) event or state—plt→ place where it begins or becomes falsepof (part-of) focused thing—pof→ thing of which it is a partpos (possessor) thing—pos→ possessor of itptn (partner) action—ptn→indispensable non-focused initiator of itpur (purpose or objective) event or existing thing—pur→ purpose or objective of an event or

purpose of a thingqua (quantity) thing or unit—qua→ quantity of itrsn (reason) event or state—rsn→ reason that it happensscn (scene) event or state or thing—scn→ virtual world where it occurs or is true or existsseq (sequence) focused event or state—seq→ prior event or statesrc (source/initial state) event—src→ initial state of an object or thing finally associated with

its objecttim (time) event or state—tim→ time at which it occurs or is truetmf (initial time) event or state—tmf→ time at which it starts or becomes truetmt (final time) event or state—tmt→ time at which it starts or becomes falseto (destination) X—to→ Y destination of thing Xvia (intermediate place or state) event or state—via→ intermediate place

Basic notions : semantic relations 2/2

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 16

@entry: graph entry node@def : determination@pl : plural

Attributes specify how concepts are used in a given graph(tense, aspect, determination, number, etc.)

agt(watch(agt>thing,obj>thing).@entry,dog(icl>animal).@def.@pl)

Basic notions : attributes

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 17

Time :

@past happened in the past@present happening at present@future will happen in future

Aspect :

@begin beginning of an event or a state@complet finishing/completion of a (whole) event.@continue continuation of an event@custom customary or repetitious action@end end/termination of an event or a state@experience experience@progress an event is in progress@repeat repetition of an event@state final state or existence of the object on which an action has been effected

The preceding attributes may be modified by the following ones : @just @soon @yet

Basic notions : attributes (examples)

4

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 18

Source document (Chinese)

Enconverted document (UNL)

EnconversionOnce 'enconverted' into UNL, adocument may be more easily'deconverted' into other languages.

Deconverted document (Russian)

Deconverted document (French)

Deconverted document

(Japanese)

Deconverted document(Spanish)

Deconversions

A document in a given natural language

Brief presentation of UNL  : spreading info over the Net

Read UNL

from graphs to English or to your language

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 20

READING UNL

• Begin reading at entry node• Beware that AND, OR, SEQ relations go contrary to English

but parallel to Japanese• In doubt, consult the abbreviated specifications• Many equivalent readings are possible, in general

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 21

Free Software Portal

30

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 22

Unchecked, this will contribute to a loss of cultural diversity of informationnetworks and a widening of existing socio-economic inequalities.

1483

.@topic

.@generic

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 23

Inuktitut speakers will soon be able to have their say online as theCanadian aboriginal language goes on the web.

46

:01

.@soon.@idiom

5

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 24

Browser settings on normal computers have not supported the language todate, but attavik.net has changed that.

47

aboriginal(aoj>thing).@eld

modmod

Canadian(aoj>thing).@eld

Write UNL

from English or from your language to UNL

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 26

• If not starting from English, think of an English translationpossible with question marksbasu ga kimashita --> ¿ the | a ? ¿ bus | buses ? came

• Determine UWsEnglish headword(restrictions)Restrictions

• icl>hyperonym• argument frame: agt>human (.@A) aoj>thing (.@A) obj>thing (.@B) ben>person

(.@C)give(icl>do, agt>person, obj>thing, ben>person)

• other semantic restrictions land(icl>do, agt>person, plt>shore) vs land(icl>do, agt>person, plt>land)

• Determine relations & scopes (hypernodes)if not precise enough, refine lexically

• fly above the street -->and/or introduce scopes (hypernodes)

• Determine attributes• Don't forget entry nodes

WRITING UNL

fly(icl>occur) above street.@defplc obj

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 27

It provides a content management system that allows native speakers towrite, manage documents and offer online payments in the Inuit language.

49agt(provide(agt>thing,obj>thing).@entry,attavik.net(icl>entity))obj(provide(agt>thing,obj>thing).@entry,system(icl>method).@indef)gol(system(icl>method).@indef,management(icl>activity).@def)mod(management(icl>activity).@def,content(icl>information))gol(provide(agt>thing,obj>thing).@entry,:01)and:01(write(agt>human,obj>thing).@entry,manage(icl>treat(agt>volitional

thing,obj>thing)))obj(:01,document(icl>information).@indef.@pl)agt(:01,speaker(icl>role).@indef.@pl)mod(speaker(icl>role).@indef.@pl,native(mod<human))and(:01,offer(icl>give(agt>thing,gol>thing,obj>thing)))obj(offer(icl>give(agt>thing,gol>thing,obj>thing)),payment(icl>action).@indef.@pl)mod(payment(icl>action).@indef.@pl,online(icl>place))ins(offer(icl>give(agt>thing,gol>thing,obj>thing)),language(icl>system).@def)mod(language(icl>system).@def,Inuit(icl>language))agt(offer(icl>give(agt>thing,gol>thing,obj>thing)),speaker(icl>role).@indef.@pl)

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 28

49agt(provide(agt>thing,obj>thing).@entry,attavik.net(icl>entity))obj(provide(agt>thing,obj>thing).@entry,system(icl>method).@indef)gol(system(icl>method).@indef,management(icl>activity).@def)mod(management(icl>activity).@def,content(icl>information))gol(provide(agt>thing,obj>thing).@entry,:01)and:01(write(agt>human,obj>thing).@entry,manage(icl>treat(agt>volitional thing,obj>thing)))obj(:01,document(icl>information).@indef.@pl)agt(:01,speaker(icl>role).@indef.@pl)mod(speaker(icl>role).@indef.@pl,native(mod<human))and(:01,offer(icl>give(agt>thing,gol>thing,obj>thing)))obj(offer(icl>give(agt>thing,gol>thing,obj>thing)),payment(icl>action).@indef.@pl)mod(payment(icl>action).@indef.@pl,online(icl>place))ins(offer(icl>give(agt>thing,gol>thing,obj>thing)),language(icl>system).@def)mod(language(icl>system).@def,Inuit(icl>language))agt(offer(icl>give(agt>thing,gol>thing,obj>thing)),speaker(icl>role).@indef.@pl)

It provides a content management system that allows native speakers towrite, manage documents and offer online payments in the Inuit language.

provide(agt>thing,obj>thing).@entry

attavik.net(icl>entity)

system(icl>method).@indef

management(icl>activity).@def

content(icl>information)

agtobj

gol

mod

gol

:01

write(agt>human,obj>thing).@entry

manage(icl>treat(agt>volitional thing,obj>thing)

and

speaker(icl>role).@indef.@pl

agt

native(mod<human)

modoffer(icl>give(agt>thing,

gol>thing,obj>thing))

and document(icl>information).@indef.@pl

obj

payment(icl>action).@indef.@pl

objonline(icl>place)mod

language(icl>system).@def

insInuit(icl>language)mod

agtins

man

obj

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 29

Initiative B@bel and Script Encoding Initiative Supporting LinguisticDiversity in Cyberspace

50

6

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 30

12-11-2004 (UNESCO)

51

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 31

Efforts continue to add N'ko, a script used by the Manden people of WestAfrica, to the international character encoding standards Unicode andISO/IEC 10646 through a project of the University of California Berkeley'sScript Encoding Initiative that is supported by UNESCO's Initiative B@bel.

52

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 32

Try now to have a short dialogue with somebody via UNL(the following was suggested at the UNL "special event" at CICLING-05)

• How you like the CICLING-05 congress?• Very much! [I like].@eld necessary for some languages (jp)• I [gender for Fr, Sp,…] am particularly interested by the 3 special

events.• What kind of Martian language do you prefer?• Well, UNL is certainly easier to understand and write.• Moreover, the first [Martian language].@eld is impossible to learn, while

enough of UNL can be learned in 15 minutes using a 7 page document.• I could not attend a session because I was ill and I regret it. [use scope]

and for UNL fans

• And the excursions!• These pyramids, I did not see them all yet, though.• Well, you will be able to visit some caves tomorrow.• Good bye

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 33

How did you like the CICLING-05 congress?

how.@question.@entry

you(icl>human).@politecongress(icl>event)

.@def

CICLING-05(icl>thing)

aoj

obj

nam

like(icl>occur, aoj>person,obj>thing)

man

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 34

Very much! [I like].@eld necessary for some languages (jp)

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 35

I [gender for Fr, Sp,…] was particularly interested by the 3 special events.

7

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 36

What kind of Martian language do you prefer?

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 37

Well, UNL is certainly easier to understand and write.

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 38

Moreover, the first [Martian language].@eld is impossible to learn, whileenough of UNL can be learned in 15 minutes using a 7 page document.

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 39

I could not attend a session because I was ill and I regret it. [use scope]

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 40

And the excursions!

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 41

These pyramids, I did not see them all yet, though.

8

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 42

Well, you will be able to visit some caves tomorrow.

ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 43

Good bye

• Thank you for your attention !• I hope EBMT or SMT or RBMT via UNL will attract (some of) you in the

future.