the global wordnet grid: anchoring languages to universal meaning piek vossen irion...

69
The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Upload: colin-glass

Post on 27-Mar-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

The Global Wordnet Grid: anchoring languages to universal meaning

Piek Vossen

Irion Technologies/Free University of Amsterdam

Page 2: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Overview

• Wordnet, EuroWordNet background

• Architecture of the Global Wordnet Grid

• Mapping wordnets to the Grid

• Advantages of shared knowledge structure

• 7th Frame work project KYOTO

Page 3: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

WordNet1.5WordNet1.5• Semantic network in which concepts are defined in

terms of relations to other concepts.• Structure:

organized around the notion of synsets (sets of synonymous words)

basic semantic relations between these synsets

http://www.cogsci.princeton.edu/~wn/w3wn.htmlhttp://www.cogsci.princeton.edu/~wn/w3wn.html Developed at Princeton by George Miller and his

team as a model of the mental lexicon.

Page 4: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Relational model of meaning

man woman

boy girl

cat

kitten

dog

puppy

animal

man

woman

boy

meisje

cat

kitten

dogpuppy

animal

Page 5: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Structure of WordNet

{vehicle}

{conveyance; transport}

{car; auto; automobile; machine; motorcar}

{cruiser; squad car; patrol car; police car; prowl car} {cab; taxi; hack; taxicab; }

{motor vehicle; automotive vehicle}

{bumper}

{car door}

{car window}

{car mirror}

{hinge; flexible joint}

{doorlock}

{armrest}

hyperonym

hyperonym

hyperonym

hyperonymhyperonym

meronym

meronym

meronym

meronym

Page 6: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Wordnet Data Model

bank

fiddleviolin

violistfiddler

string

rec: 12345- financial instituterec: 54321

- side of a riverrec: 9876

- small string instrumentrec: 65438

- musician playing violinrec:42654

- musician

rec:25876

- string instrument

rec:35576

- string of instrumentrec:29551

- underwear

type-of

type-of

part-of

Vocabulary of a languageConceptsRelations

1

2

2

1

1

2

Page 7: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Usage of Wordnet

• Improve recall of textual based analysis:– Query -> Index

• Synonyms: commence – begin• Hypernyms: taxi -> car• Hyponyms: car -> taxi• Meronyms: trunk -> elephant• Lexical entailments: gun -> shoot

• Inferencing:– what things can burn?

• Expression in language generation and translation:– alternative words and paraphrases

Page 8: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Improve recall

• Information retrieval: – small databases without redundancy, e.g. image

captions, video text

• Text classification:– small training sets

• Question & Answer systems– query analysis: who, whom, where, what, when

Page 9: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Improve recall

• Anaphora resolution:– The girl fell off the table. She....– The glass fell of the table. It...

• Coreference resolution:– When he moved the furniture, the antique table got

damaged.

• Information extraction (unstructed text to structured databases):– generic forms or patterns "vehicle" - > text with

specific cases "car"

Page 10: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Improve recall

• Summarizers:– Sentence selection based on word counts ->

concept counts– Avoid repetition in summary -> language

generation

• Limited inferencing: detect locations, organisations, etc.

Page 11: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Many others

• Data sparseness for machine learning: hapaxes can be replaced by semantic classes

• Use redundancy for more robustness: spelling correction and speech recognition can built semantic expections using Wordnet and make better choices

• Sentiment and opinion mining• Natural language learning

Page 12: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam
Page 13: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

EuroWordNet

• The development of a multilingual database with wordnets for several European languages

• Funded by the European Commission, DG XIII, Luxembourg as projects LE2-4003 and LE4-8328

• March 1996 - September 1999

• 2.5 Million EURO.

• http://www.hum.uva.nl/~ewn

• http://www.illc.uva.nl/EuroWordNet/finalresults-ewn.html

Page 14: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

EuroWordNetEuroWordNet

• Languages covered: – EuroWordNet-1 (LE2-4003): English, Dutch, Spanish, Italian– EuroWordNet-2 (LE4-8328): German, French, Czech, Estonian.

• Size of vocabulary:– EuroWordNet-1: 30,000 concepts - 50,000 word meanings.– EuroWordNet-2: 15,000 concepts- 25,000 word meaning.

• Type of vocabulary: – the most frequent words of the languages– all concepts needed to relate more specific concepts

Page 15: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

ENGLISHCar…

Train…

Vehicle

Inter-Lingual-Index

Transport

Road Air Water

Domains DOLCESUMO

Device

Object

TransportDevice

English Words

vehicle

car train

1

2

4

3 3

Czech Words

dopravní prostředník

auto vlak

2

1

French Words

véhicule

voiture train

2

1

Estonian Words

liiklusvahend

auto killavoor

2

1

German Words

Fahrzeug

Auto Zug

2

1

Spanish Words

vehículo

auto tren

2

1

Italian Words

veicolo

auto treno

2

1

Dutch Words

voertuig

auto trein

2

1

Wordnet family Princeton WordNet, (Fellbaum 1998): 115,000 concepsEuroWordNet, (Vossen 1998): 8 languagesBalkaNet, (Tufis 2004): 6 languagesGlobal Wordnet Association: all languages

Page 16: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

EuroWordNet

• Wordnets are unique language-specific structures:– different lexicalizations– differences in synonymy and homonymy– different relations between synsets– same organizational principles: synset structure and

same set of semantic relations.

• Language independent knowledge is assigned to the ILI and can thus be shared for all language linked to the ILI: both an ontology and domain hierarchy

Page 17: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Autonomous & Language-Specific

voorwerp{object}

lepel{spoon}

werktuig{tool}

tas{bag}

bak{box}

blok{block}

lichaam{body}

Wordnet1.5 Dutch Wordnet

bagspoonbox

object

natural object (an object occurring naturally)

artifact, artefact (a man-made object)

instrumentality block body

containerdeviceimplement

tool instrument

Page 18: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Artificial ontology: • better control or performance, or a more compact and coherent structure. • introduce artificial levels for concepts which are not lexicalized in a language (e.g. instrumentality, hand tool), • neglect levels which are lexicalized but not relevant for the purpose of the ontology (e.g. tableware, silverware, merchandise).

What properties can we infer for spoons?spoon -> container; artifact; hand tool; object; made of metal or plastic; for eating, pouring or cooking

Linguistic versus Artificial Ontologies

Page 19: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Linguistic ontology: • Exactly reflects the relations between all the lexicalized words and

expressions in a language. • Captures valuable information about the lexical capacity of

languages: what is the available fund of words and expressions in a language.

What words can be used to name spoons?spoon -> object, tableware, silverware, merchandise, cutlery,

Linguistic versus Artificial Ontologies

Page 20: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Wordnets versus ontologies

• Wordnets:• autonomous language-specific lexicalization

patterns in a relational network. • Usage: to predict substitution in text for

information retrieval,• text generation, machine translation, word-

sense-disambiguation.• Ontologies:

• data structure with formally defined concepts.• Usage: making semantic inferences.

Page 21: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

• Inter-Lingual-Index: unstructured fund of concepts to

provide an efficient mapping across the languages;

• Index-records are mainly based on WordNet synsets and

consist of synonyms, glosses and source references;

• Various types of complex equivalence relations are

distinguished;

• Equivalence relations from synsets to index records: not on a

word-to-word basis;

• Indirect matching of synsets linked to the same index items;

The Multilingual DesignThe Multilingual Design

Page 22: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Equivalent Near SynonymEquivalent Near Synonym1. Multiple Targets (1:many)

Dutch wordnet: schoonmaken (to clean) matches with 4 senses of clean in WordNet1.5:• make clean by removing dirt, filth, or unwanted substances from• remove unwanted substances from, such as feathers or pits, as of chickens or fruit• remove in making clean; "Clean the spots off the rug"• remove unwanted substances from - (as in chemistry)

2. Multiple Sources (many:1)Dutch wordnet: versiersel near_synonym versiering ILI-Record: decoration.

3. Multiple Targets and Sources (many:many)Dutch wordnet: toestel near_synonym apparaat

ILI-records: machine; device; apparatus; tool

Page 23: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Equivalent HyperonymyTypically used for gaps in English WordNet:

• genuine, cultural gaps for things not known in English culture:

– Dutch: klunen, to walk on skates over land from one frozen water to the other

• pragmatic, in the sense that the concept is known but is not expressed by a single lexicalized form in English:

– Dutch: kunstproduct = artifact substance <=> artifact object

Page 24: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

From EuroWordNet to Global WordNet

• Currently, wordnets exist for more than 40 languages, including:

• Arabic, Bantu, Basque, Chinese, Bulgarian, Estonian, Hebrew, Icelandic, Japanese, Kannada, Korean, Latvian, Nepali, Persian, Romanian, Sanskrit, Tamil, Thai, Turkish, Zulu...

• Many languages are genetically and typologically unrelated

• http://www.globalwordnet.org

Page 25: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Some downsides

• Construction is not done uniformly• Coverage differs• Not all wordnets can communicate with one

another• Proprietary rights restrict free access and usage• A lot of semantics is duplicated• Complex and obscure equivalence relations due to

linguistic differences between English and other languages

Page 26: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Inter-LingualOntology

Device

Object

TransportDeviceEnglish Words

vehicle

car train

1

2

3 3

Czech Words

dopravní prostředník

auto vlak

2

1French Words

véhicule

voiture train

2

1

Estonian Words

liiklusvahend

auto killavoor

2

1

German Words

Fahrzeug

Auto Zug

2

1

Spanish Words

vehículo

auto tren

2

1

Italian Words

veicolo

auto treno

2

1

Dutch Words

voertuig

auto trein

2

1

Next step: Global WordNet Grid

Page 27: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

GWNG: Main Features

• Construct separate wordnets for each Grid language

• Contributors from each language encode the same core set of concepts plus culture/language-specific ones

• Synsets (concepts) can be mapped crosslinguistically via an ontology

• No license constraints, freely available

Page 28: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

The Ontology: Main Features

• Formal, artificial ontology serves as universal index of concepts

• List of concepts is not just based on the lexicon of a particular language (unlike in EuroWordNet) but uses ontological observations

• Concepts are related in a type hierarchy• Concepts are defined with axioms

Page 29: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

The Ontology: Main Features

• In addition to high-level (“primitive”) concept ontology needs to express low-level concepts lexicalized in the Grid languages

• Additional concepts can be defined with expressions in Knowledge Interchange Format (KIF) based on first order predicate calculus and atomic element

Page 30: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

The Ontology: Main Features

• Minimal set of concepts (Reductionist view):

– to express equivalence across languages– to support inferencing

• Ontology must be powerful enough to encode all concepts that are lexically expressed in any of the Grid languages

Page 31: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

The Ontology: Main Features

• Ontology need not and cannot provide a linguistic encoding for all concepts found in the Grid languages – Lexicalization in a language is not sufficient to warrant

inclusion in the ontology– Lexicalization in all or many languages may be

sufficient• Ontological observations will be used to define the

concepts in the ontology

Page 32: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Ontological observations• Identity criteria as used in OntoClean (Guarino &

Welty 2002), :– rigidity: to what extent are properties true for entities

in all worlds? You are always a human, but you can be a student for a short while.

– essence: what properties are essential for an entity? Shape is essential for a statue but not for the clay it is made of.

– unicity: what represents a whole and what entities are parts of these wholes? An ocean is a whole but the water it contains is not.

Page 33: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Type-role distinction

• Current WordNet treatment:(1) a husky is a kind of dog(type)(2) a husky is a kind of working dog (role)

• What’s wrong? (2) is defeasible, (1) is not:*This husky is not a dogThis husky is not a working dog

Other roles: watchdog, sheepdog, herding dog, lapdog, etc….

Page 34: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Ontology and lexicon

•Hierarchy of disjunct types:Canine PoodleDog; NewfoundlandDog;

GermanShepherdDog; Husky

•Lexicon:– NAMES for TYPES:

{poodle}EN, {poedel}NL, {pudoru}JP((instance x Poodle)

– LABELS for ROLES:{watchdog}EN, {waakhond}NL, {banken}JP

((instance x Canine) and (role x GuardingProcess))

Page 35: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Ontology and lexicon

•Hierarchy of disjunct types:River; Clay; etc…

•Lexicon:– NAMES for TYPES:

{river}EN, {rivier, stroom}NL((instance x River)

– LABELS for dependent concepts:{rivierwater}NL (water from a river => water is not Unit)((instance x water) and (instance y River) and (portion x y){kleibrok}NL (irregularly shared piece of clay=>Non-essential) ((instance x Object) and (instance y Clay) and (portion x y)

and (shape X Irregular))

Page 36: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Rigidity

• The “primitive” concepts represented in the ontology are rigid types

• Entities with non-rigid properties will be represented with KIF statements

• But: ontology may include some universal, core concepts referring to roles like father, mother

Page 37: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Properties of the Ontology

• Minimal: terms are distinguished by essential properties only

• Comprehensive: includes all distinct concepts types of all Grid languages

• Allows definitions via KIF of all lexemes that express non-rigid, non-essential properties of types

• Logically valid, allows inferencing

Page 38: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Mapping Grid Languages onto the Ontology

• Explicit and precise equivalence relations among synsets in different languages, which is somehow easier:– type hierarchy is minimal– subtle differences can be encoded in KIF expressions

• Grid database contains wordnets with synsets that label – either “primitive” types in the hierarchies, – or words relating to these types in ways made explicit in KIF

expressions

• If 2 lgs. create the same KIF expression, this is a statement of equivalence!

Page 39: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

How to construct the GWNG

• Take an existing ontology as starting point;

• Use English WordNet to maximize the number of disjunct types in the ontology;

• Link English WordNet synsets as names to the disjunct types;

• Provide KIF expressions for all other English words and synsets

Page 40: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

How to construct the GWNG

• Copy the relation from the English Wordnet to the ontology to other languages, including KIF statements built for English

• Revise KIF statements to make the mapping more precise

• Map all words and synsets that are and cannot be mapped to English WordNet to the ontology:– propose extensions to the type hierarchy

– create KIF expressions for all non-rigid concepts

Page 41: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Initial Ontology: SUMO (Niles and Pease)

SUMO = Suggested Upper Merged Ontology

--consistent with good ontological practice

--fully mapped to WordNet(s): 1000 equivalence mappings, the rest through subsumption

--freely and publicly available

--allows data interoperability

--allows NLP

--allows reasoning/inferencing

Page 42: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Mapping Grid languages onto the Ontology

• Check existing SUMO mappings to Princeton WordNet -> extend the ontology with rigid types for specific concepts

• Extend it to many other WordNet synsets• Observe OntoClean principles! (Synsets

referring to non-rigid, non-essential, non-unicitous concepts must be expressed in KIF)

Page 43: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Lexicalizations not mapped to WordNet

• Not added to the type hierarchy:{straathond}NL (a dog that lives in the streets)((instance x Canine) and (habitat x Street))

• Added to the type hierarchy:{klunen}NL (to walk on skates from one frozen body to

the next over land)KluunProcess => WalkProcessAxioms:(and (instance x Human) (instance y Walk) (instance z

Skates) (wear x z) (instance s1 Skate) (instance s2 Skate) (before s1 y) (before y s2) etc…

• National dishes, customs, games,....

Page 44: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Most mismatching concepts are not new types

• Refer to sets of types in specific circumstances or to concept that are dependent on these types, next to {rivierwater}NL there are many others:

{theewater}NL (water used for making tea)

{koffiewater}NL (water used for making coffee)

{bluswater}NL (water used for making extinguishing file)

• Relate to linguistic phenomena:– gender, perspective, aspect, diminutives, politeness,

pejoratives, part-of-speech constraints

Page 45: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

• {teacher}EN((instance x Human) and (agent x

TeachingProcess))

• {Lehrer}DE ((instance x Man) and (agent x TeachingProcess))

• {Lehrerin}DE ((instance x Woman) and (agent x TeachingProcess))

KIF expression for gender marking

Page 46: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

KIF expression for perspective

sell: subj(x), direct obj(z),indirect obj(y) versus buy: subj(y), direct obj(z),indirect obj(x) (and (instance x Human)(instance y Human)

(instance z Entity) (instance e FinancialTransaction) (source x e) (destination y e) (patient e)

The same process but a different perspective by subject and object realization: marry in Russian two verbs, apprendre in French can mean teach and learn

Page 47: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Parallel Noun and Verb hierarchy

• event– act

• deed– sail

– promise

– change• movement

– change of location

• to happen– to act

• to do– to sell

– a promise

– to change• to move

– to move position

Encoded once as a Process in the ontology!

Page 48: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Part-of-speech mismatches

• {bankdrukken-V}NL vs.{bench press-N}EN

• {gehuil-N}NL vs. {cry-V}EN

• {afsluiting-N}NL vs. {close-V}EN

• Process in the ontology is neutral with respect to POS!

Page 49: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Aspectual variants• Slavic languages: two members of a verb pair for an

ongoing event and a completed event.• English: can mark perfectivity with particles, as in the

phrasal verbs eat up and read through. • Romance languages: mark aspect by verb conjugations on

the same verb. • Dutch, verbs with marked aspect can be created by

prefixing a verb with door: doorademen, dooreten, doorfietsen, doorlezen, doorpraten (continue to breathe/eat/bike/read/talk).

• These verbs are restrictions on phases of the same process

• Which does NOT warrant the extension of the ontology with separate processes for each aspectual variant

Page 50: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Aspectual lexicalization• Regular compositional verb structures:

doorademen: (lit. through+breath, continue to breath)

doorbetalen: (lit. through+pay, continue to pay)

doorlopen: (lit. through+walk, continue to walk)

doorfietsen: (lit. through+walk, continue to walk)

doorrijden: (lit. through+walk, continue to walk)

(and (instance x BreathProcess)(instance y Time) (instance z Time) (end x z) (expected (end x y) (after z y))

Page 51: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

• MORE GENERAL VERBS:openmaken: (lit. open+make, to cause to be open);dichtmaken: (lit. close+make, to cause to be open);

• MORE SPECIFIC VERBS:openknijpen (lit. open+squeeze, to open by squeezing)

has_hyperonym knijpen (squeeze) & openmaken (to open)

opendraaien (lit. open+turn, to open by turning)has_hyperonym draaien (to turn) & openmaken (to open)

dichtknijpen: (lit. closed+squeeze, to close by squeezing)has_hyperonym knijpen (squeeze) & dichtmaken (to close)

dichtdraaien: (lit. closed +turn, to close by turning)has_hyperonym draaien (to turn) & dichtmaken (to close)

Lexicalization of Resultatives

Page 52: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Kinship relations in Arabic

• (~Eam)َع&م father's brother, paternal uncle.

• (xaAl) َخ&ال mother's brother, maternal uncle.

• (Eam~ap) َع&َّم,ة father's sister, paternal aunt.

• اَل&ة (xaAlap) َخ& mother's sister, maternal aunt

Page 53: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Kinship relations in Arabic

• .........• َق1يَق&ة sister, sister on the paternal (aqiyqapfull$) َش&

and maternal side (as distinct from 5َخ4ت :(uxot<) ُأ'sister' which may refer to a 'sister' from paternal or maternal side, or both sides).

• &ْك4الن (vakolAna) َث father bereaved of a child (as opposed to 1يم &ِت 1يَّم&ة or (yatiym) َي &ِت for (yatiymap) َيfeminine: 'orphan' a person whose father or mother died or both father and mother died).

• 4َل&ى &ْك (vakolaYa) َث other bereaved of a child (as opposed to 1يم &ِت 1يَّم&ة or َي &ِت for feminine: 'orphan' a َيperson whose father or mother died or both father and mother died).

Page 54: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

father's brother, paternal uncle

WORDNETpaternal uncle => uncle

=> brother of ....????

ONTOLOGY(=> (paternalUncle ?P ?UNC) (exists (?F) (and (father ?P ?F) (brother ?F ?UNC))))

Complex Kinship concepts

Page 55: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Advantages of the Global Wordnet Grid

• Shared and uniform world knowledge:– universal inferencing– uniform text analysis and interpretation

• More compact and less redundant databases• More clear notion how languages map to

the knowledge – better criteria for expressing knowledge– better criteria for understanding variation

Page 56: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

dog

watchdog

poodlestreet dog

dachshundlapdog

short hair dachshund

long hairdachshund

Expansion from a type to roles

hunting dog

Expansion with pure hyponymy relations

puppy

bitch

Page 57: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

dog

watchdog

poodlestreet dog

dachshundlapdog

short hair dachshund

long hair dachshund

Expansion from a role to types and other roles

hunting dog

Expansion with pure hyponymy relations

puppy

bitch

Page 58: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Automotive ontology: (http://www.ontoprise.de)

Page 59: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Who uses ontologies?

Page 60: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Human dialogues with Alice-bot

Page 61: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Full understanding is fundamentally impossible BUT?

• How can people communicate?• How can people coomunicate with

computers?• As long as language is effective:

– meaning= to have the desired effect!– Link language to useful content!

Page 62: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Ontology

Texts

Objectsin reality

Thought

Expression

携帯電話(keitaidenwa )

Knowledge &information

Useful and effective behavior:-reason over knowledge-collect information and data-deliver services and be helpful

Page 63: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Concrete goals for GWG

• Global Wordnet Association website:

http://www.globalwordnet.org/gwa/gwa_grid.htm• 5000 Base Concepts or more:

– English

– Spanish

– Catalan

– Czech, Polish, Dutch, other wordnets

• 7th Frame Work project Kyoto

Page 64: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

KYOTO Project• 7th Frame Work project (under negotiation)• Kowledge Yielding Ontologies for Transition-based

Organisations• Goal:

– Global Wordnet Grid = ontology + wordnets– AutoCons = Automatic concept extractors– Kybots = Knowledge yielding robots– Wiki environment for encoding domain knowledge in expert

groups– Index and retrieval software for deep semantic search

• Languages: Dutch, English, Spanish, Basque, Italian, Chinese and Japanese

• Domain of application: environmental organisations• Period: March/April 2008 - 2011

Page 65: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

KYOTO ConsortiumUniversities• Vrije Universiteit Amterdam, Amsterdam, Netherlands• Consiglio Nazionale delle Ricerche, Pisa, Italy• Berlin-Brandenburg Academy of Sciences and Humantities, Berlin,

Germany• Euskal Herriko Unibertsitatea, San Sebastian, Spain• Academia Sinica, Taipei, Taiwan• National Institute of Information and Communications Technology,

Kyoto, Japan• Masaryk University, Brno, CzechCompanies• Irion Technologies, Delft, Netherlands• Synthema, Pisa, ItalyUsers• European Centre for Nature Conservation, Tilburg, Netherlands• World Wide Fund for Nature, Zeist, Netherlands

Page 66: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Environmental organizations

Capture

Index

Docs

URLs

Experts

Images

Search

Dialogue

ConceptMining

FactMining

Abstract PhysicalTop

Middle

Domain

water CO2

Substance

CO2 emission

water pollution

Universal Ontology Wordnets

Environmental organizations

CitizensGovernorsCompanies

DomainWiki

Process

Page 67: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Text & Meta datain XMLFormat

termhierarchy

wordnet

ConceptMiners

termrelations

ontology

Kybots

ManualRevision

WikiDEB

Client

2

3

5

domainwordnet

domainontology

Indexing

sourcedata

Capture

Data & Factsin XML Format

DEBServer

Accessend-users

Index

6

Userscenarios

Userscenarios

ManualTest

Benchmarkdata

Benchmarking

1

1

4

7 8

Page 68: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

Abstract Physical

water CO2

Substance

CO2 emission

water pollution

Ontology Wordnets

Generic

Process

Chemical Reaction

Logical Expressions Linguistic Minersor Kybots

Domain

words words

words words

Page 69: The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Free University of Amsterdam

END