automatic lexical annotation applied to the scarlet ontology matcher

28
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy [email protected] [email protected] 24 - 26 March 2010 Hue City Vietnam

Upload: juliet

Post on 25-Feb-2016

42 views

Category:

Documents


6 download

DESCRIPTION

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher. 24 - 26 March 2010 Hue City Vietnam . Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy [email protected] [email protected]. The idea. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Automatic Lexical Annotation Applied to the SCARLET Ontology MatcherLaura Po and Sonia BergamaschiDII, University of Modena and Reggio Emilia, [email protected]@unimore.it

24 - 26 March 2010Hue CityVietnam

Page 2: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

The ideaThe ideaWhen we are dealing with data sources,

we are dealing with structure information that are labeled by humans.

Humans use lexical expressions to assign names.

Natural language labels provide a rich connection between formal objects (e.g. classes and properties) and their intended meanings.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 3: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical knowledge inside Lexical knowledge inside sourcessourcesIt is necessary to address the problem

of how the concepts are "labelled", i.e. understanding the meaning behind the names denoting ontology elements.

In NLP (Natural Language Processing), Word Wense Disambiguation (WSD) is the process of identifying which sense of a word (i.e. meaning) is used in any given sentence, when the word has a number of distinct senses (polysemy).

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 4: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical knowledge inside Lexical knowledge inside sourcessourcesIt is necessary to address the problem

of how the concepts are "labelled", i.e. understanding the meaning behind the names denoting ontology elements.

In NLP (Natural Language Processing), Word Wense Disambiguation (WSD) is the process of identifying which sense of a word (i.e. meaning) is used in any given sentencesentence, when the word has a number of distinct senses (polysemy).

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 5: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Applications of WSDApplications of WSDInformation Retrieval Information ExtractionMachine Translation Content AnalysisWord ProcessingLexicographyThe Semantic Web

◦ontology learning: to build domain taxonomies and enrich large-scale semantic networks

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Roberto Navigli. Word Sense Disambiguation: A Survey, ACM Computing Surveys, 41(2), 2009

Page 6: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical AnnotationLexical AnnotationLexical Annotation is a particular metadata

annotation that refers to a semantic resource.Each lexical annotation has the property to

own one or more lexical descriptions.Lexical Annotation

◦assigns meanings to class and property names w.r.t. a semantic resource (WordNet)

◦derives relationships among source elementsLexical Annotation can be an effective

method to solve ambiguity problems!

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 7: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical Annotation – an Lexical Annotation – an exampleexample

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Hypernym of

Word form Meaning (synset) Book Volume Catalog a written work or composition that has been published (printed on pages bound together)

physical objects consisting of a number of pages bound together; "he used a large book as a doorstop"

a book or pamphlet containing an enumeration of things

Book SYN VolumeBook BT Catalog

(Catalog Book)

lexical relationships extractedlexical relationships extracted

√√

√√ √√

√√

BT

Page 8: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

The ontology matching The ontology matching problemproblemAn ontology is an explicit specification

of a conceptualization (Gruber, 1993).

The ontology matching process, for two separate and autonomous ontologies, O1 and O2, consists of finding corresponding entities in ontologies O1 and O2

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 9: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Ontology matchersOntology matchersSeveral ontology matchers have been

proposed in litterature, altought the most do not take advantage of the linguistic aspect of the involved ontologies.

In particular, ontology matchers do not discern elements with different different meaningsmeanings.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 10: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

The SCARLET matcherThe SCARLET matcherScarlet* is a technique for discovering

relationships between two concepts by making use of online available ontologies.

Scarlet discovers semantic relationships between concepts by exploiting the entire Semantic Web as a source of background knowledge.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

* SCARLET has been developed by the Knowledge and Media Institute at Milton Keynes, UK.

Page 11: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

The SCARLET matcherThe SCARLET matcherBy using semantic

search engines (Swoogle and WATSON), it finds online ontologies containing concepts with the same names as the candidate concepts and then it derives mappings from the relationships in the online ontologies.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Ontology 1 Ontology 2

Online Ontology

BA

A0

B0

Legendaanchoringrelationship

Page 12: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

The SCARLET matcherThe SCARLET matcherScarlet is able to

identify disjoint relations, subsumption relations, and correspondences. All relations are obtained by using derivation rules which explore direct relations and also relations deduced by applying subsumption reasoning.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Ontology 1 Ontology 2

Online Ontologies

BA

A0

B0

C C0

Legendaanchoringrelationship

Page 13: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

The evaluation of The evaluation of SCARLETSCARLETOn a large-scale, real life data sets

SCARLET retrived a precision value of 70%

More than half of incorrect anchonring were due to ambiguities .

SCARLET is not able to take advantage of the ontological context in which a concept appears.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Lexical Annotation can used to Lexical Annotation can used to solve the ambiguity problems!solve the ambiguity problems!

Page 14: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

SCARLET + lexical SCARLET + lexical annotationannotationBy identifying a meaning (or a set of

meanings) for each concept it is possible to, more accurately, compare the concept with the concepts that appear in online ontologies.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 15: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

How Lexical Annotation How Lexical Annotation enhances the ontology enhances the ontology matching performancematching performancePerforming lexical annotation on the

ontologies involved in the matching process allows:◦to detect false positive mappings◦to discover new mappings ◦ to identify synonyms and

more general classesfor a given concept

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Improving Improving precisionprecision

ImproviImproving ng recallrecall

Page 16: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical annotation Lexical annotation improvementsimprovements1 - detection of false positive mappings

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Ontology 1 Ontology 2

BA

A0

B0

SYNSET4

SYNSET3

SYNSET2

SYNSET1

XX

If a concept and its anchoring concept have disregardingmeanings (i.e. if they do not have the same list of meanings), the anchoring is detect as a false positive.

Page 17: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical annotation Lexical annotation improvementsimprovements1- detection of false positive mappings

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Ontology 1

BA

A0

B0

C C0

Ontology 2

SYNSET4

SYNSET3

SYNSET2 SYNSET1

X

X

Page 18: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical annotation Lexical annotation improvementsimprovements2 - new mapping discovery

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Ontology 1 Ontology 2

BA

SYNSET4

SYNSET3SYNSET2

SYNSET1

hyponym

Page 19: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical annotation Lexical annotation improvementsimprovements3 - identification of synonyms and more

general concepts

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Ontology 1 Ontology 2

Bhome

house

B0

SYNSET4

SYNSET3

SYNSET2

SYNSET1

New anchorin

g

Page 20: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

ALA toolALA toolWe employ the Automatic Lexical Annotation

tool to perform lexical annotation of the ontologies involved in the matching process (source ontologies, online ontologies).

ALA combines the output of 4 WSD algorithms and 2 heuristic rules.

The combination is a sequential composition: ◦only the first algorithm is executed on the

entire data source, the following algorithms are executed only on the set of concepts that were not disambiguated by the previous ones.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 21: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical Annotation Lexical Annotation EvaluationEvaluationThe application of lexical annotation techniques

on the SCARLET results has been tested on two test cases:

◦ NALT+AGROVOCtwo real life thesauri: the United Nations Food and Agriculture Organization (FAO)’s AGROVOC thesaurus, the United States National Agricultural Library (NAL) Agricultural thesaurus NALT

◦ OAEI 2006 benchmarkThe benchmark is bibliographic domain, the bibliographic ontologies we took into account are the reference ontology and the Karlsruhe ontology.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 22: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical Annotation Lexical Annotation Evaluation:Evaluation:detection of incorrect detection of incorrect anchoringanchoringThe results of the automatic lexical

annotation have been compared with the manual evaluation done by an expert on the entire set of matching.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

the most are due to the presence of compound nouns

Page 23: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical Annotation Lexical Annotation Evaluation:Evaluation:new mapping discoverynew mapping discoveryAfter the execution of lexical

annotation, we computed a mapping between two concepts, if we find a relationship between their corresponding meanings in WordNet.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 24: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Lexical Annotation Lexical Annotation Evaluation:Evaluation:comparisoncomparisonWe compared our results with a

multiontology disambiguation method that has been previously applied on SCARLET

Unlike multiontology disambiguation method that retrieves similarity measures, our method offers a definite answer regarding the detection of synonym relationships.

Comparing the results, we evaluated some possible thresholds on the similarity measures retrieved by the multiontolgy disambiguation method (0.19 – 0.22).

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 25: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

ConclusionConclusionWe proposed and experimentally investigated

a method to solve ambiguity problems in the context of ontology matching by using automatic lexical annotation techniques (ALA tool). The method has been applied on SCARLET, a semantic web based matcher.

Experimental results have proved that by performing lexical annotation of ontologies we are able to:◦ to detect false positive mappings◦ to discover new mappings

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 26: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Future work on lex ann + Future work on lex ann + matchersmatchersLexical Annotation is able to identify

synonymous and generalization of concepts. Implementing this will give the matcher the possibility to widen the search among online ontologies, thus, improving matching results.

In order to cope with more complex ontologies, our method needs to be extended by including the treatment of compound terms and abbreviations (published at ER2009).

The method could be coped with any matcher.

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 27: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Future perspectivesFuture perspectivesThere are several scenarios where we

applied lexical annotation◦Data integration◦Ontology matching◦Disambiguation/classification of Google

hits

New scenarios◦blogs ◦social networks◦Mash up ?!

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po

ACIIDS - 26/03/2010

Page 28: Automatic Lexical Annotation Applied to the  SCARLET Ontology Matcher

Thanks for your attention!Thanks for your attention!

www.dbgroup.unimo.it

ACIIDS - 26/03/2010

Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher - Laura Po