semantic enrichment of ontology mappings: a linguistic-based approach patrick arnold, erhard rahm...
TRANSCRIPT
Semantic Enrichment of Ontology Mappings:A Linguistic-based Approach
Patrick Arnold, Erhard RahmUniversity of Leipzig, Germany
17th East-European Conference on Advances in Databases and Information Systems
204/19/23 Semantic Enrichment of Ontology Mappings
1. Introduction
Ontology Matching: Detecting corresponding concepts between two Ontologies O and O'
Most matching tools do not consider the relation type that holds between corresponding concepts
304/19/23 Semantic Enrichment of Ontology Mappings
1. Introduction
Importance of Relation Types Ontology Merging and Ontology Evolutions
More precise results Effectively preventing false conclusions
Related fields Text Mining, Entity Resolution, Linked Data Key Question: Given two items, words etc.: What is the logical relation between
them?
404/19/23 Semantic Enrichment of Ontology Mappings
1. Introduction - Example
504/19/23 Semantic Enrichment of Ontology Mappings
1. Introduction
Some existing tools regarding relation types S-Match: Ineffective in our evaluations
Returned about 20,000 correspondences where around 400 were expected Further tools: LogMap, TaxoMap, etc.
604/19/23 Semantic Enrichment of Ontology Mappings
Our Contributions
1. Introduction2. Semantic Enrichment Architecture3. Implemented Strategies4. Evaluation5. Outlook
704/19/23 Semantic Enrichment of Ontology Mappings
2. Semantic Enrichtment Architecture
We provide a 2-step architecutre Step 1: Classic Ontology Matching Step 2: Enrichment
804/19/23 Semantic Enrichment of Ontology Mappings
2. Semantic Enrichtment Architecture
Approach consists of 4 strategies Each strategy returns one of the following relation types:
equal is-a / inverse is-a part-of / has-a related undecided
Take the relation which was returned most In case of draw: User feedback required
If all strategies return undecided: Decide on equal by default
904/19/23 Semantic Enrichment of Ontology Mappings
3. Implemented Strategies3.1 Compound Strategy
Compound: Two words A, B form a new word AB. Examples: high-school, blackbird, database conference
A is called the modifier, B is called the head
Compounds often express is-a relations (endocentric compounds)
High-school is a school Blackbird is a bird ...
1004/19/23 Semantic Enrichment of Ontology Mappings
3. Implemented Strategies3.1 Compound Strategy
If there is a correspondence (AB, B) or (B, AB), we derive the is-a or inv. is-a relation
Example: (main memory, memory)
Problem: Exocentric compounds butterfly, redhead, computer mouse
Exocentric matches extremely rare in mappings If AB is an exocentric compound, there usually is no head B in the
opposite ontology Example: sawtooth – tooth
1104/19/23 Semantic Enrichment of Ontology Mappings
3. Implemented Strategies3.1 Compound Strategy
Possibilities to reduce false conclusions Check modifier length: Must be at least 3
inroad – road Use dictionary to check the modifier
marriage – age nausea – sea holiday – day?
No solutions for “Pseudo-Compounds“ question – ion justice – ice
1204/19/23 Semantic Enrichment of Ontology Mappings
3. Implemented Strategies3.2 Background Knowledge
WordNet for English-language scenarioes Reliable, extensive thesaurus Excellent precision, good recall Limited in domain-specific areas
Problem: Compounds Example: Vintage Car Repair Shop Very simple word, but not contained by WordNet
1304/19/23 Semantic Enrichment of Ontology Mappings
3. Implemented Strategies3.2 Background Knowledge
Gradual Modifier Removal Remove modifiers gradually from the left After each removal: Check whether word is contained by WordNet Example: Vintage Car Repair Shop ↔ Company
WordNet: Repair Shop is a Company Vintage Car Repair Shop is a Company
Step Word In WordNet?
1 Vintage Car Repair Shop 2 Car Repair Shop 3 Repair Shop
1404/19/23 Semantic Enrichment of Ontology Mappings
3. Implemented Strategies3.3 Itemization
Itemization: List of items (words or phrases) Most frequently in product taxonomies Examples:
Laptops and Computers Bikes, Scooters and Motorbikes
More complex: Need special treatment Itemization Strategy: Triggers if at least one concept is an itemization Exploits previous strategies Approach: Remove items from item sets
Goal: Empty set
1504/19/23 Semantic Enrichment of Ontology Mappings
3. Implemented Strategies3.3 Itemization
Example Correspondence:
books, e-books, movies, films, cds novels and compact discs
Step 1: Build item sets{ books, e-books, movies, films, cds }
{ novels, compact discs }
Step 2: Intra-Synonym Removal{ books, e-books, movies, films, cds } In each item set, remove synonyms (A,B) by
crossing off either A or B.{ novels, compact discs }
Step 3: Intra-Hyponym Removal{ books, e-books, movies, cds } In each item set, remove existing
hyponyms.{ novels, compact discs }
Step 4: Inter-Synonym Removal{ books, movies, cds } Remove each synonym pair between the
two item sets.{ novels, compact discs }
Step 5: Intra-Hyponym Removal{ books, movies } Remove each word H to which a hypernym
H’ in the opposite item set exists.{ novels }
Step 6: Determine the Relation Type{ books, movies } Second item set more specific than first
one: Inverse is-a{ }
1604/19/23 Semantic Enrichment of Ontology Mappings
3. Implemented Strategies3.4 Structure Strategy
Focus: Structured schemas (hierarchies)
Issue: A relation between two matching concepts X, Y cannot be derived
Check the relation between X' and Y resp. X and Y' Prime (') denotes father element
1704/19/23 Semantic Enrichment of Ontology Mappings
3. Implemented Strategies3.4 Structure Strategy
Example:
18
3. Implemented Strategies3.5 Subset Verification
In some cases, is-a relations only appear to be correct
1904/19/23 Semantic Enrichment of Ontology Mappings
4. Evaluation
3 Benchmark Scenarios
Input: Perfect mapping without relation types Evaluation:
How many non-trivial relations were detected? (recall) How many of them were correct? (precision)
Scenario Domain / Traits Corresp. Non-trivial corresp.
1 Web Directories German language, product catalog 340 62
2 Diseases Health, medical domain 395 41
3 Text Mining Taxon. TM Taxonomy (Everyday Language) 762 692
2004/19/23 Semantic Enrichment of Ontology Mappings
4. Evaluation
Evaluation (as of April 2013)
Evaluation against S-Match No reasonable evaluation feasible Scenario 1: Returned only 4 correspondences, all wrong Scenario 2: Returned 19,600 correspondences
3 % recall, precision close to 0 %
Recall Precision F-Measure
Web Directories 46.7 % 69.0 % 57.8 %
Health 58.5 % 80.0 % 69.2 %
TM Taxonomies 65.4 % 97.7 % 81.1 %
2104/19/23 Semantic Enrichment of Ontology Mappings
4. Evaluation
Evaluating the Strategies
Recall Precision
Compound 18.9 % 82.2 %
Background Knowledge 19.6 % 94.0 %
Itemization 17.1 % 88.8 %
Structure 1.0 % 50.0 %
2204/19/23 Semantic Enrichment of Ontology Mappings
5. Outlook
Relation types needed for different mapping tasks Two general approaches: Linguistic or background knowledge Linguistic Strategies
More generic and more error-prone Background Knowledge
Less generic and more precise
Improvements Exploit more background knowledge
Example: Yago Taxonomy, DBPedia, UMLS Combine it with linguistic / NLP technologies
Exploit further linguistic techniques
23
Thank You