evaluating the stability and credibility of ontology matching methods

16

Click here to load reader

Upload: xing-niu

Post on 06-Jul-2015

462 views

Category:

Technology


1 download

DESCRIPTION

Ontology Track @ ESWC2011

TRANSCRIPT

Page 1: Evaluating the Stability and Credibility of Ontology Matching Methods

APEX Data & Knowledge Management Lab

Evaluating the Stability and Credibility of Ontology Matching Methods

Xing Niu, Haofen Wang, Gang Wu, Guilin Qi, and Yong Yu

2011.5.31

Page 2: Evaluating the Stability and Credibility of Ontology Matching Methods

Introduction Basic Concepts

Confidence Threshold, relaxedCT Test Unit

Evaluation Measures and Their Usages Comprehensive F-measure STD score ROC-AUC score

Conclusions and Future Work

Page P 2

Agenda

Page 3: Evaluating the Stability and Credibility of Ontology Matching Methods

Introduction

Stability Reference matches are scarce Training proper parameters High stability: a matching method performs consistently on

the data of different domains or scales

Credibility Candidate matches sorted by their matching confidence values High credibility: a matching method generate true positive

matches with high confidence values while return false positive ones with low values.

Page P 3

Page 4: Evaluating the Stability and Credibility of Ontology Matching Methods

Introduction (con’t)

Judging the basic compliance is not sufficient Precision Recall F-measure

Measurements Stability: Comprehensive F-measure and STD (STandard

Deviation) score Credibility: ROC-AUC (Area Under Curve) score

Page P 4

Page 5: Evaluating the Stability and Credibility of Ontology Matching Methods

Confidence Threshold

A match can be represented as a 5-tuple

Confidence Threshold (CT)

Page P 5

CT

Page 6: Evaluating the Stability and Credibility of Ontology Matching Methods

Relaxed CT

Page P 6

CTrelaxedCT

Page 7: Evaluating the Stability and Credibility of Ontology Matching Methods

Test Unit

Test Unit A set of similar datasets describing the same domain and

sharing many resemblances, but differing on details.

Examples Benchmark 20X: structural information remains ● another

language, synonyms, naming conventions Conference Track: conference organization domain ● built by

different groups Cyclopedia: same data source ● different categories … Others: same data source ● random N-folds

Page P 7

Page 8: Evaluating the Stability and Credibility of Ontology Matching Methods

Comprehensive F-measure

Maximum F-measure maxF-measure reflects the theoretical optimal matching

quality of matching methods

Uniform F-measure uniF-measure simulates the practical application and evaluates

such stability of matching methods.

Comprehensive F-measure

Page P 8

Page 9: Evaluating the Stability and Credibility of Ontology Matching Methods

Usage of comF-measure

Draw histograms to find out ‘who’ holds the comF-measure back

Page P 9

Falcon-AO and RiMOM in Benchmark-20X Test

*another language, synonyms

Page 10: Evaluating the Stability and Credibility of Ontology Matching Methods

Usage of comF-measure (con’t)

Use the comF-measure value as a indicator of matching quality as it reflects both theoretical and practical results

Use the comF-measure function as the objective function of a optimization problem as it conceals a multi-objective optimization problem

Page P 10

Page 11: Evaluating the Stability and Credibility of Ontology Matching Methods

STD Score

Page P 11

Page 12: Evaluating the Stability and Credibility of Ontology Matching Methods

Reflects the stability of a matching method

Usage of STD Score

Page P 12

Falcon-AO & Lily in Benchmark Test

*lexical information, structural information

Page 13: Evaluating the Stability and Credibility of Ontology Matching Methods

ROC-AUC score

Page P 13

Page 14: Evaluating the Stability and Credibility of Ontology Matching Methods

Usage of ROC-AUC Score

Page P 14

Spider Chart for Conference Test Unit

Page 15: Evaluating the Stability and Credibility of Ontology Matching Methods

Conclusions and Future Work

Conclusions New evaluation measures

Comprehensive F-measure STD score ROC-AUC score

Deep analysis

Future Work Extend our evaluation measures to a comprehensive strategy Test more matching methods under other datasets Make both stability and credibility as standard evaluation

measures for ontology matching

Page P 15

Page 16: Evaluating the Stability and Credibility of Ontology Matching Methods

APEX Data & Knowledge Management Lab