pay-as-you-go multi-user feedback model for ontology matching - ekaw2014

PAY-AS-YOU-GO MULTI-USER

FEEDBACK MODEL FOR

ONTOLOGY MATCHING

Isabel F. Cruz, Francesco Loprete, Matteo Palmonari,

Cosmin Stroe, and Aynaz Taheri

EKAW 2014Linkoping, Sweden

1

1 1

2 2

1 ADVIS lab, University of Illinois at Chicago

2 ITIS Lab, University of Milan-Bicocca

Motivation and Background

oUser Involvement is one of the promising challenges in

Ontology Matching [Shvaiko et al. 2013]

• Involving users to improve the matching process

• Design interaction schemes which are burdenless to the users

oA community of users

• Reduction of the effort from each user

• Correction of user errors

• Obtaining consensus

oMain challenge

• Saving users’ effort

1. While ensuring the quality of the alignment

2. While allowing users’ errors

2Shvaiko, P., Euzenat, J.: Ontology Matching: State of the Art and Future Challenge. Knowledge and Data Engineering,

IEEE Transactions on. 25(1) (2013) 158-176

motivation - pay-as-you-go multi-user feedback model - evaluation - conclusions and future work

Assumptions and Principles

oAssumptions

• Consensus is obtained by majority vote

• Users are domain experts (overall reliable)

• A constant error rate is associated with a sequence of validated

mappings

• Focus on equivalence mappings

oPrinciples

• Our pay-as-you-go fashion

• Each user provides validation

• Propagation of the user feedback without considering the majority vote

• Against our pay-as-you-go fashion

• Optimally Robust Feedback Loop (ORFL)

• Propagation of the user feedback when consensus is reached

11/27/2014 3


Approach Overview

Matcher 2

Matcher k

Source Ontology

Target Ontology

Initial Matching Validation Request Candidate Selection

User Validation

Feedback PropagationAlignment Selection

Feedback Aggregation

411/27/2014

Matcher 1

Validated

Mappings

Non Validated

Mappings

T(mi) F(mi)

m11 1

m2 0 0

m3 2 1

… … …

T or F


Alignment

Mapping Quality Model

Approach Overview

Matcher 2

Matcher k

Source Ontology

Target Ontology

Initial Matching Validation Request Candidate Selection

User Validation

Feedback PropagationAlignment Selection

Feedback Aggregation

511/27/2014

Matcher 1

Validated

Mappings

Non Validated

Mappings

T(mi) F(mi)

m11 1

m2 0 0

m3 2 1

… … …

T or F


Alignment

1. Automatic Matcher Agreement (AMA)

1. Cross Sum Quality (CSQ)

1. Similarity Score Definiteness (SSD)

Mapping Quality Measures

0.45 0.70

0.30

0.60

0.50 0.90

0.80

0.40 0.10 0.90

SSD(m31) = 0.0

0 1 2 3 4 5

0

1

2

3

4

5

An example of a similarity matrix

611/27/2014

Agreement of the similarity scores assigned to a mapping

by different matchers

SSD(m34) = 0.8

How close the similarity score associated with a

mapping is to the similarity scores’ upper and

lower bounds

m1 = 1,1,0,0 Þ AMA m1( ) = 0m2 = 1,1,1,1 Þ AMA m2( ) =1 ≥

CSQ(m34) = 0.13CSQ(m22) = 0.76 ≥

≥


How safe a mapping is from potential conflicts

with other mappings





0.45 0.70

0.30

0.60

0.50 0.90

0.80

0.40 0.10 0.90

SSD(m31) = 0.0

0 1 2 3 4 5

0

1

2

3

4

5


711/27/2014



SSD(m34) = 0.8



lower bounds

m1 = 1,1,0,0 Þ AMA m1( ) = 0m2 = 1,1,1,1 Þ AMA m2( ) =1 ≥

CSQ(m34) = 0.13CSQ(m22) = 0.76 ≥

≥



with other mappings





0.45 0.70

0.30

0.60

0.50 0.90

0.80

0.40 0.10 0.90

SSD(m31) = 0.0

0 1 2 3 4 5

0

1

2

3

4

5


811/27/2014



SSD(m34) = 0.8



lower bounds

m1 = 1,1,0,0 Þ AMA m1( ) = 0m2 = 1,1,1,1 Þ AMA m2( ) =1 ≥

CSQ(m34) = 0.13CSQ(m22) = 0.76 ≥

≥



with other mappings


Mapping

1 1 0.00 1.00

1 0 0.33 0.66

2 1 0.33 0.5

4. Consensus (CON)

5. Propagation Impact (PI)

PI(m) =

0

min(DT (m),DF(m))

max(DT (m),DF(m))

ì

íï

îï

if T(m) = MinCon or F(m) = MinCon

otherwise

Examples for CON and PI

T( im ) F( im ) CON( im ) PI( im )

1m

2m

3m

911/27/2014

CON(m) =

1

T (m)-F(m)

MinCon

ì

íï

îï

if T(m) ≥ MinCon or F(m) ≥ MinCon

otherwise

Minimum number of similar

labels that is needed to make a

correct decision on a mapping

DT(m) =MinCon-T(m) DF(m) =MinCon-F(m)

Captures the user consensus gathered on a mapping

Estimates the instability of the user feedback collected on a mapping


Quality-Based Candidate Selection

oCandidate selection strategies• Combine different mapping quality measures

• Rank mappings in decreasing order of quality

1. Disagreement and Indefiniteness Average (DIA)

• Selects mappings with the most disagreement by the

automatic matchers and most indefinite similarity values

1. Revalidation (REV)

• Selects mappings with the lowest consensus, highest

feedback instability and highest conflict with other mappings

DIA(m) = AVG(AMA-(m),SSD-)

REV(m) = AVG(CON-(m),PI(m),CSQ-(m))

1011/27/2014

Validated

Mappings

Non

Validated

Mappings


DIA

REV

… N5 N4 N3 N2 N1

M1

… M5 M4 M3 M2 M1

oMeta-strategy

• Combine two strategies: DIA and REV

• Revalidation Rate (RR) determines the proportion of mappings

selected from the two ranked lists for a sequence of validated

mappings

Quality-Based Candidate Selection

1111/27/2014

DIA

REV

Ranked list of mappings

Meta

Strategy

PREV Î [0,1] Revalidation Rate (RR)


E.g., for RR = 0.3, every ten iterations three mappings are picked from the REV ranked list

Quality Agreement Propagation

o Feedback is propagated by

updating the similarity ofo The mapping labeled by the user

o A class of similar mappings

o A conservative propagation to make

the system more robust to erroneous

feedbacko Propagation is proportional to

• The quality of the labeled mapping (consensus)

• The quality of the mappings in the similarity class (matchers agreement, definiteness)

• Propagation gain defined by a constant

ts ( cm ) =t-1s ( cm )+ min(Q( vm )* ¢Q ( cm )*g,1- t-1s ( cm ))

t-1s ( cm )- min(Q( vm )* ¢Q ( cm )*g, t-1s ( cm ))

ìíî

The similarity of

mapping at

iteration t

cmPropagation gain

0 £ g £1Q(

vm ) =CON(vm ) ¢Q (

cm ) =

AVG(AMA(cm ),SSD(

cm ))

If label( )=1

If label( )=1vm

vm

1211/27/2014














ìíî

The similarity of

mapping at

iteration t

cmPropagation gain

0 £ g £1Q(


cm ) =

AVG(AMA(cm ),SSD(

cm ))

If label( )=1

If label( )=1vm

vm

1311/27/2014














ìíî

The similarity of

mapping at

iteration t

cmPropagation gain

0 £ g £1Q(


cm ) =

AVG(AMA(cm ),SSD(

cm ))

If label( )=1

If label( )=1vm

vm

1411/27/2014














ìíî

The similarity of

mapping at

iteration t

cmPropagation gain

0 £ g £1Q(


cm ) =

AVG(AMA(cm ),SSD(

cm ))

If label( )=1

If label( )=1vm

vm

1511/27/2014


Experiments

oEvaluation

Benchmark track of OAEI 2010 (101-301, 101-302, 101-303, 101-304)

• Comparison with Baseline (ORFL): user feedback is propagated when

consensus is reached

• Comparison of our candidate selection strategy with a strategy proposed in

an active learning approach [Shi et al. 2009]

oWe used two measures based on F-Measure:

16

Gain at iteration t

DF_Measure(t)=

FMeasure(t)-FMeasure(0)

Robustness at iteration t

Robustness(t)= ER=erFMeasure (t)

ER=0FMeasure (t)


Shi, F., Li, J., Tang, J., Xie, G., Li, H.: Actively Learning Ontology Matching via User Interaction. In

International Semantic Web Conference (ISWC). Volume 5823., Springer (2009) 585-600

Experimental Setup

11/27/2014 17

DF_Measure

o Simulation of users• Error rate (ER): 0.0, 0.05, 0.1, 0.15, 0.2

• Number of users: 10

o AgreementMakero matchers, alignment selection


o Propagation gain (g)• 0.0 (no gain), 0.5

o Revalidation rate• 0.0, 0.1, 0.2, 0.3, 0.4, 0.5

Benchmark Track 101-303

11/27/2014 18

The dashed lines represent a propagation gain equal to zero.

The dotted pink line represents ORFL. Initial F-Measure=72.73

Iterations Iterations Iterations

Iterations Iterations Iterations


RR=0 RR=0.1 RR=0.2

RR=0.3 RR=0.4 RR=0.5

Benchmark Track 101-303

11/27/2014 19

The dashed lines represent a propagation gain equal to zero.

Iterations Iterations

Iterations Iterations

Ro

bu

stn

ess

Ro

bu

stn

ess

Ro

bu

stn

ess

Ro

bu

stn

ess

Ro

bu

stn

ess

Ro

bu

stn

ess

Iterations

Iterations


RR=0 RR=0.1 RR=0.2

RR=0.3 RR=0.4 RR=0.5

Other Benchmark Tasks

11/27/2014 20

ER RR CONF 101-301(0.92) 101-302(0.86) 101-304(0.92)

@10 @25 @50 @100 @10 @25 @50 @100 @10 @25 @50 @100

0.0 0.2 NoGain 0.03 0.05 0.05 0.05 0.03 0.05 0.06 0.08 0.0 0.05 0.05 0.05 0.0 0.2 Gain 0.03 0.04 0.04 0.05 0.03 0.06 0.06 0.08 0.0 0.05 0.05 0.05 0.0 0.3 NoGain 0.02 0.05 0.05 0.05 0.03 0.05 0.06 0.08 0.0 0.04 0.05 0.05 0.0 0.3 Gain 0.02 0.04 0.04 0.05 0.03 0.05 0.06 0.08 0.0 0.03 0.05 0.05

0.1 0.2 NoGain 0.03 0.04 0.01 -0.01 0.02 0.01 0.0 -0.02 0.0 0.03 0.03 0.0 0.1 0.2 Gain 0.03 0.03 0.01 0.0 0.02 0.03 0.01 0.01 0.0 0.03 0.03 0.00 0.1 0.3 NoGain 0.02 0.04 0.02 0.0 0.03 0.02 0.00 0.01 0.0 0.03 0.04 0.02 0.1 0.3 Gain 0.02 0.03 0.01 0.0 0.03 0.03 0.01 0.01 0.0 0.03 0.04 0.01

- 0.0 ORFL 0.0 0.02 0.04 0.05 0.01 0.03 0.05 0.05 0.0 0.0 0.0 0.05

DF _ Measure(t) for the matching tasks:


Comparison of Ranking Functions for

Non Validated Mappings: our DIA vs

[Shi et al. 2009]

21

• An error free setting

• No propagation

Quality

Measures

F-Measure(0) @10 @20 @30 @40 @50 @100 F-Measure(100)

Active

Learning

0.73 0.01 0.02 0.05 0.08 0.12 0.15 0.88

AVG(DIS, SSD) 0.73 0.05 0.12 0.14 0.16 0.19 0.26 0.99

Shi, F., Li, J., Tang, J., Xie, G., Li, H.: Actively Learning Ontology Matching via User Interaction. In

International Semantic Web Conference (ISWC). Volume 5823., Springer (2009) 585-6001

Conclusion

oTwo main steps

• Candidate mapping selection: dynamic ranking of candidate mappings

• Feedback propagation: similarity propagation of validated mappings

oError and revalidation rates

o An increasing error rate counteracted by an increasing revalidation rate

oA revalidation rate equal to 0.3 achieves a good trade-off

between F-measure and Robustness

oPropagation leads to better results than no propagation

11/27/2014 22


Future Work

11/27/2014 23

• User profiling and user validations weighting

• Propagation depending on the feedback quality

• Using different probability distributions to model a variety

of users’ behavior

• Determine the impact of users’ behavior along time on the

error distribution


QUESTIONS?

We sincerely appreciate your feedback

EKAW 2014Linkoping, Sweden

mailto: [email protected]

pay-as-you-go multi-user feedback model for ontology matching - ekaw2014

Science

users feedback

feedback model evaluation

users errors2shvaiko

ontology matching pay

majority vote users

user input

fashioneach user

ontology matching shvaiko