ontology-based semantic matchmaking approach

9
Ontology-based semantic matchmaking approach Gao Shu a, * , Omer F. Rana b , Nick J. Avis b , Chen Dingfang a a School of Computer Science, Wuhan University of Technology, Hubei 430063, China b School of Computer Science, Cardiff University, UK Received 26 October 2005; received in revised form 14 March 2006; accepted 18 May 2006 Available online 22 August 2006 Abstract As a greater number of Web Services are made available, support for service discovery mechanisms become essential. Services can have quite different Quality of Service characteristics (such as their response time when given a particular set of data). A service requestor therefore requires more sophisticated approaches to find a service that meets a particular behavior, because supporting matching between a service request and properties is not straightforward. Matchmaking plays a vital role in this discovery process. We propose a novel matchmaking algorithm to effectively compute the semantic distance of concepts in an ontology. It is based on description logic formal- ization and reasoning, extends simple subsumption matching found in other approaches and allows match ranking. We have imple- mented the proposed approach and used the developed prototype in the context of service discovery in the visualization domain. Ó 2006 Elsevier Ltd. All rights reserved. Keywords: Semantic Web; Ontology; Description logics; Matchmaking 1. Introduction The spread of Web Services as a means to support ser- vice provision reveals the need for sophisticated discovery mechanisms, such as matching service requestors with ser- vice providers, especially when services are undiscovered, new, and/or being updated often. A matchmaker plays a crucial role in this activity, and acts like a broker to assist in locating and connecting a service provider with a service requester. Generally speaking, matchmaking roughly can be divided into two categories: (1) syntactic matchmaking: which uses the structure or format of a task specification to match a requester with a provider to decide which service providers to recommend; (2) semantic matchmaking: which uses the meaning and information content of the request to match it with the meaning of the offered services. In case (2), service requesters and providers utilize an ontology to discover similarities between two services, and determine a ‘‘semantic distance’’ between services. The motivation of this work is to investigate how Semantic and Web Services technologies can be used to make the match service advertisement with service request more effective. We pay more attention to the quality of matchmaking but do not ignore the efficiency. In particular our approach, based on ontologies and description logic (DL) formalization and reasoning, overcomes simple sub- sumption matching. The rest of the paper is structured as follows. Section 2 gives a brief survey of related work regarding existing matchmaking approach. Section 3 discusses the basic con- cept of Description Logics and the Web Ontology Lan- guage. The ontology-based semantic matchmaking algorithm is presented in Section 4. The last section draws the conclusions and future work. 2. Related work The earliest matchmaker we are aware of is the ABSI (Agent-Based Software Interoperability) facilitator [1], 0965-9978/$ - see front matter Ó 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.advengsoft.2006.05.004 * Corresponding author. E-mail address: [email protected] (G. Shu). www.elsevier.com/locate/advengsoft Advances in Engineering Software 38 (2007) 59–67

Upload: gao-shu

Post on 26-Jun-2016

218 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Ontology-based semantic matchmaking approach

www.elsevier.com/locate/advengsoft

Advances in Engineering Software 38 (2007) 59–67

Ontology-based semantic matchmaking approach

Gao Shu a,*, Omer F. Rana b, Nick J. Avis b, Chen Dingfang a

a School of Computer Science, Wuhan University of Technology, Hubei 430063, Chinab School of Computer Science, Cardiff University, UK

Received 26 October 2005; received in revised form 14 March 2006; accepted 18 May 2006Available online 22 August 2006

Abstract

As a greater number of Web Services are made available, support for service discovery mechanisms become essential. Services canhave quite different Quality of Service characteristics (such as their response time when given a particular set of data). A service requestortherefore requires more sophisticated approaches to find a service that meets a particular behavior, because supporting matching betweena service request and properties is not straightforward. Matchmaking plays a vital role in this discovery process. We propose a novelmatchmaking algorithm to effectively compute the semantic distance of concepts in an ontology. It is based on description logic formal-ization and reasoning, extends simple subsumption matching found in other approaches and allows match ranking. We have imple-mented the proposed approach and used the developed prototype in the context of service discovery in the visualization domain.� 2006 Elsevier Ltd. All rights reserved.

Keywords: Semantic Web; Ontology; Description logics; Matchmaking

1. Introduction

The spread of Web Services as a means to support ser-vice provision reveals the need for sophisticated discoverymechanisms, such as matching service requestors with ser-vice providers, especially when services are undiscovered,new, and/or being updated often. A matchmaker plays acrucial role in this activity, and acts like a broker to assistin locating and connecting a service provider with a servicerequester. Generally speaking, matchmaking roughly canbe divided into two categories: (1) syntactic matchmaking:which uses the structure or format of a task specification tomatch a requester with a provider to decide which serviceproviders to recommend; (2) semantic matchmaking: whichuses the meaning and information content of the request tomatch it with the meaning of the offered services. In case(2), service requesters and providers utilize an ontology

0965-9978/$ - see front matter � 2006 Elsevier Ltd. All rights reserved.doi:10.1016/j.advengsoft.2006.05.004

* Corresponding author.E-mail address: [email protected] (G. Shu).

to discover similarities between two services, and determinea ‘‘semantic distance’’ between services.

The motivation of this work is to investigate howSemantic and Web Services technologies can be used tomake the match service advertisement with service requestmore effective. We pay more attention to the quality ofmatchmaking but do not ignore the efficiency. In particularour approach, based on ontologies and description logic(DL) formalization and reasoning, overcomes simple sub-sumption matching.

The rest of the paper is structured as follows. Section 2gives a brief survey of related work regarding existingmatchmaking approach. Section 3 discusses the basic con-cept of Description Logics and the Web Ontology Lan-guage. The ontology-based semantic matchmakingalgorithm is presented in Section 4. The last section drawsthe conclusions and future work.

2. Related work

The earliest matchmaker we are aware of is the ABSI(Agent-Based Software Interoperability) facilitator [1],

Page 2: Ontology-based semantic matchmaking approach

Table 1Basic terms in description logics

DL expressiveness DL syntax Service description language

ALC, also called S

when transitively closedprimitive roles are included

A ConceptT Thing? Nothing(C � D) Subsumption(C � D) EquivalenceR Properties(C \ D) Conjunction(C [ D) Disjunction�C Negation"R Æ C Universal role restriction$R Æ C Existential role restriction

N 6nR Æ T Non-qualified cardinalityPnR Æ T=nR Æ T

Q 6nR Æ C Qualified cardinalityPnR Æ C

=nR Æ C

I R� Inverse roles

H (R � S) Subsumption of roles(R � S) Equivalence of roles

O {o} Nominals$T Æ {o} Value restrictions

(D) D Datatype systemT Datatype property$T Æ d Existential datatype restriction"T Æ d Universal datatype restriction

60 G. Shu et al. / Advances in Engineering Software 38 (2007) 59–67

which is based on the KQML (Knowledge Query andManipulation Language) specification and uses the KIF(Knowledge Interchange Format) as the content language.The matching between the advertisement and requestexpressed in KIF involves a simple unification with theequality predicate.

In [2], Sycara et al. defined a language called ‘‘Larks’’ foragent advertisements and requests, and presented a flexibleand efficient matchmaking process. The Larks matchmak-ing process performs both syntactic and semantic matching.There are three types of matching in Larks: exact match (themost accurate type of match), plug-in match (a less accuratebut most useful type of match), and relaxed match (the leastaccurate type of match). The matching process uses five dif-ferent filters: context matching, profile comparison, similar-ity matching, signature matching and constraint matching.Different degrees of partial matching can result from utiliz-ing different combinations of these filters.

An agent-based system for information discovery andretrieval is InfoSleuth [3]. Its brokering function combinesreasoning over both the syntax and semantics of agents inthe domain. Agent capabilities and services are describedusing a common shared ontology of the attributes andthe constraints which all agents can use to specify adver-tisements and requests to the broker. Matchmaking is thenperformed by a deductive database system, allowing rulesto evaluate whether an expression of requirements matchesa set of advertised capabilities. In [4], Tangmunarunkitet al. also proposed an ontology-based resource matchingapproach. They extend the InfoSleuth approach in severaldirections: first, they use RDF (Resource DescriptionFramework), and secondly they focus on how ontology-based reasoning interacts with matchmaking rules. Mean-while, they introduce background knowledge to thematchmaking process, allowing for a more flexible match-making procedure.

In [5], Li and Horrocks introduce DAML + OIL basedmatchmaking, which uses a DL reasoner to compare ontol-ogy-based service descriptions. Service advertisements areexpressed as class expressions. The elements of the classexpressions are taken from a domain ontology and a spe-cific service ontology.

In [6–8], Colucci et al. describe a formalization of skillmatching, and propose properties that should hold in asemantic-based skill matching approach. They also definean algorithm to rank matches between skills profiledescriptions and present an ontology-based system whichembeds a modified NeoClassic reasoner implementing theranking algorithms.

Most of the matchmaking work above (except [6–8]) aremainly to use subsumption reasoning on taxonomies ofconcepts to lead to the recognition of semantic matches.Compared with the Simona Colucci ’s work, our workhas more generality, rather than domain-specific, and takesfull advantage of the effort placed in structuring an ontol-ogy to more precisely compute the semantic distancebetween concepts.

3. Description logics and OWL

3.1. Description logics

Description logics (DLs) are a family of knowledge rep-resentation formalisms. They are based on the notion ofconcepts and roles, and are mainly characterized by con-structors that allow complex concepts and roles to be builtfrom atomic ones [9]. The main benefit from these knowl-edge languages is that sound and complete algorithms forthe subsumption and satisfiability problems can be defined.A DL reasoner solves the problems of equivalence, satisfi-ability and subsumption. The basic functionalities of thematchmaking service are shown in Table 1 [10].

3.2. OWL Web Ontology Language and Protege

Ontologies play a key role in the Semantic Web. In thiscontext, specification refers to an explicit representation bysome syntactic means. In contrast to schema languages(like XML-Schema or DTDs) ontologies try to capturethe semantics of a domain by deploying knowledge repre-sentation primitives, enabling a machine to (partially)understand the relationships between concepts in adomain. Additional knowledge can be captured by axiomsor rules. The OWL Web Ontology Language is a languagefor defining and instantiating Web ontologies. OWL can beused to explicitly represent the meaning of terms in vocab-ularies and the relationships between those terms.

Page 3: Ontology-based semantic matchmaking approach

Table 2Protege uses traditional description logic symbols to display OWLexpressions

OWL element Symbol Example expression in Protege

Owl: allValuesFrom " "has � DatatypeSet ScalarOwl: someValuesFrom $ $has � Habitat UniversityOwl: hasValue 3 has-Gender 3 maleOwl: minCardinality P has-Vector P 1Owl: maxCardinality 6 has-Tensor 6 5Owl: cardinality = has-Scalar = 1Owl: intersectionOf \ Student \ ParentOwl: unionOf [ Male [ FemaleOwl: oneOf {. . .} {Scalar vector tensor}

G. Shu et al. / Advances in Engineering Software 38 (2007) 59–67 61

Compared with XML, XML-Schema, RDF, and RDF-Schema, OWL provides additional vocabulary for describ-ing properties and classes: these include among others,relations between classes (e.g. disjointness), cardinality(e.g. ‘‘exactly one’’), equality, richer typing of properties,characteristics of properties (e.g. symmetry), and enumer-ated classes [11].

OWL has three increasingly-expressive sub-languages:OWL Lite, OWL DL, and OWL Full. Among them,OWL DL supports those users who want the maximumexpressiveness without losing computational completeness(all entailments are guaranteed to be computable) anddecidability (all computations will finish in finite time) ofreasoning systems. OWL DL includes all OWL languageconstructs with restrictions such as type separation. OWLDL was designed to support the existing description logic,and has desirable computational properties for reasoningsystems which makes it easy to implement the matchmak-ing functionalities by using a DL reasoner to computethe semantic distance between concepts [11].

An OWL DL-based ontology can be developed usingProtege-3.1 with an OWL plugin. Protege is an open plat-form for ontology modeling and knowledge acquisition.The OWL plugin can be used to edit OWL ontologies, toaccess DL reasoners, and to acquire instances for semanticmarkup, in particular, to provide the way to represent andedit an OWL expression syntax based on standard DLsymbols [12]. These symbols are shown in Table 2.

We will use the above notation to express our design insubsequent sections of this paper.

4. Ontology-based semantic matchmaking approach

Our semantic matching approach is based on OWL DLontologies: advertisements and requests refer to OWL con-cepts and the associated semantics, or are expressed in DL-based expression. By using OWL, the matching process canperform inferences and calculation on the subsumptionhierarchy leading to the recognition of semantic matchesdespite their syntactic differences and difference in model-ing abstractions between advertisements and requests.The degree of match is determined by the semantic dis-tance, rather than only a subsumption relation, betweenconcepts in the taxonomy tree.

4.1. Basic assumption

Reasoning and computing in OWL are based on what isknown as the open world assumption. It means that wecannot assume something does not exist until it is explicitlystated that it does not exist. In other words, because some-thing has not been stated to be true, it cannot be assumedto be false – it is assumed that ‘‘the knowledge just has notbeen added to the knowledge base’’. So any two classes (i.e.concepts) could be overlapped unless they have been statedto be disjoint. In addition, the classes in the ontology aredivided into two categories: (i) primitive classes, which donot have any sets of necessary and sufficient condition,and (ii) defined classes, with at least one set of necessaryand sufficient condition [14].

4.2. Semantic distance of two concepts

Let Ci (i = 1,2, . . . ,n) be any concept (the term ‘‘con-cept’’ is equivalent to the term ‘‘class’’ in this paper) inthe ontology. Further, let D(Ci) be the domain of Ci. Letd : D(Ci) · D(Ci)! R+ be function on the D(Ci) mappingto R+ (R+ denotes the set of the positive real number). Ifthe properties

(1) C1 = C2) d(C1,C2) = 0;(2) dðC1;C2Þ ¼ 1 () ðC1 \ C2Þ �? (? is described in

Table 1); noting: however that(3) d(C1,C2) 5 d(C2,C1) if C1 5 C2 (Generally, the

function d(C1,C2) is non-commutable.); are fulfilled,d(C1,C2) is called the semantic distance from C1 to C2.

The semantic distance of two concepts is the sum of thesubsumption distance (‘‘ds’’) and ‘‘definition’’ distance(‘‘dd’’), hence d(C1,C2) = ds(C1,C2) + dd(C1,C2). Theds is the distance between two concepts within a hierarchy,while the dd is the difference between the semantic descrip-tion of two concepts, which is calculated by the algorithmdescribed in Section 4.3. Ideally, the semantic distanceshould just equal to the definition distance. However, it isoften difficult to give the semantic description of the con-cept, and even if it is given, it is often not complete. There-fore, the subsumption distance is introduced to make upthe insufficiency of the definition distance. In most existingmatchmaking approaches, no distinction is made betweenthe semantic distance and the definition distance. To sim-plify the calculation process and differentiate between twoprimitive concepts – which have no semantic description,the ‘‘definition’’ distance of the two primitive concepts isregarded to be zero. At the same time, we also ignore the‘‘definition’’ distance dd(C1,C2) if C2 is the subclass or off-spring of C1, because C1 inherits all the properties of C2.

Taking the ontology tree shown in Fig. 1 as an example:

(1) d(A,A) = 0;(2) d(A,D) = ds(A,D) = 2, where dd(A,D) = 0 because

D is the offspring of A;

Page 4: Ontology-based semantic matchmaking approach

CB

FD E G H

A

Fig. 1. An example of ontology tree.

62 G. Shu et al. / Advances in Engineering Software 38 (2007) 59–67

(3) d(D,A) = ds(D,A) + dd(D,A) = 2 + dd(D,A), wherethe distance of hierarchy of concepts D and A is 2;

(4) d(D,C) = ds(D,C) + dd(D,C) = 1 + dd(D,C), wherethe distance of hierarchy of concepts D and C is 1;

(5) d(B,C) = ds(B,C) + dd(B,C) = dd(B,C), whereds(B,C) = 0 because B and C are at the same level.

4.3. Matching algorithm

Our matchmaking algorithm aims to precisely computethe semantic distance of concepts in the ontology andexpress it using a score in order to improve the quality of

matchmaking. The following algorithm focuses on howthe definition distance is calculated because it is easy to cal-culate the subsumption distance of two concepts.

As previously mentioned, classes in the OWL ontologycan be divided into two kinds: defined classes and primitiveclasses. The defined class is at least defined by a set of neces-sary and sufficient condition. In fact, the condition is thesemantic description of the class. Hence, the definition dis-tance of two classes is the difference between their semanticdescriptions. Generally speaking, the semantic descriptionconsists of a set of direct superclasses and a set of restrictionswhich include allValuesFrom ("), someValuesFrom ($),hasValue (3), minCardinality (P), maxCardinality (6), car-dinality (=) (as described in Table 2). A class C can be put innormal form consisting of a set of the direct superclasses of C

and a set of restrictions, i.e. C = {superclasses} [ {restric-tion}, for short C = SS [ SR. (Here [ means union, while\ means intersection, SS is the short for the set of directsuperclasses and SR is the set of restrictions.) For example,E2V2 = {EV} [ {has-Dimension = 2, has-Vector = 2},which means EV is the direct superclass of E2V2 andE2V2 has two restrictions (has-Dimension = 2) and (has-Vector = 2). The algorithm for calculating the definition dis-tance is below [18]:

Page 5: Ontology-based semantic matchmaking approach

G. Shu et al. / Advances in Engineering Software 38 (2007) 59–67 63

4.4. Case study

In this section, we will show an example of how tocalculate the semantic distance of two concepts in anontology by means of our algorithm. Fig. 2 shows afragment of visualization ontology developed using Pro-tege. The ontology is built using Brodlie’s E notation

[16,17] and is used in the project ‘‘Grid-Enabled Discov-ery of Visualization Service’’. It has four abstract clas-ses representing the main concepts in the visualizationdomain: Data_Model, Visualization_Technique, Data_Representation and Primitive_Set. The Data_Model isthe key part, which describes the user’s data model, asshown in Fig. 2.

Page 6: Ontology-based semantic matchmaking approach

Fig. 2. Visualization ontology fragment.

64 G. Shu et al. / Advances in Engineering Software 38 (2007) 59–67

In the process of matchmaking, there is often a conflictbetween quality and efficiency. Our algorithm aims to get agood quality of match, i.e. precisely calculate the semanticdistance between concepts, because it uses both the sub-sumption and the semantic description of concepts in theirOWL ontology. However, efficiency is also achieved bypre-computing the semantic distance between the advertise-ments and the possible requests [15] to speed up match-making. Shown in Fig. 2, the semantic description ofsome concepts can be put in the normal form SS [ SR,as referred to in Section 4.3, as follows:

(1) E2V2 = {EV} [ {has-Dimension = 2, has-Vector =2}, where EV is the direct superclass of E2V2, and(‘‘has-Dimension = 2’’), (‘‘has-Vector = 2’’) are thetwo restrictions on E2V2;

(2) ES2 = {ES} [ {has-Dimension = 2, has-Scalar = 1},which means that ES is the direct superclass of ES2,and (‘‘has-Dimension = 2’’), (‘‘has-Scalar = 1’’) arethe two restrictions of ES2;Similarly,

(3) ES1 = {ES} [ {has-Dimension = 1, has-Scalar = 1};(4) ES = {Continuous_Model} [ {"has-DatatypeSet

Scalar};(5) EV = {Continuous_Model} [ {"has-DatatypeSet

Vector};(6) EnS2 = {ES} [ {has-Dimension = 2, has-Scalar P 1};(7) Continuous_Model = {Data_Model} [ {" has-

StatespaceSet Continuous};

In order to get d(E2V2,ES2) using our algorithmdescribed in Section 4.3, where C1 = E2V2, C2 = ES2,SS1 = {EV}, SS2 = {ES}, SR1 = {has-Dimension = 2,

has-Vector = 2}, SR2 = {has-Dimension = 2, has-Scalar =1}, we proceed as follows:

• We first need to deal with the superclass in each instancei.e. computing TS = SS1 \ SS2 – in this case, TS = /;

• As SS1 5 / and there is the class EV which is not SUP(where SUP is the class Data_Model) in SS1, andEV = {Continuous_Model} [ {" has-DatatypeSet Vec-tor}, so we modify SS1 and SR1, i.e. SS1 = {Continu-ous_Model}, SR1 = {" has-DatatypeSet Vector}[{has-Dimension = 2, has-Vector = 2}. Similarly,SS2 = {Continuous_Model}, SR2 = {" has-Datatype-Set Scalar} [ {has-Dimension = 2, has-Scalar = 1};

• We then compute TS again. Now, TS = {Continu-ous_Model}. Therefore, SS1 = SS1-TS = / andSS2 = SS2-TS = /;

• As SS1 and SS2 are /, we then deal with the set of theirrestrictions SR1 and SR2. For the property ‘‘"has-DatatypeSet Vector’’ in SR1, because there existsproperty ‘‘" has-DatatypeSet Scalar’’ 2 SR2, whichcorresponds to step 5 in our algorithm, so a recursionof the algorithm is called. At this time, C1 = Vectorand C2 = Scalar.

• Recurring the algorithm, we see that C1 and C2 i.e. con-cepts Vector and Scalar are disjoint, which can be gotfrom the ontology. So dd(Vector, Scalar) =1. There-fore, the algorithm stops and exits. At last, we getdd(E2V2,ES2) =1.

Similarly, we can calculate the semantic distance of ES1and Continuous_Model, i.e. d(ES1,Continuous_Model).Obviously, ds(ES1,Continuous_Model) = 2 since theirdistance in the hierarchy tree is 2. Using the algorithmcalculating_dd in Section 4.3, we can get dd(ES1, Continu-ous_Model) = 3 since ES1 has two restrictions more thanES, which are ‘‘has-Dimension = 1’’ and ‘‘has-Scalar = 1’’,and ES, which is the direct superclass of ES1, also hasone restriction more than the Continuous_Model, whichis ‘‘" has-DatatypeSet Scalar’’. So d(ES1,Continu-ous_Model) = 5. In the same way, we can compute thed(ES2,EnS2) = 1, and d(EnS2,ES2) = 0. EnS2 is more gen-eral than ES2 because other restrictions are the same exceptthat the (has-Scalar = 1) for ES2 and (has-Scalar P 1) forEnS2.

4.5. Analysis and discussion

In [13], Paolucci et al. proposed that the degree of matchis determined by the minimal distance between concepts inthe taxonomy tree, and differentiate between four degreesof matching: exact, plug-in, subsumes and fail. And in[5], Li and Horrocks extend the above four degrees: exact,plug-in, subsumes, intersection and disjoint. Both of theseprimarily consider the simple subsumption between theconcepts in the ontology, and ignore the hierarchic differ-ence between the concepts and their detailed semanticdifference. Taking the ontology in Fig. 2 for example,

Page 7: Ontology-based semantic matchmaking approach

Matching degreeApproach

ES2 and Continuous_Model ES2and ES ES2 and EnS2

Our approach d(ES2,Continuous_Model)=5

d(Continuous_Model,ES2)=2

d(ES2,ES)=3

d(ES,ES2)=1

d(ES2 , EnS2)=1

d(EnS2 , ES2)=0

Approach in [5] Subsume Subsume Intersection

Fig. 3. Comparison of our approach and the one in [5].

G. Shu et al. / Advances in Engineering Software 38 (2007) 59–67 65

according to [5], the matching degree of Continu-ous_Model and ES2 is the same as that of ES and ES2because both of them have a ‘‘subsumes’’ relation.However, we can see that ES is closer to ES2 than the Con-tinuous_Model. Moreover, the match degree of conceptswhose relation is the sibling is either intersection or disjoint– that is to say a more precise match degree between siblingconcepts cannot be calculated.

The approach in [6–8] is devised adapting the originalCLASSIC structural algorithm for subsumption [14]. It isbased on DL formalization and reasoning, and overcomessimple subsumption matching and allows match rankingand categorization. However, it does not take the inherit-able and hierarchy relation between concepts into consider-ation. Also, it cannot calculate the match degree of theprimitive classes in the ontology, referred to in Section4.1, because it is based the CLASSIC system completely,in which each concept C has an equivalent normal formas Cnames \ C# \ Call, where Cnames is a conjunction ofnames, C# of number restrictions, and Call of universal rolequantifications. The difference between our algorithm andsuch existing approaches is that our approach is more gen-eral, rather than domain-specific, and makes full use of theeffort placed in structuring an ontology to more preciselycompute the semantic distance of concepts. This allowsus to deal with both the primitive and the defined classesmentioned in Section 4.1. A comparison between ourapproach and [5] is shown in Fig. 3.

From Fig. 3, we can see that a more precise matchdegree can be achieved using our approach. For example,suppose we have three Web Services, where service 1 candeal with data model ES, service 2 can deal with EnS2,and service 3 can deal with a Continuous_Model, and auser’s data model is ES2. We can calculate d(ES,ES2) = 1,d(EnS2,ES2) = 0 and d(Continuous_Model,ES2) = 2. Asthe semantic distance from EnS2 to ES2 is 0, our approachcan tell user that service 2 is the best choice. Obviously, theresult is right because here n equals to or is greater than 1.If a service can deal with the data model EnS2 and arequestor is ES2, the service can provide what the requestorrequests.

On the other hand, another advantage of our algorithmis to differentiate between d(C1,C2) and d(C2,C1), which isvery significant and is often ignored in the otherapproaches. For example, as mentioned above, the seman-tic distance d(EnS2,ES2) should be zero because the ser-vice, whose data model is EnS2, can also visualize thedata model ES2. But if the data model of a service is ES2and a requestor is EnS2, the service cannot visualize all

the data model which the requestor requests. Therefore,the semantic distance d(ES2,EnS2) should not be zero.According to approach [5], d(EnS2,ES2) = d(ES2,EnS2),but d(EnS2,ES2) = 0 and d(ES2,EnS2) = 1 in our algo-rithm, which gives a logical match result.

In our ontology-based matchmaking algorithm, theadvertisement and request can be expressed using termsin the ontology or may be a DL-based expression. As aresponse, the potential candidates are ranked accordingto their match ‘‘score’’. Better match is characterized bya match ‘‘score’’ nearer to 0.

4.6. Evaluation

We have used a prototype implementation to carry outsome experiments designed to test our algorithm’s perfor-mance. The prototype is in a visualization service discoveryscenario. The advertisements are generated artificially byrandomly creating a specification. For example, the datamodel and visualization technique which the advertise-ments provide are randomly chosen from a set of conceptsin the visualization ontology. Our result shows that thetime required to calculate the distance of an advertisementand a request is always less than 10 ms, regardless of thetime to load an ontology. This would be fast enough forthe matchmaking system to handle a high frequency ofmatching request. But classifying the advertisements inthe TBox (Terminological Box) is quite time-consumingand the same as the loading time of the ontology, espe-cially, with the increase of the number of advertisements,the efficiency of matchmaking will decrease. In our proof-of-concept prototype, we use the idea of pre-computingin [15]. As the concern of the user is the matching timerather the time of publishing a Web Service. Therefore,we exploit this time to compute the semantic distance ofan advertisement and a request. To pre-compute, thematchmaker maintains a taxonomy that represents the sub-sumption relationships between all the concepts in theontology. Each concept in this taxonomy is annotated witha list score_advertisement that specify to what degree anyrequest pointing to that concept would match the adver-tisement, and the list is in order. The list is a vector andits elements like hAdvX, Scorei, where the AdvX pointsto the advertisement and the Score denotes the distancevalue. The establishment of the list can be off-line, and itis stored in the database in MySQL. When a new Web Ser-vice is registered, we can pre-compute the distance of it andeach concept in the taxonomy, and insert the result into asuitable position in the list. This can be undertaken in the

Page 8: Ontology-based semantic matchmaking approach

Fig. 4. An example for the implementation of our matchmaking algorithm.

66 G. Shu et al. / Advances in Engineering Software 38 (2007) 59–67

publish phase, and because it could be done off line. Bydoing so, what we need to do in the matchmaking phaseis to look up the concept the request points to in the taxon-omy and its list. The time required in the matchmakingphase will be in the order of (log N) because the data struc-ture of the taxonomy is a tree and the list is in order.

Fig. 4 shows the implementation of our matchmakingalgorithm. In the prototype, we provide three options,which are ‘‘Choice of the Data Model’’, ‘‘Choice of thealgorithm’’ and ‘‘Enter the Description of Data’’, to allowsearch for appropriate visualization algorithms. Fig. 4shows the match results when the user chooses the firstoption and selects his data model ES2 from the data modeltree. The ‘‘return matches’’ table lists the returned results. ‘‘Service Name‘‘ in the table lists all services which can pro-vide appropriate visualization algorithms, and ‘‘Data-Model’’ describes data the services can deal with. ‘‘MatchScore’’ tells the user the match degree of his request andthe services, and we rate them and show the match indica-tor in order to give some suggestion to user. As shown inFig. 4, if the requestor is ES2, we can precisely calculatethe semantic distances between its request and all theadvertisements, find 4 services which are not disjoint withthe requestor and rate them.

5. Conclusion and further work

An improvement to the quality of matchmaking is pro-posed – extending existing algorithms with a subsumptionand semantic definition distance, which allows the algo-rithms to have the ability to precisely calculate a distancebetween concepts. The approach is based extends the sim-ple subsumption idea, with a semantic description of these

concepts in their OWL ontology. The approach has beenimplemented in the prototype of visualization servicediscovery.

The algorithm is intended to be a starting point for effec-tively improving the quality of matchmaking. Futureworks will include: (1) adding the function to deal withthe complement relation, which means a complement classcontains all of the individuals that are not contained in theclass that it is the complement to, in the description of con-cept; (2) introducing the background knowledge as rulesinto the matchmaking process to enrich the description ofconcepts.

References

[1] Singh N. A Common Lisp API and Facilitator for ABSI: Version2.0.3, Technical Report Logic-93-4, Logic Group, Computer ScienceDepartment, Stanford University, 1993.

[2] Sycara Katia, Widoff Seth, Klusch Matthias, et al. Larks: dynamicmatchmaking among heterogeneous software agents in cyberspace.Auton Agents Multi-Agent Syst 2002;5:173–203.

[3] Nodine Bohrer M, Ngu AH. Semantic brokering over dynamicheterogenous data sources in infosleuth. In: Proceedings of the 15thinternational conference on data engineering, 1999. p. 358–65.

[4] Tangmunarunkit Hongsuda, Decker Stefan, Kesselman Carl. Ontol-ogy-based resource matching in the grid – the grid meets the semanticweb. In: Proceedings of SemPGRID’03, 2003.

[5] Li Lei, Horrocks Ian. A software framework for matchmaking basedon semantic web technology. In: Proceedings of the twelfth interna-tional world wide web conference (WWW 2003), 2003.

[6] Colucci S, Di Noia T, Di Sciascio E, Donini FM, Mongiello M.Concept abduction and contraction in description logics. In: Pro-ceedings of DL 2003. CEUR Electronic Workshop Proceedings.Available from: interrefhttp://ceur-ws.org/Vol-81/url<http://ceur-ws.org/Vol-81/>, 2003.

[7] Colucci S, Di Noia T, Di Sciascio E, Donini FM, Mongiello M,Mottola M. A formal approach to ontology-based semantic match of

Page 9: Ontology-based semantic matchmaking approach

G. Shu et al. / Advances in Engineering Software 38 (2007) 59–67 67

skills descriptions. J Universal Comput Sci 2003, Special issue onSkills Management.

[8] Colucci S, Di Noia T, Di Sciascio E, Donini FM, Mongiello M,Mottola M. Finding skills through ranked semantic match ofdescriptions. J Universal Comput Sci (JUCS) 2003, Issue onProceedings of third international conference on knowledge manage-ment I-KNOW ’03.

[9] Horrocks I, Sattler U, Tobies S. Practical reasoning for expressivedescription logics. In: Ganzinger H, McAllester D, Voronkov A,editors. Proceedings of LPAR’99. LNAI, vol. 1705. Springer; 1999.p. 161–80.

[10] Gonzalez-Castillo Javier, Trastour David, Bartolini Claudio.Description logics for matchmaking of services, HP Labs TechnicalReport.

[11] OWL Web Ontology Language Overview. Available from: <http://www.w3.org/TR/2004/REC-owl-features-20040210/>.

[12] Knublauch Holger, Fergerson Ray W, Noy Natalya F, Musen MarkA. The Protege OWL Plugin: An Open Development Environmentfor Semantic Web Applications. In: Third international semantic webconference – ISWC, 2004.

[13] Paolucci Massimo, Kawamura Takahiro, Sycara Katia, et al. Seman-tic matching of web services capabilities. In: Proceedings of interna-tional semantic web conference (ISWC 2002).

[14] Borgida A, Patel-Schneider PF. A semantics and complete algorithm forsubsumption in the CLASSIC description logic. JAIR 1994;1:277–308.

[15] Srinivasan Naveen, Paolucci Massimo, Sycara Katia P. An efficientalgorithm for OWL-S based semantic search in UDDI. SWSWPC,2004. p. 96–110.

[16] Brodlie KW. Visualization techniques. In: Brodlie KW, CarpenterLA, Earnshaw RA, Gallop JR, Hubbold RJ, Mumford AM, OslandCD, Quarendon P, editors. Scientific visualization – techniques andapplications. Springer-Verlag; 1992. p. 37–86 [Chapter 3].

[17] Brodlie KW. A classification scheme for scientific visualization. In:Earnshaw RA, Watson D, editors. Animation and scientific visual-ization. Academic Press; 1993. p. 125–40.

[18] Horridge Matthew, Knublauch Holger, Rector Alan, Stevens Rob-ert, Wroe Chris. A Practical Guide to Building OWL OntologiesUsing The Protege-OWL Plugin and CO-ODE Tools Edition1.0. Available from: http://www.co-ode.org/resources/tutorials/ProtegeOWLTutorial.pdf.