semantic business process integration based on ontology alignment

8
Semantic business process integration based on ontology alignment Jason J. Jung Department of Computer Engineering, Yeungnam University, Dae-Dong, Gyeungsan 712-749, Republic of Korea article info Keywords: Business process management Alignment Knowledge sharing Business collaboration abstract Innovation and agility should be provided to businesses by efficient collaboration (i.e., communication and sharing) between them. However, semantic heterogeneity between business processes is a serious problem for automatically supporting cooperation processes (e.g., knowledge sharing and querying- based interactions) between businesses. In order to overcome this problem, we propose a novel frame- work based on aligning business ontologies for integrating heterogeneous business processes. We can consider two types of alignment processes; (i) manual alignment for building a whole business process ontology in a business process management (BPM) system and (ii) automated alignment between busi- ness processes of different BPM systems. Thereby, the optimal integration between two business pro- cesses has to be discovered to maximize the summation of a set of partial similarities between semantic components consisting of the business processes. In particular, the semantic component are extracted from semantic annotations of business processes. For evaluating the proposed system, we have conducted experimentations by using 22 business process management systems, which are organized as six business alliances. We have assumed that business processes in a same BPM system should be built with a common ontologies. The proposed alignment method has shown about 71.3% of precision (65.4% of recall). In addition, we found out that alignment results are dependent on some characteristics of ontologies (e.g., depth and number of classes). Ó 2009 Elsevier Ltd. All rights reserved. 1. Introduction One of recent key functions of business processes management (BPM) systems is to integrate several business processes in various and heterogeneous domains for creating innovations and agility (Jennings, Norman, Faratin, O’Brien, & Odgers, 2000; van der Aalst, ter Hofstede, & Weske, 2003). More particularly, these BPM sys- tems are different from conventional ones in that they provide an integrated view on business processes and collaborate with other systems. It empowers business experts (or decision makers) to define more efficient business processes and business rules. Thereby, BPM systems have been exploiting ontologies to support interoperability among systems, i.e., bridge the gap between BPM systems. Especially, a virtual enterprise (VE) is an ad-hoc and automated coalition between businesses that come together to share skills (and knowledge) or core competencies and resources in order to better respond to business opportunities, and whose cooperation is supported by computer networks (Perrin & Godart, 2004). The concept of business process for the VEs has been applied to many ways of cooperative business relations, like outsourcing, supply chains, or temporary consortium (Cardoso & Oliveira, 2004). Hence, as shown in Fig. 1, we can note that in a technical point of view an ontology-based BPM system is mainly composed of resource repository, which stores massive resources (e.g., elec- tronic documents, multimedia data and so on), business processes (or service APIs), which manipulate and pro- cess the resources, and business ontology (e.g., core BPM ontology and domain-specific ontologies). For execution of business processes, conventional BPM systems covering only two layers (i.e., resources and business processes) are restricted to access to their own resources. Even though the re- sources somehow become available, users (e.g., decision makers) are required to be familiar with various information retrieval-re- lated tasks (e.g., query processing and data warehousing) to search for relevant resources to their own contexts. In contrast, ontology-based BPM systems have applied one more layer, business ontologies, to semantically describe their re- sources and business processes. The main reason of this three-lay- ered architecture is to compose business processes with difference BPM systems. On the multiple information systems, they can pro- vide more relevant services and functionalities to process and manage the resources (Abrol et al., 2005; Jonkers et al., 2004). As practical features of business ontologies, the BPM systems can 0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2009.02.086 E-mail address: [email protected] Expert Systems with Applications 36 (2009) 11013–11020 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Upload: jason-j-jung

Post on 26-Jun-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Semantic business process integration based on ontology alignment

Expert Systems with Applications 36 (2009) 11013–11020

Contents lists available at ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

Semantic business process integration based on ontology alignment

Jason J. JungDepartment of Computer Engineering, Yeungnam University, Dae-Dong, Gyeungsan 712-749, Republic of Korea

a r t i c l e i n f o

Keywords:Business process managementAlignmentKnowledge sharingBusiness collaboration

0957-4174/$ - see front matter � 2009 Elsevier Ltd. Adoi:10.1016/j.eswa.2009.02.086

E-mail address: [email protected]

a b s t r a c t

Innovation and agility should be provided to businesses by efficient collaboration (i.e., communicationand sharing) between them. However, semantic heterogeneity between business processes is a seriousproblem for automatically supporting cooperation processes (e.g., knowledge sharing and querying-based interactions) between businesses. In order to overcome this problem, we propose a novel frame-work based on aligning business ontologies for integrating heterogeneous business processes. We canconsider two types of alignment processes; (i) manual alignment for building a whole business processontology in a business process management (BPM) system and (ii) automated alignment between busi-ness processes of different BPM systems. Thereby, the optimal integration between two business pro-cesses has to be discovered to maximize the summation of a set of partial similarities betweensemantic components consisting of the business processes. In particular, the semantic component areextracted from semantic annotations of business processes. For evaluating the proposed system, we haveconducted experimentations by using 22 business process management systems, which are organized assix business alliances. We have assumed that business processes in a same BPM system should be builtwith a common ontologies. The proposed alignment method has shown about 71.3% of precision (65.4%of recall). In addition, we found out that alignment results are dependent on some characteristics ofontologies (e.g., depth and number of classes).

� 2009 Elsevier Ltd. All rights reserved.

1. Introduction

One of recent key functions of business processes management(BPM) systems is to integrate several business processes in variousand heterogeneous domains for creating innovations and agility(Jennings, Norman, Faratin, O’Brien, & Odgers, 2000; van der Aalst,ter Hofstede, & Weske, 2003). More particularly, these BPM sys-tems are different from conventional ones in that they providean integrated view on business processes and collaborate withother systems. It empowers business experts (or decision makers)to define more efficient business processes and business rules.Thereby, BPM systems have been exploiting ontologies to supportinteroperability among systems, i.e., bridge the gap between BPMsystems.

Especially, a virtual enterprise (VE) is an ad-hoc and automatedcoalition between businesses that come together to share skills(and knowledge) or core competencies and resources in order tobetter respond to business opportunities, and whose cooperationis supported by computer networks (Perrin & Godart, 2004). Theconcept of business process for the VEs has been applied to manyways of cooperative business relations, like outsourcing, supplychains, or temporary consortium (Cardoso & Oliveira, 2004).

ll rights reserved.

Hence, as shown in Fig. 1, we can note that in a technical pointof view an ontology-based BPM system is mainly composed of

� resource repository, which stores massive resources (e.g., elec-tronic documents, multimedia data and so on),

� business processes (or service APIs), which manipulate and pro-cess the resources, and

� business ontology (e.g., core BPM ontology and domain-specificontologies).

For execution of business processes, conventional BPM systemscovering only two layers (i.e., resources and business processes)are restricted to access to their own resources. Even though the re-sources somehow become available, users (e.g., decision makers)are required to be familiar with various information retrieval-re-lated tasks (e.g., query processing and data warehousing) to searchfor relevant resources to their own contexts.

In contrast, ontology-based BPM systems have applied onemore layer, business ontologies, to semantically describe their re-sources and business processes. The main reason of this three-lay-ered architecture is to compose business processes with differenceBPM systems. On the multiple information systems, they can pro-vide more relevant services and functionalities to process andmanage the resources (Abrol et al., 2005; Jonkers et al., 2004). Aspractical features of business ontologies, the BPM systems can

Page 2: Semantic business process integration based on ontology alignment

Resources

BusinessProcesses

BusinessOntologies

Business1

Resources

BusinessProcesses

BusinessOntologies

Business2

Alignment

Matching

Integration?

Fig. 2. Business process alignment in three-layered architecture.

Resources

Textual Document

Multimedia

Database

BusinessProcesses

CRM

SCM

OLAP

BusinessOntologies

Core BPM ontology

Business-specificontologies

Fig. 1. Three-layered architecture of ontology-based BPM system.

11014 J.J. Jung / Expert Systems with Applications 36 (2009) 11013–11020

employ not only their own classification systems like directories,catalogue, and yellow pages (Cilia & Buchmann, 2002; Jung,2007-b), but also other standard ontologies. Such standard ontolo-gies are Business Management Ontology (BMO)1, Business ProcessExecution Language (BPEL and BPEL4WS2).

Major usage of business ontologies is to conduct semantic anno-tation of business process. The aim of semantic annotation ap-proaches is to share meaning with other participants (e.g., usersand partner businesses) by deriving some concepts and propertiesfrom their own ontologies (Uren et al., 2006). In case of BPM, busi-ness processes and resources can be targeted by semantic annota-tion. (An example of semantic annotation for business processeswill be shown later.)

However, the problem is that semantic structures of businessontologies are not completely homogeneous with each other, be-cause the ontologies are designed by experiences and heuristicsof the local experts (or administrators). It means that semanticinformation extracted from the ontologies may be heterogeneouswith the others. Such heterogeneities are caused by the differenceof not only the terminologies (e.g., synonyms and antonym), butalso, more importantly, the knowledge structures (e.g., databaseschema Hull (1997) and ontologies Jung (2006)). We note twomain semantic heterogeneities between ontology-based BPM sys-tems, as follows.

1. Lexical heterogeneity. Even though the classes of taxonomies aresemantically equivalent, keywords used for expressing the clas-ses might be different from other VEs. For example, a class for‘‘Human Resource Department” can be represented as ‘‘HRDept” as well as ‘‘Département des Ressources Humaines”.Additionally, this sort of heterogeneities is also caused by(i) multi-lingual problem and (ii) synonyms (or antonyms)(Menczer, 2004).

2. Structural heterogeneity. Semantic relationships (e.g., subclass,superclass, and so on) between two concepts in a taxonomyare different from others. There also exist some missing con-cepts. For the practical reason, Jung has mentioned that dueto several practical reasons class duplications between identicalcategories and the subordination between dependent catego-ries (Jung, 2005).

1 http://www.bpiresearch.com/Resources/RE_OSSOnt/re_ossont.htm.2 ftp://www6.software.ibm.com/software/developer/library/ws-bpel.pdf.

Consequently, these ontology-based BPM systems are difficultto be integrated directly. It means that BPM systems are impossibleto automatically achieve strategic cooperations with heteroge-neous BPM systems.

In order to overcome this drawback, we have been focusing onbusiness process alignment method for semantic interoperabilitybetween three-layered BPM systems, as shown in Fig. 2. This align-ment method is based on ontology matching algorithm discoveringsemantic correspondences between entities (e.g., concepts andproperties) of two ontologies, so as to deal with the heterogeneityproblem between the business ontologies. Moreover, for testingscalability of the proposed alignment method, we want to showthat a large number of the ontology-based BPM systems can beinter-connected with each other for performing multiple collabo-rations among heterogeneous businesses.

Several studies have been proposed to provide interoperabilityby discovering and integrating local knowledge structures betweengeneral information systems (Castano, Ferrara, & Montanelli,2006). They can be briefly noted into three issues;

� incremental discovery of local knowledge (Jung, 2007-b).� knowledge matching (including schema and ontology matching)

(Shvaiko & Euzenat, 2005), and� interoperability via third-party platforms, e.g., service-oriented

architecture (SOA) (Vetere & Lenzerini, 2005).

In this paper we propose a novel method to integrate businessprocesses by mapping heterogeneous business ontologies, i.e.,maximizing the summation of partial similarities between a setof possible pairs of classes. The partial similarity can be calculatedby comparing both set of instances in the classes. After both taxo-nomies are aligned at conceptual level, and the source ontology in-stances are transformed into the target taxonomy entitiesaccording to those semantic relations.

The remainder of this paper is as follows. In the followingSection 2, we describe the problem of semantic heterogeneity be-tween BPM systems. Sections 3 and 3.2 propose a novel similaritymeasurement between heterogeneous business ontologies, andalignment-based interoperability applications by using these sim-ilarity measurement. In Section 4, experimental results will beshown to evaluate our approach. Section 5 discusses some signifi-cant issues and compares our contributions with the previousstudies. Finally, Section 6 draws our conclusions of this work.

2. Heterogeneous business process ontologies

For the purpose of exchanging information (i.e., semantic anno-tation) about business processes as well as resources, a BPM sys-tem has to build its local ontology through some well-knownontology engineering processes (e.g., learning, merging and entail-ment (Gómez-Pérez, Fernández-López, & Corcho, 2003)).

Page 3: Semantic business process integration based on ontology alignment

J.J. Jung / Expert Systems with Applications 36 (2009) 11013–11020 11015

In this study, we assume that the ontology is simply composedof (i) a set of classes and (ii) a set of relations between the classes.They can be applied to describe the domain-specific knowledge ofthe corresponding business processes. An ontology fragment is de-fined, as follows.

Definition 1 (Ontology fragment). Let Ck and Rk be a concept setand a relation set of a business process Pa

k executed by a BPMsystem Ba, respectively. An ontology fragment OFa

k for Pak is

defined as a set of assertions between classes in the concept set Ck.Hence, OFa

k is given by

OFak ¼ fcroot; hci; cj; rijijci; cj 2 Ck; rij 2 Rkg ð1Þ

where triple hci; cj; riji means ci and cj are related by rij. We put croot

as root class of OFak for convenience. For example, hVehicle;Auto-

mobile; superClassi means class ‘‘Vehicle” is a superclass of class‘‘Automobile”.

A business process ontology O of a BPM system should be orga-nized as a set of ontology fragments and manual alignments be-tween the ontology fragments by domain experts. On the top ofthis structure, a root node is playing a role of simple connectoramong the ontology fragments. As drilling down from this root,the classes are more branched and more specified.

Definition 2 (Business process ontology). A business process ontol-ogy Oa in a BPM system Ba is built by merging a set of ontologyfragments. Thus, supposing that a set of business processesfP1; . . . ;Pjajg be executed by a BPM system Ba, business processontology Oa is formulated by

Oa ¼[

Pak2Ba

OFak

0@

1AþAa ð2Þ

where croot in all OF are equivalently aligned. More importantly, do-main experts can manually assert alignments Aa ¼ fhcp; cq; rIiIjcp

2 OFap ; cq 2 OFa

qg. These mappings are expressed with various rela-tions between classes in different ontology fragments.

Two ontology fragments has been merged with two manualalignments between two ontology fragments. Not only the tripleentities from ontology fragments but also a manual alignment byhuman experts is compiled into the business process ontology.For such relations between aligned classes, this paper considers

Croot CrootVehicle

Automobile Car

Equivalent

EquivalentTrain

SCM ProcessCRM Process Scheduling Process

Fig. 3. Merging ontology fragments for building business process ontology. Reddotted lines and blue arrows indicate manual alignments between fragments andannotations for a given business process, respectively. (For interpretation of thereferences to color in this figure legend, the reader is referred to the web version ofthis article).

only three semantic relations subclass, superclass, and equivalence.For example, in Fig. 3, root classes are systematically aligned, andby BPM administrator, a manual alignment hAutomobile;Car;Equivalenti is obtained.

Through semantic annotation of business process, a set of clas-ses can be pointed together to a certain business process (they areblue arrows shown in Fig. 3). We refer to these links as annotationinstances.

Definition 3 (Annotation instance). Let a business process Pak

annotated with a set of classes fcijci 2 Oag. Reversely, a set ofannotation instances in class ci is denoted as IðciÞ ¼ fP1;

P2; . . . ;PjIi jg. They are simply represented as Ia # jOaj � jPaj.

In this paper, we assume that the annotation instances (e.g.,textual documents and multimedia data) should be on is-a relationwith classes in ontologies.

However, the problem is that these BP ontologies are heteroge-neous with each other (as previously mentioned), so that it is dif-ficult to support machine-processible (e.g., agent-based)interoperability effectively. The only way to take advantage ofthe annotation instances in other BP ontologies is to get crossthrough the manual alignments provided by human experts. Thiskind of alignments require time-wasting tasks. They have to realizeand understand the semantic structures of given ontologies. Suchtasks are (i) to scan most of instances in each class (what kindsof instances are included in classes), and (ii) to reflect their ownexperiences and heuristics (which relations are involved betweenclasses).

Thereby, in this paper, we want to discuss automated ontologyalignment method for business process integration. In addition,regarding to performance evaluation of the interoperability be-tween ontology-based BPM systems, we can think of two issues.First issue is accuracy. There may exist some missing alignments,which are more significant that others. Secondly, we want to findout the influence of the minimum number of manual alignments.We are considering that the ratio of alignment qAlignðaÞ ¼

jAa jPK

k¼1;OFk2OajOFk j

should be an important factor for the quality of ontol-

ogy alignment. This ratio factor should be compared with user-specified threshold sA. For example, let sA ¼ 0:3 in BPMb, andtwo faceted taxonomies OFa, OFb be given to be aligned into Ob.If qAlignðbÞ 6 sA, the BPMb is hard to execute not only internal oper-ations but also efficient collaborations with other BPMs. We willdiscuss how to obtain optimal value of this factor in Section 4.

3. Ontology alignment for business process integration

In order to solve the heterogeneity drawbacks, discovery pro-cess for significant alignments between ontologies needs to beautomated. A set of given business process ontologies have to bematched as finding out the best configuration of alignments be-tween ontology entities (e.g., classes and properties). We assumethat the best configuration should be maximizing the summationof class similarities. Similarity between two classes is computedby not only class labels but also neighbor classes (i.e., subclassesand superclasses) and the annotation instances related to the class.To preprocess a set of annotation instances, some of terms in theannotation instances should be extracted and regarded as principalcomponents representing the class (Jung, 2008).

3.1. Alignment-based on class similarity

In order to find optimal alignment between two taxonomies, wehave to measure the similarity between classes consisting of thetaxonomies.

Page 4: Semantic business process integration based on ontology alignment

11016 J.J. Jung / Expert Systems with Applications 36 (2009) 11013–11020

Definition 4 (Class similarity). Given a pair of classes from twodifferent taxonomies, the class similarity (SimC) between c and c0 isdefined as

SimCðc; c0Þ ¼X

E2NðCÞpC

E MSimYðEðcÞ; Eðc0ÞÞ ð3Þ

where NðCÞ # fE1 . . . Eng is the set of all relationships in which theclasses participate (for instance, subclass, superclass, or instances).We have to consider on three components Y ¼ fL; C; Ig (i) class la-bels (L), (ii) neighboring classes (C), and (iii) annotation instances(i). The weights pC

E are normalized (i.e.,P

E2NðCÞpCE ¼ 1). Class simi-

larity measure SimC is assigned in ½0;1�.

As a matter of fact, a similarity function between two set of clas-ses can be established by finding a maximal matching maximizingthe summed similarity between the classes:

MSimCðS; S0Þ ¼max

Phc;c0 i2PairingðS;S0 Þ SimCðc; c0Þð Þ

max jSj; jS0j� � ; ð4Þ

in which Pairing provides a matching of the two set of classes.Methods like the Hungarian method allow to find directly the pair-ing which maximizes similarity. The algorithm is an iterative algo-rithm that compute this similarity (Euzenat & Petko Valtchev,2004). This measure is normalized because if SimC is normalized,the divisor is always greater or equal to the dividend.

In case of business process ontologies, according to Definition 2,we have to take in account all possible relationships (r and rI) be-tween classes for NðCÞ ¼ fEsup; Esub; Eequg, provided (i) the super-class (Esup) and the subclass (Esub) defined in each ontologyfragment, and (ii) the equivalent class (Eequ) by manual alignmentsof human experts, respectively. Then, Eq. (3) can be rewritten as:

SimCðc; c0Þ ¼ pCL simLðLðcÞ; Lðc0ÞÞ þ pC

subMSimCðEsubðcÞ; Esubðc0ÞÞþ pC

supMSimCðEsupðcÞ; Esupðc0ÞÞ

þ pCequMSimCðEequðcÞ; Eequðc0ÞÞ

þ pCISimIðIðcÞ; Iðc0ÞÞ ð5Þ

where the set functions MSimC compute the similarity of two entitycollections. Label similarity simL is simply computed by stringmatching algorithms such as Levenshtein edit distance (Levenshtein,1996), substring distance (Euzenat, 2004), and so on. Similaritymeasure between two classes can be turned into a distance mea-sure Distance ¼ 1� Similarity by taking its complement to 1.

Especially, in order to enhance the accuracy of the class similar-ity, the last term in Eq. (5) is representing instance-level similaritymeasurement between business process annotations. We exploitthree different heuristic functions, and they are formulated by

SimIðIðcÞ; Iðc0ÞÞ ¼N

maxðjIðcÞj; jIðc0ÞjÞ ð6Þ

¼maxN

n¼1SimhPa ;Pbi2PairingðIðcÞ;Iðc0 ÞÞðLðPaÞ; LðPbÞÞn ð7Þ

¼PN

n¼1SimhPa ;Pbi2PairingðIðcÞ;Iðc0 ÞÞðLðPaÞ; LðPbÞÞnN

ð8Þ

where N is the number of pairs of term features whose distancescomputed by string matching methods are less than thresholdsDist , i.e., EditDistanceðLðPaÞ; LðPbÞÞPa2IðcÞ;Pb2Iðc0Þ 6 sDist . Three equa-tions are denoted as H1, H2, and H3, and they return the normalizednumber of matched pairs of terms, the maximum similarity amongmatched terms, and the average similarity of matched terms,respectively. (We will evaluate and compare these heuristic func-tions for matching term features in Section 5.) Because instance-le-vel class similarity can uncover the latent semantic information ofthe classes, the normalization process with the weighting factor isexpected to prune incorrect alignments between them.

As a result, the proposed alignment process between heteroge-neous ontologies can be represented as a set of pairs of classesfrom two different ontologies. We refer a class pair to correspon-dence (e.g., equivalence or subsumption).

Definition 5 (Alignment). Given two business process ontologiesOi and Oj, the alignments between both ontoliges are representedas a set of correspondences CRSPij ¼ fhc; r; c0ijc 2 Oi; c0 2 Ojg wherer means the relationship between c and c0, by maximizing thesummation of class similarities

PSimCðc;c0Þ.

Finally, the proposed alignment process makes heterogeneousBPM systems interoperable (even partially) among them. Forexample, local users in a BPM system can easily and transparentlyaccess to the other BPM systems. To do so, BPM systems have toconduct this ontology alignment process in advance. Suppose thata set of BPM systems fL1; . . . ; LNg should be interoperable with eachother. Alignment process can find out the correspondences be-tween all pairs of taxonomies, i.e., Li obtains N � 1 sets ofcorrespondences.

3.2. Interoperability based on query transformation

In order to make better business plan and decisions, a BPM sys-tem can interact with others by using the correspondences ob-tained from ontolgy alignment process. If their interactions aresimply based on (i) query answering and (ii) recommending (inother words, pushing) tasks for relevant information exchanging,we focus on conceptual transformation of the queries which indi-cate some specific information needs of the BPM systems. Duringcommunicating between BPM systems, the queries can be embed-ded into the messages sent from a source BPM system to a destina-tion BPM system.

Definition 6 (Query). A query Q is composed of a set of classes (orterms) and logical operators (e.g., :, ^, and _), and its grammar issimply given by

q ::¼ cj:qjq ^ qjq _ q ð9Þ

where c 2 Osrc , but c R Odest . Here, Osrc and Odest are ontologies of thesource and destination BPM system, respectively.

For conceptual query transformation from BPMi to BPMj, we ex-ploit simple class replacement strategy using a set of aligned cor-respondences CRSPij between Oi and Oj, in order to enhance theaccessibility of proactive software modules (e.g., agents) and, moreparticularly, local users. In other words, we want to help a localuser in BPMi to search for relevant resources in heterogeneousBPM systems by replacing the concepts in queries.

Definition 7 (Query transformation). Let a query qi in BPMi be sentto BPMj, and divided into

qi ¼ qþij þ q�ij ¼ fcþjhcþ; rel; c0i 2 CRSPijg þ fc�jc� 2 Oig ð10Þ

where class cþ is matched with a certain class in Oj. This query istransformed by replacing class cþ in qi with the classes c0 in Oj, ifand if only

� c0 is equivalent with cþ (r ¼ Equivalence), or� c0 is a subclass of cþ (r ¼ SubClass).

As an example, in BPMi, a query ‘‘Delivery ^ Airline” expressesthe intersection between two sets of resources annotated withclasses ‘‘Delivery” and ‘‘Airline,” respectively. If we could discovera semantic correspondences hDelivery; SubClass; Logisticsi be-tween Oi and Oj, the query can be modified to ‘‘Logistics^Airline” in BPMj.

Page 5: Semantic business process integration based on ontology alignment

CRoot

Delivery Service Airline Service

Road

Order

Transportation

CRoot

TravelAgency

ScheduleExpress

Delivery

AirlineTruck Train_Schedule

ScheduleDuration

Company

Reservation

Booking

Fig. 4. Semantic similarity-based ontology alignment.

J.J. Jung / Expert Systems with Applications 36 (2009) 11013–11020 11017

In case of replacement with subclasses, the transformed queriesare expressing more specified concepts. In terms of recall and pre-cision (well-known measurements from information retrievalfield), it makes the precision of the retrieved information more in-creased, while the coverage rate (or recall) is reduced. On the otherhand, query transformation based on superclass replacement maycause information loss problem, because the transformed query isimpossible to indicate the specific semantics of the original one.

3.3. Example

Now, we want to show a simple example of business processintegration between ‘‘logistics” and ‘‘airline” domains. As shownin Fig. 4, the alignment between two business process ontologiesis occurred as showing the best mappings between the corre-sponding ontology entities (they are depicted as red dotted arrows,while human expert have provided manual alignments depicted asblue dotted arrows). Both business processes ‘‘Delivery Service(DS)” and ‘‘Airline Service (AS)” are annotated with three andtwo difference ontology fragments, respectively.

Table 1Specifications of testing bed.

Number of annotationinstances

Number ofclasses (jOij)

DensityNumber of annotation instances

Number of classes

� �BPM1 172 37 Middle (4.65)BPM2 81 25 Low (3.24)BPM3 59 18 Low (3.28)BPM4 73 27 Low (2.70)BPM5 614 57 High (10.77)BPM6 264 21 High (12.57)BPM7 510 48 High (10.63)BPM8 236 29 Middle (8.14)BPM9 69 16 Middle (4.31)BPM10 276 60 Middle (4.60)BPM11 185 23 Middle (8.04)BPM12 243 28 Middle (8.68)BPM13 265 32 Middle (8.28)BPM14 422 23 High (18.35)BPM15 314 35 Middle (8.97)BPM16 251 37 Middle (6.78)BPM17 370 31 High (11.94)BPM18 135 42 Low (3.21)BPM19 165 35 Middle (4.71)BPM20 325 43 Middle (7.56)BPM21 222 31 Middle (7.16)BPM22 210 24 Middle (8.75)

Once we have aligned the whole ontologies, two correspon-dences (CRoot is ignored) are automatically discovered. When theDS receives an ‘‘express delivery order”, it can get semantic knowl-edge about ‘‘Duration”. Additionally, DS can also refer to theknowledge about ‘‘Booking” and ‘‘Reservation”, finding efficientsolution for fulfilling the given task.

4. Experimental results

We have evaluated our contributions of this paper by two mainissues; (i) human evaluation of alignment between heterogeneousontologies, and (ii) performance evaluation (i.e., recall and preci-sion) of knowledge retrieval based on query transformation.

Above all, in order to prepare a testing bed, we have invited 22graduated students from ‘‘Advanced E-commerce systems” coursein Inha University from November 2006 to February 2007 (aboutfour months), and asked them to their own BPM systems with re-spect to their interests, as shown in Table 1.

Given a set of business processes (i.e., business descriptionfiles3), they had to choose a number of processes to annotate withtheir own ontologies. They were able to merge ontology fragments,as asserting manual alignments between the fragements. The ontol-ogy fragments were simply obtained by screening some parts ofexisting ontologies (these are easily retrieved from Swoogle4). Suchtaxonomies are

� ACM Computing Classification (http://www.acm.org/class/1998/).

� Government Category List (http://www.esd.org.uk/standards/gcl/).

� On-line Medical Dictionary (OMD) (http://cancerweb.ncl.ac.uk/omd/).

� Open Directory Project (ODP) (http://dmoz.org/).� Commerce-Database Business Directory (http://www.com-

merce-database.com/).

While BPM5 and BPM7 have annotated the largest number of re-sources, BPM5 and BPM10 have shown the largest number classesin the corresponding ontologies. With respect to the density(D ¼ Number of annotation instances

Number of classes ), the BPMs are classified into three

3 IRCS dataset is organized as a set of documents retrieved from major e-commercewebsites in Korea (available on http://eslab.inha.ac.kr/~ircs/). It has been applied tosupport user browsing tasks for searching relevant information in Jung (2007-a).

4 http://swoogle.umbc.edu/.

Page 6: Semantic business process integration based on ontology alignment

11018 J.J. Jung / Expert Systems with Applications 36 (2009) 11013–11020

categories; High, Middle, and Low. Particularly, BPM6 was designedto be the densest one (D6 ¼ 12:57).

4.1. Evaluation on alignment process

For the first issue, we have performed the proposed alignmentprocess between all possible pairs of ontologies (22�21

2 ¼ 77) inthree difference cases;

� simple matching of semantic structures of ontologies,� matching with manual alignments A, and� matching with semantics extracted from annotation instance

sets.

Compared with the matching result in the first case, second andthird cases were expected to show improved results. Five humanexperts, thereby, analyzed the collected correspondences betweenthe established ontologies, and counted the number of mis-matched correspondences from each alignment.

Table 2 shows a part of the results of correspondence matchedin the first case (O1 to O12). While lower diagonal componentv ij ¼ jCRSPijj is the number of correspondences between two ontol-ogies Oi and Oj, upper diagonal component wji mean the number ofmismatched correspondences from CRSPij. Additionally, in thebracket, mismatching ratio is computed by v ij

wji. In average, our sim-

ilarity-based ontology alignment has shown approximately 27.2%mismatching ratio. Ontology alignment between O1 and O5 hasshown 44%, which is the highest mismatching ratio (i.e., the worstcase). On the other hand, alignments between O3 and O5, betweenO3 and O10 were the lowest mismatching ratio 14% (i.e., the bestcase).

In order to enhance the previous alignments, we exploited twoapproaches; (i) manual alignments A, provided by the studentsduring building their BPMs, and (ii) annotation instances of thecorresponding class. Then, we want to evaluate whether (andhow much) these methods were able to improve the alignmentperformance. Thereby, we compared the experimental results inthe previous tables. With respect to improvement of the numberof discovered correspondences, instance-based alignment (about139%) outperformed manual alignments-based one (about 105%).

We found out that in most of ontology pairs the manual align-ment has shown only slight improvement, compared to the in-stance-based matching. Particularly, although the ratio of manualalignment qAlignðkÞ in the 22 ontologies was diverse betweenqAlignð10Þ ¼ 0:32 and qAlignð9Þ ¼ 0:65, their performance was quiteconsistently maintained. It means that manual alignment hasplayed trivial contributions to automatic ontology alignmentprocess.

With respect to the mismatching ratio, two methods decreased,in average, 9.8% and 48.4% of mismatched correspondences, which

Table 2Results of taxonomy alignment (sAlign ¼ 0:1).

O1 O2 O3 O4 O5 O6

O1 – 2 2 4 8 (0.44) 3O2 11 – 3 5 5 3O3 9 8 – 1 1 (0.14) 2O4 15 13 6 – 5 4O5 18 13 7 15 – 3O6 9 8 8 11 10 –O7 19 11 8 13 16 11O8 15 11 6 15 15 10O9 6 9 6 7 7 5O10 13 11 7 9 30 10O11 9 12 10 13 12 11O12 14 13 6 11 15 7. . .

are regarded as error rates. Again, instance-based alignment hasshown better performance than others. Especially, instance-basedalignments between ontologies O3 and O4, between O3 and O10,and between O6 and O9 have been perfectly matched.

4.2. Evaluation on query transformation

Second experimentation issue is to evaluate semantic interoper-ability between BPMs. In this paper, interactions between BPMswere represented as concept-based queries, and these queries weretransformed by class replacement based on the correspondences,acquired by instance-level alignment method in the first issue.The invited students have built ten queries with the classes in theirown taxonomies, in order to broadcast these queries to the rest ofBPMs. After a set of resources frscðqiÞ were retrieved by a query qi

of BPMi, the recall R and precision P have been measured by

RðqiÞ ¼jfrscðqiÞ \ rscðqiÞj

jrscðqiÞjð11Þ

PðqiÞ ¼jfrscðqiÞ \ rscðqiÞjjfrscðqiÞj

ð12Þ

where rscðqiÞ is a set of resources retrieved by the human experts.For the given queries, average recall was 68.6%, and the queries

from BPM3 have been most successfully transformed (73.7%). Withrespect to precision, we obtained in average 85.4% precision. BPM11

has shown the maximum precision (86.9%).It proves that the correspondences were properly discovered by

the proposed approach (i.e., the rate of mismatched alignments isreasonably low), but some missed correspondences made the re-call decreased. Another important point is that precision measurehas shown better results rather than recall measure. We considerthat our concept replacement strategy is only based on ‘‘equiva-lence” and ‘‘subclass” relationships.

5. Discussion and related work

Through conducting experimentation, the proposed alignmenthas been proved to support semantic interoperability between het-erogeneous BPMs. We want to discuss several meaningful achieve-ments related to ontology alignment algorithm.

First issue is to find out whether the characteristics of BPMs(e.g., numbers of resources and classes in Table 1) are related tothe performance of alignment process or not. Given two ontologiesOi, Oj to be aligned, four parameters were chosen to be compared,as follows.

� Total numbers of two sets of classes (pij1).

� Total numbers of two sets of instances (pij2).

O7 O8 O9 O10 O11 O12

6 4 2 5 3 52 4 2 4 4 52 2 1 1 (0.14) 3 23 4 2 3 4 36 4 2 9 5 33 4 1 4 4 3– 3 3 8 3 5

12 – 2 3 2 38 7 – 2 3 2

25 12 7 – 3 310 8 9 11 – 415 13 7 10 10 –

Page 7: Semantic business process integration based on ontology alignment

J.J. Jung / Expert Systems with Applications 36 (2009) 11013–11020 11019

� Difference between numbers of two sets of classes (pij3).

� Difference between numbers of two sets of instances. (pij4)

Then, some meaningful associations between two knownquantitative variables, i.e., a parameter pk and improved ratio ofalignment process r, has been analyzed by regression method

rij ¼ aþ b� pijk þ c� ðpij

kÞ2 þ � ð13Þ

where a, b, and c are coefficients. We found out that the larger dif-ference between the numbers of classes in ontologies (i.e., p3) makethe best influence on the performance of ontology alignment.

As second issue, we found out our alignment process has shownapproximately 35.5% error rate (i.e., the mismatched correspon-dences). In the worst case (alignment between T1 and T2), werealized that mainly the differences between domain-specific ter-minologies have influenced string matching-based alignments (inour case, we measured the edit distance between labels).

Our approach can be compared with the centralized informa-tion systems, e.g., portal system. Difference between two mainapproaches to access to multiple information sources, in termsof end-users’ accessing strategies. While portal systems (e.g.,meta search engines) provide a centralized integration servicefrom these information sources, distributed approaches like oursystem can consider more domain-specific features. Moreover,they can expect some personalization techniques to their localusers.

We consider the ontologies are rather simple. While the mainrelationship between classes in ontologies is SubClass (i.e., simi-lar to taxonomy), ontologies are containing a variety of relation-ships between classes such as SubClass, SuperClass,Property, SubProperty, Domain, Range, and so on. However,in Welty and Guarino (2001), the taxonomic patterns are capableof ontological relationships. Also, many work has been proposedto match, align and merge ontologies like similarity flooding(Melnik, Garcia-Molina, & Rahm, 2002), Alignment API (Ehrig &Sure, 2005) and directory-based approach (Liang, Vaishnavi, &Vandenberg, 2006). Of particular interest is ontology sharing sys-tem between community of practice (cop), introduced in Davies,Duke, and Sure (2004) Mika, Iosif, Sure, and Akkermans (2004).In more practical aspect, several business markup languages, e.g.,Unified Enterprise Modelling Language (UEML) (Ducq, Chen, &Vallespir, 2004), have been designed. It can be regarded as moresystematic activities.

In context of query transformation, since concept-based querytransformation scheme was introduced in (Qiu & Frei, 1993), sev-eral approaches have been investigated. Examples of such ap-proaches are probabilistic query expansion based on conceptsimilarity (Cui, Wen, Nie, & Ma, 2002), logical inference (Nie,2003), and background knowledge-based systems (Liu & Chu,2005; Zazo, Figuerola, Alonso Berrocal, & Rodríguez, 2005).

6. Concluding remarks and future work

As a conclusion, we proposed alignment mechanism orchestrat-ing and gluing together several pieces of business process in orderto create innovative knowledge. This work can be explained as anintegration system among domain-specific business processes lo-cated in the third layer mentioned in van der Aalst et al. (2003).More importantly, scheme on heterogeneous virtual organizations(in particular, virtual enterprises) should be applied to integratethe corresponding business processes.

Each pair of ontologies were aligned by measuring the similar-ities between classes. We assume that the maximal summation ofthese class similarities be the best alignment between the corre-

sponding ontologies. Based on this alignment, we supported the lo-cal users to access to the other heterogeneous BPMs.

In the future, we have to evaluate the scalability of our align-ment-based distributed BPMs, as increasing the number of testingbeds. Especially, according to the semantic power, they might besocialized, as shown in Jung and Euzenat (2006); Jung, 2007-b.Then, we can provide more efficient query propagation strategies.More importantly, we are planning to evaluate our alignmentmethod by evaluation methods of taxonomy mapping algorithmsproposed in Avesani, Giunchiglia, and Yatskevich (2005).

Acknowledgement

This research was supported by the Yeungnam University re-search grants in 2008.

References

Abrol, M., Doshi, B., Kanihan, J., Kumar, A., Liu, J., & Mao, J. (2005). Intelligenttaxonomy management tools for enterprise content. In A. Skowron, R. Agrawal,M. Luck, T. Yamaguchi, P. Morizet-Mahoudeaux, J. Liu, et al. (Eds.), Proceedings ofthe 2005 IEEE/WIC/ACM international conference on web intelligence (WI 2005)(pp. 809–811). IEEE Computer Society.

Avesani, P., Giunchiglia, F., & Yatskevich, M. (2005). A large scale taxonomymapping evaluation. In Y. Gil, E. Motta, V. Richard Benjamins, & M. A. Musen(Eds.), International semantic web conference. Lecture notes in computer science(3729, pp. 67–81). Springer.

Cardoso, H. L., & Oliveira, E. C. (2004). Virtual enterprise normative frameworkwithin electronic institutions. In M. P. Gleizes, A. Omicini, & F. Zambonelli (Eds.),Proceedings of the fifth international workshop on engineering societies in theagents world (ESAW 2004). Lecture notes in computer science (3451, pp. 14–32).Springer.

Castano, S., Ferrara, A., & Montanelli, S. (2006). Matching ontologies in opennetworked systems: Techniques and applications. Journal of Data Semantics, 5,25–63.

Cilia, M., & Buchmann, A. P. (2002). An active functionality service for e-businessapplications. SIGMOD Records, 31(1), 24–30.

Cui, H., Wen, J.-R., Nie, J.-Y., & Ma, W.-Y. (2002). Probabilistic query expansion usingquery logs. In Proceedings of the 11th international conference on World Wide Web(pp. 325–332). New York, NY, USA: ACM Press.

Davies, J., Duke, A., & Sure, Y. (2004). OntoShare – an ontology-based knowledgesharing system for virtual communities of practice. Journal of UniversalComputer Science, 10(3), 262–283.

Ducq, Y., Chen, D., & Vallespir, B. (2004). Interoperability in enterprise modelling:Requirements and roadmap. Advanced Engineering Informatics, 18(4), 193–203.

Ehrig, M., & Sure, Y. (2005). FOAM – Framework for ontology alignment andmapping – Results of the ontology alignment evaluation initiative. In B.Ashpole, M. Ehrig, J. Euzenat & H. Stuckenschmidt (Eds.), Proceedings of the K-CAP 2005 Workshop on Integrating Ontologies, Banff, Canada. In CEUR workshopproceedings (Vol. 156). CEUR-WS.org.

Euzenat, J. (2004). An API for ontology alignment. In S. A. McIlraith, D. Plexousakis, &F. van Harmelen (Eds.), Proceedings of the third international semantic webconference. Lecture notes in computer science (3298, pp. 698–712). Springer.

Euzenat, J., & Valtchev, P. (2004). Similarity-based ontology alignment in OWL-Lite.In R. López de Mántaras, & L. Saitta (Eds.), Proceedings of the 16th Europeanconference on artificial intelligence (ECAI’2004) (pp. 333–337), August 22–27,Valencia, Spain. IOS Press.

Gómez-Pérez, A., Fernández-López, M., & Corcho, O. (2003). Ontological engineering.Advanced information and knowledge processing. Springer.

Hull, R. (1997). Managing semantic heterogeneity in databases: A theoreticalprospective. In Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGARTsymposium on principles of database systems (PODS 97) (pp. 51–61), New York,NY, USA. ACM Press.

Jennings, N. R., Norman, T. J., Faratin, P., O’Brien, P., & Odgers, B. (2000). Autonomousagents for business process management. Applied Artificial Intelligence, 14(2),145–189.

Jonkers, H., Lankhorst, M. M., van Buuren, R., Hoppenbrouwers, S., Bonsangue, M. M.,& van der Torre, L. W. N. (2004). Concepts for modeling enterprise architectures.International Journal of Cooperative Information Systems, 13(3), 257–287.

Jung, Jason J. (2005). Collaborative web browsing based on semantic extraction ofuser interests with bookmarks. Journal of Universal Computer Science, 11(2),213–228.

Jung, Jason J. (2006). Taxonomy alignment for interoperability betweenheterogeneous digital libraries. In Proceedings of the international conferenceon asian digital library (ICADL). Lecture notes in computer science (4312,pp. 274–282). Springer.

Jung, J. J. (2007-a). Exploiting semantic annotation to supporting user browsing onthe web. Knowledge-Based Systems, 20(4), 373–381.

Jung, J. J. (2007-b). Ontological framework based on contextual mediation forcollaborative information retrieval. Information Retrieval, 10(1), 85–109.

Page 8: Semantic business process integration based on ontology alignment

11020 J.J. Jung / Expert Systems with Applications 36 (2009) 11013–11020

Jung, J. J. (2008). Taxonomy alignment for interoperability between heterogeneousvirtual organizations. Expert Systems with Applications, 36(4), 2721–2731.

Jung, J. J., & Euzenat, J. (2006). From personal ontologies to semantic social space. InPoster of the fourth European semantic web conference (ESWC 2006).

Levenshtein, I. V. (1996). Binary codes capable of correcting deletions, insertions,and reversals. Cybernetics and Control Theory, 10(8), 707–710.

Liang, J., Vaishnavi, V. K., & Vandenberg, A. (2006). Clustering of LDAP directoryschemas to facilitate information resources interoperability acrossorganizations. IEEE Transactions on Systems, Man, and Cybernetics – Part A,36(4), 631–642.

Liu, Z., & Chu, W. W. (2005). Knowledge-based query expansion to support scenario-specific retrieval of medical free text. In Proceedings of the 2005 ACM symposiumon applied computing (SAC’05) (pp. 1076–1083). New York, NY, USA: ACMPress.

Melnik, S., Garcia-Molina, H., & Rahm, E. (2002). Similarity flooding: A versatilegraph matching algorithm and its application to schema matching. InProceedings of the 18th international conference on data engineering (ICDE)(pp. 117–128). IEEE Computer Society.

Menczer, F. (2004). Lexical and semantic clustering by web links. Journal of theAmerican Society for Information Science and Technology, 55(14), 1261–1269.

Mika, P., Iosif, V., Sure, Y., & Akkermans, H. (2004). Ontology-based contentmanagement in a virtual organization. In S. Staab & R. Studer (Eds.), Handbookon ontologies. International handbooks on information systems (pp. 455–476).Springer.

Nie, J.-Y. (2003). Query expansion and query translation as logical inference. Journalof the American Society for Information Science and Technology, 54(4), 335–346.

Perrin, O., & Godart, C. (2004). A model to support collaborative work in virtualenterprises. Data and knowledge engineering, 50(1), 63–86. July.

Qiu, Y., & Frei, H.-P. (1993). Concept based query expansion. In Proceedings of the16th annual international ACM SIGIR conference on research and development ininformation retrieval (SIGIR’93) (pp. 160–169). New York, NY, USA: ACM Press.

Shvaiko, P., & Euzenat, J. (2005). A survey of schema-based matching approaches.Journal of Data Semantics, 4, 146–171.

Uren, V. S., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., et al.(2006). Semantic annotation for knowledge management: Requirements and asurvey of the state of the art. Journal of Web Semantics, 4(1), 14–28.

van der Aalst, W. M. P., ter Hofstede, A. H. M., & Weske, M. (2003). Business processmanagement: A survey. In W. M. P. van der Aalst, A. H. M. ter Hofstede, & M.Weske (Eds.), Proceedings of the international conference on business processmanagement (BPM 2003) (pp. 1–12). June 26–27, Eindhoven, The Netherlands. InLecture notes in computer science (Vol. 2678). Springer.

Vetere, G., & Lenzerini, M. (2005). Models for semantic interoperability in service-oriented architectures. IBM Systems Journal, 44(4), 887–903.

Welty, C. A., & Guarino, N. (2001). Supporting ontological analysis of taxonomicrelationships. Data and Knowledge Engineering, 39(1), 51–74.

Zazo, Á., Figuerola, C. G., Alonso Berrocal, J. L., & Rodríguez, E. (2005). Reformulationof queries using similarity thesauri. Information Processing and Management: AnInternational Journal, 41(5), 1163–1173.