international journal of computer science & applications · international journal of computer...
TRANSCRIPT
ISSN 0972 - 9038
International Journal of Computer Science &
Applications
Volume 4 Issue 2 July 2007
Special Issue on Communications, Interactions and
Interoperability in Information Systems
Editor-in-Chief Rajendra Akerkar
Editors of Special Issue
Colette Rolland, Oscar Pastor and Jean-Louis Cavarero
International Journal of Computer Science & Applications Vol. 4, No.2, July 2007
ii
ADVISORY EDITOR Douglas Comer Department of Computer Science, Purdue University, USA
EDITOR-IN-CHIEF Rajendra Akerkar Technomathematics Research Foundation 204/17 KH, New Shahupuri , Kolhapur 416001, INDIA
MANAGING EDITOR David Camacho Universidad Carlos III de Madrid, Spain
ASSOCIATE EDITORS Ngoc Thanh Nguyen Wroclaw University of Technology, Poland
Pawan Lingras Saint Mary's University, Halifax, Nova Scotia, Canada.
COUNCIL OF EDITORS Stuart Aitken University of Edinburgh, UK Tetsuo Asano JAIST, Japan. Costin Badica University of Craiova,Craiova, Romania JF Baldwin University of Bristol, UK Pavel Brazdil LIACC/FEP,University of Porto, Portugal Ivan Bruha Mcmaster University, Canada Jacques Calmet Universität Karlsruhe Germany Narendra S. Chaudhari Nanyang Technological University, Singapore Walter Daelemans University of Antwerp, Belgium K. V. Dinesha IIIT, Bangalore, India David Hung-Chang Du University of Minnesota, USA Hai-Bin Duan Beihang University, P. R. China. Yakov I. Fet Russian Academy of Sciences, Russia Maria Ganzha, Gizycko Private Higher Educational Institute, Gizycko, Poland S. K. Gupta IIT, New Delhi, India Henry Hexmoor University of Arkansas, Fayetteville, U.S.A. Ray Jarvis Monash University, Victoria, Australia Peter Kacsuk MTA SZTAKI Research Institute, Budapest, Hungary
Huan Liu Arizona State University, USA Pericles Loucopoulos UMIST, Manchester, UK Wolfram - Manfred Lippe University of Muenster, Germany Lorraine McGinty University College Dublin, Belfield, Ireland C. R. Muthukrishnan Indian Institute of Technology, Chennai, India Marcin Paprzycki SWPS and IBS PAN, Warsaw Lalit M. Patnaik Indian Institute of Science, Bangalore, India Dana Petcu Western University of Timisoara, Romania Shahram Rahimi Southern Illinois University, Illinois, USA Sugata Sanyal Tata Institute of Fundamental Research, Mumbai, India. Dharmendra Sharma University of Canberra, Australia Ion O. Stamatescu FEST, Heidelberg, Germany José M. Valls Ferrán Universidad Carlos III, Spain Rajeev Wankar University of Hyderabad, Hyderabad, India Krzysztof Wecel The Poznan University of Economics, Poland
Editorial Office: Technomathematics Research Foundation, 204/17 Kh, New Shahupuri, Kolhapur 416001, India. E-mail: [email protected] Copyright 2007 by Technomthematics Research Foundation All rights reserved. This journal issue or parts thereof may not be reproduced in any form or by any means, electrical or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the copyright owner. Permission to quote from this journal is granted provided that the customary acknowledgement is given to the source. International Journal of Computer Science & Applications (ISSN 0972 – 9038) is high quality electronic journal published six-monthly by Technomathematics Research Foundation, Kolhapur, India. The www-site of IJCSA is http://www.tmrfindia.org/ijcsa.html
International Journal of Computer Science & Applications Vol. 4, No.2, July 2007
iii
Contents
Editorial (v)
1. A New Quantitative Trust Model for Negotiating Agents using Argumentation
Jamal Bentahar, Concordia Institute for Information Systems Engineering, John-Jules
Ch. Meyer, Department of Information and Computing Sciences, Utrecht University,
The Netherlands (1 –21)
2. Protocol Management Systems as a Middleware for Inter-Organizational Workflow
Coordination (23-41)
Andonoff Eric, IRIRT/UT1, Bouaziz Wassim, IRIRT/UT1, Hanachi Chihab, IRIRT/UT1
3. Adaptability of Methods for Processing XML Data using Relational Databases – the
State of the Art and Open Problems (43-62)
Irena Mlynkova, Department of Software Engineering, Charles University, Jaroslav
Pokorny, Department of Software Engineering, Charles University
4. XML View Based Access to Relational Data in Workflow Management Systems
(63-74)
Marek Lehmann, University of Vienna, Department of Knowledge and Business
E,Johann Eder, University of Vienna, Department of Knowledge and Business
Engineering, Christian Dreier, University of Klagenfurt, Jurgen Mangler
5. Incremental Trade-Off Management for Preference-Based Queries (75-91)
Wolf-Tilo Balke, L3S Research Center, University of Hannover, Germany, Ulrich
Güntzer, University of Tübingen, Germany, Christoph Lofi, L3S Research Center,
University of Hannover, Germany
6. What Enterprise Architecture and Enterprise Systems Usage Can and Can not Tell
about Each Other . (93-109)
Maya Daneva, University of Twente, Pascal Van Eck, University of Twente
7. UNISC-Phone - A case study. (111-123)
International Journal of Computer Science & Applications Vol. 4, No.2, July 2007
iv
Jacques Schreiber, Gunter Feldens, Eduardo Lawisch, Luciano Alves, Informatics
Department, UNISC- Santa Cruz do Sul University
8. Fuzzy Ontologies and Scale-free Networks Analysis. (125-144)
Silvia Calegari, DISCo, University of Milano-Bicocca, Fabio Farina, DISCo,
University f Milano-Bicocca
9. Extracted Knowledge Interpretation in mining biological data: a survey. (145-163)
Martine Collard, University of Nice, Ricardo Martinez, University of Nice
International Journal of Computer Science & Applications Vol. 4, No.2, July 2007
v
Editorial
The First International Conference on Research Challenges in Information Science
(RCIS) aimed at providing an international forum for scientists, researchers, engineers
and developers from a wide range of information science areas to exchange ideas and
approaches in this evolving field. While presenting research findings and state-of-art
solutions, authors were especially invited to share experiences on new research
challenges. High quality papers in all information science areas were solicited and
original papers exploring research challenges did receive especially careful interest from
reviewers. Papers that had already been accepted or were currently under review for
other conferences or journals were not to be considered for publications at RCIS’07.
103 papers were submitted and 31 were accepted. They will be published in the
RCIS’07 proceedings.
This special issue of the International Journal of Computer Science & Applications,
dedicated to Communications, Interactions and Interoperability in Information Systems,
presents 9 papers that obtained the highest marks in the reviewing process. They are
presented in an extended version in this issue. In the context of RCIS’07, they are linked
to Information System Modelling and Intelligent Agents, Description Logics,
Ontologies and XML based techniques.
Colette Rolland
Oscar Pastor
Jean-Louis Cavarero
A New Quantitative Trust Model for
Negotiating Agents using Argumentation
Jamal Bentahar1, John-Jules Ch. Meyer
2
1 Concordia University, Concordia Institute for Information Systems
Engineering, Canada
2 Utrecht University, Department of Information and Computing
Sciences, The Netherlands
Abstract
In this paper, we propose a new quantitative trust model for argumentation-based
negotiating agents. The purpose of such a model is to provide a secure environment for
agent negotiation within multi-agent systems. The problem of securing agent negotiation
in a distributed setting is core to a number of applications, particularly the emerging
semantic grid computing-based applications such as e-business. Current approaches to
trust fail to adequately address the challenges for trust in these emerging applications.
These approaches are either centralized on mechanisms such as digital certificates, and
thus are particularly vulnerable to attacks, or are not suitable for argumentation-based
negotiation in which agents use arguments to reason about trust.
Key words: Intelligent Agents, Negotiating Agents, Security, Trust.
1 Introduction
Research in agent communication protocols has received much attention during the last
years. In multi-agent systems (MAS), protocols are means of achieving meaningful
interactions between software autonomous agents. Agents use these protocols to guide
their interactions with each other. Such protocols describe the allowed communicative
acts that agents can perform when conversing and specify the rules governing a dialogue
between these agents.
Protocols for multi-agent interaction need to be flexible because of the open and
dynamic nature of MAS. Traditionally, these protocols are specified as finite state
machines or Petri nets without taking into account the agents’ autonomy. Therefore,
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 1
they are not flexible enough to be used by agents expected to be autonomous in open
MAS [16]. This is due to the fact that agents must respect the whole protocol
specification from the beginning to the end without reasoning about them. To solve this
problem, several researchers recently proposed protocols using dialogue games [6, 11,
15, 17]. Dialogue games are interactions between players, in which each player moves
by performing utterances according to a pre-defined set of roles. The flexibility is
achieved by combining different small games to construct complete and more complex
protocols. This combination can be specified using logical rules about which agents can
reason [11].
The idea of these logic-based dialogue game protocols is to enable agents to
effectively and flexibly participate in various interactions with each other. One such type
of interaction that is gaining increasing prominence in the agent community is
negotiation. Negotiation is a form of interaction in which a group of agents, with
conflicting interests, but a desire to cooperate, try to come to a mutually acceptable
agreement on the division of scarce resources. A particularly challenging problem in this
context is security. The problem of securing agent negotiation in a distributed setting is
core to a number of applications, particularly the emerging semantic grid computing-
based applications such as e-science (science that is enabled by the use of distributed
computing resources by end-user scientists) and e-business [9, 10].
The objective of this paper is to address this challenging issue by proposing a new
quantitative, probabilistic-based model to trust negotiating agents, which is efficient, in
terms of computational complexity. The idea is that in order to share resources and
allow mutual access, involved agents in e-infrastructures need to establish a framework
of trust that establishes what they each expect of the other. Such a framework must
allow one entity to assume that a second entity will behave exactly as the first entity
expects. Current approaches to trust fail to adequately address the challenges for trust in
the emerging e-computing. These approaches are mostly centralized on mechanisms
such as digital certificates, and thus are particularly vulnerable to attacks. This is
because if some authorities who are trusted implicitly are compromised, then there is no
other check in the system. By contrast, in the decentralized approach we propose in this
paper and where the principals maintain trust in each other for more reasons than a
single certificate, any “invaders” can cause limited harm before being detected.
Recently, some decentralized trust models have been proposed [2, 3, 4, 7, 13, 19] (see
[18] for a survey). However, these models are not suitable for argumentation-based
negotiation, in which agents use their argumentation abilities as a reasoning mechanism.
In addition, some of these models do not consider the case where false information is
collected from other partners. This paper aims at overcoming these limits.
The rest of this paper is organized as follows. In Section 2, we present the negotiation
framework. In Section 3, we present our trustworthiness model. We highlight its
formulation, algorithmic description, and computational complexity. In Section 4, we
describe and discuss implementation issues. In Sections 5, we compare our framework
to related work, and in Section 6, we conclude.
2 Negotiation Framework
In this section, we briefly present the dialogue game-based framework for negotiating
agents [11, 12]. These agents have a BDI architecture (Beliefs, Desires, and Intention)
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 2
augmented with argumentation and logical and social reasoning. The architecture is
composed of three models: the mental model, the social model, and the reasoning model.
The mental model includes beliefs, desires, goals, etc. The social model captures social
concepts such as conventions, roles, etc. Social commitments made by agents when
negotiating are a significant component of this model because they reflect mental states.
Thus, agents must use their reasoning capabilities to reason about their mental states
before creating social commitments. The agent's reasoning capabilities are represented
by the reasoning model using an argumentation system. Agents also have general
knowledge, such as knowledge about the conversation subject. This architecture has the
advantage of taking into account the three important aspects of agent communication:
mental, social, and reasoning. It is motivated by the fact that conversation is a cognitive
and social activity, which requires a mechanism making it possible to reason about
mental states, about what other agents say (public aspects), and about the social aspects
(conventions, standards, obligations, etc).
The main idea of our negotiation framework is that agents use their argumentation
abilities in order to justify their negotiation stances, or influence other agent’s
negotiation stances considering interacting preferences and utilities. Argumentation can
be abstractly defined as a dialectical process for the interaction of different arguments
for and against some conclusion. Our negotiation dialogue games are based on formal
dialectics in which arguments are used as a way of expressing decision-making [8, 14].
Generally, argumentation can help multiple agents to interact rationally, by giving and
receiving reasons for conclusions and decisions, within an enriching dialectical process
that aims at reaching mutually agreeable joint decisions. During negotiation, agents can
establish a common knowledge of each other’s commitments, find compromises, and
persuade each other to make commitments. In contrast to traditional approaches to
negotiation that are based on numerical values, argument-based negotiation is based on
logic.
An argumentation system is simply a set of arguments and a binary relation
representing the attack-relation between the arguments. The following definition,
describe formally these notions. Here indicates a possibly inconsistent knowledge
base. stands for classical inference and for logical equivalence.
Definition 1 (Argument). An argument is a pair (H, h) where h is a formula of a
logical language and H a sub-set of such that : i) H is consistent, ii) H h and iii) H
is minimal, so no subset of H satisfying both i and ii exists. H is called the support of the
argument and h its conclusion.
Definition 2 (Attack Relation). Let (H1, h
1), (H
2, h
2) be two arguments. (H
1, h
1) attacks
(H2, h
2) iff h1 ¬h
2.
Negotiation dialogue games are specified using a set of logical rules. The allowed
communicative acts are: Make-Offer, Make-Counter-Offer, Accept, Refuse, Challenge,
Inform, Justify, and Attack. For example, according to a logical rule, before making an
offer h, the speaker agent must use its argumentation system to build an argument (H,
h). The idea is to be able to persuade the addressee agent about h, if he decides to refuse
the offer. On the other side, the addressee agent must use his own argumentation system
to select the answer he will give (Make-Counter-Offer, Accept, etc.).
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 3
3 Trustworthiness Model for Negotiating Agents
In recent years, several models of trust have been developed in the context of MAS [2,
3, 4, 13, 18, 19]. However, these models are not designed to trust argumentation-based
negotiating agents. Their formulations do not take into account the elements we use in
our negotiation approach (accepted and refused arguments, satisfied and violated
commitments). In addition, these models have some limitations regarding the inaccuracy
of the collected information from other agents. In this section we present our
argumentation and probabilistic-based model to trust negotiating agents that overcome
some limitations of these models.
3.1 Formulation
Let A be the set of agents. We define an agent’s trustworthiness in a distributed setting
as a probability function as follows:
,: 0TRUST A A D 1
This function associates to each agent a probability measure representing its
trustworthiness in the domain D according to another agent. To simplify the notation, we
omit the domain D from the TRUST function because we suppose that is always known.
Let X be a random variable representing an agent’s trustworthiness. To evaluate the
trustworthiness of an agent Agb, an agent Aga uses the history of its interactions with
Agb. Equation 1 indicates how to calculate this trustworthiness as a probability measure
(number of successful outcomes / total number of possible outcomes).
_ ( ) _ ( ) ( )
_ _ ( ) _ _ ( )
a
a
a a
Ag Agb bAgb
a
Ag Ab b
Nb Arg Nb CAg AgTRUST Ag
T Nb Arg T Nb CAg Ag g
(1)
( )aAgbTRUST Ag indicates the trustworthiness of Agb according to Aga’s point of view.
_ ( )aAgbNb Arg Ag is the number of Agbs’ arguments that are accepted by Aga.
_ ( )aAgbNb C Ag is the number of satisfied commitments made by Agb towards Aga.
_ _ ( )aAgbT Nb Arg Ag is the total number of Agbs’ arguments towards Aga.
_ _ ( )aAgbT Nb C Ag is the total number of commitments made by Agb towards Aga.
All these commitments and arguments are related to the domain D. The basic idea is
that the trust degree of an agent can be induced according to how much information
acquired from him has been accepted as belief in the past. Using the number of accepted
arguments when computing the trust value reflects the agent’s knowledge level in the
domain D. Particularly, in the argumentation-based negotiation, the accepted arguments
capture the agent’s reputation level. If some argument conflicts in the domain D exist
between the two agents, this will affect the confidence they have about each other.
However, this is related only to the domain D, and not generalized to other domains in
which the two agents can trust each other. In a negotiation setting, the existence of
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 4
argument conflicts reflects a disagreement in the perception of the negotiation domain.
Because all the factors of Equation 1 are related to the past, this information number is
finite.
Trustworthiness is a dynamic characteristic that changes according to the interactions
taking place between Aga and Agb. This supposes that Aga knows Agb. If not, or if the
number of interactions is not sufficient to determine this trustworthiness, the
consultation of other agents becomes necessary.
As proposed in [1, 2, 3], each agent has two kinds of beliefs when evaluating the
trustworthiness of another agent: local beliefs and total beliefs. Local beliefs are based
on the direct interactions between agents. Total beliefs are based on the combination of
the different testimonies of other agents that we call witnesses. In our model, local
beliefs are given by Equation 1. Total beliefs require studying how different probability
measures offered by witnesses can be combined. We deal with this aspect in the
following section.
3.2 Estimating Agent’s Trustworthiness
Let us suppose that an agent Aga wants to evaluate the trustworthiness of an agent Agb
with who he never (or not enough) interacted before. This agent must ask agents he
knows to be trustworthy (we call these agents confidence agents). To determine whether
an agent is confident or not, a trustworthiness threshold w must be fixed. Thus, Agb will
be considered trustworthy by Aga iff ( )aAgbTRUST Ag is higher or equal to w. Aga
attributes a trustworthiness measure to each confidence agent Agi. When he is consulted
by Aga, each confidence agent Agi provides a trustworthiness value for Agb if Agi knows
Agb. Confidence agents use their local beliefs to assess this value (Equation 1). Thus, the
problem consists in evaluating Agb’s trustworthiness using the trustworthiness values
transmitted by confidence agents. Fig. 1 illustrates this issue.
We notice that this problem cannot be formulated as a problem of conditional
probability. Consequently, it is not possible to use Bayes’ theorem or total probability
theorem. The reason is that events in our problem are not mutually exclusive, whereas
this condition is necessary for these two theorems. Here an event is the fact that a
Aga
Agb
Ag3Ag2Ag1
Trust(Ag1)Trust(Ag2)
Trust(Agb) Trust(Agb)
Trust(Ag3)
Trust(Agb)
Trust(Agb) ?
Fig. 1. Problem of measuring Agb’s trustworthiness by Aga
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 5
confidence agent is trustworthy. Consequently, events are not mutually exclusive
because the probability that two confidence agents are at the same time trustworthy is
not equal to 0.
To solve this problem, we must investigate the distribution of the random variable X
representing the trustworthiness of Agb. Since X takes only two values: 0 (the agent is
not trustworthy) or 1 (the agent is trustworthy), variable X follows a Bernoulli
distribution ß(1, p). According to this distribution, we have Equation 2:
( )E X p (2)
where E(X) is the expectation of the random variable X and p is the probability that the
agent is trustworthy. Thus, p is the probability that we seek. Therefore, it is enough to
evaluate the expectation E(X) to find However, this expectation is a
theoretical mean that we must estimate. To this end, we can use the Central Limit
Theorem (CLT) and the law of large numbers. The CLT states that whenever a random
sample of size n (X
( )aAgbTRUST Ag .
1,…Xn) is taken from any distribution with mean , then the sample
mean (X1 + … +Xn)/n will be approximately normally distributed with mean . As an
application of this theorem, the arithmetic mean (average) (X1+…+ Xn)/n approaches a
normal distribution of mean , the expectation and standard deviation n .Generally,
and according to the law of large numbers, the expectation can be estimated by the
weighted arithmetic mean.
Our random variable X is the weighted average of n independent random variables Xi
that correspond to Agb’s trustworthiness according to the point of view of confidence
agents Agi. These random variables follow the same distribution: the Bernoulli
distribution. They are also independent because the probability that Agb is trustworthy
according to an agent Agt is independent of the probability that this agent (Agb) is
trustworthy according to another agent Agr. Consequently, the random variable X
follows a normal distribution whose average is the weighted average of the expectations
of the independent random variables Xi. The mathematical estimation of expectation
E(X) is given by Equation 3.
10
1
( ) ( )
( )
a
a
niAgi bi
nii Ag
TRUST TRUSTAg AgM
TRUST Ag
Ag (3)
The value 0M represents an estimation of ( )a.AgbTRUST Ag Equation 3 does not
take into account the number of interactions between confidence agents and Agb. This
number is an important factor because it makes it possible to promote information
coming from agents knowing more Agb. In addition, an other factor might be used to
reflect the timely relevance of transmitted information. This is because the agent’s
environment is dynamic and may change quickly. The idea is to promote recent
information and to deal with out-of-date information with less emphasis. Equation 4
gives us an estimation of if we take into account these factors and we
suppose that all confidence agents have the same trustworthiness.
( )aAgbTRUST Ag
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 6
11
1
( ) ( ) ( )
( ) ( )
b b
bAgb
niAg Agi i bi
nAgi ii
N TR TRUSTAg Ag AgM
N TRAg Ag
Ag (4)
The factor ( )bAgiN Ag indicates the number of interactions between a confidence
agent Agi and Agb. This number can be identified by the total number of Agb’s
commitments and arguments. The factor ( )bAgiTR Ag represents the timely relevance
coefficient of the information transmitted by Agi about Agb ‘s trust (TR denotes Timely
Relevance). We denote here that removing from the Equation 4, results in
the classical probability equation used to calculate the expectation E(X).
( )bAgiTR Ag
In our model, we assess the factor ( )bAgiTR Ag by using the function defined in
Equation 5. We call this function: the Timely Relevance function.
ln( )( )
AgbAgb i
i
tAg
AgTR t e (5)
t is the time difference between the current time and the time at which Agi updates
its information about Agb’s trust. is an application-dependant coefficient. The
intuition behind this formula is to use a function decreasing with the time difference
(Fig. 2). Consequently, the more recent the information is, the higher is the timely
relevance coefficient. The function ln is used for computational reasons when dealing
with large numbers. Intuitively, the function used in Equation 5 reflects the reliability of
the transmitted information. Indeed, this function is similar to the well known reliability
function for systems engineering ( ( ) tR t e ).
The combination of Equation 3 and Equation 4 gives us a good estimation of
(Equation 6) that takes into account the four most important factors:
(1) the trustworthiness of confidence agents according to the point of view of Ag
( )aAgbTRUST Ag
a; (2)
the Agb’s trustworthiness according to the point of view of confidence agents; (3) the
number of interactions between confidence agents and Agb; and (4) the timely relevance
1.0
ln( )( )
AgbAgb i
i
tAg
AgTR t e
0
1
( )b
i
Ag
AgTR t
Time ( b
i
Ag
Agt )
Fig. 2. The timely relevance function
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 7
of information transmitted by confidence agents. This number is an important factor
because it makes it possible to highlight information coming from agents knowing more
Agb.
12
1
( ) ( ) ( ) ( )
( ) ( ) ( )
a b b
b ba
niAg Ag Agi i i bi
nAg Agi i ii Ag
TRUST N TR TRUSTAg Ag Ag AgM
TRUST N TRAg Ag Ag
Ag (6)
The way of combining Equation 3 (M0) and Equation 4 (M1) in the calculation of
Equation 6 (M2) is justified by the fact that it reflects the mathematical expectation of
the random variable X representing the Agb’s trustworthiness. This equation represents
the sum of the probability of each possible outcome multiplied by its payoff.
This Equation shows how trust can be obtained by merging the trustworthiness values
transmitted by some mediators. This merging method takes into account the proportional
relevance of each trustworthiness value, rather than treating them equally.
According to Equation 6, we have:
1
1
( )
( )
i
a b
b ba
b Ag
nbAg Agi i ii
n
Ag
Ag Ai i ii Ag
)Agi,TRUST( w
TRUST( ) N( ) TRAg Ag AgM w.
TRUST( ) N( ) TRAg Ag Ag
M w
g
Consequently, if all the trust values sent by the consulted agents about Agb are less than
the threshold w, then Agb can not be considered as trustworthy. Thus, the well-known
Kyburg’s lottery paradox can never happen. The lottery paradox was designed to
demonstrate that three attractive principles governing rational acceptance lead to
contradiction, namely that:
1. it is rational to accept a proposition that is very likely true;
2. it is not rational to accept a proposition that you are aware is inconsistent; and
3. if it is rational to accept a proposition A and it is rational to accept another proposition
B, then it is rational to accept A B,
are jointly inconsistent. In our situation, we do not have such a contradiction.
To assess M, we need the trustworthiness of other agents. To deal with this issue, we
propose the notion of trust graph.
3.3 Trust Graph
In the previous section, we provided a solution to the trustworthiness combination
problem to evaluate the trustworthiness of a new agent (Agb). To simplify the problem,
we supposed that each consulted agent (a confidence agent) offers a trustworthiness
value of Agb if he knows him. If a confidence agent does not offer any trustworthiness
value, it will not be taken into account at the moment of the evaluation of Agb’s
trustworthiness by Aga. However, a confidence agent can, if he does not know Agb, offer
to Aga a set of agents who eventually know Agb. In this case, Aga will ask the proposed
agents. These agents also have a trustworthiness value according to the point of view of
the agent who proposed them. For this reason, Aga applies Equation 5 to assess the
trustworthiness values of these agents. These new values will be used to evaluate the
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 8
Agb’s trustworthiness. We can build a trust graph in order to deal with this issue. We
define such a graph as follows:
Definition 3 (Trust Graph). A trust graph is a directed and weighted graph. The nodes
are agents and an edge (Agi, Agj) means that agent Agi knows agent Agj. The weight of
the edge (Agi, Agj) is a pair (x, y) where x is the Agj’s trustworthiness according to the
point of view of Agi and y is the interaction number between Agi and Agj. The weight of
a node is the agent’s trustworthiness according to the point of view of the source agent.
According to this definition, in order to determine the trustworthiness of the target
agent Agb, it is necessary to find the weight of the node representing this agent in the
graph. The graph is constructed while Aga receives answers from the consulted agents.
The evaluation process of the nodes starts when all the graph is built. This means that
this process only starts when Aga has received all the answers from the consulted agents.
The process terminates when the node representing Agb is evaluated. The graph
construction and the node evaluation algorithms are given respectively by Algorithms 1
and 2.
Correctness of Algorithm 1: The construction of the trust graph is described as follows:
1- Agent Aga sends a request about the Agb’s trustworthiness to all the confidence
agents Agi. The nodes representing these agents (denoted Node(Agi)) are added to the
graph. Since the trustworthiness values of these agents are known, the weights of these
nodes (denoted Weight(Node(Agi))) can be evaluated. These weights are represented by
( )aAgiTRUST Ag (i.e. by Agi’s trustworthiness according to the point of view of Aga).
2- Aga uses the primitive Send(Agi, Investigation(Agb)) in order to ask Agi to offer a
trustworthiness value for Agb. The Agis’ answers are recovered when they are offered in
a variable denoted Str by Str = Receive(Agi). Str.Agents represents the set of agents
referred by Agi. . ( )iAgjStr TRUST Ag is the trustworthiness value of an agent Agj
(belonging to the set Str.Agents) from the point of view of the agent who referred him
(i.e. Agi).
3- When a consulted agent answers by indicating a set of agents, these agents will
also be consulted. They can be regarded as potential witnesses. These witnesses are
added to a set called: Potonial_Witnesses. When a potential witness is consulted, he is
removed from the set.
4- To ensure that the evaluation process terminates, two limits are used: the maximum
number of agents to be consulted (Limit_Nbr_Visited_Agents) and the maximum number
of witnesses who must offer an answer (Limit_Nbr_Witnesses). The variable
Nbr_Additional_Agents is used to be sure that the first limit is respected when Aga starts
to receive the answers of the consulted agents.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 9
Construct-Graph(Aga, Agb, Limit_Nbr_Visited_Agents, Limit_Nbr_Witnesses)
{
Graph :=
Nbr_Witnesses := 0
Nbr_Visited_Agents := 0
Nbr_Additional_Agents :=
Max(0, Limit_Nbr_Visited_Agents – Size(Confidence(Aga)))
Potential_Witnesses := Confidence(Aga)
Add Node(Agb) to Graph
While (Potential_Witnesses ) and (Nbr_Witnesses < Limit_Nbr_Witnesses) and
(Nbr_Visited_Agents < Limit_Nbr_Visited_Agents) {
n := Limit_Nbr_Visited_Agents - Nbr_Visited_Agents
m := Limit_Nbr_Witnesses - Nbr_Witnesses
For (i =1, i min(n, m), i++) {
Ag1 := Potential_Witnesses(i)
If Node(Ag1) Graph Then Add Node(Ag1) to Graph
If Ag1 Confidence(Aga) Then Weight(Node(Ag1)) := Trust(Ag1)Aga
Send(Ag1, Investigation(Agb))
Nbr_Visited_Agents := Nbr_Visited_Agents +1 }
For (i =1, i min(n, m), i++) {
Ag1 := Potential_Witnesses(1)
Str := Receive(Ag1)
Potential_Witnesses := Potential_Witnesses / {Ag1}
While (Str.Agents ) and (Nbr_Additional_Agents > 0) {
If Str.Agents = {Agb} Then {
Nbr_Witnesses := Nbr_Witnesses + 1
Add Arc(Ag1, Agb)
Weight1(Arc(Ag1, Agb)) := Str.TRUST(Agb)Ag1
Weight2(Arc(Ag1, Agb)) := Str.n(Agb)Ag1
Str.Agents := }
Else {
Nbr_Additional_Agents := Nbr_Additional_Agents – 1
Ag2 := Str.Agents(1)
Str.Agents := Str.Agents / {Ag2}
If Node(Ag2) Graph then Add Ag2 to Graph
Weight1(Arc(Ag1, Ag2)) := Str.TRUST(Ag2)Ag1
Weight2(Arc(Ag1, Ag2)) := Str.n(Ag2)Ag1
Potential_Witnesses := Potential_Witnesses {Ag2} } } } }
}
Algorithm 1
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 10
Evaluate-Node(Agy) {
Arc(Agx, Agy)
If Node(Agx) is note evaluated Then
Evaluate-Node(Agx)
m1 := 0, m2 := 0
Arc(Agx, Agy) {
m1 = m1 +
Weight(Node(Agx)) * Weight(Arc(Agx, Agy))m2 = m2 + Weight(Node(Agx))
}
Weight(Node(Agy)) = m1 / m2
}
Algorithm 2
Correctness of Algorithm 2: The trustworthiness combination formula (Equation 5) is
used to evaluate the graph nodes. The weight of each node indicates the trustworthiness
value of the agent represented by the node. Such a weight is assessed using the weights
of the adjacent nodes. For example, let Arc(Agx, Agy) be an arc in the graph, before
evaluating Agy it is necessary to evaluate Agx. Consequently, the evaluation algorithm is
recursive. The algorithm terminates because the nodes of the set Confidence(Aga) are
already evaluated by Algorithm 1. Since the evaluation is done recursively, the call of
this algorithm in the main program has as parameter the agent Agb.
Complexity Analysis. Our trustworthiness model is based on the construction of a trust
graph and on a recursive call to the function Evaluate-Node(Agy) to assess the weight of
all the nodes. Since each node is visited exactly once, there are n recursive calls, where n
is the number of nodes in the graph. To assess the weight of a node we need the weights
of its neighboring nodes and the weights of the input edges. Thus, the algorithm takes a
time in (n) for the recursive calls and a time in (a) to assess the agents’
trustworthiness where a is the number of edges. The run time of the trustworthiness
algorithm is therefore in (max(a, n)) i.e. linear in the size of the graph. Consequently,
our algorithm is an efficient one.
4 Implementation
In this section we describe the implementation of our negotiation dialogue game
framework and the trustworthiness model using the JackTM platform (The Agent
Oriented Software Group, 2004). We select this language for three main reasons:
1- It is an agent-oriented language offering a framework for multi-agent system
development. This framework can support different agent models.
2- It is built on top of and fully integrated with the Java programming language. It
includes all components of Java and it offers specific extensions to implement agents’
behaviors.
3- It supports logical variables and cursors. A cursor is a representation of the results
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 11
of a query. It is an enumerator which provides query result enumeration by means of re-
binding the logical variables used in the query. These features are particularly helpful
when querying the state of an agent’s beliefs. Their semantics is mid-way between logic
programming languages with the addition of type checking Java style and embedded
SQL.
4.1 General Architecture
Our system consists of two types of agents: negotiating agents and trust model agents.
These agents are implemented as JackTM agents, i.e. they inherit from the basic class
JackTM Agent. Negotiating agents are agents that take part in the negotiation protocol.
Trust model agents are agents that can inform an agent about the trustworthiness of
another agent (Fig. 3). Agents must have knowledge and argumentation systems.
Agents’ knowledge are implemented using JackTM data structures called beliefsets. The
argumentation systems are implemented as Java modules using a logical programming
paradigm. These modules use agents’ beliefsets to build arguments for or against certain
propositional formulae. The actions that agents perform on commitments or on their
contents are programmed as events. When an agent receives such an event, it seeks a
plan to handle it.
The trustworthiness model is implemented using the same principle (events + plans).
The requests sent by an agent about the trustworthiness of another agent are events and
the evaluations of agents’ trustworthiness are programmed in plans. The trust graph is
implemented as a Java data structure (oriented graph).
Jack Agent Type:
Negotiating Agent
Jack Agent Type:
Trust_Model_Agent
Ag1 Ag2 Trust_AgnTrust_Ag1 …
Negotiation protocol
Interactions for determining Ag1’s
trustworthiness
Fig. 3. The general architecture of the system
As Java classes, negotiating agents and trust model agents have private data called
Belief Data. For example, the different commitments and arguments that are made and
manipulated are given by a data structure called CAN implemented using tables and the
different actions expected by an agent in the context of a particular negotiation game are
given by a data structure (table) called data_expected_actions. The different agents’
trustworthiness values that an agent has are recorded in a data structure (table) called
data_trust. These data and their types are given in Fig. 4 and Fig. 5.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 12
Fig. 4. Belief Data used in our prototype
4.2 Implementation of the Trustworthiness Model
The trustworthiness model is implemented by agents of type: trust model agent. Each
agent of this type has a knowledge base implemented using JackTM beliefsets. This
knowledge base, called table_trust, has the following structure: Agent_name,
Agent_trust, and Interaction_number. Thus, each agent has information on other agents
about their trustworthiness and the number of times that he interacted with them. The
visited agents during the evaluation process and the agents added in the trust graph are
recorded in two JackTM beliefsets called: table_visited_agents and table_graph_trust.
The two limits used in Algorithm 1 (Limit_Nbr_Visited_Agents and
Limit_Nbr_Witnesses) and the trustworthiness threshold w are passed as parameters to
the JackTM constructor of the original agent Aga that seeks to know if his interlocutor
Agb is trustworthy or not. This original agent is a negotiating agent.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 13
The main steps of the evaluation process of Agb’s trustworthiness are implemented as
follows:
Fig. 5. Beliefsets used in our prototype
1- By respecting the two limits and the threshold w , Aga consults his knowledge base
data_trust of type table_trust and sends a request to his confidence agents Agi (i = 1,..,
n) about Agb’s trustworthiness. The JackTM primitive Send makes it possible to send the
request as a JackTM message that we call Ask_Trust of MessageEvent type. Aga sends
this request starting by confidence agents whose trustworthiness value is highest.
2- In order to answer the Aga’s request, each agent Agi executes a JackTM plan instance
that we call Plan_ev_Ask_Trust. Thus, using his knowledge base, each agent Agi offers
to Aga an Agb’s trustworthiness value if Agb is known by Agi. If not, Agi proposes a set of
confidence agents from his point of view, with their trustworthiness values and the
number of times that he interacted with them. In the first case, Agi sends to Aga a JackTM
message that we call Trust_Value. In the second case, Agi sends a message that we call
Confidence_Agent. These two messages are of type MessageEvent.
3- When Aga receives the Trust_Value message, he executes a plan:
Plan_ev_Trust_Value. According to this plan, Aga adds to a graph structure called
graph_data_trust two information: 1) the agent Agi and his trustworthiness value as
graph node; 2) the trustworthiness value that Agi offers for Agb and the number of times
that Agi interacted with Agb as arc relating the node Agi and the node Agb. This first part
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 14
of the trust graph is recorded until the end of the evaluation process of Agb’s
trustworthiness. When Aga receives the Confidence_Agent message, he executes another
plan: Plan_ev_Confidence_Agent. According to this plan, Aga adds to another graph
structure: graph_data_trust_sub_level three information for each Agi agent: 1) the agent
Agi and his trustworthiness value as a sub-graph node; 2) the nodes Agj representing the
agents proposed by Agi; 3) For each agent Agj, the trustworthiness value that Agi assigns
to Agj and the number of times that Agi interacted with Agj as arc between Agi and Agj.
This information that constitutes a sub-graph of the trust graph will be used to evaluate
Agj’s trustworthiness values using Equation 5. These values are recorded in a new
structure: new_data_trust. Thus, the structure graph_data_trust_sub_level releases the
memory once Agj’s trustworthiness values are evaluated. This technique allows us to
decrease the space complexity of our algorithm.
4- Steps 1, 2, and 3 are applied again by substituting data_trust by new_data_trust,
until all the consulted agents offer a trustworthiness value for Agb or until one of the two
limits (Limit_Nbr_Visited_Agents or Limit_Nbr_Witnesses) is reached.
5- Evaluate the Agb’s trustworthiness value using the information recorded in the
structure graph_data_trust by applying Equation 5.
The different events and plans implementing our trustworthiness model and the
negotiating agent constructor are illustrated by Fig. 6. Fig. 7 illustrates an example
generated by our prototype of the process allowing an agent Ag1 to assess the
trustworthiness of another agent Ag2. In this example, Ag2 is considered trustworthy by
Ag1 because its trustworthiness value (0.79) is higher than the threshold (0.7).
4.3 Implementation of the Negotiation Dialogue Games
In our system, agents’ knowledge bases contain propositional formulae and arguments.
These knowledge bases are implemented as JackTM beliefsets. Beliefsets are used to
maintain an agent’s beliefs about the world. These beliefs are represented in a first order
logic and tuple-based relational model. The logical consistency of the beliefs contained
in a beliefset is automatically maintained. The advantage of using beliefsets over normal
Java data structures is that beliefsets have been specifically designed to work within the
agent-oriented paradigm.
Our knowledge bases (KBs) contain two types of information: arguments and beliefs.
Arguments have the form ([Support], Conclusion), where Support is a set of
propositional formulae and Conclusion is a propositional formula. Beliefs have the form
([Belief], Belief) i.e. Support and Conclusion are identical. The meaning of the
propositional formulae (i.e. the ontology) is recorded in a beliefset called table_ontology
whose access is shared between the two agents. This beliefset has two fields:
Proposition and Meaning.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 15
Agent communication is done by sending and receiving messages. These messages
are events that extend the basic JackTM event: MessageEvent class. MessageEvents
represent events that are used to communicate with other agents. Whenever an agent
needs to send a message to another agent, this information is packaged and sent as a
MessageEvent. A MessageEvent can be sent using the primitive: Send(Destination,
Message).
Fig. 6. Events, plans and the conversational agent constructor implementing the trustworthiness model
Our negotiation dialogue games are implemented as a set of events (MessageEvents)
and plans. A plan describes a sequence of actions that an agent can perform when an
event occurs. Whenever an event is posted and an agent chooses a task to handle it, the
first thing the agent does is to try to find a plan to handle the event. Plans are reasoning
methods describing what an agent should do when a given event occurs.
Each dialogue game corresponds to an event and a plan. These games are not
implemented within the agents’ program, but as event classes and plan classes that are
external to agents. Thus, each negotiating agent can instantiate these classes. An agent
Ag1 starts a dialogue game by generating an event and by sending it to his interlocutor
Ag2. Ag2 executes the plan corresponding to the received event and answers by
generating another event and by sending it to Ag1. Consequently, the two agents can
communicate by using the same protocol since they can instantiate the same classes
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 16
representing the events and the plans. For example, the event
Event_Attack_Commitment and the plan Plan_ev_Attack_commitment implement the
Attack game. The architecture of our negotiating agents is illustrated in Fig. 8.
Fig. 7. The screen shot of a trustworthiness evaluation process
5 Related Work
Recently, some online trust models have been developed (see [20] for a detailed survey).
The most widely used are those on eBay and Amazon Auctions. Both of these are
implemented as a centralized trust system so that their users can rate and learn about
each other’s reputation. For example, on eBay, trust values (or ratings) are +1, 0, or –1
and user, after an interaction, can rate its partner. The ratings are stored centrally and
summed up to give an overall rating. Thus, reputation in these models is a global single
value. However, the model can be unreliable, particularly when some buyers do not
return ratings. In addition, these models are not suitable for applications in open MAS
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 17
such as agent negotiation because they are too simple in terms of their trust rating values
and the way they are aggregated.
Ag1 (Jack Agent) Ag2 (Jack Agent)
Knowledge
base (Jack Beliefset)
Knowledge base (Jack Beliefset)
Jack Event Jack Plan
Jack Event Jack Plan…
Jack Event Jack Plan
Dialogue games
Argumentation system(Java + Logical programming)
Argumentation system (Java + Logical programming)
Ontology (Jack
Beliefset)
Fig. 8. The architecture of the negotiating agents
Another centralized approach called SPORAS has been proposed by Zacharia and
Maes [7]. SPORAS does not store all the trust values, but rather updates the global
reputation value of an agent according to its most recent rating. The model uses a
learning function for the updating process so that the reputation value can reflect an
agent’s trust. In addition, it introduces a reliability measure based on the standard
deviations of the trust values. However, unlike our models, SPORAS deal with all
ratings equally without considering the different trust degrees. Consequently, it suffers
from rating noise. In addition, like eBay, SPORAS is a centralized approach, so it is not
suitable for open negotiation systems.
Broadly speaking, there are three main approaches to trust in open multi-agent
systems. The first approach is built on an agent’s direct experience of an interaction
partner. The second approach uses information provided by other agents [2, 3, 4]. The
third approach uses certified information provided by referees [9, 19]. In the first
approach, methods by which agents can learn and make decisions to deal with
trustworthy or untrustworthy agents should be considered. In the models based on the
second and the third approaches, agents should be able to reliably acquire and reason
about the transmitted information. In the third approach, agents should provide third-
party referees to witness about their previous performance. Because the first approaches
are only based on a history of interactions, the resulting models are poor because agents
with no prior interaction histories could trust dishonest gents until a sufficient number of
interactions is built.
Sabater [13] proposes a decentralized trust model called Regret. Unlike the first
approach models, Regret uses an evaluation technique not only based on an agent’s
direct experience of its partners reliability, but it also uses a witness reputation
component. In addition, trust values (called ratings) are dealt with according to their
recency relevance. Thus, old ratings are given less importance compared to new ones.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 18
However, unlike our model, Regret does not show how witnesses can be located, and
thus, this component is of limited use. In addition, this model does not deal with the
possibility that an agent may lie about its rating of another agent, and because the ratings
are simply equally summed, the technique can be sensitive to noise. In our model, this
issue is managed by considering the witnesses’ trust and because our merging method
takes into account the proportional relevance of each trustworthiness value, rather than
treating them equally (see Equation 6 Section III.B)
Yu and Singh [2, 3, 4] propose an approach based on social networks in which
agents, acting as witnesses, can transmit information about each other. The purpose is to
tackle the problem of retrieving ratings from a social network through the use of
referrals. Referrals are pointers to other sources of information similar to links that a
search engine would plough through to obtain a Web page. Through referrals, an agent
can provide another agent with alternative sources of information about a potential
interaction partner. The social network is presented using a referral network called
TrustNet. The trust graph we propose in this paper is similar to TrustNet, however there
are several differences between our approach and Yu and Singh’s approach. Unlike Yu
and Singh’s approach in which agents do not use any particular reasoning, our approach
is conceived to secure argumentation-based negotiation in which agents use an
argumentation-based reasoning. In addition, Yu and Singh do not consider the
possibility that an agent may lie about its rating of another agent. They assume all
witnesses are totally honest. However, this problem of inaccurate reports is considered
in our approach by taking into account the trust of all the agents in the trust graph,
particularly the witnesses. Also, unlike our model, Yu and Singh’s model do not treat
the timely relevance information and all ratings are dealt with equally. Consequently,
this approach cannot manage the situation where the agents’ behavior changes.
Huynh, Jennings, and Shadbot [19] tackle the problem of collecting the required
information by the evaluator itself to assess the trust of its partner, called the target. The
problem is due to the fact that the models based on witness implicitly assume that
witnesses are willing to share their experiences. For this reason, they propose an
approach, called certified reputation, based not only on direct and indirect experiences,
but also on third-party references provided by the target agent itself. The idea is that the
target agent can present arguments about its reputation. These arguments are references
produced by the agents that have interacted with the target agents certifying its
credibility (the model proposed by Maximilien and Singh [5] uses the same idea). This
approach has the advantage of quickly producing an assessment of the target’s trust
because it only needs a small number of interactions and it does not require the
construction of a trust graph. However, this approach has some serious limitations.
Because the referees are proposed by the target agent, this agent can provide only
referees that will give positive ratings about it and avoid other referees, probably more
credible than the provided ones. Even if the provided agents are credible, their witness
could not reflect the real picture of the target’s honesty. This approach can privilege
opportunistic agents, which are agents only credible with potential referees. For all these
reasons, this approach is not suitable for trusting negotiating agents. In addition, in this
approach, the evaluator agent should be able to evaluate the honesty of the referees
using a witness-based model. Consequently, a trust graph like the one proposed in this
paper could be used. This means that, in some situations, the target’s trust might not be
assessed without asking for witness agents.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 19
6 Conclusion
The contribution of this paper is the proposition and the implementation of a new
probabilistic model to trust argumentation-based negotiating agents. The purpose of
such a model is to provide a secure environment for agent negotiation within multi-agent
systems. To our knowledge, this paper is the first work addressing the security issue of
argumentation-based negotiation in multi-agent settings. Our model has the advantage of
being computationally efficient and of gathering four most important factors: (1) the
trustworthiness of confidence agents; (2) the target’s trustworthiness according to the
point of view of confidence agents; (3) the number of interactions between confidence
agents and the target agent; and (4) the timely relevance of information transmitted by
confidence agents. The resulting model allows us to produce a comprehensive
assessment of the agents’ credibility in an argumentation-based negotiation setting.
Acknowledgements
We would like to thank the Natural Sciences and Engineering Research Council of
Canada (NSERC), le fonds québécois de la recherche sur la nature et les technologies
(NATEQ), and le fonds québécois de la recherche sur la société et la culture (FQRSC)
for their financial support. The first author is also supported in part by Concordia
University, Faculty of Engineering and Computer Science (Start-up Grant). Also, we
would like to thank the three anonymous reviewers for their interesting comments and
suggestions.
References
[1] A. Abdul-Rahman, and S. Hailes. Supporting trust in virtual communities. In Proceedings of
the 33rd Hawaii International Conference on System Sciences, 6, IEEE Computer Society
Press. 2000.
[2] B. Yu, and M. P. Singh. An evidential model of distributed reputation management. In
Proceedings of the First International Joint Conference on Autonomous Agents and Multi-
Agent Systems. ACM Press, pages 294–301, 2002.
[3] B. Yu, and M. P. Singh. Detecting deception in reputation management. In Proceedings of
the 2nd International Joint Conference on Autonomous Agents and Multi-Agent Systems.
ACM Press, pages 73-80, 2003.
[4] B. Yu, and M. P. Singh. Searching social networks. In Proceedings of the second
International Joint Conference on Autonomous Agents and Multi-Agent Systems. ACM Press,
pp. 65–72, 2003.
[5] E. M. Maximilien, and M. P. Singh. Reputation and endorsement for web services. ACM
SIGEcom Exchanges, 3(1):24-31, 2002.
[6] F. Sadri, F. Toni, and P. Torroni. Dialogues for negotiation: agent varieties and dialogue
sequences. In Proceedings of the International workshop on Agents, Theories, Architectures
and Languages. Lecture Notes in Artificial Intelligence (2333):405–421, 2001.
[7] G. Zacharia, and P. Maes. Trust management through reputation mechanisms. Applied
Artificial Intelligence, 14(9):881-908, 2000.
[8] H. Prakken. Relating protocols for dynamic dispute with logics for defeasible argumentation.
In Synthese (127):187-219, 2001.
[9] H. Skogsrud, B. Benatallah, and F. Casati. Model-driven trust negotiation for web services.
IEEE Internet Computing, 7(6):45-52, 2003.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 20
[10] I. Foster, C. Kesselman, and S. Tuecke. The anatomy of the grid: enabling the scalable
virtual organization. The International Journal of High Performance Computing
Applications, 15(3), 200-222, 2001.
[11] J. Bentahar, B. Moulin, J-.J. Ch. Meyer, and B. Chaib-draa. A computational model for
conversation policies for agent communication. In J. Leite and P. Torroni editors,
Computational Logic in Multi-Agent Systems. Lecture Notes in Artificial Intelligence (3487):
178-195, 2005.
[12] J. Bentahar. A pragmatic and semantic unified framework for agent communication. Ph.D.
Thesis, Laval University, Canada, May 2005.
[13] J. Sabater. Trust and Reputation for Agent Societies. Ph.D. Thesis, Universitat Autµonoma de
Barcelona, 2003.
[14] L. Amgoud, N. Maudet, S. Parsons. Modelling dialogues using argumentation. In Proceeding
of the 4th International Conference on Multi-Agent Systems, pages 31-38, 2000.
[15] M. Dastani, J. Hulstijn, and L. V. der Torre. Negotiation protocols and dialogue games. In
Proceedings of Belgium/Dutch Artificial Intelligence Conference, pages 13-20, 2000.
[16] N. Maudet, and B. Chaib-draa, Commitment-based and dialogue-game based protocols, new
trends in agent communication languages. In Knowledge Engineering Review. Cambridge
University Press, 17(2):157-179, 2002.
[17] P. McBurney, and S. Parsons, S. Games that agents play: A formal framework for dialogues
between autonomous agents. In Journal of Logic, Language, and Information, 11(3):1-22,
2002.
[18] S. D. Ramchurn, T. D. Huynh, and N. R. Jennings. Trust in multi-agent systems. The
Knowledge Engineering Review, 19(1):1-25, March 2004.
[19] T. D. Huynh, N. R. Jennings, and N. R. Shadbolt. An integrated trust and reputation model
for open multi-agent systems. Journal of Autonomous Agents and Multi-Agent Systems
AAMAS, 2006, 119-154.
[20] T. Grandison, and M. Sloman. A survey of trust in internet applications. IEEE
Communication Surveys & Tutorials, 3(4), 2000.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 1 - 21
© 2006 Technomathematics Research Foundation
Jamal Bentahar, John-Jules Ch. Meyer 21
International Journal of Computer Science & ApplicationsVol. IV, No. II
© 2006 Technomathematics Research Foundation
22
Protocol Management Systems as a
Middleware for Inter-Organizational
Workflow Coordination
Eric ANDONOFF, Wassim BOUAZIZ, Chihab HANACHI IRIT/UT1, 1 Place Anatole France
31042 Toulouse Cédex, France
{Eric.Andonoff, Wassim.Bouaziz, Chihab.Hanachi}@univ-tlse1.fr
Abstract
Interaction protocols are well identifiable and recurrent in Inter-Organizational
Workflow (IOW): they notably support finding partners, negotiation and contract
establishment between partners. So, it is useful to isolate these interaction protocols in
order to better study, design and implement them as specific entities so as to allow the
different Workflow Management Systems (WfMS) involved in an IOW to share
protocols and reuse them at run-time. Consequently, our aim in this paper is to propose
a Protocol Management System (PMS) architecture as a middleware to support the
design, the selection and the enactment of protocols on behalf of an IOW system. The
paper then gives a protocol meta-model on top of which this PMS should be built.
Finally, it presents a partial implementation of such a PMS combining agent and
semantic web technologies. While agent technology eases the cooperation between the
IOW components, semantic Web technology supports the definition, the sharing and
the selection of protocols.
Keywords: Inter-Organizational Workflow, Protocol, Protocol Management System,
Agent Technology.
1 Introduction
Inter-Organizational Workflow (IOW) is essential given the growing need for
organizations to cooperate and coordinate their activities in order to meet the new
demands of highly dynamic and open markets. The different organizations involved in
such cooperation must correlate their respective resources and skills, and coordinate
their respective business processes towards a common goal, corresponding to a value-
added service [1], [2].
A fundamental issue for IOW is the coordination of these different distributed,
heterogeneous and autonomous business processes in order to both support semantic
interoperability between the participating processes, and efficiently synchronize the
distributed execution of these processes.
Coordination in IOW raises several problems such as: (i) the definition of the
universe of discourse, without which it would not be possible to solve the various
semantic conflicts that are bound to occur between several autonomous and
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 23
heterogeneous workflows, (ii) finding partners able to realize a business/workflow
process, (iii) the negotiation of a workflow process between partners according to
criteria such as due time, price, quality of service, visibility of the process evolution or
way of doing it, (iv) the signature of contracts between partners and (v) the
synchronization of the distributed and concurrent execution of these different
processes.
Today, organizations are shifting from a tight/static case of cooperation (e.g. virtual
enterprises) to a loose/dynamic case (e.g. e-commerce) where dynamic relations and
alliances are established between organizations.
IOW coordination has been widely studied for the static case investigating issues
concerning formal specification of workflow interactions [1], [2], interoperability [3],
finding partners [4] and contract specification [4], [5]. Conversely, the dynamic case
has been less examined, and tools developed in the static case cannot be straight
forwardly adapted. Indeed, the context in which loose IOW is deployed has three
main specific and additional features:
- Flexibility which means that cooperation should be free from structural constraints
in order to maintain organizations' autonomy i.e. the ability to decide by themselves
the conditions of the cooperation: when, how and with whom.
- Openness which means that the set of partners involved in an IOW can evolve
through time, and that it is not necessarily fixed a priori but may be dynamically
decided at run-time in an opportunistic way.
- Scalability, mainly in the context of the Internet, that increases the complexity of
IOW coordination: its design, its enactment and its efficiency.
Therefore, IOW coordination must be revisited and adapted in this highly dynamic
context, notably finding partners, negotiation between partners, and contracts
enactment and monitoring. Besides, new issues must be considered such as the
definition of mechanisms for business process specification, discovery and matching...
This paper is based on the observation of the fact that, whatever the coordination
problem considered in loose IOW is, it follows a recurrent schema. After an informal
interaction, the participating partners are committed to follow a strict interaction
protocol. This protocol rules the conversation by a set of laws which constraint the
behavior of the participating partners, assigns roles to each of them, and therefore
organizes their cooperation.
Since interaction protocols constitute well identifiable and recurrent coordination
patterns in loose IOW, it is useful to isolate them in order to better study, design and
implement them as first-class citizen entities so as to allow the different Workflow
Management Systems (WfMS) involved in a loose IOW to share them and reuse them
at run-time [6]. Following this abstraction implies the application of the principle of
separation of concerns, which allows the separation of individual and intrinsic
capabilities of each workflow system from what relates to loose IOW coordination.
This principle of separation of concerns is widely recognized as a good design
practice from a software engineering point of view [7] and has led to the advent of
new technologies in Information System as discussed in [8]. Indeed, [8] explains how
data, user interfaces and more recently business processes have been pushed out of
applications and led to specific software to handle them (respectively Database
Management Systems, User Interface Management Systems, and WfMS). Following
this perspective, we argue that interaction protocols have to be pushed out of IOW
applications, and that a Protocol Management System has to be defined.
Hence our objective is to specify a Protocol Management System (PMS) to support
interaction protocol-based coordination in loose IOW. Such a PMS provides a loose
IOW system with the three following services: the description of useful interaction
protocols for IOW coordination, their selection and their execution.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 24
This paper also relies on the exploitation of agent and semantic Web approaches
viewed as enabling technologies to deal with the computing context in which IOW is
deployed.
The agent approach brings technical solutions and abstractions to deal with
distribution, autonomy and openness [9], which are inherent to loose IOW. This
approach also provides organizational concepts, such as groups, roles, commitments,
which are useful to structure and rule at a macro level the coordination of the different
partners involved in a loose IOW [10]. Using this technology, we also inherit
numerous concrete solutions to deal with coordination in multi-agent systems [11]
(middleware components, sophisticated interactions protocols).
The semantic Web approach facilitates communication and semantic inter-
operability between organizations involved in a loose IOW. It provides means to
describe and access common business vocabulary and shared interaction protocols.
The contribution of this paper is fourfold. First, this paper defines a multi-agent
PMS architecture and shows how this architecture can be connected to any WfMS
whose architecture is compliant with the WfMC reference architecture. Second, it
proposes an organizational model, instance of the Agent Group Role meta-model
[12], to structure and rule the interactions between the components of the PMS' and
WfMS' architectures. Third, it provides a protocol meta-model specified with OWL to
constitute a shared ontology of coordination protocols. This meta-model contains the
necessary information to select an appropriate interaction protocol at run-time
according to the current coordination problem to be dealt with. This meta-model is
then refined to integrate a classification of interaction protocols devoted to loose IOW
coordination. Fourth, this paper presents a partial implementation of this work limited
to a matchmaker protocol useful to deal with finding partners' problem.
The remaining of this paper is organized as follows. Section 2 presents the PMS
architecture stating the role of its components and explaining how they interact with
each other. Section 3 shows how to implement any WfMS engine connectable to a
PMS, while remaining compliant with the WfMC reference architecture. For reasons
related to homogeneity and flexibility, this engine is also provided with an agent-
based architecture. Section 4 gives an organizational model that structures and rules
the communication between the different agents involved in a loose IOW. Section 5
addresses engineering issues for protocols. It presents the protocol meta model and
also identifies, among multi-agent system interaction protocols, the ones which are
appropriate for the loose IOW context. Section 6 gives a brief overview of the
implementation of the matchmaker protocol to deal with finding partners' problem.
Finally, section 7 compares our contribution to related works and concludes the paper.
2 The Protocol Management System Architecture
The Protocol Management System (PMS) follows a multi-agent architecture
represented in figure 1. The PMS is composed of persistent agents (represented by
rectangles) and dynamic ones (represented by ellipses) created and killed at run-time.
The PMS architecture consists of two blocks of components, each one providing
specific services. The left hand side block supports the design and selection of
interaction/coordination protocols appropriate for loose IOW coordination, while the
right hand side block supports the execution of the selected protocols.
The Protocol Design and Selection block is organized around three agents (Protocol
Design Agent, Protocol Selection Agent and Protocol Launcher Agent) and two
knowledge sources (Domain Ontology and Coordination Protocol Ontology)
described below.
The Protocol Design Agent (PDA) is an agent that is responsible for the
specification of protocols. It proposes tools to allow users to graphically or textually
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 25
specify protocols i.e. their control structures, the actors involved in them and the
necessary information for their execution.
The Protocol Selection Agent (PSA) is an agent whose aim is to help a WfMS
requester to select the most appropriate coordination protocol according to the
objective of the conversation to be initiated (finding partners, negotiation between
partners…), and some specific criteria (quality of service, due time…) depending on
the type of the chosen coordination protocol. These criteria will be presented in
section 4.
The Protocol Launcher Agent (PLA) is an agent that creates and launches agents
called Moderators that will play as protocols.
The Domain Ontology source structures and records different business taxonomies
to solve semantic interoperability problems that are bound to occur in a loose IOW
context. Indeed, organizations involved in such an IOW must adopt a shared business
view through a common terminology before starting their cooperation.
The Coordination Protocol Ontology source describes and records protocol
description which may be queried by the PSA agent or which may be used as models
of behavior by the PLA agent when creating moderators.
Agent Communication Channel
Protocol Design and
Selection
Message Dispatcher
Protocol Management System
Protocol Execution
Domain Ontology
ProtocolSelection Agent
Coordination
Protocol Ontology
Protocol Design Agent
ProtocolLauncher Agent
CommunicationAct
Database
...
ConversationDatabase
Moderator ModeratorConversation
Server
CommunicationAct
Database
Figure 1: PMS Architecture
The Protocol Execution block is composed of two types of agents: the conversation
server and as many moderators as the number of conversations in progress. It exploits
the Domain Ontology source, maintains the Conversation databases and handles a
Communication Act database for each moderator. We now describe the two types of
agents and how they interact with these database and knowledge sources.
Each moderator manages a single conversation which is conform to a given
coordination protocol, and a moderator has the same lifetime as the conversation it
manages. A moderator grants roles to agents and ensures that any communication act
that takes place in its conversation is compliant with the protocol's rules. It also
records all the communication acts within its conversation in the Communication Act
database. A moderator also exploits the Domain Ontology source to interact with the
agents involved in its conversation using an adequate vocabulary.
The Conversation Server publishes and makes global information about each
current conversation accessible (such as its protocol, the identity of the moderator
supervising the conversation, the date of its creation, the requester initiator of the
conversation and the participants involved in the conversation). This information is
stored in the Conversation database. By allowing participants to get information about
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 26
the current (and past) conversations and be notified of new conversations [6], the
Conversation Server makes the interaction space explicit and available. Then, this
interaction space may be public or private according to the policies followed by
moderators. A database oriented view mechanism may be used to specify what is
public or private and for whom.
In addition to the two main blocks of the PMS, two other components are needed:
The Message Dispatcher and the Agent Communication Channel.
The Message Dispatcher is an agent that interfaces the PMS with any WfMS. Thus,
any WfMS intending to invoke a PMS’s service only needs to know the Message
Dispatcher address.
Finally, the Agent Communication Channel is an agent that supports the transport
of interaction messages between all the agents of the PMS architecture.
3 Connection of a Workflow Management System to the
PMS Architecture
This section first presents the Workflow Management Coalition (WfMC) reference
architecture and then explains why this architecture is insufficient to support the
connection to the PMS architecture. This section finally explains how we revisit the
reference architecture with agents in order to support the connection to the PMS
architecture.
3.1 Insufficiency of the Reference Architecture
The reference architecture proposed by the WfMC [13] is defined by giving the role
of its software components and by specifying how they interact. The main component
of this architecture is the Workflow Enactment Service (WES) that manages the
execution of workflow processes ,and that interacts, on one hand, with workflow
definition, execution and monitoring components, and, on the other hand, with
external WES. The five interfaces supporting the communication between the
different components are called Workflow API (WAPI). These interfaces are:
- Interface 1 with Process Definition Tools,
- Interface 2 with Workflow Client Applications,
- Interface 3 with Invoked Applications,
- Interface 4 with others WESs,
- Interface 5 with Administration and Monitoring Tools.
It is relevant to be compliant with the reference architecture in order to ensure the
adaptability of our solution. Unfortunately, despite its advantages, this architecture is
insufficient in our context for two main reasons. First, because in loose IOW, not only
the WES must execute the execution of workflow processes instances but it also
should drive different concurrent activities such as, of course, process instances
execution, but also finding partners, negotiation between partners, signature of
contracts and cooperation with other WESs, as workflow client or server. Second,
interfaces 3 and 4 are not appropriate to interact with the PMS.
3.2 Revisiting the Reference Architecture
Consequently, we revisit this WfMC reference architecture and more precisely the
WES of this architecture using the agent technology and introducing a new interface
and a specific component to support the connection with the PMS. However, our
proposition remains compliant with the WfMC architecture since the existing WAPI
(1 to 5) are not modified.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 27
We have chosen the agent technology for several reasons. First, and as mentioned
in the introduction, this technology is convenient to the loose IOW context since it
provides natural abstractions to deal with distribution, heterogeneity and autonomy
which are inherent to this context. Therefore, each organization involved in a loose
IOW may be seen as an autonomous agent having the mission to coordinate with other
workflow agents, and acting on behalf of its organization. Second, as the agent
technology is also at the basis of the PMS architecture, it will be easier and more
homogeneous, using agent coordination techniques, to structure and rule the
interaction between the agents of both PMS and WfMS's architectures. Third, as
defended in [14], [15], [16], the use of this technology ensures a bigger flexibility to
the modeled workflow processes: the agents implementing them can easily adapt to
their specific requirements.
Figure 2 presents the agent-based architecture we propose for the WES. This
architecture includes: (i) as many agents, called workflow agents, as the number of
workflow process instances being currently in progress, (ii) an agent manager in
charge of these agents, (iii) a connection server and a new interface, interface 6, that
help workflow agents to solicit the PMS for coordination purposes and finally (iv) an
agent communication channel to support the interaction between these agents.
Regarding the Workflow Agents, the idea is to implement each workflow process
(stored in the Workflow Process Database) instance as a software process, and to
encapsulate this process within an agent. Such a Workflow Agent includes a
workflow engine that, as and when the workflow process instance progresses, reads
the workflow definition and triggers the action(s) to be done according to its current
state. This Workflow Agent supports interface 3 with the applications that are to be
used to perform pieces of work associated to process’ tasks.
1, 2, 5
Connexion
Server
6
3, 4
Agent Manager
Agent Communication Channel
KnowledgeDatabase
WorkflowProcess Database
Workflow Enactment Service
WorkflowAgent
3, 4
WorkflowAgent
3, 4
WorkflowAgent
Figure 2: Workflow Enactment Service Revisited
The Agent Manager controls and monitors the running of Workflow Agents:
- Upon a request for a new instance of a workflow process, the Agent Manager
creates a new instance of the corresponding Workflow Agent type, initializes its
parameters according to the context, and launches the running of its workflow
engine.
- It ensures the persistency of Workflow Agents that execute long-term business
processes extending for a long time in which task performances are interleaved with
periods of inactivity.
- It coordinates Workflow Agents in their use of the local shared resources.
- It assumes interfaces 1, 2 and 5 of the WfMS.
In the loose IOW context, workflow agents need to find external workflow agents
running in other organizations and able to contribute to the achievement of their goal.
Connecting them requires finding, negotiation and contracting capacities but also
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 28
maintaining knowledge about resources of the environment. The role of the
Connection Server is to manage this knowledge (stored in the Knowledge database)
and to help agents to connect to the partners they need. To do this, the connection
server interacts with the PMS and other WESs using a new interface, Interface 6. For
instance, this interface supports the communication between a connection server of a
WES and a moderator agent of the PMS (via the Message Dispatcher agent), but also
between two connection servers of two different WESs.
The agent manager and the connection server relieve workflow agents of technical
tasks concerning relations with their internal and external environments. Each agent
being in charge of a single workflow process instance can be more easily adapted to
specific requirements of this process. Indeed, each instance of a business process is a
specific case featuring distinctive characteristics with regard to its objectives,
constraints and environment. Beyond the common generic facilities for supporting
flexibility, a workflow agent is provided with two additional capabilities. First, it
includes its own definition of the process –what is to be performed is specific to it–
and second, it includes its own engine for interpreting this definition –how to perform
is also specific. Moreover, tailoring an agent to the distinctive features of its workflow
process takes place at its instantiation, but also occurs dynamically.
4 Organizational View on the PMS's interactions
To specify and describe the functioning of the PMS i.e. how the agents interact among
themselves and among WfMSs' agents, we adopt an organizational view providing
macro-level coordination rules. The organizational model structures the
communication between the IOW' agents and thus highlights the coordination of the
different organizations involved in a loose IOW while finding partners, negotiation
between partners… For that purpose, we use the Agent Group Role (AGR) meta
model [17] which is a possible framework to define organizational dimension of a
multi-agent system (MAS) and which is particularly appropriate for loose IOW [10],
[15].
The remainder of this section first presents AGR. It then describes how, using this
meta-model, we structure and rule the interactions between the agents of the PMS' and
WfMS' architectures. Finally, it gives an AUML Sequence Diagram that illustrates
exchange of messages between these agents.
4.1 The AGR Meta Model
According to AGR, the organization of a system is defined as a set of related groups,
agents and roles.
A group is a set of agents that also determines an interaction space: an agent may
communicate with another agent only if they belong to the same group. The cohesion
of the whole system is maintained by the fact that agents may belong to any number of
groups, so that the communication between two groups may be done by agents that
belong to both. Each group also defines a set of roles, and the group manager is the
specific role fulfilled by the agent that initiates the group. The membership of an agent
to a group requires that the group manager authorizes this agent to play some role, and
each role determines how the agents playing that role may interact with other agents.
So the whole behavior of the system is framed by the structure of the groups that may
be created and by the behaviors allowed to agents by the roles.
AGR has three interesting advantages useful in our context [17]: it eases the
security of the application being developed since the interactions are organized within
a group which is a private space only open to agents having the capacities and the
authorization to enter in it. AGR also provides Modularity by organizing the work and
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 29
the interactions space in small and manageable units thanks to the notion of role and
group. Openness is also facilitated, since AGR imposes no constraint on the internal
structure of agents.
4.2 The Organizational Model
This model, as shown in figure 3, is organized around the following components:
- Seven types of groups represented by ellipses that are: Participation, Request-
Creation-Coordination-Protocol, Coordination-Protocol-Selection, Coordination-
Protocol-Execution, Creation-Coordination-Protocol, Request-Participation-
Coordination-Protocol and Coordination-Protocol. In this figure, we have two
Participation groups (Requester-Participation and Provider-Participation).
- Eight types of agents represented by candles that are: Requester-Workflow-Agent,
Connection-Server, Provider-Agent-Manager, Message-Dispatcher, Proto-col-
Selection-Agent, Protocol-Launcher-Agent, Mode-rator and Conversation-Server.
In this figure, we have two Connection-Server agents (Requester-Connection-
Server and Provider-Connection-Server).
- Nineteen roles since each agent plays a specific role within each group.
The communication between agents belonging to the different groups correspond to
either internal communications supported by the agent communication channel (thin
arrows) or external communications supported by interface 6 (large arrows).
Request-Participation-Coordination-Protocol
Requester-Connection-Server
Provider-Agent-Manager
Message-Dispatcher
Requester-Workflow-Agent
Provider-Connection-Server
11
11
Protocol-Selection-Agent Protocol-
Launcher-Agent
Conversation-Server
Moderator
6
1 1
1 1
66
Requester-Participation
Coordination-Protocol
Provider-Participation
Creation-Coordination-Protocol
Coordination-Protocol-Selection
Coordination-Protocol-Execution
Request-Creation-Coordination-Protocol
*
1 1
1
*
*
* 11
11
Figure 3: The Organizational Model
Let us detail now how each group operates. First, the Requester-Participation
group enables a requester workflow agent to solicit its connection server in order to
contact the PMS to deal with a coordination problem (finding partners, negotiation
between partners…). The Request-Creation-Coordination-Protocol group enables the
connection server to forward this request to the message dispatcher agent. This latter
then contacts, via the Coordination-Protocol-Selection group, the protocol selection
agent that helps the requester workflow agent to select a convenient coordination
protocol. The Coordination-Protocol-Execution group then enables the message
dispatcher to enter in connection with the Protocol Launcher Agent (PLA) and ask it
for the creation of a new conversation which is created by the PLA. More precisely,
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 30
the Creation-Coordination-Protocol group enables the PLA (i) to create a moderator
implementing the underlying new conversation coordination protocol, and (ii) to
inform the conversation server of a new conversation creation.
It is now possible for both either a requester or a provider workflow agent to
participate to a coordination protocol or to get information about it. Indeed the
requester and provider workflow agents' connection servers belonging to the Request-
Participation-Coordination-Protocol group solicit the message dispatcher to forward,
via the Coordination-Protocol group, their request to either the moderator (for
instance a submission of a new communication act) or the conversation server (for
instance an information request about a conversation).
4.3 Message Exchange Between Agents
The standard FIPA-ACL communication language [18] is used to support the
interaction, through message exchange, between the different agents involved in the
organizational model. FIPA-ACL offers a convenient set of performatives to deal with
the different coordination problems introduced before (for instance, agree, cancel,
refuse, request, inform, confirm… for finding partners, or propose, accept-proposal…
for negotiation between partners). Moreover, FIPA-ACL supports message exchange
between heterogeneous agents since (i) the language used to specify the message is
free and (ii) a message can refer to an ontology. This latter point is very interesting
since it is possible, through FIPA-ACL messages, to refer a domain ontology, which
can be used to solve semantic interoperability problems [10].
Figure 4 below illustrates, giving an AUML Sequence Diagram, this message
exchange during the creation of a conversation. This sequence diagram only shows
the FIPA-ACL interactions between agents belonging to the Requester-Participation,
Request-Creation-Coordination-Protocol, Coordination-Protocol-Selection, Coordina-
tion-Protocol-Execution and Creation-Coordination-Protocol groups.
request(conversation-creation)
/Requester WorkflowAgent: Agent
request(conversation-creation) inform/in-reply-to(protocols)
request(conversation-creation)
inform/in-reply-to(protocols)
inform/in-reply-to(protocols)
1 1
1 1
request(protocol-creation)
1 1
request(protocol-creation)
1 1
1 1
1 1
request(protocol-creation)
1 1
1 1
1
/Protocol LauncherAgent: Agent /Moderator:
Agent
/Conversation Server:Agent
request(protocol-creation)
confirm(protocol-creation)
inform/in-reply-to(conversation-
creation) inform (conversation-creation)
1
1
1 1
1
1
1
1
inform/in-reply-to(conversation-creation)
inform/in-reply-to(conversation-creation)
1111
/Requester ConnectionServer: Agent
/Selection ProtocolAgent: Agent
/Message Dispatcher:Agent
1
Figure 4: AUML Sequence Diagram illustrating Message Exchange
5 Models For Engineering Protocols
The design, selection and enactment of protocols by the PMS require a precise and
non-ambiguous definition of what we call a protocol. In this section, we try to give
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 31
such a definition distinguishing three different abstraction levels for protocol
description. We also provide a meta-model for protocols and a protocol classification
model taking into account only interaction protocols devoted to the loose IOW
context. These models have been specified with OWL [19] using Protege-2000
software. In addition to these OWL models, we also give in this paper their equivalent
UML models for readability and popularization reasons. Finally, this section shows
how the protocol classification can be exploited by the PMS.
5.1 The Three Levels of a Protocol Definition
We distinguish three abstraction levels for protocol description:
- The first level is a concrete or execution level. At this level, we find conversations
(occurrence or instance of a protocol) between participating workflow agents, each
one playing a role in the conversation. For instance, a requester workflow agent (A)
plays the role of manager and evaluates workflow processes offered by several
provider workflow agents (B,C,D…), chooses one of them (D) and delegates the
workflow process to be done to the winner (D).
- The second level describes protocol specifications (protocol for short) defining the
rules to which a class of conversations must obey. Each conversation is an instance
of a single protocol specification, but several conversations referring to the same
protocol may be running simultaneously. As an example of protocol specification,
we can consider the specification of the Contract Net Protocol (CNP) [20] stating
that: i) CNP involves a single manager and any number of contractors and the
manager cannot be a contractor, ii) at the beginning the Manager, who has a
task/workflow to subcontract, submits the specification of the task to contractors
agents, wait for their bids and then awards the contractor having the best bid, and
iii) at the end the task is subcontracted to the winner.
- The third and more abstract level corresponds to the meta model of a protocol i.e.
the invariant structure shared by all the protocols.
This paper will focus on this last level, which is independent of any protocol and
any target technology. This level is described in the next section.
5.2 Protocol Meta Model
The protocol meta-model is given by figure 5. Figure 5a just gives an extract of this
meta-model expressed in OWL, which is the language we have used for its
implementation. For readability reasons, Figure 5b gives a complete UML
representation of this meta-model. In the following, we only comment this UML
representation.
This meta-model is built around three core and inter-related concepts: Interventions
Types, Roles and Protocols. We describe them in detail in the following.
Intervention Types abstract elementary conversation acts. An Intervention Type is
defined by its name, an action with its input and output parameters, and it also
includes intervention behavioral constraints such as its level of priority (with regard
to other intervention types) or how many times it may be performed during one
conversation. The PreCondition and PostCondition associations represent the
requirements that must be held before and be fulfilled after an occurrence of the
Intervention Type. They contribute to the statement of the behavioral constraints of
the protocol since they define an order relation between interventions, and thus the
control
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 32
<owl:Class rdf:ID="Protocol"> <rdfs:subClassOf>
<owl:Restriction> <owl:maxCardinality rdf:datatype= "http://www.w3.org/2001/XMLSchema#int"> 1 </owl:maxCardinality> <owl:onProperty> <owl:ObjectProperty rdf:ID="HasTerminalState"/>
</owl:onProperty> </owl:Restriction>
</rdfs:subClassOf> … <rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/> … <owl:DatatypeProperty rdf:ID="Name">
<rdfs:domain rdf:resource="#Protocol"/> <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
</owl:DatatypeProperty> … <owl:DatatypeProperty rdf:ID="Description">
<rdfs:domain rdf:resource="#Protocol"/> <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
</owl:DatatypeProperty> ...</owl:Class>
Figure 5a: The Protocol Meta Model: OWL representation
PermissionToPerform
1..*
PreCondition
1..1
0..*
PostCondition
1..1
0..*
Creator
Member
Business Domain
1..*TerminalState
0..1
1..1
InitialState
1..1
1..1
1..1
1..*
1..*
RoleNameSkillCasting const.
State
Condition
Protocol
NameDescriptionCasting const.Behavioural const.Parameters
1..1
Ontology
InterventionType
NameActionInputOutputBehavioural const.
1..1
0..1
Figure 5b: The Protocol Meta Model: UML representation
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 33
structure of the protocol, that is the sequences of interventions that may occur in the
course of a conversation. An Intervention Type belongs to the role linked to it, that
only agents playing that role may perform.
A Protocol includes a set of member Roles, one of them is allowed to initiate a new
conversation following the Protocol, and a set of Intervention Types linked to these
Roles. The Business Domain link gives access to the all-possible Ontologies to which
a Protocol may refer. The InitialState link describes the configuration required to start
a conversation, while the TerminalState link establishes a configuration to reach in
order to consider a conversation as being completed. A Protocol may include a
Description in natural language for documentation purpose, or information about the
Protocol at the agents' disposal. The protocol casting constraints attribute records
constraints that involve several Roles and cannot be defined regarding individual
Roles such as the total number of agents that are allowed to take part in a
conversation. Similarly, the protocol behavioral constraints attribute records
constraints that cannot be defined regarding individual Intervention Types such as the
total number of interventions that may occur in the course of a conversation. Some of
these casting or behavioral constraints can involve Parameters of the Protocol,
properties whose value is not fixed by the Protocol but is proper to each conversation
and set by the conversations’ initiator.
5.3 Protocol Classification
In addition to the previous meta-model, we also need additive information to better
handle protocols. We propose a classification of protocols to distinguish them and to
easily select them according to the objective to be reached: finding partners,
negotiation between partners or contract establishment support. This classification
takes only into account the protocols useful in the IOW context. The appropriateness
of these protocols to loose IOW was discussed in a previous paper [15].
FindingPartner Negotiation
Matchmaker
P2PExecutionComparisonModeQualityRateNumberOfProviders
Broker
Business Domain
1..* 1..*Protocol
Contract
Argumentation Auction ContractTemplateContractNetHeuristic
Ontology
Figure 6: The Protocol Classification
In the UML schema of figure 6, the protocol class of figure 5.b is refined. Protocols
are specialized, according to their objective, into three abstract classes:
FindingPartner, Negotiation and Contract. Those abstract classes are in their turn
recursively specialized until obtaining leaves corresponding to models of protocols
like for instance Matchmaker, Broker, Argumentation, Heuristic, ContractTemplate…
Each of these protocols may be used in one of the coordination steps of IOW.
However, at this last level, the different classes feature new attributes specific to each
one and possibly related to the quality of service.
If we consider for example the Matchmaker protocol (the only one developed in
figure 6), we can make the following observations. First, it differs from the broker by
the fact that it implements a Peer-to-Peer execution mode with the provider: the
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 34
identity of the provider is known and a direct link is established between the requester
and the provider at run time. Then, one can be interested in its comparison mode (plug
in, exact, and/or subsume) [21], a quality rating to compare it to other matchmakers,
the minimum number of providers it is able to manage…
5.4 Engineering protocols
In this paper, we focus on three activities: the design, the selection and the execution
of protocols. Let us give some hints on how to drive these three activities thanks to the
models previously given.
Protocol Design. The design process consists of producing protocol specifications
compliant with the meta-model presented in section 5.2 and refined in section 5.3.
Protocol Selection. Given a query specifying the objective of a Protocol and the
value of some additional attributes (P2PExecution, NumberOf Providers…), the
Protocol Selection Agent follows a three step process to give an answer. First, it
navigates in the hierarchy of protocols and selects possible models of protocols. For
instance, if the objective of a requester is to find a partner, the requester will obtain a
set of FindingPartner protocols. Second, the value of the other attributes can be used
to select a specific protocol model. For instance, if a query specifies a P2P execution,
it will obtain the Matchmaker protocol model otherwise the Broker will be suggested
to it. Third, the Protocol Selection Agent must check that the behavioral and casting
constraints of the selected protocol model are compatible with the requester
requirements and capabilities. This process may be iterative and interactive to guide
the requester in its choice, in case there are still several possible solutions.
In our implementation, queries are expressed in nRQL [22] which is a language
allowing the querying of OWL or RDF ontologies. We use the following nRQL query
syntax “(retrieve (?<x>) (?<x> <class>) [(<condition>)]”, where <x> is the
variable corresponding to the query result, <class> is the name of the queried class,
and <condition>, which may be omitted, is the condition that <x> must satisfy. As an
example, considering our protocol classification (see figure 6), the following query
“(retrieve (?x) (?x|PartnerFinding|) (= P2Pexecution True))” returns the
Matchmaker protocol.
Protocol Execution. Once a protocol model has been selected, the Moderator
Launcher Agent creates and launches a moderator agent to play that protocol.
6 Implementation
This work gives rise to an implementation project, called ProMediaFlow (Protocols
for Mediating Workflow), aiming at developing the whole PMS architecture. The first
step of this project is to evaluate its feasibility. For this purpose, we have developed a
simulator, limited for the moment to a subset of components and considering a single
protocol. In fact, regarding the PMS architecture, we have implemented only the
Protocol Execution block and considered the Matchmaker protocol. Regarding the
WfMS architecture, we have implemented both requester and provider Workflow
Agents and their corresponding Agent Managers and Connection Servers.
This work has been implemented with the MadKit platform [23], which permits the
development of distributed applications using multi-agent principles. Madkit also
supports the organizational AGR meta-model and then permits a straightforward
implementation of our organizational model (presented in section 4).
In the current version of our implementation, the system is only able to produce
moderators playing the Matchmaker protocol. More precisely, the Matchmaker is able
to compare offers and requests of workflow services (i.e. a service implementing a
workflow process) by establishing flexible comparisons (exact, plug in, subsume) [21]
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 35
based on a domain ontology. For that purpose, we also have included facilities to
describe workflow services into the simulator. So, as presented in [24], these offers
and requests are described using both the Petri Net with Object (PNO) formalism [25]
and the OWL-S language [26]: the PNO formalism is used to design, analyze,
simulate, check and validate workflow services which are then automatically derived
into OWL-S specifications to be published through the Matchmaker.
Figure 7 below shows some screenshots of our implementation. More precisely, this
figure shows the four following agents: a Requester Workflow Agent, a Provider
Workflow Agent, the Conversation Server, and a Moderator, which is a Matchmaker.
While windows 1 and 2 represent agents belonging to WfMS (respectively a provider
workflow agent and a requester workflow agent), windows 3 and 4 represent agents
belonging to the PMS (respectively the conversation server and a moderator agent
playing the Matchmaker protocol).
1
2
3
4
Figure 7: Overview of the Implementation
The top left window (number 1) represents a requester workflow agent belonging to
a WfMS involved in a loose IOW and implementing a workflow process instance. As
shown by window 1, this agent can: (i) specify a requested workflow service
(Specification menu), (ii) advertise this specification through the Matchmaker
(Submission menu), (iii) visualize the providers offering services corresponding to the
specification (Visualization menu), (iv) establish peer-to-peer connections with one of
these providers (Contact menu), and, (v) launch the execution of this requested
service (WorkSpace menu). In a symmetric way, the bottom left window (number 2)
represents an agent playing the role of a workflow service provider, and a set of
menus enables it to manage its offered services. As shown by window 2, the
Specification menu includes three commands to support PNO and OWL-S
specifications. The first command permits the specification of a workflow service
using the PNO formalism, the second one permits the analysis and validation of the
specified PNO and the third one derives the corresponding OWL-S specifications.
The top right window (number 3) represents the Conversation Server agent
belonging to the PMS architecture. As shown by window 3, this agent can: (i) display
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 36
all the conversations (Conversations menu and List of conversations option), (ii)
select a given conversation (Conversations menu and Select a conversation option),
and, (iii) obtain all the information related to this selected conversation -its moderator,
its initiator, its participants…- (Conversations menu and Detail of a conversation
option). Finally, the bottom right window (number 4) represents a Moderator agent
playing the Matchmaker protocol. This agent can: (i) display all the conversation acts
of the supervised conversation (Communication act menu and List of acts option), (ii)
select a given conversation act (Communication act menu and Select an act option),
and, (iii) obtain all the information related to this selected conversation act -its sender,
its content…- (Communication act menu and Detail of an act option), and, (iv) know
the underlying domain ontology (Ontology domain menu).
Now let us give some indications about the efficiency of our implementation and
more precisely of our Matchmaker protocol. Let us first remark, that in the IOW
context, workflows are most often long-term processes which may last for several
days. Consequently, we do not need an instantaneous matchmaking process. However,
in order to prove the feasibility of our proposition, we have measured the matchmaker
processing time according to some parameters (notably the number of offers and the
comparison modes) intervening in the complexity formulae of the matchmaking
process [27]. The measures have been realized in the context of the conference review
system case study, where a PC chair subcontracts to the matchmaker the research of
reviewers able to evaluate papers. The reviewers are supposed to have submitted to
the matchmaker their capabilities in term of topics. Papers are classified according to
topics belonging to an OWL ontology. Figure 8 visualizes the matchmaker average
processing time for a number of offers (services) varying from 100 to 1000 and
considering the plug in, exact and subsume comparison modes.
As illustrated in figure 8, and in accordance with the complexity study of [27], the
exact mode is the most efficient in term of time processing. To better analyze the
Matchmaker behavior, we also plan to measure its recall and precision rates, well
known in Information Retrieval [28].
Average Processing Time of the Matchaker in milliseconds
400
420
440
460
480
500
520
540
100 500 1000
Number of offers (services)
Av
erag
e P
roce
ssin
g T
ime
Exact
Plug In
Subsume
Figure 8: Quantitative Evaluation of the Matchmaker
7 Discussion and Conclusion
This paper has proposed a multi-agent protocol management system architecture by
considering protocols as first-class entities. This architecture has three advantages:
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 37
- Easy design and development of loose IOW systems. The principle of separation of
concerns improves understandability of the system, and therefore eases and speeds
up its design, its development and its maintenance. Following this practice,
protocols may be thought as autonomous and reusable components that may be
specified and verified independently from any specific workflow system behavior.
Doing so, we can focus on specific coordination issues and build a library of easy-
to-use and reliable protocols. The same holds for workflow systems, since it
becomes possible to focus on their specific capabilities, however they interact with
others.
- Workflow agent reusability. As a consequence of introducing moderators, workflow
agents and agent managers do not interact with each other directly anymore.
Instead, they just have to agree on the type of conversation they want to adopt.
Consequently, agents impose fewer requirements to their partners, and they are
loosely coupled. Hence, heterogeneous workflow agents can be easily composed.
- Visibility of conversations. Some conversations are private and concern only the
protagonists. But, most of the time, meeting the goal of the conversation depends to
a certain extent on its transparency i.e. on the possibility given to the agents to be
informed on the conversation progress. With a good knowledge about the state of
the conversation and the rules of the protocol, each agent can operate with
relevance and in accordance with its objectives. In the absence of a Moderator, the
information concerning a conversation is distributed among the participating
agents. Thus, it is difficult to know which agent has the searched information,
supposing that this agent has been designed to support the supply of this
information. By contrast, moderators support the transparency of conversations.
Related works may be considered according to two complementary points of view:
the loose IOW coordination and the protocol engineering points of view.
Regarding loose IOW coordination, it can be noted that most of the works ([10],
[14], [15], [16], [29], [30], [31]) adopt one of the following multi-agent coordination
protocol: organizational structuring, matchmaking, negotiation, contracting...
However, these works neither address all the protocols at the same time nor follow the
principle of separation of concerns. In consequence, they do not address protocol
engineering issues and notably protocol design, selection and execution.
Regarding protocol engineering, the most significant works are [6], [32]. [6] has
inspired our software engineering approach, but it differs from our work since it does
not address workflow applications and does not address the classification and
selection of protocol issues. [32] is a complementary work to ours. It deals with
protocol engineering issues focusing particularly on the notion of protocol
compatibility, equivalence and replaceability. In fact, this work aims at defining a
protocol algebra which can be very useful to our PMS. At design time, it can be used
to establish links between protocols, while, at run-time, these links can be used by the
Protocol Selection Agent.
Finally, we must also mention work which addresses both loose IOW coordination
and protocol engineering issues ([33], [34]). [33] isolates protocols and represents
them as ontologies but this work only considers negotiation protocols in e-commerce.
[34] considers interaction protocols as design abstractions for business processes,
provides an ontology of protocols and investigates the composition of protocol issues.
[34] differs from our work since it only focuses on the description and the
composition aspects. Finally, none of these works ([33], [34]) proposes means for
classifying, retrieving and executing protocols, nor addresses architectural issues as
we did through the PMS definition.
Regarding future works, we plan to complete the implementation of the PMS. The
current implantation is limited to the Matchmaker protocol, and so, we intend to
design and implement other coordination protocols (broker, argumentation, heuristic).
We also believe that an adequate combination of this work and the comparison
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 38
mechanisms of protocols presented in [32] could improve the classification of
protocols in our PMS. Finally, to provide a uniform access to our PMS, moderator
agents playing protocols could be encapsulated inside Web services. Doing so, we
follow the promising Service Oriented Multiagent Architecture recently introduced in
[35], that provides on the one hand flexibility to our workflow, inherited from agent
technology, and on the other hand interoperability and openness, thanks to the use of a
Service Oriented Architecture.
References
[1] W. van der Aalst: Inter-Organizational Workflows: An Approach Based on
Message Sequence Charts and Petri Nets. Systems Analysis, Modeling and
Simulation, 34(3), 1999, pp. 335-367.
[2] M. Divitini, C. Hanachi, C. Sibertin-Blanc: Inter Organizational Workflows for
Enterprise Coordination. In: A. Omicini, F. Zambonelli, M. Klusch, and R.
Tolksdorf (eds): Coordination of Internet Agents, Springer-Verlarg, 2001, pp.
46-77.
[3] F. Casati, A. Discenza: Supporting Workflow Cooperation within and across
Organizations. 15th Int. Symposium on Applied Computing, Como, Italy, March
2000, pp. 196-202.
[4] P. Grefen, K. Aberer, Y. Hoffner, H. Ludwig: CrossFlow: Cross-Organizational
Workflow Management in Dynamic Virtual Enterprises. Computer Systems
Science and Engineering, 15( 5), 2000, pp. 277-290.
[5] O. Perrin, F. Wynen, J. Bitcheva, C. Godart: A Model to Support Collaborative
Work in Virtual Enterprises. 1st Int. Conference on Business Process
Management, Eindhoven, The Netherlands, June 2003, pp 104-119.
[6] C. Hanachi, C. Sibertin-Blanc: Protocol Moderators as active Middle-Agents in
Multi-Agent Systems. Autonomous Agents and Multi-Agent Systems, 8(3),
March 2004, pp. 131-164.
[7] C. Ghezzi, M. Jazayeri, D. Mandrioli, Fundamentals of Software Engineering.
Prentice-Hall International, 1991.
[8] W. van der Aalst: The Application of Petri-Nets to Workflow Management.
Circuit, Systems and Computers, 8( 1), February 1998, pp. 21-66.
[9] M. Genesereth, S. Ketchpel: Software Agents. Communication of the ACM,
37(7), July 1994, pp. 48-53.
[10] E. Andonoff, L. Bouzguenda, C. Hanachi, C. Sibertin-Blanc: Finding Partners in
the Coordination of Loose Inter-Organizational Workflow. 6th Int. Conference
on the Design of Cooperative Systems, Hyeres (France), May 2004, pp. 147-162.
[11] N. Jenning, P. Faratin, A. Lomuscio, S. Parsons, C. Sierra, M. Wooldridge:
Automated Negotiation: Prospects, Methods and Challenges. Group Decision
and Negotiation, 10 (2), 2001, pp. 199-215.
[12] J. Ferber, O. Gutknecht: A Meta-Model for the Analysis and Design of
Organizations in Multi-Agent Systems, 3rd Int. Conference on Multi-Agents
Systems, Paris, France, July 1998, pp. 128-135.
[13] The Workflow Management Coalition, The Workflow Reference Model.
Technical Report WfMC-TC-1003, November 1994.
[14] L. Zeng, A. Ngu, B. Benatallah, M. O’Dell: An Agent-Based Approach for
Supporting Cross-Enterprise Workflows. 12th Australian Database Conference,
Bond, Australia, February 2001, pp. 123-130.
[15] E. Andonoff, L. Bouzguenda L: Agent-Based Negotiation between Partners in
Loose Inter-Organizational Workflow. 5th Int. Conference on Intelligent Agent
Technology, Compiègne, France, September 2005, pp. 619-625.
[16] P. Buhler, J. Vidal: Towards Adaptive Workflow Enactment Using Multi Agent
Systems. Information Technology and Management, 6(1), 2005, pp. 61-87.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 39
[17] J. Ferber, O. Gutknetcht, M. Fabien: From Agents to Organizations: an
Organizational View of Multi-Agent Systems. 4th Int. Workshop on Agent-
Oriented Software Engineering, Melbourne, Australia, July 2003, pp. 214-230.
[18] Foundation for Intelligent Physical Agents, FIPA ACL Message Structure
Specification. December 2002, Available at http://www.fipa.org/specs/fipa00061/
[19] World Wide Web Consortium, OWL Web Ontology Language. Available at
http://www.w3.org/TR/owl-features/
[20] Foundation for Intelligent Physical Agents, FIPA Contract Net Interaction
Protocol Specification. December 2002, Available at
http://www.fipa.org/specs/fipa00029/
[21] A. Ankolekar, M. Burstein, J. Hobbs, O. Lassila, D. Martin, D. McDermott, S.
McIlraith, S. Narayanan, M. Paolucci, T. Payne, K. Sycara: DAML-S: Web
Service Description for the Semantic Web. 1st Int. Semantic Web Conference,
Sardinia, Italy, June 2002, pp. 348-363.
[22] V. Haarslev, R. Moeller, M. Wessel M: Querying the Semantic Web with Racer
and nRQL. 3rd Int. Workshop on Applications of Description Logic, Ulm,
Germany, September 2004.
[23] J. Ferber, O. Gutknecht: TheMadKit Project: a Multi-Agent Development Kit.
Available at http://www.madkit.org
[24] E. Andonoff, L. Bouzguenda, C. Hanachi: Specifying Web Workflow Services for
Finding Partners in the context of Loose Inter-Organizational Workflow. 3rd Int.
Conference on Business Process Management, Nancy, France, September 2005,
pp. 120-136.
[25] C. Sibertin-Blanc: High Level Petri Nets with Data Structure. 6th Int. Workshop
on Petri Nets and Applications, Espoo, Finland, June 1985.
[26] World Wide Web Consortium, OWL-S: Semantic Markup for Web Services.
Available at http://www.w3.org/Submission/OWL-S/
[27] L. Bouzguenda: Agent-based Coordination for loose Inter-Organizational
Workflow, PHD dissertation (in French), May 2006, University of Toulouse 1.
[28] M. Klusch, B. Fries, K. Sycara: Automated Semantic Web Service Discovery with
OWLS-MX. 5th Int. Joint Conference on Autonomous Agents and Multiagent
Systems, Hokodate, Japan, May 2006, pp. 915-922.
[29] C. Aberg, C. Lambrix, N. Shahmehri: An Agent-Based Framework for
Integrating Workflows and Web Services, 14th Int. Workshop on Enabling
Technologies: Infrastructure for Collaborative Enterprises, Linköping, Sweden,
June 2005, pp. 27-32.
[30] A. Negri, A. Poggi, M. Tamaiuolo, P. Turci: Agents for e-Business Applications.
5th Int. Conference on Autonomous Agents and Multi-Agent Systems, Hokodate,
Japan, May 2004, pp. 907-914.
[31] M. Sensoy, P. Yolum: A Context-Aware Approach for Service Selection using
Ontologies. 5th Int. Conference on Autonomous Agents and Multi-Agent
Systems, Hokodate, Japan, May 2004, pp. 931-938.
[32] B. Benatallah, F. Casati, F. Toumani: Representing, Analyzing and Managing
Web Service Protocols. Data and Knowledge Engineering, 58(3), September
2006, pp. 327-357.
[33] V. Tamma, S. Phelps, I. Dickinson, M. Wooldridge: Ontologies for Supporting
Negotiation in E-Commerce. Engineering Applications of Artificial Intelligence,
18(2), March 2005, pp. 223-236.
[34] N. Desai, A. Mallya, A. Chopra, M. Singh M: Interaction Protocol as Design
Abstractions for Business Processes. IEEE Transactions on Software
Engineering, December 2005, pp. 1015-1027.
[35] M. Huhnsv bn, M. Singh, M. Burstein, K. Decker, E. Durfee, T. Finin, L. Gasser,
H. Goradia, N. Jennings, K. Lakkaraju, H. Nakashima, H. van Dyke Parunak, J.
Rosenschein, A. Ruvinski, G. Sukthankar, S. Swarup, K. Sycara, M. Tambe, T.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 40
Wagner, L. Zavala: Research Directions for Service-Oriented Multiagent
Systems. IEEE Internet Computing, 9 (6), December 2005, pp. 65-70.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 23 - 41
© 2006 Technomathematics Research Foundation
Eric Andonoff, Wassim Bouaziz, Chihab Hanachi 41
International Journal of Computer Science & ApplicationsVol. IV, No. II
© 2006 Technomathematics Research Foundation
42
Adaptability of Methods forProcessing XML Data using
Relational Databases – the Stateof the Art and Open Problems1
Irena Mlynkova and Jaroslav PokornyCharles University, Faculty of Mathematics and Physics,
Department of Software Engineering,Malostranske nam. 25, 118 00 Prague 1, Czech Republic
{irena.mlynkova,jaroslav.pokorny}@mff.cuni.cz
AbstractAs XML technologies have become a standard for data representation, it isinevitable to propose and implement efficient techniques for managing XMLdata. A natural alternative is to exploit tools and functions offered by(object-)relational database systems. Unfortunately, this approach has manyobjectors, especially due to inefficiency caused by structural differences betweenXML data and relations. On the other hand, (object-)relational databases havelong theoretical and practical history and represent a mature technology, i.e.they can offer properties that no native XML database can offer yet. In thispaper we study techniques which enable to improve XML processing based onrelational databases, so-called adaptive or flexible mapping methods. We providean overview of existing approaches, we classify their main features, and sum upthe most important findings and characteristics. Finally, we discuss possibleimprovements and corresponding key problems.
Keywords: XML-to-relational mapping, state of the art, adaptability, rela-tional databases
1 Introduction
Without any doubt the XML [9] is currently one of the most popular formatsfor data representation. It is well-defined, easy-to-use, and involves various rec-ommendations such as languages for structural specification, transformation,querying, updating, etc. The popularity invoked an enormous endeavor to pro-pose more efficient methods and tools for managing and processing XML data.The four most popular ones are methods which store XML data in a file sys-tem, methods which store and process XML data using an (object-)relationaldatabase system, methods which exploit a pure object-oriented approach, andnative methods that use special indices, numbering schemes [17], and/or datastructures [12] proposed particularly for tree structure of XML data.
Naturally, each of the approaches has both keen advocates and objectors. Thesituation is not good especially for file system-based and object-oriented meth-ods. The former ones suffer from inability of querying without any additional
1This work was supported in part by Czech Science Foundation (GACR), grant number201/06/0756.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 43
preprocessing of the data, whereas the latter approach fails especially in find-ing a corresponding efficient and comprehensive tool. The highest-performancetechniques are the native ones since they are proposed particularly for XMLprocessing and do not need to artificially adapt existing structures to a newpurpose. But the most practically used ones are methods which exploit featuresof (object-)relational databases. The reason is that they are still regarded asuniversal data processing tools and their long theoretical and practical historycan guarantee a reasonable level of reliability. Contrary to native methods it isnot necessary to start “from scratch” but we can rely on a mature and verifiedtechnology, i.e. properties that no native XML database can offer yet.
Under a closer investigation the database-based2 methods can be further clas-sified and analyzed [19]. We usually distinguish generic methods which storeXML data regardless the existence of corresponding XML schema (e.g. [10] [16]),schema-driven methods based on structural information from existing schemaof XML documents (e.g. [26] [18]), and user-defined methods which leave all thestorage decisions in hands of future users (e.g. [2] [1]).
Techniques of the first type usually view an XML document as a directedlabelled tree with several types of nodes. We can further distinguish generictechniques which purely store components of the tree and their mutual relation-ship [10] and techniques which store additional structural information, usuallyusing a kind of a numbering schema [16]. Such schema enables to speed up cer-tain types of queries but usually at the cost of inefficient data updates. The factthat they do not exploit possibly existing XML schemes can be regarded as bothadvantage and disadvantage. On one hand they do not depend on its existencebut, on the other hand, they cannot exploit the additional structural infor-mation. But together with the finding that a significant portion of real XMLdocuments (52% [5] of randomly crawled or 7.4% [20] of semi-automaticallycollected3) have no schema at all, they seem to be the most practical choice.
By contrast, schema-driven methods have contradictory (dis)advantages. Thesituation is even worse for methods which are based particularly on XML Schema[28] [7] definitions (XSDs) and focus on their special features [18]. As it isexpectable, XSDs are used even less (only for 0.09% [5] of randomly crawled or38% [20] of semi-automatically collected XML documents) and even if they areused, they often (in 85% of cases [6]) define so-called local tree grammars [22],i.e. languages that can be defined using DTD [9] as well. The most exploited“non-DTD” features are usually simple types [6] whose lack in DTD is crucialbut for XML data processing have only a side optimization effect.
Another problem of purely schema-driven methods is that information XMLschemes provide is not satisfactory. Analysis of both XML documents andXML schemes together [20] shows that XML schemes are too general. Excessiveexamples can be recursion or “*” operator which allow theoretically infinitelydeep or wide XML documents. Naturally, XML schemes also cannot provideany information about, e.g., retrieval frequency of an element / attribute or theway they are retrieved. Thus not only XML schemes but also corresponding
2In the rest of the paper the term “database” represents an (object-)relational database.3Data collected with interference of a human operator who removes damaged, artificial,
too simple, or otherwise useless XML data.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 44
XML documents and XML queries need to be taken into account to get overallnotion of the demanded XML-processing application.
The last mentioned type of approach, i.e. the user-defined one, is a bit differ-ent. It does not involve methods for automatic database storage but rather toolsfor specification of the target database schema and required XML-to-relationalmapping. It is commonly offered by most known (object-)relational databasesystems [3] as a feature that enables users to define what suits them most insteadof being restricted by disadvantages of a particular technique. Nevertheless, thekey problem is evident – it assumes that the user is skilled in both database andXML technologies.
Apparently, advantages of all three approaches are closely related to the par-ticular situation. Thus it is advisable to propose a method which is able toexploit the current situation or at least to comfort to it. If we analyze database-based methods more deeply, we can distinguish so-called flexible or adaptivemethods (e.g. [13] [25] [29] [31]). They take into account a given sample set ofXML data and/or XML queries which specify the future usage and adapt theresulting database schema to them. Such techniques have naturally better per-formance results than the fixed ones (e.g. [10] [16] [26] [18]), i.e. methods whichuse pre-defined set of mapping rules and heuristics regardless the intended futureusage. Nevertheless, they have also a great disadvantage – the fact that the tar-get database schema is adapted only once. Thus if the expected usage changes,the efficiency of such techniques can be even worse than in corresponding fixedcase. Consequently the adaptability needs to be dynamic.
The idea to adapt a technique to a sample set of data is closely related toanalyses of typical features of real XML documents [20]. If we combine thetwo ideas, we can assume that a method which focuses especially on commonfeatures will be more efficient than the general one. A similar observation isalready exploited, e.g., in techniques which represent XML documents as a setof points in multidimensional space [14]. Efficiency of such techniques dependsstrongly on the depth of XML documents or the number of distinct paths.Fortunately XML analyses confirm that real XML documents are surprisinglyshallow – the average depth does not exceed 10 levels [5] [20].
Considering all the mentioned points the presumption that an adaptive en-hancing of XML-processing methods focusing on given or typical situations seemto be a promising type of improvement. In this paper we study adaptive tech-niques from various points of view. We provide an overview of existing ap-proaches, we classify them and their main features, and we sum up the mostimportant findings and characteristics. Finally, we discuss possible improve-ments and corresponding key problems. The analysis should serve as a startingpoint for proposal of an enhancing of existing adaptive methods as well as of anunprecedented approach. Thus we also discuss possible improvements of weakpoints of existing methods and solutions to the stated open problems.
The paper is structured as follows: Section 2 contains a brief introductionto formalism used throughout the paper. Section 3 describes and classifies theexisting related works, both practical and theoretical, and Section 4 sums uptheir main characteristics. Section 5 discusses possible ways of improvement ofthe recent approaches and finally, the sixth section provides conclusions.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 45
2 Definitions and Formalism
Before we begin to describe and classify adaptive methods, we state severalbasic terms used in the rest of the text.
An XML document is usually viewed as a directed labelled tree with severaltypes of nodes whose edges represent relationships among them. Side structures,such as entities, comments, CDATA sections, processing instructions, etc., arewithout loss of generality omitted.
Definition 1 An XML document is a directed labelled tree T = (V, E, ΣE , ΣA,Γ, lab, r), where V is a finite set of nodes, E ⊆ V × V is a set of edges, ΣE isa finite set of element names, ΣA is a finite set of attribute names, Γ is a finiteset of text values, lab : V → ΣE ∪ΣA ∪ Γ is a surjective function which assignsa label to each v ∈ V , whereas v is an element if lab(v) ∈ ΣE, an attribute iflab(v) ∈ ΣA, or a text value if lab(v) ∈ Γ, and r is the root node of the tree.
A schema of an XML document is usually described using DTD or XMLSchema which describe the allowed structure of an element using its contentmodel. An XML document is valid against a schema if each element matches itscontent model. (We state the definitions for DTDs only for the paper length.)
Definition 2 A content model α over a set of element names Σ′E is a regularexpression defined as α = ε | pcdata | f | (α1, α2, ..., αn) | (α1|α2|...|αn) | β*| β+ | β?, where ε denotes the empty content model, pcdata denotes the textcontent, f ∈ Σ′E, “,” and “|” stand for concatenation and union (of contentmodels α1, α2, ..., αn), and “*”, “+”, and “?” stand for zero or more, one ormore, and optional occurrence(s) (of content model β) respectively.
Definition 3 An XML schema S is a four-tuple (Σ′E , Σ′A, ∆, s), where Σ′E isa finite set of element names, Σ′A is a finite set of attribute names, ∆ is a finiteset of declarations of the form e → α or e → β, where e ∈ Σ′E, α is a contentmodel over Σ′E, and β ⊆ Σ′A, and s ∈ Σ′E is a start symbol.
To simplify the XML-to-relational mapping process an XML schema is oftentransformed into a graph representation. Probably the first occurrence of thisrepresentation, so-called DTD graph, can be found in [26]. There are also variousother types of graph representation of an XML schema, if necessary, we mentionthe slight differences later in the text.
Definition 4 A schema graph of a schema S = (Σ′E , Σ′A, ∆, s) is a directed,labelled graph G = (V, E, lab′), where V is a finite set of nodes, E ⊆ V × V isa set of edges, lab′ : V → Σ′E ∪ Σ′A ∪ {“|”, “*”, “+”, “?”, “,”} ∪ {pcdata} is asurjective function which assigns a label to ∀ v ∈ V , and s is the root node ofthe graph.
The core idea of XML-to-relational mapping methods is to decompose a givenschema graph into fragments, which are mapped to corresponding relations.
Definition 5 A fragment f of a schema graph G is each its connected subgraph.
Definition 6 A decomposition of a schema graph G is a set of its fragments{f1, ..., fn}, where ∀ v ∈ V is a member of at least one fragment.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 46
3 Existing Approaches
Up to now only a few papers have focused on a proposal of an adaptive database-based XML-processing method. We distinguish two main directions – cost-driven and user-driven. Techniques of the former group can choose the mostefficient XML-to-relational storage strategy automatically. They usually evalu-ate a subset of possible mappings and choose the best one according to the givensample of XML data, query workload, etc. The main advantage is expressed bythe adverb “automatically”, i.e. without necessary or undesirable user interfer-ence. By contrast, techniques of the latter group also support several storagestrategies but the final decision is left in hands of users. We distinguish thesetechniques from the user-defined ones, since their approach is slightly different:By default they offer a fixed mapping, but users can influence the mapping pro-cess by annotating fragments of the input XML schema with demanded storagestrategies. Similarly to the user-defined techniques this approach also assumesa skilled user, but most of the work is done by the system itself. The user isexpected to help the mapping process, not to perform it.
3.1 Cost-Driven Techniques
As mentioned above, cost-driven techniques can choose the best storage strategyfor a particular application automatically, without any interference of a user.Thus the user can influence the mapping process only through the providedXML schema, set of sample XML documents or data statistics, set of XMLqueries and eventually their weights, etc.
Each of the techniques can be characterized by the following five features:
1. an initial XML schema Sinit,2. a set of XML schema transformations T = {t1, t2, ..., tn}, where ∀ i : ti
transforms a given schema S into a schema Si,3. a fixed XML-to-relational mapping function fmap which transforms a given
XML schema S into a relational schema R,4. a set of sample data Dsample characterizing the future application, which
usually consists of a set of XML documents {d1, d2, .., dk} valid againstSinit, and a set of XML queries {q1, q2, .., ql} over Sinit, eventually withcorresponding weights {w1, w2, .., wl}, ∀ i : wi ∈ 〈0, 1〉, and
5. a cost function fcost which evaluates the cost of the given relational schemaR with regard to the set Dsample.
The required result is an optimal relational schema Ropt, i.e. a schema, wherefcost(Ropt, Dsample) is minimal.
A naive but illustrative cost-driven storage strategy that is based on the ideaof using a “brute force” is depicted by Algorithm 1. It first generates a set ofpossible XML schemes S using transformations from set T and starting frominitial schema Sinit (lines 1 – 4). Then it searches for schema s ∈ S with minimalcost fcost(fmap(s), Dsample) (lines 5 – 12) and returns the corresponding optimalrelational schema Ropt = fmap(s).
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 47
Algorithm 1 Naive Search AlgorithmInput: Sinit, T , fmap, Dsample, fcost
Output: Ropt
1: S ← {Sinit}2: while ∃ t ∈ T, s ∈ S : t(s) 6∈ S do3: S ← S ∪ {t(s)}4: end while5: costopt ←∞6: for all s ∈ S do7: Rtmp ← fmap(s)8: costtmp ← fcost(Rtmp, Dsample)9: if costtmp < costopt then
10: Ropt ← Rtmp ; costopt ← costtmp
11: end if12: end for13: return Ropt
Obviously the complexity of such algorithm depends strongly on the set T .It can be proven that even a simple set of transformations causes the problemof finding the optimal schema to be NP-hard [29] [31] [15]. Thus the existingtechniques in fact search for a suboptimal solution using various heuristics,greedy strategies, approximation algorithms, terminal conditions, etc. We canalso observe that fixed methods can be considered as a special type of cost-drivenmethods, where T = ∅, Dsample = ∅, and fcost(R, ∅) = const for ∀ R.
3.1.1 Hybrid Object-Relational Mapping
One of the first attempts of a cost-driven adaptive approach is a method calledHybrid object-relational mapping [13]. It is based on the fact that if XML docu-ments are mostly semi-structured, a “classical” decomposition of less structuredXML parts into relations leads to inefficient query processing caused by plentyof join operations. The algorithm exploits the idea of storing well structuredparts into relations and semi-structured using so-called XML data type, whichsupports path queries and XML-aware full-text operations. The fixed mappingfor structured parts is similar to the classical Hybrid algorithm [26], whereas, inaddition, it exploits NF 2-relations using constructs such as set-of, tuple-of,and list-of. The main concern of the method is to identify the structured andsemi-structured parts. It consists of the following steps:
1. A schema graph G1 = (V1, E1, lab′1) is built for a given DTD.2. For ∀ v ∈ V1 a measure of significance ωv (see below) is determined.3. Each v ∈ V1 which satisfies the following conditions is identified:
(a) v is not a leaf node.
(b) For v and ∀ its descendant vi;1≤i≤k : ωv < ωLOD and ωvi < ωLOD,where ωLOD is a required level of detail of the resulting schema.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 48
(c) v does not have a parent node which would satisfy the conditions.
4. Each fragment f ⊆ G1 which consists of a previously identified node vand its descendants is replaced with an attribute node having the XMLdata type, resulting in a schema graph G2.
5. G2 is mapped to a relational schema using a fixed mapping strategy.
The measure of significance ωv of a node v is defined as:
ωv =12ωSv +
14ωDv +
14ωQv =
12ωSv +
14· card(Dv)
card(D)+
14· card(Qv)
card(Q)(1)
where ωSv is derived from the DTD structure as a combination of weights ex-pressing position of v in the graph and complexity of its content model (see[13]), D ⊆ Dsample is a set of all given documents, Dv ⊆ D is a set of docu-ments containing v, Q ⊆ Dsample is a set of all given queries, and Qv ⊆ Q is aset of queries containing v.
As we can see, the algorithm optimizes the naive approach mainly by thefacts that the schema graph is preprocessed, i.e. ωv is determined for ∀ v ∈ V1,that the set of transformations T is a singleton, and that the transformation isperformed if the current node satisfies the above mentioned conditions (a) – (c).As it is obvious, the preprocessing ensures that the complexity of the searchalgorithm is given by K1 ∗ card(V1) + K2 ∗ card(E1), where K1,K2 ∈ N . Onthe other hand, the optimization is too restrictive in terms of the amount ofpossible XML-to-relational mappings.
3.1.2 FlexMap Mapping
Another example of adaptive cost-driven methods was implemented as so-calledFlexMap framework [25]. The algorithm optimizes the naive approach usinga simple greedy strategy as depicted in Algorithm 2. The main differencesin comparison with the naive approach are the choice of the least expensivetransformation at each iteration (lines 3 – 9) and the termination of searching ifthere exists no transformation t ∈ T that can reduce the current (sub)optimum(lines 10 – 14).
The set T of XML-to-XML transformations involves the following operations:
• Inlining and outlining – inverse operations which enable to store columnsof a subelement / attribute either in a parent table or in a separate table
• Splitting and merging elements – inverse operations which enable to storea shared element4 either in a common table or in separate tables
• Associativity and commutativity• Union distribution and factorization – inverse operations which enable to
separate out components of a union using equation (a, (b|c)) = ((a, b)|(a, c))• Splitting and merging repetitions – exploitation of equation (a+) = (a, a∗)• Simplifying unions – exploitation of equation (a|b) ⊆ (a?, b?)
4An element with multiple parent elements in the schema – see [26].
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 49
Algorithm 2 Greedy Search AlgorithmInput: Sinit, T , fmap, Dsample, fcost
Output: Ropt
1: Sopt ← Sinit ; Ropt ← fmap(Sopt) ; costopt ← fcost(Ropt, Dsample)2: loop3: costmin ←∞4: for all t ∈ T do5: costt ← fcost(fmap(t(Sopt)), Dsample)6: if costt < costmin then7: tmin ← t ; costmin ← costt8: end if9: end for
10: if costmin < costopt then11: Sopt ← tmin(Sopt) ; Ropt ← fmap(Sopt) ; costopt ← fcost(Ropt, Dsample)12: else13: break;14: end if15: end loop16: return Ropt
Note that except for commutativity and simplifying unions the transforma-tions generate equivalent schema in terms of equivalence of sets of documentinstances. Commutativity does not retain the order of the schema, whereassimplifying unions generates a more general schema, i.e. a schema with largerset of document instances. (However, only inlining and outlining were imple-mented and experimentally tested by the FlexMap system.)
The fixed mapping again uses a strategy similar to the Hybrid algorithm but itis applied locally on each fragment of the schema specified by the transformationrules stated by the search algorithm. For example elements determined to beoutlined are not inlined though a “traditional” Hybrid algorithm would do so.
The process of evaluating fcost is significantly optimized. A naive approachwould require construction of a particular relational schema, loading sampleXML data into the relations, and cost analysis of the resulting relational struc-tures. The FlexMap evaluation exploits an XML Schema-aware statistics frame-work StatiX [11] which analyzes the structure of a given XSD and XML doc-uments and computes their statistical summary, which is then “mapped” torelational statistics regarding the fixed XML-to-relational mapping. Togetherwith sample query workload they are used as an input for a classical relationaloptimizer which estimates the resulting cost. Thus no relational schema has tobe constructed and as the statistics are respectively updated at each XML-to-XML transformation, the XML documents need to be processed only once.
3.1.3 An Adjustable and Adaptable Method (AAM)
The following method, which is also based on the idea of searching a space ofpossible mappings, is presented in [29] as an Adjustable and adaptable method
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 50
(AAM). In this case the authors adapt the given problem to features of geneticalgorithms. It is also the first paper that mentions that the problem of findinga relational schema R for a given set of XML documents and queries Dsample,s.t. fcost(R,Dsample) is minimal, is NP-hard in the size of the data.
The set T of XML-to-XML transformations consists of inlining and outliningof subelements. For the purpose of the genetic algorithm each transformedschema is represented using a bit string, where each bit corresponds to an edgeof the schema graph and it is set to 1 if the element the edge points to isstored into a separate table or 0 if the element the edge points to is stored intoparent table. The bits set to 1 represent “borders” among fragments, whereaseach fragment is stored into one table corresponding to so-called Universal table[10]. The extreme instances correspond to “one table for the whole schema” (incase of 00...0 bit string) resulting in many null values and “one table per eachelement” (in case of 11...1 bit string) resulting in many join operations.
Similarly to the previous strategy the algorithm chooses only the best possiblecontinuation at each iteration. The algorithm consists of the following steps:
1. The initial population P0 (i.e. the set of bit strings) is generated randomly.2. The following steps are repeated until terminating conditions are met:
(a) Each member of the current population Pi is evaluated and only thebest representatives are selected for further production.
(b) The next generation Pi+1 is produced by genetic operators crossover,mutation, and propagate.
The algorithm terminates either after certain number of transformations orif a good-enough schema is achieved.
The cost function fcost is expressed as:
fcost(R,Dsample) = fM (R,Dsample) + fQ(R, Dsample) =
=q∑
l=1
Cl ∗Rl + (m∑
i=1
Si ∗ PSi +n∑
k=1
Jk ∗ PJk) (2)
where fM is a space-cost function, where Cl is number of columns and Rl isnumber of rows in table Tl created for l-th element in the schema, q is thenumber of all elements in the schema, fQ is a query-cost function, where Si
is cost and PSi is probability of i-th select query and Jk is cost and PJkis
probability of k-th join query, m is the number of select queries in Dsample,and n is the number of join queries in Dsample. In other words fM representsthe total memory cost of the mapping instance, whereas fQ represents the totalquery cost. The probabilities PSi and PJk
enable to specify which elements will(not) be often retrieved and which sets of elements will (not) be often combinedto search. Also note that this algorithm represents another way of finding areasonable suboptimal solution in the theoretically infinite set of possibilities –using (in this case two) terminal conditions.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 51
3.1.4 A Hill Climbing Algorithm
The last but not least cost-driven adaptable representative can be found inpaper [31]. The approach is again based on a greedy type of algorithm, in thiscase a Hill climbing strategy that is depicted by Algorithm 3.
Algorithm 3 Hill Climbing AlgorithmInput: Sinit, T , fmap, Dsample, fcost
Output: Ropt
1: Sopt ← Sinit ; Ropt ← fmap(Sopt) ; costopt ← fcost(Ropt, Dsample)2: Ttmp ← T3: while Ttmp 6= ∅ do4: t ← any member of Ttmp
5: Ttmp ← Ttmp\{t}6: Stmp ← t(Sopt)7: costtmp ← fcost(fmap(Stmp), Dsample)8: if costtmp < costopt then9: Sopt ← Stmp ; Ropt ← fmap(Stmp) ; costopt ← costtmp
10: Ttmp ← T11: end if12: end while13: return Ropt
As we can see, the hill climbing strategy differs from the simple greedy strat-egy depicted in Algorithm 2 in the way it chooses the appropriate transformationt ∈ T . In the previous case the least expensive transformation that can reducethe current (sub)optimum is chosen, in this case it is the first such transforma-tion found. The schema transformations are based on the idea of vertical (V)or horizontal (H) cutting and merging the given XML schema fragment(s). Theset T consists of the following four types of (pairwise inverse) operations:
• V-Cut(f, (u,v)) – cuts fragment f into fragments f1 and f2, s.t. f1∪f2 = f ,where (u, v) is an edge from f1 to f2, i.e. u ∈ f1 and v ∈ f2
• V-Merge(f1, f2) – merges fragments f1 and f2 into fragment f = f1 ∪ f2
• H-Cut(f, (u,v)) – splits fragment f into twin fragments f1 and f2 hori-zontally from edge (u, v), where u 6∈ f and v ∈ f , s.t. ext(f1) ∪ ext(f2) =ext(f) and ext(f1) ∩ ext(f2) = ∅ 5 6
• H-Merge(f1, f2) – merges two twin fragments f1 and f2 into one fragmentf s.t. ext(f1) ∪ ext(f2) = ext(f)
As we can observe, V-Cut and V-Merge operations are similar to outliningand inlining of the fragment f2 out of or into the fragment f1. Conversely, H-Cut operation corresponds to splitting of elements used in FlexMap mapping,i.e. duplication of the shared part, and the H-Merge operation corresponds toinverse merging of elements.
5ext(fi) is the set of all instance fragments conforming to the schema fragment fi.6Fragments f1 and f2 are called twins if ext(f1) ∩ ext(f2) = ∅ and for each node u ∈ f1,
there is a node v ∈ f2 with the same label and vice versa.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 52
The fixed XML-to-relational mapping maps each fragment fi which consistsof nodes {v1, v2, ..., vn} to relation
Ri = (id(ri) : int, id(ri.parent) : int, lab(v1) : type(v1), ..., lab(vn) : type(vn))where ri is the root element of fi. Note that such mapping is again similar tolocally applied Universal table.
The cost function fcost is expressed as:
fcost(R, Dsample) =n∑
i=1
wi ∗ cost(Qi, R) (3)
where Dsample consists of a sample set of XML documents and a given queryworkload {(Qi, wi)i=1,2,...,n}, where Qi is an XML query and wi is its weight.The cost function cost(Qi, R) for a query Qi which accesses fragment set {fi1,..., fim} is expressed as:
cost(Qi, R) ={ |fi1| m = 1∑
j,k (|fij | ∗ Selij + δ ∗ (|Eij |+ |Eik|)/2) m > 1 (4)
where fij and fik, j 6= k are two join fragments, |Eij | is the number of elementsin ext(fij), and Selij is the selectivity of the path from the root to fij estimatedusing Markov table. In other words, the formula simulates the cost for joiningrelations corresponding to fragments fij and fik.
The authors further analyze the influence of the choice of initial schema Sinit
on efficiency of the search algorithm. They use three types of initial schemadecompositions leading to Binary [10], Shared, or Hybrid [26] mapping. Thepaper concludes with the finding that a good choice of an initial schema iscrucial and can lead to faster searches of the suboptimal mapping.
3.2 User-Driven Techniques
As mentioned above, the most flexible approach is the user-defined mapping,i.e. the idea “to leave the whole process in hands of a user” who defines both thetarget database schema and the required mapping. Due to simple implementa-tion it is supported in most commercial database systems [3]. At first sight theidea is correct – users can decide what suits them most and are not restrictedby disadvantages of a particular technique. The problem is that such approachassumes users skilled in two complex technologies and for more complex appli-cations the design of an optimal relational schema is generally an uneasy task.
On this account new techniques – in this paper called user-driven mappingstrategies – were proposed. The main difference is that the user can influencea default fixed mapping strategy using annotations which specify the requiredmapping for particular schema fragments. The set of allowed mappings is nat-urally limited but still enough powerful to define various mapping strategies.
Each of the techniques is characterized by the following four features:
1. an initial XML schema Sinit,2. a set of allowed fixed XML-to-relational mappings {f i
map}i=1,...,n,
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 53
3. a set of annotations A, each of which is specified by name, target, allowedvalues, and function, and
4. a default mapping strategy fdef for not annotated fragments.
3.2.1 MDF
Probably the first approach which faces the mentioned issues is proposed inpaper [8] as a Mapping definition framework (MDF). It allows users to specifythe required mapping, checks its correctness and completeness and completespossible incompleteness. The mapping specifications are made by annotatingthe input XSD with a predefined set of attributes A listed in Table 1.
Attribute Target Value Function
outline attribute or ele-ment
true,false
If the value is true, a separate ta-ble is created for the attribute /element. Otherwise, it is inlined.
tablename attribute, element,or group
string The string is used as the tablename.
columnname attribute, element,or simple type
string The string is used as the columnname.
sqltype attribute, element,or simple type
string The string defines the SQL typeof a column.
structurescheme root element KFO,Interval,Dewey
Defines the way of capturing thestructure of the whole schema.
edgemapping element true,false
If the value is true, the ele-ment and all its subelements aremapped using Edge mapping.
maptoclob attribute or ele-ment
true,false
If the value is true, the element/ attribute is mapped to a CLOBcolumn.
Table 1: Annotation attributes for MDF
As we can see, the set of allowed XML-to-relational mappings {f imap}i=1,...,n
involves inlining and outlining of an element / attribute, Edge mapping [10]strategy, and mapping an element or an attribute to a CLOB column. Further-more, it enables to specify the required capturing of the structure of the wholeschema using one of the following three approaches:
• Key, Foreign Key, and Ordinal Strategy (KFO) – each node is assigneda unique integer ID and a foreign key pointing to parent ID, the siblingorder is captured using an ordinal value
• Interval Encoding – a unique {start,end} interval is assigned to eachnode corresponding to preorder and postorder traversal entering time
• Dewey Decimal Classification – each node is assigned a path to the rootnode described using concatenation of node IDs along the path
As side effects can be considered attributes for specifying names of tablesor columns and data types of columns. Not annotated parts are stored usinguser-predefined rules, whereas such mapping is always a fixed one.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 54
3.2.2 XCacheDB System
Paper [4] also proposes a user-driven mapping strategy which is implementedand experimentally tested as an XCacheDB system which considers only un-ordered and acyclic XML schemes and omits mixed-content elements. The setof annotating attributes A that can be assigned to any node v ∈ Sinit is listedin Table 2.
Attribute Value Function
INLINE ∅ If placed on a node v, the fragment rooted at v is inlined intoparent table.
TABLE ∅ If placed on a node v, a new table is created for the fragmentrooted at v.
STORE BLOB ∅ If placed on a node v, the fragment rooted at v is stored alsointo a BLOB column.
BLOB ONLY ∅ If placed on a node v, the fragment rooted at v is stored into aBLOB column.
RENAME string The value specifies the name of corresponding table or columncreated for node v.
DATATYPE string The value specifies the data type of corresponding column cre-ated for node v.
Table 2: Annotation attributes for XCacheDB
It enables inlining and outlining of a node, storing a fragment into a BLOBcolumn, specifying table names or column names, and specifying column datatypes. The main difference is in the data redundancy allowed by attributeSTORE BLOB which enables to shred the data into table(s) and at the same timeto store pre-parsed XML fragments into a BLOB column.
The fixed mapping uses a slightly different strategy: Each element or attributenode is assigned a unique ID. Each fragment f is mapped to a table Tf whichhas an attribute avID of ID data type for each element or attribute node v ∈ f .If v is an atomic node7, Tf has also an attribute av of the same data type as v.For each distinct path that leads to f from a repeatable ancestor v, Tf has aparent reference column of ID type which points to ID of v.
3.3 Theoretic Issues
Besides proposals of cost-driven and user-driven techniques there are also paperswhich discuss the corresponding open issues on theoretic level.
3.3.1 Data Redundancy
As mentioned above, the XCacheDB system allows a certain degree of redun-dancy, in particular duplication in BLOB columns and the violation of BCNFor 3NF condition. The paper [4] discusses the strategy also on theoretic leveland defines four classes of XML schema decompositions. Before we state thedefinitions we have to note that the approach is based on a slightly different
7An attribute node or an element node having no subelements.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 55
graph representation than Definition 4. The nodes of the graph correspond toelements, attributes, or pcdata, whereas edges are labelled with correspondingoperators.
Definition 7 A schema decomposition is minimal if all edges connecting nodesof different fragments are labelled with “*” or “+”.
Definition 8 A schema decomposition is 4NF if all fragments are 4NF frag-ments. A fragment is 4NF if no two nodes of the fragment are connected by a“*” or “+” labelled edge.
Definition 9 A schema decomposition is non-MVD if all fragments are non-MVD fragments. A fragment is non-MVD if all “*” or “+” labelled edges appearin a single path.
Definition 10 A schema decomposition is inlined if it is non-MVD but it isnot a 4NF decomposition. A fragment is inlined if it is non-MVD but it is nota 4NF fragment.
According to these definitions fixed mapping strategies (e.g. [26] [18]) nat-urally consider only 4NF decompositions which are least space-consuming andseem to be the best choice if we do not consider any other information. Paper[4] shows that having further information (in this particular case given by auser), the choice of other type of decomposition can lead to more efficient queryprocessing though it requires a certain level of redundancy.
3.3.2 Grouping problem
Paper [15] is dealing with the idea that searching a (sub)optimal relational de-composition is not only related to given XML schema, query workload, andXML data, but it is also highly influenced by the chosen query translation algo-rithm8 and the cost model. For the theoretic purpose a subset of the problem –so-called grouping problem – is considered. It deals with possible storage strate-gies for shared subelements, i.e. either into one common table (so-called fullygrouped strategy) or into separate tables (so-called fully partitioned strategy).For analysis of its complexity the authors define two simple cost metrics:
• RelCount – the cost of a relational query is the number of relation instancesin the query expression
• RelSize – the cost of a relational query is the sum of the number of tuplesin relation instances in the query expression
and three query translation algorithms:
• Naive Translation – performs a join between the relations correspondingto all the elements appearing in the query, a wild-card query9 is convertedinto union of several queries, one for each satisfying wild-card substitution
8An algorithm for translating XML queries into SQL queries9A query containing “//” or “/*” operators.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 56
• Single Scan – a separate relational query is issued for each leaf elementand joins all relations on the path until the least common ancestor of allthe leaf elements is reached
• Multiple Scan – on each relation containing a part of the result is appliedSingle Scan algorithm and the resulting query consists of union of thepartial queries
On a simple example the authors show that for a wild-card query Q whichretrieves a shared fragment f with algorithm Naive Translation the fully parti-tioned strategy performs better, whereas with algorithm Multiple Scan the fullygrouped strategy performs better. Furthermore, they illustrate that reliabilityof the chosen cost model is also closely related to query translation strategy. Ifa query contains not very selective predicate than the optimizer may choose aplan that scans corresponding relations and thus RelSize is a good correspondingmetric. On the other hand, in case of highly selective predicate the optimizermay choose an index lookup plan and thus RelCount is a good metric.
4 Summary
We can sum up the state of the art of adaptability of database-based XML-processing methods into the following natural but important findings:
1. As the storage strategy has a crucial impact on query-processing perfor-mance, a fixed mapping based on predefined rules and heuristics is notuniversally efficient.
2. It is not an easy task to choose an optimal mapping strategy for a particu-lar application and thus it is not advisable to rely only on user’s experienceand intuition.
3. As the space of possible XML-to-relational mappings is very large (usuallytheoretically infinite) and most of the subproblems are even NP-hard, theexhaustive search is often impossible. It is necessary to define searchheuristics, approximation algorithms, and/or reliable terminal conditions.
4. The choice of an initial schema can strongly influence the efficiency of thesearch algorithm. It is reasonable to start with at least “locally good”schema.
5. A strategy of finding a (sub)optimal XML schema should take into accountnot only the given schema, query workload, and XML data statistics, butalso possible query translations, cost metrics, and their consequences.
6. Cost evaluation of a particular XML-to-relational mapping should not in-volve time-consuming construction of the relational schema, loading XMLdata and analyzing the resulting relational structures. It can be optimizedusing cost estimation of XML queries, XML data statistics, etc.
7. Despite the previous claim, the user should be allowed to influence themapping strategy. On the other hand, the approach should not demanda full schema specification but it should complete the user-given hints.
8. Even thought a storage strategy is able to adapt to a given sample ofschemes, data, queries, etc., its efficiency is still endangered by laterchanges of the expected usage.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 57
5 Open Issues
Although each of the existing approaches brings certain interesting ideas andoptimizations, there is still a space of possible future improvements of the adapt-able methods. We describe and discuss them in this section starting from (inour opinion) the least complex ones.
Missing Input Data As we already know, for cost-driven techniques thereare three types of input data – an XML schema Sinit, a set of XML documents{d1, d2, .., dk}, and a set of XML queries {q1, q2, .., ql}. The problem of miss-ing schema Sinit was already outlined in the introduction in connection with(dis)advantages of generic and schema-driven methods. As we suppose that theadaptability is the ability to adapt to the given situation, a method which doesnot depend on existence of an XML schema but can exploit the information ifbeing given is probably a natural first improvement. This idea is also stronglyrelated to the mentioned problem of choice of a locally good initial schema Sinit.The corresponding questions are: Can be the user-given schema considered as agood candidate for Sinit? How can we find an eventual better candidate? Canwe find such candidate for schema-less XML documents? A possible solutioncan be found in exploitation of methods for automatic construction of XMLschema for the given set of XML documents (e.g. [21] [23]). Assuming thatdocuments are more precise sources of structural information, we can expectthat a schema generated on their bases will have better characteristics too.
On the other hand, the problem of missing input XML documents can be atleast partly solved using reasonable default settings based on general analysisof real XML data (e.g. [5] [20]). Furthermore, the surveys show that real XMLdata are surprisingly simple and thus the default mapping strategy does not haveto be complex too. It should rather focus on efficient processing of frequentlyused XML patterns.
Finally, the presence of sample query workload is crucial since (to our knowl-edge) there are no analyses on real XML queries, i.e. no source of informationfor default settings. The reason is that collecting such real representatives is notas straightforward as in case of XML documents. Currently the best sources ofXML queries are XML benchmarking projects (e.g. [24] [30]) but as the dataand especially queries are supposed to be used for rating the performance of asystem in various situations, they cannot be considered as an example of a realworkload. Naturally, the query statistics can be gathered by the system itselfand the schema can be adapted continuously, as discussed later in the text.
Efficient Solution of Subproblems A surprising fact we have encounteredare numerous simplifications of the chosen solutions. As it was mentioned,some of the techniques omit, e.g., ordering of elements, mixed contents, orrecursion. This is a bit confusing finding regarding the fact that there areproposals of efficient processing of these XML constructs (e.g. [27]) and thatadaptive methods should cope with various situations.
A similar observation can be done for user-driven methods. Though theproposed systems are able to store schema fragments in various ways, the default
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 58
strategy for not annotated parts of the schema is again a fixed one. It can be aninteresting optimization to join the ideas and search the (sub)optimal mappingfor not annotated parts using a cost-driven method.
Deeper Exploitation of Information Another open issue is possible deeperexploitation of the information given by the user. We can identify two mainquestions: How can be the user-given information better exploited? Are thereany other information a user can provide to increase the efficiency?
A possible answer can be found in the idea of pattern matching, i.e. using theuser-given schema annotations as “hints” how to store particular XML patterns.We can naturally predict that structurally similar fragments should be storedsimilarly and thus to focus on finding these fragments in the rest of the schema.The main problem is how to identify the structurally similar fragments. If weconsider the variety of XML-to-XML transformations, two structurally samefragments can be expressed using “at first glance” different regular expressions.Thus it is necessary to propose particular levels of equivalence of XML schemafragments and algorithms how to determine them. Last but not least, suchsystem should focus on scalability of the similarity metric and particularly itsreasonable default setting (based on existing analyses of real-world data).
Theoretical Analysis of the Problem As the overview shows, there arevarious types of XML-to-XML transformations, whereas the mentioned onescertainly do not cover the whole set of possibilities. Unfortunately there seemsto be no theoretic study of these transformations, their key characteristics, andpossible classifications. The study can, among others, focus on equivalent andgeneralizing transformations and as such serve as a good basis for the patternmatching strategy. Especially interesting will be the question of NP-hardnessin connection with the set of allowed transformations and its complexity (sim-ilarly to paper [15] which analyzes theoretical complexity of combinations ofcost metrics and query translation algorithms). Such survey will provide usefulinformation especially for optimizations of the search algorithm.
Dynamic Adaptability The last but not least issue is connected with themost striking disadvantage of adaptive methods – the problem of possible changesof XML queries or XML data that can lead to crucial worsening of the efficiency.As mentioned above, it is also related to the problem of missing input XMLqueries and ways how to gather them. The question of changes of XML dataopens also another wide research area of updatability of the stored data – afeature that is often omitted in current approaches although its importance iscrucial.
The solution to these issues – i.e. a system that is able to adapt dynamically– is obvious and challenging but it is not an easy task. It should especially avoidtotal reconstructions of the whole relational schema and corresponding necessaryreinserting of all the stored data, or such operation should be done only in veryspecial cases. On the other hand, this “brute-force” approach can serve as aninspiration. Supposing that changes especially in case of XML queries will notbe radical, the modifications of the relational schema will be mostly local and
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 59
we can apply the expensive reconstruction just locally. Furthermore, we canagain exploit the idea of pattern matching and find the XML pattern definedby the modified schema fragment in the rest of the schema.
Another question is how often should be the relational schema reconstructed.The natural idea is of course “not too often”. But, on the other hand, a researchcan be done on the idea of performing gradual minor changes. It is probablethat such approach will lead to less expensive (in terms of reconstruction) and atthe same time more efficient (in terms of query processing) system. The formerhypothesis should be verified, the latter one can be almost certainly expected.The key issue is how to find a reasonable compromise.
6 Conclusion
The main goal of this paper was to describe and discuss the current state of theart and open issues of adaptability in database-based XML-processing methods.Firstly, we have stated the reasons why this topic should be ever studied. Thenwe have provided an overview and classification of the existing approaches andsummed up the key findings. Finally, we have discussed the correspondingopen issues and their possible solutions. Our aim was to show that the idea ofprocessing XML data using relational databases is still up to date and shouldbe further developed. From the overview we can see that even though there areinteresting and inspiring approaches, there is still a variety of open problemswhich can further improve the database-based XML processing.
Our future work will naturally follow the open issues stated at the end of thispaper and especially survey into the solutions we have mentioned. Firstly, wewill focus on the idea of improving the user-driven techniques using adaptivealgorithm for not annotated parts of the schema together with deeper exploita-tion of the user-given hints using pattern-matching methods – i.e. a hybrid user-driven cost-based system. Secondly, we will deal with the problem of missingtheoretic study of schema transformations, their classification, and particularlyinfluence on the complexity of the search algorithm. And finally, on the basis ofthe theoretical study and the hybrid system we will study and experimentallyanalyze the dynamic enhancing of the system.
References
[1] DB2 XML Extender. IBM. http://www.ibm.com/.[2] Oracle XML DB. Oracle Corporation. http://www.oracle.com/.[3] S. Amer-Yahia. Storage Techniques and Mapping Schemas for XML. Tech-
nical Report TD-5P4L7B, AT&T Labs-Research, 2003.[4] A. Balmin and Y. Papakonstantinou. Storing and Querying XML Data
Using Denormalized Relational Databases. The VLDB Journal, 14(1):30–49, 2005.
[5] D. Barbosa, L. Mignet, and P. Veltri. Studying the XML Web: GatheringStatistics from an XML Sample. World Wide Web, 8(4):413–438, 2005.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 60
[6] G. J. Bex, F. Neven, and J. Van den Bussche. DTDs versus XML Schema:a Practical Study. In WebDB’04: Proc. of the 7th Int. Workshop on theWeb and Databases, pages 79–84, New York, NY, USA, 2004. ACM Press.
[7] P. V. Biron and A. Malhotra. XML Schema Part 2: Datatypes (SecondEdition). W3C, October 2004.
[8] F. Du, S. Amer-Yahia, and J. Freire. ShreX: Managing XML Documentsin Relational Databases. In VLDB’04: Proc. of 30th Int. Conf. on VeryLarge Data Bases, pages 1297–1300, Toronto, ON, Canada, 2004. MorganKaufmann Publishers Inc.
[9] T. Bray et al. Extensible Markup Language (XML) 1.0 (Fourth Edition).W3C, September 2006.
[10] D. Florescu and D. Kossmann. Storing and Querying XML Data using anRDMBS. IEEE Data Eng. Bull., 22(3):27–34, 1999.
[11] J. Freire, J. R. Haritsa, M. Ramanath, P. Roy, and J. Simeon. StatiX:Making XML Count. In ACM SIGMOD’02: Proc. of the 21st Int. Conf.on Management of Data, pages 181–192, Madison, Wisconsin, USA, 2002.ACM Press.
[12] T. Grust. Accelerating XPath Location Steps. In SIGMOD’02: Proc. ofthe ACM SIGMOD Int. Conf. on Management of Data, pages 109–120,New York, NY, USA, 2002. ACM Press.
[13] M. Klettke and H. Meyer. XML and Object-Relational Database Systems– Enhancing Structural Mappings Based on Statistics. In Lecture Notes inComputer Science, volume 1997, pages 151–170, 2000.
[14] M. Kratky, J. Pokorny, and V. Snasel. Implementation of XPath Axesin the Multi-Dimensional Approach to Indexing XML Data. In Proc. ofCurrent Trends in Database Technology – EDBT’04 Workshops, pages 46–60, Heraklion, Crete, Greece, 2004. Springer.
[15] R. Krishnamurthy, V. Chakaravarthy, and J. Naughton. On the Difficultyof Finding Optimal Relational Decompositions for XML Workloads: AComplexity Theoretic Perspective. In ICDT’03: Proc. of the 9th Int. Conf.on Database Theory, pages 270–284, Siena, Italy, 2003. Springer.
[16] A. Kuckelberg and R. Krieger. Efficient Structure Oriented Storage ofXML Documents Using ORDBMS. In Proc. of the VLDB’02 WorkshopEEXTT and CAiSE’02 Workshop DTWeb, pages 131–143, London, UK,2003. Springer-Verlag.
[17] Q. Li and B. Moon. Indexing and Querying XML Data for Regular PathExpressions. In VLDB’01: Proc. of the 27th Int. Conf. on Very Large DataBases, pages 361–370, San Francisco, CA, USA, 2001. Morgan KaufmannPublishers Inc.
[18] I. Mlynkova and J. Pokorny. From XML Schema to Object-RelationalDatabase – an XML Schema-Driven Mapping Algorithm. In ICWI’04:Proc. of IADIS Int. Conf. WWW/Internet, pages 115–122, Madrid, Spain,2004. IADIS.
[19] I. Mlynkova and J. Pokorny. XML in the World of (Object-)RelationalDatabase Systems. In ISD’04: Proc. of the 13th Int. Conf. on Informa-
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 61
tion Systems Development, pages 63–76, Vilnius, Lithuania, 2004. SpringerScience+Business Media, Inc.
[20] I. Mlynkova, K. Toman, and J. Pokorny. Statistical Analysis of Real XMLData Collections. In COMAD’06: Proc. of the 13th Int. Conf. on Man-agement of Data, pages 20–31, New Delhi, India, 2006. Tata McGraw-HillPublishing Company Limited.
[21] C.-H. Moh, E.-P. Lim, and W. K. Ng. DTD-Miner: A Tool for Mining DTDfrom XML Documents. In WECWIS’00: Proc. of the 2nd Int. Workshopon Advanced Issues of E-Commerce and Web-Based Information Systems,pages 144–151, Milpitas, CA, USA, 2000. IEEE.
[22] M. Murata, D. Lee, and M. Mani. Taxonomy of XML Schema LanguagesUsing Formal Language Theory. In Proc. of the Extreme Markup LanguagesConf., Montreal, Quebec, Canada, 2001.
[23] S. Nestorov, S. Abiteboul, and R. Motwani. Extracting Schema fromSemistructured Data. In SIGMOD’98: Proc. of the ACM Int. Conf. OnManagement of Data, pages 295–306, Seattle, Washington, DC, USA, 1998.ACM Press.
[24] E. Rahm and T. Bohme. XMach-1: A Benchmark for XML Data Manage-ment. Database Group Leipzig, 2006.
[25] M. Ramanath, J. Freire, J. Haritsa, and P. Roy. Searching for EfficientXML-to-Relational Mappings. In XSym’03: Proc. of the 1st Int. XMLDatabase Symposium, volume 2824, pages 19–36, Berlin, Germany, 2003.Springer.
[26] J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. DeWitt, and J. F.Naughton. Relational Databases for Querying XML Documents: Limita-tions and Opportunities. In VLDB’99: Proc. of 25th Int. Conf. on VeryLarge Data Bases, pages 302–314, San Francisco, CA, USA, 1999. MorganKaufmann Publishers Inc.
[27] I. Tatarinov, S. D. Viglas, K. Beyer, J. Shanmugasundaram, E. Shekita,and C. Zhang. Storing and Querying Ordered XML Using a RelationalDatabase System. In SIGMOD’02: Proc. of 21st Int. Conf. on Managementof Data, pages 204–215, Madison, Wisconsin, USA, 2002. ACM Press.
[28] H. S. Thompson, D. Beech, M. Maloney, and N. Mendelsohn. XML SchemaPart 1: Structures (Second Edition). W3C, October 2004.
[29] W. Xiao-ling, L. Jin-feng, and D. Yi-sheng. An Adaptable and AdjustableMapping from XML Data to Tables in RDB. In Proc. of the VLDB’02Workshop EEXTT and CAiSE’02 Workshop DTWeb, pages 117–130, Lon-don, UK, 2003. Springer-Verlag.
[30] B. B. Yao and M. T. Ozsu. XBench – A Family of Benchmarks for XMLDBMSs. University of Waterloo, School of Computer Science, DatabaseResearch Group, 2002.
[31] S. Zheng, J. Wen, and H. Lu. Cost-Driven Storage Schema Selection forXML. In DASFAA’03: Proc. of the 8th Int. Conf. on Database Systems forAdvanced Applications, pages 337–344, Kyoto, Japan, 2003. IEEE Com-puter Society.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 43 - 62
© 2006 Technomathematics Research Foundation
Irena Mlynkova, Jaroslav Pokorny 62
XML View Based Access to RelationalData in Workflow Management Systems
Christian DreierDepartment of Informatics-SystemsUniversity of Klagenfurt, Austria
Johann Eder#1, Marek Lehmann #2, Juergen Mangler#3
Department of Knowledge and Business EngineeringUniversity of Vienna, Austria
[email protected]@[email protected]
AbstractXML became the most significant standard for data exchange and publication over theinternet but most business data remain stored in relationaldatabases. Access to businessdata in workflows or Web sevices is frequently realized by accessing XML views pub-lished over relational databases. In these loosely coupledenvironments where activitiescan execute on XML views generated from relational data without a possibility to lock thebase data, it is necessary to provide view freshness controland invalidation mechanisms.In this paper we present an invalidation method for XML viewspublished over relationaldata developed for our prototype workflow management system.
Keywords: Workflow management, workflow data, XML, XML views, view invalidation
1 Introduction
XML became the most significant standard for data exchange and publication over theinternet. Nevertheless, most business data remain stored in relational databases. XMLviews over relational data are seen as a general way to publish relational data as XMLdocuments. There are many proposals to overcome the mismatch between flat relationaland hierarchical XML models (e.g. [1,2]). Also commercial relational database manage-ment systems offer a possibility to publish relational dataas XML (e.g. [3,4]).
The importance of XML technology is increasing tremendously in process manage-ment. Web services [5], workflow management systems [6] and B2B standards [7,8] useXML as a data format. Complex XML documents published and exchanged by businessprocesses are usually defined with XML Schema types. Processactivities expect and pro-duce XML documents as parameters. XML documents encapsulated in messages (e.g.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 63
WSDL) can trigger new process instances. Frequently, hierarchical XML data used byprocesses and activities have to be translated into flat relational data model used by ex-ternal databases. These systems are very often loosely coupled and it is impossible orvery difficult to provide view maintenance. On the other hand, the original data can beaccessed and modified by other systems or application programs. Therefore, a method ofcontrolling the freshness of a view and invalidating views becomes vital.
We developed a view invalidation method for our prototype workflow managementsystem. A special data access module, so called generic dataaccess plug-in (GDAP),enables the definition of XML views over relational data. GDAP offers a possibility tocheck the view freshness and can invalidate a stale view. In case of view update operationsthe GDAP automatically checks whether the view is not stale before propagating updateto the original database.
The remainder of this paper is organized as follows: Section2 presents an overallarchitecture of our prototype workflow management systems and introduces the idea ofdata access plug-ins used to provide uniform access to external data in workflows. Sec-tion 3 discusses invalidation mechanisms for XML views defined over relational data andSection 4 describes their actual implementation in our system. We give some overview ofrelated work in Section 5 and finally draw conclusions in Section 6.
2 Uniform and Flexible Data Access in Workflow Manage-ment Systems
Workflow management systems are not intended to provide general data management sys-tems capabilities, although they have to be able to work withlarge amounts of data comingfrom different sources. Business data, describing persistent business information neces-sary to run an enterprise, may be controlled either by a workflow management system orbe managed in external systems (e.g. corporate database). The workflow managementsystem needs a direct data access to make control flow decisions based upon data values.An important drawback is that workflow management system external data can only beused indirectly for this purpose, e.g. be queried for control decisions. Therefore most ofthe activity programming is related to accessing external databases [9].
We propose to provide the workflow management system with a uniform and transpar-ent access method to all business data stored in any data source. The workflow manage-ment system should be able to use data coming from external and independent systems todetermine a state transition or to pass it between activities as parameters. This is achievedby an abstraction layer calleddata access plug-ins.
A general architecture of our workflow management prototypeis presented in Fig. 1.The workflow engine provides operational functions to support the execution of processes.The workflow repository stores both workflow definition and instance data. The pro-gram interaction manager calls programs implementing automated activities. The work-list manager is responsible for worklists of the human actors and for the interaction withthe worklist handlers. The data access plug-in manager is responsible for registering andmanaging data access plug-ins. Apart from the generic data access plug-in there may bespecialized plug-ins for specific data sources (e.g. legacysystems). Our implementationincluded generic data access plug-in for relational databases and another one for XMLfiles stored in a file system.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 64
ProgramInteractionManager
Worklist
Manager
Data Access
Plug-ins
Data AccessPlugIn Manager
WfMS
ExternalSystems
Workflow
Engine
Worklist handler
External Data Sources
WorkflowRepository
Figure 1: Workflow management system architecture with dataaccess plug-ins
2.1 Data Access Plug-ins
Data access plug-ins are reusable and interchangeable wrappers around external datasources which present to the workflow management system the content of underlyingdata sources and manage the access to it. The functionality of external data sources isabstracted in these plug-ins.
Each data access plug-in provides documents in one or several predefined XML Schematypes. Both a data access plug-in and XML Schema types servedby this plug-in are reg-istered to the workflow management system. Once registered,a data access plug-in canbe reused in many workflow definitions to access external dataas XML documents of agiven type.
Consider the following frequent scenario: an enterprise has a large database with thecustomer data stored in several relations and used in many processes. In our approach thecompany defines a complex XML Schema type describing customer data and implementsa data access plug-in which wraps this database and retrieves and stores customer data inXML format. This has several advantages:
• Business data from external systems are accessible by the workflow managementsystem. Thus, these data can be passed to activities and usedto make control flowdecisions.
• Activities can be parameterized with XML documents of predefined types. Thelogic for accessing external data sources is hidden in a dataaccess plug-in fetchingdocuments passed to activities at runtime. This allows activities to be truly reusableand independent of physical data location.
• Making external data access explicit with the data access plug-ins rather than hidingit in the activities improves the understandability, maintainability and auditabilityof process definitions.
• Both data access plug-ins and XML Schema type are reusable.
• This solution is easily evolvable. If the customer data have to be moved to a dif-ferent database, it is sufficient to use another data access plug plug-in. The processdefinition and activities remain basically unchanged.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 65
The task of a data access plug-in is to translate the operations on XML documents tothe underlying data sources. A data access plug-in exposes to the workflow managementsystem a simple interface which allows XML documents to be read, written or createdin a collection of many documents of the same XML Schema type.Each document inthe collection is identified by a unique identifier. The plug-in must be able to identify thedocument in the collection given only this identifier.
Each data access plug-in allows an XPath expression to be evaluated on a selectedXML document. The XML documents used within a workflow can be used by the work-flow engine to control the flow of workflow processing. This is done in conditional splitnodes by evaluating the XPath conditions on documents. If a given document is stored inan external data source and accessed by a data access plug-in, then the XPath conditionhas to be evaluated by this plug-in. XPath is also used to access data values in XMLdocuments.
2.2 Generic Data Access Plug-in for Relational Data Sources
Most business data remain stored in relational databases. Therefore, a generic and ex-pandable solution for relational data sources was needed. Ageneric data access plug-in(GDAP) offers basic operations and can be extended by users to their specific data sources.GDAP is responsible for mapping of the hierarchical XML documents used by workflowsand activities into flat relational data model used by external databases. Thus, documentsproduced by GDAP can be seen as XML views of relational data.
The workflows and activities managed by the workflow management system can runfor a long time. In a loosely coupled workflow scenario it is neither reasonable nor possi-ble to lock data in the original database for a processing time a workflow. At the same timethese data can be modified by other systems or workflows. In order to provide optimisticconcurrency control, some form of view invalidation is required [4]. Therefore, GDAPprovides a view freshness control and view invalidation method. In case of view updateoperations GDAP automatically checks whether the view is not stale before propagatingupdate to the original database.
3 Change Detection and View Invalidation
In our GDAP (generic data access plug-in) we analyzed and implemented a mechanismfor invalidation of XML views of relational data by detecting relevant base data changes.Change detection in base relational data can be done in two ways: passive or active.
In our approach, passive change detection means that a GDAP is informed aboutchanges in relational data used in views managed by this plug-in. Therefore, it is neces-sary that a (passive) plug-in is able to subscribe certain views to this notification process.Additionally, a view that is not used any longer needs to be unsubscribed. We identifiedthree passive mechanisms for change detection:
1. Passive change detection by use of concepts of active databases. That means thattriggers are defined on base relations containing data published in the views, in-forming the GDAP about changes in the database.
2. Passive change detection by change methods: This mechanism is based on ob-ject oriented and object relational databases, providing the possibility to implement
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 66
change methods. Change methods that implement change operations on the under-laying base data can be extended by a functionality to informthe plug-ins in caseof potential view-invalidations.
3. Passive change detection by event mechanisms: This is themost general approachbecause here an additional publish-subscribe mechanism isassumed. GDAPs sub-scribe views for different events on the database (e.g. change operations on basedata). If an event occurs, a notification is sent to the GDAP initiating the viewinvalidation process.
On the other hand, using active change detection techniques, it is the GDAP’s ownresponsibility to check whether underlaying base data has changed periodically or at de-fined points in time. Due to the fact that no concepts of activeor object oriented databasesand no publish-subscribe mechanisms are required, these techniques are universal. Wedistinguish three active change detection methods:
1. A naive approach is to backup the relevant relational basedata at view creation timeand compare it to the relational base data at the time when theview validity has tobe checked. Differences in these two data sets may lead to view invalidation.
2. To avoid storing huge amounts of backup data, a hash function can be used tocompute its hash value and back it up. At the point in time whena view validitycheck becomes necessary again a hashvalue is computed on thenew database stateand compared to the backup-value to determine changes. Notice that in this caseit can come to a collision, i.e. hash values could be same for different data and inresult lead to over-invalidation.
3. Usage of change logs: Change logs log all changes within the database caused byinternal or external actors. Because no database state needs to be backed up atview-creation time, less space is used.
After changes in base data have been detected the second GDAPtask is to determinewhether corresponding views became stale, i.e. check if they need to be invalidated ornot. Not all changes in the base relational data lead to invalid views. We developedan algorithm that considers both the type of change operation and the view definition tocheck the view invalidation.
Figure 2 gives an overview of this algorithm. It shows the different effects causedby different change operations: The change of a tuple that isdisplayed in the view (i.e.that satisfies the selection-condition), always leads to view invalidation. In certain caseschange operations on tuples that are not displayed in the view lead to invalidation too: (1)Changes of tuples selected by a where-clause make views stale. (2) Changes invalidateall views with view-definitions that do not contain a where-clause, and the changed tuplesemerge in relations occurring in the from-clause of the view-definition.
These cases also apply to deletes and inserts. A tuple inserted by an insert operationinvalidates the view if it is displayed in the view, it is selected in the where-clause or theview-definition does not contain a where-clause and the tuple is inserted into a relationoccurring in the from-clause of the view-definition.
The same applies to the case of deletes: A view is invalidatedif the deleted tuple isdisplayed in the view, it is selected by the where-clause or it is deleted from a relationoccurring in the view-definition’s from-clause.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 67
selection-clause satisfied
(tuple is shown in view)
yes no
invalid
where-clause exists
yes no
where-clause
selects tuple
tuple occurs in relation
occuring in from-
clause
yes no
invalid valid
yes no
invalid valid
Figure 2: View invalidation algorithm
If we assume that for update-propagation reasons the identifying primary keys of thebase tuples are contained in the view, every tuple of the viewcan be associated with thebase tuple and vice-versa. Thus, every single change withinbase data can be associatedwith the view.
4 Implementation
To validate our approach we implemented a generic data access plug-in which was in-tegrated into our prototype workflow management system described in Section 2. Thecurrent implementation of the GDAP for relational databases (Fig. 3) takes advantage ofXML-DBMS middleware for transferring data between XML documents and relationaldatabases [10]. XML-DBMS maps the XML document to the database according to anobject-relational mapping in which element types are generally viewed as classes andattributes and XML text data as properties of those classes.An XML-based mapping lan-guage allows the user to define an XML view of relational data by specifying these map-pings. The XML-DBMS supports also insert, update and deleteoperations. We follow inour implementation an assumption made by the XML-DBMS that the view updateabilityproblem has already been resolved. The XML-DBMS checks onlybasic properties for theview updateability, e.g. presence in an updateable view of primary keys and obligatoryattributes. Other issues, like content and structural duplication are not addressed.
The GDAP controls the freshness of generated XML views usingthe predefined trig-gers and so called view-tuple lists (VTLs). A VLT contains primary keys of tuples whichwere selected into the view. VLTs are managed and stored internally by the GDAP. Asample view-tuple list which contains primary keys of displayed tuples in each involvedrelation is shown in Table 1.
Our GDAP also uses triggers defined on tables which were used to create a view.These triggers are created together with the view definitionand store all primary keys ofall tuples inserted, deleted or modified within these relations in a special log. This log isinspected by the GDAP later, and information gathered during this inspection process isused for our invalidation process of the view. Thus, our implementation uses a variant ofactive change detection with change logs as described in Section 3.
Two different VTLs are used in the invalidation process: VTLold is computed whenthe view is generated, VTLnew is computed at the time of the invalidation process. Theprimary keys of modified records in the original tables logged by a trigger are used to
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 68
log
trigger
immutable schema
SQ
L
XMLView
external users/
applications
DB changes DB changes
method call
XML-DBMS-middleware
view invalidation checking
Generic Data Access Plugin
passive
change detection
active
change detection
implements Workflow.Plugins.
DataAccessPluginInterface
view updatability checking
XForms
Workflow Engine
...
VTLold, VTLnew
configuration
view-type-ID-List
definition/
deletion
relational
DB
LogDef.xml
ViewDef.xml
X-Diff
viewList.bin
vtl.bin
Figure 3: Generic data access plug-in architecture
Table 1: VTL exampleview-id relation primarykey key values
view 001 order orderNr 4 6 7customer customerNr 5 8 9
view 002 position orderNr,positionNr 5,1 5,2 5,4view 003 article articleNr 1 3 4 5
... ... ... ... ... ... ...
detect possible changes in a view. The algorithm is as follows:For each tuple T in a change log check one of the following (IDT denotes the identifyingprimary key of the tuple T):
• Case 1: IDT is contained both in VTLold and in VTLnew. This denotes an updateoperation.
• Case 2: IDT is contained only in VTLnew. This denotes an insert operation.
• Case 3: IDT is contained only in VTLold. This denotes a delete operation.
The check procedure stops as soon as one of Cases 1-3 is true. This means that one ofthe modified tuples logged in the change log would be selectedinto the view and the viewmust be invalidated. But the view should be also invalidatedif the selection-condition ofthe view definition is not satisfied. This described by the next two cases which also must
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 69
Table 2: View exampletupleID name salary maxSalary
1 Joe 1000 30002 Bill 2000 3000
yes no
invalid
yes no
invalid valid
case 4 or case 5 satisfied
case 1, case 2 or case 3 satisfied
Figure 4: View invalidation method implemented in GDAP
be checked. Viewold resp. Viewnew denotes the original view resp. the view generatedduring the validation checking process:
• Case 4: If VTLold is not equal to VTLnew, the view is invalid because the set oftuples has changed.
• Case 5: If VTLold is equal to VTLnew and Viewold is not equal to Viewnew, theview has to be invalidated. That means that the tuples remained the same, but valueswithin these tuples have changed.
To clarify that it is necessary to check case 5, see the following view-definition andthe corresponding view listed in Table 2:
SELECT tupleID, name, salary,(SELECT max(salary) AS maxSalaryFROM employeesWHERE department=’IT’)
FROM employees WHERE department=’IT’AND tupleID<3
If the salary of the employee with the maximum salary is changed (notice that thisemployee is not selected by the selection-condition), still the same tuples are selected, butwithin the view maxSalary changes.
The invalidation checking in Cases 1-4 does not require viewrecomputation. But Case5 only needs to be checked if the Cases 1-4 are not satisfied. Notice that while the Cases 4and 5 just have to be checked once, Cases 1-3 have to be checkedfor every tuple occurringin the change log. This invalidation algorithm used in our GDAPs is summarized in Fig. 4.
Case 5 is checked by comparing two views, as shown above. The disadvantagesof this comparison are that a recomputation of the view Viewnew is time and resourceconsuming, as well as the comparison itself may be very inefficient. A more efficient wayis to replace Case 5 by Case 5a:
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 70
Table 3: View exampletupleID name salary comparisonSalary
1 Joe 1000 25002 Bill 2000 2500
• Case 5a: If VTLold is equal to VTLnew and there is an additional sub-select-clausewithin the select-clause and any relation in this from-clause has been changed, thenthe view is invalidated.
This way the view shown in Table 2 can be invalidated after theattribute maxSalaryis changed.
It is also possible that the sub-select-clause does not contain group-functions, like thefollowing view-definition:
SELECT tupleID, name, salary,(SELECT salary AS comparisonSalaryFROM employees WHERE tupleId=’10’)
FROM employees WHERE department=’IT’AND tupleID=’3’
The resulting view is listed in Table 3. If there is a change operation on the salary ofthe employee withtupleId=10, the view has to be invalidated. This is checked by case5a. Additionally, all changes on relations occurring in thefrom-clause of the sub-selectlead to an invalidation of the view. In Case 5a even the changes, that do not affect the viewitself, can make it stale. Thus, over-invalidation may occur in a high degree. Still, thismechanism of checking view freshness and view-invalidation seems to be a more efficientalternative to Case 5.
5 Related Work
In most existing workflow management systems, data used to control the flow of the work-flow instances (i.e. workflow relevant data) are controlled by the workflow managementsystem itself and stored in the workflow repository. If thesedata originate in externaldata sources, then external data are usually copied into theworkflow repository. Thereis no universal standard for accessing external data in workflow management systems.Basically each product uses different solutions [11].
There has been recent interest in publishing relational data as XML documents oftencalled XML views over relational data. Most of these proposals focus on convertingqueries over XML documents into SQL queries over the relational tables and on efficientmethods of tagging and structuring of relational data into XML (e.g. [1,2,12].
The view updateability problem is well known in relational and object- relationaldatabases [13]. The mismatch between flat relational and hierarchical XML models isan additional challenge. This problem is addressed in [14].However, most proposals ofupdateable XML views [15] and commercial RDBMS (e.g. [3]) assume that XML viewupdateability problem is already solved. The termview freshnessis not used in a uniformway, dealing with the currency and timeliness [16] of a (materialized) view. Additional
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 71
dimensions of view freshness regarding the frequency of changes are discussed in [17].We do not distinguish all these dimensions in this paper. Here view freshness means thevalidity of a view. Different types of invalidation, including over- and under-invalidationare discussed in [18].
A view is not fresh (stale) if data used to generate the view were modified. It isimportant to detect relevant modifications. In [19] authorsproposed to store both originalXML documents and their materialized XML views in special relational tables and to useupdate log to detect relevant updates. A new view generated from the modified base datamay be different as a previously generated view. Several methods for change detectionof XML documents were proposed (e.g. [20, 21]). The authors of [22] proposed firstto store XML in special relational tables and then to use SQL queries to detect contentchanges of such documents. In [4] before and after images of an updateable XML view arecompared in order to find the differences which are later usedto generate correspondingSQL statements responsible for updating relational data. The before and after images arealso used to provide optimistic concurrency control.
6 Conclusions
The data aspect of workflows requires more attention. Since workflows typically accessdata bases for performing activities or making flow decisions, the correct synchronizationbetween the base data and the copies of these data in workflow systems is of great impor-tance for the correctness of the workflow execution. We described a way for recognizingthe invalidation of materialized views of relational data used in workflow execution. Tocheck the freshness of generated views our algorithm does not require any special datastructures in the RDBMS except a log table and triggers. Additionally, view-tuple-listsare managed to store primary keys of tuples selected into a view. Thus, only a very smallamount of overhead data are stored and can be used to invalidate stale views.
The implemented general data access plug-in enables flexible publication of relationaldata as XML documents used in loosely coupled workflows. Thisbrings obvious ad-vantages for intra- and interorganizational exchange of data. In particular, it makes thedefinition of workflows easier and the coupling between workflow system and databasesmore transparent, since it is no longer needed to perform allthe necessary checks in theindividual activities of a workflow.
Acknowledgements
This work is partly supported by the Commission of the European Union within theproject WS-Diamond in FP6.STREP
References
[1] M. Fernandez, Y. Kadiyska, D. Suciu, A. Morishima, and W.-C. Tan, “Silkroute:A framework for publishing relational data in xml,”ACM Trans. Database Syst.,vol. 27, no. 4, pp. 438–493, 2002.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 72
[2] J. Shanmugasundaram, J. Kiernan, E. J. Shekita, C. Fan, and J. Funderburk, “Query-ing xml views of relational data,” inVLDB ’01: Proceedings of the 27th Interna-tional Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc.,2001, pp. 261–270.
[3] Oracle,XML Database Developer’s Guide - Oracle XML DB. Release 2 (9.2), Ora-cle Corporation, October 2002.
[4] M. Rys, “Bringing the internet to your database: Using sqlserver 2000 and xml tobuild loosely-coupled systems,” inProceedings of the 17th International Confer-ence on Data Engineering ICDE, April 2-6, 2001, Heidelberg,Germany. IEEEComputer Society, 2001, pp. 465–472.
[5] T. Andrews, F. Curbera, H. Dholakia, Y. Goland, J. Klein,F. Leymann, K. Liu,D. Roller, D. Smith, S. Thatte, I. Trickovic, and S. Weerawarana, “Business processexecution language for web services (bpel4ws),” BEA, IBM, Microsoft, SAP, SiebelSystems, Tech. Rep. 1.1, 5 May 2003.
[6] WfMC, “Process definition interface - xml process definition language (xpdl 2.0),”Workflow Management Coalition, Tech. Rep. WFMC-TC-1025, 2005.
[7] ebXML, ebXML Technical Architecture Specification v1.0.4, ebXML Technical Ar-chitecture Project Team, 2001.
[8] M. Sayal, F. Casati, U. Dayal, and M.-C. Shan, “Integrating workflow managementsystems with business-to-business interaction standards,” in Proceedings of the 18thInternational Conference on Data Engineering (ICDE’02). IEEE Computer Soci-ety, 2002, p. 287.
[9] M. Ader, “Workflow and business process management comparative study. volume2,” Workflow & Groupware Strategies, Tech. Rep., June 2003.
[10] R. Bourret, “Xml-dbms middleware,” Viewed: May 2005,http://www.rpbourret.com/xmldbms/index.htm.
[11] N. Russell, A. H. M. t. Hofstede, D. Edmond, and W. v. d. Aalst, “Workflow datapatterns,” Queensland University of Technology, Brisbane, Australia, Tech. Rep.FIT-TR-2004-01, April 2004.
[12] J. Shanmugasundaram, E. J. Shekita, R. Barr, M. J. Carey, B. G. Lindsay, H. Pira-hesh, and B. Reinwald, “Efficiently publishing relational data as xml documents,”VLDB J., vol. 10, no. 2-3, pp. 133–154, 2001.
[13] C. Date,An Introduction to Database Systems, Eighth Edition. Addison Wesley,2003.
[14] L. Wang and E. A. Rundensteiner, “On the updatability ofxml views published overrelational data,” inConceptual Modeling - ER 2004, 23rd International Conferenceon Conceptual Modeling, Shanghai, China, November 2004, Proceedings, ser. Lec-ture Notes in Computer Science, P. Atzeni, W. W. Chu, H. Lu, S.Zhou, and T. W.Ling, Eds., vol. 3288. Springer, 2004, pp. 795–809.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 73
[15] I. Tatarinov, Z. G. Ives, A. Y. Halevy, and D. S. Weld, “Updating xml,” inSIGMODConference, 2001.
[16] M. Bouzeghoub and V. Peralta, “A framework for analysisof data freshness,” inIQIS 2004, International Workshop on Information Quality in Information Systems,18 June 2004, Paris, France (SIGMOD 2004 Workshop), F. Naumann and M. Scan-napieco, Eds. ACM, 2004, pp. 59–67.
[17] M. Bouzeghoub and V. Peralta, “On the evaluation of datafreshness in data inte-gration systems,” in20 emes Journees de Bases de Donnees Avancees (BDA 2004),2004.
[18] K. S. Candan, D. Agrawal, W.-S. Li, O. Po, and W.-P. Hsiung, “View invalidation fordynamic content caching in multitiered architectures,” inVLDB, 2002, pp. 562–573.
[19] H. Kang, H. Sung, and C. Moon, “Deferred incremental refresh of xml materializedviews : Algorithms and performance evaluation,” inDatabase Technologies 2003,Proceedings of the 14th Australasian Database Conference,ADC 2003, Adelaide,South Australia, February 2003, ser. CRPIT, vol. 17. Australian Computer Society,2003, pp. 217–226.
[20] G. Cobena, S. Abiteboul, and A. Marian, “Detecting changes in xml documents,”in Proceedings of the 18th International Conference on Data Engineering ICDE, 26February - 1 March 2002, San Jose, CA. IEEE Computer Society, 2002, pp. 41–52.
[21] Y. Wang, D. J. DeWitt, and J. yi Cai, “X-diff: An effective change detection algo-rithm for xml documents,” inProceedings of the 19th International Conference onData Engineering ICDE, March 5-8, 2003, Bangalore, India, U. Dayal, K. Ramam-ritham, and T. M. Vijayaraman, Eds. IEEE Computer Society, 2003, pp. 519–530.
[22] E. Leonardi, S. S. Bhowmick, T. S. Dharma, and S. K. Madria, “Detecting contentchanges on ordered xml documents using relational databases,” in Database andExpert Systems Applications, 15th International Conference, DEXA 2004 Zaragoza,Spain, August 30-September 3, 2004, Proceedings, ser. Lecture Notes in ComputerScience, F. Galindo, M. Takizawa, and R. Traunmuller, Eds., vol. 3180. Springer,2004, pp. 580–590.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 63 - 74
© 2006 Technomathematics Research Foundation
Christian Dreier, Johann EderMarek Lehman, Juergen Mangler 74
Incremental Trade-Off Management for
Preference Based Queries1
BALKE Wolf-Tilo LOFI Christoph
GÜNTZER Ulrich
L3S Research Center
Leibniz University Hannover Appelstr. 9a
30167 Hannover Germany {balke, lofi}@l3s.de
Institute of Computer Science
University of Tübingen Sand 13
72076 Tübingen, Germany [email protected]
tuebingen.de
Abstract
Preference-based queries often referred to as skyline queries play an important role in
cooperative query processing. However, their prohibitive result sizes pose a severe chal-
lenge to the para . In this paper we discuss the incremental
re-computation of skylines based on additional information elicited from the user. Ex-
tending the traditional case of totally ordered domains, we consider preferences in their
most general form as strict partial orders of attribute values. After getting an initial sky-
line set our ap
. This additional knowledge then is incorporated into the preference
information and constantly reduces skyline sizes. In particular, our approach also allows
users to specify trade-offs between different query attributes, thus effectively decreasing
the query dimensionality. We provide the required theoretical foundations for modeling
preferences and equivalences, show how to compute incremented skylines, and proof the
correctness of the algorithm. Moreover, we show that incremented skyline computation
can take advantage of locality and database indices and thus the performance of the algo-
rithm can be additionally increased.
Keywords: Personalized Queries, Skylines, Trade-Off Management, Preference Elicita-
tion
1 Introduction
Preference-based queries, usually called skyline queries in database research [9], [4],
[19], have become a prime paradigm for cooperative information systems. Their major
appeal is the intuitiveness of use in contrast to other query paradigms like e.g. rigid set-
based SQL queries, which only too often return an empty result set, or efficient, but hard
to use top-k queries, where the success of a query depends on choosing the right scoring
or utility functions.
1 Part of this work was supported by a grant of the German Research Foundation (DFG) within the Emmy
Noether Program of Excellence.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 75
Skyline queries offer user-centered querying as the user just has to specify the basic at-
tributes to be queried and in return retrieves the Pareto-optimal result set. In this set all
possible s to being optimal with respect to any monotonic
optimization function) are returned. Hence, a user cannot miss any important answer.
However, the intuitiveness of querying comes at a price. Skyline sets are known to grow
exponentially in size [8], [14] with the number of query attributes and may reach unrea-
sonable result sets (of about half of the original database size, cf. [3], [7]) already for as
little as six independent query predicates. The problem even becomes worse, if instead
of totally ordered domains user preferences on arbitrary predicates over attribute-based
domains are considered. In database retrieval, preferences are usually understood as par-
tial orders [12], [15], [20] of domain values that allow for incomparability between at-
tributes. This incomparability is reflected in the respective skyline sizes that are gener-
ally significantly bigger than in the totally ordered case. On the other hand such attrib-
ute-based domains like colors, book titles, or document formats play an important role in
practical applications, e.g., digital libraries or e-commerce applications. As a general
rule of thumb it can be stated that the more preference information (including its transi-
tive implications) is given by the user with respect to each attribute, the smaller the aver-
age skyline set can be expected to be. In addition to prohibitive result set sizes, skyline
queries are expensive to compute. Evaluation times in the range of several minutes or
even hours over large databases are not unheard of.
One possible solution is based on the idea of refining skyline queries incrementally by
taking advantage of user interaction. This approach is promising since it benefits skyline
sizes as well as evaluation times. Recently, several approaches have been proposed for
user-centered refinement:
using an interactive, exploratory process steering the progressive computation of
skyline objects [17]
exploiting feedback on a representative sample of the original skyline result [8],
[16]
projecting the complete skyline on subsets of predicates using pre-computed sky-
cubes [20], [23].
The benefit of offering intuitive querying and a cooperative system behavior to the user
in all three approaches can be obtained with a minimum of user interaction to guide the
further refinement of the skyline. However, when dealing with a massive amount of re-
sult tuples, the first approach needs a certain user expertise for steering the progressive
computation effectively. The second approach faces the problem of deriving representa-
tive samples efficiently, i.e. avoiding a complete skyline computation for each sample.
In the third approach the necessary pre-computations are expensive in the face of up-
dates of the database instance.
Moreover, basic theoretical properties of incremented preferences in respect to possible
preference collisions and induced query modification and query evaluation have been
outlined in [13].
In this paper we will provide the theoretical foundations of modeling partial-ordered
preferences and equivalences on attribute domains provide algorithms for incrementally
and interactively computing skyline sets and prove the soundness and consistency of the
algorithms (and thus giving a comprehensive view of [1], [2], [6]). Seeing preferences in
their most general form as partial orders between domain values, this implicitly includes
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 76
the case of totally ordered domains. After getting an (usually too big) initial skyline set
our approach aims at interac
wishes. The additional knowledge then is incorporated into the preference information
and helps to reduce skyline sets. Our contribution thus is:
Users are enabled to specify additional preference information (in the sense of
domination), as well as equivalences (in the sense of indifference) between at-
tributes leading to an incremental reduction of the skyline. Here our system will
efficiently support the user by automatically taking care that newly specified
preferences and equivalences will never violate the consistency of the previ-
ously stated preferences.
Our skyline evaluation algorithm will allow specifying such additional informa-
tion within a certain attribute domain. That means that more preference infor-
mation about an attribute is elicited from the user. Thus the respective prefer-
ence will be more complete and skylines will usually become smaller. This can
reduce skylines to the (on average considerably smaller) sizes of total order
skyline sizes by canceling out incomparability between attribute values.
In addition, our evaluation algorithm will also allow specifying additional rela-
tionships between preferences on different attributes. This feature allows defin-
ing the qualitative importance or equivalence of attributes in different domains
and thus forms a good tool to compare the respective utility or desirability of
certain attribute values. The user can thus express trade-offs or compromises
he/she is willing to take and also can adjust imbalances between fine-grained
and coarse preference specifications.
We show that the efficiency of incremented skyline computation can be consid-
erably increased by employing preference diagrams. We derive an algorithm
which takes advantage of the locality of incremented skyline set changes de-
pending on the changes made by the user to the preference diagram. By that,
the algorithm can operate on a considerable smaller dataset with an increased
efficiency.
Spanning preferences across attributes (by specifying trade-offs) is the only way short
of dropping entire query predicates to reduce the dimensionality of the skyline compu-
tation and thus severely reduce skyline sizes. Nevertheless the user stays in full control
of the information specified and all information is only added in a qualitative way, and
not by unintuitive weightings.
2 A Skyline Query Use-Case and Modeling
Before discussing the basic concepts of skyline processing, let us first take a closer look
at a motivating scenario which illustrates our modeling approach with a practical exam-
ple:
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 77
Example: Anna is currently looking for a new apartment. Naturally, she has some pref-
erences how and where she wants to live. Figure 1 shows preference diagrams of
base preferences modeled as a strict partial order on domain values of three
attributes (cf. [15], [12]): location, apartment type and price. These preferences might
either be stated explicitly by Anna together with the query or might be derived from
or activity history [11]. Some of these preferences may even be
common domain knowledge (c.f. [5]) like for instance that in case of two equally desir-
able objects, the less costly alternative is generally preferred. Based on such preferences,
Anna may now retrieve the skyline over a real-estate database. The result is the Pareto-
optimal set of available apartments consisting of all apartments which are not dominated
by others, e.g. a cheap beach area loft immediately dominates all more expensive 2-
bedrooms or studios, but can, for instance, not dominate any maisonette. After the first
retrieval Anna has to manually review a probably large skyline.
In the first few retrieval steps skylines usually contain a large portion of all database
objects due to the incomparability between many objects. But the size of the Pareto-
optimal set may be reduced incrementally by providing suitable additional information
on top of the stated preferences, which will then result in new domination relationships
on the level of the database objects and thus remove less preferred objects from the sky-
Figure 1. Three typical user preferences (left) and an enhanced preference (right)
university
district
beach area
outer suburbs
commercial
district
loft
2-bedroom
preference P1
location
preference P2
type
studio
maisonette
preference P3
price range
[>250 000]
arts district
university
district
beach area
outer suburbs
commercial
district
arts district
[<25 000]
[50 000 55 000]
preference P1
location
Figure 2. Original and induced preference relationships for trade offs
(arts district, loft, D)(beach area, studio, D)
equivalence across
preference P1 and P2
original relationship new relationship
if the price is equal,
beach area
studio is equivalent
to arts district loft
(beach area, loft, A)
(beach area, 2-bedroom, C)
(beach area, maisonette, B)
A C
C D
A D
B C
C D
B D
(arts district, 2-bedroom, E)
C E
D ED E
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 78
line. Naturally, existing preferences might be extended by adding some new preference
relationships. But also explicit equivalences may be stated between certain attributes
expressing actual indifference and thus resulting in new domination relationships, too.
Example (cont): that the skyline still contains too many apartments. Thus,
Anna interactively refines her originally stated preferences. For example, she might state
that she actually prefers the arts district over the university district and the latter over the
commercial district which would turn the preference P1 into a totally ordered relation.
This would for instance allow apartments located in the arts and university district to
dominate those located in the commercial district with respect to P1, resulting in a de-
crease of the size of the Pareto-optimal set of skyline objects. Alternatively, Anna might
state that she actually does not care whether her flat is located in the university district or
the commercial district that these two attributes are equally desirable for her. This is
illustrated in the right hand side part of Figure 1 as preference P1 . In this case, it is rea-
sonable to deduce that all arts district apartments will dominate commercial district
apartments with respect to the location preference.
Preference relationships over attribute domains lead to domination relationships on data-
base objects, when the skyline operator is applied to a given database. These resulting
domination relationships are illustrated by the solid arrows in figure 2. However, users
might also weigh some predicates as more important than others and hence might want
to model trade-offs they are willing to consider. Our preference modeling approach in-
troduced in [1] allows expressing such trade-offs by providing new preference relations
or equivalence relations between different attributes some
attributes in the skyline query and subsequently reducing the dimensionality of the
query.
Example (cont): While refining her preferences, Anna realizes that she actually would
consider the area in which her new apartment is located as more important than the ac-
tual apartment type in other words: for her, a relaxation in the apartment type is less
severe than a relaxation in the area attribute. Thus she states that she would consider a
beach area studio (the least desired apartment type in the best area) still as equally desir-
able to a loft in the arts district (the best apartment type in a less preferred area by do-
ing that, she stated a preference on an amalgamation of the attribute apartment type and
location). This statement induces new domination relations on database objects (illus-
trated as the dotted arrows in Figure 2), allowing for example any cheaper beach area 2-
bedroom to dominate all equally priced or more expensive arts district lofts (by using the
ceteris paribus [18] assumption). In this way, the result set size of the skyline query can
be decreased.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 79
3 Theoretical Foundation and Formalization
In this section we formalize the semantics of adding incremental preference or equiva-
lence information on top of already existing base preferences or base equivalences. First,
we provide basic definitions required to model base and amalgamated preferences and
equivalence relationships. Then, we provide basic theorems which allow for consistent
incremented skyline computation (cf. [1]). Moreover, we show that it suffices to calcu-
late incremental changes on transitively reduced preference diagrams. We show that
local changes in the preference graph only result in locally restricted recomputations for
the incremented skyline and thus leads to superior performance (cf. [2]).
3.1 Base Preferences and the Induced Pareto Aggregation
In this section we will provide the basic definitions which are prerequisites for section
3.1 and 3.1 . We will introduce the notion for base preferences, base equivalences, their
amalgamated counterparts, a generalized Pareto composition and a generalized skyline.
The basic construct are so-called base preferences defining strict partial orders on attrib-
ute domains of database objects (based on [12], [15]):
Definition 1: (Base Preference)
Let D1, D2, Dm be a non-empty set of m domains (i.e. sets of attribute values) on the
attributes Attr1, Attr 2 Attr m so that Di is the domain of Attr i. Furthermore let O D1
× D2 Dm be a set of database objects and let attri : O Di be a function mapping
each object in O to a value of the domain Di.
Then a Base Preference Pi Di2 is a strict partial order on the domain Di.
The intended interpretation of (x, y) Pi with x, y Di (or alternatively written x <Pi y)
is attribute value y (for the domain Di) better than attribute
value x (of the same domain) This implies that for o1, o2 O (attri(o1), attri(o2)) Pi bject o2 better than object o1 with respect to its i-th attribute value
In addition to specifying preferences on a domain Di we also allow to define equiva-
lences as given in Definition 2.
Definition 2: (Base Equivalence and Compatibility)
Let O a set of database objects and Pi a base preference on Di as given in Definition 1.
Then we define a Base Equivalence Qi Di2
as an equivalence relation (i.e. Qi is reflex-
ive, symmetric and transitive) which is compatible with Pi and is defined as:
a) Qi Pi = (meaning no equivalence in Qi contradicts any strict preference in Pi)
b) Pi i = Qi i = Pi (the domination relationships expressed transitively using Pi and
Qi must always be contained in Pi)
In particular, as Qi is an equivalence relation, Qi trivially contains the pairs (x, x) for all x
Di.
The interpretation of base equivalences is similarly intuitive as for base preferences: (x,
y) Qi with x, y Di (or alternatively written x ~Qi y am indifferent between
attribute values x and y of the domain Di
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 80
As mentioned in Definition 2, a given base preference Pi and base equivalence Qi have to
be compatible to each other - this means that on one hand attribute value x can never be
considered (transitively) equivalent and being (transitively) preferred to some attribute
value y at the same time. On the other hand preference relationships between attribute
values should always extend to all equivalent attribute values, too. Please note that gen-
erally there still may remain values x, y Di where neither x <Pi y nor y <P
i x , nor x ~Q
i y
holds. We call these values incomparable.
The base preferences P1, m together with the base equivalences Q1 m induce a
partial order on the set of database objects according to the notion of Pareto optimality.
This partial order is created by the Generalized Pareto Aggregation (cf. [6]) which is
given in Definition 3.
Definition 3: (Generalized Pareto Aggregation for Base Preferences and Equiva-
lences) Let O be a set of database objects, P1 Pm be
a set of m base preferences as given in
Definition 1 and Q1 m be a set of m compatible base equivalence relations as de-
fined in Definition 2. Then we define the Generalized Pareto Aggregation for base
preferences and equivalences as:
Pareto(O, P1 Pm , Q1 Qm) := {(o1, o2) O2 | i m: (attri(o1), attri(o2)) (Pi
Qi ) j m: (attri(o1), attri(o2)) Qj}
As stated before, the generalized Pareto aggregation induces an order on the actual da-
tabase objects. This order can be extended to an Object Preference as given by Defini-
tion 4.
Definition 4: (Object Preference P and Object Equivalence Q)
An Object Preference P O2 is defined as a strict partial order on the set of database
objects O containing the generalized Pareto aggregation of the underlying set of base
preferences and base equivalences: P Pareto(O, P1 Pm , Q1 Qm).
Furthermore, we define the Object Equivalence Q O2 as an equivalence relation on O
(i.e. Q is reflexive, symmetric and transitive) which is compatible with P (cf. Definition
2) and respects Q1 Qm in the following way:
i m (xi, yi) Qi ((x1 xm) , (y1 ym)) Q
In particular, Q contains at least the identity tuples (o, o) for each o O.
An object level preference P, as given by Definition 4, contains at least the order in-
duced by the Pareto aggregation function on the base preferences and equivalences. Ad-
ditionally, it can be enhanced and extended by other user-provided relationships (and
thus leaving the strict Pareto domain). Often, users are willing to perform a trade-off, i.e.
relax their preferences in one attribute in favor a better performance in
another attribute. For modeling trade-offs, we therefore introduce the notion of amalga-
mated preferences and amalgamated equivalences.
Building on the example given in Section 2 , an amalgamated equivalence could be a
statement am indifferent between an Arts District Studio and an University Dis-
trict Loft (as illustrated in Figure 3). Thus, this equivalence statement is modeled on an
amalgamation of domains (location and type) and results in new equivalence and prefer-
ence relationships on the database instance.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 81
Definition 5: (Amalgamated Preferences Functions)
Let m} be a set with cardinality k. Using as the projection in the sense of
relational algebra we define the function
AmalPref(x , y ) : (i
Di)2 O2
(x , y ) {(o1, o2) O2 | i : ( Attr
i (o1) = Attr
i (x ) Attr
i (o2) = Attr
i (y ))
i m}\ . : ( Attri (o1) = Attr
i (o2) )) }
This means: Given two tuples x , y from the same amalgamated domains described
by , the function AmalPref(x , y ) returns a set of relationships between database ob-
jects of the form (o1, o2) where the attributes of o1 projected on the amalgamated do-
mains equal those of x , the attributes of o2 projected on the amalgamated domains equal
those of y and furthermore all other attributes which are not within the amalgamated
attributes are identical for o1 and o2. The last requirement denotes the well-known ceteris paribus [18] condition . The relationships created by that
function may be incorporated into P violate P consistency. The
conditions and detailed mechanics allowing this incorporation are the topic of section
3.1 .
Please note the typical cross-shape introduced by trade-offs: A relaxation in one attrib-
ute is compared to a relaxation in the second attribute. Though in the Pareto sense the
two respective objects are not comparable (in Figure 3 the arts district studio has a better
value with respect to location, whereas the university district loft has a better value with
respect to type), amalgamation adds respective preference and equivalence relationships
between those objects.
Definition 6: (Amalgamated Equivalence Functions)
Let m} be a set with cardinality k. Using as the projection in the sense of
relational algebra we define the function
AmalEq(x , y ) : (i
Di )2 O2
(x , y ) {(o1, o2) O2 | i : [ ( Attr
i (o1) = Attr
i (x ) Attr
i (o2) = Attr
i (y ))
( Attri (o2) = Attr
i (x ) Attr
i (o1) = Attr
i (y )) ] i m}\ . : ( Attr
i (o1) = Attr
i
(o2) )) }
Figure 3. Modeling a trade-off using an Amalgamated Preference
university
district
beach area
outer suburbs
loft
2-bedroom
preference P1
location
preference P2
type
studio
maisonette
arts district
commercial
district
arts district
loft
university
district
studio
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 82
The function differs from amalgamated preferences in that as it returns symmetric rela-
tionships, i.e. if (o1, o2) Q, also (o2, o1) has to be in Q. Furthermore, these relationships
have to be incorporated into Q instead of P stency. But
due to the compatibility characteristic also P can be affected by new relationships in Q.
Based on an object preference P we can finally derive the Skyline set (given by Defini-
tion 7) which is returned to the user. This set contains all possible best objects with re-
spect to the underlying user preferences. This means that the set contains only those ob-
jects that are not dominated by any other object in the object level preference P. Note
that we call this set generalized Skyline as it is derived from P which is initially the
Pareto order but may also be extended with additional relationships (e.g. trade-offs).
Note that the generalized skyline set is solely derived using P, still it respects Q by in-
troducing new relationships into P based on recent additions to Q (cf. Definition 8).
Definition 7: (Generalized Skyline)
The Generalized Skyline S O is the set containing all optimal database objects in re-
spect to a given object preference P and is defined as
S := { o O | o O : (o, o ) P }
3.1 Incremental Preference and Equivalence Sets
The last section provided the basic definitions required for dealing with preference and
equivalence sets. In the following sections we provide a method for incremental specifi-
cation and enhancements of object preference sets P and equivalence sets Q. Also, we
show under which conditions the addition of new object preferences / equivalences is
safe and how compatibility and soundness can be ensured.
The basic approach for dealing with incremental preference and equivalent sets is illus-
trated in Figure 4. First, the base preferences P1 to Pm (Definition 1) and their according
base equivalences Q1 to Qm (Definition 2) are elicited. Based on these, the initial object
preference P (Definition 4) is created by using the generalized Pareto aggregation
(Definition 3). The initial object equivalence Q starts as a minimal relation as defined in
Definition 4. The generalized Pareto skyline (Definition 7) of P is then displayed to the
user and the iterative phase of the process starts. Users now have the opportunity to
specify additional base preferences or equivalences or amalgamated relationships
(Definition 5, Definition 6) as described in the previous section. The set of new object
relationships resulting from the Ceteris Paribus functions of the newly stated user in-
formation then is checked for compatibility using the constrains given in this section. If
the new object relationships are compatible with P and Q they are inserted and thus in-
cremented sets P* and Q* are formed. If the relationships were not consistent with the
previously stated information, then the last addition is discarded and the user is notified.
The user thus can state more and more information until the generalized skyline is lean
enough for manual inspection.
Figure 4. General Approach for Iterated Preference / Equivalence Sets
P1, ,Pm,Q1, ,Qm
ParetoP*, Q*
(Trade-Offs)
New Base Relationships
P, Q
Iteration
Amalgamated Pref. / Eq.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 83
Definition 8: (Incremented Preference and Equivalence Set)
Let O be a set of database objects, P O2 be a strict preference relation, Pconv
O2 be
the set of converse preferences with respect to P, and Q O2 be an equivalence relation
that is compatible with P. Let further S O2 be a set of object pairs (called incremental
preferences) such that
x, y O: (x, y) S (y ,x) S and S (P Pconv Q) =
and let E O2 be a set of object pairs (called incremental equivalences) such that
(x, y) O: (x, y) E (y, x) E and E ( P Pconv Q S) = .
Then we will define T as the transitive closure T := (P Q S E)+ and the incre-
mented preference relation P* and the incremented equivalence relation Q* as
P* := { (x, y) T | (y, x) T} and
Q* := { (x, y) T | (y ,x) T}
The basic intuition is that S and E contain the new preference and equivalence relation-
ships that have been elicited from the user additionally to those given in P and Q. For
example, S and E can result from the user specifying a trade-off and, in this case, are
induced using the ceteris paribus semantics (cf. Definition 5 and Definition 6). The only
conditions on S and E are that they can neither directly contradict each other, nor are
they allowed to contradict already known information. The sets P* and Q* then are the
new preference/equivalence sets that incorporate all the information from S and E and
that will be used to calculate the new generalized and probably smaller skyline set. Defi-
nition 8 indeed results in the desired incremental skyline set as we will prove in Theo-
rem 1:
Theorem 1: (Correct Incremental Skyline Evaluation with P* and Q*)
Let P* and Q* be defined like in Definition 8. Then the following statements hold:
1) P* defines a strict partial order (specifically: P* does not contain cycles)
2) Q* is a compatible equivalence relation with preference relation P*
3) Q E Q*
4) The following statements are equivalent
a) P S P*
b) P* (P S)conv
= and Q* (P S)conv
=
c) No cycle in (P Q S E) contains an element from (P S)
and from either one of these statements follows: Q* = (Q E)+
Proof:
Let us first show two short lemmas:
Lemma 1: T P* P*
Proof: Due to T T P* T T T holds. If there would exist objects x,
y, z O with (x, y) T, (y, z) P*, but (x, z) P*, then follows (x, z) Q* because T
is transitive and the disjoint union of P* and Q*. Due to Q* symmetry we also get (z,
x) Q* and thus (z, y) = (z, x) (x, y) T T T. Hence we have (y, z), (z, y) T
(y, z) Q* in contradiction to (y, z) P*.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 84
Lemma 2: P* T P*
Proof: analogous to Lemma 1
ad 1) From Lemma 1 directly follows P* P* P* and thus P* is transitive. Since by
Definition 8 P* is also anti-symmetric and irreflexive, P* defines a strict partial order.
ad 2) We have to show the three conditions for compatibility:
a) Q* is an equivalence relation. This can be shown as follows: Q* is symmetric by
definition, is transitive because T is transitive, and is reflexive because Q T and trivi-
ally all pairs (q, q) Q.
b) Q* P* = is true by Definition 8
c) From Lemma 1 we get Q* P* P* and due to Q* being reflexive also P* Q* P*. Thus P* = Q* P*. Analogously we get P* Q* = P* from Lemma 2.
Since a), b) and c) hold, equivalence relation Q* is compatible to P*.
ad 3) Since Q T and Q is symmetric, Q Q*. Analogously E T and E is symmet-
ric, E Q*. Thus, Q E Q*.
ad 4) We have to show three implications for the equivalence of a), b) and c):
a) c): Assume there would exist a cycle (x0, x1 xn-1, xn) with x0 = xn and
edges from (P Q S E) where at least one edge is from P S, further assume
without loss of generality (x0, x1) P S. We know (x2, xn) T and (x1, x0) T, there-
fore (x0, x1) Q* and (x0, x1) P*. Thus, the statement P S P* cannot hold in con-
tradiction to a).
c) b): We have to show T (P S)conv
= . Assume there would exist (x0, x1
xn-1, xn) (P S)conv
with (xi-1, xi) (P Q S E) for 1 i n. Because of (x0,
xn) (P S)conv
follows (xn, x0) P S and thus (x0, x1 xn-1, xn) would have
been a cycle in (P Q S E) with at least one edge from P or S, which is a contra-
diction to c).
b) a): If the statement P S P* would not hold, there would be x and y with (x, y)
P S, but (x, y) P*. Since (x, y) T, it would follow (x, y) Q*. But then also (y,
x) Q* (P S)conv
would hold, which is a contradiction to b).
This completes the equivalence of the three conditions now we have to show that from
any of we can deduce Q* = (Q E)+. Let us assume condition c) holds.
First we show Q* (Q E)+. Let (x, y) Q*, then also (y, x) Q*. Thus we have two
representations (x, y) = (x0, x1 xn-1, xn) and (y, x) = (y0, y1 ym-1, ym),
where all edge are in (P Q S E) and xn = y = y0 and x0 = x = ym. If both representa-
tions are concatenated, a cycle is formed with edges from (P Q S E). Using con-
dition c) we know that none of these edges can be in P S. Thus, (x, y) (Q E)+.
The inclusion Q* (Q E)+ holds trivially due to (Q E)
+ T and (Q E)
+ is sym-
metric, since both Q and E are symmetric.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 85
The evaluation of skylines thus comes down to calculating P* and Q* as given by Defi-
nition 8 after we have checked their consistency as described in Theorem 1, i.e. verified
that no inconsistent information has been added. It is a nice advantage of our system that
at any point we can incrementally check the applicability and then accept or reject a
statement elicited from the user or a different source like e.g. profile information. There-
fore, skyline computation and preference elicitation are interleaved in a transparent
process.
3.1 Efficient Incremental Skyline Computation
In the last sections, we provided the basic theoretical foundations for incremented pref-
erence and equivalence sets. In addition, we showed how to use the generalized Pareto
aggregation for incremented Skyline computation based on the previous skyline objects.
l nature
of incrementally added preference / equivalence information.
While in the last section we facilitated skyline computation by modeling the full object
preference P and object equivalence E, we will now enhance the algorithm to be based
on transitively reduced (and thus considerably smaller) Preference Diagrams. Prefer-
ences diagrams are based on the concept of Hasse diagrams but, in contrast, do not re-
quire a full intransitive reduction (e.g. some transitive information may remain in the
diagram). Basically, a preference diagram is a simple graph representation of attribute
values and preference edges as following:
Definition 9: (Preference Diagrams)
Let P be a preference in the form of a finite strict partial order. A preference diagram
PD(P) for preference P denotes a (not necessarily minimal) graph such that the transitive
closure PD(P)+ = P.
Please note that there may be several preference diagrams provided (or incrementally
completed) by the user to express the same preference information (which is given by
the transitive closure of the graph). Thus the preference diagram may contain redundant
transitive information if it was explicitly stated by the user during the elicitation process.
This is particularly useful when the diagram is used for user interface purposes [2].
In the remainder of this section, we want to avoid the handling of the bulky and complex
to manage incremented preference P* and rather only incorporate increments of new
preference information as well as new equivalence information into the preference dia-
gram instead. The following two theorems show how to do this.
Theorem 2: (Calculation of P*) Let O be a set of database objects and P, Pconv
, and Q as in Definition 8 and E := {(x, y),
(y, x)} new equivalence information such that (x, y), (y, x) ( P Pconv
Q). Then P*
can be calculated as
P* = (P (P E P) (Q E P) (P E Q)).
Proof: Assume (a, b) T as defined in Definition 8. The edge can be represented by a
chain (a0, a1) (an-1, an), where each edge (ai-1, ai) (P Q E) and a0 := a, an :=
b. This chain can even be transcribed into a representation with edges from (P Q E),
where at most one single edge is from E. This is because, if there would be two (or
more) edges from E namely (ai-1, ai) and (aj-1, aj) (with i < j) then there are four possibili-
ties:
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 86
a) both edges are (x, y) or both edges are (y, x), in both of which cases the sequence (ai,
ai+1) (aj-1, aj) forms a cycle and can be omitted
b) the first edge is (x, y) and the second edge is (y, x), or vice versa, in both of which
cases (ai-1, ai) (aj-1. aj) forms a cycle and can be omitted, leaving no edge from E
at all.
Since we have defined Q as compatible with P in Definition 8, we know that (P Q)+ =
(P Q) and since elements of T can be represented with at most one edge from E, we
get T = P Q ((P Q) E (P Q)).
In this case both edges in E are consistent with the already known information, because
there are no cyclic paths in T containing edges of P ( c.f. condition 1.4.c) in [1]): This is
because if there would be a cycle with edges in (P Q E) and at least one egde from
P (i.e. the new equivalence information would create an inconsistency in P*), the cycle
could be represented as (a0, a1) P and (a1, a2) (an-1, a0) would at most contain
one edge from E and thus the cycle is either of the form P (P Q), or of the form P
(P Q) E (P Q). In the first case there can be no cycle, because otherwise P and
Q would already have been inconsistent, and if there would be cycle in the second case,
there would exist objects a, b O such that (a, x) P, (x, y) E and (y, b) (P Q)
and (y, x) = (y, a) (a, x) (P Q) P P contradicting (x, y) Pconv.
Because of T = (P Q (P E P) (Q E P) (P E Q) (Q E Q)) and
P* = T \ Q* and since (P (P E P) (Q E P) (P E Q)) Q* = (if the
intersection would not be empty then due to Q* being symmetric there would be a cycle
in P* with edges from (P Q E) and at least one edge from P contradicting the con-
dition 1.4 above), we finally get P* = (P (P E P) (Q E P) (P E Q)).
We have now found a way to derive P* in the case of a new incremental equivalence
relationship, but still P* is a large relation containing all transitive information. We will
now show that we can also get P* by just manipulating a respective preference diagram
in a very local fashion. Locality here results to only having to deal with edges that are
directly adjacent in the preference diagram to the additional edges in E. Let us define an
abbreviated form of writing such edges:
Definition 10: (Set Shorthand Notations)
Let R be a binary relation over a set of database objects O and let x O. We write:
(_ R x) := { y O | (y, x) R} and
(x R _) := { y O | (x, y) R)}
If R is an equivalence relation we write the objects in the equivalence class of x in R as:
R[x] := { y O | (x, y), (y, x) R}
With these abbreviations we will show what objects sets have to be considered for actu-
ally calculating P* via a given preference diagram:
Theorem 3: (Calculation of PD(P)*) Let O be a set of database objects and P, Pconv
, and Q as in Definition 8 and
E := {(x, y), (y, x)} new equivalence information such that
(x, y), (y, x) ( P Pconv Q).
If PD(P) P is some preference diagram of P, and with
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 87
PD(P)* := (PD(P) (PD(P) E Q) (Q E PD(P))), holds: (PD(P)*)+ = P*
i.e. PD(P)* is a preference diagram for P*, which can be calculated as:
PD(P)* = PD(P) ((_ PD(P) x) Q[y]) ((_ PD(P) y) Q[x]) (Q[x] (y PD(P)_))
(Q[y] (x PD(P)_) ).
Proof: We know from Theorem 2 that P* = (P (P E P) (Q E P) (P E
Q)) and for preference diagrams PD(P) of P holds:
a) P = PD(P)+ (PD(P)*)
+
b) (P E P) = (P E) P (P E Q) P = (PD(P)+ E Q) PD(P)+
(PD(P)*)+, because (PD(P)+
E Q) (PD(P)*)+ and PD(P)+
(PD(P)*)+.
c) Furthermore (P E Q) = PD(P)+ E Q (PD(P) E Q)
+ (PD(P)*)
+
d) And similarly (Q E P) = Q E PD(P)+ (Q E PD(P))+
(PD(P)*)+
Using a) d) we get P* (PD(P)*)+ and since PD(P)* P*, we get (PD(P)*)
+ (P*)
+
= P* and thus (PD(P)*)+ = P*.
To calculate PD(P)* we have to consider the terms in PD(P) (PD(P) E Q) (Q E PD(P))): The first term is just the old preference diagram. Since the second and third
terms both contain a single edge from E (i.e. either (x, y) or (y, x)), the terms can be writ-
ten as
(PD(P) E Q) = ((_ PD(P) x) Q[y]) ((_ PD(P) y) Q[x]) and
(Q E PD(P)) = (Q[x] (y PD (P)_)) (Q[y] (x PD(P)_))
In general these sets will be rather small because first they are only derived from the
preference diagram which is usually considerably smaller than preference P and second
in these small sets there usually will be only few edges originating or ending in x or y.
Furthermore, these sets can be computed easily using an index on the first and second
entry of the binary relation PD(P) and Q. Getting a set like e.g., _PD(P) x then is just an
PD(P) where second
entry is x
Therefore we can calculate the incremented preference P* by simple manipulations on
PD(P) and the computation of a transitive closure like shown in the commutating dia-
gram in Figure 5.
Having completed incremental changes introduced by new equivalence information, we
will now consider incremental changes by new preference information.
Theorem 4: (Incremental calculation of P*) Let O be a set of database objects and P, Pconv
, and Q as in Definition 8 and S := {(x, y)}
new preference information such that (x, y) ( P Pconv Q). Then P* can be calcu-
lated as
P* = (P (P S P) (P S Q) (Q S P) (Q S Q)).
Proof: The proof is similar to the proof of Theorem 2. Assume (a, b) T as defined in
Definition 8. The edge can be represented by a chain (a0, a1) (an-1, an), where
Figure 5. Diagram for deriving incremented skylines using preference diagrams
PD(P) PD(P)*
P* = (PD(P)*)+P
trans.
closure
trans.
closure
extend by E
extend by E
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 88
each edge (ai-1, ai) (P Q S) and a0 := a, an := b. This chain can even be transcribed
into a representation with edges from (P Q S), where edge (x, y) occurs at most
once. This is because, if (x, y) occurs twice the two edges would enclose a cycle that can
be removed.
Since we have assumed Q to be compatible with P in Definition 8, we know T = P Q
((P Q) S (P Q)) and (like in Theorem 2) edge (x, y) is consistent with the
already known information, because there are no cyclic paths in T containing edges of P
( c.f. 4.c in Theorem 1): This is because if there would be a cycle with edges in (P Q
S) and at least one egde from P (i.e. the new preference information would create an
inconsistency in P*), the cycle could be represented as (a0, a1) P and (a1, a2)
(an-1, a0) would at most contain one edge from S and thus the cycle is either of the form
P, or of the form P (P Q) S (P Q). In the first case there can be no cycle, be-
cause otherwise P would already have been inconsistent, and if there would be cycle in
the second case, there would exist objects a, b O such that (a, x) P and (y, b) (P
Q) and (y, x) = (y, a) (a, x) (P Q) P P contradicting (x, y) Pconv.
Similarly, there is no cycle with edges in (Q S) and at least one egde from S, either: if
there would be such a cycle, it could be transformed into the form S Q, i.e. (x, y) (a,
b) would be a cycle with (a, b) Q, forcing (a, b)= (y, x) Q and thus due to Q m-
metry a contradiction to (x, y) Q.
Because of T = (P Q (P S P) (Q S P) (P S Q) (Q S Q)) and
P* = T \ Q* and since (P (P S P) (Q S P) (P S Q) (Q S Q))
Q* = (if the intersection would not be empty then due to Q* being symmetric there
would be a cycle in P* with edges from (P Q S) and at least one edge from P con-
tradicting the condition 1.4 above), we finally get P* = (P (P S P) (Q S P)
(P S Q) (Q S Q)).
Analogously to Theorem 2 and Theorem 3 in the case of a new incremental preference
relationship, we can also derive P* very efficiently by just working on the small prefer-
ence diagram instead on the large preference relation P.
Theorem 5: (Incremental calculation of PD(P*)) Let O be a set of database objects and P, Pconv
, and Q as in Definition 8 and S := {(x, y)}
new preference information such that (x, y) ( P Pconv Q). If PD(P) P is some
preference diagram of P, and with
PD(P)* := (PD(P) (Q S Q)), holds: (PD(P)*)+ = P*
i.e. PD(P)* is a preference diagram for P*, which can be calculated as:
PD(P)* = PD(P) (Q[x] Q[y]) with PD(P) (Q[x] Q[y]) = .
Proof: We know from Theorem 3 that P* = (P (P S P) (P S Q) (Q S
P) (Q S Q)) and for preference diagrams PD(P) of P holds:
a) P = PD(P)+ (PD(P)*)
+
b) since S Q S Q PD(P)*,
it follows (P S P) (PD(P)*)+ PD(P)* (PD(P)*)+
(PD(P)*)+
c) since Q S (Q S) Q PD(P)*,
it follows ((Q S) P) PD(P)* (PD(P)*)+ (PD(P)*)
+
d) analogously (P (S Q)) (PD(P)*)+ PD(P)* (PD(P)*)
+
e) finally by definition (Q S Q) PD(P)* (PD(P)*)+
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 89
Using a) e) we get P* (PD(P)*)+ and since PD(P)* P*, we get
(PD(P)*)+ (P*)
+ = P* and thus (PD(P)*)
+ = P*.
To calculate PD(P)* analogously to Theorem 3 we have to consider the terms in
PD(P) (Q S Q): The first term again is just the old preference diagram. Since the
second term contains (x, y), it can be written as (Q S Q) = (Q[x] Q[y]). Moreover,
if there would exist (a, b) PD(P) (Q[x] Q[y]) then (a, b) PD(P) and there
would also exist (a, x), (y, b) Q. But then (x, y) = (x, a) (a, b) (b, y) P, because
Q is compatible with P, which is a contradiction.
Thus, we can also calculate the incremented preference P* by simple manipulations on
PD(P) in the case of incremental preference information. Again the necessary set can
efficiently be indexed for fast retrieval. In summary, we have shown that the incremental
refinement of skylines is possible efficiently by manipulating only the preference dia-
grams.
4 Conclusion
In this paper we laid the foundation to efficiently compute incremented skylines driven
by user interaction. Building on and extending the often used notion of Pareto optimal-
ity, our approach allows users to interactively model their preferences and explore the
resulting generalized skyline sets. New domination relationships can be specified by
incrementally providing additional information like new preferences, equivalence rela-
tions, or acceptable trade-offs. Moreover, we investigated the efficient evaluation of in-
cremented generalized skylines by considering only those relations that are directly af-
ference information. The actual computation takes ad-
vantage of the local nature of incremental changes in preference information leading to
far superior performance over the baseline algorithms.
Although this work is an advance for the application of the skyline paradigm in real
world applications, still several challenges remain largely unresolved. For instance, the
time necessary for computing initial skylines is still too high hampering the
applicability in large scale scenarios. Here, introducing suitable index structures, heuris-
tics, and statistics might prove beneficial.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 90
References [1] W.-T. Balke, U. Güntzer, C. Lofi. Eliciting Matters Controlling Skyline Sizes by Incremental Integra-
tion of User Preferences. Int. Conf. on Database Systems for Advanced Applications (DASFAA), Bang-kok, Thailand, 2007
[2] W.-T. Balke, U. Güntzer, C. Lofi. User Interaction Support for Incremental Refinement of Preference-
Based Queries. 1st IEEE International Conference on Research Challenges in Information Science (RCIS), Ouarzazate, Morocco, 2007.
[3] W.-T. Balke, U. Güntzer, W. Siberski. Getting Prime Cuts from Skylines over Partially Ordered Do-
mains. Datenbanksysteme in Business, Technologie und Web (BTW 2007), Aachen, Germany, 2007
[4] W.-T. Balke, U. Güntzer. Multi-objective Query Processing for Database Systems. Int. Conf. on Very
Large Data Bases (VLDB), Toronto, Canada, 2004.
[5] W.-T. Balke, M. Wagner. Through Different Eyes - Assessing Multiple Conceptual Views for Querying Web Services. Int. World Wide Web Conference (WWW), New York, USA, 2004.
[6] W.-T. Balke, U. Güntzer, W. Siberski. Exploiting Indifference for Customization of Partial Order Sky-
lines. Int. Database Engineering and Applications Symp. (IDEAS), Delhi, India, 2006. [7] W.-T. Balke, J. Zheng, U. Güntzer. Efficient Distributed Skylining for Web Information Systems. Int.
Conf. on Extending Database Technology (EDBT), Heraklion, Greece, 2004.
[8] W.-T. Balke, J. Zheng, U. Güntzer. Approaching the Efficient Frontier: Cooperative Database Retrieval Using High-Dimensional Skylines. Int. Conf. on Database Systems for Advanced Applications
(DASFAA), Beijing, China, 2005.
[8] J. Bentley, H. Kung, M. Schkolnick, C. Thompson. On the Average Number of Maxima in a Set of Vec-tors and Applications. Journal of the ACM (JACM), vol. 25(4) ACM, 1978.
[9] S. Börzsönyi, D. Kossmann, K. Stocker. The Skyline Operator. Int. Conf. on Data Engineering (ICDE),
Heidelberg, Germany, 2001. [10] C. Boutilier, R. Brafman, C. Geib, D. Poole. A Constraint-Based Approach to Preference Elicitation and
Decision Making. AAAI Spring Symposium on Qualitative Decision Theory, Stanford, USA, 1997. [11] L. Chen, P. Pu. Survey of Preference Elicitation Methods. EPFL Technical Report IC/2004/67,
Lausanne, Swiss, 2004.
[12] J. Chomicki. Preference Formulas in Relational Queries. ACM Transactions on Database Systems (TODS), Vol. 28(4), 2003.
[13] J. Chomicki. Iterative Modification and Incremental Evaluation of Preference Queries. Int. Symp. on
Found. of Inf. and Knowledge Systems (FoIKS), Budapest, Hungary, 2006. [14] P. Godfrey. Skyline Cardinality for Relational Processing. Int Symp. on Foundations of Information and
Knowledge Systems (FoIKS), Wilhelminenburg Castle, Austria, 2004.
[15] W. Kießling. Foundations of Preferences in Database Systems. Int. Conf. on Very Large Databases (VLDB), Hong Kong, China, 2002.
[16] V. Koltun, C. Papadimitriou. Approximately Dominating Representatives. Int. Conf. on Database Theory
(ICDT), Edinburgh, UK, 2005. [17] D. Kossmann, F. Ramsak, S. Rost. Shooting Stars in the Sky: An Online Algorithm for Skyline Queries.
Int. Conf. on Very Large Data Bases (VLDB), Hong Kong, China, 2002.
[18] M. McGeachie, J. Doyle. Efficient Utility Functions for Ceteris Paribus Preferences. In Proc. of Conf. on Artificial In-telligence and Conf. on Innovative Applications of Artificial Intell
Edmonton, Canada, 2002.
[19] D. Papadias, Y. Tao, G. Fu, B. Seeger. An Optimal and Progressive Algorithm for Skyline Queries. Int. Conf. on Management of Data (SIGMOD), San Diego, USA, 2003.
[20] J. Pei, W. Jin, M. Ester, Y. Tao. Catching the Best Views of Skyline: A Semantic Approach Based on
Decisive Subspaces. Int. Conf. on Very Large Databases (VLDB), Trondheim, Norway, 2005. [21] T. Satty. A Scaling Method for Priorities in Hierarchical Structures. Journal of Mathematical Psychol-
ogy, 1977
[22] T. Xia, D. Zhang. Refreshing the sky: the compressed skycube with efficient support for frequent up-dates. Int. Conf. on Management of Data (SIGMOD), Chicago, USA, 2006.
[23] Y. Yuan, X. Lin, Q. Liu, W. Wang, J. Yu, Q. Zhang. Efficient Computation of the Skyline Cube. Int.
Conf. on Very Large Databases (VLDB), Trondheim, Norway, 2005.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 75 - 91
© 2006 Technomathematics Research Foundation
Wolf-Tilo Balke, Ulrich GüntzerChristof Lofi 91
International Journal of Computer Science & ApplicationsVol. IV, No. II
© 2006 Technomathematics Research Foundation
92
What Enterprise Architecture and
Enterprise Systems Usage Can and
Can not Tell about Each Other
Maya Daneva, Pascal van Eck
Dept. of Computer Science, University of Twente, The Netherlands
[email protected], [email protected]
Abstract
There is an increased awareness of the roles that enterprise architecture (EA) and
enterprise systems (ES) play in today’s organizations. EA and ES usage maturity
models are used to assess how well companies are capable of deploying these
two concepts while striving to achieve strategic corporate goals. The existence of
various architecture and ES usage models raises questions about how they both
refer to each other, e.g. if a higher level of architecture maturity implies a higher
ES usage level. This paper compares these two types of models by using
literature survey results and case-study experiences. We conclude that (i) EA
and ES usage maturity models agree on a number of critical success factors and
(ii) in a company with a mature architecture function, one is likely to observe, at
the early stages of ES initiatives, certain practices associated with a higher level
of ES usage maturity.
Keywords: maturity models, enterprise resource planning, enterprise architecture.
1 Introduction
In the past decade, companies and public sector organizations developed an
increased understanding that true connectedness and participation in “the
networked economy” or in “virtual value webs” would not happen merely
through applications of technology, like Enterprise Resource Planning (ERP),
Enterprise Application Integration middleware, or web services. The key lesson
they learnt was that it would happen only if organizations changed the way they
run their operations and integrated them well into cross-organizational business
processes [1]. This takes at least 2-3 years and implies the need to (i) align
changes in the business processes to technology changes and (ii) be able to
anticipate and support complex decisions impacting each of the partner
organizations in a network and their enterprise systems (ES).
In this context, Enterprise Architecture (EA) increasingly becomes critical, for
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 93
it provides to both business and IT managers a clear and synthetic vision of an
organization’s business processes and of the IT resources they rely on. For the
purpose of this research, we use the term ‘enterprise architecture’ to refer to the
constituents of an enterprise at both the social level (roles, organizational units,
processes) as well as the technical level (information technology and related
technology), and the synergetic relations between these constituents. Enterprise
architecture explains how the constituents of an enterprise are related and how
these relations jointly create added value. EA also implies a model that drives
the process of aligning programs and initiatives with solution architectures
integrating both ES and legacy applications. Observations from EA and ES
literature [2,3,9,13,14,15,20] indicate that, in practice, the many facets of EA
and ES are commonly used as complementing each other. For example, EA and
ES represent two of the five major decision areas encompassed in IT governance
at high performing organizations [21]. The experiences of these companies
suggest that EA is the common enforcer of standards from which a high-level
strategic and management-oriented view of potential solutions can be driven to
the implementation level. Moreover, EA processes are critical in implementing
coordinated sets of governance mechanisms for ERP programs that
simultaneously change technology support, ways of doing business, and people’s
job content.
However, due to a lack of adequate principles, theories, and tools to support
consistent application of the concepts of EA and ES usage, the interplay between
them is still rarely studied. ES usage and evolution processes and EA processes
are analyzed in isolation, by using different research methods. Clearly, there is a
need for approaches including definitions, assessment aspects and models that
allow architects and IT decision makers to reason about these two aspects of IT
governance. Examples include reasoning about the choices that guide an
organization’s approach to ES investments or about situations when changing
business requirements can be addressed within the architecture and when
changes justify an exception to enterprise standards.
The present paper responds to this need. Its objective is to add to our
understanding of how the concepts of EA and ES usage are linked, how the
processes of EA and ES usage are linked, and how those processes can be
organized differently to create improved organizational results. The paper seeks
to make the linkages between EA and ES usage explicit so that requirement
engineers working on corporate-wide or networked ES implementation projects
can use this knowledge and leverage EA assets to achieve feasible RE processes.
To get insights into these two concepts, we apply a maturity-based view of the
ES adopting organizations. This perspective provides grounds for the practical
application of commonly available maturity models that could be used with
minimal disruption to the areas being examined in a case study.
The paper is structured as follows: In Section 2 we motivate our research
approach. In Section 3, we give a background of how we use existing
architecture maturity models to build a framework and provide a rationale for
using the DoC’s AMM [4] to mould our case study assessment process. Section
4 discussed the concept of ES usage maturity along with three specific models.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 94
Section 5 reports on how both classes of models agree and disagree. Section 6
reports on and discusses findings from our case study. In Section 7, we check the
consistency between the findings from the literature survey and the ones from
the case study. We summarize conclusions and research plans in Section 8.
2 Background and Research Approach
The goal of our study is to collect information that would help us assess the
interplay of architecture and ES usage in an ES-adopting organization. Since
research studies in architecture maturity and in ERP usage maturity have been
focused either on organization-specific architecture aspects or on ES factors,
there is a distinct challenge to develop a research model that adopts the most
appropriate constructs from prior research and integrate them with constructs
that are most suitable to our context. Given the lack of research on the
phenomenon we are interested in and the fact that the boundaries between
phenomenon and context are not clearly evident, it seems appropriate to use a
qualitative approach to our research goal. Specifically, we chose to apply an
approach based on the positivist case study research method [5,22] because of
the following: (i) evidence suggests its particular suitability to IS research
situations in which both an in-depth investigation is needed and the phenomenon
in question can not be studied outside the context where it occurs, (ii) it offers a
great deal of flexibility in terms of research perspectives to be adopted and
qualitative data collection methods, and (iii) case studies open up opportunities
to get the subtle data we need to increase our understanding of complex IS
phenomena like ERP adoption and enterprise architecture.
In this research, we take the view that the linkages between EA and ES usage
can be interrogated via artefacts subjected to maturity assessments such as (a)
visible practices deployed by an organization, (b) underlying assumptions
behind these practices, (c) architecture and ES project deliverables, (d)
architecture and ES project roles, and (e) shared codes of meaning that undergird
what an organization thinks a good practice is and what it is not [19]. According
to this view, we see maturity assessment frameworks as vehicles that help
organizations and external observers integrate their experiences into coherent
systems of meaning. Our view is consistent with the understanding of
assessment models as (i) normative behaviour models, based on organization’s
values and believes as well as (ii) process theories that help explain why
organizations do not always succeed in EA and ES initiatives [10,16,23].
We selected architecture maturity models [4,8,11,12,18,23] and ES usage
models [7,13,16] as the lens through which we examine the linkages between
EA and ES usage. The reason for choosing these models is threefold: (i) the
models support decision making in context of organizational change and this is
certainly relevant to understanding IT governance, (ii) the models suggest how
organizations can proceed from less controlled to more controlled fashion of
organizing architecture and ES processes and through this we can analyze how
to leverage architecture and ES assets to achieve a better business results, and
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 95
(iii) both classes of models provide a perspective allowing us to see the
evolution of EA and ES usage as moving through stages characterized by key
role players, typical activities and challenges, appropriate performance metrics,
and a range of possible outcomes.
Our view of maturity models as normative systems of meaning brought us to
the idea of using the methods of semiotic analysis [6,19] for uncovering the
facets of the relationship between EA maturity and ES usage maturity. From the
semiotics standpoint, organizational settings are treated as a system of signs,
where a sign is defined as the relationship between a symbol and the content that
this symbol conveys. This relationship is determined by the conventions of the
stakeholders involved (e.g., business users, architects and ES implementation
project team members). In semiotic analysis, these conventions are termed
codes. A code is defined by a set of symbols, a set of contents and rules that map
symbols to contents [19]. Codes specify meanings of a set of symbols within
organizational settings. On the manifest level, certain practices, roles, and
symbols are carriers of architecture and ES usage maturity. On the core level,
stakeholders share beliefs, values, and understandings that guide their actions.
Thus, in order to fully understand the maturity of EA or ES usage in
organization’s settings, we should uncover the relevant symbols, the contents
conveyed by these symbols, and the relationships that bind them. If we could do
this, we should be able to get a clear picture about the extent to which the EA
and ES usage maturity models agree and disagree in terms of pertinent symbols,
contents, and codes.
Our analytical approach has three specific objectives, namely: (i) to identify
how existing architecture frameworks and ES usage models stand to each other,
(ii) to assess the possible mappings between their assessment criteria, and (iii) to
examine if the mappings between architecture maturity assessment criteria and
the ES usage maturity criteria can be used to judge the ES usage maturity in an
ES adopting organization, provided architecture maturity of this organization is
known.
Our research approach is multi-analytical in nature. It draws on the idea of
merging a literature survey and a case study. It involved five stages:
1. Literature survey and mapping assessment criteria of existing
architecture maturity models.
2. Literature survey of existing ES usage maturity models.
3. Identification of assessment criteria for architecture and ES usage
maturity that seem (i) to overlap, (ii) to correlate, and (iii) to explain
each other.
4. Selection and application of one architecture maturity model and one ES
usage model to organizational settings in a case study.
5. Post-application analysis to understand the relationships between the two
maturity models.
We discuss each of these stages in more detail in the sections that follow.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 96
3 Mapping Architecture Maturity Criteria
At least six methods for assessing the ability of EA to deliver to promise were
introduced in the past five years: (i) the IT ACMM of the Department of
Commerce (DoC) of the USA [4], (ii) the Federal Enterprise Architecture
Maturity Framework [8], (iii) the Information Technology Balanced Score Card
model [12], (iv) the models for extended-enterprise-architects [23], (v) the
Gartner Enterprise Architecture Maturity Model [11] and (vi) the META
Enterprise Architecture Program Maturity Model [18]. We analyzed these
models by studying the following aspects:
what assessment criteria they deem important to judge maturity,
what practices, roles and artifacts are surveyed for compliance to these
criteria,
how the artefacts surveyed are mapped to these criteria.
Our findings indicate that these six models all define the concept of maturity
differently, but all implicitly aim at adopting or adapting some good practices
within an improvement initiative targeting repeatable outcomes. The models
assume that organizations reach a plateau in which at least one architecture
process is transformed from a lower level to a new level of capability. We found
that they all share the following common properties:
a number of dimensions or process areas at several discrete levels of
maturity (typically five or six),
a qualifier for each level (such as initial, repeatable, defined, managed,
optimized),
a hierarchical model of assessment criteria for each process area,
a description of each assessment criterion which codifies what the authors
regard as good and not so good practice and which could be observed at
each maturity level,
an assessment procedure that provides qualitative or quantitative ratings
for each of the process areas.
To get more insights into how the assessment criteria of each model refer to
the ones from the other models (e.g. if assessment criteria overlap, or if they
complement each other), we did a comparison on a definition-by-definition
basis. We tried to understand if there exists a semantic equivalence between the
assessment criteria of the six models. We termed two assessment criteria
“semantically equivalent” if their definitions suggest an identical set of symbols,
an identical set of contents, and an identical set of mappings from symbols to
contents. This definition ensures that two assessment criteria are equivalent
when they have the same meaning and they use the same artifact to judge
identical maturity factors. In our definition, the term ‘artifacts’ means one of the
following [10]: a process (e.g. EA process, activity or practice), a product (e.g.
an architecture deliverable, a business requirements document), or a resource
(e.g. architects, architecture modeling tools). For example, the Operation-Unit-
Participation criterion from the DoC ACMM is semantically equivalent to the
Business-Unit-Involvement criterion from the models for extended-enterprise-
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 97
architects (E2ACMM). These two criteria both mean to assess the extent to
which business stakeholders are actively kept involved in the architecture
processes. When compared on a symbol-by-symbol, contents-by-contents and
code-by-code basis, the definitions of these two criteria indicate that they both
mean to measure a common aspect, namely how frequently and how actively
business representatives participate in the architecture process and what the level
of business representatives’ awareness of architecture is.
An extraction of our analysis’ findings is presented in Table 1. It reports on a
set of assessment criteria that we found to be semantically equivalent in two
models, namely the E2ACMM [23], and the DoC ACMM [4].
E2ACMM DoC ACMM
Extended Enterprise Involvement
Business units involvement Operating Unit Participation
Enterprise Program Management
Business & Technology Strategy Alignment Business Linkage
Executive Management Involvement Senior Management Involvement
Strategic Governance Governance
Enterprise budget & Procurement strategy IT investment & Acquisition Strategy
Holistic Extended Enterprise Architecture
Extended Enterprise Architecture Programme Office Architecture Process
Extended Enterprise Architecture Development Architecture Development
Enterprise Program Management Architecture Communication
IT security
Enterprise budget & Procurement strategy IT investment & Acquisition Strategy
Extended Enterprise Architecture Results
Extended Enterprise Architecture Development Architecture Development
Table 1: Two ACMMs compared and contrasted
Next, we analyzed the distribution of the assessment criteria according to
maturity levels in order to understand what the relative contribution of each
criterion is to a certain maturity level. Our general observation was that the
ACMMs may use correlating criteria but these may be linked to different
maturity levels. For example, the DoC ACMM defines the formal alignment of
business strategy and IT strategies to be a Level 4 criterion, while the E2ACMM
checks it at Level 3.
4 Mapping UMS Maturity Criteria
The ES literature, to the best of our knowledge, indicates that there are three
relatively popular ES Usage maturity models: (i) the ES experience model by
Markus et al [16], (ii) the ERP Maturity Model by Ernst & Young, India [7], and
(iii) the staged ES Usage Maturity Model by Holland et al [13]. All the three
models take different views of the way companies make decisions on their
organization structure, process and data definitions, configuration, security and
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 98
training. What these models have in common is that they all are meant as
theoretical frameworks for analysing, both retrospectively and prospectively, the
business value of ES. It is important to note that organizations repeatedly go
through various maturity stages when they undertake major upgrades or
replacement of ES.
As system evolution adds the concept of time to these frameworks, they tend
to structure ‘ES experiences’ in terms of stages, starting conditions, goals, plans
and quality of execution. First, the model by Markus et al [16] allocates
elements of ES success to three different points in time during the system life
cycle in an organization: (i) the ‘project phase’ in which the system is configured
and rolled out, (ii) the ‘shakedown phase’ in which the organization goes live
and integrates the system in their daily routine, and (iii) the ‘onward and upward
phase’, in which the organization gets used to the system and is going to
implement additions. Success in the shakedown phase and in the onward and
upward phase is influenced by ES usage maturity. For example, observations
like (i) a high level of successful improvement initiatives, (ii) a high level of
employees’ willingness to work with the system, and (iii) frequent adaptations in
new releases, are directly related to a high level of ES usage maturity. Second,
the ERP Maturity Model by Ernst & Young, India [7] places the experiences in
context of creating an adaptable ERP solution that meets changing processes,
organization structures and demand patterns. This model structures ERP
adopter’s experiences into three stages: (i) chaos, in which the adopter may loose
the alignment of processes and ERP definition, reverts to old habits and routines,
and complements the ERP system usage with workarounds, (ii) stagnancy in
which organizations are reasonably satisfied with the implemented solution but
they had hoped for a higher return-on-investment rates and, therefore, they refine
and improve the ES usage to get a better business performance, and (iii) growth
in which the adopter seeks strategic support from the ES and moves its focus
over to profit, working capital management and people growth. Third, the staged
maturity model by Holland et al [13] suggests three stages as shown in the Table
2. It is based on five assessment criteria that reflect how ERP-adopters progress
to a more mature level based on increased ES usage.
Our comparative analysis of the definitions of the assessment criteria pointed
out that the number of common factors that make up the criteria of these three
models is less than 30%. The common factors are: (1) shared vision of how the
ES contributes to the organization’s bottom-line, (2) use of ES for strategic
purposes, (3) tight integration of processes and ES, and (4) executive
sponsorship. In the next section, we refer to these common criteria when we
compare the models for assessing ES usage maturity to the ones for assessing
architecture maturity.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 99
Constructs Stage 1 Stage 2 Stage 3
Strategic Use
of IT
Retention of
responsible people
no CIO (anymore)
IS does not
support strategic
decision-making
ES is on a low level
used for strategic
decision-making
IT strategy is
regularly reviewed
High ES
importance
Strong vision
Organization-
wide IT strategy
CIO on the
senior
management
team
Organizational
Sophistication
no process
orientation
very little thought
about information
flows
no culture change
significant
organizational
change
improved
transactional
efficiency
process
oriented
organization
top level
support and
strong
understanding
of ERP-
implications
Penetration of
the ERP
System
the system is used
by less than 50%
of the organization
cost-based issues
prohibit the
number of users
little training
staff retention
issues
most business
groups /
departments are
supported
high usage by
employees
truly integrated
organization
users find the
system easy to
use
Drivers &
Lessons
Key drivers:
priority with
management
information
costs
Lessons:
mistakes are hard
to correct
high learning
curve
Key drivers:
reduction in costs
replacement of
legacy systems
integrating all
business processes
improved access of
management
information
Key drivers:
single supply
chain
replacement of
legacy systems
Vision no clear vision
simple transaction
processing
performance
oriented culture
internal and
external
benchmarking
higher level
uses are
identified
other IT
systems can be
connected
Table 2: ES Usage MM (based on [13])
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 100
5 Mapping Architecture Maturity Criteria to ES
Usage Maturity Criteria:Insight from the Survey
Study
The underlying hypothesis is this paper is that the criteria of the ACMMs and the
ones of the ES UMMs differ, correlate but do not explain one another. Table 3
summarizes the similarities and the differences of the two types of models. The
rightmost column indicates that the models agree on seven factors in terms of
what they contain. Table 3 also identifies significant differences between the two
model types. For example, the ES usage models do not explicitly address a
number of areas critical to ACMM compliance, e.g.: use of a framework,
existence of communication loops and acquisition processes, security, and
governance. Having analyzed the linkages between these two model types, our
findings from the literature survey study suggest the following two implications
for ES adopting organizations: (1) If an organization scores high in terms of ES
usage maturity, they still have to do something to comply with a higher level of
ACMM, and (2) If the architecture team of ES adopting organization complies
with a higher level (than 3) of ACMM, the ES usage model still has value to
offer. This is due to the focus of the ES usage model on the management of ES-
supported change and evolveability of the ES.
ACMMs contain ES MMMs
contain
Both types of models
contain
Definition of standards (incl.
frameworks)
Implementation of architecture
methods
Scoping in depth & breath of
architecture definitions
Planning
Feedback loops based revision
Implementation of metrics
program
Responsibility for acquisition
Responsibility for corporate
security
Governance
Managed ES-
supported change
Speed of
adaptation to
changing demand
patterns
Responsibility for
maintaining
stability of
information &
process
environments
Periodic reviews
Vision
Strategic decision-
making, transformation
& support
Coherence between big
picture view & local
project views
Process definition
Alignment of people,
processes &
applications with goals
Business involvement
& buy-in
Drivers
Table 3: Similarities and differences between ACMMs and ES Usage Models
This alone offers significant complementary guidance in the areas of re-
focusing resources to better balance the choices between ES rigidity and
business flexibility as well as the choice between short-term versus the long term
benefits. ES usage models also explicitly account for the role of factors beyond
the direct control of the organization. For example, they address the need to stay
aware of changes in market demands and take responsibility to maintain a stable
environment in the face of rapid change.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 101
6 Linkages between Architecture Maturity and ES
Usage Maturity: Insights from a Case Study
The case company in this study is a Canadian wireless communications services
provider who serves both corporate and consumers markets with different
subscriber segments in different geographic areas. To maintain the big-picture
view of the key business processes and supporting applications while adapting to
changing markets, the organization relied on an established architecture team. To
support their fast growth, the company also started an ES initiative that included
13 ERP projects within five years. For the purpose of our research, the unit of
analysis [5] is the ES-adopting organization. We investigate two aspect of the
adopter: (i) the maturity of their architecture function and (ii) the maturity of the
ES usage.
6.1 Architecture Maturity
In 2000, after a series of corporate mergers, the company initiated a strategic
planning exercise as part of a major business processes and systems alignment
program. A key component of the strategic planning effort was the assessment of
architecture maturity and the capability of the organization’s architecture
process. The DoC ACMM was used among other standards as a foundation and
an assessment process was devised based on a series of reviews of (i) the
architecture deliverables created for small, mid-sized and large projects, (ii)
architecture usage scenarios, (iii) architecture roles, (iv) architecture standards,
and (v) architecture process documentation. There are nine unique maturity
assessment criteria in the DoC ACMM (as can be seen in the second column in
Table 1). These were mapped into the types of architecture deliverables
produced and used at the company. The highlights of the assessment are listed
below:
Operating unit participation: Since 1996, a business process analyst and a
data analyst have been involved in a consistent way in any business (re)-
engineering initiative. Process and data modeling were established as functions,
they were visible for the business, the business knew about the value the
architecture services provided and sought architecture support for their projects.
Each core process and each data subject area had a process owner and a data
owner. Their sign-off was important for the process of maintaining the
repositories of process and data models current.
Business linkage: The architecture deliverables have been completed on
behalf of the business, but it was the business who took ownership of these
deliverables. The architecture team was the custodian of the resulting
architecture deliverables, however, these were maintained and changed based on
requests by the business.
Senior management involvement / Governance: All midsized and large
projects were strategically important, as the telecommunication industry implies
a constant change and a dynamic business environment. The projects were seen
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 102
as business initiatives rather than IT projects and had strong commitment from
top management.
IT investment and acquisition strategy: IT was critical to the company’s
success and market share. Investments in applications were done as a result of a
strategic planning process.
Architecture process: The architecture process was institutionalized as a part
of the corporate Project Office. It was documented in terms of key activities and
key deliverables. It was supported by means of standards and tools.
Architecture development: All major areas of business, e.g. all core business
processes, major portion of the support processes, and all data subject areas were
architected according to Martin’s methodology [17]. The architecture team had a
quite good understanding of which architecture elements were rigid and which
were flexible.
Architecture communication: Architecture was communicated by the
Project Office Department and by the process owners. The IT team has not been
consistently successful in marketing the architecture services. There were ups
and downs as poor stakeholder involvement impacted the effectiveness of the
architecture team’s interventions.
IT security: IT Security was considered as one of the highest corporate
priorities. The manager of this function was part of the business, and not of the
IT function. He reported directly to Vice-President Business Development.
6.2 ES Usage Maturity
To assess the ES usage maturity in this case, the ES UMM from Table 2 is used.
Assessments were done at two points in time: (i) after the completion of the
multi-phase roll-out of the ERP package and (ii) after a major business process
and systems alignment initiative run by three merging telecommunication
businesses. The first assessment rated the ERP-adopter at Maturity Stage 1,
while the second assessment indicated Stage 2. Details on the five assessment
criteria are discussed as follows:
Strategic use of IT: The organization started with a strong IT vision, the
senior managers were highly committed to the projects. The CFO was
responsible for the choice for an enterprise system, and therefore, moving to a
new ERP platform was a business decision. The company also had their CIO in
the management team. Assessments of strategically important implementation
options were done consistently by the executives themselves. For example, ERP-
supported processes were not adopted in all areas because this would have
reduced the organization’s competitive advantage. Instead, the executive team
approved the option to complement the ERP modules with a telecom-business-
specific package that supports the competitively important domain of wireless
service delivery (including client activations, client care, and rate plan
management). This decision was in line with the key priorities of the company,
namely, quality of service provisioning and client intimacy.
Organizational Sophistication: Business users wanted to keep processes
diverse, however the system pushed them towards process standardization and
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 103
this led to cultural conflicts. Another problem was the unwillingness to change
the organization. People were afraid that the new ways of working were not as
easy as before and, therefore, they undermined the process.
Penetration of the ERP system: The amount of involvement of process
owners in the implementation led immediately to the same amount of results.
The process owners were committed to reuse their old processes, which led to
significant customization efforts. The penetration of the ERP can be assessed
according to two indicators: the number of people who use the system or the
number of processes covered. The latter gives a clearer picture of the use, than
the first because many employees can be in functions in which they have nothing
to do with the ES. Examples of such functions were field technicians in cell site
building and call center representatives. In our case study organization, 30-40%
of the business processes are covered with SAP and they are still extending.
Vision: The company wanted to achieve a competitive advantage by
implementing ES. Because this was a pricy initiative, they made consistent
efforts to maximize the value of ES investments and extend it to non-core
activities and back office.
Drivers & Lessons: The company’s drivers were: (i) integration of sites and
locations, (ii) reducing transaction costs, and (iii) replacement of legacy
applications. There was a very high learning curve through the process. Some
requirements engineering activities, like requirements prioritization and
negotiation went wrong in the first place, but solutions were found during the
process. More about the lessons learned in the requirements process can be
found in [2].
6.3 Mapping of the Case Study Findings
This section provides a list of the most important findings from our architecture
and ES usage assessment results. Later, in Section 7, this list is compared to the
results of our literature survey study (Section 5). The list reports on the
following:
1. There appears to be a relationship between the DoC AMM criterion of
Business Linkage and the ES UMM criterion of Strategic Use of IT. Strong
business/architecture linkage strengthened the stakeholders’ involvement in the
ERP initiative: for example, we observed that those business process owners
who had collected positive experiences of using architecture deliverables in
earlier process automation projects, maintained, in a consistent way, positive
attitude towards the architecture-driven ERP implementation projects.
2. There appears to be a relationship between the DoC AMM criterion of
Senior Management Involvement and the ES UMM criterion of Vision. The
executive sponsorship for architecture made it easy for the ES adopter to
develop the capability to consistently maintain a shared vision throughout all ES
projects. Despite that the ES adopter was rated as a Stage 1 organization on the
majority of the ES UMM criteria, they managed to maintain at all times a sense
of shared vision and identity of who they were and this rated them, regarding the
Vision criterion, at Stage 3. This may also be an example of how a mature
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 104
architecture team can positively influence a Stage 1 organization and help to
earlier practice what other ES adopters experience when arriving at Stage 3.
3. Our observations found no correlation between the DoC ACMM criterion
of Architecture Communication and the ES UMM criterion of Organizational
Sophistication. At the very first glance, it appeared that the organization was
rated low on the Organizational-Sophistication criterion of ES UMM due to the
low level scored on the Architecture-Communication criterion of ACMM.
However, a deeper look indicated the Organizational-Sophistication criterion
got influenced by a number of events over which the architecture team’s
willingness and efforts to communicate architecture had neither a direct nor an
indirect control.
4. There appears to be no relationship between the DoC AMM criterion of
Operating Units Participation and the ES UMM criterion of Penetration of the
ES. The ES adopter had designated process and data owners on board in both the
architecture process and the ES implementation process. Despite the intuitive
belief that a high Operating Units Participation positively influences the
Penetration-of-the-ES rate, we found the contrary be part of the case study
reality. Owners of ERP-supported processes could not tie the depth and the
breath of ERP usage to architecture. One of the most difficult questions in ERP
implementation was how many jobs and job-specific roles would change and
how many people would be supposed to work in these roles. This key question is
captured in the Penetration-of-the-ES criteria of the ES UMM but its resolution
was not found based on architecture. Also, both architects and ERP teams saw
little correlation between these two aspects.
5. We observed no clear connection between a highly mature Architecture
Process and the ES UMM criterion of Drivers and Lessons. A mature
architecture process implies clarity on what the business drivers for ES
initiatives are. In our experience, however, the organization defined business
drivers for each project but found later that some of them were in conflict. This
led to unnecessary complex ERP customization and needless installation of
multiple system versions [2]. However, the ES team did it better in the next
series of roll-outs and their improvement was attributed to the role of
architecture. Architecture frameworks, architecture-level metrics, and reusable
model repositories were made parts of the requirements definition process and
were consistently used in the prioritization and negotiation of ERP-
customization-requirements in most of the projects that followed. This suggests
that an architecture process alone does not determine the project’s success but
can assist ES adopters in correcting and doing things better the next time.
6. We found no correlation between a highly-mature Architecture
Development and the ES UMM criterion of Drivers and Lessons. In the early
projects, the organization failed to see the ES initiative as a learning process.
Process owners shared readiness to change their ways of working, but found
themselves unprepared to spend time for learning the newly-designed integrated
end-to-end processes, the new system, the way it is configured, and the future
options being kept open. Inconsistent definitions of business drivers and
inconsistent learning from trials and failures favoured a low rating on the
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 105
Drivers-and-Lessons criterion.
7. We found no correlation between a highly-mature Architecture
Development and the ES UMM criterion of Organizational Sophistication.
Stakeholders saw process architecture deliverables as tools to communicate their
workflow models to other process owners. All agreed that process models made
process knowledge explicit. But business users also raised a shared concern
about the short life-span of the architecture-compliant ERP process models. Due
to the market dynamics in the telecommunication sector, process models had the
tendency to get outdated in average each 6 weeks. Modelling turned out to be an
expensive exercise and took in average at least 3 days of full-time architect’s
work and one day of process owner’s time. Keeping the models intact was found
resource-consuming and business users saw little value in doing this.
To sum up, high architecture maturity does not necessarily imply coordination
in determining ES priorities and drivers; neither can it turn an ES initiative into a
systematic learning process.
While the architecture maturity in the beginning of the project was very high,
the organization could not set up a smooth implementation process for the first
six ERP projects. So, at the time of the first assessment, the ES usage maturity
was low (stage 1) although the company had clarity on the strategic use of IT
and treated the ES implementation projects as business initiatives and not as IT
projects.
7 Comparison with the Survey Study
This section addresses the question whether the factors identified from our
survey study are consistent with the ones identified in our case study. We did
this to see if our multi-analyses approach can help uncover subtle information
about both the interplay of EA and ES and the research method itself. The
factors resulting from the survey and the ones from the case study are compared
in Table 4. It indicates a number of overlapping factors in the two case studies:
both studies identified four factors that are linked to a mature ES usage and EA.
Factor Survey
Study
Case
Study
Vision yes yes
Strategic decision-making, transformation & support yes yes
Coherence between big picture view & local project views yes no
Process definition yes yes
Alignment of people, processes & applications with goals yes no
Business involvement & buy-in yes yes
Drivers yes no
Making knowledge explicit no yes
Table 4: Consistency check in the findings of the survey and the case study
Next, our findings suggest that three factors were identified in the survey but not
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 106
in the case study. One factor was found in the case study but not in the survey.
8 Conclusions
In the past decade, awareness of IT governance in organizations increased and
many have also increased their spending in EA and ES with the expectation that
these investments will bring improved business results. However, some
organizations appear to be more mature than others in how they use EA and ES
for their advantage and do get better value out of their spending. This has
opened the need to understand what it takes for an organization to be more
mature in EA and ES usage and how an organization measures up by using one
of the numerous maturity models available in the market. Our study is one
attempt to answer this question. We outlined a comparative strategy for
researching the multiple facets of a correlation relationship existing between
these two types of maturity models, namely for EA and ES. We used a survey
study and a case study of one company’s ERP experiences in order to get a
deeper understanding of how these assessment criteria refer to each other. We
found that the two types of maturity models rest on a number of overlapping
assessment criteria, however, the interpretation of these criteria in each maturity
model can be different. Furthermore, our findings suggest that a well-established
architecture function in a company does not imply that there is support for an
ES-implementation. This leads to the conclusion that high architecture maturity
does not automatically guarantee high ES usage maturity.
In terms of research methods, our experiences in merging a case study and a
literature survey study suggest that a multi-analyses approach is necessary for a
deeper understanding of the correlations between architecture and ES usage. The
present study shows that a multi-analyses method helps revise our view of
maturity to better accommodate the cases of ES and EA from an IT governance
perspective and provides rationale for doing so. By applying a multi-analyses
approach to this research problem, our study departs from past framework
comparison studies. Moreover, this study extends previous research by providing
a conceptual basis to explicitly link the assessment criteria of two types of
models in terms of symbols, contents and codified good practices. In our case
study, we have chosen to use qualitative assessments of EA and ES maturity,
instead of determining quantitative maturity measurement according to the
models. The nature of the semiotic analysis, however, makes specific
descriptions of linkages between EA and ES usage difficult.
Many open and far-reaching questions result from this first exploration. Our
initial but not exhaustive list includes the following lines for future research:
1. Apply content analysis methods [24] to selected architecture and ES usage
models to check the repeatability of the findings of this research.
2. Analyze how EA is used in managing strategic change. This will be done
by carrying out case studies at companies’ sites.
3. Refining ES UMM concepts. The ES UMM was developed at the time of
the year 2000 ERP boom and certainly needs revisions to reflect the most recent
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 107
ERP market developments [13].
4. Investigate how capability assessments and maturity advancement are used
to achieve IT-business alignment.
Our present results suggest this research is certainly warranted.
References[1] J. Champy, X-Engineering the Corporation, Warner Books, New York, 2002.
[2] M. Daneva, “ERP requirements engineering practice: lessons learned”, IEEE Software,
March/April 2004.
[3] T. Davenport, “The future of enterprise system-enabled organizations”, Information Systems
Frontiers 2(2), 2000, pp. 163-180.
[4] Department of Commerce (DoC), USA Government: Introduction – IT Architecture Capability
Maturity Model, 2003, https://secure.cio.noaa.gov/hpcc/docita/files/
acmm_rev1_1_05202003.pdf
[5] L. Dube, G. Pare, “Rigor in information systems positivist research: current practices, trends, and
recommendations”, MIS Quarterly, 27(4), 2003, pp. 597-635.3N.,
[6] U. Eco, A Theory of Semiotics, Bloomington, Indiana, University of Indiana Press, 1996.
[7] Ernst & Young LLT: “Are you getting the most from your ERP: an ERP maturity model”,
Bombay, India, April 2002.
[8] Federal Enterprise Architecture Program Management Office, NASCIO, US Government,
http://www.feapmo.gov/resources/040427%20EA%20Assessment%20Framework.pdf, April,
2004.
[9] Fonstad, D. Robertson, Transforming a Company, Project by Project: The IT Engagement
Model, Sloan School of Management, CISR report, WP No363, Sept 2006.
[10]N. Fenton, S.L. Pfleeger, Software Metrics: a Rigorous & Practical Approach, International
Thompson Publ., London, 1996.
[11]Gartner Architecture Maturity Assessment, Gartner Group Stamford, Connecticut, Nov. 2002.
[12]W. van Grembergen, R. Saull, “Aligning business and information technology through the
balanced scorecard at a major Canadian financial group: its status measured with an IT BSC
maturity model”. Proc.of the 34th Hawaii Int’l Conf. on System Sciences, 2001.
[13]C. Holland, B. Light, “A stage maturity model for enterprise resource planning systems use”, The
DATABASE for Advances in Information Systems, 32 (2), 2001, pp. 34–45.
[14]J. Lee, K. Siau, S. Hong, “Enterprise integration with ERP and EAI”, Communications of the
ACM, 46(2), 2002.
[15]M. Markus, “Paradigm shifts – E-business and business / systems integration”, Communications
of the AIS, 4(10), 2000.
[16]M. Markus, S. Axline, D. Petrie, C. Tanis, “Learning from adopters’ experiences with ERP:
problems encountered and success achieved”, Journal of Information Technology, 15, 2000, pp.
245-265.
[17]J. Martin, Strategic Data-planning Methodologies. Prentice Hall, 1982.
[18]META Architecture Program Maturity Assessment: Findings and Trends, META Group,
Stamford, Connecticut, Sept. 2004.
[19]A. Rafaeli, M. Worline, “Organizational symbols and organizational culture”, in Ashkenasy &
C.P.M. Wilderom (Eds.) International Handbook of Organizational Climate and Culture, 2001,
71-84.
[20]J. Ross, P.Weill, D. Robertson, Enterprise Architecture as Strategy: Building a Foundation for
Business Execution, Harvard Business School Press, July 2006.
[21]P. Weill P, J.W. Ross, “How effective is your IT governance?” MIT Sloan Research Briefing,
March 2005.
[22] R.K. Yin, Case Study Research, Design and Methods, 3rd ed. Newbury Park, Sage Publications,
2002.
[23] J. Schekkerman, Enterprise Architecture Score Card, Institute for Enterprise Architecture
Developments, Amersfoort, The Netherlands, 2004.
[24] S. Stempler, “An overview of content analysis”, Journal of Practical Assessment, Research &
Evaluation, 7(17), 2001.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 108
Acknowledgement: This research has been carried out with the financial
support of the Nederlands Organization for Scientific Research (NWO) under
the Collaborative Alignment of Cross-organizational Enterprise Resource
Planning Systems (CARES) project. The authors thank Claudia Steghuis for
providing us with Table 1.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 93 - 109
© 2006 Technomathematics Research Foundation
Maya Daneva, Pascal van Eck 109
International Journal of Computer Science & ApplicationsVol. IV, No. II
© 2006 Technomathematics Research Foundation
110
UNISC-Phone – A case study
SCHREIBER Jacques
FELDENS Gunter LAWISCH Eduardo
ALVES Luciano Informatics Departament
UNISC – Universidade de Santa Cruz do Sul - www.unisc.br Av. Independência, 2293 – CEP 96815-900
Santa Cruz do Sul – Brasil [email protected]
[email protected] [email protected]
Abstract The Internet has had an exponential growth over the past decade and its impact on society has been rising steadily. Nonetheless, two situations persist: first, the available interfaces are not appropriate for a huge number of citizens (visual deficient people), second, it is still difficult for people to access the internet in our Country. On the other hand, the proliferation of cell phones as tools for accessing the Internet, equally paved the way for new applications and business models. The interaction with “Web pages” through the voice is an ever-increasing technological challenge and added solution in terms of ubiquitous interfaces. This paper presents a solution to access Web contents through a bidirectional voice interface. It also features a practical application of this technology, giving users an access to an academic enrolment system using the telephone to surf the Web pages of the present system. Keywords: disabled people, voice recognition
1 Introduction Humanity has long been feeling the need to arrange mechanisms that make sluggish, routine and complex tasks easier. Calculations, data storing are some of the examples and lots of tools have been created to solve these problems. The personal computer comes as a working instrument that carries out many of them. The interconnection of these devices allows for information release and gives users the chance to share contents thus improving task execution efficiency. The Internet is inevitably a consequence of the use and proliferation of computers. Its constant evolution could be framed into several features, such as infrastructure, development tools (increasingly easy to use) and interfaces with the user. The objective of this paper is to explore the new man-computer interface and its implications upon society.
The Web interfaces are based on languages of their own, which are used to write the documents and, by means of appropriate software (navigators, interpreters) can be read, visualized and executed. The first programs to present the contents of these documents used simple interfaces, which only contained the text. Nowadays, they support texts in various formats, charts, images, videos, sounds, among others. As information visualization software progresses, along with the need to make this
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 111
information available to more users, more complex interfaces arise, using hardware systems and software innovators. Then come technologies that utilize voice and dialogue recognition systems to provide and collect data. The motivation for developing this work comes from the ever-expanding fixed and mobile telephone services in Brazil, particularly over the past decade, a development that will allow a broader range of users to access the contents of the Web. There are also motivations of social character, for example, access to the Internet for special needs citizens, like visual deficient people.
1.1 Voice interfaces – What solutions? Nowadays, mobile phones are fairly common devices everywhere and increasingly perform activities that go beyond the range of simple conversation. On the other hand, the Internet is currently a giant “database” not easily accessible by voice through a cell phone. An interface that recognizes the voice and is capable of producing a coherent dialogue from a text, could be the solution to provide these contents. These devices would turn out to be very useful for visual deficient people.
As an initial approach toward the solution of the problem we could consider the WAP (Wireless Application Protocol), a specially designed protocol for mobile phones to access the contents of the Web. This type of access includes the microbrowser, a navigator specifically designed to function through a “screen” and reduced resources devices. The protocol was projected to make it possible to execute multimedia data on mobile phones, as well as on other wireless devices. The development was essentially based on the existing principles and patterns for the Internet.(TCP/IP-like) [1]. However, due to the technological deficiencies of these devices, new solutions were adopted (protocol battery) to optimize its utilization within this context. The architecture is similar to the one of the Internet, as shown in Figure 1.
Figure 1: WAP architecture – Extracted from the paper [6]
In a concise manner, we could say that the WAP architecture functions as an interface which allows for communication, more precisely for data exchange, between handhelds and the World Wide Web.
The sequence of actions that occur while accessing the WWW through a WAP protocol is as follows:
• Microbrowser on the mobile device starts; • The device seeks signal;
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 112
• A connection is established with the server of the telephone company; • A Web Site is selected to be displayed; • An order is sent to proxy using WAP; • The proxy accesses the desired information on the site by means of the
HTTP protocol; • The proxy codifies the HTTP data to WML; • The WML codified data are sent to the device; • The microbrowser displays the wireless version of the Web Site.
In spite of it all, this protocol shows some limitations, among them, the inability to establish a channel that allows for transporting audio data in real time. This is a considerable limitation and makes it impossible to use the WAP as a solution to supply Web contents through a voice interface. 2 Speech Recognizing Systems
The automatic speech recognition systems (Automated Speech Recognition - ASR) were greatly developed over the past years with the creation of improved algorithms and acoustic models and with the higher processing capacity of the computers. With a relatively accessible ASR system installed in the personal computer and with a good quality microphone quality recognition can be achieved if the system is trained for the user’s voice. Through a telephone, and a system that has not been trained, the recognition system needs a set of speech grammars to be able to recognize the answers. This is one possible manner to increase the possibilities to recognize, for example, a name among a rather big number of hypotheses, without the need for many interactions. Speech recognition through mobile phones, which are sometimes used in noisy environments, requires more complex algorithms and simple, well built grammars. Nowadays, there are many ASR commercial applications in an array of languages and action areas, for example, voice, finance, bank, telecommunications and trade portals (www.tellme.com). There are also evolutions in speech synthesis and in the transformation of texts into speech (TTS, Text-To-Speech). Many of the present TTS systems still have problems in terms of easily perceived speeches. However, a new form of speech synthesis under development is based on the concatenation of wave forms. Within this technique, speech is not entirely generated from the text, but recognition also relies on a series of pre-recorded sounds [2].
There are other recognizing manners, namely, mobile phones execute calls and other operations to voice commands of their owners – habitual user. The recognition system functions through the use of previous recordings of the user’s voice, which are then used to compare with the entry data so as to make order recognition possible. This is a rather simple recognition manner which works well and is widely used.
The recognition techniques are also used by systems that rely on a traditional operational system in order to allow the user to manipulate programs and access contents by means of a voice interface. An example of this type of application is the IN CUBE Voice Command [3]. This application uses a technology specifically developed to facilitate repetitive tasks. It is an application of the Windows(9X, NT, 2000) system but does not modify its configurations, and there are also versions available for other operational systems.
Alone, these systems do not assume themselves as possible solutions to the problem of creating a voice interface for the Web, because, as mentioned above, there is a need for high processing capacity systems to make full recognition possible, although, as shown later in this paper, this is an important technology and might even become part of the global architecture of a system that presents a
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 113
solution to the problem. The applications that execute commands of an operational system through voice recognition also poise some considerable disadvantages. If the systems are easy to use and are reliable, they do not have mechanisms that would allow to return in speech form, for example, any information that might be obtained from the Internet (from a XML document). 2.1 The advent of the VoiceXML
The working group Speech Interface Framework of the W3C (World Wide Web Consortium) created patterns that regulate access to Web contents using a voice interface. Access to information is normally done through visualization programs (browsers), which interpret a markup language (HTML, XML). The specification of the Speech Synthesis Markup Language is a relevant component of this new set of rules for voice navigators and was projected to provide a richer markup language, based on XML (Extended Markup Language), so as to make it possible to create applications for this new type of Web interface. The essential rule for working out the markup language is to establish mechanisms which allow the authors of the “synthesizable” contents to control such features of speech as pronunciation, volume, emphasis on certain words or sentences in the different platforms. This is how VoiceXml was created, a language based on XML which allows for the development of documents containing dialogues to facilitate access to Web contents [4].
This language, like the HTML, is used for Man-Computer dialogues. However, whilst the HTML assumes the existence of a graphic navigator (browser), a monitor, keyboard and mouse, the VoiceXML assumes the existence of a voice navigator with an audio outlet synthesized by the computer or a pre-recorded audio with an audio inlet via voice and/or keyboard tones. This technology “frees” the Internet for the development of voice applications, thus simplifying drastically the previously difficult tasks, creating new business opportunities. The VoiceXML is a specification of the VoiceXml Forum, an industrial consortium comprising more than 300 companies. This forum is now engaged in certifying, testing and spreading this language, while the control of its development is in the hands of the World Wide Web Consortium (W3C). As the VoiceXML is a specification, the applications that function in accordance with this specification can be used in different platforms. Telephones had a paramount importance in its development, but it is not restricted to the telephone market and reaches others, for example, personal computers. The technology’s architecture (Figure 2) is analogous to the one of the Internet, differing only in that it included a gateway whose functions will be explained later on (5).
Figure 2: – The Architecture of a Client-Server system based on VoiceXml, extracted from [6]
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 114
In a succinct manner, let us look at how information transference is processed in a system that implements this technology. First, the user calls a certain number over the mobile phone, or even over a conventional telephone. The answer to the call is made by a computerized system, the VoiceXml gateway, which then sends an order to the document server (the server could be in any place of the Internet) which will return the document referenced by the call . A constituent component of the gateway, called VoiceXml interpreter, executes the commands in order to make the contents speakable through the system. It listens to the answers and passes them on to the speech recognizing motor, which is also a part of the gateway.
The normalization process followed the Speech Synthesis Markup Requirements for Voice Markup Languages privileging some aspects, of which the following are of note [4]:
• Interoperationality: compatibility with other W3C specifications, including the Dialog Markup Language and the Audio Cascading Style Sheets.
• Generality: supports speech outlets to an array of applications and several contents.
• Internationalization: provides speech outlet in a big number of languages, with the possibility to use these languages simultaneously in a chosen document.
• Reading generation and capacity: capable of automatically generating easy-to-read documents.
• Consistency: will allow for a predictable control of outgoing data regardless of the implementation platform and the characteristics of the speech’s synthesis device.
• Implementable: the specification should be accomplishable with the existing technology and there should be a minimum number of operational functionalities.
Upon analyzing the architecture in detail (Figure 3) it becomes clear that the server (for example, a Web Server) processes the client’s application orders, the VoiceXml Interpreter, through the VoiceXml Interpreter context. The server produces VoiceXml documents in reply, which are then processed and interpreted by the VoiceXml Interpreter. The VoiceXml Interpreter Context can monitor the data furnished by the user in parallel with the VoiceXml Interprete. For example, a VoiceXml Interpreter Context may be listening to a request of the user to access the aid system, and the other might be soliciting profile alteration orders.
The implementation platform is controlled by the VoiceXml Interpreter context and by the VoiceXml interpreter. For example, in a voice interactive application, the VoiceXml Interpreter context may be responsible for detecting a call, read the initial VoiceXml document and answer the call, while the VoiceXml Interpreter conducts the dialogue after the answer. The implementation platform generates events in reply to user actions (for example: speech, entry digits, request for call termination) and system events (for example, end of temporizer counting). Some of these events are manipulated by the VoiceXml Interpreter itself, as specified by the VoiceXml document, while others are manipulated by the VoiceXml Interpreter context.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 115
Server
Request Document
Implementation Platform
VoiceXML Interpreter Context
VoiceXml Interpreter
Figure 3: Architecture details
Now analyzing the tasks executed by the gateway VoiceXML in a more detailed manner (Figure 4) it becomes clear that the interpretation of the scripts and the interaction with the user are actions controlled by the latter in order to execute them, the gateway consists of a set of hardware and software elements which form the heart of the VoiceXml (VoiceXml Interpreter technology and the above described VoiceXml Interpreter Context are also components of the gateway). Essentially, they furnish the user interaction mechanisms analogically to the browsers in a conventional HITP service. The calls are answered by the telephone services and by the signal processing component.
Figure 4: The Gateway VoiceXML components[5]
The gateways are fitted into the web in a manner very similar to the IVR (Interactive Voice Response) systems and may be placed before or after the small-
Gateway VoiceXML Services
VoiceXML Interpreter
Telephone and signal processing services
Speech recognizing device
Audio Reproduction
TTS services
HTTP clients
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 116
scale telephone centers utilized by many institutions. The architecture allows the users to request the transference of their call to an operator and also allows the technology to implement the order easily.
When a call is received, the VoiceXml Interpreter starts to check and execute the instructions contained in the scripts VoiceXml. As mentioned before, when the script, which is executing, requests an answer from the user, the interpreter directs the control to the recognition system, which “listens to” and interprets the user’s reply. The recognition system is totally independent from other gateway components. The interpreter may use a compatible client/server recognition system or may change the system during the execution with the purpose to improve the performance. Another manner of collecting data is the recognition of keys which results into DTMF controls which are interpreted to allow the user to furnish information to the system, like access passwords.
3 The Unisc-Phone prototype The prototype originated from the difficulties faced by the students of Unisc – University of Santa Cruz do Sul – at the enrolment period. Although efficient in itself, the academic enrolment system implemented on the Web requires access to the Internet and knowledge of micro-informatics, a distant reality for many students. This gave origin to the idea of creating a system that provides academic information and makes enrolment possible in the respective disciplines of the course attended by the student, at any place, any moment and automatically. The idea resulted into the Unisc-Phone project, and uses the telephone, broader in scope and available everywhere, to provide appropriate information to any student who wishes to enroll at the university. 3.1 The functionality of the Unisc-Phone While figuring out the project, the main concern of the team was the creation of a grammar flexible enough to provide for a dialogue as natural as possible. It would be up to the student to conduct the dialogue with the system, giving the student the chance to start the dialogue or leave it to the system to make the questions. In case the system assumes control, a set of questions in a pre-defined order would be made, if not, the system should be able to interact with the student, “perceiving” their needs, clearing the doubts as they arise. Upon assuming the dialogue, the student could interact with the system in several manners, as follows:
Figure 5: A first example of possible dialogue
<Student> “Hi! I’m a student of the Engineering cou rse, my enrolment number is 54667, I want to know which disciplines I can enroll in? “ <System> “Hello, Mr <XXXX>, please type in your pas sword on the telephone keyboard.”
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 117
After the validation by the user, the dialogue proceeds...
Figure 6: A continue example of possible dialogue Another dialogue could occur, for example:
Figure 7: A second example of possible dialogue These two dialogues are just examples of the many dialogues that could take place. Obviously, the VoiceXML technology does not yet provide for an intelligent system able to conduct any dialogue, however, with mixed-initiative forms, the <initial> tag of the VoiceXML, a system capable of interacting with a considerable array of dialogues, was implemented, imparting on the user-student the feeling of interacting with a system sufficiently intelligent, conveying confidence and security to this user.
3.2. The system’s architecture
A database, now one of the main components of the system, was built during the implementation phase. It contains detailed information on the disciplines (whether or not attended by the student), the pre-requisites and available schedules. These data will later be used to provide a solution to the student’s problem. The global architecture was implemented as shown in Figure 2, however, such technologies as the VoiceXML, JSP MySQL, were utilized, and the concept of architecture was applied in three layers, see Figure 8:
Figure 8: UNISC-Phone’s Three-Layer Architecture
<System> “Mr <XXXX>, the available disciplines are as follows: (Discipline A), (Discipline B), (Disciplin e C), (Discipline D)” <Student> “I want to enroll in disciplines A and C” <System> “You’ve requested to enroll in disciplines A and C, please confirm!” <Student> “That’s Right, ok”
<student> “Hi, I want to enroll in disckiplines A a nd C, my name is <XXXXX>” <System> “Good morning, Mr <XXXXX>, please inform y our number of enrolment and type in the password on the keyboard of the phone” <Student> “Oh..yes, my enrolment number is 54667 (t yping in password)” <System> “You’ve requested to enroll in disciplines A and C, please confirm!” <Student> “That’s Right, ok”
Business Logic (JSP)
Presentation Logic
(VoiceXML)
Data Access Logic (MySQL)
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 118
For the development of this system, the JSP program free server was used - http://www.eatj.com and a free VoiceXML gateway, site http://cafe.bevocal.com.
The following phase of the project consisted in the development of the VoiceXML documents that make up the interface. The objective was to create a voice interface in Portuguese, but unfortunately free VoiceXml gateways, capable of supporting this language, do not exist yet. Therefore, we developed a version that does not utilize the graphic signs typical to the Portuguese language, like accentuation and cedillas; even so, the dialogues are understood by Brazilian users. An accurate analysis of the implementation showed that the alteration of the system to make it support the Portuguese language is relatively simple and intuitive, once the only thing to do is to alter the text to be pronounced by the Portuguese text platform and force the system to utilize this language (there is a VoiceXML element to turn it into «speak xml:lang="pt-BR"»). A. System with dialogues in Portuguese The interface and the dialogues it is supposed to provide, as well as the information to be produced by the system, were studied and projected by the team, under the guidance of the professor of Special Topics, at Unisc’s (University of Santa Cruz do Sul) Computer Science College. After agreeing on the idea that the functionalities already offered by the system on the Web should be identical to the service existing on the Web, it was necessary to start implementing the dialogues in VoiceXml and also create dynamic pages so that the information contained on the database could be offered to the user-students. Dialogue organization and the corresponding database were established to make the resulting interface comply with the specifications (Figure 9).
Figure 9: Organizing the system’s files
In short, the functionality of each program is as follows:
• DisciplineRN.java: implements a class that lends support to the recovery of disciplines suitable to each student, in other words, the methods of this class only seek those disciplines possible to be attended, in compliance with the pre-requisites, vacancy availability and viable schedules.
• Functions.java: implements the grammar of the system, in other words, all the plausible dialogues are the result of hundreds of possible combinations of the grammar elements existing in this class.
Start.jsp Function.Java
EnrollmentRN.java
SaveEnrollment.jsp
DisciplineRN.java
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 119
• Enrollment.java: this class implements persistence, in other words, after the enrolment process, the disciplines chosen by the student are entered in the institution’s database.
An analysis of Figure 9 shows that the user starts interacting with the system through the program start.jsp which contains an initial greeting, asks for student’s name and enrollment number:
Figure 10: The first part of source code
… <%try { ConnectionBean conBean = new ConnectionBean (); Connection con = conBean.getConnection(); DisciplineRN disc = new DisciplineRN(con); GeneralED edResearch = new GeralED(); String no.Enrolment = request.getParameter("nroMatricula"); //int nroDia = 2; //Integer.valueOf(request.getParameter("nroDia")).i ntValue(); edResearch.put("codeStudent", numberEnrolment); String name = disc.listStudent(edResearch); DevCollection lstDisciplines; String dia[] = {"","","Monday","Tuesday","Wednesday","Thursday","F riday","end"}; %> <vxml version="2.0" xmlns="http://www.w3.org/2001/v xml"> <field name=”enrolment”> <prompt> Welcome to Unisc-Phone, now you can do your enrolment at home, by phone. Please inform your enr olment number. </prompt> </field> ... <block> <%=name%>, now we will start with your re-enrolment. </block> <form id="enrolment"> <% for (int no.Day=2;no.Day<=6;no.Day++) { %> <field name="re-enrol_<%=day[no. of Day ]%>">
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 120
Figure 11: The second part of source code Although built in JSP, the program mixes fragments of the VoiceXML code which provides interaction with the student. It should be noted that the Portuguese spelling omits all accentuation and other graphic signs (apostrophe, cedilla). On a daily basis, the program lists the available disciplines, later to be chosen by the student. A code analysis shows that the Function class implements a method known as GetGrammar:
Figure 12: The GetGrammar Method This grammar provides the student with several manners to ask for enrollment. What follows is an example of the dialogue:
<prompt> do you want to re-enroll on <%=day[no.Day]%>-day of week? </prompt> <grammar> [yes no] </grammar> <filled> <if cond="re-enrol_<%=day[no.Da y]%> == 'yes'"> Let’s get started... <goto nextitem="discipline_<%=day[no.Day]%>"/> <else/> you won’t have lessons on <%=day[no.Day]%>-day of week, <% if (no.Day==6){ % > The re-enrolment pr ocess has been concluded, sending data to server... <submit method="post" namelist="discipline_<%=daya[2]%> discipline_<%=day [3]%> discipline_<%=day[4]%> discipline_<%=day[5]%> disciplina_<%=dia[6]%>" next="http://matvoices.s41.eatj.with/EnrolmentVoice /saveEnrolment.jsp"/> ... </vxml>
public static String getGrammar(DevCollection listDiscipline, String nameDay) throws Exception { String return = ""; retorno += "<![CDATA[\n"+ " (\n"+ "?[ eu ]\n"+ "?[ I want (I wish to) I would like to]\n"+ "?[ (re-enrol) (do my re-enrolment) ]\n"+ "?[ (in discipline) (subject) for ]\ n"+ "?[ de ]\n"+ " [\n";
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 121
Figure 13: The GetGrammar Method Other manners for a user-student to ask for his/her enrolment could include requests like: “I want to re-enroll for the discipline of Algorithm on Monday”, or something like this “It is my wish to enroll in Logics”. This flexibility is possible thanks to the grammar definitions and the utilization of mixed-initiative forms. Once the user-student has made his/her choice, the interpreter creates a matrix containing all the intended disciplines and document interpretation calls for the SaveEnrolment.jsp program. This program asks for the confirmation and enters the re-enrolment data into the system’s database. After confirmation, the system cordially says farewell and ends the enrolment procedure.
4 Conclusions The rising business opportunities offered by the Internet as a result of never-ending technology improvement, both at infrastructure and interface level, and the growing number of users translate into a diversification of needs at interface level, triggering the appearance of new technologies. It was this context that led to the creation of the voice interface. The VoiceXML comes as a response to these needs due to its characteristics, once it allows for Computer-Human dialogues as a means of providing the users with information. Although entirely specified, this technology is still at its study and development stage. The companies comprised by the VoiceXML Forum are doing their best in spreading the system rapidly. Nevertheless, there are still some shortfalls like, for example, the lack of speech recognition motors and TTS transformation into languages like Portuguese, compatible with the existing gateways.
The system here in above described represents a step forward for the Brazilian scientific community, which lacks practical applications that materialize their research works and signal the right course toward the new means of Human- Computer interfaces within the Brazilian context.
System: Welcome to Unisc-Phone, now you can do your enrolment at home, by phone. Please inform your enr olment number. User-Student: 56447 System: John, now we will start your re-enrolment process. System: do you want to re-enroll on Monday? User-Student: Yes System: The disciplines available for you on that d ay are: algorithms, Logics and discreet mathematics User-Student: I would like to re-enroll in Logics System: Do you want to enroll on Tuesday? ...
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 122
References [1] "W AP Forum"- (http://www.wapforum.org). [2] Peter A. Heeman, "Modeling Speech Repairs and International Phrasing to Improve Speech
Recognition", Computer Science and Engineering Oregon Graduate Institute of Science and Technology.
[3] Speech Recognition Technologies Are NOT Ali Alike, (http://www.comman~corp.com/). [4] Speech Synthesis Markup Language Specification for the Speech Interface Framework
(http://www.w3.org/TR/2001/WD-speech-synthesis-2001 O 1 03). [5] Steve Ihnen VP, "Developing With VoiceXML:Overview and System Architecture ",
Applications DevelopmentSpeechHost, Inc. [6] WAP White Paper1 http://www.wapforum.org/what/WAPWhite_Paper1.pdf
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 111-123
© 2006 Technomathematics Research Foundation
Jacques Schreiber, Gunter FeldensEduardo Lawisch, Luciano Alves 123
International Journal of Computer Science & ApplicationsVol. IV, No. II
© 2006 Technomathematics Research Foundation
124
Fuzzy Ontologies andScale-free Networks Analysis
CALEGARI SilviaUniversita degli Studi di Milano Bicocca - DISCo
Via Bicocca degli Arcimboldi 8, 20126 Milano (Italy)[email protected]
FARINA FabioINFN, Sezione di Milano-Bicocca
Piazza della Scienza 3, 20126 Milano (Italy)[email protected]
Abstract
In the recent years ontologies have played a major role in knowledge representa-tion, both in the theoretic aspects and in many application domains (e.g., SemanticWeb, Semantic Web Services, Information Retrieval Systems). The structure pro-vided by an ontology lets us to semantically reason with the concepts. In thispaper, we present a novel kind of concept network based on the evolution of a dy-namical fuzzy ontology. A dynamical fuzzy ontology lets us to manage vague andimprecise information. Fuzzy ontologies have been defined by integrating FuzzySet Theory into ontology domain, so that a truth value is assigned to each conceptand relation. In particular, we have examined the case where the truth valueschange in time according to the queries executed on the represented knowledgedomain. Empirically we show how the concepts and relations evolve towards apower-law statistical distribution. This distribution is the same that characterizescomplex network systems. The fuzzy concept network evolution is analyzed as anew case of a scale-free system. Two efficiency measures are evaluated on sucha network at different evolution stages. A novel information retrieval algorithmusing fuzzy concept networks is also proposed.
Keywords: Fuzzy Ontology, Scale-free Networks, Information Retrieval.
1 Introduction
In the last years, the ontology plays a key role in the Semantic Web [1] area ofresearch. This term has been used in various areas in Artificial Intelligence [2] (i.e.knowledge representation, database design, information retrieval, knowledge man-agement, and so on), so that to find an unique its meaning becomes a subtle topic.In a philosophical sense, the term ontology refers to a system of categories in orderto achieve a common sense of the world [3]. From the FRISCO Report [4] point of
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 125
view, this agreement has to be made not only by the relationships between humansand objects, but also from the interactions established by humans-to-humans.In the Semantic Web an ontology is a formal conceptualization of a domain of inter-est, shared among heterogeneous applications. It consists of entities, attributes,relationships and axioms to provide a common understanding of the real world[5, 3, 6, 4]. With the support of ontologies, users and systems can communicatewith each other through an easy information integration [7]. Ontologies help peo-ple and machines to communicate concisely by supporting information exchangebased on semantics rather than just syntax.
Nowadays, there are ontology applications where information is often vagueand imprecise, for instance, the semantic-based applications of the Semantic Web,such as e-commerce, knowledge management, web portals, etc. Thus, one of thekey issues in the development of the Semantic Web is to enable machines to ex-change meaningful knowledge across heterogeneous applications to reach the usersgoals. Ontology provides a semantic structure for sharing concepts across differentapplications in an unambiguous way. The conceptual formalism supported by atypical ontology may not be sufficient to represent uncertain information that iscommonly found in many application domains. For example, keywords extractedby many queries in the same domain may not be considered with the same rele-vance, since some keywords may be more significant than others. Therefore, theneed to give a different interpretation according to the context emerges. Further-more, humans use linguistic adverbs and adjectives to specify their interests andneeds (i.e., users can be interested in finding “a very fast car”, “a wine with avery strong taste”, “a fairly cold drink”, and so on). The necessity to handle therichness of natural languages used by humans emerges.
A possible solution to treat uncertain data and, hence, to tackle these prob-lems, is to incorporate fuzzy logic into ontologies. The aim of fuzzy set theory [8]introduced by L. A. Zadeh [9] is to describe vague concepts through a generalizednotion of set, according to which an object may belong to a set with a certaindegree (typically a real number in the interval [0,1]). For instance, the semanticcontent of a statement like “Cabernet is a deep red acidic wine” might have thedegree, or truth-value, of 0.6. Up to now, fuzzy sets and ontologies are jointlyused to resolve uncertain information problems in various areas, for example, intext retrieval [10, 11, 12] or to generate a scholarly ontology from a database inESKIMO [13] and FOGA [14] frameworks. The FOGA framework has been re-cently applied in the Semantic Web context [15].However, there is not a complete fusion of Fuzzy Set Theory with ontologies inany of these examples.In literature we can find some attempts to directly integrate fuzzy logic in ontology,for instance in the context of medical document retrieval [16] and in ontology-basedqueries [17]. In particular, in [16] the integration is obtained by adding a degreeof membership to all terms in the ontology to overcome the overloading problem;while in [17] a query enrichment is performed. This is done with the insertion ofa weight that introduces a similarity measure among the taxonomic relations ofthe ontology. Another proposal is an extension of the ontology domain with fuzzyconcepts and relations [18]. However it is applied only to Chinese news summa-rization. Two well-formed definitions of fuzzy ontology can be found in [19, 20].
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 126
In this paper we will refer to the formal definition stated in [19]. A fuzzy ontol-ogy is an ontology extended with fuzzy values assigned to entities and relationsof the ontology. Furthermore, in [19] it has been showed how to insert fuzzy logicin ontology domain extending the KAON 1 editor [21] to directly handle uncer-tainty information during the ontology definition, in order to enrich the knowledgedomain.
In a recent work, fuzzy ontologies have been used to model knowledge in cre-ative environments [22]. The goal was to build a digitally enhanced environmentsupporting creative learning process in architecture and interaction design educa-tion. The numerical values are assigned during the fuzzy ontology definition bythe domain expert and by the user queries. There is a continuous evolution of newrelations among concepts and of new concepts inserted in the fuzzy ontology. Thisevolutive process lets the concepts to arrange in a characteristic topological struc-ture describing a weighted complex network. Such a network is neither a periodiclattice nor a random graph [23]. This network has been introduced in an informa-tion retrieval algorithm, in particular this has been adopted in a computer-aidedcreative environment. Many dynamical systems can be modelled as a network,where vertices are the elements of the system and hedges identify the interactionbetween them [24]. Some examples are biological and chemical systems, neuralnetworks, social interacting species, computer networks, the WWW, and so on[25]. Thus, it is very important to understand the emerging behaviour of a com-plex network and to study its fundamental properties. In this paper, we presenthow fuzzy ontology relations evolve in time, producing a typical structure of com-plex network systems. Two efficiency measures are used to study how informationis suitably exchanged over the network, and how the concepts are closely tied [26].
The rest of paper is organized as follows: Section 2 presents the importance ofthe use of the fuzzy ontology and the definition of a new concept network basedon its evolution in time. Section 3 introduces scale-free network notation, small-world phenomena and efficiency measures on weighted network topologies, whilein Section 4 a new information retrieval algorithm exploiting the concept networkis discussed. In Section 5 we present some experimental results that confirms thescale-free nature of the fuzzy concept network. In Section 6 some other relevantworks we found in literature are presented for introducing our scope. Finally, inSection 7 some conclusions are reported.
2 Fuzzy Ontology
In this section an in depth study about how the Fuzzy Set Theory [9] has beenintegrated into the ontology definition will be discussed. Some fuzzy ontologypreliminary applications are presented. We will show how to construct a novelconcept network model also relying on this definition.
1The KAON project is a meta-project carried out at the Institute AIFB, University of Karl-sruhe and at the Research Center for Information Technologies(FZI).
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 127
2.1 Toward a Fuzzy Ontology Definition
Nowadays, the knowledge is mainly represented with ontologies. For example,applications of the Semantic Web (i.e., e-commerce, knowledge management, webportals, etc.) are based on ontologies. With the support of ontologies, users andsystems can communicate each other through an easy information exchange andintegration [7].Unfortunately, the only ontology structure is not sufficient to handle all the nu-ances of natural languages. Humans use linguistic adverbs and adjectives to de-scribe what they want. For example, a user can be interested in finding informationusing web portals about the topic “A fun holiday”. But what is “a fun holiday”?How to handle this type of request? Fuzzy Set Theory introduced by Zadeh [9] letsus tackle this problem denoting and reasoning with non-crisp concepts. A degreeof truth (typically a real number from the interval [0,1]) is assigned to a sentence.The previous statement “a fun holiday” might have truth-value of 0.8.
At first, let us remember the definition of a fuzzy set.
Definition 1 Let U be the universe of discourse, U = {u1, u2, ..., un}, where ui ∈U is an object of U and let A be a fuzzy set in U , then the fuzzy set A can berepresented as:
A = {(u1, fA(u1)), (u2, fA(u2)), ..., (un, fA(un))}, (1)
where fA, fA : U 7→ [0, 1], is the membership function of the fuzzy set A; fA(ui)indicates the degree of membership of ui in A.
Finally, we can give the definition of fuzzy ontology presented in [19].
Definition 2 A fuzzy ontology is an ontology extended with fuzzy values assignedthrough the two functions
g :(Concepts ∪ Instances)× (Properties ∪ Property value) 7→ [0, 1] (2)h :Concepts ∪ Instances 7→ [0, 1]. (3)
where g is defined on the relations and h is defined on the concepts of the ontology.
2.2 Some Applications of the Fuzzy Ontology
From the practical point of view, using the given fuzzy ontology definition we candenote not only non-crisp concepts, but we can also directly include the propertyvalue according to the definition given in [27]. In particular, the knowledge do-main has been extended with the quality concept becoming an application of theproperty values. This solution can be used in the tourist context to better definethe meaning of a sentence like “this is a hot day”. Furthermore, it is an usualpractice to extend the set of concepts already present in the query with other oneswhich can be derived from an ontology. Generally, given a concept, the query isextended with its parents and children to enrich the set of displayed documents.With a fuzzy ontology it is possible to establish a threshold value (defined by thedomain expert) in order to extend queries with instances of concepts which sat-isfies the chosen value [19]. This approach can be compared with [17] where the
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 128
queries evaluation is determined through the similarity among the concepts andthe hyponymy relations of the ontology.In literature the problem of the efficient queries refinement has been faced witha large number of different approaches during the last years. PASS is a methoddeveloped in order to construct automatically a fuzzy ontology (the associationsamong the concepts are found analysing the documents keywords [28]) that canbe used to refine a user’s query [29] .Another possible use of the fuzzy value associated to concepts has been adopted inthe context of medical document retrieval to limit the problems due to overloadingof a concept in an ontology [16]. This also permits the reduction of the number ofdocuments found hiding those that do not fulfil the request of the user.
The relevant goal achieved using fuzzy ontology has been the direct handlingof concept modifiers into the knowledge domain. A concept modifier [30] has theeffect of altering the fuzzy value of a property. Given a set of linguistic hedgessuch as “very”, “more or less”, “slightly”, a concept modifier is a chain of oneor more hedges, such as “very slightly” or “very very slightly”, and so on. So, auser can write a statement like “Cabernet has a very dry taste”. It is necessaryto associate a membership modifier to any (linguistic) concept modifier.A membership modifier has a value β > 0 which is used as an exponent to modifythe value of the associated concepts [19, 31]. According to their effect on a fuzzyvalue, a hedge can be classified in two groups: concentration type and dilation type.The effect of a concentration modifier is to reduce the grade of a membership value.Thus, in this case, it must be β > 1. For instance, to the hedge “very”, it is usuallyassigned β = 2. So, if “Cabernet has a dry taste with value 0.8”, then Cabernethas a very dry taste with value 0.82 = 0.64. On the contrary, a dilation hedgehas the effect of raising a membership value, that is β ∈ (0, 1). The example isanalogous to the previous one. This allows not only the enrichment of the semanticthat usually the ontologies offer, but it also gives the possibility to make a requestwithout mandatory constraints to the user.
2.3 Fuzzy Concept Network
Every time that a query is performed there is an updating of the fuzzy valuesgiven to the concepts or to the relations set by the expert during the ontologydefinition. In [22] two formulae both to update and inizialize the fuzzy values aregiven. Such expressions take into account the use of concept modifiers.
The dynamical behaviour of a fuzzy ontology is also given by the introductionof new concepts when a query is performed. In [22] a system has been presentedthat allows the fuzzy ontology to adapt to the context in which it is used, inorder to propose an exhaustive approach to directly handle the knowledge-basedfuzzy information. This consists in the determination of a semantic correlation[22] among the entities (i.e. concepts and instances) that are searched together ina query.
Definition 3 A correlation is a binary and symmetric relation between entities.It is characterized by a fuzzy value: corr : O × O 7→ [0, 1], where the set O ={o1, o2, . . . , on} is the set of the entities contained in the ontology.
This defines the degree of relevance for the entities. The closer the corr value is
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 129
to 1, the more the two considered, for instance, concepts are correlated. Obviouslyan updating formula for each existent correlation is also given. A similar techniqueis known in literature as co-occurrence metric [32, 33].
To integrate the correlation values into the fuzzy ontology is a crucial topic.Indeed, the knowledge of a domain is given, also considering the use of the ob-jects inside the context. An important topic is to handle the trade off betweenthe correct definition of an object (given by the ontology represented definitionof the domain) and the actual means assigned to the artifact by humans (i.e.the experience-based context assumed by every person according to his specificknowledge).
Figure 1: Fuzzy concept network.
In this way, the fuzzy ontology reflects all the aspects of the knowledge-baseand allows to dynamically adapt to the context in which it is introduced. When aquery is executed (e.g., to insert a new document or to search for other documents)new correlations can be created or updated altering their weights. A fuzzy weightto the concepts and to correlations is also assigned during the definition of theontological domain by the expert. In this paper, we propose a new concept networkfor the dynamical fuzzy ontologies. A concept network consists of n nodes and aset of directed links [34, 35]. Each node represents a concept or a document andeach link is labelled with a real number ∈ [0, 1].Finally, we can give the definition of fuzzy concept network .
Definition 4 A Fuzzy Concept Network (FCN) is a complete weighted graph Nf ={O, F, m}, where O denotes the set of the ontology entities. The edges among thenodes are described by the function F : O × O 7→ [0, 1], if F (oi, oj) = 0 then theentities are considered uncorrelated. In particular F := corr. Each node oi ischaracterised by a membership value defined by the function m : O 7→ [0, 1], whichdetermines the importance of the entity by its own in the ontology. By definitionF (oi, oi) = m(oi).
In Fig.1 a graphical representation of a small fuzzy concept network is shown: inthis chart the edges with F (oi, oj) = 0 are omitted in order to increase the read-ability. The membership values m(oi) are reported beside the respective instances.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 130
3 Small-world behaviour and efficiency measuresof a scale-free network
From the dynamical nature of the FCN some important topological properties canbe determined. In particular, the correlations time evolution plays a dominantrole in this kind of insight. In this section some formal tools and some efficiencymeasures are presented. These have been adopted to numerically analyse the FNCevolution and the underlying fuzzy ontology.
The study of the structural properties of complex systems underlying networkscan be very important. For instance, the efficiency of communication and naviga-tion over the Net is strongly related to the topological properties of the Internetand of the World Wide Web. The connectivity structure of a population (the setof social contacts) acts on the way ideas are diffused. Only very recently the in-creasing accessibility of databases of real networks on one side, and the availabilityof powerful computers on the other side, have made possible a series of empiricalstudies on the social networks properties.
In their seminal work [23], Watts and Strogatz have shown that the connectiontopology of some real networks is neither completely regular nor completely ran-dom. These networks, named small-world networks [36], exhibit a high clusteringcoefficient (a measure of the connectedness of a network), like regular lattices,and small average distance between two generic points (small characteristic pathlength), like random graphs. Small average distance and high clustering are notall the common features of complex networks. Albert, Barabasi et al. [37] havestudied P(k), the degree distribution of a network, and found that many largenetworks are scale-free, i.e., have a power-law degree distribution P (k) ∝ kγ .
Watts and Strogatz have named these networks, that are somehow in betweenregular and random networks, small-worlds, in analogy with the small-world phe-nomenon, empirically observed in social systems more than 30 years ago [36]. Themathematical characterization of the small-world behaviour is based on the eval-uation of two quantities, the characteristic path length L, measuring the typicalseparation between two generic nodes in the network and the clustering coeffi-cient C, measuring the average cliquishness of a node. Small-world networks arehighly clustered, like regular lattices, having small characteristic path lengths, likerandom graphs.
Generic network is usually represented by a weighted graph G = (N, K), whereN is a finite set of vertices and K ⊆ (N × N) are the edges connecting thenodes. The information related to G is described by both an adjacency matrixA ∈M(|N |, {0, 1}) and by a weight matrix W ∈M(|N |,R+). Both the matrix Aand W are symmetric. The entry aij in A are 1 if there is an edge joining vertex ito vertex j, and 0 otherwise. The matrix W contains a weight wij related to anyedge aij . If aij = 0 then wij = ∞. If the condition wij = 1 for any aij = 1 isassumed the graph G corresponds to an unweighted relational network.
In network analysis a very important quantity is the degree of a vertex, i.e.,the number of incident with i ∈ N . The degree k(i) ∈ (N) of generic vertex i isdefined as:
k(i) = |{(i, j) : (i, j) ∈ K}| (4)
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 131
The average value of k(i) is
< k(i) >=∑
i k(i)2 |N | (5)
The coefficient 2 in Equation (5) appears at the denominator because eachlink in A is counted twice. The shortest path length dij : (N ×N) → {R+ ∪∞}between two vertices has to be calculated to define L. In unweighted social net-works dij corresponds to the geodesic distance between nodes and it is measuredas the minimum number of edges traversed to get from a vertex i to another vertexj. The distances dij can be calculated using any all-to-all path algorithm (e.g.,Floyd-Warshall algorithm) either for a weighted or a relational network.Let us remember that according to Definition 4, the weights in the network corre-spond to the fuzzy correlation joining two concepts in the fuzzy ontology inducedconcept network.
The characteristic path length L of graph G is defined as the average of theshortest path lengths between two generic vertices:
L(G) =∑
i 6=j∈N
1|N | (|N | − 1)
dij (6)
The definition given in Equation (6) is valid for a totally connected G, where atleast one finite path connecting any couple of vertices exists. Otherwise, whenfrom a node i we cannot reach a node j then the distance dij = ∞, thus the sumin L(G) diverges.
The clustering coefficient C(G) is a measure depending on the connectivityof the subgraph Gi induced by a generic node i and its neighbours. Formally asubgraph Gi = (Ni,Ki) of a node i ∈ N can be defined as the pair:
Ni = {j ∈ N : (i, j) ∈ K} (7)Ki = {(j, k) ∈ K : j ∈ Ni ∧ k ∈ Ni} (8)
An upper bound on the cardinality of Ki can be stated according to the fol-lowing observation: if the degree of a given node is k(i), following Equation (4),then Gi has
|Ki| ≤ k(i) (k(i)− 1)2
(9)
Let us stress that the subgraph Gi does not contain the node i. Gi results tobe useful in studying the connectivity of the neighbours of a node i after theeliminations of the node itself.
The upper bound on the number of the edges in a subgraph Gi introduces theratio of the actual number of edges in Gi with respect to the right hand side ofequation (9). Formally this ratio is defined as:
Csub(i) =2 |Ki|
k(i) (k(i)− 1)(10)
The quantities Csub(i) are used to calculate the clustering coefficient C(G) as theirmean value:
C(G) =1|N |
∑
i∈N
Csub(i) (11)
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 132
A network exhibits the small-world phenomenon if it is characterized by smallvalues for L and high values for the C clustering coefficient. Scale-free networksare usually identified, in addition to the exponential probability density functionP (k) of the edges, by small values of both L and C.
Studying real system networks, such as the collaboration graph of actors or thelinks among WWW documents, the probability to incur in non-connected graphsis very high. The L(G) and C(G) formalism is trivially not suited to treat thesesituations. In such cases the alternative formalism proposed by [38] is much moreeffective, even in the case of disconnected networks. This approach defines twomeasures of efficiency which give well-posed characterizations for the path meanlength and for the node mean cliquishness respectively.
To introduce the efficiency coefficients, it is necessary to consider the efficiencyεij of a generic node (i, j) ∈ K. This quantity measures the speed of informationpropagation between a node i and a node j, in particular εij = 1
dij.
With this definition, when there is no path in the graph between i and j; dij = ∞and consistently εij = 0.
The global efficiency of the graph G results to be:
Eglob(G) =
∑i 6=j∈G εij
|N |(|N | − 1)=
1|N |(|N | − 1)
∑
i 6=j∈G
1dij
(12)
and the local efficiency, in analogy with C, can be defined as the average efficiencyof local subgraphs:
E(Gi) =1
k(i)(k(i)− 1)
∑
l 6=m∈Gi
1d′lm
(13)
Eloc(G) =1N
∑
i∈G
E(Gi) (14)
where Gi, as previously defined, is the subgraph of the neighbours of i, which iscomposed of k(i) nodes.
The two definitions originally given in [24] have the important property thatboth the global and local efficiency are normalized quantities, that is: Eglob(G) ≤ 1and Eloc(G) ≤ 1. The conditions Eglob(G) = 1 and Eloc(G) = 1 hold in the case ofa completely connected graph where the weight of the edges is a node independentpositive constant.
In the efficiency-based formalism, small-world phenomenon emerges for systemswith high Eglob (corresponding to low L) and high Eloc (corresponding to highclustering C). Scale-free networks without small-world behaviour show high Eglob
and low Eloc.
4 Information Retrieval Algorithm using FuzzyConcept Network
In the last decade, there has been a rapid and wide development of Internet whichhas brought online an increasingly great amount of documents and online textual
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 133
information. The necessity of a better definition of the Information Retrieval Sys-tem (IRS) emerged in order to retrieve the information considered pertinent toa user query. Information Retrieval is a discipline that involves the organization,storage, retrieval and display of information. IRSs are designed with the objectiveof providing references to documents which contain the information requested bythe user [32]. In IRS, problems arise when there is the need to handle uncertaintyand vagueness that appear in many different parts of the retrieval process. On onehand, an IRS is required to understand queries expressed in natural languages.On the other hand, it has the need to handle the uncertain representation of adocument.In literature, there are many models of IRS that are classified into the follow-ing categories: boolean logic, vector space, probabilistic and fuzzy logic [39, 40].However, both the efficiency and the effectiveness of these methods are not sat-isfactory [34]. Thus, other approaches have been proposed to directly handle theknowledge-based fuzzy information. In preliminary attempts the knowledge wasrepresented by a concept matrix, where the elements identify relevant values amongconcepts [34]. Other more relevant approaches have been made adding fuzzy typesto object-oriented databases systems [41].
As stated in Section 2, a crucial topic for the semantic information handling isthe face the trade off between the proper definition of an object and its ”commonsense” counterpart. The FCN characteristic weights are initially set by an expertof the domain. In particular, he sets the initial correlation values on the fuzzyontology and the fuzzy concept network construction procedure takes these asinitial values for the links among the objects in O (see Definition 4). From nowon the correlation values F (oi, oj) will be updated according to the queries (bothselections and insertions) performed on the documents.
Most of all, the FCN usage gives the possibility to directly incorporate thesemantics expressed by the natural languages in graph spanning. This feature letus intrinsically obtain fuzzy information retrieval algorithms without introducingfuzzyfication and defuzzyfication operators. Let us stress that this process is pos-sible because the fuzzy logic is directly inserted into the knowledge expressed bythe fuzzy ontology.
An example for this kind of approach is presented in the following. The origi-nal crisp information retrieval algorithm taken into account has been successfullyapplied to support the creative processes of architects and interaction designers.More in detail, a new formalization of the algorithm adopted in the ATELIERproject (see [22] and Section 5.1) is presented including a step-by-step brief de-scription.
The FCN has been involved in steps (1) and (4) in order to semantically enrichthe results obtained. The algorithm input is the vector of the keywords in thequery. The first step of the algorithm uses these keywords to locate the documents(e.g., stored in a relational database) containing them. The keyword vector isextended with all the new keywords related to each selected document.
In the step (1) the queries are extended by navigating the FCN recursively.For each keyword specified in the query, a depth-first visit is performed arrestingthe spanning at a fixed level. In [22] this threshold was set to 3. The edgeswhose F (oi, oj) is 0 are excluded and the neighbour keywords are collected without
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 134
FCN-IR Search ( kv : keyword vector )1: FCN-based kv extension2: kv pruning3: kv-based documents extraction4: FCN-based relevance calculationreturn ranking of the documents
Figure 2: Information Retrieval Algorithm
repetitions. Usually the techniques of Information Retrieval simply extend querieswith parents and children of the concepts. In step (2) the final list of neighbouringentities is pruned by navigating the fuzzy ontology, namely the set of candidatesis reduced excluding the keywords that are not connected by a direct path in thetaxonomy, i.e. the parents and children of the terms contained in the query. In thethird phase the documents containing the resulting keywords are extracted fromthe knowledge base. In the last step, the FCN is used to calculate the relevanceof the documents thus arranging them in the desired order. In particular, thanksto the FCN characterising functions F and m, the weights for the keywords oi ineach selected document are determined according to the following equation:
w(oi) = m(oi)βoi ·∑
oj∈K,oj 6=oi
[F (oi, oj)]βoi,oj (15)
Where K is the set of the keywords obtained from the step (3), βoi,oj ∈ R is amodifier value used to express concept modifiers effects (see [19] and Section 2.2for details).
The final score of a document is evaluated through a cosine distance amongthe weights of each keyword. This is done for normalisation purpose. Such a valueis finally sorted in order to obtain a ranking among the documents.
5 Test validation
This section is divided as follows: in the first part we introduce the environmentused to experiment the FCN, whereas in the second part the analytic study of thescale-free properties of these networks is given.
5.1 Description of the experiment
A creative learning environment is the context chosen to study the fuzzy conceptnetworks behaviour. In particular, the ATELIER (Architecture and Technologiesfor Inspirational Learning Environments) project has been examined. ATELIERis an EU-funded project that is part of the Disappearing Computer initiative2.The aim of this project is to build a digitally enhanced environment, supportinga creative learning process in architecture and interaction design education. Thework of the students is supported by many kinds of devices (e.g., large displays,
2http://www.disappearing-computer.net
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 135
RFID technology, barcodes, . . . ) and a hyper-media database (HMDB) is used tostore all digital materials produced. An approach has been studied to help and tosupport the creative practices of the students. To achieve this goal, an ontology-driven selection tool has been developed. In a recent contribution [22] it has beenshown how dynamical fuzzy ontologies are suitable for this context. Every day thestudents create a very large amount of documents and artifacts and they collect alot of material (e.g., digital pictures, notes, videos, and so on). Thus, new conceptsare produced in the life cycle of a project. Indeed, the ontology evolves in time andthe necessity emerges to make a dynamical fuzzy ontology suited for the contexttaken into account.
As presented in Section 2, the fuzzy ontology is an exhaustive approach tohandle the knowledge-based fuzzy information. Furthermore, it emerges that theevolution of the fuzzy concept network is mainly given by the keywords of thedocuments inserted in a HMDB and from the concepts written during the definitionof a query by the users. The algorithm presented in Section 4 used these propertiesdeeply.The executed experiments consider all these aspects. We have examined the trendof the fuzzy concept network in three different scenarios: the contribution of thekeywords in the HMDB, the contribution of the concepts from the queries andtheir combined evaluation effects.In the first case, 485 documents have been examined. Four keywords is the averagefor each document and the resulting final correlations are 431. In the second case,500 queries were performed by users with an age of the people from 20 to 60.For each query a user had the opportunity to include up to 5 different conceptsand each user had the possibility to semantically enrich their requests using thefollowing list of concept modifiers (little, enough, moderately, quite, very, totally).In this experiment we have obtained 232 correlations and 32 new concepts whichwere introduced in the fuzzy ontology domain. In the last test we examined twotypes of queries (485+500) jointly: the keywords of the documents and requestsof the users. The number of inducted correlations is 615, while the new conceptsare 37.
5.2 Analytic Results of the experiments
During the construction of the fuzzy concept network some snapshots have beenperiodically dumped to file (one snapshot each 50 queries) to be analyzed. Tohave a graphical topological representation a network analysis tool called AGNA3
has been used. AGNA (Applied Graph and Network Analysis) is a platform-independent application designed for scientists and researchers who employ specificmathematical methods, such as social network analysis, sociometry and sequentialanalysis. Specifically, AGNA can assist in the study of communication relationsin groups, organizational analysis and team building, kinship relations or animalbehavior laws of organization. The most recent version is AGNA 2.1.1 and it hasbeen used to produce the following pictures of the fuzzy concept networks startingfrom the weighted adjacency matrix defined in Section 3. The link color intensityis proportional to the function F introduced in Definition 2 so, the more marked
3http://http://www.geocities.com/imbenta/agna/
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 136
lines mean F values near to 1. Because of the large number of concepts andcorrelations, the pictures in Figure 4 have the purpose of showing qualitativelythe link distributions and the hub locations. Little semantic information aboutthe ontology can be effectively extracted from these pictures.
Both the global and the local efficiency, as defined in Equation (12) and inEquation (14), are calculated on each of these snapshots. The evolution for theefficiency measures are reported in Figure 3(a), 3(b) and 3(c). The solid linecorresponds to the global efficiency while the dashed one is the local efficiency.
In Figure 3(a) the efficiency evolutions of the HMDB are reported. The globalefficiency becomes a dominant effect after an initial transitory where the localefficiency results dominant. So, we can deduce the emergence of a hub connectedfuzzy concept network. This consideration is graphically confirmed by the networkreported in Figure 4(a). To increase the readability we located the hubs on theborders of the figure.
It can be seen clearly that the hub concepts are “people”, “man”, “woman”,“hat”, “face” and “portrait”. These central concepts have been isolated using thebetweenness [42] sociometric measure. The high betweenness values of the hubswith respect to the one for the other concepts confirm the measure obtained by theEglo. Indeed, the mean distance among the concepts is kept low thanks to thesepoints appearing very frequently in the paths from and to all the other nodes.We want to stress that the global efficiency quantifies the presence of hubs in agiven network, while the betweenness of the nodes gives a way to identify whichof the concepts are actually hub points. On the other hand, Figure 3(b) showshow the local efficiency in a fuzzy concept network builded using user queries ishigher than its global counterpart. This suggests that the network topology lackshubs. Furthermore many nodes present quite the same number of neighbours. Aconfirmation for this analysis is given by the fuzzy concept network reported inFigure 4(b). In this case the betweenness index for the concepts shows that noparticular point has significantly higher frequency in the other node paths. Finally,in Figure 3(c) the effect of the network obtained by both the documents and thequeries is reported. In this composed kind of tests a total of about 1000 queries istaken into account. It is interesting how both the data from HMDB and the userqueries act in a non linear way on the quantification of the efficiency measures.The resulting fuzzy concept network shows a dominant Eglo with respect to itsEloc and some hubs emerge. In particular Figure 5.2 highlights the fact that thehubs collect a large number of links coming from the other concepts.
The betweenness index for this fuzzy concept network identify one main hub,the concept “people”, with an extremely high value (5 times the mean value forthe other hubs). The principal other hubs are “woman”, “man”, “landscape”,“sea”, “portrait”, “red” and “fruit”. In this case the Freeman General Coefficientevaluated on the indexes is slightly lower than in the HMDB case, this is due tothe higher clustering among concepts (higher Eloc), see Table 1. The FreemanCoefficient is a function that allows the consolidation of node-level measures in asingle value related to the properties of the whole network [42].
The strength of these connections is much more marked than in the similar sit-uation treated in the case of HMDB induced fuzzy concept network. This meansthat the queries contribution reinforces the semantic correlations among the hub
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 137
2 4 6 8 10Snap
0.025
0.05
0.075
0.1
0.125
0.15
0.175
0.2Efficiency
(a)
2 4 6 8 10Snap
0.025
0.05
0.075
0.1
0.125
0.15
0.175
0.2Efficiency
(b)
2 4 6 8 10Snap
0.025
0.05
0.075
0.1
0.125
0.15
0.175
0.2Efficiency
(c)
Figure 3: (a) Efficiency measures for the concept network induced by the knowl-edge base. (b) Efficiency for the user queries. (c) Efficiency for the joined queriesand knowledge base documents.
concepts. To confirm the scale-free nature of the hubs in the fuzzy concept net-works we analyzed the statistical distributions of k(i) (see Equation (4)) reportedin Figure 5.
For Figures 5(a) and 5(c) the frequencies decrease according to a power law.This confirms what is stated by the theoretical expectations very well. The userqueries distributions, in Figure 5(b), behave differently. Their high values for Eloc
imply, as already stated, a highly clustered structure.Let us consider how Eloc is related to other classical social network parame-
ters such as the density and the weighted density [42]. The density reflects theconnectedness of a given network with respect to its complete graph. It is easy tonote that this criterion is quite similar to what stated in Equation (14): we canconsider the Eloc as a mean value for the densities evaluated locally in each nodeof the fuzzy concept network. Unexpectedly the numerical results in Table 1 showthat there is an inverse proportional relation between the Eloc and the density.More investigations are required.
The weighted density can be interpreted as a measure of the mean link weightvalue namely, the mean semantic correlation (see Section 2) among the conceptsin the fuzzy concept network. In Table 1 it can be seen that the weighted densityvalues are higher for the systems exhibiting hubs. This is graphically confirmedby the Figures 4(a) and 5.2, where the links among the concepts are more marked(i.e. more colored lines correspond to stronger correlations).
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 138
(a) (b)
(c)
Figure 4: Topological views of different fuzzy concept network obtained by AGNA2.1.1: (a) HMDB network. (b) User queries network. (c) Complete knowledge-basefuzzy concept network.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 139
20 40 60 80 100
10
20
30
40
50
count
# link
(a)
5 10 15 20 25
5
10
15
20
25
count
# link
(b)
20 40 60 80 100
10
20
30
40
50
60
count
# link
(c)
Figure 5: (a) HMDB fuzzy concept network link distribution. (b) User queriesfuzzy concept network link distribution. (c) Complete knowledge-base fuzzy con-cept network link distribution.
Table 1: Comparison of the efficiency measures of other complex systems w.r.t ourfuzzy concept networks.
Fuzzy Concept Eglo Freeman Coeff. Eloc Density Weighted
Network Density
HMDB 0.094 0.07 0.074 0.035 0.01
Queries 0.053 0.02 0.144 0.015 0.006
Complete 0.141 0.06 0.079 0.036 0.013
(HMDB+Queries)
6 Related Work
A semantic network (or net) is a graphical notation to represent the knowledgethrough patters of interconnected nodes and arcs, defining concepts and semanticrelations between them [43] respectively. This kind of declarative graphic can beused also to support those automated reasoning systems that treat certain do-main knowledge. Sowa [44] describes six of the most common kinds of semanticnetworks: among them he cites ’definitional networks, assertional networks, im-plicational networks, . . . ’. WordNet (a lexical database for the English language)is an example of the most famous semantic network that has been widely used inthe last years.
An ontology can be envisioned as a kind of semantic network where the conceptsare related to one another using meaningful relationships. Up to now the mostimportant semantic relations are meronymy (’A is part of B’), holonymy (’B hasA as a part of itself’), hyponymy (’A is kind of B’), and synonymy (’A denotes thesame of B’). In this field our contribution, with this paper, is the introduction of anew semantic relation between the entities of the fuzzy ontology in order to definea semantic network based on the correlations (see Definition 3). The topologicalstructure of the FCN allows an intuitive navigation through a wide collection oninformation providing correlations among concepts not predictable a-priori. In[45] a similar approach is reported: in this work a system is described which isable to extend ontologies in a semi-automatic way. In particular, such a system
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 140
has been applied to ontologies related to the contexts covered in the Web Miningfield of research. Another ontology model somehow related to our approach is theseed ontology. A seed ontology creates a semantic network through co-occurrenceanalysis: it considers exclusively how many times the keywords are used together.The disambiguation related problems are resolved using a WordNet consultation.The major advantage of these nets is that they allow the efficient identification ofthe most probable candidates for inclusion in an extended ontology.
Ontology-based information retrieval approaches are one of the most promisingmethodologies to improve the quality of the responses for the users. The definitionof the FCN implies a better calculation for the relevance of searched documents.A different approach to ontology-based information retrieaval has been proposedin [46]. In this work a semantic network is built to represent the semantic contentsof a document. The topological structure of this network is used in the followingway: every time that a query is performed a keyword vector is created, in orderto select the appropriate concepts that characterize the contents of the documentsaccording to the search criteria. We are investigating on how to integrate the FNCwith this different kind of semantic network, in fact both of these methodologiescould be effectively used to achieve the goal of the IRs.
7 Conclusions
In this paper, we considered an extension of a classical ontology including FuzzySet Theory. We have reported several case studies and some research areas whereat the moment it is applied. Furthermore, we have analyzed the evolution of thefuzzy ontology in time: the correlations among concepts change according to thedifferent queries submitted. Document insertions or users queries have been takeninto account. In relation to the fuzzy ontology a new concept network called fuzzyconcept networks has been defined. In this way, the dynamical fuzzy ontology re-flects the behaviour of a fuzzy knowledge-base. Furthermore, a novel informationretrieval algorithm exploiting this data structure has been proposed. We haveexamined the topological structure of this network as a novel complex networksystem, starting from the network efficiency evaluations. A highly clustered struc-ture emerges, highlighting the role of some concepts as hubs and the characteristicdistribution of weighted links among them. Thus, we have stated the scale-freenature of the fuzzy concept network evaluating two efficiency measures to investi-gate its local and global properties. At the end, we compared these measures withthe parameters commonly used in the social network analysis.
Acknowledgements
The work presented in this paper had been partially supported by the ATELIERproject (IST-2001-33064).
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 141
References
[1] T. Berners-Lee, T. Hendler, and J. Lassila, “The semantic web,” Scientific Ameri-can, vol. 284, pp. 34–43, 2001.
[2] N. Guarino, “Formal ontology and information systems,” 1998. [Online]. Available:citeseer.ist.psu.edu/guarino98formal.html
[3] T. Gruber, “A Translation Approach to Portable Ontology Specifications,” Knowl-edge Acquisition, vol. 5, pp. 199–220, 1993.
[4] E. Falkenberg, W. Hesse, P. Lindgreen, B. Nilsson, J. Oei, C. Rolland, R. Stamper,F. V. Assche, A. Verrijn-Stuart, and K. Voss, “Frisco : A framework of informationsystem concepts,” IFIP, The FRISCO Report (Web Edition) 3-901882-01-4, 1998.
[5] N. Lammari and E. Mtais, “Building and maintaining ontologies: a set of algo-rithms,” Data and Knowledge Engineering, vol. 48, pp. 155–176, 2004.
[6] N. Guarino and P. Giaretta, “Ontologies and Knowledge Bases: Towards a Termino-logical Clarification,” in Towards Very Large Knowledge Bases: Knowledge Buildingand Knowledge Sharing, N. Mars, Ed. Amsterdam: IOS Press, 1995, pp. 25–32.
[7] V. W. Soo and C. Y. Lin, “Ontology-based information retrieval in a multi-agentsystem for digital library,” in 6th Conference on Artificial Intelligence and Applica-tions, 2001, pp. 241–246.
[8] G. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications. PrenticeHall, 1995.
[9] L. A. Zadeh, “Fuzzy sets,” Inform. and Control, vol. 8, pp. 338–353, 1965.
[10] P. Bouquet, J. Euzenat, E. Franconi, L. Serafini, G. Stamou, and S. Tessaris, “Spec-ification of a common framework for characterizing alignment,” IST Knowledge webNoE, vol. 2.2.1, 2004.
[11] S. Singh, L. Dey, and M. Abulaish, “A Framework for Extending Fuzzy DescriptionLogic to Ontology based Document Processing,” in Proceedings of AWIC 2004, ser.LNAI, vol. 3034. Springer-Verlag, 2004, pp. 95–104.
[12] M. Abulaish and L. Dey, “Ontology Based Fuzzy Deductive System to Handle Impre-cise Knowledge,” in In Proceedings of the 4th International Conference on IntelligentTechnologies (InTech 2003), 2003, pp. 271–278.
[13] C. Matheus, “Using Ontology-based Rules for Situation Awareness and InformationFusion,” in Position Paper presented at the W3C Workshop on Rule Languages forInteroperability, April 2005.
[14] T. Quan, S. Hui, and T. Cao, “FOGA: A Fuzzy Ontology Generation Frameworkfor Scholarly Semantic Web,” in Knowledge Discovery and Ontologies (KDO-2004).Workshop at ECML/PKDD, 2004.
[15] Q. T. Tho, S. C. Hui, A. Fong, and T. H. Cao, “Automatic fuzzy ontology generationfor semantic web,” IEEE Transactions on Knowledge and Data Engineering, vol. 18,no. 6, pp. 842–856, 2006.
[16] D. Parry, “A fuzzy ontology for medical document retrieval,” in Proceedings of TheAustralian Workshop on DataMining and Web Intelligence (DMWI2004), Dunedin,2004, pp. 121–126.
[17] T. Andreasen, J. F. Nilsson, and H. E. Thomsen, “Ontology-based querying,”in Flexible Query-Answering Systems, 2000, pp. 15–26. [Online]. Available:citeseer.ist.psu.edu/682410.html
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 142
[18] L. Chang-Shing, J. Zhi-Wei, and H. Lin-Kai, “A fuzzy ontology and its application tonews summarization,” IEEE Transactions on Systems, Man, and Cybernetics-PartB: Cybernetics, vol. 35, pp. 859–880, 2005.
[19] S. Calegari and D. Ciucci, “Integrating Fuzzy Logic in Ontologies,” in ICEIS,Y. Manolopoulos, J. Filipe, P. Constantopoulos, and J. Cordeiro, Eds., 2006, pp.66–73.
[20] E. Sanchez and T. Yamanoi, “Fuzzy ontologies for the semantic web.” in Proceedingof FQAS, 2006, pp. 691–699.
[21] AA.VV., “Karlsruhe Ontology and Semantic Web Tool Suite (KAON),” 2005,http://kaon.semanticweb.org.
[22] S. Calegari and M. Loregian, “Using dynamic fuzzy ontologies to understand creativeenvironments,” in LNCS - FQAS, H. L. Larsen, G. Pasi, D. O. Arroyo, T. Andreasen,and H. Christiansen, Eds. Springer, 2006, pp. 404–415.
[23] D. J. Watts and S. H. Strogatz, “Collective dynamics of ’small-world’ networks,”Nature, vol. 393, no. 6684, pp. 440–442, June 4 1998.
[24] V. Latora and M. Marchiori, “Efficient behavior of small-world networks,” Phys.Rev. Lett., vol. 87, no. 19, p. 198701(4), 2001.
[25] Y. Bar-Yam, Dynamics of Complex Systems, Addison-Wesley, Ed. Reading, MA,1997.
[26] M. E. J. Newman, “Models of the small world: A review,” J.STAT.PHYS.,vol. 101, p. 819, 2000. [Online]. Available: http://www.citebase.org/cgi-bin/citations?id=oai:arXiv.org:cond-mat/0001118
[27] P. Radaelli, S. Calegari, and S. Bandini, “Towards fuzzy ontology handling vaguenessof natural languages,” in LNCS - Rough Sets and Knowledge Technology, G. Wang,J. Peters, A. Skowron, and Y. Yao, Eds. Springer Berlin / Heidelberg, 2006, pp.693–700.
[28] D. Widyantoro and J. Yen, “A fuzzy ontology-based abstract search engine and itsuser studies,” in Proc. 10th IEEE Int’l Conf. Fuzzy Systems. IEEE Press, 2001, p.12911294.
[29] ——, “Using fuzzy ontology for query refinement in a personalized abstract searchengine,” in IFSA World Congress and 20th NAFIPS International Conference.IEEE Press, 2001, pp. 610–615.
[30] L. A. Zadeh, “A fuzzy-set-theoretic interpretation of linguistic hedges,” Journal ofCybernetics, vol. 2, pp. 4–34, 1972.
[31] T. D. Khang, H. Storr, and S. Holldobler, “A fuzzy description logic with hedgesas concept modifiers,” in Third International Conference on Intelligent Technologiesand Third Vietnam-Japan Symposium on Fuzzy Systems and Applications, 2002, pp.25–34.
[32] V. V. Raghavan and S. K. M. Wong, “A critical analysis of vector space model forinformation retrieval,” Journal of the American Society for Information Science,vol. 37, no. 5, pp. 279–287, 1986.
[33] J. Xu and W. B. Croft, “Improving the effectiveness of information retrieval withlocal context analysis,” ACM Trans. Inf. Syst., vol. 18, no. 1, pp. 79–112, 2000.
[34] S.-M. Chen and J.-Y. Wang, “A critical analysis of vector space model for infor-mation retrieval,” IEEE Transactions on Systems. Man, and Cybernetics., vol. 25,no. 5, pp. 793–803, 1995.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 143
[35] D. Lucarella and R. Morara, “First: fuzzy information retrieval system,” J. Inf.Sci., vol. 17, no. 2, pp. 81–91, 1991.
[36] M. S., “The small world problem,” Psychology Today, vol. 2, pp. 60–67, 1967.
[37] R. H. J. Albert and A. Barabasi, “Error and attack tolerance of complex networks,”Nature, vol. 406, pp. 378–382, 2000.
[38] P. Crucitti, V. Latora, M. Marchiori, and A. Rapisarda, “Efficiency of scale-freenetworks: Error and attack tolerance,” Physica A, vol. 320, pp. 622–642, 2003.
[39] G. Salton and M. McGill, Introduction to Modern Information Retrieval. McGraw-Hill Book Company, 1984.
[40] F. Crestani and G. Pasi, “Soft information retrieval: applications of fuzzy sets the-ory and neural networks,” in Neuro-Fuzzy Techniques for Intelligent InformationSystems, N. Kasabov and R. Kozma, Eds. Physica Verlag, 1999, pp. 287–315.
[41] N. Marın, O. Pons, and M. A. V. Miranda, “A strategy for adding fuzzy types to anobject-oriented database system.” Int. J. Intell. Syst., vol. 16, no. 7, pp. 863–880,2001.
[42] S. Wasserman and K. Faust, Social network analysis. Cambridge: CambridgeUniversity Press, 1994.
[43] J. Lee, M. Kim, and Y. Lee, “Information retrieval based on conceptual distance inis-a hierarchies,” J. Documentation, vol. 49, no. 2, pp. 188–207, 1993.
[44] J. Sowa, Encyclopedia of Artificial Intelligence. Stuart C. Shapiro, Wiley, 1987.
[45] W. Liu, A. Weichselbraun, A. Scharl, and E. Chang, “Semi-automatic ontologyextension using spreading activation,” J. Universal Knowledge Management, vol. 0,no. 1, pp. 50–58, 2005.
[46] M. Baziz, M. Boughanem, N. Aussenac-Gilles, and C. Chrisment, “Semantic coresfor representing document in IR,” in ACM, 2005, pp. 1011–1017.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 125-144
© 2006 Technomathematics Research Foundation
Silvia Calegari, Fabio Farina 144
Extracted Knowledge Interpretation inMining Biological Data : a Survey
Ricardo Martineza Martine Collardb
EXECO Project
Laboratoire I3S - CNRS UMR 4070
Les Algorithmes, 2000 route des Lucioles,
BP.121 - 06903 Sophia-Antipolis - France
Email: [email protected], [email protected]
Abstract
This paper discusses different approaches for integrating biological knowledge in gene ex-
pression analysis. Indeed we are interested in the fifth step of microarray analysis proce-
dure which focuses on knowledge discovery via interpretation of the microarray results. We
present a state of the art of methods for processing this step and we propose a classification
in three facets: prior or knowledge-based, standard or expression-based and co-clustering.
First we discuss briefly the purpose and usefulness of our classification. Then, following sec-
tions give an insight into each facet. We summarize each section with a comparison between
remarkable approaches.
Keywords: data mining, knowledge discovery, bioinformatics, microarray, biological sources
of information, gene expression, integration.
1 Introduction
Nowadays, one of the main challenges in gene expression technologies is to highlight the
main co-expressed1 and co-annotated2 gene groups using at least one of the different sources
of biological information [1]. In other words, the issue is the interpretation of microarray
results via integration of gene expression profiles with corresponding biological gene anno-
tations extracted from biological databases.
Analyzing microarray data consists in five steps: protocol and image analysis, statistical
data treatment, gene selection, gene classification and knowledge discovery via data inter-
pretation [2]. We can see in Figure 1 the goal of the fifth analysis step devoted to interpre-
tation, which is the integration between two domains, the numeric one represented by the
gene expression profiles and the knowledge one represented by gene annotations issued from
different sources of biological information.
At the beginning of gene expression technologies, researches were focused on the numeric3
side. So, there have been reported ([3, 4, 5, 6, 7, 8]) a variety of data analysis approaches
which identify groups of co-expressed genes based only on expression profiles without taking
into account biological knowledge. A common characteristic of purely numerical approaches
is that they determine gene groups (or clusters) of potential interest. However, they leave
to the expert the task of discovering and interpreting biological similarities hidden within
these groups. These methods are useful, because they guide the analysis of the co-expressed
gene groups. Nevertheless, their results are often incomplete, because they do not include
biological considerations based on prior biologists knowledge.
1Co-expressed gene group: group of genes with a common expression profile.2Co-annotated gene group: group of genes with the same annotation. A gene annotation is a piece of biological
information related to the gene that can be relational, syntactical, functional, etc.3We understand by numeric part the analysis of the gene expression measures only, disregarding the biological
annotations.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 145
Figure 1: Interpretation of microarray results via integration of gene expression profiles with
corresponding sources of biological information
In order to process the interpretation step in an automatic or semi-automatic way, the bioin-
formatics community is faced to an ever-increasingly volume of sources of biological infor-
mation on gene annotations. We have classified them into the following six sources of bio-
logical information: molecular databases (GenBank, Embl, Unigene, etc.); semantic sources
as thesaurus, ontologies, taxonomies or semantic networks (UMLS, GO, Taxonomy, etc.);
experience databases (GEO, Arrayexpress, etc.); bibliographic databases (Medline, Biosis,
etc.); Gene/protein related specific sources (ONIM, KEGG, etc.); and minimal microarray
information as seen in 1. Exploiting these different sources of biological information is quite
a complex task so scientists developed several tools for manipulating them or integrate them
into more complex databases [9], [10].
This paper presents a complete survey of the different approaches for automatic integration
of biological knowledge with gene expression data. A first discussion of these methods is
presented by Chuaqui in [11]. Here we present an original classification of the different
microarray analysis interpretation approaches.
The interpretation step may be defined as the result of the integration between gene expres-
sion profiles analysis with corresponding gene annotations. This integration process consists
in grouping together co-expressed and co-annotated genes. Based on this definition, three
research axes may be distinguished: the prior or knowledge-based axis, the standard or
expression-based axis and the co-clustering axis. Our classification emphasizes the weight
of the integration process scheduling on the final results [12, 13, 14, 15].
Indeed the main criteria underlying the classification we propose is the scheduling of
phases which alternatively consider gene measures or gene annotations. In prior or knowledge-
based approaches, first the co-annotated gene groups are built and then the gene expression
profiles are integrated. In standard or expression-based approaches, first co-expressed gene
groups are built and then gene annotations are integrated. Finally, co-clustering approaches
integrate co-expressed and co-annotated gene groups at the same time.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 146
This paper is organized in the following way: each section fully explains the corresponding
interpretation axis, giving an insight and a comparison of their remarkable approaches. Then,
we develop a discussion among the three interpretation axis.
2 Prior or Knowledge-Based Axis
Prior or knowledge-based approaches are based on biological knowledge from the sources of
biological information (see Figure 1). Therefore, first they build co-annotated gene groups
sharing the same biological annotations. Then, they integrate the expression profiles informa-
tion for each of the genes classified into co-annotated groups, highlighting those ones which
are co-expressed. Later on, the statistical significance of co-annotated and co-expressed gene
groups is tested. We give a detail description of this three-step methodology: co-annotated
gene groups composition, gene expression profiles integration and significant co-annotated
and co-expressed gene groups selection.
2.1 Prior or Knowledge-based Methodology
1. Co-Annotated Gene Groups Composition There exist several ways to build co-annotated
gene groups. We present here one structured way of building them. First, we need to choose
among different sources of biological information. Each kind of information is stored in a
specific format (xml, sql, etc. ) and has intrinsic characteristics. In each case, the analy-
sis process needs to deal with each biological source format. Another issue is to choose a
nomenclature for each gene identity that has to be coherent with the sources of information
and thereafter with the expression data. Next, all the annotations of each gene are to be col-
lected in one or more sources of information. Finally, we gather in a subset of genes that
share the same annotation. Thus, we obtain all the co-annotated gene groups as shown in
Figure 2.
2. Gene Expression Profiles Integration There are different ways to integrate gene ex-
pression profiles with previously built co-annotated gene groups. Here we present one current
way to do it. First, expression profiles measures are taken for each gene. Then, a variability
measure, as fold change or t − statistic or f − score [16] is used to build a sorted list of
gene-ranks based on expression profiles. Finally, this measure is incorporated gene by gene
into the co-annotated groups. Thus, we obtain co-annotated gene groups with the expression
profiles information within as shown in Figure 2.
3. Selection of the Significant Co-Annotated and Co-Expressed Gene Groups At this
stage all co-annotated and co-expressed gene groups are built. The next step is to reveal
which of these groups or subgroups are statistically significant. To tackle this issue the most
frequent technique is the statistical hypothesis testing. Here, we present the four steps for
statistical hypothesis testing:
a) Formulate the null hypothesis, H0,
H0 : Commonly, that the genes that are co-annotated and co-expressed were expressed
together as the result of pure chance. versus the alternative hypothesis, H1,
H1 : Commonly, that the co-expressed and co-annotated gene groups are found to-
gether because of a biological effect combined with a component of chance variation.
b) Identify a test statistic: The test is based on a probability distribution that will be used
to assess the truth of the null hypothesis.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 147
Figure 2: Gene expression profiles integration into previously co-annotated groups
c) Compute the p − value: The p − value is the probability that a test statistic at least
as significant as the one observed would be obtained assuming that the null hypothesis
was true.
d) Compare the p − value: This consists in comparing the p − value to an acceptable
significance value α. If p − value ≤ α we can consider that the co-annotated and
co-expressed gene group is gathered by a biological effect and thus is statistically sig-
nificant. Consequently, the null hypothesis is ruled out, and the alternative hypothesis
is valid.
At the end of the four-step methodology explained before, the prior approaches present the
interpretation results as significant co-expressed and co-annotated groups of genes (see third
step of Figure 2). The next section will present some of the most remarkable approaches and
methods of the prior or knowledge-based axis.
2.2 Remarkable Prior or Knowledge-based Approaches
We present here four representative approaches: GSEA [17], iGA [18], PAGE [19] and
CGGA [20]. In the following we describe each of them and emphasize some parameters
particularly: the source of biological information, the profiles expression measure, the ex-
pression variability measure, the hypothesis testing parameters and details (type of test, test
statistic, distribution, corrections etc.).
1. Gene Set Enrichment Analysis, GSEA
This approach [17] proposes a statistical method designed to detect coordinated changes
in expression profiles of pre-defined groups of co-annotated genes. This method is born from
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 148
the need of interpreting metabolic pathways results, where a group of genes is supposed to
move together along the pathway.
In the first step, it builds a priori defined gene sets using specific sources of information
which are the NetAFFX and GenMapp metabolic pathways databases.
In the second step, it takes the Signal to Noise Ratio (SNR) to measure the expression
profiles of each gene within the co-annotated group. Then it builds a sorted list of genes for
each of the co-annotated groups.
Third, it uses a non-parametric statistic: enrichment score, ES, (based in a Kolmogorov-
Smirnoff normalized statistic) for hypothesis testing. It takes as null hypothesis:
H0 : The rank ordering of genes is random with regard of the sample.
Then, it assesses the statistical significance of the maximal ES by running a set of permu-
tations among the samples. Finally, it compares the max ES with a threshold α, obtaining
the significant co-expressed and co-annotated gene groups.
2. Parametric Analysis of Gene Set Enrichment, PAGE
This approach [19] detects co-expressed genes within a priori co-annotated groups of genes
like GSEA, but it implements a parametric method.
In first step, it builds a priori defined gene sets from Gene Ontology (GO)4, NetAFFX 5
and GenMapp 6 metabolic databases.
In second step, it takes the fold change to measure the expression profiles of each gene
within the co-annotated group. Then, it builds a z−score from the corresponding fold change
of the two comparative groups (normal versus non normal) as variability expression measure.
Third, it uses the z − score as parametric test statistic. Then, it uses the central limit
theorem [21] to argue that when the sampling size of a co-annotated group is large enough, it
would have a normal distribution. Using the null hypothesis:
H0 : The z − score within the groups has a standard normal distribution.
Thus, if the size of the co-annotated gene groups is not big enough to reach normality, then
it would be significantly co-expressed.
3. Iterative Group Analysis, iGA
This approach [18] finds co-expressed gene groups within a priori functionally enriched
groups, sharing the same functional annotation.
In a first step, it builds a priori functionally enriched groups of genes from Gene Ontology
(GO) or other sources of biological information.
In a second step, it uses the fold change gene expression measure to build a complete
sorted list of genes. Then, it generates a reduced sorted list specific to the functionally en-
riched group.
In a third step, it calculates iteratively the probability of change for each functionally en-
riched group (based in the cumulative hypergeometric distribution). It states the null hypoth-
esis:
H0 : The top x genes are associated by chance within the functionally
enriched group.
Then, it assesses the statistical significance of each group comparing the probability of
change p − value against a user-determined α value.
4. Co-expressed Gene Group Analysis, CGGA
This approach [20] automatically finds co-expressed and co-annotated gene groups.
4http://www.geneontology.org5http://www.affymetrix.com/analysis6http://www.genmapp.org
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 149
In a first step, it builds a priori defined gene groups from a source of biological information
for instance Gene Ontology (GO) and KEGG 7.
In a second step, it uses the fold change as a gene expression measure. Then, it composes
the f−score from the corresponding gene’s fold change. Using the f−score on each gene
it builds a sorted list of gene ranks. Then, it generates a reduced list of gene ranks specific to
the co-annotated enriched group.
In a third step, it states the null hypothesis:
H0 : x genes from a co-annotated gene group (or subgroup) are co-
expressed by chance.
A hypergeometric distribution and p − value calculated from the cumulative distribution
is assumed. This p − value is compared against α to reveal all the significant co-expressed
and co-annotated gene groups, including all the possible subgroups.
2.3 Comparison between Prior or Knowledge-based Approaches
Table I presents the brief summary of the four prior approaches described in last section. For
each approach the four following parameters are presented: sources of biological informa-
tion used, expression profile measure, variability expression measure and hypothesis testing
details (test statistic, distribution and particular characteristics).
First of all, the four approaches are concerned by metabolic pathways within biological
processes, but they use different sources of information: iGA, PAGE and CGGA uses Gene
Ontology and GSEA uses manual metabolic annotations, GENMAPP and NetAffx. CGGA
is the only one which uses KEGG database combined with Gene Ontology.
For expression profiles parameters, GSEA is the only one which choice is the SNR measure
while the others opted for the fold change measure. PAGE and CGGA use respectively
z−score and f−score variability measures to detect the changes in gene expression profiles.
For hypothesis testing, GSEA is the only one which uses a non parametric method based on
a maximal ES statistic and sampling to calculate the p−value. In the contrary, PAGE (normal
distribution), CGGA (hypergeometric distribution) and iGA (hypergeometric distribution)
chose a parametric approach. iGA chose a hypothesis proof based in the most over-expressed
or under-expressed genes (in the rank list) of a co-annotated group, while CGGA searches
all the possible co-expressed subgroups within a co-annotated group (the internal sub-group
position in the group does not matter).
3 Standard or Expression-based Approach
This axis is called standard because it follows the more frequent procedure for microarray
data analysis, which consists of five steps: image analysis, statistical data treatment, genes
selection, genes classification and results interpretation via biological knowledge integration.
This axis has been used since the beginning of microarray technology with encouraging in-
terpretation results [4], [5] and [3]. Thereafter, it has been used as the reference methodology
in microarray data analysis. Expression-based approaches start by building gene groups or
clusters of genes sharing similar expression profiles. Then, they integrate the biological an-
notations of each gene contained inside the expression cluster, building co-expressed and
co-annotated subsets of genes. Later on, the statistical significance of co-expressed and co-
annotated gene groups is tested. In the following section, we explain in detail this three-step
methodology: gene expression profiles classification, biological annotations integration and
significant co-expressed and co-annotated gene groups selection.
7http://www.genome.jp/kegg
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 150
Approach Biological
Source of
Information
Expression
Profile
Measure
Variability
Expression
Measure
Hypothesis Testing Details
GSEA
(Mootha et al.
2003)
Manual
Annotations,
NetAffx and
GENMAPP
SNR (Signal
to Noise Ra-
tio)
Mean Ex-
pression
Difference
One-tailed test. Test sta-
tistic: Maximal ES. Non-
parametric distribution.
iGA
(Breitling et al.
2004)
GO Fold Change Fold
Change
One-tailed test. Modi-
fied Fisher’s exact statis-
tic: The most over or Un-
der expressed Genes in a
group. Hypergeometric dis-
tribution.
PAGE
(Kim et al.
2005)
GO Fold Change z-score One-tailed test. z-score sta-
tistic. Normal distribution.
CGGA
(Martinez et al.
2006)
GO: (MP,
BP, CC) and
KEGG
Fold Change F-score One-tailed test. Modi-
fied Fisher’s exact statistic:
All over or under expressed
genes in a group. Hyperge-
ometric distribution. Bino-
mial distribution for N large.
Bonferroni Correction.
TABLE I
COMPARISON BETWEEN FOUR KNOWLEDGE-BASED INTEGRATION APPROACHES
3.1 Standard or Expression-based Methodology
1. Gene Expression Profiles Classification There exist several methods for classifying
gene expression profiles from cleaned microarray data, i.e. data matrix of thousands of genes
measured in tens of biological conditions. Various supervised methods and non supervised
methods tackled the gene classification issue. Between the most common methods, we can
mention: hierarchical clustering, k-means, Diana, Agnes, Fanny [22], model-based clustering
[23] support vector machines SVM, self organizing maps (SOM), and even association rules
(see more details in [24]).
The target of these methods is to classify genes into clusters sharing similar gene expres-
sion profiles as shown in the first step of Figure 3.
2. Biological Annotations Integration Once clusters of genes are built by similar expres-
sion levels, each gene annotation is extracted from sources of biological information. As
in prior axis, this step deal with different formats of information. A list of annotations is
composed for each gene, and then all the annotations are integrated into the clusters of genes
(previously built by co-expression profiles). Thus, subsets of co-annotated and co-expressed
gene groups are built within each cluster. Figure 3 illustrates this process: three clusters of
similar expression profiles are first built, and then all the individual gene annotations are col-
lected to be incorporated in each cluster. For example in the first under-expressed green group
we have found three subsets of co-annotated genes. These subsets are respiratory complex:
Gene E and Gene D, gluconeogenesis: Gene G and Y and tricarboxylic acid cycle Gene E and
Gene T. We can observe intersections of genes within the under-expressed cluster because of
the different annotations that each gene may have. Thus, we obtain all the co-annotated gene
groups.
3. Selection of the Significant Co-Annotated and Co-Expressed Gene Groups At this
stage all the co-expressed and co-annotated gene groups are built and the issue is to reveal
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 151
Figure 3: Interpretation of microarray results via integration of gene expression profiles with
corresponding sources of biological information
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 152
which of these groups or the possible subgroups are statistically significant. The most current
technique in use is the statistical hypothesis testing (see Figure 3).
Afterward, this full three-step methodology the expression-based approaches present the
interpretation results as significant co-expressed and co-annotated groups of genes.
The next section presents some of the most representative approaches and methods of the
expression-based axis. Since these approaches are quite numerous, we have classified them
according their main source of biological information. Thus, we have the following clas-
sification: minimal information approaches, ontology approaches and bibliographic source
approaches.
3.2 Expression-based Semantic Approaches
Expression-based semantic approaches integrate fundamentally semantic annotations (con-
tained in ontologies, thesaurus, semantic networks etc.) into co-expressed gene groups.
Nowadays, semantic sources of biological information i.e. structured and controlled vocabu-
laries are one of the best available sources of information to analyze microarray data in order
to discover meaningful rules and patterns [1].
Actually, expression-based semantic approaches are widely exploited. In this section we
present seven among them: FunSpec [25], OntoExpress [26], Quality Tool [27], EASE [28],
THEA [29], Graph Theoretic Modeling [12] and GENERATOR [30]. Each approach uses
Gene Ontology (GO) as source of biological annotation, sometimes combined with another
gene/protein related specific sources as MIPS, KEGG, Pfam, Smart, etc. or molecular data-
base as Embl, SwissProt8, etc.
During last years, GO has been chosen preferably over other sources of information, be-
cause of its non ambiguous and comprehensible structure. That is the reason of the recent
explosion of many more expression-based GO approaches. Among these approaches, we can
cite the integration tools which integrate gene expression data with GO as GoMiner [31],
FatiGO [32], Gostat [33], GoToolbox [34], GFINDer [35], CLENCH [36], BINGO [37], etc.
This up to date GO compendium 9 gives more integration methods, GO searching tools, GO
browsing tools and related GO tools.
In the next section, we describe seven remarkable expression-based semantic solutions.
3.3 Remarkable Expression-based Semantic Approaches
1. FunSpec: web-based cluster interpreter
This approach [25] proposes a statistical evaluation of groups of co-expressed genes and
proteins with respect to existing annotations.
It takes as input clusters of genes previously built by similarity in expression. Then it
searches for all gene and protein annotations in four biological sources of information: Gene
Ontology (GO), Munich Information Center for Protein Sequences (MIPS)10, Nucleotide se-
quence database (EMBL)11, Protein families of alignments and HMMs (Pfam)12. It builds
all the subsets of co-annotated and co-expressed gene and protein groups within each cluster.
It makes the selection of the significant subsets (really functionally enriched) via hypothesis
testing. It states the null hypothesis:
H0 : A functionally enriched group of genes is associated by chance
within the cluster of co-expressed genes.
8http://www.ebi.uniprot.org9http://www.geneontology.org/GO.tools.shtml
10http://mips.gsf.de11http://www.ebi.ac.uk/embl12http://www.sanger.ac.uk/Software/Pfam
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 153
This one-tailed hypothesis is solved on the basis of an hypergeometric distribution and
using a p − value calculated from the cumulative distribution as in Fisher’s exact test [38].
A Bonferroni correction is applied to compensate for multiple testing. Finally, it assesses the
statistical significance of each group comparing the p − value against a user-determined α
value (more details in [39]).
2. Onto-Express: Global functional profiling of gene expression
This approach [26] proposes several statistical evaluations of co-expressed gene groups
with respect to GO existing annotations. It takes as input clusters of genes previously built
by similarity in expression. In a second step, it takes all the existing GO annotations included
in three ontologies, molecular function, cellular component and biological process. Then, it
builds all the subsets of co-annotated and co-expressed gene groups within each cluster.
In a third step, it makes the selection of the significant subsets rejecting the null hypothesis:
H0 : A GO annotated group of genes is associated by chance within the
cluster of co-expressed genes.
This one-tailed hypothesis is solved using a probability distribution and using a p− value
calculated from the cumulative distribution. Finally, it assesses the statistical significance
of each group comparing the p − value against a user-determined α value. Onto-Express
gives the following test options: binomial distribution [21] (when the number of genes is
very large), Fisher’s exact test [40] (when the number of genes is not too important), and χ2
test for equality of proportions [41].
3. Quality Tool: judging the quality of gene expression-based clustering methods
This approach [27] proposes a measure for testing the quality of clusters of gene expression
profiles based on mutual information between cluster membership and known gene annota-
tions. In a first step, it takes clusters of co-expressed genes. In a second step, it takes all the
existing GO annotations included in the three ontologies: molecular function, cellular com-
ponent and biological process. Then, it builds a wide matrix of GO attributes for all genes
containing 1 if the gene matches the attribute and 0 if not. It builds a contingency table for
each cluster-attribute pair, from which it computes cluster-attribute entropy and mutual infor-
mation [42]. In a third step, it compares this measure with clusters grouped by chance from
the same microarray experiments, to check if they are better than random clusters.
This approach uses the same one-tailed hypothesis as seen before (Onto-Express and Fun-
Spec), but it supposes a normal distribution and uses z − score statistic for calculations.
Finally, it obtains co-expressed and co-annotated significants groups of genes.
4. Identifying biological themes within lists of genes
This approach [28] provides a friendly interface for quick annotation of genes within
a cluster, giving a selection method for co-expressed and co-annotate gene groups. In a
first step, it takes clusters of co-expressed genes (previously made by classification algo-
rithms). In a second step it takes the available gene annotations from GO, KEGG, Swiss-
Prot, PFAM, SMART. Then, it builds all the subsets of co-annotated and co-expressed gene
groups within each cluster. In a third step, it shows the statistically significant co-expressed
and co-annotated gene groups.
This approach uses the same one tailed hypothesis testing assumptions: null hypothesis,
hypergeometric distribution, fisher’s exact test, p− value and α as used in Onto-Express and
FunSpec. The only difference is the use of an alternative statistic named ease−score, which
is a conservative adjustment that weights statistical significance in favor of co-annotated
groups supported by more genes.
5. THEA: Tools for high-throughput experiments analysis
This approach [29] proposes a set of tools designed for manipulating microarray results ob-
tained by hierarchical clustering trees. It integrates gene annotations from biological sources
of information and evaluates co-expressed and co-annotated groups of genes.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 154
It takes as input clusters of genes obtained by a hierarchical clustering algorithm. Then, it
queries a database in order to obtain all the possible gene annotations from the ontologies in
GO on biological process, molecular function and cellular component. Then, it shows all the
possible subsets of co-annotated and co-expressed gene groups within each cluster. It displays
graphically the statistical evaluation of the co-expressed and co-annotated gene groups. This
approach uses the same one tailed hypothesis: H0, Fisher’s exact test, p− value and α set of
values as used in Onto-Express and FunSpec.
6. Graph-theoretic modeling
This approach [12] extracts common GO annotations of the genes within a cluster of co-
expressed genes through the modified structure of gene ontology called GO tree.
In a first step, it takes as input clusters of co-expressed genes obtained with any clustering
technique. In a second step, it annotates all genes in a cluster with GO terms, taking into
account the hierarchical nature of GO. It proposes a quantitative measure for estimating how
well gene clusters of expression profiles are gathered together along with known GO cate-
gories. This measure is based in a graphical distance between nodes in the directed acyclic
graph (DAG) of GO. In a third step, it compares this quantitative measure with the same
measure taken from random clusters to see if it is better or not. Thus, it obtains co-expressed
and co-annotated significants groups of genes.
7. GENERATOR: Theme discovery from gene lists for identification and viewing of multi-
ple functional groups
This approach [30] takes co-expressed gene groups and it splits them into homogeneous
co-annotated significant groups within each group.
In a first step, it takes co-expressed gene groups. In a second step, it takes all GO annota-
tions (studying each GO ontology separately) for each gene group. Then, it runs a clustering
algorithm based in a Non-negative Matrix Factorization (NMF) to create a k-means (begins
with k=2) partition of co-annotated groups within each gene group. This process is repeated,
applying k-means algorithm (increasing each time the number of k clusters) and building a
non-nested hierarchical clustering tree. At each step, it tests for significant co-expressed and
co-annotated groups. For this purpose, it uses one-sided test hypothesis with the same as-
sumptions: null hypothesis: H0, hypergeometric distribution, fisher’s exact test, p − value
and α as used in Onto-Express.
3.4 Expression-based Bibliographic Approach
Nowadays bibliographic databases represent one of the richest update sources of biological
information. This type of information, however, is under-exploited by researchers because of
the highly unstructured free-format characteristics of the published information and because
of its overwhelming volume. The main challenges coming up with bibliographic databases
integration are to manage interactions with textual sources (abstracts, articles etc.) and to re-
solve syntactical problems that appears in biological language like synonyms or ambiguities.
At the moment, some text mining methods and tools have been developed for manipulate this
kind of biological textual information. Among these methods we can mention Suiseki [43]
which focuses on the extraction and visualization of protein interactions, MedMinder [44]
takes advantage of GeneCards13 as a knowledge source and offers gene information related
to specific keywords, XplorMed [45] which presents specified gene-information through user
interaction, EDGAR [46] which extracts information about drugs and genes relevant to cancer
from the biomedical literature, GIS [47] which retrieves and analyzes gene-related informa-
tion from PubMed14 abstracts. These methods are useful as stand-alone applications but they
do not integrate gene expression profiles.
13http://www.genecards.org14http://www.pubmed.gov
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 155
Approach Biological
Source of
Information
Hypothesis-
testing Type and
Statistics
Hypothesis-
testing Dis-
tribution and
details
Distinctive
Characteristic
FunSpec
(Robinson et
al. 2002)
GO, MIPS,
EMBL and
Pfam
One-tailed test
Fisher’s exact
statistic
Hypergeome-
tric Bonferroni
Correction
Online integration of 4 dif-
ferent sources of biologi-
cal information
OntoExpress
(Draghici et
al. 2002)
GO (MP, BP
and CC)
One-tailed test
Fisher’s exact
statistic χ2
statistic
Binomial Hy-
pergeometric
χ2
Choice of 3 different sta-
tistical methods
Quality Tool
(Gibbons et
al. 2002)
GO (MP, BP
and CC)
One-tailed test z-
score
Normal Measure based in cluster-
attribute Entropy and mu-
tual information
EASE (Ho-
sack et al.
2003)
GO, KEGG,
Pfam, Smart,
and Swis-
sProt
One-tailed test
Fisher’s exact
statistic
Hypergeo-
metric Ease
correction
Friendly interface for
quick gene annotation
THEA
(Pasquier et
al. 2004)
GO (MP, BP
and CC)
One-tailed test
Fisher’s exact
statistic
Hypergeome-
tric Binomial
Bonferroni
Correction
Friendly interface for
quick annotation and
cluster’s analysis
Graph
Theoretic
Modeling
(Sung 2004)
GO (MP, BP
and CC)
One-tailed test
Average PD
statistic
Non-
Parametric
Graphical method who
proposes an Average
statistic for cluster’s
significance
GENE-
RATOR
(Pehkonen et
al. 2005)
GO (MP, BP
and CC)
One-tailed test
Fisher’s exact
statistic
Hypergeome-
tric
Non-negative matrix
factorization to create
k-means partition. Results
presented as a non-nested
hierarchical tree
Annotation-
Tool (Masys
et al. 2001)
Medline
(abstracts),
Mesh (key-
words),
UMLS
One-tailed
test Estimated
likelihood Vs.
Observed likeli-
hood
Semi-
Parametric:
Empirical
Likelihood
Hierarchical groups of co-
annotated groups within
co-expressed clusters
TABLE II
EXPRESSION-BASED APPROACHES
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 156
We define expression-based bibliographic approaches as methods that integrates at least
one of the bibliographic databases (Medline, Biosis, MeSH, etc.) annotations into co-expressed
gene groups. Only a small number of approaches have integrated this kind of biological infor-
mation into co-expressed gene groups. Masys et al. [48] proposed to use keyword hierarchies
to interpret gene expression patterns for integrating bibliographic databases.
In a first step, his method proposes to take as input clusters of genes grouped by similarity
in expression (previously built by any of the supervised or non supervised methods). Sec-
ond, it searches for gene indexing terms contained in some PubMed articles. Then, it trans-
lates these indexing terms to MeSH15 “keywords” terms. Later, it combines the UMLS16
knowledge, the enzyme code nomenclature and MeSH terms to build hierarchical groups
of genes classified by annotation. Third, it makes the selection of the significant groups of
co-annotated genes in each co-expressed cluster. For this purpose, it states:
H0 : Keyword would appear at or above the observed frequency by chance
in a group of keywords of the same size within the cluster of co-expressed
genes.
This hypothesis test is solved by comparing the observed versus the expected frequency
of each keyword retrieved in association with a set of genes and a p − value estimate of
the likelihood under the null hypothesis. Finally, it obtains co-expressed and co-annotated
significant groups of genes.
3.5 Comparison between several Expression-based Approaches
Table II presents a brief summary of eight expression-based approaches. The comparison
is based on four characteristics: the source of biological information, the hypothesis-testing
type and statistics, the hypothesis-testing distribution and a distinctive characteristic.
All the approaches appear in chronological order, the first one integrates bibliographic
sources of information i.e. Medline abstracts and the seven others integrate semantic sources
of information principally GO, sometimes combined with another gene/protein related spe-
cific sources as MIPS, KEGG, Pfam, Smart, etc. or molecular database as Embl, SwissProt,
etc.
Concerning selecting co-expressed and co-annotated gene groups all the approaches have
chosen a one-tailed test. FunSpec, OntoExpress, EASE, THEA and Generator have opted
for Fisher’s exact statistic, and their statistical evaluation methods have small variations.
FunSpec, THEA, EASE, Generator have used the typical fisher’s test with hypergeometric
distribution. The first two of these have chosen bonferroni correction against multi-testing
problem and EASE has used an ease-score correction against the over-representation weight
given in bigger gene groups by Fisher’s test. Only two approaches Graph Theoretic Model-
ing and AnnotationTool have chose non-parametric and semi-parametric statistical evaluation
models respectively.
The last column in Table II contains an important distinctive feature. For example GEN-
ERATOR uses a particular method based on k−means that builds a non-nested hierarchical
tree, as final result.
4 Co-Clustering Axis
From the beginning of gene expression technologies, clustering algorithms were focused on
grouping gene expression profiles with biological conditions [16]. Sources of biological in-
formation and well structured ontologies as GO and KEGG particularly, are constantly grow-
15http://www.nlm.nih.gov/mesh16http://umlsks.nlm.nih.gov
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 157
ing in quantity and quality and have opened the interpretation challenge of grouping hetero-
geneous data as numeric gene expression profiles and textual gene annotations. Co-clustering
approaches focus their effort to answer this challenge. Each co-clustering approach has its
specific parameters: biological source of information, clustering method and integration al-
gorithm. They generally follow a three-step methodology described in the following.
New co-clustering integration approaches are currently one of the interpretation challenges
in gene expression technologies. At the moment, few co-clustering approaches have been
reported since the principal barrier is the difficulty to build clustering methods fitting hetero-
geneous sources of information. Among the co-clustering approaches we can cite Co-Cluster
[15] and Bicluster [14] described in subsection remarkable co-clustering methods.
4.1 Co-Clustering Methodology
In a first step, they state two different measures: one measure to manipulate gene expression
profiles and the other one for gene annotations in an independent manner.
In a second step, they apply an integration criterion (merging function, graphical function
etc.) within the co-clustering algorithm for building the co-expressed and co-annotated gene
groups simultaneously.
They select the significant co-expressed and co-annotated gene groups. In the last step,
most recent solutions [49], [50], [51], [52], [53], [54] and [55] test the quality of the final
clusters.
4.2 Remarkable Co-Clustering Methods
1. Co-cluster: Co-clustering of biological networks and gene expression data
This approach [15] constructs a merging distance function which combines information
from gene expression data and metabolic networks, computing a joint clustering of co-expressed
genes and vertices (annotations from KEGG database) of the network.
In a first step, it computes two distances: a network distance obtained from the proximity
of enzymes in the metabolic pathway network beneath undirected graph form, and a gene
expression distance obtained from Pearson correlation coefficients of expression matrix [56].
In a second step, it builds a merging function that consists in a mapping that relates genes
to enzymes nodes in the undirected graph. Then, it applies hierarchical average linkage
clustering algorithm using the merged (enzyme-gene) distance.
Finally, it evaluates the significant co-expressed and co-annotated clusters using the silhou-
ette coefficient [57]. This quality cluster method determines the number of optimal clusters
in a hierarchical dendrogram.
2. Bi-cluster: Gene Ontology friendly bi-clustering of expression profiles
This approach [58] directly incorporates Gene Ontology information into the gene expres-
sion clustering process, using Smart Hierarchical Tendency Preserving clustering algorithm
(SHTP). HTP is a bi-clustering algorithm capable of discovering gene expression patterns
embedded in only a subset of conditions. It becomes “Smart” when it integrates the GO
functional annotations.
In a first step, it calculates two trees, the Tendency Preserving (TP) Cluster tree obtained
from gene expression matrix (rank measures) and the Gene Ontology tree decomposition
obtained from GO gene annotations.
In a second step, it builds a hierarchical structure by mapping the TP cluster tree onto GO
Hierarchy.
While applying HTP clustering algorithm, the GO annotations tree is useful for two pur-
poses: assessing functional enrichments of a cluster (using one-tailed Fisher’s test as shown
in OntoExpress) and selecting the subset of conditions critical to a function category (build-
ing the α threshold). Finally, the subset of co-expressed genes contained in the subset of the
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 158
Approach Biological
Source of
Information
Gene Expression
Profiles Measure
and Gene Matrix
Distance
Co-clustering
Details
Co-expressed
and
Co-annotated
gene group
Selection Details
Co-Cluster
(Hanisch et
al. 2003)
KEGG Fold Change Pearson
Correlation distance
Hierarchical Av-
erage Linkage
Silhouette
Coefficient
GO Bi-
clustering
(Liu J. et al.
2004)
GO: (MP,
BP, CC) and
KEGG
Fold Change Rank be-
tween conditions
SHTP: Smart
Hierarchical Ten-
dency preserving
One-tailed
Fisher’s test
Alfa threshold
construction
TABLE III
CO-CLUSTERING INTEGRATION APPROACHES
GO annotations tree becomes the selected significant group of co-annotated and co-expressed
genes by tendency.
4.3 Comparison between Co-clustering Approaches
Table III presents a brief summary of the two co-clustering approaches explained in last sub-
section. It is based on four parameters: source of biological information, expression profile
measure, co-clustering details, and co-expressed and co-annotated gene groups selection de-
tails as seen in Table III.
Both approaches select well-structured ontologies: KEGG database in Co-Cluster and GO
for Bi-Cluster. These ontologies have a graph-based representation that allows the clustering
algorithm to integrate gene expression profiles with gene database annotations.
For manipulating gene expression measures, both methods use fold change expression
measures. Nevertheless, co-cluster chooses Pearson’s correlation coefficient as gene to gene
distance and bi-cluster chooses a gene tendency measure based in the gene-rank between
biological conditions.
Concerning to co-clustering details, both co-cluster and bi-cluster have chosen a hierarchi-
cal clustering method. However, co-cluster has opted for typical hierarchical average linkage
algorithm and bi-cluster has developed the Smart Hierarchical Tendency preserving (SHTP)
algorithm.
Related to gene group selection, co-cluster uses the silhouette coefficient for determining
the quality of the clusters built (selecting the significant ones). In the other hand, bi-cluster
states for a selection in two different stages.
First it uses standard one-tailed Fisher’s test for calculate the p−value for the co-annotated
and co-expressed gene groups and then it builds a particular α threshold for each of them.
Finally, as seen in the previous approaches, it compares p − value against α to select or not
the co-expressed and co-annotated gene group.
5 Discussion
The bioinformatics community has developed many approaches to tackle the interpretation
microarray challenge, we classify them in three different interpretation axes: prior, standard
and co-clustering. The important intrinsic characteristics of each axis have been developed
before.
Standard or expression-based approaches give importance to gene expression profiles.
However, microarray history has revealed intrinsic errors in microarray measures and proto-
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 159
cols that increase during the whole microarray analysis process. Thus, the expression-based
interpretation results can be severely biased [13], [14].
On the other hand prior or knowledge-based approaches give importance to biological
knowledge. Nevertheless, all sources of biological information fix many integration con-
straints: the database format or structure, the weak quantity of annotated genes or the avail-
ability of maintaining up to date and well revised annotations for instance. Consequently, the
knowledge-based interpretation results can be poor or somewhat quite small in relation to the
whole studied biological process.
Co-clustering approaches represent the best compromise in terms of integration, giving the
same weight to expression profiles and biological knowledge. But, they have to deal with
the algorithmic issue of integrating these two elements at the same time. However, they are
often forced to give more weight to one of these elements. In the last section above, we have
seen two examples: co-cluster algorithm gives more weight to knowledge, and expression
profiles were used to guide the clustering analysis while hand bi-cluster algorithm gives more
weight to tendency in expression profiles and GO annotations are used to guide the clustering
analysis.
Indeed, the improvement of microarray data quality, microarray process analysis and the
completion of biological information sources should make the interpretations results more
independent on the interpretation axis.
As long as there is not enough reliability on these main elements, the choice of the inter-
pretation approach remains of crucial importance for the final interpretation results.
References
[1] T. Attwood and C. J. Miller, “Which craft is best in bioinformatics?” Computer Chem-
istry, vol. 25, pp. 329–339, 2001.
[2] A. Zhang, Advanced analysis of gene expression microarray data, 1st ed., ser. Science,
Engineering, and Biology Informatics. World Scientific, 2006, vol. 1.
[3] R. Cho, M. Campbell, and E. Winzeler, “A genome-wide transcriptional analysis of the
mitotic cell cycle,” Molecular Cell, vol. 2, pp. 65–73, 1998.
[4] J. DeRisi, L. Iyer, and V. Brown, “Exploring the metabolic and genetic control of gene
expression on a genomic scale,” Science, vol. 278, pp. 680–686, 1997.
[5] M. Eisen, P. Spellman, P. Brown, and D. a. Botsein, “Cluster analysis and display of
genome wide expression patterns,” in Proceedings of the National Academy of Sciences
of the USA, vol. 95, no. 25, 1998, pp. 14 863–8.
[6] P. Tamayo and D. Slonim, “Interpreting patterns of gene expression with self-organizing
maps: Methods and application to hematopoietic differentiation,” in Proceedings of the
National Academy of Sciences of the USA, vol. 96, 1999, pp. 2907–2912.
[7] S. Tavazoie, J. Hughes, M. Campbell, R. Cho, and G. Church, “Systematic determina-
tion of genetic network architecture,” Nature Genetics, vol. 22, pp. 281–285, 1999.
[8] A. Ben-Dor, R. Shamir, and Z. Yakhini, “Clustering gene expression patterns,” Compu-
tational Biology, vol. 6, pp. 281–297, 1999.
[9] C. Blaschke, L. Hikrschman, and A. Valencia, “Co-clustering of biological networks
and gene expression data,” Bioinformatics, vol. 18, pp. S145–S154, 2002.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 160
[10] H. Muller, E. Kenny, and P. Sternberg, “Textpresso: An ontology-based information
retrieval and extraction system for biological literature,” PLoS Biology, vol. 2, no. 11,
p. 309, 2004.
[11] G. Churchill, “Fundamentals of experimental design for cdna micro-arrays,” Nature
Genetics, vol. 32, pp. 490–495, 2002.
[12] S. G. Lee, J. U. Hur, and Y. S. Kim, “A graph theoretic modeling on go space for biolo-
gical interpretation of gene clusters,” Bioinformatics, vol. 3, pp. 381–386, 2004.
[13] Z. Fang, J. Yang, Y. Li, Q. Luo, and L. Liu, “Knowledge guided analysis of microarray
data,” Biomedical Informatics, vol. 10, pp. 1–11, 2005.
[14] J. Liu, J. Yang, and W. Wang, “Biclustering in gene expression data by tendency,” in
Computational Systems Bioinformatics Conference, CSB 2004 Proceedings, 2004, pp.
182–193.
[15] D. Hanisch, A. Zien, R. Zimmer, and T. Lengauer, “Co-clustering of biological net-
works and gene expression data,” Bioinformatics, vol. 18, pp. S145–S154, 2002.
[16] A. Riva, A. Carpentier, B. Torresani, and A. Henaut, “Comments on selected funda-
mental aspects of microarray analysis,” Computational Biology and Chemistry, vol. 29,
pp. 319–336, 2005.
[17] V. Mootha, C. Lindgren, K. Eriksson, and A. Subramanian, “Pgc-l’alpha-reponsive
genes involved in oxidative phosphorylation are coordinately downregulated in human
diabetes,” Nature Genetics, vol. 34, no. 3, pp. 267–273, 2003.
[18] R. Breitling, A. Amtmann, and P. Herzyk, “IGA: A simple tool to enhance sensitiv-
ity and facilitate interpretation of microarray experiments,” BMC Bioinformatics, vol.
2005, no. 34, 2004.
[19] S. Kim and D. Volsky, “Page: Parametric analysis of gene set enrichment,” BMC Bioin-
formatics, vol. 6, p. 144, 2005.
[20] R. Martinez, N. Pasquier, C. Pasquier, M. Collard, and L. Lopez-Perez, “Co-expressed
gene groups analysis (CGGA): An automatic tool for the interpretation of microarray
experiments,” Journal of Integrative Bioinformatics, vol. 3, no. 11, pp. 1–12, 2006.
[21] W. Feller, An Introduction to Probability Theory and Its Applications, 3rd ed. Wiley
and sons, 1971.
[22] L. Kaufman and P. Rousseeuw, Findings Groups in Data. An introduction to Cluster
Analysis. New York, USA: Wiley and Sons, 1990.
[23] J. Banfield and A. Raftery, “Model-based gaussian and non-gaussian clustering,” Bio-
metrics, vol. 49, pp. 803–822, 1993.
[24] K. Cios, W. Pedrycz, and R. Swiniarski, Data Mining Methods for Knowledge Discov-
ery. Boston/London: Kluwer Academic Publishers, 1998.
[25] M. Robinson, “Funspec : a web based cluster interpreter for yeast,” BMC Bioinformat-
ics, vol. 3, p. 35, 2002.
[26] S. Draghici and P. Khatri, “Global functional profiling of gene expression,” Genomics,
no. 81, pp. 1–7, 2003.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 161
[27] D. Gibbons and F. Roth, “Judging the quality of gene expression-based clustering meth-
ods using gene annotation,” Genome Research, vol. 12, pp. 1574–1581, 2002.
[28] D. Hosack and G. Dennis, “Identifying biological themes within lists of genes with
ease,” Genome Biology, vol. 4, no. 70, 2003.
[29] C. Pasquier, F. Girardot, K. Jevardat, and R. Christen, “Thea : Ontology-driven analysis
of microarray data,” Bioinformatics, vol. 20, no. 16, 2004.
[30] P. Pehkonen, G. Wong, and P. Toronen, “Theme discovery from gene lists for identifi-
cation and viewing of multiple functional groups,” BMC Bioinformatics, vol. 6, p. 162,
2005.
[31] W. Feng, G. Wang, B. Zeeberg, K. Guo, A. Fojo, D. Kane, W. Reinhold, S. Lababidi,
J. Weinstein, and M. Wang, “Development of gene ontology tool for biological inter-
pretation of genomic and proteomic data,” in AMIA Annual Symposium Proceedings,
2003, p. 839.
[32] F. Al-Shahrour, R. Diaz-Uriarte, and J. Dopazo, “Fatigo: a web tool for finding signifi-
cant associations of gene ontology terms with groups of genes,” Bioinformatics, vol. 20,
no. 4, pp. 578–580, 2004.
[33] T. Beissbarth and T. Speed, “Gostat: find statistically overrepresented gene ontologies
within a group of genes,” Bioinformatics, vol. 20, no. 9, pp. 1464–1465, 2004.
[34] D. Martin, C. Brun, E. Remy, P. Mouren, D. Thieffry, and B. Jacq, “Gotoolbox: func-
tional analysis of gene datasets based on gene ontology,” Genome Biology, vol. 5, no. 12,
2004.
[35] M. Masseroli, D. Martucci, and F. Pinciroli, “Gfinder: Genome function integrated
discoverer through dynamic annotation, statistical analysis, and mining,” Nucleic Acids
Research, vol. 32, pp. 293–300, 2004.
[36] N. Shah and N. Fedoroff, “Clench: a program for calculating cluster enrichment using
the gene ontology,” Bioinformatics, vol. 20, pp. 1196–1197, 2004.
[37] S. Maere, K. Heymans, and M. Kuiper, “Bingo: a cytoscape plugin to assess overrepre-
sentation of gene ontology categories in biological networks,” Bioinformatics, vol. 21,
pp. 3448–3449, 2005.
[38] R. Fisher, “On the interpretation of x2 from contingency tables, and the calculation of
p,” Journal of the Royal Statistical Society, vol. 85, no. 1, pp. 87–94, 1922.
[39] K. Kerr and G. Churchill, “Statistical design and the analyissi of gene expression mi-
croarray data,” Genetics Research, vol. 77, pp. 123–128, 2001.
[40] M. Man, Z. Wang, and Y. Wang, “Power sage: comparing statistical test for sage exper-
iments,” Bioinformatics, vol. 16, no. 11, pp. 953–959, 2000.
[41] L. Fisher and G. Van Belle, Biostatistics: a methodology for health sciences. New
York, USA: Wiley and Sons, 1993.
[42] T. Cover and J. Thomas, Elements of information theory. New York, USA: Wiley-
Interscience, 1991.
[43] C. Blaschke, R. Hoffmann, A. Valencia, and J. Oliveros, “Extracting information au-
tomatically from biological literature,” Comparative and Functional Genomics, vol. 2,
no. 5, pp. 310–313, 2001.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 162
[44] L. Tanabe, U. Scherf, L. Smith, J. Lee, L. Hunter, and J. Weinstein, “Medminer: an in-
ternet text-mining tool for biomedical information, with application to gene expression
profiling,” Biotechniques, vol. 27, no. 6, pp. 1210–1217, 1999.
[45] C. Perez-Iratxeta, P. Bork, and M. Andrade, “Exploring medline abstracts with
xplormed,” Drugs Today, vol. 38, no. 6, pp. 381–389, 2002.
[46] T. Rindflesch, L. Tanabe, J. Weinstein, and L. Hunter, “Edgar: extraction of drugs, genes
and relations from the biomedical literature,” in Proceedings of the Pacific Symposium
on Biocomputing, 2000, pp. 517–528.
[47] J. Chiang, H. Yu, and H. Hsu, “Gis: a biomedical text-mining system for gene informa-
tion discovery,” Bioinformatics, vol. 1, no. 20, pp. 120–121, 2004.
[48] D. Masys, “Use of keyword hierarchies to interpret gene expressions patterns,” Bioin-
formatics, vol. 17, pp. 319–326, 2001.
[49] K. Zhang and H. Zhao, “Assessing reliability of gene clusters from gene expression
data,” Functional Integrative Genomics, pp. 156–173, 2000.
[50] M. Halkidi, Y. Batistakis, and M. Vazirgiannis, “On clustering validation techniques,”
Intelligent Information Systems, vol. 17, pp. 107–145, 2001.
[51] F. Azuaje, “A cluster validity framework for genome expression data. bioinformatics,”
Bioinformatics, vol. 18, pp. 319–320, 2002.
[52] A. Ben-Hur, A. Elisseeff, and I. Guyon, “A stability based method for discovering struc-
ture in clustered data,” in Pacific Symposium on Biocomputing, vol. 7, 2002, pp. 6–17.
[53] S. Datta and S. Datta, “Comparisons and validation of clustering techniques for mi-
croarray gene expression data,” Bioinformatics, vol. 4, pp. 459–466, 2003.
[54] C. Giurcaneanu, I. Tabus, I. Shmulevich, and W. Zhang, “Stability-based cluster analy-
sis applied to microarray data,” in Proceedings of the Seventh International Symposium
on Signal Processing and its Applications, 2003, pp. 57–60.
[55] M. Smolkin and D. Ghosh, “Cluster stability scores for microarray data in cancer stud-
ies,” BMC Bioinformatics, vol. 4, no. 36, 2003.
[56] D. Elihu, P. Nechama, and L. Menachem, “Mercury exposure and effects at a ther-
mometer factory,” Scandinavian Journal of Work Environmental Health, vol. 8, no. 1,
pp. 161–166, 2004.
[57] P. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster
analysis,” Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987.
[58] J. Liu, J. Yang, and W. Wang, “Gene ontology friendly biclustering of expression pro-
files,” in Computational Systems Bioinformatics Conference, CSB 2004 Proceedings,
2004, pp. 436–447.
International Journal of Computer Science & ApplicationsVol. IV, No. II, pp. 145 - 163
© 2006 Technomathematics Research Foundation
Ricardi Martinez, Martine Collard 163