sla enactment for large-scale healthcare workflows on multi-cloud

14
Future Generation Computer Systems ( ) Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs SLA enactment for large-scale healthcare workflows on multi-Cloud Foued Jrad a,, Jie Tao a , Ivona Brandic b , Achim Streit a a Steinbuch Centre for Computing SCC, Karlsruhe Institute of Technology KIT, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany b Information Systems Institute, Vienna University of Technology, Argentinierstrasse 8/184-1, 1040 Vienna, Austria highlights An ontological model to semantically describe composite multi-Cloud services. A mathematical formulation of the SLA-based match-making problem on multi-Cloud. A utility-based genetic algorithm to optimize the selection of Cloud resources. A simulation-based evaluation with a real DNA sequencing healthcare workflow. The proposed matching algorithm reduces execution costs while fulfilling the SLAs. article info Article history: Received 1 March 2014 Received in revised form 15 July 2014 Accepted 23 July 2014 Available online xxxx Keywords: DNA sequencing Scientific workflow Cloud computing Match-making Genetic algorithm abstract Computing Clouds offer a new way of using IT facilities including the hardware, storage, applications and networks. The huge resource pool on the Cloud forms an appropriate platform for running applications with both computing and data intensity, like the DNA sequencing workflows. This paper studies the topic of running scientific workflows on multiple Clouds, with the DNA sequencing workflow as a driven appli- cation. We focus on the problem of matching the workflow functional and non-functional Service Level Agreement (SLA) requirements to the compute and storage services provisioned by underlying Clouds with different service price and quality. We designed an ontological model for a semantic description of the problem and developed a novel utility-based genetic matching algorithm for selecting the Cloud ser- vices with respect to the user requirements and the properties of the Clouds. We validated the approach by comparing the performance of the proposed algorithm with other matching algorithms in executing the DNA sequencing application on a realistic simulation platform. The results show the effectiveness of our approach in reducing the total costs and fulfilling the requested service quality even with large-scale service compositions. © 2014 Elsevier B.V. All rights reserved. 1. Introduction Cloud computing was introduced as a novel computing paradigm that offers computing facilities as a service. A specific feature of Cloud computing, in comparison with other computing paradigms and e-science infrastructures [1], is that it allows the provision of on-demand, reliable resources and customized com- puting environments in a way of pay-as-you-go [2–4]. The base for such elasticity, reliability and customization is a resource pool, of- ten virtualized, that contains both a large number of servers and a high storage capacity. Therefore, increasing applications [5,6] are Corresponding author. Tel.: +49 1791103568. E-mail addresses: [email protected] (F. Jrad), [email protected] (J. Tao), [email protected] (I. Brandic), [email protected] (A. Streit). moved to the Clouds for enjoying the rich resource set and the spe- cific features of the Cloud workflows [7,8]. This includes applica- tions in healthcare, like the DNA sequencing. An example is the company DNAnexus, 1 which uses the Amazon AWS 2 Cloud ser- vices to offer genomic analysis in the Cloud to hospitals and re- searchers. A major reason of running the DNA sequencing workflows on the Cloud is to use the Cloud feasibility in scaling-in and scaling- out [9]. Depending on the amount of the sequences the analysis work can take several days. For a fast diagnosis it is necessary to involve more computing capacities for larger data or reduce the re- source number when the data set is small. In this case, Cloud is an 1 https://www.dnanexus.com. 2 http://aws.amazon.com. http://dx.doi.org/10.1016/j.future.2014.07.005 0167-739X/© 2014 Elsevier B.V. All rights reserved.

Upload: achim

Post on 16-Feb-2017

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SLA enactment for large-scale healthcare workflows on multi-Cloud

Future Generation Computer Systems ( ) –

Contents lists available at ScienceDirect

Future Generation Computer Systems

journal homepage: www.elsevier.com/locate/fgcs

SLA enactment for large-scale healthcare workflows on multi-Cloud

Foued Jrad a,∗, Jie Tao a, Ivona Brandic b, Achim Streit aa Steinbuch Centre for Computing SCC, Karlsruhe Institute of Technology KIT, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germanyb Information Systems Institute, Vienna University of Technology, Argentinierstrasse 8/184-1, 1040 Vienna, Austria

h i g h l i g h t s

• An ontological model to semantically describe composite multi-Cloud services.• A mathematical formulation of the SLA-based match-making problem on multi-Cloud.• A utility-based genetic algorithm to optimize the selection of Cloud resources.• A simulation-based evaluation with a real DNA sequencing healthcare workflow.• The proposed matching algorithm reduces execution costs while fulfilling the SLAs.

a r t i c l e i n f o

Article history:Received 1 March 2014Received in revised form15 July 2014Accepted 23 July 2014Available online xxxx

Keywords:DNA sequencingScientific workflowCloud computingMatch-makingGenetic algorithm

a b s t r a c t

Computing Clouds offer a newway of using IT facilities including the hardware, storage, applications andnetworks. The huge resource pool on the Cloud forms an appropriate platform for running applicationswith both computing and data intensity, like the DNA sequencing workflows. This paper studies the topicof running scientific workflows onmultiple Clouds, with the DNA sequencing workflow as a driven appli-cation. We focus on the problem of matching the workflow functional and non-functional Service LevelAgreement (SLA) requirements to the compute and storage services provisioned by underlying Cloudswith different service price and quality. We designed an ontological model for a semantic description ofthe problem and developed a novel utility-based genetic matching algorithm for selecting the Cloud ser-vices with respect to the user requirements and the properties of the Clouds. We validated the approachby comparing the performance of the proposed algorithm with other matching algorithms in executingthe DNA sequencing application on a realistic simulation platform. The results show the effectiveness ofour approach in reducing the total costs and fulfilling the requested service quality even with large-scaleservice compositions.

© 2014 Elsevier B.V. All rights reserved.

1. Introduction

Cloud computing was introduced as a novel computingparadigm that offers computing facilities as a service. A specificfeature of Cloud computing, in comparison with other computingparadigms and e-science infrastructures [1], is that it allows theprovision of on-demand, reliable resources and customized com-puting environments in a way of pay-as-you-go [2–4]. The base forsuch elasticity, reliability and customization is a resource pool, of-ten virtualized, that contains both a large number of servers anda high storage capacity. Therefore, increasing applications [5,6] are

∗ Corresponding author. Tel.: +49 1791103568.E-mail addresses: [email protected] (F. Jrad), [email protected] (J. Tao),

[email protected] (I. Brandic), [email protected] (A. Streit).

moved to the Clouds for enjoying the rich resource set and the spe-cific features of the Cloud workflows [7,8]. This includes applica-tions in healthcare, like the DNA sequencing. An example is thecompany DNAnexus,1 which uses the Amazon AWS2 Cloud ser-vices to offer genomic analysis in the Cloud to hospitals and re-searchers.

A major reason of running the DNA sequencing workflows onthe Cloud is to use the Cloud feasibility in scaling-in and scaling-out [9]. Depending on the amount of the sequences the analysiswork can take several days. For a fast diagnosis it is necessary toinvolvemore computing capacities for larger data or reduce the re-source number when the data set is small. In this case, Cloud is an

1 https://www.dnanexus.com.2 http://aws.amazon.com.

http://dx.doi.org/10.1016/j.future.2014.07.0050167-739X/© 2014 Elsevier B.V. All rights reserved.

Page 2: SLA enactment for large-scale healthcare workflows on multi-Cloud

2 F. Jrad et al. / Future Generation Computer Systems ( ) –

ideal choice for these kind of applications. In addition to the com-puting facilities offered by Clouds, the scalable storage systems andanalysis tools offered by Cloud also permit us to realize the visionof ‘‘data-driven medicine’’ [10]. It may be possible to run the DNAsequencing workflows on a single Cloud [11], however, it is betteror evennecessary to usemultiple Clouds (multi-Cloud) because theresources on one Cloud can be insufficient, or there could be a re-source limit to the customers. In contrast to single Cloud, in multi-Cloud a user is aware of various Clouds, and usually a third partyentity called broker is needed to deal with the complexity of theservice provisioning phase.

Using multiple Clouds for workflows raises the topic of servicecomposition [12], where the services provided by different Cloudsare composed for a single goal, here, performing the tasks of aworkflow. Correspondingly, we call such a multi-Cloud service asa composite service. The research topic with composite Cloud ser-vices handles not only the selection of individual services but also,and with more challenges, the composition effect of the services.Cloud selection for a workflow is a complex matching problem,where both the application requirements on the infrastructure andservice quality, and the properties of the Cloud providers, such asservice price, availability, and response time, have to be taken intoaccount. With these parameters and additionally the compositioneffect, it is not an easy task to decide using which Clouds to run theworkflow with a result of paying in compliance with the specifiedcost but with a quality guarantee.

Today, existing workflow deployment approaches on multi-Cloud are typically restricted to optimizing the task allocation toprovisioned resources owned by predefined Clouds, with the goalto minimize data transfer, execution time and cost [13]. These ap-proaches lack the support of the SLA-aware Clouds selection,whichcan also heavily impact the performance and cost of the deploy-ment.

Motivated by the above considerations, we investigate in thispaper the matching of composite services on multi-Cloud takingDNA sequencing workflows as example. To tackle this problem,we propose a broker-based generic approach, which takes as inputan ontological model describing the composite service require-ments and provider offerings. The core of our approach is a match-making algorithm called Hybrid Utility-based Genetic Algorithm(HU-GA) allowing the selection of compute and storage Cloud re-sources with respect to the user requirements in terms of Qual-ity of Service (QoS) and costs. The functionality of the algorithmis validated with a real DNA sequencing workflow-based applica-tion using a simulation environment implemented on top of theCloudSim [14] Cloud simulator. The experimental results show thatHU-GA offers benefits to users in terms of performance and costcompared to other matching policies. Overall, this work makes thefollowing contributions:

1. An ontological model to semantically describe the compositeservice requirements and provider offerings.

2. A mathematical formulation of the SLA-based match-makingproblem on multi-Cloud.

3. An efficient utility-based genetic algorithm to match Cloud re-sources with respect to user SLA requirements.

4. An extensive simulation-based evaluation with a real DNA se-quencing healthcare application.

The remainder of the paper is organized as follows: Section 2presents the relatedwork on composite service selection onmulti-Cloud. Section 3 describes our broker-based matching approachand the used ontological model. Section 4 describes the function-ality of the HU-GA algorithm implemented in this work. Section 5presents the simulation environment and the evaluation resultsgathered from the simulation experiments. Finally, Section 6 con-cludes the paper and provides the future work.

2. Related work

Several approaches are proposed to solve QoS-based web ser-vice composition problems in the context of Service Oriented Ar-chitectures (SOA). Most of these approaches like in [15] are basedon linear programming methods, which are not applicable forlarge-scale service composition problems as in Cloud computing.Besides, these approaches do not provide declarative representa-tions of service offers and requests as required in a Cloud scenario.

There is preliminary work in the field of QoS-based serviceselection in Cloud environments. A broad overview of the commonused selection methodologies with some related research issues isgiven in [16]. In this section we discuss the main research worksdealing with composite service selection in Cloud.

In the context of the SMICloud project, Garg et al. [17] usedAnalytical Hierarchy Process (AHP) decision making methods torank candidate Cloud service offerings. Although they showed thecost and QoS matching effectiveness of the AHP-based ranking, asthe project is still in the initial phase, the evaluation is done onlywith simple Cloud services. Juan-Verdejo and Baars [18] proposeda framework based also on AHP to support the partial migrationof business intelligence applications to Cloud infrastructures bytaking into account business and economic considerations. How-ever, their ongoingwork still lacks the evaluation.Menzel and Ran-jan [19] presented a framework called CloudGenius to select thebest combination of a virtual machine (VM) image and a Cloud in-frastructure service to supportWeb servermigrations to the Cloud.Their presented evaluation results with Amazon-based servicesshowed the time complexity of the algorithm but not yet the im-pact of the matching on the QoS. Although AHP-based selectionmethods are effective in the ranking of Cloud services character-ized by conflictingQoSparameters, they require the user to providean accurate subjective weighting scheme for each requested QoSparameter, which heavily influences the matching results. Hence,thesemethods can only performwell when the number of alterna-tives is small and the number of objectives is limited [20].

Zhang et al. [21] proposed a declarative decision support sys-tem called CloudRecommender for the automatic selection of in-frastructure Cloud service configurations using transactional SQLsemantics. For this purpose, they introduced an extensible ontol-ogy to describe the functionalities and QoS parameters of IaaS of-fers. Their current prototype implementation allows a selectionbased on previous stored service information and does not supporta dynamic selection based on QoS information, like latency and re-source utilization.

In [22] the authors showed the effectiveness of genetic algo-rithms compared to linear programming methods in optimizingthe composition of Cloud services. Contrary to this work, theyused the simple additive Weighting (SAW) of four QoS parame-ters, which are response time, price, availability, and reputation,as fitness function for the genetic algorithm. Besides, they focusedon applications, where the order of execution of each componentservice is important. Such kind of applications, like business work-flows, are out of the focus of this work.

Dastjerdi et al. [20] proposed an approach that selects the CloudVM images and Cloud infrastructure services for a network of vir-tual appliances across multiple Clouds. In their work, the selectionis performed after an ontology-based discovery using a genetic-based matching algorithm with the goal to optimize latency,reliability, and deployment cost. Their evaluation showed the cost-effectiveness of the genetic algorithm compared to their intro-duced Forward-Checking Based Backtracking (FCBB) algorithm,but not yet the QoS benefits from the matching.

Table 1 compares the previous discussed research works withour work in terms of their matching methods, their use of ontolo-gies, and their applications scope. As can be seen from the table,a unique feature of this work is the use of a hybrid utility-based

Page 3: SLA enactment for large-scale healthcare workflows on multi-Cloud

F. Jrad et al. / Future Generation Computer Systems ( ) – 3

Table 1A comparison of existing composite Cloud service selection frameworks with this work.

Criteria CloudGenius [19] CloudRecommender [21] Dastjerdi et al. [20] Proposed approach

Matching methods AHP SQL semantics FCBB/GA + cost Sieving/GA + utilityOntology-based No Yes Yes YesEvaluation criteria Time complexity Cost + time complexity Cost + time complexity Cost + QoS + time complexitySLA metrics/pricing Synthetic Real Real RealApplication scope Multi-tier Multi-tier Multi-tier Scientific workflowsFramework CumulusGenius CloudRecommender CloudPick Cloud Service BrokerDynamic matching No No No Yes

Fig. 1. Multi-Cloud brokering life cycle.

genetic algorithm (HU-GA), where a quasi-linear utility functionis used as objective function to optimize the composite servicesselection. The used utility function is adopted from the economictheory and has been already used in [23] for the benchmarkingof simple Cloud services. In a previous work [24] we evaluatedits effectiveness in the matching of simple Cloud Services. Utility-based genetic algorithms are widely used in finance to solve port-folio optimization problems [25]. However, this work is the firstattempt to use them in the selection of composite Cloud services.In contrast to the other approaches, ourmatching scheme supportsdynamic information in the matching such as the Cloud-to-Cloudlatency and the resource load on the Clouds.We evaluated our pro-posedHU-GAmatching algorithmusing a broker-based simulationframework to automate the deployment of a real scientific work-flow and showed its cost and QoS effectiveness in addition to itstime complexity.

3. Brokering of multi-Cloud composite services

In order to assist users in the task of selecting and deployingtheir composite services on multi-Cloud environments, we use anIaaS (Infrastructure as a Service) broker. The latter one acts as ame-diator between users and IaaS Cloud providers. Its main task is tomatch the requirements of users in terms of SLA to the resourcesneeded to provision the composite service. In the following sub-sections, we describe the brokering steps and the used ontologiesto describe the IaaS provider offerings and composite service re-quests.

3.1. Brokering lifecycle

As depicted in Fig. 1 the service brokering lifecycle consists ofthe following steps:Step 1. Request formulation: The user defines at design time the

functional and non-functional SLA requirements for the re-quested composite Cloud service.

Step 2. Discovery and monitoring: The broker discovers the can-didate service offers and stores their monitored SLA met-rics and pricing information in different data repositories.

Step 3. Match-making: The broker selects the suitable Cloudproviders for the user using different match-making poli-cies.

Step 4. Deployment: The broker deploys the service componentson the selected providers.

Step 5. Execution: The composite service is executed and its sta-tus is continually monitored at the runtime.

Step 6. Termination: The service can be terminated on the userrequest or by the broker (e.g. in case of repeated SLA viola-tions).

3.2. Ontological model

In order to facilitate the semantic storage of data collected fromthe above described request formulation and discovery and moni-toring steps, wemake use of two ontologies. These provide the do-main specific model and vocabulary to semantically describe IaaSservice offerings and the composite service requirements. Some ofthe terminologies used in these ontologies are taken from theOpenCloud Computing Interface (OCCI3) specification, which defines astandard API to access and manage heterogeneous IaaS Clouds. Inthe following we present in detail the two designed ontologies.

3.2.1. IaaS composite service requester ontologyThe IaaS composite service requester ontology, as depicted in

Fig. 2, captures the user requirements, which are defined as func-tional properties (e.g., VM type, number of VM instances, and stor-age size) and non-functional properties (e.g. budget, latency andavailability), which represent theQoS requirements. As can be seenfrom the figure, the functional requirements are specific to eachcomponent IaaS service, whereas the QoS requirements are globalfor the composition. This ontology also holds the current status ofthe composite service throughout all the SLAmanagement steps. Inaddition, each requested service has a unique ID and is associatedto a specific customer type.

3.2.2. IaaS Cloud provider ontologyThe IaaS provider ontology, as depicted in Fig. 3, provides an ab-

stractmodel for describing the provider service offerings, their QoSmetrics and pricing policies. In the current model, we concentrateon the modeling of IaaS offers including storage and computation.However, the model can be easily extended to support more Clouddeliverymodels orQoSmetrics. Eachprovider is presented throughthe ontology with a unique name, ID, and a geographical location.In addition to the pricing policies for the provided computing andstorage resources, the provider ontology also captures the networktraffic cost charged by each Cloud provider.

4. Matching of multi-Cloud composite services

In this section the match-making problem is formally definedand then the implemented matching policies are described in de-tail.

3 http://www.occi-wg.org.

Page 4: SLA enactment for large-scale healthcare workflows on multi-Cloud

4 F. Jrad et al. / Future Generation Computer Systems ( ) –

Fig. 2. Composite IaaS service requester ontology.

Fig. 3. IaaS Cloud provider ontology.

4.1. Problem formulation

Each requested multi-Cloud composite service can be modeledas a fully connected undirected graph G(S, E) called Intercloudgraph, where each node represents a single compute or storageCloud service and the edges show the network connectivity be-tween the nodes. An example of an Intercloud graph with threerequested services (two VMs and one storage) with a possible con-crete deployment is presented in Fig. 4.

The match-making problem on multi-Cloud consists of find-ing the service composition that minimizes the deployment costand best satisfies all user QoS requirements. In order to formu-late this NP-hard multi-objective optimization problem, we intro-duced amathematicalmodel. The latter is used tomodel compositeservice requests, Cloud providers, and candidate service compo-sitions, which constitute the input parameters for the matching Fig. 4. Example of an Intercloud graph for a composite multi-Cloud service.

Page 5: SLA enactment for large-scale healthcare workflows on multi-Cloud

F. Jrad et al. / Future Generation Computer Systems ( ) – 5

Table 2Multi-Cloud match-making model.

Parameter Description

Request

VM = {vm1, . . . , vma} Set of requested a compute servicesST = {st1, . . . , stb} Set of requested b storage servicesS = {s1, . . . , sn}, S = VM ∪ ST Set of n component services forming the composite requestE = {esisj |si, sj ∈ S, i = j} Set of all f edges connecting the n servicesL = {lat(esisj )|si, sj ∈ S} Set of edge latenciesQr = {q1(r), . . . , qm(r)} Set ofm required QoS valuesCr = {cvm(r), cst (r), ctr (r)} Max payment for compute, storage and trafficD = {dst , dtr } Required storage and data traffic sizeT Provisioning (lease) time

ProviderP = {p1, . . . , pl} Set of l candidate IaaS providersQpj = {q1(pj), . . . , qm(pj)} Measured QoS metrics for provider pjCpj = {cvm(pj), cst (pj), ctr (pj)} Pricing policies for provider pj

CandidateX = {x =

si → pj|si ∈ S, pj ∈ P} Set of possible service compositions x

Qx = {q1(x), . . . , qm(x)} Set of QoS values for service composition xCx = {cvm(x), cst (x), ctr (x)} Total usage cost for composition x

algorithms. The description of the model parameters is providedin Table 2.

In addition to this model, as we do not know the exact amountof data to be transferred at runtime between the requested ser-vices, we made the following assumptions on data traffic for ourapplication needs:Assum 1. Users are charged only for data traffic from Cloud to the

Internet.Assum 2. The data transfer inside the same provider is free of

charge.Assum 3. All the requested storage services will store the same

amount of data (data is replicated).Assum 4. The data transfer between two connected nodes is

bidirectional.Assum 5. The requested amount of data traffic is equally dis-

tributed between the connected Clouds.

Based on ourmathematicalmodel, thematch-making problem canbe formulated as follows:max th(x), av(x) with th(x) ≥ th(r) ∧ av(x) ≥ av(r)min rt(x), lat(x) with rt(x) ≤ rt(r) ∧ lat(x) ≤ lat(r)min Cx(T ) with Cx(T ) ≤ Cr(T )

where, Qx = {rt(x), th(x), av(x), lat(x)} denotes respectively theset of aggregated SLA values of response time, throughput, avail-ability, and latency for the composite service x, and Qr = {rt(r),th(r), av(r), lat(r)} is the set of their minimum or maximum ac-ceptable values. The former is calculated from their correspond-ing values in the component services by applying the aggregationfunctions used in [15], as presented in Table 3. Cx(T ) denotes thetotal predicted costs (compute, storage and data transfer cost) forusing the composite service x during the time period T . Based onthe above assumptions, Cx(T ) can be computed as follows:

Cx(T ) = T ∗

a

i=1

cvm(vmi) + dst ∗

bi=1

cst(st i)

+

sk,sl∈S,k=le∈E

dtr ∗ ctr(esksl)2 ∗ f

(1)

with:

ctr(esksl)sk,sl→pm,pn =

ctr(pm) + ctr(pn) ifm = n0 else (2)

where pm, pn ∈ P denote the Cloud providers allocated respec-tively to the requested services sk, sl ∈ S. Cr(T ) denotes the totaluser budget for the period of usage T . It is calculated as follows:

Cr(T ) = T ∗

a ∗ cvm(r) + b ∗ dst ∗ cst(r) + dtr ∗ ctr(r)

. (3)

4.2. Sieving matching algorithm

For a comparative study with our HU-GA algorithm, we imple-mented a simple matching policy called Sieving. Given a servicelist forming the requested multi-Cloud service composition and alist of candidate providers, the Sievingmatching algorithm iteratesthrough the service list and selects randomly for each service a can-didate IaaS Cloud, which satisfies all functional and non-functionalSLA requirements. Therefore, for each selected provider the mea-sured SLAmetrics andhis usage price should be respectivelywithinthe ranges specified by the user in his SLA requirements and bud-get. In addition, the algorithm checks if the current datacenter ca-pacity load allows the deployment of the requested service type.However, it may be possible that the result set is empty in casethat none of the Cloud providers fulfill the requested criteria.

The following pseudo-code describes in detail the functionalityof the Sieving algorithm using the mathematical model fromTable 2:

Listing 1: Sieving AlgorithmInput : S , P , cvm(r) , cst (r) , av(r) , th(r) , rt(r)For each s in S DoFor each p in P DoI f s i s Deployable in p Then

I f cvm(p) <= cvm(r) & cst (p) <= cst (r) &av(p) >= av(r) & rt(p) <= rt(r) & th(p) >= th(r)

Thenadd p to CandidatesList

EndifEndif

EndforI f s i zeo f ( CandidatesList ) > 0 Then

Choose random p from CandidatesListCloudCompositionMap . add( s , p)

Else return nul lEndif

EndforOutput : CloudCompositionMap

4.3. Utility-based matching

A major issue of the above described Sieving matching algo-rithm is the lack of flexibility in the matching of non-functionalSLA attributes. Hence, it cannot handle use cases as such availabil-ity is more important than throughput or selecting well qualifiedproviders while keeping the total costs low. In addition, the net-work connectivity between the Clouds and traffic costs is ignoredin the matching. Therefore, we adopted from the Attribute Auc-tion Theory [26] a neweconomic utility-basedmatching algorithm,which takes the payment of customers and their QoS preferencesas the focus.

Page 6: SLA enactment for large-scale healthcare workflows on multi-Cloud

6 F. Jrad et al. / Future Generation Computer Systems ( ) –

Fig. 5. Functionality of the hybrid utility-based genetic algorithm.

Table 3Aggregated SLA attributes for a composite service x.

SLA attribute Aggregation function

Throughput th(x) = min th(si) ∀si ∈ SResponse time rt(x) = max rt(si) ∀si ∈ S

Latency lat(x) =

e∈E lat(esi sj )

f ∀si, sj ∈ S, i = j

Availability av(x) =

av(si) ∀si ∈ S

The main strategy of the utility-algorithm is to maximize theuser profit for the requested service quality by using a quasi-linearutility function [27]. The user preferences for the non-functionalSLA attributes aremodeled byweighted scoring functions,whereasall functional requirementsmust be fulfilled similar to Sieving. Theutility of a customer i from using during the period T a candidatemulti-Cloud composite service x, which assures a service qualityQx, is computed as follows:

Uix(Qx, T ) = Cri(T ) ∗ Fi(Qx) − Cx(T ), (4)

where Cri(T ) represents the maximum willingness to pay of con-sumer i for an ‘‘ideal’’ service quality in the period T (see Eq. (3)),Cx(T ) is the total service usage cost (see Eq. (1)), and Fi(Qx) is thecustomer’s scoring function translating the aggregated service qual-ity attribute levels into a relative fulfillment level of consumer re-quirements. It is defined as follows:

Fi(Qx) =

mj=1

λi(qj) ∗ fi(qj) → [0, 1], (5)

where λi(qj) and fi(qj) denote respectively the relative assessedweight and the fitting function for consumer i regarding the SLAattribute qj, where

mj=1 λi(qj) = 1. The fitting function maps

properly to the user behavior eachmeasured SLA attribute to a nor-malized real value in the interval [0, 1] with 1 representing an idealexpected SLA value. An example for non-linear fitting functions isprovided in Section 5.

A candidate Cloud service composition x is optimal if it isfeasible and if it leads to the maximum utility value with:

Uixoptimal(Qx, T ) = maxx∈X

Uix(Qx, T ) (6)

where X is the set of possible Cloud service compositions (solutionspace). Hence, the match-making problem can be formulated witha search for the Cloud composition with the highest utility valuefor the user.

4.4. HU-GA matching algorithm

Evolutionary based approaches like genetic algorithms are acommon used method to solve complex optimization problemswith a large solution space. In this work, we adopted a single

Fig. 6. Genetic encoding of the composite service candidates.

objective genetic algorithm called hybrid utility-based genetic al-gorithm (HU-GA) to solve the match-making problem introducedin Section 4.1. The adoption of the algorithm involves six stepswhich are depicted in Fig. 5. In the first step, the algorithm takes asinput a composite service request presented as Intercloud graph. Inthe next step called pre-sieving, we filter out candidate providersthat do not satisfy the QoS requirements for throughput, availabil-ity, and response time similar to the Sieving algorithm. Therewithunfeasible solutions can be removed from the solution space andthe convergence of the algorithm can be accelerated.

The third step is to create the population by generating randomcomposite service candidates called individuals. As can be seen inFig. 6, each individual is presented with a chromosome consistingof multiple genes, which are encoded using a selection Map datastructure. Each entry in the map represents a gene that has as keya graph node presenting the requested single service and the list ofcandidate providers capable of deploying the service as value.

In the fourth step, the composite service candidates x (individ-uals) from each generation are evaluated against a fitness function.In our adopted HU-GA algorithm, the fitness function that needs tobe maximized as shown in Eq. (7) is equal to the utility functionfrom Eq. (4). Additionally, we use a death penalty function to pe-nalize candidates who do not satisfy the service constraints and todiscard Cloud compositions with a negative utility.

maxx∈X

fitness =

0 if constraints are violatedUix(Qx, T ) otherwise. (7)

The next step is the evolution of the population based onthe crossover and mutation genetic operators. Herewith the elite

Page 7: SLA enactment for large-scale healthcare workflows on multi-Cloud

F. Jrad et al. / Future Generation Computer Systems ( ) – 7

Fig. 7. Cloud Service Broker simulation framework architecture.

candidates with the best fitness values will survive in the nextgeneration and are used to create a new population, while the badcandidateswill be discarded. This step is repeatedmany times untilthe algorithm reaches the convergence. Finally, the optimal Cloudcomposition is the output as a final solution.

5. Evaluation

We evaluated our proposed HU-GA matching algorithm with areal DNA sequencing workflow application by conducting a seriesof simulation experiments. The used simulation environment andthe gathered results are presented in the following subsections.

5.1. Simulation environment

For the purpose of evaluation, we used a broker-based simula-tion framework called Cloud Service Broker, we implemented ina previous work [28] to investigate the deployment and schedul-ing of workflows in multi-Cloud environments. Fig. 7 depicts thearchitecture of the framework developed on top of CloudSim.

The Cloud Service Broker, as shown in the middle of the archi-tecture, assists users in finding the suitable Cloud services to de-ploy and execute their workflow applications. Its main componentis a Match-Maker that performs a matching process to select thetarget Clouds for the deployment. A scheduler assigns the work-flow tasks to the selected Cloud resources. The architecture in-cludes also a DataManager tomanage the data transfers during theworkflow execution. The entire communication with the underly-ing Cloud providers is realized through standard interfaces offeredby provider hosted Intercloud Gateways. A Workflow Engine de-ployed on the client side, delivers the workflow tasks to the CloudService Broker with respect to their execution order and data flowdependencies. As workflow engine, we use WorkflowSim [29], aCloudSim-based version of the Pegasus WfMS [30].

In order to execute workflows using the framework, the Work-flow Engine receives in a first step a workflow description and the

Table 4Modeled Clouds setup.

Parameter Value or Range

Host

CPU cores per host 8–16Host CPU speed 1860–2660 MHzHost RAM size 8–16 GBHost local storage 1 TB

Network

Cloud local bandwidth 100 Mb/sCloud local latency 10 ms

Intra-continental:Cloud-to-Cloud bandwidth 30 Mb/sCloud-to-Cloud latency 25 ms

Inter-continental:Cloud-to-Cloud bandwidth 10 Mb/sCloud-to-Cloud latency 150 ms

DatacenterNumber of datacenters 20Hosts per datacenter 50Regions Europe, USA, Asia and Australia

SLA requirements from the user. After parsing the description, theWorkflow Engine applies different clustering techniques to reducethe number of workflow tasks. The reduced workflow and the userrequirements are then forwarded to the Broker. In the following,the Match-Maker selects the Cloud resources that can fit the usergiven requirements by applying different matching policies. Afterthat all the requested VMs and Cloud storage are deployed on theselected Clouds, theWorkflowEngine transfers the input data fromthe client to the Cloud storage and then starts to release the work-flow tasks with respect to their execution order. During execution,the scheduler assigns each task to a target VM according to dif-ferent scheduling policies while the Data Manager manages theCloud-to-Cloud data transfers. A Replica Catalog stores the list ofdata replicas by mapping workflow files to their current datacen-ter locations. Finally, the execution results are transferred to theCloud storage and can be retrieved via the user interface.

For the purpose of evaluation, we implemented, using thejava-based Opt4J [31] genetic framework, the proposed HU-GAalgorithmas newmatch-making policies in theMatch-Maker com-ponent of the Broker. In addition, we implemented the Sieving al-gorithm for a comparative study. For all conducted experiments,we configured Opt4J to use a population size of 100 with a maxi-mal generation number of 1000 and a crossover rate of 0.95.

5.2. Datacenters model

For all the conducted simulation experiments, we configured20 Cloud datacenters located in four world regions (Europe, USA,Asia and Australia). Each compute Cloud is made up of 50 physicalhosts, which are equally divided between two different host typeswith respectively 8 and 16 CPU cores. The detailed datacenterconfiguration is provided in Table 4. All themodeled 20datacentersprovide compute services with different VM configurations.Additionally, 12 of them provide Cloud storage services.

In order to make our simulation more realistic, we collectedthe current pay-as-you-go prices for computation, storage andnetwork traffic and SLA metrics of nine real IaaS providersnamely: Amazon AWS, ElasticHosts, GoGrid, Rackspace, Vox-CLOUD, CloudSigma, OpSource, CityCloud, HP Cloud and Flexis-cale.Wemapped then the collected pricing policies to themodeleddatacenters based on each IaaS provider’s regional location. Theaverage Cloud SLA metrics values of the last three months foravailability, Client-to-Cloud throughput, and response time (fordownloading large files from the Cloud) were acquired throughCloudHarmony4 network tests from the same client host located

4 http://www.cloudharmony.com.

Page 8: SLA enactment for large-scale healthcare workflows on multi-Cloud

8 F. Jrad et al. / Future Generation Computer Systems ( ) –

Fig. 8. A sample vertically clustered Epigenomics workflow.

in Europe. In order to model the network between the datacen-ters, we defined, based on their regional location, three constantbandwidth and latency values (local, intra-continental and inter-continental, see Table 4). The use of these synthetic values is justi-fied by the lack of free accessible Cloud-to-Cloud network metricsfrom CloudHarmony.

5.3. Use case: epigenomics workflow deployment

For the experimental evaluation of our match-making schemewith a real case study, we investigate the deployment ofan Epigenomics bioinformatic application using our previousdescribed simulation framework. Epigenomics is a compute-intensive workflow-based DNA sequencing application, developedby USC Epigenome Center [32], to map the epigenetic state of hu-man cells on a genome-wide scale. A sample directed acyclic graph(DAG) presentation of a four vertical level Epigenomics workflow5

is illustrated in Fig. 8.As can be seen from the figure, the workflow takes as input

the DNA sequence data generated by a Genetic Analyzer systemand splits it into several chunks that can be processed in parallel.The sequences of each data chunk are filtered to remove noisy andcontaminating sequences, and then mapped into the correct loca-tion in a reference genome. Finally, a global map of the aligned se-quences is generated and the sequence density at each position inthe genome is calculated [33].

For all our conducted experiments, we use a real EpigenomicsXML trace generated by the Pegasus WfMS from a productive runof the workflow. Table 5 shows the characteristics of the traceconsisting of 997 tasks.

The task descriptions including runtime and input/output filesinformation have been importedwith the help of theWorkflowSimWorkflow Parser. In order to reduce the scheduling overhead, weconfigured the Clustering Engine to use vertical clustering asmerg-ing techniques. Herewith, the sequential tasks of each vertical levelare merged into a clustered job, so that the total number of tasks isreduced to 260 tasks. In addition, we configured theWorkflow En-gine to release maximal five tasks to the broker in each schedulinginterval (default value used in Pegasus). As scheduling policy, weuse the simple Round Robin scheduler, which schedules workflow

5 http://pegasus.isi.edu/applications/dna_sequencing.

Table 5Epigenomics workflow trace characteristics.

# tasks # clustered jobs Input data Output data Data traffic

997 260 7 GB 300 MB 505 GB

Table 6VMs Setup; 1 CPU Core: 1 GHz Xeon 2007 Processor of 1000MIPS; OS: Linux 64 bits.

VM type Cores RAM (GB) Disk (GB)

Small 1 1.7 75Medium 2 3.75 150

tasks to the first free available VMs in the composite service re-gardless of the datacenter location. Clearly for more data-intensiveworkflows, we can use more complex task scheduling policies likethe data locality driven scheduling, we used in [34] to improvethe execution performance of the Montage astronomical applica-tion [35] on Cloud.

The use case scenario of a Epigenomics workflow deploymentusing our simulation framework is depicted in Fig. 9. As can be seenfrom the figure, the deployment consists of the following steps:in the first step, the user gives his functional requirements for theworkflow deployment by requesting from 10 to 50 VMs and onestorage Cloud to store the workflow data. One half of the VMs isof the type small and the other half is of type medium. The config-uration of each type is presented in Table 6. We assume that allthe VMs located in the same datacenter are connected to a sharedstorage. After acquiring the user QoS and budget requirements, theWorkflow engine forwards the user request to the Cloud ServiceBroker (1) to select the suitable Clouds for the deployment basedon the configured matching policy. After that the requested re-sources are deployed (2), the Workflow Engine transfers the inputdata from the Client to the Cloud storage, and then starts the exe-cution of the workflow (3). Finally, the output data is stored in theCloud storage when the execution is finished and can be fetchedfrom the user Client (4).

For the purpose of evaluation, we modeled two simulation sce-narios. In the first scenario, named ‘‘unconstrained’’, the numberof VMs that can be deployed per datacenter is unlimited. In thesecond scenario, named ‘‘constrained’’, we added a constraint inthe user request to increase the chance of amultiple Cloud deploy-ment by limiting themaximal number of VMs per datacenter to thehalf of the total requested VMs. For all our simulation experiments,we assume that the user is located in Germany. For collecting thesimulation results we repeated each of the experiments ten timesfrom the same host (a Notebook with 2 CPU Cores, 4 GB RAM andWindows 7 OS), and then computed the average value.

5.4. User QoS and budget requirements

The user requested minimal and maximal values for the QoSparameters (response time, availability, latency and throughput) todeploy the Epigenomicsworkflowon theCloud are given in Table 7.These values are consumed by the Sieving matching algorithmdescribed in Section 4.2 to select the random candidate Clouds forthe deployment.

The fitting functions and the relative weight for each QoS pa-rameter, which are both part of the input for our proposed HU-GAmatching algorithm, are given in Table 8. The fitting functions areset so that a sore value higher than 0.8 corresponds to the minimalQoS requirements given in Table 7. It can be seen from the tablethat in the modeled user request the Cloud-to-Cloud latency hasmore importance for the user because of the high amount of trafficdata that need to be transferred between Clouds, which requires aminimal latency between the Clouds. The second place of user con-cern is the availability of the Clouds. Whereas both the Client-to-Cloud throughput and response time have no significance because

Page 9: SLA enactment for large-scale healthcare workflows on multi-Cloud

F. Jrad et al. / Future Generation Computer Systems ( ) – 9

Fig. 9. DNA sequencing workflow use case.

Table 7User Qos Requirements; Response time rt(r); Availability av(r); Latency lat(r); Throughput th(r).

Max response time (s) Min availability (%) Max latency (ms) Min throughput (Mb/s)

25 95 50 12

Table 8User preferences expressed using fitting functions f and relative weights λ; γ = 0.0005; β = 1 − γ .

Response time Availability Latency Throughput

λrt f (rt) λav f (av) λlat f (lat) λth f (th)110

γ

γ+βe(0.3(rt−55))310

γ

γ+βe(−0.9(av−84))510

γ

γ+βe(0.2(lat−100))110 1 − βe−0.2th

Table 9Maximal payment for VMs cvm(r), storage cst (r) and traffic ctr (r).

VM small ($/h) VMmedium ($/h) Storage ($/GB) Traffic ($/GB)

0.09 0.18 0.1 0.1

of the relatively small amount of input/output data that need to betransferred from/to the Client. For all our conducted experiments,we fixed the lease time T in which the requested Cloud resourcesare provisioned for the user to two days. The maximum accept-able hourly price for each requested small andmedium VM type aswell as the maximum gigabyte price for data storage and networktraffic are given in Table 9. Clearly, these budget limits express themedium payment willingness of the users of such type of scientificapplications.

5.5. Time complexity and convergence

In order to evaluate the time complexity of our HU-GA algo-rithmwith the increasing number of requested VMs in the ‘‘uncon-strained’’ scenario, we measured in the first experiment the timeconsumed by the genetic algorithm to find the optimal compositeservice and then compared the results with the Sieving algorithm.Fig. 10 illustrates the results. As can be seen from the figure, thetime complexity of HU-GA increases exponentially as the numberof VMs increases from 10 to 50 reaching the 40 s with 50 VMs. Al-though the complexity values brought by our approach are up to

three orders of magnitude compared to the Sieving algorithm, itis negligible compared to the workflow makespan, which can takemany hours.

We repeated the above experiment bymeasuring the number ofiterations needed for the convergence of the genetic algorithm forboth the ‘‘constrained’’ and ‘‘unconstrained’’ use cases. Addition-ally, in order to access the impact of the pre-sieving process usedin the HU-GA algorithm on the algorithm convergence, we con-ducted the experiments first with enabling and thenwith disablingpre-sieving. The results for all scenarios are presented in Fig. 11.As depicted in the figure, the continual increase of the VM numberresults for all scenarios in a steady increase of the iteration num-ber and consequently more time is needed to perform the match-ing. It can be seen that the fastest convergence is reached with the‘‘pre-sieved’’ HU-GA in the ‘‘unconstrained’’ scenario, followed bythe ‘‘pre-sieved’’ HU-GA in the ‘‘constrained’’ scenario. This resultproves the benefit from pre-sieving in improving the genetic algo-rithm performance. We observed also that starting from 40 VMs,the ‘‘unsieved’’ HU-GA algorithm is terminated (after 1000 itera-tions) without reaching convergence. Because of this result, all thenext conducted experiments are performed with a ‘‘pre-sieved’’HU-GA algorithm.

5.6. Makespan evaluation

We repeated the previous experiment to measure themakespans in minutes after a single run of the Epigenomics work-flow on the Cloud resources allocated either using the Sieving or

Page 10: SLA enactment for large-scale healthcare workflows on multi-Cloud

10 F. Jrad et al. / Future Generation Computer Systems ( ) –

Fig. 10. Unconstrained HU-GA and Sieving time complexity for different VMnumbers.

Fig. 11. Unconstrained (U)/constrained (C) HU-GA convergence for different VMnumbers.

Fig. 12. Workflow makespan in minutes with unconstrained HU-GA-and Sieving.

HU-GA matching algorithm. In addition, we compared the resultwith the predicted theoretical makespan value (baseline) calcu-lated based on the number of VMs. For an accurate calculation ofthemakespan,we extracted from theworkflow trace the real delayoverhead resulted from clustering, post-scripting and queuing. Theresults for the ‘‘unconstrained’’ scenario with different numbers ofrequested VMs are shown in Fig. 12.

It can be seen from the figure that for both algorithms themakespans are very close to each others. This is explained by ourassumption that the VMs on all the matched Clouds have the samehardware configuration and performance, and by the fact that bothalgorithms have the tendency to select Clouds in the same userregion to take advantage of low data transfer time and latency.We observed also that the execution of the workflow within therequested two days lease time is only possible with more than30 VMs, otherwise the deadline will be violated and the user willbe charged for the additional needed time. The difference from thetheoretical makespan is due to the overhead that resulted fromthe file transfer in particular with a large number of VMs in thecomposite service.

5.7. QoS evaluation

One of the important criteria to prove the effectiveness of thematching algorithms is the QoS that resulted from the deploy-ment of the requested composite services on the matched Clouds.For this purpose, we calculated for the ‘‘unconstrained’’ and ‘‘con-strained’’ simulation scenarios with 50 requested VMs the aggre-gated values for the availability, response time, throughput andlatency using the aggregation functions presented in Table 3. Theresults for each SLA attribute with the Sieving and HU-GA algo-rithms are depicted in Fig. 13. Note that for the latency calculation,we used the previous defined ‘‘Inter-Continental’’ and ‘‘Intra-Continental’’ latency values from Table 4. It can be seen from thefigure that, except for the availability attribute, both algorithms areable to keep the QoS values inside the requested user range (lightlycolored). The low aggregated availability value obtainedwith Siev-ing matching is due to its used random strategy in choosing theClouds, which leads to a multiplicative decrease of the availabil-ity in particular with a high number of component services. More-over, we observed that the use of HU-GA matching assures betteraggregated values for latency and availability, as they are both ofhigh importance for the user (highest weights). The good values ofthe Client-to-Cloud throughput and response time are explainedby the tendency of both algorithms to choose the Clouds located inEurope to reduce the time for transferring input and output datafrom/to the user client. Furthermore, the ‘‘unconstrained’’ use casegives the best results in terms of latency, as in this case, the chanceto deploy all the VMs in one Cloud is higher.

5.8. Cost evaluation

In this subsection, we evaluate the impact of the usedmatchingpolicies on the execution costs, including costs for compute,storage, and traffic. Thus, we calculated the total charged cost aftera single run of the Epigenomics workflow for a different numberof VMs. As the VMs are charged on hourly basis, we rounded upthe makespan to the nearest next hour. Our cost calculation doesnot consider additional costs like license and VM images costs,which are charged by some Cloud providers. For the traffic costcalculation, we measured the amount of data traffic transferredbetween twoClouds during the execution and then applied the realpricing policies. The resulted total costs for the different use casesare depicted in Fig. 14. We can observe that the HU-GA algorithmallows up to 25% cost-saving compared to Sieving for both‘‘unconstrained’’ and ‘‘constrained’’ scenarios. The best cost-savingis achieved with ‘‘unconstrained’’ HU-GA algorithm due to itstendency to deploy all the requested VMs in the cheapest providerif the required QoS is fulfilled. The small increase of the costswith ahigher number of used VMs results from the lowmakespan, whichcompensates the additional costs of the leased VMs.

In order to evaluate the amount of cost-saving in the requestedlease period T of two days, we use a metric called Cost–Budget-Ratio (CBR) defined as follows:

CBR =Cx(T )

Cr(T )∗ 100 (8)

where Cx(T ) and Cr(T ) respectively denote the total lease cost anddefined user budget for the period T . These are calculated as inEqs. (1) and (3). Based on the above metric definition a low CBRvalue is expressed with a rise in cost-saving. We repeated the pre-vious experiment to compare the obtained CBR values with dif-ferent matching policies in the two deployment scenarios. Thegathered results are provided in Table 10. It can be seen from thetable that, for all the scenarios, the budget constraints could be ful-filled. Here we also confirm that HU-GA allows more cost-savingthan Sieving by keeping the costs under 60% of the user requested

Page 11: SLA enactment for large-scale healthcare workflows on multi-Cloud

F. Jrad et al. / Future Generation Computer Systems ( ) – 11

Fig. 13. Average aggregated SLA values for unconstrained (U)/constrained (C) HU-GA and Sieving for a VM number of 50.

Fig. 14. Totalworkflow execution costwith unconstrained (U)/constrained (C) HU-GA and Sieving for different VM numbers.

Table 10CBR values with unconstrained (U)/constrained (C) HU-GA and Sieving for differentVM numbers.

VMs 10 20 30 40 50

U-Sieving 49.22 61.03 66.12 69.85 72.92C-Sieving 50.27 59.84 66.57 70.95 73.4U-HU-GA 36.53 46.13 50.67 53.31 55.04C-HU-GA 42.15 49.48 54.07 56.88 58.77

budget. In contrast, a workflow deployment with Sieving is moreeconomic onlywith a small number of VMs. In addition,we can alsoprove that the ‘‘unconstrained’’ HU-GA deployment gives the min-imal CBR values and consequently ensures the best cost-saving.

5.9. Cost-makespan behavior

The makespan and cost evaluations described in the previoussubsections show that the number of VMs used to run the work-flow results in not only different execution times but also differentcosts. Hence, it will be interesting to observe the behavior between

Fig. 15. The behavior between performance and cost with unconstrained(U)/constrained (C) HU-GA and Sieving.

performance and cost. We created a diagram, as shown in Fig. 15,with one axis (the x-axis) presenting the data with makespan andthe other (the y-axis) depicting the cost.

Observing the figure, it can be commonly seen that the cost de-creases with the increasing makespan. This means that using lessvirtual machines is cheaper but with a cost of longer executiontime. Depending on the applied matching algorithm users haveto make decisions between a cost-performance trade-off. For ex-ample, using a 10 VMs deployment with ‘‘unconstrained’’ HU-GA,users can save 18% of the total payment while having to face thefact that the workflow runs four times slower than a deploymentwith 50 VMs. Therefore, users may choose to lease 50 VMs for im-proving performance rather than for cost-saving and to meet theleasing time deadline.

6. Conclusions and future directions

This work addressed the problem of composite services selec-tion on multi-Cloud with the goal to benefit users in terms of ser-vice quality and costs. After formalizing the problem, we proposed

Page 12: SLA enactment for large-scale healthcare workflows on multi-Cloud

12 F. Jrad et al. / Future Generation Computer Systems ( ) –

Table A.11Modeled datacenters SLA metrics measured from Germany.

Datacenter Location Availability (%) Response time (s) Throughput (Mb/s)

EC2 EU Ireland 99.97 3.63 19.11EC2 US Virginia 99.98 8.59 2.39EC2 JP Tokio 99.91 21.19 2.73

ElasticHosts US Texas 99.96 12.12 1.41ElasticHosts EU England 99.99 2.87 15.42

GoGrid US Virginia 100 8.35 3.13GoGrid EU Netherlands 99.96 1.45 48.52

Rackspace EU England 99.96 2.73 11.28Rackspace US Texas 99.96 11.32 1.76

CloudSigma EU Switzerland 99.95 2.69 29.2CloudSigma US Nevada 99.93 12.58 1.58

VoxCLOUD US New York 99.93 8.27 1.96VoxCLOUD SG Singapore 99.93 24.65 0.63VoxCLOUD EU Netherlands 99.93 4.76 36.05

OpSource EU Netherlands 99.98 2.91 27.5OpSource AU Australia 99.95 28.89 0.85OpSource US Virginia 99.96 8.64 2.47

CityCloud EU Sweden 99.93 4.18 23.13HP Cloud US Nevada 100 9.08 1.34Flexiscale EU England 99.5 4.04 24.51

Table B.12Matched datacenters with unconstrained (U) and constrained (C) HU-GA for different VM numbers; ST = Storage.

VMs 10 20 30 40 50Deployment U C U C U C U C U C

CityCloud EU 10 VM 5 VM 20 VM 10 VM 30 VM 15 VM 40 VM 20 VM 50 VM 25 VMEC2 EU 0 0 0 10 VM 0 15 VM 0 20 VM 0 25 VMCloudSigma EU 0 5 VM 0 1ST 0 0 0 0 0 0VoxCLOUD EU 1ST 1ST 1ST 0 1ST 1ST 1ST 1ST 1ST 1ST

a hybrid utility-based genetic algorithm called HU-GA capable totackle the matching problem.

We evaluated the HU-GA algorithm using an implementedbroker-based simulation environment with a real DNA sequencingworkflow application in different deployment scenarios and com-pared it with a simple Sieving matching policy. The experimentalresults show the benefits fromHU-GAmatching compared to Siev-ing in reducing the total execution costs aswell improving the QoS,in particular when running the workflow on a large service com-position.

In the next step of this research work, we will improve the timeperformance of the genetic algorithm with large service compo-sitions by implementing a parallel version of the algorithm run-ning on faster hardware. In addition, wewill extend the simulationframework to make the simulation more realistic by includingmore accurate network models and automating the collection ofSLA metrics from third-party Cloud monitoring services. Further-more, the match-making should be resilient to Cloud or networkfailures that occur during the provisioning process [36] and causingthe SLA violations. Finally, we will evaluate our matching schemewith popular data-processing workflow applications like MapRe-duce.

Acknowledgment

We would like to thank Weiwei Chen from the University ofSouthern California for his contribution to this work by providingus with the WorkflowSim source code.

Appendix A. SLA metrics of the modeled Clouds

Table A.11 shows the SLAmetrics of the public Clouds collectedfrom CloudHarmony, which we used in most of our simulation

experiments. These data are dependent on the location of themea-surement as well as the network connection and time. The metricsshown are the average availability and response time of an entiremonth and the average throughput (for 5 MB file download) of asingle week. They were acquired in Germany from the same clienthost. In all our conducted experiments we suppose that all the cus-tomers are located in Germany.

Appendix B. Matched datacenters

Table B.12 shows the matched datacenters with HU-GA algo-rithm for the ‘‘constrained’’ and ‘‘unconstrained’’ scenarios. It canbe seen from the table that all the selected datacenters are locatedin Europe due to their closeness to the user. Themostmatched dat-acenter is CityCloud EU, as it offers the cheapest VM prices com-binedwith good SLAmetric values. As storage Cloud, VoxCLOUDEUis themostmatched followed by CloudSigma EU. It can be also seenthat for the constrained deployment, the VMs are equally deployedon EC2 EU and CityCloud EU. The candidate datacenters matchedby the Sieving algorithm are CityCloud EU, Amazon EC2 EU, Flexis-cale EU, CloudSigma EU and VoxCLOUD EU.

Appendix C. HU-GA convergence

In all our conducted experiments, we observed at the conver-gence a steady linear increase of the maximal achieved utility byincreasing the number of VMs in the composite service. This canbe seen in Fig. C.16 for both deployment scenarios.

Fig. C.17 shows a screen shot from the Opt4J convergence plotfor the ‘‘constrained’’ deployment scenario with 30 VMs. The sameplot when running the algorithm without sieving is depicted inFig. C.18. In the later case the candidate composite services with

Page 13: SLA enactment for large-scale healthcare workflows on multi-Cloud

F. Jrad et al. / Future Generation Computer Systems ( ) – 13

Fig. C.16. Maximal utility values at the HU-GA convergence for different VMnumbers.

Fig. C.17. Opt4J convergence plot with 30 VMs for constrained sieved HU-GA.

Fig. C.18. Opt4J convergence plot with 30 VMs for constrained unsieved HU-GA.(For interpretation of the references to color in this figure legend, the reader isreferred to the web version of this article.)

negative utilities (blue line) are also evaluated; consequentlymoreiterations are needed for the convergence.

References

[1] X. Yang, L. Wang, G. von Laszewski, Recent research advances in e-science,Cluster Comput. 12 (2009) 353–356.

[2] R. Buyya, C.S. Yeo, S. Venugopal, J. Broberg, I. Brandic, Cloud computing andemerging it platforms: vision, hype, and reality for delivering computing asthe 5th utility, Future Gener. Comput. Syst. 25 (2009) 599–616.

[3] L. Wang, D. Chen, Y. Hu, Y. Ma, J. Wang, Towards enabling cyberinfrastructureas a service in clouds, Comput. Electr. Eng. 39 (2013) 3–14.

[4] L. Wang, W. Jie, Towards supporting multiple virtual private computingenvironments on computational Grids, Adv. Eng. Softw. 40 (2009) 239–245.

[5] W. Zhang, L. Wang, D. Liu, W. Song, Y. Ma, P. Liu, D. Chen, Towards building amulti-datacenter infrastructure for massive remote sensing image processing,Concurr. Comput.: Pract. Exper. 25 (2013) 1798–1812.

[6] D. Chen, Z. Liu, L.Wang,M. Dou, J. Chen, H. Li, Natural disastermonitoringwithwireless sensor networks: a case study of data-intensive applications uponlow-cost scalable systems, Mobile Netw. Appl. 18 (2013) 651–663.

[7] G. Juve, E. Deelman, Scientific workflows in the cloud, in: ComputerCommunications and Networks, Springer, London, 2011.

[8] Y. Ma, L. Wang, A.Y. Zomaya, D. Chen, R. Ranjan, Task-tree based large-scalemosaicking for remote sensed imageries with dynamic DAG scheduling, IEEETrans. Parallel Distrib. Syst. (2013).

[9] M.C. Schatz, B. Langmead, S.L. Salzberg, Cloud computing and the DNA datarace, Nature Biotechnol. 28 (2010) 691.

[10] F.F. Costa, Big data in biomedicine, Drug Discov. Today (2013).[11] G. Juve, E. Deelman, G. Berriman, B.P. Berman, P. Maechling, An evaluation

of the cost and performance of scientific workflows on Amazon EC2, J. GridComput. 10 (2012) 5–21.

[12] M. Papazoglou, Web Services: Principles and Technology, Addison-Wesley,2008.

[13] C. Szabo, Q.Z. Sheng, T. Kroeger, Y. Zhang, J. Yu, Science in the cloud: allocationand execution of data-intensive scientific workflows, J. Grid Comput. (2013).

[14] R.N. Calheiros, R. Ranjan, A. Beloglazov, C.A.F.D. Rose, R. Buyya, Cloudsim: atoolkit for modeling and simulation of cloud computing environments andevaluation of resource provisioning algorithms, Softw. - Pract. Exp. 41 (2011)23–50.

[15] L. Zeng, B. Benatallah,M. Dumas, J. Kalagnanam,Q.Z. Sheng, Quality drivenwebservices composition, in: Proceedings of the 12th International Conference onWorld Wide Web, WWW’03, ACM, New York, NY, USA, 2003, pp. 411–421.URL: http://doi.acm.org/10.1145/775152.775211. http://dx.doi.org/10.1145/775152.775211.

[16] A. Dastjerdi, R. Buyya, A Taxonomy of QoS Management and ServiceSelection Methodologies for Cloud Computing, CRC Press, 2011, pp. 109–131.http://dx.doi.org/10.1201/b11149-8.

[17] S.K. Garg, S. Versteeg, R. Buyya, A framework for ranking of cloud computingservices, Future Gener. Comput. Syst. 29 (2013) 1012–1023.

[18] A. Juan-Verdejo, H. Baars, Decision support for partially moving applicationsto the cloud: the example of business intelligence, in: Proceedings of the2013 International Workshop on Hot Topics in Cloud Services, HotTopiCS’13,ACM, New York, NY, USA, 2013, pp. 35–42. URL: http://doi.acm.org/10.1145/2462307.2462316. http://dx.doi.org/10.1145/2462307.2462316.

[19] M. Menzel, R. Ranjan, Cloudgenius: decision support for web server cloudmigration, in: Proceedings of the 21st International Conference onWorldWideWeb, ACM, New York, NY, USA, 2012.

[20] A.V. Dastjerdi, S.K. Garg, R. Buyya, QoS-aware deployment of network ofvirtual appliances across multiple clouds, in: 2011 IEEE Third InternationalConference on Cloud Computing Technology and Science, 2011, pp. 415–423.

[21] M. Zhang, R. Ranjan, S. Nepal, M.Menzel, A. Haller, A declarative recommendersystem for cloud infrastructure services selection, in: Proceedings of the9th International Conference on Economics of Grids, Clouds, Systems, andServices, GECON’12, Springer-Verlag, Berlin, Heidelberg, 2012, pp. 102–113.http://dx.doi.org/10.1007/978-3-642-35194-5_8.

[22] Z. Ye, X. Zhou, A. Bouguettaya, Genetic algorithm based QoS-aware ser-vice compositions in cloud computing, in: Proceedings of the 16th Inter-national Conference on Database Systems for Advanced Applications: PartII, DASFAA’11, Springer-Verlag, Berlin, Heidelberg, 2011, pp. 321–334. URL:http://dl.acm.org/citation.cfm?id=1997251.1997281.

[23] S. Haak, M. Menzel, Autonomic benchmarking for cloud infrastructures: aneconomic optimizationmodel, in: Proceedings of the 1st ACM/IEEEWorkshopon Autonomic Computing in Economics, ACM, 2011, pp. 27–32.

[24] F. Jrad, J. Tao, R. Knapper, C.M. Flath, A. Streit, A utility-based approachfor customised cloud service selection, Int. J. Computational Science andEngineering (2014) in press. http://www.inderscience.com/info/ingeneral/forthcoming.php?jcode=ijcse.

[25] X. Yang, Improving portfolio efficiency: a genetic algorithmapproach, Comput.Econ. 28 (2006) 1–14.

[26] J. Asker, E. Cantillon, Properties of scoring auctions, Rand J. Econ. 39 (2008)69–85.

[27] S. Lamparter, S. Ankolekar, S. Grimm, R. Studer, Preference-based selection ofhighly configurable web services, in: Proc. of the 16th Int. World Wide WebConference, WWW’07, Banff, Canada, 2007, pp. 1013–1022.

[28] F. Jrad, J. Tao, A. Streit, A broker-based framework for multi-cloud workflows,in: Proceedings of the 2013 International Workshop on Multi-Cloud Applica-tions and Federated Clouds, MultiCloud’13, ACM, New York, NY, USA, 2013,pp. 61–68. URL: http://doi.acm.org/10.1145/2462326.2462339. http://dx.doi.org/10.1145/2462326.2462339.

[29] C. Weiwei, D. Ewa, Workflowsim: a toolkit for simulating scientific workflowsin distributed environments, in: The 8th IEEE International Conference oneScience, IEEE, IEEE, Chicago, 2012, URL: http://isi.edu/%7Ewchen/papers/workflowsim.pdf.

[30] E. Deelman, G. Singh, M.-H. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta,K. Vahi, G.B. Berriman, J. Good, et al., Pegasus: a framework for mappingcomplex scientific workflows onto distributed systems, Sci. Program. 13(2005) 219–237.

Page 14: SLA enactment for large-scale healthcare workflows on multi-Cloud

14 F. Jrad et al. / Future Generation Computer Systems ( ) –

[31] M. Lukasiewycz,M. Glaß, F. Reimann, J. Teich, Opt4J—amodular framework formeta-heuristic optimization, in: Proceedings of the Genetic and EvolutionaryComputing Conference, GECCO 2011, Dublin, Ireland, 2011, pp. 1723–1730.

[32] Epigenome, USC Epigenome Center, [Online], 2013. http://epigenome.usc.edu(accessed: 20.12.13).

[33] G.M. Juve, Resource management for scientific workflows (Ph.D. thesis),University of Southern California, 2012.

[34] F. Jrad, J. Tao, I. Brandic, A. Streit, Multi-dimensional resource allocationfor data-intensive large-scale cloud applications, in: Proceedings of theInternational Conference on Cloud Computing and Services Science, CLOSER2014, Barcelona, Spain, 2014, pp. 691–702.

[35] G.B. Berriman, E. Deelman, J.C. Good, J.C. Jacob, D.S. Katz, C. Kesselman,A.C. Laity, T.A. Prince, G. Singh, M.-H. Su, Montage: a grid-enabled enginefor delivering custom science-grade mosaics on demand, in: P.J. Quinn,A. Bridger (Eds.), Society of Photo-Optical Instrumentation Engineers (SPIE)Conference Series, in: Society of Photo-Optical Instrumentation Engineers(SPIE) Conference Series, vol. 5493, 2004, pp. 221–232. http://dx.doi.org/10.1117/12.550551.

[36] E. Deelman, G. Juve, G.B. Berriman, Using clouds for science, is it just kickingthe can down the road? in: Proceedings of the International Conference onCloud Computing and Services Science, CLOSER 2012, Porto, Portugal, 2012.

Foued Jrad is a Ph.D. student at the Steinbuch Centrefor Computing of the Karlsruhe Institute of Technology(KIT), Germany. He received his Diploma in ElectricalEngineering from the University of Hanover, Germany. Hiscurrent research interests include Intercloud computing,cloud interoperability and cloud service brokerage.

Jie Tao is a Senior Research Associate at the SteinbuchCentre for Computing of the Karlsruhe Institute of Tech-nology (KIT), Germany. She received her Ph.D. at the Tech-nical University ofMunich, Germany. Her research interestis mainly in parallel and distributed computing with a fo-cus on parallel programming models, performance tools,distributed shared memory systems and grid/cloud com-puting.

Ivona Brandic is Assistant Professor at the DistributedSystems Group, Information Systems Institute, ViennaUniversity of Technology (TU Wien).

Prior to that, shewas Assistant Professor at theDepart-ment of Scientific Computing, Vienna University.

She received her Ph.D. degree from Vienna Universityof Technology in 2007.

Achim Streit is the Director of Steinbuch Centre for Com-puting (SCC), Karlsruhe Institute of Technology (KIT), anda Professor for Distributed and Parallel High PerformanceSystems, Institute of Telematics, Department of Informat-ics, Karlsruhe Institute of Technology (KIT), Germany.

His research includes high performance computing,Grid computing and Cloud computing.