resource preprocessing and optimal task scheduling in cloud computing environments

CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCEConcurrency Computat.: Pract. Exper. (2014)Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cpe.3204

SPECIAL ISSUE PAPER

Resource preprocessing and optimal task scheduling in cloudcomputing environments

Zhaobin Liu, Wenyu Qu*,†, Weijiang Liu, Zhiyang Li and Yujie Xu

School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China

SUMMARY

Cloud computing came into being and is currently an essential infrastructure of many commerce facilities.To achieve the promising potentials of cloud computing, effective and efficient scheduling algorithms arefundamentally important. However, conventional scheduling methodology encounters a number ofchallenges. During the tasks scheduling in cloud systems, how to make full use of resources and how toeffectively select resources are also important factors. At the same time, communication delay also playsan important role in cloud scheduling, which not only leads to waiting between tasks but also results in muchidle interval time between processing units. In this paper, a fuzzy clustering method is used to effectivelypreprocess the cloud resources. Combining the list scheduling with the task duplication scheduling scheme,a new directed acyclic graph based scheduling algorithm called earliest finish time duplication algorithm forheterogeneous cloud systems is presented. Earliest finish time duplication attempts to insert suitableimmediate parent nodes of the current selected node in order to reduce its waiting time on the processor.The case study and experimental results illustrate that the algorithm proposed in this paper is better thanthe popular heterogeneous earliest finish time algorithms. Copyright © 2014 John Wiley & Sons, Ltd.

Received 30 January 2013; Revised 27 November 2013; Accepted 8 December 2013

KEY WORDS: cloud computing; DAG (direct acyclic graph); fuzzy clustering; task scheduling

1. INTRODUCTION

As a typical distributed system, the cloud is composed of a large number of the shared and heterogeneousresources that provide a tremendous computing power. The resources of cloud computing are wide-areadistributed, self-management, heterogeneous, and dynamic load changes, which makes the tasksscheduling in the cloud environment face much more complex problems than in the traditionaldistributed environment [1]. The tasks scheduling in the cloud is an important research topic in thefield of high-performance heterogeneous cloud computing [2]. Scheduling is the method by whichthreads, processes, task, or data flows is given access to system resources (e.g., processor time andcommunications bandwidth). This is usually performed to achieve a target quality of service. Theneed for a scheduling algorithm arises from the requirement for most modern systems to performnot only execute more than one task at a time but also transmit multiple tasks simultaneously [3].

To achieve the promising potentials of tremendous distributed resources, effective and efficientscheduling algorithms are fundamentally important. Unfortunately, scheduling algorithms in traditionalparallel and distributed systems, which usually run on homogeneous and dedicated resources, forexample, computer clusters, cannot work well in the new cloud computing circumstances [4].

*Correspondence to: Wenyu Qu, School of Information Science and Technology, Dalian Maritime University, Dalian116026, China.†E-mail: [email protected]

Copyright © 2014 John Wiley & Sons, Ltd.

Z. LIU ET AL.

In the cloud task scheduling, communication delay is an important factor affecting schedulingalgorithm [5, 6], which not only results in waiting time between tasks but also makes processing unitsown too much idle time interval. In addition, how to make full use of resources and how to selectresources are also important factors [7]. This paper studies the cloud task scheduling strategy, takinginto account the influence of resource preprocessing and communication delays on the cloud schedulingalgorithm. In summary, our main salient contributions can be summarized as the following two aspects.

On the one hand, there is very little resource preprocessing stage in the traditional cloud task schedulingprocedure. Consequently, the tradition method cannot make good use of resources based on resourcecharacteristics. Thus in this paper, our first contribution focuses on the resource preprocessing problemin the cloud computing systems. We use the fuzzy clustering method to preprocess the cloud resources.Through fuzzy clustering for resource characteristics [8], the scheduling time, which is spent in thestage of selecting the processing unit, can be reduced. The performance comparison illustrates thatarithmetic average method is more suitable for heterogeneous resource preprocessing than otherpreprocessing methods.

On the other hand, our second contribution is that an earliest finish time duplication (EFTD)algorithm is proposed. Combining heterogeneous earliest finish time (HEFT) scheduling and taskduplication scheduling, when there is a period of idle time on the processor, EFTD algorithmattempts to insert suitable immediate parent nodes of the current selected node. This policy willreduce its waiting time on the processor, which is helpful to advance the earliest starting time of thiscandidate tasks. Our case study and experimental results also illustrate that our proposed algorithmis better than traditional HEFT algorithms.

The remainder of the manuscript is organized as follows. The definitions and related work arereviewed in Section 2. Section 3 describes the assumptions, investigates the theoretical analysis fordirected acyclic graph (DAG) scheduling, and proposes our EFTD scheduling model andarchitecture for heterogeneous cloud systems. Fuzzy clustering based resource preprocess andcomparisons are described in Section 4. In Section 5, an EFTD methodology is presented toschedule the heterogeneous cloud systems. Our experimental results and performance evaluation arealso analyzed in Section 5. The conclusion and future works are concluded in Section 6.

2. DEFINITIONS AND RELATED WORK

Before presenting formulation, to facilitate the discussion of this paper, we first introduce somedefinitions, terms, and resource characteristics used in this paper. In order to improve the resources’scheduling performance, the resources of target cloud system can be clustered based on theirproperties. To make the further study of the resource preprocessing easier, we defined the followingfive characteristics to describe the processing unit in the resource system:

• Processing performance: the average computing ability among processing units in the resourcesystem, it represents the average time per task executed by a processing unit. The smaller thisvalue is, the less a processing unit costs during computation, and vice versa.

• Number of links: the number of edges that are connected to a processing unit. It represents thenumber of the connected links of a processing unit.

• Average communication ability: the average weight of edges that are connected to a processingunit, it represents the average value of the communication ability of the connected links.

• Network position: the product of the number of the nodes to which a processing unit reaching itscorresponding farthest unit has to cross, and the number of nodes with the same span. It representsthe position of a processing unit in the network, the greater this value is, the farther this processingunit is away from the central of the network, and vice versa.

• Maximum transmission capacity: The maximum value of the edges that are connected to aprocessing unit, which represents the maximum transmission capacity of the connected links.

Many cloud scheduling schemes have been proposed for efficient resource scheduling [9]. In thefollowing context, we review the major research works on the limitations of traditional cloudscheduling algorithms.

Copyright © 2014 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. (2014)DOI: 10.1002/cpe

RESOURCE PREPROCESSING AND OPTIMAL TASK SCHEDULING

Traditional scheduling algorithms for parallel and distribute computing environment are based onatomic or batch tasks, but most of these models are simple and easy to implement. In a cloudenvironment, this assumption cannot be a reality [10]. The cloud tasks may be not independent butrelated with each other or constrained with an assigned priority [11]. To improve the systemperformance of a large distributed network, Li et al. [12] have addressed the problem of multimediaobject placement for transparent data replication. The performance objective is to minimize the totalaccess cost by considering the transmission cost. In their more recent work, the authors haveproposed an energy-efficient and high-accuracy scheme for secure data aggregation to overcome thecommunication overhead [13]. However, the approach suits only wireless sensor networks as it doesnot support heterogeneous cloud computing environment.

In recent years, there has been an increasing interest in DAG scheduling heuristics forheterogeneous resources, that is, resources whose capabilities may differ (as opposed to traditionalhomogeneous resources)[14]. For instance, scientific workflows and clouds are considered as a newchallenge on how to leverage the computing power of cloud and grid computing on scienceworkflow applications [15].

Directed acyclic graph can be applied to map these tasks’ relations in the graph for heterogeneouscloud environment. The DAG technique is based on the repeated execution of the two steps: selectthe node with higher priority and assign the selected node to a suitable processing unit [16]. Manytypes of scheduling algorithms for DAG are introduced since the last mid-century [17]. Based onheuristic DAG scheduling, the DAG scheduling algorithms can be classified into four types. Amongthese heuristics, the heterogeneous earliest finish time heuristic (or HEFT in short) [18] has beenone of the most often cited and used, having the advantage of simplicity and producing generallygood schedules with a short makespan. HEFT is essentially a list scheduling heuristic. The HEFTalgorithm selects the task with the highest upward rank (an upward rank is defined as the maximumdistance from the current node to the exiting node, including the computational cost andcommunication cost) at each step. Nonetheless, the popular HEFT algorithm cannot take full use ofthe idle time of CPU units.

Abirami and Ramanathan [19] have presented an effective scheduling algorithm named linearscheduling for tasks and resources scheduling algorithm is designed, which schedules both the task andthe resources. The algorithm mainly focuses is eradicating the starvation and deadlock conditions. Thevirtualization technique along with the scheduling algorithm will yield higher resource utilization,system throughput, thus improving the performance of the cloud resources. However, the proposedapproach is assumed that the task requests should satisfy linear scheduling mode. Kaur and Verma [20]have proposed a modified genetic algorithm for single user jobs in which the fitness is developed toencourage the formation of solutions to achieve the time minimization and compared it with existingheuristics. Experimental results show that, under the heavy loads, the proposed algorithm exhibits agood performance. However, the proposed methods are single user type, thus are not suitable forgeneral cloud applications.

Magulur et al. [21] have considered a stochastic model of a cloud computing cluster, where jobsarrive according to a stochastic process and request virtual machines, which are specified in terms ofresources such as CPU, memory, and storage space. A primary contribution is the development offrame-based non-preemptive virtual machine configuration policies. In order to obtain a comprehensiveunderstanding of the system performance, a novel analytical model based on Markov-modulatedPoisson process is developed for communication networks in multicluster systems in the presence ofthe spatiotemporal bursty traffic [22]. In [23], the authors proposed a resource preprocessing methodfor Grid system, but the method is still in its infancy. Mohsen et al. [24] have proposed twomarket-oriented scheduling policies that aim at satisfying the application deadline by extending thecomputational capacity of local resources via hiring resource from Cloud providers. The policies do nothave any prior knowledge about the application execution time. The proposed policies are implementedin Gridbus broker as a user-level broker. However, our work focuses on providing scheduling policiesto enhance system speedup, and we do not deal with market related issues. Chang et al [25] use a datareplication mechanism to generate multiple copies of the existing data to reduce access opportunitiesfrom a remote site for data intensive application. In [26], a novel scheduling heuristic called DAGMapwas proposed to solve the problem of job accomplishing time and high resource utilization efficiency.


Z. LIU ET AL.

In [27], the authors combined centralized static scheduling with a dynamic flow of jobs that makesobtaining analytical results about the performance of their algorithm tractable. However, thesegeneral task duplication-based methods usually assume a homogeneous network or an infiniteresource system.

In contrast to the discussed studies, in order to improve the utilization of processing unites underheterogeneous cloud environment, we propose a new DAG based resource preprocessing and EFTDalgorithm for heterogeneous cloud systems. According to resource preprocessing based on fuzzyclustering method, the similar resources can be clustered so as to minimize the scheduling time,which is spent in the selecting stage. Moreover, the proposed EFTD approach will reduce its waitingtime on the processor and thus advancing the system speedup.

3. PROBLEM STATEMENT AND SYSTEM MODEL

A DAG scheduling model can be represented by a weighted DAG G = (T, E), where T= {ti} is a set oftasks. E= {eij} is a set of edges indicating the dependency between tasks. eij represents the edgebetween ti and tj, where ti is the parent node of tj. Obviously, tj cannot start unless ti finishes and ti′data reach tj. cij denotes the inter-tasks communication time between ti and tj. If ti and tj are assignedinto the same processing unit, the overhead of communication cost is zero, that is cij= 0. In addition,in a given task graph, tenter is called an enter node, which has no parent nodes. Texit is called an exitnode, which has no child nodes. If there are more than one entry tasks or exit tasks in DAG, we canadd a virtual entry or exit task to DAG with zero-cost edges. A task in a cloud is an atomic unit tobe scheduled by the scheduler and assigned to a cloud resource.

The DAG target system can be depicted by an acyclic graph D= (P, C), where P= {p1, p2,… pM} isthe collection of processing units (M is the total number of processing units), that is pwx means theaverage computing time. C is the collection of communication time between computing units’edges. ccxy ∈C denotes the data transfer per unit time, that is the average communication ability. Wealso define b as the heterogeneous coefficient, where b = 0 means homogeneous and b = 1 meansheterogeneous. In this case, we can obtain a computing cost of processing unit wix, which can beany value in the range of pwxñ 1� b

2

� �; pwx � 1þ b

2

� �� .

To facilitate the evaluation of the algorithm, associated scheduling parameters, abbreviations, andnotions can be formalized as follows:

• EST: earliest start time.

EST ti; pj� � ¼ max

�avail pj

� �; maxtx∈pred tið Þ

�AFT txð Þ þ cx;i

ccx;j

��

Here, pred(ti) is the closest predecessor of ti and avail [pj] is the earliest time when a processing unitis going to be executed. AFT(tx) is the actual finish time of task tx.

• EST : average of EST.

EST tið Þ ¼ maxtn∈pred tið Þ

�EST tnð Þ þMp þ cn;i

McÞ�

where Mp is the median of processing unit’s computing ability. Mc is the median of transfer abilitybetween processing units. It is clear that EST(tentry) = 0.

• blevel: bottom level, denotes the longest length from the current node to an exit node.

blevel tið Þ ¼ Mp þ maxtj∈succ tið Þ

� ci; jMc

þ blevel tj� ��



Here, succ(ti) is the closest successor or child of task ti.

• tlevel: top level, denotes the longest length from the entry node to current node.

tlevel tið Þ ¼ mintn∈succ tið Þ

�EST tið Þ � ci;n

Mc

��Mp

Here, tlevel(texit) =EST(texit).

• EFT: earliest finish time.

EFT ti; pj� � ¼ EST ti; pj

� �þ wi;j

• COD: an interval time during target duplication.

COD ti; pj� � ¼ EST ti; pj

� �� avail pj� �

• DAT(tcp2, pj): data arrival time, denotes the arrival time from second key parent node tcp2 toprocessing unit pj.

In order to reduce the scheduling length in our strategy, we find that it is necessary to cut thecommunication and computing costs. Our EFTD scheduling model based on DAG includes fourstages: resource preprocessing stage, priority processing stage, task allocation stage, and taskduplication stage. The detailed system model and architecture is showed in Figure 1.

The resource provider provide the resources, when users submit one DAG task, through the fuzzyclustering method, the resource broker will select the resource that has a better comprehensiveperformance. And finally, with the algorithm used by user agent and broker agent, the resourcescheduling procedure will be realized.

In our approach, resource clustering preprocessing and EFTD algorithm are premier that relate to theefficiency of the whole cloud computing facilities. So in the following two sections, we will introduceour resource preprocessing procedure and EFTD algorithm, respectively.

4. FUZZY CLUSTERING BASED RESOURCE PREPROCESSING

4.1. Procedure of fuzzy clustering

In order to achieve the efficient optimization for the cloud environment, the fuzzy clustering basedcloud resource preprocessing strategy is carried out. By partitioning all resources into a number ofclusters, resources with the same computing capability are grouped into one cluster. Resourceswithin one cluster usually share the same network communication, so they have the same datatransfer rate with each other within this cluster. Also, resources within one cluster have the samedata transfer rate to resources in another cluster. The model accommodates both heterogeneous andhomogeneous computing environments, in terms of computing capability and data communication[28]. According to the problem statement, we define the fuzzy clustering based resourcepreprocessing procedure as four phases:

• experimental data range standardization;• computing similarity of experimental data;• achieve fuzzy matrix based on confidence level; and• evaluate clustering information using evaluation function.

Qualitative characteristics refer to qualities or properties of cloud computing resources. For eachprocessing unit sn in set S= {s1, s2… sN} has a pattern vector S’(sn) = {sn1, sn2… sn5}, where sni is


Figure 1. Earliest finish time duplication (EFTD) system model and architecture.

Z. LIU ET AL.

the ith (1≤ i≤m) characteristic value in the nth process unit. One qualitative feature can be realized inmultiple ways as Section 2 describes. Therefore, we set m= 5, that is, 5 characteristics. They areidentified as follows:

• t1 represents its processing performance.• t2 represents its average communication ability.• t3 represents its maximum transmission capacity.• t4 represents its network position.• t5 represents its link number.

The original data sheet S’ is shown in Table I.Using means and standard deviation to deal with the data S in target system, we obtain the

standardization data. The standardized value p’ik of each data is equal to

p’ik ¼ pik � tkð Þ=Stk


Table I. Original data.

S ′

t1 t2 t3 t4 t5

S1 20 0.05 0.05 15 1S2 30 0.08 0.05 12 2S3 35 0.1 0.1 12 1S4 50 0.13 0.1 9 4S5 45 0.13 0.1 6 4S6 50 0.13 0.1 9 3S7 30 0.1 0.1 8 1S8 28 0.1 0.1 8 1S9 48 0.12 0.08 4 5S10 33 0.1 0.1 4 1S11 30 0.08 0.08 5 1S12 35 0.1 0.1 5 1S13 40 0.12 0.12 5 1


Where tk denotes the kth eigenvector of original data. tk is the mean value of tk. Stk is the standardized

deviation of tk. Because the resulting standardized data p′ik is not yet necessarily in [0, 1], extremestandardized method is introduced to normalize the S′ to S″ (illustrated in Table II). The extremestandardized method is defined as

p″ik ¼ p′ik � p′kmin

� �= p′kmax � p′kmin

� �

where p′kmin is the minimum value in p′1k, p′2k, …, p′Nk and p′kmax is the maximum value in p′1k,

p′2k, …, p′Nk.Thus, the fuzzy similarity relation Rs of processing unit S = {s1, s2… sN} can be calculated under a

variety of similarity calculation methodology. Here, we take advantage of index similar coefficientmethod as the first instance:

rij ¼ 1n∑e

�34*

pik�pjkð Þ2s″2tk

Table II. Range of standardized data.

S ″

t1 t2 t3 t4 t5

S1 0 0 0 1 0S2 0.33 0.3 0 0.73 0.25S3 0.5 0.6 0.71 0.73 0S4 1 0.9 0.71 0.45 0.75S5 0.83 0.9 0.71 0.18 0.75S6 1 1 0.71 0.45 0.5S7 0.33 0.6 0.71 0.36 0S8 0.27 0.6 0.71 0.36 0S9 0.93 0.84 0.43 0 1S10 0.43 0.6 0.71 0 0S11 0.33 0.36 0.43 0.09 0S12 0.5 0.6 0.71 0.09 0S13 0.67 0.84 1 0.09 0


Z. LIU ET AL.

where n stands for the number of eigenvalue. The values of i and j belong to [1,13]. The value of kbelongs to [1,5]. In this case, the Rs is

Rs ¼

1 0:61 0:36 0:03 0:02 0:08 0:31 0:34 0:04 0:26 0:4 0:24 0:21

0:61 1 0:62 0:18 0:11 0:27 0:53 0:52 0:09 0:45 0:6 0:43 0:26

0:36 0:62 1 0:42 0:38 0:46 0:83 0:8 0:28 0:8 0:59 0:81 0:59

0:03 0:18 0:42 1 0:88 0:94 0:47 0:47 0:66 0:34 0:2 0:38 0:45

0:02 0:11 0:38 0:88 1 0:82 0:46 0:45 0:77 0:49 0:34 0:55 0:64

0:08 0:27 0:46 0:94 0:82 1 0:51 0:51 0:57 0:38 0:25 0:42 0:49

0:31 0:53 0:83 0:47 0:46 0:51 1 0:99 0:31 0:86 0:73 0:88 0:62

0:34 0:52 0:8 0:47 0:45 0:51 0:99 1 0:31 0:84 0:73 0:85 0:59

0:04 0:09 0:28 0:66 0:77 0:57 0:31 0:31 1 0:45 0:43 0:46 0:52

0:26 0:45 0:8 0:34 0:49 0:38 0:86 0:84 0:45 1 0:79 0:98 0:74

0:4 0:6 0:59 0:2 0:34 0:25 0:73 0:73 0:43 0:79 1 0:78 0:52

0:24 0:43 0:81 0:38 0:55 0:42 0:88 0:85 0:46 0:98 0:78 1 0:78

0:21 0:26 0:59 0:45 0:64 0:49 0:62 0:59 0:52 0:74 0:52 0:78 1

26666666666666666666666666664

37777777777777777777777777775

Meanwhile, fuzzy equivalence relation with transitive closure Re can be obtained throughcompositional operations based on similar coefficient matrix:

Re ¼

1 0:61 0:61 0:61 0:61 0:61 0:61 0:61 0:61 0:61 0:61 0:61 0:61

0:61 1 0:62 0:62 0:62 0:62 0:62 0:62 0:62 0:62 0:62 0:62 0:62

0:61 0:62 1 0:64 0:64 0:64 0:83 0:83 0:64 0:83 0:79 0:83 0:78

0:61 0:62 0:64 1 0:88 0:94 0:64 0:64 0:77 0:64 0:64 0:64 0:64

0:61 0:62 0:64 0:88 1 0:88 0:64 0:64 0:77 0:64 0:64 0:64 0:64

0:61 0:62 0:64 0:94 0:88 1 0:64 0:64 0:77 0:64 0:64 0:64 0:64

0:61 0:62 0:83 0:64 0:64 0:64 1 0:99 0:64 0:88 0:79 0:88 0:78

0:61 0:62 0:83 0:64 0:64 0:64 0:99 1 0:64 0:88 0:79 0:88 0:78

0:61 0:62 0:64 0:77 0:77 0:77 0:64 0:64 1 0:64 0:64 0:64 0:64

0:61 0:62 0:83 0:64 0:64 0:64 0:88 0:88 0:64 1 0:79 0:98 0:78

0:61 0:62 0:79 0:64 0:64 0:64 0:79 0:79 0:64 0:79 1 0:79 0:78

0:61 0:62 0:83 0:64 0:64 0:64 0:88 0:88 0:64 0:98 0:79 1 0:78

0:61 0:62 0:78 0:64 0:64 0:64 0:78 0:78 0:64 0:78 0:78 0:78 1

26666666666666666666666666664

37777777777777777777777777775

To achieve different clustering results, values of cut set a can be set differently. If a’s value is closerto 1, it indicates the higher degree of similarity among clusters, whereas if a’s value is closer to 0, itshows the lower degree of similarity among clusters. If we let a= 0.8, then cut set Rg is calculated as



Rg ¼

1 0 0 0 0 0 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0 0 0 0 0

0 0 1 0 0 0 1 1 0 1 0 1 0

0 0 0 1 1 1 0 0 0 0 0 0 0

0 0 0 1 1 1 0 0 0 0 0 0 0

0 0 0 1 1 1 0 0 0 0 0 0 0

0 0 1 0 0 0 1 1 0 1 0 1 0

0 0 1 0 0 0 1 1 0 1 0 1 0

0 0 0 0 0 0 0 0 1 0 0 0 0

0 0 1 0 0 0 1 1 0 1 0 1 0

0 0 0 0 0 0 0 0 0 0 1 0 0

0 0 1 0 0 0 1 1 0 1 0 1 0

0 0 0 0 0 0 0 0 0 0 0 0 1

26666666666666666666666666664

37777777777777777777777777775

Obviously, every 1 in each column in Rg matrix can be clustered together. We can obtain theclustering result G matrix through merging the same item in Rg matrix.

G ¼

1 2 12 6 9 11 13

0 0 10 5 0 0 0

0 0 8 4 0 0 0

0 0 7 0 0 0 0

0 0 3 0 0 0 0

26666664

37777775

The values in G stand for the corresponding number of resources. Each column in the G matrixresult represents a cluster. It can be seen that there are seven clusters in the example earlier. In thiscase, each cluster’s overall performance can be estimated by the following equation:

PERF CLj� � ¼ 1

w∑

pk∈CLj∑n‵

i¼1αi�P″ k½ � i½ �

where w is the number of processing units contained in the cluster. αi is the weight of ith characteristicof processing units. Its value generally can be gained by historical data or experimental experiences,without losing generality; here, we assume that its value is {1/2, 1/8, 1/8, 1/8, 1/8}. As a result, theoverall performance above clustering results can be sorted from high to low, that is, {{p4, p5, p6},{p9}, {p13}, {p3, p7, p8, p10, p12}, {p2}, {p11}, {p1}}.

4.2. Other fuzzy clustering methods

In the earlier section, we use index similarity coefficient method to obtain fuzzy similar relations. Inorder to fully compare different preprocessing schemes, using similar methods, we obtain morefuzzy similar relations based on other clustering methods.


Z. LIU ET AL.

(1) Cosine method

G1 ¼

1 2 13

0 0 12

0 0 11

0 0 10

0 0 9

0 0 8

0 0 7

0 0 6

0 0 5

0 0 4

0 0 3

266666666666666666666664

377777777777777777777775

We can obtain clustering results {{p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13}, {p2}, {p1}} by usingevaluation function of Cosine method.

(2) Correlation coefficient

G2 ¼

9 2 13

6 0 12

5 0 11

4 0 10

1 0 8

0 0 7

0 0 3

2666666666664

3777777777775

We can obtain clustering results {{p3, p7, p8, p10, p11, p12, p13}, {p1, p4, p5, p6, p9}, {p2}} by usingevaluation function of correlation coefficient method.

(3) Maximum and minimum method

G3 ¼

1 2 3 6 12 9 11 13

0 0 0 5 10 0 0 0

0 0 0 4 8 0 0 0

0 0 0 0 7 0 0 0

26664

37775

We can obtain clustering results {{p4, p5, p6}, {p9}, {p13}, {p3}, {p7, p8, p10, p12}, {p2}, {p11},{p1}} by using evaluation function of maximum and minimum method.

(4) Minimum of arithmetic mean method

G4 ¼

1 2 13 9 11

0 0 12 6 0

0 0 10 5 0

0 0 8 4 0

0 0 7 0 0

0 0 3 0 0

26666666664

37777777775

We can obtain clustering results {{p4, p5, p6, p9}, {p3, p7, p8, p10, p12, p13}, {p2}, {p11}, {p1}} byusing evaluation function of arithmetic mean method.



(5) Euclidean distance method

G5 ¼

5 2 3 6 8 9 13

1 0 0 4 7 0 12

0 0 0 0 0 0 11

0 0 0 0 0 0 10

26664

37775

We can obtain clustering results {{p4, p6}, {p3}, {p1, p5}, {p10, p11, p12, p13}, {p7, p8}, {p2}} byusing evaluation function of Euclidean distance method.

(6) Absolute value of the index method

G6 ¼ 1 2 3 4 5 6 8 9 12 11 13

0 0 0 0 0 0 7 0 10 0 0

�

We can obtain clustering results {{p4}, {p6}, {p9}, {p5}, {p13}, {p3}, {p10, p12}, {p7, p8}, {p2},{p11}, {p1}} by using evaluation function of absolute value of the index method.

(7) Hamming distance method

G7 ¼

1 2 13 6 9

0 0 12 5 0

0 0 11 4 0

0 0 10 0 0

0 0 8 0 0

0 0 7 0 0

0 0 3 0 0

2666666666664

3777777777775

We can obtain clustering results {{p4, p5, p6}, {p9}, {p3, p7, p8, p10, p11, p12, p13}, {p2}, {p1}} byusing evaluation function of Hamming distance method.

(8) NTV method

G8 ¼1 2 3 6 8 9 12 11 13

0 0 0 5 7 0 10 0 0

0 0 0 4 0 0 0 0 0

264

375

We can obtain clustering results {{p4, p5, p6}, {p9}, {p10, p12}, {p13}, {p3}, {p7, p8}, {p2}, {p11},{p1}} by using evaluation function NTV method.

4.3. Preprocessing performance comparison

In order to compare the preprocessing effect more directly, we use coefficient of index similaritymethod as our base benchmark. When the number of cloud resources is varied from 10 to 130,we compare each index coefficient method with other methods. In order to further validate theeffectiveness, each comparison experiment has made for 10 times, and the result is the averagevalue of these 10 times. And without loss of generality, we define if the matching degree ismore than 80% it is good, less than 40% is poor, the other is general. Here, we obtainFigure 2(a)–(h) as follows.


0

1

2

3

4

5

6

7

10 40 70 100 130

No. of resourcesT

imes

Good General Poor

Figure 2. (a) Cosine of the angle comparison. (b) Comparison of correlation coefficient. (c) The maximumand minimum contrast. (d) The minimum of arithmetic mean. (e) Comparison of Euclidean distance. (f)

Comparison of the absolute value of index. (g) Comparison of Hamming distance. (h) NTV contras.

Z. LIU ET AL.



Looking at the results in Figure 2, we can see that the minimum of arithmetic mean method andthe Hamming distance method are better than the index similarity coefficient method. However,the cosine method and the maximum and minimum method are a little worse. It is obvious that theminimum of arithmetic mean method and the Hamming distance method are suitable for clusteringcloud resources under heterogeneous environment. Thus in general, in the following sections, wechoose the minimum of arithmetic mean method as our resources preprocessing method is feasibleand effective.

5. EARLIEST FINISH TIME DUPLICATION ALGORITHM AND PERFORMANCEEVALUATION

5.1. Earliest finish time duplication algorithm

As described in the earlier section, HEFT algorithm does not make full use of the interval time of CPUunits. The general task duplication-based algorithm mainly takes into consideration the homogeneousnetwork or an infinite resource environment. For this reason, combining with the traditional HEFT andduplication algorithms, we introduce a new algorithm called EFTD for heterogeneous cloud systems.Our scheduling implementation contains four main modules: resource preprocessing stage, priorityprocessing stage, task allocation stage, and task duplication stage. The users submits DAG tasksgraph, at the same time resource providers provide recourses. Through cluster method, resourceagent selects the better recourses (processing units). At last, the scheduling algorithm and procedurewill be fulfilled by the broker module. In brief, our EFTD algorithm can be summarized in Figure 3.In the former context, we have introduced the resource preprocessing stage, so in this section, theother stages are illustrated as follows.

(1) Priority processing stage

Firstly, we establish the list of ready task (LRT) through priority allocation. Tasks in the LRT denotethe current task’s parent nodes have been scheduled. The LRT has only entry task(s) in the beginning.As in each task assigning, some tasks will be released, so the LRT would be updated after each taskallocation. Task’s priority is based on its blevel value. The larger blevel it is, the greater priority ithas. When the task is ready on the top of LRT, then it is selected by scheduling algorithm.

(2) Task allocation stage

In this stage, the candidate tasks are allocated to suitable processing unit, this stage includes threephases:

• Computing the earliest start time (EST)

We compute all the EST of processing units allocated by ti to determine the processing unit p1 andgive it the smallest EST value.

• Computing the earliest finish time (EFT)

In a similar fashion, we also choose a processing unit p2 and give it the smallest EFT value.

• Task allocation

If the chosen processing units by p1 and p2 are same, then we allocate the task to the processing unitp1 and then remove p1 from the LRT. At the same time, we update the LRT simultaneously. If thechosen processing units by p1 and p2 are different, we choose the next task with lower priority fromthe LRT, then repeat step (i) until satisfying p1 = p2, and update LRT. When the tasks in LRT are inthe tail and do not meet the allocation conditions, we compute the deviation of the task at the headof LRT:

Maindiff i; j; kð Þ ¼ wi; j � wi;k

� �� EFT i; jð Þ � EFT i; kð Þ½ � � wi


Figure 3. Earliest finish time duplication (EFTD) algorithm.

Z. LIU ET AL.

where j is the chosen processing unit by p1 and k is the chosen processing unit by p2. If the resultis positive, choose p1, else choose p2, then remove it from the table and update LRTsimultaneously.

(3) Task duplication stage

In order to reduce resource scheduling length, we should restrain the cost of communication andcomputing overheads. If the task to be allocated and its predecessor are in the same processing unit,their communication cost is 0. Therefore, our solution is duplicating two key task parent nodes. Thegoal of duplication is to reduce the starting time, so as to make the parent node just copied could bechosen in the processing unit, which has the minimum start time of the candidate tasks. The keyparent node here is the parent node of task running on different processing units, and its data arrivaltime is up to date.

At this stage, first of all, the algorithm would test whether it can meet the duplication condition. Ifthe condition can satisfy, then it copies its parent node’s task to its child processing unit. Then, basedon the calculated value of avail[pi], we compute the child’s new EST value (called nEST). If nEST isless than EST, then duplicate. The conditions of duplication can be shown as



COD ti; pj� �

> wcp; j and EFT tcp; pj� �

< EFT ti; pj� �

5.2. Case study

In our case study, we assume that the task graph can be outlined as Figure 4, and the target system canbe depicted as Figure 5. The heterogeneous factor b is 0.5. We assume all the tasks will be scheduledinto 3 processing units.

First of all, we can obtain the resource description matrix R from the target system, based on theresource preprocessing method described in Section 4, we can obtain R,

R ¼

2 4 4 6 1

4 2 4 4 3

6 1 1 6 1

5 2 4 4 3

2 4 4 6 1

5 2 2 6 1

26666666664

37777777775

As a result, we can obtain the clustering results: {{p3}, {p2, p4}, {p1, p5}, {p6}}. We choose 3 bestprocessing units as practical task scheduling processor, that is, p3, p2, and p4. In order to simulate thereal application scenario, we assume that the processors performance have been isomerized randomly.Table III depicts the computational cost corresponding to each task.

We calculate BL values of each task, as shown in Table IV. The number is sorted from biggest tosmallest, which denotes as their priorities, respectively.

Figure 4. Tasks graph.


Figure 5. Target system.

Table III. Calculation of costs of the tasks.

p1 p2 p3

N1 5 3 4N2 3 2 7N3 3 4 8N4 7 5 3N5 6 2 5N6 3 4 8N7 4 1 3N8 2 3 4N9 4 5 3N10 6 4 5

Z. LIU ET AL.

Table V illustrates the algorithm’s procedure. In step 1, parent task does not exist, so no needfor duplication. But as for step 2, 3, and 5, they are too short at time interval, that is to sayCOD(ti,pj)<wcp,j, so it does not meet the duplication condition. Their time intervals are 0, 0, and2.5, respectively, and their key parent nodes computing costs are 3, 5, and 3, respectively. As forstep 6, although time interval conditions are met (at this time, COD is 12, w is 7), and the task starttime is later than the predecessor task finish time (values are 12 and 10, respectively). However, thenEST (the value is 13) is bigger than original EST (the value is 12). In steps 8 and 10, theintervals are also too short, and they are 0 and 3, respectively, but their computing costs are 2 and4, respectively. In step 9, we compare and select processing unit. As for step 11, it meets theduplication conditions: at this time, the interval time is 8, predecessor task computing cost is 1,


Table IV. BL values of tasks.

N1 N2 N3 N4 N5 N6 N7 N8 N9 N10

BL 29.34 23.34 18.34 17.67 19.68 19.34 11.67 11 13.34 6.67

Table V. Running procedure of EFTD algorithm case study.

Steps LRT EST EFT Yes/no p1 = p2 Process unit Duplication condition

1 R1 0 3 Y p2 Not meet, no copy2 R2,R5,R6 3 5 Y p2 Not meet, no copy3 R5,R6,R3,R4 5 7 Y p2 Not meet, no copy4 R6,R3,R4 7 10.5 N5 R6,R3,R4 6.5 9.5 Y p1 Not meet, no copy6 R6,R4 9 12 Y p3 Not meet, no copy7 R6,R7,R8 12 14.5 N8 R6,R7,R8 10 12 Y p1 Not meet, no copy9 R6,R7 12 15 NC p1 Not meet, no copy10 R9,R7 15 19 Y p1 Not meet, no copy11 R7 14.5 15.5 Y p2 Meet, copy R412 R10 19 25 Y p2 Not meet, no copy


predecessor task finish time is 12, predecessor start time is 16, and the task arrival time of thesecond key predecessor is 14.5. In step 12, although the interval time meets the conditions (theinterval time is 4.5 and predecessor task computing cost is 4), but the predecessor task finishtime (value is 23) is bigger than the task start time (value is 19), so it does not meet theduplication conditions.

We use HEFT algorithm to execute this example, and the result is drawn in Figure 6, then we useEFTD to run the example; its result is depicted in Figure 7. Through the performance comparison,we can see that the executing time of our EFTD algorithm is shorter than that of HEFT. Althoughresource preprocessing method can effectively improve the performance, it pays more attention tothe constrained cases. In the future, we will use optimal methods [29] and exchanged crossed cube[30] to further consider this problem for both constrained and unconstrained cases.

5.3. Performance evaluation

5.3.1. Performance methodology. In order to verify the performance of our method (EFTD), wemake comparisons with the popular HEFT_AEST and HEFT_TL algorithms, respectively. Here,HEFT_TL algorithm is the HEFT algorithm using the average of late departure time (tlevel) as thepriority. HEFT_EST algorithm is the HEFT algorithm using the average of earliest start time(AEST) as the priority. Besides, we also need to consider the size of figures and communicationcalculation rate (CCR).

In order to compare the algorithms, the following two parameters are proposed.

• Normalized scheduling length (NSL): NSL value [31], we define as the ratio of the algorithmscheduling length to the referring algorithm scheduling length.

• Speedup: speedup value [32], we define as the ratio of linear execution time to the parallelexecution time. Here, linear time refers to all tasks executing in a single processing unit,which leads to the task with minimal computational cost. That is, it satisfies the followingequation:

speedup ¼ minpj∈P

∑ti∈Twi;jgmakespan


Figure 6. Heterogeneous earliest finish time (HEFT) result.

Z. LIU ET AL.

To test our proposed EFTD algorithm, we choose CloudSim toolkit [33, 34] as a simulationplatform to simulate heterogeneous cloud environments. CloudSim can model a vivid virtualizedcloud environment and give a detailed analysis on task processing. In order to further evaluate theresults, we extend CloudSim’s class and develop a generator to randomly generate task graphs. Itsinput parameters include scheduling algorithm, DAG tasks, task dependency, target system, numberof selected processing units, estimated running time of each task corresponding to each processingunit, and communication time among independent tasks.

5.3.2. Experiment results and analysis. Here, we assume that CCR= 2; we can obtain the relationshipbetween the number of nodes and average NSL shown in Figure 5 and the relationship between thenumber of nodes and speedup shown in Figure 6. From Figure 8, we can see that although the NSLof the three algorithms are increasing with the increasing of the number of nodes, but in any case,EFTD algorithm has smaller NSL than the other two algorithms. Another phenomenon is that as thenumber of nodes increases, the NSL value increase of EFTD algorithm is not obvious. Looking atthe results in Figure 9, it also shows that our method has higher speedup than the other twoalgorithms in all cases.

This means that our algorithm has better scheduling performance under the same CCR condition.The reason is that during the priority processing and task allocation stage, the EFTD algorithm takesinto account much more factors related to Grid tasks.

To evaluate the effect of CCR on NSL and speedup, we assume that the number of nodes is 100. Therelationship between the CCR and NSL is depicted in Figure 10, and the relationship between the CCRand speedup is shown in Figure 11. With the increasing of CCR, EFTD has better performance than the


Figure 7. Earliest finish time duplication (EFTD) result.

Figure 8. The relationship between the number of nodes and normalized scheduling length (NSL).


HEFT_AEST algorithm and HEFT_TL algorithm. Another observation is that with the increasing ofCCR, the improvement of EFTD is more and more obvious. The reason why it comes out like thisis because EFTD algorithm integrates the idea of task duplication. In fact, the increasing of CCRmeans that the cost of communication is higher than the cost of computing in the Grid applicationsystem. Task duplication minimizes the cost of communication, that is, the higher the CCR value is,the better performance of EFTD scheduling.


Figure 9. The relationship between the number of nodes and speedup.

Figure 10. The relationship between the communication calculation rate (CCR) and normalized schedulinglength (NSL).

Figure 11. The relationship between the communication calculation rate (CCR) and speedup.

Z. LIU ET AL.

6. CONCLUSIONS AND FUTURE WORK

In this paper, based on fuzzy theory, we present a clustering method to preprocess the cloud resourcesbefore scheduling. For heterogeneous cloud computing environment, a new DAG based schedulingalgorithm called EFTD algorithm is presented, which combined HEFT scheduling with taskduplication scheduling scheme. EFTD attempts to insert suitable immediate parent nodes of thecurrent selected node in order to reduce its waiting time on the processor. The case study andexperimental results illustrate that the algorithm proposed in this paper can achieve betterperformance than the popular HEFT algorithms in terms of NSL and speedup, respectively.



In the future work, we intend to deal with the dynamic task scheduling at the running time. Inaddition, we plan to develop dynamic scheduling methodology and ultimately implement in our realcloud computing system for practical evaluations instead of CloudSim.

ACKNOWLEDGEMENTS

This work is supported by the National Science Foundation for Distinguished Young Scholars of Chinaunder grant no. 61225010, NSFC under grant nos. of 61370198, 61370199, 61300187, 61173160,61173161, 61173162, 61173165, and 61103234, Program for New Century Excellent Talents in University(NCET-10-0095) of Ministry of Education of China, The Fundamental Research Funds for the CentralUniversities under grant nos. of 3132014215 and 2012QN029, China Scholarship Council Program, andProject of College Students’ Innovative and Entrepreneurial Training Program of China under grant nos.of 2011022, 2011003, 201211258013, and 201211258014.

REFERENCES

1. Moschakis IA, Karatza HD. Evaluation of gang scheduling performance and cost in a cloud computing system. TheJournal of Supercomputing 2012; 59(2):975–992.

2. Erdil DC, Lewis MJ. Dynamic grid load sharing with adaptive dissemination protocols. The Journal ofSupercomputing 2012; 59(3):1139–1166.

3. Bittencourt LF, Madeira ERM, Da Fonseca NLS. Scheduling in hybrid clouds. IEEE Communications Magazine2012; 50(9):42–47.

4. Chen XJ, Zhang J, Li J. Resource management framework for collaborative computing systems over multiple virtualmachines. Service Oriented Computing and Applications 2011; 5(4):225–243.

5. Saovapakhiran B, Devetsikiotis M, Michailidis G, Viniotis Y. Average delay SLAs in Cloud computing. 2012 IEEEInternational Conference on Communications, 2012; 1302–1308.

6. Zheng Q, Tham C-K, Veeravalli B. Dynamic load balancing and pricing in grid computing with communicationdelay. Journal of Grid Computing 2008; 6(3):239–253.

7. Tom´s L, Caminero AC, Caminero B, Carrion C. A strategy to improve resource utilization in grids based onnetwork-aware meta-scheduling in advance. 12th IEEE/ACM International Conference on Grid Computing (GRID2011); 50–57.

8. Dorn C, Dustdar S. Weighted fuzzy clustering for capability-driven service aggregation. Service Oriented Computingand Applications 2012; 6(2):83–98.

9. Bhatia M. RR based grid scheduling algorithm. Proceedings of the International Conference on Advances inComputing and Artificial Intelligence (ACAI 2011); 120–123.

10. Dümmler J, Kunis R, Rünger G. SEParAT: scheduling support environment for parallel application task graphs.Cluster Computing 2012; 15(3):223–238.

11. Sotomayor B, Montero RS, Llorente IM, Foster IT. Virtual infrastructure management in private and hybrid clouds.IEEE Internet Computing 2009; 13(5):4–22.

12. Li K, Shen H, Chin FYL, Zhang W. Multimedia object placement for transparent data replication. IEEE Transactionson Parallel and Distributed Systems 2007; 18(2):212–224.

13. Li H, Lin K, Li K. Energy-efficient and high-accuracy secure data aggregation in wireless sensor networks. ComputerCommunications 2011; 34(4):591–597.

14. Bittencourt LF, Sakellariou R, Madeira ERM. DAG scheduling using a lookahead variant of the heterogeneousearliest finish time algorithm. PDP 2010; 27–34.

15. Juve G, Deelman E. Scientific workflows and clouds. Crossroads 2010; 16(3):14–18.16. Jeong H-Y, Park JH. An efficient cloud storage model for cloud computing environment. GPC 2012; 370–376.17. Kwok Y-K, Ahmad I. Static scheduling algorithms for allocating directed tasks graphs to multiprocessors. ACM

Computing Surveys 1999; 31(4):406–471.18. Topcuoglu H, Hariri S, Wu MY. Performance-effective and low-complexity task scheduling for heterogeneous

computing. IEEE Transactions on Parallel and Distributed Systems 2002; 13(3):260–274.19. Abirami SP, Ramanathan S. Linear scheduling strategy for resource allocation in cloud environment. International

Journal on Cloud Computing: Services and Architecture (IJCCSA) 2012; 2(1):9–17.20. Kaur S, Verma A. An efficient approach to genetic algorithm for task scheduling in cloud computing environment.

International Journal of Information Technology and Computer Science 2012; 4(10):74–79.21. Maguluri ST, Srikant R, Ying L. Stochastic models of load balancing and scheduling in cloud computing clusters.

INFOCOM 2012; 702–710.22. Wu Y, Min G, Li K, Javadi B. Modeling and analysis of communication networks in multicluster systems under

spatio-temporal bursty traffic. IEEE Transactions on Parallel and Distributed Systems 2012; 23(5):902–912.23. Liu Z, Qin T, Qu W, Liu W. DAG cluster scheduling algorithm for grid computing. IEEE 14th International

Conference on Computational Science and Engineering (CSE 2011); 632–636.24. Salehi MA, Buyya R. Adapting market-oriented scheduling policies for cloud computing. ICA3PP (1) 2010;

351–362.


Z. LIU ET AL.

25. Chang R-S, Chang J-S, Lin S-Y. Job scheduling and data replication on data grids. Future Generation ComputerSystems 2007; 23(7):846–860.

26. Cao H, Jin H, Wu X, Wu S, Shi X. DAGMap: efficient and dependable scheduling of DAG workflow job in Grid.The Journal of Supercomputing 2010; 51(2):201–223.

27. Saovapakhiran B, Michailidis G, Devetsikiotis M. Aggregated-DAG scheduling for job flow maximization inheterogeneous Cloud computing. GLOBECOM 2011; 5–9.

28. Lin C, Lu S. Scheduling scientific workflows elastically for cloud computing. IEEE CLOUD 2011; 746–747.29. Li K, Shen H, Chin FYL, Zheng S-Q. Optimal methods for coordinated enroute web caching for tree networks. ACM

Transactions on Internet Technology 2005; 5(3):480–507.30. Li K, Mu Y, Li K, Min G. Exchanged crossed cube: a novel interconnection network for parallel computation. IEEE

Transactions on Parallel and Distributed Systems 2012. doi:10.1109/TPDS.2012.330.31. Shin KS, Cha MJ, Jang MS. Task scheduling algorithm using minimized duplication in homogeneous systems.

Journal of Parallel and Distributed Computing 2008; 68(8):1146–1156.32. Cao H, Jin H, Wu X, Wu S, Shi X. DAGMap: efficient and dependable scheduling of DAG workflow job in Grid.

The Journal of Supercomputing 2010; 5(2):201–223.33. Calheiros RN, Ranjan R, Beloglazov A, Rose CAFD, Buyya R. CloudSim: a toolkit for modeling and simulation of

Cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice and Experience2011; 41(1):23–50.

34. Beloglazov A, Buyya R. Optimal online deterministic algorithms and adaptive heuristics for energy and performanceefficient dynamic consolidation of virtual machines in Cloud data centers. Concurrency and Computation: Practiceand Experience 2012; 24(13):1397–1420.


resource preprocessing and optimal task scheduling in cloud computing environments

Documents