proposed algorithm for optimal resource allocation for cloud ......1 proposed algorithm for optimal...

1

Proposed algorithm for Optimal Resource Allocation for Cloud Computing Environments

Mohamed Fakhri, Naglaa Sayed Abdelrehem, Fathi Ahmed Amer [email protected], [email protected], [email protected]

Abstract:

Cloud Computing is known to use common software, infrastructure, virtual machines (VMs), and other resources on the cloud as services. It is a utility computing model in which the customer can get a required infrastructure without buying it. The customers only want to pay for what they use. The scheduling on cloud computing is a strategy or method used to define the most appropriate deployment for the available tasks, resources or jobs. So that there are many available distinct cloud scheduling algorithms which coordinate between the tasks and the appropriate resources to get the best and most efficient way of resources use with regards to some measurement factors to get the minimum value of time, minimum value of cost, minimum value of delay, or to maximize the resources utilization. In this paper, we propose a cloud scheduling technique that works as a strategy of three stages. In the first stage, a job classifier is used for task classification. Which helps to pre-create a different type of virtual machines. And save the time needed to create virtual machines during the scheduling process also decrease the task scheduling failure rate. In the second stage, we sort the tasks in an ascending order depending on the priority of the length then mention the state of the virtual machine that satisfies the deadline constraint as successful. In the third stage match the tasks dynamically to their corresponding virtual machines with the minimum completion time and compare the algorithm with the Min-Min algorithm and Max-Min algorithm. The comparison depends on some evaluation factors like the makespan, average waiting time, the task scheduling failure rate and the virtual machine utilization rate that will improve the scheduling process performance and the cloud resources load balancing.

Keywords—Cloud computing, scheduling, virtual machines (VMs), makespan, waiting time, completion time, resource utilization.

1. Introduction Cloud computing is defined as a technology that allows sharing resources, services or infrastructure over the internet on demand according to the pay per use policy. Where the Customer can use storage space, operating system, servers, processing

mailto:[email protected]



2

capabilities, and any application development requirements. Service providers schedule tasks based on the user different needs [9], [16]. The Cloud service providers give permission to the users to use their resources like memory, storage space, bandwidth, etc. based on the available resources in the cloud. There exist various tasks which have various Quality of Service (QoS) requirements scheduled in various cloud environments [7], [9]. Cloud computing, in the community of researches and the business community, has exponentially grown. in the last years, Because of the virtualization technologies and modern advances, cloud became more popular [20]. Many different cloud applications are received by the data centers to get specified services based on the pay-per-use method. Because of the limited resources on the cloud with different functionalities and different capacities. Cloud scheduling became a challenging process [20]. To allocate and schedule the resources is a valuable issue that affects the performance of all the networking, distributed, parallel, and cloud computing. There are a big number of different algorithm scenarios suggested from many researchers to find the best effect, and optimal way to deploy the resources in the cloud. We can generalize the process of cloud scheduling to three defined states. which are to discover and filter resources, then decide or select the convenient resources, and submit tasks to the selected resources [10]. The main important point is to define the most convenient way to allocate the tasks of the users with matching cloud resources. To enhance the profit to the service provider of the cloud. And also improve the quality of service for all the tasks [21]. to schedule, tasks mean to use a reasonable way to deploy the resources of the cloud in a way to satisfy the requirements of the user and also increase the service provider's economic benefits. The predefined task scheduling scenarios for the cloud are more important to enhance the quality of service requirements (QoS) for the user tasks [25]. Or to optimize the deadline warranty [22], [21], [6]. Others to enhance the throughput value [8], [23]. Others to optimize the makespan [20]. And also to get the least and efficient cost [24]. A load balanced scheduling model works to get balances for the loads received from several users around the datacenters [15]. Also, for the service providers to obtain the best service level agreement and the energy benefits is an urgent issue to be fully considered [4], [5]. This paper is organized as follows, section 2 presents related work. A newly proposed enhanced algorithm is detailed in section 3. Comparison evaluation of the newly proposed scheme with other algorithms is given in section 4. Conclusion and future work are given in section 5.

2. Related work

PeiYun Zhang and MengChu Zhou proposed a cloud scheduling technique that works as a strategy of two stages. In the first stage, a job classifier is used for task classification. Depending on past documented data which helps to pre-create a various type of virtual machines. And save the time we need to create them during the scheduling process also decrease the task scheduling failure rate. In the second stage match the tasks dynamically to their corresponding virtual machines. They

3

compared the algorithm with the Min-Min algorithm and Max-Min algorithm. Based on some evaluation factors like the makespan, average waiting time, task scheduling failure rate and virtual machine utilization rate where their algorithm improved the scheduling process performance and the cloud resources load balancing [1]. Indukuri R. Krishnam Raju1* and G. Jose Moses proposed a cloud scheduling technique that works as a strategy of two stages. They built their work based on the idea of virtual machines scheduling for all the requested customer's jobs. Where each job has to pass on a two virtual machine sequentially to complete its desired task. They treated the virtual machines as a resource on the cloud need to be matched to the required job according to their response and waiting time. They contrasted their algorithm with two of the previous algorithms which were (the First Come First Serve algorithm and the Shortest Job First algorithm). Depending on some evaluation factors like the deadline, the average turnaround time. Where their algorithm enhanced the scheduling process performance by decreasing the value of the last metrics [2]. Mokhtar A. Alworafi and Suresha Mallappa proposed a cloud scheduling technique to enhance resource utilization and decrease the average makespan. They sort the tasks ascendingly depending on the priority of the length then they mention the state of the virtual machine that satisfies the deadline constraint as successful. Then they match the tasks to the convenient virtual machine with the minimum processing time. They compared their algorithm with other recent algorithms like (Min-Min, GA, SJF, and Round Robin) where their algorithm maximized the utilization of resources and task guarantee ratio. And also reduced the average response time, violation s number, the ratio of violation, the ratio of failure and the makespan [3]. Xinqian and Rajkumar proposed a strategy to assign virtual machines to enhance the efficiency of the energy for the cloud data center. By combining more of the reserved virtual machines to get the best energy consumption rate. They simulated their algorithm using the cloud sim and also from a real cloud environment. They found that they can achieve optimal and quick allocation method for a class of reserved virtual machines. And also can merge more virtual machines with a fewer number of physical machines. Which helps to improve the total profit and energy consumption by about 24% and 41%. Matched with state-of-the-art methods. And also enable the data center of the cloud to serve more requests [4]. Zhou Zhou and Zhigang Hue defined two energy aware techniques to increase the efficiency of the energy and also decrease the violation value for SLA in the datacenters of the cloud. They take care of the types of the application, memory resources and CPU through the virtual machine deployment process. After experiments, they found that their algorithm minimized the consumption of the data center energy rate and minimize the SLA violation ratio [5]. SaeMi Shin and SuKyoung Lee defined a scheduling technique that updates the conservative backfilling algorithm to get a utilization from (EDF) and (LWF) algorithms. They receive all the data center jobs. Then sort the jobs ascendingly to serve the jobs with high priority first then choose the largest backfill job to satisfy the deadline guarantee constraint. After simulation, they found that their algorithm enhanced the scheduling process performance by enhancing the utilization of resources and deadline guarantee ratio [6]. Niloofar Khanghahi and Reza Ravanmehr used different policies for scheduling, simulation, and evaluation depending on the users, data centers and geographical regions. They had three scenarios each time. They kept two of them static and change the third parameter

4

value to measure the change in total cost, minimum or maximum processing time. Where cloud computing performance still has shortages in performance evaluation. So special measures are required like delay or service level agreement to achieve more accurate evaluation in the future [7]. Atul Vikas Lakraa and Dharmendra Kumar Yadavb made a survey on various task scheduling algorithms in a cloud computing environment based on distinguishable scheduling. All algorithms are efficient in one way or another. The existing algorithms showed enhanced load balancing, minimized makespan, energy efficiency, quality of service, consistency, maximum resource utilization, effective implementation, fairness among tasks, high profits and bandwidth utilization over the cloud. But all the scheduling algorithms have a problem to achieve all of them together so none of them can be 100% efficient [8]. IM.Vijayalakshmi, and IIV.Venkatesa Kumar viewed four different scheduling algorithms. Minimum Completion time, Round Robin, Random Resource Selection, and Opportunistic Load Balancing algorithms. They evaluated their performance using various performance metrics like makespan, throughput amount and total cost. Where the Round Robin algorithm got less cost compared with the minimum completion time and the opportunistic load balancing algorithms. The Random algorithm is the best of all the other algorithms in terms of the total cost. When increasing the number of jobs, the Random algorithm and the Round Rubin algorithm had the same cost [9]. Dr. Amit Agarwal and Saloni Jain proposed a new Generalized Priority Algorithm where the customer defines the priority according to the user demands. By comparing their algorithm time with (FCFS) algorithm and (RR) algorithm they found that their algorithm has a minimum execution time [10]. Xiaoping Li and Rub´en Ruiz proposed a new task scheduling algorithm based on combining the tasks into two groups (deadline-based and cost-based). They arranged the tasks ascendingly based on the deadline and arranged the cost descending based on the task length. They had different queues with different priorities to execute the tasks. They combined the priority scheduling algorithm and the RR algorithm to enhance performance by improving the execution time and throughput values [11].

Mehwish Awan*, and Munam Ali Shah in [27].proposed a task scheduling algorithm which is multi-objective to relate a group of tasks received by the broker to the received virtual machines list. They reduced the execution time of workload to the minimum optimized time. They compared their algorithm to (FCFS) algorithm and priority scheduling algorithm. They may use other QoS parameters to get more optimization [13]. Babur Hayat Malik and Javaria Khalid made a good comparison on

several task scheduling algorithms. Their comparison was summarized as in table1.

Table1: Description Comparison between 7-related scheduling algorithms.

Disadvantages Advantages Parameters Methodology Algorithms

1. Task scheduling is based on arrival time, doesn’t consider any other criteria 2.Less utilization of

1. Simple and fast execution

1. Arrival time This algorithm manages the task scheduling based on the FIFO queue. A task which comes first will be executed first on VM

First Come First Serve

5

VM

1.load imbalance 2.poor QoS

1.Better makespan 1. Makespan This algorithm works on strategy in which task having minimum execution time is selected for all task

Min-Min Algorithm

1. This algorithm work very slowly. 2. This algorithm cannot find the exact solutions. 3. Method of selection should be appropriate

1. It can solve mathematical problems and financial problems more accurately. 2. Easy to understand the concepts. 3. Some applications required less time for processing.

1.population size 2. Crossover probability 3.mutation probability

Genetic algorithm wants a depiction of the solution domain and suitable function to estimate the solution domain.

Genetic Algorithm

1. This algorithm does not fulfill a global optimization solution. 2. Very difficult to make changes in parameters.

1. Easy to implement. 2. This algorithm needs fewer resources. 3. Execution is very fast. 4. Scheduling is very fast.

1. Parameter µ 2. Domain D 3. Population n

This algorithm tries to find the global optimum by following the problem-solving heuristic approach for choosing every step.

Greedy Algorithm

1. Jobs having the lowest priority will be lost when the system crashes. 2. Starvation for the resources they need.

1. Priority of the process increases with the increases in the time. 2. Easy to use and user-friendly. 3. Best for the applications which require time and resources. 4. Less finish time

1. Priority to each queue

Dependency mode Priority-Based Job Scheduling Algorithm

1. Pre-emption causes the process out once time slice expires

1. Response time is good. 2. The load is balanced. 3.less complex

1. Arrival time 2. Time slice

The algorithm works on a cyclic approach in which each task has an equal chance to be chosen and has an equally small unit of time for execution

Round Robin

1. Slow convergence speed if search space is large

1.high utilization of resources 2. finding the optimal solution 3. minimizing processing time

1. Inertia, 2. C1, C2 constants

The algorithms use population to find the optimal minimum values that help in creating a correct order of tasks and schedule task to a suitable resource

Particle Swarm Optimization

6

3. Proposed algorithm In this section, we improve the PeiYun Zhang, and MengChu Zhou two stages scheduling scheme [1]. To propose a cloud scheduling technique that works as a strategy of three stages. In the first stage, a job classifier is used for task classification. Depending on past documented data and the current state of the cloud environment which helps to pre-create a various type of virtual machines. And save the time needed to create virtual machines during the scheduling process also decrease the task scheduling failure rate. In the second stage, we improve the A. Alworafi, and Suresha Mallappa paper [3], where they proposed a cloud scheduling technique to enhance the resource utilization and decrease the average makespan. We sort the tasks descendingly depending on the priority of the length then, mention the state of the virtual machine that satisfies the deadline constraint as successful to complete the scheduling process and discard the unsuccessful ones. In the third stage match the tasks dynamically to their corresponding virtual machines with the minimum completion time and compare the algorithm with the Min-Min algorithm and Max-Min algorithm. Based on some evaluation factors like the makespan, average waiting time, task scheduling failure rate and virtual machine utilization rate that improved the scheduling process performance and the cloud resources load balancing.

3.1 The Scheduling Framework Assume we have a VM set called V, where V = {1, 2, …, N} and vi, where i ∈ {1, 2, ..., N}, represent the VM number I, vi defined by four attributes denoted as Vi(a), where a ∈ {1, 2, 3, 4}. Which represent the CPU resources (like the clock speed of the CPU), resources of the memory, the bandwidth of the network, and the hard disk storage, respectively. Where, we have vi = {vi

(1), vi

(2), vi (3), vi

(4)}. Assume we have a set of tasks provided by the users defined as T = {1, 2, ... , M} and t j , where j ∈ {1, 2, ... , M}, represent the task number j. where the task j can be defined based on some attributes as t j ={tj(id) , tj(r), tj(d), and tj(p),tj(L) } , where:

1. tj (id): is defined as the unique ID of task j. 2. tj (r): is defined as requirements of task j . and tj(r) = {tj1, tj2, tj3, tj4} specifies the requirements for CPU, memory, network bandwidth and hard disk storage for task tj. 3. tj(d): is defined as the deadline of task tj. When the deadline of t j is violated, the task is failed to be scheduled. 4. tj(p): is defined as the priority of task tj. If tj is urgent or high payment user’s job, it is a high priority task otherwise it is a regular

7

job.

{

5. tj(L): is defined as the length of task tj.

In our paper, the cloud holds a set of hosts defined as H. assume Hk, where k ∈ {1, 2, ..., K}, represent the host number k that can create Gk VMs, i.e., Hk = {vnk |n ∈ {1, 2, ..., Gk}}. Where vnk represents the VM number n of the host number k. VMs are generated from Hk by virtualization. The task scheduling process is defined as a function which maps tasks to VMs f : (T → V) After finishing a successful scheduling, each task is scheduled and executed at a convenient VM. To reduce the complexity of the mapping, we divide tasks to be matched with VM types. Assume we have a set of VM types called type = {1, 2, ..., L}. Where L represents the number of VM types. Each successfully created VM has a specified matching VM type. Then, we get a utilization from task classifier by realizing mapping from tasks to VM types. This way can decrease the difficulties of the process of task scheduling because the number of VM types can be much smaller than the number of VMs. Where

{

Assume the data center DC that has a set of servers defined as {S1, S2 . . . Ss}, Si = {Vi1, Vi2 . . . V iN} is a set of virtual machines in the server Si. Each virtual machine has a specified speed defined by the number of million instructions per second (MIPS). and the speed of the VM is defined as (Vs), where the number of instructions per task (task length) is defined as (TL).

The task deadline tj(d) is defined as the execution time of task j (ET) which is calculated based on the following equation:

(1)

The expected execution time (EET) of the task in each VM is computed, and compared with the deadline constraint value to find which of the VM achieve the deadline to be defined as a successful task to complete the scheduling sequence. Or which of them don't achieve the deadline to be defined as an unsuccessful task depending on in the following equation:

{

8

The value of the task guarantee (Gr) ratio at hosts can be calculated as follows:

∑ ∑ ∑

The task guarantee ratio (Gr) value of a given host Hk at its VMs as follows:

G ∑ ∑

We have two types of tasks, the first type is tasks that serve urgent and VIP users or users who pay a high price, which is known as (priority tasks). and the other type is tasks that serve ordinary users and noncritical jobs, which is known as (ordinary tasks). We may have a Priority task guarantee ratio (P) at hosts that can be calculated as follows:

P(H) =

∑ ∑ ∑

∑

The priority task guarantee ratio for a given host Hk, at VMi can be calculated as follows:

P(k) =∑ ∑

∑

3.2 Model Description

We define our cloud task scheduling technique as having the following main modules:

1) Task Classifier: our proposed task classifier consists of three stages with

three Functions. The first stage determines different types of VM and their quantities based on historical task scheduling information. The second stage, sort the tasks depending on the priority of the length in ascending order, then mention the state of the virtual machine that satisfies the deadline constraint as successful, and find the VM with the minimum completion time to match with. The third stage, classify a submitted task and match it to the most appropriate VM type with minimum completion time. Once a task is matched, we call it “marked.” Its type is defined as its matched VM’s type. The algorithms used for creating the VM types and for matching a job with a VM type will be viewed in the next section (Algorithms 1,2 and 3).

9

2) Task sorter: arrange the tasks in ascending order, according to the priority of the length then mention the state of the virtual machine that satisfies the deadline constraint as successful. (algorithm 2).

3) Task Matcher: matches the tasks to their convenient concrete VMs

according to the task classifier results. if we have more than a matched VM for the task, compute the completion time for the task at each matched VM, then select the matched VM with the minimum completion time. That part is implemented using Algorithm 4.

4) Ready Queue: tasks are pushed into a ready queue when there are idle

VMs meeting the requirements of tasks, and may be executed directly. That part is implemented using Algorithm 4.

5) Waiting Queue: tasks are pushed into a waiting queue in case there is no

idle VMs meeting the tasks requirements. and it needs to wait for a specific VMs during the scheduling.

11

As we discussed before, the proposed model consists of three stages where:

i. In the first stage: we have the following steps: 1. get a historical data from a historical task scheduling database. 2. use a task classifier to classify tasks. 3. store the classified tasks in a database to be used later as a piece of

historical information. 4. create a set of VM types. 5. create a convenient number of VMs of various types at hosts.

Input: a historical scheduling data stored in (DB) Output: a set of VM types known as (Types) 1. Types←∅;

Algorithm 1: Create VM Types and Create VMs at Hosts

11

2. T' ← Data processing of DB; 3. L ← Task types number of T'; 4. For i =1 to L 5. compute P (T'i) ; 6. End For 7. K ←TopK (P (T'i )); 8. For i=1 to K 9. Types← (create a set of VM types based on the value of P (T'i)); 10. vi ←create VMi; 11. End For 12. Return Types;

ii. In the Second Stage: we have the following steps. 1. Get an initial task set from the user. 2. Sort the tasks in ascending order based on the length priority. 3. Calculate the expected execution time (EET) for each task. 4. Compare the EET of the tasks with its deadline. 5. if EET<= deadline marks it as a successful task. 6. Otherwise, mark it as an unsuccessful task, and put it in the failure

queue.

Input: an initial set of tasks (t). Input: an initial task set (t) Output: a set of successful tasks that match the deadline constraint (T) 1. For j=1 to M // M is the number of tasks 2. Sort tj in ascending order based on Tj(L). // Check the VM state: 3. For i=1 to N //In each VM resource 4. Calculate the Expected Execution Time (EET) of the task from ET that calculated above. 5. If (EET(t) <= t j (d)) 6. VM state: successful 7. Count++ 8. End if 9. Else VM state: unsuccessful 10. failure ←tj //put the task j into failure queue

11. End For

12. Return T

iii. In the third Stage: we have the following steps.

1. Users send the tasks dynamically to the system. 2. Store the tasks in a task queue.

Algorithm 2: mention the VM state for coming tasks.

12

3. The task classifier gets the urgent, then the regular tasks from the task queue.

4. The task classifier returns the information of VM's types as pre-created at the first stage.

5. Maps each of the tasks with a convenient VM type. 6. If there are many matched VMs, find the one with the minimum

completion time. 7. Passes The marked tasks to a matcher to matches them with a specified VMs with convenient types. 8. Push the matched pair <task, VM> into a ready queue. 9. schedule and execute the tasks at the appropriate VMs. 10. return the scheduling information to the task scheduler. 11. Store the scheduling information as a historical information. 12. Return the task execution results to the users.

Input: a successful task set (T) from algorithm 2, (Types) from algorithm1 Output: information of marked tasks T info 1. T info ←∅; 2. K← |Types|; 3. T← Sort the tasks in T by tj(p) in descending order; 4. For j =1 to T. sizeof () 5. tj← {Tj(id), Tj(r), Tj(deadline), tj(p)}; 6. δ←0; 7. For i =1 to K

8. P (Yi |t j) ← ∏ ( | ) ;

9. If ((i ', j ') == max {P (Yi |t j)} & t j can be executed before T j(d)) 10. T info (j) ← {Yi', t j ', δ}; //there is only one matched VM 11. δ ← δ +1; 12. End If 13. if (δ> 1) // there are many matched VMs 14. Find the min estimated CT in a specific successful matched VMs 15. End If 16. End For 17. If (δ ==0) 18. Yx ← (create a proper VM type); 19. T info (j) ←< Yx, tj, δ >; 20. End If 21. End For 22. Return T info;

Some additional operations we will need:

A. Store the results for the successful and unsuccessful task scheduling in a

Algorithm 3: Map Tasks to the VM Types

13

database as a historical task information. B. Push the tasks in the ready queue back into the waiting queue when they

don't satisfy the scheduling conditions. C. push the tasks in the waiting queue into the ready queue, when a convenient

VMs are available, and the ready queue has spaces. D. Store the information of < tj, vi > and execution time into the historical

scheduling database, when the tasks are scheduled successfully. Input: information about marked tasks that comes from Algorithm 3 (T info). Output: matched pairs of VMs and tasks (matched). 1. n ← T info . sizeof (); 2. For j = 1 to n & each T info ( j)∈ T info ( j) 3. type ← T info (j). type; 4. t j ← T info (j). TASK; 5. δ ← T info (j). δ; 6. If (there is a free VM i of type Yi) 7. status←1; 8. End If 9. If (the ready 's sequence is not full) 10. ready←1; 11. End If 12. If (status==1 & δ ==1 & ready==1) 13. ready 's sequence ← < vi, tj >; 14. End If 15. If (status==0&δ >=1& ready==1) || (δ ==0) 16. create a new VM i of type Yi; 17. vi ← {vi (1), vi (2), vi (3), vi (4)}; 18. ready 's sequence ← < vi, tj >; 19. End If 20. Update the task scheduling information; 21. matched ← <tj, vi >; 22. End For 23.Return matched;

3.3 The Scheduling Algorithm

We design a task scheduling algorithm as in [1]. by following the proposed three-stage strategy where:

1. First Stage: Tasks Classification

Algorithm 4: Match the Tasks with VMs

14

Let T' be a historical task set that can be extracted from a historical task scheduling database DB in a cloud computing system. Assume we have a set of type-y tasks defined as T'y, Where the ratio P (T'y) is given as follows:

(3)

We define a set of historically processed tasks by a type-y VM as βy = {i: a task i is of a type-y task that processed by a type-y VM}.

Table 2

Task type Task requirements Task deadline Task count

1 1M memory,100 floating point

operations

Submission time+100ms

100


operations


400


operations


900

Assume we have some historical examples of a three task-types, as given In table 2. for task- type1, |T'1| = 100. |T'| = 100 + 400 + 900 = 1400. By

substitution in equation (3), we have P(T'1) =

= 0.1.

so, we can pre-create a convenient number of VMs with the defined VM types to save the task scheduling time, as in Algorithm 1.Given VMi and taskj, we have their attributes, like tj(r) = {tj(a)|a ∈ {1, 2, 3, 4}}, and vi = {vi(a) |a ∈ {1, 2, 3, 4}}.The matching degree P(tj(a)|vi(a)) is computed as follows:

{ (

)

(4)

Where Vmax (a) = max Vk ( , k represents the type-k virtual machine. kϵ type in equation (4) the value of “a = 1” defines the CPU resources, “a = 2”

defines the memory resources, “a= 3” defines the network bandwidth, and "a= 4” defines the hard disk storage. Because vi(a) > 0, the matching degree value of > 0. Assume that we have vi (1) = 13, Vk (1) = 8, Vmax (1) = 100, and tj (1) = 9. By substitution in equation (4), we can get the matching degrees as follows:

15

=

= 0.96.

k

(

) = (

)2 =0.77

=

= 0.1.

From the above results, we find that 0.96>0.77>0.1, which means that tj (1) Is matching better with vi (1). Note that we will have a perfect match if tj(a) = vi(a) or = 1.

Table 3

VM type VM 's Attributes VM count

1 10MHz 10M 50Kbps 10M 7

2 90MHz 90M 80Kbps 90M 8

3 200MHz 200M 100Kbps 200M 10

4 500MHz 500M 270Kbps 500M 5

Assume we have an example of four VM's types and the attributes of VMs as given in table (3). where each VM's type has a specified number of VMs, e.g., the VM type 1 has seven VMs and their four attributes are given. Assume the variable Yi that defines VM type i. For task j, we can compute the possibility of task j related to type Yi, by using a Bayes classifier, using the following equation: P (Yi |t j) ∏

(5)

The values of tj' and Yi' are obtained using equation (4) which represent the decision function of task j given by Algorithm 3. and they get the best matching pairs. where our task classifier is designed based on the features or the attributes of a task. If a task j ’s requirements match perfectly with a VM type’s attributes, i.e., P(tj(a)|vi(a)) = 1, ∀a ∈ {1, 2, 3, 4} or P (Yi |tj) = 1, then task j will of course select a type i VM When it is not matched. for example, if we have one with a larger capacity and another with smaller one, our classifier design will prefer the choice of a VM with larger capacity for a task as mentioned before. (i', j') = arg max P (Yi |t j). (6)

Assume that we have the values of tj (1) =10, tj (2) =10, tj (3) =48, tj (4) =10. we can see that tj(a)= v1(a) According to Table 3. we get the value of P(tj(a)|v1(a)) = 1 for all a ∈ {1, 2, 3, 4}, using equation (4). Then, we use its value to calculate P (Y1 |t

j) ∏ ( | ) . When i ∈ {2, 3, 4}, we can see that the value of tj(a) <

v1(a). So, we compute:

P (Y2|t j) =∏ ( | )

= ( | ) ( | ) ( | ) ( | )

16

= [(400 − 100 + 100)/400] · [(400 − 100 + 10)/400]. [(256 − 96 + 48)/256] · [(400 − 100+10)/400] =0.378. by the same way, we compute:

P (Y3|t j) =∏ ( | )

= ( | ) ( | ) ( | ) ( | )

=0.100.

And P (Y4|t j) =∏ ( | )

= ( | ) ( | ) ( | ) ( | )

= 0.

From the above results, the task j can be matched as VM type 1, when an initial set of tasks come to the task scheduler. the proposed task classifier differentiates all the tasks into convenient types. Then, each task is matched with its appropriate VM type as shown in Algorithm 3. If there was more than one matched VM we need to compute the estimated completion time for the task which is defined as (the sum of the expected execution time for the current task and the expected execution time for all the previous tasks). and then find the lowest completion time to define the suitable VM as in the following equation:

CT= ∑

2. Stage two: the VM state Definition:

An initial task set t is obtained from the user, then we arrange

these tasks in ascending order based on the length priority for each task. As in algorithm 2, Calculate the expected execution time (EET) for each task using equation (1). assume we have a set of 5-tasks defined as {T1, T2, T3, T4, T5} and their length as given in table 4, and 3-VMs defined as {V1, V2, V3} as given in table 5.

Table 4

Task name Task length

T1 100

T2 4000

T3 30000

T4 500

T5 125000

17

Table 5

VM name V1 V2 V3

VM speed 900 600 200

By substitution in equation (1)

, the task deadline for each task

will be:

T1(d)

= 0.11

T2 (d)

= 6.67

T3 (d)

= 150

T4 (d)

= 0.56

T5 (d)

=208

Compare the EET of the tasks with its deadline using equation (2). if EET is less than or equal to the deadline, then marks it as a successful task that will follow the scheduling process. or to be an unsuccessful task that we will put into the failure queue.

3. Stage three: Match successful Tasks with VMs

During the actual scheduling process, we need to match the tasks that we classified in the last stages with a specified convenient VMs dynamically. as shown in Algorithm 4, when we match the tasks with VMs of marked types, the VMs may be idle or busy. We need to find a convenient behavior for each situation differently. Where Algorithm 4 has two parts. the first part deals when type-i VMs are available. and if there is a free specified type-i VM, matching is successful. In this case we have two scenarios, the first if we have only one matched VM we will map the task to the specified VM directly. in the second case, we have more than one matched VM, here we need to define the VM that have the minimum completion time among all the matched VMs.

the second part deals when a proper-type VM i is not existed or taken by other tasks. If no VM matches the requirements of task j, the scheduling process can't continue. so, we need some saving measures and we must create a new VMs to get the pair < t j, vi > which refers to a task j is matched with a VM i. where tasks in the waiting queue are pushed into the ready queue When there is a right-type VM and a ready queue spaces are available. In a given time, when the requirements of the tasks in the waiting queue cannot match, task scheduling should be interrupted and these tasks are pushed into the failure queue.

18

Assume that the latest start time for task j in a VMi is defined as S (tj, vi). the execution time of task j at VM i is defined as E (t j, vi). and the deadline of task j is defined before as tj(d).

Assume the actual start time of task j at VM i is defined as, A (t j, vi). We first schedule the tasks from the ready queue at the corresponding idle critical time such as S (t j, vi) of VMs. so, we ensure that tasks can be executed before their deadline. Then, if there is much idle time before the critical time, we schedule tasks before the critical time as early as possible.

Algorithm 5: Task scheduling algorithm based on a three-Stage Strategy Input: matched pairs of VMs and tasks comes from Algorithm 4. (matched) Output: The final scheduling info. (schedule) 1: schedule [] ←∅; 2: n ←|matched|; 3: For (j = 1 to n & each pair <tj, vi > ∈ matched at host k) 4: jik ← 0; 5: compute S (tj, vi); 6: compute the idle time slotj of VM Vi before S (tj, vi); 7: map task j to VM i at slotj; 8: jik ← 1; 9: End For 10: For j = 1 to n 11: If A (t j, vi) reaches the minimum of all S (tk, vi), k = (1, 2, ..., n) 12: schedule task tj at VM Vi at slotj; 13: schedule [ j] ← (< tj, vi > & E (tj, vi) & F (tj, vi)); 14: End If 15: End For 16: Return schedule;

3 Comparison and Evaluation Results

We compare our proposed model with two standard cloud scheduling algorithms which are adopted by many researchers as baseline methods [20]:

1) Min–Min algorithm: It is a standard task scheduling technique that schedules the tasks with VMs to be able to start early and perform tasks in the

19

minimum time. It needs to define Which VM can apply tasks at the earliest time and also can finish them fastest [29].

2) Max–Min algorithm: It is a standard task scheduling technique that applies

the task scheduling by considering the difficulty of the task execution. that means to schedule the hardest task at the first [30].

Evaluation Metrics:

1. The Time complexity: the value of time complexity for both the Min–Min and the Max-Min algorithms are O(n3) [31].

2. The average makespan: it is defined as the total time where tasks are

scheduled and completed in the cloud. The smaller the makespan value, the better the schedule and the quality of the service.

∑

(9)

where CTj is the completion time of task tj in the cloud.

3. The task Average Waiting Time: it is defined as the performance of the overall processing capacity and the throughput of the cloud.

∑

(10)

where WTj is the waiting time of tj.

4. The task average completion time: The completion time is defined as the sum of

expected execution time for a task and the expected execution time for all the previous tasks [32]. Estimated completion time can be calculated as:

∑

5. The Task Scheduling Failure Rate: It is defined as the cloud's stability.

% (12)

Where FT is the number of tasks that have a scheduling failure, and M is the total number of tasks.

6. The VMs Utilization Rate:

% (13)

21

Where Tp is the Time processing tasks, and Tt is the Total time Based on the above evaluation factors, and according to the experimental results of the authors of [1]. The expected results of our proposed three stage scheduling algorithm compared to the two algorithms mentioned above will be as in the following table: Table 6

Algorithm Time complexity

Average makespan

Average Waiting

Time

average completion

time

Failure Rate

VMs Utilization

Rate

Proposed algorithm

Medium Low Low Low Low High

Min-Min Medium Medium High Medium High Medium

Max-Min Medium High Medium High Medium Low

4 Conclusion and Future Work

In this paper, we offer a three-stage task scheduling methodology with its suggested algorithms. The methodology is used to achieve the best results of task scheduling and execution, and enhance the quality of the service of the cloud depending on a historical task scheduling data, and a convenient number of pre-created VMs with different resource attributes which can save much time and minimize the task scheduling's failure rate. The paper compares the proposed algorithm with two standard algorithms which are the Min-Min and Max-Min algorithms theoretically. In the near future we will use the cloud sim to implement the proposed algorithm and get the experiment results to support our expected results.

References:

[1] Zhang, P., & Zhou, M. "Dynamic Cloud Task Scheduling Based on a Two-Stage Strategy". IEEE Transactions on Automation Science and Engineering, 15(2). 772–783. 2018.

[2] Krishnam Raju, I. R., Suresh Varma, P., Rama Sundari, M. V., & Jose Moses, G. "Deadline Aware Two Stage Scheduling Algorithm in Cloud Computing". Indian Journal of Science and Technology. 9(4). 2016.

[3] 1 Alworafi, M. A., & Mallappa, S. "An Enhanced Task Scheduling in Cloud Computing Based on Deadline-Aware Model". International Journal of Grid and High-Performance Computing. 10(1). 31–53. 2018.

21

[4] Zhang, X., Wu, T., Chen, M., Wei, T., Zhou, J., Hu, S., and Buyya, R. "Energy-Aware Virtual Machine Allocation for Cloud with Resource Reservation". Journal of Systems and Software. 147. 147-161. 2018.

[5] Zhou, Z., Abawajy, J., Chowdhury, M., Hu, Z., Li, K., Cheng, H., … Li, F. Minimizing SLA violation and power consumption in Cloud data centers using adaptive energy-aware algorithms. Future Generation Computer Systems. 86. 836–850. 2018.

[6] SaeMi Shin, Yena Kim, and SuKyoung Lee." Deadline-guaranteed scheduling algorithm with improved resource utilization for cloud computing". 2015 12th Annual IEEE Consumer Communications and Networking Conference (CCNC). 2015.

[7] Khanghahi, N., & Ravanmehr, R "Cloud Computing Performance Evaluation: Issues and Challenges". International Journal on Cloud Computing: Services and Architecture. 3(5). 29–41. 2013.

[8] Lakra, A. V., & Yadav, D. K "Multi-Objective Tasks Scheduling Algorithm for Cloud Computing Throughput Optimization". Procedia Computer Science. 48. 107–113. 2015.

[9] IM.Vijayalakshmi, IIV.Venkatesa Kumar" Investigations on Job Scheduling Algorithms in Cloud Computing". International Journal of Advanced Research in Computer Science & Technology (IJARCST). 2 (1). 2014.

[10] Dr. Amit Agarwal , Saloni Jain."Efficient Optimal Algorithm of Task Scheduling in Cloud Computing Environment". International Journal of Computer Trends and Technology (IJCTT). 9(7). 344-349, 2014.

[11] Li, X., Qian, L., & Ruiz, R. "Cloud Workflow Scheduling with Deadlines and Time Slot Availability". IEEE Transactions on Services Computing. 11(2). 329–340. 2018.

[12] Kapil Kumar *, Abhinav Hans, Navdeep Singh, and Mohit Birdi." Differences and Problems Task Scheduling Algorithm -A Survey ". International Journal of Hybrid Information Technology. 8(6). 145-152. 2015.

[13] Mehwish Awan*, Munam Ali Shah. "A Survey on Task Scheduling Algorithms in Cloud Computing Environment ". International Journal of Computer and Information Technology. 4(2). 2015.

[14] Sunil Kumar1*, Sumit Mittal2, and Manpreet Singh3. "A Comparative Study of Metaheuristics based Task Scheduling in Distributed Environment". Indian Journal of Science and Technology. 10(26). 2017.

[15] Raza Abbas Haidri\ C. P. Katti2, P. C. Saxena3. "A Load Balancing Strategy for Cloud Computing". IEEE International Conference on Signal Propagation and Computer Technology. 2014.

[16] Syed Mohtashim Abbas Bokhari, Umm-e-Habiba, Farooque Azam, and Muhammad Abbas. 'Limitations of Service Oriented Architecture and its Combination with Cloud Computing'. Bahria University Journal of Information & Communication Technologies. 8(1). 2015.

[17] Anurag S. Barde. "Cloud Computing and Its Vision 2015!!". International Journal of Computer and Communication Engineering. 2(4). 2013.

[18] K C Gouda, Anurag Patro, Dines Dwivedi, Nagaraj Bhat."Virtualization Approaches in Cloud Computing". International Journal of Computer Trends and Technology (IJCTT).12(4). 161-166. 2014.

22

[19] Tsai, W.-T., Sun, X., and Balasooriya, J." Service-Oriented Cloud Computing Architecture". IEEE Seventh International Conference on Information Technology: New Generations. 2010.

[20] Panda, S. K., & Jana, P. K. "Efficient task scheduling algorithms for heterogeneous multi-cloud environment". The Journal of Supercomputing. 71(4). 1505–1533.2015.

[21] Zuo, X., Zhang, G., & Tan, W. "Self-Adaptive Learning PSO-Based Deadline Constrained Task Scheduling for Hybrid IaaS Cloud". IEEE Transactions on Automation Science and Engineering. 11(2). 564–573.2014.

[22] Rodriguez, M. A., & Buyya, R. "Deadline Based Resource Provisioning and Scheduling Algorithm for Scientific Workflows on Clouds". IEEE Transactions on Cloud Computing. 2(2). 222–235. 2014.

[23] Yuan, H., Bi, J., Tan, W., Zhou, M., Li, B. H., & Li, J. "TTSA: An Effective Scheduling Approach for Delay Bounded Tasks in Hybrid Clouds". IEEE Transactions on Cybernetics. 47(11). 3658–3668. 2017.

[24] Jain, N., Menache, I., Naor, J. (Seffi), & Yaniv, J. "Near-Optimal Scheduling Mechanisms for Deadline-Sensitive Jobs in Large Computing Clusters". ACM Transactions on Parallel Computing. 2(1). 1–29. 2015.

[25] Delimitrou, C., & Kozyrakis, C. "QoS-Aware scheduling in heterogeneous data centers with paragon". ACM Transactions on Computer Systems. 31(4). 1–34. 2013.

[26] Chen, C.-Y. "Task Scheduling for Maximizing Performance and Reliability Considering Fault Recovery in Heterogeneous Distributed Systems". IEEE Transactions on Parallel and Distributed Systems. 27(2). 521–532. 2016.

[27] Babur Hayat Malik, Mehwashma Amir, Bilal Mazhar, Shehzad Ali, Rabiya Jalil, Javaria Khalid. "Comparison of Task Scheduling Algorithms in Cloud Environment". (IJACSA) International Journal of Advanced Computer Science and Applications. 9(5). 2018.

[28] Abdulhamid, S. M., Abd Latiff, M. S., Madni, S. H. H., & Abdullahi, M. Fault tolerance aware scheduling technique for cloud computing environment using dynamic clustering algorithm. Neural Computing and Applications, 29(1), 279–293. 2016.

[29] Li, J., Qiu, M., Ming, Z., Quan, G., Qin, X., & Gu, Z."Online optimization for scheduling preemptable tasks on IaaS cloud systems". Journal of Parallel and Distributed Computing, 72(5), 666–677. 2012.

[30] Devipriya, S., & Ramesh, C." Improved Max-min heuristic model for task scheduling in cloud". 2013 International Conference on Green Computing, Communication and Conservation of Energy (ICGCE). (2013).

[31] Rubing Duan, Prodan, R., & Xiaorong Li. "Multi-Objective Game Theoretic Schedulingof Bag-of-Tasks Workflows on Hybrid Clouds". IEEE Transactions on Cloud Computing, 2(1), 29–42.2014.

[32] Banerjee, S., Adhikari, M., Kar, S., & Biswas, U. "Development and Analysis of a New Cloudlet Allocation Strategy for QoS Improvement in Cloud". Arabian Journal for Science and Engineering, 40(5), 1409–1425.2015.

proposed algorithm for optimal resource allocation for cloud ......1 proposed algorithm for optimal...

Documents