adviser: frank, yeong -sung lin present by sean chou

55
OPTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY GREGORY LEVITIN, YUAN-SHUN DAI Adviser: Frank, Yeong-Sung Lin Present by Sean Chou 1

Upload: hovan

Post on 23-Feb-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Optimal service task partition and distribution in grid system with star topology Gregory Levitin , Yuan-Shun Dai. Adviser: Frank, Yeong -Sung Lin Present by Sean Chou. Agenda. Introduction The model Algorithm for determining the pmf of the service time Numerical example Conclusions. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

1

OPTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEMWITH STAR TOPOLOGYGREGORY LEVITIN, YUAN-SHUN DAI

Adviser: Frank, Yeong-Sung LinPresent by Sean Chou

Page 2: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

2

AGENDA Introduction The model Algorithm for determining the pmf of the

service time Numerical example Conclusions

Page 3: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

3

AGENDA Introduction The model Algorithm for determining the pmf of the

service time Numerical example Conclusions

Page 4: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

4

INTRODUCTION Grid computing is a newly developed

technology for complex systems with large-scale resource sharing, wide-area communication, and multi-institutional collaboration. [1]

This is required by a range of collaborative problem-solving and resource-brokering strategies emerging in industry, science, and engineering.

Page 5: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

5

INTRODUCTION The sharing is controlled by a resource

management system (RMS) [2] When the RMS receives a service request

from a user, the task can be divided into a set of execution blocks (EBs) that are executed in parallel.

The RMS assigns those EBs to available resources for execution.

After the resources finish the assigned jobs, they return the results back to the RMS

Page 6: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

6

INTRODUCTION The above grid service process can be

approximated by a structure with star topology

Page 7: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

7

INTRODUCTION The performance of grid computing is of

great concern. Usually the measure of grid performance is

the task execution time (service time). This index can be significantly improved by

using the RMS that divides a task into a set of EBs which can be executed in parallel by multiple online resources.

Many complicated and time-consuming tasks that could not be implemented before are currently working well under the grid computing environment

Page 8: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

8

INTRODUCTION The service time is a random variable affected

by many factors [3].1. There are many resources available online, that

have different task processing speeds.2. Some resources can fail when running the jobs3. The communication links in grid service can fail

during the data transmission.4. The choice of the group of subtasks assigned to

the same EB and running on the same resource can influence the total amount of data transmitted between the RMS and the resource since different subtasks can use common input data blocks.

Page 9: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

9

INTRODUCTION Most of the previous researchers separated

performance and reliability into two different fields and studied them individually.

However in fact, performance and reliability are closely related and affect each other, in particular when the grid computing is implemented.

Page 10: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

10

INTRODUCTION For example, when a task is fully parallelized

into n different EBs executed by n resources simultaneously, the performance is high but the reliability can be low because failure of any resource makes the entire task incomplete.

Therefore, it is worth having some redundant resources to execute same EB especially for those failure-prone resources.

However, too many redundancies, even though improving the reliability, can decrease the performance by not fully parallelizing the task.

Page 11: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

11

INTRODUCTION Performance and reliability should be studied

together in the grid service analysis. The first model for evaluating performance

(service time) of grid with star topology taking the service reliability into account was presented in [4].

Page 12: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

12

INTRODUCTION Optimizing the division of a service task into

EBs and distribution of these EBs among available grid resources can considerably improve the service performance.

This paper presents an algorithm for solving these optimization problems based on the model developed in [4].

Page 13: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

13

AGENDA Introduction The model Algorithm for determining the pmf of the

service time Numerical example Conclusions

Page 14: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

14

THE MODEL 2.1. Service execution by the grid system

with star architecture 2.2. Assumptions 2.3. Service execution time 2.4. Service reliability and expected

performance

Page 15: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

15

THE MODEL Service execution by the grid system

with star architecture Different resources are distributed in the grid

system. The considered service can use a given set of

resources. All the resources and communication

channels from this set are available at the time when the request for service arrives to the RMS

Page 16: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

16

THE MODEL Each resource is directly connected to the

RMS by a single communication channel forming the star topology.

Page 17: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

17

THE MODEL The service task consists of subtasks that

can be independently executed by different resources.

Different subtasks may need some common input data blocks for their execution.

The subtasks can be grouped into EBs. The input data for any EB consists of input data blocks necessary for executing all the subtasks belonging to this EB.

Page 18: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

18

THE MODEL The request for service (task execution) arrives

to the RMS which forms the EBs and assigns them to different resources for processing. Each resource gets no more than one EB for processing.

The same EB can be assigned to several resources for parallel execution.

If the same EB is processed by several resources, it is completed when first output is returned to the RMS.

The entire task is completed when all of the EBs are completed and their results are returned to the RMS from the resources.

Page 19: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

19

THE MODEL Assumptions

Each resource starts processing the assigned EB immediately after it gets all the necessary input data from the RMS through the corresponding communication channel. Each resource sends the output data to the RMS through the same communication channel immediately after it completes the EB.

Each resource has a given constant processing speed when it is available. Each resource has a given constant failure rate.

Page 20: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

20

THE MODEL Each communication channel has constant data

transmission speed (bandwidth) when it is available. Each communication channel has a constant failure rate.

The subtasks belonging to an EB are processed in sequence. The subtask processing time is proportional to its computational complexity.

The data transmission time is proportional to the amount of data transmitted between the RMS and a resource.

Page 21: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

21

THE MODEL The failure rates of the communication channels

or resources are the same when they are idle or loaded (hot standby model). The failures at different resources and communication channels are independent.

The RMS is fully reliable. The time of task processing by the RMS (formation and assignment of EBs, sending them to the resources, receiving the results and integrating them into entire task output) is negligible when compared with the EBs’ processing time.

Page 22: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

22

THE MODEL Service execution time The entire task consists of m subtasks that

can be executed independently Any EB i consisting of a set of subtasks EB’s computational complexity :

Page 23: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

23

THE MODEL Each subtask j needs a set Bj of data blocks

as its input and produces amount Oj of output data.

The set of the input data blocks necessary for execution of EB i is [j2siBj

the amount of data to be transmitted from the RMS to the resource executing this EB is

Page 24: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

24

THE MODEL The total amount of data (input and output)

Di that should be transmitted between the RMS and a resource executing EB i is

Page 25: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

25

THE MODEL The EB execution time is defined as time from the

beginning of input data transmission from the RMS to a resource to the end of output data transmission from the resource to the RMS.

Therefore, the random time tij of EB i completion by resource j can take two possible values

If the resource j and the communication channel j do not fail until the subtask completion, and otherwise.

Page 26: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

26

THE MODEL EB i can be successfully completed by

resource j if this resource and communication link j do not fail before the end of subtask execution.

For constant failure rates of resource j and communication link j one can obtain the probability of EB success as

Page 27: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

27

THE MODEL Assume that each EB i is assigned to

resources composing set oi such that oi \ oj ?; for any iaj.

The random time of EB i completion is

The entire task is completed when all of the subtasks (including the slowest one) are completed.

The random task execution time takes the form:

Page 28: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

28

THE MODEL Service reliability and expected

performance In order to estimate both the service

reliability and performance of a grid system, different measures can be used depending on the application.

The system reliability ReyT is defined (according to performability concept [5,6]) as a probability that the correct output is produced in time less than y.

Page 29: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

29

THE MODEL The service reliability is defined as the

probability that it produces correct outputs without respect to the service time. This index can be referred to as

The conditional expected service time W is considered to be a measure of its performance.

Page 30: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

30

THE MODEL The service task partition into EBs

(represented by the sets si, 1piph) and distribution of the EBs among the resources (represented by the sets oi, 1piph) determine the service reliability and performance.

Two optimization problems:

Page 31: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

31

AGENDA Introduction The model Algorithm for determining the pmf of

the service time Numerical example Conclusions

Page 32: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

32

ALGORITHM FOR DETERMINING THE PMF OF THE SERVICE TIME The procedure used for the evaluation of

service time distribution is based on the universal generating function (u-function) technique.

Its high computational efficiency that allows it to be used in optimization procedures where a large number of different solutions should be estimated.

Page 33: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

33

ALGORITHM FOR DETERMINING THE PMF OF THE SERVICE TIME The u-function ui;fjge can define pmf of total

completion time tij for EB i assigned to resource j.

This u-function takes the form of

Page 34: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

34

ALGORITHM FOR DETERMINING THE PMF OF THE SERVICE TIME The total completion time of EB i assigned to

a pair of resources k and j is equal to the minimum of completion times for different resources

To obtain the u-function representing the pmf of this time, composition operator with

should be used:

Page 35: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

35

ALGORITHM FOR DETERMINING THE PMF OF THE SERVICE TIME The u-function representing the pmf of

completion time of EB i assigned to all of the resources from set can be obtained recursively:

Page 36: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

36

ALGORITHM FOR DETERMINING THE PMF OF THE SERVICE TIME Having the u-functions uj;oj ez for each EB i

(1piph) one can obtain the u-function representing the pmf of the entire task completion time Y

Page 37: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

37

ALGORITHM FOR DETERMINING THE PMF OF THE SERVICE TIME The final u-function Uh(z represents the pmf

of random task completion time Y in the form

Page 38: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

38

ALGORITHM FOR DETERMINING THE PMF OF THE SERVICE TIME Algorithm for determining service

performance/reliability indices for arbitrary task partition and distribution :

Page 39: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

39

AGENDA Introduction The model Algorithm for determining the pmf of the

service time Numerical example Conclusions

Page 40: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

40

NUMERICAL EXAMPLE Formulations (9) and (10) define a

complicated NP complete partitioning/allocation problem.

An exhaustive examination of all possible solutions is not realistic, considering reasonable time limitations.

Page 41: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

41

NUMERICAL EXAMPLE A heuristic search algorithm is needed which

uses only estimates of solution quality and which does not require derivative information to determine the next direction of the search.

The genetic algorithm (GA) has been proven to be an effective optimization tool for a large number of complicated problems in reliability engineering [10,11].

Page 42: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

42

NUMERICAL EXAMPLE Consider a grid service that uses six

resources distributed in the grid system.

Page 43: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

43

NUMERICAL EXAMPLE The entire service task can be divided into

eight independent subtasks.

Page 44: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

44

NUMERICAL EXAMPLE The amount of data in each input data block

is presented in Table 4.

Page 45: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

45

NUMERICAL EXAMPLE First the optimal task partition and

distribution problem was solved by the GA for formulation (9):

The solutions for different allowed service time y are presented in Tables 5 and 6.

Page 46: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

46

NUMERICAL EXAMPLE Table 5 contains obtained task partition into

EB and their distribution among the resources

Page 47: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

47

NUMERICAL EXAMPLE Table 6 contains minimal and maximal

possible service times, the service reliability and the conditional expected service time for each obtained solution.

Page 48: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

48

NUMERICAL EXAMPLE Functions for the obtained solutions are

presented in Fig. 2. It can be seen that the best solutions

obtained for certain y provide the greatest reliability for this value of service time whereas for other values of y they provide lower reliability than the solutions obtained for these values.

Page 49: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

49

NUMERICAL EXAMPLE

Page 50: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

50

NUMERICAL EXAMPLE

Page 51: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

51

AGENDA Introduction The model Algorithm for determining the pmf of the

service time Numerical example Conclusions

Page 52: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

52

CONCLUSIONS Grid technology is a newly developed

method for large scale distributed system. This technology allows effective distribution of computational tasks among different resources presented in the grid.

The resource management system (RMS) can divide service task into subtasks and send the subtasks to different resources for parallel execution.

Page 53: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

53

CONCLUSIONS For any given service task the service

reliability and performance indices depend on task partition into EBs and their distribution among the available resources.

The suggested optimization algorithm is aimed at achieving the greatest reliability/performance by the optimal task partition and distribution.

Page 54: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

54

CONCLUSIONS Most of the previous researchers separated

performance and reliability into two different fields and studied them individually.

However in fact, performance and reliability are closely related and affect each other, in particular when the grid computing is implemented.

This paper presents an algorithm for solving these optimization problems about evaluating performance (service time) of grid with star topology taking the service reliability into account.

Page 55: Adviser: Frank,  Yeong -Sung Lin Present by Sean Chou

55

Thanks for your listening.