scheduling algorithms for grid computing

Download Scheduling Algorithms for Grid Computing

Post on 08-Jul-2018

216 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    1/45

    Technical Report No. 2006-504

    Scheduling Algorithms for Grid omputing!

    State of the Art and "pen #ro$lems

    %angpeng &ong and Selim G. A'l

    School of omputing(

    )ueen*s +ni,ersit

    ingston( "ntario /anuar 2006

    A$stract!

    Than's to ad,ances in ide-area netor' technologies and the lo cost of computing

    resources( Grid computing came into $eing and is currentl an acti,e research area. "ne

    moti,ation of Grid computing is to aggregate the poer of idel distri$uted resources(

    and pro,ide non-tri,ial ser,ices to users. To achie,e this goal( an efficient Grid scheduling

    sstem is an essential part of the Grid. Rather than co,ering the hole Grid scheduling

    area( this sur,e pro,ides a re,ie of the su$1ect mainl from the perspecti,e of 

    scheduling algorithms. n this re,ie( the challenges for Grid scheduling are identified.

    %irst( the architecture of components in,ol,ed in scheduling is $riefl introduced to

     pro,ide an intuiti,e image of the Grid scheduling process. Then ,arious Grid scheduling algorithms are discussed from different points of ,ie( such as static ,s. dnamic policies(

    o$1ecti,e functions( applications models( adaptation( )oS constraints( strategies dealing

    ith dnamic $eha,ior of resources( and so on. 3ased on a comprehensi,e understanding

    of the challenges and the state of the art of current research( some general issues orth of 

    further eploration are proposed.

    . ntroduction

    The popularit of the nternet and the a,aila$ilit of poerful computers and

    high-speed netor's as lo-cost commodit components are changing the a e use

    computers toda. These technical opportunities ha,e led to the possi$ilit of using

    geographicall distri$uted and multi-oner resources to sol,e large-scale pro$lems in

    science( engineering( and commerce. Recent research on these topics has led to the

    emergence of a ne paradigm 'non as Grid computing 78.

    To achie,e the promising potentials of tremendous distri$uted resources( effecti,e and

    efficient scheduling algorithms are fundamentall important. +nfortunatel( scheduling

    algorithms in traditional parallel and distri$uted sstems( hich usuall run on

    homogeneous and dedicated resources( e.g. computer clusters( cannot or' ell in the ne

    circumstances 28. n this paper( the state of current research on scheduling algorithms for 

    the ne generation of computational en,ironments ill $e sur,eed and open pro$lems

    ill $e discussed.

    The remainder of this paper is organi9ed as follos. An o,er,ie of the Gridscheduling pro$lem is presented in Section 2 ith a generali9ed scheduling architecture. n

    Section :( the progress made to date in the design and analsis of scheduling algorithms

    for Grid computing is re,ieed. A summar and some research opportunities are offered in

    Section 4.

    2. ",er,ie of the Grid Scheduling #ro$lem

    A computational Grid is a hardare and softare infrastructure that pro,ides

    dependa$le( consistent( per,asi,e( and inepensi,e access to high-end computational

    capa$ilities 458. t is a shared en,ironment implemented ,ia the deploment of a

     persistent( standards-$ased ser,ice infrastructure that supports the creation of( and resource

    sharing ithin( distri$uted communities. Resources can $e computers( storage space(

    instruments( softare applications( and data( all connected through the nternet and a middleare softare laer that pro,ides $asic ser,ices for securit( monitoring( resource

    management( and so forth. Resources oned $ ,arious administrati,e organi9ations are

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    2/45

    shared under locall defined policies that specif hat is shared( ho is alloed to access

    hat( and under hat conditions 4;8. The real and specific pro$lem that underlies the Grid

    concept is coordinated resource sharing and pro$lem sol,ing in dnamic(

    multi-institutional ,irtual organi9ations 448.

    %rom the point of ,ie of scheduling sstems( a higher le,el a$straction for the Grid

    can $e applied $ ignoring some infrastructure components such as authentication(

    authori9ation( resource disco,er and access control. Thus( in this paper( the folloing definition for the term Grid adopted! ualit-of-ser,ice re>uirements? 08.

    To facilitate the discussion( the folloing fre>uentl used terms are defined!

    @ A tas' is an atomic unit to $e scheduled $ the scheduler and assigned to a

    resource.

    @ The properties of a tas' are parameters li'e #+memor re>uirement( deadline(

     priorit( etc.

    @ A 1o$ Bor metatas'( or applicationC is a set of atomic tas's that ill $e carried out

    on a set of resources. /o$s can ha,e a recursi,e structure( meaning that 1o$s are composed of su$-1o$s andor tas's( and su$-1o$s can themsel,es $e decomposed

    further into atomic tas's. n this paper( the term 1o$( application and metatas' are

    interchangea$le.

    @ A resource is something that is re>uired to carr out an operation( for eample! a

     processor for data processing( a data storage de,ice( or a netor' lin' for data

    transporting.

    @ A site Bor nodeC is an autonomous entit composed of one or multiple resources.

    @ A tas' scheduling is the mapping of tas's to a selected group of resources hich

    ma $e distri$uted in multiple administrati,e domains.

    2

    2. The Grid Scheduling #rocess and omponents

    A Grid is a sstem of high di,ersit( hich is rendered $ ,arious applications(

    middleare components( and resources. 3ut from the point of ,ie of functionalit( e

    can still find a logical architecture of the tas' scheduling su$sstem in Grid. %or eample(

    Dhu 2:8 proposes a common Grid scheduling architecture. Ee can also generali9escheduling

     process in the Grid into three stages! resource disco,ering and filtering(

    resource selecting and scheduling according to certain o$1ecti,es( and 1o$ su$mission 748.

    As a stud of scheduling algorithms is our primar concern here( e focus on the second

    step. 3ased on these o$ser,ations( %ig.  depicts a model of Grid scheduling sstemshich

    functional components are connected $ to tpes of data flo! resourceapplication information

    flo and tas' or tas' scheduling command flo.a

    in

    or 

    %ig. ! A logical Grid scheduling architecture! $ro'en lines sho resource or application

    information

    flos and real lines sho tas' or tas' scheduling command flos.

    3asicall( a Grid scheduler BGSC recei,es applications from Grid users( selects feasi$le

    resources for these applications according to ac>uired information from the Grid

    nformation Ser,ice module( and finall generates application-to-resource mappings( $ased

    on certain o$1ecti,e functions and predicted resource performance. +nli'e their 

    counterparts in traditional parallel and distri$uted sstems( Grid schedulers usuall cannot control Grid resources directl( $ut or' li'e $ro'ers or agents:8( or e,en tightl

    coupled ith the applications as the application-le,el scheduling scheme proposes 8(

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    3/45

    058. The are not necessaril located in the same domain ith the resources hich are

    ,isi$le to them. %ig.  onl shos one Grid scheduler( $ut in realit multiple such

    schedulers might $e deploed( and organi9ed to form different structures Bcentrali9ed(

    hierarchical and decentrali9ed 558C according to different concerns( such as performance

    or scala$ilit. Although a Grid le,el scheduler Bor Fetascheduler as it is sometime referred

    to in the literature( e.g.( in 8C is not an indispensa$le component in the Grid

    : infrastructure Be.g.( it is not included in the Glo$us Tool'it 258( the defacto standardthe Grid

    computing communitC( there is no dou$t that such a scheduling componentcrucial for harnessing

    the potential of Grids as the are epanding >uic'l( incorporating

    resources from supercomputers to des'tops. "ur discussion on scheduling algorithms$ased on the

    assumption that there are such schedulers in a Grid.

    nformation a$out the status of a,aila$le resources is ,er important for a Grid

    scheduler to ma'e a proper schedule( especiall hen the heterogeneous and dnamic

    nature of the Grid is ta'en into account. The role of the Grid information ser,ice BGSCto pro,ide

    such information to Grid schedulers. GS is responsi$le for collecting and

     predicting the resource state information( such as #+ capacities( memor si9e( netor' 

     $andidth( softare a,aila$ilities and load of a site in a particular period. GS can anser  >ueries for resource information or push information to su$scri$ers. The Glo$us

    Fonitoring and &isco,e