web services load leveler enabling autonomic meta-scheduling in grid environments objective enable...
TRANSCRIPT
![Page 1: Web Services Load Leveler Enabling Autonomic Meta-Scheduling in Grid Environments Objective Enable autonomic meta-scheduling over different organizations](https://reader037.vdocuments.net/reader037/viewer/2022110405/56649ec15503460f94bcc99b/html5/thumbnails/1.jpg)
Web Services
Load Leveler
Enabling Autonomic Meta-Scheduling in Grid Environments
Objective• Enable autonomic meta-scheduling over different
organizations
Strategic Importance• Enhance usability: common job control language to different
resource domains• Drive interoperability of schedulers: proprietary and open-
source• Provide integrated scheduling views for enterprise and grid
customers
Technology Benefits• Meet various user service objectives: policy driven (e.g.
capability based, response time based)• Maximize resource availability to users with transparency of
locations• Optimize utilization of resources across domains
I: Objectives
Connection API• Establish and terminate connections
between domain meta-schedulers. • Negotiate roles and connection parameters
using the interface• Provider roles: provide resources for job
execution; is responsible of sending out resource information
• Consumer roles: use resources provided by providers; route job request to providers.
• Send heart beats: exchanged to guarantee the healthy state of the connection.
Resource exchange API• Exchange the scheduling capability and
capacity of the domain controlled by the meta-scheduler• Exchanged information can be a
complete or incremental set of data
Job management API• Submit, re-route and monitor job executions
across schedulers
IV: System architecture
Job Mgmt API
Connection API
Resource exchange API
Job Management
ResourceManagement
ConnectionManagement
FIU
Peer-to-peer
II: P2P Meta scheduling
TDWB
IBM-USA
TDWB
IBM-India
IBM
Fork
BSCgrid
BSC
SGE
FIU-GCB
Fork
LAGrid
FIU
Meta-Scheduler
Meta-Scheduler
Meta-Scheduler
Peer-to-peer
Peer-to-peer
III: Autonomic Meta scheduling
Yanbin Liu1, S. Masoud Sadjadi2, Liana Fong1, Ivan Rodero3, David Villegas2, Selim Kalayci2, Norman Bobroff1, Juan Carlos Martinez2
1: IBM T. J. Watson Research Center, Yorktown Heights, NY 105982: Florida International University, Miami, FL 33199
3: Barcelona Supercomputing Center, Barcelona, Spain
C P: Job flow is from C to P, resource info flow is from P to C
Job Management
ResourceManagement
ConnectionManagement
IBM
JSDL
Resource Repository
Conn Records
Job Records
LL Proxy
UserClient
JSDL
GCBCluster
LAGridCluster
SGE Fork
Some key aspects of the Metascheduler Protocol:
• Heterogeneous sites; inner structure of domains doesn’t effect the functionality of the protocol.
• Site autonomy; each metascheduler is responsible from its own site, and offers as much information as it wants to other sites.
• Peer-to-peer; no centralized body, no single-point of failure.
BSC
BSCGridCluster
Fork
CEPBAResources
LoadLeveler
JSDL
GT4 Container
eNANOSResource
Broker
LAGrid RP
WS ClientLAGrid
Plugin
ApacheAxis2 Server WS Interface
Command-line
Java API
eNANOSClient
LL/Fork
CEPBA
WS Client
Site scheduling manager
GlobalSchedulingmanager
Resourcemanager
Globus Globus
Globus
Globus
Scheduler
Job Management
ResourceManagement
ConnectionManagement
Gridway
Sched Policy
Meta-TDWB
1. User Client takes the job request from the local User. This request is forwarded to Global Scheduling Manager (GSM).2. GSM queries the Resource Manager (RM) for resources. RM stores information about local and remote resources. 3. If available resources are found on local site, job request is forwarded to Site Scheduling Manager (SSM). 4. SSM leverages Gridway functionality to submit the job to the Grid Middleware (Globus).5. If there are not available resources locally, job request is sent to a remote site through WS Client6. Alternatively, job requests from other peers can be received from the WS layer.
1. User Client takes the job request from the local User. This request is forwarded to Global Scheduling Manager (GSM).2. GSM queries the Resource Manager (RM) for resources. RM stores information about local and remote resources. 3. If available resources are found on local site, job request is forwarded to Site Scheduling Manager (SSM). 4. SSM leverages Gridway functionality to submit the job to the Grid Middleware (Globus).5. If there are not available resources locally, job request is sent to a remote site through WS Client6. Alternatively, job requests from other peers can be received from the WS layer.
1. The eNANOS Client forwards the user requests to the eNANOS Broker.2. The remote request from the P2P infrastructure are managed by regular WS (Axis2) acting as a wrapper to a GT4 service that implements the LAGrid APIs and protocols. Connections and other data is stored in Resource Properties.3. Jobs and resources (aggregated data) obtained from local and remote sites are used in the eNANOS Resource Broker scheduling. Jobs are executed under the local domain through Globus services, or are forwarded to other meta-scheduler.4. eNANOS provide its resources data, forwards jobs and performs other operations (such as sending heart beats) through a WS Client.
1. The eNANOS Client forwards the user requests to the eNANOS Broker.2. The remote request from the P2P infrastructure are managed by regular WS (Axis2) acting as a wrapper to a GT4 service that implements the LAGrid APIs and protocols. Connections and other data is stored in Resource Properties.3. Jobs and resources (aggregated data) obtained from local and remote sites are used in the eNANOS Resource Broker scheduling. Jobs are executed under the local domain through Globus services, or are forwarded to other meta-scheduler.4. eNANOS provide its resources data, forwards jobs and performs other operations (such as sending heart beats) through a WS Client.
1. Users send job requests in JSDL format to Meta-TDWB using either TDWB web console or command line.
2. The connection with other meta-schedulers is created and recorded.
3. The resource information from other meta-schedulers, local schedulers such as LoadLeveler and TDWB, and local resources are recorded.
4. Combined with chosen scheduling policy, scheduler matches job requests to resources.
5. Based on the allocated resources, job requests are either forwarded to other schedulers or dispatched to the resources directly.
1. Users send job requests in JSDL format to Meta-TDWB using either TDWB web console or command line.
2. The connection with other meta-schedulers is created and recorded.
3. The resource information from other meta-schedulers, local schedulers such as LoadLeveler and TDWB, and local resources are recorded.
4. Combined with chosen scheduling policy, scheduler matches job requests to resources.
5. Based on the allocated resources, job requests are either forwarded to other schedulers or dispatched to the resources directly.
Command-ine
Web Console
TDWBResources
TivoliAgent
• Use of a common protocol for communication
• Initial negotiation phase
• Meta scheduler roles
• Heartbeat rate
• Type of authentication
• Quality of Service
• Job execution
• Based on user policies
• Resource matching
• Physical resources
• Scheduler capabilities (FCFS, priority-based)