Performance-responsive Scheduling for Grid Computing
Dr Stephen JarvisHigh Performance Systems Group
University of Warwick, UK
High Performance Systems Group
Context
• Funded by / collaborating with – UK e-Science Core Programme– IBM (Watson, Hursley)– NASA (Ames)– NEC Europe– Los Alamos National Laboratory
• Integrate established performance tools into emerging grid middleware
High Performance Systems Group
What do we mean by ‘scheduling’• Users view
– Jobs run somewhere on the Grid– Notion of deadline– Execution is single domain (includes pre-staging)
• Resource providers view– Don’t mind which jobs are run where– As long as resources are well/evenly used– Maintaining customers deadlines is important
• System view– Jobs can run anywhere– Resources are heterogeneous– Throughput is important, as are scheduling overheads
High Performance Systems Group
Managing through Middleware
High Performance Systems Group
•Determine what resources are required (predict)
•Determine what resources are available (discover)
•Map requirements to available resources (schedule)
•Maintain contract of performance (QoS)
Managing through Middleware
Performance Services
• Intra-domain– Lab- / department-based
– Shared resources under local administration
• Multi-domain– Campus- / country-based
– Wide-area resource and task management
– Cross domain
High Performance Systems Group
Performance Services
High Performance Systems Group
• Intra-domain– Lab- / department-based
– Shared resources under local administration
• Multi-domain– Campus- / country-based
– Wide-area resource and task management
– Cross domain
Performance Services
High Performance Systems Group
• Intra-domain– Lab- / department-based
– Shared resources under local administration
• Multi-domain– Campus- / country-based
– Wide-area resource and task management
– Cross domain
Performance Prediction
• Performance prediction tools• Aim to predict
– Execution time– Communication usage– Data and resource requirements
• Provides best guess as to how an application will execute on a given resource
High Performance Systems Group
High Performance Systems Group
PACE User
Application
Resource
High Performance Systems Group
PACE User
Application
Resource
ApplicationModel
Resource Model
Application
ApplicationModel
Resource
Resource Model
PACE User
Evaluation Engine
Model parameters
Resource config.
High Performance Systems Group
Application
ApplicationModel
Resource
Resource Model
PACE User
Evaluation Engine
Model parameters
Resource config.
High Performance Systems Group
Why is prediction useful?
• Scaling properties
• Compare runtime options with– deadline
– available resources
– priority / other jobs
– etc.
High Performance Systems Group
0
5
10
15
20
25
30
35
40
45
50
1 4 7 10 13 16
The Number of Processors
Run
ning
Tim
e on
SG
IOri
gin2
000
(sec
)
sweep3d
fft
improc
closure
jacobi
memsort
cpi
Allows runtime scenarios to be explored before deployment
Run
-tim
e
1. Intra-Domain Co-Scheduling
High Performance Systems Group
• Augment Condor scheduler with additional
performance information
• Scheduler driver, or co-scheduler (called
Titan)
• Use predictive data for system improvement
– Time to complete tasks / utilisation of resources
– QoS – ability to meet deadlines
• Handle predictive and non-predictive tasks
Intra-Domain Co-Scheduling
High Performance Systems Group
• Non-predictive tasks
PORTALPRE-
EXECUTIONENGINE MATCHMAKER
SCHEDULEQUEUE
PACE
GA CLUSTERCONNECTOR
CONDORCONDOR
REQUESTS FROM USERS OR OTHERDOMAIN SCHEDULERS
RESOURCES
CLASSADS
Titan
Intra-Domain Co-Scheduling
High Performance Systems Group
• Non-predictive tasks
PORTALPRE-
EXECUTIONENGINE MATCHMAKER
SCHEDULEQUEUE
PACE
GA CLUSTERCONNECTOR
CONDORCONDOR
REQUESTS FROM USERS OR OTHERDOMAIN SCHEDULERS
RESOURCES
CLASSADS
Titan
Intra-Domain Co-Scheduling
High Performance Systems Group
• Non-predictive tasks• Tasks with prediction
data
PORTALPRE-
EXECUTIONENGINE MATCHMAKER
SCHEDULEQUEUE
PACE
GA CLUSTERCONNECTOR
CONDORCONDOR
REQUESTS FROM USERS OR OTHERDOMAIN SCHEDULERS
RESOURCES
CLASSADS
Titan
Intra-Domain Co-Scheduling
High Performance Systems Group
• Non-predictive tasks• Tasks with prediction
data
PORTALPRE-
EXECUTIONENGINE MATCHMAKER
SCHEDULEQUEUE
PACE
GA CLUSTERCONNECTOR
CONDORCONDOR
REQUESTS FROM USERS OR OTHERDOMAIN SCHEDULERS
RESOURCES
CLASSADS
Titan
Intra-Domain Co-Scheduling
High Performance Systems Group
• Non-predictive tasks• Tasks with prediction
data
PORTALPRE-
EXECUTIONENGINE MATCHMAKER
SCHEDULEQUEUE
PACE
GA CLUSTERCONNECTOR
CONDORCONDOR
REQUESTS FROM USERS OR OTHERDOMAIN SCHEDULERS
RESOURCES
CLASSADS
Titan
Intra-Domain Co-Scheduling
High Performance Systems Group
• Non-predictive tasks• Tasks with prediction
data
PORTALPRE-
EXECUTIONENGINE MATCHMAKER
SCHEDULEQUEUE
PACE
GA CLUSTERCONNECTOR
CONDORCONDOR
REQUESTS FROM USERS OR OTHERDOMAIN SCHEDULERS
RESOURCES
CLASSADS
Titan
Intra-Domain Co-Scheduling
High Performance Systems Group
• Non-predictive tasks• Tasks with prediction
data
PORTALPRE-
EXECUTIONENGINE MATCHMAKER
SCHEDULEQUEUE
PACE
GA CLUSTERCONNECTOR
CONDORCONDOR
REQUESTS FROM USERS OR OTHERDOMAIN SCHEDULERS
RESOURCES
CLASSADS
Titan
Intra-Domain Deployment
Without co-scheduler With co-scheduler
Time to complete = 70.08m Time to complete = 35.19m
High Performance Systems Group
• Publish intra-domain perf. data through
global information services (MDS)
• Augment service with agent system
– One agent per domain / VO
• When a task is submitted
– Agents query IS, and negotiate to discover best
domain to run task
• Scheme is tested on a 256-node exp. Grid
– 16 resource domains; 6 arch. types
High Performance Systems Group
2. Multi-Domain Management
High Performance Systems Group
Multi-Domain Management
time
High Performance Systems Group
Multi-Domain Management
time
High Performance Systems Group
Multi-Domain Management
time
High Performance Systems Group
Multi-Domain Management
Time to complete = 2752s
Multi-Domain Management
High Performance Systems Group
Time to complete = 467s; an improvement of 83%
Multi-Domain Management
High Performance Systems Group
Time to complete = 467s; an improvement of 83%
QoS: Ability to Meet Deadline
High Performance Systems Group
active inactive
Resource usage
High Performance Systems Group
active inactive
Other work
• OGSA compatibility• Prediction
– Accuracy– Other prediction techniques
• Workflow (CCGrid 2003)• Reservation• V. 1.1, Condor/GT2-based
– www.dcs.warwick.ac.uk/~hpsg– Documented at HPDC-12/GGF-8, FGCS
High Performance Systems Group