1 coscheduling in clusters: is it a viable alternative? gyu sang choi, jin-ha kim, deniz ersoz, andy...

14
1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

Upload: bertha-glenn

Post on 04-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

1

Coscheduling in Clusters: Is it a Viable Alternative?

Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo,

Chita R. Das

Presented by: Richard Huang

Page 2: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

2

Outline Evaluation of scheduling

alternatives Proposed HYBRID Coscheduling Evaluation Conclusions Discussion

Page 3: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

3

Evaluation of Scheduling Alternatives Local Scheduling

Processes of parallel job independently scheduled Batch Scheduling

Most popular (Maui, PBS,etc.) Avoid memory swapping, but low utilization and

high completion time Gang Scheduling

All processes of job (gang) scheduled together for simultaneous execution

Faster completion time, but global synchronization costs

Page 4: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

4

Communication-Driven Coscheduling Dynamic Coscheduling (DCS)

Uses incoming message to schedule processes for which messages are destined

Spin Block (SB) Process waiting for message spins for fixed

amount of time before blocking itself Periodic Boost (PB)

Periodically boosts priority of process with un-consumed messages

Co-ordinated Coscheduling (CC) Optimizes spinning time to improve performance

at both sender and receiver

Page 5: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

5

HYBRID Coscheduling Idea:

Combines merits of both gang scheduling and communication-driven coscheduling

Coschedule ALL processes like gang scheduler

Boost process priority during communication phase

Issues: How to differentiate between computation

and communication phases? How to ensure fairness during boosting?

Page 6: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

6

HYBRID Coscheduling Boost priority

whenever parallel process enter collective communication phase

Immediate blocking used at sender and receiver

Page 7: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

7

Traditional and Generic Coscheduling Framework

Page 8: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

8

Evaluation 16 node Linux cluster connected through

16-port Myrinet switch 100 mixed applications from NAS Two different job allocation

PACKING: contiguous nodes assigned to a job to reduce system fragmentation and increase system utilization

NO PACKING: parallel processes of job randomly allocated to available nodes in system

Page 9: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

9

Performance Comparison

Page 10: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

10

Observations Average performance gain for PACKING

about 20% compared to NO PACKING Under high load, big differences due to

waiting times Under light load, difference in execution

time more pronounced Batch scheduler has lowest execution

time, followed by HYBRID HYBRID has lowest completion time

among all scheduling schemes

Page 11: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

11

Explanations HYBRID avoids unnecessary spinning

process immediately blocked if communication operation is not complete

HYBRID reduces communication delay process wake up immediately upon receipt of message

(since its priority boosted) HYBRID avoids interrupt overheads

Frequent interrupts from NIC to CPU to boost process’s priority in CC, DCS, and PB

HYBRID boosted only at beginning of an MPI collective communication

HYBRID avoids global synchronization overhead like gang scheduling

HYBRID follows implicit coscheduling

Page 12: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

12

Other Results

Communication-driven coscheduling should deploy memory aware allocator to avoid expensive disk activities

Completing jobs faster can lead to energy savings by using dynamic voltage scaling or shutting down machines

Page 13: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

13

Conclusions Can get significant performance improvement by

using coscheduling mechanisms like HYBRID, SB, or CC

Block-based scheduling techniques had better results because other processes in ready state can proceed

HYBRID scheme is best performer and can be easily implemented on any platform with only modification in the message passing layer

New techniques deployed on cluster should avoid expensive memory swapping

Improved efficiency in scheduling algorithm can translate to better performance-energy ratio

Page 14: 1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

14

Discussion Can it be true that blocking is

always better than spinning? How likely is it to move away from

batch scheduling in clusters and super computers?

Do people try to save energy by improving scheduling algorithm?