work stealing scheduler

32
WORK STEALING SCHEDULER 6/16/2010 Work Stealing Scheduler 1

Upload: bracha

Post on 23-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Work Stealing Scheduler. Announcements. Text books Assignment 1 Assignment 0 results Upcoming Guest lectures. Recommended Textbooks. Patterns for Parallel Programming Timothy Mattson, et.al. The Art of Multiprocessor Programming Maurice Herlihy, Nir Shavit. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Work Stealing Scheduler

Work Stealing Scheduler 1

WORK STEALING SCHEDULER

6/16/2010

Page 2: Work Stealing Scheduler

Work Stealing Scheduler 2

Announcements• Text books• Assignment 1• Assignment 0 results• Upcoming Guest lectures

6/16/2010

Page 3: Work Stealing Scheduler

Work Stealing Scheduler 3

Recommended Textbooks

6/16/2010

The Art of Multiprocessor Programming

Maurice Herlihy, Nir Shavit

Patterns for ParallelProgramming

Timothy Mattson, et.al.

Parallel Programming with Microsoft .NET

http://parallelpatterns.codeplex.com/

Page 4: Work Stealing Scheduler

Work Stealing Scheduler 4

Available on Website• Assignment 1 (due two weeks from now)• Paper for Reading Assignment 1 (due next week)

6/16/2010

Page 5: Work Stealing Scheduler

Work Stealing Scheduler 5

Assignment 0• Thanks for submitting • Provided important feedback to us

6/16/2010

Page 6: Work Stealing Scheduler

Work Stealing Scheduler 6

Percentage Students Answering Yes

6/16/2010

Vector C

lock

Empiric

al Measu

rement

Fence

s

Functi

onal Progra

mming

Lock

Ordering

Recurre

nceACID

Race Condition

0

20

40

60

80

100

Page 7: Work Stealing Scheduler

Work Stealing Scheduler 7

Number of Yes Answers Per Student

6/16/2010

0 1 2 3 4 5 6 7 80

2

4

6

8

10

12

14

Page 8: Work Stealing Scheduler

Work Stealing Scheduler 8

Upcoming Guest Lectures• Apr 12: Todd Mytkowicz • How to (and not to) measure performance

• Apr 19: Shaz Qadeer• Correctness Specifications and Data Race Detection

6/16/2010

Page 9: Work Stealing Scheduler

Work Stealing Scheduler 9

Last Lecture Recap

6/16/2010

Page 10: Work Stealing Scheduler

Work Stealing Scheduler 10

Last Lecture Recap• Parallel computation can be represented as a DAG• Nodes represent sequential computation• Edges represent dependencies

6/16/2010

Split

Sequential Sort

Merge

Sequential Sort

Time

Page 11: Work Stealing Scheduler

Work Stealing Scheduler 11

Last Lecture Recap• Parallel computation can be represented as a DAG

• = Work = time on a single processor

• = Depth = time assuming infinite processors

• = time on P processor using optimal scheduler

6/16/2010

Page 12: Work Stealing Scheduler

Work Stealing Scheduler 12

Last Lecture Recap• Work law:

• Depth law:

6/16/2010

Page 13: Work Stealing Scheduler

Work Stealing Scheduler 13

Last Lecture Recap• Work law:

• Depth law:

• Greedy scheduler is optimal within a factor of 2

6/16/2010

Page 14: Work Stealing Scheduler

Work Stealing Scheduler 14

This Lecture• Design of a greedy scheduler• Task abstraction• Translating high-level abstractions to tasks

6/16/2010

Page 15: Work Stealing Scheduler

Work Stealing Scheduler 15

This Lecture• Design of a greedy scheduler• Task abstraction• Translating high-level abstractions to tasks

6/16/2010

.NET.NETProgram

IntelTBB

TBBProgram

… Scheduler

Page 16: Work Stealing Scheduler

Work Stealing Scheduler 16

(Simple) Tasks• A node in the DAG• Executing a task generates dependent subtasks

6/16/2010

Split(16) Merge(16)

Split(8) Merge(8)Sort(4)

Sort(4)

Split(8) Merge(8)Sort(4)

Sort(4)

Page 17: Work Stealing Scheduler

Work Stealing Scheduler 17

(Simple) Tasks• A node in the DAG• Executing a task generates dependent subtasks• Note: Task C is generated by A or B, whoever finishes last

6/16/2010

Split(16) Merge(16)

Split(8) Merge(8)Sort(4)

Sort(4)

Split(8) Merge(8)Sort(4)

Sort(4)

CA

B

Page 18: Work Stealing Scheduler

Work Stealing Scheduler 18

Design Constraints of the Scheduler• The DAG is generated dynamically• Based on inputs and program control flow• The graph is not known ahead of time

• The amount of work done by a task is dynamic• The weight of each node is not know ahead of time

• Number of processors P can change at runtime• Hardware processors are shared with other processes, kernel

6/16/2010

Page 19: Work Stealing Scheduler

Work Stealing Scheduler 19

Design Requirements of the Scheduler

6/16/2010

Page 20: Work Stealing Scheduler

Work Stealing Scheduler 20

Design Requirements of the Scheduler• Should be greedy• A processor cannot be idle when tasks are pending

• Should limit communication between processors

• Should schedule related tasks in the same processor• Tasks that are likely to access the same cachelines

6/16/2010

Page 21: Work Stealing Scheduler

Work Stealing Scheduler 21

Attempt 0: Centralized Scheduler• “Manager distributes tasks to others”

• Manager: assigns tasks to workers, ensures no worker is idle

• Workers: On task completion, submit generated tasks to the manager

6/16/2010

Page 22: Work Stealing Scheduler

Work Stealing Scheduler 22

Attempt 1: Centralized Work Queue• “All processors share a common work queue”

• Every processor dequeues a task from the work queue

• On task completion, enqueue the generated tasks to the work queue

6/16/2010

Page 23: Work Stealing Scheduler

Work Stealing Scheduler 23

Attempt 2: Work Sharing• “Loaded workers share”

• Every processor pushes and pops tasks into a local work queue

• When the work queue gets large, send tasks to other processors

6/16/2010

Page 24: Work Stealing Scheduler

Work Stealing Scheduler 24

Disadvantages of Work Sharing• If all processors are busy, each will spend time trying to

offload• “Perform communication when busy”

• Difficult to know the load on processors• A processor with two large tasks might take longer than a processor

with five small tasks

• Tasks might get shared multiple times before being executed

• Some processors can be idle while others are loaded• Not greedy

6/16/2010

Page 25: Work Stealing Scheduler

Work Stealing Scheduler 25

Attempt 3: Work Stealing• “Idle workers steal”

• Each processor maintains a local work queue• Pushes generated tasks into the local queue• When local queue is empty, steal a task from another

processor

6/16/2010

Page 26: Work Stealing Scheduler

Work Stealing Scheduler 26

Nice Properties of Work Stealing• Communication done only when idle• No communication when all processors are in full throttle

• Each task is stolen at most once

• This scheduler is greedy, assuming stealers always succeed

• Limited communication• steals on average for some stealing strategies

6/16/2010

Page 27: Work Stealing Scheduler

Work Stealing Scheduler 27

Nice Properties of Work Stealing• Communication done only when idle• No communication when all processors are in full throttle

• Each task is stolen at most once

• This scheduler is greedy, assuming stealers always succeed

• Limited communication• steals on average for some stealing strategies

• Assignment 1 explores different stealing strategies

6/16/2010

Page 28: Work Stealing Scheduler

Work Stealing Scheduler 28

Work Stealing Queue Datastructure• A specialized deque (Double-Ended Queue) with three

operations:• Push : Local processor adds newly created tasks • Pop : Local processor removes task to execute• Steal : Remote processors remove tasks

6/16/2010

Push

Page 29: Work Stealing Scheduler

Work Stealing Scheduler 29

Work Stealing Queue Datastructure• A specialized deque (Double-Ended Queue) with three

operations:• Push : Local processor adds newly created tasks • Pop : Local processor removes task to execute• Steal : Remote processors remove tasks

6/16/2010

Push Pop?

Steal?

Pop?

Steal?

Page 30: Work Stealing Scheduler

Work Stealing Scheduler 30

Work Stealing Queue Datastructure• A specialized deque (Double-Ended Queue) with three

operations:• Push : Local processor adds newly created tasks • Pop : Local processor removes task to execute• Steal : Remote processors remove tasks

6/16/2010

Push

Steal

Pop

Page 31: Work Stealing Scheduler

Work Stealing Scheduler 31

Advantages• Stealers don’t interact with local processor when the

queue has more than one task• Popping recently pushed tasks improves locality• Stealers take the oldest tasks, which are likely to be

the largest (in practice)

6/16/2010

Push

Steal

Pop

Page 32: Work Stealing Scheduler

Work Stealing Scheduler 32

For Assignment 1• We provide an implementation of Work Stealing

Queue• This implementation is thread-safe• Clients can concurrently call the push, pop, steal operations• You don’t need to use additional locks or other

synchronization• Implement the scheduler logic and stealing strategy• Hint: this implementation is single threaded

6/16/2010