lottery and stride scheduling - waldspurger.org

29
Lottery and Stride Scheduling Flexible Proportional-Share Resource Management Carl A. Waldspurger Parallel Software Group MIT Laboratory for Computer Science August 25, 1995

Upload: others

Post on 16-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Lottery and Stride SchedulingFlexible Proportional-Share Resource Management

Carl A. Waldspurger

Parallel Software GroupMIT Laboratory for Computer Science

August 25, 1995

Overview

Context

Framework

Mechanisms

Prototypes

Diverse Resources

Conclusions

Problem

Environment� multiplex scarce resources� concurrently executing clients� service requests of varying importance

Goals� manage computation rates dynamically� enable flexible application-level policies� promote software engineering principles

Related Work

Priority-Based Scheduling� operating systems� real-time systems

Share-Based Scheduling� fair-share� proportional-share� microeconomic

Rate-Based Network Flow Control� virtual clock, WFQ� AN2 switch

Contributions

New Framework� simple, powerful abstractions� modular resource management

Novel Mechanisms� randomized and deterministic algorithms� precise control over service rates

Resource-Specific Techniques� proportional-share control� locks, memory, disk I/O

Resource Management Framework

Simple� direct control over service rates� resource rights aggregate and vary smoothly

Modular� powerful abstraction mechanism� insulate concurrent modules

Flexible� can express sophisticated policies� adapts to dynamic changes� general-purpose, scalable

Framework Abstractions

Tickets� first-class objects� encapsulate resource rights� proportional throughput� inversely proportional response time

Currencies� modular abstraction mechanism� name, share, protect sets of tickets� flexibly group or isolate sets of clients

Dynamic Management Techniques

Ticket Transfers� explicit transfer between clients� useful when client blocks while waiting� example: synchronous IPC

Ticket Inflation and Deflation� clients create and destroy tickets� effects locally contained by currencies� example: progress-based allocation

Example Currency Graph

base3000

2000base

1000base

bob100

alice300

100alice

task2

100bob

task3

200alice

task1

1task3

1

Computing Values� currency:sum value ofbacking tickets� ticket:compute share ofcurrency value

Example� task2 funding inbase units?� 100

3001000 +

11

100

1002000� 2333 base units

Proportional-Share Mechanisms

Randomized� lottery� multi-winner lottery

Deterministic� stride� hierarchical stride

Evaluation Criteria� throughput accuracy� response-time variability� algorithmic complexity

Lottery Scheduling Example

10 2 5 1 2

0 2 4 6 8 10 12 14 16 18

winner

random [0..19] = 13total = 20

Lottery Scheduling Analysis

Strengths� simple, stateless algorithm� supports dynamic operations� randomization prevents cheating

Weaknesses� guarantees are probabilistic� poor short-term accuracy: O(pna) absolute error� high response-time variability: �=� =

p1� p

Multi-Winner Lottery Example

10 2 5 1 2

0 2 4 6 8 10 12 14 16 18

winner#1

winner#2

winner#3

winner#4

random [0..19] = 13total = 20 #win = 4

total / #win = 5

Multi-Winner Lottery Analysis

Strengths� improves accuracy for large clients� guarantees bnw tT c quanta per superquantum� bounds worst-case response time� improves list-based efficiency

Weaknesses� probabilistic guarantees for small clients� dynamic operations terminate superquantum

Stride Scheduling Example

0 5 10Time (quanta)

0

5

10

15

20

Pas

s V

alu

e

3 : 2 : 1 allocation

Initialization� stride =stride1

tickets� pass = stride� stride1 = 6strides: 2, 3, 6

Allocation� choose client Cwith minimum pass� C.pass += C.stride

Dynamic Stride Allocation Change

stride’

stride

now pass

pass’

remain

remain’

done

now

Allocation Change� tickets! tickets0� stride0 =stride1

tickets 0� remain0 = stride 0

strideremain� pass0 = now + remain0

no updates needed

for other clients

Stride Scheduling Analysis

Strengths� strong deterministic guarantees� throughput error independent of na� maximum relative error is one quantum

Weaknesses� O(nc) absolute error� poor behavior for skewed ticket allocations

Hierarchical Stride Example

18

10 20

2

90 90

10

18 54

12

15 45

6

30 30

5

36 36

1

180 180

10 : 2 : 5 : 1 Ratio

tickets

stride pass Node

Initialization� stride =stride1

tickets� pass = stride� stride1 = 180

Allocation� follow child C withsmaller pass value� C.pass += C.stride

Hierarchical Stride Analysis

Strengths� O(lgnc) absolute error� reduces worst-case response-time variability� avoids worst-case stride scheduling behavior

Weaknesses� can increase response-time variability� actual error can exceed stride scheduling error� complex dynamic operations

Throughput Accuracy Comparison

Time (quanta)

Ab

solu

te E

rro

r (q

uan

ta)

0 200 400 600 800 10000

5

10

Lo

tter

y

0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000

0 200 400 600 800 10000

5

10

Mu

lti-

Win

ner

0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000

0 200 400 600 800 10000

5

10

Str

ide

0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000

0 200 400 600 800 1000

13 Tickets

0

5

10

Hie

rarc

hic

al

0 200 400 600 800 1000

1 Ticket0 200 400 600 800 1000

7 Tickets0 200 400 600 800 1000

3 Tickets

0 20 400

1

2

0 20 400

1

2

0 20 400

1

2

0 20 400

1

2

0 20 400

1

2

0 20 400

1

2

0 20 400

1

2

0 20 400

1

2

Static Allocation

13 : 7 : 3 : 1 Ratio

Mechanisms� lottery� multi-winner (2,4,8)� stride� hierarchical

Response-Time Comparison

Response Time (quanta)

Fre

qu

ency

(lo

g s

cale

)

0 10 20 30 40 500

1

10

100

1000

10000

100000

1000000

Lo

tter

y

0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50

0 10 20 30 40 500

1

10

100

1000

10000

100000

1000000

Mu

lti-

Win

ner

0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50

0 10 20 30 40 500

1

10

100

1000

10000

100000

1000000

Str

ide

0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50

0 10 20 30 40 50

13 Tickets

0

1

10

100

1000

10000

100000

1000000

Hie

rarc

hic

al

0 10 20 30 40 50

7 Tickets0 10 20 30 40 50

3 Tickets0 10 20 30 40 50

1 Ticket

Static Allocation

13 : 7 : 3 : 1 Ratio

Mechanisms� lottery� multi-winner (4)� stride� hierarchical

Prototype Process Schedulers

Lottery Scheduler� modified Mach microkernel� DECStation 5000/125� complete framework implementation

Stride Scheduler� modified Linux kernel� IBM Thinkpad 350C� no ticket transfers or currencies

Low System Overhead

Relative Rate Accuracy

0 2 4 6 8 10Ticket Ratio

0

5

10

15

Ob

serv

ed It

erat

ion

Rat

io

Lottery Scheduler� Dhrystonebenchmark� two tasks� three 60-secondruns for each ratio

Stride Scheduler� arith benchmark� two tasks� three 30-secondruns for each ratio

Dynamic Ticket Deflation

0 500 1000Time (sec)

0

20

40

60

80

100

Cu

mu

lati

ve T

rial

s (t

ho

usa

nd

s)

stride scheduler

Monte-Carlo

simulations

many trials for

accurate results

three tasks

funding based

on relative error

Dynamic Ticket Transfers

0 200 400 600 800Time (sec)

0

10

20

30

40

Qu

erie

s P

roce

ssed

lottery scheduler

query processing

multithreaded

“database” server

three clients

8 : 3 : 1 allocation

Modular Load Insulation

0 100 200 300Time (sec)

0

2

4

6

8

Iter

atio

ns

(mill

ion

s) A

B1

B2

lottery scheduler

currencies A, B

2 : 1 funding

task A

funding 100.A

task B1

funding 100.B

task B2 joins with

funding 100.B

Managing Diverse Resources

Synchronization Resources� locks, condition variables� ticket inheritance, repayment

Space-Shared Resources� inverse lotteries� minimum-funding revocation

Disk I/O Bandwidth

Multiple Resources

Conclusions

General Framework� direct application-level control� simple, modular, flexible� widely applicable

Proportional-Share Algorithms� lottery and stride scheduling� efficient O(lgnc) operations� techniques for locks, memory, disk

Future Directions

Multiple Resources� manage all critical resources� develop tools for adaptive software� microeconomic vs. proportional-share

Human-Computer Interaction� improve application responsiveness� GUI elements for resource management