priority based fair scheduling: a memory scheduler design for chip-multiprocessor systems

17
Priority Based Fair Scheduling: A Memory Scheduler Design for Chip- Multiprocessor Systems Tsinghua University Tsinghua National Laboratory for Information Science and Technology

Upload: stu

Post on 06-Feb-2016

56 views

Category:

Documents


0 download

DESCRIPTION

Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems. Tsinghua University Tsinghua National Laboratory for Information Science and Technology. Background. “Memory-wall” High memory access latency DRAM Structure Channel, Rank, Bank, Row, Column … - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Tsinghua UniversityTsinghua National Laboratory for Information Science and Technology

Page 2: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

2

Background

• “Memory-wall”– High memory access latency

• DRAM Structure– Channel, Rank, Bank, Row, Column …

– Various timing constraint

• Challenge of multi-core– High parallelism

– More data contention

• Solution– More memory channels

– Efficient memory scheduler

Page 3: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

3

Motivation

• Threads classification [TCM:Kim:2008]– Latency-sensitive threads– Bandwidth-sensitive threads

• A memory scheduler should– Improve system throughput– Avoid starvation– Keep fair among different threads

Page 4: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

4

Goals

• Requests of latency-sensitive threads– To be issued ASAP

• Requests of bandwidth-sensitive threads– Avoid unfairness

• Our proposal: PBFS– Prioritize latency-sensitive threads– Avoid starvation of bandwidth-sensitive threads

Page 5: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

5

Basic Idea

• Each thread gets a priority– Range from -1 to n

• Top-priority (n)– latency sensitive threads

• Bottom-priority (0)– intermediate threads

• Medium-priority (1,n-1)– latency sensitive threads

• Idle (-1)– finished threads or compute-intensive threads

Page 6: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Priority Updating Rules

• Dynamically update– Once a request is issued

• The corresponding thread priority - 1

– When there no thread has top-priority• All thread’s priorities +1

– When a time threshold is arrived• Identify Idle threads, • Adjust top-priority

– Extremely unbalance: increase top-priority– Extremely balance: decrease top-priority– Other case: unchanged– Upper/lower boundaries are adjusted by active threads

6

Page 7: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

System throughput

• Latency-sensitive threads – Easy to get top-priority– Issued as soon as possible

• Example– 2-core CMP

• Thread A, latency-sensitive• Thread B, bandwidth-sensitive• Top-priority = 2• Init, both threads’ priorities are 2

7

Page 8: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Example

8

Rq 0

Rq 0

Rq 1

Rq 2

Rq 3

Rq 5

Rq 6

Rq 7

Rq 8

Rq 1

Rq 0

Rq 0

Rq 1

Rq 2

Rq 3

Rq 5

Rq 6

Rq 7

Rq 8

Rq 1

Rq 4

Rq 9

Rq 4

Rq 9

0 1 2 3 4 5 6 7 8 9 10 11

2 2 2 1 2 2 2 2 1 2 2 2

1 0 0 0 0 0 0 0 0 0 0 0

Thread A

Thread B

ExecutionMem. Cycle

Priority A

Priority B

Page 9: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Starvation Avoidance

• When a thread continuously issued too many requests– It will be classified as bandwidth-sensitive thread

– Other threads may have more chance to promote their priorities

• Example– 2-core CMP

• Thread A, less bandwidth-sensitive

• Thread B, bandwidth-sensitive

• Top-priority = 2

• Init, both threads’ priorities are 2

9

Page 10: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Example

10

Rq 0

Rq 0

Rq 1

Rq 2

Rq 3

Rq 5

Rq 6

Rq 7

Rq 8

Rq 1

Rq 0

Rq 0

Rq 1

Rq 2

Rq 3

Rq 2

Rq 6

Rq 7

Rq 8

Rq 3

Rq 4

Rq 9

Rq 4

Rq 9

Rq 2

Rq 4

Rq 1

Rq 3

Rq 5

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

2 2 2 1 1 2 1 2 1 2 1 2 1 2 2 2

1 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0

Rq 4

Rq 5

Rq 5

Thread A

Thread B

ExecutionMem. Cycle

Priorit y A

Priority B

Page 11: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Hardware overhead

• Need hardware support to– record the priority of each thread

– monitor the threads’ behavior (read counts within a time interval)

– maintain the flags that whether a row buffer can close

• The storage overhead is small and easy to implement

11

Page 12: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Evaluation

• Usimm-1.3• Memory configuration

– 1 channel

– 4 channel

• Benchmarks• Metrics

– Execution time

– Maximum slowdown

– EDP

12

Page 13: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Execution Time

• Overall– CLOSE: 4.2% reduction– PBFS: 7.5% reduction

13

Page 14: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Maximum Slowdown

• Overall– CLOSE: 4.7% reduction– PBFS: 7.0% reduction

14

Page 15: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

EDP

15

• Overall– CLOSE: 9.1% reduction– PBFS: 13.8% reduction

Page 16: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Summary

• We proposed PBFS– Classify threads with priority– Dynamically update threads’ priorities– Guarantee system throughput– Avoid starvation of bandwidth-sensitive threads– Low hardware overhead

16

Page 17: Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems

Thanks

17