cbr: sharing dram with minimum latency and bandwidth guarantees

Post on 31-Dec-2015

40 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

CBR: Sharing DRAM with Minimum Latency and Bandwidth Guarantees. Zefu Dai, Mark Jarvin and Jianwen Zhu. University of Toronto. Background. Consumer Electronics is part of everyday life!. SoC. Mem Contr. DRAM. Background. A portable media player SoC example. Background. - PowerPoint PPT Presentation

TRANSCRIPT

CBR: Sharing DRAM with Minimum Latency and Bandwidth

Guarantees

Zefu Dai, Mark Jarvin and Jianwen Zhu

University of Toronto

23/4/19 University of Toronto 2

Background Consumer Electronics is part of everyday life!

SoC

Mem Contr.

DRAM

23/4/19 University of Toronto 3

Background A portable media player SoC example

23/4/19 University of Toronto 4

Background A portable media player SoC example

23/4/19 University of Toronto 5

BackgroundA portable media player SoC example

6.4 9.6 1.2 164.8 0.09 31.0 156.7 94MB/s

23/4/19 University of Toronto 6

BackgroundA portable media player SoC example

6.4 9.6 1.2 164.8 0.09 31.0 156.7 94MB/s

1000x

23/4/19 University of Toronto 7

BackgroundA portable media player SoC example

6.4 9.6 1.2 164.8 0.09 31.0 156.7 94MB/s

Give me 10 KB in 1 us,

please.

23/4/19 University of Toronto 8

BackgroundA portable media player SoC example

6.4 9.6 1.2 164.8 0.09 31.0 156.7 94MB/s

Give me 10 KB in 1 us,

please.

I want the data

NOW!!!

23/4/19 University of Toronto 9

BackgroundA portable media player SoC example

6.4 9.6 1.2 164.8 0.09 31.0 156.7 94MB/s

Give me 10 KB in 1 us,

please.

I want the data

NOW!!!

I can only supply a maximum of 6.4 GB every second.

23/4/19 University of Toronto 10

ChallengesSimultaneously satisfy:

- Bandwidth requirements

- Latency requirements

23/4/19 University of Toronto 11

Previous WorkQoS aware

- Bandwidth or latency is heuristically improved

QoS guaranteed- Guaranteed minimum bandwidth and / or latency

23/4/19 University of Toronto 12

Main IdeasStart with Bandwidth Guaranteed Prioritized

Queuing (BGPQ) algorithm - Bandwidth guarantee

Improve it using Credit Borrow and Repay (CBR) mechanism- Minimum latency guarantee

23/4/19 University of Toronto 13

Bandwidth Guaranteed Prioritized Queuing

Combine both the benefits of the Priority Queuing and Weighted Fair Queuing - Credit based Weighted Fair Queuing

- Prioritized service for residual bandwidth allocation

Residual bandwidth:- The bandwidth assigned to one user that is unused

at a specific point of time

23/4/19 University of Toronto 14

BGPQ AlgorithmCase 1: all queues are busy

- No residual bandwidth

- Act as WFQ

Q0

Q1

Q2

Shared Resource

50%

20%

30%

0

0.0 0.0 0.0

Initial state: everybody has a credit of zero.

Multiplexer

BGPQ Scheduler

23/4/19 University of Toronto 15

BGPQ AlgorithmCase 1: all queues are busy

- No residual bandwidth

- Act as WFQ

Q0

Q1

Q2

Shared Resource

50%

20%

30%

0

0.50.2

0.3

Multiplexer

Step 1: calculate dynamic credit for each queue.

BGPQ Scheduler

23/4/19 University of Toronto 16

BGPQ AlgorithmCase 1: all queues are busy

- No residual bandwidth

- Act as WFQ

Q0

Q1

Q2

Shared Resource

50%

20%

30%

0

0.50.2

0.3

Step 2: turn on switch box and transfer data from granted queue.

BGPQ Scheduler

Multiplexer

23/4/19 University of Toronto 17

BGPQ AlgorithmCase 1: all queues are busy

- No residual bandwidth

- Act as WFQ

Q0

Q1

Q2

Shared Resource

50%

20%

30%

0-0.5

0.20.3

Multiplexer

Step 3: subtract 1 from the credit of granted queue.

One Scheduling cycle is Done!!

Sum of credits = 0!

BGPQ Scheduler

23/4/19 University of Toronto 18

BGPQ AlgorithmCase 2: some queues are empty

- Has residual bandwidth

- Prioritized service on residual bandwidth

Q0

Q1

Q2

Shared Resource

50%

20%

30%Multiplexer

Before new scheduling cycle:

Q1 is empty.

Priority: Q0>Q1>Q2

BGPQ Scheduler

0-0.5

0.20.3

23/4/19 University of Toronto 19

BGPQ AlgorithmCase 2: some queues are empty

- Has residual bandwidth

- Prioritized service on residual bandwidth

Q0

Q1

Q2

Shared Resource

50%

20%

30%Multiplexer

Step 1: Calculate a dynamic credit for each queue.

Credit of empty queue remain unchangedPriority: Q0>Q1>Q2

BGPQ Scheduler

00.0 0.2

0.6

23/4/19 University of Toronto 20

BGPQ AlgorithmCase 2: some queues are empty

- Has residual bandwidth

- Prioritized service on residual bandwidth

Q0

Q1

Q2

Shared Resource

50%

20%

30%Multiplexer

Step 2: allocate residual bandwidth to non-empty queue with highest priority.

Priority: Q0>Q1>Q2

BGPQ Scheduler

00.2 0.2

0.6

23/4/19 University of Toronto 21

Shared Resource

BGPQ AlgorithmCase 2: some queues are empty

- Has residual bandwidth

- Prioritized service on residual bandwidth

Q0

Q1

Q2

50%

20%

30%Multiplexer

Step 3: transfer data from granted queue.

Priority: Q0>Q1>Q2

BGPQ Scheduler

00.2 0.2

0.6

23/4/19 University of Toronto 22

Shared Resource

BGPQ AlgorithmCase 2: some queues are empty

- Has residual bandwidth

- Prioritized service on residual bandwidth

Q0

Q1

Q2

50%

20%

30%Multiplexer

Step 4: subtract 1 from the credit of granted queue.

Priority: Q0>Q1>Q2 One Scheduling cycle is Done!!

Sum of credits = 0!

BGPQ Scheduler

00.2 0.2

-0.4

23/4/19 University of Toronto 23

BGPQ AdvantagesBGPQ = WFQ + PQ

- bandwidth guarantee

- prioritized access to residual bandwidth

Low implementation cost:- 3 adders for credit calculation

- 1 comparator tree to find the highest dynamic credit

23/4/19 University of Toronto 24

BGPQ DisadvantageLow latency, low bandwidth requirement

class:- No minimum latency guarantee

Minimum latency:- No need to wait for any request that has lower

priority

23/4/19 University of Toronto 25

Latency Problem of BGPQExample:

Optimal Scheduling:

23/4/19 University of Toronto 26

Credit Borrow and Repay Mechanism

Borrow- Allow low latency requirement class to borrow the

scheduling opportunity from other classes

Repay- Return the credit later when convenient

23/4/19 University of Toronto 27

CBR MechanismCase 3: Credit Borrow and Repay

- Maintain a debt queue for Q0: a borrowed ID FIFO

Q0

Q1

Q2

Shared Resource

10%

20%

70%

00.3 0.0

0.7

Step 1: calculate dynamic credit, and allocate the residual bandwidth

Priority: Q0>Q1>Q2DebtQ

CBR Scheduler

Multiplexer

23/4/19 University of Toronto 28

CBR MechanismCase 3: Credit Borrow and Repay

- Maintain a debt queue for Q0

Q0

Q1

Q2

Shared Resource

10%

20%

70%

00.3 0.0

0.7

Multiplexer

Priority: Q0>Q1>Q2DebtQ

Step 2: re-assign the scheduling opportunity to Q0. And record the borrowed ID.

CBR Scheduler

23/4/19 University of Toronto 29

CBR MechanismCase 3: Credit Borrow and Repay

- Maintain a debt queue for Q0

Q0

Q1

Q2

Shared Resource

10%

20%

70%

00.3 0.0

0.7

Multiplexer

Priority: Q0>Q1>Q2DebtQ

Step 3: transfer data

CBR Scheduler

23/4/19 University of Toronto 30

CBR MechanismCase 3: Credit Borrow

- Maintain a debt queue for Q0

Q0

Q1

Q2

Shared Resource

10%

20%

70%

00.3 0.0

-0.3

Multiplexer

Priority: Q0>Q1>Q2DebtQ

Step 4: subtract 1 from original scheduled queue.

One Scheduling cycle is Done!!

Sum of credits = 0!

CBR Scheduler

23/4/19 University of Toronto 31

CBR MechanismCase 4: Credit Repay

- It is time to repay the credit

Q0

Q1

Q2

Shared Resource

10%

20%

70%

00.3 0.0

-0.3

Multiplexer

Priority: Q0>Q1>Q2DebtQ

Initial state: Q0 is empty but has debt. It will ‘appear’ to be non-empty

CBR Scheduler

23/4/19 University of Toronto 32

CBR MechanismCase 4: Credit Repay

- It is time to repay the credit

Q0

Q1

Q2

Shared Resource

10%

20%

70%

0

0.60.0 0.4

Multiplexer

Priority: Q0>Q1>Q2DebtQ

Step 1: calculate dynamic credits and allocate the residual bandwidth.

CBR Scheduler

23/4/19 University of Toronto 33

CBR MechanismCase 4: Credit Repay

- It is time to repay the credit

Q0

Q1

Q2

Shared Resource

10%

20%

70%

0

0.60.0 0.4

Multiplexer

Priority: Q0>Q1>Q2DebtQ

Step 2: return the scheduling opportunity and clear the DebtQ.

CBR Scheduler

23/4/19 University of Toronto 34

CBR MechanismCase 4: Credit Repay

- It is time to repay the credit

Q0

Q1

Q2

Shared Resource

10%

20%

70%

0

0.60.0 0.4

Multiplexer

Priority: Q0>Q1>Q2DebtQ

Step 3: transfer data.

CBR Scheduler

23/4/19 University of Toronto 35

CBR MechanismCase 4: Credit Repay

- It is time to repay the credit

Q0

Q1

Q2

Shared Resource

10%

20%

70%

0-0.4

0.0 0.4

Multiplexer

Priority: Q0>Q1>Q2DebtQ

Step 4: subtract 1 from scheduled queue.

One Scheduling cycle is Done!!

Sum of credits = 0!

CBR Scheduler

23/4/19 University of Toronto 36

CBR MechanismMinimum Latency Guarantee using CBR

- No need to wait for requests in other queues

Worst case: Q0 is not empty while DebtQ is full- No minimum latency guarantee under such case

23/4/19 University of Toronto 37

Implementation in FPGACBR MPMC top level diagram

- Instantiation-time configurable port number

- Run-time programmable priority and bandwidth

23/4/19 University of Toronto 38

Implementation in FPGA

Credit calculation circuit

Sorting Network and CBR

23/4/19 University of Toronto 39

Implementation Cost8 port CBR-MPMC with 16-depth DebtQ

- Xilinx Virtex-5 XC5VLX50T

- Speedy DDR backend memory controller

23/4/19 University of Toronto 40

EvaluationSimulation Framework

- Cycle accurate C model of MPMC- Simple close-page DDR memory model - Trace capturing and converting method

23/4/19 University of Toronto 41

EvaluationCPU workload trace file (from B. Jacob)

- Cache simulation on standard SPEC2000 integer benchmark

Irregular and low bandwidth requirement:

0.4 memory transactions per 1k instructions.

23/4/19 University of Toronto 42

EvaluationAccelerator Workload

- ALPBench suite of parallel multimedia applications

23/4/19 University of Toronto 43

EvaluationAccelerator Workload

- ALPBench suite of parallel multimedia applications

Periodically repeated access pattern, high bandwidth requirement:

18.3 memory transactions per 1k instructions.

23/4/19 University of Toronto 44

Results BGPQ Scheduler

- Latency: number of clock cycles- Bandwidth: number of memory transaction per 1k clock cycles

23/4/19 University of Toronto 45

ResultsCBR Scheduler with a 16-depth debtQ

23/4/19 University of Toronto 46

Impact of DebtQ SizeRepay conditions:

- DebtQ is full

- Q0 is empty

Q0

Q1

Q2

Shared Resource

10%

20%

70%

0

0.60.0 0.4

Multiplexer

Priority: Q0>Q1>Q2DebtQ

CBR Scheduler

When DebtQ is full, remaining requests in Q0 will not be served with minimum latency guarantee!

23/4/19 University of Toronto 47

Impact of DebtQ SizeHow big is enough for DebtQ?

- Determined by instant time bandwidth requirement

Irregular access pattern means:- Large range of DebtQ size requirement

Tradeoff- Resource efficiency VS performance

23/4/19 University of Toronto 48

ResultsImpact of debt queue size

23/4/19 University of Toronto 49

ConclusionsCBR scheduler can provide minimum

bandwidth and latency guarantees

Low implementation cost, power consumption

We expect its successful use in a wide range of multimedia applications

23/4/19 University of Toronto 50

Questions?

Q0

Q1

Q2

Shared Resource

10%

20%

70%

00.3 0.0

-0.3

CBR Scheduler

Multiplexer

Priority: Q0>Q1>Q2DebtQ

top related