george michelogiannakis william j. dally stanford university router designs for elastic- buffer...

Post on 29-Jan-2016

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

George MichelogiannakisWilliam J. Dally

Stanford University

Router Designs for Elastic-Buffer On-Chip Networks

Introduction

EB flow-control was recently proposed.• Uses the channels as distributed FIFOs.

EB routers are bufferless packet-switched routers.• They have the benefits of circuit-switched routers,

without the overhead of setting up and tearing down circuits.

This work explores the EB router design space.• By evaluating three representative designs.

2SC09: Routers for EB NoCs

The EB Flow-control Idea

Master-slave FF

Elastic buffer

Pipelined channel

Channel as FIFO

3SC09: Routers for EB NoCs

How Elastic Buffer Channels Work

Ready/valid handshake between elastic buffers• Ready: At least one free storage slot

• Valid: Non-empty (driving valid data)

Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 64SC09: Routers for EB NoCs

Use EB Flow-Control Through the Router

VC input-buffered router

EB router

Input bufferreplaced byinput EB

VC & SWallocators removed.Per-output arbitersinstead.

Three-slot outputEB to cover forarbitration doneone cycle inadvance.

LA routing alsoapplicable to EBnetworks.

5

Baseline Router - Issues

Issues constraining the clock cycle time:• Three-slot EB FSM too complicated: output EB

implemented as FIFO.

• Routing is performed serially with switch arbitration.

FIFO

Serially

6

Enhanced Two-Stage Router

Look-ahead routing to shorten the critical path.

Use two-slot EBs at output and for pipelining.• Flits are stored in the interm. EB and wait for a grant.

• Decision to traverse switch made in the same cycle.

7

Enhanced Two-Stage Router – Sync Module

Synchronization module maintains alignment between flits and grants.

Contains an output port EB.• Stores the chosen output port of the current and any

other packets in the router stage 1 and interm. EB.Maintains alignment between flits and grants.

8

Enhanced Two-Stage Router – Sync Module

When the current packet’s tail flit is departing:• Sync. module propagates the next output to the arbiters.

• From the appropriate location.

Sync. module propagates an update to all outputs.• An output receiving an update from the input it is

granting clocks the arbiter output regs at the next edge.

9

Single-Stage Router

Merges the two router stages to:• Reduce router latency.

• Avoid pipelining overhead.

10SC09: Routers for EB NoCs

Evaluation Methodology

45nm worst-case low-power commercial library.

Synopsys DC and Cadence Encounter.• 64-bit router datapath. 70% initial area utilization ratio.

Used a cycle-accurate network simulator.

We assume each router at its maximum post-P&R frequency, or all at the same frequency.

8x8 2D mesh. 2mm-long wires. 1 cycle latency.• Constant packet size of 512 bits.

Averaged over a set of six traffic patterns.

Swept datapath width from 28 to 171 bits.11SC09: Routers for EB NoCs

Placement and Routing Cycle Time

Enhanced two-stage has a 26% reduced cycle time compared to the single-stage, and 42% compared to the baseline two-stage.

12SC09: Routers for EB NoCs

Placement and Routing Energy per Bit

Baseline two-stage requires 9% less energy per bit compared to the single-stage, and 35% compared to the enhanced two-stage.

13

Placement and Routing Area

Single-stage occupies 30% less area than the enhanced two-stage and 44% less than the baseline two-stage.

14

Latency-Throughput, Max Frequencies.

Latency increase:

Enhanced: +1%Baseline: +46%

15

Latency-Throughput, Equal Frequencies.

Latency increase:

Enhanced: +34%Baseline: +32%

16

Which Router is the Optimal Choice?

Priority Router Choice

Operate at maximum frequencies

Area Enhanced two-stage

Energy Baseline two-stage(closely followed by single-stage)

Latency Single-stage(depends on effect on channels)

Operate at the same frequency

Area Single-stage

Energy Baseline two-stage(closely followed by single-stage)

Latency Single-stage

17SC09: Routers for EB NoCs

Conclusion

Improved EB router designs can widen the gap compared to VC networks.• Makes EB look even more attractive.

EB routers are simple designs. Simple designs have numerous advantages.• A lot of the complexity of VC networks is ignored by some

area and power models.

Overall compared to VC, 43% reduction in power per unit throughput, 67% reduction in cycle time and 22% throughput per unit area.

18SC09: Routers for EB NoCs

Questions?

SC09: Routers for EB NoCs

top related