george michelogiannakis william j. dally stanford university router designs for elastic- buffer...
Post on 29-Jan-2016
216 Views
Preview:
TRANSCRIPT
George MichelogiannakisWilliam J. Dally
Stanford University
Router Designs for Elastic-Buffer On-Chip Networks
Introduction
EB flow-control was recently proposed.• Uses the channels as distributed FIFOs.
EB routers are bufferless packet-switched routers.• They have the benefits of circuit-switched routers,
without the overhead of setting up and tearing down circuits.
This work explores the EB router design space.• By evaluating three representative designs.
2SC09: Routers for EB NoCs
The EB Flow-control Idea
Master-slave FF
Elastic buffer
Pipelined channel
Channel as FIFO
3SC09: Routers for EB NoCs
How Elastic Buffer Channels Work
Ready/valid handshake between elastic buffers• Ready: At least one free storage slot
• Valid: Non-empty (driving valid data)
Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 64SC09: Routers for EB NoCs
Use EB Flow-Control Through the Router
VC input-buffered router
EB router
Input bufferreplaced byinput EB
VC & SWallocators removed.Per-output arbitersinstead.
Three-slot outputEB to cover forarbitration doneone cycle inadvance.
LA routing alsoapplicable to EBnetworks.
5
Baseline Router - Issues
Issues constraining the clock cycle time:• Three-slot EB FSM too complicated: output EB
implemented as FIFO.
• Routing is performed serially with switch arbitration.
FIFO
Serially
6
Enhanced Two-Stage Router
Look-ahead routing to shorten the critical path.
Use two-slot EBs at output and for pipelining.• Flits are stored in the interm. EB and wait for a grant.
• Decision to traverse switch made in the same cycle.
7
Enhanced Two-Stage Router – Sync Module
Synchronization module maintains alignment between flits and grants.
Contains an output port EB.• Stores the chosen output port of the current and any
other packets in the router stage 1 and interm. EB.Maintains alignment between flits and grants.
8
Enhanced Two-Stage Router – Sync Module
When the current packet’s tail flit is departing:• Sync. module propagates the next output to the arbiters.
• From the appropriate location.
Sync. module propagates an update to all outputs.• An output receiving an update from the input it is
granting clocks the arbiter output regs at the next edge.
9
Single-Stage Router
Merges the two router stages to:• Reduce router latency.
• Avoid pipelining overhead.
10SC09: Routers for EB NoCs
Evaluation Methodology
45nm worst-case low-power commercial library.
Synopsys DC and Cadence Encounter.• 64-bit router datapath. 70% initial area utilization ratio.
Used a cycle-accurate network simulator.
We assume each router at its maximum post-P&R frequency, or all at the same frequency.
8x8 2D mesh. 2mm-long wires. 1 cycle latency.• Constant packet size of 512 bits.
Averaged over a set of six traffic patterns.
Swept datapath width from 28 to 171 bits.11SC09: Routers for EB NoCs
Placement and Routing Cycle Time
Enhanced two-stage has a 26% reduced cycle time compared to the single-stage, and 42% compared to the baseline two-stage.
12SC09: Routers for EB NoCs
Placement and Routing Energy per Bit
Baseline two-stage requires 9% less energy per bit compared to the single-stage, and 35% compared to the enhanced two-stage.
13
Placement and Routing Area
Single-stage occupies 30% less area than the enhanced two-stage and 44% less than the baseline two-stage.
14
Latency-Throughput, Max Frequencies.
Latency increase:
Enhanced: +1%Baseline: +46%
15
Latency-Throughput, Equal Frequencies.
Latency increase:
Enhanced: +34%Baseline: +32%
16
Which Router is the Optimal Choice?
Priority Router Choice
Operate at maximum frequencies
Area Enhanced two-stage
Energy Baseline two-stage(closely followed by single-stage)
Latency Single-stage(depends on effect on channels)
Operate at the same frequency
Area Single-stage
Energy Baseline two-stage(closely followed by single-stage)
Latency Single-stage
17SC09: Routers for EB NoCs
Conclusion
Improved EB router designs can widen the gap compared to VC networks.• Makes EB look even more attractive.
EB routers are simple designs. Simple designs have numerous advantages.• A lot of the complexity of VC networks is ignored by some
area and power models.
Overall compared to VC, 43% reduction in power per unit throughput, 67% reduction in cycle time and 22% throughput per unit area.
18SC09: Routers for EB NoCs
Questions?
SC09: Routers for EB NoCs
top related