Download - Slide_1 A Survey of Architectural Design and Implementation Tradeoffs in Network on Chip Systems Dan Marconett Next-Generation Networking Systems Lab University

Slide_1

A Survey of Architectural Design and Implementation Tradeoffs in Network on Chip

Systems

Dan Marconett

Next-Generation Networking Systems Lab

University of California, Davis

[email protected]

Slide_2

Overview

• Introduction SoC/NoC

• Architectures

• Routing Strategies

• Energy Dissipation

• Conclusion

Slide_3

SoC

What is System-on-Chip (SoC)?

• Integration of multiple computer components (i.e. microcontroller, memory blocks, timers, etc.) onto a single silicon chip

• Each on chip component referred to as a block

• Block abstraction enables component-level design of SoC containing multiple proprietary elements

Slide_4

NoC

What is Network-on-Chip (NoC)?

• Leveraging existing computer networking principles to improve inter-component intra-chip communications for SoC

• Each on chip component connected by switch to a particular comm wire(s)

• Improvement over standard bus based interconnections for SoC architectures in terms of throughput

Slide_5

Overview


• Architectures



• Conclusion

Slide_6

Architectures: CLICHE

CLICHÉ: Chip-Level Integration of Communicating Heterogeneous Elements

• Two-dimensional mesh network layout for NoC design

• All switches are connected to the four closest other switches and target resource block, except those switches on the edge of the layout

• Connections are two unidirectional links

Slide_7

Architectures: Folded Torus

Similar to mesh based architectures

• Wires are wrapped around from the top component to the bottom and rightmost to leftmost

• Smaller hop count

• Higher bandwidth

• Decreased Contention

• Increased chip space usage

Slide_8

Architectures: BFT

BFT: Butterfly Fat Tree

• Each node in tree model has coordinates (level, position) where level is depth and position is from left to right

• Leaves are component blocks

• Interior nodes are switches

• Four child ports per switch and two parent ports

•LogN levels, ith level has n/(2^i+1) switches, n = leaves (blocks)

• Use traffic aggregation to reduce congestion

Slide_9

Architectures: SPIN

SPIN: Scalable, Programmable, Integrated Network

• Leverages the Butterfly Fat Tree design

• Now every level has same number switches

• Network grows like (NlogN)/8

• Trades area overhead and decreased power efficiency for higher throughput

• Illustrative of performance vs. power consumption

Slide_10

Architectures: Octagon

Standard model: 8 components, 12 interconnects

• Design complexity increases linearly with number of nodes

• Largest packet travel distance is two hops

• High throughput

• Shortest path routing easy to implement

Slide_11

Overview


• Architectures



• Conclusion

Slide_12

Routing: Circuit/Packet Switching

Circuit Switching

• Dedicated path, or circuit, is established over which data packets will travel

• Naturally lends itself to time-sensitive guaranteed service due to resource allocation

• Reservation of bandwidth decreases overall throughput and increases average delays

Packet Switching

• Intermediate routers are now responsible for the routing of individual packets through the network, rather than following a single path

• Provides for so-called best-effort services

Slide_13

Routing: Wormhole/Virtual Cut Through

Wormhole Switching

• Message is divided up into smaller, fixed length flow units called flits

• Only first flit contains routing information, subsequent flits follow

• Buffer size is significantly reduced due to the limitation on

the number of flits needed to be buffered at any given time

Virtual Cut Through Switching

• Much like Wormhole switching

• Header flit can travel ahead and undergo processing while remaining flits are still navigating the network

• Higher acceptance rates and lower latencies than Wormhole

Slide_14

Routing: Contention

Contention occurs when routers or IP blocks attempt to send data over the same link at the same time

• For Circuit Switching, contention is resolved at the time of actual connection setup

• For packet switching, contention resolution is handled at a much finer level, by the router buffering and scheduling individual packets of information

• Better overall performance for packet switched networks at the cost of lack of service guarantee

Slide_15

Overview


• Architectures



• Conclusion

Slide_16

Energy Dissipation: Architectures

Two causes for dissipation, switches and wire segments

Many parameters in the architectural design phase which affect the key trade-off of performance vs. power dissipation

• Length of physical wires

• Switching techniques

• Buffer allocation

• Types of guaranteed service

• The topology itself

Slide_17

Energy Dissipation: Architectures (2)

Pande et al. [10] used a simulator to investigate various metrics, including energy dissipation, with respect to the five main architectures

• Average dynamic energy dissipated per event, each layout containing 256 functional blocks

• Energy dissipation increases linearly with the increase of virtual channels for all five architectures

• Small number (4) of virtual channels will keep energy dissipation low without giving up throughput

• When the traffic load was analyzed, it was found that the energy dissipation reached an upper limit when throughput was maximized

• Architectures with more elaborate topologies, and therefore higher degrees of connectivity (such as SPIN and Octagon) have a higher much greater energy dissipation on average (~60 nj vs. 250-350 nj)

Slide_18

Energy Dissipation: Switching

How to route information from block A to block B in such a way that the constraints on energy consumption are maintained

Banerjee et al. [9] address this issue through a modeling approach based on a 4x4 mesh layout

• Virtual-cut Through Switching versus Wormhole Switching

• For both routing techniques, energy dissipation rises linearly with the injection rate of data packets until the network is fully congested, after which it is constant

• Both techniques yield same power consumption

• Virtual-Cut Through switching produces higher acceptance rates and lower latencies than Wormhole Switching, therefore VCT is preferred

Slide_19

Overview


• Architectures



• Conclusion

Slide_20

Conclusion

1. More elaborate layouts with higher degrees of connectivity (SPIN and Octagon) were seen to have much higher rates of energy dissipation, however, they also yield increased throughput

2. Elaborate architectures also take up more space on the silicon chip

3. VCT is preferred to Wormhole due to decreased latency, though both have same energy dissipation for given traffic loads

4. Decide on priorities; communication reliability, energy efficiency, increased throughput, decreased latency….?

Slide_21

References

[1] E. Rijpkema, K. Goossens, A. Radulescu, J. Dielssen, J. van Meerbergen, P. Wielage, and E. Waterlander, “Trade- offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip,” IEE

Proceedings Computers and Digital Techniques, vol. 150, no. 5, pp. 294-302, Sept. 2003.

[2] W. Dally, C. Seitz, “Deadlock-free Message Routing in Multiprocessor Interconnection Networks,” IEEE Transactions on Computers, vol. C-34, no. 10, pp. 547-553, May 1987.

[3] S. Kumar, A. Jantsch, J. Soininen, M. Forsell, M. Millberg, J. Oberg, K. Tiensyrja, and A. Hemani, “A Network on Chip Architecture and Design Methodology,” Proceedings International Symposium VLSI (ISVLSI), pp. 117-124, 2002.

[4] W. J. Dally and B. Towles, “Route Packets, Not Wires: On-Chip Interconnection Networks,” Proceedings Design and Automation Conference (DAC), pp. 683-689, 2001.

[5] P. P. Pande, C. Grecu, A. Ivanov, and R. Saleh, “Design of a Switch for Network on Chip Applications,” Proceedings International Symposium on Circuits and Systems (ISCAS), vol. 5, pp.217-220, May 2003.

[6] P. Guerrier and A. Greiner, “A Generic Architecture for On-Chip Packet-Switched Interconnections,” Proceedings Design and Test in Europe (DATE), pp. 250-256, Mar. 2000.

[7] F. Karim, A. Nguyen, and Sujit Dey, “An Interconnect Architecture For Networking Systems on Chips,” IEEE Micro, vol. 22, no. 5, pp. 36-45, Sept./Oct. 2002.

[8] Ateris, “A comparison of Network-on-Chip and Buses,” http://www.arteris.com/noc_whitepaper.pdf.

[9] Nilanjan Banerjee, Praveen Vellanki, Karam S. Chatha, "A Power and Performance Model for Network-on-Chip Architectures," Proceedings of the Design, Automation, and Test in Europe Conference and Exhibition (DATE) , p. 21250, 2004.

[10] Partha Pratim Pande, Cristian Grecu, Michael Jones, Andre Ivanov, Resve Saleh, "Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures," IEEE Transactions on Computers ,vol. 54, no. 8, pp. 1025-1040, August, 2005.

Download - Slide_1 A Survey of Architectural Design and Implementation Tradeoffs in Network on Chip Systems Dan Marconett Next-Generation Networking Systems Lab University

Top Related