Slide_1
A Survey of Architectural Design and Implementation Tradeoffs in Network on Chip
Systems
Dan Marconett
Next-Generation Networking Systems Lab
University of California, Davis
Slide_2
Overview
• Introduction SoC/NoC
• Architectures
• Routing Strategies
• Energy Dissipation
• Conclusion
Slide_3
SoC
What is System-on-Chip (SoC)?
• Integration of multiple computer components (i.e. microcontroller, memory blocks, timers, etc.) onto a single silicon chip
• Each on chip component referred to as a block
• Block abstraction enables component-level design of SoC containing multiple proprietary elements
Slide_4
NoC
What is Network-on-Chip (NoC)?
• Leveraging existing computer networking principles to improve inter-component intra-chip communications for SoC
• Each on chip component connected by switch to a particular comm wire(s)
• Improvement over standard bus based interconnections for SoC architectures in terms of throughput
Slide_5
Overview
• Introduction SoC/NoC
• Architectures
• Routing Strategies
• Energy Dissipation
• Conclusion
Slide_6
Architectures: CLICHE
CLICHÉ: Chip-Level Integration of Communicating Heterogeneous Elements
• Two-dimensional mesh network layout for NoC design
• All switches are connected to the four closest other switches and target resource block, except those switches on the edge of the layout
• Connections are two unidirectional links
Slide_7
Architectures: Folded Torus
Similar to mesh based architectures
• Wires are wrapped around from the top component to the bottom and rightmost to leftmost
• Smaller hop count
• Higher bandwidth
• Decreased Contention
• Increased chip space usage
Slide_8
Architectures: BFT
BFT: Butterfly Fat Tree
• Each node in tree model has coordinates (level, position) where level is depth and position is from left to right
• Leaves are component blocks
• Interior nodes are switches
• Four child ports per switch and two parent ports
•LogN levels, ith level has n/(2^i+1) switches, n = leaves (blocks)
• Use traffic aggregation to reduce congestion
Slide_9
Architectures: SPIN
SPIN: Scalable, Programmable, Integrated Network
• Leverages the Butterfly Fat Tree design
• Now every level has same number switches
• Network grows like (NlogN)/8
• Trades area overhead and decreased power efficiency for higher throughput
• Illustrative of performance vs. power consumption
Slide_10
Architectures: Octagon
Standard model: 8 components, 12 interconnects
• Design complexity increases linearly with number of nodes
• Largest packet travel distance is two hops
• High throughput
• Shortest path routing easy to implement
Slide_11
Overview
• Introduction SoC/NoC
• Architectures
• Routing Strategies
• Energy Dissipation
• Conclusion
Slide_12
Routing: Circuit/Packet Switching
Circuit Switching
• Dedicated path, or circuit, is established over which data packets will travel
• Naturally lends itself to time-sensitive guaranteed service due to resource allocation
• Reservation of bandwidth decreases overall throughput and increases average delays
Packet Switching
• Intermediate routers are now responsible for the routing of individual packets through the network, rather than following a single path
• Provides for so-called best-effort services
Slide_13
Routing: Wormhole/Virtual Cut Through
Wormhole Switching
• Message is divided up into smaller, fixed length flow units called flits
• Only first flit contains routing information, subsequent flits follow
• Buffer size is significantly reduced due to the limitation on
the number of flits needed to be buffered at any given time
Virtual Cut Through Switching
• Much like Wormhole switching
• Header flit can travel ahead and undergo processing while remaining flits are still navigating the network
• Higher acceptance rates and lower latencies than Wormhole
Slide_14
Routing: Contention
Contention occurs when routers or IP blocks attempt to send data over the same link at the same time
• For Circuit Switching, contention is resolved at the time of actual connection setup
• For packet switching, contention resolution is handled at a much finer level, by the router buffering and scheduling individual packets of information
• Better overall performance for packet switched networks at the cost of lack of service guarantee
Slide_15
Overview
• Introduction SoC/NoC
• Architectures
• Routing Strategies
• Energy Dissipation
• Conclusion
Slide_16
Energy Dissipation: Architectures
Two causes for dissipation, switches and wire segments
Many parameters in the architectural design phase which affect the key trade-off of performance vs. power dissipation
• Length of physical wires
• Switching techniques
• Buffer allocation
• Types of guaranteed service
• The topology itself
Slide_17
Energy Dissipation: Architectures (2)
Pande et al. [10] used a simulator to investigate various metrics, including energy dissipation, with respect to the five main architectures
• Average dynamic energy dissipated per event, each layout containing 256 functional blocks
• Energy dissipation increases linearly with the increase of virtual channels for all five architectures
• Small number (4) of virtual channels will keep energy dissipation low without giving up throughput
• When the traffic load was analyzed, it was found that the energy dissipation reached an upper limit when throughput was maximized
• Architectures with more elaborate topologies, and therefore higher degrees of connectivity (such as SPIN and Octagon) have a higher much greater energy dissipation on average (~60 nj vs. 250-350 nj)
Slide_18
Energy Dissipation: Switching
How to route information from block A to block B in such a way that the constraints on energy consumption are maintained
Banerjee et al. [9] address this issue through a modeling approach based on a 4x4 mesh layout
• Virtual-cut Through Switching versus Wormhole Switching
• For both routing techniques, energy dissipation rises linearly with the injection rate of data packets until the network is fully congested, after which it is constant
• Both techniques yield same power consumption
• Virtual-Cut Through switching produces higher acceptance rates and lower latencies than Wormhole Switching, therefore VCT is preferred
Slide_19
Overview
• Introduction SoC/NoC
• Architectures
• Routing Strategies
• Energy Dissipation
• Conclusion
Slide_20
Conclusion
1. More elaborate layouts with higher degrees of connectivity (SPIN and Octagon) were seen to have much higher rates of energy dissipation, however, they also yield increased throughput
2. Elaborate architectures also take up more space on the silicon chip
3. VCT is preferred to Wormhole due to decreased latency, though both have same energy dissipation for given traffic loads
4. Decide on priorities; communication reliability, energy efficiency, increased throughput, decreased latency….?
Slide_21
References
[1] E. Rijpkema, K. Goossens, A. Radulescu, J. Dielssen, J. van Meerbergen, P. Wielage, and E. Waterlander, “Trade- offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip,” IEE
Proceedings Computers and Digital Techniques, vol. 150, no. 5, pp. 294-302, Sept. 2003.
[2] W. Dally, C. Seitz, “Deadlock-free Message Routing in Multiprocessor Interconnection Networks,” IEEE Transactions on Computers, vol. C-34, no. 10, pp. 547-553, May 1987.
[3] S. Kumar, A. Jantsch, J. Soininen, M. Forsell, M. Millberg, J. Oberg, K. Tiensyrja, and A. Hemani, “A Network on Chip Architecture and Design Methodology,” Proceedings International Symposium VLSI (ISVLSI), pp. 117-124, 2002.
[4] W. J. Dally and B. Towles, “Route Packets, Not Wires: On-Chip Interconnection Networks,” Proceedings Design and Automation Conference (DAC), pp. 683-689, 2001.
[5] P. P. Pande, C. Grecu, A. Ivanov, and R. Saleh, “Design of a Switch for Network on Chip Applications,” Proceedings International Symposium on Circuits and Systems (ISCAS), vol. 5, pp.217-220, May 2003.
[6] P. Guerrier and A. Greiner, “A Generic Architecture for On-Chip Packet-Switched Interconnections,” Proceedings Design and Test in Europe (DATE), pp. 250-256, Mar. 2000.
[7] F. Karim, A. Nguyen, and Sujit Dey, “An Interconnect Architecture For Networking Systems on Chips,” IEEE Micro, vol. 22, no. 5, pp. 36-45, Sept./Oct. 2002.
[8] Ateris, “A comparison of Network-on-Chip and Buses,” http://www.arteris.com/noc_whitepaper.pdf.
[9] Nilanjan Banerjee, Praveen Vellanki, Karam S. Chatha, "A Power and Performance Model for Network-on-Chip Architectures," Proceedings of the Design, Automation, and Test in Europe Conference and Exhibition (DATE) , p. 21250, 2004.
[10] Partha Pratim Pande, Cristian Grecu, Michael Jones, Andre Ivanov, Resve Saleh, "Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures," IEEE Transactions on Computers ,vol. 54, no. 8, pp. 1025-1040, August, 2005.