communication synthesis: buses and network-on-chip (noc) dr. eng. amr t. abdel-hamid elect 1002...

27
Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Upload: barnard-jones

Post on 26-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Communication Synthesis:

Buses and Network-on-Chip (NOC)

Dr. Eng. Amr T. Abdel-Hamid

ELECT 1002

Spring 2008

System

-On

-a-Ch

ip

Desig

n

Page 2: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

The SoC nightmare

The architecture is tightly coupled

Source: Prof Jan Rabaey CS-252-2000 UC Berkeley

DMA CPU DSP

MemCtrl. Bridge

MPEGI oo

The “Board-on-a-Chip”Approach

C

System Bus

Peripheral Bus

Page 3: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Very long wires

1 ns (1 GHz) 0.1 ns (10 GHz)

A

B

A

B

Year 2005 Year 2010

Page 4: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Why NoC?

Global wire delays

increase exponentially or linearly by inserting repeaters The delay may exceed one clock cycle after repeater insertion In ultra-deep submicron processes, 80% or more of the delay of

critical paths will be due to interconnections Communication structures need to be designed first and then followed by

functional blocks

Page 5: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Homogeneous SoC (MP-SoC)

CPU

MEM

CPU

MEM

CPU

MEM

CPU

MEM

CPU

MEM

CPU

MEM

CPU

MEM

CPU

MEM

Interconnection network (BUS, XBAR)

Page 6: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Why not bus? Shared medium arbitrated bus, the most frequently used on-chip interconnect

architectures Pros

Simple, low area cost, and extensibility Cons

The intrinsic parasitic resistance and capacitance can be quite high for a long bus line

Every additional IP block adds to parasitic capacitance and causes increased propagation delay

The number of IP blocks that can be connected by the bus is limited

Page 7: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

On-Chip Communication

Bus based interconnect Low cost Easier to Implement Flexible

Networks on Chip Layered Approach Buses replaced with Networked

architectures Better electrical properties Higher bandwidth Energy efficiency Scalable

Irregular architectures Regular ArchitecturesBus-based architectures

Page 8: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Network on Chip

Software

Transport

Network

Wiring

Separation of concerns

Software

Transport

Network

Wiring

Data Link Layer

Communication-based Design Orthogonalizes function and communication Builds on well-known models-of-computation and correct-by-

construction synthesis flow Parallels layered approach exploited by communications community

Page 9: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

NoCWhat is Network-on-Chip (NoC)?

• Leveraging existing computer networking principles to improve inter-component intra-chip communications for SoC.

• Each on chip component connected by switch to a particular comm wire(s)

• Improvement over standard bus based interconnections for SoC architectures in terms of throughput

Page 10: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

SOC Current Trend

Explicitly parallel SoC architectures

Integrating huge amounts of Memory in chip designs

Distributed Shared Memory Environments

Should allow Interconnection centric design flow and better predictability Physical design Closure Wire delay dominates gate delay

Page 11: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Design goal of NoC

High throughput Low latency Less energy consumption Small area requirements

Network-on-Chip Basics: Architectures Routing Strategies Evaluation

IP Core

CNI

Router Logic

To/From Network

Figure 1: NoC Architecture

Page 12: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Routing: Circuit/Packet Switching

Circuit Switching

• Dedicated path, or circuit, is established over which data packets will travel

• Naturally lends itself to time-sensitive guaranteed service due to resource allocation

• Reservation of bandwidth decreases overall throughput and increases average delays

Packet Switching

• Intermediate routers are now responsible for the routing of individual packets through the network, rather than following a single path

• Provides for so-called best-effort services

Page 13: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Routing: Wormhole/Virtual Cut ThroughWormhole Switching

• Message is divided up into smaller, fixed length flow units called flits

• Only first flit contains routing information, subsequent flits follow

• Buffer size is significantly reduced due to the limitation on the number of flits needed to be buffered at any given time

Virtual Cut Through Switching

• Much like Wormhole switching

• Header flit can travel ahead and undergo processing while remaining flits are still navigating the network

• Higher acceptance rates and lower latencies than Wormhole

Page 14: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Wormhole Switching

Page 15: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Routing: Contention

•Contention occurs when routers or IP blocks attempt to send data over the same link at the same time

• For Circuit switching, contention is resolved at the time of actual connection setup

• For packet switching, contention resolution is handled at a much finer level, by the router buffering and scheduling individual packets of information

• Better overall performance for packet switched networks at the cost of lack of service guarantee

Page 16: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Architectures: SPIN• SPIN: Scalable, Programmable, Integrated Network

• Every level has same number switches

• Network grows like (NlogN)/8

• Trades area overhead and decreased power efficiency for higher throughput

• Illustrative of performance vs. power consumption

Page 17: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Architectures: CLICHE

•CLICHÉ: Chip-Level Integration of Communicating Heterogeneous Elements

• Two-dimensional mesh network layout for NoC design

• All switches are connected to the four closest other switches and target resource block, except those switches on the edge of the layout

• Connections are two unidirectional links

Page 18: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Architectures: Torus

•Similar to mesh based architectures

• Wires are wrapped around from the top component to the bottom and rightmost to leftmost

• Smaller hop count

• Higher bandwidth

• Decreased Contention

• Increased chip space usage

Page 19: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Architectures: Folded Torus•Similar to Torus

•Torus, the long end-around connections can yield excessive delays

•Avoided by folding the torus

Page 20: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Architectures: Octagon•Standard model: 8 components, 12 interconnects

• Design complexity increases linearly with number of nodes

• Largest packet travel distance is two hops

• High throughput

• Shortest path routing easy to implement

Page 21: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Architectures: BFT•BFT: Butterfly Fat Tree

• Each node in tree model has coordinates (level, position) where level is depth and position is from left to right

• Leaves are component blocks

• Interior nodes are switches

• Four child ports per switch and two parent ports

•LogN levels, ith level has n/(2^i+1) switches, n = leaves (blocks)

• Use traffic aggregation to reduce congestion

Page 22: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Network interface

Open Core Protocol (OCP) An interface standard between IP cores and the interconnection

fabric

Page 23: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Packet Format

Type: Head, Data, Tail and CompleteVCID: Virtual Channel IdentifierRoute: ‘N’ bit route field with last 2 bits specifying the Route to be used in the next controller

00 - Left01 - Right

10 - Straight11 - Extract

Data: Actual Data field

Page 24: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Routing Example

Page 25: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Simulation

A simulator is used to investigate various metrics:

•Each system consists of 256 functional IP blocks

•Wormhole routing is used

•User can choose uniform and localized traffic

•Support both Poisson and self-similar message injection distributions

A flit is only one word (36 bits, 4 bits are for packet framing).

Page 26: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Area comparison

SPIN and Octagon have a considerably higher silicon area overhead.

Page 27: Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design

Dr. A

mr T

alaat

ELECT1002

So

C D

esign

Projected performance