1 h friedman fully asynchronous framework for gals network on chip 2010 advanced topics for noc...

106
1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic Engineering Technion EE Department Technion, Haifa, Israel Mentor: Prof Ran Ginosar

Post on 20-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

1 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Advanced topics for NOC

Friedman Harel

Seminar in VLSI Architectures (048879)Electronic Engineering

Technion

EE Department Technion, Haifa, Israel

Mentor: Prof Ran Ginosar

Page 2: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

2 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Agenda

• Quality of Service - QoS• NoC types• NoC fundamental• Guaranteed and Best-Effort Services for Networks on Chip• The MANGO Clock less Network-on-Chip• NoC Design Flow for TDMA and QoS Management• Q-ANoC Router with Dynamic Virtual Channel Allocation

• Three emerging interconnect paradigms• 3-D integration.• Nanophotonic communications.• Wireless Interconnects.

Page 3: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

3 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Agenda

NoC types

Page 4: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

4 H Friedman Fully Asynchronous framework for GALS network on chip 2010

SPIN Fat tree (FT)

SPIN network (Scalable Programmable Integrated Network)

Page 5: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

5 H Friedman Fully Asynchronous framework for GALS network on chip 2010

ST NoC Ring & Spidergon

3D

Page 6: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

6 H Friedman Fully Asynchronous framework for GALS network on chip 2010

ST NoC Spidergon routing: Across first

PathRedundancy

Page 7: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

7 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Mesh network

Torus was tested by NOSTRUM

But find to be too complicated

Page 8: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

8 H Friedman Fully Asynchronous framework for GALS network on chip 2010

TI OMAP MP Multimedia SoC

Page 9: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

9 H Friedman Fully Asynchronous framework for GALS network on chip 2010

AMBA = Bus Not NoC

Wide market ARM based architecture

Page 10: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

10 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Recent NoC Implementations Philips NXP Viper 2 Set-top box SoC

Page 11: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

11 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Agenda

NoC - fundamental

Page 12: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

12 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Four layer NoC Protocol stack

Page 13: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

13 H Friedman Fully Asynchronous framework for GALS network on chip 2010

A system level transaction through the protocol stack

Page 14: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

14 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Example of 5x5 Router

Link controller

Page 15: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

15 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Routers Queuing

Page 16: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

16 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Symmetric and Asymmetric routers

Page 17: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

17 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Routers Queuing

Page 18: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

18 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Wormhole switching with VC

Page 19: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

19 H Friedman Fully Asynchronous framework for GALS network on chip 2010

VC allocation FSM

Forward the packet flits, in the same rout where the header flit

was routed

Page 20: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

20 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Store and forward (SAF) VS Wormhole switching

Page 21: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

21 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Network interface to IP

Biulding blocks• FIFOs to buffer data and cross clocks domains• Decompose messages into packets and packets into flits.• Inject flits to the network given a scheduling policy• Depacketize data and reassemble it into messages toward system layer.

Page 22: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

22 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Agenda

NoC QoSData classification• GT (guaranteed throughput) / GS (guaranteed service)• BE (best effort)

=> Also multi service levels - SL TDMA

SDM – Multi VC

Circuit Switched - with global flow control and blocking relief routing algorithm

Page 23: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

23 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Time slot allocation - Time division Multiplexing (TDM)

NXP - Aethereal TM

• Contention free.• Pipelined time slot allocation.• Guaranteed incoming packets are serviced in the next time slot

Page 24: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

24 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Spatial Division Multiplexing (SDM)

Allocating sub set of the links wires to a given circuit, for the whole connection lifetime. (MANGO)

Page 25: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

25 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Block free algorithms should be implemented

Page 26: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

26 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Two views of the combined GT-BE router

• GT (guaranteed throughput) this is the part of the traffic for which the network has to give real-time guarantees (i.e. guaranteed bandwidth, bounded latency).• BE (best effort) this is the part of the traffic for which the network guarantees only fairness but does not give any bandwidth and timing guarantees.

Page 27: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

27 H Friedman Fully Asynchronous framework for GALS network on chip 2010

The three stages of a schedule iteration

Bipartite graph matching

Page 28: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

28 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Circuit Switched Router

• Each output port is 64-bits wide, since no control data is necessary.

• Each 64-bit output port is split into four, 16-bit wide lanes.

• Given the 5-port design, 20 input and output lanes therefore exist.

• A 16 x 20 crossbar provides full connectivity between every input and output lane. (5bit address – 4bits add + 1 valid).

• The completely static nature of the CS network means that a separate control network is necessary to provide all circuit set-up and tear-down functions.

NXP - Aethereal TM

Page 29: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

29 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Wormhole Router

• Conventional input-queued architecture with 4-flit-deep buffers at each input.

• A two-stage pipeline is provided.

• A pipeline register is provided between the FIFO and the crossbar.

• For crossbar traversal, the flit at the head of the FIFO is loaded into this register, which drives it across the rest of the data path to verify that no “buffer nearly full” signal is asserted by the destination router.

Control information is appended to each flit rather than being carried in an additional header flit. The 64-bit data-path therefore combines with a one-hot encoded, 5-bit next-port identifier for look-ahead routing, two bits each for destination X and Y addresses and one bit to identify tail flits, to result in a total flit size of 74 bits

Page 30: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

30 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Wormhole Router

Page 31: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

31 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Speculative Virtual Channel Router• Spec VC router provides single

cycle flit forwarding by utilizing look-ahead routing and speculative VC and crossbar allocation. A conventional input-queued architecture with 4 VCs per port and 4-flit-deep cyclic buffers for each VC.

• Both ,VC and switch allocators (based on matrix arbiters) are allocate VCs and crossbar ports speculatively for the next clock cycle if necessary. Since both crossbar and link traversal are performed.

Each flit identifies its VC by using a one hot encoded 4-bit VC identifier. A 5-bit next-port identifier, 4-bits each for destination X and Y address and a bit to identify tail flits combines with the 64-bit data path to result in a total flit size of 82 bits.

Page 32: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

32 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Speculative Virtual Channel Router

• VC and switch allocation may be performed concurrently Speculate that

waiting packets will be successful in acquiring a VC

Prioritize non-speculative requests over speculative ones

Page 33: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

33 H Friedman Fully Asynchronous framework for GALS network on chip 2010

State of the art NoCs and QoS

Page 34: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

34 H Friedman Fully Asynchronous framework for GALS network on chip 2010

MANGO approach Message passing Asynchronous NoC providing Guaranteed services over OCP (Open Cores Protocol) interfaces

Page 35: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

35 H Friedman Fully Asynchronous framework for GALS network on chip 2010

MANGO approach – virtual channels exclusively reserved for guarantees bandwidth channels

• Guaranteed Channels, for high priority connections

Page 36: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

36 H Friedman Fully Asynchronous framework for GALS network on chip 2010

MANGO approach – virtual channels exclusively reserved for guarantees bandwidth channels

ALG – Asynchronous Latency Guarantee

SPA – Static Priority Admission

Page 37: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

37 H Friedman Fully Asynchronous framework for GALS network on chip 2010

MANGO approach – Overlapping VC HS

Overlapping VC handshakes in order to maximize link utilization

Page 38: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

38 H Friedman Fully Asynchronous framework for GALS network on chip 2010

MANGO approach – Lock / UnlockMechanism for GS channels.

One flit at VC at a time

Page 39: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

39 H Friedman Fully Asynchronous framework for GALS network on chip 2010

MANGO approach – Credit - basedMechanism for BE channels.

Page 40: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

40 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Asynchronous Latency (BW) Guarantee (ALG)

Fair share accessBounded latency

Page 41: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

41 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Asynchronous Latency (BW) Guarantee (ALG)

Page 42: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

42 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Asynchronous Latency (BW) Guarantee (ALG)

Page 43: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

43 H Friedman Fully Asynchronous framework for GALS network on chip 2010

MANGO approach - Simulation

Page 44: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

44 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Multiplexed channel – 2 phase dual rail

Page 45: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

45 H Friedman Fully Asynchronous framework for GALS network on chip 2010

MANGO approach - Summary

MANGO architecture contains: 5 X 5 , 33-bit MANGO router using 0.13 um CMOS standard cells from ST Microelectronics. The router supports 7 independent buffered GS connections on each of the four network ports in addition to connection-less BE source-routing, with 4-flit deep BE buffers on each input port. The local port implements 4 GSports and 1 BE port. When routing data using the BE router, one bit is re-served to indicate end-of-packet.One router for an average core size of 5 mm^2.

111 Mflits/s

Page 46: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

46 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Flexible Wormhole-Switched NoC withTwo-Level Priority Data Delivery Service

Page 47: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

47 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Flexible Wormhole-Switched NoC withTwo-Level Priority Data Delivery Service

Block free adaptive routing algorithm

The turn model of adaptive West-First routing algorithm

Inter network LUTTwo priority levels , each direction contains its routing table.

Page 48: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

48 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Flexible Wormhole-Switched NoC withTwo-Level Priority Data Delivery Service

Page 49: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

49 H Friedman Fully Asynchronous framework for GALS network on chip 2010

NXP - Aethereal TM TDMA

Page 50: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

50 H Friedman Fully Asynchronous framework for GALS network on chip 2010

NXP - Aethereal TM Synchronizer

Page 51: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

51 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Routing algorithms benchmark:DA vs BFS (NXP - Aethereal TM)

Breadth-first search Is a basic shortest path search algorithm that works on non-weighted graphs, to find paths that are minimal in terms of physical distance

Dijkstra’s algorithm Is a more advanced shortest path search algorithm that works on weighted, directed graphs with non-negative weights. We use it to implement a more sophisticated routing function that tries to balance the traffic by distributing the communication over the network. The algorithm finds shortest paths in terms of a minimal weighted sum.

Page 52: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

52 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Routing algorithms benchmark:DA vs BFS (NXP - Aethereal TM)

Breadth-first search Is a basic shortest path search algorithm that works on non-weighted graphs, to find paths that are minimal in terms of physical distance

Dijkstra’s algorithm Is a more advanced shortest path search algorithm that works on weighted, directed graphs with non-negative weights. We use it to implement a more sophisticated routing function that tries to balance the traffic by distributing the communication over the network. The algorithm finds shortest paths in terms of a minimal weighted sum.

Page 53: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

53 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Routing algorithms benchmark:DA vs BFS (NXP - Aethereal TM)

Page 54: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

54 H Friedman Fully Asynchronous framework for GALS network on chip 2010

QNoC Router – TechnionMultiple VC and Service levels

A pull of VCs for each SL

Page 55: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

55 H Friedman Fully Asynchronous framework for GALS network on chip 2010

QNoC Router - Technion

Page 56: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

56 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Preemptive Priority - QNoC

Page 57: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

57 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Packet structure

Page 58: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

58 H Friedman Fully Asynchronous framework for GALS network on chip 2010

QNoC - data flow

Page 59: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

59 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Multi-Service-Level Input-Port

Page 60: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

60 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Virtual Channel Input Port(VC-IP) Architecture

Page 61: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

61 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Multi-Service-Level Output-Port

Page 62: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

62 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Virtual Channel Admission Control(VCAC)

Page 63: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

63 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Virtual Channel OutputPort (VC-OP) Architecture

Page 64: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

64 H Friedman Fully Asynchronous framework for GALS network on chip 2010

M-way VC Arbiter

Page 65: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

65 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Comparison table

Page 66: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

66 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Dynamic Network configuration

Movie application Promotion from wireless 3D game application

Page 67: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

67 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Dynamic Network configuration

Page 68: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

68 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Agenda

• Quality of Service - QoS• NoC fundamental• Guaranteed and Best-Effort Services for Networks on Chip• Dynamic Virtual Channel Allocation• Single and multi levels services• NoC Design Flow for TDMA and QoS Management in GALS

• Three emerging interconnect paradigms• 3-D integration.• Nanophotonic communications.• Wireless Interconnects.

Page 69: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

69 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Reduces silicon area and wire length

Page 70: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

70 H Friedman Fully Asynchronous framework for GALS network on chip 2010

3D Pros & Cons

Page 71: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

71 H Friedman Fully Asynchronous framework for GALS network on chip 2010

3D Cons – Thermal density control

Page 72: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

72 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Different types Integration

Page 73: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

73 H Friedman Fully Asynchronous framework for GALS network on chip 2010

3D reduces: 1 The number of metal layers. 2 The number of hops in NoC

Page 74: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

74 H Friedman Fully Asynchronous framework for GALS network on chip 2010

3-D Integration

Offers an opportunity to continue performance improvements using CMOS technology-Higher integration density.

Thus overcoming the barrier of interconnect scaling.

Three-Dimensional Integration options

Page 75: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

75 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Symmetric NoC Router Design.

Simplest extension to the generic 2D NoC router to facilitate a 3D layout:

Hop-by-Hop Traversal in both intra and inter layer.

Page 76: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

76 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Advantages of Symmetric NoC.

Simple to implement .Number of hops reduced due to folding 2D design into

multiple stacked layer.

Page 77: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

77 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Disadvantages of Symmetric NoC.

Each flit (quick movement) must undergo buffering and arbitration at every hop.

Adding overall delay in moving up and down layers.Addition of 2 extra ports necessitates a larger 7x7

crossbar.Crossbars scale upward very inefficiently.7x7 crossbars incur area and power overhead.

Page 78: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

78 H Friedman Fully Asynchronous framework for GALS network on chip 2010

3D NoC Bus Hybrid Router Design.

Why this?Asymmetric Delays between fast vertical interconnects

and horizontal interconnects that connects cores.Vertical distance is negligible compared to intra layer bus

can be used between any two layer.Why Bus?Because it provides single-hop communication.

Page 79: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

79 H Friedman Fully Asynchronous framework for GALS network on chip 2010

3D NoC Bus Hybrid

NoC router can be hybridized with a bus link in vertical dimension which forms as interface between NoC domain and the bus (vertical domain).

Page 80: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

80 H Friedman Fully Asynchronous framework for GALS network on chip 2010

3D Network in Memory terms.

Page 81: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

81 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Advantages – Increase locality.

Page 82: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

82 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Advantages.

Hybrid system provides both performance and area benefits.

Requires 6x6 crossbar.

Page 83: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

83 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Disadvantages.

• Flits from different layers wishing to move up/down should arbitrate for access to shared medium-bus.

• Though single hop improve performance in terms of overall latency, inter -layer bandwidth suffers.

Page 84: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

84 H Friedman Fully Asynchronous framework for GALS network on chip 2010

True 3D Router Design

True 3D crossbar implementation enables seamless integration of vertical links in overall router configuration.

The vertical links are now embedded in the crossbar.

Page 85: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

85 H Friedman Fully Asynchronous framework for GALS network on chip 2010

True 3D Router Design

Page 86: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

86 H Friedman Fully Asynchronous framework for GALS network on chip 2010

True 3D Router Design

Interconnection between the various links in a 3D crossbar would have to be provided by dedicated connection boxes at each layer.

2D crossbars of all layers are physically fused into one single three-dimensional crossbar.

Multiple internal paths are present.

Page 87: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

87 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Advantages.

Single-hop inter-layer:

Flits re-entering another layer do not go through an intermediate buffer, instead they directly connect to the output port of destination layer. For ex: a flit can now move from western input port of layer 2 to the northern output port of layer 4 in a single hop.

Requires less power hungry 5x5 crossbar.

Page 88: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

88 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Disadvantages.

Large number of vertical links increases path diversity i.e. multiple possible paths between source and destination pairs.

Increases complexity of central arbiter which co-ordinates inter –layer communication in 3D crossbar.

Redundancy offered by the full connectivity is rarely utilized by real-world workloads.

Page 89: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

89 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Multi-layer 3D NoC Router Design

All the 3D router design options discussed earlier are based on assumption that PE’s are still 2D.

For a fine granularity design of 3D design one can split a PE across multiple layer.

3D router is required for such a multi-layer stacking of processing elements.

Logically such multilayer PE and multi-layer router is identical to traditional 2D NoC case.

Page 90: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

90 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Nanophotonic Interconnection Networks.

Nanophotonic circuits for manipulating the light signals, similar to the way electrical signals are manipulated in computer chips. 

Optical communications can meet the large bandwidth demands of high-performance computing systems.

This future 3D-integated chip consists of several layers connected with each other with very dense and small pitch interlayer vias. The lower layer is a processor itself with many hundreds of individual cores. Memory layer (or layers) are bonded on top to provide fast access to local caches.

Page 91: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

91 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Combining Photonic NoC and 3D Integration.

Page 92: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

92 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Advantages – Energy efficient.

Bit rate transparency: Router must switch with every bit of transmitted data leading to dynamic power dissipation that scales with bit rate, photonic switches switch on and off once per message, and their energy dissipation is essentially independent from the bit rate.

Low loss in optical waveguides: At the chip and board scale the power that is dissipated on a photonic link is independent of the transmission distance. Energy dissipation remains essentially the same whether a message travels between two processing cores that are a few millimeters or a few centimeters apart.

Page 93: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

93 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Limitations.

Its limitations in terms of computing and storage capabilities pose some critical challenges to the design of photonic NoCs.

In particular, flit buffering and control-flit processing, two important functions of any packet-switched NoCs, are impractical to implement with optical devices.

At the system-level more research is necessary to understand how to exploit most effectively the high-bandwidth and low-power connectivity offered by optical links to increase the performance of real applications.

Page 94: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

94 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Wireless NoC

Alternative to the existing metal/dielectric 2-D interconnect infrastructures is to use transmission of signals via RF/wireless interconnects.

Using Frequency Division Multiple Access (FDMA) with multiband frequency synthesizers and metal wires available from current CMOS processes as transmission lines a high bandwidth RF interconnect can be created for on-chip data transport.

Page 95: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

95 H Friedman Fully Asynchronous framework for GALS network on chip 2010

On Chip Antennas.

By replacing multi-hop wire line links in a NoC through high-bandwidth single-hop long-range wireless channels the latency, power consumption and interconnect routing problems of a traditional NoC can be simultaneously addressed.

Design of a wireless NoC based on CMOS Ultra Wideband (UWB) technology .

The particular antennas used in achieve a transmission range of 1 mm while being 2.98 mm in length.

For NoC spreading on die of 20 mm x 20 mm will require multiple hops communication.

Page 96: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

96 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Silicon integrated on-chip antennas.

Metal Zig-Zag antennas operating in tens of GHz.

It was shown that zig-zag monopole antennas of axial length 1-2 mm can achieve a communication range of about 10-15 mm.

Page 97: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

97 H Friedman Fully Asynchronous framework for GALS network on chip 2010

NanoScale Anntennas.

Nanoscale antennas based on CNTs operating in the THz/optical frequency range

CNT material enables high transmitted powers from nanotube antennas, crucial for long-range communications.

Fig: Carbon Nanotube.

Page 98: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

98 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Network Architecture

Major challenges in wire-based traditional on-chip communication networks are the high latency and power consumption of their multi-hop links.

Inserting single-hop long range wireless links in place of multi-hop wire line communication, overall system performance can be significantly improved.

Page 99: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

99 H Friedman Fully Asynchronous framework for GALS network on chip 2010

WiNoC

Network should be divided into multiple small clusters of neighboring cores called subnets. Wireless links will be introduced between the subnets, while intra-subnet communication will still be solely through wires.

Each subnet is equipped with a wireless base station (WB), which transmits and receives data packets over the wireless channels.

Page 100: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

100 H Friedman Fully Asynchronous framework for GALS network on chip 2010

WiNoC

Allocating long range wireless links between distant subnets facilitates better utilization of limited wireless channels.

Page 101: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

101 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Conclusion.

Three-dimensional integration, nanophotonic communication and on-chip wireless links are all promising alternative options to traditional planar metal/dielectric-based interconnects for building the communication infrastructure of future multi-core systems-on-chip

However, in order to harvest their potential more research

is necessary to address various challenges in multiple areas

including system architecture, circuit design, device fabrication.

Page 102: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

102 H Friedman Fully Asynchronous framework for GALS network on chip 2010

3D SoC

Page 103: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

103 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Current NoC with off chip DRAM

Page 104: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

104 H Friedman Fully Asynchronous framework for GALS network on chip 2010

Proposed NoC with up layer DRAM

Page 105: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

105 H Friedman Fully Asynchronous framework for GALS network on chip 2010

References [1] Mapping and management of communication and services on MP-SOC platforms IMEC Copyright 2007 T.M Marescaux Technische Universiteit Eindhoven http://w3.ele.tue.nl/nl/

[2] The MANGO Clockless Network-on-Chip: Concepts and Implementation PhD Thesis by Tobias Bjerregaard Kgs. Lyngby 2005 IMM-PHD-2005-153. Technical University of Denmark, Informatics and Mathematical Modeling www.imm.dtu.dk (festschrift)

[3] Networks-On-Chip Based on Dynamic Wormhole Packet Identity Mapping Management Faizal A. Samman, Thomas Hollstein, and Manfred Glesner Institute of Microelectronic Systems, Darmstadt University of Technology, Karlstr 15, 64283 Darmstadt, Germany Department of Electrical Engineering, Hasanuddin University, Jl. Perintis Kemerdekaan km.10, Makassar 90245, Indonesia Correspondence should be addressed to Faizal A. Samman, [email protected] Received 7 August 2008; Revised 1 December 2008; Accepted 7 January 2009

[4] QNoC Asynchronous Router with Dynamic Virtual Channel Allocation R. R. Dobkin, R. Ginosar, and I. Cidon, “Qnoc asynchronous router with dynamic virtual channel allocation,” in NOCS ’07: Proceedings of the First International Symposium on Networks-on-Chip. Washington, DC, USA: IEEE Computer Society, 2007, p. 218

[5] An Energy and Performance Exploration of Network-on-Chip ArchitecturesArnab Banerjee, Pascal T. Wolkotte, Robert D. Mullins, Simon W. Moore, and Gerard J. M. Smit . IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 3, MARCH 2009

[6] NoC Design Flow for TDMA and QoS Management in a GALS ContextSamuel Evain,Jean-Philippe Diguet, and Dominique Houzet LESTER, Hindawi Publishing Corporation EURASIP Journal on Embedded systems Volume 2006, Article ID 63656, Pages 1–12 DOI 10.1155/ES/2006/63656

[7] Design methodologies and architectures for Networks-on-Chip SoCs Valerio Catalano STMicroelectronics/ALaRI-USI Neuchatel 11/11/2008 Presentation.

Page 106: 1 H Friedman Fully Asynchronous framework for GALS network on chip 2010 Advanced topics for NOC Friedman Harel Seminar in VLSI Architectures (048879) Electronic

106 H Friedman Fully Asynchronous framework for GALS network on chip 2010

References [8] Trade Offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on ChipRijpkema, E.; Goossens, K.; Rădulescu, A. In: Design, Automation and Test in Europe (DATE’03), Mar. 2003, pp. 350-355

[9] An efficient distributed memory interface for Many-Core Platform with 3D stacked DRAMIgor Loi, and Luca Benini DEIS, University of Bologna, 978-3-9810801-6-2/DATE10 © 2010 EDAA

[10] Network-on-Chip for 3D ArchitecturesVijaykrishnan Narayanan, C. Nicopoulos, R. Das, S. Eachempati, A. Mishra, J. Kim, Y. Xie, D. Park, C.R. Das. Presentation at MPSoC 2008 Microsystems Design Lab (www.cse.psu.edu/~mdl)

[11] Comparative Analysis of NoCs for Two-Dimensional Versus Three-Dimensional SoCs Supporting Multiple Voltage and Frequency Islands. Ciprian Seiculescu, Srinivasan Murali, Luca Benini, and Giovanni De Micheli, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 57, NO. 5, MAY 2010

[12] Traffic- and Thermal-Aware Run-Time Thermal Management Scheme for 3D NoC SystemsChih-Hao Chao, Kai-Yuan Jheng, Hao-Yu Wang, Jia-Cheng Wu, and An-Yeu. 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip

[13] On-Chip Interconnect: The Past, Present, and FutureEby G. Friedman Future Interconnects and Networks NoC Workshop – DATE ’06 March 10, 2006 Presentation

[14] Network’s On Chip (NoC)Youraj Pawar & Surjyendu Ray FCU computer science Presentation