Download - Network On Chip Cache Coherency
Network On Chip Network On Chip Cache CoherencyCache CoherencyFinal presentation – Part AFinal presentation – Part A
Students:Students: Zemer Tzach Zemer Tzach Kalifon EthanKalifon Ethan
Instructor:Instructor: Walter Walter IsascharIsaschar
Spring 2008Spring 2008
AgendaAgenda
Project’s general concepts.
Design architecture of the router.
Implementation of the router (using HDL
Designer).
Router’s simulations.
Demonstration of our NoC.
Network On Chip - Cache Coherency 2
3Network On Chip - Cache Coherency
General BackgroundGeneral Background
Modern CPU’s are based on
CMP - Multi-Core Processor.
Improved performance is achieved by
“Distribution and Parallelism”.
Cores interact by using
NoC – Network on Chip.
Network On Chip - Cache Coherency 4
NoC’s General DiagramNoC’s General Diagram
Network On Chip - Cache Coherency 5
NoC’s CharacteristicsNoC’s Characteristics
Wormhole packet routing.
Packet’s path is X-Y.
Units can communicate simultaneously.
Reduce power consumption.
Scalability.
Network On Chip - Cache Coherency 6
Cache CoherencyCache Coherency
Definition: CMP cores use only up to date
data.
Originally, Cache Coherency in CMP was
achieved by using a central memory
control unit.
Network On Chip - Cache Coherency 7
Cache Coherency Cache Coherency Protocol NowadaysProtocol Nowadays
Network On Chip - Cache Coherency 8
Problem DescriptionProblem Description
Prior Cache Coherency protocols are
irrelevant – NoC doesn’t have central unit.
Adding such unit will damage both NoC’s
scalability and parallelism.
Network On Chip - Cache Coherency 9
Solution RequirementsSolution Requirements
Won’t affect main NoC’s characteristics
(e.g. scalability).
Avoid “Hot Spots” and “Bottle Necks”.
Minimize use of NoC’s resources.
Network On Chip - Cache Coherency 10
SolutionSolution
Memory control distribution among a
number of units according to memory
spaces.
Placement of control units as part of the
NoC.
Network On Chip - Cache Coherency 11
Solution DiagramSolution Diagram
Network On Chip - Cache Coherency 12
Project’s GoalsProject’s Goals
Primary Goal:Primary Goal: Design and Design and
implement Cache Coherencyimplement Cache Coherency
protocol for CMP.protocol for CMP.
Implement NoC (including NoC’s router).
Assemble CMP based on NoC.
Network On Chip - Cache Coherency 13
14Network On Chip - Cache Coherency
NoC Packet’s StructureNoC Packet’s Structure
Packet is divided into flits.
There are four flit types: Start, Body, End
and Idle.
Network On Chip - Cache Coherency 15
Flit’s StructureFlit’s Structure
Flit contain two fields: Data and Type.
Network On Chip - Cache Coherency 16
5 Ports Router5 Ports Router
Direct packets according to X-Y routing.
5 ports – North, East, West, South and
Processing Unit.
Processing Units are using the network’s
communication protocol.
2 Virtual Channels (VC) per port.
Network On Chip - Cache Coherency 17
5 Ports Router Structure 5 Ports Router Structure
Network On Chip - Cache Coherency 18
Input PortInput Port
Receives Flits from Router or from
Processing unit.
Analyze and save the current packet
direction.
Switch between Virtual Channels.
Network On Chip - Cache Coherency 19
Input Port StructureInput Port Structure
Network On Chip - Cache Coherency 20
Output PortOutput Port
Transmits Flits to Router or to Processing
unit.
Each Virtual Channel save the currently
serviced input port (CSIP).
Switch between Virtual Channels.
Network On Chip - Cache Coherency 21
Output Port StructureOutput Port Structure
Network On Chip - Cache Coherency 22
Cross BarCross Bar
Transfer Flits from Input Port to the
matching Output Port.
Consists of 5 controllers – one for every
Output Port.
Network On Chip - Cache Coherency 23
Cross Bar StructureCross Bar Structure
Network On Chip - Cache Coherency 24
25Network On Chip - Cache Coherency
Network’s characteristics Network’s characteristics
The width of the Data bus is 8 bit.
The size of the ports’ buffers is 4 flits (can
contain 4 flit at the most).
The NoC is composed of 9 routers, placed
in 3x3 grid formation.
Network On Chip - Cache Coherency 26
Input’s VC ImplementationInput’s VC Implementation
Network On Chip - Cache Coherency 27
Output’s VC Output’s VC ImplementationImplementation
Network On Chip - Cache Coherency 28
Input Port ImplementationInput Port Implementation
Network On Chip - Cache Coherency 29
Output Port Output Port ImplementationImplementation
Network On Chip - Cache Coherency 30
Output’s Controller Output’s Controller ImplementationImplementation
Network On Chip - Cache Coherency 31
Crossbar_Mux ImplementationCrossbar_Mux Implementation
Network On Chip - Cache Coherency 32
Cross Bar ImplementationCross Bar Implementation
Network On Chip - Cache Coherency 33
Router ImplementationRouter Implementation
Network On Chip - Cache Coherency 34
Synthesis ParametersSynthesis Parameters
Network On Chip - Cache Coherency 35
Network’s PerformanceNetwork’s Performance
Latency of the router is 2 cycles.
Throughput of the router is 1 flit per cycle.
System’s clock frequency is 100 [MHz].
Packets can be routed simultaneously.
Packets can by-pass each other.
Network On Chip - Cache Coherency 36
37Network On Chip - Cache Coherency
Cross TransmitCross Transmit
38Network On Chip - Cache Coherency
Routing two packets simultaneously: port 0 to port 2 and port 3 to port 2.
Traffic avoidance by using VCTraffic avoidance by using VC
39Network On Chip - Cache Coherency
First packet from port 0 to port 1 get blocked in output port.
Packet from port 3 to port 1 by-pass it.
40Network On Chip - Cache Coherency
Demonstration DiagramDemonstration Diagram
Network On Chip - Cache Coherency 41
General DescriptionGeneral Description
Dummy units transmit packets.
Destination is being set by the switch-
buttons.
The Dummy port start transmitting
according to its push-button.
Network On Chip - Cache Coherency 42
Project Schedule Project Schedule (1(1stst Semester) Semester)Familiarize with design tools – 3 weeks.
Familiarize with VirtexII Pro FPGA (application
& components) – 4 weeks.
Design & Implement NoC’s router – 5
weeks.
Assemble CMP using our router
implementation – 2 weeks.
Network On Chip - Cache Coherency 43
Project Schedule Project Schedule (2(2ndnd Semester) Semester)
Assemble CMP using our router
implementation – 4 weeks.
Design Cache Coherency protocol for CMP
based on faculty research – 4 weeks.
Implement the protocol as part of the
assembled CMP – 6 weeks.
Network On Chip - Cache Coherency 44