load balancing switch
DESCRIPTION
Final presentation for project. By: Maxim Fudim Oleg Schtofenmaher Supervisor: Walter Isaschar. Spring 2008 ( Part B). LOAD BALANCING SWITCH. General overview. Software solutions for real-time are too slow Power dissipation limits work frequencies - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/1.jpg)
1
LOAD BALANCING SWITCH
By: Maxim Fudim Oleg Schtofenmaher Supervisor: Walter Isaschar
FINAL PRESENTATION FOR PROJECT
Spring 2008 ( Part B)
![Page 2: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/2.jpg)
2
General overview
Software solutions for real-time are too slow
Power dissipation limits work frequencies
Greater computing power neededH/W accelerators can improve S/W
processesMulti-core, multi-threaded systems
are the future
![Page 3: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/3.jpg)
3
Multiprocessor environment for parallel processing of vectors data stream
Maximal ThroughputConfigurable hardwareExpandable designStatistics report
Project Goals
![Page 4: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/4.jpg)
4
System specifications
SW over transparent HWInterface over PCI 1 Mbit/sec input streamVectors of 8 ÷ 1024 chunksVariable number of processorsSystem spreads over multiple FPGAs
![Page 5: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/5.jpg)
5
Problem How to manage Data stream? How to manage multiple parallel units? How to achieve full and effective
utilization of resources?
![Page 6: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/6.jpg)
6
Solution (Top Level)Board Level Load Balancing SwitchOne system input and output to
PCIDistribute vectors among classes Local buffers for chip data
![Page 7: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/7.jpg)
7
Solution (Chip Level)
Chip Level Load Balancing SwitchConverting shared resources to
“personal” work space.Cluster ‘s organized VPUsMonitoring for each unit’s loadSmart arbitrationFlexible and easy configuration
![Page 8: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/8.jpg)
8
Solution - Tree Distribution Switch
Class of Service Distribution
SW/HW interface
Clusters of VPUs
Clusters of VPUsClusters
of VPUs
LBS Arbitration
Clusters of VPUs
Clusters of VPUsClusters
of VPUs
LBS ArbitrationCluster
s of
VPUs
Clusters
of VPUs
Clusters
of VPUs
LBS Arbitration
![Page 9: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/9.jpg)
9
Three level Architecture
Provide level for packets management ( Classes )Type, Size, Priority of Data
Provide level for organizing various processing units ( Clusters )Speed , Quantity, Resources of
Processors
Provide level for fine tuning ( VPUs ) Algorithm, HW accelerating
![Page 10: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/10.jpg)
10
Implementation
![Page 11: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/11.jpg)
11
Board Level
Multi chip systemLocal FIFOs for every chip/classClassifier for packet managementSW configurable controlsInput and Controls over Main BusOutput via streamed neighbored
busses
![Page 12: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/12.jpg)
12
Board Overview
![Page 13: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/13.jpg)
13
Busses Description
![Page 14: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/14.jpg)
14S/W emulator or H/W DSP system
Board Level diagram
Input vectorsOutput reports
LBS1
Classifier
Stratix II 180
PROCStar IIPCI Bus
DDR2 DDR2
LBS2
DDR2 DDR2 DDR2 DDR2 DDR2 DDR2
LBS3 LBS4
Main Bus : Data In and Controls
Stratix II 180 Stratix II 180 Stratix II 180
Ring Bus
Ring Bus
Per LBS registers
![Page 15: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/15.jpg)
15
Right BusReports
NIOS VPU
NIOS VPU
Single Chip diagram
Main BusInput Vectors
Load Balancing Switch
(LBS)
Left Bus Muxed Reports
NIOS VPU
DDR2 A FIFO IN
Data and Controls
Stratix II FPGA
DDR2 BFIFO OUT
NIOS VPU
Bus Control Block
![Page 16: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/16.jpg)
16
PCI-System InterfacesSoftware - Hardware Interface:Input and Output MultiFIFO PCI data
busMultiFIFO statusLBS 1-4 Interface:2x32-bit general read purpose registers2x32-bit general write purpose registers8-bit information registerSoftware reset signal
![Page 17: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/17.jpg)
17
PCI-System InterfacesClassifier:Global Configuration Register (32 bit)Global Info Register (32 bit)Global In Count Register (32 bit)Global Out Count Register (32 bit)Global Active Time Register (32 bit)Global Software reset signal
![Page 18: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/18.jpg)
Board Level DescriptionClassifier (board level):Distributes data from Input PCI to
Local FIFOsHandles demands from Local Output
MastersSynchronize data and controlsConfigurable arbitration between
LBS classesConfigurable statistics gathering Timeout mechanism
![Page 19: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/19.jpg)
Board Level DescriptionBusses Control Block (on every
chip):Parametric pins numberingMain /Ring Busses routingData samplingFIFO managementLocal Grant controlsLocal Output FIFO master
![Page 20: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/20.jpg)
Main Bus InterfacesInput Data & Control Interface:Input data bus to Local FIFOsACK from Local FIFOsREQ to Local FIFOsStatistics REQ
![Page 21: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/21.jpg)
21
Main Bus InterfacesOutput Controls Interface:Demand from Local FIFO MastersOutput Grant ACK from PCI FIFOEnd of vector from PCI FIFO Master
![Page 22: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/22.jpg)
RING Bus InterfacesOutput Data Interface:Output data bus from Local FIFOs Data Valid from Local MastersEnd of output Vector from Local
MastersStatistics DataStatistics Valid
![Page 23: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/23.jpg)
23
Chip Level
Local FIFO for inputs/outputsInternal clusters configurationArbitration, prioritiesStatistics, Synchronization
![Page 24: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/24.jpg)
24
Single FPGA Top Diagram
Load Balancing
Switch
(LBS)
DDR2Controls Bank A
LBS 1-4
Stratix II 180 FPGA
DDR2 Controls
Bank B
I/O – LBSControl Block
Data flow
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
NIOScluster
BusControl Block
![Page 25: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/25.jpg)
25
Input System InterfaceLBS Input Interface:64 bit data bus from Input MultiFIFORead request and ack. SignalsMultiFIFO status flagsSW/HW input signals
![Page 26: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/26.jpg)
26
Output System InterfaceLBS Output interface:64 bit data bus to Output MultiFIFOWrite request and ack. SignalsMultiFIFO status flagsSW/HW input signals
![Page 27: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/27.jpg)
27
Data Packet Format
Header Data 1 to N of 32-bit
WordsTail
……
Unused
Nios Numb
er Data
Length NVector ID/Command
Type
8-bit 32-bit16-bitVersion 4-bit
SW/HW Control 1-bit
Type 1-bit(Data/
Command)
Tail : Sync Data
Header:
![Page 28: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/28.jpg)
28
LBS Top Level ViewPC
I
Main Controller
unit
Stratix II FPGA
Output Writer
Cluster ArbiterNIOS II Syste
m
Input Reader
Cluster ArbiterNIOS II Syste
m
Control
Control
FIFO Input Port
FIFOOutput
Port
Control
Cluster ArbiterNIOS II Syste
mMuxed output data bus
Input data bus
Controland Status
Statistics
Reporter
![Page 29: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/29.jpg)
Organization of VPU’s(Vector Processing Units)NIOS VPUs joined into the clustersConstant number of ClustersParametric number of NIOS VPU’s
in clusterParametric control logicVariable configuration of NIOS Different Priority for different
clusters
![Page 30: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/30.jpg)
30
NIOS Input InterfaceHardware:64-bit input data bus – from LBS10 bit data slices counter – from
LBSWrite request signal – from LBSChip select signal – from LBSNIOS ready signal – from NIOSData ready signal – from LBS
![Page 31: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/31.jpg)
31
NIOS Output InterfaceHardware:64 bit output data bus – from NIOS7 bit data slices counter – from LBSRead request signal – from LBSChip select signal – from LBSOutput ready signal – from NIOSOutput taken signal – from LBS
![Page 32: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/32.jpg)
32
Twin VPU SystemInput / Output waveform
![Page 33: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/33.jpg)
33
LBS Units DescriptionInput ReaderReading data from input FIFOWriting data to selected clusterProviding header control bits for
main controllerSynchronization checksVector length counterInput Time stamp
Main Controller unit
Output
Writer
Cluster
Arbiter
NIOS II
System
Input Reade
r Cluster
Arbiter
NIOS II
System
FIFO
Input
Port
FIFOOutput
Port
Cluster
Arbiter
NIOS II
SystemMuxed output data
bus
Input data bus
Controland Status
Statistics
Reporter
![Page 34: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/34.jpg)
34
LBS Units Description Sync Flusher
Main Controller unit
Output
Writer
Cluster
Arbiter
NIOS II
System
Input Reade
r Cluster
Arbiter
NIOS II
System
FIFO
Input
Port
FIFOOutput
Port
Cluster
Arbiter
NIOS II
SystemMuxed output data
bus
Input data bus
Controland Status
Statistics
Reporter
Flush data on Input errorLook for Sync TailParametric number of recovery
triesFailure signal to Error Reporter
![Page 35: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/35.jpg)
35
Input Reader Diagram
![Page 36: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/36.jpg)
36
LBS Units DescriptionInput Controller - FSM
![Page 37: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/37.jpg)
37
LBS Units DescriptionOutput WriterReading data from selected
clusterWriting data to output FIFOVector length counterOutput Time Stamp
Main Controller unit
Output
Writer
Cluster
Arbiter
NIOS II
System
Input Reade
r Cluster
Arbiter
NIOS II
System
FIFO
Input
Port
FIFOOutput
Port
Cluster
Arbiter
NIOS II
SystemMuxed output data
bus
Input data bus
Controland Status
StatisticsReporter
![Page 38: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/38.jpg)
38
Output Writer Diagram
![Page 39: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/39.jpg)
39
LBS Units DescriptionOutput Controller - FSM
![Page 40: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/40.jpg)
40
LBS Units DescriptionMain ControllerEnabling input and output unitsSelecting control source (S/W or
H/W)Monitoring clusters’ load via
status busesSelecting clusters for input/output
operationsData validity indication
Main Controller unit
Output
Writer
Cluster
Arbiter
NIOS II
System
Input Reade
r Cluster
Arbiter
NIOS II
System
FIFO
Input
Port
FIFOOutput
Port
Cluster
Arbiter
NIOS II
SystemMuxed output data
bus
Input data bus
Controland Status
StatisticsReporter
![Page 41: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/41.jpg)
41
Main ControllerStatus Decoders
![Page 42: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/42.jpg)
42
Status input and output independent decoders
Static PriorityDynamic LoadParametric Aging mechanismRound Robin in same priority
group
LBS Units DescriptionMC Status Alghoritm
![Page 43: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/43.jpg)
43
LBS Units DescriptionMC Status Alghoritm
11
1314
013
12
114 015
.
.
.
13|7 ... 013|0
1 2|1 0 1|7 00|0
014|3015|0
1 4|7 13|13
.
.
.
14|1213
Status input
Dynamic port
mapping
RR on Active ports
Next port
1011
1314
013
12
114 015
.
.
.
Static Priority/ Aging
mapping00111
202
10
![Page 44: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/44.jpg)
44
Decoding Flow
![Page 45: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/45.jpg)
45
LBS Units DescriptionStatistics ReporterMonitoring system activityError reporting for software Counting processed vectorsThroughput = Vectors served / Time
of service
Main Controller unit
Output
Writer
Cluster
Arbiter
NIOS II
System
Input Reade
r Cluster
Arbiter
NIOS II
System
FIFO
Input
Port
FIFOOutput
Port
Cluster
Arbiter
NIOS II
SystemMuxed output data
bus
Input data bus
Controland Status
StatisticsReporter
![Page 46: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/46.jpg)
46
LBS Units DescriptionClusters Load Reporter
Monitoring clusters activityPer VPU active/free statusSending Status by request from
Classifier
Main Controller unit
Output
Writer
Cluster
Arbiter
NIOS II
System
Input Reade
r Cluster
Arbiter
NIOS II
System
FIFO
Input
Port
FIFOOutput
Port
Cluster
Arbiter
NIOS II
SystemMuxed output data
bus
Input data bus
Controland Status
StatisticsReporter
![Page 47: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/47.jpg)
47
LBS Units DescriptionError Reporting Input Reader errorRecovery Synchronize FailurePackets DropsLocal Output Master’s ErrorLBS’s activity
Main Controller unit
Output
Writer
Cluster
Arbiter
NIOS II
System
Input Reade
r Cluster
Arbiter
NIOS II
System
FIFO
Input
Port
FIFOOutput
Port
Cluster
Arbiter
NIOS II
SystemMuxed output data
bus
Input data bus
Controland Status
StatisticsReporter
![Page 48: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/48.jpg)
48
Cluster/VPUs parametric enabling
Cluster/VPUs status reportersVPUs Flow controllersWatchdogsNIOS Systems
LBS Units DescriptionCluster Entity
Main Controller unit
Output
Writer
Cluster
Arbiter
NIOS II
System
Input Reade
r Cluster
Arbiter
NIOS II
System
FIFO
Input
Port
FIFOOutput
Port
Cluster
Arbiter
NIOS II
SystemMuxed output data
bus
Input data bus
Controland Status
StatisticsReporter
![Page 49: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/49.jpg)
49
LBS Units Description Cluster ConfigsDefine quantity of VPUs ports Define type of VPUs in clusterAutomatic creation of per VPU
control logicParametric arbiter for input/output
data
Main Controller unit
Output
Writer
Cluster
Arbiter
NIOS II
System
Input Reade
r Cluster
Arbiter
NIOS II
System
FIFO
Input
Port
FIFOOutput
Port
Cluster
Arbiter
NIOS II
SystemMuxed output data
bus
Input data bus
Controland Status
StatisticsReporter
![Page 50: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/50.jpg)
50
LBS Units DescriptionPer Nios Structure
![Page 51: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/51.jpg)
51
Input 4-phase REQ/ACK protocol with NIOSNios ReadyData Ready
Output 4-phase REQ/ACK protocol with NIOSOutput ReadyOutput Taken
Smart Status Reporter
LBS Units DescriptionVPU Controller
![Page 52: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/52.jpg)
52
LBS Units DescriptionVPU Controller
![Page 53: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/53.jpg)
53
VPU Input FSM
![Page 54: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/54.jpg)
54
VPU Output FSM
![Page 55: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/55.jpg)
55
Single processor with in/out buffers
HW accelerated systemShared resources system with
mutexMulti- processors system with
number of ports to Cluster
LBS Units DescriptionGeneral NIOS System
![Page 56: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/56.jpg)
56
SOPC components:Nios II with custom algorithmProgram memory Input Vector Output VectorBuffersHW AcceleratorTimer
LBS Units DescriptionRafael’s basic NIOS System
![Page 57: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/57.jpg)
57
Stub components: Input Vector Buffer Output Vector BufferFlow managementParametric delay for performance
analysis
LBS Units DescriptionDummy NIOS System
![Page 58: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/58.jpg)
62
Resources &
Performance
![Page 59: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/59.jpg)
63
Resource Usage
ModuleLogic
utilization
%Memor
y (M4K)
%
Peripheral IPs (MegaFIFO, PLLs, etc.) 3,100 2 16 2
User System (All VPUs + LBS) 42,000 30 675 88
Single VPU 6,775 4.7 112 15LBS Logic 1,350 1 3 0.5Total usage of chip resources 45,896 32 691 90
Total available 143,000 100 768 100
Resource usage data for 6 VPU system
VPU resource usage is based on basic VPUs and may be decreased by advanced configurations and policies.
![Page 60: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/60.jpg)
64
Performance of LBSTheoretical Throughput:
100MHz x 64bit = 6.4Gbit/sArbitration and routing latency:
2-4 cycles in average60% Throughput for short vectors,
up to 95% for long vectorsPCI and slow algorithms =
bottlenecks1Mbit/s – 400 Mbit/s real
throughput
![Page 61: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/61.jpg)
65
Performance for short vectors
SystemTime ofService[sec]
Throughput[Mbit/s] Impr
SW(on Core2Duo
E6600)0.1 3.2
6 VPUs 0.00209 122 382 Classes of 6 VPUs 0.00134 191 603 Classes of 6 VPUs 0.00086 297 93
4 Classes of 6 VPUs 0.00064 400 125
Time and throughput for 1000 vectors of 4 chunks each
VPU performance is based on basic VPUs and RR arbitration and may be increased for giving workload after perf. analysis by defining advanced configurations and policies.
![Page 62: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/62.jpg)
66
Performance for medium vectors
SystemTime ofService[sec]
Throughput[Mbit/s] Impr
SW(on Core2Duo
E6600)2.9 2.3
6 VPUs 0.28 23.4 102 Classes of 6 VPUs 0.15 43.5 18.53 Classes of 6 VPUs 0.01 66 28.7
4 Classes of 6 VPUs 0.074 88 38
Time and throughput for 1000 vectors of 200 chunks each
VPU performance is based on basic VPUs and RR arbitration and may be increased for giving workload after perf. analysis by defining advanced configurations and policies.
![Page 63: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/63.jpg)
67
Performance for long vectors
SystemTime ofService[sec]
Throughput[Mbit/s] Impr
SW(on Core2Duo
E6600)1.1 2.9
One VPU 1.224 2.62 0.896 VPUs 0.208 15.43 5.3
2 Classes of 6 VPUs 0.11 29.1 103 Classes of 6 VPUs 0.074 43.69 14.8
4 Classes of 6 VPUs 0.061 52.46 18
Time and throughput for 100 vectors of 1000 chunks each
VPU performance is based on basic VPUs and RR arbitration and may be increased for giving workload after perf. analysis by defining advanced configurations and policies.
![Page 64: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/64.jpg)
68
System performance – missing TOAs
0 5 10 15 20 250.085
0.09
0.095
0.1
0.105
0.11
0.115
Number of missing TOAs
Proc
essi
ng t
ime
[sec
]
![Page 65: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/65.jpg)
69
0 10 20 30 40 50 600.094
0.104
0.114
0.124
0.134
0.144
0.154
0.164
0.174
0.184
Noise [%]
Proc
essi
ng t
ime
[sec
]System performance – noise levels
![Page 66: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/66.jpg)
70
Tasks (part A) Study relevant tools and environments
(GiDEL PROCWizard, API, Quartus, STP…) –Done Define interfaces with other groups –Done Define basic algorithm for h/w switching – Done Implement and debug the switch – Done Develop stubs for testing – Done Expand design for several NIOS’s – Done Integration with NIOS system – Done SW Test application for operating and
integration with hardware design – Done
![Page 67: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/67.jpg)
71
Tasks (part B)
Increase number of Nios’s in clusters – Done
Improve algorithm for priority cluster selection – Done
Expand statistic reports – Done Expand SW/HW communication – Done Add error correction/handling – Done Spread design to several FPGAs – Done Complete integration with relevant
projects – Done
![Page 68: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/68.jpg)
72
Summary
Flexible Architecture LBS with various SW/HW control and
statistics Up to 4 chip x 16 cluster x 32 VPU system Fully functional S/W – Board – LBS – NIOS
interfaces Successful hardware and software
integration Working design examples for other teams
![Page 69: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/69.jpg)
73
Conclusions
Tree Switch concept is simple and efficient
Three layers abstraction concept = minimize changes
Buffers for every Class = independence
SAE for every VPU = balancing and performance
Single level of mastering = minimize resources
64-bit buses = maximize throughput Single data interface to SW =
bottleneck for high-speed designs
![Page 70: LOAD BALANCING SWITCH](https://reader031.vdocuments.net/reader031/viewer/2022012917/56816678550346895dda14ef/html5/thumbnails/70.jpg)
74
Conclusions (cont.)
Main bottlenecks are: PCI bus and VPU algorithm
Throughput varies between 400Mb/sec and 1Mb/sec (vector dependant)
The design complies with requirements
Further improvements in algorithm will speed up the system and increase num. of VPUs