voq crossbar schedulers
DESCRIPTION
Hong Kong University of Science and Technology - MSc(IT) 2003 Fall Semester - Track 1 CSIT560 Internet Infrastructure: Switches and Routers Final Project Presentation #T1-6-2. VoQ Crossbar Schedulers. Group Members: Cyrus LO – 02784753 [email protected] Ricky CHAN - 02784557 [email protected] - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/1.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Hong Kong University of Science and Technology -MSc(IT) 2003
Fall Semester - Track 1
CSIT560 Internet Infrastructure: Switches and Routers
Final Project Presentation #T1-6-2
VoQ Crossbar SchedulersVoQ Crossbar Schedulers
StartStart
Group Members:Cyrus LO – 02784753 [email protected]
Ricky CHAN - 02784557 [email protected]
Ricky KWAN - 02784674 [email protected]
Presented on 1 December 2003
![Page 2: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/2.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Agenda• Part I. Introduction
Project goals (30secs by Ricky Chan)
• Part II. Some well-known MWM/MSM scheduling algorithms What are the differences between MSM & MWM ? Overview of PIM algorithm What is the common problem? (2’30 mins by Ricky Chan)
• Part III. Our findings Requirements and assumptions A new conceptual algorithm –
Real De-synchronized Round-Robin Model (RDESRR) What is RDESRR ? (6 mins by Ricky Kwan)
• Part IV. Experimental results SIM Results Result comparisons : RDESRR Vs SSRR & ISLIP Strengths & Weaknesses (4 mins by Cryus Lo)
• Part V. Conclusion Conclusions (1 min by Cryus Lo)
• Part VI. Q & A (1 min by all)
![Page 3: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/3.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Part I. Introduction
![Page 4: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/4.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Project goals
By further studying and analysis of the pointer desynchronization effect of some iterative arbitration algorithms, our group has the following goals:
•Tackle the common problem of some well-known MSM algorithms
•Propose a new conceptual algorithm for overcoming the problem
•Simulate the proposed algorithm by using SIM
•Justify the results
![Page 5: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/5.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Part II.
Some Well-known MWM/MSM scheduling algorithms
![Page 6: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/6.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
What are the differences between MSM & MWM ?
• Maximum weight matching (MWM) – stable for independent, 100% throughput under any traffic.
• Maximum size matching (MSM) – stable for i.i.d, uniform, admissible traffic.
• Both are too complex to implement, time complexities are O(N3logN) – MWM and O(N5/2) – MSM
•Only rely on some simple iterative algorithms to approximate MSMapproximate MSM•Some simple iterative algorithms:
•Parallel Iterative Matching (PIM)Round-Robin Matching (RRM)iterative SLIP (iSLIP).Fcfs in Round-Robin Matching (FIRM)Static Round-Robin Matching (SRR – SSRR, DSRR, Rotating DSRR & DRDSRRDRDSRR etc)
Derandomized rotating double Derandomized rotating double static round-robin !static round-robin !
![Page 7: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/7.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Overview of PIM algorithm
When no new matching can be found, the algorithm stops.
3.Accept - If an input receives a grant, it accepts one by selecting an output randomly among those that granted to this output..
2. Grant - If an unmatched output receives any requests, it grants to one by randomly selecting a request uniformly over all requests.
1. Request - Each unmatched input sends a request to every output for which it has a queued cell.
The basic matching algorithm. Each iteration of the algorithm follows these three steps:
![Page 8: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/8.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Is any improvement for PIM algorithms ?
• PIM -> RRM -> ISLIP -> FIRM -> Pointer Desynchronized (SRR)
• Some small change on pointers but make big improvement
• SRR is desynchronized the highest priority pointers
• It has no guarantee the grant result is desynchronized.
0
1
2
3
3 02 1
3 02 1
3 02 1
3 02 10
1
2
3
Both grant to 3
![Page 9: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/9.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Is any improvement for PIM algorithms ?
•If all grants are desynchronized.
Expect to get better result.
0
1
2
3
3 02 1
3 02 1
3 02 1
3 02 1
Grant Desynchronized(Real Desynchronized)
![Page 10: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/10.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Part III. Our Findings
![Page 11: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/11.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Assumptions
• Not enough knowledge in Hardware• Look for the theoretical solution• We ignore all hardware implementation issues
Requirements
• The system is stable• 100% Throughput in uniform traffic
•Target to desynchronize the grant results (so called Real Desynchronize)
![Page 12: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/12.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
A new algorithm – RDESRR
Concept:• Real Desynchronized Round Robin Model (RDESRR)
• Based on 2 phases RRM model (Request and Grant)
• Add a small share memory that each outputs can read/write (called Share Bits)
• The size of the memory is 1 bit per input
• If the bit is set, the corresponding input has already granted by an output
• If the bit is not set, the output may grant to corresponding input port
![Page 13: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/13.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
RDESRR Conceptual model
0
1
2
3
0
1
2
3
3 02 1
3 02 1
3 02 1
3 02 1
3
0
1
2
Share Bits
![Page 14: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/14.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
RDESRR model
• 2 phases only
• Request. Each input sends a request to every output for which it has a queued cell.
• Grant. If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output check the corresponding bit is set or not, if not set, the output will set the bit and notifies the input its request was granted. Otherwise, the output will look for next request until all requests has gone through. The pointer gi to
the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input. If no request is received, the pointer stays unchanged.
![Page 15: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/15.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
RDESRR Demo - Request
Step 1: Request
0
1
2
3
0
1
2
3
![Page 16: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/16.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
RDESRR Demo – Add a share memory in Output
Step 2: Grant
0
1
2
3
0
1
2
3
3 02 1
3 02 1
3 02 1
3 02 1 3
0
1
2
Share Bits
•Add a small share memory that each outputs can read/write (called Share Bits)
![Page 17: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/17.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
3 02 1
3 02 1
3 02 1
3 02 1
RDESRR Demo – Output check the share bits
0
1
2
3
0
1
2
3
Step 2: Grant
3
0
1
2
Share Bits
•The output check the corresponding bit is set or not
![Page 18: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/18.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
RDESRR Demo – When share bit is occupied
0
1
2
3
0
1
2
3
Step 2: Grant
3
0
1
2
3 02 1
3 02 1
3 02 1
3 02 1
Share Bits
•if not set, the output will set the bit and notifies the input its request was granted•The share bit is First Come First Serve
![Page 19: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/19.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
RDESRR Demo – Output looks for next request
0
1
2
3
0
1
2
3
Step 2: Grant
3 02 1
3 02 1
3 02 1
3 02 1 3
0
1
2
Share Bits
•If set, the output will look for next request until all requests have gone through
![Page 20: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/20.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
RDESRR Demo – All share bits are allocated
0
1
2
3
0
1
2
3
Step 2: Grant
3 02 1
3 02 1
3 02 1
3 02 1 3
0
1
2
Share Bits
•Fully allocate the share bit will result for fully grant all input request
![Page 21: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/21.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
3 02 1
3 02 1
3 02 1
3 02 1
RDESRR Demo – Pointer update/Share bit reset
0
1
2
3
0
1
2
3
3
0
1
2
Share Bits
•The pointer gi to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input•If no request is received, the pointer stays unchanged•Share bits are also reset
![Page 22: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/22.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
RDESRR model (Recap)
• 2 phases only
• Request. Each input sends a request to every output for which it has a queued cell.
• Grant. If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output check the corresponding bit is set or not, if not set, the output will set the bit and notifies the input its request was granted. Otherwise, the output will look for next request until all requests has gone through. The pointer gi to
the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input. If no request is received, the pointer stays unchanged.
![Page 23: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/23.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Part IV. Experimental Results
![Page 24: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/24.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
SIM Results• Run the test for 32x32 port in SIM using –l 1000000
Total Latency Avg Match Size0.1 0.0588 3.1958 0.2 0.1447 6.3938 0.3 0.2686 9.5947 0.4 0.4501 12.7940 0.5 0.7198 15.9960 0.6 1.1398 19.1980 0.7 1.8636 22.3961 0.8 3.2619 25.5986 0.9 7.5087 28.8003 1.0 715.5900 31.9850
RDESRR
![Page 25: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/25.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Result comparisons : RDESRR Vs SSRR & ISLIP
• The latency in load = 1 is 715.59• The average match size in load = 1 is 31.985 (99.95% match)
Average Delay under Uniform Traffic
1.0E-02
1.0E-01
1.0E+00
1.0E+01
1.0E+02
1.0E+03
1.0E+04
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
NormalizedLoad
Ave
rage
Del
ay
RR-DESY
SSRR
ISLIP
![Page 26: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/26.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Strengths • 2 phase operations
• Each input only get one grant, • no arbiter in input port
Unknown factorsParallel processing for ports scheduling.
All outputs access the share bits simultaneously Multi-process protection for Share Bits
1. Prevent deadlock, Check and set bit should be atomic action
2. Any chance for two outputs access same bit at the same time?
• Save Cost•It is simple to implement because base on RR
•Has a lot of potential varieties to study
•May achieve Maximum Size Match after few iterations.
![Page 27: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/27.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Part V. Conclusions
![Page 28: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/28.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Final thought…• Future research and study
1. Hardware implementation feasibility?
2. Apply to ISLIP, FIRM, may get better result?
• Comments on SIM1. We have successfully build the SIM on Linux platform so we expect it
can also be build on Windows platform with Cywin library
2. It cannot simulate the parallel processing behaviour therefore we cannot know what happen in the real scheduler
3. The processing time of SIM is quite slow so we cannot have enough time to simulate more revised algorithms and more iterations
Conclusion: Conclusion: RDESRR is a conceptual model which get RDESRR is a conceptual model which get
the good result in SIMthe good result in SIM
![Page 29: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/29.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Part VI. Q & A
![Page 30: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/30.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
Q & AQ & A
![Page 31: VoQ Crossbar Schedulers](https://reader036.vdocuments.net/reader036/viewer/2022062309/568135cd550346895d9d31da/html5/thumbnails/31.jpg)
41 2 3
1
3
2
44 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
4 13 2
PresentationPresentation End End