the concurrent matching switch architecture bill lin (university of california, san diego) isaac...

26
The Concurrent Matching Switch Architecture Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)

Post on 20-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

The Concurrent Matching Switch Architecture

Bill Lin (University of California, San Diego)

Isaac Keslassy (Technion, Israel)

IEEE INFOCOM, Barcelona, April 23-29, 2006 2

Motivation

Traffic demands expected to grow, driven in part by increasing broadband adoption 10x increase in broadband subscription in just last 3

years, already over 100 million subscribers 1.25-2.4 Gbps fiber to homes emerging (GPON,

GEPON, EPON, BPON …)

Larger routers needed for consolidation

Operators need scalable routers that provide good performance

IEEE INFOCOM, Barcelona, April 23-29, 2006 3

Limitations of Previous Routers

Output-Queueing (OQ) Switch Well-known to provide good performance, but

scalability hampered by need for internal N speedup

Crossbar Switches, using Input-Queueing (IQ) or Combined Input-Output Queueing (CIOQ)

Huge body of literature, but scalability hampered by need for centralized scheduling and arbitrary per-packet switch configurations

IEEE INFOCOM, Barcelona, April 23-29, 2006 4

Limitations of Previous Routers

Load-Balanced Routers No centralized scheduler Scalable fixed configuration switch fabric in optics Guarantees 100% throughput 100 Tb/s design with 160 Gb/s linecards shown

But packets may be delivered “out-of-order”

IEEE INFOCOM, Barcelona, April 23-29, 2006 5

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

R/N

R/N

R/NR/N

R/N

R/N

R/N

Basic Load-Balanced Router

R/NR/N

R/NR/N

In

In

In

LinecardsLinecards LinecardsA1A1A2A2A3A3

B1B1

C1C1C2C2

B1B1B2B2

C1C1

IEEE INFOCOM, Barcelona, April 23-29, 2006 6

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

R/N

R/N

R/NR/N

R/N

R/N

R/N

Basic Load-Balanced Router

R/NR/N

R/NR/N

In

In

In

LinecardsLinecards Linecards

A1A1

A2A2

A3A3

B1B1C1C1

C2C2B1B1

B2B2C1C1

Many Fabric Options (any spreading device)

Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches

Just need fixed uniform rate channels at R/N

No dynamic switch reconfigurations

Many Fabric Options (any spreading device)

Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches

Just need fixed uniform rate channels at R/N

No dynamic switch reconfigurations

IEEE INFOCOM, Barcelona, April 23-29, 2006 7

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

R/N

R/N

R/NR/N

R/N

R/N

R/N

Basic Load-Balanced Router

R/NR/N

R/NR/N

In

In

In

LinecardsLinecards Linecards

A1A1

A2A2

A3A3

B1B1C1C1

C2C2B1B1

B2B2C1C1

Out ofOrder !

IEEE INFOCOM, Barcelona, April 23-29, 2006 8

Packet Ordering Problem

Out-of-order packet delivery is undesirable(e.g. bad for TCP)

Previous techniques (e.g. EDF, UFS, FOFF) Accumulate and delay packets at input/middle ports And/or delay and re-order packets at middle/output ports

However, these techniques are unsatisfactory because they add substantial delays

IEEE INFOCOM, Barcelona, April 23-29, 2006 9

Impact on Avg. Delay(N = 128, uniform traffic)

Basic Load-Balanced

UFSFOFF

SignificantDelay

IEEE INFOCOM, Barcelona, April 23-29, 2006 10

Concurrent Matching Switch (CMS)

Basic idea Retain load-balanced router structure and scalability of a

fixed optical mesh, no dynamic reconfiguration Instead of packets, load-balance “request tokens” to N

parallel “schedulers” Each scheduler independently solves its own matching Packets delivered in order based on matching results

Goal is to provide much lower average delay than accumulation-based methods for ensuring packet

order while retaining 100% throughput and scalability

Goal is to provide much lower average delay than accumulation-based methods for ensuring packet

order while retaining 100% throughput and scalability

IEEE INFOCOM, Barcelona, April 23-29, 2006 11

Out

Out

Out

R

R

R

R

R

R

ArchitectureLinecards LinecardsLinecards

A1A1

B1B1

C1C1C2C2C1C1C1C1

B2B2

C2C2

Retain Fixed Configuration

Meshes

BUT move packet buffers

to INPUT

A2A2A3A3A4A4

IEEE INFOCOM, Barcelona, April 23-29, 2006 12

Out

Out

Out

R

R

R

R

R

R

ArchitectureLinecards LinecardsLinecards

A1A1

B1B1

C1C1C2C2C1C1C1C1

B2B2

C2C2

A2A2A3A3A4A4 201

101

100

001

001

011

010

000

000

Add N2 Token

Counters

IEEE INFOCOM, Barcelona, April 23-29, 2006 13

Out

Out

Out

R

R

R

R

R

R

Arrival PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C1C1C1C1

C2C2

A2A2A3A3A4A4 201

101

100

001

001

011

010

000

000

B1B1B1B1B2B2

A1A1A1A1A2A2

B1B1B2B2

C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 14

Out

Out

Out

R

R

R

R

R

R

Arrival PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C1C1C1C1

C2C2

A2A2A3A3A4A4 201

101

100

101

001

011

110

000

100

B1B1B1B1B2B2

B1B1B2B2

C2C2C3C3C4C4

A1A1A1A1A2A2

IEEE INFOCOM, Barcelona, April 23-29, 2006 15

Out

Out

Out

R

R

R

R

R

R

Arrival PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C1C1C1C1

C2C2

A2A2A3A3A4A4 211

101

100

101

011

011

110

010

100

B1B1B1B1B2B2

A1A1A1A1A2A2

B1B1B2B2

C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 16

Out

Out

Out

R

R

R

R

R

R

Arrival PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C1C1C1C1

C2C2

A2A2A3A3A4A4 211

101

100

101

011

012

111

010

101

B1B1B1B1B2B2

A1A1A1A1A2A2

B1B1B2B2

C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 17

Out

Out

Out

R

R

R

R

R

R

Matching PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2

A2A2A3A3A4A4 211

101

100

101

011

012

111

010

101

B1B1B1B1B2B2

A1A1A1A1A2A2

B1B1B2B2

C1C1C2C2C1C1C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 18

Out

Out

Out

R

R

R

R

R

R

Matching PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2

211

101

100

101

011

012

111

010

101

B1B1

A2A2A3A3A4A4B1B1A1A1

A1A1A2A2 C1C1

B1B1B2B2B2B2

C2C2C1C1C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 19

Out

Out

Out

R

R

R

R

R

R

Matching PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C2C2

111

001

000

100

001

002

110

000

100

B1B1 A2A2

A3A3

A4A4B1B1

B2B2

C3C3C4C4

A1A1A1A1A2A2 C1C1

B1B1C1C1

B2B2C2C2

IEEE INFOCOM, Barcelona, April 23-29, 2006 20

Out

Out

Out

R

R

R

R

R

R

Departure PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C2C2

111

001

000

100

001

002

110

000

100

B1B1 A2A2

A3A3

A4A4B1B1

B2B2

C3C3C4C4

A1A1A1A1A2A2 C1C1

B1B1C1C1

B2B2C2C2

IEEE INFOCOM, Barcelona, April 23-29, 2006 21

Distributed Operation

All linecards operate in parallel in a fully distributed manner

Arrival, matching, and departure phases overlap in a pipeline manner

IEEE INFOCOM, Barcelona, April 23-29, 2006 22

Main Ideas

Each middle linecard acts as a “micro-router” with 1/Nth of the arrival traffic

Therefore, it gets N time slots to think about the schedule, time complexity amortized by a factor of N

If each micro-router can guarantee 100% throughput, so can the overall switch

Each micro-router can work the way that it wants, leveraging huge body of existing work on scheduling

CMS provides a new way of aggregating routers together. Therefore, provides a new way of thinking

about scaling routers.

CMS provides a new way of aggregating routers together. Therefore, provides a new way of thinking

about scaling routers.

IEEE INFOCOM, Barcelona, April 23-29, 2006 23

Practicality

Well-studied randomized approximations to Maximum Weighted Matching have been shown to achieve very good results [Tassiulas 1998] [Giaccone, Prabhakar & Shah, 2003]

These algorithms only require O(N) complexity using sequential hardware, but can provide 100% throughput guarantees with no speedup and good delay results

Amortized over N time slots, CMS with these scheduling algorithms can achieve O(1) time complexity (independent of switch size) 100% throughput Good delay results Packet ordering

IEEE INFOCOM, Barcelona, April 23-29, 2006 24

Experimental Results(N = 128, uniform traffic)

Basic Load-Balanced

UFSFOFFCMS

Difference of N time slots for matching phase

IEEE INFOCOM, Barcelona, April 23-29, 2006 25

Conclusions

CMS is scalable Leverages scalability of fixed optical meshes Fully distributed Can achieve O(1) time complexity

CMS achieves good performance Guarantees 100% throughput Guarantees packet ordering Experimentally achieves low packet delays

CMS provides new way of thinking about scaling routers and connects huge body of existing literature on scheduling to load-balanced routers

Thank You