inf5050 – protocols and routing in internet (friday...

68
INF5050 – Protocols and Routing in Internet (Friday 9.2.2018) Presented by Tor Skeie Subject: IP-router architecture

Upload: dinhnguyet

Post on 07-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

INF5050 – Protocols and Routing in Internet (Friday 9.2.2018)

Presented by Tor Skeie

Subject: IP-router architecture

Page 2: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 2

High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.

This presentation is based on slidesfrom Nick McKeown, with updates

Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University

[email protected]

www.stanford.edu/~nickm

Stanford High Performance Networking group: http://klamath.stanford.edu

Page 3: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 3

Outline

Background What is a router?

Why do we need faster routers?

Why are they hard to build?

Architectures and techniques The evolution of router architecture.

IP address lookup.

Packet buffering.

Switching.

The Future

Page 4: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 4

What is Routing?

R3

A

B

C

R1

R2

R4 D

E

FR5

R5F

R3E

R3D

Next HopDestination

Page 5: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 5

What is Routing?

R3

A

B

C

R1

R2

R4 D

E

FR5

R5F

R3E

R3D

Next HopDestination

16 3241

Data

Options (if any)

Destination Address

Source Address

Header ChecksumProtocolTTL

Fragment OffsetFlagsFragment ID

Total Packet LengthT.ServiceHLenVer

20

byt

es

Page 6: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 6

Points of Presence (POPs)

A

B

C

POP1

POP3POP2

POP4 D

E

F

POP5

POP6 POP7POP8

Page 7: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 7

(400 Gb/s)

Where High Performance Routers are Used

R10 R11

R4

R13

R9

R5

R2R1 R6

R3 R7

R12

R16R15

R14

R8

(2.5 Gb/s)

(400 Gb/s)

(400 Gb/s)

(400 Gb/s)

Page 8: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 8

What a Router Looks Like

Cisco CRS-X(CRS-X 16 slot single-shelf on picture)

Juniper M320 (M160 on picture)

2.14m

0.60m

0.91m

Capacity: 12.8Tb/sPower: 11.2kWWeight: 723kg

0.88m

0.65m

0.44m

Capacity: 160Gb/sPower: 3.5kW

Capacity is sum of rates of linecards

Page 9: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Alcatel 7670 RSP Juniper TX8/T640

TX8

ChiaroAvici TSR

Some Multi-rack Routers

Page 10: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 11

Generic Router Architecture

LookupIP Address

UpdateHeader

Header ProcessingData Hdr Data Hdr

1M prefixesOff-chip DRAM

AddressTable

IP Address Next Hop

QueuePacket

BufferMemory

1M packetsOff-chip DRAM

Page 11: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 12

Generic Router Architecture

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

Data Hdr

Data Hdr

Data Hdr

BufferManager

BufferMemory

BufferManager

BufferMemory

BufferManager

BufferMemory

Data Hdr

Data Hdr

Data Hdr

Page 12: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 13

Why do we Need Faster Routers?

1. To prevent routers becoming the bottleneck in the Internet.

2. To increase POP capacity, and to reduce cost,

1. size and

2. power.

Page 13: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 14

0,1

1

10

100

1000

10000

1985 1990 1995 2000

Sp

ec95In

t C

PU

resu

lts

Why we Need Faster Routers 1: To prevent routers from being the bottleneck

0,1

1

10

100

1000

10000

1985 1990 1995 2000

Fib

er

Cap

ac

ity (

Gb

it/s

)

TDM DWDM

Packet processing Power Link Speed

2x / 18 months 2x / 7 months

Source: SPEC95Int & David Miller, Stanford.

More recently transmission speed of Petabit/s has been

demonstratedby Labs (multicore fiber)

Page 14: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 15

POP with large routersPOP with smaller routers

Why we Need Faster Routers 2: To reduce cost, power & complexity of POPs

Ports: Price >$100k, Power > 400W. It is common for 50-60% of ports to be for interconnection.

Page 15: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 16

Why are Fast Routers Difficult to Make?

1. It’s hard to keep up with Moore’s Law:

The bottleneck is memory speed.

Memory speed is not keeping up with Moore’s Law.

Page 16: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 17

1. It’s hard to keep up with Moore’s Law:

The bottleneck is memory speed.

Memory speed is not keeping up with Moore’s Law.

Why are Fast Routers Difficult to Make?Speed of Commercial DRAM

Moore’s Law2x / 18 months

1.1x / 18 months

0,0001

0,001

0,01

0,1

1

10

100

1000

1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 2011

Acc

ess

Tim

e (ns

)

1.1x / 18 months

Moore’s Law2x / 18 months

DDR4 (2018):• Speed: 3200 Mb/s

• Latency: ~13 ns•Capacity: 64GB

Page 17: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 18

Why are Fast Routers Difficult to Make?

1. It’s hard to keep up with Moore’s Law:

The bottleneck is memory speed.

Memory speed is not keeping up with Moore’s Law.

2. Moore’s Law is too slow:

Routers need to improve faster than Moore’s Law.

Page 18: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 19

Router Performance Exceeds Moore’s Law

Growth in capacity of commercial routers: Capacity 1992 ~ 2Gb/s Capacity 1995 ~ 10Gb/s Capacity 1998 ~ 40Gb/s Capacity 2001 ~ 160Gb/s Capacity 2003 ~ 640Gb/s Capacity 2008 ~ 100Tb/s Capacity 2013 ~ 922Tb/s

Average growth rate: 2.2x / 18 months, but the last 5 years: 2.8x / 18 months.

2013:The Cisco CRS-X multishelf router has a capacity of 921.6Tb/s (1152 ports)

Page 19: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 20

Outline

Background What is a router?

Why do we need faster routers?

Why are they hard to build?

Architectures and techniques The evolution of router architecture.

IP address lookup.

Packet buffering.

Switching.

The Future

Page 20: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 21

RouteTable

CPU BufferMemory

LineInterface

MAC

LineInterface

MAC

LineInterface

MAC

Typically <0.5Gb/s aggregate capacity

First Generation Routers

Shared Backplane

Page 21: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 22

Second Generation RoutersRouteTable

CPU

LineCard

BufferMemory

LineCard

MAC

BufferMemory

LineCard

MAC

BufferMemory

FwdingCache

FwdingCache

FwdingCache

MAC

BufferMemory

Typically <5Gb/s aggregate capacity

Page 22: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 23

Third Generation Routers

LineCard

MAC

LocalBufferMemory

CPUCard

LineCard

MAC

LocalBufferMemory

Switched Backplane

FwdingTable

RoutingTable

FwdingTable

Typically <50Gb/s aggregate capacity

Page 23: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 24

Fourth Generation Routers/SwitchesOptics inside a router for the first time

Switch Core Linecards

Optical links

100sof metres

100s Tb/s routers available/in development

Page 24: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 25

Outline

Background What is a router?

Why do we need faster routers?

Why are they hard to build?

Architectures and techniques The evolution of router architecture.

IP address lookup.

Packet buffering.

Switching.

The Future

Page 25: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 26

Generic Router Architecture

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

BufferManager

BufferMemory

BufferManager

BufferMemory

BufferManager

BufferMemory

LookupIP Address

AddressTable

LookupIP Address

AddressTable

LookupIP Address

AddressTable

Page 26: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 27

IP Address Lookup

Why it’s thought to be hard:1. It’s not an exact match: it’s a longest prefix

match.

2. The table is large: about 700,000 entries today, and growing.

3. The lookup must be fast: about 2ns for a 80Gb/s line.

Page 27: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 28

Longest Prefix Match is Harder than Exact Match

• The destination address of an arriving packet does not carry with it the information to determine the length of the longest matching prefix

• Hence, one needs to search among the space of all prefix lengths; as well as the space of all prefixes of a given length

Page 28: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 29

IP Lookups find Longest Prefixes

128.9.16.0/21 128.9.172.0/21

128.9.176.0/24

0 232-1

128.9.0.0/16142.12.0.0/1965.0.0.0/8

128.9.16.14

Routing lookup: Find the longest matching prefix (aka the most specific route) among all prefixes that match the destination address.

Page 29: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 30

Address Tables are Large

Page 30: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 31

Lookups Must be Fast

Year Aggregate Line-rate

Arriving rate of 40B packets (Million pkts/sec)

1997 622 Mb/s 1.94

2001 10 Gb/s 31.25

2006 80 Gb/s 250

2010 140 Gb/s 437.5

2013 400 Gb/s 1250

1. Lookup mechanism must be simple and easy to implement

2. Memory access time is the bottleneck

250Mpps × 2 lookups/pkt = 500 Mlookups/sec → 2ns per lookup

Page 31: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 32

IP Address LookupBinary tries

Example Prefixes:

a) 00001

b) 00010

c) 00011

d) 001

e) 0101

f) 011

g) 100

h) 1010

i) 1100

j) 11110000

e

f g

h i

j

0 1

a b c

d

0

0

1

1

Page 32: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 33

Multi-bit Tries

Depth = WDegree = 2Stride = 1 bit

Binary trieW

Depth = W/kDegree = 2k

Stride = k bits

Multi-ary trie

W/k

Time ~ W/kStorage ~ NW/k * 2k-1

W = longest prefixN = #prefixes

Page 33: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 34

Prefix Length Distribution

99.5% prefixes are 24-bits or shorter

0

10000

20000

30000

40000

50000

60000

70000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Prefix Length

Num

ber

of P

refixes

Source: Geoff Huston, Oct 2001

Page 34: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 35

24-8 Direct Lookup Trie

0000……0000 1111……1111

0 224-1

24 bits

8 bits

0 28-1

When pipelined, allows one lookup per memory access. Inefficient use of memory, though.

Page 35: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 38

Outline

Background What is a router?

Why do we need faster routers?

Why are they hard to build?

Architectures and techniques The evolution of router architecture.

IP address lookup.

Packet buffering.

Switching.

The Future

Page 36: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 40

Conceptual architecture

Line cardshosting

one or moreports

Non-blockingswitchingcore(s)

Arbitration/Control

Bi-

dir

ect

iona

l po

rts B

i-dire

ctional ports

Page 37: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 41

Conceptual Packet Buffering

Non-blockingswitchingcore(s)

Arbitration/Control

Bi-

dir

ect

iona

l po

rts

Bi-d

irectional ports

Input buffering

Page 38: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 43

Arbitration

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

QueuePacket

BufferMemory

QueuePacket

BufferMemory

QueuePacket

BufferMemory

Data Hdr

Data Hdr

Data Hdr

1

2

N

1

2

N

Data Hdr

Data Hdr

Data Hdr

Arbitration

Page 39: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 44

Head of Line Blocking

Page 40: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 45

0% 20% 40% 60% 80% 100%Load

Dela

yA Router with Input Queues

Head of Line Blocking

The best that any queueing system can

achieve.

2 2 58%

Page 41: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 46

0% 20% 40% 60% 80% 100%Load

Dela

yThe Best Performance

The best that any queueing system can

achieve.

Page 42: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 47

Conceptual Packet Buffering

Non-blockingswitchingcore(s)

Arbitration/Control

Bi-

dir

ect

iona

l po

rts

Bi-d

irectional ports

Central buffer

Page 43: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 48

Fast Packet Buffers(http://yuba.stanford.edu/fastbuffers/)

Example: 40Gb/s packet bufferSize = RTT*BW = 10Gb; 40 byte packets

Write Rate, R

1 packetevery 8 ns

Read Rate, R

1 packetevery 8 ns

BufferManager

BufferMemory

Use SRAM?+ fast enough random access time, but

- too low density to store 10Gb of data.

Use DRAM?+ high density means we can store data, but

- too slow (~15ns random access time).

Page 44: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 49

DRAM Buffer Memory

Packet Caches to implement Central buffer

Buffer Manager

SRAM

Arriving

Packets

DepartingPackets12

Q

2

1234

345

123456

Small ingress SRAM cache of FIFO headscache of FIFO tails

5556

9697

8788

57585960

899091

1

Q

2

Small ingress SRAM

1

57 6810 9

79 81011

1214 1315

5052 515354

8688 878991 90

8284 838586

9294 9395 68 7911 10

1

Q

2

DRAM Buffer Memory

b>>1 packets at a time

Page 45: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 50

Conceptual Packet Buffering

Non-blockingswitchingcore(s)

Arbitration/Control

Bi-

dir

ect

iona

l po

rts

Bi-d

irectional ports

Output buffering

Page 46: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 51

Output buffering

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

Data Hdr

Data Hdr

Data Hdr

BufferManager

BufferMemory

BufferManager

BufferMemory

BufferManager

BufferMemory

Data Hdr

Data Hdr

Data Hdr

Page 47: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 54

Speed-up

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

QueuePacket

BufferMemory

QueuePacket

BufferMemory

QueuePacket

BufferMemory

Data Hdr

Data Hdr

Data Hdr

1

2

N

1

2

N

N times line rate

N times line rate

Page 48: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 55

Conceptual Packet Buffering

Non-blockingswitchingcore(s)

Arbitration/Control

Bi-

dir

ect

iona

l po

rts

Bi-d

irectional ports

Input buffering with a virtual output queue

Page 49: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 56

Virtual Output Queues

Page 50: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 57

Matching

A matching on a graph is a subset of edges of the graph such that no two of them share a vertex in common.

edgevertex

Page 51: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 59

0% 20% 40% 60% 80% 100%Load

Dela

yA Router with Virtual Output Queues

The best that any queueing system can

achieve.

Page 52: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 67

Current Internet Router TechnologySummary

There are three potential bottlenecks: Address lookup, Packet buffering, and Switching.

Techniques exist today for: 100sTb/s Internet routers, with 400Gb/s linecards.

But what comes next…?

Page 53: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 68

Outline

Background What is a router? Why do we need faster routers? Why are they hard to build?

Architectures and techniques The evolution of router architecture. IP address lookup. Packet buffering. Switching.

The Future More parallelism. Eliminating schedulers. Introducing optics into routers.

Page 54: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

The Future

Page 55: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 72

Complex linecards

PhysicalLayer

Framing&

Maintenance

PacketProcessing

Buffer Mgmt&

Scheduling

Buffer Mgmt&

Scheduling

Buffer& StateMemory

Buffer& StateMemory

Typical IP Router Linecard

10Gb/s linecard: Number of gates: 30M Amount of memory: 2Gbits Cost: >$20k Power: 300W

LookupTables

SwitchFabric

Arbitration

Optics

Page 56: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 73

External Parallelism: Multiple Parallel Routers

What we’d like:R

R R

R

The building blocks we’d like to use:

R

RR

R

NxN

IP Router capacity 100s of Tb/s

Page 57: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 74

Multiple parallel routers Load Balancing

R R

R

1

2

k

R

R

R

R/k R/k

R

R

R

Page 58: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 75

Intelligent Packet Load-balancingParallel Packet Switching

1

2

k

1

N

rate, R

rate, R

rate, R

rate, R

1

N

Router

Bufferless

R/k R/k

Demultiplexor Multiplexor

Page 59: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 76

Parallel Packet Switching

AdvantagesSingle-stage of buffering

No excess link capacity

kh a power per subsystem i

kh a memory bandwidth i

kh a lookup rate i

Page 60: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 77

Parallel Packet Switching

AdvantagesLoad-balancing: output links are less

congested

Scalability: new router can be dynamically added

Redundancy

Page 61: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 78

Parallel Packet SwitchTheorem

If Speed-up > 2k/(k+2) @ 2 then a parallel packet switch can precisely emulate a single big router.

Page 62: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 80

Eliminating schedulersTwo-Stage Switch

1

N

1

N

1

N

External Outputs

Internal Inputs

External Inputs

First Round-Robin Second Round-Robin

Load Balancing

Page 63: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 83

Optics in routers

Switch Core Linecards

Optical links

Page 64: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 85

160Gb/s

40Gb/s

40Gb/s

40Gb/s

40Gb/s

Optical2-stageSwitch160-

320Gb/s160-

320Gb/s

100 Tb/s IP Router,

625 linecards, each operating at 160Gb/s.

The Stanford Phicticious Optical Router

Page 65: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 90

References

General1. J. S. Turner “Design of a Broadcast packet switching network”,

IEEE Trans Comm, June 1988, pp. 734-743.

2. C. Partridge et al. “A Fifty Gigabit per second IP Router”, IEEE Trans Networking, 1998.

3. N. McKeown, M. Izzard, A. Mekkittikul, W. Ellersick, M. Horowitz, “The Tiny Tera: A Packet Switch Core”, IEEE Micro Magazine, Jan-Feb 1997.

Fast Packet Buffers1. Sundar Iyer, Ramana Rao, Nick McKeown “Design of a fast

packet buffer”, IEEE HPSR 2001, Dallas.

Page 66: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 91

ReferencesIP Lookups1. A. Brodnik, S. Carlsson, M. Degermark, S. Pink. “Small

Forwarding Tables for Fast Routing Lookups”, Sigcomm 1997, pp 3-14.

2. B. Lampson, V. Srinivasan, G. Varghese. “ IP lookups using multiway and multicolumn search”, Infocom 1998, pp 1248-56, vol. 3.

3. M. Waldvogel, G. Varghese, J. Turner, B. Plattner. “Scalable high speed IP routing lookups”, Sigcomm 1997, pp 25-36.

4. P. Gupta, S. Lin, N. McKeown. “Routing lookups in hardware at memory access speeds”, Infocom 1998, pp 1241-1248, vol. 3.

5. S. Nilsson, G. Karlsson. “Fast address lookup for Internet routers”, IFIP Intl Conf on Broadband Communications, Stuttgart, Germany, April 1-3, 1998.

6. V. Srinivasan, G.Varghese. “Fast IP lookups using controlled prefix expansion”, Sigmetrics, June 1998.

Page 67: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 92

ReferencesSwitching• N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand.

Achieving 100% Throughput in an Input-Queued Switch. IEEE Transactions on Communications, 47(8), Aug 1999.

• A. Mekkittikul and N. W. McKeown, "A practical algorithm to achieve 100% throughput in input-queued switches," in Proceedings of IEEE INFOCOM '98, March 1998.

• L. Tassiulas, “Linear complexity algorithms for maximum throughput in radio networks and input queued switchs,” in Proc. IEEE INFOCOM ‘98, San Francisco CA, April 1998.

• D. Shah, P. Giaccone and B. Prabhakar, “An efficient randomized algorithm for input-queued switch scheduling,” in Proc. Hot Interconnects 2001.

• J. Dai and B. Prabhakar, "The throughput of data switches with and without speedup," in Proceedings of IEEE INFOCOM '00, Tel Aviv, Israel, March 2000, pp. 556 -- 564.

• C.-S. Chang, D.-S. Lee, Y.-S. Jou, “Load balanced Birkhoff-von Neumann switches,” Proceedings of IEEE HPSR ‘01, May 2001, Dallas,

Texas.

Page 68: INF5050 – Protocols and Routing in Internet (Friday 2.2.2018)folk.uio.no/yanzhang/INF5050-2018/IProuter2018.pdf · R10 R11 R4 R13 R9 R5 R2 R1 R6 R3 R7 R12 R16 R15 R14 R8 (2.5 Gb/s)

Nick McKeown 93

ReferencesFuture• C.-S. Chang, D.-S. Lee, Y.-S. Jou, “Load balanced Birkhoff-

von Neumann switches,” Proceedings of IEEE HPSR ‘01, May 2001, Dallas, Texas.

• Pablo Molinero-Fernndez, Nick McKeown "TCP Switching: Exposing circuits to IP" Hot Interconnects IX, Stanford University, August 2001

• S. Iyer, N. McKeown, "Making parallel packet switches practical," in Proc. IEEE INFOCOM `01, April 2001, Alaska.

• I. Keslassy et al. ”Scaling Internet Routers Using Optics”in. Proc. ACM SIGCOMM `03, August 2003, Germany.