forwarding and routers 15-744: computer networkingprs/15-744-f11/lectures/07-routers-alg.pdf · 2...
TRANSCRIPT
1
15-744: Computer Networking
L-7 Router Algorithms
Forwarding and Routers
• IP lookup• Longest prefix matching
Cl ifi ti• Classification• Flow monitoring• Readings
• [EVF03] Bitmap Algorithms for Active Flows on High Speed Links
• [BV01] Scalable Packet Classification (4 pages)
2
[BV01] Scalable Packet Classification (4 pages)• Optional
• [D+97] Small Forwarding Tables for Fast Routing Lookups
Outline
• IP route lookup• Variable prefix match algorithms• Packet classification• Flow monitoring
3
Original IP Route Lookup
• Address classes• A: 0 | 7 bit network | 24 bit host (16M each)
B 10 | 14 bit t k | 16 bit h t (64K)• B: 10 | 14 bit network | 16 bit host (64K)• C: 110 | 21 bit network | 8 bit host (255)
• Address would specify prefix for forwarding table• Simple lookup
4
2
Original IP Route Lookup – Example
• www.cmu.edu address 128.2.11.43• Class B address – class + network is 128.2• Lookup 128.2 in forwarding tableLookup 128.2 in forwarding table• Prefix – part of address that really matters for routing
• Forwarding table contains• List of class+network entries• A few fixed prefix lengths (8/16/24)
• Large tables• 2 Million class C networks
5
• 2 Million class C networks
• 32 bits does not give enough space encode network location information inside address – i.e., create a structured hierarchy
CIDR Revisited
• Supernets• Assign adjacent net addresses to same org
Cl l ti (CIDR)• Classless routing (CIDR)• How does this help routing table?
• Combine routing table entries whenever all nodes with same prefix share same hop
• Routing protocols carry prefix with destination network address
6
• Longest prefix match for forwarding
CIDR Illustration
Provider is given 201.10.0.0/21
201.10.0.0/22 201.10.4.0/24 201.10.5.0/24 201.10.6.0/23
Provider
7
CIDR Shortcomings
• Multi-homing• Customer selecting a new provider
201.10.0.0/21
Provider 1 Provider 2
8
201.10.0.0/22 201.10.4.0/24 201.10.5.0/24 201.10.6.0/23 or Provider 2 address
3
Outline
• IP route lookup• Variable prefix match algorithms• Packet classification• Flow monitoring
9
Trie Using Sample Database
Root • P1 = 10*Sample DatabaseTrie
0 1P5 P4
0 1P10
0P60
P2
0
0
• P2 = 111*• P3 = 11001*• P4 = 1*• P5 = 0*
P6 1000*
1
10
P7
P8
0
0
0
1
P3
• P6 = 1000*• P7 = 100000*• P8 = 1000000*
How To Do Variable Prefix Match
• Traditional method – Patricia Tree• Arrange route entries into a series of bit tests
10
0
• Worst case = 32 bit tests• Problem: memory speed is a bottleneck
• May need to backtrack
Bit to test – 0 = left child,1 = right child
11
128.2/16
10
16
19128.32/16
128.32.130/240 128.32.150/24
default0/0
Optimal Expanded Tries
• Pick stride s for root and solve recursively
4
Speeding up Prefix Match (P+98)
• Cut prefix tree at 16 bit depth • 64K bit mask
Bit 1 if t ti b l t ( t h d)• Bit = 1 if tree continues below cut (root head)• Bit = 1 if leaf at depth 16 or less (genuine head)• Bit = 0 if part of range covered by leaf
13
Prefix Tree
1 0 0 0 1 0 1 1 1 0 0 0 1 1 1 1
14
00 0 0 0 0 0 01 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Port 1 Port 5 Port 7Port 3
Port 9Port 5
Prefix Tree
1 0 0 0 1 0 1 1 1 0 0 0 1 1 1 1
15
00 0 0 0 0 0 01 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Subtree 1 Subtree 2Subtree 3
Leaf Pushing: entries that have pointers plus prefix have prefixes pushed down to leaves
5
Speeding up Prefix Match (P+98)
• Each 1 corresponds to either a route or a subtree• Keep array of routes/pointers to subtree
N d i d i t h t t # f 1• Need index into array – how to count # of 1s• Keep running count to 16bit word in base index + code
word (6 bits)• Need to count 1s in last 16bit word
• Clever tricks
• Subtrees are handled separately
17
Subtrees are handled separately
Lulea Trick Speeds up Lookup
Speeding up Prefix Match (P+98)
• Scaling issues• How would it handle IPv6
• Update issues• Other possibilities
• Why were the cuts done at 16/24/32 bits?• Improve data structure by shuffling bits
19
Speeding up Prefix Match - Alternatives
• Route caches• Temporal locality• Many packets to same destination• Many packets to same destination
• Other algorithms • Waldvogel – Sigcomm 97
• Binary search on prefixes• Works well for larger addresses
• Bremler-Barr – Sigcomm 99• Clue = prefix length matched at previous hop
20
• Clue = prefix length matched at previous hop• Why is this useful?
• Lampson – Infocom 98• Binary search on ranges
6
Speeding up Prefix Match - Alternatives
• Content addressable memory (CAM)• Hardware based route lookup• Input = tag, output = value associated with tag• Requires exact match with tag
• Multiple cycles (1 per prefix searched) with single CAM• Multiple CAMs (1 per prefix) searched in parallel
• Ternary CAM• 0,1,don’t care values in tag match• Priority (I e longest prefix) by order of entries in CAM
21
• Priority (I.e. longest prefix) by order of entries in CAM
Outline
• IP route lookup• Variable prefix match algorithms• Packet classification
• See slides Gennadi• Flow monitoring
22
Packet Classification
• Typical uses• Identify flows for QoS
Fi ll filt i• Firewall filtering• Requirements
• Match on multiple fields• Strict priority among rules
• E.g 1. no traffic from 128.2.*2. ok traffic on port 80
23
p
Complexity
• N rules and k header fields for k > 2• O(log Nk-1) time and O(N) space• O(log N) time and O(Nk) space• Special cases for k = 2 source and
destination• O(log N) time and O(N) space solutions exist
• How many rules?
24
How many rules?• Largest for firewalls & similar 1700• Diffserv/QoS much larger 100k (?)
7
Bit Vectors
0 1Rule Field1 Field2
0 00* 00*
0 0 1
0 00 00
1 00* 01*
2 10* 11*
25
0001001011003 11* 10*
Field 1
Bit Vectors
0 1Rule Field1 Field2
0 00* 00*
0 0 1
0 00 00
1 00* 01*
2 10* 11*1
26
0010000110003 11* 10*
0100
Field 2
Aggregating Rules [BV01]
• Common case: very few 1’s in bit vector aggregate bits
• OR together A bits at a time N/A bit long vector• OR together A bits at a time N/A bit-long vector• A typically chosen to match word-size• Can be done hierarchically aggregate the
aggregates• AND of aggregate bits indicates which groups of
A rules have a possible match• Hopefully only a few 1’s in AND’ed vector
27
Hopefully only a few 1 s in AND ed vector• AND of aggregated bit vectors may have false positives
• Fetch and AND just bit vectors associated with positive entries
Rearranging Rules [BV01]
• Problem: false positives may be common• Solution: reorder rules to minimize false positives
Wh t b t th i it d f l ?• What about the priority order of rules?• How to rearrange?
• Heuristic sort rules based on single field’s values• First sort by prefix length then by value• Moves similar rules close together reduces false positives
28
8
Summary: Addressing/Classification
• Router architecture carefully optimized for IP forwarding
• Key challenges:• Key challenges:• Speed of forwarding lookup/classification • Power consumption
• Some good examples of common case optimization
29
p• Routing with a clue• Classification with few matching rules• Not checksumming packets
Outline
• IP route lookup• Variable prefix match algorithms• Packet classification• Flow monitoring
• Based on IMC’03 talk by Estan, Varghese, and Fisk
30
Why count flows?
• Detect port/IP scansId tif D S tt k
Dave Plonka’s FlowScan• Identify DoS attacks• Estimate spreading
rate of a worm• Packet scheduling
Existing flow counting solutions
TrafficRouterNetworkNetwork
b d id hb d id hServer
NetFlow dataAnalysis
Traffic
reports
Router
F li k
MemoryMemory sizeMemory size
& bandwidth& bandwidth
bandwidthbandwidth
Network Operations Center
Fast link
Network
9
Motivating question
• Can we count flows at line speeds at the router?• Wrong solution – counters• Naïve solution – use hash tables (like NetFlow)• Our approach – use bitmaps
Bitmap counting algorithms
• A family of algorithms that can be used as building blocks in various systems
• Algorithms can be adapted to application• Low memory and per packet processing• Generalize flows to distinct header patterns
• Count flows or source addresses to detect attack• Count destination address+port pairs to detect scan
Outline
• Direct bitmap• Virtual bitmaps• Multi-resolution bitmaps• Some results
Bitmap counting – direct bitmap
Set bits in the bitmap using hash of the flow ID of incoming packets
HASH(green)=10001001
packets
10
Bitmap counting – direct bitmap
Different flows have different hash values
HASH(blue)=00100100
Bitmap counting – direct bitmap
Packets from the same flow always hash to the same bit
HASH(green)=10001001
Bitmap counting – direct bitmap
Collisions OK, estimates compensate for them
HASH(violet)=10010101
Bitmap counting – direct bitmap
HASH(orange)=11110011
11
Bitmap counting – direct bitmap
HASH(pink)=11100000
Bitmap counting – direct bitmap
As the bitmap fills up, estimates get inaccurate
HASH(yellow)=01100011
Bitmap counting – direct bitmap
Solution: use more bits
HASH(green)=10001001
Bitmap counting – direct bitmap
Solution: use more bitsProblem: memory scales with the number of flows
HASH(blue)=00100100
12
Bitmap counting – virtual bitmap
Solution: a) store only a portion of the bitmap
b) multiply estimate by scaling factor
Bitmap counting – virtual bitmap
HASH(pink)=11100000
Bitmap counting – virtual bitmap
Problem: estimate inaccurate when few flows active
HASH(yellow)=01100011
Bitmap counting – multiple bmps
Solution: use many bitmaps, each accurate for a different range
13
Bitmap counting – multiple bmps
HASH(pink)=11100000
Bitmap counting – multiple bmps
HASH(yellow)=01100011
Bitmap counting – multiple bmps
Use this bitmap to estimate number of flows
Bitmap counting – multiple bmps
Use this bitmap to estimate number of flows
14
Bitmap counting – multires. bmp
Problem: must update up to three bitmaps
OR
OR
Problem: must update up to three bitmapsper packet
Solution: combine bitmaps into one
Bitmap counting – multires. bmp
HASH(pink)=11100000
Bitmap counting – multires. bmp
HASH(yellow)=01100011
Error of virtual bitmap
ror
ge (r
elat
ive)
err
Flow density (flows/bit)
Aver
a
15
100 million flows, error 1%
Hash table* 1.21 GbytesHash table 1.21 Gbytes
Direct bitmap 1.29 Mbytes
Virtual bitmap* 1.88 Kbytesp y
Multiresolution bitmap 10.33 Kbytes
Adaptive bitmap
• Virtual bitmap measures accurately number of flows if range known in advance
• Often number of flows does not change rapidly• Measurement repeated• Can use previous measurement to tune virtual
bitmap• Use small multi-resolution bit map for tuning• Combine into single bit map (single update)
Triggered bitmap
• Need multiple instances of counting algorithm (e.g. port scan detection)
• Many instances count few flows• Triggered bitmap
• Allocate small direct bitmap to new sources• If number of bits set exceeds trigger value, allocate
large multiresolution bitmap
Scan detection memory usage
Interval Snort Probabilistic Triggeredlength (naïve) counting
ggbitmap
12 seconds 1.94 M 2.42 M 0.37 M
600 seconds 49.60 M 22,34 M 5.59 M
16
A family of counting algorithms
Setting Algorithm Applications
General counting Multiresolution bmp. Track infections
Narrow range Virtual bitmap Triggers (e.g. DoS)
Small counts common Triggered bitmap Port scans
Stationarity Adaptive bitmap Measurement
Add and delete Increment-decrement Scheduling
What is Next?
• Wireless!• Readings:
• [BM09] In Defense of Wireless Carrier Sense• [BPSK97] A Comparison of Mechanisms for Improving
TCP Performance over Wireless Links (2 sections)• Optional:
• [BDSC94] MACAW: A Media Access Protocol for Wireless LAN’sWireless LAN s
62