bgp scaling techniques - start [apnic training wiki] · bgp scaling techniques poriginal bgp...
TRANSCRIPT
![Page 1: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/1.jpg)
BGP Scaling TechniquesISP Training Workshops
1
![Page 2: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/2.jpg)
BGP Scaling Techniquesp Original BGP specification and
implementation was fine for the Internet of the early 1990sn But didn’t scale
p Issues as the Internet grew included:n Scaling the iBGP mesh beyond a few peers?n Implement new policy without causing flaps
and route churning?n Keep the network stable, scalable, as well as
simple?
2
![Page 3: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/3.jpg)
BGP Scaling Techniquesp Current Best Practice Scaling Techniques
n Route Refreshn Peer-groupsn Route Reflectors (and Confederations)
p Deprecated Scaling Techniquesn Soft Reconfigurationn Route Flap Damping
3
![Page 4: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/4.jpg)
Dynamic ReconfigurationNon-destructive policy changes
4
![Page 5: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/5.jpg)
Route Refreshp Policy Changes:
n Hard BGP peer reset required after every policy change because the router does not store prefixes that are rejected by policy
p Hard BGP peer reset:n Tears down BGP peeringn Consumes CPUn Severely disrupts connectivity for all networks
p Solution:n Route Refresh
5
![Page 6: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/6.jpg)
Route Refresh Capabilityp Facilitates non-disruptive policy changesp No configuration is needed
n Automatically negotiated at peer establishmentp No additional memory is usedp Requires peering routers to support “route
refresh capability” – RFC2918p Tell peer to resend full BGP announcement
clear ip bgp x.x.x.x [soft] in
p Resend full BGP announcement to peerclear ip bgp x.x.x.x [soft] out
6
![Page 7: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/7.jpg)
Dynamic Reconfigurationp Use Route Refresh capability
n Supported on virtually all routersn find out from “show ip bgp neighbor”n Non-disruptive, “Good For the Internet”
p Only hard-reset a BGP peering as a last resort
7
Consider the impact to be equivalent to a router reboot
![Page 8: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/8.jpg)
Cisco’s Soft Reconfigurationp Now deprecated — but:p Router normally stores prefixes which have been
received from peer after policy applicationn Enabling soft-reconfiguration means router also stores
prefixes/attributes received prior to any policy application
n Uses more memory to keep prefixes whose attributes have been changed or have not been accepted
p Only useful now when operator requires to know which prefixes have been sent to a router prior to the application of any inbound policy
8
![Page 9: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/9.jpg)
Cisco’s Soft Reconfiguration
9
BGP inprocess
BGPtable
BGP outprocess
BGP intable
receivedreceivedand used
accepted
discardedpeer
peer
normal
soft
![Page 10: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/10.jpg)
Configuring Soft Reconfigurationrouter bgp 100neighbor 1.1.1.1 remote-as 101neighbor 1.1.1.1 route-map infilter inneighbor 1.1.1.1 soft-reconfiguration inbound! Outbound does not need to be configured !
p Then when we change the policy, we issue an exec command
clear ip bgp 1.1.1.1 soft [in | out]
p Note:n When “soft reconfiguration” is enabled, there is no
access to the route refresh capabilityn clear ip bgp 1.1.1.1 [in | out] will also do a
soft refresh10
![Page 11: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/11.jpg)
Peer Groups
11
![Page 12: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/12.jpg)
Peer Groupsp Problem – how to scale iBGP
n Large iBGP mesh slow to buildn iBGP neighbours receive the same updaten Router CPU wasted on repeat calculations
p Solution – peer-groupsn Group peers with the same outbound policyn Updates are generated once per group
12
![Page 13: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/13.jpg)
Peer Groups – Advantagesp Makes configuration easierp Makes configuration less prone to errorp Makes configuration more readablep Lower router CPU loadp iBGP mesh builds more quicklyp Members can have different inbound
policyp Can be used for eBGP neighbours too!
13
![Page 14: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/14.jpg)
Configuring a Peer Grouprouter bgp 100neighbor ibgp-peer peer-groupneighbor ibgp-peer remote-as 100neighbor ibgp-peer update-source loopback 0neighbor ibgp-peer send-communityneighbor ibgp-peer route-map outfilter outneighbor 1.1.1.1 peer-group ibgp-peerneighbor 2.2.2.2 peer-group ibgp-peerneighbor 2.2.2.2 route-map infilter inneighbor 3.3.3.3 peer-group ibgp-peer
! note how 2.2.2.2 has different inbound filter from peer-group !14
![Page 15: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/15.jpg)
Configuring a Peer Grouprouter bgp 100neighbor external-peer peer-groupneighbor external-peer send-communityneighbor external-peer route-map set-metric outneighbor 160.89.1.2 remote-as 200neighbor 160.89.1.2 peer-group external-peerneighbor 160.89.1.4 remote-as 300neighbor 160.89.1.4 peer-group external-peerneighbor 160.89.1.6 remote-as 400neighbor 160.89.1.6 peer-group external-peerneighbor 160.89.1.6 filter-list infilter in
15
![Page 16: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/16.jpg)
Peer Groupsp Always configure peer-groups for iBGP
n Even if there are only a few iBGP peersn Easier to scale network in the future
p Consider using peer-groups for eBGPn Especially useful for multiple BGP customers using same
AS (RFC2270)n Also useful at Exchange Points where ISP policy is
generally the same to each peerp Peer-groups are essentially obsoleted
n But are still widely considered best practicen Replaced by update-groups (internal coding – not
configurable)n Enhanced by peer-templates (allowing more complex
constructs)16
![Page 17: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/17.jpg)
Route ReflectorsScaling the iBGP mesh
17
![Page 18: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/18.jpg)
Scaling iBGP mesh
p Avoid ½n(n-1) iBGP mesh
18
n=1000 Þ nearlyhalf a millionibgp sessions!
14 routers = 91 iBGP sessions
p Two solutionsn Route reflector – simpler to deploy and runn Confederation – more complex, has corner case
advantages
![Page 19: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/19.jpg)
Route Reflector: Principle
19
AS 100
A
CB
![Page 20: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/20.jpg)
Route Reflector: Principle
20
AS 100
A
CB
Route Reflector
![Page 21: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/21.jpg)
Route Reflectorp Reflector receives
path from clients and non-clients
p Selects best pathp If best path is from
client, reflect to other clients and non-clients
p If best path is from non-client, reflect to clients only
p Non-meshed clientsp Described in RFC4456
21
AS 100
A
B C
Clients
Reflectors
![Page 22: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/22.jpg)
Route Reflector Topologyp Divide the backbone into multiple clustersp At least one route reflector and few clients
per clusterp Route reflectors are fully meshedp Clients in a cluster could be fully meshedp Single IGP to carry next hop and local
routes
22
![Page 23: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/23.jpg)
Route Reflectors:Loop Avoidancep Originator_ID attribute
n Carries the RID of the originator of the route in the local AS (created by the RR)
p Cluster_list attributen The local cluster-id is added when the update
is sent by the RRn Cluster-id is router-id (address of loopback)n Do NOT use bgp cluster-id x.x.x.x
23
![Page 24: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/24.jpg)
Route Reflectors:Redundancyp Multiple RRs can be configured in the
same cluster – not advised!n All RRs in the cluster must have the same
cluster-id (otherwise it is a different cluster)p A router may be a client of RRs in different
clustersn Common today in ISP networks to overlay two
clusters – redundancy achieved that wayn ® Each client has two RRs = redundancy
24
![Page 25: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/25.jpg)
Route Reflectors:Redundancy
25
AS 100
Cluster One
Cluster Two
PoP2PoP1
PoP3
![Page 26: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/26.jpg)
Route Reflector: Benefitsp Solves iBGP mesh problemp Packet forwarding is not affectedp Normal BGP speakers co-existp Multiple reflectors for redundancyp Easy migrationp Multiple levels of route reflectors
26
![Page 27: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/27.jpg)
Route Reflectors: Migrationp Where to place the route reflectors?
n Follow the physical topology!n This will guarantee that the packet forwarding
won’t be affectedp Configure one RR at a time
n Eliminate redundant iBGP sessionsn Place one RR per cluster
27
![Page 28: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/28.jpg)
Route Reflectors: Migration
p Migrate small parts of the network, one part at a time. 28
AS 200
AS 100
AS 300A
B
GFE
D
C
![Page 29: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/29.jpg)
Configuring a Route Reflectorp Router D configuration:
router bgp 100...neighbor 1.2.3.4 remote-as 100neighbor 1.2.3.4 route-reflector-clientneighbor 1.2.3.5 remote-as 100neighbor 1.2.3.5 route-reflector-clientneighbor 1.2.3.6 remote-as 100neighbor 1.2.3.6 route-reflector-client...
29
![Page 30: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/30.jpg)
BGP Scaling Techniquesp These 3 techniques should be core
requirements on all ISP networksn Route Refresh (or Soft Reconfiguration)n Peer groupsn Route Reflectors
30
![Page 31: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/31.jpg)
BGP Confederations
31
![Page 32: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/32.jpg)
Confederationsp Divide the AS into sub-AS
n eBGP between sub-AS, but some iBGP information is kept
p Preserve NEXT_HOP across thesub-AS (IGP carries this information)
p Preserve LOCAL_PREF and MED
p Usually a single IGP p Described in RFC5065
32
![Page 33: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/33.jpg)
Confederationsp Visible to outside world as single AS –
“Confederation Identifier”n Each sub-AS uses a number from the private
space (64512-65534)p iBGP speakers in sub-AS are fully meshed
n The total number of neighbors is reduced by limiting the full mesh requirement to only the peers in the sub-AS
33
![Page 34: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/34.jpg)
Confederations
p Configuration (Router C):router bgp 65532bgp confederation identifier 200bgp confederation peers 65530 65531 neighbor 141.153.12.1 remote-as 65530neighbor 141.153.17.2 remote-as 65531
34
AS 200
Sub-AS65530
Sub-AS65532 Sub-AS
65531C B
A
![Page 35: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/35.jpg)
Confederations: Next Hop
35
Sub-AS65002
Sub-AS65003 Sub-AS
65001
Confederation 100
AS 200
180.10.0.0/16 180.10.11.1A
B C D E
![Page 36: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/36.jpg)
Confederation: Principlep Local preference and MED influence path
selectionp Preserve local preference and MED across
sub-AS boundaryp Sub-AS eBGP path administrative distance
36
![Page 37: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/37.jpg)
Confederations: Loop Avoidancep Sub-AS traversed are carried as part of
AS-pathp AS-sequence and AS path lengthp Confederation boundaryp AS-sequence should be skipped during
MED comparison
37
![Page 38: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/38.jpg)
Confederations: AS-Sequence
38
Sub-AS65002
Sub-AS65003
Sub-AS65001
Confederation 100
Sub-AS65004
180.10.0.0/16 200
180.10.0.0/16 (65002) 200180.10.0.0/16 (65004 65002) 200
180.10.0.0/16 100 200
A
B
C
EF
DG
H
![Page 39: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/39.jpg)
Route Propagation Decisionsp Same as with “normal” BGP:
n From peer in same sub-AS ® only to external peers
n From external peers ® to all neighborsp “External peers” refers to
n Peers outside the confederation n Peers in a different sub-AS
p Preserve LOCAL_PREF, MED and NEXT_HOP
39
![Page 40: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/40.jpg)
Confederations (cont.)p Example (cont.):
BGP table version is 78, local router ID is 141.153.17.1Status codes: s suppressed, d damped, h history, * valid, >
best, i - internalOrigin codes: i - IGP, e - EGP, ? - incompleteNetwork Next Hop Metric LocPrf Weight Path*> 10.0.0.0 141.153.14.3 0 100 0 (65531) 1 i*> 141.153.0.0 141.153.30.2 0 100 0 (65530) i*> 144.10.0.0 141.153.12.1 0 100 0 (65530) i*> 199.10.10.0 141.153.29.2 0 100 0 (65530) 1 i
40
![Page 41: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/41.jpg)
More points about confederationsp Can ease “absorbing” other ISPs into your
ISPn e.g., if one ISP buys another n (can use local-as feature to do a similar thing)
p You can use route-reflectors with confederation sub-AS to reduce the sub-AS iBGP mesh
41
![Page 42: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/42.jpg)
Confederations: Benefitsp Solves iBGP mesh problemp Packet forwarding not affectedp Can be used with route reflectorsp Policies could be applied to route traffic
between sub-AS’s
42
![Page 43: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/43.jpg)
Confederations: Caveatsp Minimal number of sub-ASp Sub-AS hierarchyp Minimal inter-connectivity between sub-
AS’sp Path diversityp Difficult migration
n BGP reconfigured into sub-ASn must be applied across the network
43
![Page 44: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/44.jpg)
RRs or Confederations
44
Internet Connectivity
Multi-Level Hierarchy
Policy Control Scalability
Migration Complexity
Confederations
Route Reflectors
Anywhere in the
NetworkYes Yes
YesYesAnywhere
in the Network
Medium
Very High Very Low
Mediumto High
Most new service provider networks now deploy Route Reflectors from Day One
![Page 45: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/45.jpg)
Route Flap DampingNetwork Stability for the 1990s
Network Instability for the 21st Century!
45
![Page 46: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/46.jpg)
Route Flap Dampingp For many years, Route Flap Damping was
a strongly recommended practicep Now it is strongly discouraged as it causes
far greater network instability than it cures
p But first, the theory…
46
![Page 47: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/47.jpg)
Route Flap Dampingp Route flap
n Going up and down of path or change in attribute
p BGP WITHDRAW followed by UPDATE = 1 flapp eBGP neighbour going down/up is NOT a flap
n Ripples through the entire Internetn Wastes CPU
p Damping aims to reduce scope of route flap propagation
47
![Page 48: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/48.jpg)
Route Flap Damping (continued)p Requirements
n Fast convergence for normal route changesn History predicts future behaviourn Suppress oscillating routesn Advertise stable routes
p Implementation described in RFC 2439
48
![Page 49: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/49.jpg)
Operationp Add penalty (1000) for each flap
n Change in attribute gets penalty of 500p Exponentially decay penalty
n half life determines decay ratep Penalty above suppress-limit
n do not advertise route to BGP peersp Penalty decayed below reuse-limit
n re-advertise route to BGP peersn penalty reset to zero when it is half of reuse-
limit49
![Page 50: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/50.jpg)
Operation
50
Reuse limit
0 1 2 3 4 5 6 7 8 9 10 11 1213 14 15 16 1718 19 20 212223 2425
0
1000
2000
3000
4000
Time
Penalty
Suppress limit
NetworkAnnounced
NetworkRe-announced
NetworkNot Announced
Penalty
![Page 51: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/51.jpg)
Operationp Only applied to inbound announcements
from eBGP peersp Alternate paths still usablep Controlled by:
n Half-life (default 15 minutes)n reuse-limit (default 750)n suppress-limit (default 2000)n maximum suppress time (default 60 minutes)
51
![Page 52: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/52.jpg)
Configurationp Fixed damping
router bgp 100bgp dampening [<half-life> <reuse-value> <suppress-penalty> <maximum suppress time>]
p Selective and variable dampingbgp dampening [route-map <name>]route-map <name> permit 10match ip address prefix-list FLAP-LISTset dampening [<half-life> <reuse-value> <suppress-penalty> <maximum suppress time>]
ip prefix-list FLAP-LIST permit 192.0.2.0/24 le 32
52
![Page 53: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/53.jpg)
Operationp Care required when setting parametersp Penalty must be less than reuse-limit at
the maximum suppress timep Maximum suppress time and half life must
allow penalty to be larger than suppress limit
53
![Page 54: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/54.jpg)
Configurationp Examples – û
n bgp dampening 15 500 2500 30p reuse-limit of 500 means maximum possible penalty
is 2000 – no prefixes suppressed as penalty cannot exceed suppress-limit
p Examples – ün bgp dampening 15 750 3000 45
p reuse-limit of 750 means maximum possible penalty is 6000 – suppress limit is easily reached
54
![Page 55: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/55.jpg)
Maths!p Maximum value of penalty is
p Always make sure that suppress-limit is LESS than max-penalty otherwise there will be no route damping
55
![Page 56: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/56.jpg)
Route Flap Damping Historyp First implementations on the Internet by
1995p Vendor defaults too severe
n RIPE Routing Working Group recommendations in ripe-178, ripe-210, and ripe-229
n http://www.ripe.net/ripe/docsn But many ISPs simply switched on the
vendors’ default values without thinking
56
![Page 57: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/57.jpg)
Serious Problems:p "Route Flap Damping Exacerbates Internet
Routing Convergence“n Zhuoqing Morley Mao, Ramesh Govindan, George
Varghese & Randy H. Katz, August 2002p “What is the sound of one route flapping?”
n Tim Griffin, June 2002p Various work on routing convergence by Craig
Labovitz and Abha Ahuja a few years agop “Happy Packets”
n Closely related work by Randy Bush et al
57
![Page 58: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/58.jpg)
Problem 1:p One path flaps:
n BGP speakers pick next best path, announce to all peers, flap counter incremented
n Those peers see change in best path, flap counter incremented
n After a few hops, peers see multiple changes simply caused by a single flap ® prefix is suppressed
58
![Page 59: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/59.jpg)
Problem 2:p Different BGP implementations have
different transit time for prefixesn Some hold onto prefix for some time before
advertisingn Others advertise immediately
p Race to the finish line causes appearance of flapping, caused by a simple announcement or path change ® prefix is suppressed
59
![Page 60: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/60.jpg)
Solution:p Do NOT use Route Flap Damping whatever you
do!p RFD will unnecessarily impair access to:
n Your network and n The Internet
p More information contained in RIPE Routing Working Group recommendations:n www.ripe.net/ripe/docs/ripe-378.[pdf,html,txt]
p Work is underway to try and find ways of making RFD usable:n http://datatracker.ietf.org/doc/draft-ymbk-rfd-usable/
60
![Page 61: BGP Scaling Techniques - start [APNIC TRAINING WIKI] · BGP Scaling Techniques pOriginal BGP specification and implementation was fine for the Internet of the early 1990s nBut didn’t](https://reader030.vdocuments.net/reader030/viewer/2022040309/5f10a03f7e708231d44a07d1/html5/thumbnails/61.jpg)
BGP Scaling TechniquesISP Training Workshops
61