bgp deployment & scalability
DESCRIPTION
BGP Deployment & Scalability. Mike Pennington Network Consulting Engineer Cisco Systems, Denver. Basic BGP Review. Border Gateway Protocol. Routing Protocol used to exchange routing information between networks exterior gateway protocol RFC1771 work in progress to update - PowerPoint PPT PresentationTRANSCRIPT
1© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
BGP Deployment & Scalability
Mike PenningtonMike Pennington
Network Consulting EngineerNetwork Consulting Engineer
Cisco Systems, DenverCisco Systems, Denver
2© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
Basic BGP Review
© 2001, Cisco Systems, Inc. All rights reserved. 3ISP Workshops
Border Gateway Protocol
• Routing Protocol used to exchange routing information between networks
exterior gateway protocol
• RFC1771
work in progress to update
draft-ietf-idr-bgp4-17.txt
• Currently Version 4
• Runs over TCP
© 2001, Cisco Systems, Inc. All rights reserved. 4ISP Workshops
BGP
• Path Vector Protocol
• Incremental Updates
• Many options for policy enforcement
• Classless Inter Domain Routing (CIDR)
• Widely used for Internet backbone
• Autonomous systems
© 2001, Cisco Systems, Inc. All rights reserved. 5ISP Workshops
Path Vector Protocol
• BGP is classified as a path vector routing protocol (see RFC 1322)
A path vector protocol defines a route as a pairing between a destination and the attributes of the path to that destination.
12.6.126.0/24 207.126.96.43 1021 0 6461 7018 6337 11268 i12.6.126.0/24 207.126.96.43 1021 0 6461 7018 6337 11268 i
AS PathAS Path
© 2001, Cisco Systems, Inc. All rights reserved. 6ISP Workshops
• Sequence of ASes a route has traversed
• Loop detection
• Apply policy
AS-Path
AS 100
AS 300
AS 200
AS 500
AS 400
170.10.0.0/16 180.10.0.0/16
150.10.0.0/16
180.10.0.0/16 300 200 100
170.10.0.0/16 300 200
150.10.0.0/16 300 400
180.10.0.0/16 300 200 100170.10.0.0/16 300 200
© 2001, Cisco Systems, Inc. All rights reserved. 7ISP Workshops
AS-Path loop detection
AS 100
AS 300
AS 200
AS 500
170.10.0.0/16 180.10.0.0/16
180.10.0.0/16 300 200 100
170.10.0.0/16 300 200
140.10.0.0/16 300
140.10.0.0/16 500 300
170.10.0.0/16 500 300 200
140.10.0.0/16
180.10.0.0/16 is not announced to AS100 as AS500 sees that it is originated from AS100, and that AS100 is the neighbouring AS – loop detection in action
© 2001, Cisco Systems, Inc. All rights reserved. 8ISP Workshops
Autonomous System (AS)
• Collection of networks with same routing policy
• Single routing protocol
• Usually under single ownership, trust and administrative control
AS 100
© 2001, Cisco Systems, Inc. All rights reserved. 9ISP Workshops
BGP Basics
AS 100 AS 101
AS 102
EE
BB DD
AA CC
Peering
BGP speakers are called peers
© 2001, Cisco Systems, Inc. All rights reserved. 10ISP Workshops
BGP General Operation
• Learns multiple paths via internal and external BGP speakers
• Picks the best path and installs in the forwarding table
• Policies applied by influencing the best path selection
© 2001, Cisco Systems, Inc. All rights reserved. 11ISP Workshops
External BGP Peering (eBGP)
AS 100 AS 101CC
AA
• Between BGP speakers in different AS
• Should be directly connected
• Do not run an IGP between eBGP peers
BB
© 2001, Cisco Systems, Inc. All rights reserved. 12ISP Workshops
Internal BGP Peering (iBGP)
• Topology independent• Each iBGP speaker must peer with
every other iBGP speaker in the AS
AS 100
AA
EE
BB
DD
© 2001, Cisco Systems, Inc. All rights reserved. 13ISP Workshops
Internal BGP (iBGP)
• BGP peer within the same AS
• Not required to be directly connected
• iBGP speakers need to be fully meshed
they originate connected networks
they do not pass on prefixes learned from other iBGP speakers
14© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
BGP Attributes
© 2001, Cisco Systems, Inc. All rights reserved. 15ISP Workshops
What Is an Attribute?
• Describes the characteristics of prefix
• Transitive or non-transitive
• Some are mandatory
Next Next HopHop
AS AS PathPath
............MEDMED......
© 2001, Cisco Systems, Inc. All rights reserved. 16ISP Workshops
• Sequence of ASes a route has traversed
• Loop detection
• Apply policy
AS-Path
AS 100
AS 300
AS 200
AS 500
AS 400
170.10.0.0/16 180.10.0.0/16
150.10.0.0/16
180.10.0.0/16 300 200 100
170.10.0.0/16 300 200
150.10.0.0/16 300 400
180.10.0.0/16 300 200 100170.10.0.0/16 300 200
© 2001, Cisco Systems, Inc. All rights reserved. 17ISP Workshops
150.10.0.0/16 150.10.1.1160.10.0.0/16 150.10.1.1
150.10.1.2150.10.1.1
Next Hop
• Next hop to reach a network
• Usually a local network is the next hop in eBGP session
160.10.0.0/16
150.10.0.0/16
AS 100
AS 300AS 200
AA BB
20
© 2001, Cisco Systems, Inc. All rights reserved. 18ISP Workshops
Next Hop (continued)
• IGP should carry route to next hops
• Recursive route look-up
• Unlinks BGP from actual physical topology
• Allows IGP to make intelligent forwarding decision
© 2001, Cisco Systems, Inc. All rights reserved. 19ISP Workshops
Local Preference
AS 400
AS 200
160.10.0.0/16AS 100
AS 300
160.10.0.0/16 500> 160.10.0.0/16 800
500 800 EE
BB
CC
AA
DD
© 2001, Cisco Systems, Inc. All rights reserved. 20ISP Workshops
Local Preference
• Local to an AS – non-transitive
local preference set to 100 when heard from neighbouring AS
• Used to influence BGP path selection
determines best path for outbound traffic
• Path with highest local preference wins
© 2001, Cisco Systems, Inc. All rights reserved. 21ISP Workshops
Multi-Exit Discriminator (MED)
AS 201
AS 200
192.68.1.0/24
CC
AA BB
192.68.1.0/24 1000192.68.1.0/24 2000
© 2001, Cisco Systems, Inc. All rights reserved. 22ISP Workshops
Multi-Exit Discriminator
• Inter-AS – non-transitive
metric reset to 0 on announcement to next AS
• Used to convey the relative preference of entry points
determines best path for inbound traffic
• Comparable if paths are from same AS
• IGP metric can be conveyed as MED
© 2001, Cisco Systems, Inc. All rights reserved. 23ISP Workshops
MED & IGP Metric
• set metric-type internal
enable BGP to advertise a MED which corresponds to the IGP metric values
changes are monitored (and re-advertised if needed) every 600s
bgp dynamic-med-interval <secs>
© 2001, Cisco Systems, Inc. All rights reserved. 24ISP Workshops
Community
• BGP attribute
• Used to group destinations
• Represented as two 16bit integers
• Each destination could be member of multiple communities
• Community attribute carried across AS’s
• Useful in applying policies
© 2001, Cisco Systems, Inc. All rights reserved. 25ISP Workshops
160.10.0.0/16 300:1
Community
AS 200
160.10.0.0/16 300:1
170.10.0.0/16 300:1
170.10.0.0/16 300:1
AS 400
DD
CC
EE
BB
170.10.0.0/16
AS 100AA
160.10.0.0/16
ISP 1200.10.0.0/16 300:9
XX
ISP 2
200.10.0.0/16
AS 300
26© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
BGP Deployment Guidelines
© 2001, Cisco Systems, Inc. All rights reserved. 27ISP Workshops
Recommended BGP commands for everyone
• ip bgp-community new-format
• no auto-summary
• no synchronization
• bgp deterministic-med
Whatever you do, use of deterministic-med MUST be consistent in your Autonomous System.
© 2001, Cisco Systems, Inc. All rights reserved. 28ISP Workshops
Other serious considerations
• For public peering: filter EBGP routes inbound and outbound
Block your own address space inbound
Block RFC 1918 space (inbound and outbound)
Block DSUA space (inbound and outbound):
http://www.ietf.org/internet-drafts/draft-manning-dsua-08.txt
• Use prefix-lists for route-filtering when possible (easier to read than ACLs)
© 2001, Cisco Systems, Inc. All rights reserved. 29ISP Workshops
Other serious considerations
• If you carry a default in the IGP, your BGP next-hops ALWAYS resolve (generally not good)
• bgp bestpath compare-routerid
Restores RFC-compliant path selection; OFF by default to reduce update churn, use with discretion
• If you have a large BGP network, consider techniques in the next section
30© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
BGP Scaling Techniques
© 2001, Cisco Systems, Inc. All rights reserved. 31ISP Workshops
BGP Scaling Techniques
• How to scale iBGP mesh beyond a few peers?
• How to implement new policy without causing flaps and route churning?
• How to reduce the overhead on the routers?
© 2001, Cisco Systems, Inc. All rights reserved. 32ISP Workshops
BGP Scaling Techniques
• Dynamic reconfiguration
• Peer groups
• Route flap damping
• Route reflectors
© 2001, Cisco Systems, Inc. All rights reserved. 33ISP Workshops
Soft Reconfiguration
Problem:
• Hard BGP peer clear required after every policy change because the router does not store prefixes that are denied by a filter
• Hard BGP peer clearing consumes CPU and affects connectivity for all networks
Solution:
• Soft-reconfiguration
© 2001, Cisco Systems, Inc. All rights reserved. 34ISP Workshops
Soft Reconfiguration
BGP in
process
BGP
table
BGP out
process
BGP in
table
receivedreceivedand used
accepted
discardedpeer
peer
normal
soft
© 2001, Cisco Systems, Inc. All rights reserved. 35ISP Workshops
Soft Reconfiguration
• New policy is activated without tearing down and restarting the peering session
• Per-neighbour basis
• Use more memory to keep prefixes whose attributes have been changed or have not been accepted
© 2001, Cisco Systems, Inc. All rights reserved. 36ISP Workshops
Configuring Soft reconfiguration
router bgp 100
neighbor 1.1.1.1 remote-as 101
neighbor 1.1.1.1 route-map infilter in
neighbor 1.1.1.1 soft-reconfiguration inbound
! Outbound does not need to be configured !
Then when we change the policy, we issue an exec command
clear ip bgp 1.1.1.1 soft [in | out]
© 2001, Cisco Systems, Inc. All rights reserved. 37ISP Workshops
Managing Policy Changes
• clear ip bgp <addr> [soft] [in|out]
<addr> may be any of the following
x.x.x.x IP address of a peer
* all peers
ASN all peers in an AS
external all external peers
peer-group <name> all peers in a peer-group
© 2001, Cisco Systems, Inc. All rights reserved. 38ISP Workshops
Route Refresh Capability
• Facilitates non-disruptive policy changes
• No configuration is needed
• No additional memory is used
• Requires peering routers to support “route refresh capability” – RFC2918
• clear ip bgp x.x.x.x in tells peer to resend full BGP announcement
© 2001, Cisco Systems, Inc. All rights reserved. 39ISP Workshops
Soft Reconfiguration vs Route Refresh
• Use Route Refresh capability if supported
find out from “show ip bgp neighbor”
uses much less memory
• Otherwise use Soft Reconfiguration
© 2001, Cisco Systems, Inc. All rights reserved. 40ISP Workshops
Peer Groups
Without peer groups
• iBGP neighbours receive same update
• Large iBGP mesh slow to build
• Router CPU wasted on repeat calculations
Solution – peer groups!
• Group peers with same outbound policy
• Updates are generated once per group
© 2001, Cisco Systems, Inc. All rights reserved. 41ISP Workshops
Peer Groups - Advantages
• Makes configuration easier
• Makes configuration less prone to error
• Makes configuration more readable
• Lower router CPU load
• iBGP mesh builds more quickly
• Members can have different inbound policy
• Can be used for eBGP neighbours too!
© 2001, Cisco Systems, Inc. All rights reserved. 42ISP Workshops
Configuring Peer Group
router bgp 100
neighbor ibgp-peer peer-group
neighbor ibgp-peer remote-as 100
neighbor ibgp-peer update-source loopback 0
neighbor ibgp-peer send-community
neighbor ibgp-peer route-map outfilter out
neighbor 1.1.1.1 peer-group ibgp-peer
neighbor 2.2.2.2 peer-group ibgp-peer
neighbor 2.2.2.2 route-map infilter in
neighbor 3.3.3.3 peer-group ibgp-peer
! note how 2.2.2.2 has different inbound filter from peer-group !
© 2001, Cisco Systems, Inc. All rights reserved. 43ISP Workshops
Route Flap Damping
• Route flap
Going up and down of path or change in attribute
BGP WITHDRAW followed by UPDATE = 1 flap
eBGP neighbour going down/up is NOT a flap
Ripples through the entire Internet
Wastes CPU
• Damping aims to reduce scope of route flap propagation
© 2001, Cisco Systems, Inc. All rights reserved. 44ISP Workshops
Route Flap Damping (Continued)
• Requirements
Fast convergence for normal route changes
History predicts future behaviour
Suppress oscillating routes
Advertise stable routes
• Implementation described in RFC2439
© 2001, Cisco Systems, Inc. All rights reserved. 45ISP Workshops
Operation
• Add penalty (1000) for each flapChange in attribute gets penalty of 500
• Exponentially decay penaltyhalf life determines decay rate
• Penalty above suppress-limitdo not advertise route to BGP peers
• Penalty decayed below reuse-limitre-advertise route to BGP peers
penalty reset to zero when it is half of reuse-limit
© 2001, Cisco Systems, Inc. All rights reserved. 46ISP Workshops
Operation
Reuse limit
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
0
1000
2000
3000
4000
Time
Penalty
Suppress limit
NetworkAnnounced
NetworkRe-announced
NetworkNot Announced
© 2001, Cisco Systems, Inc. All rights reserved. 47ISP Workshops
Operation
• Only applied to inbound announcements from eBGP peers
• Alternate paths still usable
• Controlled by:
Half-life (default 15 minutes)
reuse-limit (default 750)
suppress-limit (default 2000)
maximum suppress time (default 60 minutes)
© 2001, Cisco Systems, Inc. All rights reserved. 48ISP Workshops
Configuration
Fixed dampingrouter bgp 100
bgp dampening [<half-life> <reuse-value> <suppress-penalty> <maximum suppress time>]
Selective and variable damping bgp dampening [route-map <name>] route-map <name> permit 10
match ip address prefix-list FLAP-LIST
set dampening [<half-life> <reuse-value> <suppress-penalty> <maximum suppress time>]
ip prefix-list FLAP-LIST permit 192.0.2.0/24 le 32
© 2001, Cisco Systems, Inc. All rights reserved. 49ISP Workshops
Operation
• Care required when setting parameters
• Penalty must be less than reuse-limit at the maximum suppress time
• Maximum suppress time and half life must allow penalty to be larger than suppress limit
© 2001, Cisco Systems, Inc. All rights reserved. 50ISP Workshops
Configuration
• Examples - bgp dampening 30 750 3000 60
reuse-limit of 750 means maximum possible penalty is 3000 – no prefixes suppressed as penalty cannot exceed suppress-limit
• Examples - bgp dampening 30 2000 3000 60
reuse-limit of 2000 means maximum possible penalty is 8000 – suppress limit is easily reached
© 2001, Cisco Systems, Inc. All rights reserved. 51ISP Workshops
Configuration
• Examples - bgp dampening 15 500 2500 30
reuse-limit of 500 means maximum possible penalty is 2000 – no prefixes suppressed as penalty cannot exceed suppress-limit
• Examples - bgp dampening 15 750 3000 45
reuse-limit of 750 means maximum possible penalty is 6000 – suppress limit is easily reached
© 2001, Cisco Systems, Inc. All rights reserved. 52ISP Workshops
• Maximum value of penalty is
• Always make sure that suppress-limit is LESS than max-penalty otherwise there will be no route damping
Maths!
© 2001, Cisco Systems, Inc. All rights reserved. 53ISP Workshops
Enhancements
• Selective damping based on
AS-path, Community, Prefix
• Variable damping
recommendations for ISPs
http://www.ripe.net/docs/ripe-229.html
• Flap statisticsshow ip bgp neighbor <x.x.x.x> [dampened-routes | flap-statistics]
© 2001, Cisco Systems, Inc. All rights reserved. 54ISP Workshops
Scaling iBGP mesh
Two solutions
Route reflector – simpler to deploy and run
Confederation – more complex, corner case benefits
13 Routers 78 iBGP
Sessions!
n=1000 nearlyhalf a millionibgp sessions!
n=1000 nearlyhalf a millionibgp sessions!
Avoid n(n-1)/2 iBGP mesh
© 2001, Cisco Systems, Inc. All rights reserved. 55ISP Workshops
AS 100
Route Reflector: Principle
AA
CCBB
Route Reflector
© 2001, Cisco Systems, Inc. All rights reserved. 56ISP Workshops
Route Reflector
AS 100
AA
BB CC
Clients
Reflectors
• Reflector receives path from clients and non-clients
• Selects best path
• If best path is from client, reflect to other clients and non-clients
• If best path is from non-client, reflect to clients only
• Non-meshed clients
• Described in RFC2796
© 2001, Cisco Systems, Inc. All rights reserved. 57ISP Workshops
Route Reflector Topology
• Divide the backbone into multiple clusters
• At least one route reflector and few clients per cluster
• Route reflectors are fully meshed
• Clients in a cluster could be fully meshed
• Single IGP to carry next hop and local routes
© 2001, Cisco Systems, Inc. All rights reserved. 58ISP Workshops
Route Reflectors:Loop Avoidance
• Originator_ID attribute
Carries the RID of the originator of the route in the local AS (created by the RR)
• Cluster_list attribute
The local cluster-id is added when the update is sent by the RR
Cluster-id is router-id (address of loopback)
Do NOT use bgp cluster-id x.x.x.x
© 2001, Cisco Systems, Inc. All rights reserved. 59ISP Workshops
Route Reflectors:Redundancy
• Multiple RRs can be configuredin the same cluster – not advised!
All RRs in the cluster must have the same cluster-id (otherwise it is a different cluster)
• A router may be a client of RRsin different clusters
Common today in ISP networks to overlay two clusters – redundancy achieved that way
Each client has two RRs = redundancy
© 2001, Cisco Systems, Inc. All rights reserved. 60ISP Workshops
Route Reflector: Benefits
• Solves iBGP mesh problem
• Packet forwarding is not affected
• Normal BGP speakers co-exist
• Multiple reflectors for redundancy
• Easy migration
• Multiple levels of route reflectors
© 2001, Cisco Systems, Inc. All rights reserved. 61ISP Workshops
Configuring a Route Reflector
router bgp 100
neighbor 1.1.1.1 remote-as 100
neighbor 1.1.1.1 route-reflector-client
neighbor 2.2.2.2 remote-as 100
neighbor 2.2.2.2 route-reflector-client
neighbor 3.3.3.3 remote-as 100
neighbor 3.3.3.3 route-reflector-client
62