beyond bgp dan massey colorado state university. 24 october [email protected] internet...

52
Beyond BGP Dan Massey Colorado State University

Upload: kerry-glenn

Post on 17-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Beyond BGP

Dan MasseyColorado State University

24 October 04 [email protected]

Internet Routing Challenges Facing Internet Routing

Internet Has Grown Dramatically– Large number of routing entries– High volumes of updates– Frequent topological changes

Fault-Model Has Changed Dramatically– More malfunctioning components– Intentional attacks

Do we need a fundamentally new routing architecture?

24 October 04 [email protected]

Toward a New Architecture One claim: BGP is nearing the end of its

useful lifetime The Internet will soon collapse unless we

act!! Other claim: BGP is the best engineering

solution we are likely to produce We need incremental patches to new

problems Who is right?

Beyond BGP uses – Measurements to assess where we are– Identification of (new?) routing requirements– Development of changes (incremental or new

system) to address the above

24 October 04 [email protected]

How Did We Get To BGP Simple Distance Vector Routing

Algorithms Used in early Internet routing designs Convey only limited information Prone to long lasting loops

Expensive Link State Routing Algorithms Learn the Full Network Topology Signal every change in every link

Path Vector Routing (BGP) Middle ground that signals some path data But does not signal the full topology

24 October 04 [email protected]

RIP and DBF

RIP

• Keep shortest path only

Distributed Bellman-Ford(DBF)• Keep distance info from all neighbors

A

B

C

E F

D

D:1

D:3

D:2

D:2

D:3

•B’s route to D: Nexthop=A, Dist=4

•B’s route to D: Nexthop=A, dist=4Alternate Nexthop=C, Dist=4

D: infin

ity

• 30sec refreshing interval •Damping timer to space out two triggered updates: 1~5 seconds

•Poison reverse: B sends infinity distance to A

RIP and DBF:

•Exchange distance info.

24 October 04 [email protected]

Internet: composed of thousands of Autonomous Systems(ASes).

BGP Background

BGP (Border Gateway Protocol): the de facto inter-AS routing protocol

AS A R1R2

R3AS B

AS C

R4

R5

AS ER6

BGP Routers BGP Routers

24 October 04 [email protected]

How BGP works Uses path vector protocol

– similar to distance vector protocol.

what if no path available?

Consider an AS as a node

Route via A = <A>Route via C = <C E A>

B’s route to D:

route includes entire path(sequence of nodes)D

A

B

C

E

D:<A>D:<A>

D:<E A>D:<C E A>

24 October 04 [email protected]

Path Vector Routing Changes Worms triggered edge instabilty

Routers crashed due to ARP cache overflow.

Links were congested by worm traffic. BGP Path Exploration Exacerbates

Dynamics B’s route to D

Route via A=<A>

Route via C=<C E A>

D

A

B

C

E

Obsolete backup path <C E A >is used and convergence is delayed

withdrawwithdraw

withdraw

24 October 04 [email protected]

Policies and Policy Withdrawal

But A could stop advertising to B due to a policy change, path <C E A> is still valid!

A

B

C

E

policy withdraw

D

Attach a Failure Withdrawal Community Attribute Only apply the approach to failure withdrawal

B’s route to D

Route via A= <A >

Route via C=<C E A>Route via C=<C E A>

Route via A= <A>

A

B

C

E

24 October 04 [email protected]

BGP Traffic Engineering

BGP Traffic Engineering:R4 chooses path <C B A>R5 chooses path <C E A>

We assumed an AS could be modeled as a node with a single best path to the destination

But a single AS may advertise more than one path.

Divide one AS into Logical ASes such that

All routers within a logical AS have the same best patheach logical AS can be modeled as a node.

24 October 04 [email protected]

Number of Updates

Number of ASes in Network

Nu

mb

er o

f Up

date

s

Original BGP

Enhanced BGP

Substantial reduction is achieved.

E.g. 3766 to 1419 in the 60-AS topology

MinRouteAdver timer: within 30 seconds, only one advertisement is allowed.

It “packs” consecutive changes into one update.

24 October 04 [email protected]

Convergence time

Number of ASes in Network

Con

verg

en

ce

Tim

e(s

econ

ds)

Original BGP

Enhanced BGP

Enhanced BGP reduces the convergence time substantially.

E.g. 337.0 seconds to 19.5 seconds in the 60-AS topology

Elimination of one advertisement can cut convergence time by 30 seconds

24 October 04 [email protected]

Improving Path Vector Convergence Infocom 02 [4] uses consistency to detect invalid

paths. Reject path <x1, x2,…, xn, r1,r2…, rm> if

r1 is adirect neighbor r1’s path is not <r1, r2, …., rm>

Adjusted to account for policy and implement in BGP Infocom 03 [Afek, et al] quickly flushes invalid paths.

BGP requires updates be separated by a min interval Send withdraw (to flush route) if blocked by the interval

Our recent work [5] attaches a new attribute: Root Cause Notification (RCN) Identifies the failed link and includes a sequence number. Allows any route relying on the failed link to be rejected.

24 October 04 [email protected]

Analyzing Path Vector Convergence Route fail-over has

two stages. First, nodes inside

the blue triangle lose routes and explore backup paths. All short invalid paths

are explored Second, an edge

(a0) eventually selects the valid backup path via Sk. Valid routes begin to

propagate through the blue triangle.

24 October 04 [email protected]

Generic Convergence Results

Algorithm Fail-Over Convergence Bounds

SPVP (BGP) (N-1) (M + ld) + 3 Pmax(|E|-degree(G,0))

SPVP-AS (N- degree(G,0) ) (M+ld) + 3Pmax(|E| - |E^| + Degree(G^))

SPVP-GF (N-1) ld + 3Pmax(|E| - degree(G,0))

SPVP-RCN Distance(G,0) (ld) + (Pmax) Distance(G,0)

Pmax = Node Processing Delay, ld = Link Delay

M = Minimum Advertisement Interval

24 October 04 [email protected]

Simulation Results

24 October 04 [email protected]

What About Security? Convergence Discussion Neglects Security

What if routers send intentionally bad information?

What is the Simplest Possible Attack? Announce someone elses routes

Example: Suppose Univ. of Colorado announces it is the origin for 129.82.0.0/16 In other words, CU announces CSU IP Address

Space Can this Happen and/or What Would Prevent

It?

24 October 04 [email protected]

Multiple Origin AS (MOAS) Cases

Prefixes originate from Multiple Origin AS (MOAS) Lower curve likely due to valid operational needs

Spikes are errors that disrupt routing to prefix Includes loss of routes to top level DNS servers

24 October 04 [email protected]

Infrastructure Faults and Attacks

InternetInternet c.gtld-servers.net

BGP monitor

192.26.92.30

originates route to 192.26.92/24

BGP and DNS Provide No Authentication Faults and attacks can mis-direct traffic. One (of many) examples observed from BGP

logs. Server could have replied with false DNS data.

ISPs announced new pathfor 20 minutes to 3 hours

24 October 04 [email protected]

BGP-based Solution Example

router bgp 59 neighbor 1.2.3.4 remote-as 52 neighbor 1.2.3.4 send-community neighbor 1.2.3.4 route-map setcommunity outroute-map setcommunity match ip address 18.0.0.0/8 set community 59:MOAS 58:MOAS additive

Example configuration:

AS58

18/8, PATH<4>, MOAS{4,58,59}

AS59

18.0

.0.0

/8 18/8, PATH<58>, MOAS{58,59}

18/8, PATH<59>, MOAS{58,59}

18/8, PATH<52>, MOAS{52, 58}

AS52

24 October 04 [email protected]

(b) Two Origin AS’s(a) One Origin AS

BGP false origin detectionSimulation Results

24 October 04 [email protected]

A Simple Filter Current BGP provides dynamic routes

Explore the opposite extreme...

Select a single static route to each server.

Apply AS path filters to block all other announcements.

– Also filter against more specifics.

Route changes on a frequency of months, if at all.

Change in IP address, origin AS, or transit policy.

Adjust route only after off-line verification

24 October 04 [email protected]

Why This Works: Theory

Scale is limited to a small number of routes. No exponential growth in top level DNS servers.

Loss of a server is tolerable, invalid server is not. Resolvers detect and time-out unreachable servers.

– Provided surviving servers handle load, cost is some delay.

Expect predictable properties and stable routes. Servers don’t change without non-trivial effort.

Servers located in highly available locations.

24 October 04 [email protected]

Why This Works: Data Analysis based on BGP updates from RIPE.

Archive of BGP updates sent by each peer.

9 ISPs from US, Europe, and Japan.

February 2001 - April 2002

Some data collection notes Used only peers that exchange full routing

tables– Otherwise some route changes are hidden by policies

Adjusted data to discount multi-hop effect.– Multi-hop peering session resets don’t reflect ISP ops.

24 October 04 [email protected]

Impact on ReachabilityISP1 (US/Tier 1)

24 October 04 [email protected]

How Static Are The Routes?

3 changes in route to “A” over 14 months.

2 (valid) changes in the origin AS

5/19/01 origin AS changed from 6245 to 11840

6/4/01 origin AS changed from 11840 to 19836

1 change in transit AS routing policy

11/8/01 (*,10913, 10913, 10913,*) -> (*,10913,

*)

Could have built filter to allow this...

24 October 04 [email protected]

What Routes Are Lost? Results from 3/1/01 until 5/19/01 AS change.

Reduced reachability to “A” from 99.997% to 99.904%

18 events when trusted route was withdrawn 2 resulted in no route available (28 secs, 103 secs)

8 instances of a back-up route lasting over 3 minutes

Longest lasting back-up advertised for 15 minutes

Similar results for other time periods and servers.

24 October 04 [email protected]

Example of Filtered RoutesTime Tail of AS Path

12:35:30 * 19836 19836 19836 1983616:06:32 * 10913 10913 10913 10913 10913 10913 10913 1983616:06:59 * 1239 10913 1983616:07:30 * 701 10913 10913 1983616:08:30 withdrawal16:15:55 * 19836 19836 19836 19836

With filter no route at 16:06:32

19836

109131239

701

* server

No route at 16:08:30

24 October 04 [email protected]

Worst Case In StudyISP 3 (Europe)

ISP 3 used one main route and a smallISP 3 used one main route and a smallnumber of consistent back-up routes.number of consistent back-up routes.

24 October 04 [email protected]

Toward a More Balanced Approach Required infrequent updates to the filter.

Especially useful to automate infrequent tasks.– Natural tendency to forget task or forget how to

do task

More paths improves robustness Simple filtered allowed only 1 path. ISP3’s reachability can be improved if filter

allows two routes… Strike a balance between allowing

dynamic changes and restricting to trusted paths.

24 October 04 [email protected]

BGP Adaptive Filters Slow down the route dynamics and

add validation. Apply hysteresis before accepting new

paths

Add options for validating new paths:– Believe route based purely on hysteresis

– Probabilistic query/response testing against known data.

– Trigger off-line checking (did origin AS really change?)

24 October 04 [email protected]

Impacts on ReachabilityISP1

Root servers

gTLD servers

24 October 04 [email protected]

Impacts on ReachabilityISP3

Root servers

gTLD servers

24 October 04 [email protected]

Convergence And Authentication BGP Suffers From Both Convergence

Problems and Authentication Problems Convergence fixes are good, if no attacks. Authentication fixes work for redundant sites

Can you improve both convergence and authentication in a realistic environment? Do you need to replace BGP?

– If yes, with what? Would you pick BGP for your new network?

– If no, what would you do instead?

Wide Variety of Other Routing Challenges Check out CS 580 and BBGP Project if interested

24 October 04 [email protected]

BGP Measurement and Artifacts BGP peers establish TCP session

and send full route table (120K+ routes) Updates sent only if routes change.

Our results show frequent session resets between ISP routers and the monitoring point. Monitoring point sessions cross

multiple systems in the Internet. Each reset adds 120K updates. But very few ISP-ISP session resets.

Our work in [1] presents rules to remove session reset artifacts.

Initial Table(120K+ routes)

Route Changes

Initial Table(120K+ routes)

24 October 04 [email protected]

BGP Updates During Slammer Worm

24 October 04 [email protected]

BGP Updates During Nimda Worm

Measurement Artifacts

Routing Changes

Total Attack

24 October 04 [email protected]

What Our Analysis ShowsBGP Advertisements on 9/18/2001

42%

5%8%8%

37%BGP Table Exchange

Duplicate Advertisements

New Announcements

Withdraws

Implicit Withdraws

40.2%

A substantial percentage of the BGP messages during the worm attack were not about route changes

37.6%

8.8%8.3%

24 October 04 [email protected]

FRTR: Improving Peer Communication BGP Updates Are Not (Topology) Event Driven

Session resets trigger high volume surges– Govindan shows cascade failures can result.

Lifetime of Invalid Routes is Unbounded Never recover (until reset) if update is somehow lost.

– Despite TCP, we found cases of “lost” withdrawals. Attacker can poison a route with one update.

Soft-state (periodic re-announce) is too costly…

FRTR Uses Periodic Bloom Filter Digests Digests quickly confirm state after session reset. Periodic digests bound lifetime of faults (w/ high

prob). Co-Author Keyur Patel (Cisco) is exploring Cisco

development.

24 October 04 [email protected]

FRTR Performance For each route at receiver,

check against the digest. Bloom filter results in no

false negatives. Compare total digests for

missing route detection. False positive possible with

known rate. Add salts to reduce the

chance of repeated false positives.

Overhead is a function of digest size and frequency.

Work with Cisco suggests a 1.3% overhead increase.

Complete Details to appear in [2] (DSN 2004)

24 October 04 [email protected]

Packet Delivery during Routing Convergence

Failures do occur in the Internet– 20% of intra-ISP links have a MTTF < 1 day [Diot:IMW02]– 40% of Inter-ISP routes have a MTT-Change < 1 day [Labovitz:FTCS-29]

Routing convergence after failure takes time– IS-IS(Intra-ISP protocol): 5+ seconds [Diot:IMW02]– BGP(Inter-ISP protocol): 3+ minutes [Labovitz:Sigcomm00]

Packets can be delivered during convergence

A B C

E F

D

G

24 October 04 [email protected]

What Is the Goal of Routing How to maximize packet delivery during routing

convergence?

– Topological connectivity’s impact?

– Studying: RIP, Distributed Bellman-Ford(DBF), BGP

– Previous work focused on: preventing loops, minimizing convergence time and routing overhead

This problem becomes more important with

Larger Internet topology [Huston01] --> higher freq. of component failuresRicher connectivity[Huston01] --> potentially helps with more alternate pathsHigher bandwidth --> more packets sent during convergence

24 October 04 [email protected]

Simulation conducted

7 by 7 mesh topologies similar those in [Baran64]

20 pkts/second

Measure Packet loss, loops, path convergence time, throughput, and e2e delay.

Simulated node degree range [3 ~ 16]

24 October 04 [email protected]

Packet Losses (I) : Observation

RIP

DBF, BGP’ and BGP

Packet losses of DBF, BGP’ and BGP decrease to zero at degree 6.

Richer connectivity helps RIP little.

Node Degree

Pac

ket L

oss

24 October 04 [email protected]

Packet Loss(II): Lessons Learned

Keeping alternate paths

F

DA

B

C

E

F

DA

B

C

E

Connectivity Mattersno immediate available alternative due to poor connectivity and poison reverse

RIP:

DBF, BGP:

alternative is more likely with richer connectivity

24 October 04 [email protected]

Is an alternate path valid?

Valid Alternate Paths: not using the failed link

Poison reverse and BGP’s path information are not enough! [Pei:Infocom2002]

F

DA

B

C

E

U

X

VW

Richer connectivity --> reduces one single link’s impact better availability of valid(but may be suboptimal) path

C2

D: < >

D: < >

D: < >

24 October 04 [email protected]

Transient Loops(I): Observation

DBF

BGP’

BGP•BGP has the most loops!

•RIP has no loops

•Richer connectivity reduces the chance of looping.

Node Degree

Los

ses

due

to lo

ops

24 October 04 [email protected]

F

D

A

B

C

E

Transient Loops(II): Msg Propagation

Damping timer slows the msg propagation, causing looping

UX

V W

Y

D:<C A E F>

D:

<B

A E

F>

D: <B C A E F>

D:<C B A E F>

Richer connectivity can reduce the chance of loopingMore details in:

“A Study of Transient Loops in BGP”

30 seconds!

D: < >

D: < >

D: < >D: <

>

D:

< >

24 October 04 [email protected]

Instantaneous Throughput

RIP

DBF

BGP’

BGP

RIP

Time

Thr

ough

put(

pkts

/sec

ond

24 October 04 [email protected]

Packet Delay During Convergence

24 October 04 [email protected]

Forwarding Path Convergence time

BGP: no loss at degree 6 or higher

Shall we still tune MRAI timer to minimize convergence time(with the risk of increasing overhead)?

Node Degree

BGP:70

BGP’:10

Time till there is no routing msg.BGP:13

BGP’:2

Time till the forwarding path from S to D stabilizes.

24 October 04 [email protected]

Packet Delivery After a Failure