internet routing instability
DESCRIPTION
Internet Routing Instability. Craig Labovitz, G. Robert Malan, Farham Jahanian University of Michigan Presented By Krishnanand M Kamath. Cause and Effect. Define routing instability Rapid change of network reachability and topology information. Causes Router Configuration Errors - PowerPoint PPT PresentationTRANSCRIPT
Internet Routing InstabilityCraig Labovitz, G. Robert Malan, Farham Jahanian
University of Michigan
Presented ByKrishnanand M Kamath
Cause and EffectDefine routing instability
Rapid change of network reachability and topology information
Causes Router Configuration Errors Transient Physical and data link problems
Problems with leased line, router failures, high levels of congestion Software Configuration Errors
Effects Very many – slew of effects
Effects Increased network latency and time to convergence
Dropped and out of order delivery of packets Miserable end to end performance
Loss of connectivity in national networks Route caching architecture and low end processors for CPU
Pr(Cache Miss) increases, severe CPU load, memory problemsDelays in packet processing, Keep-Alive packets are delayed
Others flag the router as down and transmit updatesDown router reinitiates peering sessionLarge state dump transmissionYet more routers fail- Route Flap Storm
SolutionsRoute Aggregation
Reduces the overall number of networks visible in the coreRequires cooperation between service providersRedundant connectivity to the internet – multi-homing
Route Dampening AlgorithmsNot a panacea – legitimate announcements may be delayed
Overall,Multi-homing exhibiting linear growthInternet topology growing increasingly less hierarchicalIncreasing topological complexity
RecallUpdates
Announcements•New route•New policy decision for an existing route
Withdrawals
Explicit – associated with a withdrawal message
Implicit – existing route isReplaced by announcementOf new route
Types of UpdatesInter-domain routing updates
Forwarding Instability
Legitimate topological changes and affect the paths on which data will be forwarded between AS’s
Routing policy fluctuation
Reflects changes in routing policy information that may not affecting forwarding paths between AS’s
Pathological Updates
Redundant BGP info that reflect neither routing nor forwarding instability
Major ResultsNumber of BGP updates is one or more orders of magnitude larger than expected.Routing information is dominated by pathological updates
Instability and redundant updates exhibit a periodicity of 30 & 60 secs
Instability and redundant updates show a correlation to network usage
Instability is not dominated by a small set of AS or routesDiscounting policy fluctuation and pathological behavior there remains a significant level of internet forwarding instability
Specific architectural and protocol implementation changes in commercial internet routers through collaboration with vendors
TaxonomyData Analyzed
Sequences of BGP updates for each (prefix, peer) tuple
Events Identified•WADiff
A route is explicitly withdrawn as it becomes unreachable and later replacedwith an alternative route to the same destination. The alternative route differsin its ASPATH or nexthop attribute information.(Forwarding Instability)
•AADiffA route is implicitly withdrawn and replaced by an alternative route as the original route becomes unreachable, or a prefferd alternative path becomesAvailable (Forwarding Instability)
Taxonomy(contd.)Events Identified(contd.)•WADup
A route is explicitly withdrawn and then re-announced as reachable. This mayreflect transient topological failure, or it may represent a pathological oscillation.(Forwarding Instability or Pathological Behavior)
•AADupA route is implicitly withdrawn and replaced with a duplicate of the original route.Duplicate Route – is defined as a subsequent route announcement that does notdiffer in nexthop or ASPATH attribute information.(Pathological Behavior or Route Ploicy Fluctuation)
•WWDupThe repeated transmission of BGP withdrawals for a prefix that is currentlyunreachable. (Pathological Behavior)
MethodologyData Collected: BGP routing messagesTime Period: Over the course of 9 months starting Jan 96Where: Five of the major U.S. network exchange pointsTool: Unix based route servers, Multithreaded routing Toolkit(MRT)
Gross ObservationsWe Expect,
Instability (Globally visible addresses, total number of available paths)
We Observe,For 45,000 prefixes and 1500 paths- 3 to 6 million updates per day
Pathological Behavior
Disturbing behaviors,Most of the BGP updates entirely pathological (WWDup)Disproportionate effect that a single service provider can have on global routingCausal relationship between manufacturer of a router and level of pathological behaviorRouting updates have a regular, specific periodicity of either 30 or 60 secondsPersistence of pathological behavior are under five minutes
Origins of PathologiesStateless BGP: Withdrawals are sent for every explicitly and implicitly
withdrawn prefix- no state on info advertised to peers
Plausible Explanations,CSU Timer problemsUnjittered 30 second interval timer, self-synchronization Misconfigured interaction of IGP/BGP protocolsRouter vendor software bugsUnconstrained routing policies
Analysis of InstabilityInstability as the sum of AADiff, WADiff and WADup updates
Fine-grained Instability StatisticsThere is no correlation between the size of an AS and its
proportion of the instability statistics.
Fine-grained Instability StatisticsNo single AS or prefix consistently dominates the instability statistics
Instability is evenly distributed across routes
Temporal Properties of Instability
Plausible causes for the periodicity,Routing software timers, self synchronization, and routing loopsCSU handshaking timeoutsFlaw in routing protocol
Origins of Internet Routing Instability
Craig Labovitz, G. Robert Malan, Farham Jahanian
University of Michigan
IntroductionWe observed,
Several orders of magnitude more routing updatesLarge number of duplicate routing messagesUnexpected frequency components between instability events
Extend earlier analysis by,Identifying the origins of many of the pathological behaviorImpact of specific commercial router software changes suggestedAdditional router software changes that can decrease updates exchanged by an additional 30 percent or more
Major ResultsVolume of inter-domain routing updates has decreased by an order of magnitude since April 1997.
The majority of BGP messages consists of redundant announcements
A growing proportion of instability stems from specific changes in Internet architecture coupled with limitations in router software and algorithms.
Instability is not disproportionately dominated by prefixes of specific lengths.
Persistently oscillating routes dominate the BGP traffic generated by a few Internet providers.Experimentally confirmed a num of origins of pathological routing behavior postulated in the earlier work.
Analysis of Gross Trends
Note,Dramatic decrease in the number of withdrawalsNumber of announcements have doubled over 28 month period Growth of BGP announcements disproportional to any corresponding increase in the number of routing table entries
TaxonomyAnalyze sequences of BGP updates for each (prefix, peer) tuple
Identify the events,•AADup:
A route is implicitly withdrawn and replaced with a duplicate of the original route.We define a duplicate route as a subsequent route announcement that does not differ in any BGP path attribute information.
•AADiff:A route is implicitly withdrawn and replaced by an alternative route as the originalroute becomes unreachable, or a preferred alternative path becomes available.
•Tup and TdownFluctuation in the reachability for a given prefixTup:currently unreachable prefix announced reachable & transitions upTdown: announced route is withdrawn and transitions down
Analysis of Update Categories
AADup Behavior stems from:1. Non – transitive attribute filtering2. Combination of BGP minimum advertising timer with stateless BGP
Analysis of AADiffs
NoteLow percentage of ASPath ASDiffsGrowth in number of origin AADiffs related to architecture and and policy issuesGrowth in number of community AADiffs reflects its recent adoption by many ISPsOscillations in MED due to the IBGP mapped MED policy at two service providers
IBGP Mapped MED
FrequencyRecall,
Frequency defined as inverse of inter-arrival time between routing updatesPredominant frequencies have a 30 sec and 60 sec periodicity
Cause,Frequency components stem from a fixed minimum BGP advertisement timerused by atleast one router vendor
Prefix Length Statistics
ConclusionsVolume of routing update messages decreased by an order of magnitudeby specific software changes on the majority of core Internet backbonerouters. Software changes successfully suppressed the generation ofpathological withdrawals.
Proposed new software changes that may reduce instability levels by an additional thirty percent.
Instability is well distributed across both autonomous system and prefix space. No single service provider or set of network destinations appearsto be at fault.