availability and survivability in ip networks · icnp 2003 -s. bhattacharyya and g. iannaccone...

110
ICNP 2003 Tutorial ICNP 2003 Tutorial Availability and Availability and Survivability Survivability in IP Networks in IP Networks Supratik Bhattacharyya Sprint Advanced Technology Labs [email protected] Gianluca Iannaccone Intel Research Cambridge [email protected] Copyright ©2003 Sprint. All rights reserved.

Upload: others

Post on 03-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • ICNP 2003 TutorialICNP 2003 Tutorial

    Availability and Availability and Survivability Survivability in IP Networksin IP Networks

    Supratik BhattacharyyaSprint Advanced Technology Labs

    [email protected]

    Gianluca IannacconeIntel Research Cambridge

    [email protected]

    Copyright ©2003 Sprint. All rights reserved.

  • 22ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Tutorial Outline

    Part IIntroduction & Background

    Part IICommon approaches to survivability

    Part IIIThe Sprint experience

    Part IVOpen Issues & Future Directions

  • 33ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part IIntroduction & Background

  • 44ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part I - Outline

    IP NetworksSurvivability & AvailabilityScope of this tutorial

  • 55ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part I - Outline

    IP NetworksSurvivability & AvailabilityScope of this tutorial

  • 66ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    What is the Internet?

    Set of networks running the IP protocolNetworks are loosely connectedAdministered independentlyFor our purposes, the Internet ends at the borders of the Internet Service Providers (ISPs).

  • 77ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Loosely connected networks...

    UUnet

    Sprint

    AT&T

    Dial-up ISP

    Tier 2 ISP

    BT

    Peering points

  • 88ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Sub-IP Technologies (Sprint network)

    Dense Wavelength Dense Wavelength Division Multiplexing (DWDM)Division Multiplexing (DWDM)

    [one [one fiberfiber can carry up to 40 can carry up to 40 λλ at at OCOC--192 (10 192 (10 GbpsGbps) speed]) speed]

    SONET FramingSONET FramingCisco HDLC (for protocol multiplexing)Cisco HDLC (for protocol multiplexing)

    IPIP ISIS--ISIS

  • 99ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Topology of today’s tier-1 backbones

    Points-of-Presence connected by long-haul fibers– Each PoP contains several backbone routers and access

    routers to connect customers– Two options: many small PoPs or few large PoPs– Trade off: sites to maintain vs. proximity to customers

    Some examples– Sprint: IP over Sonet with few large PoPs– AT&T: IP over Sonet with many small PoPs– UUNet: MPLS/IP/ATM with many small PoPs

  • 1010ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    The Sprint U.S. Topology

  • 1111ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Protocols for IP networks

    Exterior Gateway Protocol (EGP)– Border Gateway Protocol (BGP)– Announces reachability between networks– Heavily affected by policies

    • “Hot-potato routing” is the norm• Based on trust relationship between ISPs• Difficult to provide availability/survivability guarantees on BGP

    (and no incentive to do so)• Visibility into problems via the NANOG mailing list...

  • 1212ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Protocols for IP networks (cont)

    Interior Gateway Protocols (IGP)– Intermediate System to Intermediate System (IS-IS)– Open Shortest Path First (OSPF)– Routing based on shortest paths

    • ISP assigns a weight (or metric) to each link in the network. • Routers run the Dijkstra algorithm over the resulting weighted

    undirected graph

  • 1313ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part I - Outline

    IP NetworksSurvivability & AvailabilityScope of this tutorial

  • 1414ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Definitions

    SurvivabilityAbility to maintain uninterrupted service in presence of failures– It needs a definition of failure scenarios

    AvailabilityA measure of the disruption in packet forwarding due to failures– Different from disruption due to traffic congestion– It should be independent of traffic demand– More discussion in Part IV

  • 1515ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Scope of this tutorial

    What we will cover in detail– Sprint’s network design principles– Performance of today’s routing equipment– Failure characterization in IP networks

    What we will cover briefly– Optical protection/restoration– MPLS-based approaches– Many references can be found in the bibliography

    What we will not cover– Survivability across multiple autonomous systems

  • 1616ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part IICommon Approaches

    to Survivability

  • 1717ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part II - Outline

    Protection vs. Restoration: fundamental trade-offsOptical Protection/RestorationMPLS-based ProtectionIP RestorationMulti-layer approaches

  • 1818ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part II - Outline

    Protection vs. Restoration: fundamental trade-offsOptical Protection/RestorationMPLS-based ProtectionIP RestorationMulti-layer approaches

  • 1919ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Protection vs. RestorationProtection: fixed, pre-determined failure recovery– wired in the network– provision primary and backup path at the same time

    Restoration: on-demand recovery– no need to plan ahead where failures may occur– the backup path is not defined a-priori

    They can co-exist...– they operate at different timescales and layers

    ...but do we need both of them?

  • 2020ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Trade-offs: Recovery Speed

    Protection inherently faster than RestorationBut... how fast does it need to be? – Outage durations between 50 – 200 ms will have

    minimal impact to services.– Voice services slightly affected by outages between

    200ms and 2s. Some concerns with video applications.– [ANSI Technical Report T1.TR.68-2001]

  • 2121ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Trade-offs: Deployment Costs

    Restoration uses better the network resourcesProtection less flexible than RestorationExample: Optical protection– Fiber provisioning cycle in the order of 12-18 months – New fiber requires substantial capital investment– Geography “against” fiber path diversity

    Flexibility crucial if traffic demand grows too fast– Late 90s: Internet traffic “doubling every year”– [Coffman and Odlyzko, 2001]

  • 2222ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Trade-offs: Failure characteristics

    Network supposed to survive to all failuresUnderstanding of failures is crucial– frequency and magnitude of failure events– what equipment is more prone to fail

    Lower layers cannot address higher layers’ failures– e.g., failures of IP forwarding engine– the opposite is not (usually) true

  • 2323ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part II - Outline

    Protection vs. Restoration: fundamental trade-offsOptical Protection/RestorationMPLS-based ProtectionIP RestorationMulti-layer approaches

  • 2424ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Optical Protection schemes[Ramamurthy and Mukherjee, 1999]

    Pre-configured backup route and wavelength

    Dedicated Shared

  • 2525ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Dedicated vs. Shared Protection

    Dedicated

    Prim

    ary

    APrimary B

    Backup ABa

    ckup

    B

    Shared

    Prim

    ary

    A

    Primary B

    Backup A

    Back

    up B

    Dedicated protection is more robust to failuresShared protection uses resources more efficiently

  • 2626ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Optical Protection schemes (WDM)[Ramamurthy and Mukherjee, 1999]

    Pre-configured backup route and wavelength

    Dedicated Shared

    Pathprotection

    Linkprotection

    Pathprotection

    Linkprotection

  • 2727ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Path vs. Link Protection

    Path

    Backup

    Link

    Backup 1

    Backup 2

    Link 1

    Link 2

    Path protection uses resources more efficiently Link protection gives longer backup path– Primary/Backup paths use same wavelength (may not

    be feasible)

  • 2828ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Comparison of Protection Schemes

    Dedicated link protection is the least efficientUsage of resources– shared path

  • 2929ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Optical Restoration schemes

    Dynamic discovery of path and wavelengthAgain, it can be Path-based or Link-based

    Restoration efficiency– Measured as fraction of connections restored after a

    failure– Path restoration > link restoration

    Restoration time– Link restoration < path restoration

  • 3030ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part II - Outline

    Protection vs. Restoration: fundamental trade-offsOptical Protection/RestorationMPLS-based ProtectionIP RestorationMulti-layer approaches

  • 3131ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    MPLS-based Protection

    Multi Protocol Label Switching– Label Switched Path (LSP) uniquely defines the path

    between source and destination– Routers (or Label Switched Router - LSR) switch packets

    based on labels and can also assign a different label

    Protection-like scheme– Provision primary LSP and backup LSP– In case of failure along primary path, first LSR assigns

    backup label to incoming packets– There can be multiple backup LSP

  • 3232ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Options for the backup LSP

    There can be multiple backup LSPAll backup paths are equal– Selection based on listed order of configuration

    Stand-by knob– Maintains backup path in ‘up’ condition– Eliminates call-setup delay of secondary LSP– Additional state information must be maintained

  • 3333ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    MPLS Protection example

    LSR 1

    Primary LSP (Label A)

    LSR 0

    Backup LSP (Label B)

    3B

    1

    Next Hop

    A

    Label

    LSR 0: MPLS Table

    LSR 3 LSR 4 LSR 5

    LSR 2

    5B

    Next HopLabel

    LSR 4: MPLS Table

    5

    Next Hop

    A

    Label

    LSR 1: MPLS Table

    4B

    Next HopLabel

    LSR 3: MPLS Table

  • 3434ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    LSP Rerouting

    Initiated by ingress LSR (#0 in the example)– Exception: Fast Re-Route (we will discuss it in Part IV)

    Conditions that trigger reroute– More optimal route becomes available– Failure along primary path– Preemption– Manual configuration change

    Recovery speed– With backup in stand-by: 100’s msec – 1 sec– NANOG talk based on experiments in Qwest network

  • 3535ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part II - Outline

    Protection vs. Restoration: fundamental trade-offsOptical Protection/RestorationMPLS-based ProtectionIP RestorationMulti-layer approaches

  • 3636ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    IP Restoration: IS-IS protocol

    Link State protocol– Each node has complete information on the topology– Nodes flood list of neighbors and cost to reach them– Nodes independently compute their own routing tree

    In case of failure– The nodes that identify the failure flood an update

    message with the new list of neighbors– Each nodes updates its routing tree accordingly

  • 3737ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    IP Restoration: Example

    A

    HG

    FD E

    BC

    11

    5204

    1015

    710

    12

    3

    5

  • 3838ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    IP Restoration: Example

    A

    HG

    FD E

    BC

    11

    5204

    1015

    710

    12

    3DH

    DG

    B, DF

    DE

    DD

    BC

    BB

    -A

    NEXTHOP

    DST

    5

  • 3939ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    IP Restoration: Example

    A

    HG

    FD E

    BC

    5 11

    5204

    1015

    710

    12

    3DH

    DG

    B, DF

    DE

    DD

    BC

    BB

    -A

    NEXTHOP

    DST

    LSP: E-F down

  • 4040ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    IP Restoration: Example

    A

    HG

    FD E

    BC

    11

    5204

    1015

    710

    12

    3DH

    DG

    B, DF

    DE

    DD

    BC

    BB

    -A

    NEXTHOP

    DST

    5

  • 4141ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part II - Outline

    Protection vs. Restoration: fundamental trade-offsOptical Protection/RestorationMPLS-based ProtectionIP RestorationMulti-layer approaches

  • 4242ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Multi-Layered Protection/Restoration

    MPLS/IP over WDM Need coordination– Multiple schemes should not compete– Need escalation strategy based on

    • explicit messaging or,• timer settings for detection and completing restoration

    – Higher layers should wait for lower layers first

  • 4343ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part IIIThe Sprint Experience

  • 4444ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part III - Outline

    IP-based SurvivabilityPerformance

  • 4545ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part III - Outline

    IP-based Survivability– Design Requirements– Components of design solutions

    Performance

  • 4646ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Requirements for Survivability

    Availability of backup paths– must have enough spare capacity– must satisfy SLA requirements

    • low loss, end-to-end latency bounds

    Localized failure recovery– re-routing should be close to point of failure

    Prevent partitions– multiple node/link disjoint paths– Physical path diversity

  • 4747ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Design Solution: Capacity Provisioning

    No admission controlRedundant capacity for restoration path– rule of thumb – average link utilization < 50%

    Must plan for widespread outages– Baltimore tunnel fire, meltdown in other ISPs, etc.

    Added benefit– SLAs satisfied: negligible loss, no queuing– no need for QOS, service classes in the core

  • 4848ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Design Solution: Network Topology

    Fully meshed inter-POP topology is not feasible– each POP connected to a subset of other POPs

    • between 2 and 10

    – reduces probability of network partitioning

    Parallel links between adjacent PoPs– terminates on different routers, run on different fibers– Added benefit: load balancing

  • 4949ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    The Sprint U.S. Topology

  • 5050ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Point-of-Presence(PoP) Design

  • 5151ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Design Solution: IP-Based Restoration

    No protection at optical layer– huge capital investment– less flexible, provisioning cycle between 12-18 months– cannot fix IP layer problems

    Relies solely on IS-IS protocol to re-route traffic around failures

  • 5252ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    IS-IS routing practices

    Primary focus: inter-PoP paths– Selection criterion: end-to-end latency – primary and backup paths should traverse same set of

    PoPs

    Link weights – inter-POP links ~10-63 – intra-POP links ~ 1-4

    Updating weights:– set by hand, modified only for large-scale failures

  • 5353ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part III - Outline

    IP-based SurvivabilityPerformance– Analysis of failure patterns– Recovery Speed with IS-IS

  • 5454ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Performance of IP-based Survivability

    Question 1: Is it sufficient to handle the type, frequency, and scale of failures in the Sprint network?

    Analyze and characterize failures

  • 5555ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    “Listening” to failures

    All failures are visible at the IP layer– Lower layers are not masking out events

    LSPs are flooded throughout the entire network– A machine that sets up an adjacency with a router is

    enough to observe and record all the failure events– Python Routing Toolkit (http://ipmon.sprint.com/pyrt)

    Three locations: East and West Coast– To monitor loss of LSPs– To measure propagation delay of LSPs

  • 5656ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Definition of failure event

    “Failure”: any event that causes a topology change

    time

    B: Link to A is downB: Link to A is down

    A: Link to B is upA: Link to B is up

    TimeTime--toto--Repair (or “duration”)Repair (or “duration”)

    A: Link to B is downA: Link to B is down

    start

    B: Link to A is upB: Link to A is up

    end

  • 5757ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Failures are part of everyday operations

    Weekly

    Daily

    Hourly

  • 5858ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Time between Failures (network-wide)

    43%:

  • 5959ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Sources of failures

    Duration can provide hints, e.g., – long (>1hour): fiber cuts, severe failures– medium (>10min): router/line card failures– short (>1min): line card resets– very short (

  • 6060ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Network-wide Failure Duration

    c um

    ulat

    ive

    frac

    tion

    of f

    a il u

    r es

    40 % in 1-60sec

    40 % in 1-15min

    10 % in 15-60min

    10 % >1h

  • 6161ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Classification of Failures

    Remove Maintenance– 9 hrs/week account for 20% of all failures events

    Classify remaining “Unplanned” failures

    Unplanned

    Shared Link Failures 30.8%

    Individual Link Failures69.2%

  • 6262ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Shared Link Failures: Simultaneous

    2+ links going down at the exact same time– Example: Failure of a line card with multiple interfaces

    All events turn out to involve a common router– Router Related class (16.5%)

    Router 0

    LineCard

    Router 1

    Router 2

    Router 3

    Router 4

    IS-IS logs

    t1 time

    0 – 1

    t2

    0 – 2

    0 – 3

    0 – 4

    LSP from Router 0reporting links down

    LSP from Router 0 reporting 3 links up

  • 6363ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Shared Link Failures: Overlapping

    Some links fail “almost simultaneously”

    Possible causes– Shared component fails but the news arrive to the

    listener with some delay, due to various timers– Could be optical component– ...or also router component

    Overlapping W2 W1

    Link 1Link 2Link 3

    time

    Shared optical part

    Link 1Link 2Link 3

  • 6464ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Classification of Failures (updated)

    Unplanned

    Shared Link Failures 30.8%

    Individual Link Failures69.2%

    Router Related16.5%

    Simultaneous Overlapping

    Optical Related11.4%

    Unspecified2.9%

  • 6565ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Individual Link Failures

  • 6666ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    High vs. Low Failure Links

    Normalized number of failures per link High degree of heterogeneityA few (2.5% of) links account for half of independent failuresRoughly two power-laws: -0.73, -1.35

  • 6767ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    High Failure Links

  • 6868ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Low failure links

  • 6969ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Classification of Failures (updated)

    Unplanned

    Shared Link Failures 30.8%

    Individual Link Failures69.2%

    Simultaneous

    Router Related16.5%

    Overlapping

    Optical Related11.4%

    Unspecified2.9%

    High Failure38.5%

    Low Failure30.7%

  • 7070ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Summary: Lessons learnt

    Failure are part of everyday operationsNetwork topology is very dynamic– Links are reported down every 30 minutes on average– In 80% of the cases links come back up in

  • 7171ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Performance of IP-based Survivability

    Question 1: Is it sufficient to handle the type, frequency, and scale of failures in the Sprint network?

    Analyze and characterize failures

    Question 2: Is it fast enough to provide a highly available service across the backbone?

    Evaluate and fine-tune IS-IS

  • 7272ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Restoration Steps in IS-IS

    Failure detectionFailure notificationForwarding path re-computation

  • 7373ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Failure detection

    For optical layer problems on point-to-point links– SONET layer detects failure in 10-20 msec

    Keep-alive messages are used in all other cases– software/hardware failures on a router– switched networks (e.g., Ethernet)– detection on the order of seconds (up to 60s)

  • 7474ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Failure notification

    For optical layer problems– SONET alarm on adjacent routers– Hold-off timer delays notification to IS-IS process to

    mask out transients

    Notifying other routers– LSPs are flooded to the entire network – Regulated by “generation” timer – At each hop, LSP flooding is rate-related and incurs

    processing delay

  • 7575ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Forward path re-computation

    Routing tree computation– CPU intensive, depends on number of nodes– Hold-off timer to aggregate multiple LSPs

    Forwarding information update has to propagate to interface cards– only in distributed architectures (e.g. Cisco GSR)– depends on number of BGP prefixes

  • 7676ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Generic GSR architecture

    RouteProcessor

    RoutingTable (RIB)

    Line Card

    MAC

    PacketMemory

    FwdingTable (FIB)

    Line Card

    MAC

    PacketMemory

    FwdingTable (FIB)

    Switched Backplane

    Line Interface

    CPUMemory

  • 7777ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Restoration Steps

    2. Forward LSP

    2. LSP Flooding

    1a. Failure Detection

    1. Del ISIS adjacency

    1b. ISIS Notification

    4.Update FIB on linecards

    4. Update FIBs

    2.Generate LSP

    3. SPF & update RIB

    3. SPF & update RIB

    Protocol convergence = 1+2+3Service convergence = Protocol convergence + 4

  • 7878ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Seven Steps for Restoration

    1. Detect link down 10-20ms2. Wait to filter out transient flaps 20ms (2s)3. Wait before sending update out 50ms4. LSP flooding ~10ms/hop5. Hold time before SPF 100ms (5.5s)6. Compute shortest paths 100-400 ms7. Update the routing tables ~20 pfx/ms

    Expected service disruption 400ms-1.2s (+7s)

  • 7979ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Backbone Experiments

    after 640msafter 640ms

    POP #2 POP #2 –– East CoastEast CoastPOP #1 POP #1 –– West CoastWest Coast

    POP #4 POP #4 –– East CoastEast CoastPOP #3 POP #3 –– West CoastWest Coast

  • 8080ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Summary of Lessons learnt

    Timers dominated the restoration times for historical reasons– fear of instability in the network– fear of overloading router CPUs

    Sub-second convergence is achievableBut…– need greater predictability– need fine tuning, e.g., FIB update process

  • 8181ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part IVCurrent Trends/Open Issues

  • 8282ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part IV - Outline

    IP-based restorationMPLS Fast Re-RouteOptical layer protection/restorationService Availability

  • 8383ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part IV - Outline

    IP-based restoration– Link weight assignment for transient failures– Failure-Insensitive Routing (FIR)– IGP Protocol modifications– Router architecture

    MPLS Fast Re-RouteOptical layer protection/restorationService Availability

  • 8484ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Link Weight Assignment (LWA)

    Manually configured by ISP operators looking at the network staticallyOperational Goals– Low end-to-end delays – Prevent link overload/congestion

    Main problem: “transient” failures– Failures repaired in less than 10 min (~80% of total)– IS-IS re-routes around failure– Load balancing across the network is sub-optimal

  • 8585ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    LWA: looking at transient failures

  • 8686ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    LWA: optimization problem

    [Nucci et al., ITC 2003]Consider every single link failure scenarioFind a set of weights that – minimizes max utilization– satisfy end-to-end delay guarantees (from SLA)– across all failure scenarios

    Tabu search heuristic to explore solution space

  • 8787ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    LWA: Worst case load spreading

    Weight setting helps balancing the load during the worst case failure scenario

    Worst-case failure scenario

    Without taking into account transient failures

    Taking into account transient failures

    No failure state

  • 8888ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Failure Insensitive Routing (FIR)

    [Nelakuditi et al., IWQoS 2003]Handle single transient failures with– no network-wide re-convergence– fast restoration of packet forwarding

    Key idea– detect failure if packet returned by outgoing interface– upon failure detection

    • suppress LSP broadcast• use precomputed table to re-route packet

  • 8989ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    FIR: Example

    1

    2

    1

    2

    1

    33

    AE

    D

    C

    B

    F

    Shortest path from A to F : A->B->E->F

  • 9090ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    FIR: Example (cont’d)

    1

    2

    1

    2

    1

    33

    AE

    D

    C

    B

    F

    A detects failure of one of (B->E) or (E->F)Key link set for F on interface (A->B) is { (B->E), (E->F)}Interface-specific forwarding is pre-computed “Backwarding” table needed for sending packet backward

  • 9191ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    IS-IS/OSPF modifications

    Content-based rules for LSP processing– different priorities for important links and messages

    (up/down)– which first: LSP propagation or SPF?

    Second shortest paths– avoid re-computing the paths (save 100-400ms)– not clear how to guarantee loop-free alternatives

  • 9292ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Router Architecture

    Incremental SPF– some updates don’t change routing tree– no need to shortest path tree for each update

    Reliable multicast for line-card updates– a backbone router may have 16 line cards

    Prioritize routing updates– some prefixes are more important than others, e.g.,

    voice/video gateways

  • 9393ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part IV - Outline

    IP-based restorationMPLS Fast Re-RouteOptical layer protection/restorationService Availability

  • 9494ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    MPLS Fast Re-Route

    Basic Idea: emulate optical link protectionOperator assign additional labels to re-route traffic around failure of each protected link– Same mechanism that is used for path protection– Labels provide a temporary “patch” after a failure– Give time to find a “more optimal” solution without

    interrupting traffic forwarding

    One label is required per link to be protected– More complex network management– Higher risk of configuration errors

  • 9595ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    MPLS Fast Re-Route: Example

    A

    BD

    E

    F

    Primary LSP from A to E

    C

  • 9696ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    MPLS Fast Re-Route: Example

    A

    D

    C

    B

    E

    F

    Has to be enabled on ingress – A creates detour around B, B around C, C around D

  • 9797ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    MPLS Fast Re-Route: Example

    A

    DB

    E

    F

    B to C link fails – B immediately detours around C– B signals to A that failure occurred

    C

  • 9898ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    MPLS Fast Re-Route: Example

    A

    DB

    E

    F

    A calculates and signals new primary path

    C

  • 9999ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    MPLS Fast Re-Route: Performance

    NANOG presentation on Qwest backboneFast Re-route: 10’s – 100’s of msec– Secondary LSP plus stand-by: 100’s msec – 1 sec– Disruption of traffic is limited to the detection time and

    propagation of update to FIBs

    Question: Is the gain worth the complexity?

  • 100100ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part III: Outline

    IP-based restorationMPLS Fast Re-RouteOptical layer protection/restoration– Traffic grooming– Multi-layered approach

    Service Availability

  • 101101ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Traffic Grooming

    Single lightpath has high capacity– 10 Gbps, soon to be 40 Gbps– most customers need less

    Fill up a lightpath with low-speed streams – e.g. send 16 OC-3’s over a single lightpath

    Protection/Restoration– can be done at the level of lightpath, or…– find backup paths for individual streams

    • need sophisticated switches, costs more

  • 102102ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Traffic Grooming for Survivability

    [Mukherjee et al., 2003]Protect sub-lambda granularity connectionsApproaches– Protection at lightpath level (PAL)– Protect at connection level (PAC)

    • Capacity of a lightpath may be shared by backup and primary paths for a connection(MPAC) or not (SPAC)

    Better to groom working and backup paths separatelySPAC is best but grooming ports are costly!

  • 103103ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    A Joint IP/WDM Technique

    [Fumagalli et al., 2000]Hybrid solution – only part of traffic protected at WDM layer– Use simulated annealing to optimize cost

    • WDM cost = total miles of working and protection wavelengths • IP cost = mileage of unprotected traffic streams and a “penalty”

    factor

    – cost function used to tune extent of WDM protection

  • 104104ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Part III: Outline

    IP-based restorationMPLS Fast Re-RouteOptical layer protection/restorationService Availability

  • 105105ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Motivation

    Traditional QOS objectives less challenging in today’s backbones– loss, delay, jitter, etc.

    Need to capture the effect of failures– known to cause disruptions

    Availability based metric in SLA can provide competitive advantage to ISPs

  • 106106ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    The challenge of defining availability

    Telephone networks– gold standard 5 9’s, in terms of blocked calls– ANSI-defined outage index

    IP networks– no admission control, connectionless paradigm – ISPs offer “port availability” as part of SLA, but this

    ignore many factors e.g.,• Is a physical path established? Is there an IP route? Is the

    server up and responding? etc.

  • 107107ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    IP service availability for an ISP

    [Diot et al., 2003]Definition: how often is packet forwarding available between two end-points?Factors– network topology– IP-to-fiber mapping– interdependence of IP-layer elements– failure characteristics of links/routers– network convergence times

  • 108108ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    A strawman definition

    Assume – uncongested network, P parallel paths between O-D pair

    – tk=mean time between failure for any k paths– D = constant forwarding outage due to convergence– Ok = average failure duration affecting any k paths

    A = 1 – [ D/K * Σ k/tk + OP / tp ]k=1P-1

    subset of paths fail, forwarding disrupted due to convergence

    all paths fail simultaneously

  • 109109ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Failures in a congested network

    [Farago et al., 2003]Path can become unavailable due to congestion or failureρj = traffic on link j; pj = reliability of link jLink availability function αj(pj, ρj )– Bounded by [0,1], decreasing function of ρj – A route is available if all links on the route are

    simultaneously available– Upper and lower bounds derived

  • 110110ICNP 2003 ICNP 2003 -- S. Bhattacharyya and G. IannacconeS. Bhattacharyya and G. Iannaccone November 4th, 2003November 4th, 2003 --

    Conclusions

    A wide range of techniques available, but…– Need cost-benefit analysis– Need to understand network failure characteristics

    Sprint experience– Optical layer mechanisms may be faster but IP

    restoration is satisfactory– Need to fine-tune existing mechanisms

    The future– Service availability will be an important metric for IP

    networks– End-to-end availability is orders of magnitude harder!

    Availability and Survivability in IP NetworksTutorial OutlinePart IIntroduction & BackgroundPart I - OutlinePart I - OutlineWhat is the Internet?Loosely connected networks...Sub-IP Technologies (Sprint network)Topology of today’s tier-1 backbonesThe Sprint U.S. TopologyProtocols for IP networksProtocols for IP networks (cont)Part I - OutlineDefinitionsScope of this tutorialPart IICommon Approaches to SurvivabilityPart II - OutlinePart II - OutlineProtection vs. RestorationTrade-offs: Recovery SpeedTrade-offs: Deployment CostsTrade-offs: Failure characteristicsPart II - OutlineOptical Protection schemesDedicated vs. Shared ProtectionOptical Protection schemes (WDM)Path vs. Link ProtectionComparison of Protection SchemesOptical Restoration schemesPart II - OutlineMPLS-based ProtectionOptions for the backup LSPMPLS Protection exampleLSP ReroutingPart II - OutlineIP Restoration: IS-IS protocolIP Restoration: ExampleIP Restoration: ExampleIP Restoration: ExampleIP Restoration: ExamplePart II - OutlineMulti-Layered Protection/RestorationPart IIIThe Sprint ExperiencePart III - OutlinePart III - OutlineRequirements for SurvivabilityDesign Solution: Capacity ProvisioningDesign Solution: Network TopologyThe Sprint U.S. TopologyPoint-of-Presence(PoP) DesignDesign Solution: IP-Based RestorationIS-IS routing practicesPart III - OutlinePerformance of IP-based Survivability“Listening” to failuresDefinition of failure eventFailures are part of everyday operationsTime between Failures (network-wide)Sources of failuresNetwork-wide Failure DurationClassification of FailuresShared Link Failures: SimultaneousShared Link Failures: OverlappingClassification of Failures (updated)Individual Link FailuresHigh vs. Low Failure LinksHigh Failure LinksLow failure linksClassification of Failures (updated)Summary: Lessons learntPerformance of IP-based SurvivabilityRestoration Steps in IS-ISFailure detectionFailure notificationForward path re-computationRestoration StepsSeven Steps for RestorationBackbone ExperimentsSummary of Lessons learntPart IVCurrent Trends/Open IssuesPart IV - OutlinePart IV - OutlineLink Weight Assignment (LWA)LWA: looking at transient failuresLWA: optimization problemLWA: Worst case load spreadingFailure Insensitive Routing (FIR)FIR: ExampleFIR: Example (cont’d)IS-IS/OSPF modificationsRouter ArchitecturePart IV - OutlineMPLS Fast Re-RouteMPLS Fast Re-Route: ExampleMPLS Fast Re-Route: ExampleMPLS Fast Re-Route: ExampleMPLS Fast Re-Route: ExampleMPLS Fast Re-Route: PerformancePart III: OutlineTraffic GroomingTraffic Grooming for SurvivabilityA Joint IP/WDM TechniquePart III: OutlineMotivationThe challenge of defining availabilityIP service availability for an ISPA strawman definitionFailures in a congested networkConclusions