thesis proposal meeyoung cha resilient design architecture

Post on 21-Jun-2015

740 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

thesis proposal

Meeyoung Cha

Resilient Design Architecturefor Realtime Network Services

2

• Goal Analytic foundation and resilient design architecture for realtime network services

• OutlineChallenges for realtime servicesPoint-to-point network servicesPoint-to-multipoint network servicesSummary and future work

Overview

3

Part 1:Challenges for realtime services

4

• MotivationFailures are frequent and routing protocols converge not fast enough for realtime network services.

• How to provide resiliency against temporary outages? Exploit path diversity!

What’s the problem?

5

Exploiting Path Diversity

Destination

Multiple disjoint paths

ISP Network

Using multiple disjoint paths gives maximumrobustness against single point of failures!

Source

6

Scope of Work

Intra-domain

Algorithms

Modeling

Application

ISP backbone network

Point-to-pointPoint-to-multipoint

Heuristic, optimal

Identify new problems,practical considerations

7

Part 2:Point-to-point Network Services

8

• How to find multiple disjoint paths? Use a node inside the network to relay packets

• Problem is:Where to place relay nodes and how many?

Idea: Placing Relays

overlay path

Destination

default path

relays

ISP Network

Origin

9

• Idea: use disjoint overlay paths along with the default routing path to route around temporary failures.

• Previous work has focused on selecting good relay nodes assuming relay nodes are already deployed.– E.g., RON [Anderson SOSP 01], Detour [Savage Micro 99]

• As an ISP, we consider the problem of placing relay nodes well. – Find a fixed set of relay nodes that offer as

much path diversity as possible to all OD pairs.

Problem Definition

10

• Equal Cost Multi Paths (ECMP)

• Completely disjoint paths not possible due to ECMP.

Impact of ECMP on Overlay Path Selection

Intra-PoP

AR

AR BR

BR

BR

BR AR

AR

Inter-PoP

(Access router) (Backbone router)

11

Partially Disjoint Overlay Path

We allow partially disjoint overlay paths.

Overlap decreases resiliency. Introduce penalty to quantify the quality degradation.

o d

r

default path

overlay path

12

Penalty for Overlapped Links

0.5

0.5

0.25

0.25

0.5 0.75

0.125

0.125

0.875

0.125

1.0o d

• Impact of a single link failure on a path- prob. a packet routed encounters a link failure P[ path od fails | link l fails ]

13

Penalty –fraction of traffic carried on “overlapped” link

Penalty Measures

o d

r

• Penalty of a relay r for OD pair (o,d) Po,d(r) = P[ both ord and od fail | single link failure ]

• Penalty of a relay set R of size k – sum of minimum penalty of all OD pairs using relays

∑o,d min( Po,d(r) | r ∈ R )

14

• Goal: find a relay set R of size k with minimum penalty

• Optimal solution– exhaustive search, 0-1 integer programming (IP)

• Greedy selection heuristic– start with 0 relays – iteratively make greedy choice (minimal penalty)– repeat until k relays are selected

• Local search heuristic– start with k random relays– repeat single-swaps if penalty is reduced

Placement Algorithms

15

• Performance evaluation– Number of relays vs. penalty reduction– Comparison with other heuristics (random, degree)

• Sensitivity to network dynamics– Based on topology snapshot data, do relays

selected remain effective as topology changes?– Based on network event logs, what is the fraction

of traffic protected from failures by using relays?

Evaluation Overview

16

• We use an operational tier-1 ISP backbone and 3-month topology snapshots and 6-month event

logs. Topology - 100 routers, 200 links Assume hypothetical traffic matrix

- equal amount of traffic between OD pairs

• Also evaluated with 1 real, 3 inferred, 6 synthetic topologies.

Dataset

17

Sensitivity to Network Dynamics

Relays are relatively insensitive to network dynamics.

5% of nodes are selected as relays

10% of nodes are selected as relays

18

Hypothetical Traffic Loss from Failure Event Logs

complete protectionfor 75.3% failures

less than 1% of trafficlost for 92.8% failures

(failu

re e

vents

)

19

• This is the first work to consider relay placement for path diversity in intra-domain routing.

• We quantify the penalty of using partially disjoint overlay paths; and propose two heuristics for relay placement.

• We evaluate our methods on diverse dataset. – Our heuristics perform consistently well (near-optimal).– A small number of relay nodes (≤10%) is good enough.– Relays are relatively insensitive to network dynamics.– Proven also effective against real (multiple) failures.

Summary

20

• PublicationsMeeyoung Cha, Sue Moon, Chong-Dae Park, Aman Shaikh“Placing Relay Nodes for Intra-domain Path Diversity”Proc. IEEE INFOCOM poster, Mar 2005 Proc. IEEE INFOCOM conference paper, Apr 2006

• TalksIEEE INFOCOM 2005, 2006DIMACS mixer series, Sep 2005 Princeton systems group lunch talk, Dec 2005

• Action itemsIn preparation of a journal versionExtended idea to inter-domain setting

Meeyoung Cha, Aman Shaikh, Sharad Agarwal, Sue Moon

“On AS Level Path Diversity”, submitted to IMC 06

Accomplishments

21

Part 3:Point-to-multipoint Network Services

22

• IPTV (Internet Protocol TV) distribution of broadcast TV traffic using IP technology

• Growing need for efficient and resilient IPTV design– 4 million IPTV subscribers in 2005 [1]

• What is the best architecture for supporting IPTV?– technology (IP, optical) – hierarchy (hub-and-spoke, meshed)– multicast routing (cost)– failure restoration (high availability)

Motivation

[1] http://www.cisco.com/global/DK/docs/presentations/partnere/IPTV-Copenhagen-291105.pdf

23

SHE

Regional Network

Regional Network

Video Hub Office (VHO)

2 SHEs and 40 VHOs across the US

customers

Regional Network

Regional Network

Backbone Distribution Network

Super Head Ends (SHE)

VHO

VHO

Service Architecture of IPTV

Broadcast TV

Regional Network

Regional Network

How to design backbone part of IPTV services(e.g., inter-connecting SHEs and VHOs)?

24

Service Requirements of IPTV

• Cost-effective design– Each link associated with port / transport cost – Find minimum cost multicast trees

• Reliable service– High availability– Resiliency against single node or link failures– Two physically disjoint paths from SHEs to VHOs

(Multilayer problem)

25

SRLG (Shared Risk Link Group)

• Layered architecture

Single link failure → multiple failures in the upper layer

Two disjoint links may belong to a common SRLG

26

Path Protection Routing

How to create two trees such that the total cost is minimized and each VHO has physically disjoint paths connecting SHEs?

• 1+1 protection: resources dedicated, data simultaneously sent on two paths (guarding against each other)

SHESHE

VHO

VHO VHO

SRLG-diversepaths

VHO

27

Link-diverse versus SRLG-diverse

D1 and D3 may be disconnected due to a single fiber cut.

28

Problem Definition

• ProblemGiven sources S and destinations D, inter-connect S and D such that each destination is connected to at least one of the sources under any single source, link, and SRLG failures.

s

d

i

j

=1, if link (i,j) is used from s to d

=1, if link (i,j) is ever used by s

=1, if SRLG b is used from s to d

b

29

Minimize total cost

SRLGdiversity

Flowconservation

Integer Programming (IP) Formulation

30

Evaluation Setup

• IP modeling– GAMS tool http://www.gams.com/– ILOG CPLEX IP solver

http://www.ilog.com/

• Dataset2 SHE / 40 VHO locations in the US

• IP formulation amenable to realistic topologies!

31

Compared Designs

• Optimal versus heuristic– Active Path First (APF) heuristic

• Find multicast tree from one SHE• Remove all the SRLGs used in the first tree• Find second multicast tree from remaining SHE

• Reduced reliability– Link diverse (Link-Div)

• Find link diverse paths connecting 2 SHEs 40 VHOs

– Source diverse (Src-Div) • Find two multicast trees

32

More economical than heuristic.Cost for increased reliability affordable.

Cost Comparison Across Designs

Most reliable Most Reliablecost

Reduced reliability Reduced reliability

33

Summary

• The first work to consider IPTV backbone design.• 1+1 path protection routing problem modeled.• Compact Integer Programming formulation

proposed.• IP formulation evaluated using realistic topologies.

– Real topologies amenable to our method– Cost gain against heuristics– SRLG diversity shown affordable

34

• PublicationsMeeyoung Cha, Gagan Choudhury, Jennifer yates, Aman Shaikh, Sue Moon, “Case Study: Resilient Backbone Design for IPTV Services”Proc. WWW IPTV workshop, May 2006

Meeyoung Cha, W. Art Chaovalitwongse, Zihui Ge, Jennifer Yates, Sue Moon, “Path Protection Routing with SRLG Constraints to Support IPTV in WDM Mesh Networks”, Proc. IEEE Global Internet Symposium, Apr 2006

• TalksAT&T research labs, Feb 2006 IEEE Global Internet, Apr 2006WWW IPTV workshop, May 2006

• Action itemsIn preparation of a journal versionIncorporate practical considerations and develop new algorithms

Accomplishments

35

Part 4:Summary and Future Work

36

• Point-to-point communication– VoIP, online-gaming, VPN applications– Disjoint overlay paths for robustness– Relay placement algorithms– Extensive analyses

• Point-to-multipoint communication– IPTV application– Shared Risk Link Group (SRLG) consideration– New Integer Programming (IP) model– Extensive analyses

Summary: Resilient Design Architecture

What I have done:

37

• Relay architecture– Implementation issues

• Protocol design• Router support• Billing issues

• Relay placement in inter-domain setting– Border Gateway Protocol (BGP) path

• Inference • Asymmetries

• Lower layer path diversity – Incorporate Shared Risk Link Group (SRLG)

Future Work: Point-to-point

What I am going to do:

38

• Network design: source placement– Given fixed destinations, where to place

sources? – New IP formulation, new algorithms,…

• Guaranteed path performance– Can we guarantee latency bounds on paths?

• Plasticity and scalability – Adding more sources and destinations

Future Work: Point-to-multipoint

What I am going to do:

39

• Dynamic change of service requirements – Change of topology, demand, service expansions– How to incorporate changes?

• Optimal solutions may be too expensive or infeasible

• What are good heuristics? – Fast convergence, easy to parallelize

Real-world Service Considerations

sub-optimal from here

Example of immediate work:

40

• Step1: Pool of feasible solutionsUse fast heuristicwith random parameters

• Step2: Sort solutions (1st generation)Use cost functionsto evaluate solutions

Improvement of Existing Algorithm 1

Feasible solutions

S1 S2 Sn…

Best(20%)

S1

S2

Sn

Mid(75%)

Worst(5%)

Genetic Algorithm (GA) based heuristic:

41

• Step3: Mutate parametersMix best with mid or worst

• Step4: Sort solutions (2st generation)

• Step5: Repeat steps 3 and 4Until no improvement found

Best(20%)

Mid(75%)

Worst(5%)

S1

S2

Sn

Best(20%)

Mid(75%)

Worst(5%)

S1’

S3’

Sn’

S2

Sn’

S1’

new feasible solutions

mutateparameters

Improvement of Existing Algorithm 2

Genetic Algorithm (GA) based heuristic:

42

Thank you very much!

top related