survivable ip network design with ospf routing · survivable ip network design with ospf routing...

SURVIVABLE IP NETWORK DESIGN WITH OSPF ROUTING

L.S. BURIOL, M.G.C. RESENDE, AND M. THORUP

Abstract. Internet protocol (IP) traffic follows rules established by routing

protocols. Shortest path based protocols, such as Open Shortest Path First

(OSPF), direct traffic based on arc weights assigned by the network operator.

Each router computes shortest paths and creates destination tables used for

routing flow on the shortest paths. If a router has multiple outgoing links on

shortest paths to a given destination, it splits traffic evenly over these links. It

is also the role of the routing protocol to specify how the network should react

to changes in the network topology, such as arc or router failures. In such

situations, IP traffic is re-routed through the shortest paths not traversing theaffected part of the network. This paper addresses the issue of assigning OSPF

weights and multiplicities to each arc, aiming to design efficient OSPF-routednetworks with minimum total weighted multiplicity (multiplicity multiplied bythe arc length) needed to route the required demand and handle any single arcor router failure. The multiplicities are limited to a discrete set of values andwe assume that the topology is given. We propose an evolutionary algorithmfor this problem, and present results applying it to several real-world probleminstances.

1. Introduction

The Internet is the global network of interconnected communication networks,made up of routers and links connecting the routers. On a network level, theInternet is built up of several autonomous systems (ASs) that typically fall under theadministration of a single institution, such as a company, a university, or a serviceprovider. Routing within a single AS is done by a Interior Gateway Protocol (IGP),while an Exterior Gateway Protocol (EGP) is used to route traffic flow betweenASs. IP traffic is routed in small chunks called packets. A routing table instructsthe router how to forward packets. Given a packet with an IP destination addressin its header, the router retrieves from the table the IP address of the packet’s nexthop.

OSPF (Open Shortest Path First) is the most used intra-domain routing pro-tocol. For this protocol, an integer weight must be assigned to each arc, and theentire network topology and weights are known to each router. With this informa-tion, each router computes the graphs of shortest (weight) paths from each otherrouter in the AS to itself. The graphs need not be trees, since all shortest pathsbetween two routers need to be considered. Demands are routed on the correspond-ing shortest path graphs. At each router s, the total demand leaving this node anddestined to a target router t is evenly split among all links outgoing from router s

Date: September 6, 2004.Key words and phrases. Internet, OSPF routing, network design, survivability, evolutionary

algorithm.AT&T Labs Research Technical Report TD-64KUAW.

1

2 L.S. BURIOL, M.G.C. RESENDE, AND M. THORUP

on the shortest path graphs ending at t. This demand not only consists of demandoriginating at s, but also of demand passing through s on its way to t.

The arcs weights are assigned by the AS operator. The lower the weight, thegreater the chance that traffic will get routed on that arc. Different weight settingslead to different traffic flow patterns. Weights can be set to optimize networkperformance, such as minimize congestion [5, 8, 9], as well as to minimize networkdesign cost. In this paper, we address the latter case.

Given a network topology and predicted traffic demands, the OSPF networkdesign problem is to find a set of OSPF weights that minimizes network cost. Moreprecisely, we are given a directed network G = (N,A), where N is the set of routersand A is the set of potential arcs where capacity can be installed, and a demandmatrix D that, for each pair (s, t) ∈ N×N , specifies the demand dst between s andt. The arc multiplicity of an arc a is the number of parallel links associated with arca. We want to determine a positive integer weight wa ∈ [1, 65535], as well as themultiplicity, for each arc a ∈ A, such that the network cost is minimized. OSPFweights are restricted to be identical for all links on the same arc. Furthermore,we consider each link capacity to be fixed and equal to M . Therefore, each arccapacity is limited to a discrete set of values

0,M, 2M, 3M, . . . .

Multiplicities are determined such that all of the demand can be feasibly routed onthe network, i.e. no arc load exceeds the arc’s capacity. For quality of service (QoS)purposes, the definition of feasible route is slightly different. A feasibly routed flowis such that no arc load exceeds a fraction ρ (0 < ρ ≥ 1) of the arc’s capacity.

For traffic splitting purposes, each parallel link is considered as an independentlink. For clarity, we use arc for naming the structure installed between two nodesand link for each capacitated link inserted in this structure. As an example,consider Figure 1, where a load going through router u is destined to a target routert (not shown in the figure). Arcs (−−→u, v1) and (−−→u, v4) belong to the shortest path graphto destination t and arcs (−−→u, v2) and (−−→u, v3) do not. Let the arc multiplicities of(−−→u, v1) and (−−→u, v4) be 1 and 3, respectively. Then, one fourth of the load will berouted on arc (−−→u, v1), which has one link, and three fourths will be routed on arc(−−→u, v4), which has three links.PSfrag replacements

uu u

v1v1 v1

v2v2 v2

v3v3 v3

v4v4 v4

M = 1M = 1 M = 1

M = 3M = 3 M = 3

loadload load

Figure 1. Load splitting for arc multiplicities. Left: structureof outgoing arcs of node u; Middle: structure considering the arc

concept; Right: structure considering link concept.

Since failures can occur in either arcs or routers, it is desirable to design IPnetworks that are survivable subject to these types of failures. To overcome thecomplexity associated with generating all possible combinations of failures, we limit

SURVIVABLE IP NETWORK DESIGN WITH OSPF ROUTING 3

140000

160000

180000

200000

220000

240000

260000

280000

300000

320000

0.5 0.6 0.7 0.8 0.9 1

netw

ork

cost

PSfrag

replacem

ents

QoS parameter ρf

Figure 2. Effect of varying QoS parameter ρf on network costfor a 74-router, 278-arc network (net-3 in Section 5) with singlearc failure. Five independent runs were done for each parametervalue. The line connects average network cost values. Runs usedρn = .8ρf .

ourselves to single arc or single router failure. Since links on a given arc are in somesense dependent, we consider in this paper single arc failure instead of single linkfailure. Therefore, when an arc fails, all links associated with that arc fail. Whena router fails, all arcs incoming and outgoing to/from this router are disabled.Furthermore, all demands having this router as source or destination are discarded.We also assume that no single arc failure disconnects the network. A failure usuallychanges one or more shortest path graphs in the network. Consequently, demandmay be rerouted on different paths. If there is sufficient capacity such that all ofthe demand can be feasibly rerouted for all possible single arc and single routerfailures, the network is called survivable. In case of failures, the fraction of the arc’scapacity limiting the load is usually larger than the fraction used to define a feasibleflow for the non-failure case. For example, in the case of no failure, a flow would befeasible if no arc load exceeds 70% of the arc’s capacity, while for the case of singlefailure, this fraction could be, say, 90%. These values, which we call ρn and ρf forthe non-failure and failure cases, respectively, depend on the network’s quality ofservice requirements. A high quality of service is associated with a low fraction.Similarly, as shown in Figure 2, the cost of the network is inversely proportional tothe values of the fractions.

Several papers on OSPF routing and on survivable IP network design have re-cently appeared in the literature. Fortz and Thorup [9] propose a local search


algorithm to determine OSPF weights to minimize network congestion. Ericsson,Resende, and Pardalos [8] optimize the same objective function, but with a geneticalgorithm. A local search is added to a similar genetic algorithm in Buriol, Resende,Ribeiro, and Thorup [5]. Bley, Grotschel, and Wessasly [3] address survivable IPnetwork design for single arc or router failure. Capacities are assigned so that aspecified percentage of each demand is satisfied for any single router or arc failure.Capacities are installed in multiples of a capacity unit and can vary between aminimum value and a maximum value if the arc is utilized. Maximum hop countis imposed on each route. Arcs, in this paper, have a single link. Furthermore,routing is done on a single shortest path, i.e. load splitting is not implemented.Link weights are adjusted by local search to minimize the total cost of the network,which depends on the installed capacities. Bley [2] considers a similar problem, butalso takes into account hardware considerations that affect solution feasibility andcost. A solution method based on Lagrangian Relaxation is used. Holmberg andYuan [12] use simulated annealing to determine OSPF weights and arc capacitiesto minimize network cost based on a fixed charge and variable cost model. OSPFrouting with load splitting is used. Arcs in this paper, however, have a singlelink. Brostrom and Holmberg [4] use a similar simulated annealing approach, buttackle the problem of minimizing network cost while maximizing a measure of net-work survivability. Only arc failures are considered. A mixed integer programmingformulation is given. As with the papers above, arcs have a single link.

In this paper, we present a genetic algorithm for finding good-quality solutionsto the survivable network design problem for single arc or single router failureswhere arc multiplicities are considered. When simulating arc or router failures,the shortest path graphs are updated using dynamic shortest path algorithms [6],instead of recomputing the shortest path graphs from scratch. Moreover, OSPFweights are computed and OSPF routing is affected by arc multiplicities. Severalextensions to the basic model are proposed, including identical or different OSPFweights on symmetric arcs, identical or different multiplicities for symmetric arcs,different objective functions, arc weight range, and latency constraints.

The paper is organized as follows. In Section 2, we describe a general evolution-ary algorithm for weight setting in OSPF routing. This algorithm calls a procedurethat, given a weight setting, routes the demands to determine arc multiplicities andcomputes the cost of the network. This procedure is described in Section 3. In Sec-tion 4, we add survivability to the network design process. Computational resultson “real-world” and artificial networks are presented in Section 5. Extensions andconcluding remarks are given in Section 6.

2. Evolutionary algorithm for weight setting in OSPF routing

Evolutionary algorithms, such as genetic algorithms [10, 11], evolve a set ofsolutions (the population) over time (generations) by combining pairs of solutions(mating) from one generation and randomly perturbing them (mutation) to producenew solutions (offspring) that make up the population of the following generation.The process is repeated for a fixed number of generations and the best solution inthe last generation is returned by the algorithm as an approximate solution of theoptimum.

Ericsson, Resende, and Pardalos [8] presented a genetic algorithm for settingOSPF weight in an IP network with known link capacities. The objective function


0

50

100

150

200

250

300

350

400

450

500

50 100 150 200 250 300 350 400 450 500

CP

U ti

me

(1.5

GH

z Ita

nium

-2 s

econ

ds)

population size

50 generations100 generations200 generations500 generations

1000 generations

Figure 3. Genetic algorithm running time as a function of num-ber of population size for 50, 100, 200, 500, and 500 generations.Experiment done on a 74-router, 278-arc network (net-3 in Sec-tion 5) with no router or arc failure.

used was the one proposed by Fortz and Thorup [9] which increasingly penalizesloads that approach and go over link capacities. The same problem was addressedin Buriol, Resende, Ribeiro, and Thorup [5], where a local search procedure wasapplied to the resulting offspring after mating.

Our algorithm uses the same genetic algorithm structure described in [5, 8], butit differs in how the solution is evaluated, besides it considers single arc or routerfailures. In this section, we describe our genetic algorithm optimizing a generalfunction f(w). In the next section, we show how we compute this function. We donot consider survivability issues yet and address those issues later, in Section 4.

Each of the p elements of the population is an integer weight vector, characteriz-ing a solution. Each arc a ∈ A has associated with it an integer weight wa ∈ [1, w].Initially, all but one of the solutions are made up of uniformly randomly generatedweights. The remaining solution is made up of unit weights.

The algorithm is run for N generations. We use the random keys crossoverscheme of Bean [1] for mating and mutation. At each generation, the solutionsare evaluated and sorted in increasing order of objective function value. The NA

solutions with the smallest costs are put in class A, while the NC solutions with thelargest costs are placed in class C. The remaining solutions are assigned to classB.


100000

105000

110000

115000

120000

125000

130000

135000

140000

145000

150000

100 200 300 400 500 600 700 800 900 1000

netw

ork

cost

number of generations

10 elements30 elements50 elements70 elements90 elements

150 elements250 elements500 elements

Figure 4. Network cost as a function of number of genetic al-gorithm generations for population sizes 10, 30, 50, 70, 90, 150,250, and 500 elements. Experiment done on a 74-router, 278-arcnetwork (net-3 in Section 5) with no router or arc failure.

The next generation is produced as follows. All solutions in class A are automat-ically promoted, as is, to the next generation. All solutions in class C are discardedand replaced by random weight solutions in the next generation. The remainingNB solutions of the next generation are generated as follows. To produce each off-spring, one parent (p1) is selected at random (with replacement) from the elementsin class A and one (p2) from B∪C. The i-th weight of the offspring will be the i-thweight of parent p1 with probability π1 > 1/2, the i-th weight of parent p2 withprobability π2 < 1− π1, or a random weight in the interval [1, w] with probability1− π1 − π2.

Several parameters must be set. In Section 5, where computational results aredescribed, we define these values. Guidelines for setting these parameters are asfollows. The larger the population size p, the longer will each generation take to becomputed (see Figure 3). We also expect solution quality to improve by increasingthe number of generations N and/or increasing the population size p (see Figures 4and 5). Of course, running time increases with number of generations (see Figure 6).The mating/mutation probabilities π1 and π2 should be such that π1 > π2 and1− π1 − π2 ≈ 0. NA, NB , and NC should be such that NB > NA > NC .


100000

105000

110000

115000

120000

125000

130000

135000

140000

145000

150000

50 100 150 200 250 300 350 400 450 500

netw

ork

cost

population size

50 generations100 generations200 generations500 generations

1000 generations

Figure 5. Network cost as a function of number of genetic algo-rithm population size for 50, 100, 200, 500, and 500 generations.Experiment done on a 74-router, 278-arc network (net-3 in Sec-tion 5) with no router or arc failure.

3. Heuristic for computing arc multiplicities

In this section, we describe a heuristic that computes arc multiplicities givena topology of potential arcs, lengths and capacities associated with each arc (alllinks of the same arc a have identical capacities ca and lengths lengtha), OSPF arcweights, and a demand matrix. This heuristic not only returns the arc multiplicities,but also computes the network cost f(w) that guides the genetic algorithm ofSection 2. In this section, we describe the heuristic for the no-failure case. Apseudo-code for the heuristic is described. Later, in Section 4, we consider thefull procedure, with single failures. In all pseudo-code, the parameters for somefunctions are omitted for clarity.

Let T be the set of destination routers. Each shortest path graph gt, withdestination t ∈ T , has an |A|-vector, Lt, associated with the arcs, that stores thepartial loads flowing to t traversing each arc a ∈ A. The total load from each arcis represented in the |A|-vector l, which stores the total load traversing each arca ∈ A. For each destination t, the |N |-vectors πt and δt are associated with thenodes. The distance from each node to t is stored in πt, while δt keeps the numberof arc multiplicities (links) outgoing from each node in gt.

The heuristic first routes the demand using OSPF routing rules and assumingeach arc has unit multiplicity. Arc loads are computed and required multiplicities


0

50

100

150

200

250

300

350

400

450

500

100 200 300 400 500 600 700 800 900 1000

CP

U ti

me

(1.5

GH

z Ita

nium

-2 s

econ

ds)

number of generations

10 elements30 elements50 elements70 elements90 elements

150 elements250 elements500 elements

Figure 6. Genetic algorithm running time as a function of num-ber of generations for population sizes 10, 30, 50, 70, 90, 150, 250,and 500 elements. Experiment done on a 74-router, 278-arc net-work (net-3 in Section 5) with no router or arc failure.

are determined. Demand is rerouted assumming the updated multiplicities andupdated arc load are determined. This cycle (routing, computing arc loads, deter-mining multiplicities) is repeated until the demand can be feasibly routed using thecurrent multiplicities.

Figure 7 shows the pseudo-code for the heuristic for computing arc multiplicities.For each potential arc a ∈ A, the arc’s multiplicity µa is initially set to 1 in line 1.For each demand destination t ∈ T , the shortest path graph distances are computedfrom scratch, using Dijkstra’s algorithm [7], in procedure ReverseDijkstra (line 3).Given the weights w and the shortest path distances, the shortest path graph isidentified by procedure ComputeSPG in line 4. Given the shortest path graph gt,δt is computed in line 5 by ComputeDelta. OSPF routes with load splitting arecomputed in line 6 and the partial loads vector Lt is determined by procedureComputePartialLoads for each arc in the shortest path graph gt. The total load lon an arc, computed in line 8 by procedure ComputeTotalLoads, is the sum of allof partial loads. In the implementation, the total loads are actually computed inprocedure ComputePartialLoads.

Next, in line 9, for each arc a ∈ A, the multiplicity µa, and consequently δa, areupdated in UpdateMultandDelta according to the total load la on the arc. Theupdated multiplicity of arc a is the maximum of its current multiplicity µa and the


procedure EvaluateSolution(w, lf , rf )1 forall a ∈ A do µa = 1;2 forall t ∈ T do

3 πt ← ReverseDijkstra(w);4 gt ← ComputeSPG(w, πt);5 δt ← ComputeDelta(gt);6 Lt ← ComputePartialLoads(µ, δ, π, gt);7 end forall

8 l← ComputeTotalLoads(L);9 S ← UpdateMultiandDelta();10 if |S| > 0 UpdateSolution();11 forall a ∈ A if la = 0 then µa = 0;12 f ←

∑

a∈A µa ∗ lengtha;13 return(f , µ);end procedure

Figure 7. Pseudo-code for heuristic for computing arc multiplicities.

minimum multiplicity required to route load la on arc a, dla/(ρ · cae), where ρ issuch that 0 < ρ ≤ 1 and defines the portion of the capacity that can be used, i.e.µa ← max{µa, dla/(ρ ·ca)e}. The arcs whose multiplicities were updated are placedin a set S and the loads are updated in procedure UpdateSolution, described next.In line 11, the arcs with no load have their multiplicities set to 0 and in line 12, thesolution cost is computed as the sum of the products of arc multiplicities and thecorresponding arc lengths. This value, as well as the arc multiplicities, are returnedin line 13.

Pseudo-code for procedure UpdateSolution is given in Figure 8. For each short-est path graph gt, the loads are updated if at least one of the arcs in S belongs togt. In this case, the partial loads are not recomputed from scratch, but are simplyupdated. In line 4 of UpdateSolution, tail nodes u of all arcs (−→u, v) ∈ S belongingto gt are inserted in a priority queue indexed by its distance to t. Nodes u areremoved from the heap in line 6 and considered one by one until the heap is empty.The test in line 7 is only used when considering failures and is always true for theno-failure case. In line 8, the load is evenly split, considering arc multiplicities. Theload σ is computed as the ratio between the sum of the load leaving node u andthe load passing through node u and δt

u. All arcs (−→u, v) ∈ gt outgoing from u havetheir new loads (λ) computed, and if they have changed, the loads are updated inlines 13-14 and node v is inserted in the heap (line 15).

After the loads are updated, some multiplicities may change. In line 21, proce-dure UpdateMultiandDelta computes the set of arcs S for which multiplicities havechanged, and updates the vectors µ and δ. While set S contains at least one arc,the loop from line 2 to line 20 is repeated. In Section 5, we record the number oftimes that loop 2–20 is repeated and present statistics indicating that this numberis small.


procedure UpdateSolution()1 do

2 forall t ∈ T do

3 H ← {};4 forall e = (−→u, v) ∈ S ∩ gt do InsertIntoHeapMax(H,u, πt

u);5 while HeapSize(H) > 0 do

6 u← FindAndDeleteMax(H);7 if πt

u 6=∞ then

8 σ ←(

Dtu +

∑

a∈gt∩IN(u) Lta

)

/δtu;

9 forall e = (u, v) ∈ OUT(u) do

10 if e 6∈ gt then λ← 0;11 else λ← µe ∗ σ;12 if λ 6= Lt

e then

13 le ← le − Lte + λ;

14 Lte ← λ;

15 InsertIntoHeapMax(H, v, πtv);

16 end if

17 end forall

18 end if

19 end while

20 end forall

21 S ← UpdateMultiandDelta();22 until |S| = 0;end procedure

Figure 8. Pseudo-code for the procedure that updates the solution.

4. Survivable network design

In this section, we add survivability to the network design process. We con-sider single arc, single router, and single arc or single router failure. The differ-ence between the genetic algorithm for the no-failure case and the ones for thesingle failure cases is the procedure EvaluateSolution. We describe changes toEvaluateSolution and present the procedure that simulates these failures.

Unlike the procedure in the previous section, the solution evaluation not onlycomputes the multiplicities for the no-failure case, but also updates the multiplic-ities considering every single failure. These changes are shown in the pseudo-codein Figure 9.

Lines 9 and 10 of the pseudo-code of the no-failure version of EvaluateSolutionin Figure 7 have been substituted by lines 9 to 21 in the complete version in Figure 9.Furthermore, in line 23, if in no situation (no-failure or any single failure) arc a hasa positive load, then the arc’s multiplicity µa is set to 0. In the pseudo-code, themaximum load on arc a over no-failure and all single failure simulations is la.

In lines 9 to 21, the multiplicities are updated considering the cases of no-failure,single arc failures, and single router failures, consecutively. In line 10, the multi-plicities are updated for the no-failure case UpdateMultiandDelta. Here, the QoSparameter ρn is used. A change in multiplicity may cause a change in routing which


can consequently cause a change in arc loads. A change in an arc load can leadto another change in the arc’s multiplicity. For this reason, the steps in lines 10to 20 are repeated in a circular loop until no further change in arc loads or mul-tiplicities is detected in a full cycle of the loop by procedure NoMoreChanges. InSection 5, we study the number of times that procedure NoMoreChanges is calledand consequently the number of times that the loop in lines 9–21 is executed.

Single arc failures and single router failures are simulated by the same pro-cedure, SimulateFail. This procedure has two input parameters: a set F of setsF1, F2, . . . , Fq of arcs that are consecutively removed from the graph during the sim-ulation; and the cardinality q of F . For the single arc failure case, SimulateFailfirst takes as input a set of sets Fa of single arcs, where each Fa consists of arca. The second parameter is m, the cardinality of F . For single router failure,SimulateFail takes as input a set F of sets Fi and cardinality n of F . Each Fi inthis case is the set of all incoming and outgoing arcs to and from router i.

procedure EvaluateSolution(w, lf , rf )1 forall a ∈ A do µa = 1;2 forall t ∈ T do

3 πt ← ReverseDijkstra(w);4 gt ← ComputeSPGandLoad(w, πt);5 δt ← ComputeDelta(gt);6 Lt ← ComputePartialLoads(µ, δ, π, gt);7 end forall

8 l← ComputeTotalLoads(L);9 while 1 do

10 S ← UpdateMultiandDelta();11 if |S| > 0 UpdateSolution();12 if NoMoreChanges() then goto OUTLOOP;

13 if lf = 1 do

14 simulateFail(A,m);15 if NoMoreChanges() then goto OUTLOOP;

16 end if

17 if rf = 1 do

18 simulateFail(R,n);19 if NoMoreChanges() then goto OUTLOOP;

20 end if

21 end while

22 OUTLOOP:

23 forall a ∈ A if la = 0 then µa = 0;24 f ←

∑

a∈A µa ∗ lengtha;25 return(f , µ);end procedure

Figure 9. Pseudo-code for updating a solution considering failures.

We conclude this section, with a description of failure simulation, which is de-scribed in the pseudo-code in Figure 10. In the loop in lines 2 to 15 of the pseudo-code, the failure of each set Fi (i = 1, . . . , q) is simulated. The loop is repeated


procedure SimulateFail(F = {F1, F2, . . . , Fq}, q)1 i = 1;2 while NoMoreChanges() do

3 CopySolution();4 forall a ∈ Fi do wa ← wa;5 forall a ∈ Fi do wa ←∞;6 UpdateSPGandLoad();7 forall a ∈ Fi do wa ← wa;8 S ← UpdateMultiandDelta();9 if |S| > 1 then

10 forall t ∈ T do UpdateDelta(δt, gt);11 UpdateSolution();12 end if

13 if i < q then i← i + 1;14 else i← 1;15 end while

end procedure

Figure 10. Pseudo-code for updating the multiplicities and loadssimulating a given set of failures.

until one pass is completed over the entire set F without causing any change inthe arc multiplicities and consequently in the arc loads. This condition is tested inNoMoreChanges. For each simulation, the current solution is copied to an auxiliarysolution in line 3. The current weights of arcs a ∈ Fi are saved in the auxiliaryvector w (line 4) and are set to infinity (line 5). Procedure UpdateSPGandLoad up-dates the shortest path graphs and the total and partial loads of the copied solutionwith weights w used to simulate the failure of arcs a ∈ Fi. The weights are resetto their original values in line 7 and in line 8 the multiplicities are updated for thecopied solution. The QoS parameter ρf is used in UpdateMultiandDelta. If atleast one multiplicity has changed, then for each t ∈ T , the original δt, Lt, and lt

are updated in lines 10 and 11. The counter i is either incremented in line 13 orreset to 1 in line 14 to force the loop to cycle through all of the sets F1, F2, . . . , Fq.

There is another way to simulate failures. Instead of copying the current solution(l and gt, πt, δt, Lt, for each t ∈ T ) to the auxiliary solution (line 3), we simulateeach failure by modifiying the original solution and then undo the modification atthe end of the loop. We implemented and tested this alternative approach, butsuprisingly, is was computationally less efficient than the one we adopted. Perhapsthis is due to the fact that copying a block of memory is done very efficiently bymodern hardware.

In Section 5, we study the number of times that the failure of set Fi is simulated,discriminating between arc failure and router failure simulations.

5. Computational results

We describe computational experiments with a C language implementation ofthe algorithm described in this paper on four test problems derived from real-worldIP networks. The dimensions of the four instances are summarized in Table 1.


Table 1. Test problem network dimensions. For each of the fournetworks used in the computational experiments, this table liststhe number of routers (|N |), the number of potential arcs (|A|),the number of distinct destination routers taking part in the listof demands (|T |), and the number of demand pairs (|D|).

network |N | |A| |T | |D|net-1 10 90 10 90net-2 11 110 11 110net-3 74 278 18 306net-4 71 350 71 4960

The first two instances (net-1 and net-2) are dense networks in which there isdemand between all pairs of routers. The other two instances (net-3 and net-4)are much larger. Both involve sparse networks. In instance net-3, 18 of the 74nodes are destination routers of demand pairs. Because of this, the algorithm workswith only 18 shortest path graphs per solution. In the other instance (net-4), allnodes are destination routers of demand pairs and almost all router pairs havea demand associated with them. For each instance in this study, all links haveidentical capacities.

The C program was compiled with the gcc compiler, version 3.2.3 with optimiza-tion flag -O3 and run on a SGI Altix 3700 Supercluster running RedHat AdvancedServer with SGI ProPack. The cluster is configured with 32 1.5-GHz Itanium-2processors (Rev. 5) and 245 Gb of main memory. Each run was limited to a sin-gle processor. User running times were measured with the getrusage system call.Running times exclude problem input.

As mentioned in Section 2, solution quality improves with population size andnumber of generations. On the other hand, running times increase as population sizeand number of generations increase. Throughout these computational experiments,we use a population of size p = 50, mating/mutation probabilities π1 = .7 andπ2 = .29, and define classes A, B, and C with NA = .25p, NB = .7p, and NC = .05pelements, respectively. Weights can take values in the interval [1, w = 20]. Withrespect to the QoS parameters, we use ρn = 0.8 for no-failure routing and ρf = 0.95for the cases where failures occur. On all experiments, the objective function issum of the weighted multiplicities, where the weighted multiplicity of an arc is themultiplicity of the arc multiplied by its length.

Figure 11 shows network cost as a function of running time for four runs on net-work net-3. The algorithm was run for no-failure, single-arc failure, single-routerfailure, and single-arc/single-router failure. Each run was limited to 10,000 seconds.The figure shows that the algorithm produced designs having costs: 104,528.32 forthe no-failure case, 168,801.88 for single-arc failure, 172,570.86 for single-routerfailure, and 185,709.16 for single-arc/single-router failure. Hence, to achieve pro-tection from single-arc and single-router failures on net-3 results in about 78%more network cost than when survivability is not required. Figure 12 illustratesthe progress of the best solutions produced by the genetic algorithm. The figureplots the relative errors of each run as a function of CPU time. Errors are computedrelative to the value of the best solution found during each run, i.e. all relative er-ror are zero at time 10,000 seconds. The figure shows that for the no-failure run,


100000

120000

140000

160000

180000

200000

220000

240000

260000

280000

0.1 1 10 100 1000 10000

netw

ork

cost

CPU time (1.5 GHz Itanium 2 seconds)

no failure

single router failure

single router/arc failure

single arc failure

Figure 11. Network design cost for genetic algorithm runs for10,000 seconds on instance net-3 for no-failure, single-router fail-ure, single-arc failure, and single-arc or single-router failure.

the algorithm produces solutions within 10% and 1% of the best solution found(104,528.32) in 6.86 and 1921.53 seconds, respectively. For the single-arc failurerun, solutions within 10% and 1% of the best solution found (168,801.88) are foundin, respectively, 374.11 and 3365.66 seconds. In 85.18 and 6925.90 seconds, thealgorithm produces solutions within 10% and 1%, respectively, of the best solutionfound (172,570.86) for the single-router failure run. Finally, for the single-arc /single-router failure run, the algorithm produces solutions within 10% and 1% ofthe best solution found (185,709.16) in 356.78 and 9249.97 seconds, respectively.

In the next experiment, the genetic algorithm (GA) was run on each of thefour test problems using five different random number generator seeds. For prob-lems net-1 and net-2, the algorithm was run for 200 generations, whereas for thelarger problems net-3 and net-4, the number of generations was fixed at 100. Wecompare the average network cost for weights computed by the genetic algorithmwith average network cost produced using unit and random (rand) weights. Forboth random and unit weight solutions, we apply the multiplicity setting heuristicsdescribed in Sections 3 and 4, i.e. repeatedly, demand is routed following OSPFrules, and loads and multiplicities are computed, until there is enough capacity sothat all of the demand can be feasibly routed. Costs for random weight solutionswere calculated by computing 245 random weight designs (five independent runswith 49 random solutions each) and taking the average cost of these designs. We


0.0001

0.001

0.01

0.1

1

0.1 1 10 100 1000 10000

rela

tive

erro

r

CPU time (1.5 GHz Itanium-2 seconds)

no failure

single router failure

single arc failure

single router/arc failure

Figure 12. Relative error with respect to the best solution foundfor genetic algorithm runs for 10,000 seconds on instance net-3 forno-failure, single-router failure, single-arc failure, and single-arc orsingle-router failure.

compute the network costs for the four combinations of arc and router failure /no-failure. We also compute a lower bound (LB) on the network cost as follows.For each target node, we determine a shortest path graph using lengths as arcweights. Demands are routed on shortest paths assuming unit multiplicities andarc loads are computed. New multiplicities are computed as in Section 3, exceptthat their values are not rounded up to an integer, but instead are taken as realnumbers. The arc cost is the product of its real-valued multiplicity and its length.A lower bound on the network design cost is the sum of all arc costs. The resultsare summarized in Table 2, where for each instance, four sets of costs are shown:no arc or router failure, single router failure and no arc failure, single arc failureand no router failure, and both single arc and single router failure. Average CPUtimes for the genetic algorithm (in 1.5-GHz Itanium-2 seconds) are also listed inthe table. Figures 13 and 14 show plots of the data in Table 2.

In almost all cases, the genetic algorithm produced designs where network costincreased when additional failures were considered. The only exception was onnet-1, where the cost for single-arc failure and no router failure was 2773.0, whileit decreased to 2766.8 when single-router failures were also considered. This isprobably due to the fact that demand originating or terminating at failed routersis discarded in the cost computation.


Table 2. Average network cost for random, unit, and genetic algo-rithm arc weight solutions, lower bound cost and genetic algorithmrunning times (seconds on an 1.5-GHz Itanium-2 processor) for theexperiments with no failures, and all single failure combinations.Values are averaged over five independent runs.

failure avg network cost avg time

network arc router random unit GA LB GA

net-1 no no 6202.9 4162.4 1597.1 690.7 1.48

no yes 8894.2 6591.5 2512.8 690.7 7.75

yes no 9126.7 6443.4 2773.0 841.7 34.70

yes yes 9239.3 6964.4 2766.8 841.7 42.04

net-2 no no 2619.1 1805.4 558.8 209.4 1.76

no yes 3729.7 2770.8 1032.2 209.4 9.48

yes no 3748.6 2782.8 1041.4 223.6 46.92

yes yes 3795.9 2801.1 1081.4 223.6 54.77

net-3 no no 221008.7 183195.2 120708.1 51850.2 5.26

no yes 387739.1 264176.4 182941.6 83775.6 116.60

yes no 374835.9 257112.1 174780.8 80040.0 381.70

yes yes 396671.1 260940.4 185379.1 84178.2 470.29

net-4 no no 817779.0 655428.0 566632.0 171642.1 26.58

no yes 1374696.3 1040761.7 941359.4 224954.4 651.96

yes no 1282068.1 876604.1 818249.3 222062.7 2035.37

yes yes 1477997.7 1076690.4 987499.6 245750.0 3017.41

Table 3. Average network cost ratios of random and unit weightsolutions to genetic algorithm network cost and ratios betweenmean genetic algorithm network costs and lower bound. We as-sume that routers do not fail.

network arc failure rand/GA unit/GA GA/LB

net-1 no 3.88 2.61 2.31yes 3.29 2.32 3.29

net-2 no 4.69 3.23 2.67yes 3.60 2.67 4.66

net-3 no 1.83 1.52 2.33yes 2.14 1.47 2.18

net-4 no 1.44 1.16 3.30yes 1.57 1.07 3.68


0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

no/no no/yes yes/no yes/yes

aver

age

netw

ork

cost

single-arc failure / single-router failure

net-1

rand

unit

GA

LB

0

500

1000

1500

2000

2500

3000

3500

4000


aver

age

netw

ork

cost


net-2

rand

unit

GA

LB

Figure 13. Average network costs for instances net-1 and net-2

for random weights, unit weights, and weights determined by GA.A lower bound on network cost is also shown.


0

50000

100000

150000

200000

250000

300000

350000

400000


aver

age

netw

ork

cost


net-3

rand

unit

GA

LB

0

200000

400000

600000

800000

1e+06

1.2e+06

1.4e+06

1.6e+06


aver

age

netw

ork

cost


net-4

rand

unit

GA

LB

Figure 14. Average network costs for instances net-3 and net-4

for random weights, unit weights, and weights determined by GA.A lower bound on network cost is also shown.


Table 3 lists ratios of average network costs of random weight and genetic algo-rithm solutions and of average network costs of unit weight and genetic algorithmsolutions. The table also lists ratios of average genetic algorithm solutions and thelower bounds. The table considers only runs with no failure or single-arc failure.With respect to random weight solutions, the GA solutions were up to almost afactor of five smaller. Likewise, with respect to unit weight solutions, GA solu-tions were up to a factor of three smaller. The lower bounds do not appear to bevery strong with the GA solution reaching almost a factor of 5 of the lower bound.Recall, however, that these runs were limited to only 100 or 200 generations andused a small population of size 50. In contrast, the 1000 generation run with a 500element population of Figure 4 produced a solution with cost 100,880 for net-3

with no failures. This increases the ratio rand/GA from 1.83 to 2.19 and the ratiounit/GA from 1.52 to 1.82 and decreases the ratio GA/LB from 2.33 to 1.94.

We conclude this section with an empirical examination of the computationalcomplexity of the multiplicity setting heuristic described in Sections 3 and 4. We callconvergence loops the set of while loops that are executed in the multiplicity settingheuristic. For the purpose of our analysis, we call the convergence loops L1, L2, L3,and L4. Loop L1 is the while loop in lines 9 to 21 of procedure EvaluateSolution;loop L2 (L3) is the loop in lines 2 to 15 of procedure SimulateFail for arc (router)failure; and L4 is the loop in lines 1 to 22 of procedure UpdateSolution.

Figure 15 shows how the loops are related. Loop L1 calls all the other loops. Itcalls loop L4 in all four combinations of arc and router failure and no-failure. Forthe case of no arc or router failure, loop L1 and loop L4 are executed only onceduring solution evaluation. In the case of arc or router failure, loop L1 is executeduntil one round of no-failure, arc failure, and/or router failure is computed withoutany change in the solution. Since arc failures are simulated before router failures,loop L1 can terminate between simulations. Therefore, the number of calls to loopL2 is always at least as large as the number of calls to loop L3.

Loop L4 is called from the other loops only when at least one multiplicity haschanged. For example, in the last round of each arc or router failure simulation, themultiplicities do not change. Therefore, the number of calls to loop L4 is alwayssmaller than the sum of calls of the three other loops.

Table 5 presents the number of times loops L1, L2, L3, and L4 are called andexecuted for instances net-1 and net-4. The four combinations of arc and routerfailure and no-failure are considered. These counts were made in the experimentsreported in Table 2. For each failure combination, the table lists counts for eachpossible combination of loops. For each loop and instance, there are three columns.Column #called is the average number of times the loop was called; #passes is thenumber of times the loop was executed (with an exception in the case of loop L1),and max# is the maximum number of times the loop was executed by a call. Thefield #passes for loop L1 is the number of times that it calls the other convergenceloops. For example, as shown in the table, in the case of no arc or router failure,loop L1 will execute only once per call, and consequently max# is equal to one.

Since the genetic algorithm was run for 200 generations on the smaller instance(net-1), the #called entry for loop L1 is larger than it is for the larger instance(net-4), which was run for only 100 generations. Since there are 50 individu-als per population, then three random solutions and 34 solutions originated fromcrossover are evaluated per generation, which together with the evaluation of the


PSfrag replacements

uv1

v2

v3

v4

loop L1

loop L2

loop L3

loop L4

load

Figure 15. The relationship between loops in the multiplicity set-ting heuristics.

Table 4. Average number of calls, runs, and maximum runs percall of the convergence loops of networks net-1 and net-4.

failure net-1 net-4

arc router loop #called #passes max# #called #passes max#

no no L1 7413.0 7413.0 1.0 3713.0 3713.0 1.0

L4 7410.6 7924.4 3.2 3714.0 9213.0 5.0

no yes L1 7413.0 27132.8 5.0 3713.0 14994.2 5.4

L3 12298.2 153786.0 32.0 7427.4 1032683.4 369.2

L4 15748.0 16001.2 3.2 131439.8 137829.2 5.40

yes no L1 7413.0 28939.8 5.0 3713.0 15009.6 5.6

L2 14106.2 1605306.8 276.2 7427.8 5259311.4 1877.2

L4 24732.6 25010.8 3.2 431627.4 438850.6 5.2

yes yes L1 7413.0 37727.0 8.0 3713.0 26311.8 10.0

L2 14528.2 1666501.6 262.8 9410.2 6225702.2 1872.6

L3 8354.0 89004.4 25.8 7471.8 976921.8 337.0

L4 31501.2 32202.6 3.2 492128.8 499017.8 5.2

initial solutions (in the GA the initial population is counted as the first generationof solutions), sum up the total number of calls to loop L1.

Loop L2 is executed more times than loop L3, since there are more arcs thanrouters in the networks studied.


The maximum number of times a loop is executed gives an idea about the con-vergence of these loops. For example, on instance net-4, the maximum numberof times loop L2 was executed was for the case of arc and router failure. In thiscase max# = 1872.6, which means that the arc failure simulation ran up to 5.35(= 1872.6/350) times for each arc. Also, for this case, loop L1 was executed up to3.33 times per call, loop L3 was called up to 4.75 (= 337.0/71) times per call, andloop L4 was called up to 5.2 times per call. Note that the above maximum numbersof calls are averaged over five runs each and therefore can be real-valued.

The number of times the loops were executed is related to the instance size andkind of experiment performed. Furthermore, it is an indicator of how long thecomplete experiment took.

6. Concluding remarks

In this paper, we described a new evolutionary algorithm for survivable IP net-work design with OSPF routing. The algorithm works with a set of p solutions.Each solution is characterized by an integer arc weight vector. The algorithm startswith a unit weight solution and p−1 random weight solutions and evolves this pop-ulation over N generations, combining solutions according to a specific recipe. Toassociate a design with each weight vector, a multiplicity setting heuristic deter-mines the number of links that each arc must have so that all demand can befeasibly routed in the network using OSPF routing with route splitting. To achievean efficient implementation, instead of recomputing shortest paths and loads fromscratch, dynamic shortest path algorithms were used whenever possible.

To simplify the description of the algorithm, we omit from the paper many exten-sions that we have incorporated into the algorithm. Most are simple to incorporateinto a genetic algorithm. They include:

• In addition to the network cost described in the paper, we also allow mini-mization of number of links, as well as fixed charge cost model, where onepays Ka to use arc a and an additional ka per link of arc a deployed.• Besides arc and router failures, the algorithm also allows single span failure.

A span is a set of arcs that use a common network structure and are likelyto fail simultaneously. Single span failure is a generalization of single arcand single router failures.• Often one wishes to impose a maximum latency restriction on demand.

The algorithm described in the paper does not restrict any demand fromfollowing a long and winding route. It is simple to maintain the latency ofeach shortest path graph and thus compute the latency of each demand pair.A penalty function with the sum of excess latencies can be added to theobjective function to disfavor solutions that violate the latency restriction.• Two arcs are said to be symmetric if the tail of one is the head of the other

and vice-versa. Often one wishes to impose that OSPF weights be equalon symmetric arcs. Minor modifications of the random weight generationand mating/mutation procedures can achieve this.• Likewise, multiplicities may be required to be the same on symmetric arcs.

A simple modification of the multiplicity setting heuristic allows this typeof restriction.• The implementation allows the input of fixed OSPF weights for a subset

S ⊆ A of the arcs.


• The implementation allows for input of an initial set of multiplicities for allsolutions to allow for network expansion studies.• Offspring solutions resulting from mating of two solutions from the popula-

tion may not be locally optimal with respect to the neighborhoods definedby single link removals and single arc weight increases (as in [5]). We imple-mented such local search procedures. Though they can reduce the designcost of some offspring, we feel that the additional computational effort doesnot justify their use.• The weights produced by the best solution can be further adjusted to opti-

mize a measure of network congestion. We allow the OSPF weights of theset of best solutions in the final population to be optimized with the weightsetting algorithm of Buriol, Resende, Ribeiro, and Thorup [5].

7. Acknowledgments

The first author acknowledges discussions with Paulo Morelato Franca, whichhelped improve an earlier version of the paper. The first author also acknowledgessupport from the Brazilian National Council for Scientific and Technological De-velopment (CNPq), AT&T Labs Research, and the European Project Coevolution

and Self-organization in Dynamical Networks (COSIN).

References

[1] J. C. Bean. Genetic algorithms and random keys for sequencing and optimization. ORSA J.on Comp., 6:154–160, 1994.

[2] A. Bley. A Lagrangian approach for integrated network design and routing in IP Networks. In

Proceedings of International Network Optimization Conference (INOC 2003), pages 107–113,2003.

[3] A. Bley, M. Grotchel, and R. Wessaly. Design of broadband virtual private networks: Modeland heuristics for the B-WiN. In N. Dean, D.F. Hsu, and R. Ravi, editors, Robust communi-

cation networks: Interconnection and survivability, volume 53 of DIMACS Series in DiscreteMathematics and Theoretical Computer Science, pages 1–16. American Mathematical Soci-

ety, 1998.[4] P. Brostrom and K. Holmberg. Multiobjective design of survivable IP networks. Technical

Report LiTH-MAT-R-2004-03, Division of Optimization, Linkoping Institute of Technology,2004.

[5] L.S. Buriol, M.G.C. Resende, C.C. Ribeiro, and M. Thorup. A hybrid genetic algorithm forthe weight setting problem in OSPF/IS-IS routing. Technical Report TD-5NTN5G, AT&T

Labs Research, 2003.

[6] L.S. Buriol, M.G.C. Resende, and M. Thorup. Speeding up dynamic shortest path algorithms.

Technical Report TD-5RJ8B, AT&T Labs Research, 2003.[7] E. Dijkstra. A note on two problems in connection of graphs. Numerical Mathematics, 1:269–

271, 1959.

[8] M. Ericsson, M.G.C. Resende, and P.M. Pardalos. A genetic algorithm for the weight setting

problem in OSPF routing. Journal of Combinatorial Optimization, 6:299–333, 2002.

[9] B. Fortz and M. Thorup. Increasing internet capacity using local search. Technical report,

AT&T Labs Research, 2000. Preliminary short version of this paper published as “Internet

Traffic Engineering by Optimizing OSPF weights,” in Proc. 19th IEEE Conf. on ComputerCommunications (INFOCOM).

[10] D.E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning.Addison-Wesley, 1989.

[11] J.H. Holland. Adaptation in Natural and Artificial Systems. MIT Press, 1975.

[12] K. Holmberg and D. Yuan. Optimization of Internet Protocol network design and routing.Networks, 43(1):39–53, 2004.


(L.S. Buriol) Department of Computer and Systems Science, University of Rome “La

Sapienza”, Via Salaria, 113, Room 247, 00198 Rome, Italy

E-mail address, L.S. Buriol: [email protected]

(M.G.C. Resende) Internet and Network Systems Research Center, AT&T Labs Re-

search, 180 Park Avenue, Room C241, Florham Park, NJ 07932 USA.

E-mail address, M.G.C. Resende: [email protected]

(M. Thorup) Internet and Network Systems Research Center, AT&T Labs Research,

180 Park Avenue, Room C227, Florham Park, NJ 07932 USA.

E-mail address, M. Thorup: [email protected]

survivable ip network design with ospf routing · survivable ip network design with ospf routing...

Documents