multicast routing for virtual network functions on sdn...in chapter 2, we will present the related...

UNIVERSITY OF TORONTO

Multicast Routing for Virtual Network

Functions on SDN

by

Sai Qian Zhang

A thesis submitted in partial fulfillment for the

degree of Master of Applied Science

Department of Electrical and Computer Engineering

University of Toronto

c©Copyright by Sai Qian Zhang 2016

University of Toronto

AbstractDepartment of Electrical and Computer Engineering

Master of Applied Science

by Sai Qian Zhang

2016

Many multicast services such as live multimedia distribution and real-time event monitoring

require constructing a multicast mechanism to chain network functions (e.g. firewall, video

transcoding). Network Function Virtualization (NFV) is a concept that proposes using virtual-

ization to implement network functions on infrastructure building blocks (such as high volume

servers, virtual machines), where software replaces the functionality of existing purpose-built

network equipment. We present an approach for building the multicast mechanism whereby

multicast flows are processed by NFV before reaching end users. We propose a routing al-

gorithm and a method for building an appropriate multicast topology. First we proposed a

approximation algorithm to give an offline solution of the problem, then we further extend this

problem to consider the effect of the NFV on bandwidth consumption. Finally, we consider

the online version of this problem and present an online algorithm to adjust the multicast

topology based on the dynamic cost of resources.

ii

Acknowledgements

I would like to first express my most sincere appreciation and gratitude to my supervisor,

Professor Alberto Leon-Garcia for his guidance, support and understanding during the course

of my thesis work. I am grateful for the amount of trust and inspiration from him that led my

learning, exploration, and completion of my thesis.

I would like to thank the members of my committee: Professor Raviraj Adve, Professor Elvino

Sousa, and Professor Paul Chow for their evaluation of my work and valuable comments.

I would like to especially thankful to Dr. Qi Zhang to his useful comments, remarks and

engagement through the learning process of my master study, and I am thankful to the SAVI

Testbed architect, Hadi Bannazadeh and Dr. Ali Tizghadam for their support, discussion and

help throughout my master study.

I would also like to thank the other colleagues in my group, Byungchul Park, Thomas Lin,

Spandan Bemby, Pouya Yasrebi, Lilin Zhang, Houman Rastegarfar for their collaboration and

feedback.

I am also thankful for the assistance and administrative support from staff member Vladimirio

Cirillo.

Last but not least, I would like to thank my mum and dad for the unconditional love and

support throughout all my studies. My thesis would not have been possible without them.

iii

Contents

Abstract ii

Acknowledgements iii

1 Introduction 1

1.1 Virtualization in Cloud Environment . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Software Defined Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Multicast Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.4 Network Function Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.5 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Related Works 6

2.1 Multicast Mechanism in Software Defined Networking . . . . . . . . . . . . . . 6

2.2 Traditional Multicast Routing Algorithm . . . . . . . . . . . . . . . . . . . . . 7

2.3 Network Function Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Single Layer NFV-enabled Multicasting Routing Problem 9

3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Complexity of NEMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.3 Algorithm for NEMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3.1 An Approximation Algorithm . . . . . . . . . . . . . . . . . . . . . . . 12

3.3.2 A Solution Algorithm based on Branch and Bound . . . . . . . . . . . 17

3.3.3 Dynamic heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.4.1 Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.4.2 Running Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.5.1 Multicast Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Single Layer Multicasting Routing Problem with variable bandwidth 31


iv

Contents v

4.1.1 Algorithm for RMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.1.2 Algorithm for IMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.2 Dynamic heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.3 Scheduling of Multiple Multicast Sessions . . . . . . . . . . . . . . . . . . . . . 36


4.4.1 Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.4.2 Running Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5 Multiple Layer NFV-enabled Multicasting Routing Problem 53


5.1.1 Two-Approximation Algorithm . . . . . . . . . . . . . . . . . . . . . . 55

5.1.2 Heuristic Algorithm based on TAA . . . . . . . . . . . . . . . . . . . . 59

5.2 Build the Multicast Topology with given number of NFV Instances . . . . . . 60

5.2.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2.2 Heuristic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.3 SMRP with Time-variant Resource Cost . . . . . . . . . . . . . . . . . . . . . 62

5.3.1 Introduction to Markov Approximation . . . . . . . . . . . . . . . . . . 62

5.3.2 Design of the Discrete Markov Chain . . . . . . . . . . . . . . . . . . . 63

5.3.3 Online Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.3.4 Reliability of Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.3.5 Cost of Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.3.6 The Effect of Time Error during State Transition . . . . . . . . . . . . 66


6 Conclusions and Future Work 74

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Bibliography 76

Chapter 1

Introduction

1.1 Virtualization in Cloud Environment

Cloud computing allows clients to use heterogeneous resources that are managed by third

parties over the Internet. Suppose you want store 5GB photos somewhere, and the space on

your laptop has only 1 GB left, what you can do is you can upload your photos online. When

you store your photos online instead of on your home computer, you are using the storage

resources offered by cloud. For another example, if you want to run a software program which

consumes more CPU and RAM resources than your local laptop can handle, then you can

upload this task to the cloud, cloud will return any outputs or results to your laptop so that

this program will not consume any resources of your local laptop.

Virtualization is concept that divides hardware infrastructures to create various dedicated

resources. It is the basic concept that powers cloud computing. By deploying virtualization,

cloud platforms can host several independent applications on a shared hardware resource pool

with the capability to allocate computing power to applications on a per-demand basis. The

computing power is allocated in the form of virtual machine (VM) which runs on the physical

machine. Figure 1.1 shows such an example, a physical machine in datacenter has 8 cpus and

64GB RAM, the hardware resources are shared by three VMs, each runs a applications for the

tenants.

1

Contents 2

Figure 1.1: Virtual Machines Deployment

1.2 Software Defined Networking

In cloud environment, a central controller is usually used to decide the setting of VMs, deploy

VMs on the hardware resources. Moreover, a centralized control of the networks across VMs is

deployed in the cloud by leveraging the idea of Software defined networking (SDN). SDN is a

network paradigm that separates the data plane from the control plane. A logically-centralized

controller has the capability to enforce the network management policies and configure each

SDN switch. SDN allows the control plane to communicate with the network elements in the

data plane. It provides an open protocol to program the flow tables in different switches and

routers [1]. SDN converts the distributed control problem into a centralized control problem,

so that each router/switch just needs to forward the traffic according to the rules dictated

from the control plane, and they themselves do not need to perform any routing decisions.

1.3 Multicast Communication

Multicast is a fundamental communication style in which packets are sent to multiple des-

tinations simultaneously. In a multicast session, packets are replicated in each router and

forwarded to multiple output ports based on the multicast topology. In comparison to unicast

communications, where a single path is set up for each source-destination pair, multicast com-

munication can save a huge amount of bandwidth. Many web-based applications, for example,

Contents 3

Figure 1.2: Example of service chain

multimedia distribution, video conferencing, software updates, and IPTV, rely on multicast

communication to function correctly. For this reason, multicast communication is becoming

increasingly popular.

1.4 Network Function Virtualization

While the advent of SDN provides more possibility for developing new multicast mechanisms,

many multicast services in the cloud nowadays require the involvement of middleboxes. Net-

work appliances such as network address translation (NAT), intrusion detection system (IDS),

video transcoders, deep packet inspection (DPI) are becoming essential in the modern net-

work services to achieve the desired network behaviour. For example according to Figure

1.2, the multicast flow is sent from the server to three end users. Before reaching the end

users, the packets flow passed through a service chain consists of three middleboxes: DPI,

NAT and transcoder. The packets first get inspected by DPI then translate its IP address,

finally get transcoded before reaching the end users. Usually the middleboxes are implemented

in hardware. However, the hardware implementation involves some problems on deployment

and maintenance due to the proprietary nature of the network appliance [2]. For example,

integrating and managing the new middlebox into the network is cumbersome because of the

incompatibility of the new hardware. Moreover, the cost of providing the space and energy

Contents 4

of the hardware is high. Network Function Virtualization (NFV) is proposed to decouple the

network functions from propriety hardware to the software instances running on the VM to

alleviate the above problem in the cloud. By utilizing NFV, we can allocate network resources

to provide network functionalities, while optimizing network topology/configuration, and en-

suring higher reliability by including appropriate mechanisms. For example, consider the case

of video multicasting. The video streams may require transcoding before reaching the end

user, so transcoders must be included into this multicast mechanism. More generally, the mul-

ticast mechanism must include both the placement of NFV nodes and multicast path routing.

In this work, we study such an algorithm to jointly determine the placement of NFV nodes

and to construct a multicast topology to connect the source with every end user through NFV

nodes.

Several basic multicast routing techniques have been developed for construction of multicast

trees. Finding the optimal tree with the minimum cost is equivalent to finding the Steiner

Tree, which has been proven to be NP-hard. However, a number of heuristic algorithms have

been developed. In [3], an approximation algorithm with factor 2 has been developed. This

approximation factor was later improved to 1.598 in [4]. In [5], a heuristic algorithm to build a

Steiner tree with delay bound was proposed. A shared tree algorithm builds a single tree to be

used by all the multicast sessions. The tree contains a single point called core or rendezvous

point (RP) so that all the packets will be forwarded to RP before reaching destination [6,7].

However, the selection of the optimal RP is also an NP-hard problem.

While multicast routing has been a subject of extensive research, designing multicast services

that involve intermediary processing functions has not been carefully studied. In particular,

computing efficient multicast topologies that involve NFV nodes is still a challenging problem,

because it involves jointly determining the placement of NFV nodes as well as constructing

a multicast topology that connects the source and destinations through the NFV nodes. In

response to the above need, we propose an algorithm for building an NFV-enabled multicast

mechanism on SDN. The controller is responsible for setting up the routing path and selecting

NFV nodes. We also develop algorithms for both dynamic multicast and static multicast.

Contents 5

Moreover, we have implemented the preliminary version of our multicast mechanism on ViNO

[8], an SDN launcher that uses VXLAN tunneling protocol alongside a software switch (Open

vSwitch) to dynamically create network topologies specified by the user.

1.5 Thesis Structure

This thesis is based on my previous work [29,30,31], it is organized in the following structure:

In Chapter 2, we will present the related work about multicast routing algorithm, multicast

routing on SDN and NFV research. In Chapter 3, we provide our problem formulation for

the single layer NFV enabled multicast routing problem, and we present the approximation

algorithm. In Chapter 4, we consider an extension of the problem with variable bandwidth

consumption. In chapter 5, we present the online algorithm for the dynamic version of the

problem. We make a conclusion and describe the future work in chapter 6.

Chapter 2

Related Works

2.1 Multicast Mechanism in Software Defined Network-

ing

Multicast has received much attention in the modern communication networks because of the

popularity of group communication and its advantage in saving bandwidth. As mentioned

before, traditional IP multicast suffers from the problem of security, reliability and scalability.

The advent of SDN helps relieve these concerns by separating the control plane from the data

plane. SDN-based multicast framework has been widely used and deployed in the datacenter

networks. In Avalanche, a SDN based system for the datacenter network is proposed that

enables multicast in commodity switches [9]. In [10], an SDN-based multicast clean-slate

scheme aiming to improve security and controllability of multicast network is developed. And

in [11], a SDN-based scalable IP multicast in datacenter network is proposed and evaluated.

[12] proposes a routing algorithm to construct multicast tree with high reliability. However,

none of these works have included NFV functionality with the multicast mechanism.

6

Contents 7

2.2 Traditional Multicast Routing Algorithm

Several basic multicast routing techniques have been developed for constructing the multicast

trees. Finding the optimal tree with the minimum cost can be formulated as a Stein tree

problem, which has been proven NP-hard. In the literature, a number of algorithms have

been developed to solve the Steiner tree problem. In [3], an algorithm with approximation

factor of 2 has been proposed. This approximation factor was later improved to 1.598 in [4].

In [5], a heuristic algorithm to build a Steiner tree with delay bound was proposed. And a

heuristic algorithm for finding a directed multicast tree with minimum cost is proposed in [13].

Some other multicast tree building techniques include the shared tree algorithm, which builds

a single tree to be used by all the multicast sessions. The tree contains a single point called

core or rendezvous point (RP) so that all the packets will be forwarded to RP before reaching

the destinations [6] [14]. However, the selection of the optimal RP is also an NP-hard problem

[15]. The authors of [7] propose a heuristic algorithm to choose a rendezvous point while

minimizing the total weighted cost of routing paths. In [16] a novel rendezvous point based

algorithm is proposed to build a multicast tree which satisfies several constraints, including

delay constraint, link utilization constraint, while minimizing the total cost. However, all of

the paper is focused on generating the multicast tree with minimum cost, none of these paper

has considered the joint NFV placement and multicast tree construction.

2.3 Network Function Virtualization

Now we review some prominent research work on NFV. NFV is a promising research topic and

some research has been done on design and implementation aspect of NFV. The authors of [17]

present a design of a control plane to deal with the race condition during the migration of NFV.

In [18], a design and implementation of a virtualized software network function platform was

presented. In [19] a high-level control platform is described for directing network flow through

a predefined chain of middleboxes with minimum resource consumption. Moreover, previous

Contents 8

work has also been done on efficient placement of NFV. The authors of [20] create a dynamic

NFV placement and routing algorithm for content delivery network. The authors of [21] raise

and solve the virtual DPI placement problem with minimum total cost and delay constraint.

In [22] a joint NFV placement and traffic routing algorithm is designed for the service function

chain, which aims at minimizing the total delay of the traffic with constraint on hardware and

bandwidth resources. However, to the best of our knowledge, no research paper has focused

on joint multicast routing and NFV placement problem yet.

Chapter 3

Single Layer NFV-enabled

Multicasting Routing Problem

3.1 Problem Statement

We model the network as a graph G = (V,E), which consists of a set of nodes V and undirected

links E. We define H ⊆ V as a set of candidate nodes on which NFV may be deployed. Each

multicast session involves a source host s ∈ V and a set of destinations D ⊆ V . In this

context, we define a multicast topology as a subgraph G′ ⊆ G. For each G′, there exists a

mapping function fG′ : D → H ′, that maps each destination d ∈ D to a NFV node h ∈ H ′,

where H ′ ⊆ H denotes the set of NFV nodes used by G′. Therefore, G′ delivers the multicast

content to every d ∈ D using two paths, one from s to a NFV node fG′(d), the other from

fG′(d) to d.

To define the cost of a multicast topology, we assume there is an non-negative fixed cost w(e)

for using each edge e ∈ E. Moreover, we assume running a NFV node on each machine h ∈ H

incurs an activation cost c(h) in terms of resource usage and performance overhead. Therefore,

the total cost of constructing a multicast topology consists of the activation cost of each NFV

node and the sum of the link costs.

9

Contents 10

Our goal is to find a multicast topology G′ that ensures each multicast flow traverses through

the NFV node(s) before reaching the destination, while minimizing the total topology con-

struction cost. Specifically, we define the total link cost of a subgraph U ⊆ G as

C l(U) =∑e∈U

w(e), (3.1)

and define total NFV activation cost as

Ch(U) =∑h∈U

c(h), (3.2)

then our goal is find a multicast topology U to minimize the sum of the total link cost and

NFV activation cost:

C(U) = C l(U) + Ch(U). (3.3)

We call this problem the NFV enabled multicast problem (NEMP).

A concrete example of NEMP is provided in Figure 3.1(a). Consider a real-time video stream-

ing service that delivers a transcoded version of original video stream from h1 to h2 and h3.

Two nodes {5, 7} are available to place the transcoder with activation cost c(5) = 3 and

c(7) = 1 respectively. In this case, the goal of NEMP is to find the minimum cost multicast

topology as shown in Figure 3.1(b).

It is noteworthy that unlike the traditional multicast tree problem, in NEMP each link may be

traversed more than once. For instance, in Figure 3.2, assume the only NFV node is located

at node 4 and D = {2, 3, 5}, s = 1. The multicast routes are:

1. From 1 to 3 : (1, 6, 4, 6, 3)

2. From 1 to 5 : (1, 6, 4, 6, 5)

3. From 1 to 2 : (1, 6, 4, 6, 3, 1, 2)

Contents 11

Figure 3.1: Example of NEMP

Figure 3.2: An example of NEMP solution

In this case, the total cost of this multicast topology is C(U) = w(1, 6) + w(6, 4) + w(6, 4) +

w(6, 5) + w(6, 3) + w(3, 1) + w(1, 2) + c(4). The cost of link (6, 4) is counted twice since the

flow traverses the link (6, 4) twice. And the cost of the NFV node only counts once.

3.2 Complexity of NEMP

We first analyze the complexity of the problem:

Contents 12

Theorem 1. NEMP is NP-hard.

Proof. We show that this problem can be reduced to the the steiner tree problem. The steiner

tree problem aims at finding a minimum cost multicast tree that connects the source to each

of the destinations. Given steiner tree problem, we can build a NEMP by limiting the only

candidate location for placing the NFV is at source node. It is easy to see that the optimal

solution of the multicast topology construction problem is exactly the solution for the original

steiner tree problem. Since steiner tree problem is NP-hard, we conclude that multicast

topology construction problem is NP-hard as well.

3.3 Algorithm for NEMP

Since NEMP is NP-hard, the time required to find the exact solution will grow exponential

with the size of the network, therefore we next propose heuristic algorithms with low time

complexity to solve NEMP. We first present an algorithm that achieves an approximation

ratio of 2. Then, we present an exact solution algorithm based on branch-and-bound. Lastly,

as end users may dynamically enter and leave the multicast session, dynamic heuristics are

proposed to deal with the coming/leaving users so they can connect to the multicast topology

quickly.

3.3.1 An Approximation Algorithm

Algorithm 1 is our approximation algorithm for NEMP, it first searches for a single NFV node

which is used by all the destination users and then it constructs a minimum spanning tree

among the end users. The traffic flow will first traverse through selected NFV node and then

multicast to each end user using the minimum spanning tree it constructs. We now prove its

approximation guarantee.

Contents 13

Algorithm 1 Approximation algorithm

1: Cbest =∞2: for each (h, d) pair ∈ H ×D do3: Construct a shortest path graph P by connecting s to h, h to d.4: Construct a complete subgraph Gc for D, where the cost of each edge (di, dj) in Gc is

set equal to the distance of shortest path from di to dj in G5: Find the minimum spanning tree T from Gc and calculate C l(T )6: Connect P with T , call the combined graph Gw (since d exists in both P and T , we can

connect them together). Calculate C(Gw) using eq. (3.3)7: if C(Gw) ≤ Cbest then8: Cbest ← C(Gw), Gbest ← Gw

9: end if10: end for11: Construct the graph Tbest from Gbest by replacing every edge in Gbest with the shortest

path in G, if there are several shortest paths, randomly select one.12: Remove any unnecessary edges in Tbest to get final output TH

Figure 3.3: An example of Approximation algorithm. (a) shows a network topology G(V,E)with 9 nodes and 12 links, suppose s = 9, H = {5, 7}, D = {2, 3, 4, 6}, and pick node 7 to beh and node 4 to be d. (b) is the path graph P . (c) is the complete graph Gc. Assume (d)is the minimum spanning tree T of Gc. Connecting P with T gives Gw in (e). Assume this

combination gives the least total cost, (f) shows Tf and (g) is TH

Contents 14

Theorem 2. Given that there exists an optimal multicast topology for the NEMP, the total

cost of TH is no more than two times that of the optimal multicast topology.

To prove this theorem, we first state two lemmas:

Lemma 1. Let Gopt be the optimal multicast topology of NEMP, there exists a multicast tree

Topt such that each root-leaf path in Topt is exactly a walk1 from source to an end user in Gopt,

and the total cost of the Topt equals that of Gopt.

Proof. For each destination d, there must exist a walk W which connects source s to d in

Gopt. We construct Topt iteratively. In the first iteration, we select a destination node d at

random. Let the walk between s and d in Gopt to be the first root-leaf path of Topt. In each

of the subsequent iterations, we first select a remaining destination d from D. Let W denote

the walk from s to d in Gopt. Among all existing root-leaf path P in Topt, we find the P which

contains the largest subset W1 ⊆ W from the source s. That is, P contains the longest subpath

of W among all the root-leaf path. Let n be the end point of W1 and define W2 = W \W1,

we then add a W2 as a new branch to Topt. Repeat this for every end user in Gopt until all the

end users have been added. Since all the links and nodes are copied from the walks in Gopt

to Topt, the total of Topt (include the activation cost of NFV nodes and total cost of the links)

equals that of Gopt.

This can be illustrated by Figure 3.4. We try to derive Topt from the optimal multicast topology

Gopt in Figure 3.2. The walk from source node 1 to the first end user 3 is (1, 6, 4, 6, 3) (3.4(a)).

The walk from source node 1 to second end user 5 is (1, 6, 4, 6, 5), since (1, 6, 4, 6) is already

contained in Topt, we connect (6,5) to Topt (3.4(b)). Finally, the walk from source node 1 to

last end user 2 is (1, 6, 4, 6, 3, 1, 2) and (1, 6, 4, 6, 3) is already contained in Topt, we connect

(1, 2) to node 3 in Topt (3.4(c)). The total cost of Topt equals that of Gopt.

And we cite following result from [2]:

1In graph theory, a walk is a sequences of links and nodes, where each link’s endpoints are preceding andfollowing nodes in the sequence.

Contents 15

Figure 3.4: Example of constructing tree

Lemma 2. Let T be a tree with m ≥ 1 edges. Then there exists a loop in T, u1, u2, ..., u2m,

where every ui, 1 ≤ i ≤ 2m, is a vertex in T , such that every edge in T appears exactly twice

in the loop.

Proof of Theorem 2. Let Gopt denote the optimal solution and Topt denote the tree derived

from it. Lemma 1 shows there exists a Topt with cost equals that of Gopt. Define Copt(v1, v2)

as the cost of the path connecting node v1 and v2 in Topt, Cshortest(v1, v2) cost of the shortest

path connecting node v1 and v2 in G. Also denote the leaves of Topt as di, 1 ≤ i ≤ n (the order

of leaves is random), where n is the number of leaves in Topt. From Lemma 2, there exists a

loop L in Topt such that every edge in Topt is traversed exactly twice and every leaf in Topt is

visited exactly once. L can be decomposed into three paths:

1. A simple path M1 that connects source s to the first leaf d1.

2. A path M2 that connects first leaf d1 to next leaf d2, and from d2 to the next leaf d3

until to the last leaf dn.

3. A path M3 that connects dn back to source s.

Since each link and NFV nodes in Topt appears exactly twice in the loop L, C(L) = 2 ×

C(Topt) ≥ C l(L). The simple path M1 must pass through at least one NFV node, otherwise

the flow is not processed before reaching the destination user. Call this node hopt, so the

C(M1) = Copt(s, hopt) + Copt(hopt, d1) + c(hopt). Ignoring the activation cost of all the NFV

Contents 16

nodes that M2 traverses, M2 can be viewed as a graph that contains all the end users, so C lM2

equalsn−1∑i=1

Copt(di, di+1) and C(M1) + C l(M2) must be greater than or equal to C(Tbest) =

minh∈H,d∈D

(Cshortest(s, h) +Cshortest(h, d) + c(h) +C l(T )), where T is the minimum spanning tree

defined in step 5 of Algorithm 1. Thus C(L) must be greater than or equal to C(Tbest),

which is greater than or equal to C(TH). Therefore C(TH) ≤ C(Tbest) ≤ C(M1) + C l(M2) ≤

C(M1) + C(M2) ≤ C(L) = 2C(Topt) = 2C(Gopt).

We use the example in Figure 3.3(a) to illustrate this proof, suppose for the problem in figure

3.3(a) we have the optimal solution Topt in Figure 3.5(a). The C(Topt) includes all the cost of

the links it uses and all the activation cost of NFV nodes it uses (node 5 and node 7). The

loop L that traverses Topt is shown in 3.5(a), which includes:

1. source s to the first leaf node 3, call this simple path M1 (shown in 3.5(b)).

2. first leaf node 3 to next leaf node 4, from node 4 to the next leaf node 6 until to the last

leaf node 2, call this path M2 (shown in 3.5(c)).

3. last leaf node 2 back to source s, call this simple path M3.

The total cost of L includes two times of the NFV activation cost in Topt (the activation

cost of node 5 and 7, since they are traversed twice) and two times the sum of the link cost

of Topt. Therefore C(L) = 2 × C(Topt). The simple path M1 passes through NFV node

5 and connects to end user node 3. M2 contains all the end users (node 3,4,2,6). And

2× C(Topt) ≥ C(M1) + C(M2) ≥ C(Tbest) ≥ C(TH).

We now analyze the complexity of Algorithm 1. In line 1 and 2 , for each fixed h and d, we

need to find the shortest path between s to h, h to d. In line 3, to calculate the cost of edge

in Gc, we need to find the shortest paths for each pair of destination users. Therefore, The

overall complexity is O((n2 + n|H|)(|V | + |E| log |E|)), where n is number of end users, |H|

is the size of H, |V | is number of nodes in the network and |E| is number of edges in the

network.

Contents 17

Figure 3.5: Example

3.3.2 A Solution Algorithm based on Branch and Bound

We can also formulate the NEMP as an optimization problem, and find the exact solution

by using a technique such as branch-and-bound. To formulate the problem, we first make

the graph directed by representing each undirected link with two opposite directed links with

equal weight. In presenting the formulation, we will use the definitions given in Table 3.1.

The problem can be formulated as a binary integer programming problem as follows:

minimizeX1,X2,Y1,Y2,Z

P T1 (Y1 + Y2) + P T

2 Z (3.4)

subject to M(X1 + X2) = A (3.5)

ZTMX1 = −1 (3.6)

ZTMX2 = 1 (3.7)

X1jk ≤ Y1j ≤ 1 (∀k, j) (3.8)

X2jk ≤ Y2j ≤ 1 (∀k, j) (3.9)

RTZ = 0 (3.10)

Contents 18

Table 3.1: Definitions of parameters

NameDimension Description

N - Number of nodes in the networkK - Number of undirected links in the network, after replac-

ing each undirected link with two directed links, therewill be 2K directed links in the graph

n - Number of destinationsH - Set of nodes where NFV can be implementedD - Set of destinationsM N × 2K Incidence matrix, Mmi = 1 and Mni = −1 if there is a

directed link i from node m to node nP1 2K × 1 Edge cost vector, it contains cost w(e) of each directed

link eP2 N × 1 Node cost vector, P2k = c(k) iff node k is in H. P2k = 0

otherwiseA N × n Ajk = 1 iff j = s, Ajk = −1 iff j = dk and Ajk = 0

otherwiseW 1× n Wk = 1 for each element in WR N × 1 Ri = 1 if node k is not in H. Rk = 0 otherwiseS N × n Sij = 1 if i = source and Sij = 0 for all i 6= sourceQ N × n Qij = 0 if i = dj and Qij = −1 otherwiseX1 2K × n X1ei = 1 iff link e is used by the flow with destination di

to connect source s with the NFV nodeX2 2K × n X2ei = 1 iff link e is used to reach destination di from the

NFV nodeY1 2K × 1 Y1e = 1 iff link e is used in the multicast topology to

connect source s to any NFV nodeY2 2K × 1 Y2e = 1 iff link e is used in the multicast topology to

connect any NFV node to any end usersZ N × 1 Zk= 1 iff NFV is running on node k

Q �MX1 � S (3.11)

X1, X2, Y1, Y2, Z are binary

(3.4) gives the total cost of the multicast topology, which includes the cost for each used

link and the activation cost of the NFV nodes. (3.5) and (3.11) ensure a single path set up

between the source s and each end user. (3.6) and (3.7) ensures that the multicast flow passes

at least one NFV node before reaching each end user. (3.8) and (3.9) ensure that the multicast

topology is comprised of all the single paths between source s and each end user di. (3.10)

ensures that only the nodes in H can be used for NFV node.

Contents 19

We now provide an algorithm for solving the integer programming problem using Branch

and bound. Compared to Algorithm 1, our Branch and bound algorithm is able to compute

the exact optimal solution of NEMP with the drawback of higher computational overhead.

However, since this is a quadratic integer program, it is hard to apply traditional algorithms

to solve it. Instead, we can make the problem linear by trying every combination of NFV node

and solve the linear integer programming problem (given the known NFV node) using Branch

and bound. The key features of our algorithm is that we can prune the search space in the

following two dimensions:

1. Prune the search space of NFV nodes

We can search for the best NFV enabled multicast topology without the concern on the

activation cost of NFV nodes, call this topology Tlink, and Tlink is the cheapest multicast

topology we can build if we neglect the activation cost of NFV nodes. Denote by cmax the cost

of deployed NFV nodes in Tlink. Therefore if we have a multicast tree whose activation cost of

NFV nodes is greater than cmax, then its total cost must be greater than that of Tlink and it

must not be the optimal solution. Using this fact, the steps for pruning the search space are

shown below:

• We change the objective function of the previous problem, ignore the cost of NFV nodes,

so the new object function becomes:

min P T1 (Y1 + Y2)

• Activate every NFV node in H by setting Zi = 1 iff node i∈ H and Zi = 0 otherwise.

• Solve the new linear integer programming problem by using Branch and bound.

• Record the actual NFV nodes used by the solution since some of the NFV nodes may

not be used by the multicast topology. Calculate the total activation cost of these used

NFV node, cmax.

Contents 20

• Select one combination of the NFV nodes, before solving the linear integer programming

problem, calculate the total activation cost of this NFV node combination, denoted as

c, if c > cmax, then this combination is pruned and there is no need to test it.

2. Prune the search space of links

When solving the linear integer programming problem given the activated NFV node, we can

use the following facts of the multicast topology to prune the search space:

• Links directed to the source s can never carry flow from the source to the NFV node

deployed.

• Links originating from the destination can never carry flow from NFV node to that

destination.

Therefore, multicast topologies that satisfy either of these two conditions will not be considered

in our Branch and bound algorithm.

3.3.3 Dynamic heuristics

The algorithms presented in the previous sections have been designed to solve the static version

of NEMP. In reality, end users may dynamically join and leave the existing multicast topology,

so below we will propose a dynamic approach (Algorithm 2,3) to deal with this situation. For

the new incoming end users, the algorithm finds the closest end user in distance among the

existing end users in the multicast topology, and then makes a connection between them. For

leaving end users, the algorithm will remove the path to the end user if no other existing end

users use that path.

The complexity of dynamic heuristics of each user entering and leaving is O(n(|V |+|E| log |E|))

and O(|E|) respectively, where n is number of end users, |V | is number of nodes in the network

and |E| is number of edges in the network. The complexity of dynamic heuristics is much lower

Contents 21

Algorithm 2 Dynamic heuristic (Entering)

1: for each coming end user d ∈ D do2: if new user connects to a node in multicast topology then3: Send the flow to that user directly.4: end if5: if new user connects to a new node outside existing multicast topology then6: dmin ← user in the multicast group who has the shortest distance to d. If there are

more than one existing users satisfy this, randomly choose one.7: Connecting d to dmin with shortest path.8: end if9: end for

Algorithm 3 Dynamic heuristic (Leaving)

1: for each leaving end user d ∈ D do2: for each link e in path to d do3: if e does not carry any multicast traffic then4: remove e from the existing multicast topology5: end if6: end for7: end for

than that of the static algorithms, therefore the users can be quickly added or removed from

the multicast topology.

3.4 Numerical Results

We evaluated our multicast algorithm from both static and dynamic perspectives on the wide

area network model. We will demonstrate the accuracy of the algorithms by comparing the

result of Approximation algorithm with the multicast topology generated by our Branch and

bound method on three network models generated by GT-ITM [23]. GT-ITM is a very popular

network generator which simulates the wide area network topology. The network settings are

presented in table 3.2.

For each network topology, we randomly generate a set of nodes H and a set of end users.

We schedule all the users at the same time. Each scenario is evaluated several times and the

Contents 22

Table 3.2: Properties of networks

Case NetworkI NetworkII NetworkIII

nodes 10 35 100links(undirected) 14 50 127

size of H 2 7 10

average results are presented.

3.4.1 Cost Analysis

Tables 3.3, 3.4 and 3.5, the result generated by Approximation algorithm is normalized to

that generated by our Branch and bound method. For network I, the normalization ratio

between the approximation algorithm and optimal solution increases from 1.25 to 1.31 as the

number of users increases. For network II, the normalization ratio increases from 1.31 to 1.41.

For network III, the normalization ratio increases from 1.47 to 1.56 as the number of users

increases. We observe that the gap between the Branch and bound solution and Approximate

algorithm increases as the number of end users grows or the size of the network increases,

one possible reason is that as the size of the end users increases or the size of the network

increases, there are more possibilities to build the multicast topology, therefore it is harder to

find a good solution.

Table 3.3: cost of the multicast topology (NetworkI)

Number of end users Branch and bound Approx. algorithm3 1 1.254 1 1.305 1 1.31

Table 3.4: cost of the multicast topology (NetworkII)


Contents 23

Table 3.5: cost of the multicast topology (NetworkIII)


3.4.2 Running Time Analysis

After comparing the cost of different algorithms, Figure 3.6-3.8 give the running time of

Approximation method and our Branch and bound for three networks. The running time

of the approximation algorithm increases from 80ms to 300ms for Network I, 0.25s to 9s for

Network II and 0.8s to 77s for Network III. We make the following observations. First, the

running time of both methods increases as the number of destination users increases. By

comparison, the growth rate of running time of Approximation algorithm is much lower than

that of Branch and Bound. Second, the network size has a great effect on the running time

for both methods, with a higher impact on Branch and Bound.

Figure 3.9-3.11 shows the relationships between total number of end users and the average

processing time of each incoming user. The running time of the algorithm increases from

1.5ms to 6.1ms for Network I, 3.2ms to 4.7ms for Network II and 4ms to 5.5ms for Network

III. We can see that the average processing time increases as more users join the multicast

group, a possible reason is that as the number of users in the group increases, there are more

options for the new end users to connect to. And the average processing time grows gently as

the network size increases, which demonstrates the algorithm is scalable.

3.5 Implementation

In order to validate the functionality and implementability of this multicast mechanism, we

use ViNO running on the SAVI testbed. The SAVI testbed is an open application platform

which is aimed for efficiently controlling and managing virtual ICT resources [24]. ViNO is

an SDN launcher that uses VXLAN alongside a software switch (Open vSwitch) running on

Contents 24

Figure 3.6: Running time of two static algorithms for network I

Figure 3.7: Running time of two static algorithms for three network II

Contents 25

Figure 3.8: Running time of two static algorithms for three network III

Figure 3.9: Running time of two dynamic heuristics for network I

Contents 26

Figure 3.10: Running time of two dynamic heuristics for three network II

Figure 3.11: Running time of two dynamic heuristics for three network III

Contents 27

virtual machines to create an overlay SDN network whose topology is specified by the user.

Moreover, a central controller is set up to enforce the network management policies to each

network element [8].

3.5.1 Multicast Mechanism

The multicast mechanism can be described by the block diagram in Figure 3.12:

1. User (Tenant) input: tenant specifies the network topology, link weight, source node,

destination nodes into a file with some special format. Then the file will be passed to

the routing calculator and ViNO.

2. Routing Calculator: it takes the input from tenant, computes the routing paths, and

sends the results to the SDN controller.

3. ViNO: ViNO will take the input from tenant and generate the overlay network topology

based on the input file.

4. SDN controller: the SDN controller will take the output from the Routing Calculator,

translates them into a set of OpenFlow forwarding rules, and implements the rules in

OpenFlow switches.

5. A multicast network is set up after the rules has been implemented in the switches.

3.5.2 Example

For the implementation, we choose a network topology with four hosts h1, h2, h3, h4, 12

OpenFlow switches and a set of predefined link weights (shown in Figure 3.13). The multicast

group consists of three hosts h1, h3 and h4. The source host h1 multicasts packets which

contains some text messages to end users h3 and h4. An NFV node has also been booted up

on h2 which can turn the lower case letter received by the NFV into upper case. The text

Contents 28

Figure 3.12: Multicast mechanism diagram

message written in lower case will be sent from h1 to h3 and h4, h3 and h4 should receive

the processed multicast flow which contains the same text message in upper case. The paths

connecting s1 with h3 and h1 are described below:

From h1 to h3 : (h1, s1, NFV, s6, s10, h3)

From h1 to h4 : (h1, s1, NFV, s1, s2, s4, s7, h4)

We run a simple C code on the NFV node. This code will intercept every packet received from

the UDP port 5005, turn the text message inside into upper case. To send text from h1 to h2

and h3, we use netcat command in Linux to send UDP packet through port 5005. We use the

IP address of h3 as destination IP address of the multicast packets.

First, we seek to validate the functionality of the multicast mechanism by sending the text

messages from the host h1. As shown in the Figure 3.14, the text message will first sent to

the NFV node and processed by h2. And h2 will send the texts in upper case to the end users

h3 and h4.

Contents 29

Figure 3.13: Network topology

3.5.3 Evaluation

The total set up time is 57.79 seconds, average bandwidth used on each link is 47.6 bytes/s

and drop rate is 0 bytes/s. Assume the overlay network is ready for use, the total setup time

is the time from sending the user input to Routing Calculator until all the forwarding rules

have been implemented on each switch. The total time used is less than 1min. Most of time

is spent on communication between virtual machines.

Contents 30

Figure 3.14: Screenshots of h1, h2, h3 and h4

Chapter 4

Single Layer Multicasting Routing

Problem with variable bandwidth


In the previous section, we make an assumption that the bandwidth of a flow remains constant

after traversing through the NFV. However, in general the bandwidth usually changes when the

traffic traversing NFV, such as video compression/decompression. Moreover, we assume that

the link cost is also associated with the bandwidth consumption. The intuition would be that

as the bandwidth consumption increases, the link cost also increases, and vice versa. We call

this problem NFV-enabled multicast problem with variable bandwidth (NEMPVB) problem.

We can formulate the NEMPVB as an optimization problem, and solve it by using the tech-

nique such as Branch and bound. In presenting the formulation, define P1 the cost of links

before traversing the NFVs, P2 the cost of links after traversing through the NFVs and P3 the

cost of nodes. And we will use the definitions given in Table 3.1. The optimization problem

can now be stated as:

minimizeX1,X2,Y1,Y2,Z

P>1 Y1 + P>2 Y2 + P>3 Z (4.1)

31

Contents 32

Equation (4.1) is the new object function that computes the total cost of multicast topology

considering variable bandwidth, which includes the cost for each used link before and after

traversing through the NFV and the activation cost of the NFV nodes. And (4.1) together

with (3.5)− (3.11) formally define the NEMPVB.

We can easily show that NEMPVB is also NP − hard. Since the NEMPVB is NP − hard,

our main objective is to find the polynomial time algorithms that yields good approximation

guarantee. In the following sections, we shall divide NEMPVB into three separate cases. In

the first case, we assume bbefore = bafter, namely the bandwidth usage before and after the

process by NFV components are the same, this is the NEMP which we have solved in last

chapter. The second case is when bbefore > bafter. We call this problem Reduced Multicast Flow

Bandwidth problem (RMF). Finally, we consider the case where bbefore < bafter. We call this

problem the Increased Multicast Flow Bandwidth problem (IMF). While these three problems

are similar, the approximation algorithms developed for these problems differ in subtle aspects.

In the following subsections, we will study RMF and IMF separately.

4.1.1 Algorithm for RMF

In RMF, the bandwidth of multicast flow decreases after traversing through NFV, the reduction

on bandwidth consumption will cause the reduction on the link cost. This problem can be

solved by almost the same approximation algorithm as the NEMP. The only difference is that

the bandwidth requirement of the flow reduced after traversing through the NFV, resulting

two link weight metrics Θbefore and Θreduced. So the approximation algorithm of RMF is the

same except that in line 3 of Algorithm 1, Θbefore is used to calculate distance from source

s to NFV node h and Θreduced is used to calculate distance from NFV to end user d. By

using link metric Θ, we mean the link weight used to calculate distance between two nodes is

defined by Θ. In line 4 of algorithm 1, Θreduced is used to calculate the cost of the edge of Gc.

This approximation algorithm also keeps an approximation factor of 2, which can be proven

as follows:

Contents 33

First, let Gopt denote the optimal solution. From Lemma 1, we can derive a tree Topt with the

same cost. To facilitate the statement of the algorithm, define:

Copt(v1, v2,Θ): Cost of the path connecting v1 and v2 in Topt under link weight metric Θ.

Cshortest(v1, v2,Θ): Minimum cost of the path connecting v1 and v2 in G under link weight

metric Θ.

Now we state the proof of the approximation algorithm for RMF :

Proof. From Lemma 2, there also exists a loop L in Topt such that every edge in Topt is traversed

exactly twice and every leaf in Topt is visited exactly once. L can be decomposed into three

paths:

1. A simple path M1 that connects source s to the first leaf l1.

2. A path M2 that connects first leaf l1 to next leaf l2, and from l2 to the next leaf l3 until

to the last leaf ln.

3. A path M3 that connects ln back to source s.

The simple path M1 must pass through at least one NFV node, call this NFV node hopt,

the total cost of M1 equals Copt(s, hopt,Θbefore) + Copt(hopt, l1,Θreduced) + c(hopt). Ignoring

the activation cost of all the NFV nodes that M2 traverses, M2 can be viewed as a graph

that contains all the end users, so the total cost of M2 equalsn−1∑i=1

Copt(li, li+1,Θ), where is

Θ may be Θbefore or Θreduced depending on whether the flow travels through the NFV node.

And the sum cost of M1 and M2 must be greater than or equal to that of Tbest = minh∈H,d∈D

(Cshortset(s, h,Θbefore) +Cshortest(h, d,Θreduced) + c(h) +C l(T )). Thus the total cost of the loop

L must be greater than or equal to the total cost of Tbest, whose total cost is greater than

or equal to the cost of the solution TH . Therefore C(TH) ≤ C(Tbest) ≤ C(M1) + C l(M2) ≤

C(M1) + C(M2) ≤ C(L) = 2C(Topt) = 2C(Gopt).

The complexity of the above algorithm is the same as that of approximation algorithm for

UMF.

Contents 34

Algorithm 4 Approximation algorithm for IMF

1: Cbest =∞2: for each (h, d) pair ∈ H ×D do3: Construct a shortest path graph P by connecting s to h, h to d. Calculate the cost of

P , C(P ) = Cshortest(s, h,Θbefore) + Cshortest(h, d,Θincreased) + c(h)4: Construct a complete subgraph Gc for D, where the cost of each edge (di, dj) in Gc

is set equal to min{min(h1,h2)∈H (Cshortest(vi, h1,Θincreased) + Cshortest(h1, h2,Θbefore) +Cshortest(h2, vj,Θincreased) + c(h1) + c(h2)), Cshortest(vi, vj,Θincreased)} in G

5: Find the minimum spanning tree T from Gc and calculate C l(T )6: Connect P with T , call the combined graph Gw (since d exists in both P and T , we can

connect them together). Calculate C(Gw) using eq. (3)7: if C(Gw) ≤ Cbest then8: Cbest ← C(Gw), Gbest ← Gw

9: end if10: end for11: Construct the graph Tbest from Gbest by replacing every edge in Gbest with the shortest

path in G, if there are several shortest paths, randomly select one.12: Remove any unnecessary edges in Tbest to get final output TH

4.1.2 Algorithm for IMF

In IMF, the bandwidth consumption increases after passing through the NFV nodes, result in

a higher link cost.

We propose a 2-approximation algorithm (Algorithm 4) for IMF below. It can be seen that

the main difference between Algorithm 4 and Algorithm 1 is line 3 and 4, the cost of edge of

P and Gc is calculated differently. The rationale of Algorithm 4 is similar to Algorithm 1.

Similarly, we have two link weight metrics Θbefore and Θincreased.

Now we state the proof of approximation algorithm for IMF :

Proof. Let Topt denote the optimal tree derived from Gopt. There also exists a loop L in Topt

such that every edge in Topt is traversed exactly twice and every leaf in Topt is visited exactly

once. Like previous case, L can be decomposed into three paths M1, M2 and M3. The simple

path M1 must pass through at least one NFV node, call this NFV node hopt, the total cost of

M1 equals Copt(s, hopt,Θbefore) +Copt(hopt, l1,Θincreased) + c(hopt). M2 consists of paths which

Contents 35

connect first leaf l1 to next leaf l2 until the last leaf ln. Call the path which connects node li

and node li+1 in L Qi, there are two categories for Qi:

1. Qi does not pass through any NFV node, in this case the total cost ofQi = Copt(li, li+1,Θincreased).

2. Qi passes through two NFV nodes, h1 and h2. In this case the total cost of Qi =

Copt(li, h1,Θincreased) + Copt(h1, h2,Θbefore) + Copt(h2, li+1,Θincreased) + c(h1) + c(h2).

The total cost of M2 equalsn−1∑i=1

Qi. According to the definition of Tbest above, the sum cost of

M1 and M2 must be or equal to that of Tbest = minh∈H,d∈D

(C(P ) +C(T )). Thus the total cost of

the loop L must be greater than or equal to the total cost of Tbest, whose total cost is greater

than or equal to the cost of the solution TH , hence C(TH) ≤ C(Tbest) ≤ C(L) = 2C(Topt) =

2C(Gopt).

We now analyze the complexity of the Algorithm 4. In line 3 of Algorithm 4, we need to find

the shortest path between s to h, h to d for each h and d pair. Moreover, in line 2, to calculate

the cost of edge in Gc, we need to try every combination of h1 and h2 and the shortest paths

between them. Therefore, The overall complexity is O(n2|H|2(|V | + |E| log |E|)), where n is

number of end users, |V | is number of nodes in the network and |E| is number of edges in the

network.

A limitation of Algorithm 4 is that, in line 4 of Algorithm 4, the multicast flow sent between

two end users may first be reversely processed by one NFV node to decrease bandwidth then

forward processed by another NFV node before reaching the next destination user. Therefore

Algorithm 4 assumes that NFV node can perform both forward and backward processing.

Since bandwidth consumption of the flow increases when traversing through the NFV, there

should be hardly any loss of original information when the flow is first forward and then reverse

processed by NFV node. Example may include lossless video compression which allows the

original video stream to be perfectly recovered from the compressed stream [25].

Contents 36

4.2 Dynamic heuristics

So far, our proposed algorithms focus on static cases where individual users never leave the

system until the session ends. In reality, end users may dynamically join and leave the ex-

isting multicast topology. The static algorithms are not suited for this case because it is not

reasonable to run the static algorithms every time a user joins or leaves the system due to

the high computation overhead. Moreover, static algorithm may return a completely different

multicast topology as one user joining/leaving the multicast topology, therefore all the existing

multicast trees need to reconfigure, which produces a high reconfiguration overhead. There-

fore, we propose dynamic heuristics to deal with this situation. Algorithms 2 and 3 proposed

in the last section can be used for handling user joining and leaving for both the UMF and

RMF problem.

However, the case of IMF, as the bandwidth increasing after processing by NFV, finding the

closest user and connecting it to the new user may result in high bandwidth usage. Therefore

we design a new algorithm for the entering operation in the case of IMF (Algorithm 5). When

the new end user joins, this dynamic algorithm first finds the shortest distance between every

existing end user to the new user and find the least incremental cost c1min. Then it calculates

the least cost of activating a new NFV node from set of unused NFVs Hinactivated, then setting

up the flow path, c2min. If c1min ≤ c2min, then it connects the new end user to that existing

end user, otherwise it activates a new NFV node and sets up a new path. The complexity of

this dynamic heuristics is O((n+ |H|)(|V |+ |E| log |E|)), which is still much lower than that

of Algorithm 4. Finally, Algorithm 3 can still be used to handle the case when a user leaves

the system for IMF.

4.3 Scheduling of Multiple Multicast Sessions

In the previous sections, we consider the routing for single multicast session. However, in

reality, multiple multicast sessions may coexist to deliver their own information to different

Contents 37

Algorithm 5 Dynamic heuristics (entering)

1: For each coming end user d:2: For each end user v:3: vmin ← argmin

v(Cshortest(v, d, θincreased))

4: c1min ← minv

(Cshortest(v, d, θincreased))

5: End6: For each node h in Hinactivated:7: hmin ← argmin

h(Cshortest(s, h, θbefore) + Cshortest(h, d, θincreased))

8: c2min ← minh

(Cshortest(s, h, θbefore) + Cshortest(h, d, θincreased) + ch))

9: End10: If c1min ≤ c2min11: Add the new user to the group by connecting user d with vmin with the shortest

path12: End13: If c1min > c2min14: Hinactivated.remove(hmin)15: Add the new user to the group by connecting (s,hmin,d) with the shortest path16: End17: End

end user groups. Without loss of generality, we assume each multicast session has its priority.

For the scenario of multiple multicast sessions, applying the approximation algorithms over

each session may cause imbalanced resources utilization.

For example, as it is shown in Figure 4.1(a), two multicast trees (shown in solid and dash

arrow) carry traffic from source 1 to their end users, result in the heavy traffic load on NFV

node 3 and link (1, 3), (3, 5), (3, 6) since they carry the traffic of both sessions. To solve this

problem, we can reroute one of the multicast trees to distribute the traffic load to other links

and NFV nodes to achieve load balancing, one possible solution is shown in 4.1(b).

We now give the formulation of the problem. Suppose there are a set of multicast sessions T ,

each session has its own source node st, set H t to place NFV nodes and destinations Dt (t ∈ T ).

The goal is to minimize the maximum link utilization. This problem can be represented as

the follows (variable definitions are given in Table 4.1):

minimize u (4.2)

Contents 38

Table 4.1: Definitions of parameters

Name Dimension Description

N - Number of nodes in the networkK - Number of undirected links in the network,

after replacing each undirected link with twodirected links, there will be 2K directed linksin the graph

nt - Number of destinations of multicast session tH t - Set of nodes where NFV can be implemented

for multicast session tDt - Set of destinations in multicast session t, {dtl}

l = 1,2,.....,nt

u - Maximum utilization rate of links in the net-work

Cj - Capacity of link jLt1 - Bandwidth usage of multicast session t before

passing NFVLt2 - Bandwidth usage of multicast session t after

passing NFVβt - Workload on NFV nodes of multicast session

tM N × 2K Incidence matrix, Mim = 1 and Min = −1 if

there is a directed link i from node m to noden

At N × nt Atjk = 1 iff j = st, Atjk = −1 iff j = dtk andAtjk = 0 otherwise

W t 1× nt W tk = 1 for each element in W t

Rt N × 1 Rtk = 1 if node k is not in H t, Rt

k = 0 other-wise

Qt N × nt Qtij = 0 if i = dtj and Qt

ij = −1 otherwiseX t

1 2K × nt X t1ei = 1 iff link e is used by the flow with

destination dti to connect source st with theNFV node

X t2 2K × nt X t

2ei = 1 iff link e is used to reach destinationdti from the NFV node

Y t1 2K × 1 Y t

1e = 1 iff link e is used in the multicast sessiont to connect source s to any NFV node

Y t2 2K × 1 Y t

2e = 1 iff link e is used in the multicast sessiont to connect any NFV node to any end users

Zt N × 1 Ztk= 1 iff NFV is running on node k in multi-

cast session t

Contents 39

Figure 4.1: Example of Load balancing

subject to M(X t1 +X t

2) = At (∀t) (4.3)

Zt>MX t1 = −W t (∀t) (4.4)

Zt>MX t2 = W t (∀t) (4.5)

X t1jk ≤ Y t

1j ≤ 1 (∀k, j, t) (4.6)

X t2jk ≤ Y t

2j ≤ 1 (∀k, j, t) (4.7)

Rt>Zt = 0 (∀t) (4.8)

Qt �MX t1 � St (∀t) (4.9)

∑t

(Lt1Yt

1j + Lt2Yt

2j) ≤ uCj (∀j, t) (4.10)

X t1, X

t2, Y

t1 , Y

t2 , Z

t are binary

In this formulation, equations (4.3), (4.4), (4.5), (4.6), (4.7), (4.8) and (4.9) are the same

as what is defined in the NEMP in previous chapter. (4.10) calculates the total bandwidth

consumption on each link and makes sure that the total bandwidth usage on that link is less

than the utilization rate times capacity of that link.

Contents 40

Algorithm 6 Load balancing algorithm

1: Sort the multicast sessions t ∈ V from highest to lowest priority2: Assign initially cost to the nodes and links based on their resource usages3: For each multicast session t in V (highest to lowest priority):4: Run the Approximation algorithm.5: For each link in the network:6: Calculate the bandwidth usage.7: Update the cost of links based on the current utilization rate of the resources.8: End9: End

Our heuristic algorithm works as follows. Assume that we have a set of multicast sessions

t ∈ V with its own priority, one of the methods to achieve load balancing is first scheduling

the multicast session with the highest priority, then updating the link cost based on their

amount of resource remaining, until all the sessions are scheduled. Assume that initially each

node has a NFV activation cost and each link has a link cost, the algorithm is presented as

Algorithm 6.

To update the cost of link j, we use the function which is reversely proportional to the re-

maining bandwidth on the link =ρj

1−m (where ρj is a constant, and m is the current utilization

rate of the node or link). By using this update function, the link/node with lower load is more

likely to be used in the next round of scheduling.

One limitation of the above algorithm is that it assumes that all multicast sessions have their

unique priority levels. However, it is possible that there are multiple multicast sessions with

the same priority, we need to ensure each session is fairly treated while not exceed the capacity

of resources. That is, the total bandwidth consumption does not exceed the link capacity. For

instance, in Figure 4.2(a). Two multicast sessions are built using the approximation algorithm

proposed above. Session 1 shown in solid arrow has s = 9, h = 5, D = {3, 4}. Session 2 shown

in dash arrow has s = 9, h = {5, 7}, D = {4, 6}. Both the trees consume the resources on link

(9, 5) and (5, 4). Suppose the sum of bandwidth requirement of session 1 and session 2 exceeds

the capacity of link (9, 5). We call it resource contention, since multiple multicast sessions

will compete for limited link bandwidth. To solve this resource contention problem, we can

reroute one of the multicast tree to satisfy the capacity constraint.

Contents 41

Figure 4.2(b) shows an alternative path (in dotted arrow), so one of the session may take this

alternative path. However, since the original multicast topology has been changed, the new

multicast topology may has a different cost than the original multicast topology. That is,

rerouting a flow may incur a reconfiguration cost.

For each overloaded link, suppose we have a set of paths {pn}, (n = 0, 1, ..., N) which contains

the original path p0 (which is the link) as well as N disjointed alternative paths. Define the

total number of flows of multicast sessions that the overloaded link carries as M , and a set of

multicast sessions as {fm}, (m = 1, 2, ...,M), and the reconfiguration cost rmn as:

rmn =

0 if n = 0, 1 ≤ m ≤M

c[pn, {fm}]− c[p0, {fm}] if n 6= 0, 1 ≤ m ≤M

where c[pn, fm] denotes the total cost of the multicast topology of fm if p0 is replaced by pn.

Define Cl as the capacity of the link l, and the capacity of the path pn, Cpn is defined as:

Cpn = minl∈pn

Cl

which is the minimum capacity of the links contained in the path. Define Bm as the bandwidth

requirement of multicast flow fm, and xmn as the binary integer variable to indicate if flow fm

is assigned to pn. In order to minimize the total reconfiguration cost, we have the following

objective function:

minimizex

M∑m=1

N∑n=0

rmnxmn (4.11)

We need to assign a path for each multicast session, which gives:

subject toN∑n=0

xmn = 1 ∀m ∈M (4.12)

Contents 42

And we need to make sure that the total bandwidth usage does not exceed the capacity of

each path:M∑m=1

Bmxmn ≤ Cpn ∀0 ≤ n ≤ N (4.13)

Finally, this is a binary integer programming problem:

xmn ∈ {0, 1} (4.14)

This problem is a special case of the generalized assignment problem (GAP) which is APX-

hard. For this problem, [26] presents a solution algorithm running in polynomial time with

relaxation on the capacity constraint for a similar problem. We need to revise the solution in

[26] to solve our problem. The algorithm works as follows. First, we need to convert it into

linear programming problem by relaxing the integer constraint (4.14) to

xmn ≥ 0 (4.15)

Moreover, we can add another constraint

xmn = 0 if Bm > Cpn (4.16)

which states that we will not assign fm to pn if the fm requires more bandwidth than the

capacity of pn. Similar to the result given by [26], we have the following theorem:

Theorem 3. Providing that there is a feasible solution for the linear programming problem

(4.11),(4.12),(4.13),(4.14),(4.15) with cost π, we can find the optimal integer solution with

total cost no more than π in polynomial time, with each path pn has to carry at most 2Cpn .

Proof. First, assume that we have a set of feasible fractional solution xmn, which means path

pn is carrying part of flow fm. Similar to the proof of Theorem 11.1 in [26], we provide

vn = dM∑m=1

xmne slots to each path pn, and define set S = {(n, s) : 0 ≤ n ≤ N, 1 ≤ s ≤ vn} to

represent the slot assignment on path pn, U = {1, 2, ...,M} to represent the flows. We then

Contents 43

build a bipartite graph F = (U, S,E), where there is a edge e between node m and node (n, s)

iff xmn > 0 and the cost of edge e is set to rmn.

F still keeps the following properties [26]:

1. F contains a fractional complete matching for set U with cost π

2. Every integer matching of set U in F still has a total bandwidth bound of 2C for each

path pn

For 1) we still use the same step used in [26], and each flow fm with xmn > 0 is assigned to

a slot (n, s), and let ym,(n,s) denote the fraction of flows assigned to slot (n, s). Obviously, we

have xmn > 0 if ym,(n,s) > 0. For 2), first we sort all the flow in the same path pn according to

their bandwidth requirement Bm (xmn = 1). Suppose that we have total number of λn flows

assigned to path pn. After sorting, we have B1 ≥ B2 ≥ .... ≥ Bλn and we can still use the

scheduling method presented in [26] to construct the graph F ′ which still satisfy 1). Let Bs,nmax

the maximum bandwidth requirement among the flows assigned to slot s of path pn. Then

the total bandwidth requirement on path pn is at mostvn∑s=1

Bs,nmax. And we have the following

derivations for each path pn with 0 ≤ n ≤ N :

vn∑s=1

Bs,nmax ≤ Cpn +

vn∑s=2

Bs,nmax (4.17)

Cpn +vn∑s=2

Bs,nmax ≤ Cpn +

vn−1∑s=1

M∑m=1

ym,(n,s)Bm (4.18)

Cpn +vn−1∑s=1

M∑m=1

ym,(n,s)Bm ≤ Cpn +M∑m=1

vn∑s=1

ym,(n,s)Bm (4.19)

Cpn +M∑m=1

vn∑s=1

ym,(n,s)Bm = Cpn +M∑m=1

xmnBm ≤ 2Cpn (4.20)

(4.17) comes from the fact that B1,nmax ≤ Cpn in equation (4.13), (4.18) comes from the fact

that Bs+1,nmax ≤

M∑m=1

ym,(n,s)Bm since all the flows are sorted according to their bandwidth from

largest to smallest, the total fractional sum of bandwidth in the slot s must be greater than

Contents 44

Figure 4.2: Example

the maximum bandwidth in slot s+ 1. For (4.19) we just reverse the order of summation and

add one more item on the right side. (4.20) comes from the fact that sum the bandwidth over

each slot and each flow in the path pn just equals to sum the bandwidth of all the flows in the

path pn. And the inequality of (4.20) comes from the constraint in (4.13).


We evaluate our multicast algorithms on the real wide area network model generated by GT-

ITM [22]. We will demonstrate the accuracy of the algorithms by comparing the result of the

Approximation algorithm with the multilcast topology generated by our Branch-and-Bound

method over three network models generated by GT-ITM, the network settings are presented

in Table 4.2.

Table 4.2: Properties of networks

Case NetworkI NetworkII NetworkIII

nodes 10 35 100links(undirected) 14 50 127

size of H 2 7 10

Contents 45

For each network topology, we randomly generate a set of nodes H and a set of end users. We

schedule all the users at the same time.

4.4.1 Cost Analysis

From table 4.3 to 4.8, we compare the total cost of multicast topology generated by approx-

imation algorithm with the optimal solution generated by Branch-and-Bound for RMF and

IMF. All the results generated by the corresponding approximation algorithms are normalized

by the optimal solution generated by our Branch-and-Bound method.

Table 4.3: Cost evaluation for NetworkI (RMF)

Number of end users Branch and Bound Approximation algorithm3 1 1.184 1 1.215 1 1.25

Table 4.4: Cost evaluation for NetworkII (RMF)


Table 4.5: Cost evaluation for NetworkIII (RMF)


Table 4.6: Cost evaluation for NetworkI (IMF)


The variation on the performance of the Approximation algorithm is small (within 11 percent).

For network I, the normalization ratio between the approximation algorithm and optimal

Contents 46

Table 4.7: Cost evaluation for NetworkII (IMF)


Table 4.8: Cost evaluation for NetworkIII (IMF)


solution increases from 1.18 to 1.25 for RMF and 1.16 to 1.27 for IMF as the number of users

increases. For network II, the normalization ratio increases from 1.35 to 1.42 for RMF and

1.48 to 1.51 for IMF as the number of users increases. For network III, the normalization

ratio increases from 1.49 to 1.53 for RMF and 1.57 to 1.62 for IMF as the number of users

increases. For the small size network, both approximation algorithms for RMF and IMF

keep a low normalization ratio, which demonstrates that performance of the approximation

algorithms is very good. Moreover, we found the gap between the performance of Branch and

Bound solution and Approximate algorithm increases as the number of end users grows or

the size of the network increases, one possible reason is that as the number of the end users

increases or the size of the network increases, there are more possibilities to build the multicast

topology, therefore it is harder to find a good solution.

4.4.2 Running Time Analysis

Figure 4.3-4.5 show the running time of Approximation algorithm and Branch-and-Bound for

IMF over three network topologies. The running time of the approximation algorithm increases

from 5.5ms to 94ms for Network I, 0.75s to 10s for Network II and 1s to 90s for Network III.

We can see the running time of both methods increases as the number of destination users

increases, and the running time of Approximation algorithm is much lower than that of Branch

Contents 47

and Bound. Besides, the network size has a great effect on the running time for both methods,

with a higher impact on Branch and bound.

Figure 4.6-4.8 show the relationship between total number of end users entering and the average

time taken to process each end user for IMF over three network topologies. The running time

of the algorithm increases from 1.2ms to 3.8ms for Network I, 4ms to 42ms for Network II and

20ms to 0.17s for Network III. We find that the average processing time increases as number

of user increases, and we think the reason is that as the number of user increases, there are

more options for the new users to connect, therefore the algorithm will run for a long time to

find the solution.

Figure 4.9-4.11 show the relationships between total number of existing end users in the

multicast group and the average time of processing each leaving user for the dynamic algorithm

for IMF. The running time of the approximation algorithm increases from 1.3ms to 3.6ms for

Network I, 4ms to 33ms for Network II and 40ms to 0.18s for Network III.

We also find that the average processing time increases as number of user increases, and we

think the reason is similar: as the number of user increases, the algorithm will run for a long

time to check if the links connecting to this end user carry the multicast traffic of other users,

therefore it will run for a long time to find the solution.

Contents 48

Figure 4.3: Running time of static algorithms over three networks


Contents 49


Figure 4.6: Running time of dynamic heuristics over three networks

Contents 50



Contents 51

Figure 4.9: Running time of dynamic algorithms over three networks


Contents 52


Chapter 5

Multiple Layer NFV-enabled

Multicasting Routing Problem


In the last two chapters, we consider NFV-enabled Multicasting Routing Problem with single-

layer NFV. However, in the reality, the service chain usually contains multiple layers of NFVs.

For the example which is shown in Figure 1.2, the service chain contains three levels of middle-

box processing for the video streaming: DPI (Deep packet inspection), NAT (Network address

translation), and Transcoding. In this chapter, we discuss the routing for the multiple layer

NFV-enabled Multicasting Routing Problem.

Given the network topology G(V,E), which includes a node set V and edge set E. We define

set N = {n1, n2, ..., n|N |} the set of network functions in the service function chain, where the

traffic has to traverse n1, n2,...,n|N | in order. Let Hi ⊆ V denote the set of candidate nodes

to deploy the NFV instance of network function ni and denote H = {Hi} the whole set of

candidate nodes to deploy the NFV instances. Denote s as the source host and d ∈ D the set

of destination nodes. Moreover, we revise the definition of multicast topology to include the

concept of multiple layers NFVs:

53

Contents 54

Definition 1. Define the multicast topology as a subgraph G′ ∈ G. For each G′, there exists a

function f : D → H1 ×H2 × ...×H|N | such that for each d ∈ D, there exists a chain of NFV

instance {h1, h2, ..., hN} ∈ H1 ×H2 × ...×H|N | to process the traffic before reaching d.

Let we denote the cost of link e and wiv the cost of implementing network function ni on

node v. To achieve different purpose, different metrics can be used to derive the cost. For

example, one of the metrics bases on the utilization rate of the remaining bandwidth of the

link and utilization rate of hardware resource of the node. Further define a binary variables

xei (i = 0, 1, ..., |N |), xe0 = 1 if link e is used to direct the multicast flow from s to NFV n1 and

xe0 = 0 otherwise. xei = 1 (i = 1, 2, ..., |N |−1) if link e is used to direct the multicast flow from

NFV ni to NFV ni+1 and xei = 0 otherwise. xe|N | = 1 if link e is used to direct the multicast

flow from NFV n|N | to d ∈ D and xe|N | = 0 otherwise. Let zvi ∈ {0, 1} (i = 1, 2, ..., |N |)

denote whether the instance of ni runs on node v. Our goal is minimizing the total cost of the

multicast topology, which is defined in (5.1).

minimizexei,zvi∈{0,1}

∑e∈E

|N |∑i=0

wexei +∑v∈hi

|N |∑i=1

wivzvi (5.1)

Let binary variable yeid denote if link e is used to direct the traffic from ni to ni+1 for user d,

and ye0d denote if link e is used to direct the traffic from s to n1 and ye|N |d denote if link e is

used to direct the traffic from n|N | to d. Similarly, let binary variable uvid denote if NFV ni

runs on node v for user d, equation (5.2) and (5.3) illustrate the relations between yeid and

xei, uvid and zvi.

yeid ≤ xei ≤ 1 ∀e ∈ E, d ∈ D, i = 0, 1, ..., |N | (5.2)

uvid ≤ zvi ≤ 1 ∀v ∈ V, d ∈ D, i = 1, 2..., |N | (5.3)

Next, define Qi = {Qi ⊆ V : hi ∈ Qi, hi+1 /∈ Qi} (∀i = 1, 2..., |N | − 1), Qs = {Qs ⊆ V : s ∈

Qs, h1 /∈ Qs} and Qd = {Qd ⊆ V : d ∈ Qd, h|N | /∈ Qd}. Define δ(Q) the link set in the cut of

Q, that is the set of edges in G which have one endpoint in Q, and other endpoint not in Q.

equation (5.4) − (5.6) makes sure that there is a path from s to each destination user d ∈ D

Contents 55

[26]: ∑e:e∈δ(Qs)

ye0d ≥ 1 ∀d ∈ D,Qs ∈ Qs (5.4)

∑e:e∈δ(Qi)

yeid ≥ 1 ∀i = 1, 2..., |N | − 1, d ∈ D,Qi ∈ Qi (5.5)

∑e:e∈δ(Qd)

ye|N |d ≥ 1 ∀d ∈ D,Qd ∈ Qd (5.6)

Finally, we have to make sure that the flow traverses through at least one NFV instance hi for

each network function ni in the service function chain before reaching the destination user.

∑v∈Hi

uvid ≥ 1 ∀d ∈ D,ni ∈ N (5.7)

Equations (5.1) − (5.7) formally define the problem, we call this problem Service Function

Chain Enabled Multicast Routing Problem (SMRP).

5.1.1 Two-Approximation Algorithm

We next propose a two-approximation algorithm (TAA) based on Algorithm 1, which is de-

scribed in Algorithm 7, we name Algorithm 7 TAA. It first searches for a path which connects

the source to one of the destination users through the chains of NFV with the minimum cost,

then builds a minimum spanning tree among the users and direct the flows to each end user

by using the minimum spanning tree.

Figure 5.1 presents an example and Theorem 4 presents the performance bound for Algorithm

7:

Theorem 4. Given the solution Tmin generated by Algorithm 7, the total cost of Tmin is less

than two times that of the optimal multicast topology Topt of SMRP.

To prove theorem 4, we still need to cite the lemma 1 and 2 in Chapter 3: Now we present

the proof for the Algorithm 7:

Contents 56

Algorithm 7 Two-Approximation algorithm (TAA)

1: Wmin =∞2: for each {h1, h2, ..., h|N |, d} ∈ H1 × ...×H|N | ×D do3: Construct a graph P by connecting s to h1, h1 to h2,...,h|N | to d.4: Construct a complete graph C among the end users D, the cost of the link between each

two users is set to the minimum distance between the two users in G.5: Find the minimum spanning tree T from C and calculate the total link cost of T .6: Connect P with T , call the combined graph Gs (since an end user exists both in P and

T , they can connect together). Calculate the total cost W of Gs defined by equation(5.1).

7: if W ≤ Wmin then8: Wmin = W , Gmin ← Gs

9: end if10: end for11: Replace the each link in Gmin with the shortest path between them in G. Call the result

multicast topology Tmin.12: Remove any unnecessary edges in Tmin to get the final output.

Proof. For the ease of interpretation, denote W (U) the total cost of subgraph U and Wl(U) =∑e∈U we the total link cost of U and Wh(U) =

∑v∈U wv the total node cost of U . Further

denote Gopt the optimal multicast topology, and Topt the relative multicast tree derived from

it by using lemma 1. Therefore the total cost of Gopt equals to that of Topt. By lemma 2,

there exists a loop L such that every edge in T appears exactly twice in L. This loop L can

be decomposed into three parts:

1. A path R1 which connects source s to one end user d through a chain of NFVs.

2. A path R2 which connects all the end users.

3. A path R3 which connects d|D| back to s.

Since each link and nodes in Topt appears exactly twice in the loop L, W (L) = 2×W (Topt) ≥

Wl(L). The path R1 must pass through a chain of NFV before reaching d1, otherwise the

flow is not processed. Denote λ(a, b) the cost of the shortest path between node a and b in G.

Hence W (R1) ≥ W (P ) = minimumh1,h2,...,h|N|,d

(λ(s, h1) +∑|N |−1

i=1 λ(hi, hi+1) + λ(h|N |, d) +∑|N |−1

i=1 wihi .

Ignore the node cost of R2, R2 is just a path which connects all the end users, therefore

Contents 57

Figure 5.1: An example of TAA. (a) shows a network topology G(V,E) with 10 nodesand 13 links, suppose s = 10, D = {2, 3, 4, 6}. There are two layers of NFV, with H1 ={5, 7}, H2 = {1, 9}. (b) shows the path graph P . (c) is the complete graph C among the endusers. Assume (d) is the minimum spanning tree T derived from C. Connecting P with Tgives Gs, which is shown in (e), assume Gs = Gmin. (f) shows Tmin and (g) shows Tmin after

removing the redundant edges.

W (R2) ≥ Wl(R2) ≥ W (T ). Therefore W (Tmin) ≤ W (Gmin) ≤ W (R1) +Wl(R2) ≤ W (R1) +

W (R2) ≤ W (Topt).

An example is shown below in Figure 5.2 to illustrate the proof of TAA. In Figure 5.2, the

source node is 1, and destination users are 2,6 and 7. The service function chain involves two

level of network functions, which locates at node 3 and node 4. The Gopt is shown in Figure

5.2(a). In Figure 5.2(a), the path from 1 to the end users are (1, 3, 5, 3, 4, 7), (1, 3, 5, 3, 4, 2)

and (1, 3, 5, 3, 4, 2, 6) respectively. Topt of equal cost can be derived as follows: first, the

path to node 7 is added to Topt, which is shown in Figure 5.2(b). The path to node 2 is

(1, 3, 5, 3, 4, 2), since (1, 3, 5, 3, 4) is already contained in the path to 7, the user 2 can be

Contents 58

Figure 5.2: Example of constructing tree

added by connecting node 2 to node 4, which is shown in Figure 5.2(b). Similarly, node 6

can be added to Topt by connecting to node 2 (Figure 5.2(d)). And the total cost of Topt is

the same as the total cost of Gopt. Figure 5.2(e) shows the path R1, R2, R3 in the Topt. With

W (R1) +W (R2) +W (R3) = 2W (Topt) = 2W (Gopt) ≥ W (P ) +W (T ).

Next we analyze the complexity of the TAA. In the for loop between line 2 and 8, we need to

find the shortest path between NFV instances of ni and ni+1. In line 4, to calculate the cost of

edge of C, shortest paths need to be calculated for each pair of destination users. Hence, the

total complexity of TAA is O((|D|2 + |D|Π|N |i=1|Hi|)(|V |+ |E| log |E|)), where |D| is the number

of users, |Hi| is the number of NFV nodes available to deploy the NFV instance of ni, |V | is

the number of nodes and |E| is the number of edges.

Contents 59

Algorithm 8 Heuristic Algorithm (HA)

1: Define h′1 = argminh1∈H1

(λ(s, h1) + w1h1

).

2: for i = 2, ..., |N | do3: Define h′i = argmin

hi∈Hi

(λ(h′i−1, hi) + wihi).

4: end for5: Define d′ = argmin

d∈D(λ(h′|N |, d) + w

|N |h|N|

).

6: Construct a graph P by connecting s to h′1, h′1 to h′2,...,h′|N | to d′.7: Construct a complete graph C among the end users D, the cost of the link between each

two users a and b is set to the minimum distance between a and b in G.8: Find the minimum spanning tree T from C and calculate the total link cost of T .9: Connect P with T , call the combined graph Gs. Calculate the total cost W of Gs defined

by equation (5.1).10: Replace the each link in Gmin with the shortest path between them in G. Call the result

multicast topology Tmin.11: Remove any unnecessary edges in Tmin to get final output.

5.1.2 Heuristic Algorithm based on TAA

As we state in the last paragraph, although TAA provides a bound on its performance, the

complexity of TAA exponentially grows with the number of network functions |N | in the service

function chain. This is acceptable when the number of network functions is small. However,

with the large number of network functions and the network of large size, the complexity is

high. Hence we proposed a heuristic algorithm based on TAA, which is described in Algorithm

8. Compared with TAA, HA iteratively finds the NFV on each layer of the service function

chain such that the sum of link cost and node cost is minimum. The complexity of the HA

is O((|D|2 +∑|N |

i=1 |Hi|)(|V |+ |E| log |E|)), whose complexity is linearly growing with |N |. As

shown by the simulation in Section 5.4, HA saves huge amount of running time at the cost of

3%− 7% increase on the total cost.

Contents 60

5.2 Build the Multicast Topology with given number of

NFV Instances

5.2.1 Problem Statement

In the last section, we investigated the Service Function Chain Enabled Multicast Routing

Problem, and presented a two-approximation algorithm. However, in SMRP, there is no

requirement on the number of NFV instances deployed for each layer of network function

ni. The TAA proposed above presents a solution with one NFV instance deployed for each

network function ni. However, in the real scenario, this solution is not reliable, once one of

the NFV instance is down, all the network traffic is disrupted. To guarantee the reliability of

the multicast service, we place a limit on the minimum number of NFV instances deployed for

each network function ni by adding a new constraint:

∑d∈D

∑v∈Hi

uvid ≥ γ (5.8)

where γ is the minimum number of NFV instance for each network function ni. And we call

(5.1)−(5.8) the Reliable Service Function Chain Enabled Multicast Routing Problem (RSMRP).

5.2.2 Heuristic Algorithm

Next we propose a heuristic algorithm for RSMRP. We first classify all the end users into γ

groups by running a clustering algorithm, then we run the TAA for each user group such that

each user group has different NFV instances deployed.

First we state the clustering algorithm. The clustering algorithm selects the center of each

group by using k-center algorithm [26]. Then all the users are categorized based on the

distance with the centres, the clustering algorithm is described in Algorithm 9. The algo-

rithm for RSMRP is described in Algorithm 10. The Clustering(D,γ) will return all the

Contents 61

Algorithm 9 Clustering Algorithm (CA)

1: Pick one end user k ∈ D at random. Let K = {k}2: while |K| ≤ γ do3: llarge = 04: for d ∈ D do5: for k ∈ K do6: Find the shortest distance lkd between k and d7: if lkd ≥ llarge then8: llarge = lkd, dlarge = d9: end if

10: K = K ∪ {dlarge}11: end for12: end for13: end while14: for ki ∈ K do15: Set Ui = {ki}16: end for17: Group the other end users based on the distance between the user to and ki, the user is

grouped to Ui iff ki is closest to the user among all the other k ∈ K. If more than onek ∈ K satisfy this, randomly pick one.

Algorithm 10 Heurisitic Algorithm for Group Routing (HAG)

1: [U1, U2, ..., Uγ] = Clustering(D,γ)2: for each user group Ui do3: [h1, h2, ..., h|N |] = TAA(H1, H2, ..., H|N |, Ui) or we can use [h1, h2, ..., h|N |] =

HA(H1, H2, ..., H|N |, Ui)4: for i = 1, 2, ...|N | do5: Hi = Hi\{hi}6: end for7: end for

user groups based on the Clustering Algorithm, and we can choose TAA(H1, H2, ..., H|N |, Ui)

or HA(H1, H2, ..., H|N |, Ui) to construct the multicast topology for users in Ui based on the

Two-Approximation Algorithm (Algorithm 7) or Heuristic Algorithm (Algorithm 8) proposed

above, which return the corresponding NFV instances used. The used NFV instances will then

be removed from the set Hi. Repeat this operation until all the user groups are routed. We

name the HAG HAG-TAA if it uses TAA for constructing the multicast topology and name

the HAG HAG-HA if it uses HA for constructing the multicast topology.

Next we analyze the complexity of the CA and HAG, the for loop between the line 2−9 of CA

calculate the distance between the end users, and this calculation repeats for γ times, therefore

Contents 62

the complexity of CA is O((γ|D|2)(|V | + |E| log |E|)). Moreover, the for loop between the

line 2 − 5 of HAG calls TAA/HA for γ times, therefore the total complexity of HAG-TAA

is O(γ(|D|2 + |D|Π|N |i=1|Hi|)(|V | + |E| log |E|)) and the complexity of HAG-HA is O(γ(|D|2 +

|D|∑|N |

i=1 |Hi|)(|V |+ |E| log |E|)).

5.3 SMRP with Time-variant Resource Cost

In the previous section, we assumed the link cost we and the node cost wiv are fixed. However,

this assumption is not reasonable in the real scenario. For example, the bandwidth demand

and the hardware resource demand of the multicast service may change over time, which causes

the fluctuation on link cost and node cost. Instead of creating a static algorithm which gives

a one-shot solution for SMRP, we are more interested in developing an online algorithm which

adjusts the multicast topology based on the time-variant cost of links and nodes.

Denote witv the cost of implementing the NFV instance ni on node v at time t, and wte the cost of

link e at time t. Denote θ the period of reconfiguration, during each period of reconfiguration,

the multicast topology remains fixed. The problem becomes how to build and dynamic adjust

the multicast topology based on the time-variant cost of links wte and nodes witv such that the

average total cost of multicast topology is minimized.

5.3.1 Introduction to Markov Approximation

To solve the above problem, we leverage the idea of Markov Approximation which is described

in [27]. Markov Approximation is a general technique to solve the combinatorial optimization

problem, it relaxes the objective function by adding an entropy term and solve the problem

by achieving efficient time-sharing among all the feasible solutions. More specially, SMRP can

Contents 63

be formulated as a combinatorial optimization problem as follows:

minimumpf≥0

∑f∈F

pfφf

s.t.∑f∈F

pf = 1

f ∈ F is the feasible solution of SMRP, F is the set of all the feasible solutions, or all the

possible configurations of the multicast topology. φf is the total cost of configuration f , which

is defined by (5.1). pf is the probability that the multicast topology is in configuration f . It

is obvious that the solution of the above problem is pf = 1 if f = argminf∈Fφf and pf = 0

otherwise. The objective function can be relaxed by adding a entropy term in the objective

function:

minimumpf≥0

∑f∈F

pfφf +1

β

∑f∈F

pf log(pf )

By solving the relaxed version of this problem, we have the following solution:

p∗f =exp(−βφf )∑f∈F exp(−βφf )

∀f ∈ F (5.9)

Where β is a large constant, as β approach to infinity, the relaxed term will approach to 0

and the solution will approach to the optimal solution of the original problem. Given the

probability of each state described above, we want to design a discrete Markov chain whose

steady state probability is the same as the probability given in the solution. Next we describe

the design of the Markov Chain in detail.

5.3.2 Design of the Discrete Markov Chain

Our goal is to design a discrete Markov chain whose steady state probability satisfies (5.9).

First, we have the following results when the discrete Markov chain is in the steady state.

Theorem 5. For any two states f, f ′ ∈ F in the discrete Markov chain, if we have pfqff ′ =

pf ′qf ′f and∑

f∈F pf = 1, where qff ′(qf ′f ) is the state transitional probability between the

Contents 64

state f(f ′) and f ′(f). Then the discrete Markov chain is in the steady state with the steady

state probability pf (∀f ∈ F ).

Proof. Given pfqff ′ = pf ′qf ′f , then we can sum the left side and right side of the equation over

f ′ ∈ F , and we have∑

f ′∈F pfqff ′ =∑

f ′∈F pf ′qf ′f . This indicates that the total probability

of leaving f equals to total probability of entering f . And we have the sum of the state

probabilities equals 1, which indicates the Markov chain is in the steady state.

To achieve the steady state probability given in (5.9), and satisfies the steady state probability

in Theorem 5, we define the state transitional probability as follows:

qff ′ ∝ exp(βφf )

More specially, we make qff ′ = α exp(βφf ), where α is the constant to make the state transi-

tional probability less or equal than 1. For the ease of interpretation, we make the following

definition:

Definition 2. Given the state f , which corresponds to a multicast topology with set of links

fl and set of used nodes to deploy NFV fnfv, the state f ′ is adjacent to state f iff f ′ =

Routing(G\e,H) (∀e ∈ fl) or f ′ = Routing(G,H\v) (∀v ∈ fnfv).

Where Routing(G,H) is function which takes G and H as input, and returns the multicast

topology based on TAA. Then we design the Markov chain such that there is a link between

two states (i.e. qff ′ > 0) only if two states are adjacent states. Denote the set of adjacent

states of f A(f). The transition state probability qff ′ = α exp(βφf ) (∀f ′ ∈ A(f)) and qff =

1− |A(f)|α exp(βφf ).

5.3.3 Online Algorithm

Next we state our design of the online algorithm to deal with the time-variant resource cost,

which is described in Algorithm 11: During each period θ, the OA calculate the average cost

Contents 65

Algorithm 11 Online Algorithm (OA)

1: Given the network topology G(V,E), wi0v and w0e at t = 0, calculate the multicast topology

f by using the TAA, f = Routing(G,H) and the total cost φf2: Let k = 1.3: while t > 0 do4: if t = kθ then5: Update the resource cost wikθv = 1

θ

∫ kθ(k−1)θ

witv dt and wkθe = 1θ

∫ kθ(k−1)θ

wtedt

6: Calculate the total cost of f with the updated resource cost, calculate pff = 1 −|A(f)|α exp(βφf ), generate a uniform random variable j.

7: if 0 ≤ j ≤ 1− pff then8: Calculate a adjacent state f ′ = Routing(G\e,H) or f ′ = Routing(G,H\v) with

the updated resource cost by randomly pick e from fl or v from fnfv, switch thenetwork topology according to f ′, Let f = f ′.

9: end if10: k = k + 1;11: end if12: end while

of the link and node, then calculate the total cost of the current multicast topology, and the

state transition probability to the other states by using qff ′ = α exp(βφf ). The multicast

topology is adjusted accordingly to achieve the better performance.

5.3.4 Reliability of Solution

The online algorithm gives a solution which periodically changes its topology based on the

current cost of resources, and multiple NFV instances are used to guarantee the reliability of

the multicast topology.

5.3.5 Cost of Reconfiguration

It is costly to reconfigure the multicast topology. For every period of θ, the multicast topology

will be adjusted based on the current cost of resources. However, the frequency of reconfigu-

ration can be decreased by adjusting some parameters, we can either:

1. decrease the α such that the probability of staying in the same state pff increases.

Contents 66

Figure 5.3: Timeline of Reconfiguration

2. Increase the period of reconfiguration θ.

5.3.6 The Effect of Time Error during State Transition

In the above sections, we described an online algorithm to deal with time-variant cost of

resources. One assumption we made is that the multicast topology can be reconfigured based

on the OA at each θ. However, in the real network environment, it may be difficult to

reconfigure the multicast topology such that the reconfiguration process finishes sharply at

each θ, a time error (e.g. caused by the time consumed by the reconfiguration process) M θ

will exist. For the network with heavy-loaded traffic, the reconfiguration process may take

some time and cause the inaccuracy on the duration of each state. This can be illustrated by

the diagram in Figure 5.3. Let M θmax denote the maximum variation of time error, the time

spent on each state may range from θ− M θmax to θ+ M θmax. This time error will affect the

overall performance of the system since the time ratio (pf ) of each state will also change. Next

we will give a quantitative analysis on this issue.

Define M = {m1,m2...m|M |} a set of multicast topologies that is used in chronological order.

For example, m2 is used after m1, which is followed by m3, etc. Let θtotal denote the total time

that the multicast topology exists, θm denote the time that the multicast topology m exists

ideally and M θm denote the time error during the reconfiguration process for the multicast

topology m. Therefore θm+ M θm denote the time that m exists in reality and we have

θtotal =∑

m∈M θm. Then we are interested in finding the maximum change on the average cost

Contents 67

caused by the time error M θ. More specifically, we want to maximize the total cost, which is

presented by the following objectives.

maximumMθ

∑m∈M

θm+ M θmθtotal

φm (5.10)

And the sum of all the time error must equals 0 since the total time θmax is fixed.

∑m∈M

M θm = 0 (5.11)

Finally, we have the limit on the time error.

− M θmax ≤M θm ≤M θmax (5.12)

Rank the total cost of the multicast topology φm from highest to lowest, call this ordered set

of φm Φ. Theorem 6 gives the solution of the above problem.

Theorem 6. Denote the ideal total cost generated by OA W , then the maximum cost caused

by the time error M θ is W + Mθmax

θtotal(∑|Φ|

m=d|Φ|/2e φm −∑b|Φ|/2c

m=1 φm)

Proof. The problem (5.10)− (5.12) is a linear programming problem and we solve the problem

by using Karush Kuhn Tucker (KKT) conditions. And we have the following equations:

− φmθtotal

+ a+ bm − cm = 0 ∀m ∈M∑m∈M

M θm = 0

bm, cm ≥ 0 ∀m ∈M

bm(M θm− M θmax) = 0 ∀m ∈M

cm(M θm+ M θmax) = 0 ∀m ∈M

Contents 68

If |Φ| is even, and we have the following results for the above equations:

M θm =M θmax, if|Φ|2

+ 1 ≤ m ≤ |Φ|

M θm = − M θmax if 1 ≤ m ≤ |Φ|2

bm =φmθtotal

−(φ |Φ|

2

+ φ |Φ|2

+1)

2θtotal, cm = 0 if

|Φ|2

+ 1 ≤ m ≤ |Φ|

cm =(φ |Φ|

2

+ φ |Φ|2

+1)

2θtotal− φmθtotal

, bm = 0 if 1 ≤ m ≤ |Φ|2

a =1

2θtotal(φ |Φ|

2

+ φ |Φ|2

+1)

If |Φ| is odd, and we have the following results for the above equations:

M θm =M θmax, if|Φ|+ 1

2+ 1 ≤ m ≤ |Φ|

M θm = − M θmax if 1 ≤ m ≤ |Φ| − 1

2

M θm = 0 ifm =|Φ|+ 1

2

bm =φmθtotal

−φ |Φ|+1

2

θtotal, cm = 0 if

|Φ|+ 1

2≤ m ≤ |Φ|

cm =φ |Φ|+1

2

θtotal− φmθtotal

, bm = 0 if 1 ≤ m ≤ |Φ| − 1

2

a =φ |Φ|+1

2

θtotal

Substitute the above solutions to the objective function, the maximum cost equals W +

Mθmax

θtotal(∑|Φ|

m=d|Φ|/2e φm −∑b|Φ|/2c

m=1 φm).


To demonstrate the performance of the algorithms proposed above, we evaluate the algorithms

over three network models, including Abilene (11 nodes, 13 links), GEANT (23 nodes, 74

links) and the real WAN model (100 nodes 127 links) generated by GT-ITM software tool

[23]. In Abilene, we make |Hi| = 2, (1 ≤ i ≤ |N |), in GEANT, |Hi| = 5, (1 ≤ i ≤ |N |), and

Contents 69

Algorithm 12 Benchmark Algorithm (BA)

1: for d ∈ D do2: Define h′1 = argmin

h1∈H1

(λ(s, h1) + w1h1

).

3: for i = 2, ..., |N | do4: Define h′i = argmin

hi∈Hi

(λ(h′i−1, hi) + wihi).

5: end for6: Define d′ = argmin

d∈D(λ(h′|N |, d) + w

|N |h|N|

).

7: Connecting s to h′1, h′1 to h′2,...,h′|N | to d′.8: end for

|Hi| = 2, (1 ≤ i ≤ |N |) for GT-ITM. We compare the performance of the algorithms by using

a metric called Cost Deduction Ratio (CDR), assume the total cost generated by TAA is π2,

and the total cost generated by the benchmark algorithm HA is π1, then CDR is defined in

(5.13). All the simulations are done 100 times and the average results are presented.

CDR = (π1 − π2)/π1 (5.13)

To compare the performance of the TAA and HA, we create another Benchmark Algorithm

(BA). For each end user, BA works by finding the nodes to deploy NFV as well as the paths

such that the total cost is smallest, BA is described in Algorithm 12. Similarly, we can redefine

a CDR such that π1 is the total cost generated by BA, and π2 is the total cost generated by

TAA, name it CDR’. Table 5.1-5.3 show the performance of TAA, HA and BA with different

number of end users over different network topologies. The CDR′ ranges from 12% to 48% for

different network topologies and different number of end users, which shows that the efficiency

of TAA. Moreover, the CDR ranges from 3% to 7% for all the network topologies and different

number of end users, which demonstrates the performance of HA and TAA is closed. Figure

Table 5.1: Performance of Algorithms

Network Users 2 3 4 5 6Abilene CDR 0.0643 0.0707 0.0495 0.0498 0.0562

CDR’ 0.127 0.188 0.219 0.331 0.404

5.4-5.6 show the running time of TAA and HA for different number of end users over different

network topologies. The running time of HA is much lower than that of TAA. The running

Contents 70


Network Users 5 8 10 12 15

GEANT CDR 0.0344 0.0386 0.0413 0.0382 0.0649CDR’ 0.144 0.160 0.227 0.378 0.448


Network Users 10 20 30 40 50GTITM CDR 0.0604 0.0577 0.0523 0.0637 0.0576

CDR’ 0.161 0.198 0.277 0.368 0.479

2 2.5 3 3.5 4 4.5 5 5.5 610

−4

10−3

10−2

10−1

Number of end users

Run

ning

Tim

e

Running time TAARunning time HA

Figure 5.4: Running Time Comparison on Abilene

time of both algorithms increase as the number of end users increase, this is reasonable since

the more calculation need to be done as more end users need to be scheduled.

Next we evaluate the performance of HAG, we evaluate the total cost generated by HAG-TAA

and HAG-HA with different number of clusters γ, given the fixed number of end users. Table

5.4-5.5 show the results for GEANT network with 15 end users and GTITM network with 50

end users. And CDR is defined with π1 the total cost of the HAG-HA and π2 the total cost

of the HAG-TAA. As we expect, the data shown above presents a similar result as Table 5.1-

Table 5.4: Performance of HAG

Network Groups 2 3 4 5

GEANT CDR 0.0741 0.0523 0.0412 0.0557

Contents 71

5 6 7 8 9 10 11 12 13 14 1510

−3

10−2

10−1

Number of end users

Run

ning

Tim

e


Figure 5.5: Running Time Comparison on GEANT

10 15 20 25 30 35 40 45 5010

−3

10−2

10−1

100

101

Number of end users

Run

ning

Tim

e


Figure 5.6: Running Time Comparison on GTITM

5.3. This is because HAG still uses TAA and HA to build the multicast tree for each cluster.

Therefore the performance of HAG-TAA and HAG-HA should follow the similar tendency.

Finally, we evaluate the performance of the OA by measuring the average cost of OA. During

each round, each link and each node is assigned with a cost we and wiv generated by some

probabilistic distribution. Each node is assigned a cost to deploy the NFV which is Gaussian

distributed with a mean equals 10 and variance equals 1. If the cost of node or link is less

Contents 72

Table 5.5: Performance of HAG

Network Groups 5 8 10 15GTITM CDR 0.0346 0.0478 0.0335 0.0580

than 0, then the cost is made to be 0. We evaluate the cost over 1000 rounds. As mentioned

above, we can control the parameter α to decrease the number of reconfiguration such that the

reconfiguration cost is small. We compare the performance of OA and TAA with difference

number of reconfigurations on GEANT and GTITM by adjusting α. For TAA, the multicast

topology will be calculated by using TAA with the cost wi0v and w0e and the multicast topology

will remain fixed over the 1000 rounds. The CDR is defined the same as (5.13), with π1 the

average cost of TAA over all the rounds and π2 the cost of OA. Table 5.6-5.9 show the relations

Table 5.6: Performance of OA

GEANT Reconfigurations 50 100 200 400 600(Gaussian cost) CDR 0.13 0.16 0.23 0.29 0.36


GEANT Reconfigurations 50 100 200 400 600(Uniform cost) CDR 0.17 0.20 0.27 0.33 0.38


GTITM Reconfigurations 50 100 200 400 600(Gaussian cost) CDR 0.18 0.20 0.27 0.30 0.39


GTITM Reconfigurations 50 100 200 400 600(Uniform cost) CDR 0.27 0.33 0.36 0.42 0.48

between number of reconfigurations and CDR over different network topologies. During each

round, each link and each node is assigned with a cost which is Gaussian distributed and

uniformly distributed. For the Gaussian distribution, each link is assigned a cost we which is

Gaussian distributed with a mean equals 1 and variance equals 0.1. Each node is assigned a

cost wiv which is Gaussian distributed with a mean equals 10 and variance equals 1. For the

Contents 73

uniform distribution, each link is assigned a uniformly distributed cost with a mean equals

1 and variance equals 0.1, and each node is assigned a uniformly distributed cost wiv with a

mean equals 10 and variance equals 1. The CDR increases with the number of reconfigurations

increases. This is because under limited number of rounds (1000 rounds), as the number of

reconfigurations increases, the multicast topology will be adjusted more frequent based on the

current cost of the links and nodes. However, according to the discussion above, the average

cost of the multicast topology will not depend on α as the number of rounds approach infinity.

Moreover, the CDR generated on GTITM is larger than the CDR generated on GEANT, this

is because the size of GTITM is larger than that of GEANT. Hence the variance of the cost

of multicast topology built on GTITM, which equals the sum of more random variables, is

larger than the variance of the cost of multicast topology built on GEANT. Therefore, a larger

number of reconfigurations will save more on GTITM than GEANT.

Chapter 6

Conclusions and Future Work

6.1 Conclusions

Many multicast services such as live multimedia distribution and real-time event monitoring

require multicast mechanisms that involve network functions (e.g. firewall, video transcoding).

Network Function Virtualization (NFV) is a concept that proposes using virtualization to

implement network functions on infrastructure building block (such as high volume servers,

virtual machines), where software provides the functionality of existing purpose-built network

equipment. In this work, we present routing algorithms for building an NFV-enabled multicast

topology on SDN. We consider different scenarios of the routing problems and we proposed

solutions for each of them. First we study a simple yet important case where a single NFV

processing step is involved. This scenario is already applicable to many scenarios such as video

transcoding, packet filtering and intrusion detection. We propose an algorithm for building an

NFV-enabled multicast mechanism on SDN for both dynamic and static scenarios. Then we

make a extension to the original problem to consider the joint placement of NFVs and routing

in the service chain.

Finally, we presented an online algorithm to deal with the scenario that the cost of links and

nodes are time-variant based on the idea of Markov approximation. The simulation indicates

74

Contents 75

a huge saving on the total cost by using our multicast routing algorithms and a preliminary

implementation of the multicast framework has been implemented on the testbed.

6.2 Future Work

For future work, there are three main areas we can continue to improve:

1. One extension is to include delay constraint in the formulation of the problem. Ensuring

that the assigned requests meet their deadlines is essential for the real-time multicast

service, such as live event, gaming, etc. More specifically, the total delay consists of

transmission delay of the links and processing delay of the node. Furthermore, Previ-

ous work has shown that NFV may cause abnormal delay variations and throughput

instability [28], which make the problem even more complicated.

2. Another direction where out work can be extended is to consider the efficient heteroge-

neous resource usage of the NFV instance. Infrastructure component such as servers,

switches and virtual machines execute software to provide the functionalities of net-

work elements and appliances, such as NAT, proxy server, video transcoder, deep packet

inspection (DPI). However, processing packets to serve these network functions need mul-

tiple hardware resources, such as, CPU, memory, disk storage. Furthermore, different

NFV instance may consume different amounts of hardware resources, how to efficiently

deploy and make use of these limited hardware resources to obtain the maximum profit

while maintaining the quality of the service is another important problem.

3. Finally, robustness of the NFV service is always an important research topic. If a NFV

instance does not work properly or a link is down abruptly, how to quickly reconfigure

the multicast topology to minimize the service interruption while keeping the total cost

of the multicast topology small is another promising research area.

Contents 76

Bibliography

[1] N. McKeown and T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford,

S. Shenker, J. Turner ”Openflow: Enabling Innocation in Campus Networks”, ACM SIG-

COMM Computer Communication Review, April 2008.

[2] B. Han, et al. ”Network function virtualization: Challenges and opportunities for innova-

tions”, IEEE Communications Magazine (Volume:53,Issue: 2)

[3] L. Kou, G. Markowsky, L. Berman ”A fast algorithm for Steiner trees”, Acta Informatica

1981, Volume 15, Issue 2, pp 141-145.

[4] S. Hougardy , H. Prmel. ”A 1.598 Approximation Algorithm for the Steiner Problem in

Graphs”, in proceesings of the tenth annal ACM-SIAM symposium on discrete algorithms,

Soda, 1999.

[5] G.Rouskas, I.Baldine. ”Multicast Routing with End-to-End Delay and Delay Variation

Constraints”. IEEE Journal on Selected Areas in Communications, Vol. 15, No 3, 1997.

[6] Mukherjee, R. ; Atwood, J.W. ”Rendezvous point relocation in protocol independent multi-

cast - sparse mode”. 10th International Conference on Telecommunications, 2003. ICT 2003.

[7] Macq, J.-F. ; Wolsey, L.A. ; Macq, B. ”A rendezvous point selection algorithm for videocon-

ferencing applications”. 2002 IEEE International Conference on Multimedia and Expo, 2002.

ICME ’02. Proceedings.

[8] S. Bemby ; H. Lu ; K. Zadeh ; H. Bannazadeh ; A. Leon-Garcia. ”ViNO: SDN Overlay to

Allow Seamless Migration Across Heterogeneous Infrastructure ”. IM 2015.

[9] A. Iyer, P. Kumar, V. Mann ”Avalanche: Data center Multicast using software defined

networking, in the Proceedings of IEEE COMSNETS, 2014.

Contents 77

[10] J. Wen-Kang, L. Chun Wang ”A Unified Unicast and Multicast Routing and Forwarding

Algorithm for Software-Defined Datacenter Networks”, IEEE JOURNAL ON SELECTED

AREAS IN COMMUNICATIONS, VOL. 31, NO. 12, DECEMBER 2013.

[11] X. Li, M. Freedman. Scaling IP multicast on datacenter topologies, in the Proceedings of

ACM CoNext, 2013.

[12] S. Shen, L. Huang, D. Yang, W. Chen. Reliable Multicast Routing for Software-Defined

Networks, in the Proceedings of IEEE Infocom, 2015.

[13] Klinker, J.E. Multicast tree construction in directed networks, in the Proceedings of IEEE

MILCON, 1996.

[14] Castro, M. ; Kermarrec, A.-M. ; Rowstron, A.I.T. Scribe: a large-scale and decentralized

application-level multicast infrastructure. Selected Areas in Communications, IEEE Journal

on (Volume:20 , Issue: 8)

[15] Ralph, W.; Martina, Z. Multicast communications: protocol and applications. Morgan

Kaufmann Publishers Inc, May, 2000.

[16] K.Stachowiak, P.Zwierzykowski. Rendezvous point based approach to the multi-constrained

multicast routing problem, International Journal of Electronics and Communications, June,

2014;

[17] A. Jacobson, et al. OpenNF: Enabling Innovation in Network Function Control. In the

proceedings of ACM Sigcomm, 2014.

[18] J. Martins, et al. ClickOS and the Art of Network Function Virtualization, in the Pro-

ceeding of USENIX NSDI 2014.

[19] Y. Zhang, et al. StEERING: A software-defined networking for inline service chaining.

In the Proceedings of IEEE ICNP, 2013.

[20] M. Mangili, et al. Stochastic Planning for Content Delivery: Unveiling the Benefits of

Network Functions Virtualization, IEEE ICNP, 2014.

Bibliography 78

[21] M. Bouet, et al. Cost-based placement of vDPI functions in NFV infrastructures, in the

Proceedings of IEEE NetSoft, 2015.

[22] M.C. Luizelli, et al. Piecing together the NFV provisioning puzzle: Efficient placement

and chaining of virtual network functions, in the Proceedings of IEEE IM, 2015.

[23] ”Modeling Topology of Large Internetworks” : http://www.cc.gatech.edu/projects/gtitm/

[24] J. Kang ; Bannazadeh, H. ; Leon-Garcia, A. SAVI testbed: Control and management of

converged virtual ICT resources. IM 2013.

[25] K. Sayood: ”Introduction to data compression”, Morgan Kaufmann Publishers Inc. San

Francisco, CA, USA 2005 ISBN:012620862X

[26] D. Williamson, B. Shmoys ”The Design of Approximation Algorithms”. Cambridge Uni-

versity Press.

[27] M. Chen, et al.Markov Approximation for Combinatorial Network Optimization, in the

Proceedings of IEEE Infocom, 2010.

[28] G. Wang and T. S. E. Ng. The Impact of Virtualization on Network Performance of

Amazon EC2 Data Center. In Proceedings of IEEE Infocom, 2010.

[29] S. Zhang, et al. Network Function Virtualization Enabled Multicast Routing on SDN, in

the Proceedings of IEEE ICC, 2015.

[30] S. Zhang, et al. Routing algorithms for network function virtualization enabled multicast

topology on SDN, in IEEE Transaction on Network and Service Management.

[31] S. Zhang, et al. Joint NFV Placement and Routing for Multicast Service on SDN, in the

Proceedings of IEEE NOMS, 2016.

multicast routing for virtual network functions on sdn...in chapter 2, we will present the related...

Documents