multicast routing for virtual network functions on sdn...in chapter 2, we will present the related...
TRANSCRIPT
UNIVERSITY OF TORONTO
Multicast Routing for Virtual Network
Functions on SDN
by
Sai Qian Zhang
A thesis submitted in partial fulfillment for the
degree of Master of Applied Science
Department of Electrical and Computer Engineering
University of Toronto
c©Copyright by Sai Qian Zhang 2016
University of Toronto
AbstractDepartment of Electrical and Computer Engineering
Master of Applied Science
by Sai Qian Zhang
2016
Many multicast services such as live multimedia distribution and real-time event monitoring
require constructing a multicast mechanism to chain network functions (e.g. firewall, video
transcoding). Network Function Virtualization (NFV) is a concept that proposes using virtual-
ization to implement network functions on infrastructure building blocks (such as high volume
servers, virtual machines), where software replaces the functionality of existing purpose-built
network equipment. We present an approach for building the multicast mechanism whereby
multicast flows are processed by NFV before reaching end users. We propose a routing al-
gorithm and a method for building an appropriate multicast topology. First we proposed a
approximation algorithm to give an offline solution of the problem, then we further extend this
problem to consider the effect of the NFV on bandwidth consumption. Finally, we consider
the online version of this problem and present an online algorithm to adjust the multicast
topology based on the dynamic cost of resources.
ii
Acknowledgements
I would like to first express my most sincere appreciation and gratitude to my supervisor,
Professor Alberto Leon-Garcia for his guidance, support and understanding during the course
of my thesis work. I am grateful for the amount of trust and inspiration from him that led my
learning, exploration, and completion of my thesis.
I would like to thank the members of my committee: Professor Raviraj Adve, Professor Elvino
Sousa, and Professor Paul Chow for their evaluation of my work and valuable comments.
I would like to especially thankful to Dr. Qi Zhang to his useful comments, remarks and
engagement through the learning process of my master study, and I am thankful to the SAVI
Testbed architect, Hadi Bannazadeh and Dr. Ali Tizghadam for their support, discussion and
help throughout my master study.
I would also like to thank the other colleagues in my group, Byungchul Park, Thomas Lin,
Spandan Bemby, Pouya Yasrebi, Lilin Zhang, Houman Rastegarfar for their collaboration and
feedback.
I am also thankful for the assistance and administrative support from staff member Vladimirio
Cirillo.
Last but not least, I would like to thank my mum and dad for the unconditional love and
support throughout all my studies. My thesis would not have been possible without them.
iii
Contents
Abstract ii
Acknowledgements iii
1 Introduction 1
1.1 Virtualization in Cloud Environment . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Software Defined Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Multicast Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Network Function Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Related Works 6
2.1 Multicast Mechanism in Software Defined Networking . . . . . . . . . . . . . . 6
2.2 Traditional Multicast Routing Algorithm . . . . . . . . . . . . . . . . . . . . . 7
2.3 Network Function Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Single Layer NFV-enabled Multicasting Routing Problem 9
3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Complexity of NEMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Algorithm for NEMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.1 An Approximation Algorithm . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.2 A Solution Algorithm based on Branch and Bound . . . . . . . . . . . 17
3.3.3 Dynamic heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4.1 Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4.2 Running Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5.1 Multicast Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4 Single Layer Multicasting Routing Problem with variable bandwidth 31
4.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
iv
Contents v
4.1.1 Algorithm for RMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.2 Algorithm for IMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Dynamic heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Scheduling of Multiple Multicast Sessions . . . . . . . . . . . . . . . . . . . . . 36
4.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4.1 Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4.2 Running Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5 Multiple Layer NFV-enabled Multicasting Routing Problem 53
5.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.1.1 Two-Approximation Algorithm . . . . . . . . . . . . . . . . . . . . . . 55
5.1.2 Heuristic Algorithm based on TAA . . . . . . . . . . . . . . . . . . . . 59
5.2 Build the Multicast Topology with given number of NFV Instances . . . . . . 60
5.2.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2.2 Heuristic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 SMRP with Time-variant Resource Cost . . . . . . . . . . . . . . . . . . . . . 62
5.3.1 Introduction to Markov Approximation . . . . . . . . . . . . . . . . . . 62
5.3.2 Design of the Discrete Markov Chain . . . . . . . . . . . . . . . . . . . 63
5.3.3 Online Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3.4 Reliability of Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.3.5 Cost of Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.3.6 The Effect of Time Error during State Transition . . . . . . . . . . . . 66
5.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6 Conclusions and Future Work 74
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Bibliography 76
Chapter 1
Introduction
1.1 Virtualization in Cloud Environment
Cloud computing allows clients to use heterogeneous resources that are managed by third
parties over the Internet. Suppose you want store 5GB photos somewhere, and the space on
your laptop has only 1 GB left, what you can do is you can upload your photos online. When
you store your photos online instead of on your home computer, you are using the storage
resources offered by cloud. For another example, if you want to run a software program which
consumes more CPU and RAM resources than your local laptop can handle, then you can
upload this task to the cloud, cloud will return any outputs or results to your laptop so that
this program will not consume any resources of your local laptop.
Virtualization is concept that divides hardware infrastructures to create various dedicated
resources. It is the basic concept that powers cloud computing. By deploying virtualization,
cloud platforms can host several independent applications on a shared hardware resource pool
with the capability to allocate computing power to applications on a per-demand basis. The
computing power is allocated in the form of virtual machine (VM) which runs on the physical
machine. Figure 1.1 shows such an example, a physical machine in datacenter has 8 cpus and
64GB RAM, the hardware resources are shared by three VMs, each runs a applications for the
tenants.
1
Contents 2
Figure 1.1: Virtual Machines Deployment
1.2 Software Defined Networking
In cloud environment, a central controller is usually used to decide the setting of VMs, deploy
VMs on the hardware resources. Moreover, a centralized control of the networks across VMs is
deployed in the cloud by leveraging the idea of Software defined networking (SDN). SDN is a
network paradigm that separates the data plane from the control plane. A logically-centralized
controller has the capability to enforce the network management policies and configure each
SDN switch. SDN allows the control plane to communicate with the network elements in the
data plane. It provides an open protocol to program the flow tables in different switches and
routers [1]. SDN converts the distributed control problem into a centralized control problem,
so that each router/switch just needs to forward the traffic according to the rules dictated
from the control plane, and they themselves do not need to perform any routing decisions.
1.3 Multicast Communication
Multicast is a fundamental communication style in which packets are sent to multiple des-
tinations simultaneously. In a multicast session, packets are replicated in each router and
forwarded to multiple output ports based on the multicast topology. In comparison to unicast
communications, where a single path is set up for each source-destination pair, multicast com-
munication can save a huge amount of bandwidth. Many web-based applications, for example,
Contents 3
Figure 1.2: Example of service chain
multimedia distribution, video conferencing, software updates, and IPTV, rely on multicast
communication to function correctly. For this reason, multicast communication is becoming
increasingly popular.
1.4 Network Function Virtualization
While the advent of SDN provides more possibility for developing new multicast mechanisms,
many multicast services in the cloud nowadays require the involvement of middleboxes. Net-
work appliances such as network address translation (NAT), intrusion detection system (IDS),
video transcoders, deep packet inspection (DPI) are becoming essential in the modern net-
work services to achieve the desired network behaviour. For example according to Figure
1.2, the multicast flow is sent from the server to three end users. Before reaching the end
users, the packets flow passed through a service chain consists of three middleboxes: DPI,
NAT and transcoder. The packets first get inspected by DPI then translate its IP address,
finally get transcoded before reaching the end users. Usually the middleboxes are implemented
in hardware. However, the hardware implementation involves some problems on deployment
and maintenance due to the proprietary nature of the network appliance [2]. For example,
integrating and managing the new middlebox into the network is cumbersome because of the
incompatibility of the new hardware. Moreover, the cost of providing the space and energy
Contents 4
of the hardware is high. Network Function Virtualization (NFV) is proposed to decouple the
network functions from propriety hardware to the software instances running on the VM to
alleviate the above problem in the cloud. By utilizing NFV, we can allocate network resources
to provide network functionalities, while optimizing network topology/configuration, and en-
suring higher reliability by including appropriate mechanisms. For example, consider the case
of video multicasting. The video streams may require transcoding before reaching the end
user, so transcoders must be included into this multicast mechanism. More generally, the mul-
ticast mechanism must include both the placement of NFV nodes and multicast path routing.
In this work, we study such an algorithm to jointly determine the placement of NFV nodes
and to construct a multicast topology to connect the source with every end user through NFV
nodes.
Several basic multicast routing techniques have been developed for construction of multicast
trees. Finding the optimal tree with the minimum cost is equivalent to finding the Steiner
Tree, which has been proven to be NP-hard. However, a number of heuristic algorithms have
been developed. In [3], an approximation algorithm with factor 2 has been developed. This
approximation factor was later improved to 1.598 in [4]. In [5], a heuristic algorithm to build a
Steiner tree with delay bound was proposed. A shared tree algorithm builds a single tree to be
used by all the multicast sessions. The tree contains a single point called core or rendezvous
point (RP) so that all the packets will be forwarded to RP before reaching destination [6,7].
However, the selection of the optimal RP is also an NP-hard problem.
While multicast routing has been a subject of extensive research, designing multicast services
that involve intermediary processing functions has not been carefully studied. In particular,
computing efficient multicast topologies that involve NFV nodes is still a challenging problem,
because it involves jointly determining the placement of NFV nodes as well as constructing
a multicast topology that connects the source and destinations through the NFV nodes. In
response to the above need, we propose an algorithm for building an NFV-enabled multicast
mechanism on SDN. The controller is responsible for setting up the routing path and selecting
NFV nodes. We also develop algorithms for both dynamic multicast and static multicast.
Contents 5
Moreover, we have implemented the preliminary version of our multicast mechanism on ViNO
[8], an SDN launcher that uses VXLAN tunneling protocol alongside a software switch (Open
vSwitch) to dynamically create network topologies specified by the user.
1.5 Thesis Structure
This thesis is based on my previous work [29,30,31], it is organized in the following structure:
In Chapter 2, we will present the related work about multicast routing algorithm, multicast
routing on SDN and NFV research. In Chapter 3, we provide our problem formulation for
the single layer NFV enabled multicast routing problem, and we present the approximation
algorithm. In Chapter 4, we consider an extension of the problem with variable bandwidth
consumption. In chapter 5, we present the online algorithm for the dynamic version of the
problem. We make a conclusion and describe the future work in chapter 6.
Chapter 2
Related Works
2.1 Multicast Mechanism in Software Defined Network-
ing
Multicast has received much attention in the modern communication networks because of the
popularity of group communication and its advantage in saving bandwidth. As mentioned
before, traditional IP multicast suffers from the problem of security, reliability and scalability.
The advent of SDN helps relieve these concerns by separating the control plane from the data
plane. SDN-based multicast framework has been widely used and deployed in the datacenter
networks. In Avalanche, a SDN based system for the datacenter network is proposed that
enables multicast in commodity switches [9]. In [10], an SDN-based multicast clean-slate
scheme aiming to improve security and controllability of multicast network is developed. And
in [11], a SDN-based scalable IP multicast in datacenter network is proposed and evaluated.
[12] proposes a routing algorithm to construct multicast tree with high reliability. However,
none of these works have included NFV functionality with the multicast mechanism.
6
Contents 7
2.2 Traditional Multicast Routing Algorithm
Several basic multicast routing techniques have been developed for constructing the multicast
trees. Finding the optimal tree with the minimum cost can be formulated as a Stein tree
problem, which has been proven NP-hard. In the literature, a number of algorithms have
been developed to solve the Steiner tree problem. In [3], an algorithm with approximation
factor of 2 has been proposed. This approximation factor was later improved to 1.598 in [4].
In [5], a heuristic algorithm to build a Steiner tree with delay bound was proposed. And a
heuristic algorithm for finding a directed multicast tree with minimum cost is proposed in [13].
Some other multicast tree building techniques include the shared tree algorithm, which builds
a single tree to be used by all the multicast sessions. The tree contains a single point called
core or rendezvous point (RP) so that all the packets will be forwarded to RP before reaching
the destinations [6] [14]. However, the selection of the optimal RP is also an NP-hard problem
[15]. The authors of [7] propose a heuristic algorithm to choose a rendezvous point while
minimizing the total weighted cost of routing paths. In [16] a novel rendezvous point based
algorithm is proposed to build a multicast tree which satisfies several constraints, including
delay constraint, link utilization constraint, while minimizing the total cost. However, all of
the paper is focused on generating the multicast tree with minimum cost, none of these paper
has considered the joint NFV placement and multicast tree construction.
2.3 Network Function Virtualization
Now we review some prominent research work on NFV. NFV is a promising research topic and
some research has been done on design and implementation aspect of NFV. The authors of [17]
present a design of a control plane to deal with the race condition during the migration of NFV.
In [18], a design and implementation of a virtualized software network function platform was
presented. In [19] a high-level control platform is described for directing network flow through
a predefined chain of middleboxes with minimum resource consumption. Moreover, previous
Contents 8
work has also been done on efficient placement of NFV. The authors of [20] create a dynamic
NFV placement and routing algorithm for content delivery network. The authors of [21] raise
and solve the virtual DPI placement problem with minimum total cost and delay constraint.
In [22] a joint NFV placement and traffic routing algorithm is designed for the service function
chain, which aims at minimizing the total delay of the traffic with constraint on hardware and
bandwidth resources. However, to the best of our knowledge, no research paper has focused
on joint multicast routing and NFV placement problem yet.
Chapter 3
Single Layer NFV-enabled
Multicasting Routing Problem
3.1 Problem Statement
We model the network as a graph G = (V,E), which consists of a set of nodes V and undirected
links E. We define H ⊆ V as a set of candidate nodes on which NFV may be deployed. Each
multicast session involves a source host s ∈ V and a set of destinations D ⊆ V . In this
context, we define a multicast topology as a subgraph G′ ⊆ G. For each G′, there exists a
mapping function fG′ : D → H ′, that maps each destination d ∈ D to a NFV node h ∈ H ′,
where H ′ ⊆ H denotes the set of NFV nodes used by G′. Therefore, G′ delivers the multicast
content to every d ∈ D using two paths, one from s to a NFV node fG′(d), the other from
fG′(d) to d.
To define the cost of a multicast topology, we assume there is an non-negative fixed cost w(e)
for using each edge e ∈ E. Moreover, we assume running a NFV node on each machine h ∈ H
incurs an activation cost c(h) in terms of resource usage and performance overhead. Therefore,
the total cost of constructing a multicast topology consists of the activation cost of each NFV
node and the sum of the link costs.
9
Contents 10
Our goal is to find a multicast topology G′ that ensures each multicast flow traverses through
the NFV node(s) before reaching the destination, while minimizing the total topology con-
struction cost. Specifically, we define the total link cost of a subgraph U ⊆ G as
C l(U) =∑e∈U
w(e), (3.1)
and define total NFV activation cost as
Ch(U) =∑h∈U
c(h), (3.2)
then our goal is find a multicast topology U to minimize the sum of the total link cost and
NFV activation cost:
C(U) = C l(U) + Ch(U). (3.3)
We call this problem the NFV enabled multicast problem (NEMP).
A concrete example of NEMP is provided in Figure 3.1(a). Consider a real-time video stream-
ing service that delivers a transcoded version of original video stream from h1 to h2 and h3.
Two nodes {5, 7} are available to place the transcoder with activation cost c(5) = 3 and
c(7) = 1 respectively. In this case, the goal of NEMP is to find the minimum cost multicast
topology as shown in Figure 3.1(b).
It is noteworthy that unlike the traditional multicast tree problem, in NEMP each link may be
traversed more than once. For instance, in Figure 3.2, assume the only NFV node is located
at node 4 and D = {2, 3, 5}, s = 1. The multicast routes are:
1. From 1 to 3 : (1, 6, 4, 6, 3)
2. From 1 to 5 : (1, 6, 4, 6, 5)
3. From 1 to 2 : (1, 6, 4, 6, 3, 1, 2)
Contents 11
Figure 3.1: Example of NEMP
Figure 3.2: An example of NEMP solution
In this case, the total cost of this multicast topology is C(U) = w(1, 6) + w(6, 4) + w(6, 4) +
w(6, 5) + w(6, 3) + w(3, 1) + w(1, 2) + c(4). The cost of link (6, 4) is counted twice since the
flow traverses the link (6, 4) twice. And the cost of the NFV node only counts once.
3.2 Complexity of NEMP
We first analyze the complexity of the problem:
Contents 12
Theorem 1. NEMP is NP-hard.
Proof. We show that this problem can be reduced to the the steiner tree problem. The steiner
tree problem aims at finding a minimum cost multicast tree that connects the source to each
of the destinations. Given steiner tree problem, we can build a NEMP by limiting the only
candidate location for placing the NFV is at source node. It is easy to see that the optimal
solution of the multicast topology construction problem is exactly the solution for the original
steiner tree problem. Since steiner tree problem is NP-hard, we conclude that multicast
topology construction problem is NP-hard as well.
3.3 Algorithm for NEMP
Since NEMP is NP-hard, the time required to find the exact solution will grow exponential
with the size of the network, therefore we next propose heuristic algorithms with low time
complexity to solve NEMP. We first present an algorithm that achieves an approximation
ratio of 2. Then, we present an exact solution algorithm based on branch-and-bound. Lastly,
as end users may dynamically enter and leave the multicast session, dynamic heuristics are
proposed to deal with the coming/leaving users so they can connect to the multicast topology
quickly.
3.3.1 An Approximation Algorithm
Algorithm 1 is our approximation algorithm for NEMP, it first searches for a single NFV node
which is used by all the destination users and then it constructs a minimum spanning tree
among the end users. The traffic flow will first traverse through selected NFV node and then
multicast to each end user using the minimum spanning tree it constructs. We now prove its
approximation guarantee.
Contents 13
Algorithm 1 Approximation algorithm
1: Cbest =∞2: for each (h, d) pair ∈ H ×D do3: Construct a shortest path graph P by connecting s to h, h to d.4: Construct a complete subgraph Gc for D, where the cost of each edge (di, dj) in Gc is
set equal to the distance of shortest path from di to dj in G5: Find the minimum spanning tree T from Gc and calculate C l(T )6: Connect P with T , call the combined graph Gw (since d exists in both P and T , we can
connect them together). Calculate C(Gw) using eq. (3.3)7: if C(Gw) ≤ Cbest then8: Cbest ← C(Gw), Gbest ← Gw
9: end if10: end for11: Construct the graph Tbest from Gbest by replacing every edge in Gbest with the shortest
path in G, if there are several shortest paths, randomly select one.12: Remove any unnecessary edges in Tbest to get final output TH
Figure 3.3: An example of Approximation algorithm. (a) shows a network topology G(V,E)with 9 nodes and 12 links, suppose s = 9, H = {5, 7}, D = {2, 3, 4, 6}, and pick node 7 to beh and node 4 to be d. (b) is the path graph P . (c) is the complete graph Gc. Assume (d)is the minimum spanning tree T of Gc. Connecting P with T gives Gw in (e). Assume this
combination gives the least total cost, (f) shows Tf and (g) is TH
Contents 14
Theorem 2. Given that there exists an optimal multicast topology for the NEMP, the total
cost of TH is no more than two times that of the optimal multicast topology.
To prove this theorem, we first state two lemmas:
Lemma 1. Let Gopt be the optimal multicast topology of NEMP, there exists a multicast tree
Topt such that each root-leaf path in Topt is exactly a walk1 from source to an end user in Gopt,
and the total cost of the Topt equals that of Gopt.
Proof. For each destination d, there must exist a walk W which connects source s to d in
Gopt. We construct Topt iteratively. In the first iteration, we select a destination node d at
random. Let the walk between s and d in Gopt to be the first root-leaf path of Topt. In each
of the subsequent iterations, we first select a remaining destination d from D. Let W denote
the walk from s to d in Gopt. Among all existing root-leaf path P in Topt, we find the P which
contains the largest subset W1 ⊆ W from the source s. That is, P contains the longest subpath
of W among all the root-leaf path. Let n be the end point of W1 and define W2 = W \W1,
we then add a W2 as a new branch to Topt. Repeat this for every end user in Gopt until all the
end users have been added. Since all the links and nodes are copied from the walks in Gopt
to Topt, the total of Topt (include the activation cost of NFV nodes and total cost of the links)
equals that of Gopt.
This can be illustrated by Figure 3.4. We try to derive Topt from the optimal multicast topology
Gopt in Figure 3.2. The walk from source node 1 to the first end user 3 is (1, 6, 4, 6, 3) (3.4(a)).
The walk from source node 1 to second end user 5 is (1, 6, 4, 6, 5), since (1, 6, 4, 6) is already
contained in Topt, we connect (6,5) to Topt (3.4(b)). Finally, the walk from source node 1 to
last end user 2 is (1, 6, 4, 6, 3, 1, 2) and (1, 6, 4, 6, 3) is already contained in Topt, we connect
(1, 2) to node 3 in Topt (3.4(c)). The total cost of Topt equals that of Gopt.
And we cite following result from [2]:
1In graph theory, a walk is a sequences of links and nodes, where each link’s endpoints are preceding andfollowing nodes in the sequence.
Contents 15
Figure 3.4: Example of constructing tree
Lemma 2. Let T be a tree with m ≥ 1 edges. Then there exists a loop in T, u1, u2, ..., u2m,
where every ui, 1 ≤ i ≤ 2m, is a vertex in T , such that every edge in T appears exactly twice
in the loop.
Proof of Theorem 2. Let Gopt denote the optimal solution and Topt denote the tree derived
from it. Lemma 1 shows there exists a Topt with cost equals that of Gopt. Define Copt(v1, v2)
as the cost of the path connecting node v1 and v2 in Topt, Cshortest(v1, v2) cost of the shortest
path connecting node v1 and v2 in G. Also denote the leaves of Topt as di, 1 ≤ i ≤ n (the order
of leaves is random), where n is the number of leaves in Topt. From Lemma 2, there exists a
loop L in Topt such that every edge in Topt is traversed exactly twice and every leaf in Topt is
visited exactly once. L can be decomposed into three paths:
1. A simple path M1 that connects source s to the first leaf d1.
2. A path M2 that connects first leaf d1 to next leaf d2, and from d2 to the next leaf d3
until to the last leaf dn.
3. A path M3 that connects dn back to source s.
Since each link and NFV nodes in Topt appears exactly twice in the loop L, C(L) = 2 ×
C(Topt) ≥ C l(L). The simple path M1 must pass through at least one NFV node, otherwise
the flow is not processed before reaching the destination user. Call this node hopt, so the
C(M1) = Copt(s, hopt) + Copt(hopt, d1) + c(hopt). Ignoring the activation cost of all the NFV
Contents 16
nodes that M2 traverses, M2 can be viewed as a graph that contains all the end users, so C lM2
equalsn−1∑i=1
Copt(di, di+1) and C(M1) + C l(M2) must be greater than or equal to C(Tbest) =
minh∈H,d∈D
(Cshortest(s, h) +Cshortest(h, d) + c(h) +C l(T )), where T is the minimum spanning tree
defined in step 5 of Algorithm 1. Thus C(L) must be greater than or equal to C(Tbest),
which is greater than or equal to C(TH). Therefore C(TH) ≤ C(Tbest) ≤ C(M1) + C l(M2) ≤
C(M1) + C(M2) ≤ C(L) = 2C(Topt) = 2C(Gopt).
We use the example in Figure 3.3(a) to illustrate this proof, suppose for the problem in figure
3.3(a) we have the optimal solution Topt in Figure 3.5(a). The C(Topt) includes all the cost of
the links it uses and all the activation cost of NFV nodes it uses (node 5 and node 7). The
loop L that traverses Topt is shown in 3.5(a), which includes:
1. source s to the first leaf node 3, call this simple path M1 (shown in 3.5(b)).
2. first leaf node 3 to next leaf node 4, from node 4 to the next leaf node 6 until to the last
leaf node 2, call this path M2 (shown in 3.5(c)).
3. last leaf node 2 back to source s, call this simple path M3.
The total cost of L includes two times of the NFV activation cost in Topt (the activation
cost of node 5 and 7, since they are traversed twice) and two times the sum of the link cost
of Topt. Therefore C(L) = 2 × C(Topt). The simple path M1 passes through NFV node
5 and connects to end user node 3. M2 contains all the end users (node 3,4,2,6). And
2× C(Topt) ≥ C(M1) + C(M2) ≥ C(Tbest) ≥ C(TH).
We now analyze the complexity of Algorithm 1. In line 1 and 2 , for each fixed h and d, we
need to find the shortest path between s to h, h to d. In line 3, to calculate the cost of edge
in Gc, we need to find the shortest paths for each pair of destination users. Therefore, The
overall complexity is O((n2 + n|H|)(|V | + |E| log |E|)), where n is number of end users, |H|
is the size of H, |V | is number of nodes in the network and |E| is number of edges in the
network.
Contents 17
Figure 3.5: Example
3.3.2 A Solution Algorithm based on Branch and Bound
We can also formulate the NEMP as an optimization problem, and find the exact solution
by using a technique such as branch-and-bound. To formulate the problem, we first make
the graph directed by representing each undirected link with two opposite directed links with
equal weight. In presenting the formulation, we will use the definitions given in Table 3.1.
The problem can be formulated as a binary integer programming problem as follows:
minimizeX1,X2,Y1,Y2,Z
P T1 (Y1 + Y2) + P T
2 Z (3.4)
subject to M(X1 + X2) = A (3.5)
ZTMX1 = −1 (3.6)
ZTMX2 = 1 (3.7)
X1jk ≤ Y1j ≤ 1 (∀k, j) (3.8)
X2jk ≤ Y2j ≤ 1 (∀k, j) (3.9)
RTZ = 0 (3.10)
Contents 18
Table 3.1: Definitions of parameters
NameDimension Description
N - Number of nodes in the networkK - Number of undirected links in the network, after replac-
ing each undirected link with two directed links, therewill be 2K directed links in the graph
n - Number of destinationsH - Set of nodes where NFV can be implementedD - Set of destinationsM N × 2K Incidence matrix, Mmi = 1 and Mni = −1 if there is a
directed link i from node m to node nP1 2K × 1 Edge cost vector, it contains cost w(e) of each directed
link eP2 N × 1 Node cost vector, P2k = c(k) iff node k is in H. P2k = 0
otherwiseA N × n Ajk = 1 iff j = s, Ajk = −1 iff j = dk and Ajk = 0
otherwiseW 1× n Wk = 1 for each element in WR N × 1 Ri = 1 if node k is not in H. Rk = 0 otherwiseS N × n Sij = 1 if i = source and Sij = 0 for all i 6= sourceQ N × n Qij = 0 if i = dj and Qij = −1 otherwiseX1 2K × n X1ei = 1 iff link e is used by the flow with destination di
to connect source s with the NFV nodeX2 2K × n X2ei = 1 iff link e is used to reach destination di from the
NFV nodeY1 2K × 1 Y1e = 1 iff link e is used in the multicast topology to
connect source s to any NFV nodeY2 2K × 1 Y2e = 1 iff link e is used in the multicast topology to
connect any NFV node to any end usersZ N × 1 Zk= 1 iff NFV is running on node k
Q �MX1 � S (3.11)
X1, X2, Y1, Y2, Z are binary
(3.4) gives the total cost of the multicast topology, which includes the cost for each used
link and the activation cost of the NFV nodes. (3.5) and (3.11) ensure a single path set up
between the source s and each end user. (3.6) and (3.7) ensures that the multicast flow passes
at least one NFV node before reaching each end user. (3.8) and (3.9) ensure that the multicast
topology is comprised of all the single paths between source s and each end user di. (3.10)
ensures that only the nodes in H can be used for NFV node.
Contents 19
We now provide an algorithm for solving the integer programming problem using Branch
and bound. Compared to Algorithm 1, our Branch and bound algorithm is able to compute
the exact optimal solution of NEMP with the drawback of higher computational overhead.
However, since this is a quadratic integer program, it is hard to apply traditional algorithms
to solve it. Instead, we can make the problem linear by trying every combination of NFV node
and solve the linear integer programming problem (given the known NFV node) using Branch
and bound. The key features of our algorithm is that we can prune the search space in the
following two dimensions:
1. Prune the search space of NFV nodes
We can search for the best NFV enabled multicast topology without the concern on the
activation cost of NFV nodes, call this topology Tlink, and Tlink is the cheapest multicast
topology we can build if we neglect the activation cost of NFV nodes. Denote by cmax the cost
of deployed NFV nodes in Tlink. Therefore if we have a multicast tree whose activation cost of
NFV nodes is greater than cmax, then its total cost must be greater than that of Tlink and it
must not be the optimal solution. Using this fact, the steps for pruning the search space are
shown below:
• We change the objective function of the previous problem, ignore the cost of NFV nodes,
so the new object function becomes:
min P T1 (Y1 + Y2)
• Activate every NFV node in H by setting Zi = 1 iff node i∈ H and Zi = 0 otherwise.
• Solve the new linear integer programming problem by using Branch and bound.
• Record the actual NFV nodes used by the solution since some of the NFV nodes may
not be used by the multicast topology. Calculate the total activation cost of these used
NFV node, cmax.
Contents 20
• Select one combination of the NFV nodes, before solving the linear integer programming
problem, calculate the total activation cost of this NFV node combination, denoted as
c, if c > cmax, then this combination is pruned and there is no need to test it.
2. Prune the search space of links
When solving the linear integer programming problem given the activated NFV node, we can
use the following facts of the multicast topology to prune the search space:
• Links directed to the source s can never carry flow from the source to the NFV node
deployed.
• Links originating from the destination can never carry flow from NFV node to that
destination.
Therefore, multicast topologies that satisfy either of these two conditions will not be considered
in our Branch and bound algorithm.
3.3.3 Dynamic heuristics
The algorithms presented in the previous sections have been designed to solve the static version
of NEMP. In reality, end users may dynamically join and leave the existing multicast topology,
so below we will propose a dynamic approach (Algorithm 2,3) to deal with this situation. For
the new incoming end users, the algorithm finds the closest end user in distance among the
existing end users in the multicast topology, and then makes a connection between them. For
leaving end users, the algorithm will remove the path to the end user if no other existing end
users use that path.
The complexity of dynamic heuristics of each user entering and leaving is O(n(|V |+|E| log |E|))
and O(|E|) respectively, where n is number of end users, |V | is number of nodes in the network
and |E| is number of edges in the network. The complexity of dynamic heuristics is much lower
Contents 21
Algorithm 2 Dynamic heuristic (Entering)
1: for each coming end user d ∈ D do2: if new user connects to a node in multicast topology then3: Send the flow to that user directly.4: end if5: if new user connects to a new node outside existing multicast topology then6: dmin ← user in the multicast group who has the shortest distance to d. If there are
more than one existing users satisfy this, randomly choose one.7: Connecting d to dmin with shortest path.8: end if9: end for
Algorithm 3 Dynamic heuristic (Leaving)
1: for each leaving end user d ∈ D do2: for each link e in path to d do3: if e does not carry any multicast traffic then4: remove e from the existing multicast topology5: end if6: end for7: end for
than that of the static algorithms, therefore the users can be quickly added or removed from
the multicast topology.
3.4 Numerical Results
We evaluated our multicast algorithm from both static and dynamic perspectives on the wide
area network model. We will demonstrate the accuracy of the algorithms by comparing the
result of Approximation algorithm with the multicast topology generated by our Branch and
bound method on three network models generated by GT-ITM [23]. GT-ITM is a very popular
network generator which simulates the wide area network topology. The network settings are
presented in table 3.2.
For each network topology, we randomly generate a set of nodes H and a set of end users.
We schedule all the users at the same time. Each scenario is evaluated several times and the
Contents 22
Table 3.2: Properties of networks
Case NetworkI NetworkII NetworkIII
nodes 10 35 100links(undirected) 14 50 127
size of H 2 7 10
average results are presented.
3.4.1 Cost Analysis
Tables 3.3, 3.4 and 3.5, the result generated by Approximation algorithm is normalized to
that generated by our Branch and bound method. For network I, the normalization ratio
between the approximation algorithm and optimal solution increases from 1.25 to 1.31 as the
number of users increases. For network II, the normalization ratio increases from 1.31 to 1.41.
For network III, the normalization ratio increases from 1.47 to 1.56 as the number of users
increases. We observe that the gap between the Branch and bound solution and Approximate
algorithm increases as the number of end users grows or the size of the network increases,
one possible reason is that as the size of the end users increases or the size of the network
increases, there are more possibilities to build the multicast topology, therefore it is harder to
find a good solution.
Table 3.3: cost of the multicast topology (NetworkI)
Number of end users Branch and bound Approx. algorithm3 1 1.254 1 1.305 1 1.31
Table 3.4: cost of the multicast topology (NetworkII)
Number of end users Branch and bound Approx. algorithm5 1 1.3110 1 1.3715 1 1.41
Contents 23
Table 3.5: cost of the multicast topology (NetworkIII)
Number of end users Branch and bound Approx. algorithm10 1 1.4720 1 1.5130 1 1.56
3.4.2 Running Time Analysis
After comparing the cost of different algorithms, Figure 3.6-3.8 give the running time of
Approximation method and our Branch and bound for three networks. The running time
of the approximation algorithm increases from 80ms to 300ms for Network I, 0.25s to 9s for
Network II and 0.8s to 77s for Network III. We make the following observations. First, the
running time of both methods increases as the number of destination users increases. By
comparison, the growth rate of running time of Approximation algorithm is much lower than
that of Branch and Bound. Second, the network size has a great effect on the running time
for both methods, with a higher impact on Branch and Bound.
Figure 3.9-3.11 shows the relationships between total number of end users and the average
processing time of each incoming user. The running time of the algorithm increases from
1.5ms to 6.1ms for Network I, 3.2ms to 4.7ms for Network II and 4ms to 5.5ms for Network
III. We can see that the average processing time increases as more users join the multicast
group, a possible reason is that as the number of users in the group increases, there are more
options for the new end users to connect to. And the average processing time grows gently as
the network size increases, which demonstrates the algorithm is scalable.
3.5 Implementation
In order to validate the functionality and implementability of this multicast mechanism, we
use ViNO running on the SAVI testbed. The SAVI testbed is an open application platform
which is aimed for efficiently controlling and managing virtual ICT resources [24]. ViNO is
an SDN launcher that uses VXLAN alongside a software switch (Open vSwitch) running on
Contents 24
Figure 3.6: Running time of two static algorithms for network I
Figure 3.7: Running time of two static algorithms for three network II
Contents 25
Figure 3.8: Running time of two static algorithms for three network III
Figure 3.9: Running time of two dynamic heuristics for network I
Contents 26
Figure 3.10: Running time of two dynamic heuristics for three network II
Figure 3.11: Running time of two dynamic heuristics for three network III
Contents 27
virtual machines to create an overlay SDN network whose topology is specified by the user.
Moreover, a central controller is set up to enforce the network management policies to each
network element [8].
3.5.1 Multicast Mechanism
The multicast mechanism can be described by the block diagram in Figure 3.12:
1. User (Tenant) input: tenant specifies the network topology, link weight, source node,
destination nodes into a file with some special format. Then the file will be passed to
the routing calculator and ViNO.
2. Routing Calculator: it takes the input from tenant, computes the routing paths, and
sends the results to the SDN controller.
3. ViNO: ViNO will take the input from tenant and generate the overlay network topology
based on the input file.
4. SDN controller: the SDN controller will take the output from the Routing Calculator,
translates them into a set of OpenFlow forwarding rules, and implements the rules in
OpenFlow switches.
5. A multicast network is set up after the rules has been implemented in the switches.
3.5.2 Example
For the implementation, we choose a network topology with four hosts h1, h2, h3, h4, 12
OpenFlow switches and a set of predefined link weights (shown in Figure 3.13). The multicast
group consists of three hosts h1, h3 and h4. The source host h1 multicasts packets which
contains some text messages to end users h3 and h4. An NFV node has also been booted up
on h2 which can turn the lower case letter received by the NFV into upper case. The text
Contents 28
Figure 3.12: Multicast mechanism diagram
message written in lower case will be sent from h1 to h3 and h4, h3 and h4 should receive
the processed multicast flow which contains the same text message in upper case. The paths
connecting s1 with h3 and h1 are described below:
From h1 to h3 : (h1, s1, NFV, s6, s10, h3)
From h1 to h4 : (h1, s1, NFV, s1, s2, s4, s7, h4)
We run a simple C code on the NFV node. This code will intercept every packet received from
the UDP port 5005, turn the text message inside into upper case. To send text from h1 to h2
and h3, we use netcat command in Linux to send UDP packet through port 5005. We use the
IP address of h3 as destination IP address of the multicast packets.
First, we seek to validate the functionality of the multicast mechanism by sending the text
messages from the host h1. As shown in the Figure 3.14, the text message will first sent to
the NFV node and processed by h2. And h2 will send the texts in upper case to the end users
h3 and h4.
Contents 29
Figure 3.13: Network topology
3.5.3 Evaluation
The total set up time is 57.79 seconds, average bandwidth used on each link is 47.6 bytes/s
and drop rate is 0 bytes/s. Assume the overlay network is ready for use, the total setup time
is the time from sending the user input to Routing Calculator until all the forwarding rules
have been implemented on each switch. The total time used is less than 1min. Most of time
is spent on communication between virtual machines.
Contents 30
Figure 3.14: Screenshots of h1, h2, h3 and h4
Chapter 4
Single Layer Multicasting Routing
Problem with variable bandwidth
4.1 Problem Statement
In the previous section, we make an assumption that the bandwidth of a flow remains constant
after traversing through the NFV. However, in general the bandwidth usually changes when the
traffic traversing NFV, such as video compression/decompression. Moreover, we assume that
the link cost is also associated with the bandwidth consumption. The intuition would be that
as the bandwidth consumption increases, the link cost also increases, and vice versa. We call
this problem NFV-enabled multicast problem with variable bandwidth (NEMPVB) problem.
We can formulate the NEMPVB as an optimization problem, and solve it by using the tech-
nique such as Branch and bound. In presenting the formulation, define P1 the cost of links
before traversing the NFVs, P2 the cost of links after traversing through the NFVs and P3 the
cost of nodes. And we will use the definitions given in Table 3.1. The optimization problem
can now be stated as:
minimizeX1,X2,Y1,Y2,Z
P>1 Y1 + P>2 Y2 + P>3 Z (4.1)
31
Contents 32
Equation (4.1) is the new object function that computes the total cost of multicast topology
considering variable bandwidth, which includes the cost for each used link before and after
traversing through the NFV and the activation cost of the NFV nodes. And (4.1) together
with (3.5)− (3.11) formally define the NEMPVB.
We can easily show that NEMPVB is also NP − hard. Since the NEMPVB is NP − hard,
our main objective is to find the polynomial time algorithms that yields good approximation
guarantee. In the following sections, we shall divide NEMPVB into three separate cases. In
the first case, we assume bbefore = bafter, namely the bandwidth usage before and after the
process by NFV components are the same, this is the NEMP which we have solved in last
chapter. The second case is when bbefore > bafter. We call this problem Reduced Multicast Flow
Bandwidth problem (RMF). Finally, we consider the case where bbefore < bafter. We call this
problem the Increased Multicast Flow Bandwidth problem (IMF). While these three problems
are similar, the approximation algorithms developed for these problems differ in subtle aspects.
In the following subsections, we will study RMF and IMF separately.
4.1.1 Algorithm for RMF
In RMF, the bandwidth of multicast flow decreases after traversing through NFV, the reduction
on bandwidth consumption will cause the reduction on the link cost. This problem can be
solved by almost the same approximation algorithm as the NEMP. The only difference is that
the bandwidth requirement of the flow reduced after traversing through the NFV, resulting
two link weight metrics Θbefore and Θreduced. So the approximation algorithm of RMF is the
same except that in line 3 of Algorithm 1, Θbefore is used to calculate distance from source
s to NFV node h and Θreduced is used to calculate distance from NFV to end user d. By
using link metric Θ, we mean the link weight used to calculate distance between two nodes is
defined by Θ. In line 4 of algorithm 1, Θreduced is used to calculate the cost of the edge of Gc.
This approximation algorithm also keeps an approximation factor of 2, which can be proven
as follows:
Contents 33
First, let Gopt denote the optimal solution. From Lemma 1, we can derive a tree Topt with the
same cost. To facilitate the statement of the algorithm, define:
Copt(v1, v2,Θ): Cost of the path connecting v1 and v2 in Topt under link weight metric Θ.
Cshortest(v1, v2,Θ): Minimum cost of the path connecting v1 and v2 in G under link weight
metric Θ.
Now we state the proof of the approximation algorithm for RMF :
Proof. From Lemma 2, there also exists a loop L in Topt such that every edge in Topt is traversed
exactly twice and every leaf in Topt is visited exactly once. L can be decomposed into three
paths:
1. A simple path M1 that connects source s to the first leaf l1.
2. A path M2 that connects first leaf l1 to next leaf l2, and from l2 to the next leaf l3 until
to the last leaf ln.
3. A path M3 that connects ln back to source s.
The simple path M1 must pass through at least one NFV node, call this NFV node hopt,
the total cost of M1 equals Copt(s, hopt,Θbefore) + Copt(hopt, l1,Θreduced) + c(hopt). Ignoring
the activation cost of all the NFV nodes that M2 traverses, M2 can be viewed as a graph
that contains all the end users, so the total cost of M2 equalsn−1∑i=1
Copt(li, li+1,Θ), where is
Θ may be Θbefore or Θreduced depending on whether the flow travels through the NFV node.
And the sum cost of M1 and M2 must be greater than or equal to that of Tbest = minh∈H,d∈D
(Cshortset(s, h,Θbefore) +Cshortest(h, d,Θreduced) + c(h) +C l(T )). Thus the total cost of the loop
L must be greater than or equal to the total cost of Tbest, whose total cost is greater than
or equal to the cost of the solution TH . Therefore C(TH) ≤ C(Tbest) ≤ C(M1) + C l(M2) ≤
C(M1) + C(M2) ≤ C(L) = 2C(Topt) = 2C(Gopt).
The complexity of the above algorithm is the same as that of approximation algorithm for
UMF.
Contents 34
Algorithm 4 Approximation algorithm for IMF
1: Cbest =∞2: for each (h, d) pair ∈ H ×D do3: Construct a shortest path graph P by connecting s to h, h to d. Calculate the cost of
P , C(P ) = Cshortest(s, h,Θbefore) + Cshortest(h, d,Θincreased) + c(h)4: Construct a complete subgraph Gc for D, where the cost of each edge (di, dj) in Gc
is set equal to min{min(h1,h2)∈H (Cshortest(vi, h1,Θincreased) + Cshortest(h1, h2,Θbefore) +Cshortest(h2, vj,Θincreased) + c(h1) + c(h2)), Cshortest(vi, vj,Θincreased)} in G
5: Find the minimum spanning tree T from Gc and calculate C l(T )6: Connect P with T , call the combined graph Gw (since d exists in both P and T , we can
connect them together). Calculate C(Gw) using eq. (3)7: if C(Gw) ≤ Cbest then8: Cbest ← C(Gw), Gbest ← Gw
9: end if10: end for11: Construct the graph Tbest from Gbest by replacing every edge in Gbest with the shortest
path in G, if there are several shortest paths, randomly select one.12: Remove any unnecessary edges in Tbest to get final output TH
4.1.2 Algorithm for IMF
In IMF, the bandwidth consumption increases after passing through the NFV nodes, result in
a higher link cost.
We propose a 2-approximation algorithm (Algorithm 4) for IMF below. It can be seen that
the main difference between Algorithm 4 and Algorithm 1 is line 3 and 4, the cost of edge of
P and Gc is calculated differently. The rationale of Algorithm 4 is similar to Algorithm 1.
Similarly, we have two link weight metrics Θbefore and Θincreased.
Now we state the proof of approximation algorithm for IMF :
Proof. Let Topt denote the optimal tree derived from Gopt. There also exists a loop L in Topt
such that every edge in Topt is traversed exactly twice and every leaf in Topt is visited exactly
once. Like previous case, L can be decomposed into three paths M1, M2 and M3. The simple
path M1 must pass through at least one NFV node, call this NFV node hopt, the total cost of
M1 equals Copt(s, hopt,Θbefore) +Copt(hopt, l1,Θincreased) + c(hopt). M2 consists of paths which
Contents 35
connect first leaf l1 to next leaf l2 until the last leaf ln. Call the path which connects node li
and node li+1 in L Qi, there are two categories for Qi:
1. Qi does not pass through any NFV node, in this case the total cost ofQi = Copt(li, li+1,Θincreased).
2. Qi passes through two NFV nodes, h1 and h2. In this case the total cost of Qi =
Copt(li, h1,Θincreased) + Copt(h1, h2,Θbefore) + Copt(h2, li+1,Θincreased) + c(h1) + c(h2).
The total cost of M2 equalsn−1∑i=1
Qi. According to the definition of Tbest above, the sum cost of
M1 and M2 must be or equal to that of Tbest = minh∈H,d∈D
(C(P ) +C(T )). Thus the total cost of
the loop L must be greater than or equal to the total cost of Tbest, whose total cost is greater
than or equal to the cost of the solution TH , hence C(TH) ≤ C(Tbest) ≤ C(L) = 2C(Topt) =
2C(Gopt).
We now analyze the complexity of the Algorithm 4. In line 3 of Algorithm 4, we need to find
the shortest path between s to h, h to d for each h and d pair. Moreover, in line 2, to calculate
the cost of edge in Gc, we need to try every combination of h1 and h2 and the shortest paths
between them. Therefore, The overall complexity is O(n2|H|2(|V | + |E| log |E|)), where n is
number of end users, |V | is number of nodes in the network and |E| is number of edges in the
network.
A limitation of Algorithm 4 is that, in line 4 of Algorithm 4, the multicast flow sent between
two end users may first be reversely processed by one NFV node to decrease bandwidth then
forward processed by another NFV node before reaching the next destination user. Therefore
Algorithm 4 assumes that NFV node can perform both forward and backward processing.
Since bandwidth consumption of the flow increases when traversing through the NFV, there
should be hardly any loss of original information when the flow is first forward and then reverse
processed by NFV node. Example may include lossless video compression which allows the
original video stream to be perfectly recovered from the compressed stream [25].
Contents 36
4.2 Dynamic heuristics
So far, our proposed algorithms focus on static cases where individual users never leave the
system until the session ends. In reality, end users may dynamically join and leave the ex-
isting multicast topology. The static algorithms are not suited for this case because it is not
reasonable to run the static algorithms every time a user joins or leaves the system due to
the high computation overhead. Moreover, static algorithm may return a completely different
multicast topology as one user joining/leaving the multicast topology, therefore all the existing
multicast trees need to reconfigure, which produces a high reconfiguration overhead. There-
fore, we propose dynamic heuristics to deal with this situation. Algorithms 2 and 3 proposed
in the last section can be used for handling user joining and leaving for both the UMF and
RMF problem.
However, the case of IMF, as the bandwidth increasing after processing by NFV, finding the
closest user and connecting it to the new user may result in high bandwidth usage. Therefore
we design a new algorithm for the entering operation in the case of IMF (Algorithm 5). When
the new end user joins, this dynamic algorithm first finds the shortest distance between every
existing end user to the new user and find the least incremental cost c1min. Then it calculates
the least cost of activating a new NFV node from set of unused NFVs Hinactivated, then setting
up the flow path, c2min. If c1min ≤ c2min, then it connects the new end user to that existing
end user, otherwise it activates a new NFV node and sets up a new path. The complexity of
this dynamic heuristics is O((n+ |H|)(|V |+ |E| log |E|)), which is still much lower than that
of Algorithm 4. Finally, Algorithm 3 can still be used to handle the case when a user leaves
the system for IMF.
4.3 Scheduling of Multiple Multicast Sessions
In the previous sections, we consider the routing for single multicast session. However, in
reality, multiple multicast sessions may coexist to deliver their own information to different
Contents 37
Algorithm 5 Dynamic heuristics (entering)
1: For each coming end user d:2: For each end user v:3: vmin ← argmin
v(Cshortest(v, d, θincreased))
4: c1min ← minv
(Cshortest(v, d, θincreased))
5: End6: For each node h in Hinactivated:7: hmin ← argmin
h(Cshortest(s, h, θbefore) + Cshortest(h, d, θincreased))
8: c2min ← minh
(Cshortest(s, h, θbefore) + Cshortest(h, d, θincreased) + ch))
9: End10: If c1min ≤ c2min11: Add the new user to the group by connecting user d with vmin with the shortest
path12: End13: If c1min > c2min14: Hinactivated.remove(hmin)15: Add the new user to the group by connecting (s,hmin,d) with the shortest path16: End17: End
end user groups. Without loss of generality, we assume each multicast session has its priority.
For the scenario of multiple multicast sessions, applying the approximation algorithms over
each session may cause imbalanced resources utilization.
For example, as it is shown in Figure 4.1(a), two multicast trees (shown in solid and dash
arrow) carry traffic from source 1 to their end users, result in the heavy traffic load on NFV
node 3 and link (1, 3), (3, 5), (3, 6) since they carry the traffic of both sessions. To solve this
problem, we can reroute one of the multicast trees to distribute the traffic load to other links
and NFV nodes to achieve load balancing, one possible solution is shown in 4.1(b).
We now give the formulation of the problem. Suppose there are a set of multicast sessions T ,
each session has its own source node st, set H t to place NFV nodes and destinations Dt (t ∈ T ).
The goal is to minimize the maximum link utilization. This problem can be represented as
the follows (variable definitions are given in Table 4.1):
minimize u (4.2)
Contents 38
Table 4.1: Definitions of parameters
Name Dimension Description
N - Number of nodes in the networkK - Number of undirected links in the network,
after replacing each undirected link with twodirected links, there will be 2K directed linksin the graph
nt - Number of destinations of multicast session tH t - Set of nodes where NFV can be implemented
for multicast session tDt - Set of destinations in multicast session t, {dtl}
l = 1,2,.....,nt
u - Maximum utilization rate of links in the net-work
Cj - Capacity of link jLt1 - Bandwidth usage of multicast session t before
passing NFVLt2 - Bandwidth usage of multicast session t after
passing NFVβt - Workload on NFV nodes of multicast session
tM N × 2K Incidence matrix, Mim = 1 and Min = −1 if
there is a directed link i from node m to noden
At N × nt Atjk = 1 iff j = st, Atjk = −1 iff j = dtk andAtjk = 0 otherwise
W t 1× nt W tk = 1 for each element in W t
Rt N × 1 Rtk = 1 if node k is not in H t, Rt
k = 0 other-wise
Qt N × nt Qtij = 0 if i = dtj and Qt
ij = −1 otherwiseX t
1 2K × nt X t1ei = 1 iff link e is used by the flow with
destination dti to connect source st with theNFV node
X t2 2K × nt X t
2ei = 1 iff link e is used to reach destinationdti from the NFV node
Y t1 2K × 1 Y t
1e = 1 iff link e is used in the multicast sessiont to connect source s to any NFV node
Y t2 2K × 1 Y t
2e = 1 iff link e is used in the multicast sessiont to connect any NFV node to any end users
Zt N × 1 Ztk= 1 iff NFV is running on node k in multi-
cast session t
Contents 39
Figure 4.1: Example of Load balancing
subject to M(X t1 +X t
2) = At (∀t) (4.3)
Zt>MX t1 = −W t (∀t) (4.4)
Zt>MX t2 = W t (∀t) (4.5)
X t1jk ≤ Y t
1j ≤ 1 (∀k, j, t) (4.6)
X t2jk ≤ Y t
2j ≤ 1 (∀k, j, t) (4.7)
Rt>Zt = 0 (∀t) (4.8)
Qt �MX t1 � St (∀t) (4.9)
∑t
(Lt1Yt
1j + Lt2Yt
2j) ≤ uCj (∀j, t) (4.10)
X t1, X
t2, Y
t1 , Y
t2 , Z
t are binary
In this formulation, equations (4.3), (4.4), (4.5), (4.6), (4.7), (4.8) and (4.9) are the same
as what is defined in the NEMP in previous chapter. (4.10) calculates the total bandwidth
consumption on each link and makes sure that the total bandwidth usage on that link is less
than the utilization rate times capacity of that link.
Contents 40
Algorithm 6 Load balancing algorithm
1: Sort the multicast sessions t ∈ V from highest to lowest priority2: Assign initially cost to the nodes and links based on their resource usages3: For each multicast session t in V (highest to lowest priority):4: Run the Approximation algorithm.5: For each link in the network:6: Calculate the bandwidth usage.7: Update the cost of links based on the current utilization rate of the resources.8: End9: End
Our heuristic algorithm works as follows. Assume that we have a set of multicast sessions
t ∈ V with its own priority, one of the methods to achieve load balancing is first scheduling
the multicast session with the highest priority, then updating the link cost based on their
amount of resource remaining, until all the sessions are scheduled. Assume that initially each
node has a NFV activation cost and each link has a link cost, the algorithm is presented as
Algorithm 6.
To update the cost of link j, we use the function which is reversely proportional to the re-
maining bandwidth on the link =ρj
1−m (where ρj is a constant, and m is the current utilization
rate of the node or link). By using this update function, the link/node with lower load is more
likely to be used in the next round of scheduling.
One limitation of the above algorithm is that it assumes that all multicast sessions have their
unique priority levels. However, it is possible that there are multiple multicast sessions with
the same priority, we need to ensure each session is fairly treated while not exceed the capacity
of resources. That is, the total bandwidth consumption does not exceed the link capacity. For
instance, in Figure 4.2(a). Two multicast sessions are built using the approximation algorithm
proposed above. Session 1 shown in solid arrow has s = 9, h = 5, D = {3, 4}. Session 2 shown
in dash arrow has s = 9, h = {5, 7}, D = {4, 6}. Both the trees consume the resources on link
(9, 5) and (5, 4). Suppose the sum of bandwidth requirement of session 1 and session 2 exceeds
the capacity of link (9, 5). We call it resource contention, since multiple multicast sessions
will compete for limited link bandwidth. To solve this resource contention problem, we can
reroute one of the multicast tree to satisfy the capacity constraint.
Contents 41
Figure 4.2(b) shows an alternative path (in dotted arrow), so one of the session may take this
alternative path. However, since the original multicast topology has been changed, the new
multicast topology may has a different cost than the original multicast topology. That is,
rerouting a flow may incur a reconfiguration cost.
For each overloaded link, suppose we have a set of paths {pn}, (n = 0, 1, ..., N) which contains
the original path p0 (which is the link) as well as N disjointed alternative paths. Define the
total number of flows of multicast sessions that the overloaded link carries as M , and a set of
multicast sessions as {fm}, (m = 1, 2, ...,M), and the reconfiguration cost rmn as:
rmn =
0 if n = 0, 1 ≤ m ≤M
c[pn, {fm}]− c[p0, {fm}] if n 6= 0, 1 ≤ m ≤M
where c[pn, fm] denotes the total cost of the multicast topology of fm if p0 is replaced by pn.
Define Cl as the capacity of the link l, and the capacity of the path pn, Cpn is defined as:
Cpn = minl∈pn
Cl
which is the minimum capacity of the links contained in the path. Define Bm as the bandwidth
requirement of multicast flow fm, and xmn as the binary integer variable to indicate if flow fm
is assigned to pn. In order to minimize the total reconfiguration cost, we have the following
objective function:
minimizex
M∑m=1
N∑n=0
rmnxmn (4.11)
We need to assign a path for each multicast session, which gives:
subject toN∑n=0
xmn = 1 ∀m ∈M (4.12)
Contents 42
And we need to make sure that the total bandwidth usage does not exceed the capacity of
each path:M∑m=1
Bmxmn ≤ Cpn ∀0 ≤ n ≤ N (4.13)
Finally, this is a binary integer programming problem:
xmn ∈ {0, 1} (4.14)
This problem is a special case of the generalized assignment problem (GAP) which is APX-
hard. For this problem, [26] presents a solution algorithm running in polynomial time with
relaxation on the capacity constraint for a similar problem. We need to revise the solution in
[26] to solve our problem. The algorithm works as follows. First, we need to convert it into
linear programming problem by relaxing the integer constraint (4.14) to
xmn ≥ 0 (4.15)
Moreover, we can add another constraint
xmn = 0 if Bm > Cpn (4.16)
which states that we will not assign fm to pn if the fm requires more bandwidth than the
capacity of pn. Similar to the result given by [26], we have the following theorem:
Theorem 3. Providing that there is a feasible solution for the linear programming problem
(4.11),(4.12),(4.13),(4.14),(4.15) with cost π, we can find the optimal integer solution with
total cost no more than π in polynomial time, with each path pn has to carry at most 2Cpn .
Proof. First, assume that we have a set of feasible fractional solution xmn, which means path
pn is carrying part of flow fm. Similar to the proof of Theorem 11.1 in [26], we provide
vn = dM∑m=1
xmne slots to each path pn, and define set S = {(n, s) : 0 ≤ n ≤ N, 1 ≤ s ≤ vn} to
represent the slot assignment on path pn, U = {1, 2, ...,M} to represent the flows. We then
Contents 43
build a bipartite graph F = (U, S,E), where there is a edge e between node m and node (n, s)
iff xmn > 0 and the cost of edge e is set to rmn.
F still keeps the following properties [26]:
1. F contains a fractional complete matching for set U with cost π
2. Every integer matching of set U in F still has a total bandwidth bound of 2C for each
path pn
For 1) we still use the same step used in [26], and each flow fm with xmn > 0 is assigned to
a slot (n, s), and let ym,(n,s) denote the fraction of flows assigned to slot (n, s). Obviously, we
have xmn > 0 if ym,(n,s) > 0. For 2), first we sort all the flow in the same path pn according to
their bandwidth requirement Bm (xmn = 1). Suppose that we have total number of λn flows
assigned to path pn. After sorting, we have B1 ≥ B2 ≥ .... ≥ Bλn and we can still use the
scheduling method presented in [26] to construct the graph F ′ which still satisfy 1). Let Bs,nmax
the maximum bandwidth requirement among the flows assigned to slot s of path pn. Then
the total bandwidth requirement on path pn is at mostvn∑s=1
Bs,nmax. And we have the following
derivations for each path pn with 0 ≤ n ≤ N :
vn∑s=1
Bs,nmax ≤ Cpn +
vn∑s=2
Bs,nmax (4.17)
Cpn +vn∑s=2
Bs,nmax ≤ Cpn +
vn−1∑s=1
M∑m=1
ym,(n,s)Bm (4.18)
Cpn +vn−1∑s=1
M∑m=1
ym,(n,s)Bm ≤ Cpn +M∑m=1
vn∑s=1
ym,(n,s)Bm (4.19)
Cpn +M∑m=1
vn∑s=1
ym,(n,s)Bm = Cpn +M∑m=1
xmnBm ≤ 2Cpn (4.20)
(4.17) comes from the fact that B1,nmax ≤ Cpn in equation (4.13), (4.18) comes from the fact
that Bs+1,nmax ≤
M∑m=1
ym,(n,s)Bm since all the flows are sorted according to their bandwidth from
largest to smallest, the total fractional sum of bandwidth in the slot s must be greater than
Contents 44
Figure 4.2: Example
the maximum bandwidth in slot s+ 1. For (4.19) we just reverse the order of summation and
add one more item on the right side. (4.20) comes from the fact that sum the bandwidth over
each slot and each flow in the path pn just equals to sum the bandwidth of all the flows in the
path pn. And the inequality of (4.20) comes from the constraint in (4.13).
4.4 Numerical Results
We evaluate our multicast algorithms on the real wide area network model generated by GT-
ITM [22]. We will demonstrate the accuracy of the algorithms by comparing the result of the
Approximation algorithm with the multilcast topology generated by our Branch-and-Bound
method over three network models generated by GT-ITM, the network settings are presented
in Table 4.2.
Table 4.2: Properties of networks
Case NetworkI NetworkII NetworkIII
nodes 10 35 100links(undirected) 14 50 127
size of H 2 7 10
Contents 45
For each network topology, we randomly generate a set of nodes H and a set of end users. We
schedule all the users at the same time.
4.4.1 Cost Analysis
From table 4.3 to 4.8, we compare the total cost of multicast topology generated by approx-
imation algorithm with the optimal solution generated by Branch-and-Bound for RMF and
IMF. All the results generated by the corresponding approximation algorithms are normalized
by the optimal solution generated by our Branch-and-Bound method.
Table 4.3: Cost evaluation for NetworkI (RMF)
Number of end users Branch and Bound Approximation algorithm3 1 1.184 1 1.215 1 1.25
Table 4.4: Cost evaluation for NetworkII (RMF)
Number of end users Branch and Bound Approximation algorithm5 1 1.3510 1 1.4015 1 1.42
Table 4.5: Cost evaluation for NetworkIII (RMF)
Number of end users Branch and Bound Approximation algorithm10 1 1.4920 1 1.5530 1 1.53
Table 4.6: Cost evaluation for NetworkI (IMF)
Number of end users Branch and Bound Approximation algorithm3 1 1.164 1 1.175 1 1.27
The variation on the performance of the Approximation algorithm is small (within 11 percent).
For network I, the normalization ratio between the approximation algorithm and optimal
Contents 46
Table 4.7: Cost evaluation for NetworkII (IMF)
Number of end users Branch and Bound Approximation algorithm5 1 1.4810 1 1.4915 1 1.51
Table 4.8: Cost evaluation for NetworkIII (IMF)
Number of end users Branch and Bound Approximation algorithm10 1 1.5720 1 1.6930 1 1.62
solution increases from 1.18 to 1.25 for RMF and 1.16 to 1.27 for IMF as the number of users
increases. For network II, the normalization ratio increases from 1.35 to 1.42 for RMF and
1.48 to 1.51 for IMF as the number of users increases. For network III, the normalization
ratio increases from 1.49 to 1.53 for RMF and 1.57 to 1.62 for IMF as the number of users
increases. For the small size network, both approximation algorithms for RMF and IMF
keep a low normalization ratio, which demonstrates that performance of the approximation
algorithms is very good. Moreover, we found the gap between the performance of Branch and
Bound solution and Approximate algorithm increases as the number of end users grows or
the size of the network increases, one possible reason is that as the number of the end users
increases or the size of the network increases, there are more possibilities to build the multicast
topology, therefore it is harder to find a good solution.
4.4.2 Running Time Analysis
Figure 4.3-4.5 show the running time of Approximation algorithm and Branch-and-Bound for
IMF over three network topologies. The running time of the approximation algorithm increases
from 5.5ms to 94ms for Network I, 0.75s to 10s for Network II and 1s to 90s for Network III.
We can see the running time of both methods increases as the number of destination users
increases, and the running time of Approximation algorithm is much lower than that of Branch
Contents 47
and Bound. Besides, the network size has a great effect on the running time for both methods,
with a higher impact on Branch and bound.
Figure 4.6-4.8 show the relationship between total number of end users entering and the average
time taken to process each end user for IMF over three network topologies. The running time
of the algorithm increases from 1.2ms to 3.8ms for Network I, 4ms to 42ms for Network II and
20ms to 0.17s for Network III. We find that the average processing time increases as number
of user increases, and we think the reason is that as the number of user increases, there are
more options for the new users to connect, therefore the algorithm will run for a long time to
find the solution.
Figure 4.9-4.11 show the relationships between total number of existing end users in the
multicast group and the average time of processing each leaving user for the dynamic algorithm
for IMF. The running time of the approximation algorithm increases from 1.3ms to 3.6ms for
Network I, 4ms to 33ms for Network II and 40ms to 0.18s for Network III.
We also find that the average processing time increases as number of user increases, and we
think the reason is similar: as the number of user increases, the algorithm will run for a long
time to check if the links connecting to this end user carry the multicast traffic of other users,
therefore it will run for a long time to find the solution.
Contents 48
Figure 4.3: Running time of static algorithms over three networks
Figure 4.4: Running time of static algorithms over three networks
Contents 49
Figure 4.5: Running time of static algorithms over three networks
Figure 4.6: Running time of dynamic heuristics over three networks
Contents 50
Figure 4.7: Running time of dynamic heuristics over three networks
Figure 4.8: Running time of dynamic heuristics over three networks
Contents 51
Figure 4.9: Running time of dynamic algorithms over three networks
Figure 4.10: Running time of dynamic algorithms over three networks
Contents 52
Figure 4.11: Running time of dynamic algorithms over three networks
Chapter 5
Multiple Layer NFV-enabled
Multicasting Routing Problem
5.1 Problem Statement
In the last two chapters, we consider NFV-enabled Multicasting Routing Problem with single-
layer NFV. However, in the reality, the service chain usually contains multiple layers of NFVs.
For the example which is shown in Figure 1.2, the service chain contains three levels of middle-
box processing for the video streaming: DPI (Deep packet inspection), NAT (Network address
translation), and Transcoding. In this chapter, we discuss the routing for the multiple layer
NFV-enabled Multicasting Routing Problem.
Given the network topology G(V,E), which includes a node set V and edge set E. We define
set N = {n1, n2, ..., n|N |} the set of network functions in the service function chain, where the
traffic has to traverse n1, n2,...,n|N | in order. Let Hi ⊆ V denote the set of candidate nodes
to deploy the NFV instance of network function ni and denote H = {Hi} the whole set of
candidate nodes to deploy the NFV instances. Denote s as the source host and d ∈ D the set
of destination nodes. Moreover, we revise the definition of multicast topology to include the
concept of multiple layers NFVs:
53
Contents 54
Definition 1. Define the multicast topology as a subgraph G′ ∈ G. For each G′, there exists a
function f : D → H1 ×H2 × ...×H|N | such that for each d ∈ D, there exists a chain of NFV
instance {h1, h2, ..., hN} ∈ H1 ×H2 × ...×H|N | to process the traffic before reaching d.
Let we denote the cost of link e and wiv the cost of implementing network function ni on
node v. To achieve different purpose, different metrics can be used to derive the cost. For
example, one of the metrics bases on the utilization rate of the remaining bandwidth of the
link and utilization rate of hardware resource of the node. Further define a binary variables
xei (i = 0, 1, ..., |N |), xe0 = 1 if link e is used to direct the multicast flow from s to NFV n1 and
xe0 = 0 otherwise. xei = 1 (i = 1, 2, ..., |N |−1) if link e is used to direct the multicast flow from
NFV ni to NFV ni+1 and xei = 0 otherwise. xe|N | = 1 if link e is used to direct the multicast
flow from NFV n|N | to d ∈ D and xe|N | = 0 otherwise. Let zvi ∈ {0, 1} (i = 1, 2, ..., |N |)
denote whether the instance of ni runs on node v. Our goal is minimizing the total cost of the
multicast topology, which is defined in (5.1).
minimizexei,zvi∈{0,1}
∑e∈E
|N |∑i=0
wexei +∑v∈hi
|N |∑i=1
wivzvi (5.1)
Let binary variable yeid denote if link e is used to direct the traffic from ni to ni+1 for user d,
and ye0d denote if link e is used to direct the traffic from s to n1 and ye|N |d denote if link e is
used to direct the traffic from n|N | to d. Similarly, let binary variable uvid denote if NFV ni
runs on node v for user d, equation (5.2) and (5.3) illustrate the relations between yeid and
xei, uvid and zvi.
yeid ≤ xei ≤ 1 ∀e ∈ E, d ∈ D, i = 0, 1, ..., |N | (5.2)
uvid ≤ zvi ≤ 1 ∀v ∈ V, d ∈ D, i = 1, 2..., |N | (5.3)
Next, define Qi = {Qi ⊆ V : hi ∈ Qi, hi+1 /∈ Qi} (∀i = 1, 2..., |N | − 1), Qs = {Qs ⊆ V : s ∈
Qs, h1 /∈ Qs} and Qd = {Qd ⊆ V : d ∈ Qd, h|N | /∈ Qd}. Define δ(Q) the link set in the cut of
Q, that is the set of edges in G which have one endpoint in Q, and other endpoint not in Q.
equation (5.4) − (5.6) makes sure that there is a path from s to each destination user d ∈ D
Contents 55
[26]: ∑e:e∈δ(Qs)
ye0d ≥ 1 ∀d ∈ D,Qs ∈ Qs (5.4)
∑e:e∈δ(Qi)
yeid ≥ 1 ∀i = 1, 2..., |N | − 1, d ∈ D,Qi ∈ Qi (5.5)
∑e:e∈δ(Qd)
ye|N |d ≥ 1 ∀d ∈ D,Qd ∈ Qd (5.6)
Finally, we have to make sure that the flow traverses through at least one NFV instance hi for
each network function ni in the service function chain before reaching the destination user.
∑v∈Hi
uvid ≥ 1 ∀d ∈ D,ni ∈ N (5.7)
Equations (5.1) − (5.7) formally define the problem, we call this problem Service Function
Chain Enabled Multicast Routing Problem (SMRP).
5.1.1 Two-Approximation Algorithm
We next propose a two-approximation algorithm (TAA) based on Algorithm 1, which is de-
scribed in Algorithm 7, we name Algorithm 7 TAA. It first searches for a path which connects
the source to one of the destination users through the chains of NFV with the minimum cost,
then builds a minimum spanning tree among the users and direct the flows to each end user
by using the minimum spanning tree.
Figure 5.1 presents an example and Theorem 4 presents the performance bound for Algorithm
7:
Theorem 4. Given the solution Tmin generated by Algorithm 7, the total cost of Tmin is less
than two times that of the optimal multicast topology Topt of SMRP.
To prove theorem 4, we still need to cite the lemma 1 and 2 in Chapter 3: Now we present
the proof for the Algorithm 7:
Contents 56
Algorithm 7 Two-Approximation algorithm (TAA)
1: Wmin =∞2: for each {h1, h2, ..., h|N |, d} ∈ H1 × ...×H|N | ×D do3: Construct a graph P by connecting s to h1, h1 to h2,...,h|N | to d.4: Construct a complete graph C among the end users D, the cost of the link between each
two users is set to the minimum distance between the two users in G.5: Find the minimum spanning tree T from C and calculate the total link cost of T .6: Connect P with T , call the combined graph Gs (since an end user exists both in P and
T , they can connect together). Calculate the total cost W of Gs defined by equation(5.1).
7: if W ≤ Wmin then8: Wmin = W , Gmin ← Gs
9: end if10: end for11: Replace the each link in Gmin with the shortest path between them in G. Call the result
multicast topology Tmin.12: Remove any unnecessary edges in Tmin to get the final output.
Proof. For the ease of interpretation, denote W (U) the total cost of subgraph U and Wl(U) =∑e∈U we the total link cost of U and Wh(U) =
∑v∈U wv the total node cost of U . Further
denote Gopt the optimal multicast topology, and Topt the relative multicast tree derived from
it by using lemma 1. Therefore the total cost of Gopt equals to that of Topt. By lemma 2,
there exists a loop L such that every edge in T appears exactly twice in L. This loop L can
be decomposed into three parts:
1. A path R1 which connects source s to one end user d through a chain of NFVs.
2. A path R2 which connects all the end users.
3. A path R3 which connects d|D| back to s.
Since each link and nodes in Topt appears exactly twice in the loop L, W (L) = 2×W (Topt) ≥
Wl(L). The path R1 must pass through a chain of NFV before reaching d1, otherwise the
flow is not processed. Denote λ(a, b) the cost of the shortest path between node a and b in G.
Hence W (R1) ≥ W (P ) = minimumh1,h2,...,h|N|,d
(λ(s, h1) +∑|N |−1
i=1 λ(hi, hi+1) + λ(h|N |, d) +∑|N |−1
i=1 wihi .
Ignore the node cost of R2, R2 is just a path which connects all the end users, therefore
Contents 57
Figure 5.1: An example of TAA. (a) shows a network topology G(V,E) with 10 nodesand 13 links, suppose s = 10, D = {2, 3, 4, 6}. There are two layers of NFV, with H1 ={5, 7}, H2 = {1, 9}. (b) shows the path graph P . (c) is the complete graph C among the endusers. Assume (d) is the minimum spanning tree T derived from C. Connecting P with Tgives Gs, which is shown in (e), assume Gs = Gmin. (f) shows Tmin and (g) shows Tmin after
removing the redundant edges.
W (R2) ≥ Wl(R2) ≥ W (T ). Therefore W (Tmin) ≤ W (Gmin) ≤ W (R1) +Wl(R2) ≤ W (R1) +
W (R2) ≤ W (Topt).
An example is shown below in Figure 5.2 to illustrate the proof of TAA. In Figure 5.2, the
source node is 1, and destination users are 2,6 and 7. The service function chain involves two
level of network functions, which locates at node 3 and node 4. The Gopt is shown in Figure
5.2(a). In Figure 5.2(a), the path from 1 to the end users are (1, 3, 5, 3, 4, 7), (1, 3, 5, 3, 4, 2)
and (1, 3, 5, 3, 4, 2, 6) respectively. Topt of equal cost can be derived as follows: first, the
path to node 7 is added to Topt, which is shown in Figure 5.2(b). The path to node 2 is
(1, 3, 5, 3, 4, 2), since (1, 3, 5, 3, 4) is already contained in the path to 7, the user 2 can be
Contents 58
Figure 5.2: Example of constructing tree
added by connecting node 2 to node 4, which is shown in Figure 5.2(b). Similarly, node 6
can be added to Topt by connecting to node 2 (Figure 5.2(d)). And the total cost of Topt is
the same as the total cost of Gopt. Figure 5.2(e) shows the path R1, R2, R3 in the Topt. With
W (R1) +W (R2) +W (R3) = 2W (Topt) = 2W (Gopt) ≥ W (P ) +W (T ).
Next we analyze the complexity of the TAA. In the for loop between line 2 and 8, we need to
find the shortest path between NFV instances of ni and ni+1. In line 4, to calculate the cost of
edge of C, shortest paths need to be calculated for each pair of destination users. Hence, the
total complexity of TAA is O((|D|2 + |D|Π|N |i=1|Hi|)(|V |+ |E| log |E|)), where |D| is the number
of users, |Hi| is the number of NFV nodes available to deploy the NFV instance of ni, |V | is
the number of nodes and |E| is the number of edges.
Contents 59
Algorithm 8 Heuristic Algorithm (HA)
1: Define h′1 = argminh1∈H1
(λ(s, h1) + w1h1
).
2: for i = 2, ..., |N | do3: Define h′i = argmin
hi∈Hi
(λ(h′i−1, hi) + wihi).
4: end for5: Define d′ = argmin
d∈D(λ(h′|N |, d) + w
|N |h|N|
).
6: Construct a graph P by connecting s to h′1, h′1 to h′2,...,h′|N | to d′.7: Construct a complete graph C among the end users D, the cost of the link between each
two users a and b is set to the minimum distance between a and b in G.8: Find the minimum spanning tree T from C and calculate the total link cost of T .9: Connect P with T , call the combined graph Gs. Calculate the total cost W of Gs defined
by equation (5.1).10: Replace the each link in Gmin with the shortest path between them in G. Call the result
multicast topology Tmin.11: Remove any unnecessary edges in Tmin to get final output.
5.1.2 Heuristic Algorithm based on TAA
As we state in the last paragraph, although TAA provides a bound on its performance, the
complexity of TAA exponentially grows with the number of network functions |N | in the service
function chain. This is acceptable when the number of network functions is small. However,
with the large number of network functions and the network of large size, the complexity is
high. Hence we proposed a heuristic algorithm based on TAA, which is described in Algorithm
8. Compared with TAA, HA iteratively finds the NFV on each layer of the service function
chain such that the sum of link cost and node cost is minimum. The complexity of the HA
is O((|D|2 +∑|N |
i=1 |Hi|)(|V |+ |E| log |E|)), whose complexity is linearly growing with |N |. As
shown by the simulation in Section 5.4, HA saves huge amount of running time at the cost of
3%− 7% increase on the total cost.
Contents 60
5.2 Build the Multicast Topology with given number of
NFV Instances
5.2.1 Problem Statement
In the last section, we investigated the Service Function Chain Enabled Multicast Routing
Problem, and presented a two-approximation algorithm. However, in SMRP, there is no
requirement on the number of NFV instances deployed for each layer of network function
ni. The TAA proposed above presents a solution with one NFV instance deployed for each
network function ni. However, in the real scenario, this solution is not reliable, once one of
the NFV instance is down, all the network traffic is disrupted. To guarantee the reliability of
the multicast service, we place a limit on the minimum number of NFV instances deployed for
each network function ni by adding a new constraint:
∑d∈D
∑v∈Hi
uvid ≥ γ (5.8)
where γ is the minimum number of NFV instance for each network function ni. And we call
(5.1)−(5.8) the Reliable Service Function Chain Enabled Multicast Routing Problem (RSMRP).
5.2.2 Heuristic Algorithm
Next we propose a heuristic algorithm for RSMRP. We first classify all the end users into γ
groups by running a clustering algorithm, then we run the TAA for each user group such that
each user group has different NFV instances deployed.
First we state the clustering algorithm. The clustering algorithm selects the center of each
group by using k-center algorithm [26]. Then all the users are categorized based on the
distance with the centres, the clustering algorithm is described in Algorithm 9. The algo-
rithm for RSMRP is described in Algorithm 10. The Clustering(D,γ) will return all the
Contents 61
Algorithm 9 Clustering Algorithm (CA)
1: Pick one end user k ∈ D at random. Let K = {k}2: while |K| ≤ γ do3: llarge = 04: for d ∈ D do5: for k ∈ K do6: Find the shortest distance lkd between k and d7: if lkd ≥ llarge then8: llarge = lkd, dlarge = d9: end if
10: K = K ∪ {dlarge}11: end for12: end for13: end while14: for ki ∈ K do15: Set Ui = {ki}16: end for17: Group the other end users based on the distance between the user to and ki, the user is
grouped to Ui iff ki is closest to the user among all the other k ∈ K. If more than onek ∈ K satisfy this, randomly pick one.
Algorithm 10 Heurisitic Algorithm for Group Routing (HAG)
1: [U1, U2, ..., Uγ] = Clustering(D,γ)2: for each user group Ui do3: [h1, h2, ..., h|N |] = TAA(H1, H2, ..., H|N |, Ui) or we can use [h1, h2, ..., h|N |] =
HA(H1, H2, ..., H|N |, Ui)4: for i = 1, 2, ...|N | do5: Hi = Hi\{hi}6: end for7: end for
user groups based on the Clustering Algorithm, and we can choose TAA(H1, H2, ..., H|N |, Ui)
or HA(H1, H2, ..., H|N |, Ui) to construct the multicast topology for users in Ui based on the
Two-Approximation Algorithm (Algorithm 7) or Heuristic Algorithm (Algorithm 8) proposed
above, which return the corresponding NFV instances used. The used NFV instances will then
be removed from the set Hi. Repeat this operation until all the user groups are routed. We
name the HAG HAG-TAA if it uses TAA for constructing the multicast topology and name
the HAG HAG-HA if it uses HA for constructing the multicast topology.
Next we analyze the complexity of the CA and HAG, the for loop between the line 2−9 of CA
calculate the distance between the end users, and this calculation repeats for γ times, therefore
Contents 62
the complexity of CA is O((γ|D|2)(|V | + |E| log |E|)). Moreover, the for loop between the
line 2 − 5 of HAG calls TAA/HA for γ times, therefore the total complexity of HAG-TAA
is O(γ(|D|2 + |D|Π|N |i=1|Hi|)(|V | + |E| log |E|)) and the complexity of HAG-HA is O(γ(|D|2 +
|D|∑|N |
i=1 |Hi|)(|V |+ |E| log |E|)).
5.3 SMRP with Time-variant Resource Cost
In the previous section, we assumed the link cost we and the node cost wiv are fixed. However,
this assumption is not reasonable in the real scenario. For example, the bandwidth demand
and the hardware resource demand of the multicast service may change over time, which causes
the fluctuation on link cost and node cost. Instead of creating a static algorithm which gives
a one-shot solution for SMRP, we are more interested in developing an online algorithm which
adjusts the multicast topology based on the time-variant cost of links and nodes.
Denote witv the cost of implementing the NFV instance ni on node v at time t, and wte the cost of
link e at time t. Denote θ the period of reconfiguration, during each period of reconfiguration,
the multicast topology remains fixed. The problem becomes how to build and dynamic adjust
the multicast topology based on the time-variant cost of links wte and nodes witv such that the
average total cost of multicast topology is minimized.
5.3.1 Introduction to Markov Approximation
To solve the above problem, we leverage the idea of Markov Approximation which is described
in [27]. Markov Approximation is a general technique to solve the combinatorial optimization
problem, it relaxes the objective function by adding an entropy term and solve the problem
by achieving efficient time-sharing among all the feasible solutions. More specially, SMRP can
Contents 63
be formulated as a combinatorial optimization problem as follows:
minimumpf≥0
∑f∈F
pfφf
s.t.∑f∈F
pf = 1
f ∈ F is the feasible solution of SMRP, F is the set of all the feasible solutions, or all the
possible configurations of the multicast topology. φf is the total cost of configuration f , which
is defined by (5.1). pf is the probability that the multicast topology is in configuration f . It
is obvious that the solution of the above problem is pf = 1 if f = argminf∈Fφf and pf = 0
otherwise. The objective function can be relaxed by adding a entropy term in the objective
function:
minimumpf≥0
∑f∈F
pfφf +1
β
∑f∈F
pf log(pf )
By solving the relaxed version of this problem, we have the following solution:
p∗f =exp(−βφf )∑f∈F exp(−βφf )
∀f ∈ F (5.9)
Where β is a large constant, as β approach to infinity, the relaxed term will approach to 0
and the solution will approach to the optimal solution of the original problem. Given the
probability of each state described above, we want to design a discrete Markov chain whose
steady state probability is the same as the probability given in the solution. Next we describe
the design of the Markov Chain in detail.
5.3.2 Design of the Discrete Markov Chain
Our goal is to design a discrete Markov chain whose steady state probability satisfies (5.9).
First, we have the following results when the discrete Markov chain is in the steady state.
Theorem 5. For any two states f, f ′ ∈ F in the discrete Markov chain, if we have pfqff ′ =
pf ′qf ′f and∑
f∈F pf = 1, where qff ′(qf ′f ) is the state transitional probability between the
Contents 64
state f(f ′) and f ′(f). Then the discrete Markov chain is in the steady state with the steady
state probability pf (∀f ∈ F ).
Proof. Given pfqff ′ = pf ′qf ′f , then we can sum the left side and right side of the equation over
f ′ ∈ F , and we have∑
f ′∈F pfqff ′ =∑
f ′∈F pf ′qf ′f . This indicates that the total probability
of leaving f equals to total probability of entering f . And we have the sum of the state
probabilities equals 1, which indicates the Markov chain is in the steady state.
To achieve the steady state probability given in (5.9), and satisfies the steady state probability
in Theorem 5, we define the state transitional probability as follows:
qff ′ ∝ exp(βφf )
More specially, we make qff ′ = α exp(βφf ), where α is the constant to make the state transi-
tional probability less or equal than 1. For the ease of interpretation, we make the following
definition:
Definition 2. Given the state f , which corresponds to a multicast topology with set of links
fl and set of used nodes to deploy NFV fnfv, the state f ′ is adjacent to state f iff f ′ =
Routing(G\e,H) (∀e ∈ fl) or f ′ = Routing(G,H\v) (∀v ∈ fnfv).
Where Routing(G,H) is function which takes G and H as input, and returns the multicast
topology based on TAA. Then we design the Markov chain such that there is a link between
two states (i.e. qff ′ > 0) only if two states are adjacent states. Denote the set of adjacent
states of f A(f). The transition state probability qff ′ = α exp(βφf ) (∀f ′ ∈ A(f)) and qff =
1− |A(f)|α exp(βφf ).
5.3.3 Online Algorithm
Next we state our design of the online algorithm to deal with the time-variant resource cost,
which is described in Algorithm 11: During each period θ, the OA calculate the average cost
Contents 65
Algorithm 11 Online Algorithm (OA)
1: Given the network topology G(V,E), wi0v and w0e at t = 0, calculate the multicast topology
f by using the TAA, f = Routing(G,H) and the total cost φf2: Let k = 1.3: while t > 0 do4: if t = kθ then5: Update the resource cost wikθv = 1
θ
∫ kθ(k−1)θ
witv dt and wkθe = 1θ
∫ kθ(k−1)θ
wtedt
6: Calculate the total cost of f with the updated resource cost, calculate pff = 1 −|A(f)|α exp(βφf ), generate a uniform random variable j.
7: if 0 ≤ j ≤ 1− pff then8: Calculate a adjacent state f ′ = Routing(G\e,H) or f ′ = Routing(G,H\v) with
the updated resource cost by randomly pick e from fl or v from fnfv, switch thenetwork topology according to f ′, Let f = f ′.
9: end if10: k = k + 1;11: end if12: end while
of the link and node, then calculate the total cost of the current multicast topology, and the
state transition probability to the other states by using qff ′ = α exp(βφf ). The multicast
topology is adjusted accordingly to achieve the better performance.
5.3.4 Reliability of Solution
The online algorithm gives a solution which periodically changes its topology based on the
current cost of resources, and multiple NFV instances are used to guarantee the reliability of
the multicast topology.
5.3.5 Cost of Reconfiguration
It is costly to reconfigure the multicast topology. For every period of θ, the multicast topology
will be adjusted based on the current cost of resources. However, the frequency of reconfigu-
ration can be decreased by adjusting some parameters, we can either:
1. decrease the α such that the probability of staying in the same state pff increases.
Contents 66
Figure 5.3: Timeline of Reconfiguration
2. Increase the period of reconfiguration θ.
5.3.6 The Effect of Time Error during State Transition
In the above sections, we described an online algorithm to deal with time-variant cost of
resources. One assumption we made is that the multicast topology can be reconfigured based
on the OA at each θ. However, in the real network environment, it may be difficult to
reconfigure the multicast topology such that the reconfiguration process finishes sharply at
each θ, a time error (e.g. caused by the time consumed by the reconfiguration process) M θ
will exist. For the network with heavy-loaded traffic, the reconfiguration process may take
some time and cause the inaccuracy on the duration of each state. This can be illustrated by
the diagram in Figure 5.3. Let M θmax denote the maximum variation of time error, the time
spent on each state may range from θ− M θmax to θ+ M θmax. This time error will affect the
overall performance of the system since the time ratio (pf ) of each state will also change. Next
we will give a quantitative analysis on this issue.
Define M = {m1,m2...m|M |} a set of multicast topologies that is used in chronological order.
For example, m2 is used after m1, which is followed by m3, etc. Let θtotal denote the total time
that the multicast topology exists, θm denote the time that the multicast topology m exists
ideally and M θm denote the time error during the reconfiguration process for the multicast
topology m. Therefore θm+ M θm denote the time that m exists in reality and we have
θtotal =∑
m∈M θm. Then we are interested in finding the maximum change on the average cost
Contents 67
caused by the time error M θ. More specifically, we want to maximize the total cost, which is
presented by the following objectives.
maximumMθ
∑m∈M
θm+ M θmθtotal
φm (5.10)
And the sum of all the time error must equals 0 since the total time θmax is fixed.
∑m∈M
M θm = 0 (5.11)
Finally, we have the limit on the time error.
− M θmax ≤M θm ≤M θmax (5.12)
Rank the total cost of the multicast topology φm from highest to lowest, call this ordered set
of φm Φ. Theorem 6 gives the solution of the above problem.
Theorem 6. Denote the ideal total cost generated by OA W , then the maximum cost caused
by the time error M θ is W + Mθmax
θtotal(∑|Φ|
m=d|Φ|/2e φm −∑b|Φ|/2c
m=1 φm)
Proof. The problem (5.10)− (5.12) is a linear programming problem and we solve the problem
by using Karush Kuhn Tucker (KKT) conditions. And we have the following equations:
− φmθtotal
+ a+ bm − cm = 0 ∀m ∈M∑m∈M
M θm = 0
bm, cm ≥ 0 ∀m ∈M
bm(M θm− M θmax) = 0 ∀m ∈M
cm(M θm+ M θmax) = 0 ∀m ∈M
Contents 68
If |Φ| is even, and we have the following results for the above equations:
M θm =M θmax, if|Φ|2
+ 1 ≤ m ≤ |Φ|
M θm = − M θmax if 1 ≤ m ≤ |Φ|2
bm =φmθtotal
−(φ |Φ|
2
+ φ |Φ|2
+1)
2θtotal, cm = 0 if
|Φ|2
+ 1 ≤ m ≤ |Φ|
cm =(φ |Φ|
2
+ φ |Φ|2
+1)
2θtotal− φmθtotal
, bm = 0 if 1 ≤ m ≤ |Φ|2
a =1
2θtotal(φ |Φ|
2
+ φ |Φ|2
+1)
If |Φ| is odd, and we have the following results for the above equations:
M θm =M θmax, if|Φ|+ 1
2+ 1 ≤ m ≤ |Φ|
M θm = − M θmax if 1 ≤ m ≤ |Φ| − 1
2
M θm = 0 ifm =|Φ|+ 1
2
bm =φmθtotal
−φ |Φ|+1
2
θtotal, cm = 0 if
|Φ|+ 1
2≤ m ≤ |Φ|
cm =φ |Φ|+1
2
θtotal− φmθtotal
, bm = 0 if 1 ≤ m ≤ |Φ| − 1
2
a =φ |Φ|+1
2
θtotal
Substitute the above solutions to the objective function, the maximum cost equals W +
Mθmax
θtotal(∑|Φ|
m=d|Φ|/2e φm −∑b|Φ|/2c
m=1 φm).
5.4 Numerical Results
To demonstrate the performance of the algorithms proposed above, we evaluate the algorithms
over three network models, including Abilene (11 nodes, 13 links), GEANT (23 nodes, 74
links) and the real WAN model (100 nodes 127 links) generated by GT-ITM software tool
[23]. In Abilene, we make |Hi| = 2, (1 ≤ i ≤ |N |), in GEANT, |Hi| = 5, (1 ≤ i ≤ |N |), and
Contents 69
Algorithm 12 Benchmark Algorithm (BA)
1: for d ∈ D do2: Define h′1 = argmin
h1∈H1
(λ(s, h1) + w1h1
).
3: for i = 2, ..., |N | do4: Define h′i = argmin
hi∈Hi
(λ(h′i−1, hi) + wihi).
5: end for6: Define d′ = argmin
d∈D(λ(h′|N |, d) + w
|N |h|N|
).
7: Connecting s to h′1, h′1 to h′2,...,h′|N | to d′.8: end for
|Hi| = 2, (1 ≤ i ≤ |N |) for GT-ITM. We compare the performance of the algorithms by using
a metric called Cost Deduction Ratio (CDR), assume the total cost generated by TAA is π2,
and the total cost generated by the benchmark algorithm HA is π1, then CDR is defined in
(5.13). All the simulations are done 100 times and the average results are presented.
CDR = (π1 − π2)/π1 (5.13)
To compare the performance of the TAA and HA, we create another Benchmark Algorithm
(BA). For each end user, BA works by finding the nodes to deploy NFV as well as the paths
such that the total cost is smallest, BA is described in Algorithm 12. Similarly, we can redefine
a CDR such that π1 is the total cost generated by BA, and π2 is the total cost generated by
TAA, name it CDR’. Table 5.1-5.3 show the performance of TAA, HA and BA with different
number of end users over different network topologies. The CDR′ ranges from 12% to 48% for
different network topologies and different number of end users, which shows that the efficiency
of TAA. Moreover, the CDR ranges from 3% to 7% for all the network topologies and different
number of end users, which demonstrates the performance of HA and TAA is closed. Figure
Table 5.1: Performance of Algorithms
Network Users 2 3 4 5 6Abilene CDR 0.0643 0.0707 0.0495 0.0498 0.0562
CDR’ 0.127 0.188 0.219 0.331 0.404
5.4-5.6 show the running time of TAA and HA for different number of end users over different
network topologies. The running time of HA is much lower than that of TAA. The running
Contents 70
Table 5.2: Performance of Algorithms
Network Users 5 8 10 12 15
GEANT CDR 0.0344 0.0386 0.0413 0.0382 0.0649CDR’ 0.144 0.160 0.227 0.378 0.448
Table 5.3: Performance of Algorithms
Network Users 10 20 30 40 50GTITM CDR 0.0604 0.0577 0.0523 0.0637 0.0576
CDR’ 0.161 0.198 0.277 0.368 0.479
2 2.5 3 3.5 4 4.5 5 5.5 610
−4
10−3
10−2
10−1
Number of end users
Run
ning
Tim
e
Running time TAARunning time HA
Figure 5.4: Running Time Comparison on Abilene
time of both algorithms increase as the number of end users increase, this is reasonable since
the more calculation need to be done as more end users need to be scheduled.
Next we evaluate the performance of HAG, we evaluate the total cost generated by HAG-TAA
and HAG-HA with different number of clusters γ, given the fixed number of end users. Table
5.4-5.5 show the results for GEANT network with 15 end users and GTITM network with 50
end users. And CDR is defined with π1 the total cost of the HAG-HA and π2 the total cost
of the HAG-TAA. As we expect, the data shown above presents a similar result as Table 5.1-
Table 5.4: Performance of HAG
Network Groups 2 3 4 5
GEANT CDR 0.0741 0.0523 0.0412 0.0557
Contents 71
5 6 7 8 9 10 11 12 13 14 1510
−3
10−2
10−1
Number of end users
Run
ning
Tim
e
Running time TAARunning time HA
Figure 5.5: Running Time Comparison on GEANT
10 15 20 25 30 35 40 45 5010
−3
10−2
10−1
100
101
Number of end users
Run
ning
Tim
e
Running time TAARunning time HA
Figure 5.6: Running Time Comparison on GTITM
5.3. This is because HAG still uses TAA and HA to build the multicast tree for each cluster.
Therefore the performance of HAG-TAA and HAG-HA should follow the similar tendency.
Finally, we evaluate the performance of the OA by measuring the average cost of OA. During
each round, each link and each node is assigned with a cost we and wiv generated by some
probabilistic distribution. Each node is assigned a cost to deploy the NFV which is Gaussian
distributed with a mean equals 10 and variance equals 1. If the cost of node or link is less
Contents 72
Table 5.5: Performance of HAG
Network Groups 5 8 10 15GTITM CDR 0.0346 0.0478 0.0335 0.0580
than 0, then the cost is made to be 0. We evaluate the cost over 1000 rounds. As mentioned
above, we can control the parameter α to decrease the number of reconfiguration such that the
reconfiguration cost is small. We compare the performance of OA and TAA with difference
number of reconfigurations on GEANT and GTITM by adjusting α. For TAA, the multicast
topology will be calculated by using TAA with the cost wi0v and w0e and the multicast topology
will remain fixed over the 1000 rounds. The CDR is defined the same as (5.13), with π1 the
average cost of TAA over all the rounds and π2 the cost of OA. Table 5.6-5.9 show the relations
Table 5.6: Performance of OA
GEANT Reconfigurations 50 100 200 400 600(Gaussian cost) CDR 0.13 0.16 0.23 0.29 0.36
Table 5.7: Performance of OA
GEANT Reconfigurations 50 100 200 400 600(Uniform cost) CDR 0.17 0.20 0.27 0.33 0.38
Table 5.8: Performance of OA
GTITM Reconfigurations 50 100 200 400 600(Gaussian cost) CDR 0.18 0.20 0.27 0.30 0.39
Table 5.9: Performance of OA
GTITM Reconfigurations 50 100 200 400 600(Uniform cost) CDR 0.27 0.33 0.36 0.42 0.48
between number of reconfigurations and CDR over different network topologies. During each
round, each link and each node is assigned with a cost which is Gaussian distributed and
uniformly distributed. For the Gaussian distribution, each link is assigned a cost we which is
Gaussian distributed with a mean equals 1 and variance equals 0.1. Each node is assigned a
cost wiv which is Gaussian distributed with a mean equals 10 and variance equals 1. For the
Contents 73
uniform distribution, each link is assigned a uniformly distributed cost with a mean equals
1 and variance equals 0.1, and each node is assigned a uniformly distributed cost wiv with a
mean equals 10 and variance equals 1. The CDR increases with the number of reconfigurations
increases. This is because under limited number of rounds (1000 rounds), as the number of
reconfigurations increases, the multicast topology will be adjusted more frequent based on the
current cost of the links and nodes. However, according to the discussion above, the average
cost of the multicast topology will not depend on α as the number of rounds approach infinity.
Moreover, the CDR generated on GTITM is larger than the CDR generated on GEANT, this
is because the size of GTITM is larger than that of GEANT. Hence the variance of the cost
of multicast topology built on GTITM, which equals the sum of more random variables, is
larger than the variance of the cost of multicast topology built on GEANT. Therefore, a larger
number of reconfigurations will save more on GTITM than GEANT.
Chapter 6
Conclusions and Future Work
6.1 Conclusions
Many multicast services such as live multimedia distribution and real-time event monitoring
require multicast mechanisms that involve network functions (e.g. firewall, video transcoding).
Network Function Virtualization (NFV) is a concept that proposes using virtualization to
implement network functions on infrastructure building block (such as high volume servers,
virtual machines), where software provides the functionality of existing purpose-built network
equipment. In this work, we present routing algorithms for building an NFV-enabled multicast
topology on SDN. We consider different scenarios of the routing problems and we proposed
solutions for each of them. First we study a simple yet important case where a single NFV
processing step is involved. This scenario is already applicable to many scenarios such as video
transcoding, packet filtering and intrusion detection. We propose an algorithm for building an
NFV-enabled multicast mechanism on SDN for both dynamic and static scenarios. Then we
make a extension to the original problem to consider the joint placement of NFVs and routing
in the service chain.
Finally, we presented an online algorithm to deal with the scenario that the cost of links and
nodes are time-variant based on the idea of Markov approximation. The simulation indicates
74
Contents 75
a huge saving on the total cost by using our multicast routing algorithms and a preliminary
implementation of the multicast framework has been implemented on the testbed.
6.2 Future Work
For future work, there are three main areas we can continue to improve:
1. One extension is to include delay constraint in the formulation of the problem. Ensuring
that the assigned requests meet their deadlines is essential for the real-time multicast
service, such as live event, gaming, etc. More specifically, the total delay consists of
transmission delay of the links and processing delay of the node. Furthermore, Previ-
ous work has shown that NFV may cause abnormal delay variations and throughput
instability [28], which make the problem even more complicated.
2. Another direction where out work can be extended is to consider the efficient heteroge-
neous resource usage of the NFV instance. Infrastructure component such as servers,
switches and virtual machines execute software to provide the functionalities of net-
work elements and appliances, such as NAT, proxy server, video transcoder, deep packet
inspection (DPI). However, processing packets to serve these network functions need mul-
tiple hardware resources, such as, CPU, memory, disk storage. Furthermore, different
NFV instance may consume different amounts of hardware resources, how to efficiently
deploy and make use of these limited hardware resources to obtain the maximum profit
while maintaining the quality of the service is another important problem.
3. Finally, robustness of the NFV service is always an important research topic. If a NFV
instance does not work properly or a link is down abruptly, how to quickly reconfigure
the multicast topology to minimize the service interruption while keeping the total cost
of the multicast topology small is another promising research area.
Contents 76
Bibliography
[1] N. McKeown and T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford,
S. Shenker, J. Turner ”Openflow: Enabling Innocation in Campus Networks”, ACM SIG-
COMM Computer Communication Review, April 2008.
[2] B. Han, et al. ”Network function virtualization: Challenges and opportunities for innova-
tions”, IEEE Communications Magazine (Volume:53,Issue: 2)
[3] L. Kou, G. Markowsky, L. Berman ”A fast algorithm for Steiner trees”, Acta Informatica
1981, Volume 15, Issue 2, pp 141-145.
[4] S. Hougardy , H. Prmel. ”A 1.598 Approximation Algorithm for the Steiner Problem in
Graphs”, in proceesings of the tenth annal ACM-SIAM symposium on discrete algorithms,
Soda, 1999.
[5] G.Rouskas, I.Baldine. ”Multicast Routing with End-to-End Delay and Delay Variation
Constraints”. IEEE Journal on Selected Areas in Communications, Vol. 15, No 3, 1997.
[6] Mukherjee, R. ; Atwood, J.W. ”Rendezvous point relocation in protocol independent multi-
cast - sparse mode”. 10th International Conference on Telecommunications, 2003. ICT 2003.
[7] Macq, J.-F. ; Wolsey, L.A. ; Macq, B. ”A rendezvous point selection algorithm for videocon-
ferencing applications”. 2002 IEEE International Conference on Multimedia and Expo, 2002.
ICME ’02. Proceedings.
[8] S. Bemby ; H. Lu ; K. Zadeh ; H. Bannazadeh ; A. Leon-Garcia. ”ViNO: SDN Overlay to
Allow Seamless Migration Across Heterogeneous Infrastructure ”. IM 2015.
[9] A. Iyer, P. Kumar, V. Mann ”Avalanche: Data center Multicast using software defined
networking, in the Proceedings of IEEE COMSNETS, 2014.
Contents 77
[10] J. Wen-Kang, L. Chun Wang ”A Unified Unicast and Multicast Routing and Forwarding
Algorithm for Software-Defined Datacenter Networks”, IEEE JOURNAL ON SELECTED
AREAS IN COMMUNICATIONS, VOL. 31, NO. 12, DECEMBER 2013.
[11] X. Li, M. Freedman. Scaling IP multicast on datacenter topologies, in the Proceedings of
ACM CoNext, 2013.
[12] S. Shen, L. Huang, D. Yang, W. Chen. Reliable Multicast Routing for Software-Defined
Networks, in the Proceedings of IEEE Infocom, 2015.
[13] Klinker, J.E. Multicast tree construction in directed networks, in the Proceedings of IEEE
MILCON, 1996.
[14] Castro, M. ; Kermarrec, A.-M. ; Rowstron, A.I.T. Scribe: a large-scale and decentralized
application-level multicast infrastructure. Selected Areas in Communications, IEEE Journal
on (Volume:20 , Issue: 8)
[15] Ralph, W.; Martina, Z. Multicast communications: protocol and applications. Morgan
Kaufmann Publishers Inc, May, 2000.
[16] K.Stachowiak, P.Zwierzykowski. Rendezvous point based approach to the multi-constrained
multicast routing problem, International Journal of Electronics and Communications, June,
2014;
[17] A. Jacobson, et al. OpenNF: Enabling Innovation in Network Function Control. In the
proceedings of ACM Sigcomm, 2014.
[18] J. Martins, et al. ClickOS and the Art of Network Function Virtualization, in the Pro-
ceeding of USENIX NSDI 2014.
[19] Y. Zhang, et al. StEERING: A software-defined networking for inline service chaining.
In the Proceedings of IEEE ICNP, 2013.
[20] M. Mangili, et al. Stochastic Planning for Content Delivery: Unveiling the Benefits of
Network Functions Virtualization, IEEE ICNP, 2014.
Bibliography 78
[21] M. Bouet, et al. Cost-based placement of vDPI functions in NFV infrastructures, in the
Proceedings of IEEE NetSoft, 2015.
[22] M.C. Luizelli, et al. Piecing together the NFV provisioning puzzle: Efficient placement
and chaining of virtual network functions, in the Proceedings of IEEE IM, 2015.
[23] ”Modeling Topology of Large Internetworks” : http://www.cc.gatech.edu/projects/gtitm/
[24] J. Kang ; Bannazadeh, H. ; Leon-Garcia, A. SAVI testbed: Control and management of
converged virtual ICT resources. IM 2013.
[25] K. Sayood: ”Introduction to data compression”, Morgan Kaufmann Publishers Inc. San
Francisco, CA, USA 2005 ISBN:012620862X
[26] D. Williamson, B. Shmoys ”The Design of Approximation Algorithms”. Cambridge Uni-
versity Press.
[27] M. Chen, et al.Markov Approximation for Combinatorial Network Optimization, in the
Proceedings of IEEE Infocom, 2010.
[28] G. Wang and T. S. E. Ng. The Impact of Virtualization on Network Performance of
Amazon EC2 Data Center. In Proceedings of IEEE Infocom, 2010.
[29] S. Zhang, et al. Network Function Virtualization Enabled Multicast Routing on SDN, in
the Proceedings of IEEE ICC, 2015.
[30] S. Zhang, et al. Routing algorithms for network function virtualization enabled multicast
topology on SDN, in IEEE Transaction on Network and Service Management.
[31] S. Zhang, et al. Joint NFV Placement and Routing for Multicast Service on SDN, in the
Proceedings of IEEE NOMS, 2016.