recent advances in cloud computing- a case study approach

38
ii A TECHNICAL SEMINAR REPORT SUBMITTED BY Gunasekaran, D (621513510002) in partial fulfillment for the award of the degree of MASTER OF TECHNOLOGY in INFORMATION TECHNOLOGY MAHENDRA COLLEGE OF ENGINEERING SALEM ANNA UNIVERSITY: CHENNAI 600 025 MAY 2014 RECENT ADVANCEMENTS IN CLOUD COMPUTING: A CASE STUDY APPROACH

Upload: dguna

Post on 28-Dec-2015

51 views

Category:

Documents


4 download

DESCRIPTION

Review of some recent articles on cloud computing

TRANSCRIPT

Page 1: Recent Advances in Cloud Computing- A Case Study Approach

ii

A TECHNICAL SEMINAR REPORT

SUBMITTED BY

Gunasekaran, D (621513510002)

in partial fulfillment for the award of the degree

of

MASTER OF TECHNOLOGY

in

INFORMATION TECHNOLOGY

MAHENDRA COLLEGE OF ENGINEERING

SALEM

ANNA UNIVERSITY: CHENNAI 600 025

MAY 2014

RECENT ADVANCEMENTS IN CLOUD

COMPUTING: A CASE STUDY APPROACH

Page 2: Recent Advances in Cloud Computing- A Case Study Approach

ii

ANNA UNIVERSITY: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this technical seminar report “RECENT

ADVANCEMENTS IN CLOUD COMPUTING: A CASE STUDY

APPROACH ” is the bonafide work of

Gunasekaran, D (621513510002)

who carried out the work under my supervision.

Submitted for the technical seminar held on __________

---------------------- --------------------- Internal Examiner External Examiner

SIGNATURE

Mrs. A.LOGANAYAKI, M.E.,

SUPERVISOR

Assistant Professor,

Department of Information

Technology,

Mahendra College of Engineering,

Minnampalli, Salem

SIGNATURE

HEAD OF THE DEPARTMENT

Department of Information

Technology,

Mahendra College of Engineering,

Minnampalli, Salem

Page 3: Recent Advances in Cloud Computing- A Case Study Approach

iii

ABSTRACT

Cloud Computing is the latest buzz word in the IT industry. This internet -

based ongoing technology which has brought out flexibility, capacity and power of

processing has realised service- oriented idea and has created a new ecosystem in the

computing world with its great power and benefits. Cloud capabilities have been

able to move IT industry one giant step forward. Nowadays, large and famous

enterprises have resorted to cloud computing and have transferred their processing

and storage to it. This report discusses the basics of cloud computing followed by

review of two important and emerging aspects of cloud computing, viz.,

performance evaluation and network virtualisation. Due to popularity and progress

of cloud in different organisations, cloud performance evaluation is of special

importance and this evaluation can help users make right decisions.

Network virtualisation is the key to the current and future success of cloud

computing. Key reasons for virtualisation, several of the networking technologies

that have been developed recently or are being developed in various standards

bodies are reviewed including software defined networking, which is the key to

network programmability. OpenADN - application delivery in a multi-cloud

environment is also briefly reviewed.

The design and implementation of an academic cloud at IIT Delhi, named as

Baadal, which is an in-house effort, is reviewed as a case study to highlight the

recent achievements within the sphere of cloud computing.

Page 4: Recent Advances in Cloud Computing- A Case Study Approach

iv

ACKNOWLEDGEMENT

I take immense pleasure in expressing my humble note of gratitude to our

honourable Chairman Shri.M.G.BHARATHKUMAR, M.A., B.Ed., and our

young and dynamic Managing Directors Er.Ba.MAHENDHIRAN, B.E., and

Er.Ba.MAHA AJAY PRASATH, B.E.,M.S(U.S.A)., who have provided excellent

facilities to complete the technical seminar successfully.

I also express my gratitude and thanks to our honourable Principal

Dr.R.ASOKAN, M.Tech., Ph.D., F.I.E., F.T.A., for providing all facilities for

carrying out the technical seminar work.

I take immense pleasure in expressing my heart-felt gratitude to our Dean,

Dr. S.KRISHNAKUMAR, M.Tech., Ph.D., for his guidance and sustained

encouragement for the successful completion of this report..

I wish to express my sense of gratitude and sincere thanks to our Head of the

Department Dr. N.SATISH, M.E., Ph.D., of Information Technology for his

valuable guidance and resources provided for completion of the technical seminar

report .

I express my profound sense of thanks with deepest respect and gratitude to

my Guide Mrs. A.LOGANAYAKI, M.E., Assistant Professor, Department of

Information technology for her valuable and precious guidance for completion of

this report.

Page 5: Recent Advances in Cloud Computing- A Case Study Approach

v

TABLE OF CONTENTS

Chapter

No.

Description Page

No

Abstract iii

1 Introduction 1

2 Cloud Computing 2

3 Evolution and Potential 5

4 Virtualisation 11

5 Performance Evaluation 14

6 Network Virtualisation 18

7 Design and Implementation of Academic Cloud:

Baadal at IIT Delhi

27

8 Conclusion 32

References 33

Page 6: Recent Advances in Cloud Computing- A Case Study Approach

1

CHAPTER 1

INTRODUCTION

As more aspects of our work and life move online and the Web expands

beyond a communication medium to become a platform for business and society, a

new paradigm of large-scale distributed computing has emerged in our lives. Cloud

computing has very quickly become one of the hottest topics, if not the hottest one ,

for practicing engineers and academics in domains related to engineering, science,

and art for building large-scale networks and Internet applications. Nowadays,

everyone’s talking about cloud computing. In academia, numerous research papers,

tutorials, workshops, and panels on this emerging topic have been presented at major

conferences and published in the top-level computer science journals and magazines.

Also, several universities have added courses that are dedicated to cloud computing

principles. A plethora of blogs, forums, and discussion groups on the subject are

available on the Web. In industry, companies are devoting great resources to

investing in cloud computing, either by building their own infrastructures or

developing innovative cloud services.

Cloud computing is a new multidisciplinary research field, considered to be

the evolution and convergence of several independent computing trends such as

Internet delivery, “pay-as-you-go” utility computing, elasticity, virtualization, grid

computing, distributed computing, storage, content outsourcing, security, and Web

2.0. However, multidisciplinary nature of cloud computing has raised questions in

the research community about how novel this new paradigm is because it includes

almost everything that existing technologies already do. An attempt is made to

demystify cloud computing and highlight the innovative aspects of cloud computing,

identifying its major technical and nontechnical challenges.

Page 7: Recent Advances in Cloud Computing- A Case Study Approach

2

CHAPTER 2

CLOUD COMPUTING

2.1 Definition

Even though we can’t precisely define the cloud because it’s an evolving

paradigm, the US National Institute of Standards and Technology’s definition covers

the most important aspects of the cloud vision. NIST defines [1] as Cloud

computing is a model for enabling ubiquitous, convenient, on-demand network

access to a shared pool of configurable computing resources (e.g., networks,

servers, storage, applications, and services) that can be rapidly provisioned and

released with minimal management effort or service provider interaction. This

cloud model is composed of five essential characteristics, three service models, and

four deployment models.

2.2 Essential Characteristics

On-demand self-service. A consumer can unilaterally provision computing

capabilities, such as server time and network storage, as needed automatically

without requiring human interaction with each service provider.

Broad network access. Capabilities are available over the network and

accessed through standard mechanisms that promote use by heterogeneous thin or

thick client platforms (e.g., mobile phones, tablets, laptops, and workstations).

Resource pooling. The provider’s computing resources are pooled to serve

multiple consumers using a multi-tenant model, with different physical and virtual

resources dynamically assigned and reassigned according to consumer demand.

There is a sense of location independence in that the customer generally has no

control or knowledge over the exact location of the provided resources but may be

able to specify location at a higher level of abstraction (e.g., country, state, or data

centre). Examples of resources include storage, processing, memory, and network

bandwidth.

Page 8: Recent Advances in Cloud Computing- A Case Study Approach

3

Rapid elasticity. Capabilities can be elastically provisioned and released, in

some cases automatically, to scale rapidly outward and inward commensurate with

demand. To the consumer, the capabilities available for provisioning often appear to

be unlimited and can be appropriated in any quantity at any time.

Measured service. Cloud systems automatically control and optimize

resource use by leveraging a metering capability at some level of abstraction

appropriate to the type of service (e.g., storage, processing, bandwidth, and active

user accounts). Resource usage can be monitored, controlled, and reported,

providing transparency for both the provider and consumer of the utilized service.

2.3 Service Models

Software as a Service (SaaS). The capability provided to the consumer is to

use the provider’s applications running on a cloud infrastructure2. The applications

are accessible from various client devices through either a thin client interface, such

as a web browser (e.g., web-based email), or a program interface. The consumer

does not manage or control the underlying cloud infrastructure including network,

servers, operating systems, storage, or even individual application capabilities, with

the possible exception of limited user-specific application configuration settings.

Platform as a Service (PaaS). The capability provided to the consumer is to

deploy onto the cloud infrastructure consumer-created or acquired applications

created using programming languages, libraries, services, and tools supported by the

provider.3 The consumer does not manage or control the underlying cloud

infrastructure including network, servers, operating systems, or storage, but has

control over the deployed applications and possibly configuration settings for the

application-hosting environment.

Infrastructure as a Service (IaaS). The capability provided to the consumer

is to provision processing, storage, networks, and other fundamental computing

resources where the consumer is able to deploy and run arbitrary software, which

can include operating systems and applications. The consumer does not manage or

control the underlying cloud infrastructure but has control over operating systems,

Page 9: Recent Advances in Cloud Computing- A Case Study Approach

4

storage, and deployed applications; and possibly limited control of select networking

components (e.g., host firewalls).

2.4 Deployment Models

Private cloud. The cloud infrastructure is provisioned for exclusive use by a

single organization comprising multiple consumers (e.g., business units). It may be

owned, managed, and operated by the organization, a third party, or some

combination of them, and it may exist on or off premises.

Community cloud. The cloud infrastructure is provisioned for exclusive use

by a specific community of consumers from organizations that have shared concerns

(e.g., mission, security requirements, policy, and compliance considerations). It may

be owned, managed, and operated by one or more of the organizations in the

community, a third party, or some combination of them, and it may exist on or off

premises.

Public cloud. The cloud infrastructure is provisioned for open use by the

general public. It may be owned, managed, and operated by a business, academic, or

government organization, or some combination of them. It exists on the premises of

the cloud provider.

Hybrid cloud. The cloud infrastructure is a composition of two or more

distinct cloud infrastructures (private, community, or public) that remain unique

entities, but are bound together by standardized or proprietary technology that

enables data and application portability (e.g., cloud bursting for load balancing

between clouds).

Page 10: Recent Advances in Cloud Computing- A Case Study Approach

5

CHAPTER 3

EVOLUTION AND POTENTIAL

3.1 Evolution

Figure 3.1 below shows the evolution of cloud computing as a paradigm.

With the launch of Amazon Web Services in 2002 the proliferation of firms entering

into this business has been tremendous. Figure 3.2 depicts and alternate view of the

evolution process which has ushered in the Ubiquity era. Figure 3.3 shows how

computing era has evolved over a period of time in a nutshell.

Figure 3.1- Cloud computing timeline. Cloud computing has evolved from

previous computing paradigms going back to the days of mainframes

John McCarthy

envisions that

‘computation’

might someday

be organised as

a public utility

The term grid

computing is

originated by Ian

Foster and Carl

Kesselman’s

work, The Grid:

Blueprint for a

New Computing

Infrastructure

Salesforce.

com

introduces

the

concept of

delivering

enterprise

application

s via a

website

Amazon

Web

Services

provide a

suite of

cloud-based

services,

including

storage and

computation

Amazon launches

Elastic Compute

(EC2) as a

commercial Web

service that small

companies and

individuals rent to

run their own

computer

applications

Private

cloud

models

make their

appearance

Cloud

providers

offer

browser-

based

enterprise

application

s

Cloud 2.0

model is

emerging

IBM (key

contributor Jim

Rymarczyk)

launches CP-67

software; one of

IBM’s first

attempts at

virtualising

mainframe

operating

systems

1961 1967 1999 2002 2006 2008 2009 2010

Page 11: Recent Advances in Cloud Computing- A Case Study Approach

6

Figure 3.2- A different perspective of cloud computing evolution

Figure 3.3 – March to the Ubiquity Era

Page 12: Recent Advances in Cloud Computing- A Case Study Approach

7

3.2 Cloud Computing Growth and Potential

Similarly, the user base of cloud services and revenue generated by way of

business opportunities have also seen meteoric rise. Today, Telcos have around a

5% share of nearly $20Bn p.a. cloud services revenue, with 25% compound annual

growth rate (CAGR) forecast to 2015. Most market forecasts are that the total cloud

services market will reach $45-50Bn revenue by 2015 . Applying these views to an

extrapolated 'mid-point' forecast view of the Cloud Market in 2015, implies that

Telcos will take just under $9Bn revenue from Cloud by 2014, thus increasing

today's $1Bn share nine-fold. Figure 3.4 shows the growth forecast the current

market players in cloud computing services.

Figure 3.4- Cloud services current players and market growth

3.3 Reference Architecture - The Conceptual Reference Model

Figure 3.5 below presents an overview of the NIST cloud computing

reference architecture, which identifies the major actors, their activities and

functions in cloud computing. The diagram depicts a generic high-level architecture

and is intended to facilitate the understanding of the requirements, uses,

characteristics and standards of cloud computing.

Page 13: Recent Advances in Cloud Computing- A Case Study Approach

8

Figure 3.5 – NIST Reference Architecture

A brief role/definition of the key players of cloud computing is shown

figure 3.6

Actor Definition

Cloud Consumer

A person or organization that maintains a business relationship with, and

uses service from, Cloud Providers.

Cloud Provider

A person, organization, or entity responsible for making a service

available to interested parties.

Cloud Auditor

A party that can conduct independent assessment of cloud services,

information system operations, performance and security of the cloud

implementation.

Cloud Broker

An entity that manages the use, performance and delivery of cloud

services, and negotiates relationships between Cloud Providers and

Cloud Consumers.

Cloud Carrier

An intermediary that provides connectivity and transport of cloud

services from Cloud Providers to Cloud Consumers.

Figure 3.6- Actors of cloud computing

Page 14: Recent Advances in Cloud Computing- A Case Study Approach

9

It also makes economic sense to enterprises and start-ups to migrate to cloud

due to the obvious benefits it shall accrue to them over a period of time and the

really short uptime required to use cloud services. Figure 3.7 shows that the tip of

the iceberg, which is just the acquisition cost is only 10 % as compared to the hidden

cost of operation and maintenance.

Figure 3.7-Total Cost of IT infrastructure

3.4 Services on the cloud

The services offered to the consumers almost encompasses everything

including scientific computing as depicted in figure 3.8.

Figure 3.8 – Services offered on the cloud

Page 15: Recent Advances in Cloud Computing- A Case Study Approach

10

The grid in figure 3.9 shows the type of services offered by various

layers of cloud computing

Figure 3.9- Grid showing the services offered by cloud computing

Page 16: Recent Advances in Cloud Computing- A Case Study Approach

11

CHAPTER 4

VIRTUALISATION

Virtualisation is one of the bedrock of cloud computing. The other being

multi-tenancy. All services offered on the cloud depend on these two fundamental

pillars as shown in Figure 4.1

Figure 4.1- Pillars of Cloud Computing

The Internet has resulted in virtualization of all aspects of our life. Today, our

workplaces are virtual, we shop virtually, get virtual education, entertainment is all

virtual, and of course, much of our computing is virtual. The key enabler for all

virtualizations is the Internet and various computer networking technologies. It turns

out that computer networking itself has to be virtual-ized. Several new standards and

technologies have been developed for network virtualization. This article is a survey

of these technologies.

There are many reasons why we need to virtualise resources. The five most

common reasons are:

Sharing: When a resource is too big for a single user, it is best to divide it

into multiple virtual pieces, as is the case with today’s multi-core processors. Each

processor can run multiple virtual machines (VMs), and each machine can be used

by a different user. The same applies to high-speed links and large-capacity disks.

Page 17: Recent Advances in Cloud Computing- A Case Study Approach

12

Isolation: Multiple users sharing a resource may not trust each other, so it is

important to provide isolation among users. Users using one virtual component

should not be able to monitor the activities or interfere with the activities of other

users. This may apply even if different users belong to the same organization since

different departments of the organization (e.g., finance and engineering) may have

data that is confidential to the department.

Aggregation: If the resource is too small, it is possible to construct a large

virtual resource that behaves like a large resource. This is the case with storage,

where a large number of inexpensive unreliable disks can be used to make up large

reliable storage.

Dynamics: Often resource requirements change fast due to user mobility, and

a way to reallocate the resource quickly is required. This is easier with virtual

resources than with physical resources.

Ease of management: Last but probably the most important reason for

virtualization is the ease of management. Virtual devices are easier to manage

because they are soft-ware-based and expose a uniform interface through standard

abstractions.

Virtualisation is not a new concept to computer scientists. Memory was the

first among the computer components to be virtualized. Memory was an expensive

part of the original computers, concepts were developed in the 1970s. Study and

comparison of various page replacement algorithms was a popular research topic

then. Today’s computers have very sophisticated and multiple levels of caching for

memo-ry. Storage virtualization was a natural next step with virtual disks, virtual

compact disk (CD) drives, leading to cloud storage today. Virtual-isation of desktops

resulted in thin clients, which resulted in significant reduction of capital as well as

operational expenditure, eventually leading to virtualization of servers and cloud

computing.

However, there has been significant renewed interest in network virtualization

fuelled primarily by cloud computing. Several new standards have been developed

Page 18: Recent Advances in Cloud Computing- A Case Study Approach

13

and are being developed. Software defined networking (SDN) also helps in network

virtualisation

The efficiency and effectiveness of cloud computing as a service intrinsically

depends on performance and continued innovation. Thus a review of literature

available on performance evaluation of cloud computing was carried out.

Page 19: Recent Advances in Cloud Computing- A Case Study Approach

14

CHAPTER 5

PERFORMANCE EVALUATION

Cloud computing resources must be compatible, high performance and

powerful. High performance is one of the cloud advantages which must be

satisfactory for each service.

Higher performance of services and anything related to cloud have influence on

users and service providers. Hence, performance evaluation for cloud providers and

users is important. There are many methods for performance prediction and

evaluation; the following methods in is used in the evaluation process:

· Evaluation based on criteria and characteristics

· Evaluation based on simulation

Another category which can be considered for evaluating cloud performance is

classification of three layers of cloud services evaluation.

5.1. Factors affective on performance

Nowadays , the term “performance” is more than a classic concept and includes

more extensive concepts such as reliability, energy efficiency, scalability and soon.

Due to the extent of cloud computing environments and the large number of

enterprises and normal users who are using cloud environment, many factors can

affect the performance of cloud computing and its resources. Some of the important

factors considered in this paper are as follows:

· Security, the impact of security on cloud performance may seem lightly strange,

but the impact of security on network infrastructure has been proven. For

example, DDoS attacks have wide impact on networks performance and if

happen, it will greatly reduce networks performance and also be effective on

response time too. Therefore, if this risk and any same risks threaten cloud

environment, it will be a big concern for users and providers.

Page 20: Recent Advances in Cloud Computing- A Case Study Approach

15

· Recovery, when data in cloud face errors and failures or data are lost for any

eason, the time required for data retrieval and volumes of data which are

recoverable, will be effective on cloud performance. For example, if the data

recovery takes a long time will be effective on cloud Performance and customer

satisfaction, because most organizations are cloud users and have quick access

to their data and their services are very important for them.

· Service level agreements, when the user wants to use cloud services, an

agreement will be signed between users and providers which describes user’s

requests, the ability of providers, fees, fines etc. If performance is looked at

from personal view, the better, more optimal and more timely the agreed

requests, the higher the performance will be .This view also holds true for

providers.

· Network bandwidth, this factor can be effective on performance and can be a

criterion for evaluations too. For example, if the bandwidth is too low to provide

service to customers, performance will be low too

· Storage capacity, Physical memory can also be effective on the performance

criteria. This factor will be more effective in evaluating the performance of

cloud infrastructure

· Buffer capacity: as shown in figure 2, if servers cannot serve a request, it will be

buffered in a temporary memory. Therefore, buffer capacity effect on

performance. If the buffer capacity is low, many requests will be rejected and

therefore performance will be low.

· Disk capacity, can also have a negative or positive impact on performance in

cloud

· Fault tolerance, this factor will have special effect on performance of cloud

environment. As an example, if a data centre is in deficient and is able to

provide the minimum services, this can increase performance.

· Availability, with easy access to cloud services and the services are always

available, performance will be increase.

Page 21: Recent Advances in Cloud Computing- A Case Study Approach

16

· Number of users, if a data centre has a lot of users and this number is greater

than that of the rated capacity, this will reduce performance of services.

· Location, data centres and their distance from a user’s location are also an

important factor that can be effective on performance from the users’ view.

Other factors that can affect performance which are as follows:- · Usability

· Scalability

· Workload

· Repetition or Redundancy

· Processor Power

· Latency

5.2. Simulation Category

There are three categories in this simulation and evaluation based on major

components in cloud environment. Specific metrics are used and the categories have

been selected because data centres, users and geographic region are important in

cloud computing environments.

Simulation and evaluation based on data centres. Evaluation is done by

modifying the virtual machine, memory and bandwidth. Results show the response

time for some users increasing with improved processing time, bud additional Data

Centre is only extra cost and also for low volume requests, additional Data Centre

will not Cause significant changes in processing time.

The average of maximum service time per request in data centres. It also

proved to be similar to previous values after the same number of data centres. This

reviews shows that the average service time is decreasing with increasing number of

centres, but reduction is lower after some additional data centre and costs is too

high.

Change in number of processors of data centres has the greatest impact on

Page 22: Recent Advances in Cloud Computing- A Case Study Approach

17

processing time, and has the greatest impact on cost too.

Simulation and evaluation based on users. The results with change in number of

users and volume of work is evaluated then. It can be concluded that if a data centre

user is overrated capacity, not only it will not be profitable but also it will lower

efficiency of that centre.

The results shows that increasing the number of requests per unit times have

little impact on response time, on processing time, data centres. But unlike other

measures, it is effective on the data transfer and thus the cost of data transfer. More

information can be transferred with the increasing number of requests and therefore,

costs will also increase.

Simulation and evaluation based on geographical region. The impact of

geographical location of users and data centres are studied to determine how

effective they will be on criteria if the data centres and users are in the same region

or are far from each other in different regions.

The results show that these changes affect cost and other measures. This result

shows that it is better for the users and data centres to be in the same region or have

the least distribution. Processing rate in data centres will be reduced when user is

away from the centre, because the response time increases, so users may have fewer

requests from the data centre.

Page 23: Recent Advances in Cloud Computing- A Case Study Approach

18

CHAPTER 6

NETWORK VIRTUALISATION

6.1 Introduction

A computer network starts with a network inter-face card (NIC) in the host,

which is connected to a layer 2 (L2) network (Ethernet, WiFi, etc.) segments.

Several L2 network segments may be interconnected via switches (a.k.a. bridges) to

form an L2 network, which is one subnet in a layer 3 (L3) network (IPv4 or IPv6).

Multiple L3 networks are connected via routers (a.k.a. gate-ways) to form the

Internet. A single data centre may have several L2/L3 networks. Several data centres

may be interconnected via L2/L3 switches. Each of these network components -

NIC, L2 network, L2 switch, L3 networks, L3 routers, data centres, and the Internet

- needs to be virtualised. There are multiple, often competing, standards for

virtualization of several of these components. Several new ones are being developed.

When a VM moves from one subnet to another, its IP address must change,

which complicates routing. It is well known that IP addresses are both locators and

system identifiers, so when a system moves, its L3 identifier changes. In spite of all

the developments of mobile IP, it is significantly simpler to move systems within

one subnet (within one L2 domain) than between subnets. This is because the IEEE

802 addresses used in L2 networks (both Ethernet and WiFi) are system identifiers

(not locators) and do not change when a system moves. Therefore, when a network

connection spans multiple L2 networks via L3 routers, it is often desirable to create

a virtual L2 network that spans the entire network. In a loose sense, several IP

networks together appear as one Ethernet network.

6.2 Virtualisation Of NICs

Each computer system needs at least one L2 NIC (Ethernet card) for

communication. There-fore, each physical system has at least one physical NIC.

However, if we run multiple VMs on the system, each VM needs its own virtual

NIC. As shown in Fig. 6.1, one way to solve this problem is for the “hypervisor”

software that provides processor virtualization also implements as many virtual

Page 24: Recent Advances in Cloud Computing- A Case Study Approach

19

NICs (vNICs) as there are VMs. These vNICs are interconnected via a virtual switch

(vSwitch) which is connected to the physical NIC (pNIC). Multiple pNICs are

connected to a physical switch (pSwitch). We use this notation of using p-prefix for

physical and v-prefix for virtual objects. In the figures, virtual objects are shown by

dotted lines, while physical objects are shown by solid lines.

vM1 v

M2 v

M1 v

M2 v

M1 v

M2

vNIC1 v

NIC2 p

M Hy

pervisor vNIC1

vNIC2

vSwitch v

NIC1 v

NIC2 VEP

A

pM

pM

pNIC p

NIC

vSwitch p

NIC

p

Switch

(a) (b) (c)

Figure 6.1- Three approaches to NIC virtualization

Virtualization of the NIC may seem straight-forward. However, there is

significant industry competition. Different segments of the networking industry have

come up with competing standards. Figure 6.1 shows three different approaches.

The first approach, providing a software vNIC via hypervisor, is the one

proposed by VM software vendors. This virtual Ethernet bridge (VEB) approach has

the virtue of being trans-parent and straightforward. Its opponents point out that

there is significant software overhead, and vNICs may not be easily manageable by

external network management software. Also, vNICs may not provide all the

features today’s pNICs provide.

The second approach provided by pNIC vendors (or pNIC chip vendors) have

their own solution, which provides virtual NIC ports using single-route I/O

virtualization (SR-IOV) on the peripheral-component interconnect (PCI) bus.

Page 25: Recent Advances in Cloud Computing- A Case Study Approach

20

The third approach by the switch vendors (or pSwitch chip vendors) have yet

another set of solutions that provide virtual channels for inter-VM communication

using a virtual Ethernet port aggregator (VEPA), which passes the frames simply to

an external switch that implements inter-VM communication policies and reflects

some traffic back to other VMs in the same machine. IEEE 802.1Qbg specifies both

VEB and VEPA.

6.3 Virtualisation of Switches

A typical Ethernet switch has 32–128 ports. The number of physical machines

that need to be connected on an L2 network is typically much larger than this.

Therefore, several layers of switches need to be used to form an L2 network. IEEE

Bridge Port Extension standard 802.1BR, shown in Fig. 2, allows forming a virtual

bridge with a large number of ports using port extenders that are simple relays and

may be physical or virtual (like a vSwitch).

pBridge vBridge

Port extender Port extender

Port extender Port extender

Figure 6.2- IEEE 802.1BR bridge port extension.

6.4 Virtualisation in LAN Clouds

One additional problem in the cloud environment is that multiple VMs in a

single physical machine may belong to different clients and thus need to be in

different virtual LANs (VLANs). As discussed earlier, each of these VLANs may

span several data centres interconnected via L3 networks, as shown in Fig. 6.3.

Page 26: Recent Advances in Cloud Computing- A Case Study Approach

21

S

erver 1 S

erver 2

V

M2-1 VM2-

1 V

M2-1 VM2-

1

V

LAN 22 VLAN

34 V

LAN 34 VLAN

74

V

M2-3 VM2-

4 V

M2-3 VM2-

4

V

LAN 74 VLAN

98 V

LAN 98 VLAN

22

Hypervisor VTEP IP1 L3 networks Hypervisor VTEP IP2

Figure 6.3- Different virtual machines may be in different VLANs.

Again, there are a number of competing proposals to solve this problem.

VMware and sever-al partner companies have proposed virtual extensible LANs

(VXLANs). Network virtualization using generic routing encapsulation (NVGRE)

and the Stateless Transport Tunnelling (STT) protocol are two other proposals being

considered in the Network Virtualization over L3 (NVO3) working group of the

Internet Engineering Task Force (IETF).

6.5 Network Function Virtualisation

Standard multi-core processors are now so fast that it is possible to design

networking devices using software modules that run on standard processors. By

combining many different functional modules, any networking device - L2 switch,

L3 router, application delivery controller, and so on - can be composed cost

effectively and with acceptable performance. The Network Function Virtualization

(NFV) group of the European Telecommunications Standards Institute (ETSI) is

working on developing standards to enable this.

6.6 Software Defined Networking

Software defined networking is the latest revolution in networking

innovations. All components of the networking industry, including network

equipment vendors, Internet service providers, cloud service providers, and users,

Page 27: Recent Advances in Cloud Computing- A Case Study Approach

22

are working on or looking forward to various aspects of SDN. SDN consists of four

innovations:

· Separation of the control and data planes

· Centralization of the control plane

· Programmability of the control plane

· Standardization of application programming interfaces (APIs)

Each of these innovations is explained briefly below.

6.7 Separation of the Control Plane and Data Plane

Networking protocols are often arranged in three planes: data, control, and

management. The data plane consists of all the messages that are generated by the

users. To transport these messages, the network needs to do some house-keeping

work, such as finding the shortest path using L3 routing protocols such as Open

Shortest Path First (OSPF) or L2 forwarding proto-cols such as Spanning Tree. The

messages used for this purpose are called control messages and are essential for

network operation. In addition, the network manager may want to keep track of

traffic statistics and the state of various networking equipment. This is done via

network management. Management, although important, is different from control in

that it is optional and is often not done for small networks such as home networks.

One of the key innovations of SDN is that the control should be separated from

the data plane. The data plane consists of forwarding the packets using the

forwarding tables prepared by the control plane. The control logic is separated and

implemented in a controller that prepares the forwarding table. The switches

implement data plane (forwarding) logic that is greatly simplified. This reduces the

complexity and cost of the switches significantly.

6.8 Centralisation of Control Plane

The U.S. Department of Defence funded Advanced Research Project Agency

Network (ARPAnet) research in the early 1960s to counter the threat that the entire

nationwide communication system could be disrupted if the telecommunication

centres, which were highly centralized and owned by a single company at that time,

Page 28: Recent Advances in Cloud Computing- A Case Study Approach

23

were to be attacked. ARPAnet researchers therefore came up with a totally

distributed architecture in which the communication continues and packets find the

path (if one exists) even if many of the routers become non-operational. Both the

data and control planes were totally distributed. For example, each router

participates in helping prepare the routing tables. Routers exchange reachability

information with their neighbours and neighbours’ neighbours, and so on. This

distributed control paradigm was one of the pillars of Internet design and

unquestionable up until a few years ago.

Centralization, which was considered a bad thing until a few years ago, is now

considered good, and for good reason. Most organizations and teams are run using

centralized control. If an employee falls sick, he/she simply calls the boss, and the

boss makes arrangements for the work to continue in his/her absence. Now consider

what would happen in an organization that is totally distributed. The sick employee,

say John, will have to call all his co-employees and tell them that he/she is sick.

They will tell other employees that John is sick. This will take quite a bit of time

before everyone will know about John’s sickness, and then everyone will decide

what, if anything, to do to alleviate the problem until John recovers. This is quite

inefficient, but is how current Internet control protocols work. Centralization of

control makes sensing the state and adjusting the control dynamically based on state

changes much faster than with distributed protocols.

Of course, centralization has scaling issues but so do distributed methods. For

both cases, we need to divide the network into subsets or areas that are small enough

to have a common control strategy. A clear advantage of centralised control is that

the state changes or policy changes propagate much faster than in a totally

distributed system. Also, standby controllers can be used to take over in case of

failures of the main controller. Note that the data plane is still fully distributed.

6.9 Programmable Control Plane

Now that the control plane is centralized in a central controller, it is easy for

the network manager to implement control changes by simply changing the control

program. In effect, with a suitable API, one can implement a variety of policies and

change them dynamically as the system states or needs change.

Page 29: Recent Advances in Cloud Computing- A Case Study Approach

24

This programmable control plane is the most important aspect of the SDN. A

programmable control plane in effect allows the network to be divided into several

virtual networks that have very different policies and yet reside on a shared

hardware infrastructure. Dynamically changing the policy would be very difficult

and slow with a totally distributed control plane.

6.10 Standardisation of API

SDN consists of a centralised control plane with a southbound API for

communication with the hardware infrastructure and a northbound API for

communication with the network applications. The control plane can be further

subdivided into a hypervisor layer and a control system layer. A number of

controllers are already available. Floodlight is one example. OpenDaylight is a

multi-company effort to develop an open source controller. A networking hypervisor

called FlowVisor that acts as a transparent proxy between forwarding hardware and

multiple controllers is also available.

The main southbound API is OpenFlow, which is being standardized by the

Open Networking Foundation. A number of proprietary southbound APIs also exist,

such as OnePK from Cisco. These later ones are especially suit-able for legacy

equipment from respective vendors. Some argue that a number of previously

existing control and management protocols, such as Extensible Messaging and

Presence Protocol (XMPP), Interface to the Routing System (I2RS), Software

Driven Networking Protocol (SDNP), Active Virtual Network Management

Protocol (AVNP), Simple Network Management Protocol (SNMP), Network

Configuration (Net-Conf), Forwarding and Control Element Separation (ForCES),

Path Computation Element (PCE), and Content Delivery Network Interconnection

(CDNI), are also potential southbound APIs. However, given that each of these was

developed for another specific application, they have limited applicability as a

general-purpose southbound control API.

Northbound APIs have not been standardised yet. Each controller may have a

different programming interface. Until this API is standardised, development of

Page 30: Recent Advances in Cloud Computing- A Case Study Approach

25

network applications for SDN will be limited. There is also a need for an east-west

API that will allow different controllers from neighbouring domains or in the same

domain to communicate with each other.

Networking industry has shown enormous interest in SDN. SDN is expected

to make the net-works programmable and easily partitionable and virtualisable.

These features are required for cloud computing where the network infrastructure is

shared by a number of competing entities. Also, given simplified data plane, the

forwarding elements are expected to be very cheap standard hardware. Thus, SDN is

expect-ed to reduce both capital expenditure and operational expenditure for service

providers, cloud service providers, and enterprise data centres that use lots of

switches and routers.

SDN is like a tsunami that is taking over other parts of the computing industry

as well. More and more devices are following the software defined path with most of

the logic implemented in software over standard processors. Thus, today we have

software defined base stations, software defined optical switches, software defined

routers, and so on.

Regardless of what happens to current approaches to SDN, it is certain that the

networks of tomorrow will be more programmable than today. Programmability will

become a common feature of all networking hardware so that a large number of

devices can be programmed (aka orchestrated) simultaneously. The exact APIs that

will become common will be decided by transition strategies since billions of legacy

networking devices will need to be included in any orchestration.

It must be pointed out that NFV and SDN are highly complementary

technologies. They are not dependent on each other.

6.11 Open Application Delivery Using SDN

While current SDN-based efforts are mostly restricted to L3 and below

(network traffic), it may be extended to manage L3 and above application traffic as

well. Application traffic management involves enforcing application deployment

and delivery policies on application traffic flows that may be identified by the type

Page 31: Recent Advances in Cloud Computing- A Case Study Approach

26

of application, application deployment context (application partitioning and

replication, intermediary service access for security, performance, etc.), user and

server contexts (load, mobility, failures, etc.), and application QoS requirements.

This is required since delivering modern Internet-scale applications has become

increasingly complex even inside a single private data centre.

Key features of OpenADN

· OpenADN takes network virtualization to the extreme of making the global

Internet look like a virtual single data centre to each ASP.

· Proxies can be located anywhere on the global Internet. Of course, they should

be located in proximity to users and servers for optimal performance.

· Backward compatibility means that legacy traffic can pass through OpenADN

boxes, and OpenADN traffic can pass through legacy boxes.

· No changes to the core Internet are neces-sary since only some edge devices

need to be OpenADN/SDN/OpenFlow-aware. The remaining devices and

routers can remain legacy.

· Incremental deployment can start with just a few OpenADN-aware OpenFlow

switches.

· Economic incentives for first adopters are to be found by ISPs that deploy a few

of these switches, and those ASPs that use OpenADN will benefit immediately

from the technology.

· ISPs keep complete control over their net-work resources, while ASPs keep

complete control over their application data, which may be confidential and

encrypted.

Page 32: Recent Advances in Cloud Computing- A Case Study Approach

27

CHAPTER 7

DESIGN AND IMPLEMENTATION OF ACADEMIC CLOUD:

BAADAL AT IIT DELHI

7.1 Introduction

Cloud Computing is becoming increasingly popular for its better usability,

lower cost, higher utilization, and better management. Apart from publicly available

cloud infrastructure such as Amazon EC2, Microsoft Azure, or Google App Engine,

many enterprises are setting up “private clouds". Private clouds are internal to the

organization and hence provide more security, privacy, and also better control on

usage, cost and pricing models. Private clouds are becoming increasingly popular

not just with large organizations but also with medium sized organizations which

run a few tens to a few hundreds of IT services.

An academic institution (university) can benefit significantly from private

cloud infrastructure to service its IT, research, and teaching requirements. The paper,

discusses the experience with setting up a private cloud infrastructure in Indian

Institute of Technology (IIT) Delhi, which has around 8000 students, 450 faculty

members, more than 1000 workstations, and around a hundred server-grade

machines to manage the IT infrastructure. With many different departments and

research groups requiring compute infrastructure for their teaching and research

work, and other IT services, IIT Delhi has many different “labs" and “server rooms"

scattered across the campus. The aim to consolidate this compute infrastructure by

setting up a private cloud and providing VMs to the campus community to run their

workloads. This can significantly reduce hardware, power, and management costs,

and also relieve individual research groups of management headaches. A cloud

infrastructure with around 30 servers, each with 24 cores, 10 TB shared SAN-based

storage, all connected with 10Gbps fibre. A Virtual Machine is run on this hardware

infrastructure using KVM and manages these hosts using the custom management

layer developed using Python and libvirt

Page 33: Recent Advances in Cloud Computing- A Case Study Approach

28

7.2 Salient Design Features of the Academic Cloud

While implementing the private cloud infrastructure, the team came across

several issues that have previously not been addressed by commercial cloud

offerings. Some of the main challenges faced by the team are discussed below:

Workflow: In an academic environment the concern is about simplicity and

usability of the workflow for researchers (e.g., Ph.D. students, research staff, faculty

members) and administrators (system administrators, policy makers and enforcers,

approvers for resource usage).

For authentication, the cloud service is integrated with a campus-wide LDAP

server to leverage existing authentication mechanisms and also the service with the

campus-wide mail and Kerberos servers. A researcher creates a request which

should be approved by the concerned faculty member before it is approved by the

cloud administrator. Both the faculty member and cloud administrator can change

the request parameters (e.g., number of cores, memory size, disk size, etc.) which is

followed by a one click installation of the virtual machine. As soon as the virtual

machine is installed, the faculty and the students are informed about the same with a

VNC console password that they can use to access the virtual machine.

Cost and Freedom: In an academic setting, concern is also about both cost and

freedom to tweak the software. For this reason, on free and open-source

infrastructure are chosen. Enterprise solutions like those provided by VMware are

both expensive and restrictive. The virtualization stack comprising of KVM, Libvirt,

and Web2py is open-source and available freely.

Workload Performance: The researchers typically need large number of VMs

executing complex simulations communicating with each other through message-

passing interfaces like MPI. Both compute and I/O performance is critical for such

workloads. Hardware and software are chosen to provide the maximum performance

possible. For example, the best possible bandwidths between the physical hosts,

storage arrays, and external network switches are ensured with available hardware.

Similarly, the best possible emulated devices in Virtual Machine Monitor are used.

Whenever possible, para-virtual devices for maximum performance are used.

Page 34: Recent Advances in Cloud Computing- A Case Study Approach

29

Maximizing Resource Usage: Currently dedicated high-performance server-

class hardware to host cloud infrastructure is used. Custom scheduling and

admission-control policies to provide maximal resource usage are employed. In

future, plan is to use the idle capacity of lab and server rooms to implement larger

cloud infrastructure at minimal cost. Some details are discussed below. A typical

lab contains tens to a few hundred commodity desktop machines, each having one or

more CPUs, a few 100 GBs of storage, connected over 100Mbps or 1Gbps Ethernet.

Often these clusters of computers are also connected to a shared Network-Attached

Storage (NAS) device. For example, there are around 150 commodity computers in

the Computer Science department. Typical utilization of these desktop computers is

very low (1-10%). Intention to use this “community" infrastructure for running the

cloud service. The VMs will run in background, causing no interference to the

applications and experience of the workstation user. This can significantly improve

the resource utilization of lab machines.

7.3 Challenges

Reliability: In lab environments, it is common for desktops to randomly

switch-off or become disconnected. These failures can be due to several reasons

including manual reboots, pulling out of network cables, power outages, or physical

hardware failures. Work on techniques to have redundant VM images to be able to

recover from such failures is in progress.

Network and Storage topology: Most cloud offerings use shared storage

(SAN/NAS). Such shared storage can result in a single point of failure. Highly

reliable storage arrays tend to be expensive. Use of disk- attached-storage in each

computer to provide a high-performance shared storage pool with built-in

redundancy is under investigation. Similarly, redundancy in network topology is

required to tolerate network failures.

Scheduling: Scheduling of VMs on server-class hardware has been well-

studied and is implemented on current cloud offerings. Scheduling algorithms for

commodity hardware where network bandwidths are lower is being developed,

Page 35: Recent Advances in Cloud Computing- A Case Study Approach

30

storage is distributed, and redundancy is implemented. For example, the scheduling

algorithm maintains redundant copies of a VM in separate physical environments.

Encouraging Responsible Behaviour: Public clouds charge their users for

CPU, disk, and network usage on per CPU-hour, GB-month, and Gbps-month

metrics. Instead of a strict pricing model, reliance on good community behaviour is

ensured by using different categories of users.

7.4 Ubuntu Enterprise Cloud

Ubuntu Enterprise Cloud is integrated with the open source Eucalyptus private

cloud platform, making it possible to create a private cloud with much less

configuration than installing Linux first, then Eucalyptus. Ubuntu/Eucalyptus

internal cloud offering is designed to be compatible with Amazon's EC2 public

cloud service which offers additional ease of use. On the other side, there is a need

to familiarize with both Ubuntu and Eucalyptus, as were frequently required to

search beyond Ubuntu documentation following the Ubuntu Enterprise Cloud's

dependence on Eucalyptus. For example, it was observed that Ubuntu had weak

documentation for customizing images, which is an important step in deploying their

cloud. Further even though the architecture is quite stable and worth using, it doesn't

serve the requirements of a custom tailored interface which should suit an academic

or research environment like IIT Delhi.

7.5 VMware vCloud

VMware vCloud offers on demand cloud infrastructure such that end users can

consume virtual resources with maximum agility. It offers consolidated data centres

and an option to deploy workloads on shared infrastructure with built-in security and

role-based access control. Migration of workloads between different clouds and

integration of existing management systems using customer extensions, APIs, and

open cross-cloud standards serves as one of the most convincing arguments to use

the same for a private cloud. Despite these features and one of the most stable cloud

platforms VMware vCloud might not be an ideal solution to be deployed by an

academic institution owing to the high licensing costs attached to it, though it might

prove ideal for an Enterprise with sufficiently good budget.

Page 36: Recent Advances in Cloud Computing- A Case Study Approach

31

7.6 Baadal: The Workflow Management Tool for Academic Requirements

Currently Baadal is based on KVM as the hypervisor and the Libvirt API

which serves as a toolkit to interact with the virtualization capabilities. The choice of

libvirt is guided by the fact that libvirt can work on variety of hypervisors including

KVM, Xen, and VMWare. Thus, the underlying hypervisor technology can be

changed at any later stage with minimal efforts.

Management software is exported in two layers, namely Web based and

Command-line interface (CLI). While the web based interface is built using web2py,

a MVC based python framework, use of python is continued for the command line

interface as well. The choice of the python as the primary language for the entire

project is supported by the excellent support and documentation by libvirt

community.

Page 37: Recent Advances in Cloud Computing- A Case Study Approach

32

CHAPTER 8

CONCLUSION

Cloud computing is a result of advances in virtualization in computing,

storage, and networking. Networking virtualisation is still in its infancy. Numerous

standards related to network virtualization have recently been developed in the IEEE

and Internet Engineering Task Force (IETF), and several are still being developed.

One of the key recent developments in this direction is software defined networking.

The key innovations of SDN are separation of the control and data planes,

centralisation of control, programmability, and stan-dard southbound, northbound,

and east-west APIs. This will allow a large num-ber of devices to easily be

orchestrated (programmed). OpenFlow is the standard southbound API being

defined by Open Networking Forum. Work on OpenADN, which is a network

application based on SDN that enables application partitioning and delivery in a

multi-cloud environment.

The recent developments in cloud computing were reviewed from available

literature with specific reference to performance evaluation of cloud environment

and innovation in the field of network virtualisation.

The designing and implementation of an academic cloud, Baadal at IIT Delhi

with in-house efforts were examined as a case study to highlight the developments

within the country in the sphere of cloud computing.

Page 38: Recent Advances in Cloud Computing- A Case Study Approach

33

REFERENCES

1. National Institute of Standards and Technology U.S. Department of

Commerce - Special Publication 800-145

2. Raj Jain and Subharthi Paul (2013) ‘Networking Virtualization and Software

Defined networking for Cloud Computing : A Survey’ IEEE Communication

Magazine Nov 2013

3. Niloofar Khanghahi and Reza Ravanmehr (2013) ‘Cloud Computing

Performance Evaluation: Issues and Challenges’ International Journal on

Cloud Computing: Services and Architecture (IJCCSA) ,Vol.3, No.5,

October 2013 4. George Pallis (2010) ‘Cloud Computing The New Frontier of Internet

Computing’ IEEE Computer Society September/October 2010

5. Christian Vecchiola, Suraj Pandey, and Rajkumar Buyya ‘ High Performance

Cloud Computing: A view of Scientific Applications’

Cloud computing and Distributed Systems (CLOUDS) Laboratory

Department of Computer Science and Software Engineering

6. Abhishek Gupta, Jatin Kumar, Daniel Mathew, Sorav Bansal, Subhashis

Banerjee and Huzur Saran ‘Design and Implementation of the WorkFlow of

an Academic Cloud’ IIT Delhi

7. http://www.iitd.ac.in/content/baadal-iitd-computing-cloud

8. http://web.mit.edu/6.897/www/readings.html

9. http://nhatnguyen.net/what-is-cloud-computing.aspx