design the support for granting required sla in public cloud environments based on cloud foundry
DESCRIPTION
Cloud Foundry, Isolation.TRANSCRIPT
-
ALMA MATER STUDIORUM - UNIVERSIT DI BOLOGNA
SCUOLA DI INGEGNERIA E ARCHITETTURA
DIPARTIMENTO DI INFORMATICA SCIENZA E INGEGNERIA
CORSO DI LAUREA MAGISTRALE IN INGEGNERIA INFORMATICA
TESI DI LAUREA
in
RETI DI CALCOLATORI M
DESIGN THE SUPPORT FOR GRANTING REQUIRED SLA IN PUBLIC CLOUD ENVIRONMENTS
BASED ON CLOUD FOUNDRY
CANDIDATO: RELATORE: Guido Davide DallOlio Chiar.mo Prof. Ing. Antonio Corradi
CORRELATORI:
Ing. Diana J. Arroyo Dr. Ing. Luca Foschini
Ing. Darrell Reimer Dr. Ing. Malgorzata Steinder
Anno Accademico 2012/13
Sessione III
-
Design the support for granting required SLAin public Cloud Environments
based on Cloud Foundry
Guido Davide Dall'Olio
-
Key words :
PaaS
Cloud Foundry
Isolation
-
Contents
Introduction 13
1 Introduction to Cloud Computing 15
1.1 Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Dierent Clouds . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.2.1 Public Cloud . . . . . . . . . . . . . . . . . . . . . . . 20
1.2.2 Private Cloud . . . . . . . . . . . . . . . . . . . . . . . 21
1.2.3 Private vs Public Cloud . . . . . . . . . . . . . . . . . 22
1.2.4 Hybrid Cloud . . . . . . . . . . . . . . . . . . . . . . . 24
1.3 Service Level Agreement in the Cloud . . . . . . . . . . . . . . 24
2 Cloud layers and its uses 27
2.1 Cloud Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1 IaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1.2 PaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.1.3 SaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2 Main Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.1 Google Cloud Platform . . . . . . . . . . . . . . . . . . 34
2.2.2 Amazon Web Services . . . . . . . . . . . . . . . . . . 35
3 Cloud Foundry 37
-
6 CONTENTS
3.1 A Good Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2 The Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.1 NATS . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.2 Cloud Controller . . . . . . . . . . . . . . . . . . . . . 41
3.2.3 Droplet Execution Agent . . . . . . . . . . . . . . . . . 44
3.2.4 Warden . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.5 Router . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.6 Health Manager . . . . . . . . . . . . . . . . . . . . . . 47
3.2.7 User Account and Authentication Server . . . . . . . . 48
3.2.8 Services . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3 Roles and Organizations . . . . . . . . . . . . . . . . . . . . . 51
3.4 Command Line Client . . . . . . . . . . . . . . . . . . . . . . 53
3.5 Applications Guidelines . . . . . . . . . . . . . . . . . . . . . . 54
3.6 Interaction and Usage . . . . . . . . . . . . . . . . . . . . . . 56
3.6.1 Staging . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.6.2 Start of an Application . . . . . . . . . . . . . . . . . . 59
4 BOSH 63
4.1 BOSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 The Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2.1 Stemcell . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2.2 Jobs and Packages . . . . . . . . . . . . . . . . . . . . 66
4.2.3 BOSH Agent . . . . . . . . . . . . . . . . . . . . . . . 69
4.2.4 Blobstore . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.5 Director . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3 BOSH Manifest . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5 Cloud Foundry Deployment 77
-
CONTENTS 7
5.1 The deployment . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2 Local deployment . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.1 CF Nise Installer . . . . . . . . . . . . . . . . . . . . . 79
5.2.2 Local Development Environment . . . . . . . . . . . . 80
5.3 Distributed Deployment . . . . . . . . . . . . . . . . . . . . . 83
5.3.1 Micro BOSH . . . . . . . . . . . . . . . . . . . . . . . 84
5.3.2 The Steps Involved . . . . . . . . . . . . . . . . . . . . 84
5.3.3 Distributed Development Environment . . . . . . . . . 87
5.3.4 OpenStack . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3.5 Deploying Micro BOSH . . . . . . . . . . . . . . . . . 91
5.3.6 Deploying a distributed Cloud Foundry . . . . . . . . . 92
6 Application Isolation in Cloud Foundry 101
6.1 Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.2 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.3 Process Groups - Control Groups . . . . . . . . . . . . . . . . 104
6.3.1 Hierarchy and Subsystems . . . . . . . . . . . . . . . . 104
6.3.2 An example of usage . . . . . . . . . . . . . . . . . . . 107
6.4 Containers a lightweight approach . . . . . . . . . . . . . . . . 108
6.4.1 LXC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.4.2 Warden . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.4.3 Docker . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.4.4 Docker vs Warden . . . . . . . . . . . . . . . . . . . . 118
6.5 Risks of a Container Based isolation . . . . . . . . . . . . . . . 120
7 Improving provided isolation 125
7.1 The current simple isolation . . . . . . . . . . . . . . . . . . . 125
7.2 Where to hook up a virtualization isolation: the Stack . . . . 127
-
8 CONTENTS
7.2.1 Current Stack usage . . . . . . . . . . . . . . . . . . . 128
7.2.2 Cloud Foundry Stack in details . . . . . . . . . . . . . 129
7.2.3 Our proposal to employ the Stack . . . . . . . . . . . . 133
7.2.4 Integrate the change with BOSH . . . . . . . . . . . . 139
7.3 Enabling a Dynamic Provisioning . . . . . . . . . . . . . . . . 140
7.3.1 Heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.3.2 Scaling DEA nodes with Heat . . . . . . . . . . . . . . 144
8 Isolation and Co-location Performances 149
8.1 Application Tested . . . . . . . . . . . . . . . . . . . . . . . . 151
8.1.1 CPU - intensive application . . . . . . . . . . . . . . . 154
8.1.2 Network - intensive application . . . . . . . . . . . . . 158
8.1.3 Disk I/O - intensive application . . . . . . . . . . . . . 161
8.1.4 Distributed Application . . . . . . . . . . . . . . . . . 163
8.1.5 Media Stream over network . . . . . . . . . . . . . . . 165
8.1.6 Multi-tier Application . . . . . . . . . . . . . . . . . . 167
8.2 Technical Conclusions . . . . . . . . . . . . . . . . . . . . . . 170
Conclusions and Future work 173
-
List of Figures
1.1 Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2 Virtual Machine Monitor and Virtualization . . . . . . . . . . 18
1.3 Deployment Models . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1 The cloud computing stack . . . . . . . . . . . . . . . . . . . . 28
2.2 Google Cloud Platform . . . . . . . . . . . . . . . . . . . . . . 35
3.1 A Triangle of choice . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Cloud Foundry Architecture . . . . . . . . . . . . . . . . . . . 40
3.3 Organization and Roles . . . . . . . . . . . . . . . . . . . . . . 52
3.4 Start of an application . . . . . . . . . . . . . . . . . . . . . . 60
4.1 BOSH architecture . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2 BOSH Director and Agent basic interaction . . . . . . . . . . 70
4.3 BOSH APIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.1 MicroBosh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2 Final BOSH deployment . . . . . . . . . . . . . . . . . . . . . 94
6.1 Xen and KVM virtualization . . . . . . . . . . . . . . . . . . . 103
6.2 A single hierarchy can have one or more subsystems attached . 105
6.3 Attaching multiple subsystems . . . . . . . . . . . . . . . . . . 106
6.4 Lightweight virtualization layers . . . . . . . . . . . . . . . . . 109
-
10 LIST OF FIGURES
6.5 Container vs Virtualization . . . . . . . . . . . . . . . . . . . 110
6.6 DEA and Warden interaction . . . . . . . . . . . . . . . . . . 114
6.7 Docker features and Virtualization . . . . . . . . . . . . . . . 117
6.8 Container and Virtual Machine comparison . . . . . . . . . . . 121
7.1 DEAs and Applications . . . . . . . . . . . . . . . . . . . . . . 126
7.2 DEA pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.3 DEA advertisements . . . . . . . . . . . . . . . . . . . . . . . 130
7.4 Cloud Controller selection process . . . . . . . . . . . . . . . . 132
7.5 New DEA advertisements . . . . . . . . . . . . . . . . . . . . 134
7.6 DEA and Cloud Controller conguration les . . . . . . . . . 135
7.7 New pools and processing . . . . . . . . . . . . . . . . . . . . 136
7.8 A new Controller processing . . . . . . . . . . . . . . . . . . . 137
7.9 Heat architecture . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.10 Cloud Foundry scaler, using Heat . . . . . . . . . . . . . . . . 145
7.11 A dierent Stack is running . . . . . . . . . . . . . . . . . . . 147
8.1 Virtual Machines and test conguration . . . . . . . . . . . . . 152
8.2 Two VCPU deployment for test . . . . . . . . . . . . . . . . . 153
8.3 Four VCPU deployment for test . . . . . . . . . . . . . . . . . 153
8.4 Whetstone Benchmark host score . . . . . . . . . . . . . . . . 154
8.5 Whetstone Benchmark two VCPU average container score . . 156
8.6 Whetstone Benchmark two VCPU average execution time . . 156
8.7 Whetstone Benchmark four VCPU average container score . . 157
8.8 Whetstone Benchmark four VCPU average execution time . . 157
8.9 Two VCPU deployment iperf test . . . . . . . . . . . . . . . . 158
8.10 TCP average Bandwidth . . . . . . . . . . . . . . . . . . . . . 159
8.11 UDP two VCPU average Bandwidth . . . . . . . . . . . . . . 160
-
LIST OF FIGURES 11
8.12 UDP four VCPU average Bandwidth . . . . . . . . . . . . . . 160
8.13 Disk intensive I/O Write Disk speed . . . . . . . . . . . . . . . 161
8.14 Disk intensive I/O total average execution time . . . . . . . . 162
8.15 Two VCPU deployment distributed computation test . . . . . 163
8.16 Average communication Bandwidth sending the chunk . . . . 164
8.17 Average execution time . . . . . . . . . . . . . . . . . . . . . . 164
8.18 Two VCPU deployment media stream test . . . . . . . . . . . 165
8.19 Average connection speed during media transfer two VCPU . 166
8.20 Average connection speed during media transfer four VCPU . 167
8.21 Two VCPU deployment Multi-tier test . . . . . . . . . . . . . 167
8.22 Average execution time two VCPU . . . . . . . . . . . . . . . 168
8.23 Average transfer speed two VCPU . . . . . . . . . . . . . . . . 169
8.24 Average execution time four VCPU . . . . . . . . . . . . . . . 169
8.25 Average transfer speed four VCPU . . . . . . . . . . . . . . . 170
-
Introduction
Cloud Computing, that is providing computer resources as a service, is a tech-
nology revolution oering exible IT usage in a cost ecient and pay-per-use
way. The Cloud approach can be applied for applications development pro-
cess by the use of special platforms and environments that provide an access
to remote resources. One of the platforms, Platform as a Service (PaaS ),
oers opportunities for software companies to create applications easier, con-
centrating on business processes instead of coding and maintenance, reduce
costs, associated with hardware and software, anticipate possible problems
in scalability and carry out the whole development lifecycle within the same
environment. In the last 12 months the adoption of PaaS has increased
dramatically and it is now one of the fastest growing areas of all the cloud
computing services. Gartner estimates a steep rise in PaaS adoption and
forecasts an increase in spending to more than $2.9 billion by 2016 and that
every organization will run some or all of its business software on public or
private PaaS.
Early PaaS oerings, however, restrict developers to a specic or non-stand-
ard development frameworks, a limited set of application services or a sin-
gle, vendor-operated Cloud service. These incompatible platforms inhibit
application portability, locking developers into a particular oering and re-
stricting movement of applications across Cloud providers or even into an
enterprise's own datacenter. Cloud Foundry is a modern application plat-
form built specically to simplify the end-to-end development, deployment
and operation of Cloud era applications, it is an open source Cloud Com-
puting project, that oers a platform supporting many languages and many
-
14 Introduction
services. Thanks to its openness it can be adapted and partially changed, but
also integrated to fulll many tasks in dierent environments. Cloud Foundry
represents a new generation of application platform, architected specically
for Cloud Computing environments and delivered as a service from enterprise
datacenters and public Cloud service providers. The project is not tied to
any single Cloud environment, rather, Cloud Foundry supports deployment
to any public and private Cloud environment. However, each Cloud plat-
form, has to face some challenges, such as: application portability, security,
scalability and SLA delivery.
The open source platform project sacrices a solid run-time application iso-
lation for a easier architecture and deploying process; however, in certain
scenarios, some applications should not run or be co-located with others of
a dierent classication or requiring dierent SLAs. This work represents a
proposed solution and research to meet customers required SLA, on Cloud
Foundry, granting application separation and isolation: during the thesis ad-
vancement we dened a change in the PaaS architecture, that could add
new features but grant always backward compatibility to the current appli-
cations developed. Moreover, the new added placement option grants more
strict and eective boundaries between the applications hosted on the PaaS,
allowing more advanced and ecient SLAs.
In Chapter 1 the idea and concept of Cloud Computing is presented, while in
Chapter 2 we explore the dierent Cloud layers and dierent available solu-
tion for enterprises. Chapter 3 deeply analyzes Cloud Foundry's architecture
and characteristics, exalting its qualities and use cases; while in Chapter 4
BOSH, a specic deployer, will be introduced. In Chapter 5 shows how to
install and deploy the open project, then Chapter 6 examines application
placement weakness and drawbacks present in Cloud Foundry. Chapter 7
explains the proposed change and its painless integration, compatible with
the current release, while in the Chapter 8 many uses cases, applications and
tests were carried out to quantify and understand the real benets of the
dierent isolated placement.
-
Chapter 1
Introduction to Cloud
Computing
Cloud computing has recently emerged as one of the common words in the
Information and Communications Technology (ICT) industry. Several Infor-
mation Technology (IT) vendors are promising to oer computation, storage,
and application hosting services and to provide coverage in several continents,
oering specic Service Level Agreements (SLA) and ensuring performance
and uptime promises for their services. These \clouds" are distinguished
by exposing resources such as computation, data storage and applications
as standards-based Web services and following a pricing model where cus-
tomers are charged based on their utilization of computational resources,
storage, and transfer of data. Nowadays we are experiencing a more continu-
ous transition to the Cloud; mostly because it aims to cut costs, and help the
users focus on their core business instead of being impeded by IT obstacles.
1.1 Cloud Computing
We can track the roots of clouds computing by observing the advancement of
several technologies, especially in hardware (virtualization, multi-core chips),
-
16 Introduction to Cloud Computing
Internet technologies (Web services, service-oriented architectures, Web 2.0),
distributed computing (clusters, grids), and systems management (auto-
nomic computing, data center automation). Figure 1.1 shows the conver-
gence of technology elds that signicantly advanced and contributed to the
advent of cloud computing. While these emerging services have increased
interoperability and usability and reduced the cost of computation, applica-
tion hosting, and content storage and delivery, there is signicant complexity
involved in ensuring that applications and services can scale as needed to
achieve consistent and reliable operation under peak loads.
Cloud vendors, researchers, and practitioners alike are working to ensure
Figure 1.1: Cloud Computing
that potential users are educated about the benets of cloud computing and
the best way to harness the full potential of the cloud. The Cloud Comput-
ing is really connected to the concept of utility computing, described by a
business model for on-demand delivery of computing power: a scenario where
-
1.1 Cloud Computing 17
consumers pay providers based on usage, similar to the way in which we cur-
rently obtain services from traditional public utility services such as water,
electricity and gas. However, while the realization of real utility computing
appears closer than ever, its acceptance is currently restricted to cloud ex-
perts due to the perceived complexities of interacting with cloud computing
providers.
From a nal user perspective, the Cloud Computing can be seen as a set
of useful functions and resources that hide how their internals work and
do not let worry the customer. Before this concept took place, the access,
the management and the elaboration of the data has been achieved through
bare-metal congurations gradually high performing; now the computing it-
self, that may be considered fully virtualized, allows computers to be built
from distributed components such as processing, storage, data, and software
resources.
One of the main aim of the Cloud Computing is to allow access to large
amounts of computing power in a fully virtualized manner, by aggregating
resources and oering a single system view. In addition, an important aim
of this technology has been delivering computing as a utility [2]. Cloud
computing has been coined as an umbrella term to describe a category of
sophisticated on-demand computing services; it denotes a model on which a
computing infrastructure is viewed as a \cloud", from which businesses and
individuals access applications from anywhere in the world on demand [3].
The main principle behind this model is oering computing, storage, and
software \as a service".
Three new main aspects can be generally considered as Cloud features [4]:
Perception of innite computing resources on demand;
Removal of a excessive investment in resources;
Ability to pay for use of short-term resources only when necessary.
They seem simple topics, but with a complex feasibility.
We are currently experiencing a switch in the IT world, from in-house gen-
erated computing power into utility-supplied computing resources delivered
-
18 Introduction to Cloud Computing
over the Internet as Web services. Cloud computing services are usually
backed by large-scale data centers composed of thousands of computers. Such
data centers are built to serve many users and host many disparate appli-
cations. For this purpose, hardware virtualization can be considered as a
perfect t to overcome most operational issues of data center building and
maintenance, as it allows running multiple operating systems and software
stacks on a single physical platform. A software layer, the Virtual Machine
Monitor (VMM) as shown in Figure 1.2, also called a hypervisor, mediates
Figure 1.2: Virtual Machine Monitor and Virtualization
access to the physical hardware presenting to each guest operating system a
Virtual Machine (VM), which is a set of virtual platform interfaces [5].
Traditionally, perceived benets were improvements on sharing and utiliza-
tion, better manageability, and higher reliability. More recently, with the
adoption of virtualization on a broad range of server and client systems, re-
searchers and practitioners have been emphasizing three basic capabilities
regarding management of workload in a virtualized system, namely: isola-
tion, consolidation and migration [6]. Workload isolation is achieved since all
program instructions are fully conned inside a VM, which leads to improve-
ments in security. Better reliability is also achieved because software failures
-
1.1 Cloud Computing 19
inside one VM do not aect others [5]. Moreover, better performance con-
trol is attained since execution of one VM should not aect the performance
of another VM. The consolidation of several individual and heterogeneous
workloads onto a single physical platform leads to better system utilization.
Workload migration, also referred to as application mobility, targets at fa-
cilitating hardware maintenance, load balancing, and disaster recovery. It is
done by encapsulating a guest operating system (OS) state within a VM and
allowing it to be suspended, fully serialized, migrated to a dierent platform,
and resumed immediately or preserved to be restored at a later date.
Certain features of a cloud are essential to enable services that truly repre-
sent the cloud computing model and satisfy expectations of consumers, and
cloud oerings grant:
Self-service: consumers of cloud computing services expect on-demand,nearly instant access to resources. To support this expectation, clouds
must allow self service access so that customers can request, customize,
pay, and use services without intervention of human operators [7];
Per-usage metering and billing: cloud computing eliminates up-frontcommitment by users, allowing them to request and use only the neces-
sary amount. For this reason, clouds must implement features to allow
ecient trading of service such as pricing, accounting, and billing;
Elasticity: cloud computing gives the illusion of innite computingresources available on demand. Therefore users expect clouds to rapidly
provide resources in any quantity at any time. In particular, it is
expected that the additional resources can be provisioned, possibly
automatically, when an application load increases and released when
load decreases;
Customization: resources rented from the cloud must be highly cus-tomizable. In the case of infrastructure services, customization means
allowing users to deploy specialized virtual appliances and to be given,
for example, privileged (root) access to the virtual servers.
-
20 Introduction to Cloud Computing
1.2 Dierent Clouds
Although cloud computing has emerged mainly from the appearance of public
computing utilities, other deployment models, with variations in physical
location and distribution, have been adopted. In this sense, regardless of its
service class, a cloud can be classied as public, private or hybrid [7] based
on model of deployment as shown in Figure 1.3. In most cases, establishing
Figure 1.3: Deployment Models
a private cloud means restructuring an existing infrastructure by adding
virtualization and cloud-like interfaces. This allows users to interact with the
local data center while experiencing the same advantages of public clouds,
most notably self-service interface, privileged access to virtual servers, and
per-usage metering and billing. A public cloud can be shared by several
organizations and supports a specic community that has shared concerns
(e.g., mission, security requirements, policy and compliance considerations
[7]. While a hybrid cloud takes shape when a private cloud is supplemented
with computing capacity from public clouds [9].
1.2.1 Public Cloud
It is the most common model. Public cloud or external cloud describes
cloud computing in a traditional mainstream sense, whereby resources are
dynamically provisioned via publicly accessible Web applications/Web ser-
vices (SOAP or RESTful interfaces) from an o-site third-party provider.
Commonly when services and applications rely on public visibility and reach-
-
1.2 Dierent Clouds 21
ability from the Internet, a public cloud is the rst choice. To run a public
cloud, a service provider will rst need to dene the services that will be
oered to enterprises that want to place their workloads in the cloud; a of-
fer for many customers. This is the reason why Cloud providers, such as
Amazon EC2, can host a large of number applications [10] in a multitude of
virtual machine independently managed and oer a collection of remote com-
puting services that together make up a platform over the internet. Those
who choose this approach, should not worry about computational resources
supply or availability issues, as the provider will totally take care of it. The
main benets of using a public cloud service can be summarized in:
Easy and inexpensive set-up because hardware, application and band-width costs are covered by the provider;
Scalability to meet needs;
Very few or totally absent wasted resources.
Nothing comes free, typically public clouds are subject to billing services
based on a \pay-per-use" basis and a time-based resource usage calculation.
The provider shares resources and bills customers on a ne-grained utility
computing basis; the user pays only for the capacity of the provisioned re-
sources at a particular time. Usually, from an organization point of view,
public cloud is chosen when not sensible data is involved, enterprise public
services are needed or when company data centers cannot fulll load requests.
1.2.2 Private Cloud
Private cloud (also called internal cloud or corporate cloud) is a marketing
term for a proprietary computing architecture that provides hosted services
to a limited number of people behind a rewall. Private cloud is the cloud
infrastructure operated solely for an organization. It can be managed by
the organization or a third party and can exist on-premises or o-premises.
It aims at providing public cloud functionality, but on private resources,
-
22 Introduction to Cloud Computing
while maintaining control over an organization's data and resources to meet
security and governance's requirements in an organization. It usually consists
in a compute platform with these goals:
Simplicity: allow service provisioning, setup and compute capabilityfor an organization's users in a self-service manner;
Potency: automate and provide well-managed virtualized environments;
Management: optimize computing resources, and servers' utilization;
Adaptability: support specic workloads.
Dierently from public clouds, instead of a pay-as-you-go model, there could
be other schemes in place, which take into account the usage of the cloud
and proportionally bill the dierent departments or sections of the enterprise.
Private clouds have the advantage of keeping in-house the core business op-
erations by relying on the existing IT infrastructure and reducing the burden
of maintaining it once the cloud has been set up. In spite of these advan-
tages, private clouds cannot easily scale out in the case of peak demand, and
the integration with public clouds could be a solution to the increased load.
However some drawbacks may occur, since the conguration and the instal-
lation of the private infrastructure is a mandatory phase. A total private
conguration, at the cost of a initial investment.
1.2.3 Private vs Public Cloud
After an initial enthusiasm for this new trend, it soon became evident that
a solution built on outsourcing the entire IT infrastructure to third parties
would not be applicable in many cases, especially when there are critical op-
erations to be performed and security concerns to consider. Moreover, with
the public cloud distributed anywhere on the planet, legal issues arise and
they simply make it dicult to rely on a virtual public infrastructure for any
IT operation. As an example, data location and condentiality are two of the
-
1.2 Dierent Clouds 23
major issues that scare stakeholders to move into the clouddata that might
be secure in one country may not be secure in another [2]. In many cases
though, users of cloud services dont know where their information is held
and dierent laws can apply. It could be stored in some data center in either
Europe, where the European Union favors very strict protection of privacy,
or America, where laws such as the U.S. Patriot Act2 invest government and
other agencies [16] with virtually limitless powers to access information in-
cluding that belonging to companies. In addition, enterprises already have
their own IT infrastructures. In spite of this, the distinctive feature of cloud
computing still remains appealing, and the possibility of replicating in-house
(on their own IT infrastructure) the resource and service provisioning model
proposed by cloud computing led to the development of the private cloud
concept. In this scenario, security concerns are less critical, since sensitive
information does not ow out of the private infrastructure.
Moreover, existing IT resources can be better utilized since the Private cloud
becomes accessible to all the division of the enterprise. Another interest-
ing opportunity that comes with private clouds is the possibility of testing
applications and systems at a comparatively lower price rather than public
clouds before deploying them on the public virtual infrastructure. For the
enterprises there are some key advantages from the use of a private cloud:
Customer information protectionDespite the public cloud oerings about the specic level of security,
in-house security is easier to maintain and to rely on;
Infrastructure ensuring Service Level Agreements (SLAs)Quality of service implies that specic operations such as appropriate
clustering and failover, data replication, system monitoring and main-
tenance, disaster recovery, and other uptime services can be commen-
surate to the application needs. While public clouds vendors provide
some of these features, not all of them are available as needed;
Compliance with standard procedures and operationsIf organizations are subject to third-party compliance standards, spe-
-
24 Introduction to Cloud Computing
cic procedures have to be put in place when deploying and executing
applications. This could be not possible in the case of virtual public
infrastructure.
However private clouds may not easily scale. Hence, hybrid clouds, which
are the result of a private cloud growing and provisioning resources from a
public cloud, are likely to be best option in many cases.
1.2.4 Hybrid Cloud
Hybrid cloud is the cloud infrastructure composed of two or more clouds,
either private or public that remain separated entities but bound together
by standardized technology that enables data and application portability.
Hybrid clouds allow exploiting existing IT infrastructures, maintaining sen-
sitive information within the premises, and naturally growing and shrinking
by provisioning external resources and releasing them when needed. Secu-
rity concerns are then only limited to the public portion of the cloud, which
can be used to perform operations with less stringent constraints but that
are still part the system workload. Hybrid clouds change their composition
and topology over time. They form as a result of dynamic conditions such
as peak demands or specic SLAs attached to the applications currently in
execution. An open and extensible architecture that allows easily plugging
new components and rapidly integrating new features is of a great value in
this case.
1.3 Service Level Agreement in the Cloud
A Service Level Agreement (SLA) is a contract between a network service
provider and a customer that species, usually in measurable terms, what ser-
vices the network service provider will furnish. SLAs are oered by providers
to express their commitment to delivery of a certain QoS. To customers it
serves as a warranty. An SLA usually include availability and performance
-
1.3 Service Level Agreement in the Cloud 25
guarantees. Additionally, metrics must be agreed upon by all parties as well
as penalties for violating these expectations. Service Level Agreements can
prove to be a useful instrument in facilitating enterprises' trust in cloud-based
services. Cloud providers are typically not directly exposed to the service se-
mantics or the SLAs that service owners may contract with their end users.
The capacity requirements are, thus, less predictable and more elastic. The
use of reservations may be insucient, and capacity planning and optimiza-
tions are required instead. The cloud provider's task is, therefore, to make
sure that resource allocation requests are satised with specic probability
and timeliness. These requirements are formalized in infrastructure SLAs
between the service owner and cloud provider, separate from the high-level
SLAs between the service owner and its end users. In many cases, either the
service owner is not resourceful enough to perform an exact service sizing or
service workloads are hard to anticipate in advance. Therefore, to protect
high-level SLAs, the cloud provider should cater for elasticity on demand.
There are two types of SLAs from the perspective of hosting, at two dierent
levels:
Infrastructure: infrastructure provider manages and oers guaranteeson availability of the infrastructure, namely, server machine, power,
network connectivity, and so on. Enterprises manage themselves, their
applications that are deployed on these server machines. The machines
are leased to the customers and are isolated from machines of other cus-
tomers. In such dedicated hosting environments, a practical example
of service-level is represented by a Quality of Service (QoS) condition
related to the availability of the system CPU, data storage and network
for ecient execution of the application at peak loads.
Application: in the application co-location hosting model, the servercapacity is available to the applications based solely on their resource
demands. Hence, the service providers are exible in allocating and
de-allocating computing resources among the co-located applications.
Therefore, the service providers are also responsible for ensuring to
meet their customer's application Service Level Objective.
-
26 Introduction to Cloud Computing
It is also possible for a customer and the service provider to mutually agree
upon a set of SLAs with dierent performance and cost structure rather than
a single SLA. The customer has the exibility to choose any of the agreed
SLAs from the available oerings. At runtime, the customer can switch be-
tween the dierent SLAs.
Currently, the cloud solutions come with primitive or reduced SLAs [2]. This
is surely bound to change; as the cloud market gets crowded with increasing
number of cloud oers, providers have to gain some competitive dierenti-
ation to capture larger share of the market. This is particularly true for
market segments represented by enterprises and large organizations; where
those entities will be particularly interested to choose the oering with so-
phisticated SLAs providing more assurances. Many businesses are ready to
move and have already migrated to the cloud to this day.
Through the idea of Cloud we understand why many enterprise applica-
tions are moving towards this direction and which kind of SLAs are implied.
Now, we are going to present the dierent layers of the Clouds and how the
dierent models interact each other.
-
Chapter 2
Cloud layers and its uses
Cloud computing services are divided into three classes, according to the ab-
straction level of the capability provided and the service model of providers:
a Provider of a Cloud typically oers a subscription-based access to infras-
tructure (Infrastructure as a Service), platforms (Platform as a Service) and
applications (Software as a Service); that are popularly referred to as IaaS,
PaaS, and SaaS. Figure 2.1 depicts the layered organization of the cloud
stack from physical infrastructure to applications.
2.1 Cloud Layers
The abstraction levels can also be viewed as a layered architecture where
services of a higher layer can be composed from services of the underlying
layer [14]. Cloud development environments are built on top of infrastruc-
ture services to oer application development and deployment capabilities; in
this level, various programming models, libraries, APIs, and mashup editors
enable the creation of a range of business, Web, and scientic applications.
Once deployed in the cloud, these applications can be consumed by end users.
We start describing from the lowest layer, going up to the most abstract.
-
28 Cloud layers and its uses
Figure 2.1: The cloud computing stack
2.1.1 IaaS
Oering virtualized resources (computation, storage, and communication) on
demand is known as Infrastructure as a Service (IaaS ) [9]. A cloud infras-
tructure enables on-demand provisioning of servers running several choices
of operating systems and a customized software stack. Infrastructure ser-
vices are considered to be the bottom layer of cloud computing systems [11].
Amazon Web Services mainly oers IaaS, which in the case of its EC2 service
means oering VMs with a software stack that can be customized similar to
how an ordinary physical server would be customized; in the same way Open-
stack provides the same abstraction to the nal consumers. Users are given
privileges to perform numerous activities to the server, such as: starting and
stopping it, customizing it by installing software packages, attaching virtual
disks to it, and conguring access permissions and rewalls rules. A key
challenge IaaS providers face when building a cloud infrastructure is man-
aging physical and virtual resources, namely servers, storage, and networks,
in a holistic fashion. The orchestration of resources must be performed in a
way to rapidly and dynamically provision resources to applications[9]. Public
-
2.1 Cloud Layers 29
Infrastructure as a Service providers commonly oer virtual servers contain-
ing one or more CPUs, running several choices of operating systems and a
customized software stack. In addition, storage space and communication
facilities are often provided.
In spite of being based on a common set of features, IaaS oerings can be
distinguished by the availability of specialized features that inuence the
cost-benet ratio to be experienced by user applications when moved to the
cloud. The most relevant features are:
Geographic presence: to improve availability and responsiveness, aprovider of worldwide services would typically build several data cen-
ters distributed around the world;
User interfaces and access to servers: ideally, a public IaaS providermust provide multiple access means to its cloud, thus catering for var-
ious users and their preferences. Graphical User Interfaces (GUIs) are
preferred by end users who need to launch, customize, and monitor
a few virtual servers and do not necessary need to repeat the process
several times. On the other hand, Command Line Interfaces (CLIs)
oer more exibility and the possibility of automating repetitive tasks
via scripts;
Advance reservation of capacity: advance reservations allow users torequest for an IaaS provider to reserve resources for a specic time
frame in the future, thus ensuring that cloud resources will be available
at that time.
Automatic scaling and load balancing: elasticity is a key characteristicof the cloud computing model. Applications often need to scale up and
down to meet varying load conditions. Automatic scaling is a highly
desirable feature of IaaS clouds. It allow users to set conditions for
when they want their applications to scale up and down, based on
application-specic metrics such as transactions per second, number of
simultaneous users, request latency, and so forth;
-
30 Cloud layers and its uses
Hypervisor and operating system choice: IaaS providers needed ex-pertise in Linux, networking, virtualization, metering, resource man-
agement, and many other low-level aspects to successfully deploy and
maintain their cloud oerings.
One of the most well known open-source IaaS is OpenStack [12]: a cloud-
computing project that aims to provide the \ubiquitous open source cloud
computing platform for public and private clouds".
2.1.2 PaaS
Platform as a Service (PaaS ) is a category of cloud computing services that
provides a computing platform and a set of software subsystems or com-
ponents, needed to perform a task without further external dependencies,
as a service. Typically an IaaS is not agile enough for developers, when
developers that adopt infrastructure layered clouds, become responsible for
managing their virtual machines, needing to understand more about the in-
frastructure, VMM and OS than when they were using traditional IT. For
service providers, as the number of VMs grows, it becomes very dicult to
manage and keep track of what exact virtual machine has specic applica-
tions running. It becomes a logistical nightmare as the ecosystem of users,
as well as applications, grow within the cloud infrastructure.
In addition to infrastructure-oriented clouds, that provide raw computing and
storage services, another approach is to oer a higher level of abstraction to
make a cloud easily programmable, through the PaaS. Public Platform as
a Service providers commonly oer a deployment environment that allow
users to create and run their applications with little or no concern to low-
level details of the platform. Such a cloud platform oers an environment on
which developers create and deploy applications and do not necessarily need
to know how many processors or how much memory that applications will
be using. In addition, multiple programming models and specialized services
e.g., data access, authentication, and payments) are oered as building blocks
to new applications [13]. Specic programming languages and frameworks
-
2.1 Cloud Layers 31
are made available in the platform, as well as other services such as persistent
data storage and in-memory caches. Typical features of these platforms are:
Programming models, languages and frameworks: PaaS providers usu-ally support multiple programming languages. Most commonly used
languages in platforms include Python (e.g., Google AppEngine), Java
(e.g., Google AppEngine, Cloud Foundry), .NET languages (e.g., Mi-
crosoft Azure), Ruby (e.g., Heroku, Cloud Foundry) and NodeJS (e.g.,
Cloud Foundry). A variety of software frameworks are usually made
available to PaaS developers, depending on application focus. Providers
that focus on Web and enterprise application hosting oer popular
frameworks such as Ruby on Rails, Spring, Java EE (frameworks that
can be used in Cloud Foundry PaaS ). Sometimes well-dened APIs,
too;
Persistence options: a persistence layer is essential to allow applica-tions to record their state and recover it in case of crashes, as well
as to store user data. In the cloud computing domain we can rely
on two common solutions: relational databases and distributed stor-
age technologies. Typically PaaS providers oer several solutions or
connection mechanisms to integrate these persistence options with the
applications.
Scalability: a PaaS is usually built in order to let an agile team workand iterate quickly with software. Application scalability is not only
an operations issue but it is a development issue as well, but a PaaS
provides these functions out of the box. Moreover scaling an application
and run it in production, should is oered as a fundamental feature.
In this way the downtime penalty to be paid, when a specic scaling is
required, is drastically reduced.
Cloud consumers of PaaS can employ the tools and execution resources pro-
vided by cloud providers to develop, test, deploy and manage the applications
hosted in a cloud environment. PaaS consumers can be application develop-
ers who design and implement application software, application testers who
-
32 Cloud layers and its uses
run and test applications in cloud-based environments, application deployers
who publish applications into the cloud, and application administrators who
congure and monitor application performance on a platform.
Moreover, an additional desired feature for a PaaS is being portable, regard-
ing both the applications and the PaaS itself. Selecting the right PaaS has a
signicant impact on keeping an application portable. Basically an applica-
tion should be portable among several deployments of the same PaaS, with
no issues; as PaaS oers developers a set of services that are independent of
the infrastructure, ensuring that the application and operational tools inte-
grated by developers are agnostic of any cloud infrastructure. Moreover, a
more portable PaaS, is capable of being installed on many IaaS ; thus increas-
ing the portability of the application independently from the IaaS (open and
portable PaaS oerings, like Cloud Foundry [20], can be deployed to pub-
lic or private cloud congurations giving you the most exible deployment
alternatives).
2.1.3 SaaS
A Software as a Service (SaaS ) is a software delivery model in which both
software and data are totally hosted on the cloud, providing to the consumer
the capability to access the provider's applications, running on a cloud in-
frastructure, from various client devices through a thin client interface such
as a web browser.
In the SaaS domain, cloud applications can be built as compositions of other
services from the same or dierent providers. Services such user authenti-
cation, e-mail, payroll management, and calendars are examples of building
blocks that can be reused and combined in a business solution in case a sin-
gle, ready-made system does not provide all those features. Many building
blocks and solutions are now available in public marketplaces. Applications
reside on the top of the cloud stack; the services provided by this layer can
be accessed typically by end users through Web portals. The SaaS on Cloud
oerings are focused on supporting large software package usage leveraging
-
2.1 Cloud Layers 33
cloud benets. This layer represent the most abstract one, most of the users
of these packages will access directly the services and the applications totally
unaware of the underlying cloud support. Traditional desktop applications
such as word processing and spreadsheet can now be accessed as a service
in the Web. This model of delivering applications alleviates the burden of
software maintenance for customers and simplies development and test-
ing for providers [14][15].The SaaS model has no physical need for indirect
distribution since it is not distributed physically and is deployed almost in-
stantaneously. Therefore SaaS customers have no hardware or software to
buy, install, maintain, or update. Access to applications is easy as only an
internet connection is required. Applications, especially the line of business
services that are large customizable business solutions aimed at facilitating
business processes, are normally designed for ease of use and based upon
proven business architectures. The advantages of this approach include:
Multitenant architecture: in which all users and applications share asingle, common infrastructure and code base that is centrally main-
tained. Because SaaS vendor clients are all on the same infrastructure
and code base, vendors can innovate more quickly and save the valuable
development time previously spent on maintaining numerous versions
of outdated code;
Customization: the ability to easily customize applications to t enter-prise business processes without aecting the common infrastructure.
Because of the way SaaS is architected, these customizations are unique
to each company or user and are always preserved through upgrades.
That means SaaS providers can make upgrades more often, with less
customer risk and much lower adoption cost;
Scalability: most of the software runs on provider infrastructure, thesame provider is responsible for its availability and scalability.
Applications are studiously being moved to clouds, which are exposed as
services, which are delivered via the Internet to user agents or humans and
-
34 Cloud layers and its uses
accessed through the ubiquitous web browsers. In a SaaS approach, most of
the time, we do not have to worry about the installation, setup and running
of the application, because the service provider will take care of it; good
realizations, for example, can be found in Google Apps [17] (a cloud-based
productivity suite) and Microsoft Oce 365 [18] (an online oce suite, based
on the cloud).
2.2 Main Platforms
Before diving completely into a cloud technology, it is interesting to take a
look at the enterprise companies and how well the Cloud, in this last years,
has been integrated and used by them. The cloud ecosystem is evolving
very fast and several businesses have to deal with it, enterprise businesses
need to use clouds and not build them. A cloud technology should be seen
as a commodity and not a dierent way to achieve the same tasks or reach
in a dierent way the same clients. Nowadays, two of the main providers
oering a suite of tools and a platform to ease the move towards the Cloud
Computing are: Google and Amazon.
2.2.1 Google Cloud Platform
Google Cloud Platform [19] is a set of services that enables developers to
build, test and deploy applications on Google's reliable infrastructure. Gen-
erally, we talk about cloud computing when taking applications and running
them on infrastructure other than our own. As a developer, the cloud should
be seen as a service that provides resources to our applications. Built on the
same infrastructure that allows Google to return billions of search results in
milliseconds, rapidly we can develop, deploy and iterate applications without
worrying about system administration, as Google manages completely appli-
cation life-cycle, database and storage servers. As shown in Figure 2.2, the
oering covers mainly to cloud layers: IaaS and PaaS.
-
2.2 Main Platforms 35
Thanks to a solid infrastructure, the Compute Engine, the platform is ca-
Figure 2.2: Google Cloud Platform
pable of handling millions of requests and the applications can automatically
scale up to handle the most demanding workloads and scale down when trac
subsides. Google's compute infrastructure provides consistent CPU, mem-
ory and disk performance, while the network and edge cache serve responses
rapidly to users across the world.
While Cloud Platform oers both a fully managed platform and exible vir-
tual machines, App Engine a PaaS, supports application development when
focus on the only code is required. In addition, if the storage is required, the
service oer comprises databases such as MySQL or NoSQL.
2.2.2 Amazon Web Services
Amazon Web Services (AWS) is a suite of remote services that together make
up a cloud computing platform. While with Google Cloud Platform we have
an environment of a higher logic level, more close to developers and appli-
cations; with AWS we reach a lower level, meaning direct access to virtual
machines, congurations and more adaptability. The most well-known ser-
vices oered are: EC2, S3, Route 53 and ELB. Elastic Cloud Computing
(EC2) is a central service, allowing users to rent virtual computers and de-
ploy applications on virtual machines called instances; the users are allowed
to select the most suitable avor represented by a purpose-specic instance
based on dierent sizes or number of CPU, GPU and memory. Amazon
-
36 Cloud layers and its uses
Simple Storage Service (S3) is an online le storage web service providing
the storage through Web Services interfaces, based on a highly durable and
available store for a variety of content, ranging from web applications to
media les. Amazon Route 53 is a highly available and scalable Domain
Name System (DNS) web service, designed to give developers and businesses
an extremely reliable and cost eective way to route end users to Internet
applications by translating names into the numeric IP addresses; Route 53
eectively connects user requests to infrastructure running in AWS and can
also be used to route users to infrastructure outside of AWS. While Elastic
Load Balancing automatically distributes incoming application trac across
multiple Amazon EC2 instances, by enabling users to achieve greater lev-
els of fault tolerance in their applications, seamlessly providing the required
amount of load balancing capacity needed to distribute application trac.
While with AWS we can tweak more the architecture and touch directly the
lower layer, by conguring it and developing specic services that interface
with the cloud suite, with Google Cloud Platform we can get the same ben-
ets at the cost of a more opaque architecture, quick to develop on, scalable
but not transparent.
After an overview about the dierent Cloud layers, we know what a PaaS
is and why for certain purposes it is as its strong points oered are: fast
development, environment pre-provisioned and easy scalability. As we have
seen, many businesses are running on the Cloud and several technologies and
approaches are utilized. The market oers dierent solutions and suites to
meet up with developers, many of them provides dierent layers of abstrac-
tion and environment to work with. However all of these oerings come in
bundle, with a pretty static choice, based on the vendor technologies. In a
dierent way, some good project at the Platform level exist; we want to take
a look at a specic PaaS that is having a great momentum today, we are
going to introduce the open PaaS par excellence: Cloud Foundry.
-
Chapter 3
Cloud Foundry
In the cloud era, the application platform delivered as a service, often called
Platform as a Service (PaaS ), makes much easier to deploy, run and scale
applications. At the state of the art, some PaaS oer a limited language
and framework support, do not deliver key application services and restrict
deployment to a single cloud. Cloud Foundry [20] is the industry's Open
PaaS and provides a choice of clouds, framework and application services.
As an open source project, there is a broad community both contributing
and supporting Cloud Foundry [21].
Open cloud and open source are only part of the transformation underway,
there are also continuous innovation and high velocity agile development
along the way. Some open source projects foster inclusiveness and sacri-
ce velocity, while some increase velocity at the expense of transparency.
Cloud Foundry's unique vision is to foster contributions from a broad com-
munity of developers, users, customers, partners and independent software
vendors while advancing development of the platform at extreme velocity.
Cloud Foundry exists to provide a platform for the community of customers,
partners and even former competitors to collaborate, teach, share and learn
together, accelerating the pace of innovation and contribution.
-
38 Cloud Foundry
3.1 A Good Choice
Cloud Foundry does not stop its openness to the code, but extends it to
dierent environments. Being an Open Platform as a Service is about having
the ability to make several choices [22] that best t developers, as represented
in Figure 3.1, such as:
Choice of Developer Frameworks:The platform supports several and common frameworks such as Spring
for Java, Rails and Sinatra for Ruby and Node.js. There is also sup-
port for Grails on Groovy and other JVM-based frameworks integrated
into Cloud Foundry. As far as now the choice is restricted to those
languages, the project will soon integrate other languages as Cloud
Foundry matures;
Choice of Application Services:Application Services allow developers to take advantage of data, mes-
saging, and web services as building blocks for their applications. Cloud
Foundry currently oers abstract logical components that link appli-
cations to external services like MySQL, MongoDB and Redis [23]. In
addition, the services can be extended; the PaaS oers interfaces and
constructs to link not natively supported services from scratch;
Choice of Clouds:Cloud Foundry can run on several variety of clouds, both private and
public are supported [24]; it is up to the developer and organization
where they want to run it. Cloud Foundry can be run on the top of
OpenStack or Amazon Web Services as well;
Type of Usage:Platform's code is open sourced at Cloud Foundry.org under the Apache
License making it easy for anyone to adopt and use the technology in
virtually any way they want. This is one of the best ways to avoid the
risk of lock-in and foster additional innovation.
-
3.2 The Architecture 39
Figure 3.1: A Triangle of choice
Cloud Foundry is an interoperable PaaS framework that allows users to have
freedom of choice across cloud infrastructure and application programming
models, and cloud applications. Therefore developers do not have to worry
about Virtual Machine conguration or environment setup anymore; a de-
ployment can be really speeded up, as we discussed in. 2.1.2.
It appears clear now as Cloud Foundry needs to oer portability, extensi-
bility and scalability, to comply to PaaS standards. The architecture itself
demands modularity and cross-compatibility between the dierent IaaS, in
order to provide an environment ready to be used.
3.2 The Architecture
Cloud Foundry has been designed with a simple but eective concept in mind:
\The closer to the center of the system, the dumber the code should be" [27].
Distributed systems raise fundamentally hard to solve problems [25], every
component that cooperates to form the entire system should be as simple as
it possibly can be, and still do its job properly.
The architecture is both portable across dierent infrastructures and funda-
mentally extensible itself. The components are modular and loosely coupled,
-
40 Cloud Foundry
they know each other in a loosely coupled way via a Publisher-Subscriber
message system called NATS; where any direct call, to dierent agents and
all the tasks, and request communicate on. Every component in this system
is horizontally scalable and self recongurable in case of failure, meaning it
is possible to add as many copies of each component as needed in order to
support the load of a cloud, and in any order with resilience properties always
in mind. Since everything is very decoupled, it does not even really matter
where each component resides or runs.
We can break down the whole system into ve main components, as displayed
in the Figure 3.2, plus a message bus: the Cloud Controller, the Health Man-
ager, the Router, the DEAs (Droplet Execution Agents) and a set of Services.
Figure 3.2: Cloud Foundry Architecture
-
3.2 The Architecture 41
3.2.1 NATS
NATS is a lightweight publish-subscribe and distributed queuing cloud mes-
saging system, Cloud Foundry's message bus, written in Ruby. It is the
system that all the internal components communicate on.
The dierent agents, within the architecture, re o dierent messages and
receive them from other components on dierent subjects. When the other
components have completed their tasks, usually send a message back on the
NATS bus. Other components have the option to listen to what is happen-
ing on the NATS and perform peripheral tasks, such as ensuring the DNS is
correctly congured for deployed applications, logging activity, or managing
scalability. The NATS client is also built with EventMachine [26], which
means communication is asynchronous and does not block the invoker which
can immediately handle any NATS messages that are pushed to it; the pub-
sub system oers in addition multiple subjects for several communications
and tasks. When each daemon rst boots, it connects to the NATS message
bus, subscribes to subjects it cares about (ie: provision or heartbeat signals),
and also begins to publish its own heartbeats and notications. We are able
to replicate mostly any component, in this way as we only require the NATS
endpoint, for each component, to acquire the connection and the message
ow for each task.
3.2.2 Cloud Controller
The Cloud Controller is the main orchestrator of the system. This is an
application that uses EventMachine[28] to be fully async and Sinatra[29]
(web application library) to expose REST APIs. This component exposes
the main REST interface that the Command Line Interface (CLI) tool \cf"
talks to. The orchestrator of the system wears many hats, the main ones:
Control of the life cycle of an application;
Initiation of the staging process of a new application;
-
42 Cloud Foundry
Selection of the best DEA agent;
Reception of information from the Health Manager about applications;
Control of client's access credentials;
Management of the spaces, organization and user;
Binding of services to the applications.
The Cloud Controller maintains a database (CC DB) with tables for organi-
zations, spaces, applications, services, service instances, user roles and tasks.
Relying on the data structure created during the deployment and rst run
of this component, the Controller takes care of several tasks.
Each time a command is issued, via CF CLI, the Cloud Controller checks if
the user is authenticated - authentication is performed by providing a UAA
Token in the authorization HTTP header- and if has the right role, combined
with a set of permissions, to manage the life cycle of the applications. There
is an access validation whenever the users try to access to the space and orga-
nization associated. While the Cloud Controller can answer the REST calls
via Sinatra and provide the right endpoints for all the client requests, by us-
ing Sequel (an object relational mapping tool) it can asynchronously update
the Postgres database and dynamically be notied of changes. The Cloud
Controller can be seen under a Model View Controller (MVC), where: the
model is persistent on a database and associated with specic logic classes
at run-time; the view is represented by the REST APIs that oer specic
endpoints to the CF CLI commands issued; and the controller is partially
obtained through specic REST logic and partially through a set of spe-
cic classes that interact directly with other Cloud Foundry components,
via NATS, and are driven by database updates and events. The high level
architecture of this version of the Cloud Controller can be summarized as
follows:
Sinatra HTTP framework;
Sequel ORM;
-
3.2 The Architecture 43
Thread per request, currently using Thin in threaded mode;
NATS based communication with other CF components.
By adopting these components the Cloud Controller can oer specic APIs
to the clients that grant:
Consistencyacross all resource URLs, parameters, request-response bodies and error
responses;
Partial updatesof a resource can be performed by providing a subset of the resources'
attributes;
Paginationsupport for each of the collections;
Filteringsupport for each of the collections.
A developer typically will interact with the Cloud Controller only during the
rst process of \pushing" an application to Cloud Foundry, that it can be
translated into a simple upload of his application and a transfer of the only
les that are really required to run the piece of software. The deployment
of an application starts always with an initial push. Thanks to the CF CLI
and the Cloud Controller, the application's le are ngerprinted and the or-
chestrator can keep track of the changes, it is like a built-in version control
system. Then the client only sends the objects that the cloud requires, in
order to create a full \Droplet" (a droplet is a tarball of all application's code
plus its dependencies, all wrapped up into a droplet with a start and stop
button).
Moreover the Cloud Controller is in charge to manage a blob store, contain-
ing:
-
44 Cloud Foundry
Resources:les that are uploaded to the Cloud Controller with a unique SHA such
that they can be reused without re-uploading the le;
Application packages:un-staged les that represent an application;
Droplets:the result of taking an application package, processing a buildpack and
getting it ready to run.
The blob store uses the FOG technology (a Ruby cloud service library) such
that it can use abstractions like Amazon S3 or an NFS-mounted le system
for storage.
3.2.3 Droplet Execution Agent
This is an agent that is run on each node that actually runs the applications.
So in any particular cloud build of Cloud Foundry, there will be more DEA
nodes then any other type of node in a typical setup. Each DEA can be
congured to advertise a dierent capacity and dierent built-in image for
the applications, identied via \stack" label. So not all DEA nodes are of the
same size or are able to run the same applications. The DEA itself is written
in Ruby and takes care of managing an application instance's life cycle. It can
be instructed by the Cloud Controller to start and stop application instances.
It keeps track of all started instances, and periodically broadcasts messages
about their state over NATS (meant to be picked up by the Health Manager).
The Droplet Execution Agents were designed with an idea in mind: as much
as possible modular. We do not need to know exactly the id of a DEA node,
or to make a direct call to start/stop an application; when we talk about a
DEA we need a service, an agent that can fulll the request. NATS here has
a central role: the nodes publish an advertise message with their capabilities
and the orchestrator, Cloud Controller, browses between all the messages to
nd the most suitable. We can congure each execution node and set up a
-
3.2 The Architecture 45
dierent capacity, stack, disk size in order to create pools of execution agents.
The DEA does not necessarily care what language an app is written in. All
it sees are droplets: a droplet is a simple wrapper around an application that
takes one input, the port number to serve HTTP requests on and it also has
two \buttons", start and stop. So the DEA treats droplets as black boxes:
when it receives a new droplet to run, it tells it what port to bind to and runs
the start script. A droplet again is just a tarball of the application, wrapped
up in a start/stop script and with all the conguration les, rewritten in
order to bind to the proper database. Once it tells the droplet what port to
listen on for HTTP requests and runs its start script, then the app properly
binds to the correct port; later it will broadcast on the bus the location of
the new application so the Routers can know about it. If the app did not
start successfully it will return log messages to the CF client that tried to
push this app, telling the user why their app did not start. To summarize,
the key functions of a Droplet Execution Agent (DEA) are:
Stage applications:a DEA uses the appropriate buildpack to stage the application, the
result of this process is a droplet;
Manage Warden containers:after the staging process the applications run in Warden containers, a
DEA is in charge to control them;
Run droplets:a DEA manages the lifecycle of each application instance running in it,
starting and stopping droplets upon request of the Cloud Controller.
The DEA monitors the state of a started application instance, and
periodically broadcasts application state messages over NATS for con-
sumption by the Health Manager.
To guarantee a good availability, DEA periodically checks the health of the
applications running in it. If a URL is mapped to an application, the DEA
attempts to connect to the port assigned to the application. If the application
-
46 Cloud Foundry
port is accepting connections, the DEA will consider that application state
to be \Running". If there is no URL mapped to the application, the DEA
checks the system process table for the application's process PID; if the PID
exists, the DEA will consider that application state to be \Running".
3.2.4 Warden
The container of dierent apps on DEA nodes. The Warden's primary goal
is to provide a simple API for managing isolated environments. These iso-
lated environments (or containers) can be limited in terms of CPU usage,
memory usage, disk usage, and network access. The isolation is achieved by
namespacing kernel resources that would otherwise be shared; because the
applications will be co-located on the same node. Obviously the intended
level of isolation is set such that multiple containers present on the same
host should not be aware of each others presence. This means that these
containers are given (among others) their own PID (Process ID) namespace,
network namespace, and mount namespace while the resource control is done
by using Control Groups (cgroups). Every container is placed in its own con-
trol group, where it is congured to use an equal slice of CPU compared to
other containers, and the maximum amount of memory it may use. Warden
is a daemon that manages containers and can be controlled via a simple API
rather than a set of tools that are individually executed. While the Linux
backend for Warden was initially implemented with LXC, the current ver-
sion no longer depends on it, this because running LXC out of the box is a
very opaque and static process[30]. There is little control over when dierent
parts of the container start process are executed, and how they relate to each
other. Because Warden relies on a very small subset of the functionality that
LXC oers, this tool executes pre congured hooks at dierent stages of the
container start process, such that required resources can be set up without
worrying about concurrency issues. These hooks make the start process more
transparent, allowing for easier debugging when parts of this process are not
working as expected.
-
3.2 The Architecture 47
3.2.5 Router
The Router routes trac coming into Cloud Foundry to the appropriate com-
ponent, usually Cloud Controller or a running application on a DEA node.
The router is implemented in Go. Implementing a custom router in Go gives
full control over every connection to the router, which makes it easier to
support WebSockets and other types of trac. All routing logic is contained
in a single process, removing unnecessary latency. When gorouter is used in
Cloud Foundry, it receives route updates via NATS from the Droplet Execu-
tion Agents. Routes that have not been updated in two minutes, by default
are pruned. Therefore, to maintain an active route, it needs to be updated
at least every two minutes. In this way we guarantee updated routes and
a sort of monitoring. If the applications have an entry in the router table,
we can assume that they are running and they are reachable, otherwise they
would have been removed. If the DEA node or the application itself crash,
the gorouter will lose the entry after few minutes.
In a larger production setup there is a pool of Routers load balanced behind
Nginx or some other load balancers. These routers listen on the bus for noti-
cations from the DEA nodes about new apps coming online and apps going
oine. When they get a real-time update they will change their in-memory
routing table, that they consult in order to properly route requests. So a
request coming into the system goes through Nginx, or some other HTTP
termination endpoint, which then load balances across a pool of identical
Routers. One of the routers will pick up the phone to answer the request, it
will start inspecting the headers of the request just enough to nd the Host:
header so it can pick out the name of the application this request is headed
for. It will then do a basic hash lookup in the routing table to nd a list of
potential backends that represent this particular application.
3.2.6 Health Manager
The Health Manager is a standalone daemon that has a copy of the same
models the Cloud Controller has and can currently see into the same database
as the Cloud Controller. This daemon has an interval where it wakes up
-
48 Cloud Foundry
and scans the database of the Cloud Controller to see what the state of
the world \should be", then inspects the real state to make sure it matches
the one desired. If there are things that do not match, then it will send
specic messages back to the Cloud Controller to correct this incongruity.
This is how it is handled the loss of an application or even a DEA node
per say. If an application goes down, the Health Manager will notice and
will quickly remedy the situation by signaling the Cloud Controller to start
a new instance. If a DEA node completely fails, the app instances running
over there will be redistributed back out across the grid of remaining DEA
nodes. Health Manager monitors the state of the applications and ensures
that started applications are indeed running, their versions and number of
instances correct. The Health Manager is essential to ensuring that apps
running on Cloud Foundry remain available and scale correctly, it is needed
to restart applications whenever the DEA running an app shuts down for
any reason or Warden kills the app because it violated a quota or just the
application process exits with a non-zero exit code.
Conceptually, this is done by maintaining an actual state of applications and
comparing it against the desired state. When discrepancies are found, actions
are initiated to bring the applications to the desired state, i.e. start/stop
commands are issued for missing or extra instances, respectively. The current
Cloud Foundry release is using this component, but a new brand version will
be soon released under the name of \HM9000".
3.2.7 User Account and Authentication Server
Also called UAA, it is the identity management service for Cloud Foundry,
its primary role is as an OAuth2 provider, issuing tokens for client applica-
tions to use when they act on behalf of Cloud Foundry users. It can also
authenticate users with their Cloud Foundry credentials, and can act as a
Single Sign-on (SSO) service using those credentials (or others). It has end-
points for managing user accounts and for registering OAuth2 clients, as well
as various other management functions. It provides single sign-on for web
applications and secures Cloud Foundry resources. In addition it grants ac-
cess tokens to client applications for use in accessing Resource Servers in the
-
3.2 The Architecture 49
platform, including the Cloud Controller. It is a plain Spring MVC webapp
that provides:
OAuth2 authorize tokens;
A login endpoint, to allow querying for login prompts;
A check token endpoint, to allow resource servers to obtain informationabout an access token submitted by an OAuth2 client;
A Simple Cloud Identity Management (SCIM) user provisioning end-point;
OpenID connection endpoints to support authentication, to get userinfos and check id.
The authentication usually can be performed by command line clients by
submitting credentials directly to the authorization endpoint.
3.2.8 Services
Cloud Foundry Services are add-ons that can be provisioned alongside an ap-
plication. There are two ways in which Cloud Foundry enables developers to
add services to their applications, Managed Services and User-provided Ser-
vice Instances. Managed Services have been integrated with Cloud Foundry
via APIs and provision new service instances and credentials on demand;
while user-provided Service Instances are a mechanism to deliver credentials
to applications for service instances which have been pre-provisioned outside
of Cloud Foundry.
-
50 Cloud Foundry
3.2.8.1 User-provided Service Instances
Sometimes we need only to provide a simple endpoint and a set of credentials
to our application, before we push the application on the PaaS. Therefore if
we know the connection parameters before the deployment, or just we do not
need any Broker logic, we can inject this setting during the publication phase.
The Cloud Foundry CLI will prompt us with basic and static information
that we can associate with our service, whenever we want to add a Service
Instance, an entity represented by a set of: hostname, port and password.
Service Instances enable developers to use external services with their appli-
cations using familiar workows; in addition the user-provided ones (against
the one provided via a Service Broker) are service instances which have been
provisioned outside of Cloud Foundry, as we can only dene the parameters
to connect to, without providing any kind of additional logic. For example, a
DBA may provide a developer with credentials to an Oracle database man-
aged outside of, and unknown to Cloud Foundry. Rather than hard coding
credentials for these instances into an application, it is possible to create a
mock service instance in Cloud Foundry to represent an external resource
using the familiar create-service command, and provide whatever creden-
tials the application requires. Once created, user-provided service instances
behave just like other service instances.
3.2.8.2 Managed Services
Cloud Foundry provides an API which is used to integrate services with
Cloud Foundry, each time a new Managed Service manager is added, an in-
teraction stars where the Cloud Controller is the client and the manager of
those services is the Service Broker. The APIs involved are RESTful HTTP
API, those should not be confused with the version of the Cloud Controller
API, often used to refer to the version of Cloud Foundry itself; when one
refers to Cloud Foundry V2 typically refers to the Cloud Controller version.
The services API is versioned independently of the Cloud Controller API.
When we need to oer a Manage Service, we need to provide a logic construct
called Service Broker. The Broker is the term used to refer to a component
-
3.3 Roles and Organizations 51
which implements the service broker API and oers an endpoint to the Cloud
Controller during the provisioning of the required service. In general, service
brokers advertise a catalog of service oerings and service plans to Cloud
Foundry, and receive calls from Cloud Foundry for ve functions: fetch cat-
alog, create service, bind service, unbind service and delete service. What
a broker does with each call can vary between services, is totally up to the
business logic that a developer would add; in general the command create re-
serves resources on a service and bind delivers information to an application
necessary for accessing the resource. The reserved resource is called a Service
Instance, what a service instance represents can vary by service, obviously:
it could be a single database on a multi-tenant server, a dedicated cluster,
or even just an account on a web application.
How a Service Broker handles the life cycle of all these services created is
again up to the service provider/developer. Basically Cloud Foundry only
requires that the service provider implements the service broker API: a bro-
ker can be implemented as a separate application, or by adding the required
HTTP endpoints to an existing service, we need only to comply to a specic
set of REST APIs.
3.3 Roles and Organizations
As shown in Figure 3.3, Cloud Foundry oers many meta objects within the
concept of the organization. Each organization is a logical abstraction that
encompasses three things: domains, spaces and users. A domain is exactly
a domain-name, like acme.com or foo.net. This feature allows nal user to
associate application to custom domains registered to an organization. Each
application deployed on the PaaS is always a web application, that needs to
be reached from the Internet. Via a domain, we can congure several appli-
cation and aggregate them under the same internet domain name. Tipically
when we use for the rst time Cloud Foundry, a default domain is available to
all spaces. Domains can also be multi-level and contain sub-domains like for
example \store" in \store.acme.com". Domain objects belong to an organi-
zation and are associated with zero or many spaces within the organization,
-
52 Cloud Foundry
moreover they are not directly bound to application, but a child of a domain
object called a route is.
A route, is associated with an application and bind the application with the
Figure 3.3: Organization and Roles
domain. Once a web application is pushed to Cloud Foundry, a route, must
be provided. The route is at the end a subdomain, that will let the Router
component forward all the request to the right application. The Space, as
shown in Figure 3.3, is always part of an organization; in addition every
organization can have multiple spaces. The concept of the spaces provide
separation and boundaries for all the application. The default ones for a
standard Cloud Foundry installation are development, test, and production.
In each space we can deploy multiple applications.
In order to control and manage the users and the whole organization some
permissions are required, such as:
Org manager: the org-admin permission is used to edit the AccessControl List (ACL) on the organization. The org-admin permission is
required in order to create or delete an app-space, the enumerate app-
spaces, to manage organization level features, to change the plan for
an organization, to add users to the organization;
-
3.4 Command Line Client 53
Org audit: the org audit permission gives the user rights to see allorganization levels and application space levels reports and also all
organization space level and app space level events;
The app space permissions are required to handle applications and services,
some of them are:
App space manager: this permission is required to edit the ACL onan app-space. In addition, it is required to add additional managers,
to invite developers and to enable/disable/add features to the app-
space which can then be used by applications within the app-space.
The admin permission on an app-space does not give one the ability
to create or delete app-space's. This function is considered to be an
operation on the org object;
Developer: the permission is required in order to perform all operationsagainst apps and services within the app-space. With this permission it
is possible to: create, delete, stop, change instance count, bind/unbind
services, read logs and les, read stats, enumerate apps, change app
settings. If we were to map this to today's current system, all users
have the developer permission for their account;
App space audit: the audit permission is required to read all statefrom the app-space and all containing apps. If all users have this audit
access, they can do anything that is non-destructive. They can enu-