+ cs 325: cs hardware and software organization and architecture cloud architectures

+ CS 325: CS Hardware and SoftwareOrganization and Architecture

Cloud Architectures

+

“Computation may someday be organized as a public utility”

- John McCarthy, 1961

+Outline

Introduction Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS)

Background

Computational Resource Load Balancing

+Cloud Computing Architecture

What is cloud computing?

+Introduction Scalable resource hosting

Storage Computational Software APIs Applications

Tailored services Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS)

Billed like a utility Monthly, depending on usage

+Introduction No formal definition!

A set of service oriented architectures, which allow users to access a number of resources in a way that is scalable, elastic, on-demand, and cost-efficient

ServerCloud Interface

…

Client

Client

Compute

Compute Service

Compute Service

ComputeStorage Service

Other Services

+Introduction

ServerCloud InterfaceCompute

Compute Service

Compute Service

ComputeStorage Service

Other Services

Infrastructure as a service(IaaS) [2-4]

Lowest service level in cloud stack.

Provides compute, storage, and networking services using hardware virtualization.

Platform as a service(PaaS) [2-4]

Software as a service(SaaS) [2-4]

2. Edmonds, A., S. Johnston, T. Metsch, and G. Mazzaferro 3. Liu, F., J. Tong, J. Mao, R. Bohn, J. Messina, M. Badger, and D. 4. Canonical Group Ltd.

+Introduction

Typical General Purpose Private Cloud Architecture (Eucalyptus [5])5. Eucalyptus Systems

+Types of Clouds Public Cloud

Marketed based on Resources offered Availability Security Price

Local Cloud Cloud architectures tailored to an organization’s needs

Hybrid Cloud Combination of public and local cloud resources

+Common Public Cloud Vendors

+

Those public vendors are great, but what if an organization wants to build their own local cloud?

+Implementation Considerations of a Local Cloud Cloud resource maintenance

Security Software Hardware Network Users

Computational resource power requirements With scalability comes increased power demands

Storage resource power requirements


Cloud Architecture Background

+Background

Concept of delivering computing resources through a global network 1960s

Computer Clusters 1970s

Grid Computing 1990s

Cloud: Evolution of Grid and Cluster 2000s

+Cloud Layers

Clients – Thick client, thin client, mobile client Application Layer – SaaS Platform Layer – PaaS Infrastructure Layer – IaaS Hardware Layer – Physical cloud resources

Client

Aplication

Platform

Infrastructure

Hardware

+Local Cloud Architecture - Eucalyptus Open source cloud architectures have different names

for components. Share basic concepts

Five components: Cloud Controller Node Walrus (Image) Storage Node User Persistent Storage Node Cluster Controller Node Compute Node

+Cloud Controller Node

The entry point into the cloud for Administrators Developers Project managers End users

The CLC: Queries other components for information about resources Makes high level scheduling decisions Makes requests to Cluster Controller Nodes

As the interface to the cloud architecture, the CLC is responsible for exposing and managing the underlying virtualized resources (Servers, network, and storage) Users can access the CLC command line tools or by using a

web interface.

+Walrus (Image) Controller Node

Used for storing virtual machine images and snapshots of user VM images.

When a user requests resources from the cloud architecture, those resources are given (if available) in the form of virtual machines. More on this later.

+Storage Controller Node

Holds user generated data.

The cloud architecture needs a place to hold user data even when the user is not currently active.

+Cluster Controller Node

Used for communication between CLC and Compute Node Controllers.

The CC gathers information about a set of NCs and schedules virtual machine execution on specific NCs.

The CC also manages the virtual machine networks. DHCP service for VMs located on NCs IP of NCs also managed by CC

All NCs associated with a single CC must be on the same subnet.

+Compute Node Controller

Used for hosting virtual machines.

The NC controls: VM activities

Launch/execution Inspection Migration Termination

The NC also: Fetches and maintains a local cache of VM images from the

Walrus. Queries and controls the host OS and hypervisor in

response to queries and control requests from the CC.

+Compute Node Controller

Maintains the resources that are given to end users CPUs, RAM, Local Disks (HHD, SSD) resources

In the form of virtual machines Hosts virtual machines using a hypervisor

Xen, KVM, ESXi Grid Nvidia VGX

Hybrid approach to hypervisor selection is common. Windows Linux Mac OS X

+Notes on Resource Virtualization Cloud architectures generally provide physical

resources to end users in the form of virtual machines

Virtual machines execute as process instances within an instance manager called a “Hypervisor”. Allows multiple guest operating systems to run on a single

host.

+Notes on Resource Virtualization

Full virtualization Paravirtualization Kernel based virtualization

Unmodified guest kernel

Modified guest kernel Unmodified guest kernel

Not aware of hypervisor Aware of hypervisor Not aware of hypervisor

Open or closed source os

No closed-source os support

Open or closed source os

Slowest due to device emulation overhead

May have better performance due to modified kernel

Best performance due to matching guest and host kernel

+Common IaaS Architecture


Computational Resource Load Balancing and Consolidation

+Computational Resource Load Management Cloud size increases in two areas:

Computational power Storage capacity

While growing in size, power management of compute nodes is needed.

Load Balancing: Distribute VM requests evenly across compute nodes to

ensure high resource availability. Disadvantage: Higher power consumption

Load Consolidation: Maximize utilization on as few compute nodes as possible to

reduce power consumption. Disadvantage: Higher resource latency

+Computational Resource Load Balancing Solution:

Power Aware Computational Resource Load Balancing

Power down unused compute nodes when they are not needed to reduce wasted power consumption. Trade-off between power consumption and resource

availability.

+Computational Resource Load Balancing Resource Load Balancing Algorithm:

Hosted on the cloud’s Cluster Controller Requests are handled by “powered on” compute nodes All available compute nodes are active

+Computational Resource Load Balancing

+Computational Resource Load Consolidation Resource Load Consolidation Algorithm:

Hosted on the cloud’s Cluster Controller Only the minimum number of compute nodes are active to

handle the current job requests When a request asks for resources that are not available, a

new compute node will power on. How to handle consolidation state over time?

Consolidation algorithm runs based on threshold, performs virtual machine live migrations and powers down unneeded node controllers.

Adverse effects of power cycling over time? Complete shutdown/reboot process

+Computational Resource Load Balancing

+Computational Resource Load Management – Turtle Project

A small computer, Turtle, was built (~$100) to test the effect of constant power cycling.

Heartbeat – 2 minutes Server powers on Turtle (cron job – every 2 minutes, wake on LAN) Turtle boots Ubuntu Server 11.04 and MySQL client Turtle writes to server MySQL DB (timestamp) Turtle stays powered on for approximately 1 minute, then powers off. # of power cycles determined by counting the # of timestamps in the

server DB

After 6 weeks: 13,000 successful power cycles 1 year avg: 36 cycles/day 2 year avg: 18.2 cycles/day 3 year avg: 11.9 cycles/day 4 year avg: 8.9 cycles/day 5 year avg: 7.2 cycles/day

+Computational Resource Load Management – Turtle Project

Turtle eventually died after 11 months of continuous power cycling. CPU fan malfunctioned, causing the CPU to overheat. Over 118,000 heartbeats Organizations keep computation

Resources 3-5 years Experiment shows power cycling

Has no adverse effect

+ cs 325: cs hardware and software organization and architecture cloud architectures

Documents

cloud stack

service paasinfrastructure

service saasplatform

computing resources

lowest service level

cs hardware

public vendors

number of resources