eucalyptus: an open source infrastructure for elastic computing research

31
EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research Rich Wolski Chris Grzegorczyk, Dan Nurmi, Graziano Obertelli, Shriram Rajagopalan, Sunil Soman, Lamia Youseff, Dmitrii Zagorodnov Computer Science Department University of California, Santa Barbara

Upload: wesley-wallace

Post on 31-Dec-2015

22 views

Category:

Documents


3 download

DESCRIPTION

EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research. Rich Wolski Chris Grzegorczyk, Dan Nurmi, Graziano Obertelli, Shriram Rajagopalan, Sunil Soman, Lamia Youseff, Dmitrii Zagorodnov Computer Science Department University of California, Santa Barbara. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

EUCALYPTUS:An Open Source Infrastructure for

Elastic Computing ResearchRich Wolski

Chris Grzegorczyk, Dan Nurmi, Graziano Obertelli, Shriram Rajagopalan, Sunil

Soman, Lamia Youseff, Dmitrii Zagorodnov

Computer Science Department

University of California, Santa Barbara

Page 2: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Exciting Weather Forecasts

Page 3: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Commercial Cloud Formation

Page 4: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

What is a Cloud?

SLAs

Web Services

Virtualization

Page 5: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

How do they work?

• What can and cannot easily be hosted in a cloud?

• What extensions or modifications are required to support a wider variety of services and applications?—Scientific computing—Data assimilation—Multiplayer gaming

• How can cloud computing be coupled with other distributed software systems and infrastructure?—How should clouds and mobile devices (e.g. cell phones)

interact?

• Open Source Cloud—Simple—Extensible—Based on widely available and popular technologies—Easy to install and maintain

Page 6: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

The Skies are Opening

• Nimbus (Freeman and Keahey, University of Chicago)—Client-side cloud-computing interface to Globus-enabled

TeraPort cluster at U of C—Based on GT4 and the Globus Virtual Workspace Service

– Lots of cool features– Great if local resources are GT4 proficient– Tutorials and documentation in “grid space”

• Enomalism—Start-up company distributing open source —REST APIs—User “dashboard”—Multi-virtulaization support—Lost of extended cloud services—Beta version now available for download from SourceForge

Page 7: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

• Elastic Utility Computing Architecture Linking Your Programs To Useful Systems

• Web services based implementation of elastic/utility/cloud computing infrastructure—Linux image hosting ala Amazon

• How do we know if it is a cloud?—Try and emulate an existing cloud: EC2 + S3—Works with command-line tools from Amazon w/o

modification—Enables leverage of emerging EC2 value-added service

venues (e.g. Rightscale)

• Functions as a software overlay—Existing installation should not be violated (too

much)

• “One-button” install using Rocks—“System Administrators are people too.”

Page 8: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Goals for Eucalyptus

• Foster research in elastic/cloud/utility computing —models of service provisioning, scheduling, SLA

formulation, hypervisor portability and feature enhancement, etc.

• Experimentation vehicle prior to buying commercial services—“Tech Preview” using local machines with local

system administration support

• Provide a debugging and development platform for EC2 (and other clouds)—Allow the environment to be set up and tested

before it is instantiated in a for-fee environment

• Provide a basic software development platform for the open source community—E.g. the “Linux Experience”

• Not a designed as a replacement technology for EC2 or any other cloud service

Page 9: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Challenges

• Extensibility—Simple architecture and open internal APIs

• Client-side interface—Amazon’s EC2 interface and functionality (familiar and

testable)

• Networking—Virtual private network per cloud—Must function as an overlay => cannot supplant local

networking

• Security—Must be compatible with local security policies

• Packaging, installation, maintenance—system administration staff is an important constituency

for uptake

Page 10: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Eucalyptus Architecture: WS-Cloud

Client-side APITranslator

Cloud Controller

Cluster Controller Node Controller

Amazon EC2 Interface

Database

Page 11: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

EC2 Compatibility

• Interface is based on Amazon’s published WSDL—2008 compliant except for

– static IP address assignment– Security groups

—“Availability” zones correspond to individual clusters—Uses the EC2 command-line tools downloaded from

Amazon—REST interface

• S3 support/emulation: not yet, but on its way—Images accessed by file system name instead of S3 handle

for the moment– Unless user wants to use the actual S3 and pay for the

egress charges

• System administration is different—Eucalyptus defines its own Cloud Admin. tool set for user

accounting and cloud management

Page 12: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Networking

• Eucalyptus does not assume that all worker nodes will have publicly routable IP addresses

—Each cloud allocation will have one or more public IP addresses

—All cloud images have access to a private network interface

• Two types of networks internal to a cloud allocation—Virtual private network

– Uses VDE interfaced to Xen that is set up dynamically– Substantial performance hit within a cluster– Allows a cloud allocation to span clusters

—High-performance private network (availability zone)– Bypasses VDE and uses local cluster network for each

allocation– Runs at “native” network speed (I.e. with Xen)– Cloud allocations cannot span clusters

• Availability zone approach fits with Amazon’s high-level semantics

Page 13: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Virtual Network: Ethernet Overlay

vde vde vde

vde

vde vde vde

vde

ssl

Page 14: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Performance of the Virtual Network

Page 15: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Security

• All Eucalyptus components use WS-security for authentication—Encryption of inter-component communication is not

enabled by default– Configuration option

• Ssh key generation and installation ala EC2 is implemented—Cloud controller generates the public/private key pairs and

installs them

• User sign-up is web based—User specifies a password and submits sign-up request—Cert is generated but withheld until admin. approves

request—User gains access to cert. through password-protected web

page– Similar to EC2 model without the credit cards

Page 16: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Packaging, Installation, and Deployment

• Rocks—“One-button” install per cluster—Requires Rocks V (the most current release) for Xen

support—If you know what you are doing, RPMs can be

extracted and installed manually—Multiple clusters requires a configuration file

– Multi-cluster configuration tools ala Rocks not readily available

• Build-from-source—“Many-button” install

– Instructions, scripts, rsync, and perseverance

• Single-machine “cloud”– All components run in dom0– Need to resolve port-conflicts by hand

Page 17: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

What’s it Made Out Of?

• Axis2 and Axis2c version 1.4.0

• Hibernate 3.2.2

• HSQLDB 1.8.0

• jetty 6.1.9

• JiBX (March 30th sourceforge)

• Mule 2.0.1

• Rampart version 1.3

• libvirt version 0.4.2

• socat-1.6.0

• VDE version 2.2.0-pre2

Page 18: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Eucalyptus Public Cloud

• Free, time limited access to a Eucalyptus installation at UCSB—Only installed images can be run (i.e. no image uploading)—4 VM limit—6 hour limit—Reverse firewall

• Configuration—8 Pentium Xeon processors (3.2 GHz)—2.5 GB of memory per image—36 GB of disk space—1 Gb enet interconnect—Local availability zone only (i.e. no VDE)—Debian 4.0, Linux v2.6.18-xen-3.1—Xen 3.2

Demo

Page 19: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

EC2 and EPC Throughput

Page 20: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

EC2 and EPC RTT

Page 21: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Single Instance

Page 22: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Four Instances

Page 23: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Eight Instances

Page 24: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Version History• Eucalyptus version 1.0 became available for public

release 5/28/08 (Rocks binary only)

• Version is 1.1 shipped 7/1/2008—Bug fixes—Decent WS-security implementation—REST interface—Source code release—Build-from source “guidance” scripts and instructions

• Version 1.2 shipped 8/1/2008—Primarily a bug-fix release—Upgrade mechanism (instead of re-install)

• Version 1.3 shipped 8/23/2008—Amazon changed their client-side tools

Page 25: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Next Releases

• Version 1.4 (expected 11/5/2008)—S3 support uses local file system—Administrator definable SLAs—Cross cluster layer 2 networking—Elastic IPs and security groups, metadata service—User-defined image management and registration

• Version 1.5 (expected 1/1/09)—Elastic Block Store (EBS)—VLAN safe layer 3 networking—Credential federation support—DB managed configuration support—Distributed DB state management (maybe)

• Should be fully 2008 interface compatible in Release 1.5

Page 26: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Next Generation Eucalyptus Networking

• Multiple networking implementations—Open Source + academic environment == overlay or

nothing—Some sites are willing to tolerate a more invasive

networking approach in exchange for performance and scalability

—Three different approaches– Exploit Xen network interface isolation and VLANS

+ software only approach - will make Eucalyptus more Xen dependent

– IP-tables and NATs + high-level software only approach - possible conflicts with existing IP-tables

configuration(s)– Hardware-supported VLANs and trunking

+ fast and scalable - requires on-line access to VLAN configuration

interface

Page 27: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

More Plans

• Hypervisor religiosity and secularism—Current implementation uses a subset of the libvirt

interface– Xen, VMWare, kvm

—Eucalyptus + Xen + VMWare “works” but is clearly not the right answer

—HyperV– Initial study makes it look quite doable for

virtualization support– Understanding the networking is next on the list– Port of the Eucalyptus components to .Net

• UCSB Campus Cloud(s)—UC Cyberinfrastructure pilot—Test installation up at California Nanosystems Institute

(CNSI)—Leverage UCSB VMWare installation and Eucalyptus

installation at SDSC—Requires a very rich user accounting system

Page 28: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Ancillary Projects

• Google App Engine—AppDrop will run App Engine inside EC2—Port AppDrop to Eucalyptus—Port App Engine to Hbase and/or Hypertable—Should provide an interesting research vehicle

• Rightscale—Local enterprise focused on providing Ruby-on-Rails

infrastructure for EC2—“Turing Test” for Eucalyptus

– Can Rightscale “tell” that it isn’t talking to EC2?—Requires that the REST interface be solid—Testing now against the EPC

Page 29: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Clouds Versus Grids

• Clouds and Grids are distinct

• Cloud—Full private cluster is provisioned—Individual user can only get a tiny fraction of the total

resource pool—No support for cloud federation except through the client

interface—Opaque with respect to resources

• Grid—Built so that individual users can get most, if not all of the

resources in a single request—Middleware approach takes federation as a first principle—Resources are exposed, often as bare metal

• These differences mandate different architectures for each

Page 30: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Lessons Learned so Far

• Open source for cloud computing constrains design more than we thought it would—More of the technical challenge centers on dealing with

local configuration choices—Multi-cluster service ensemble really isn’t a typical open

source tool– Do we really need a laptop edition?

• Administrators in the “real world” still build clusters by hand—We thought the use of Rocks early on would make us

heroes -- it hasn’t—In HPC space, admin time is *really* expensive

• There are few, if any, cloud configuration tools available—Red Hat, Debian, CentOS, Ubuntu => linux packaging and

deployment—Rocks => cluster packaging and deployment—??? => cloud packaging and deployment?

Page 31: EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research

Thanks, More Information, and Help!

• National Science Foundation—VGrADS Project

• SDSC, CNSI, IU, Rice University

• RightScale.com

• The Eucalyptus Development Team at UCSB is—Chris Grzegorczyk -- [email protected]—Dan Nurmi -- [email protected]—Graziano Obertelli -- [email protected]—Shriram Rajagopalan -- [email protected]—Sunil Soman -- [email protected]—Lamia Youseff -- [email protected]—Dmitrii Zagordnov -- [email protected]

[email protected]

• http://eucalyptus.cs.ucsb.edu