architecting for the cloud: lessons learned from 100 ... · cto, cloud platforms, citrix...

34
CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang

Upload: others

Post on 23-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

CTO, Cloud Platforms, Citrix

Architecting for the cloud: lessons

learned from 100 CloudStack

deployments

Sheng Liang

Page 2: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

2008

Sept 2008:

VMOps

Founded

2009

Nov 2009:

CloudStack

1.0 GA

2010

May 2010:

Cloud.com

Launch &

CloudStack

2.0 GA

2011

July 2011:

Citrix

Acquires

Cloud.com

2012

April 2012:

Apache

CloudStack

CloudStack History

Page 3: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Open Source Xen HypervisorOpen Source Xen Hypervisor

Amazon Proprietary Orchestration SoftwareAmazon Proprietary Orchestration Software

EC2 APIEC2 API

Amazon eCommerce PlatformAmazon eCommerce Platform

Networking StorageCommodity

Servers

The inventor of IaaS cloud – Amazon EC2

Page 4: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Open Source Xen HypervisorOpen Source Xen Hypervisor

Amazon Proprietary Orchestration SoftwareAmazon Proprietary Orchestration Software

EC2 APIEC2 API

Amazon eCommerce PlatformAmazon eCommerce Platform

Networking StorageCommodity

Servers

XenServer

CloudStack

CloudPortal

Cloud APIs

ESX Hyper-V KVM OVM

CloudStack is inspired by Amazon EC2

Page 5: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

There will be 1000s of clouds

IT

SP

Ow

ner

| O

pera

tor

Horizontal

General Purpose

Vertical

Special Purpose

Desktop

Cloud

Data center mgmt

and automation

Page 6: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Learning from 100s of CloudStack deployments

EnterpriseService Providers Web 2.0

Page 7: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

What is the biggest difference between

traditional-style data center automation

and Amazon-style cloud?

Page 8: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008
Page 9: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008
Page 10: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

How to handle failures

Page 11: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

11

8%Kashi Venkatesh Vishwanath and

Nachiappan Nagappan, Characterizing

Cloud Computing Hardware Reliability,

SoCC’10

Annual Failure Rate of servers

• Server failure comes from:ᵒ 70% - hard diskᵒ 6% - RAID controllerᵒ 5% - memoryᵒ 18% - other factors

• Application can still fail for

other reasons:ᵒ Network failureᵒ Software bugsᵒ Human admin error

Page 12: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Internet

Core Routers

Access Routers

Aggregation Switches

Load Balancers

Top of Rack Switches

Servers

Page 13: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

13

40%

Phillipa Gill, Navendu Jain & Nachiappan

Nagappan, Understanding Network Failures

in Data Centers: Measurement, Analysis

and Implications, SIGCOMM 2011

Effectiveness of network

redundancy in reducing failures

•Bugs in failover

mechanism

• Incorrect configuration

•Protocol issues such

as TCP back-off,

timeouts, and spanning

tree reconfiguration

Page 14: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

A. Promise users VM, storage, and networking will

never fail

B. Backup VM for users and restore for users

when failure happens

C.Tell users to expect failure. Users to backup

VM and handle failure themselves

-- no strategy to handle failures

Page 15: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

zCloud

East

ZoneAWS

East

Zone

zCloud

West

ZoneAWS

West

Zone

Page 16: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

zCloud

East

ZoneAWS

East

Zone

zCloud

West

ZoneAWS

West

Zone

Design for

Failure

Page 17: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Cloud workloads

Traditional-Style

Reliable hardware, backup entire

cloud, and restore for users when

failure happens

Amazon-Style

Tell users to expect failure.

Users to build apps that can

withstand infrastructure failure

Link aggregation

Storage multi-pathing

VM HA, fault tolerance

VM live migration

VM backup/snapshots

Multi-site redundancy

Chaos monkey

Ephemeral resources

Strong consistency Eventual consistency

Page 18: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Designing a zone for a traditional workload

vCenter/XenCentervCenter/XenCenter

Hypervisor

Cluster

Hypervisor

Cluster

Hypervisor

Cluster

Hypervisor

Cluster

Hypervisor

Cluster

Hypervisor

Cluster

Enterprise Networking (e.g., VLAN)Enterprise Networking (e.g., VLAN)

Enterprise Storage (e.g., SAN)Enterprise Storage (e.g., SAN)

Hypervisor

Storage

SAN

Networking

L2 VLANs

Network Services

Load Balancing VPN

Multi-tier Apps

Ent App Mgmt

vSphere or XenServer Enterprise

Traditional-Style Availability Zone

Page 19: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Designing a zone for an Amazon-style workload

Hypervisor

Storage

Local EBS

Networking

L3 SDN based L2 Elastic IP

Network Services

Security Groups ELB

Multi-tier Apps

3rd Party Tools (e.g.,

RightScale, enStratus)

3rd Party Tools (e.g.,

RightScale, enStratus)

XenServer or KVM

GSLB

Software Defined Networks

(e.g., Security Groups, EIP, ELB,...)

Software Defined Networks

(e.g., Security Groups, EIP, ELB,...)

Amazon-Style Availability Zone

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Server

Racks

Elastic Block StorageElastic Block Storage

Page 20: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Users

Availability Zone 1

NetScaler

ELB/

GSLB

NetScaler

Availability Zone 2

?

Object store is critical for Amazon-style cloud

NetScaler

Availability Zone 2

Storage Cloud

Page 21: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Object StorageObject Storage

AWS-style

Availability

Zone

AWS-style

Availability

Zone

AWS-style

Availability

Zone

AWS-style

Availability

Zone

AWS-style

Availability

Zone

AWS-style

Availability

Zone

Same Cloud can Support Both Styles

Traditional

Style

Availability

Zone

Traditional

Style

Availability

Zone

Apache CloudStack

Mgmt Server

Traditional

Style

Availability

Zone

Traditional

Style

Availability

Zone

Replication/DR

Page 22: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Tests for a “true” cloud app

• Does it require SAN or VLAN?

• Does it run in multiple data centers?

• Does it involve a distributed object store?

• Is there a single point of failure?

Page 23: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Learning from 100s of CloudStack deployments

EnterpriseService Providers Web 2.0

Traditional-style Mostly Amazon-style Mostly traditional style

Page 24: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Router

L3 Core Switch

Top of Rack

Switch

………… …Availability Zone 1

ServersObject Store

Pod 1 Pod 2 Pod 3 Pod N

Load Balancer

Internet

Availability Zone 2Primary CloudStack

Mgmt Server Cluster

Primary

MySQL

CloudStack Admin

Backup

MySQL

Standby CloudStack

Mgmt Server Cluster

Page 25: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

DB

Security

Group

Web

Security

Group

Layer 3 cloud networking (security groups)

… …

Web

VM

Web

VM

Web

VM

Web

VM

Web

VM

Web

VM

Web

VM

Web

VM

DB

VM

DB

VM

Web

VM

Web

VM

DB

VM

DB

VM

Web

VM

Web

VM

Page 26: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Layer 2 VLAN networking

… …

User

2

User

2User

1

User

1

User

1

User

1

User

1

User

1

User

1

User

1

User

2

User

2

Page 27: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

OVS networking

… …

User

2

User

2User

1

User

1

User

1

User

1

User

1

User

1

User

1

User

1

User

2

User

2

OVS

OVS

OVS

OVSOVS

GRE Key 2GRE Key 1

Page 28: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Multi-tier virtual networking

App subnet

10.1.2.0/24

App VM

1

App VM

2

Web VM

1

Web VM

2

Web VM

3

Web VM

4

Web subnet

10.1.1.0/24

DB Subnet

10.1.3.0/24

DB VM 1

Customer

Premises

Customer

Premises

IPSec VPN

InternetInternet

MPLS VLAN

Network Services

• IPAM

• DNS

• LB [intra]

• S-2-S VPN

• Static Routes

• ACLs

• NAT, PF

• FW [ingress & egress]

• BGP

NetScaler VPX

GRE Key 1

GRE Key 2

GRE Key 3

Virtual Router

Public VLAN

Page 29: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Network flexibility

Network Services

� L2 connectivity

� IPAM

� DNS

� Routing

� ACL

� Firewall

� NAT

� VPN

� LB

� IDS

� IPS

Network Isolation

� No isolation

� VLAN isolation

� SDN overlays

� L3 isolation

Service Providers

� Virtual appliances

� Hardware firewalls

� LB appliances

� SDN controllers

� IDS /IPS appliances

� VRF

� Hypervisor

Page 30: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

“The Apache Way”

• Collaborative software development

• Commercial-friendly standard license

• Consistently high quality software

• Respectful, honest, technical-based interaction

• Faithful implementation of standards

• Security as a mandatory feature

Page 31: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Apache CloudStack Community

Pre Apache

Move (Jan

2012)

June

Actuals

# of companies

endorsing project

1 68

# of companies

participating

10 140

# of developers

working on project

40 238

Page 32: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Apache CloudStack community projects

• SDNᵒ Nicira

ᵒ Midokura

ᵒ Big Switch Networks

ᵒ Stratosphere

• Backup/DRᵒ Sungard

• Networkingᵒ Cisco

ᵒ Brocade (ADX)

• Smart Storageᵒ Hadoop + S3 API for object store

ᵒ NetApp (FlexPod, object store)

ᵒ Basho RIAK CS

ᵒ Caringo object store

ᵒ Cloudian S3

• PaaSᵒ CloudFoundry implementation through

IronFoundry and Stackato teams

ᵒ Engine Yard

ᵒ Cumulogic

ᵒ GigaSpaces

Page 33: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

Workload requirements drive cloud architecture

There is real demand for SDN in cloud infrastructure

Open source developers drive cloud adoption

Page 34: Architecting for the cloud: lessons learned from 100 ... · CTO, Cloud Platforms, Citrix Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang. 2008

More info

http://cloudstack.org