Download - Cloud stack overview
CloudStack Overview
Written by: Chiradeep Vittal, Alex Huang @ Citrix
Revised by: Gavin Lee, Zhennan Sun @ TCloud Computing
Outline
• Overview of CloudStack • Problem Definition• Feature set overview• Network• Storage• MS internals• System VMs• System Interactions• Roadmap• Comparisons
•Multi-tenant cloud orchestration platform– Turnkey– Hypervisor agnostic– Scalable– Secure– Open source, open standards– Deploys on premise or as a hosted
solution– BSS, self service portal. (Not ASL)– Extensive networking service
•Deliver cloud services faster and cheaper
What is CloudStack?
Build your cloud the way the world’s most successful clouds
are built
CloudStack Supports Multiple Cloud Strategies
Multi-tenantPublic Cloud
• Dedicated resources
• Security & total control
• Internal network• Managed by
Enterprise or 3rd party
• Mix of shared and dedicated resources
• Elastic scaling• Pay as you go• Public internet,
VPN access
Hosted Enterprise Cloud
• Dedicated resources
• Security• SLA bound• 3rd party owned
and operated
Private Clouds Public Clouds
On-premise Enterprise Cloud
Compute
CloudStack Provides On-demand Access to Infrastructure Through a Self-Service Portal
Network Storage
Admin
Users
Org A
Admin
Users
Org BUsers
End User
Admin
Open Flexible Platform
Compute
XenServer VMware KVMOracle VM Bare metal
Hypervisor
Storage
Local Disk iSCSI NFSFiber Channel Swift
Block & Object
Network
Connection Type Isolation Load
balancerFirewall VPN
Network & Network Services
Primary Storage Secondary Storage
Problem Definition
• Offer a scalable, flexible, manageable IAAS platform that follows established cloud computing paradigms
• IAAS– Orchestrate physical and virtual resources to offer self-service
infrastructure provisioning and monitoring• Scalable
– 1 -> N hypervisors / VMs / virtual resources– 1 -> N end users
• Flexible– Handle new physical resource types
• Hypervisors, storage, networking
– Add new APIs– Add new services– Add new network models
Problem Definition (contd)
• Manageable– Hide complexity of underlying resources– Rich functional end-user and admin UI– Admin API to automate operations– Easy install, upgrade for small -> large clouds– Simple scaling, automated resilience
• Established Paradigms– EC2 –inspired
• Semantic variations based on cloud provider needs, hypervisor capabilities
Feture Set Overview
Select Operating System• Windows, Linux
Select Compute Offering• CPU & RAM
Select Disk Offering• Volume Size
Select Network Offering• Network & Services
Create VM
Create Custom Virtual Machines via Service Offerings
Dashboard Provides Overview of Consumed Resources
• Running, Stopped & Total VMs
• Public IPs
• Private networks
• Latest Events
Virtual Machine Management
Users
Start
Stop
Restart
Destroy
VM Operations VM Access
• CPU Utilized
• Network Read
• Network Writes
VM StatusChange
Service Offering
2 CPUs
1 GB RAM
20 GB
20 Mbps
4 CPUs
4 GB RAM
200 GB
100 Mbps
Volume & Snapshot Management
Volume
VM 1Add / Delete
Volumes
Schedule Snapshots
Hourly
Daily
Weekly
MonthlyNow
Create Templates from Volumes
Volume Template
View Snapshot History
….
Network & Network Services
• Create Networks and attach VMs
• Acquire public IP address for NAT & load balancing
• Control traffic to VM using ingress and egress firewall rules
• Set up rules to load balance traffic between VMs
• Hosts• Servers onto which services will be provisioned
• Primary Storage• VM storage
• Cluster• A grouping of hosts and their associated storage
• Pod• Collection of clusters
• Network• Within the same L2 switch
• Secondary Storage• Template, snapshot and ISO storage
• Zone• Collection of pods, network offerings and secondary
storage
• Management Server Farm• Responsible for all management and provisioning tasks
Core CloudStack Components
Zone
CloudStack Pod
Cluster
Host
Host
PrimaryStorage
VM
VM
CloudStack Pod
ClusterSecondary
Storage Network
Pod 1
….
Cluster N
Access Layer
Host 2
Cluster 1
CloudStack Deployment Architecture
Host 1
Hypervisor is the basic unit of scale.
Cluster consists of one ore more hosts of same hypervisor
All hosts in cluster have access to shared (primary) storage
Pod is one or more clusters, usually with L2 switches.
Availability Zone has one or more pods, has access to secondary storage.
One or more zones represent cloud
PrimaryStorage
Zone 1
….
L3 core
SecondaryStorage
Pod N
CloudStack Management
Server
Internet
CloudStack Cloud Architecture
Zone1
Data Center 1
Cloud
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 3
Zone 4 CloudStack Cloud can have one or more Availability Zones (AZ).
Management Server Managing Multiple Zones
Zone1
Data Center 1
Cloud
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 3
Zone 4
MgmtServer
Single Management Server can manage multiple zones
Zones can be geographically distributed but low latency links are expected for better performance
Single MS node can manage up to 10K hosts.
Multiple MS nodes can be deployed as cluster for scale or redundancy
Management Server Deployment Architecture
MS MySQLDB
Back UpDB
InfrastructureResources
User API
Admin API
Load Balancer
MS
MS
MSMySQL
DB
InfrastructureResources
User API
Admin API
Single-node Deployment
Multi-node Deployment
MS is stateless. MS can be deployed as physical server or VM
Single MS node can manage up to 10K hosts. Multiple nodes can be deployed for scale or redundancy
Commercial: RHEL 5.4+; FOSS: Ubuntu 10.0.4, Fedora 16
Replication
Pod 1
Host 2
Cluster 1
Host 1PrimaryStorage
L3 switch
SecondaryStorage
L2 switch
CloudStack Storage
• Configured at Cluster-level. Close to hosts for better performance
• Stores all disk volumes for VMs in a cluster
• Cluster can have one or more primary storages
• Local disk, iSCSI, FC or NFS
Primary Storage
• Configured at Zone-level
• Stores all Templates, ISOs and Snapshots
• Zone can have one or more secondary storages
• NFS, OpenStack Swift
Secondary Storage
• Primary Storage• Cluster level storage for VMs• Connected directly to hosts• NFS, iSCSI, FC and Local
• Secondary Storage• Zone level storage for template, ISOs and
snapshots• NFS or OpenStack Swift via CloudStack
System VM
• Templates and ISOs• Imported into CloudStack• Can be private or public
Understanding the Role of Storage and Templates
Zone
Secondary Storage
Pod
Cluster
Host
HostPrimary Storage
Template
1. User Requests Instance
2. Provision Optional Network Services
3. Copy instance template from secondary storage to primary storage on appropriate cluster
4. Create any requested data volumes on primary storage for the cluster
5. Create instance
6. Start instance
Provisioning Process
Zone
Secondary Storage
Pod
Cluster
Host
HostPrimary Storage
VM
Template
XenServer Resource Pool
• Integrates directly with XenServer Pool Master
• Snapshots at host level
• System VM control channel at host level
• Network management is host level
Citrix XenServerCloudStack Manager
XenServer Pool Master Host
XenServer Host
XenServer Host
XenServer Host
XenServer Host
• Integrates with ovs-agent
• Snapshots at host level
• System VM control channel at host level
• Network management is host level
• Does not use OVM Manager
• All templates must be from Oracle
• CloudStack configures ocfs2 nodes
• Requires “helper” cluster• XenServer, KVM or vSphere
Oracle VMCloudStack Manager
OVM Host
OVS Agent
OVM Host
OVS Agent
OVM Host
OVS Agent
OVM Host
OVS Agent
• Integrates with libvirt using Cloud Agent
• Snapshots at host level
• System VM control channel at host level
• Network management is host level
• Only RHEL 6, not RHEV• Also supports Ubuntu 10.04
RedHat Enterprise Linux (KVM)
KVM Host
Cloud Agent
Libvirt
KVM Host
Cloud Agent
Libvirt
CloudStack Manager
• Integration through vCenter
• System VM control channel via CloudStack private network
• Snapshot and volume management via Secondary Storage VM
• Networking via vSphere vSwitch
VMware vSphereCloudStack Manager
Data Center
vSphere Cluster
vSphere Host
vSphere Host
vSphere Host
vSphere Cluster
vSphere Host
vSphere HostvCenter
Management Server Interaction with Hypervisors
Management Server
XenServer
ESX
vCenter
KVM
Agent
OVM
Agent
XAPI HTTPS
• XS 5.6, 5.6FP1, 5.6 SP2, 6.0
• Incremental Snapshots
• VHD
• NFS, iSCSI, FC & Local disk
• Storage over-provisioning: NFS
• ESX 4.1, 5.0 (coming)
• Full Snapshots
• VMDK
• NFS, iSCSI, FC & Local disk
• Storage over-provisioning: NFS, iSCSI
• RHEL 6.0, 6.1, 6.2 (coming)
• Full Snapshots (not live)
• QCOW2
• NFS, iSCSI & FC
• Storage over-provisioning: NFS
• OVM 2.2
• No Snapshots
• RAW
• NFS & iSCSi
• No storage over-provisioning
Multi-tenancy & Account ManagementCloud
• Domain is a unit of isolation that represents a customer org, business unit or a reseller
• Domain can have arbitrary levels of sub-domains
• A Domain can have one or more accounts
• An Account represents one or more users and is the basic unit of isolation
• Admin can limit resources at the Account or Domain levels
Admin
Org A
Admin
Reseller A
Domain
Domain
Admin
Org C
Sub-Domain
User 1
User 2
Group B
Account
Group A
Account
VMs, IPs, Snapshots…
VMs, IPs, Snapshots…
Resources
Resources
CloudStack Network
Network Terminology• Traffic type
– Guest: The tenant network to which instances are attached– Storage: The physical network which connects the hypervisor to primary storage– Management: Control Plane traffic between CloudStack management server and
hypervisor clusters– Public:
“Outside” the cloud [usually Internet]Shared public VLANs trunked down to all hypervisors
• Network type– Shared, one subnet serve multiple users
Direct. 1 subnetDirect tagged. VLAN, multiple subnet
– Isolated, different subnet for different userVirtual (tagged)
• All traffic can be multiplexed on to the same underlying physical network using VLANs– Usually Management network is untagged– Storage network usually on separate nic (or bond)
• Admin informs CloudStack how to map these network types to the underlying physical network– Configure traffic labels on the hypervisor– Configure traffic labels on Admin UI
Physical Network• Zone level • Defined by NIC• Assigned with traffic type (P, G, M, S)• Associated by label/vswitch name • Attached with device as service provider
Network Offering• Only for Guest traffic• Guest network type: Shared or Isolated• Defined a set of network services, such as
DHCP, Firewall, VPN, NAT…• Bandwidth
Tag
Guest Network • Instance of Network Offering• Shared: created by Admin• Isolated: Created and owned by user• One virtual router for one network• Cross pod, within Zone• VLAN id picked from the pool
VM Instance
• Choose the instantiated guest network• IP is arbitrary
Router
L3 Core Switch
Access Layer
Switches
………… …
Availability Zone
Servers
CloudStack MS Cluster
Secondary Storage
Pod 1 Pod 2 Pod 3 Pod N
MySQL
Load Balancer
Operations Admin and Cloud API
Users
Physical Network
…
DB Security Group
WebSecurity Group
Network Isolation
… …
Web VM
Web VM
Web VM
Web VM
DB VM
Web VM
DB VM
Web VM
Network Isolation (Security Group, L3)Guest 1
VM 1
Guest 2 VM 1
Guest 1 VM 2
Guest 2 VM 2
Public Internet
10.1.0.1
10.1.0.2
10.1.0.3
10.1.0.4
10.1.16.12Load
BalancerGuest 2
VM 3
Guest 1 VM 3
Guest 1 VM 4
10.1.16.21
10.1.16.47
10.1.16.85
L3 Core Switch
Pod 1 L2 Switch
Pod 3 L2 Switch
10.1.16.1
…
…10.1.8.1Pod 2 L2 Switch
Hypervisor 1
Hypervisor N
Hypervisor 8
Access Switch(es)
VLAN 101 Traffic
…
Pod K
CLUSTER 1
…
CLUSTER 4
Core (L3) Network
…
Pod M Pod N
Network Isolation (VLAN, L2)
Hypervisor N+1
VLAN 102 Traffic
Hypervisor
R
R V
VV
V
HypervisorV V
V
RTenant VMTenant Virtual Router
Guest virtual network
Guest 1 VM 1
Guest 1 VM 2
Guest 1 VM 3
Guest 1 VM 4
Public Internet
Public Network
Guest Virtual Network 10.1.1.0/24
Gateway address 10.1.1.1
NATDHCPLoad BalancingVPN
Public IP address 65.37.141.1165.37.141.36
Guest address 10.1.1.2Guest address 10.1.1.3Guest address 10.1.1.4Guest address 10.1.1.5
Guest 1 Virtual Router
Guest 2 VM 1
Guest 2 VM 2
Guest 2 VM 3
Guest Virtual Network 10.1.1.0/24
Gateway address 10.1.1.1
NATDHCPLoad BalancingVPN
Guest address 10.1.1.2Guest address 10.1.1.3Guest address 10.1.1.4
Guest 2 Virtual Router
Public IP address 65.37.141.2465.37.141.80
Guest Virtual Network With Physical Device
Public Network/Internet
Guest Virtual Network 10.1.1.1/8VLAN 100
Gateway address 10.1.1.1
DHCP, DNSNATLoad BalancingVPN
Public IP 65.37.141.11
10.1.1.1Guest VM 1
10.1.1.3Guest VM 2
10.1.1.4Guest VM 3
10.1.1.5Guest VM 4
CSVirtual Router
Public Network/Internet
Guest Virtual Network 10.1.1.1/8VLAN 100
Private IP10.1.1.112
DHCP, DNS
Public IP 65.37.141.112
10.1.1.1Guest VM 1
10.1.1.3Guest VM 2
10.1.1.4Guest VM 3
10.1.1.5Guest VM 4
NetScalerLoad
Blancer
Private IP10.1.1.111
Public IP 65.37.141.111 Juniper
SRXFirewall
CS Virtual Router provides Network Services External Devices provide Network Services
CSVirtual Router
Layer-3 Guest Network
Public Network65.11.0.0/16
65.11.1.2Guest VM
1
Guest VM 2
Guest VM 3
Guest VM 4
Public Network/Internet
NetScalerLoad
Blancer
Network Services Managed Externally Network Services Managed by CS
65.11.1.3
65.11.1.4
65.11.1.5
DHCP, DNS
CSVirtual Router
Security Group 1
Security Group 2
10.1.2.3Guest VM
1
Guest VM 2
Guest VM 3
Guest VM 4
10.2.12.4
10.5.2.99
10.1.2.18
DHCP, DNS
CSVirtual Router
Security Group 1
Security Group 2
EIP, ELB
65.11.1.265.11.1.3
65.11.1.4
L3 switch
Multi-tier network
Public Network/Internet
Private IP10.1.1.112
DHCP, DNSUser-data
Public IP 65.37.141.112
10.1.1.1Web VM
1
10.1.1.3Web VM
2
10.1.1.4Web VM
3
10.1.1.5Web VM
4
NetscalerLoad
Balancer
Private IP10.1.1.111
Public IP 65.37.141.111
Juniper SRX
Firewall
Multi-tier network
CSVirtual Router
CSVirtual Router
Virtual Network 10.1.1.0/24VLAN 100
Virtual Network 10.1.2.0/24VLAN 1001
10.1.2.21
10.1.2.18
10.1.2.38
10.1.2.39
10.1.2.31App VM
1 10.1.3.21
Virtual Network 10.1.3.0/24VLAN 141
10.1.2.24App VM
2 10.1.3.45
10.1.3.24 DB VM 1
CSVirtual Router
DHCP, DNS, User-data
DHCP, DNSUser-data,Source-NAT, VPN
Public IP 65.37.141.115
Multi-tier unified [vision]
10.1.1.1Web VM
1
10.1.1.3Web VM
2
10.1.1.4Web VM
3
10.1.1.5Web VM
4
Virtual Network 10.1.1.0/24VLAN 100
Virtual Network 10.1.2.0/24VLAN 1001
10.1.2.31App VM
1
Virtual Network 10.1.3.0/24VLAN 141
10.1.2.24App VM
2
10.1.3.24 DB VM 1
CSVirtual Router /
OtherCustomerPremises
IPSec or SSL site-to-site VPN
Internet
Monitoring VLAN
Virtual Router Services• IPAM• DNS• LB [intra]• S-2-S VPN• Static Routes• ACLs• NAT, PF• FW [ingress & egress]• BGP
Loadbalancer
Multi-tier unified with SDN[vision]
10.1.1.1Web VM
1
10.1.1.3Web VM
2
10.1.1.4Web VM
3
10.1.1.5Web VM
4
Overlay Network 10.1.1.0/24
Overlay Network 10.1.2.0/24
10.1.2.31App VM
1
Overlay Network 10.1.3.0/24
10.1.2.24App VM
2
10.1.3.24 DB VM 1
CSVirtual Router /
OtherCustomerPremises
IPSec or SSL site-to-site VPN
Internet
Monitoring VLAN
Virtual Router Services• IPAM• DNS• LB [intra]• S-2-S VPN• Static Routes• ACLs• NAT, PF• FW [ingress & egress]• BGP
LoadbalancerVirtual Appliance
• Cloud provider defines the feature set for guest networks
• Toggle features or service levels– Security groups on/off– Load balancer on/off– Load balancer software/hardware– VPN, firewall, port forwarding
• User chooses network offering when creating network
• Enables upgrade between network offerings
• Default offerings built-in– For classic CloudStack networking
Network Offerings
CloudStack Storage
Storage
Zone-Level Layer 3 Switch
Pod 2
Pod N
Private Network
Computing Server 1
Computing Server 3
Computing Server 2
Computing Server 4
Pod-Level Layer-2 Switch
Primary Storage
Primary Storage
Pod 1
Scale-Out NFS
Clus
ter 2
Clus
ter 1 Primary
Storage
Scale-Out NFS
• Primary Storage – Block device to the VM– IOPs intensive– Accessible from host or cluster
wide– Supports storage tiering
• WORM Storage– Secondary Storage or Object
Store for templates, ISO, and snapshot archiving
– High capacity• CloudStack manages the storage
between the two to achieve maximum benefit and resiliency
Primary Storage Support Matrix
Type XenServer VmWare KVM
Local Disk Supported Supported Supported
iSCSI Supported Supported Not Supported
Fiber Channel Supported Supported Not Supported
NFS Supported Supported Supported
VM Storage Network
Supported Supported Supported
Storage Tiering
• Supported via storage tags for primary storage• Specify a tag when adding a storage pool• Specify a tag when adding a disk offering• Only storage pools with the tag will be
allocated for the volume
WORM Storage
• Write Once Read Many storage pattern is supported by two different storage types– Secondary Storage (NFS Server within an
availability zone)– Object Store (Swift implementation for cross-zone)
• Objective for WORM storage– High capacity, cheap storage– Easy to increase capacity
• Used to store templates, ISOs, and snapshots
Snapshots
• Snapshots are used as backups for DRS• Taken on the primary storage and moved to
secondary storage• Supports individual snapshots and recurring
snapshots• Full snapshots on VmWare and KVM. Need help.• Incremental snapshots on XenServer• Allows backup network traffic to be specified in zone
to segregate the backup network traffic from other network traffic types
MS Internals
• Architecture• Workflow• High Availability• Scalability
Inside a Management Server
APIServlet
AsyncJob
QueueMgr
CS API
ServicesAPI
Cmds
Responses
cmd.execute()
Kernel
PluginsPlugins
Plugins
Agent Manager
ResourcesAgent API(Commands)
HypervisorNativeAPIs
LocalOrRemote
NetworkDeviceAPI
MySQL
Old Architecture
Pros• Agile development for
existing developers• Scales well horizontallyCons• Monolithic• Difficult to educate
new and third-party developers
• Easy to introduce bugs
52
XenServer
Resource
XenServer
Resource
Agent ManagerAgent Manager
API LayerAPI LayerEC2EC2 CloudStackCloudStack
Virt
ual M
achi
ne M
anag
erVi
rtua
l Mac
hine
Man
ager
KVM Resour
ce
KVM Resour
ce
SRX Resour
ce
SRX Resour
ce
F5 Resour
ce
F5 Resour
ce
NetScaler
Resource
NetScaler
Resource
Other Resourc
es
Other Resourc
es
Access ControlAccess Control
Stor
age
Man
ager
Stor
age
Man
ager
Net
wor
k M
anag
erN
etw
ork
Man
ager
Cons
ole
Prox
y M
anag
erCo
nsol
e Pr
oxy
Man
ager
Snap
shot
Man
ager
Snap
shot
Man
ager
Tem
plat
e M
anag
erTe
mpl
ate
Man
ager
Asyn
c Jo
b M
anag
erAs
ync
Job
Man
ager
…
New Deployment Architecture• Scales horizontally to
different pressure points• Automatically scales
service VMs in zones to facilitate most efficient data path transfers
• Fault isolation between API servers and Execution Servers and resources within zones
API Server
New Architecture – API Server• API Server isolates
integration code from Execution Server
• API Server can horizontally scale to handle traffic
• Easily adds other API compatibility
• Easily exposes API needed by third party vendors
UI Cloud Portal CLI
Other Clients
Pluggable API Engine
OAM&P API End User API
EC2 API
Other APIs
ACL & Authentication
- Accounts, Domains, and Projects
- ACL, limits checking
Management Services- Resource
management- Configuration- Additional
operations added by third party
REST
Framework- Job Queue - Database Access Layer- OSGi
Integration
New Architecture – Execution Server• Execution Server protected by
job queue• Kernel kept small for stability.
It only drives processes.• Plugins provide mappings of
virtual entities to physical resources
• Third party plugins to provide vendor differentiation in CloudStack
• Communicates with resources within data center over message bus
Execution Server
Kernel• Drives long running VM operations• Syncs between resources managed
and DB• Generates events
Framework• Cluster Management• Job Management• Alert & Event Management• Database Access Layer• Messaging Layer
Plugins• Storage
Handling• Network
Handling• Deployment
planning• Hypervisor
Handling
• Component Framework (OSGi)
• Transaction Management
Services API
New Architecture – Resources• Resources are carried in service
VMs to be in close network proximity to the physical resources it manages
• Easily scales to utilize the most abundant resource in data center (CPU & RAM)
• Communicates with Execution Server over message bus (JSON)
• Can be replicated for fault tolerance
Agent
Hypervisor Resources
Network Resources
Storage Resources
Image & Template Resources
Snapshot Resources
Management Server
Kernel- Drives long running VM
operations- Syncs between resources
managed and DB- Generates events
Resource Management
Cluster Management
JobManagement
UI Cloud Portal CLI
Other Clients
Deployment Planning
Network Configurations
Network Elements
Hypervisor Gurus
DatabaseAccess
Alert & EventManagement
Plug
in A
PI
Hypervisor Resources
Network Resources
Storage Resources
ImageResources
SnapshotResources
REST API
OAM&P API End User API EC2 API Pluggable Service API EngineOther APIs
Security Adapters
Account Management Connectors
ACL & Authentication- Accounts, Domains, and Projects- ACL, limits checking
Services API
Serv
ices
API
Console Proxy Management
Template Access
HA
Usage Calculations
Additional Services
Event BusMessage Bus
Kernel Module
• Understands how to orchestrate long running processes (i.e. VM starts, Snapshot copies, Template propagation)
• Well defined process steps• Calls Plugin API to execute functionalities that
it needs
Plugins
• Various ways to add more capability to CloudStack
• Implements clearly defined interfaces• All operations must be idempotent• All calls are at transaction boundaries• Compiles only against the Plugin API module
Anatomy of a Plugin
ServerResource- Optional. Required if
Plugin needs to be co-located with the resource
- Implements translation layer to talk to resource
- Communicates with server component via JSON
ServerResource- Optional. Required if
Plugin needs to be co-located with the resource
- Implements translation layer to talk to resource
- Communicates with server component via JSON
Rest API- Optional. Required only if needs to expose configuration API to admin.
Plug
in A
PI
Data Access Layer
Implmentation
Anatomy of a Plugin
• Can be two jars: server component to be deployed on management server and an optional ServerResource component to be deployed co-located with the resource
• Server component can implement multiple Plugin APIs to affect its feature
• Can expose its own API through Pluggable Service so administrators can configure the plugin
• As an example, OVS plugin actually implements both NetworkGuru and NetworkElement
Plugin Interfaces Available• NetworkGuru – Implements various network isolation technologies
and ip address technologies• NetworkElement – Facilitate network services on network elements
to support a VM (i.e. DNS, DHCP, LB, VPN, Port Forwarding, etc)• DeploymentPlanner – Different algorithms to place a VM and
volumes.• Investigator – Ways to find out if a host is down or VM is down.• Fencer – Ways to fence off a VM if the state is unknown• UserAuthenticator – Methods of authenticating a user• SecurityChecker – ACL access• HostAllocator – Provides different ways to allocate host• StoragePoolAllocator – Provides different ways to allocate volumes
Adding a Plugin to CloudStack
• Components are configured through components.xml
• Supports DAO, Manager, and Adapter patterns• Open to other component frameworks (OSGi a
possibility)
Components.xml Example<components.xml> <system-integrity-checker class="com.cloud.upgrade.DatabaseUpgradeChecker"> <checker name="ManagementServerNode" class="com.cloud.cluster.ManagementServerNode"/> <checker name="EncryptionSecretKeyChecker" class="com.cloud.utils.crypt.EncryptionSecretKeyChecker"/> <checker name="DatabaseIntegrityChecker" class="com.cloud.upgrade.DatabaseIntegrityChecker"/> <checker name="DatabaseUpgradeChecker" class="com.cloud.upgrade.PremiumDatabaseUpgradeChecker"/> </system-integrity-checker> <interceptor library="com.cloud.configuration.DefaultInterceptorLibrary"/> <management-server class="com.cloud.server.ManagementServerExtImpl" library="com.cloud.configuration.PremiumComponentLibrary"> <adapters key="com.cloud.storage.allocator.StoragePoolAllocator"> <adapter name="LocalStorage" class="com.cloud.storage.allocator.LocalStoragePoolAllocator"/> <adapter name="Storage" class="com.cloud.storage.allocator.FirstFitStoragePoolAllocator"/> </adapters> <pluggableservice name="VirtualRouterElementService" key="com.cloud.network.element.VirtualRouterElementService" class="com.cloud.network.element.VirtualRouterElement"/> </management-server></components.xml>
ServerResource
• Translation layer between CloudStack commands and resource API
• May be Co-located with resource• Have no access to DB• API defined in JSON messages
DAO
• SQL generation done mostly in GenericDaoBase• Uses JPA annotations• Very little code to write for each individual DAO• Database Access Layer for Kernel• No support for more complicated features such as
fetch strategy• Welcome to use other types of ORM in other
modules but like to hear about preferred library. (Hibernate is out due to licensing issues)
Example DAO// ExampleVO.java@Entity@Table(name=“example”)public class ExampleVO { @Id @GeneratedValue(strategy= GenerationType.IDENTITY) @Column(name=“id”) long id;
@Column(name=“name”) String name;
@Column(name=“value”) String value;}
// ExampleDao.javapublic interface ExampleDao extends GenericDao<ExampleVO, Long> {}
// ExampleDaoImpl.java@Local(value=ExampleDao.class)public class ExampleDaoImpl extends GenericDaoBase<ExampleVO, Long> implements ExampleDao {
protected ExampleDaoImpl() { }}
Kernel
Sequence Flow for deploy VMEnd User Rest API
SecurityCheckers
User VM Mgr
Network Mgr
Storage MgrJob
SchedulingVirtualMachine Mgr
Network Guru
Deploy VM
ACL Checks
Allocate Entity in CS
Allocate VM
Allocate NIC
Allocate Volume
Allocate IP
Schedules Deploy Job
Returns with job id, VM id
Query Job Result
Returns with job status
Sequence Flow for deploy VMJob Threads
Network Element
User VM Mgr
Network Mgr
Storage Mgr
VirtualMachine Mgr
Network Guru
Start VM
Start VM
Prepare Nics
Notify that Nic is about to be started in network
Reserve resources for Nic
Services API
Start User VM
Agent Calls
Prepare Volumes
Template Mgr
DeploymentPlanner
Get a Deployment Plan (Host and StoragePool)
Prepare template on Primary Storage
Agent Calls
Agent Start VM Call
Stores job result
Server Resource
High Availability
High Availability
• Service Offering contains a flag for whether HA should be supported for the VM
• Does not use the native HA capability of hypervisors for XenServer and KVM
• Uses adapters to fine tune HA process
Triggering High Availability
VM HA are triggered via the following methods:• VM Sync detects out of band VM changes• Resource Management detects that a resource is
unreachable and its state can not be determined.• VM start/stop has been sent to the resource but
resource does not return• Details of how high availability is done is at
http://docs.cloudstack.org/CloudStack_Documentation/Design_Documents/CloudStack_High_Availability_-_Developer's_Guide
High Availability
Has more Investigators
?
Is VM Up or Down?
Investigation Needed?
Has VM changed since
work scheduled? Cancel Work
Start VM
Reschedule Work
Completed Work
Fence off VM?
More Fencers??
Is hypervisor host Up or
Down?
Yes
No
Yes
No
Up
Unknown
Down
Yes
No
Success
Failure
Yes
No
Yes
No
Up
Down
• Investigation – Uses investigators to find out if VM is
alive or down– Each investigator returns three states
• Up• Down• Unknown
• Fencing– Uses fencers to fence off the VM
from accessing storage to ensure VM is not corrupted
– Each Fencer returns three states• Fenced• Unable to Fence• Don’t know how to fence
• Restart– Restarts the VM
Scalability
Current Status
• 10k resources managed per management server node
• Scales out horizontally (must disable stats collector)• Real production deployment of tens of thousands of
resources• Internal testing with software simulators up to 30k
physical resources with 30k VMs managed by 4 management server nodes
• We believe we can at least double that scale per management server node
Balancing Incoming Requests• Each management server has two worker thread pools for incoming
requests: effectively two servers in one.– Executor threads provided by tomcat– Job threads waiting on job queue
• All incoming requests that requires mostly DB operations are short in duration and are executed by executor threads because incoming requests are already load balanced by the load balancer
• All incoming requests needing resources, which often have long running durations, are checked against ACL by the executor threads and then queued and picked up by job threads.
• # of job threads are scaled to the # of DB connections available to the management server
• Requests may take a long time depending on the constraint of the resources but they don’t fail.
Comparison of two Approaches
• Stats Collector – collects capacity statistics– Fires every five minutes to collect stats about host CPU and memory
capacity– Smart server and dumb client model: Resource only collects info and
management server processes– Runs the same way on every management server
• VM Sync– Fires every minute– Peer to peer model: Resource does a full sync on connection and
delta syncs thereafter. Management server trusts on resource for correct information.
– Only runs against resources connected to the management server node
Numbers• Assume 10k hosts and 500k VMs (50 VMs per host)• Stats Collector
– Fires off 10k requests every 5 minutes or 33 requests a second.– Bad but not too bad: Occupies 33 threads every second.– But just wait:
• 2 management servers: 66 requests• 3 management servers: 99 requests
– It gets worse as # of management servers increase because it did not auto-balance across management servers
– Oh but it gets worse still: Because the 10k hosts is now spread across 3 management servers. While it’s 99 requests generated, the number of threads involved is three-fold because requests need to be routed to the right management server.
– It keeps the management server at 20% busy even at no load from incoming requests• VM Sync
– Fires off 1 request at resource connection to sync about 50 VMs– Then, push from resource as resource knows what it has pushed before and only pushes
changes that are out-of-band.– So essentially no threads occupied for a much larger data set.
Resource Load Balancing• As management server is added into the cluster, resources are rebalanced
seamlessly.– MS2 signals to MS1 to hand over a resource– MS1 wait for the commands on the resources to finish– MS1 holds further commands in a queue– MS1 signals to MS2 to take over– MS2 connects– MS2 signals to MS1 to complete transfer– MS1 discards its resource and flows the commands being held to MS2
• Listeners are provided to business logic to listen on connection status and adjusts work based on who’s connected.
• By only working on resources that are connected to the management server the process is on, work is auto-balanced between management servers.
• Also reduces the message routing between the management servers.
CloudStack System VMs
CloudStack System VMs
• System VMs optimize and scale the data path on behalf of CloudStack– Stateless, can be destroyed and recreated from database state– Highly Available– Communicates with Management Server over management network– Usually have 3 interfaces: control(linked-local), mgmt and public
• Console Proxy VM – Provides AJAX-style HTTP-only console viewer– Grabs VNC output from hypervisor– Scales out (more spawned) as load increases– Java-based server Communicates with MS
• Secondary Storage VM– Provides image (template) management services– Download from HTTP file share or Swift– Copy between zones– Scale out to handle multiple NFS mounts– Java-based server communicates with MS
CloudStack System VMs
• Virtual Router VM – Provides multiple network services– IPAM (DHCP), DNS, NAT, Source NAT, Firewall, Port Forwarding, VPN– User-data, Meta-data, guest SSH keys and password change server– Redundancy via VRRP– MS configures VR over SSH
• Proxied via the hypervisor on XS and KVM
System VM spec
• Debian 6.0 ("Squeeze"), 2.6.32 kernel with the latest security patches from the Debian security APT repository. No extraneous accounts
• 32-bit for enhanced performance on Xen/VMWare• Only essential software packages are installed. Services such as, printing, ftp, telnet, X, kudzu,
dns, sendmail are not installed.• SSHd only listens on the private/link-local interface. SSH port has been changed to a non-
standard port (3922). SSH logins only using keys (keys are generated at install time and are unique for every customer)
• pvops kernel with Xen paravirt drivers + KVM virtio drivers + VMware tools for optimum performance on all hypervisors. Xen tools inclusion allows performance monitoring
• Template is built from scratch and is not polluted with any old logs or history• Latest versions of haproxy, iptables, ipsec, apache from debian repository ensures improved
security and speed• Latest version of jre from Sun/Oracle ensures improved security and speed
System VM contd
• SSH keys and password are unique to cloud installation
• Code can be patched by restarting system vm– Mounts a special ISO file with latest code at boot– If ISO contents differ, patch and reboot
• Same system vm works on XS, KVM, VMWare– Bootstrap step for the cloud is to install the template
for this system vm• Ready to be re-purposed for other specialized tasks
Interactions
CloudStack
Cloud user{API client (Fog/etc)}
End User UI
Admin UI
MySQL
CloudStackClustered
CloudStackManagement
Server
Domain Admin
UI
CS Admin & End-user API
Cloud user{ec2 API client }
ec2 API
Monitoring CS API vSphere ClusterPrimaryStorage
vcenter
Cluster Mgmt
XS ClusterPrimaryStorage
vCenter API
XAPI
KVM ClusterPrimaryStorageJSON
OVM Cluster PrimaryStorage
XenApi
NetConf
Nitro APIJuniper SRX
Netscaler
Console Proxy VMConsole
Proxy VM
JSON
Cloud user
HTTPSAjax Console
VNC
Sec. StorageVM
NFS Server
NFSSec. Storage
VM
HTTP (Template Download)
HTTP (Template Copy)
HTTP (Swift)
NFS
Router VMRouter VM
Router VM
JSON
{Proxied} SSH
CloudStack Roadmap
2012
Apr Jul OctFeb
► Inter-Vlan Routing
► Multi-tier App
► Site-to-Site VPNs
► AWS-style tags
► VM Tiers
2013
Acton
► Swift Integration
► Support XenServer 6
► Support Vsphere 5
► Netscaler Integration
► Refine Resource Management
► UI refinement
► LDAP/AD Authentication
► Clustered LVM support
Burbank ?
► Hyper-V (win 8)► OpenvSwitch Support
► VMWare Distributed vSwitch Support
► Cisco Nexus 1000v Support
► Upload Volume
Bonita Campo
Feb
► AWS-style Regions
► IPv6
► Resource Scaling
► Dedicated Resource Module
► Scalability (50K hosts)
► Plugin Architecture
► Hypervisor Enhancement
CloudStack vs. OpenStack vs. Eucalyptus
CloudStack
• Mainly written in Java• ASL2.0 license• Has more than 100 production clouds (Around May, 2012)• Support private/hybrid/public cloud• Scale to 30K physical host in commercial environment• Support XenServer/Vsphere/KVM/OVM/Hyper-V/Baremetal as
hypervisor• Multiple geographically distributed datacenters management• Flexible and rich network functionality• Easy installation and management• Amazon EC2 API compatible• Well documented • Active community
OpenStack
• Mainly written in Python• ASL2.0 license• Support private/hybrid/public cloud• Immature for commercial usage• Support XenServer/Vsphere/KVM/Xen/Hyper-V as hypervisor• Network is a single point of failure• Weak VPN support for enterprise hybrid cloud• All inter-module communication are based on MQ• Not well documented • A bit hard to install• Amazon EC2 API partially compatible
Eucalyptus (Open Source edition)
• Mainly written in Java• GPLv3 license• Focus on private cloud• Support KVM/Xen as hypervisor• Fully compatible with Amazon EC2• Fully compatible with Amazon S3 via Walrus• EBS support via AoE and iSCSI• Both web UI and command line tools for cloud administration• Well documented• Difficult to getting started