integrating cloudstack & ceph
DESCRIPTION
Wido den Hollander (@widoh) did a great presentation on his work to integrate Cloudstack to CephTRANSCRIPT
Ceph as storage for CloudStack
Wido den Hollander <[email protected]>
Who am I?
● Wido den Hollander– Part of the Ceph community since 2010
– Co-owner of a dutch hosting company– Committer and PMC member for Apache CloudStack
● Developed:– phprados
– rados-java– libvirt RBD storage pool support
– CloudStack integration
● Work as a Ceph and CloudStack consultant
Ceph
Ceph is a unified, open source distributed object store
Auto recovery
● Recovery when a OSD fails● Data migration when the cluster expands or
contracts
Traditional vs Distributed
Traditional storage systems don't scale that well– All have their limitations: Number of disks, shelfs, CPUs,
network connections, etc
– Scaling usually meant buying a second system● Migrating data requires service windows
● Ceph clusters can grow and shrink without service interruptions– We don't want to watch rsync copying over data and
wasting our time
● Ceph runs on commodity hardware– Just add more nodes to add capacity
– Ceph fits in smaller budgets
Hardware failure is the rule
● As systems grow hardware failure becomes more frequent– A system with 1.000 nodes will see daily hardware
issues
– We don't want to get out of bed when a machine fails at 03:00 on Sunday morning
● Commodity hardware is cheaper, but less reliable. Ceph mitigates that
RBD: the RADOS Block Device
● Ceph is a object store– Store billions of objects in pools
– RADOS is the heart of Ceph
● RBD block devices are striped over RADOS objects– Default stripe size is 4MB
– All objects are distributed over all available Object Store Daemons (OSDs)
– 40GB image consists out of 10.000 potential objects
– Thin provisioned
RADOS Block Device
RBD for Primary Storage
● In 4.0 RBD support for Primary Storage for KVM was added– No support for VMware or Xen
– Xen support is being worked on (not by me)
● Live migration is supported● Snapshot and backup support (4.2)● Cloning when deploying from templates● Run System VMs from RBD (4.2)● Uses the rados-java bindings
RBD for Primary Storage
System Virtual Machines
● Perform cluster tasks, e.g.:– DHCP
– Serving metadata to Instances
– Loadbalancing
– Copying data between clusters
– Run in between user Instances
● They can now run from RBD due to a change in the way they get their metadata– Old way was dirty and had to be replaced
● It created a small disk with metadata files
rados-java bindings
● Developed to have the KVM Agent perform snapshotting and cloning– libvirt doesn't know how to do this, but it would be
best if it did
● Uses JNA, so easy deployment● Binds both librados and librbd● Available on github.com/ceph/rados-java
Future plans
● Add RBD write caching– Write-cache setting per Disk Offering
● none (default), write-back and write-through
– Probably in 4.3
● Native RADOS support for Secondary Storage– Secondary Storage already supports S3
– Ceph has a S3-compatible gateway
● Moving logic from the KVM Agent into libvirt– Like snapshotting and cloning RBD images
Help is needed!
● Code is tested, but testing is always welcome● Adding more RBD logic into libvirt
– Snapshotting RBD images
– Cloning RBD images
– This makes the CloudStack code cleaner and helps other users who also use libvirt with RBD
● Improving the rados-java bindings– Not feature complete yet
Thanks
● Find me on:– E-Mail: [email protected]
– IRC: widodh @ Freenode / wido @ OFTC
– Skype: widodh / contact42on
– Twitter: widodh