private cloud as google does it - linuxtag.org · - xen, kvm, lxc · - drbd, lvm, file ... ganeti...

19
Ganeti Private Cloud as Google does it Helga Velroyen <[email protected]> Linuxtag Berlin, May 9th, 2014 · ·

Upload: doanbao

Post on 07-Jul-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

GanetiPrivate Cloud as Google does it

Helga Velroyen <[email protected]>Linuxtag Berlin, May 9th, 2014

·

·

A Ganeti Cluster

Instance: a virtualization guestNode: a virtualization hostNodegroup: a homogeneous set of nodesCluster: a set of nodes, managed as a collective, partitioned by nodegroups

·

·

·

·

4/20

What can it do?

Manage clusters of physical machinesDeploy virtual machines on them

·

·

Resiliency to failure (distributed storage)Live migrationEase of repairs and hardware swapsCluster balancing

-

-

-

-

5/20

Ideas

Interact with the cluster as an entity, instead of the individual machines.Making the virtualization entry level as low as possible

Scale to enterprise ecosystems

·

·

Easy to install/manageLightweight (no "expensive" dependencies)No specialized hardware needed (eg. SANs)Start small, grow big

-

-

-

-

·

Manage simultaneously from 1 to ~200 host machinesAccess to advanced features (distributed storage, live migration, clusterbalancing)

-

-

6/20

Technologies

Linux and standard utils (iproute2, bridge-utils, ssh)Hypervisors:

Storage:

Programming languages:

·

·

Xen, KVM, LXC-

·

DRBD, LVM, file, distributed storage, Ceph/Gluster-

·

Python, Haskell-

7/20

Controlling Ganeti

(*) Programmable interfaces

Command line (*)RAPI (Rest-full http interface) (*)Webinterfaces:

·

·

·

Ganeti Web manager, aiming for admins, but includes "self-servicemanagement" for usersganetimgr web manager, simplified multicluster web manager for endusersSynnefo, complete cloud service solution, OpenStack API compatible

-

-

-

8/20

Production clusterAs we use it in a Google Datacentre

Ganeti node group / rack

Ganeti node

Ganeti node

Ganeti node

Ganeti node

Ganeti node

Ganeti node

Ganeti node group / rack

Ganeti node

Ganeti node

Ganeti node

Ganeti node

Ganeti node

Ganeti node

Per m

achine monitoring

Ganeti node group / rack

Ganeti node

Ganeti node

Ganeti node

Ganeti node

Ganeti node

Ganeti node

Remote API

SSH access

Ganeti cluster

current master node

Per m

achine monitoring

Per m

achine monitoring

... ... ...

9/20

Fleet at Google

Ganeti clustertype Dedicatedmaint window A

Ganeti cluster

maint window Btype Dedicated

Ganeti cluster

maint window Atype General

Datacenter Y

Ganeti clustertype Generalmaint window A

Ganeti clustertype Generalmaint window B

Ganeti clustertype Ubiquityno maint window

Datacenter X

Ganeti cluster

no maint windowtype Office

Office ZURICH

VirgilEuripidesDradis

Ganeti clustertype Generalmaint window A

Ganeti clustertype Generalmaint window B

Ganeti clustertype Ubiquityno maint window

Datacenter Z

VM transfer

Fleet Management

10/20

Instance provisioning at Google

type General

Ganeti cluster

Ganeti cluster

type Ubiquity

type Dedicated

Ganeti cluster

Monitoring

Ganeti cluster

type General

Virgil

Alloc request

Machine DB

scan capacityRAPI interface

gather capacity

11/20

Auto node repair at Google

Ganeti HW

Ganeti HW

Ganeti HW

Ganeti HW

broken HW

Ganeti HW Ganeti cluster

Euripides Virgil

Machine database

Send machine

Tell cluster to evacuate

to repairs (2)

the broken machine (4)

Monitoring ­ detects fault (1)

Mark machine broken (3)

Send to repairs (5)

12/20

Auto node readd at Google

Watches machine for 24hrs (2)

Euripides Machine DB

Virgil

Dradis

Ganeti HW

Ganeti HW

Ganeti HW

Ganeti HW

repaired HW

Ganeti HW

Detects machinewas repaired (1)

Tells Virgil to reintegrate machine (3)

Configure machine (4)

Tell cluster to add it (5)

Mark machine serving (6)

13/20

Ganeti 2.8, 2.9

2.8.4

2.9.6

DowngradingAutorepair toolHrollerImprovements on storage, monitoring

·

·

·

·

DRBD 8.4 supportContinued work on monitoring, storage, hroller

·

·

14/20

Ganeti 2.10

2.10.3, available in debian wheezy backports, debian jessi

Cross-cluster instance moves:

Cluster balancing based on CPU loadKVM: Hotplug support, direct access to RBD storageGaneti upgrades!

·

automatic node allocation on destination clusterconvert disk templates on the fly

-

-

·

·

·

15/20

Updates

In the past, updating Ganeti was a pain:

From 2.10 on, Ganeti comes with a built-in upgrade mechanism:

Note that you still have to install the new and deinstall the old packagesmanually.

/etc/init.d/ganeti stop // on all nodesapt-get install ganeti2=2.7.1-1 ganeti-htools=2.7.1-1 // on all nodes/usr/lib/ganeti/tools/cfgupgrade // on master/etc/init.d/ganeti start // on all nodesgnt-cluster redist-conf // on master... // lots of other steps, depending on the version// If something goes wrong, fix the mess manually.

apt-get install ganeti-2.11 // on all nodesgnt-cluster upgrade --to 2.11 // on mastergnt-cluster upgrade --to 2.10 // to roll back

16/20

Ganeti 2.11

Current stable release, 2.11.0.

RPC security: individual node certificatesCompression for instance moves / backups / importsConfigurable SSH ports per node groupGluster support (experimental)

·

·

·

·

17/20

Current and Future development

No guarantees!

Google Summer of Code:

Network improvements (IPv6, more flexibility)Storage: more work on shared storageHeterogeneous clustersImprovements on cross-cluster instances moves

·

·

·

·

Make LXC support production-readyConversion between arbitrary disk templates

·

·

18/20

Open Source Events

Confirmed:

Not confirmed yet:

Linuxcon Japan, Tokyo, May 20th 2014Ganeticon, Portland, Oregon, September

·

·

Linuxcon North America, Chicago, AugustFrOSCon, St. Augustin, Germany, AugustLISA '14, Seattle, November

·

·

·

19/20

Thank You!Questions?

© 2010 - 2014 GoogleUse under GPLv2+ or CC-by-SASome images borrowed / modified from Lance Albertson, IustinPop, and Guido TrotterSome slides were borrowed / modified from Tom Limoncelli

·

··

·

·

·