cern it department ch-1211 genève 23 switzerland t it configuration activities gavin mccance...

11
CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/ IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Upload: kristin-short

Post on 18-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

IT Configuration Activities

Gavin McCanceOnline Cross-experiment Meeting, 14 June 2012

Page 2: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Why?

2

• We’re changing the tools we use to manage the centre

• Ten years ago, we were big in compute– There were no real IT ops tools at our scale, so we

developed our own– Our tools are becoming increasingly brittle and high

maintenance– Inefficiencies exist but root cause cannot be easily

identified– Learning curve remains high

• About to expand to new remote tier-0• Our needs are no longer special

Page 3: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Why?

• Last few years have seen an explosion in the IT operations tool space– Configuration, management and monitoring– Large, supportive user communities

• Strategy is absolute minimum development– Other than involvement in upstream projects

3

Page 4: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Scaling challenges: hosts and people

4

• Currently we have 10k hosts• We’ll add another 5k in the medium term and move to VMs

– 50 – 300k “hosts” depending on how we chop the CPUs up

• Many, diverse applications (“clusters”) managed by different teams

• ..and 700+ other “unmanaged” Linux nodes in VMs that could benefit from a simple configuration system

Page 5: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

What’s the config stack?

• Based around the Puppet tool and eco-system– Declarative configuration tool– Scales well– Very active, wide community– Very well integrated with other tools

5

Page 6: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Deployment status

• ~140 nodes in test with single puppetmaster– Will be soon expanding to 4k (virtual) nodes on load-

balanced puppet setup– Integrating with Openstack for VMs

• Investigating and understanding tools– IT-internal “early adopters” starting (castor, lxbatch,

lxplus, webservices, …)

• Foreman dashboard as front-end and ENC

6

Page 7: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Major bits

• Puppet and Foreman dashboard using git to version the templates– We’re putting “useful to others” modules in

https://github.com/cernops

– We’ve added integration of Puppet to the CERN CA

– Hiera for cluster-specific parameterisation• Should make modules more portable in the future.

• Our software (and scripts) are built using Koji -> mash -> yum

• Automation: Looking at Crucible for automated configuration-code-review

• Keeping Lemon for monitoring (for now) though changing alarms to use messaging notifications

• mcollective for task orchestration

7

Page 8: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Current arch

8

Page 9: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

mcollective: task orchestration

9

• Broadcast

• Run

• Collect

• Very fast response

• Automatable

Page 10: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Interesting CERNish modules

• Will be putting things in https://github.com/cernops

• Modules– AFS– Keytab, Kerberos– CVMFS– SSO with Apache httpd– SSL Apache load-balancer– CERN auth with LDAP (SSSD)– CERN Lemon– + usual OS level configurations

• Openstack integration• Cloud-init auto-registration into Puppet

10

Page 11: CERN IT Department CH-1211 Genève 23 Switzerland  t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Summary

• We’re moving to standard tools for configuration (and VM + monitoring)

• We’re gaining experience using Puppet and friends– Internal IT early adopters now– On track to move our IT services 2013

• We interested to collaborate on the work

11