puppet camp london nov 2014 slides (1)
TRANSCRIPT
FROM ZERO TO DELIVERY IN A LARGE ENTERPRISE PUPPET CAMP LONDON ALAN SCHWARZENBERGER (Head of Engineering – Tools & Automation) CHRIS SPENCE (All round nice guy) 17th NOVEMBER 2014
Outline - Background
• Scale
– many applications
– each application has multiple contributing groups
– many data centres around globe
– many servers (>10k)
• Complexity
– many OS image variants
– competing solutions from
different teams
– most products are 24x7
• Organisational tension
• Previous failed attempts (not puppet)
2
Outline – Why & Aims
• Why?
– Simplify delivery to decrease cost, effort and time
– Improve quality & transparency of change implementation
– Standardise the way that change is performed
– Introduce common workflows
• Aims
– Automated deployments & configuration
management
– Standard open source tools
– Replace non-scalable legacy tools
– Provide consistency from dev
through to production
– Modular and loosely coupled
– Self service
– Single scaled platform
3
Outline – Starting point
• Previous toolsets
– didn't scale, hard to workflow, hard to use
• Small pockets of Puppet, Chef, Rundeck etc
• Greenfield site for Puppet
• Other initiatives informed solution
– data aggregator
– release management UI
• External integrations with other
tools using REST api
• Workflow implied by overall
data model and business structures
• Multi-tenant problems
4
Design & Architecture
• Puppet master solution, with Hiera and External Node Classifier
• Resilient, horizontally scaled (mix of DNS, F5, Apache)
• MoM, Compilers, Puppet DB, Puppet Dashboard
• Directory Environments
• Separate artifact distribution with caches around the globe
– This could be a talk all on its own!
– It turns out distributing things is hard.
• Servers modelled in hiera, templated configuration
5
puppetmaster::servers:
puppetmaster: upg-dev-moma.amers1.ciscloud
servers:
compiler1:
group: compilers
compiler2:
group: compilers
mom1:
group: mom
primary: true
mom2:
group: mom
Filling in the missing bits
• Hiera at scale
• Resilient CA
• Handling server-side Ruby
• Resilient & scaled PuppetDB
• Merging Hiera from multiple contributors into a directory environment
• Aggregating puppet code from multiple contributors for one stack
• How do you release?
– Components and Component Versions
– Component Stacks, Component Stack Versions
– Deployment Groups
6
Filling in the missing bits – Hiera at scale
• Don’t put every value into Hiera
– consider performance implications
– keep values tightly coupled to code in the code
• Use for what changes in different builds
• Our chosen hierarchy:
7
---
:hierarchy:
- "host/%{::fqdn}"
- "server_type/%{server_type}/server_environment_class/%{server_environment_class}/deployment_location_code/%{deployment_location_code}"
- "server_type/%{server_type}/server_environment_class/%{server_environment_class}"
- "server_environment_class/%{server_environment_class}/deployment_location_code/%{deployment_location_code}"
- "server_environment_class/%{server_environment_class}/server_environment_class_instance/%{server_environment_class_instance}"
- "server_type/%{server_type}"
- "deployment_location_code/%{deployment_location_code}"
- "server_environment_class/%{server_environment_class}/region/%{region}"
- "server_environment_class/%{server_environment_class}"
- default
:backends:
- yaml
:yaml:
:datadir: "/platform/puppet/hieradata/%{::environment}"
:merge_behavior: deeper
Filling in the missing bits – Resilient CA
• Building a resilient pair of CA servers
– Uses the Puppet CA
– You can’t just rsync!
– CAsync!
• Copies certificates
• Merges the inventory files
• Works out latest CRL
– Don’t forget the serial number offset!
8
Filling in the missing bits – Server-side Ruby
• Server-side ruby is immutable – only one version!
– functions, types, providers, core Puppet modules (inc forge modules)
– Directory Environments – use basemodulepath for the server-side ruby
– Multiple teams must contribute server side ruby to a single place
9
Filling in the missing bits – Scaling PuppetDB
• PuppetDB with read slaves
• Puppetlabs modules with a wrapper class
• Set up PostgreSQL write ahead log (WAL) archiving
10
Filling in the missing bits – Merge & Aggregate
• Merging Hiera from multiple contributors into a directory environment
– Hieramerge - deep merge hashes
– right to left priority wins
• Aggregating puppet code from multiple contributors for one stack
– Components, component stacks and deployment groups
11
foo = HieraMerge.new
foo.dirs = [
"/base/module/path/hieradata",
"/thing/otherthing/1.6.6.6/hieradata",
"/thing/penguinthing/1.2.3/hieradata”
]
foo.target = "/out/put/hieradata"
foo.merge
How do you release?
• How do you release software?
– Components – what we release, e.g. MyWebApp
– Component Versions – a point version MyWebApp-1.3.37
• That’s probably pretty normal
• But we’re releasing multiple apps at a time from different groups
– Component Stacks – MyWebApp + Monitoring
– Component Stack Versions – MyWebApp 1.3.37 + Monitoring version 9
12
Components
• Components
– discrete sets of functionality
– informed by organisation structures & application responsiblities
• Component Versions
– a release, published version of a discrete set of functionality
– expected to be self contained without external dependencies other than on the
base module
– consumed into an rpm and become immutable ( Jordan Sissel for FPM)
– never deployed on their own unless they are put in a component stack
13
{
component: "winning",
versionvalidated: "true",
componentreleaseversion: "1.3.37",
packageurl: "http://gitserver.mcgitserver/project/winning.git"
},
{
component: "losing",
versionvalidated: "false",
componentreleaseversion: "6.6.6",
packageurl: "http://gitserver.mcgitserver/otherproject/losing.git",
versionvalidationerror: "failed to download modules"
},
Component Stacks & Deployment Groups
• Component Stacks
– one or more components which together describe the desired overall state of node
• Component Stack Version
– A release, published version of code, RPM packaged
– rpm contains:
• the hieradata, merged with hieramerge
• an environment.conf
• RPM dependencies to component versions
– at install time use yum for dependency resolution
– reuse of different component versions across stacks
– component stack version becomes a
Puppet Directory Environment
• Deployment Group
– nodes that need same component stack version
– nodes that will be upgraded together (eg A/B side)
– ENC integration – deploymentgroup controls which
component stack version a node gets
14
{
version: "9.1.1",
versionvalidated: "true",
componentstackname: "nigel",
componentversion: [
{
component: {
component: "allyourbase",
versionvalidated: "true",
componentreleaseversion: "22.7”,
},
{
component: {
component: "arebelongtous",
versionvalidated: "true",
componentreleaseversion: "3.141",
},
],
}
Build workflow
• Jenkins build server in full control
– Remove deploy-time dependency on version control system & build server
– Enforces immutability at point in time
– BYO version control system (gitlab/github/svn)
• driven from data in the data layer
• we support reading from git directly (r10k)
• we support people doing their own thing and giving us an artifact (tar/zip)
15
What’s the process?
• Workflow
– Component version built into rpm
– Component versions built into a stack rpm
with merged hieradata
– Component stacks installed
on puppetmasters using puppet
– Stateful description of all versions
of all apps/states through data layer
– Validated/deprecated releases are marked
– Data available to puppet master, so we can install
and purge stacks using puppet and yum
• We are treating Puppet releases
as Software releases
16
Other interesting things
• Standard internal Travis-CI like per repo testing
• Self-service development for groups
– managed dev master pinned to production release
– has own CA, so downstream is independent
– no ENC
– iterate on published componentstackversions
– quick iteration/self-service/pre-publish
• Custom reporting events into the data layer
17
begin
config = YAML::load_file '.config.yaml'
rescue Errno::ENOENT
config = {}
end
command = config['command'] ? config['command'] : 'rake
lint && rake syntax'
[command].flatten.each do |check|
system(check)
returnvalue = $?
unless returnvalue.exitstatus == 0
exit(1)
end
end
Future and next steps
• Adoption – going from 200 to >10000 nodes
• Reporting into data layer from PuppetDB for a custom dashboard
• Component dependencies because currently hidden/not explicit
• User experience of building stacks needs improving
• Fully automated acceptance testing
– rspec-puppet gives some guarantees
• Open sourcing interesting, novel &
useful things
– casync
– hieramerge
• Resolving split responsibility
between OS and app
18
Contact us
19
Alan
Schwarzenberger
Or via LinkedIn
Chris
Spence
github.com/fiddyspence
Or via LinkedIn