test driven infrastructure development - puppetconf 2013

61
Test driven Infrastructure development Tomas Doran bobtfish@bobtfish.net @bobtfish

Upload: puppet-labs

Post on 07-Jul-2015

6.540 views

Category:

Technology


1 download

DESCRIPTION

"Test Driven Infrastructure Development" by Tomas Doran Senior Systems Administrator, TIM Group. Presentation Overview: Continuous deployment of puppet code feels like a holy grail; naive approaches are a minefield for stability. Code that works on existing machines but doesn’t work on newly provisioned machines is easy to write. Whilst there are tools like puppet-rspec to help with testing your code, they don’t help with system level tests. One way to solve this problem is to build an infrastructure and run end to end tests against it. This talk will cover the approach my team has taken to this; defined by the development cycle we wanted - test driven development, fast feedback and confidence in the repeatability of builds with an automated and continuous deployment pipeline to take changes from the first push through to production. Speaker Bio: Tomas currently works as senior sysadmin at TIM Group, developing application and infrastructure automation solutions. Tom came to the dark side of systems & devops after being a professional perl developer for many years, and having worked in other varied fields such as security, QA and management. He’s an avid open source coder and core maintainer of the Catalyst and Plack projects as well as having over 100 CPAN modules and 200 github projects. He speaks regularly at technical conferences on a number of topics between development, architecture, automation, security and systems administration.

TRANSCRIPT

Page 1: Test Driven Infrastructure Development - PuppetConf 2013

Test driven Infrastructure development

Tomas [email protected]@bobtfish

Page 2: Test Driven Infrastructure Development - PuppetConf 2013

Today, I’m going to talk about the promised land!And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!

Page 3: Test Driven Infrastructure Development - PuppetConf 2013

• High availability!

Today, I’m going to talk about the promised land!And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!

Page 4: Test Driven Infrastructure Development - PuppetConf 2013

• High availability!

• Automated testing of all infrastructure changes

Today, I’m going to talk about the promised land!And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!

Page 5: Test Driven Infrastructure Development - PuppetConf 2013

• High availability!

• Automated testing of all infrastructure changes

• Entirely repeatable application environments

Today, I’m going to talk about the promised land!And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!

Page 6: Test Driven Infrastructure Development - PuppetConf 2013

• High availability!

• Automated testing of all infrastructure changes

• Entirely repeatable application environments

• High confidence in changes

Today, I’m going to talk about the promised land!And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!

Page 7: Test Driven Infrastructure Development - PuppetConf 2013

• High availability!

• Automated testing of all infrastructure changes

• Entirely repeatable application environments

• High confidence in changes

• Continuous integration and deployment for infrastructure

Today, I’m going to talk about the promised land!And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!

Page 8: Test Driven Infrastructure Development - PuppetConf 2013

So who the hell am I?

Page 9: Test Driven Infrastructure Development - PuppetConf 2013

Dev

Infrastructure automation nut!Ex-backend web developer, Ex-security, currently fixing puppet at Yelp!

Page 10: Test Driven Infrastructure Development - PuppetConf 2013

Dev / Ops

State of repeatability and testing in infrastructures is generally shocking.Leads to systems/operations teams being adverse to change and conservative - slows the business down!Why isn’t your infrastructure an agile software project?

Page 11: Test Driven Infrastructure Development - PuppetConf 2013

Dev / Ops•Developer viewpoint

State of repeatability and testing in infrastructures is generally shocking.Leads to systems/operations teams being adverse to change and conservative - slows the business down!Why isn’t your infrastructure an agile software project?

Page 12: Test Driven Infrastructure Development - PuppetConf 2013

Dev / Ops•Developer viewpoint

•Grass IS greener

State of repeatability and testing in infrastructures is generally shocking.Leads to systems/operations teams being adverse to change and conservative - slows the business down!Why isn’t your infrastructure an agile software project?

Page 13: Test Driven Infrastructure Development - PuppetConf 2013

Dev / Ops•Developer viewpoint

•Grass IS greener

State of repeatability and testing in infrastructures is generally shocking.Leads to systems/operations teams being adverse to change and conservative - slows the business down!Why isn’t your infrastructure an agile software project?

Page 14: Test Driven Infrastructure Development - PuppetConf 2013

Dev / Ops•Developer viewpoint

•Grass IS greener

•Think of your infra as an agile software project...

State of repeatability and testing in infrastructures is generally shocking.Leads to systems/operations teams being adverse to change and conservative - slows the business down!Why isn’t your infrastructure an agile software project?

Page 15: Test Driven Infrastructure Development - PuppetConf 2013

Dev / Ops•Developer viewpoint

•Grass IS greener

•Think of your infra as an agile software project...

•What workflow do I want?

State of repeatability and testing in infrastructures is generally shocking.Leads to systems/operations teams being adverse to change and conservative - slows the business down!Why isn’t your infrastructure an agile software project?

Page 16: Test Driven Infrastructure Development - PuppetConf 2013

The state of the art

Going to talk about how I think the generally accepted way of doing some things is fundamentally broken!But lets start with a simple description of the issues I’m worrying about.

Page 17: Test Driven Infrastructure Development - PuppetConf 2013

CM = state machine

Each change puppet makes (or attempts to make) is a state transition. Each circle represents the configuration state of the server on disc + services running etc..

Page 18: Test Driven Infrastructure Development - PuppetConf 2013

Non deterministic

This is the key observation here - you don’t know which way puppet’s gonna jump :)In this case - it doesn’t matter, as the two operations are orthogonal.

Page 19: Test Driven Infrastructure Development - PuppetConf 2013

Convergent!

Convergence is when each run of puppet takes you nearer to 0 changes, but the next run makes additional changes..The classic way to screw this up is to miss a dependency in your code.

Page 20: Test Driven Infrastructure Development - PuppetConf 2013

Convergent!

Of course, this doesn’t happen - the first step goes BANG, then mysql gets installed, creates /etc/mysql.The second puppet run _then_ sets the config up..

Page 21: Test Driven Infrastructure Development - PuppetConf 2013

err: /Stage[main]//File[/etc/mysql/my.cnf]/ensure: change from absent to file failed: Could not set 'file on ensure: No such file or directory - /etc/mysql/my.cnf.puppettmp_3706 at /home/tdoran/test.pp:4

Aaand in your puppet logs, you get.

Page 22: Test Driven Infrastructure Development - PuppetConf 2013

Purple text of rage!

err: /Stage[main]//File[/etc/mysql/my.cnf]/ensure: change from absent to file failed: Could not set 'file on ensure: No such file or directory - /etc/mysql/my.cnf.puppettmp_3706 at /home/tdoran/test.pp:4

THE PURPLE TEXT OF RAGE

Page 23: Test Driven Infrastructure Development - PuppetConf 2013

Convergent!

(Shamelessly stolen from https://www.usenix.org/legacy/publications/library/proceedings/lisa02/tech/full_papers/traugott/traugott.pdf)

Aaand your machine is convergent - i.e. it gets towards the desired state in a number of steps..

Page 24: Test Driven Infrastructure Development - PuppetConf 2013

•before

• require

• subscribe

•notify

As I noted, this all happens as you missed a dependency. This is the easy case, where puppet can detect hat and tell you! It’s also entirely possible to be totally silent.It is though totally possible to write your puppet code well enough to need EXACTLY 1 puppet run to fully provision a server!

Page 25: Test Driven Infrastructure Development - PuppetConf 2013

Fixable!

•before

• require

• subscribe

•notify

As I noted, this all happens as you missed a dependency. This is the easy case, where puppet can detect hat and tell you! It’s also entirely possible to be totally silent.It is though totally possible to write your puppet code well enough to need EXACTLY 1 puppet run to fully provision a server!

Page 26: Test Driven Infrastructure Development - PuppetConf 2013

Fixable!

•before

• require

• subscribe

•notify

What about an entire

infrastructure?

The $64,000 question is....

Page 27: Test Driven Infrastructure Development - PuppetConf 2013

A whole stack

Lets start simple, but semi realistic.Gonna ignore databases.Gonna ignore monitoring.Gonna ignore the n[eo]twork.

Page 28: Test Driven Infrastructure Development - PuppetConf 2013

Exported resources

Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy).Given you know the dependencies - you can get consistent runs by ordering them.

Page 29: Test Driven Infrastructure Development - PuppetConf 2013

Exported resources

• Inter machine dependencies

Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy).Given you know the dependencies - you can get consistent runs by ordering them.

Page 30: Test Driven Infrastructure Development - PuppetConf 2013

Exported resources

• Inter machine dependencies

• Unidirectional!

Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy).Given you know the dependencies - you can get consistent runs by ordering them.

Page 31: Test Driven Infrastructure Development - PuppetConf 2013

Exported resources

• Inter machine dependencies

• Unidirectional!

• Known graph - webs, proxies, lbs

Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy).Given you know the dependencies - you can get consistent runs by ordering them.

Page 32: Test Driven Infrastructure Development - PuppetConf 2013

Exported resources

• Inter machine dependencies

• Unidirectional!

• Known graph - webs, proxies, lbs

• Puppetroll (github.com/youdevise/puppetroll)

Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy).Given you know the dependencies - you can get consistent runs by ordering them.

Page 33: Test Driven Infrastructure Development - PuppetConf 2013

Exported resources

(Shameless ripoff of http://xkcd.com/1171/ )

Ordering dependent. Hard to test (in isolation). Slooow (have to run in order)

Page 34: Test Driven Infrastructure Development - PuppetConf 2013

Co-dependence

And if we really are talking about entire infrastructures...Then maybe we need some of these.

Page 35: Test Driven Infrastructure Development - PuppetConf 2013

Co-dependence

:(You _know_ that if everything is dynamically configured that you’re gonna have to do multiple puppet runs per server...Do we _really_ want to keep running puppet till it stops changing things?

Page 36: Test Driven Infrastructure Development - PuppetConf 2013

The solution - an external model

Use your software model to generate a set of machines for an environment.And generate config for puppet to apply to each system to configure itAdd super secret special sauce (lots and lots of mcollective!)

Page 37: Test Driven Infrastructure Development - PuppetConf 2013

The solution - an external model

• Represent system as a set of ruby classes

Use your software model to generate a set of machines for an environment.And generate config for puppet to apply to each system to configure itAdd super secret special sauce (lots and lots of mcollective!)

Page 38: Test Driven Infrastructure Development - PuppetConf 2013

The solution - an external model

• Represent system as a set of ruby classes

• DSL for describing environments

Use your software model to generate a set of machines for an environment.And generate config for puppet to apply to each system to configure itAdd super secret special sauce (lots and lots of mcollective!)

Page 39: Test Driven Infrastructure Development - PuppetConf 2013

The solution - an external model

• Represent system as a set of ruby classes

• DSL for describing environments

• Dependencies

Use your software model to generate a set of machines for an environment.And generate config for puppet to apply to each system to configure itAdd super secret special sauce (lots and lots of mcollective!)

Page 40: Test Driven Infrastructure Development - PuppetConf 2013

The solution - an external model

• Represent system as a set of ruby classes

• DSL for describing environments

• Dependencies

• Domain knowledge

Use your software model to generate a set of machines for an environment.And generate config for puppet to apply to each system to configure itAdd super secret special sauce (lots and lots of mcollective!)

Page 41: Test Driven Infrastructure Development - PuppetConf 2013

This is a simplified / minimal example jenkins environment - just 4 machines (2 web apps, 2 load balancers)

Page 42: Test Driven Infrastructure Development - PuppetConf 2013

ENC data!

Our external node classifier generates this for each of the 4 machines, which translates to puppet code run on the server.Note how every server gets all of it’s dependenciesThere’s a companion data structure sent to the agent which actually provisons the virtual machines

Page 43: Test Driven Infrastructure Development - PuppetConf 2013

Call tree looks something like this: Model all the nodes, allocate all their IPs. Make calls to KVM servers to provision machines.. VMs start, boot, run puppet, send cert to puppetmaster, --waitforcert.Central provisioning asks ‘do we have a cert’, waits - signs it. Looks up DNS and ENC to compile catalogue. Catalog shipped to node, runs puppet. Provisioning uses MCO to determine when puppet finished.. When all nodes up, check nrpe all green on all nodes, then run end to end app tests!

Page 44: Test Driven Infrastructure Development - PuppetConf 2013

Automate all the things

Suddenly, I have massive power.I can write a small script to bring up a whole production like environment, run tests against it, tear it down. I can do this against the latest puppet changes, and only promote them to run on production servers when the tests pass!

Page 45: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure

Behavior driven development - given I have a high level model of the systems comprising an infrastructure, I can then write equally high level tests to assert the behavior of that infrastructiure

Page 46: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given

For example...

Page 47: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given – the Service has finished being

provisioned

Page 48: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given – the Service has finished being

provisioned

• And

Page 49: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given – the Service has finished being

provisioned

• And – all monitoring related to the service is passing

Page 50: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given – the Service has finished being

provisioned

• And – all monitoring related to the service is passing

• When

Page 51: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given – the Service has finished being

provisioned

• And – all monitoring related to the service is passing

• When – when we destroy a single member of the service

Page 52: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given – the Service has finished being

provisioned

• And – all monitoring related to the service is passing

• When – when we destroy a single member of the service

• Then

Page 53: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given – the Service has finished being

provisioned

• And – all monitoring related to the service is passing

• When – when we destroy a single member of the service

• Then – we expect all monitoring at the service level to be passing

Page 54: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given – the Service has finished being

provisioned

• And – all monitoring related to the service is passing

• When – when we destroy a single member of the service

• Then – we expect all monitoring at the service level to be passing

• And

Page 55: Test Driven Infrastructure Development - PuppetConf 2013

BDD infrastructure• Given – the Service has finished being

provisioned

• And – all monitoring related to the service is passing

• When – when we destroy a single member of the service

• Then – we expect all monitoring at the service level to be passing

• And – we expect all monitoring at the single machine level to be failing

Yes, I am suggesting regression testing your load balancer setup...

Page 56: Test Driven Infrastructure Development - PuppetConf 2013

Is this for real?

Page 57: Test Driven Infrastructure Development - PuppetConf 2013

Is this for real?

• Yes!

Page 58: Test Driven Infrastructure Development - PuppetConf 2013

Is this for real?

• Yes!

• We actually built this, the core parts are on github

Page 59: Test Driven Infrastructure Development - PuppetConf 2013

Is this for real?

• Yes!

• We actually built this, the core parts are on github

• Deployed real applications to production at TIM Group

Page 60: Test Driven Infrastructure Development - PuppetConf 2013

• High availability!

• Automated testing of all infrastructure changes

• Entirely repeatable application environments

• High confidence in changes

• Continuous integration and deployment for infrastructure

This is my promised land!