transforming it operations: a survey of effective practices

36
Transforming IT Operations: A Survey of Effective Practices Shawn Winnington- Ball Information Systems And Technology 03 December 2013

Upload: emil

Post on 07-Jan-2016

32 views

Category:

Documents


1 download

DESCRIPTION

Transforming IT Operations: A Survey of Effective Practices. Shawn Winnington-Ball Information Systems And Technology 03 December 2013. Introduction. There are some fundamental problems in IT that people are working to solve Let ’ s examine some of the ideas and approaches - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Transforming IT Operations: A Survey of Effective Practices

Transforming IT Operations: A Survey of

Effective Practices

Shawn Winnington-BallInformation Systems And Technology

03 December 2013

Page 2: Transforming IT Operations: A Survey of Effective Practices

Introduction

• There are some fundamental problems in IT that people are working to solve

• Let’s examine some of the ideas and approaches

• Knowledge gleaned from various sources: books, blogs, articles

• A selection of what I find compelling

Page 3: Transforming IT Operations: A Survey of Effective Practices

IT/business

• IT is a critical function in the achievement of business goals– Business goals have become IT goals

• Past the point of no return where we can fallback to manual processes– IT risk is therefore business risk

Page 4: Transforming IT Operations: A Survey of Effective Practices

IT/business

• In our digital society, there is tremendous value in using IT to create novel ways of enhancing our experiences– Digitalization (Gartner)

• Business success is tied to IT success, and how creatively and capably the IT hammer can be wielded

Page 5: Transforming IT Operations: A Survey of Effective Practices

IT risks

• What are some of the risks that might prevent us from achieving our IT goals?– There’s too much work to do already– Fixed culture: ‘we’ve always done it this way’– Sufficiently resilient/secure IT infrastructure– Silo mentality: the right people aren’t talking

about the right things– Insufficient understanding of true priorities

Page 6: Transforming IT Operations: A Survey of Effective Practices

The approaches

• From here on out, IT operations context

• The Phoenix Project: IT is in the toilet, and the miraculous recovery

• The DevOps movement: bury the hatchet

• IT process improvement efforts, culture change

Page 7: Transforming IT Operations: A Survey of Effective Practices

IT is a mess

• The situation: too much to do, everything chaotic, messy, unordered

• Where do you begin when overwhelmed?

• Tough to build a house with a jumbled pile of bricks, lumber, screws and shingles

• The right work isn’t getting done: inefficient practices and processes

Page 8: Transforming IT Operations: A Survey of Effective Practices

Unclogging the pipes

• Analyze active work, see the big picture– Who spends the time on this work?– Which of the work is repeatable?– Which of it requires specialized knowledge?– What are the organization’s true priorities and

how does the work fit with them? Is there a disconnect?

Page 9: Transforming IT Operations: A Survey of Effective Practices

Unclogging the pipes

• Collect the work, categorize it– Projects, Infrastructure, Changes, Unplanned– Infrastructure development/maintenance work

is internal project work: call as it much– 20000’ view: what are All The Things currently

underway?– This is our Work in Progress, active tasks

Page 10: Transforming IT Operations: A Survey of Effective Practices

Unclogging the pipes

• Clear the backlog: what is preventing the work from getting done?– Constraints and bottlenecks– Systematically clear them

• Low-hanging fruit: cease unplanned work– Underlying causes: why does IT break?

Page 11: Transforming IT Operations: A Survey of Effective Practices

Unclogging the pipes

• Steady ongoing changes, make them less prone to causing unplanned work

• Technical debt, taking shortcuts now will cause pain later

• Control the release of work into IT

• Demand outstrips capacity: don’t auto-accept new commitments

Page 12: Transforming IT Operations: A Survey of Effective Practices

Unclogging the pipes

• Determine total IT capacity. What commitments can we reasonably take on?– Isolate key projects and freeze ongoing efforts

for everything else– Identify the work that only one person does

and standardize it, document the process– Elevate preventative work: if it breaks often it

gets the most attention

Page 13: Transforming IT Operations: A Survey of Effective Practices

Unclogging the pipes

• ‘Setting the tempo by our constraints’– Say NO now but say YES later once the

backlog is clear– It’s easy to be honest about your capabilities

when you have a clear picture

Page 14: Transforming IT Operations: A Survey of Effective Practices

Free and clear

• What can these ideas bring about?– Reduction in chaos– Ordered approach to work, priorities-based– No more uncontrolled change– Honest assessment of true capabilities

Page 15: Transforming IT Operations: A Survey of Effective Practices

DevOps overview

• What is DevOps? A collaborative approach to how IT development and operations relate

• Tension between creating and maintaining– Development: fast, agile, creative– Production: stable, predictable, resilient

• Reconciling different perspectives

Page 16: Transforming IT Operations: A Survey of Effective Practices

DevOps overview

• Borne from the Agile development movement: fast code release, quick sprints

• Speed is of the essence: companies need to keep up with competition, provide value quicker and more often, more reliably

• The DevOps philosophy is summed up in three guiding principles…

Page 17: Transforming IT Operations: A Survey of Effective Practices

DevOps – First Way

1. Systems Thinking– Performance of the entire system– Fast flow of work: continuous integration,

deployment: small legos not big bricks– Understand that value is generated in IT from

left to right: development to production, always moving forward

– ‘”Reduce friction, increase velocity”’ (Farr)

Page 18: Transforming IT Operations: A Survey of Effective Practices

DevOps – Second Way

2. Amplify feedback loops– Bring developers closer to their live code: if

sysadmin is on-call, why not the developer– Improve the duration between learning of and

correcting failures– When the system is broken, fix it before

completing the work itself

Page 19: Transforming IT Operations: A Survey of Effective Practices

DevOps – Third Way

3. Culture of continual improvement and learning

– Take risks, fail quickly, move on– Prevent failures from reaching production– The basis of improvement is practice and

repetition: make it habitual and widespread– Test your supposed resilience: break things

on purpose to see what happens

Page 20: Transforming IT Operations: A Survey of Effective Practices

DevOps: the toys

• Infrastructure as code: heavy use of configuration management

• Versioned environments, automated deployments

• Graph anything and everything

• DevOps isn’t tools but they are invaluable to establishing the culture

Page 21: Transforming IT Operations: A Survey of Effective Practices

The Visible Ops

• Prescriptive guide based on ITIL

• ITIL doesn’t tell you where to begin; daunting effort

• Authors provide 4 distinct phases of process improvement

• Case study based: what do the shining stars have in common?

Page 22: Transforming IT Operations: A Survey of Effective Practices

The Visible Ops

• “80% of outages caused by operator and application errors”

• Cultural problems– Change management is made too tough– “Cowboy culture”; misplaced sense of agility– Reactive, always firefighting, never planning– Constantly chasing audit requirements

Page 23: Transforming IT Operations: A Survey of Effective Practices

The Visible Ops

• Characteristics of high-performing orgs– High availability as measured by MTBF and MTTR– High throughput of successful changes– Investment early in IT lifecycle: release mgmt– Visible audit controls– IT ops and security working closely, mentor/mentee– Low amounts of unplanned work– Server to admin ratio > 100:1

Page 24: Transforming IT Operations: A Survey of Effective Practices

The Visible Ops

• “Stabilize the patient”– Identify most problematic infrastructure– Publish change policy: Thou Shalt Not Touch– Create designated change windows– Use Tripwire to verify compliance– Create Change Advisory Board body comprising

stakeholders, use change request tracking system– Initiate change management meetings (to authz

changes) and daily change briefings (to announce)

Page 25: Transforming IT Operations: A Survey of Effective Practices

The Visible Ops

• “Catch and Release” & “Find Fragile Artifacts”– Interrogate all systems, ask many questions of them– Find the systems that are unique, scary, important,

and historically problematic– Determine how many unique configurations you

actually have– Document systems and services and

interdependencies in a CMDB

Page 26: Transforming IT Operations: A Survey of Effective Practices

The Visible Ops

• Create a Repeatable Build Library– Infrastructure as fuses; replace, don’t fix– Engineer builds for fragile infrastructure– Reduce unique configurations in production– Create ‘Golden Builds’: system images– Identify lowest common denominators across

the environment

Page 27: Transforming IT Operations: A Survey of Effective Practices

The Visible Ops

• Continual Improvement– Metrics: can’t manage what you can’t measure– Fact-, not belief-based management– MTTR and MTBF are key, affected by release stage

planning efforts

• Closed loop between phases 1-3– Release, controls, resolution

Page 28: Transforming IT Operations: A Survey of Effective Practices

LISA 2011

• SREs at Google: Tom Limoncelli– Disconnect between dev and prod, competition brings

them closer out of necessity– Faster feature release, pent-up waterfall methods no

longer suffice– Dev teams run their own services for 6+ months– SREs provide self-service to devs: systems, storage,

bandwidth, monitoring, docs: videos, wikis, SLA metrics

Page 29: Transforming IT Operations: A Survey of Effective Practices

LISA 2011

• Deployinator at Etsy: Erik Kastner, John Goulah– Speed and agility valued: 30+ code deploys/day– “Be wrong as fast as possible”– Graph everything that can be measured– The entire company is on IRC, up to CEO– Code push announcements are published via IRC bot

Page 30: Transforming IT Operations: A Survey of Effective Practices

LISA 2011

• Puppet: Luke Kanies– A pep talk for an obstinate, slow-moving sector– Competition drives innovation: do it better and faster

than the next person– Zynga was adding 1000 servers per week (!)– Cloud computing is independence and self-service,

not doing it all yourself, relying on sub-contractors

Page 31: Transforming IT Operations: A Survey of Effective Practices

LISA 2011

• Game Day: Jesse Robbins, Opscode– Things happen, adjust your response to them– Determine the MTTR on your own terms– Rules:

• Preparation: goals: mitigate impact, reduce MTTR, MTBF• Participation: all hands on deck, everyone suffers together• Exercises: ‘trigger and expose latent’ defects, start small

– Work up to full data centre outage!– Essentially positive outlook, can-do attitude

Page 32: Transforming IT Operations: A Survey of Effective Practices

IT culture

• Tools, tools, tools is the typical mantra

• Discuss the ideas, habits and beliefs that underpin our approach to our jobs and IT

• Technology is rapid, people aren’t– “Give People priority. If a few more projects spent a third or more

of their time, effort and money on People aspects (consultation, collaboration, walkthroughs, training, pilots, training, coaching, training, support, feedback...) instead of Technology and ITIL consultants, we might have some more successful ITSM implementations.” (Rob England, itskeptic.org)

Page 33: Transforming IT Operations: A Survey of Effective Practices

IT culture

• How do you compel people to change their views and habits?– Address ‘how is this time any different?’– Address ‘how does this affect me?’ and ‘what

do I stand to gain from it?’– Courage to tell it like it is: be honest and don’t

avoid conflict out of fear– Be vulnerable, share your personal story

Page 34: Transforming IT Operations: A Survey of Effective Practices

Conclusions

• Many great ideas on how to advance IT operations to meet business goals

• Perhaps we just need ideas to flourish in small pockets?– Can’t ordain cultural change: find places

where it will grow and support the good ideas– Organize more places to connect like-minded

people

Page 35: Transforming IT Operations: A Survey of Effective Practices

Sources, inspiration• The Visible Ops, The Phoenix Project – Behr, Kim, Spafford

• http://itskeptic.org (ITSM consultant, kinda grouchy, great critical perspective)

• http://blogs.pinkelephant.com/troy (ITSM consultant, several years’ of blog material)

• “SRE@Google Limoncelli”

• “Opscode Gameday LISA 2011”

• http://agile.dzone.com/articles/agile-its-second-decade-0

• http://itrevolution.com/learn-more-about-concepts-in-phoenix-project/

• http://itrevolution.com/nick-galbreath-on-integrating-information-security-into-devops/

• http://itrevolution.com/one-of-the-best-devops-talks-on-it-transformation-continuously-deploying-culture-by-rembetsy-and-mcdonnell-velocity-london-2012/

• http://itrevolution.com/the-three-ways-principles-underpinning-devops/

• http://itrevolution.com/video-of-my-2012-puppetconf-keynote/

• http://noelbruton.wordpress.com/2013/11/23/the-phoenix-project-exposes-itils-anti-management-backwardness

• https://speakerdeck.com/atalanta/how-not-to-do-devops

• http://venturebeat.com/2013/09/30/an-idiots-guide-to-devops

Page 36: Transforming IT Operations: A Survey of Effective Practices

Questions

[email protected]