peter holditch devops

69
Realising the true value of DevOps The DevOps Payrise

Upload: peter-holditch

Post on 17-Dec-2014

402 views

Category:

Engineering


0 download

DESCRIPTION

The "devOps pay-rise" presentation I gave at tcube on 18th September 2014

TRANSCRIPT

Page 1: Peter holditch   devops

Realising the true value of DevOpsThe DevOps Payrise

Page 2: Peter holditch   devops

@pholditch

Peter HolditchSenior Sales Engineer

Page 3: Peter holditch   devops

DevOps?

Page 4: Peter holditch   devops

Developers working together with

Operations to get things done faster in an

automated and repeatable way

Page 5: Peter holditch   devops

DevOps Success?

Page 6: Peter holditch   devops
Page 7: Peter holditch   devops

Typical Dev Day1. Look at the overnight integration tests 2. Buy chocolates for the team if you broke the build 3. Scramble to fix the build 4. Pick the top priority item from your backlog 5. Start coding 6. Get dragged into troubleshooting prod. incidents 7. Hastily check in new code in as you ran out of time

Page 8: Peter holditch   devops

What do developers care about?

Learn

InnovateEat Pizza

Page 9: Peter holditch   devops

What does development really care about?

Page 10: Peter holditch   devops

What did the Business care about?

£

Page 11: Peter holditch   devops

Features = £Even though the business never measured it.

Page 12: Peter holditch   devops

“Everything is fine from our end.”

OPS:

Page 13: Peter holditch   devops

Typical Ops Day1. Open 30 new tickets 2. Make 200 phone calls 3. Attend executive P1 status update meeting 4. Argue about what a P1 and P2 really is 5. Reprioritise P2 tickets to P1 6. Reprioritise P3 tickets to P2 7. Close tickets as ‘Cannot reproduce’ or ‘Duplicate’

Page 14: Peter holditch   devops

What do operators care about?

Page 15: Peter holditch   devops

P1’sSLA’s

What does operations really care about?

Page 16: Peter holditch   devops

What did the Business care about?

£

Page 17: Peter holditch   devops

P1 = £Even though the business could never prove it.

Page 18: Peter holditch   devops

How the Business often view dev & ops

Page 19: Peter holditch   devops

How L2 & L3 Support often view dev & ops

Page 20: Peter holditch   devops

False Alarms

Site is down

404 Errors

My search is slow

Page 21: Peter holditch   devops

2am Friday - #FFS

We have had an alert that the load on one of your staging servers is critical.

Page 22: Peter holditch   devops

How much time do false alarms waste?

Role Hours Per Week Cost Per Week Cost Per Year

Ops 20 £400 £20,800

L2 10 £200 £10,400

L3 15 £300 £15,600

Hosting 6 £120 £6240

Network 6 £120 £6240

CMS 10 £200 £10,400

Total 55 £1,340 £69,680

Conservative estimates assuming £20/hour

Page 23: Peter holditch   devops

How much revenue did the business lose?

No idea

Page 24: Peter holditch   devops

Typical Day1. Open 30 new tickets 2. Make 300 phone calls 3. Attend executive P1 status update meeting 4. Argue about what a P1 and P2 really is 5. Reprioritize P2 tickets to P1 6. Reprioritize P3 tickets to P2 7. Close tickets as ‘Cannot reproduce’ or ‘Duplicate’

1. Look at the overnight integration tests 2. Buy chocolates for the team if you broke the build 3. Scramble to fix the build 4. Pick the top priority item from your backlog 5. Start coding 6. Get dragged into troubleshooting prod. incidents 7. Hastily check in new code in as you ran out of time

Page 25: Peter holditch   devops

Things that would help

1. Automation

2. Collaboration

3. Better Tooling

4. Business Metrics

Page 26: Peter holditch   devops

Things that could justify them1. Baseline the starting point

2. Measure progress

3. Calculate Business Impact

4. Promote success not problems

5. Demonstrate value

Page 27: Peter holditch   devops

Modern-day User Expectations…

Page 28: Peter holditch   devops

3 billion daily transactions

250 milliseconds

500+ updates/yr

Spot the App…

Page 29: Peter holditch   devops

1 million+ servers

100 million GB

1,000 man years

1,500 miles

Konstantin Karpov

Users Expectations

Page 30: Peter holditch   devops

Web server 1

Internet FirewallLoad

Balancer

Web server 2

Database

Page 31: Peter holditch   devops

Napkin architecture…

Page 32: Peter holditch   devops

Key:

= bad

= not bad

Page 33: Peter holditch   devops

Pre$Produc)on+APM+–+“Non+Produc)on+Data”+

Development Operations

Dev Test Staging Live

Monitor & Manage Profile QA Load Test

Pre-Production Production

Page 34: Peter holditch   devops

Produc'on)APM)–)“Produc'on)Data”)

6

Development Operations

Dev Test Staging Live

Monitor & Manage

Pre-Production Production

Profile QA Load Test

Page 35: Peter holditch   devops

tools can be helpful

Page 36: Peter holditch   devops

right tools

right hands

right use

Page 37: Peter holditch   devops

How much time and £ do these tools save?

INFRASTRUCTURE AUTOMATION

Page 38: Peter holditch   devops

How much time and £ do these tools save?

DEPLOYMENT AUTOMATION

Page 39: Peter holditch   devops

How much time and £ do these tools save?

LOG AUTOMATION

LogStash

Page 40: Peter holditch   devops

Monitoring

How much time and $ do these tools save?

Page 41: Peter holditch   devops
Page 42: Peter holditch   devops

severe outage?

Page 43: Peter holditch   devops

PLAN FOR FAILURE!be stronger than the weakest link

Page 44: Peter holditch   devops

Traditional monitoring approach is limited

APPLICATION

BUSINESS TRANSACTION

Server

OS DB

MQ

Web

JVM

Silo’d domain visibility

EXISTING APPROACH

EXPANDED APPROACH

Business transaction

99.9% 99.9% 99.9%99.9%

END USER EXPERIENCE

Page 45: Peter holditch   devops

How many of you use performance

management tools?

Page 46: Peter holditch   devops
Page 47: Peter holditch   devops

Identify early !Troubleshoot fast !Resolve quickly !Quantify impact

x

Page 48: Peter holditch   devops

FOCUS

Page 49: Peter holditch   devops

Big is BAD

data

66

Page 50: Peter holditch   devops

monitoringBig is BAD

data

Page 51: Peter holditch   devops

Keep Everything?

51

Page 52: Peter holditch   devops

52

Keep Nothing?

Page 53: Peter holditch   devops

just what you need

Page 54: Peter holditch   devops

serverscores storage80TB 92700

MONITORING ENVIRONMENT

8%

servers1200

trans/min300,000

IT ENVIRONMENT

Page 55: Peter holditch   devops

smart data

actionable, intelligent, information

Page 56: Peter holditch   devops

IS THIS PERSON PERFORMING WELL?

Blood pressure!165/100!

Heart rate!150bpm!

Page 57: Peter holditch   devops

57

are we talking about this person?

Page 58: Peter holditch   devops

OR this person?

Page 59: Peter holditch   devops

Attribute Person 1 Person 2Heart Rate 150 150

Blood Pressure 180/90 180/90

Eye Color Blue BrownBlood Type O+ O-

White Blood Cell Count 3.5 3.8Hair Color Brown Blue

Height 180cm 175cmShoe size 11 10

Weight 180kg 94kgCurrent activity sitting skating

What data could we collect?

Page 60: Peter holditch   devops

IS PERSON 2 PERFORMING WELL?

Time 12min 44sec!

Distance 10,000 metres!

Record time: 12min 58secbaseline

Page 61: Peter holditch   devops

New Olympic Record Jorrit Bergsma 10,000m winner

Page 62: Peter holditch   devops

average response time with historical baseline

Page 63: Peter holditch   devops

User & IT perspective

Analytics

Correlation

Intelligent alerting

Resolution path

monitoring platforms should do the heavy lifting

Page 64: Peter holditch   devops

64

Don’t be this person…

Page 65: Peter holditch   devops

65

plan ahead

anticipate needs

intended purpose

Page 66: Peter holditch   devops

And remember: Monitoring is not all traffic lights…

Page 67: Peter holditch   devops

Understand the impact of slow performance

* Screenshot from US e-Commerce AppDynamics Customer

Application Revenue

Application Errors

Application Response time

$64,499 per min

$11,987 per min

10.1 s

100 ms

Page 68: Peter holditch   devops

Understand the benefit of an application release

Application Revenue

Application Response time

code release 1

code release 2

code release 3

$44,499 per min

$58,237 per min

1.9 s3.1 sec

Page 69: Peter holditch   devops