operational costs of technical debt

69
The Operational Cost of Technical Debt Kurt Andersen @drkurta

Upload: kurt-andersen

Post on 23-Aug-2014

167 views

Category:

Internet


0 download

DESCRIPTION

Slide deck from my presentation at #velocityconf 2014: http://oreil.ly/NtSknc

TRANSCRIPT

Page 1: Operational Costs of Technical Debt

The Operational Cost of Technical Debt

Kurt Andersen

@drkurta

Page 2: Operational Costs of Technical Debt

Kurt AndersenLinkedIn Site Reliability

[email protected]@drkurta

Page 3: Operational Costs of Technical Debt

In our daily work, there are things which slow us down or make us inefficient

3

Page 4: Operational Costs of Technical Debt

noisrevNI

LinkedIn 2002 2011

Page 5: Operational Costs of Technical Debt

With awareness and will, you can fix those problems

5

Page 6: Operational Costs of Technical Debt

Kafka 0.7 0.8

Upgrading from 0.7

0.8, the release in which added replication, was our first backwards-incompatible release: major changes were made . . .

The upgrade from 0.7 to 0.8.x requires a special tool for migration.

This migration can be done without downtime.

from https://kafka.apache.org/documentation.html

Page 7: Operational Costs of Technical Debt

We have all dealt with processes, systems or procedures that are lodged in the past

7

Page 8: Operational Costs of Technical Debt

LinkedIn’s use of memcached

Page 9: Operational Costs of Technical Debt

9

What do we need to solve the problems of technical debt?

INversion

Examples in action:

INversion

Kafka 0.7 0.8 migration

INversion

Kafka 0.7 0.8 migration6 year old version of memcached

INversion

Kafka 0.7 0.8 migration6 year old version of memcached

wire-line format change

Page 10: Operational Costs of Technical Debt

from java-serialized objects via RPC to REST+JSON

Page 11: Operational Costs of Technical Debt

If we recognize the problems and evaluate the costs correctly,

we make better decisions about how to spend our efforts

11

Page 12: Operational Costs of Technical Debt

Technical Debt is a Decision

12

Page 13: Operational Costs of Technical Debt

Technical debt accumulates from a series of small choices

ê 13

Page 14: Operational Costs of Technical Debt

"Doing it this way" is good enough for now

ê 14

noisrevNI

Page 15: Operational Costs of Technical Debt

We'll just skip version N+1 and look at N+2 or higher

ê 15

Page 16: Operational Costs of Technical Debt

Changing to version X is going to take a lot of work

ê 16

Page 17: Operational Costs of Technical Debt

"This is the way we do things, it is not open to discussion"

ê 17

Page 18: Operational Costs of Technical Debt

Infrastructure becomes technical debt by focusing on shiny new

features

ê18

Page 19: Operational Costs of Technical Debt

"We can't afford the time to upgrade infrastructure, we have to ship features A + B"

ê 19

Page 20: Operational Costs of Technical Debt

"What have you done for me lately" is more sellable than preventing problems

ê 20

Y2k

Page 21: Operational Costs of Technical Debt

Past decisions become debt unless they are updated to reflect

new realities

ê 21

Page 22: Operational Costs of Technical Debt

Assumptions/predictions which are made early in the design process can be way off the mark.

ê 22

Page 23: Operational Costs of Technical Debt

Mary, Mary, quite contraryHow does your system scale?

ê 23

noisrevNI

Page 24: Operational Costs of Technical Debt

“One in a million” happens multiple times per hour or minute at web scale

ê 24

Page 25: Operational Costs of Technical Debt

What are the direct costs of technical debt?

25

Page 26: Operational Costs of Technical Debt

System outages and errors increase

ê 26

Page 27: Operational Costs of Technical Debt

Development process was more and more bogged down in conflict resolution in the branch dev model

ê 27

noisrevNI

Page 28: Operational Costs of Technical Debt

28

Page 29: Operational Costs of Technical Debt

Teams develop work-arounds and procedures that are more

complicated than the problem

ê 29

Page 30: Operational Costs of Technical Debt

Signs you are dealing with tech debt:

ê 30

1) “cult” ops

2) “red face” quotient

3) working around problems rather than fixing them

Page 31: Operational Costs of Technical Debt

Signs you are dealing with tech debt:

ê 31

1) “cult” ops

2) “red face” quotient

3) working around problems rather than fixing them

Page 32: Operational Costs of Technical Debt

Signs you are dealing with tech debt:

ê 32

1) “cult” ops

2) “red face” quotient

3) working around problems rather than fixing them

Page 33: Operational Costs of Technical Debt

New features are blocked when the infrastructure can’t deal with new

loads.

ê 33

Page 34: Operational Costs of Technical Debt

Capacity uplifts become increasingly painful

or impossible

ê 34

Page 35: Operational Costs of Technical Debt

Constant rollbacks and rework cause stress on dev and ops everyone

ê 35

Page 36: Operational Costs of Technical Debt

What are the indirect costs of technical debt?

36

Page 37: Operational Costs of Technical Debt

Technical debt devalues ops in favor of new feature development

ê 37

Page 38: Operational Costs of Technical Debt

"No one gets promoted for retiring debt"

ê 38

Page 39: Operational Costs of Technical Debt

“Our ops guys are so good, they can make anything work”

ê 39

Page 40: Operational Costs of Technical Debt

Supporting zombies leads to finger-pointing and avoidance

ê 40

Page 41: Operational Costs of Technical Debt

Zombies are unsupported and unsupportable

ê 41

Page 42: Operational Costs of Technical Debt

Zombies require active intervention to stop

ê 42

Page 43: Operational Costs of Technical Debt

Technical debt leads to demoralization

ê 43

Page 44: Operational Costs of Technical Debt

Being constantly reactive is no fun

ê 44

Page 45: Operational Costs of Technical Debt

Friction for teams like customer support makes it harder than necessary to provide excellent support

ê 45

Page 46: Operational Costs of Technical Debt

How do you balance retiring technical debt against other development work?

46

Page 47: Operational Costs of Technical Debt

Recognize debt choices and decisions

ê 47

Page 48: Operational Costs of Technical Debt

Never say "never"

ê 48

Page 49: Operational Costs of Technical Debt

Keep an open mind

ê 49

Page 50: Operational Costs of Technical Debt

Revisit old decisions as usage and requirements change

ê 50

Page 51: Operational Costs of Technical Debt

Measure the right things

ê 51

Page 52: Operational Costs of Technical Debt

52

Time to Repairand Effort

Page 53: Operational Costs of Technical Debt

Impact frequency, severity and reach

53

Page 54: Operational Costs of Technical Debt

Error rates

54

Page 55: Operational Costs of Technical Debt

Capacity/Headroom

ê 55

Page 56: Operational Costs of Technical Debt

If you were implementing package X today, what would you do differently?

ê 56

Page 57: Operational Costs of Technical Debt

Evaluate all the costs: either to fix or to tolerate

ê 57

Page 58: Operational Costs of Technical Debt

Make active decisions

ê 58

Page 59: Operational Costs of Technical Debt

What is your job?

Page 60: Operational Costs of Technical Debt

60

How did our examples turn out?

• INversion• memcached• Kafka 0.7 0.8• Rest.LinoisrevNI

1. Check code into trunk2. Peer review3. Release from trunk4. Continuous integration5. Service owners own their

services6. Canary all deployments7. New features ramped not

binary

Page 61: Operational Costs of Technical Debt

61

How did our examples turn out?

• INversion• memcached• Kafka 0.7 0.8• Rest.Li

Page 62: Operational Costs of Technical Debt

62

How did our examples turn out?

• INversion• memcached• Kafka 0.7 0.8• Rest.Li

Page 63: Operational Costs of Technical Debt

63

How did our examples turn out?

• INversion• memcached• Kafka 0.7 0.8• Rest.Li

Page 64: Operational Costs of Technical Debt

Moving beyond the debt crisis

64

Page 65: Operational Costs of Technical Debt
Page 66: Operational Costs of Technical Debt

Advance our standards, set upon our foes Our ancient word of courage, fair Saint George, Inspire us with the spleen of fiery dragons! Upon them! victory sits on our helms.

Richard III. act v, sc.3.

Page 67: Operational Costs of Technical Debt

Transforming the way the world works.

Kurt [email protected]

@drkurta

Page 68: Operational Costs of Technical Debt

Appendix

Page 69: Operational Costs of Technical Debt

Members first

Relationships matter

Be open, honest, and constructive

Demand excellence

Take intelligent risks

Act like an owner

Values