cloud foundry summit 2015: using service brokers to manage data lifecycle

35
Using Service Brokers to Manage Data Lifecycle Josh Kruck | @krujos [email protected] github.com/krujos

Upload: pivotal

Post on 25-Jul-2015

644 views

Category:

Technology


0 download

TRANSCRIPT

Using Service Brokers to Manage Data Lifecycle

Josh Kruck | @[email protected]

github.com/krujos

2

What are the some operational problems with data?

3

Primary

Primary

DR Backup

Snapshots

Business Critical Data LifecycleRTO 00:05 RPO 01:00First 12 hours

Replica

Backup

4

Primary

Backup Backup

Primary

Snapshots

Replica

Backup

Business Critical Data LifecycleRTO 00:05 RPO 01:00First 24 hours

DR

5

525,600 minutes

6

5476 copies

7

8

(capex is easy, just buy more stuff)

copies aren’t really the problem!

9

The real problem is

5476 copiesare…

10

managed by 3 systems

[“storage”, “backup”, “rdbms”]

11

and 5 teams. [

“storage”, “backup”,

“offsite provider”, “app owner”,

“dba” ]

12

(you shouldn't buy more people)

opex is the problem

13

what’s the read/write

load on the copy?

14

0 5475 copies doing nothing

for your business

Joshua Kruck
We talk about IT as a cost center. This is why.

15

Why all this talk about backups and stuff?

?

16

Good code needs good tests.Good tests need good data.Good data needs… a copy.

A play in 3 acts

so lets get one!

17

“I don’t think we have any copies of that”

18

“I not allowed to have prod logs, much less the db”

19

we can do it, this one time: file a

ticket.

20

Solved! But did we create another problem?

21

Once you find a copy, it needs a curatorSizing (don’t use all of 10 TB of prod to test)

But your sample must represent the entirety of the dataset.

Representative curation is futile with most datasets (unknown unknowns).

Sizing means you restrict your tests to what you left in.

Sizing hides performance issues (missing index)

So maybe it’s not worth it….

22

Once you find a copy, it needs a curator

Sanitize it!

Can’t have SSN’s and CC in test

23

Once you find a copy, it needs a curator

Delete!

old data smells funny.

24

Once you find a copy, it needs a curator

Refresh!GOTO 10

25

hard|complex

manual

infrequent

error prone

handoffs

deletion

ownership

Curation is expensive

26

A manual process that starts with a

ticket is the wrong solution

27

The sum of the mess is worth more than its parts

There’s 5475 secondary copies with no load, can we leverage them for testing?

Fix: Let CF manage your data.

28

How?

29

most copies do nothing, but when the sky is falling you need them

first do no harm

30

cf create-service

Copy Data

Sanitize Data

cf push <app>

Test

cf delete app -r -f

cf delete-service

Pattern:

31

How do you fill in that hand

wavy part in the middle?

32

Putting the E in Enterprise

Buy a CDM Product

Actifio, Delphix, ViPR

Great if they support your workloads!

And you can consume the form factors they deliver

33

Based on technology to allow layered writes

Layered FS (Docker, Docker, Docker)?

Clones, Linked Clones, VM Snaps

Writeable Snapshots (FlexClone, XtremIO, LVM Snaps)

Building is harder than buying

BYO

34

cf create-service

Snap Prod VM

Spin up VM

Allocate IP

Sanitize Data in PG

cf push demo

Test

Dispose

AMI and Postgres Demo

35

https://github.com/krujos/data-lifecycle-service-brokerplease help!