how would you recover? lesson's from 2016's most interesting disasters

25
How would you recover? Lessons from 2016’s most interesting disasters

Upload: databarracks

Post on 22-Jan-2018

181 views

Category:

Business


0 download

TRANSCRIPT

Page 1: How would you recover? Lesson's from 2016's most interesting disasters

How would you recover?Lessons from 2016’s most interesting disasters

Page 2: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 2www.databarracks.com | 2

INTRO & AGENDA

Duration: 30 mins

(including Q&A)

Type questions on

the rightQ

• The most common causes of data loss in 2016

• Examination of 6 real disasters suffered by Databarracks customers in 2016

• What mistakes led to the disaster

• Recommendations for becoming more resilient

*Slides will be made available and sent out following this session

Page 3: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 3www.databarracks.com | 3

THE BCPCAST

http://www.thebcpcast.com/

Page 4: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 4

THE COST OF IT DOWNTIME

http://costofitdowntime.com/

Page 5: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 5

CYBER ATTACK AS THE LEADING CAUSE OF DATA LOSS

2015 2016

4%

6%

8%9%

20142013

datahealthcheck.databarracks.com

Page 6: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 6

FROM THE DATA HEALTH CHECK–LEADING CAUSE OF DATA LOSS

Cyber attack

9%

Hardware failure Human error

16%

23%

datahealthcheck.databarracks.com

Page 7: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 7https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/

Page 8: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 8

GITLAB

LVM snapshots are by default only taken once every 24 hours. YP happened to run one manually about 6 hours prior to the outage

Regular backups seem to also only be taken once per 24 hours, though YP has not yet been able to figure out where they are

stored. According to JN these don’t appear to be working, producing files only a few bytes in size.

SH: It looks like pg_dump may be failing because PostgreSQL 9.2 binaries are being run instead of 9.6 binaries. This happens

because omnibus only uses Pg 9.6 if data/PG_VERSION is set to 9.6, but on workers this file does not exist. As a result it defaults

to 9.2, failing silently. No SQL dumps were made as a result. Fog gem may have cleaned out older backups.

Disk snapshots in Azure are enabled for the NFS server, but not for the DB servers.

The synchronisation process removes webhooks once it has synchronised data to staging. Unless we can pull these from a regular

backup from the past 24 hours they will be lost

The replication procedure is super fragile, prone to error, relies on a handful of random shell scripts, and is badly documented

Our backups to S3 apparently don’t work either: the bucket is empty

“So in other words, out of 5 backup/replication techniques

deployed none are working reliably or set up in the first place.”

Page 9: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 9

HAVE YOU TESTED DR IN THE LAST 12 MONTHS?

Page 10: How would you recover? Lesson's from 2016's most interesting disasters

HOW WOULD YOU RECOVER?

Page 11: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 11www.databarracks.com | 11

Installation

Contact with C&C

Search

Encryption

Ransom

CASE STUDY 1 – MAJOR PLAYERS

Page 12: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 12www.databarracks.com | 12

LESSONS #1

If your users need to open attachments from unknown

sources:

• How can you limit the damage a ransomware attack

might inflict?

• How quickly would you be able to recover?

• How much data would be lost?

Page 13: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 13www.databarracks.com | 13

CASE STUDY 2 – RANSOMWARE #2

Installation

Contact with C&C

Search

Encryption

Ransom

IT manager leaves business

Page 14: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 14www.databarracks.com | 14

LESSONS #2

• What happens if the person or people responsible for

your IT aren’t available?

• How many people have access beyond what they

really need?

• Do you remove access properly for leavers?

Page 15: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 15www.databarracks.com | 15

CASE STUDY 3 – STOLEN SERVERS

Page 16: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 16www.databarracks.com | 16

LESSONS #3

• How secure is your office, server room and data

centre? (Do you have CCTV and access control?)

• Is there a possibility of data loss through the physical

removal of your hardware?

• How long would it take to source replacement

hardware or to recover at a second site?

• Would you be able to do so if you only lost a small

sub-set of systems?

Page 17: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 17www.databarracks.com | 17

CASE STUDY 4 – PERMISSIONS ON FILE SYSTEM ERROR

Page 18: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 18www.databarracks.com | 18

LESSONS #4

If you made a similar mistake:

• How long would it realistically take before you found

out?

• How long would it take to restore normal

permissions?

• What would be the impact if all employees had

access to sensitive customer, financial and HR data?

Page 19: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 19www.databarracks.com | 19

CASE STUDY 5 –COMDEMMED BUILDING

Page 20: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 20www.databarracks.com | 20

LESSONS #5

•What would you do if you lost access to

your premises?

•Does your backup and recovery plan

account for all users needing to

continue to operate?

Page 21: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 21www.databarracks.com | 21

CASE STUDY 6 – HOLBORN FIRE, YORK FLOODS, RANSOMWARE

Page 22: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 22www.databarracks.com | 22

LESSONS #6

•Do you have skills in place to recover

from a range of different risks?

•Does your service provider have the

capacity to cope with multiple

invocations in parallel?

Page 23: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 23www.databarracks.com | 23

SUMMARY

• Put in methods to limit the damage of a ransomware attack

• Do not allow greater access than necessary for users

• Make server rooms secure with CCTV and access control

• Make sure your DR plans include the option that all users need

to continue working

Page 24: How would you recover? Lesson's from 2016's most interesting disasters

www.databarracks.com | 24

RESOURCES

• The Business Continuity Podcast

– http://www.thebcpcast.com/

• Tabletop testing simulator

– https://tools.databarracks.com/

dr-tabletop-simulation/index.html

• The Cost of IT Downtime

– http://costofitdowntime.com/

• Data Health Check

– http://datahealthcheck.databar

racks.com/

• GitLab data loss

– https://www.theregister.co.uk/2

017/02/01/gitlab_data_loss/

Page 25: How would you recover? Lesson's from 2016's most interesting disasters

Thank you