high availability redundancy vs backup vs archiving databases

24
High Availability Redundancy vs Backup vs Archiving Databases (MySQL, PostgreSQL, MongoDB), Data Rafał Gołąb <[email protected]> Kraków, 19.02.2015r.

Upload: codibly-software-house

Post on 07-Aug-2015

252 views

Category:

Data & Analytics


3 download

TRANSCRIPT

High AvailabilityRedundancy vs Backup vs Archiving

Databases (MySQL, PostgreSQL, MongoDB), Data

Rafał Gołąb <[email protected]>

Kraków, 19.02.2015r.

Agenda

1. Redundancy, backup, archiving theory

2. Motivation

3. Objectives

4. Solutions

5. RAID & Replication & Archiving Tools

6. MySQL Backup Tools (real examples)

7. Conclusions

Redundancy, backup, archiving theory

● Redundancy- establishes a straight copy of an entire system, ready to take over if the original

system fails

● Backup - create a second copy of data at specific points in time- ideally keeping multiple historic copies- must be consistent

● Archiving- makes a primary copy of selected data with the aim of retaining data in the long-

term

Motivation

Murphy’s law: Anything that can go wrong, will go wrong.

Motivation

http://highscalability.com/blog/2014/2/17/how-the-aolcom-architecture-evolved-to-99999-availability-8.html

Objectives

● understanding how big is a problem

● sleep well

● extend knowledge

● know the differences

● increase data safety

What will can happen?

● location● networking● hardware

● operating system● data storage● app layer

What can we do?● load balancing● fail-over● disaster recovery

Detailed problem solving

● DNS problems- round robin- low ttl- gslb (dnsmadeeasy.com, akamai)

● HTTP problems- HAproxy, nginx (LB algorithms)- memcache servers- failover ip addresses

● MAIL problems- few MX servers- LB SMTP servers

● DATABASES & STORAGE problems- next part of presentation

RAID (Redundant Array of Independent Disks)

RAID 0 (stripping)

● Adventages- read/write speed- capacity

● Disadventages- not fault-tolerant

2TB1TB1TB1TB

RAID 1 (mirroring)

● Adventages- have a copy of data (N-1)- fastest read

● Disadventages- slow write- capacity (max one smallest disk)

rebuild

RAID 10 (stripping + mirroring)

● Adventages- have a copy of data- fast read/write

● Disadventages- need the same disk (capacity and speed)

RAID 5

● Adventages- fast read like raid 0- fault-tolerant (one disk)

● Disadventages- slow write- slow read/write during disk failture

RAID 6

● Adventages- fault-tolerant (two disks)- fastest read than one disk

● Disadventages- cost

RAID

WHAT IS RAID?

!!! REDUNDANCY !!!

Replication

REPLICATION

WHAT IS REPLICATION?

!!! REDUNDANCY !!!

LVM (Logical Volume Manager)

● PV (phisical volume), VG (volume group), LV (logical volume)- pvdisplay, vgdisplay, lvdisplay- pvs, vgs, lvs- pvcreate, vgcreate, lvcreate- pvremove, vgremove, lvremove- lvextend, lvreduce

● Adventages- may extend after several physical disks- supports resizing LV on the fly- supports snapshots

● RAID + LVM = safe and flexible storage

LVM - snapshots

LVM snapshots allow for a consistent backup even if files are open during the backup. The snapshot volume needs enough space to store changes that occur during the backup.

100GB5GB

100GB

LVM

WHAT IS LVM?

!!! BACKUP SYSTEM !!!

MySQL Backup Tools (real examples)

● mylvmbackup

● xtrabackup- no tables locks- only for innodb

● mysqldump- tables locks- long time recovery- for small databases

Archiving Tools

● cron

● tar + gzip

● rsync

● scp

● ...

Conclusions

● High Availability is complex problem and different on each organisation.

● The best practice when it comes to protecting your data is using all of solutions (redundancy, backup and archiving) when possible.

● Redundancy isn’t backup

● Backup is more important than redundancy

● Using LVM is the best solution for preparation DBs backups

Thank you for your attention. Questions?

Rafał GołąbLinux System Administrator

E-mail: [email protected].: (+48) 506 514 543

www.codibly.com