databases and the cloud

2012-02-27 TTKK 1

Databases and the cloudHenrik Ingo

TTKK2012-02-27

Please share and re-use this presentation, licensed under Creative Commons Attribution license.

2011-10-25 2

2011-10-25 3

History

...things that used to be difficult at different points in time.

2012-02-27 TTKK 4

1994

2012-02-27 TTKK 5

1994

USSR fell 3 years ago

Finland yet to win first gold medal in hockey

Windows 3.0 replacing MS-DOS

Windows 95 did not yet exist

I learn how to write .bat scripts from my dad

Word 6.0 replacing WordPerfect 5.1

Excel 5.0 replacing Lotus

I had not yet used Internet

2012-02-27 TTKK 6

1994

C++ & Visual Basic

LANPrinter Database

Client - Server architecture

SQLStandard Interpreted

Flexible and expressive command line env

"SQL for secretaries"Good for English speakers

2000: Bad for IDE w IntelliSense

Stored procedures rule

DBA is king of business logic

2012-02-27 TTKK 7

1997

2012-02-27 TTKK 8

1997

2012-02-27 TTKK 9

Learning SQL

NoSQL advocates say:

SQL is hard to learn

To really scale, you must de-normalize

Most people don't get normalization right

Impedance mismatch between oo and sql

I just need a simple key-value store

Henrik says:

MS Access easy to learn

Darn, I always got normalization

I know, teaching n:n relations was always fun

INSERT INTO t ... serialize($obj)

SELECT value FROM tWHERE key=?;

2012-02-27 TTKK 10

2000

MS Access not supported in Linux

PostgreSQL and MySQL

Learn SQL for real

phpMyAdmin to ease the pain

2012-02-27 TTKK 11

2005

Graduate & do websites for a living.

Spend 3-6 days creating and re-creating properly normalized DB schema

In MS Access I just clicked next, next, next and ok.

Time is money

Invent NoSQL:PHP serialize()MySQL BLOB

2012-02-27 TTKK 12

2008

Join MySQL AB

Confused about synchronous vs asynchronous replication

Learn a lot about MySQL NDB Cluster

2011-10-25 13

High Availability

...and why it is more difficultfor databases.

2011-10-25 14

PerformanceTransactions / second (throughput)

Response time (latency)Percentiles (95% - 99%)

DurabilitySpeaking of databasesCommitted data is not lostD in ACID

High Availability

Get any response at all (tps > 0)Measured as percentile (99.999%)

Replicas, snapshotspoint in time, backups

ClusteringMonitoring

Failover

ReplicationRedundancy

2011-10-25 15

Uptime

Percentile target Max downtime per year

90% 36 days

99% 3.65 days

99.5% 1.83 days

99.9% 8.76 hours

99.99% 52.56 minutes

99.999% 5.26 minutes

99.9999% 31.5 seconds

Beyond system availability: Average downtime per user.

2011-10-25 16

Clustering frameworks - general

Failover

HeartbeatCorosync

VM of choiceRed Hat ClusterSolaris Cluster

...

2011-10-25 17

Clustering frameworks - DB

Failover

HeartbeatCorosync

MMMVM of choice

MHATungsten Enterprise

Solaris Cluster...

2011-10-25 18

Sounds simple. What could possibly go wrong?

Old Master must stop service (VIP, os, DB). But it is not responding, so how do you make it stop?

Polling from the outside. Interval = 1 sec, 10 sec, 60 sec!

What if replication fails first and client transactions don't?

Polling connectivity of DB nodes but not client p.o.v.

Failover can be expensive (SAN, DRBD) -> false positives costly

2011-10-25 19

Disk Disk

Active-Active Shared disk clustering. (State of the art?)

Failover

Oracle RAC(ScaleDB?)

2011-10-25 20


Well, actually it's pretty good.Data integrity protection is good.But...

SAN is considered the biggest SPOF of all.

Recovery time on single node failure is +60 sec

Recovery time? Because internally each node will lock some pages and process them locally.

(Bloody expensive)

Disk

2011-10-25 21

Synchronous multi-master

Failover

NDBGalera

2011-10-25 22


Synchronous Multi Master replication rocks :-)

Failure detection inherent in replication protocol.

Instant failovers.

Bonus: Both Galera and NDB provision new nodes automatically.

Problem is solved. Time for new problems...

2011-10-25 23

Performance

SAN has "some" overhead compared to local disk

DRBD = 50% performance penalty

Replication, when implemented correctly, has 0 performance penalty.

Galera and NDB = more performance

2011-10-25 24

Is a clustering solution part of the solution or the part of the problem?

"Causes of Downtime in Production MySQL Servers" by Baron Schwartz:

#1: Human error #2: SAN

Complex clustering framework + SAN =

More problems, not less!

Galera (and NDB) =

Replication based, no SAN or DRBD

No "failover moment", no false positives

No clustering framework needed

No load balancer needed (JDBC loadbalance)

Simple and elegant!

2011-10-25 25

Scale-out and elasticity


2012-02-27 TTKK 26

Scale-out

Invented by MySQL / LAMP stack.

Laughed at by other RDBMSes

Now everyone does it. Because the Internet is too big.

Originally with read-only replicas. Then sharding.

Easy for http, inconvenient for databases.

NoSQL systems do it really well.

MySQL NDB does it really well and Galera pretty well.

2012-02-27 TTKK 27

DBA's life is more interesting!

HTTP

Stateless

Usually can partition/shard

Scale-out = boot more servers

Writes = Write to the database

RDBMS

Where everyone else stores their state

Needs expertise to partition/shard

Scale-out = Boot more servers. Backup DB. Restore DB. Setup replication. Tweak application code...

Write = Which partition/node? Beware of read-only slaves. Beware of eventual consistency...

2012-02-27 TTKK 28

But it can be done

Automating DB deployments is more complex. But not impossible.

NDB and Galera handle data provisioning really well. But deploying the empty nodes still manual labor.

Scale-out happens because you must. Scale-down will never happen if it's too much work.

MySQL

Amazon RDS, others.

Severalnines

Also supports Galera and NDB

Scalr

PostgreSQL

Heroku

EnterpriseDB

NoSQL

Usually do this relatively well.

2011-10-25 29

NoSQL

...and why it is more difficultfor relational databases.

2012-02-27 TTKK 30

It used to be

2012-02-27 TTKK 31

The future is

2012-02-27 TTKK 32

The future is

All Open SourceAll Open Source

2012-02-27 TTKK 33

Things NoSQL guys do really well

No SQL (parsing)

Schemaless = Win! for agile development

HA with quorum consistency: R + W > N

Transparent sharding

Graph databases

N:N relationships, what's the big deal?

Actually makes sense to give up on SQL!

Map Reduce

Bypass ETL, get clean data

Implemented in Java or Python (or Erlang)

Reading tip: Original Amazon Dynamo

paper

Reading tip: Original Amazon Dynamo

paper

2012-02-27 TTKK 34

Best of both worlds

NoSQLNo SQL.

Simple key-value store.

...and secondary indexes

Quorum consistency.

Transparent sharding

Graph databases

Map Reduce against text files

Java and Python.

MySQLHandlerSocketMemcache API, NDB API

BLOBSELECT v FROM ... WHERE k=?

Functional indexesVirtual columns, etc...

Synchronous replicationGalera, NDB

We have it tooNDB, Spider + proprietary solutions

Damn N:N relations!

Map Reduce against RDBMS

C++ can be done rightDrizzle

2011-10-25 35

Cloud


2011-10-25 36

4 different DB deployments

Server HW

VM

DB process

User (schema)

2011-10-25 37

Consider memory utilization"All of computation is just different layers of caching."

Dedicated HWGreat performance€€€€Not cloud

Virtualization overhead

"Safety margin" of unallocated memory per VM.

Memory allocation per VM is fixed (without reboot) -> cache of idle DB instances is wasted.

No virtualization overhead

Memory allocation per DB instance is fixed (without restart) -> cache of idle DB instances is wasted.

No virtualization overhead.

Busy schema can use more cache and idle schema evicted from cache.

2011-10-25 38

Multi-tenancy

Web hosting w MySQL

Cpanel = hack

Not "cloud". Users expect "dedicated" database instance.

Drizzle

True multi-tenancy, "virtualization for databases"

Ready but experimental

2012-02-27 TTKK 39

Thank you

databases and the cloud

Education

secrecovery time

clustering solution

expensive san

mysql ndb

servers scale

simple key

mysqllearn sql

san complex