making mysql flexible with parelastic database scalability, amrith kumar, founder cto, parelastic

52
Scalability and database virtualization How virtualizing your databases improves performance, and lowers costs New York City MySQL Meetup, October 3, 2013

Upload: -eric-david-benari-pmp

Post on 09-May-2015

689 views

Category:

Technology


1 download

DESCRIPTION

http://www.DatabaseMonth.com/database/parelastic-database-scalability

TRANSCRIPT

Page 1: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Scalability and database virtualization

How virtualizing your databases improves performance, and lowers costs

New York City MySQL Meetup, October 3, 2013

Page 2: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What’s this presentation about?

• Scalability and the database tier• What’s the problem?

• How did we get here?

• Some proposed solutions

• What are parallel databases?

• What’s ParElastic?

• How do I get ParElastic?

• Q&A

Scalability and the database tier | NYC MySQL Meetup 2October 3, 2013

Tweet this presentation#parelastic

Page 3: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What is the scalability problem?

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 3

Page 4: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What is the scalability problem?

• Has many faces• Connections and Concurrency

• Data Volume and Retention Period

• Databases and Tenants

• Read vs. Write

• Your problem(s)• May be more than one

• May change over time

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 4

Page 5: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Connections and Concurrency

• More [Active] Connections• Worse Performance

• Sizing your database

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 5

Page 6: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Data Volume and Retention Period

• Longer Retention Period• More Data

• More Data• Worse Performance

• Progressive deterioration• All data in memory

• All indexes in memory

• Not enough memory

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 6

Page 7: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Databases and “Tenants”

• Common paradigm in SaaS applications• Each tenant’s application instance has a database

• Several databases on each database instance

• More databases per instance• Worse Performance

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 7

In one customer engagement we were informed that no more than 1000 tenants could be located on one database instance before performance became unacceptable

Page 8: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Read vs. Write

• Simple read (SELECT) queries could scale well• Key based lookups

• With favorable indexes

• Things that cause heartburn• Complex joins (with large data sets)

• Sorts

• Aggregation

• Reads are easier to scale than writes

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 8

Page 9: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

How did we get here?A brief history lesson

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 9

Page 10: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

How did we get here? [1]

• A combination of factors• Changes in the application user/usage

• Driven by the Internet and mobile computing

• “News Cycles” are getting shorter

• Economics• Commodity computing is cheap and getting cheaper

• Solutions that can “scale-out” win, others lose

• Ability to leverage higher core-densities• Other databases does a better job at this than MySQL

• MySQL would do great if you had a 20GHz processor ;)

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 10

Page 11: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

How did we get here? [2]

• The Evolution of the Database Management System• A battle between “generalized” and “specialized”

• The Relational Database Management System (RDBMS)• Designed for monolithic systems

• SMP

• Scale-Up

• Applications evolve quickly!• Databases respond slowly

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 11

Page 12: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

How did we get here? [3]

• Moore’s Law• Scale-Up seemed like a fine answer

• But there are limits …

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 12

Page 13: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

How did we get here? [4]

• Database architectures traditionally were• Shared CPU/Memory/Disk

• Also known as “Shared-Everything”

• But “Shared-Everything” doesn’t scale • At least not for databases

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 13

A server costing twice as much doesn’t always give you twice as much database “power”. You reach a point of diminishing returns.

Page 14: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

How did we get here? [5]

• You can pay more but you may not get more

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 14

Source: Amazon RDS TPC-C Benchmark. Md. Borhan Uddin, Bo He, Radu Sion, Cloud Computing Center, SUNY Stony Brook. Viewed online http://digitalpiglet.org/research/sion2010cloud-rds.pdf

Page 15: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Some proposed solutions

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 15

Page 16: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Some proposed solutions

• Several strategies have been advocated• Cache, Cache, Cache,…

• Get a bigger server [a.k.a. Scale-Up]

• Sharding [a form of Scale-Out]

• NoSQL or NewSQL [typically Scale-Out]

• Replication and variants

• We look at each one in more detail

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 16

Page 17: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Cache, Cache, Cache!

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 17

caching transitive verb to cache

cache noun

Temporary computer storage used for quick retrieval of data in order to increase processing speed.

• Caching only addresses ‘read’; not ‘write’

• Social Media workloads are 'write heavy‘, 'interactive‘ and ‘highly personalized’

That’s easy! Do some caching!

Page 18: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Get a bigger server [Scale-Up]

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 18

I will use a bigger database

server

Can I even get a bigger server?

What if m2.4xlarge isn’t

enough?

Maybe I just have too much

data?

Maybe I have too many users?

Page 19: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Sharding [a form of Scale-Out]

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 19

shard noun \ˈshärd\

a piece or fragment of a brittle substance <shards of glass>; broadly : a small piece or part

sharding noun \ˈshär-diŋ\

(a) to make ones application brittle or fragmented;

(b) to take one big problem and make many small problems;

(c) to complicate an application while claiming to solve a scalability problem;

(d) to decrease developer productivity;

(e) a bad idea;

(f) sharding library: a mechanism that attempts (unsuccessfully) to hide the bad taste of sharding

Sharding will solve my problem!

Page 20: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

NoSQL or NewSQL?

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 20

• Yes, I have to rewrite my application

• Yes, not all queries will work

• No, there’s no standard query language

• No, most do not have ACID guarantees; hell some don’t even guarantee Durability

• Yes, most are somewhat untried science-experiments

• More flavors than Ben & Jerry’s Ice Cream [yes, really]

• But, all the cool kids are doing it!

You need NoSQLor NewSQL!

Page 21: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Replication and variants

• Replication based solutions (typically called clustering)• Many copies of the data

• Distribute queries across the copies

• Keep the copies synchronized: like herding cats

• Write bottleneck

• Read/Write splitting• Single Master (gets all the writes)

• Many Slaves (share the reads)

• Unpredictable latency

• Write bottleneck

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 21

Page 22: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What about MySQL Cluster?

• MySQL Cluster is a strange beast

• For best results, you must use the NDB interface

• Only supports the NDB storage engine

• Primarily a distributed in-memory Key-Value Store• That is ACID compliant and supports joins and things if you

use the SQL interface

• But no one tells you about the performance of this path!

• Published benchmarks are all “FlexAsync” which talk directly to the NDB interface• And READ-ONLY

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 22

For more details visit http://www.parelastic.com/blog/mysql-cluster-and-benchmarksOr stick around after the presentation and we can chat!

Page 23: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What are parallel databases?

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 23

Page 24: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What are parallel databases?

• A database architecture proposed in 19921

• Very successfully applied to many database problems• Oracle Exadata, Netezza, Teradata, Greenplum, …

• An example of the “Shared Nothing” database paradigm

2

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 24

1Parallel Database Systems: The future of high performance database processing [1992, Dewitt, Gray,

ftp://ftp.cs.wisc.edu/pub/techreports/1992/TR1079.pdf]2

The Case for Shared Nothing [1986, Stonebraker, http://db.cs.berkeley.edu/papers/hpts85-nothing.pdf]

Page 25: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

How parallel databases execute queries

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 25

Image from “Parallel Database Systems: The future of high performance database processing” [1992, Dewitt, Gray, ftp://ftp.cs.wisc.edu/pub/techreports/1992/TR1079.pdf]

Page 26: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Benefits of parallel databases

• Linear improvement in “reads”

• Linear improvements in “writes”

• Better than linear improvement in “joins”

• Better than linear improvement in “aggregation”

• Better than linear improvement in “sorts”

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 26

For more details, refer “Parallel Database Systems: The future of high performance database processing” [1992, Dewitt, Gray, ftp://ftp.cs.wisc.edu/pub/techreports/1992/TR1079.pdf]

Page 27: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Parallel Databases vs. Sharding

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 27

• Parallel Database• Database architecture• Application is data

location agnostic• Application perceives a

single database• Requires no application

rewrites

• Application is notconstrained by parallel database architecture

• A parallel database handles any schema

• Sharding• Application architecture

• Application is data location aware

• Application perceives a collection of databases

• Requires application rewrites

• Application is constrained to the limitations of the sharding architecture

• Not all schemas are shard’able

Page 28: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What is ParElastic?Hypervisor for databases

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 28

Page 29: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What is ParElastic?

• An approach to relational database virtualization

• Addresses issues of scalability in relational databases

• A parallel database architecture• Built on standard MySQL or MySQL variant databases

• Horizontal Scalability

• Elastic

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 29

Page 30: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

ParElastic: System Architecture

10/7/2013Flex Your Database | ParElastic ® Database Virtualization

Engine30

ParElastic Architecture protected by US8214356, “Apparatus for elastic database processing with heterogeneous data”

Page 31: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Data Distribution: How it works

• User data is “distributed” across multiple storage nodes

• Queries are executed in parallel by some [or all] nodes

• Multiple distribution models supported• Range

• Hash

• Broadcast

• Random

• ParElastic guarantees co-location and query execution

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 31

Page 32: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Storage Elasticity: How it works

• A “generational scheme”

• Storage Nodes added over time• Each creates a new “generation”

• Unnecessary to migrate large amounts of data • A key drawback with “sharding” that requires “resharding”

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 32

Storage Elasticity protected by US8478790, US8386532 and other patents.

Page 33: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

ParElastic: How It Works

10/7/2013Flex Your Database | ParElastic ® Database Virtualization

Engine33

Page 34: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

ParElastic: Simple query processing example

SELECT COUNT(*)

FROM CUSTOMER;

count(*)

--------

2771

(1 row affected)

PROVISION 1 DYNAMIC NODE

ON DYNAMIC NODE

CREATE TEMP TABLE

T1

( C INT );

ON ALL STORAGE NODES

SELECT COUNT(*)

FROM CUSTOMER

AND REDISTRIBUTE

TO T1

ON DYNAMIC NODE

SELECT SUM(C)

FROM T1;

10/7/2013Flex Your Database | ParElastic ® Database Virtualization

Engine34

Page 35: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

ParElastic Performance Benefits

• Connection Scalability• ParElastic Tier Elasticity; have more or less ParElastic servers

• Storage / Data Volume Scalability• Add ParElastic Persistent Nodes as data volumes increase

• Multiple machines working together

• Workloads are variable• Compute Node Elasticity; have more or less as required

• Databases and Tenants [SaaS applications]• ParElastic Adaptive Multi-tenancy ™

• No application change

• Queries processed by, data stored on standard MySQL!

10/7/2013Flex Your Database | ParElastic ® Database Virtualization

Engine35

Page 36: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

ParElastic Multi-Tenancy

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 36

Page 37: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

ParElastic Concurrency [1]

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 37

Page 38: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

ParElastic Concurrency [2]

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 38

Page 39: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

ParElastic data “ingest”

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 39

Tests conducted in Amazon Cloud. Native MySQL testing on m1.xlarge server, standard MySQL, standard EBS volumes. Test driver was a c1.xlarge server to provide sufficient CPU head-room to generate load. ParElastic run with 5 and 15 persistent storage nodes identically configured, m1.xlarge, standard MySQL, standard EBS Volumes. 15 node test employed two c1.xlarge test drivers. Best ParElastic performance was with 10 threads, 10 persistent storage nodes and an insert batch size of 5,000 tuples per insert batch. Best native MySQL performance was with 2 threads and a batch size of 10,000 tuples per insert batch.

One Million rows/s!15 Storage Nodes, 2 ParElastic Servers

Page 40: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What’s the ParElastic Overhead?

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 40

Machine 1

Test Client

Machine 2

mysqld

Machine 1

Test Client

Machine 3

mysqld

Machine 4

mysqld…

Machine 2

ParElastic

Network RTT 0.35ms

Query Time 15.72ms

Query Time 17.03ms

ParElastic overhead ~ 1.31ms

Page 41: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Characterizing ParElastic Performance

• A “fixed cost”, the overhead per query

• A “variable cost” for query processing

• Consider this example, a simple “COUNT” query.

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 41

Page 42: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Some things to keep in mind

• Horizontal Scale-Out benefits from• Being “stateless”, or at least having less state

• Adhering to a truly “shared nothing” approach

• Horizontal Scale-Out is impeded by• Complex or Shared “State”

• Things that violate the “shared nothing” paradigm

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 42

Page 43: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

What is ParElastic?

• An approach to relational database virtualization• "A Hypervisor for the Database Tier"

• Scale out database capacity across many servers• Effectively handle workloads too big for one server

• Share this pool of database among many applications• Efficiently allocate database capacity to workload

• An elastic, multi-tenant, parallel database architecture• Built on standard MySQL or MySQL variant databases

• Horizontal Scalability

• Elastic

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 43

Page 44: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Some target markets

• Database Virtualization – “Hypervisor for the Database”• Reduce capex and simplify administration for development

and test

• SaaS Enablement• Simplified deployment of SaaS applications using multi-

tenancy

• High Volume Database Applications• High traffic websites, (e.g. social, ecommerce, on-line games)

• High speed data ingest (e.g. click tracking, sensor arrays, mobile)

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 45

Page 45: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Where do I get ParElastic?

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 46

Page 46: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Getting ParElastic

• For Evaluations• Available at no charge on Amazon Marketplace

• Preconfigured for evaluation purposes; not performance testing

• Runs completely on a single EC2 instance

• For Larger Configurations• Contact ParElastic

• Email: [email protected]

• Twitter: @parelastic

• Web: http://www.parelastic.com

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 47

Page 47: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Getting ParElastic

• On the Amazon AWS Marketplace (aws.amazon.com/marketplace)

• Quick start guide and simple (two-step) setup wizard provided.

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 48

Page 48: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Conclusion

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 49

Page 49: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Conclusion

• Database Scalability is a very real problem• The Cloud has put a very complicated wrinkle in it

• The problem was seen before with commodity servers• Virtualization was able to address this problem

• Several “hacks” have been proposed• Not really solutions, just hacks

• ParElastic is a database virtualization solution• Based on standard relational databases

• Provides benefits of horizontal scalability and multi-tenancy

• ParElastic is available for evaluation on many platforms• Free evaluation also available on Amazon Marketplace

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 50

Page 50: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Contacting ParElastic

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 51

• Look us up online– http://www.parelastic.com

• Watch an explainer video– http://www.parelastic.com/video

• Contact us– Email: [email protected]

Page 51: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Q&A

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 52

Page 52: Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Founder CTO, ParElastic

Image Credits• Moore’s Law

• Wikipedia [http://commons.wikimedia.org/wiki/File%3ATransistor_Count_and_Moore's_Law_-_2011.svg]

• Hercules slays the Hydra

• Wikipedia [http://commons.wikimedia.org/wiki/File%3AHercules_slaying_the_Hydra.jpg]

• CPU History

• Phillip E. Ross, “Why CPU Frequency Stalled” [http://spectrum.ieee.org/computing/hardware/why-cpu-frequency-stalled]

• Herding Cats

• Image from [http://wodongatafe.wordpress.com/2011/05/27/herding-cats-or-facilitating-a-webinar-whats-the-difference/]

October 3, 2013 Scalability and the database tier | NYC MySQL Meetup 53