scaling mysql: benefits of automatic data distribution
DESCRIPTION
In this webinar, we cover how ScaleBase provides transparent data distribution to its clients, overcoming caveats, hiding the complexity involved in data distribution, and making it transparent to the application.TRANSCRIPT
Webinar: Scaling MySQL Benefits of Automatic Data Distribution
December 13, 2012
2
Agenda
1. Who We Are 2. The Scalability Problem
3. Benefits of Automatic Data Distribution
4. Customer ROI/Case Studies
5. Q & A
(please type questions directly into the GoToWebinar side panel)
3
Who We Are
Presenters: Paul Campaniello,
VP of Global Marketing 25 year technology veteran with marketing experience at Mendix, Lumigent, Savantis and Precise.
Doron Levari, Founder A technologist and long-time
veteran of the database industry. Prior to founding ScaleBase, Doron
was CEO to Aluna.
4
Pain Points – The Scalability Problem
• Thousands of new online and mobile
apps launching every day
• Demand climbs for these apps and
databases can’t keep up
• App must provide uninterrupted
access and availability
• Database performance and
scalability is critical
5
Big Data = Big Scaling Needs
The 451 Group & Teradata
Big Data = Transactions + Interactions + Observations
BIG
DA
TA
ER
P
CR
M
WE
B
Petabytes
Terabytes
Gigabytes
Megabytes
Increasing Data Variety and Complexity
Purchase Detail
Purchase Record
Payment Record
Segmentation
Offer Details
Customer Touches
Support Contacts
Web Logs
Offer History A/B Testing
Dynamic Pricing
Affiliate Networks
Search Marketing
Behavioral
Targeting
Dynamic
Funnels
Sensors/RFID/Devices
User Click Stream
Mobile Web
Sentiment
User Generated Content
Social Interactions & Feeds
Spatial & GPS Coordinates
External
Demographics
Business Data
Feeds
HD Video, Audio, Images
Speech to Text
Product/Service Logs
SMS/MMS
6
Scalability Pain
You just lost
customers
Infrastructure Cost $
time
Large
Capital
Expenditure
Opportunity
Cost
Predicted Demand
Traditional Hardware
Actual Demand
Dynamic Scaling
7
Ongoing “Scaling MySQL” Series
• August 16 & September 20, 2012
– Scaling MySQL: ScaleUp versus Scale Out
• October 23, 2012
– Methods and challenges to Scale out MySQL
• Today
– Benefits of Automatic Data Distribution
• January 17, 2013
– Catch 22 of read-write splitting
8
The Database Engine is the Bottleneck...
• Every write operation is At Least 4 write operations inside the DB:
– Data segment
– Index segment
– Undo segment
– Transaction log
• And Multiple Activities in the DB engine memory:
– Buffer management
– Locking
– Thread locks/semaphores
– Recovery tasks
9
• Every write operation is At Least 4 write operations inside the DB:
– Data segment
– Index segment
– Undo segment
– Transaction log
• And Multiple Activities in the DB engine memory:
– Buffer management
– Locking
– Thread locks/semaphores
– Recovery tasks
The Database Engine is the Bottleneck
Now multiply
by 10TB accessed by
10000 concurrent
sessions
10
COI – Customer, Order, Item
C_ID NAME LOCATION RANK
1 John MA 10
2 James AL 9
3 Peter CA 10
4 Chris FL 8
5 Oliver MA 9
6 Allan MA 9
7 Janette CA 8
8 David MD 10
O_ID C_ID DATE
1 1 2012-02-01
2 1 2012-02-01
3 2 2012-02-01
4 6 2012-02-01
5 6 2012-02-01
6 8 2012-02-01
OI_ID O_ID QUANT I_ID
1 1 3 1
2 1 6 2
3 2 4 1
4 2 2 2
5 2 1 5
6 3 1 1
7 3 6 5
8 4 8 3
9 4 9 4
10 5 2 6
11 6 1 5
I_ID NAME
1 iPhone
2 iPad
3 iPad Mini
4 Kindle
5 Kindle Fire
6 Galaxy S3
CUSTOMER ORDER ORDER_ITEM ITEM
11
Requirements
• Every day:
• Updates
– 30,000 new customers
– 1,000,000 new orders, average of 5 items per order
– Items catalog is updated once a day, nightly, on 11pm
• Queries
– Top customers, rank 9 and up)
– New orders, joins across the board…
Throughput
Latency
12
Splitting the data
• CUSTOMER – random (hash)
• ORDER – derivative (C_ID)
• ORDER_ITEM – transitive (O_ID -> C_ID)
• ITEM – global table
13
Sliced Database
C_ID NAME LOCATION RANK
1 John MA 10
4 Chris FL 8
7 Janette CA 8
O_ID C_ID DATE
1 1 2012-02-01
2 1 2012-02-01
OI_ID O_ID QUANT I_ID
1 1 3 1
2 1 6 2
3 2 4 1
4 2 2 2
5 2 1 5
I_ID NAME
1 iPhone
… …
6 Galaxy S3
CUSTOMER ORDER ORDER_ITEM ITEM
C_ID NAME LOCATION RANK
2 James AL 9
5 Oliver MA 9
8 David MD 10
C_ID NAME LOCATION RANK
3 Peter CA 10
6 Allan MA 9
O_ID C_ID DATE
3 2 2012-02-01
6 8 2012-02-01
O_ID C_ID DATE
4 6 2012-02-01
5 6 2012-02-01
OI_ID O_ID QUANT I_ID
6 3 1 1
7 3 6 5
11 6 1 5
OI_ID O_ID QUANT I_ID
8 4 8 3
9 4 9 4
10 5 2 6
I_ID NAME
1 iPhone
… …
6 Galaxy S3
I_ID NAME
1 iPhone
… …
6 Galaxy S3
DB - 1
DB - 2
DB - 3
14
Requirements
• Every day:
• Updates
– 30,000 new customers
– 1,000,000 new orders, average of 5 items per order
– Items catalog is updated once a day, nightly, on 11pm
• Queries
– Top customers, rank 9 and up)
– New orders, joins across the board…
Throughput
Distribution
Parallelism
Latency
15
Automatic Data Distribution
• The ultimate way to scale
• Provides significant performance improvements
• The only way to really improve read and also writes
• Good for scaling high session-volume reads and writes
• Good for scaling high data-volume reads and writes
• Home-grown implementations have drawbacks
16
Scale Out Features and Benefits
Feature Benefit
Parallel query execution Great performance of cross-db queries & maintenance commands
Query result aggregation Support of sophisticated cross-db queries, even with ORDER BY, GROUP BY, LIMIT, Aggregate functions…
Online data redistribution Flexibility: no need to over-provision No downtime
100% compatible MySQL proxy Applications unmodified Standard MySQL tools and interfaces
MySQL databases untouched Data is safe within MySQL InnoDB/MyISAM/any
Data distribution review and analysis Optimization of data distribution policy
Data consistency verifier Validate system-wide data consistency
Real-time monitoring and alerts Simplify management, reduce TCO
17
Scale Out Provides Immediate & Tangible Value
Application Server
BI
Management
Application Server
Database A Standby A
Database B Standby B
Database C Standby C
Database D Standby D
18
Typical Scale Out (ScaleBase) Deployment
Database B
Database C
Database D
Database A Standby A
Standby B
Standby C
Standby D
ScaleBase
Data Traffic Manager
ScaleBase
Central Management
Application Server
BI
Management
Application Server
19
Choose Your Scale-out Path
# of concurrent sessions
Dat
abas
e S
ize
1 DB?
Good for me!
Data Distribution
Read/Write Splitting
20
Scaling Out Achieves Unlimited Scalability
6000 12000
24000
36000
48000
60000
84000
500 500 1000
1500 1500 2000
2500
0
20000
40000
60000
80000
100000
120000
140000
160000
1 2 4 6 8 10 14
Thro
ugh
pu
t
Number of Databases
Throughput (TPM)
Total DB Size (MB)
# Connections
21
Detailed Scale Out Case Studies
Nokia
• Device Apps App
• Availability
• Scalability
• Geo-clustering
• 100 Apps
• 300 MySQL DB
Solar Edge
• Next Gen Monitoring App
• Massive Scale
• Monitors real time data from thousands of distributed systems
Mozilla
• New Product/ Next Gen App/ AppStore
• Scalability
• Geo-sharding
AppDynamics
• Next gen APM company
• Scalability for the Netflix implementation
22
Summary
• Database scalability is a significant problem
– App explosion, Big Data, Mobile
• Scale Up helps somewhat, but Scale Out provides
a long-term, cost-effective solution
• ScaleBase has an effective Scale Out
solution with a proven ROI
– Improves performance &
requires NO changes to
your existing infrastructure
• Choose your scale-out path....
– The ScaleBase platform enables
you to start with R/W splitting and
grow into automatic data distribution
23
Questions (please enter directly into the GTW side panel)
617.630.2800
www.ScaleBase.com
24
Thank You