how rakuten reduced database management spending by 90% through clustrix implementation

36
October 17 th , 2012 Ryutaro Yada ( 矢矢 矢矢矢 ) Database Platform Group Global Infrastructure Development Dept. Rakuten, Inc. 矢矢矢矢矢矢 : Clustrix 矢矢矢矢矢 DB 矢矢矢矢矢矢矢矢 How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Upload: rakuten-inc

Post on 07-Nov-2014

1.966 views

Category:

Documents


0 download

DESCRIPTION

Session slide at db tech showcase 2012 How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation - About Rakuten - Rakuten database environment and operational issues - What is Clustrix? - Clustrix verification results and implementation effectiveness - Summary

TRANSCRIPT

Page 1: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

October 17th, 2012Ryutaro Yada ( 矢田 龍太郎 )

Database Platform GroupGlobal Infrastructure Development Dept.

Rakuten, Inc.

楽天事例紹介 : Clustrix 導入によるDB 管理コストの削減

How Rakuten Reduced Database Management Spending by 90% through

Clustrix implementation

Page 2: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Introduction

2

Ryutaro Yada First employed by Rakuten in 2008   Present job

Development of a platform database to support Rakuten Testing and discussion of new techniques and new architecture in view of having it

adopted for use. Previous functions

Promotion of Oracle business with specified customer Establish collaborative network with Oracle, develop and verify new solutions, etc..

LinkedIn profile: http://www.linkedin.com/pub/ryutaro-yada/32/368/4b0

Page 3: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Agenda

3

About Rakuten Rakuten database environment and

operational issues What is Clustrix? Clustrix verification results and

implementation effectiveness Summary

Page 4: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Introduction to Rakuten

Rakuten market

About 3000 employees: (Group approx. 7000) Market / more than 40 services provided including travel More than 120,000 contracted firms; more than 80,000,000

registered products Group distribution total: 3.2 trillion yen (2011)

Page 5: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Rakuten Global Expansion

Our Goal is to become the No. 1 Internet Service in the World

★★

★★

★★

★★

★★

★★

Taiwan

★★

★★★★

★★★★

★★

★★

★★★★★★

★★

★★LS(UK)

★★

★★ Ichiba (EC) ★★ Travel ★★ Performance marketing

★★★★★★

★★ ★★

★★★★

★★

*To be open soon

Page 6: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Rakuten’s Global Position

Rakuten is aiming to be the world’s largest internet firm. Firm and highly flexible infrastructure is required to achieve this

goal

Amazon e-Bay Alibaba Apple Rakuten Walmart0

50000

100000

150000

200000

250000

300000Retail / auction site global ranking 2011 based on unique (no. of) visitors

Source : comScore Media Metrics

Page 7: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Rakuten Database

7

Breakdown according to the number of databases: approx. 80% MySQL (more than 1100)

More than 350 MySQL database servers MySQL has the largest share

MySQL

Informix

Oracle PostgreSQL Teraddata

No. of databases according to actual environment RDBMS

Same number of databases for each STG and DEV

Page 8: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

MySQL Database Issue (1)

8

Data Sharding Operations Required for functionality scaling Instance/database/table splitting, data redistribution Correction of application code, control of database access

Data Protection, HA Securing Replication cannot realize zero data loss at failure Switch back/switch over management takes a lot of effort

Page 9: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

MySQL Database Issue (2)

9

Online Maintainability Schema modification and index addition, rebuild Lock, access concentration

Number of Units Tends to Increase Load distribution slave, redundant configuration of slave Tendency for preparations on an individual service basis (service level

differences, maintenance adjustment diversion) CPU efficiency decreases; increases in data center costs

Page 10: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Clustrix Characteristics

10

Appliance-style database server Cluster database

NewSQL = LegacySQL + NoSQL LegacySQL: SQL access, transaction consistency NoSQL: Scalability, high performance

Fault-tolerance function MySQL compatibility

Usually access is through MySQL protocol

What is Clustrix?

Page 11: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Clustrix Provision Model

11

2 Models

Page 12: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Looking at Clustrix

12

SSD

Infiniband Low latency High performance

Page 13: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Clustrix Operation

13

Distributed arrangement on the physical layer Redundancy protection, auto rebalance Parallel query execution

SQL

SQL SQL SQL Query, not data, is migrated (this concept differs from Oracle RAC)

Page 14: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

14

TPC-C Benchmark Result

Page 15: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

GUI

15

Page 16: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

16

Useful Command Interface

Page 17: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Clustrix Implementation Cases

17

Rakuten is the first case in JapanNumerous foreign cases

Page 18: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Verification Points

18

PerformanceScalabilityFault-tolerance verificationOnline schema modification

Page 19: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

OLTP Performance Results (1)

19

p3 p12 p24 p48 p96 p192

Single Throughput

4014.70340928495

8350.80109847176

10022.3282685872

10448.2547852492

10520.0806628112

10213.9827837091

Clx 3 nodesThroughput

6301.60359877864

18530.3162608949

26182.7733058652

30021.3084104135

27581.9210371511

24401.2890368011

Clx 4 nodes Throughput

6090.51319292526

20584.419997779 30544.8251962052

38545.2177406787

36837.1017573143

33221.7252902507

2500

7500

12500

17500

22500

27500

32500

37500

42500

Insert

(ops/sec)

Page 20: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

OLTP (2)

20

p3 p12 p24 p48 p96 p192

Single Throughput

3854.24358613085

8018.59324851275

12186.2079297577

13385.7783377638

13395.0658663929

11538.296680495

Clx 3 nodes Throughput

3377.35937160859

10741.774171341 16505.7965226951

16964.0110664261

16189.8841578955

15379.6268324374

Clx 4 nodes Throughput

3682.99885957258

12679.269657803 19737.6381506327

22232.7357030747

21568.3931798503

21303.7287176473

2500

7500

12500

17500

22500

Update

(ops/sec)

Page 21: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

OLTP (3)

21

p3 p12 p24 p48 p96 p192

Single Throughput

6134.38646581779

26773.7120229317

44388.7847721773

56144.2728063981

57926.6243329803

49362.5110601163

Clx 3 nodes Throughput

5050.23029484888

17380.6806013135

27803.8249388909

39693.7563317322

49822.3402571188

56847.7787871649

Clx 4 nodes Throughput

5959.90006380203

20794.2083020251

34743.3165492514

54382.9641882055

70302.2731278006

76000.5917532173

5000

15000

25000

35000

45000

55000

65000

75000

Read

(ops/sec)

Page 22: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

OLTP (4)

22

p3 p12 p24 p48 p96 p192

Single Throughput

3976.841545964 8587.21815825699

11632.6412209825

12946.2536032732

13122.3374763111

12769.4579406992

Clx 3 nodes Throughput

3113.10943051971

12940.9919099681

21264.6330913696

26759.7529065423

25976.2662489231

25334.180535794

Clx 4 nodes Throughput

5150.53799892806

15220.8469000052

25601.6790899751

34647.4161635837

34697.7370021236

30804.0994888721

2500

7500

12500

17500

22500

27500

32500

37500

Mix

(ops/sec)

Page 23: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

23

Clustrix IA with SSD SPARC with SAN

J) Count+GroupBy+OrderBy+Limit 1.9s (3.4s) 2.1s (8.5s) 3.4s (409.32s)

K) Count+GroupBy+OrderBy+Limit 0.7s (1.13s) 5.9s (7.49s) 13.0s (39.41s)

L) 2000 of IN+GroupBy 3.8s (8.97s) 106.5s (103.77s) 193.0s (321.68s)

M) Case+OrderBy 31.0s (45.66s) 47.3s (60.9s) 90.5s (112.24s)

Complex and Heavy SQL Comparison

Page 24: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Example of Performance Improvements

24

Example improvements regarding a particular service Before: 116.8ms After: 21.4ms

Page 25: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Fault-Tolerance Inspection

25

Failure Test Items Downtime

1 Front network (port1) No

2 Front network (port2) No

3 Internal network (primary) < 12s

4 Internal network (standby) No

5 MySQL instance < 4s

6 Node OS < 4s

7 Online data disk(SSD) failure < 5s

8 Log/work data disk(SATA) failure No

9 Infiniband switch (primary) < 12s

10 Infiniband switch (standby) No

11 Front network (port1&2) < 18s

12 Internal network (primary & standby) < 12s

DB DB DB

Infiniband SW1 Infiniband SW2

DB

Front SW1 Front SW2

1

3

5,6

2

4

11

12

9 10

7,8

Page 26: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Time Required for Online Maintenance

26

  Small Medium Large

Create Column 1.6s 13.5 149.8

Create Index 1.6s 13.0s 172.7sDrop Column 1.5s 13.8s 125.5s

Drop Index 0.5s 0.5s 0.5s

Implementation Time

  Small Medium Large

Row 50,000 500,000 5,000,000

Size (byte) 113,639,424 1,063,190,528 10,696,130,560

Table Rows and Size

Page 27: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Impacts During Online Scheme Modification

27

Online execution – 5 million cases, total tables 10G

No impact on access performance in areas other than those subject to work operations Some impact on performance of access to table being subject to work operations (taking

periods with little impacts, such as night service, into consideration)

Page 28: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Clustrix Implementation Impacts Release from Sharding (1)

DB DB DB DB

……

……

No more sharding!

DB DB DB DB……+

Before

After

Page 29: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Clustrix Implementation Impacts Release from Sharding (2)

29

0 2 4 6 8 10 12 14

to-be

as- is

man-month

DBA

APP

No need for correction of application No need for DB distribution Sharding production costs reduction (over 90%) for

both application engineer and DBA

In case of large-scale sharding project, actual production costs compared to original

Page 30: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Clustrix Implementation Impacts Cost Reductions due to Consolidation (1)

30

Sufficient performance scalability Fault-tolerance ready for mission critical No data loss High online maintainability that doesn’t affect other services Possibility of consolidation to Clustrix of existing MySQL

database

Page 31: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Clustrix Implementation Impacts Cost Reductions due to Consolidation (2)

31

Consolidation of all existing MySQL within Clustrix Number of servers will be reduced to 10% Monthly system costs will be reduced to 40%

Page 32: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Back-up Structure

32

DB

Node 1

DB DB

Node 2 Node 3

DBMySQL

NAS

NFS

Backup by mysqldump

Replication

Slave as first backup

Clustrix

Page 33: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Data Migration Procedure

33

DBMySQL

Replication

DB DB DB

Clustrix PRODB DB DB

Clustrix DEVReplication

Replication to DEV for verification Replication to PRO for migration Conversion of application access point to PRO

Page 34: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Other Advantages of Clustrix

34

Auto-DefragCordial Support Service

Advice regarding structure Troubleshooting Tuning advice Etc.

Page 35: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Operational Issues Resolved with Clustrix

35

Data sharding operations

Data protection, HA securing

Online maintenance

Tendency for large number of units

Unnecessary, operational cost reduction

Possible

Possible

Consolidation possible Cost reduction

Page 36: How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation

Clustrix at Rakuten

36

An important database platformProvided as Database-as-a-ServiceNo lead-timeUsage volume rate structure