high availability, disaster recovery and extreme read scaling using binlog servers

27
High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers Jean-François Gagné jeanfrancois DOT gagne AT booking.com Presented at Percona Live London 2014

Upload: jean-francois-gagne

Post on 08-Jul-2015

618 views

Category:

Technology


0 download

DESCRIPTION

At Booking.com, we are adding new components to our replication architecture: Binlog Servers. Those will allow us to reach extreme number of slaves replicating from a single master (greater than 100 slaves and growing). We think that people having more modest replication installations can also benefit from Binlog Server as they ease high availability deployment and simplify remote site replication. Moreover, we are convinced that the Binlog Server will become an enabler for parallel replication, especially on remote site deployment. We will be happy to give you all the details during this talk.

TRANSCRIPT

Page 1: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

High Availability, Disaster Recovery

and Extreme Read Scaling

using Binlog Servers

Jean-François Gagné

jeanfrancois DOT gagne AT booking.com

Presented at Percona Live London 2014

Page 2: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Booking.com

2

Page 3: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Booking.com’

● Based in Amsterdam since 1996

● Online Hotel and Accommodation Agent: ● 135 offices worldwide

● +540.000 properties in 207 countries

● 42 languages (website and customer service)

● Part of the Priceline Group

● And we are using MySQL: ● >3000 servers, ~90% replicating

● ~100 masters: ~10 >50 slaves & ~4 >100 slaves

3

Page 4: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Session Summary

1. Replication

2. What is the Binlog Server

3. Extreme Read Scaling

4. Remote Site Replication and Easy Disaster Recovery

5. Easy High Availability

6. Other Use-Cases

7. Impacts on the Ecosystem

4

Page 5: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Replication

● One master / one or more slaves

● The master records all its writes in a journal: binary logs

● Each slave: ● Downloads the journal from the master

and saves it locally (IO thread): relay logs

● Executes the relay logs on the local database (SQL thread)

● Could produce binary logs to be a master to other slaves

● Replication is: ● Asynchronous lag slaves are eventually consistent

● Single threaded slower than the master

5

Page 6: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Booking.com’’

● Typical replication deployment: ----- | M | ----- | +------+-- ... --+---------------+-------- ... | | | | ----- ----- ----- ----- | S1| | S2| | Sn| | M1| ----- ----- ----- ----- | +-- ... --+ | | ----- ----- | T1| | Tm| ----- -----

6

Page 7: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: What

● Binlog Server (BLS): is a daemon that: ● Downloads binary logs from the master

● Saves them in the same structure as the master

● Serves the binary logs to slaves

----- / \

| A | ---> / X \

----- -----

| |

----- -----

| B | | B |

----- -----

● A or X are the same from the point of view of B:

● By desing, the binlogs served by A and X are the same

7

Page 8: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Read Scaling

● Typical replication topology for read scaling:

----- | M | ----- | +------+------+--- ... ---+ | | | | ----- ----- ----- ----- | S1| | S2| | S3| | Sn| ----- ----- ----- -----

● When too many slaves, NIC of M is overloaded: ● 100 slaves x 1Mbit/s very close to 1Gbit/s

● OSC or purging data in RBR becomes hard

● Slaves lag, or worst: unreachable master for writes

8

Page 9: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Read Scaling’

● Typical solution: Intermediary Masters (IM):

----- | M | ----- +----------------+------ ... ------+ | | | ----- ----- ----- | M1| | M2| | Mm| ----- ----- ----- +------+ ... +--- ... +--- ... ---+ | | | | | ----- ----- ----- ----- ----- | S1| | S2| | T1| |...| | Zi| ----- ----- ----- ----- -----

● But IMs bring new problems: ● Lag of an IM all its slaves are lagging

● Failure of an IM all its slaves stop replicating

● Rogue transaction on an IM corruption of all its slave

9

Page 10: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Read Scaling’’

● Can the IM problems be fixed ?

● Shared disk for HA: ● Filers or doubling the number of servers

● HA needs sync_binlog=1 + trx_commit=1

● After a crash of an IM: ● needs InnoDB recovery slaves mostly useless

● and cache is cold replication will lag

● GTIDs to the rescue: ● They allow slave repointing :-)

● But do not completely solve the lag problem :-(

● And we cannot migrate online :-( :-(

10

Page 11: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Read Scaling’’’

● New Solution: BLS

----- | M | ----- +----------------+------ ... ------+ | | | / \ / \ / \ / I1\ / I2\ / Im\ ----- ----- ----- +------+ ... +--- ... +--- ... ---+ | | | | | ----- ----- ----- ----- ----- | S1| | S2| | Si| | Sj| | Sn| ----- ----- ----- ----- -----

● If a BLS fails, repoint its slaves to other BLSs: ● This is easy, the binlogs on all BLSs are the same by design:

the same as the one from the master

11

Page 12: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Remote Site

● Typical deployment for remote site:

-----

| A |

-----

+------+------+---------------+

| | | |

----- ----- ----- -----

| B | | C | | D | | E |

----- ----- ----- -----

+------+------+

| | |

----- ----- -----

| F | | G | | H |

----- ----- -----

● E is an IM same problems as slave scaling.

12

Page 13: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Remote Site’

● Ideally, we would like this:

-----

| A |

-----

+------+------+---------------+------+------+------+

| | | | | | |

----- ----- ----- ----- ----- ----- -----

| B | | C | | D | | E | | F | | G | | H |

----- ----- ----- ----- ----- ----- -----

● No lag and no Single Point of Failure (SPOF)

● But no master on remote site for writes (solved problem)

● And expensive in WAN bandwidth

13

Page 14: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Remote Site’’

● New solution: a BLS on the remote site:

-----

| A |

-----

+------+------+---------------+

| | | |

----- ----- ----- / \

| B | | C | | D | / X \

----- ----- ----- -----

+------+------+------+

| | | |

----- ----- ----- -----

| E | | F | | G | | H |

----- ----- ----- -----

14

Page 15: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Remote Site’’’

● Or deploy 2 BLSs to get better resilience:

----- | A | ----- +------+------+---------------+ | | | | ----- ----- ----- / \ / \ | B | | C | | D | / X \ ------> / Y \ ----- ----- ----- ----- ----- +------+ +------+ | | | | ----- ----- ----- ----- | E | | F | | G | | H | ----- ----- ----- -----

● If Y fails, repoint G and H to X.

● If X fails, repoint Y to A and E and F to Y.

15

Page 16: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Remote Site’’’’

● Interesting property: In case of a failure of A, E, F G and H converge to a common state.

----- | A | ----- +------+------+---------------+ | | | | ----- ----- ----- / \ | B | | C | | D | / X \ ----- ----- ----- ----- +------+------+------+ | | | | ----- ----- ----- ----- | E | | F | | G | | H | ----- ----- ----- -----

● New master election is easy on remote site.

16

Page 17: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: High Availability

● This property can be used for HA:

-----

| A |

-----

|

/ \

/ X \

-----

+------+------+------+------+------+------+------+

| | | | | | | |

----- ----- ----- ----- ----- ----- ----- -----

| B | | C | | D | | E | | F | | G | | H | | I |

----- ----- ----- ----- ----- ----- ----- -----

17

Page 18: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: HA, DR, and RS

● With this deployment spanning many data centers:

-----

| M |

-----

|

+--- ... ---+--- ... ------------+------------ ...

| | |

/ \ / \ / \ / \

/ I1\ / Ix\ / J1\ ----> / Jy\

----- ----- ----- -----

| | | |

+-- ... +-- ... +-- ... +-- ... --+

| | | | |

----- ----- ----- ----- -----

| S1| | Si| | T1| | Ti| | Tn|

----- ----- ----- ----- -----

18

Page 19: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: HA, DR, and RS’

● If M fails:

-----

| M | <--- Failed master

-----

/--- Most up to date BLS

/

/

/ \ / \ / \ / \

/ I1\ <---- / Ix\ -------------> / J1\ ----> / Jy\

----- ----- ----- -----

| | | |

+-- ... +-- ... +-- ... +-- ... --+

| | | | |

----- ----- ----- ----- -----

| S1| | Si| | T1| | Ti| | Tn|

----- ----- ----- ----- -----

19

Page 20: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: HA, DR, and RS’’

● A primary BLS on all sites might simplify things:

-----

| M |

-----

|

+--------------- ... ------------+------------ ...

| |

/ \ / \ / \ / \

/ I1\ ----> / Ix\ / J1\ ----> / Jy\

----- ----- ----- -----

| | | |

+-- ... +-- ... +-- ... +-- ... --+

| | | | |

----- ----- ----- ----- -----

| S1| | Si| | T1| | Ti| | Tn|

----- ----- ----- ----- -----

20

Page 21: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Other Use-Cases

● Better Crash-Safe Replication

● http://blog.booking.com/

better_crash_safe_replication_for_mysql.html

● And Better Parallel Replication

21

Page 22: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Better // Replication

● What is parallel (//) replication:

● transactions committing together on the master

are executed in parallel on slaves

● In other words:

transactions finishing at the same time on the master

are started at the same time on the slave

No guarantee that these transactions will complete at the

same time on the slave

Impact when we have Intermediate Masters ?

22

Page 23: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Better // Replication’

● Four transactions on X, Y and Z: ----- | X | ----- | ----- | Y | ----- | ----- | Z | -----

● IM might stall the // replication pipeline

● To fully benefit from // replication, IM must disappear

● The Binlog Server allows exactly that

23

On Y:

----Time---->

B---C

B---C

B-------C

B-------C

On Z:

----Time--------->

B---C

B---C

B-------C

B-------C

On X:

----Time---->

T1 B---C

T2 B---C

T3 B-------C

T4 B-------C

Page 24: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Impact on Ecosystem

● MHA (and other HA tools): ● Parsing relay logs is not needed anymore

● Promoting a new master is always needed

● GTIDs: ● Less useful in replication topologies

● Still useful in Group Communication solutions

● Binlog Tailers + Semi-Sync Replication: ● s/Binlog Tailers/Binlog Servers/

24

Page 25: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Impact on Ecosystem’

● Intermediary Master: ● GTIDs only solve one of its problem

● Even with GTIDs, there is still lag/delay

● Without GTIDs: ● SPOF for all its slaves

● Or performance killer if we deploy HA with shared disk

● Replicating through an IM looks wrong

log-slave-update might become less useful

● The Binlog Servers should work with any version of MySQL (5.7, 5.6, 5.5 and 5.1) or MariaDB (10.1, 10.0, 5.5. 5.3, 5.2 and 5.1).

25

Page 26: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Binlog Server: Links

● http://blog.booking.com/

● http://blog.booking.com/mysql_slave_scaling_and_more.html

● http://blog.booking.com/better_crash_safe_replication_for_mysql.html

● http://blog.booking.com/better_parrallel_replication_for_mysql.html

● https://workingatbooking.com/

● https://mariadb.com/blog/mariadb-replication-maxscale-and-need-binlog-server

● https://mariadb.com/blog/maxscale-proxy-mysql-replication-relay

● https://mariadb.com/blog/maxscale-proxy-replication-relay-part-2-slave-side

Soon: the MaxScale Binlog Server Plugin.

26

Page 27: High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers

Questions

Jean-François Gagné

jeanfrancois DOT gagne AT booking.com