etcd based postgresql ha cluster

29
etcd based PostgreSQL HA cluster TL;DR: github.com/compose/template-etcd-based-postgres-ha

Upload: winsletts

Post on 15-Jul-2015

859 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: etcd based PostgreSQL HA Cluster

etcd based PostgreSQL HA cluster

TL;DR: github.com/compose/template-etcd-based-postgres-ha

Page 2: etcd based PostgreSQL HA Cluster

Introduction

Chris Winslett

@winsletts

compose.io

reading the top 5 comments on Imgur since 2012

Page 3: etcd based PostgreSQL HA Cluster

How we started using PostgreSQL

MongoDB was a primary datastore

launched project to understand financial metrics

required data exploration, which is brutal in MongoDB

Page 4: etcd based PostgreSQL HA Cluster

Our database product

our platform runs databases

these databases scale automatically as a customer

increases data size

Page 5: etcd based PostgreSQL HA Cluster

Our database product

could we run PostgreSQL on our platform?

Page 6: etcd based PostgreSQL HA Cluster

Database operational requirements

• replicated • highly-available • no human interaction for failover • minimize core-engine

modifications • customers use entire

deployment

Page 7: etcd based PostgreSQL HA Cluster

Tools investigated

repmgr with pgpool II

required human interaction for failover

does not use PostgreSQL streaming

pgpool was flakey on failover

Page 8: etcd based PostgreSQL HA Cluster

Tools investigated

PostgreSQL streaming replication

no automatic failover

Page 9: etcd based PostgreSQL HA Cluster

Tools investigated

bi-directional replicationi.e. master-master

only runs on one database per cluster

requires a patch on core engine

Page 10: etcd based PostgreSQL HA Cluster

is automated failover too ambitious with PostgreSQL?

Page 11: etcd based PostgreSQL HA Cluster

Learned from tools investigation

PostgreSQL should not be the canonical store of its own state, investigated:

serf - not consensus based consul - runs with consensus

etcd - run with conensus

Page 12: etcd based PostgreSQL HA Cluster

Consulwe built the prototype on Consul

using:

locking sessions

health checks

code at: https://github.com/MongoHQ/consul_ha

Page 13: etcd based PostgreSQL HA Cluster

ConsulCode at: https://github.com/MongoHQ/

consul_ha

Tight coupling between:

Consul interaction and

HA decision loop

Page 14: etcd based PostgreSQL HA Cluster

Consul Diagram 1

Page 15: etcd based PostgreSQL HA Cluster

Final Consul Diagram

Page 16: etcd based PostgreSQL HA Cluster

Consul Results

amazing

automatically growing and shrinking Consul clusters

health checks to prevent unhealthy secondaries from acquiring locks

Page 17: etcd based PostgreSQL HA Cluster

Consul

until, we ran into massive swap allocation.

40 GB swap allocation.

fine for prototypes, not for production.

Page 18: etcd based PostgreSQL HA Cluster

Results from Consul

HA PostgreSQL is possible

but, we need a tool which uses our resources more wisely.

Page 19: etcd based PostgreSQL HA Cluster

Switch to etcd

because of what we’d learned in Consul, the switch to etcd took a

day to have a working sample

Page 20: etcd based PostgreSQL HA Cluster

Modern etcd diagramStart

Connect to etcd?

Is data directory empty?

yes

Win race to set initialization

key?yes Initialize

database

Take over lead TTL

keyStart

PostgreSQL as a

leaderless Secondary

no

yes

Leader owns key?

pg_basebackup from leader

Do I own leader key?

Acquire leader lock?

yes

Update leader

TTL lock

yes

Promote to leader

Is leader key

owned?

no

Am I following

the correct leader?

yes

Am I the healthiest member?

no

Am I the leader?

no

Wait 30 seconds

yes

yes

no

yes

Start Postgres

wait 5 seconds

no

wait 5 seconds

no

follow proper leader

no

yes

Running Loop

Start Postgres

Start-up Process

Page 21: etcd based PostgreSQL HA Cluster

etcd features used

concensus recursivettl prevValue prevExist

https://coreos.com/docs/distributed-configuration/etcd-api/

Page 22: etcd based PostgreSQL HA Cluster

etcd: recursive

used to find all members known to a cluster

Page 23: etcd based PostgreSQL HA Cluster

etcd: ttl

used with our keep alive from a PostgreSQL runner

Page 24: etcd based PostgreSQL HA Cluster

etcd: prevValue

used in conjunction with TTL to ensure the leader remains the leader when updating the TTL

Page 25: etcd based PostgreSQL HA Cluster

etcd: prevExist

used to create a deployment initialization race

Page 26: etcd based PostgreSQL HA Cluster

Improved with etcd

removed tight coupling in classes:

HA decision process

etcd state interaction

PostgreSQL handler

Page 27: etcd based PostgreSQL HA Cluster

Issues with etcd

overly aggressive about consensus

instructions for optimization at https://coreos.com/docs/cluster-management/debugging/etcd-

tuning/

Page 28: etcd based PostgreSQL HA Cluster

Issues with etcd

overly aggressive about consensus

we quit running etcd along side PostgreSQL because we wanted expanding PostgreSQL clusters

Page 29: etcd based PostgreSQL HA Cluster

Time for live demo?