easypostgresqlclusteringwithpatroni antsaasma 21.02 · whyishatricky i...

Post on 11-Nov-2018

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Easy PostgreSQL Clustering with Patroni

Ants Aasma21.02.2017

Ants Aasma21.02.2017

Introduction

Ants Aasma21.02.2017

About me

I Support engineer at CybertecI Helping others run PostgreSQL for 5 years.I Helping myself run PostgreSQL since 7.4 days.

Ants Aasma21.02.2017

What are we going to talk about

I Patroni - a tool to build high availability clusters withPostgreSQL

I https://github.com/zalando/patroni

I Questions welcome during the talk

Ants Aasma21.02.2017

PostgreSQL clustering state of the art

I pgpool2, Pacemaker, repmgr, . . .

Ants Aasma21.02.2017

What Patroni does

Ants Aasma21.02.2017

Parts of the HA problem

I Detect failureI Promote new masterI Route clients to the correct master

Ants Aasma21.02.2017

Why is HA tricky

I PostgreSQL provides single-master replicationI Having more than one master is worse than having none.I Need to agree on who gets to be master

I When servers failI When networks failI When things almost work but not quite

Ants Aasma21.02.2017

Distributed databases to the rescue

I Distributed consensus algorithms were inveneted to solve this.Paxos, Raft

I Many existing distirbuted consensus databases:I etcdI ConsulI Zookeeper

Ants Aasma21.02.2017

How Patroni solves the HA problem

I Each node runs a Patroni agentI Patroni agent runs a constant loop to checkI health of local PostgreSQLI health of clusterI fix things when they are not idealI A distributed consensus store is used to pick a leader

Ants Aasma21.02.2017

How Patroni works

I Picks one node to initialize the databaseI Clones new nodes joining the cluster using pg_basebackupI Monitors health of PostgreSQLI Promotes a new master if existing master failsI Sets primary_conninfo on other standbysI Rejoins old master using pg_rewind

Ants Aasma21.02.2017

Load balancer

I Patroni does not do client routing or virtual IP movementI libpq expects to see a single host to connect to.

I This will be fixed in PostgreSQL 10I JDBC already supports this directly

I Use your favourite load balancer to perform the connectionforwarding

I HAProxy/nginx + VIP failover/run on app servers and connectto localhost

I F5 BigIP, etc.I Your cloud providers load balancerI Customize your connection pooler

Ants Aasma21.02.2017

Patroni architecture

Figure 1: Patroni overviewAnts Aasma21.02.2017

Demo time

Ants Aasma21.02.2017

Local etcd for testing

# Downloadcurl -sL https://github.com/coreos/etcd/releases/download/\v3.1.1/etcd-v3.1.1-linux-amd64.tar.gz | tar xz

# And runetcd-v3.1.1-linux-amd64/etcd

Ants Aasma21.02.2017

Setting up Patroni

# Install Patronivirtualenv --quiet patroni-venv && source patroni-venv/bin/activatepip install patroni

# Sample configwget https://github.com/zalando/patroni/raw/master/postgres0.ymlvim postgres0.yml

# Ready to gopatroni postgres0.yml

Ants Aasma21.02.2017

Second node

# Change node name, datadir name and portscat postgres0.yml |\

sed s/postgresql0/postgresql1/ |\sed s/:5432/:5433/ |\sed s/:8008/:8009/ > postgres1.yml

patroni postgres1.yml

Ants Aasma21.02.2017

Setting up a load balancer

# Get the system HAProxy package installedsudo apt-get/yum/... install haproxy

# Get confdcurl -O https://github.com/kelseyhightower/confd/releases/download/\v0.11.0/confd-0.11.0-linux-amd64chmod a+x confd-0.11.0-linux-amd64 && ln -s confd-0.11.0-linux-amd64 confd

# Get Patroni confd config examplescurl -sL https://github.com/zalando/patroni/archive/v1.2.3.tar.gz | tar xz# Adjust HAProxy to run locallysed -i 's#/etc/haproxy' patroni-1.2.3/extras/confd/conf.d/haproxy.tomlsed -i 's#/var/run/#haproxy/#' patroni-1.2.3/extras/confd/conf.d/haproxy.toml

Ants Aasma21.02.2017

Setting up a load balancer continued

# Run confd./confd -prefix=/service/batman -backend etcd \

-node http://localhost:2379/ \-interval 10 \-confdir patroni-1.2.3/extras/confd/

Ants Aasma21.02.2017

Wrapping up

Ants Aasma21.02.2017

How we avoid split brain

I All nodes try to acquire leader key in DCS.I Leader key has a timeout, the master runs a loop that keeps

updating the leader key.I If DCS gives an error - PostgreSQL gets demotedI If DCS access times out - PostgreSQL gets demotedI If we discover leader key was timed out - PostgreSQL gets

demotedI When there is no leader other nodes try to contact the previous

leader.I If Patroni is not responding, load balancer removes that node

from rotation.I Future versions will have kernel watchdog support

I If Patroni does not get to run, the whole OS gets rebooted.Ants Aasma21.02.2017

Leader elections

I If there is no leader key in the cluster, the remaining nodesI Check if the old leader is still respondingI Contact all other members, the ones with most xlog get to

participate in the leader raceI Check if they are too far behind from last known master xlog

position.

Ants Aasma21.02.2017

That’s all

Ants Aasma21.02.2017

Thank you

I Questions?I If you need professional support, contact us info@cybertec.at

Ants Aasma21.02.2017

Extra content

Ants Aasma21.02.2017

Configuration

I Configuration is merged from cluster configuration stored inetcd and local configuration given as a parameter.

I Patroni does configuration management for PostgreSQLI You can update Patroni configuration through the REST API

curl -XPATCH \-d '{"postgresql": {"parameters": {"work_mem": "32MB"}}}' \http://localhost:8008/config

I Patroni updates PostgreSQL configuration on all nodes and issuesSIGHUP

Ants Aasma21.02.2017

Other nifty features

I Schedule a node restart in the futureI Optionally, only if there are config changes that require a restartI Optionally, only if still running a specified version

I Voluntary restart runs a checkpoint before shutdownI Run a manual failover

I Schedule a failover in the future

Ants Aasma21.02.2017

Nifty features, continued.

I Clone new nodes from a special node instead of master.I Clone new nodes using a backupI Turn off cluster management to do whatever.

Ants Aasma21.02.2017

top related