postgresql replication tutorial

PostgreSQL: Understanding replication

Hans-Jurgen Schonig

www.postgresql-support.de

Hans-Jurgen Schonigwww.postgresql-support.de

Welcome to PostgreSQL replication

What you will learn

I How PostgreSQL writes dataI What the transaction log doesI How to set up streaming replicationI How to handle Point-In-Time-RecoveryI Managing conflictsI Monitoring replicationI More advanced techniques

How PostgreSQL writes data

Writing a row of data

I Understanding how PostgreSQL writes data is key tounderstanding replication

I Vital to understand PITRI A lot of potential to tune the system

Write the log first (1)

I It is not possible to send data to a data table directly.

I What if the system crashes during a write?

I A data file could end up with broken data at potentiallyunknown positions

I Corruption is not an option

Write the log first (2)

I Data goes to the xlog (= WAL) first

I WAL is short for “Write Ahead Log”

I IMPORTANT: The xlog DOES NOT contain SQL

I It contains BINARY changes

The xlog

I The xlog consists of a set of 16 MB files

I The xlog consists of various types of records (heap changes,btree changes, etc.)

I It has to be flushed to disk on commit to achieve durability

Expert tip: Debugging the xlog

I Change WAL DEBUG in src/include/pg config manual.hI Recompile PostgreSQL

NOTE: This is not for normal use but just for training purposes

Enabling wal debug

test=# SET wal_debug TO on;

test=# SET client_min_messages TO debug;

Observing changes

I Every change will go to the screen nowI It helps to understand how PostgreSQL worksI Apart from debugging: The practical purpose is limited

Making changes

I Data goes to the xlog first

I Then data is put into shared buffers

I At some later point data is written to the data files

I This does not happen instantly leaving a lot of room foroptimization and tuning

A consistent view of the data

I Data is not sent to those data files instantly.

I Still: End users will have a consistent view of the data

I When a query comes in, it checks the I/O cache (=shared buffers) and asks the data files only in case of a cachemiss.

I xlog is about the physical not about the logical level

Sustaining writes

I We cannot write to the xlog forever without recycling it.

I The xlog is recycled during a so called “checkpoint”.

I Before the xlog can be recycled, data must be stored safely inthose data files

I Checkpoints have a huge impact on performance

Checkpoint parameters:

I checkpoint timeout = 15minI max wal size = 5GBI min wal size = 800MBI checkpoint completion target = 0.5I checkpoint warning = 30s

Checkpointing to frequently

I Checkpointing is expensiveI PostgreSQL warns about too frequent checkpointsI This is what checkpoint warning is good for

min wal size and max wal size (1)

I This is a replacement for checkpoint segmentsI Now the xlog size is auto-tunedI The new configuration was introduced in PostgreSQL 9.5

Instead of having a single knob (checkpoint segments) that bothtriggers checkpoints, and determines how many checkpoints torecycle, they are now separate concerns. There is still an internalvariable called CheckpointSegments, which triggers checkpoints.But it no longer determines how many segments to recycle at acheckpoint. That is now auto-tuned by keeping a moving averageof the distance between checkpoints (in bytes), and trying to keepthat many segments in reserve.

The advantage of this is that you can set max wal size very high,but the system won’t actually consume that much space if thereisn’t any need for it. The min wal size sets a floor for that; youcan effectively disable the auto-tuning behavior by settingmin wal size equal to max wal size.

How does it impact replication

I The xlog has all the changes needed and can therefore beused for replication.

I Copying data files is not enough to achieve a consistent viewof the data

I It has some implications related to base backups

Setting up streaming replication

The basic process

I S: Install PostgreSQL on the slave (no initdb)I M: Adapt postgresql.confI M: Adapt pg hba.confI M: Restart PostgreSQLI S: Pull a base backupI S: Start the slave

Changing postgresql.conf

I wal level: Ensure that there is enough xlog generated by themaster (recovering a server needs more xlog than just simplecrash-safety)

I max wal senders: When a slave is streaming, connects to themaster and fetches xlog. A base backup will also need 1 / 2wal senders

I hot standby: This is not needed because it is ignored on themaster but it saves some work on the slave later on

Changing pg hba.conf

I Rules for replication have to be added.

I Note that “all” databases does not include replication

I A separate rule has to be added, which explicitly states“replication” in the second column

I Replication rules work just like any other pg hba.conf rule

I Remember: The first line matching rules

Restarting PostgreSQL

I To activate those settings in postgresql.conf the master has tobe restarted.

I If only pg hba.conf is changed, a simple SIGHUP (pg ctlreload) is enough.

Using pg basebackup (1)

I pg basebackup will fetch a copy of the data from the master

I While pg basebackup is running, the master is fullyoperational (no downtime needed)

I pg basebackup connects through a database connection andcopies all data files as they are

I In most cases this does not create a consistent backup

I The xlog is needed to “repair” the base backup (this is exactlywhat happens during xlog replay anyway)

Using pg basebackup (2)

pg_basebackup -h master.com -D /slave \--xlog-method=stream --checkpoint=fast -R

xlog-method: Self-contained backups

I By default a base backup is not self-contained.

I The database does not start up without additional xlog.

I This is fine for Point-In-Time-Recovery because there is anarchive around.

I For streaming it can be a problem.

I –xlog-method=stream opens a second connection to fetchxlog during the base backup

checkpoint=fast: Instant backups

I By default pg basebackup starts as soon as the mastercheckpoints.

I This can take a while.

I –checkpoint=fast makes the master check instantly.

I In case of a small backup an instant checkpoint speeds thingsup.

-R: Generating a config file

I For a simple streaming setup all PostgreSQL has to know isalready passed to pg basebackup (host, port, etc.).

I -R automatically generates a recovery.conf file, which is quiteok in most cases.

Backup throttling

I –max-rate=RATE: maximum transfer rate to transfer datadirectory (in kB/s, or use suffix “k” or “M”)

I If your master is weak a pg basebackup running at full speedcan lead to high response times and disk wait.

I Slowing down the backup can help to make sure the masterstays responsive.

Adjusting recovery.conf

I A basic setup needs:

I primary conninfo: A connect string pointing to the masterserver

I standby mode = on: Tells the system to stream instantly

I Additional configuration parameters are available

Starting up the slave

I Make sure the slave has connected to the masterI Make sure it has reached a consistent stateI Check for wal sender and wal receiver processes

Promoting a slave to master

I Promoting a slave to a master is easy:

pg_ctl -D ... promote

I After promotion recovery.conf will be renamed torecovery.done

One word about security

I So far replication has been done as superuserI This is not necessaryI Creating a user, which can do just replication makes sense

CREATE ROLE foo ... REPLICATION ... NOSUPERUSER;

Monitoring replication

Simple checks

I The most basic and most simplistic check is to check for

I wal sender (on the master)I wal receiver (on the slave)

I Without those processes the party is over

More detailed analysis

I pg stat replication contains a lot of informationI Make sure an entry for each slave is thereI Check for replication lag

Checking for replication lag

I A sustained lag is not a good idea.I The distance between the sender and the receiver can be

measured in bytes

SELECT client_addr,

pg_xlog_location_diff(pg_current_xlog_location(),

sent_location)

FROM pg_stat_replication;

I In asynchronous replication the replication lag can varydramatically (for example during CREATE INDEX, etc.)

Creating large clusters

Handling more than 2 nodes

I A simple 2 node cluster is easy.

I In case of more than 2 servers, life is a bit harder.

I If you have two slaves and the master fails: Who is going tobe the new master?

I Unless you want to resync all your data, you should betterelect the server containing most of the data already

I Comparing xlog positions is necessary

Timeline issues

I When a slave is promoted the timeline ID is incremented

I Master and slave have to be in the same timeline

I In case of two servers it is important to connect one server tothe second one first and do the promotion AFTERWARDS.

I This ensures that the timeline switch is already replicatedfrom the new master to the surviving slave.

Cascading slaves

I Slaves can be connected to slaves

I Cascading can make sense to reduce bandwidth requirements

I Cascading can take load from the master

I Use pg basebackup to fetch data from a slave as if it was amaster

Conflicts

How conflicts happen

I During replication conflicts can happenI Example: The master might want to remove a row still visible

to a reading transaction on the slave

What happens during a conflict

I PostgreSQL will terminate a database connection after sometime

I max standby archive delay = 30sI max standby streaming delay = 30s

I Those settings define the maximum time the slave waitsduring replay before replay is resumed.

I In rare cases a connection might be aborted quite soon.

Reducing conflicts

I Conflicts can be reduced nicely by settinghot standby feedback.

I The slave will send its oldest transaction ID to tell the masterthat cleanup has to be deferred.

Making replication more reliable

What happens if a slave reboots?

I If a slave is gone for too long, the master might recycle itstransaction log

I The slave needs a full history of the xlog

I Setting wal keep segments on the master helps to prevent themaster from recycling transaction log too early

I I recommend to always use wal keep segments to make surethat a slave can be started after a pg basebackup

Making use of replication slots

I Replication slots have been added in PostgreSQL 9.4I There are two types of replication slots:

I Physical replication slots (for streaming)I Logical replication slots (for logical decoding)

Configuring replication slots

I Change max replication slots and restart the master

I Run . . .

test=# SELECT *

FROM pg_create_physical_replication_slot(’some_name’);

slot_name | xlog_position

-----------+---------------

some_name |

(1 row)

Tweaking the slave

I Add this replication slot to primary slot name on the slave:

primary_slot_name = ’some_name’

I The master will ensure that xlog is only recycled when it hasbeen consumed by the slave.

A word of caution

I If a slave is removed make sure the replication slot is dropped.

I Otherwise the master might run out of disk space.

I NEVER use replication slots without monitoring the size ofthe xlog on the sender.

Key advantages of replication slots

I The difference between master and slave can be arbitrary.I During bulk load or CREATE INDEX this can be essential.I It can help to overcome the problems caused by slow networks.I It can help to avoid resyncs.

Moving to synchronous replication

Synchronous vs. asynchronous

I Asynchronous replication: Commits on the slave can happenlong after the commit on the master.

I Synchronous replication: A transaction has to be written to asecond server.

I Synchronous replication potentially adds some networklatency to the scenery

The application name

I During normal operations the application name setting can beused to assign a name to a database connection.

I In case of synchronous replication this variable is used todetermine synchronous candidates.

Configuring synchronous replication:

I Master:

I add names to synchronous standby names

I Slave:

I add an application name to your connect string inprimary conninfo

Fail safety

I Synchronous replication needs 2 active servers

I If no two servers are left, replication will wait until a secondserver is available.

I Use AT LEAST 3 servers for synchronous replication to avoidrisk.

Point-In-Time-Recovery

What it does

I PITR can be used to reach (almost) any point after a basebackup.

I It is more of a backup strategy than a replication thing.

I Replication and PITR can be combined.

Configuring for PITR

I S: create an archive (ideally this is not on the master)

I M: Change postgresql.conf

I set wal levelI set max wal senders (if pg basebackup is desired)I set archive mode to onI set a proper archive command to archive xlog

I M: adapt pg hba.conf (if pg basebackup is desired)

I M: restart the master

pg basebackup, etc.

I Perform a pg basebackup as performed before

I –xlog-method=stream and -R are not needed

I In the archive a .backup file will be available afterpg basebackup

I You can delete all xlog files older than the oldest base backupyou want to keep.

I The .backup file will guide you

Restoring from a crash

I Take a base backup.

I Write a recovery.conf file:

I restore command: Tell PostgreSQL where to find xlogI recovery target time (optional): Use a timestamp to tell the

system how far to recover

I Start the server

I Make sure the system has reached consistent state

More config options

recovery min apply delay: Delayed replay

I This settings allows you to tell the slave that a certain delay isdesired.

I Example: A stock broker might want to provide you with 15minute old data

pause at recovery target

I Make sure that the recovery does not stop at a specified pointin time.

I Make PostgreSQL wait when a certain point is reached.

I This is essential in case you do not know precisely how far torecover

recovery target name

I Sometimes you want to recover to a certain point in time,which has been specified before.

I To specify a point in time run . . .

SELECT pg_create_restore_point(’some_name’);

I Use this name in recovery.conf to recover to this very specificpoint

Finally . . .

Contact us . . .

Cybertec Schonig & Schonig GmbH

Grohrmuhlgasse 26

A-2700 Wiener Neustadt Austria

I More than 15 years of PostgreSQL experience:

I TrainingI ConsultingI 24x7 support

postgresql replication tutorial

Software

postgresql clusters using streaming replication and...

streaming replication in postgresql

postgresql: mέθοδοι για data replication

postgresql replication

new the postgresql replication protocol - hagander...

troubleshooting postgresql streaming replication

built-in replication in postgresql

total availability with postgresql and multi-master...

postgresql java tutorial

tutorial postgresql

future in-core replication for postgresql - postgresql wiki

oracle to postgresql replication and migration · pdf...

postgresql replication in 10min - 3rd meetup idpug

logical replication in postgresql - flossuk 2016

tutorial de postgresql - undersecurity

asynchronous replication for postgresql slony

postgresql replication and some test scenarios

postgresql logical replication practical use case...

tutorial do postgresql 8

replication solutions for postgresql