postgresql hangout replication features v9.4
DESCRIPTION
See the new enhancements in v9.4 which takes away the pain of guessing right wal_keep_segment See the new time lagging replication capability in v9.4 Short intro to logical replication introduced in v9.4TRANSCRIPT
Promising New Replication Features in
PostgreSQL v9.4
A quick recap!
Earlier we saw:- What is Streaming Replication - How to setup of Streaming Replication in
v9.3- How v9.3 enhancements made switchover
and switchback easier- How v9.3 enhancements ease the setup of
Replication using pg_basebackup
What are we going to do today
- See the new enhancements in v9.4 which takes away the pain of guessing right wal_keep_segment
- See the new time lagging replication capability in v9.4
- Short intro to logical replication introduced in v9.4
Parameter Changes in v9.4
- New recovery.conf parameters:- primary_slot_name - recovery_min_apply_delay
- New parameter postgresql.conf- max_replication_slot
- New parameter values for postgresql.conf- wal_level can have a value logical
Replication Slot
How DBAs do it today
- Guess a proper value for wal_keep_segment based on transaction volume
- Keep monitoring the transaction rate- Increase the wal_keep_segment
proactively
What if you ‘guessed’ a wrong value- A smaller value means replication may go out of sync
- Need to rebuild the secondary node from a base backup of Primary node
- Setup the replication again- Guess the ‘wal_keep_segment’ value again
- A larger value means you might be wasting storage space
- Rebuild replication if secondary server goes down- Archived WALs to avoid these issues = more storage
How is that going to change
- Create a replication slot on primary server- Add it in recovery.conf on secondary server- Primary server will keep WAL files unless
the server using the replication slot has got them
- No guess work!- If secondary server goes down pending
WALs are still kept with Primary Server
Caveats
- If the secondary server goes down for a long time- WAL files will continue to accumulate on
primary server- Replication slot needs to be dropped
manually in such cases
Time lagging DR
Why would you need it?
- Ever tried Point-in-Time recovery?- Stop the Production Database- Restore from a backup- Reapply the transaction log/archive WAL since the backup- Stop at the time just before the application issue/bug introduced data
inconsistency/corruption- What if the Backup size is huge?- What if there are too many archived WAL to be applied- Higher Recovery Time = Higher Down Time = Loss of
business
Setup a time-lagging DR
- Setup a time-lagging DR in PostgreSQL v9.4 with acceptable amount of time-lag. Let’s say 2 hours
- If there is a need for Point-in-Time recovery- Stop Primary server- Apply only pending WALs (not since last backup but only 2hours)- Stop recovery before the point of corruption- Promote secondary server to be primary- Change connection conf
- Less time taken to bring up the Server = Reduced the Loss of Business
Backdated Reporting and Time-travel Queries- Do correlation/comparative queries to
check profit margin as compared to yesterday- Pull data from Primary Server and Secondary Server
lagging by a day- Pull reports from yesterday’s database
- Pause recovery on secondary and pull reports- Reduces the downtime needed for Primary DB for end of
day reporting
Demo
- Primary Server running on port 5532 on localhost- Secondary Server running on port 6532 on localhost
- postgresql.conf On primary - max_replication_slot = 2 - max_wal_senders = 2- wal_level = hot_standby- archive_mode=off #no archive setup
- Create a replication slot on primary- select * from create_physical_replication_slot('testingv94');
Demo
- recovery.conf on Secondary Server- standby_mode = on- primary_conninfo = 'host=127.0.0.1 port=5532
user=postgres'- primary_slot_name = 'testingv94'- recovery_min_apply_delay = 1min
- postgresql.conf on secondary- hot_standby_mode = on
Logical Replication