Configure Streaming Replication (Record Based Log Shipping) in PostgreSQL
In the recent two blogs, we discuss briefly on Replication available options in PostgreSql Click here to read more. And How To Configure File-Based Log Shipping in PostgreSQL Click here to read more. In this blog, we will implement Record Based Log Shipping also know as Streaming Replication in PostgreSQL.
PostgreSQL implements file-based log shipping by transferring WAL files to Standby Server and they are shipped only after they reach the threshold value of 16MB (default size). This can cause a delay in the replication process and increase the chances of a data loss due to a possible master crash.
The answer to the above problem is Stream Replication. The database servers stream WAL records in chunks to ensure that the data is always in sync. The standby servers receive the WAL chunks by establishing a connection with the master server. The advantage of streaming WAL records is that it doesn’t wait for the capacity to be full, these are streamed immediately. This helps in keeping the standby server up-to-date.
Streaming replication is asynchronous by default however, it supports synchronous replication mode as well.
Below are setup details and the same will be used in this demonstration.
Sr No. | Hostname | IP | Role |
1 | test-machine01 | 192.168.114.177 | Master / Primary Server |
2 | test-machine02 | 192.168.114.176 | Standby / Secondary Server |
Below are the high-level steps we will follow to set up our replication.
1. Configure Passwordless Authentication SSH connection from test-machine01 to test-machine02.
2. Configure Primary Server for Stream Replication.
3. Replication Entry in pg_hba.conf File in Primary Server.
4. Take a Base Backup of Primary Server.
5. Start Standby Server.
6. Test Stream Replication.
7. Monitor Stream Replication.
Step 1. Configure Passwordless Authentication: We need to configure the Passwordless Authentication SSH connection from test-machine01 to test-machine02 as we will be using rsync OS utility for continuous WAL archiving in test-machine02. Use OS command ssh-keygen to generate public and private keys in test-machine01. Once keys are generated use OS command ssh-copy-id to transfer the public key to test-machine02.
[root@test-machine01 ~]# su - postgres
Last login: Wed Jun 16 17:44:29 +03 2021 on pts/2
-bash-4.2$
-bash-4.2$
-bash-4.2$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/var/lib/pgsql/.ssh/id_rsa):
Created directory '/var/lib/pgsql/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /var/lib/pgsql/.ssh/id_rsa.
Your public key has been saved in /var/lib/pgsql/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:jiouqVEgyZz+dmsHM+H0yzn4GgdhYnqBkbJ3Es/ralw postgres@test-machine02
The key's randomart image is:
+---[RSA 2048]----+
| .. |
|+o= |
|=* B o |
|+.= Bo. |
| +.+ooo S |
| .o E=.+ |
|.o = oB.+ |
|o.= o++* |
|o+oooo+.. |
+----[SHA256]-----+
-bash-4.2$
-bash-4.2$ ls -ltr .ssh
total 8
-rw-r--r--. 1 postgres postgres 405 Jun 22 10:12 id_rsa.pub
-rw-------. 1 postgres postgres 1679 Jun 22 10:12 id_rsa
-bash-4.2$
-bash-4.2$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@test-machine02
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/var/lib/pgsql/.ssh/id_rsa.pub"
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
barman@test-machine01's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'postgres@test-machine02'"
and check to make sure that only the key(s) you wanted were added.
-bash-4.2$
-bash-4.2$ ssh 'postgres@test-machine02'
Last login: Tue Jun 22 10:32:10 2021
-bash-4.2$
-bash-4.2$ exit
logout
Connection to test-machine02 closed.
-bash-4.2$
Step 2.1. Configure Primary Server: Create Role: repluser. This role will be used by Standby DB to connect to Primary DB. Also, configure the below parameters in primary server postgresql.conf. After parameters are edited restart the PostgreSQL cluster to apply these parameters.
wal_level: This is used to enable Postgres WAL replication/ streaming replication. The possible values here are replica, minimal, and logical.
wal_log_hints: It is required for the pg_rewind capability, this helps when the standby server is out of sync with the master server.
max_wal_senders: It is used to specify the maximum numbers of concurrent connections that can be established with the standby servers.
wal_keep_size: An integer parameter that define the size of the log files segments that should be kept in the pg_wal directory. wal_keep_size = wal_keep_segments * wal_segment_size (typically 16MB).
max_slot_wal_keep_size: Sets the maximum WAL size that can be reserved by replication slots.
[root@test-machine01 ~]# su - postgres
Last login: Thu Jul 22 13:55:34 +03 2021 on pts/2
-bash-4.2$ psql
psql (13.1)
Type "help" for help.
postgres=# CREATE USER repluser REPLICATION PASSWORD 'Root@1234';
CREATE ROLE
postgres=#
postgres=# alter system set archive_mode=on;
ALTER SYSTEM
postgres=# alter system set archive_command='test ! -f postgres@test-machine02:/u01/wal_archive/%f && rsync -a %p postgres@test-machine02:/u01/wal_archive/%f';
ALTER SYSTEM
postgres=# alter system set wal_keep_size='320 MB';
ALTER SYSTEM
postgres=# alter system set max_slot_wal_keep_size='1024 MB';
ALTER SYSTEM
postgres=# alter system set wal_log_hints=on;
ALTER SYSTEM
postgres=#
[root@test-machine01 ~]#
[root@test-machine01 ~]# systemctl restart postgresql-13
[root@test-machine01 ~]#
postgres=#
postgres=#
postgres=# select name,setting,unit from pg_settings where name in ('wal_level','archive_mode','archive_command','max_wal_senders','wal_keep_size','max_slot_wal_keep_size','wal_log_hints');
name | setting | unit
------------------------+------------------------------------------------------------------------------------------------------------------+------
archive_command | test ! -f postgres@test-machine02:/u01/wal_archive/%f && rsync -a %p postgres@test-machine02:/u01/wal_archive/%f |
archive_mode | on |
max_slot_wal_keep_size | 1024 | MB
max_wal_senders | 10 |
wal_keep_size | 320 | MB
wal_level | replica |
wal_log_hints | on |
(7 rows)
postgres=#
Step 2.2 Verify WAL Arching working fine: Perform a manual WAL switch using function pg_switch_wal() in Primary DB and verify WAL archived to Standby Server.
[root@test-machine01 ~]# su - postgres
Last login: Thu Jul 22 12:06:05 +03 2021 on pts/3
-bash-4.2$ psql
psql (13.1)
Type "help" for help
postgres=#
postgres=# select pg_switch_wal();
pg_switch_wal
---------------
0/240000F0
(1 row)
-bash-4.2$ hostname
test-machine02
-bash-4.2$ pwd
/u01/wal_archive
-bash-4.2$
-bash-4.2$ ls -ltr
total 16384
-rw-r-----. 1 postgres postgres 16777216 Jul 22 13:51 000000050000000000000024
-bash-4.2$
Step 3. Replication Entry in pg_hba.conf File in Primary Server: To execute pg_basebackup command from the Remote Server and To connect Standby DB replication user repluser to Primary we need to update pg_hba.conf and tell PostgreSQL Server that you will get a replication connection request from Remote Server. Otherwise, you will receive an error. Once changes are done in pg_hba.conf reload the changes with pg_ctl reload or SELECT pg_reload_conf(), Please note we added the line “host replication all 192.168.114.0/24 scram-sha-256“ at end of the file.
-bash-4.2$ pwd
/var/lib/pgsql/13/data
-bash-4.2$
-bash-4.2$ vi pg_hba.conf
host replication all 192.168.114.0/24 scram-sha-256
:wq!
-bash-4.2$ pwd
/var/lib/pgsql/13/data
-bash-4.2$ cat pg_hba.conf
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all peer
# IPv4 local connections:
host all all 127.0.0.1/32 scram-sha-256
host all all 192.168.114.0/24 scram-sha-256
# IPv6 local connections:
host all all ::1/128 scram-sha-256
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all peer
host replication all 127.0.0.1/32 scram-sha-256
host replication all ::1/128 scram-sha-256
host replication all 192.168.114.0/24 scram-sha-256
-bash-4.2$
postgres=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
postgres=#
Step 4.1. Take a Base Backup of Primary Server from Standby Server: Before we take base backup of Primary DB. We need to create the same directory structure for non-default tablespace in Standby Server. In Primary we have 2 non default tablespace myts01:/u01/pg1_tbls & myts02:/u01/pg2_tbls. Also, remove all files from PG_DATA of standby database. We will take a remote backup from Standby Server using utility pg_basebackup. For more commands on pg_basebackup Click here.
[root@test-machine02 ~]# systemctl stop postgresql-13
[root@test-machine02 ~]#
[root@test-machine02 ~]# rm -rf /u01/pg1_tbls/*
[root@test-machine02 ~]# rm -rf /u01/pg2_tbls/*
[root@test-machine02 ~]# rm -rf /var/lib/pgsql/13/data/*
[root@test-machine02 ~]#
[root@test-machine02 ~]# su - postgres
Last login: Sun Jul 25 13:34:21 +03 2021 on pts/1
-bash-4.2$
-bash-4.2$ pg_basebackup -h test-machine01 -p 5432 -U repluser -c fast -C -S slot_standby1 -R -D /var/lib/pgsql/13/data -Fp -Xs -P -v
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/39000028 on timeline 5
pg_basebackup: starting background WAL receiver
pg_basebackup: created replication slot "slot_standby1"
74399/74399 kB (100%), 3/3 tablespaces
pg_basebackup: write-ahead log end point: 0/39000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
-bash-4.2$
-bash-4.2$
-bash-4.2$
-bash-4.2$ cd /var/lib/pgsql/13/data
-bash-4.2$
-bash-4.2$ ls -ltr
total 388
drwx------. 3 postgres postgres 84 Jul 25 12:52 pg_wal
-rw-------. 1 postgres postgres 226 Jul 25 12:52 backup_label
drwx------. 6 postgres postgres 54 Jul 25 12:52 base
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_replslot
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_notify
drwx------. 4 postgres postgres 36 Jul 25 12:52 pg_multixact
drwx------. 4 postgres postgres 68 Jul 25 12:52 pg_logical
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_dynshmem
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_commit_ts
drwx------. 2 postgres postgres 4096 Jul 25 12:52 log
drwx------. 2 postgres postgres 18 Jul 25 12:52 pg_xact
-rw-------. 1 postgres postgres 3 Jul 25 12:52 PG_VERSION
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_twophase
drwx------. 2 postgres postgres 32 Jul 25 12:52 pg_tblspc
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_subtrans
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_stat_tmp
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_stat
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_snapshots
drwx------. 2 postgres postgres 6 Jul 25 12:52 pg_serial
-rw-------. 1 postgres postgres 1636 Jul 25 12:52 pg_ident.conf
-rw-------. 1 postgres postgres 28083 Jul 25 12:52 postgresql.conf
-rw-------. 1 postgres postgres 4704 Jul 25 12:52 pg_hba.conf
-rw-------. 1 postgres postgres 583 Jul 25 12:52 old_postgresql.auto.conf
drwx------. 2 postgres postgres 4096 Jul 25 12:52 global
-rw-------. 1 postgres postgres 30 Jul 25 12:52 current_logfiles
-rw-------. 1 postgres postgres 259 Jul 25 12:52 backup_label.old
-rw-------. 1 postgres postgres 601 Jul 25 12:52 postgresql.auto.conf
-rw-------. 1 postgres postgres 0 Jul 25 12:52 standby.signal
-rw-------. 1 postgres postgres 320032 Jul 25 12:52 backup_manifest
-bash-4.2$
Step 4.2. Verify PG_DATA folder in Standby Server: Please note in above pg_basebackup we are using option -C -S slot_standby & -R These option will add parameters primary_slot_name & primary_conninfo in configuration file postgresql.auto.conf. And standby.signal in the data directory this file will cause PostgreSQL Server to enter in standby mode.
-bash-4.2$ cd /var/lib/pgsql/13/data
-bash-4.2$
-bash-4.2$ ls -l standby.signal
-rw-------. 1 postgres postgres 0 Jul 25 12:52 standby.signal
-bash-4.2$
-bash-4.2$ cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
archive_mode = 'on'
archive_command = 'test ! -f postgres@test-machine02:/u01/wal_archive/%f && rsync -a %p postgres@test-machine02:/u01/wal_archive/%f'
wal_keep_size = '320 MB'
max_slot_wal_keep_size = '1024 MB'
wal_log_hints = 'on'
primary_conninfo = 'user=repluser password=''Root@1234'' channel_binding=prefer host=''test-machine01'' port=5432 sslmode=prefer sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any'
primary_slot_name = 'slot_standby1'
-bash-4.2$
Step 5. Start Standby Server and Verify log file: Once the above steps are done start Standby PostgreSSQ Server and verify logfile. You can also verify from function pg_is_in_recovery() whether Standby DB is in recovery mode.
[root@test-machine02 ~]#
[root@test-machine02 ~]# systemctl start postgresql-13
[root@test-machine02 ~]#
postgres=#
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
postgres=#
postgres=# CREATE TABLE my_table_test(col1 varchar, col2 varchar);
ERROR: cannot execute CREATE TABLE in a read-only transaction
postgres=#
-bash-4.2$ pwd
/var/lib/pgsql/13/data/log
-bash-4.2$ vi postgresql-Sun.log
2021-07-25 12:55:34.826 +03 [130312] LOG: starting PostgreSQL 13.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
2021-07-25 12:55:34.828 +03 [130312] LOG: listening on IPv4 address "0.0.0.0", port 5432
2021-07-25 12:55:34.828 +03 [130312] LOG: listening on IPv6 address "::", port 5432
2021-07-25 12:55:34.829 +03 [130312] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-07-25 12:55:34.830 +03 [130312] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-07-25 12:55:34.833 +03 [130316] LOG: database system was interrupted; last known up at 2021-07-25 12:52:59 +03
2021-07-25 12:55:34.858 +03 [130316] LOG: entering standby mode
2021-07-25 12:55:34.860 +03 [130316] LOG: redo starts at 0/39000028
2021-07-25 12:55:34.860 +03 [130316] LOG: consistent recovery state reached at 0/39000100
2021-07-25 12:55:34.861 +03 [130312] LOG: database system is ready to accept read only connections
2021-07-25 12:55:34.895 +03 [130320] LOG: started streaming WAL from primary at 0/3A000000 on timeline 5
Step 6. Test Standby Configuration: Since this is configuration is Record Based Log Shipping changes from Primary will be replicated to Standby immediately. Replication process will not wait till WAL files are filled to 16MB.
Primary DB
postgres=#
postgres=# \connect sample1
You are now connected to database "sample1" as user "postgres".
sample1=#
sample1=# CREATE TABLE my_table_test1 (col1 varchar, col2 varchar);
CREATE TABLE
sample1=# INSERT INTO my_table_test1 VALUES ('row1', 'this is row 1'), ('row2', 'this is row 2');
INSERT 0 2
sample1=#
Standby DB
postgres=#
postgres=# \connect sample1
You are now connected to database "sample1" as user "postgres".
sample1=#
sample1=# select * from my_table_test1 ;
col1 | col2
------+---------------
row1 | this is row 1
row2 | this is row 2
(2 rows)
sample1=#
Step 7. Monitor Stream Replication: We can monitor replication performance from Primary or Standby Cluster using below queries. To understand the below function refer PostgreSql Doc : System Administration Functions Click Here.
Primary DB
postgres=# \x
Expanded display is on.
postgres=#
postgres=# select pg_current_wal_lsn ();
-[ RECORD 1 ]------+-----------
pg_current_wal_lsn | 0/3B000000
postgres=#
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 39216
usesysid | 33764
usename | repluser
application_name | walreceiver
client_addr | 192.168.114.176
client_hostname |
client_port | 35618
backend_start | 2021-07-25 12:55:42.745883+03
backend_xmin |
state | streaming
sent_lsn | 0/3B000000
write_lsn | 0/3B000000
flush_lsn | 0/3B000000
replay_lsn | 0/3B000000
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async
reply_time | 2021-07-25 13:49:19.786463+03
postgres=# select * from pg_replication_slots;
-[ RECORD 1 ]-------+--------------
slot_name | slot_standby1
plugin |
slot_type | physical
datoid |
database |
temporary | f
active | t
active_pid | 39216
xmin |
catalog_xmin |
restart_lsn | 0/3B000000
confirmed_flush_lsn |
wal_status | reserved
safe_wal_size | 1090519040
postgres=#
postgres=# select
postgres-# pid,
postgres-# application_name,
postgres-# pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn) sending_lag,
postgres-# pg_wal_lsn_diff(sent_lsn, flush_lsn) receiving_lag,
postgres-# pg_wal_lsn_diff(flush_lsn, replay_lsn) replaying_lag,
postgres-# pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) total_lag
postgres-# from pg_stat_replication;
-[ RECORD 1 ]----+------------
pid | 39216
application_name | walreceiver
sending_lag | 0
receiving_lag | 0
replaying_lag | 0
total_lag | 0
postgres=#
Standby DB
postgres=#
postgres=#
postgres=# select pg_last_wal_receive_lsn ();
-[ RECORD 1 ]-----------+-----------
pg_last_wal_receive_lsn | 0/3B000000
postgres=# select * from pg_stat_wal_receiver;
-[ RECORD 1 ]---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid | 130320
status | streaming
receive_start_lsn | 0/3A000000
receive_start_tli | 5
written_lsn | 0/3B000000
flushed_lsn | 0/3B000000
received_tli | 5
last_msg_send_time | 2021-07-25 13:49:37.653767+03
last_msg_receipt_time | 2021-07-25 13:49:29.785219+03
latest_end_lsn | 0/3B000000
latest_end_time | 2021-07-25 13:07:59.720781+03
slot_name | slot_standby1
sender_host | test-machine01
sender_port | 5432
conninfo | user=repluser password=******** channel_binding=prefer dbname=replication host=test-machine01 port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any
postgres=#
postgres=# select pg_is_in_recovery(),pg_is_wal_replay_paused(), pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp();
-[ RECORD 1 ]-----------------+------------------------------
pg_is_in_recovery | t
pg_is_wal_replay_paused | f
pg_last_wal_receive_lsn | 0/3B000000
pg_last_wal_replay_lsn | 0/3B000000
pg_last_xact_replay_timestamp | 2021-07-25 12:58:31.030197+03
postgres=#
postgres=#
Reference: PostgreSql Doc: https://www.postgresql.org/docs/current/warm-standby.html
This document is just for learning purpose and always validate in the LAB environment first before applying in the LIVE environment.
Hope so you like this article!
Please share your valuable feedback/comments/subscribe and follow us below and don’t forget to click on the bell icon to get the most recent update. Click here to understand more about our pursuit.
Related articles
- Setup Logical Replication using pglogical in PostgreSQL
- Backups from the Standby Server using pgBackRest in PostgreSQL
- Setup Streaming Replication with pgBackRest in PostgreSQL
- Configure automatic failover with repmgrd in PostgreSQL
- Perform Manual Failover & Switchover using repmgr in PostgreSQL