Replication slots in PostgreSQL

To the ninth version in PostgreSQL to create a warm standby server used WAL archiving. In version 9.0 there is streaming replication with the ability to create "hot" read-only servers. In the next version of PostgreSQL 9.4 will have new functionality to create streaming replication called replication slots. Consider what it is and how it differs from the previous methods.
/ > To date, the first available candidate releases. As the test stand 2 selected new path under Ubuntu 14.04. The process of assembling and installing the same for primary and backup servers. Set from source by typing the required packages:

the
sudo apt-get update && sudo apt-get -y install g++ make checkinstall libreadline6-dev zlib1g-dev 

Download and unpack the archive from the repository:

the
wget https://ftp.postgresql.org/pub/source/v9.4rc1/postgresql-9.4rc1.tar.gz 
tar xzf postgresql-9.4rc1.tar.gz 

Collect and install the package:

the
cd postgresql-9.4rc1/ 
./configure 
make 
sudo checkinstall 

By default, the binaries for working with database are placed in /usr/local/pgsql/.
Add the postgres user to the system:

the
sudo useradd-M postgres 

Create a directory for a cluster:

the
sudo mkdir -p /data/db 
sudo chown postgres:postgres /data/db 
sudo chmod 0700 /data/db 

Next, execute the procedure on the primary server. Initialize the cluster:

the
sudo -u postgres /usr/local/pgsql/bin/initdb -D /data/db

In addition to the structure of the cluster, initdb will create default configs. Create a cluster in the pg_log directory that holds the logs:

the
sudo -u postgres mkdir /data/db/pg_log 
sudo -u postgres chmod 0700 /data/db/pg_log 

Add entries in pg_hba.conf to connect users to the backup server can pick up the WAL logs from the primary:

the
host all all 192.168.1.0/24 md5 
host replication replica 192.168.1.108/32 md5 

In the config file postgresql.run conf settings:
listen_addresses = '*' — to listen to incoming connections on all interfaces
wal_level = hot_standby — the format required for the WAL logs for replication
max_wal_senders = 2 is the number of simultaneous connections for replication
logging_collector = on — put the logs in pg_log

Run our cluster:

the
sudo -u postgres /usr/local/pgsql/bin/pg_ctl -D /data/db start

See the state of the process:

the
ps aux | grep postgres 
postgres 21295 0.0 0.0 604 23700 ? Ss 13:39 0:00 postgres: logger process 
postgres 21297 0.0 13.6 170880 138408 ? Ss 13:39 0:01 postgres: checkpointer process 
postgres 21298 0.0 5.0 170784 51076 ? Ss 13:39 0:00 postgres: writer process 
postgres 21299 0.0 0.5 5148 170648? Ss 13:39 0:00 postgres: wal writer process 
postgres 21300 0.0 0.1 171052 1836 ? Ss 13:39 0:00 postgres: autovacuum launcher process 
postgres 21301 0.2 0.1 25924 1060 ? Ss 13:39 0:17 postgres: stats collector process

Create user replica with replication rights:

the
/usr/local/pgsql/bin/psql -U postgres -c "create user replica with replication encrypted password '123'" 

Let's create a test database with data:

the
/usr/local/pgsql/bin/createdb -U postgres testdb 
/usr/local/pgsql/bin/psql -U postgres -d testdb -c "create table testtable (id serial, data text)" 
/usr/local/pgsql/bin/psql -U postgres -d testdb -c "insert into testtable select nextval('testtable_id_seq'::regclass), md5(generate_series(1,1000000)::text)"

Set up a backup server.
Create a directory for a cluster:

the
sudo mkdir -p /data/db 
sudo chmod 0700 /data/db 
sudo chown postgres:postgres /data/db

Using pg_basebackup utility to make a backup of the server:

the
sudo -u postgres /usr/local/pgsql/bin/pg_basebackup -h 192.168.1.103 -U replica -D /data/db -X s 

pg_basebackup copies all the contents of the cluster, including the config files, so changing the parameter hot_standby to on in postgresql.conf
Create the file recovery.conf in the directory cluster, which will contain the parameters of connection to the primary server:

the
standby_mode='on' 
primary_conninfo='host=192.168.1.103 port=5432 user=replica password=123' 

Start the cluster on the backup server:

the
sudo -u postgres /usr/local/pgsql/bin/pg_ctl -D /data/db start

Then on the primary server needs to start the process wal_sender and backup wal_receiver:
Hidden text
postgres 21295 0.0 0.0 604 23700 ? Ss 13:39 0:00 postgres: logger process 

postgres 21298 0.0 0.2 2252 170784 ? Ss 13:39 0:00 postgres: writer process 
postgres 21299 0.0 0.5 5148 170648 ? Ss 13:39 0:00 postgres: wal writer process 
postgres 21300 0.0 0.1 1804 171052 ? Ss 13:39 0:00 postgres: autovacuum launcher process 
postgres 21301 0.0 0.1 25924 1060 ? Ss 13:39 0:00 postgres: stats collector process 
postgres 21323 0.0 0.2 171048 2108 ? Ss 13:46 0:00 postgres: wal sender process replica 192.168.1.108(56673) streaming 0/4E000210 

postgres 15150 23700 0.0 0.0 612 ? Ss 13:46 0:00 postgres: logger process 
postgres 15151 0.0 0.1 1496 170788 ? Ss 13:46 0:00 postgres: startup process recovering 00000001000000000000004E 
postgres 15152 0.0 0.0 170680 944 ? Ss 13:46 0:00 postgres: checkpointer process 
postgres 15153 0.0 0.1 170680 1204 ? Ss 13:46 0:00 postgres: writer process 
postgres 15154 0.0 0.0 25792 648 ? Ss 13:46 0:00 postgres: stats collector process 
postgres 15155 0.6 0.1 174956 1660 ? Ss 13:46 0:00 postgres: wal receiver process streaming 0/4E000138


View the status of replication through the pg_stat_replication view on the master server

the
testdb=# select * from pg_stat_replication; 
-[ RECORD 1 ]----+------------------------------ 
pid | 21987 
usesysid | 16384 
usename | replica 
application_name | walreceiver 
client_addr | 192.168.1.108 
client_hostname | 
client_port | 56674 
backend_start | 2014-11-25 18:30:09.206434+03 
backend_xmin | 
state | streaming 
sent_location | 0/5A2D8E60 
write_location | 0/5A2D8E60 
flush_location | 0/5A2D8E60 
replay_location | 0/5A2D8E60 
sync_priority | 0 
sync_state | async 

It is seen that primary and standby server synchronized. Now shekerim some more test data and immediately after that, view the replication status.

the
testdb=# insert into testtable select nextval('testtable_id_seq'::regclass), md5(generate_series(1,1000000)::text); 
INSERT 0 1000000 
testdb=# select * from pg_stat_replication; 
-[ RECORD 1 ]----+------------------------------ 
pid | 21987 
usesysid | 16384 
usename | replica 
application_name | walreceiver 
client_addr | 192.168.1.108 
client_hostname | 
client_port | 56674 
backend_start | 2014-11-25 18:30:09.206434+03 
backend_xmin | 
state | streaming 
sent_location | 0/63800000 
write_location | 0/63680000 
flush_location | 0/63680000 
replay_location | 0/6136E160 
sync_priority | 0 
sync_state | async

Here we see that the backup server took the main all the WAL logs, but have not had time to apply all of them, so it lags behind the primary. By default, postgres replication occurs asynchronously with WAL-log is a binary file of fixed size to 16Mb, which are located in the directory pg_xlog. Their number can be changed using the parameters checkpoint_segments and wal_keep_segments. When the amount of changed data in the cluster exceeds the total size of the WAL logs, start the process checkpointer, which will reset the WAL logs in the data files. After that, WAL-logs re-created anew. In the current stable version of postgres, the primary server does not take into account the status of the backup server. Therefore, if the backup is too "behind" the main, the main WAL logs will be re-created before they take a backup. Let's try to simulate this situation.
Temporarily disable a backup server to connect on port 5432:

the
sudo iptables -A OUTPUT-m tcp -p tcp —dport 5432 -j DROP

Will Shearim more data on the primary server:

the
testdb=# insert into testtable select nextval('testtable_id_seq'::regclass), md5(generate_series(1,1000000)::text); 
INSERT 0 1000000

Drop the iptables rule and see the logs on the backup server, in which we witness such an unpleasant picture.

the
LOG: started streaming WAL from primary at 0/78000000 on timeline 1 
FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000000000078 has already been removed

The primary server was removed WAL logs before they managed to pick up a backup. Now have to re-make a backup of the server. The problem is that the primary server does not take into account the status of the backup. Therefore, if there are network problems or just a slow link between the servers, in heavy loading and/or changing data on the primary server, there is a risk of breakage of replication. A partial solution is to increase the value of the parameter wal_keep_segmentsand enable WAL archiving. But in version 9.4 will appear replication slots. Consider how it works:
On the principal server, create a replication slot:

the
testdb=# SELECT pg_create_physical_replication_slot('standby_slot'); 
-[ RECORD 1 ]-----------------------+---------------- 


the
testdb=# select * from pg_replication_slots; 
-[ RECORD 1 ]+------------- 
slot_name | standby_slot 
plugin | 
slot_type | physical 
datoid | 
database | 
active | f 
xmin | 
catalog_xmin | 
restart_lsn | 

The backup will add to the existing contents of the file recovery.conf the line
primary_slot_name = 'standby_slot'

After restarting the backup server again, disconnect it from the main and Shearim on the primary test data that exceed the amount of WAL logs:

the
testdb=# insert into testtable select nextval('testtable_id_seq'::regclass), md5(generate_series(1,10000000)::text); 

View options WAL-logs in the system, and then the real number of log files in the directory pg_xlog:

the
testdb=# show checkpoint_segments; 
-[ RECORD 1 ]-------+-- 
checkpoint_segments | 3 

the
testdb=# show wal_keep_segments; 
-[ RECORD 1]-----+-- 
wal_keep_segments | 0 

the
testdb=#\! ls /data/db/pg_xlog | wc-l 
50

To calculate the maximum number of WAL files in the system using formula: (2 + checkpoint_completion_target) * checkpoint_segments + 1.
However, the current number of WAL-logs in the system is much higher. Replication slots hold information about the number of downloaded WAL-logs each backup server. Now WAL logs will accumulate as long as the last backup server will pick them up or until you remove replication slot. As download WAL-log the pg_xlog directory on the primary server will decrease. Dropping a rule iptables on the backup server, in the logs we see that replication has resumed.

the
testdb=#\! tail-f /data/db/pg_log/postgresql-2014-11-27_191036.log 
Is the server running on host "192.168.1.103" and accepting TCP/IP connections on port 5432? 
LOG: started streaming WAL from primary at 0/A0000000 on timeline 1

Replication slots is the perfect tool for enhancing the reliability and ease of replication for PostgreSQL.

Description replication slots in PostgreSQL official website: www.postgresql.org/docs/9.4/static/warm-standby.html#STREAMING-REPLICATION-SLOTS
Article based on information from habrahabr.ru

Популярные сообщения из этого блога

Approval of WSUS updates: import, export, copy

The Hilbert curve vs. Z-order

Kaspersky Security Center — the fight for automation