Install and Configure pg_repack in PostgreSQL
pg_repack is a PostgreSQL extension which lets you remove bloat from tables and indexes, and optionally restore the physical order of clustered indexes. Unlike CLUSTER and VACUUM FULL it works online, without holding an exclusive lock on the processed tables during processing. pg_repack is efficient to boot, with performance comparable to using CLUSTER directly.
Notes:
Only superusers can use the utility.
The target table must have a PRIMARY KEY or at least a UNIQUE index on a NOT NULL column.
Performing a full-table repack requires free disk space about twice as large as the target table(s) and its indexes. For example, if the total size of the tables and indexes to be reorganized is 2GB, an additional 4GB of disk space is required.
Below are the high-level steps we will follow to set up pg_repack
1. Installing pg_repack extension
2. Configure postgresql.conf
3. Create pg_repack extension
4. Use pg_repack to Rebuild Tables Online
Step 1. Installing pg_repack extension: Install pg_repack package using OS yum command. Please note below command will work if you have postgreSQL repository already configured. Click here to configure PostgreSQL repository. Once package is installed, use command pg_repack –version to check version and pg_repack –help to get full available options.
[root@test-machine02 tmp]#
[root@test-machine02 tmp]# yum install pg_repack13
Loaded plugins: langpacks, ulninfo
pgdg-common | 2.9 kB 00:00:00
pgdg10 | 3.6 kB 00:00:00
pgdg11 | 3.6 kB 00:00:00
pgdg12 | 3.6 kB 00:00:00
pgdg13 | 3.6 kB 00:00:00
pgdg95 | 3.6 kB 00:00:00
pgdg96 | 3.6 kB 00:00:00
vault.centos.org_centos_7.4.1708_extras_x86_64_ | 3.4 kB 00:00:00
(1/9): mongodb-org-4.4/7Server/primary_db | 74 kB 00:00:00
(2/9): pgdg-common/7Server/x86_64/primary_db | 141 kB 00:00:00
(3/9): pgdg10/7Server/x86_64/primary_db | 348 kB 00:00:01
(4/9): pgdg12/7Server/x86_64/primary_db | 227 kB 00:00:00
(5/9): pgdg96/7Server/x86_64/primary_db | 337 kB 00:00:00
(6/9): pgdg13/7Server/x86_64/primary_db | 143 kB 00:00:00
(7/9): pgdg11/7Server/x86_64/primary_db | 372 kB 00:00:01
Package pg_repack13 is obsoleted by pg_repack_13, trying to install pg_repack_13-1.4.7-1.rhel7.x86_64 instead
Resolving Dependencies
--> Running transaction check
---> Package pg_repack_13.x86_64 0:1.4.7-1.rhel7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
=============================================================================================================================================================================
Package Arch Version Repository Size
=============================================================================================================================================================================
Installing:
pg_repack_13 x86_64 1.4.7-1.rhel7 pgdg13 124 k
Transaction Summary
=============================================================================================================================================================================
Install 1 Package
Total download size: 124 k
Installed size: 307 k
Is this ok [y/d/N]: y
Downloading packages:
pg_repack_13-1.4.7-1.rhel7.x86_64.rpm | 124 kB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : pg_repack_13-1.4.7-1.rhel7.x86_64 1/1
Verifying : pg_repack_13-1.4.7-1.rhel7.x86_64 1/1
Installed:
pg_repack_13.x86_64 0:1.4.7-1.rhel7
Complete!
[root@test-machine02 tmp]#
[root@test-machine02 tmp]# su - postgres
Last login: Sun Oct 17 09:12:05 +03 2021 on pts/0
[postgres@test-machine02 ~]$ pg_repack --version
pg_repack 1.4.7
[postgres@test-machine02 ~]$
[postgres@test-machine02 ~]$
[postgres@test-machine02 ~]$ pg_repack --help
pg_repack re-organizes a PostgreSQL database.
Usage:
pg_repack [OPTION]... [DBNAME]
Options:
-a, --all repack all databases
-t, --table=TABLE repack specific table only
-I, --parent-table=TABLE repack specific parent table and its inheritors
-c, --schema=SCHEMA repack tables in specific schema only
-s, --tablespace=TBLSPC move repacked tables to a new tablespace
-S, --moveidx move repacked indexes to TBLSPC too
-o, --order-by=COLUMNS order by columns instead of cluster keys
-n, --no-order do vacuum full instead of cluster
-N, --dry-run print what would have been repacked
-j, --jobs=NUM Use this many parallel jobs for each table
-i, --index=INDEX move only the specified index
-x, --only-indexes move only indexes of the specified table
Generic options:
-e, --echo echo queries
-E, --elevel=LEVEL set output message level
--help show this help, then exit
--version output version information, then exit
Read the website for details: <https://reorg.github.io/pg_repack/>.
Report bugs to <https://github.com/reorg/pg_repack/issues>.
[postgres@test-machine02 ~]$
Step 2. Configure postgresql.conf: We need to add pg_repack to shared_preload_libraries. For that, just set this parameter in postgresql.conf.
[postgres@test-machine02 ~]$
[postgres@test-machine02 ~]$ cd $PGDATA
[postgres@test-machine02 data]$ ls -ltr *.conf
-rw-------. 1 postgres postgres 1636 Aug 11 17:30 pg_ident.conf
-rw-------. 1 postgres postgres 27974 Aug 14 14:55 postgresql.conf
-rw-------. 1 postgres postgres 574 Aug 24 11:36 postgresql.auto.conf
-rw-------. 1 postgres postgres 4772 Sep 29 16:45 pg_hba.conf
[postgres@test-machine02 data]$ vi postgresql.conf
shared_preload_libraries = 'pg_repack'
:wq!
[postgres@test-machine02 data]$
[postgres@test-machine02 data]$
[postgres@test-machine02 data]$ systemctl restart postgresql-13
==== AUTHENTICATING FOR org.freedesktop.systemd1.manage-units ===
Authentication is required to manage system services or units.
Authenticating as: root
Password:
==== AUTHENTICATION COMPLETE ===
[postgres@test-machine02 data]$
Step 3. Create pg_repack extension: Before using pg_repack, we must create pg_repack extension in each database where we wish to run it.
[postgres@test-machine02 data]$
[postgres@test-machine02 data]$ psql sample1
psql (13.4)
Type "help" for help.
sample1=# CREATE EXTENSION pg_repack;
CREATE EXTENSION
sample1=#
sample1=# \dx
List of installed extensions
Name | Version | Schema | Description
-----------+---------+------------+--------------------------------------------------------------
pg_repack | 1.4.7 | public | Reorganize tables in PostgreSQL databases with minimal locks
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(2 rows)
sample1=#
Step 4. Use pg_repack to Rebuild Tables Online:
Use –dry-run option with pg_repack to see if this table/database can be rebuilt online.
[postgres@test-machine02 data]$
[postgres@test-machine02 data]$ pg_repack --dry-run -d sample1 --table actor
INFO: Dry run enabled, not executing repack
INFO: repacking table "public.actor"
[postgres@test-machine02 data]$
[postgres@test-machine02 data]$
[postgres@test-machine02 data]$ pg_repack --dry-run -d sample1 --table employee
INFO: Dry run enabled, not executing repack
WARNING: relation "public.employee" must have a primary key or not-null unique keys
[postgres@test-machine02 data]$
[postgres@test-machine02 data]$
[postgres@test-machine02 data]$ pg_repack --dry-run -d sample1
INFO: Dry run enabled, not executing repack
INFO: repacking table "public.actor"
INFO: repacking table "public.address"
INFO: repacking table "public.category"
INFO: repacking table "public.city"
INFO: repacking table "public.country"
INFO: repacking table "public.customer"
INFO: repacking table "public.film"
INFO: repacking table "public.film_actor"
INFO: repacking table "public.film_category"
INFO: repacking table "public.inventory"
INFO: repacking table "public.language"
INFO: repacking table "public.payment"
INFO: repacking table "public.pgbench_accounts"
INFO: repacking table "public.pgbench_branches"
INFO: repacking table "public.pgbench_tellers"
INFO: repacking table "public.rental"
INFO: repacking table "public.staff"
INFO: repacking table "public.store"
[postgres@test-machine02 data]$
Rebuild Full PostgreSQL Cluster: Attempt to repack all the databases of the PostgreSQL cluster using -a or –all option. Databases, where the pg_repack extension is not installed will be skipped.
[postgres@test-machine02 ~]$
[postgres@test-machine02 ~]$ pg_repack --all
INFO: repacking database "sample1"
INFO: repacking table "public.actor"
INFO: repacking table "public.address"
INFO: repacking table "public.category"
INFO: repacking table "public.city"
INFO: repacking table "public.country"
INFO: repacking table "public.customer"
INFO: repacking table "public.film"
INFO: repacking table "public.film_actor"
INFO: repacking table "public.film_category"
INFO: repacking table "public.inventory"
INFO: repacking table "public.language"
INFO: repacking table "public.payment"
INFO: repacking table "public.pgbench_accounts"
INFO: repacking table "public.pgbench_branches"
INFO: repacking table "public.pgbench_tellers"
INFO: repacking table "public.rental"
INFO: repacking table "public.staff"
INFO: repacking table "public.store"
INFO: repacking database "sample2"
INFO: database "sample2" skipped: pg_repack 1.4.7 is not installed in the database
INFO: repacking database "sample3"
INFO: database "sample3" skipped: pg_repack 1.4.7 is not installed in the database
INFO: repacking database "template1"
INFO: database "template1" skipped: pg_repack 1.4.7 is not installed in the database
[postgres@test-machine02 ~]$
Rebuild Database Online: Reorganize the specified database only. Please note -d or –dbname option can be specified only once.
[postgres@test-machine02 ~]$
[postgres@test-machine02 ~]$ pg_repack -d sample1 -d sample2
ERROR: option -d, --dbname should be specified only once: 'sample2'
[postgres@test-machine02 ~]$
[postgres@test-machine02 data]$ pg_repack -d sample1
INFO: repacking table "public.actor"
INFO: repacking table "public.address"
INFO: repacking table "public.category"
INFO: repacking table "public.city"
INFO: repacking table "public.country"
INFO: repacking table "public.customer"
INFO: repacking table "public.film"
INFO: repacking table "public.film_actor"
INFO: repacking table "public.film_category"
INFO: repacking table "public.inventory"
INFO: repacking table "public.language"
INFO: repacking table "public.payment"
INFO: repacking table "public.pgbench_accounts"
INFO: repacking table "public.pgbench_branches"
INFO: repacking table "public.pgbench_tellers"
INFO: repacking table "public.rental"
INFO: repacking table "public.staff"
INFO: repacking table "public.store"
[postgres@test-machine02 data]$
Rebuild Schema Online: Repack the tables in the specified schema(s) only. Multiple schemas may be repacked by writing multiple -c or –schema options. It may be used in conjunction with –tablespace to move tables to a different tablespace.
[postgres@test-machine02 ~]$ pg_repack -d sample1 --schema public
INFO: repacking table "public.actor"
INFO: repacking table "public.address"
INFO: repacking table "public.category"
INFO: repacking table "public.city"
INFO: repacking table "public.country"
INFO: repacking table "public.customer"
WARNING: relation "public.employee" must have a primary key or not-null unique keys
INFO: repacking table "public.film"
INFO: repacking table "public.film_actor"
INFO: repacking table "public.film_category"
INFO: repacking table "public.inventory"
INFO: repacking table "public.language"
WARNING: relation "public.my_table_test1" must have a primary key or not-null unique keys
INFO: repacking table "public.payment"
INFO: repacking table "public.pgbench_accounts"
INFO: repacking table "public.pgbench_branches"
WARNING: relation "public.pgbench_history" must have a primary key or not-null unique keys
INFO: repacking table "public.pgbench_tellers"
INFO: repacking table "public.rental"
INFO: repacking table "public.staff"
INFO: repacking table "public.store"
[postgres@test-machine02 ~]$
Rebuild Tables Online: Reorganize the specified table(s) only. Multiple tables may be reorganized by writing multiple -t or –table options.
[postgres@test-machine02 data]$
[postgres@test-machine02 data]$ pg_repack -d sample1 --table actor
INFO: repacking table "public.actor"
[postgres@test-machine02 data]$
Move the specified table to tablespace pg1_tbls: Reorganize the specified table(s) only. Multiple tables may be reorganized by writing multiple -t or –table options. It may be used in conjunction with –tablespace to move the table to a different tablespace.
[postgres@test-machine02 ~]$ pg_repack -d sample1 --table actor --tablespace pg1_tbls
INFO: repacking table "public.actor"
[postgres@test-machine02 ~]$
Move the specified table & indexes to tablespace pg1_tbls: Also move the indexes of the repacked tables to the tablespace specified by the –tablespace option.
[postgres@test-machine02 ~]$ pg_repack -d sample1 --table actor --moveidx --tablespace pg2_tbls
INFO: repacking table "public.actor"
[postgres@test-machine02 ~]$
Move all indexes of the table to tablespace pg2_tbls: Repack only the indexes of the specified table(s), which must be specified with the –table or –parent-table options.
[postgres@test-machine02 ~]$ pg_repack -d sample1 --table actor --only-indexes --tablespace pg2_tbls
INFO: repacking indexes of "actor"
INFO: repacking index "public.actor_pkey"
INFO: repacking index "public.idx_actor_last_name"
[postgres@test-machine02 ~]$
Move the specified index to tablespace pg1_tbls: Repack the specified index(es) only. Multiple indexes may be repacked by writing multiple -i or –index options. It may be used in conjunction with –tablespace to move the index to a different tablespace.
[postgres@test-machine02 ~]$
[postgres@test-machine02 ~]$ pg_repack -d sample1 --index idx_actor_last_name --tablespace pg1_tbls
INFO: repacking index "public.idx_actor_last_name"
[postgres@test-machine02 ~]$
Running pg_repack in parallel jobs: Use option -j to add the number of parallel processes to rebuild operation.
[postgres@test-machine02 ~]$ pg_repack -d sample1 --table actor -j 4
NOTICE: Setting up workers.conns
INFO: repacking table "public.actor"
LOG: Initial worker 0 to build index: CREATE UNIQUE INDEX index_16562 ON repack.table_16427 USING btree (actor_id) TABLESPACE pg2_tbls
LOG: Initial worker 1 to build index: CREATE INDEX index_16593 ON repack.table_16427 USING btree (last_name)
LOG: Command finished in worker 0: CREATE UNIQUE INDEX index_16562 ON repack.table_16427 USING btree (actor_id) TABLESPACE pg2_tbls
LOG: Command finished in worker 1: CREATE INDEX index_16593 ON repack.table_16427 USING btree (last_name)
[postgres@test-machine02 ~]$
Reference : https://reorg.github.io/pg_repack/
This document is just for learning purpose and always validate in the LAB environment first before applying in the LIVE environment.
Hope so you like this article!
Please share your valuable feedback/comments/subscribe and follow us below and don’t forget to click on the bell icon to get the most recent update. Click here to understand more about our pursuit.
Related Articles
- Access the MSSQL database from PostgreSQL using TDS Foreign Data Wrapper (TDS_FDW)
- Access MySql database from PostgreSQL using MySQL Foreign Data Wrapper (MYSQL_FDW)
- Configure and monitor using pg_profile in PostgreSQL
- Monitor PostgreSQL Cluster using pgCenter
- Install and Configure pg_repack in PostgreSQL
thank you so much for nice and useful article.
– Satya