PostgreSQL DBA Guide
PostgreSQL DBA Guide
downloading package
downloading package
In the realm of Red Hat, we employ RPMs (Red Hat Package Manager) as a fundamental tool. RPM
files serve as the architects of software repositories, facilitating the seamless installation of software
packages. The process unfolds as follows:
1. Selecting the Operating System (OS) Version: This step is akin to choosing the foundation upon
which our digital infrastructure will be built.
2. Utilizing the Yum Repository Method: We opt for the Yum repository method, a robust approach akin
to gaining access to a comprehensive software library.
3. PostgreSQL Version Selection: A pivotal decision arises as we must specify the PostgreSQL
version. For our purposes, let's opt for PostgreSQL version 12.
use the below command wget to download file directory from website
wget [link to download ]
wget [https://yum.postgresql.org/packages/#pg12]
or use the below script which you can get from website its self
additional package we need to install postgree-contrib , this will include all additional features
to locate this the syntax for package use yum list postgresql12*
configuring PostgreSQL post installation
During the installation of PostgreSQL on a Linux system, you may have noticed that the installation
process lacks certain user interactions that are typically associated with software installations. Unlike
some other software, PostgreSQL installation on Linux doesn't ask you for important details such as the
data directory, installation directory, port number, or even the PostgreSQL password.
you cannot login to PostgreSQL using normal only root can access
then once you login just Chang the password using passwd command using the root user
the installer has already specify the pgdata path for the user
you just need to specify the path for binary to be used by user
PATH=$PATH:HOME/BIN export PATH export PATH=/usr/pgsql-12/bin:$PATH
when you want to use deterrent user then you will have to edit environment path for that user to .
3. database cluster
initdb
initdb syntax
shutdown option
syntax
reload vs restart
pg_controldata
syntax
see data directory
PostgreSQL+Cluster.docx
initdb
initdb need to be installed , and once installed you have to initialise storage area
/var/lib/pgsql/data/
initdb syntax
you have to execute initdb using the postgres user
syntax:
initdb -D /usr/var/lib/pgsql/main
pg_ctl -D /usr/var/lib/pgsql/main
to run the command you will have to put the full path of initdb binary
with follow up with the option and directory
make sure that command is not executed using the root user
if you ran to the below issue you also have to change the owner of the folder using chown
chown postgres:postgres /postgresql/data/
then try to run the command again and it should execute correctly
/usr/lib/postgresql/13/bin/initdb -D /postgresql/data/
now go to the file you created and you will see there files created on new directory
once we created the cluster it wont start automatically , we need to start the cluster using the command
same issue as INITDB command you have to give the full path of the command or edit the environment
to include the pg_ctl path
pg_ctl start
/usr/lib/postgresql/13/bin/initdb -D /postgresql/data/
if will ran to error as showing below , because the cluster is using the same port 5432 as default cluster
to connect to the new cluster we have created use the below syntax , and make sure to provide the
correct port for the cluster
psql -U postgres [or any user you want] -p 5444[port number]
\q
you can check if the cluster is runinng by using the below command
/usr/lib/postgresql/13/bin/pg_ctl status -D /postgresql/data/
shutdown option
syntax
for the propose i will use smart option in shutdown the new cluster we have created , for the rest of
shutdown option is basically the same .
pg_ctl stope -m smart[also you can uses immediate or fast] -D[data directory]
reload vs restart
reload is used when we did some changes on configuration files to , reload only load new config with
restarting the services .
Some changes in config wont reflect unless we restart the services
restart : gracefully shutdown all activity , close all open files and start again with new configuration .
pg_controldata
syntax
pg_contrldata /var/lib/pgsql/main
to see the data directory connect psql client and use the below command
show data_directory;
4- PostgreSQL Directory layout
bin folder
data
log folder
bin folder
here you will find all PostgreSQL utility files such as initdb a& bg_ctl
data
the folder is located in default cluster and might change depend on if there mulitable cluster created ,
but by default its located in /var/lib/postgresql/13/main
here you will find all data related to database and the configuration file for the cluster
log folder
the log of the PostgreSQL is very important it can support you to troubleshoot the status of PostgreSQL
postgresql.conf
postgresql.autoconf
reset all
pg _ident.conf
pg_hba.conf
postgresql.conf
file contain parameter to help configure and manage performance of the database server
when we run initd command to create cluster it create postgresql.conf file default
pg_setting is table contain all the infromation about config setup of postgresql.conf
you can query it to get parameters set
for example the below query will show parameter that recently eddied and pending restart to apply
to get the list of column in pg_setting we can use the below option , useful in case you want to build
query for yourself
\d pg_setting
select name , setting , category , boot_val from pg_settings where sourcefile =
'/etc/postgresql/13/main/postgresql.conf';
postgresql.autoconf
this file have parameter that allow to modify the postgresql.conf parameter from cli rather then editing
the file itself
the editing of postgresql.conf is very critical and better to edit it from psql command line using alter
command
keeping in mind some parameter require restart to reflect in the postgresql
to view the current value use the show command and the parameter
syntax
or from cli itself you can check it by using the below query .
we have changed the parameter fo work_mem from 3 to 10 let's check if its reflected on postgresql.cnf
file
the value hasn't changed why ?
acuity the changes is not edited in postgresql.conf file , it will be available in postgresql.autoconf
if you checked it you will find the changes you did for work_mem
the file will be located in the data directory of the db-cluster
/var/lib/postgresql/13/main/
if you check the content of the file you will find the new value for work_mem
postgresql will load the value in postgresql.auto.conf and avoid load the value in postgresql.conf
reset all
if you want to reset all the changes you did in psql cli using alter system
command
you can use the below query
pg _ident.conf
the file is part of authenticating section file of PostgreSQL and will allow to map OS user with
PostgreSQL user
this file allow to match the user in PostgreSQL database with username in OS level
i have PostgreSQL user called ahmed , but the user is not there in OS so i will create it and map it to
user called postres in PostgreSQL
in the file you will asked to give friendly name for map , then add the OS username in Identity , then
postgresql_username add PostgreSQL username
after that you need to reload PostgreSQL
pg_hba.conf
enables clients authentication between PostgreSQL server and the client application
creating database
createdb utility
drop database
from psql
dropdb utility
create user
using --interactive
drop user
privilege
Create.docx
creating database
createdb utility
syntax
createdb [database name]
1.using -U option
i will create database called asus i am login in using root user
so the syntax is below
~$ createdb lenovo
create database from psql
the syntax :
if you don't specify the owner then automatically the owner will be the user that issue the command .
you can use the below option and it will display the connection infor including data directory and port
\conninfo
drop database
to drop the database same as create you can do it from command line .
from psql
#drop database [database_name]
dropdb utility
this utility allow you to drop database from the command line
create user
while creating the user it important to note that username must be unique
the username should not start with 'pg_'
the user postgres is created during the installation automatically
it hold the privilege of 'super user ' allow it to users with role privileges
the user postgres has all the privilege with grant option .
as usual you can use --help to check deferent option you can use
for this porpoise i will create user which not super user and i will specify password for it
so the syntax is
root@PostgreSQLSTG ~# createuser -U postgres -P -S [username]
postgres@PostgreSQLSTG ~# createuser -P -S [username]
using --interactive
for more interactive opting using createuser utility you can specify --interactive option and it will ask
you for deterrent parameter to which you have to specify and this option for ease of use
createuer --interactive
create user from psql
syntax
drop user
privilege
privilege is right to execute particular SQL statement , or a right to access another user objects .
when you issue the \du in pgSQL you will see the users with cluster level privileges
syntax
when you create table the owner will be the user who created the table other user will not be able to
select from the table so to solve this you will have to grant select privilege to the user
table inheritance
copy table
table inheritance
To convey the idea that child tables inherit all columns from the parent master table while also allowing
child tables to have additional fields, with the ability to specify the "ONLY" keyword for queries on
specific tables, you can rewrite the statement as follows:
"We enable child tables to inherit all columns from the parent master table, permitting them to include
additional fields unique to each child table. Furthermore, the 'ONLY' keyword can be utilized to specify
that a query should exclusively target a particular table and not affect any others.
Please note that any updates or deletions made to the parent table will automatically propagate to the
child table due to their inheritance relationship.
We will create a child table called 'online.book' and specify its inheritance using the INHERITS syntax.
inherits(partent_table_name)
To check the description of the child table and verify its inheritance from the parent "order" table, you
can use the following phrase:
\d online_booking
"We will inspect the child table's metadata and confirm whether it inherits from the parent 'order' table."
you will also notice that child table inheritance all the column from parent table
we can use the \d+ order on parent table and it will give more info on parent table and also will show
if there child table
in child table you can update with no requirement , but for updating parent table it require to be caution
due to fact that any changes will reflect on child
copy is used to copy structure of table alone with the data to another table
the way you implement this is by creating new table and using as table followed by existing table name
below is the syntax with data
default tablespaces
pg_default
pg_global.
creating tablespace.
drop tablespace
to avoid load on disk we can separate the data been stored in disk to be segregated to deterrent part of
the disk.
default tablespaces
default comes with two out box tablespaces namely pg_default & pg_global .
the default location for tablespaces is data directory.
pg_default
al newly created database uses pg_default tablespace , unless overridden by TABLESPACE CLAUSE
while create database
pg_global.
creating tablespace.
to show the tablespace that are in the db_cluster you can use the below command
select * from pg_tablespace ;
to create new tablespace first create directory where the tablespace will point the data to be saved to
syntax:
when creating new directory you you will need to add required permission
now when you want to create any new object such as table or database or even index you can specify
the table space that will store the object
if you check the directory of the tablespace you will find there is new file created
move table between tablespaces
the table store is not assinge to any tablespace so if you found the tablespace column empty it means
the table is assinge to the default tablespace
alter table [table-name] set tablespace [tablespace-name];
alter table store set tablespace table1;
select pg_relation_filepath('store');
drop tablespace
note that you cannot drop tablespace if there is object in the tablespace
the syntax :
PostgreSQL create temporary tablespace for such action you require for completing a query example
sorting query
temporary tablespace deosnt store any data and they are remove after you shutdown database .
syntax:
create tablespace [tablespace-choosen-name] owner [username] LOCATION
[filepath]
the owner parameter is not require to be added , incase you didn't specify the owner the temp
tablespace owner will be PostgreSQL .
after creating temp tablespace there is one changes must be done in PostgreSQL configuration files
there parameter called temp_tablespaces where you have to specify temp tablespace name .
9. backup and restore
type of backup
1. logical backup
2. physical backup
logical backup
physical backup
offline backup
type of backup
1. logical backup
A logical backup refers to the process of converting the data in a database into a straightforward text
format. This typically involves creating scripts of the database, table, or database cluster, which are then
saved as SQL files.
The scripts for tables and databases are composed of SQL statements. By executing these SQL
statements, the tables and databases can be recreated
2. physical backup
A physical backup entails duplicating the actual files used for the storage and retrieval of the database.
These backups are typically in binary form, which is not human-readable.
Physical backups can be categorized into two types:
Offline Backup: This type of backup is performed when the system is shut down. During this time, a
backup of the data directory is taken.
Online Backup: This is conducted while the system is operational and running, with users actively
connected to it.
logical backup
To take a logical backup in PostgreSQL, you can use the built-in utility pg_dump. The syntax for using
this utility is as follows:
Replace [username] with your PostgreSQL username, [database_name] with the name of the database
you want to back up, and [path_to_backup_file]/backup_file_name.sql with the path and name of the file
where you want to store the backup.
For example:
In this example, postgres is the username, dvdrental is the database name, and the backup will be
stored in the specified path /share/dvdrental_backup.sql. Make sure that you have the necessary
permissions for the folder where you intend to store the backup. If you don't specify an extension for the
file, it will be automatically saved with the .sql extension, which is the standard for SQL files.
if you check the file content by using any prefers text editor such as less you will find SQL statement
only on the file .
taking backup of the entire cluster
To perform a logical backup of a PostgreSQL database cluster, you can use the built-in utility
pg_dumpall . This utility is designed to back up all the databases in the cluster, and the user executing
pg_dumpall needs to have full superuser privileges.
In this example, postgres is the superuser, and the backup of the entire database cluster will be
stored at the specified path /share/dbcluster_backup.sql. Remember to ensure that you have the
necessary permissions for the folder where the backup file will be stored. By default, if you don't specify
an extension for the file, it will be saved with the .sql extension.
While running pg_dumpall for a PostgreSQL database cluster backup, it's normal for the utility to
prompt for the password multiple times. This happens because it requests the password for each
database in the cluster.
This approach is suitable for smaller databases due to its straightforward nature. However, it is not
recommended for large databases as it can be time-consuming to generate the backup. For larger
databases, more efficient methods or tools that can handle large volumes of data more effectively might
be preferable.
In the case of logical backups, where the dump file can become quite large, there are strategies to
manage the file size and storage requirements:
1.Compression: You can compress the dump file to reduce its size. This is particularly useful when
dealing with large databases, as it can significantly decrease the space needed for storage. Most
compression tools can reduce the file size substantially, making it easier to handle and store.
syntax :
example:
2- Splitting the File: If you have limited space in your operating system or wish to distribute the storage
of the dump across different partitions, you can split the dump file into smaller parts. This approach
allows you to manage storage more effectively, especially when dealing with constraints in disk space or
organizational policies on data storage.
To split the output of pg_dumpall into smaller files, you can indeed use the split command in
Unix/Linux systems. The -b option allows you to specify the size of each split file. You can specify the
size in kilobytes (k) , megabytes (m) , or gigabytes (g).
For example, if you want to split the dump into 1KB chunks, you would use the 1k option. Similarly, you
can use m for megabytes and g for gigabytes.
to test the process i have go ahead and drop database dvdrental and then i will attempt to restore the
database
in this case i will drop a table and only restore this table , this one of common scenarios you will
encounter during production .
command to restore table
syntax:
physical backup
offline backup
its important to note that when we restore the database the PostgreSQL server must be shutdown
during the restore .
partial restore or singe table restore is not possible because we are backup the entire data directory and
when we restore we are going to restore the entire data directory
to start i have two dbcluster
using the below command we can stop ne of them
pg_lscluster
pg_ctlcluster 13 ahmed stop
the backup will happen in folder you are currently in so better to change you directory to folder you want
to store the backup
In this type of backup the system available and online for the user to do there
In background continuous backup is taken
In postgresql we use a continuous archiving method to enforce online backup.
Wall files : similar to MS SQL log file , here the transaction gets written on wall files upon commit before
they are written to the data file .
This is done to ensure that in case of a crash we can recover the system using the wal files .
Archiver : archiver role is to copy the wall files to another safe location
To achieve online backup we will use continuous archiving of wall files or in better context continuous
copying of wall files to safe locations .
In case of a disaster assuming i have full backup on sunday and system crash monday at 10:00AM .
And you need to recover the system till 10 Am because you don't want to lose data .
In this case you will need the full backup taken on Sunday plus all archived wal files.
This method of restore is called point in time recovery.
Also look for parameter ‘archive_mode’ uncomment and change it from off to on
Show archive_mode;
We can use the which just inform postgresql that i am going to start a backup
note : this is not real backup
Select pg_start_backup(‘database_name’);
select pg_stop_backup();
for now we have archive wall for certain database from psql
we have setup archive backup , but you must keep in mind that archiver require full backup to be
avilable
without that archiver is totaly useless.
base_backup is very
full online backup means full backup is taken while system is online
now go back to psql console and stop the backup that we run
select pg_stop_backup('f');
base backup is can be used for replication and point in time recovery
the base_backup don't put the database in backup mode but make the database accessible while the
backup is running .
base_backup cannot take single object or single table or singe database , insist it will only take full
backup of the entire DBCLUSTER .
syntax:
-h this host in which where the backup will be store in my case is localhost but you can store it in
remote server
-Ft is for format you want
-z this will store the backup in gzip format
-P this will allow you to view the progress of the backup.
-Xs means you want backup of the entire database along with all transaction which are happening
during the time of the backup
10-Maintenance in PostgreSQL (course closure)
analyse command
vacuum freeze
vacuum
what is vacuum
what is vacuum full
routine reindexing
rebuild index
cluster
example ;
explain select * from actor;
the query will display the following
cost here means the cost of CPU and the number of pages read , the that i will calculated and the cost
is presented
the cost equation is as following = 'umber of page read * sequnce_page_cost*row count * CPU usage
time '
Cost Analysis:
the cost will display two value 1.intila cost , 2. actual cost
1.Initial Cost: This is the preliminary estimation of how resource-intensive the query might be.
2. Actual Cost: This value represents the actual resources consumed when running the query.
Row Details:
Rows Read: This indicates the number of rows that the query will read from the database.
Data Width:
Row Size: This represents the size or width of each row's data.
the explain cost will change depend on query for example the below query will have where clause
adding flirting to the result
you can see the cost reduce because we only checking one certain row
updating planner statistics / analyse
PostgreSQL query planner relays on statistical information about the contact of the table in order to
generate good plans for query .
If the statistical information about the table, including the number of rows or the number of columns,
remains static and is not updated, the query '''optimizer''' may struggle to create effective execution
plans. Keeping these statistics up-to-date is crucial for the optimizer to generate optimal query plans
that adapt to changes in the data distribution and size.
table in general get updated deleted new insert
so what's is going on in the table has to be update to the ''optimizer''
inaccurate or outdate statics may lead to optimizer chosen the poor execution plan which my lead to
database performance degradation
analyse command
analyse command collects information about row size , row count , average row size and row
sampling information
we can run Analyze command automatcily by enable autovaccum daemon(enabled by default) or
run the analyze command manually
I have upload the below insert file to server we will use psql to insert this
insert.sql
now we will see what's inofmration does optimize have about the table
for insist the row are stored in pages
let's see what the optimize know about number o pages and number of row
so let's see the quey execution plan for select * from tel_dirctory;
explain select * from tel_directory;
the execution plan came wrong its said it will read 864 row while there is 5000 row as shown in count
query .
this means that optimizer doesn't have any idea about the table.
now we will run the Analise command this now will tell the optimizer what are number row and the
pages
analyze [table-name];
now the optimizer know that there is 5000 row and there 32 pages took to store the rows
now let's check explain and we will see the explain plan know the exist number of row
how to enable autovacum
vacuum freeze
is a critical maintenance operation in PostgreSQL. It's specifically designed to mark rows as "frozen,"
which means they are no longer subject to being vacuumed or removed by the system. This process is
essential for ensuring the stability and performance of the database, particularly in scenarios where data
changes frequently
Transaction ID Wraparound Prevention: PostgreSQL uses Transaction IDs (XIDs) to track the age
and status of transactions. Since XIDs are finite and wrap around, there's a risk of a "transaction ID
wraparound" if the XID limit is reached. When this happens, it can lead to data corruption and
downtime. VACUUM FREEZE helps prevent this by recycling old XIDs, making them available for
reuse.
Performance Optimization: As rows become "frozen," they are excluded from regular vacuuming
processes. This reduces the overhead of vacuum operations and helps maintain consistent
database performance over time.
Example:
Imagine you have a PostgreSQL database that's been running for several years, with frequent data
modifications. The XID counter has been steadily increasing. If you don't perform VACUUM FREEZE,
you risk reaching the XID limit, which could result in a catastrophic database failure.
PostgreSQL uses the utility called vacumdb to vacum freeze tabel or row or entire database .
vacuum
PostgreSQL doesn't UPDATE in PLACE OR DELETE a row directory form the disk .
meaning when you delete a record form table they are not actually delete they are just marked as
deleted and they are keep as old version
as the old version keep pilling up and they become absolute , this causes fragmentation and
bloating in the table .
so the tables are indexes will look the excite the same size even after you delete a lot of record .
to solve this we will use vacuum and vacuum full
what is vacuum
during vacuum there will not be exclusive luck on table , meaning we can do vacuum during
business hour and it will not effect performance .
vacuum full rewrite the entire contents of the tables into new disk file with no extra space .
vacuum full take a lot of time then vacuum ,because it going to rewrite the entire content of the table
into the disk again
SELECT pg_size_pretty(pg_total_relation_size('your_table_name'));
or\dt [table-name]
you can see the size reduced
\dT+ [ pattern ]
Lists all data types or only those that match pattern. The command form \dT+ shows extra information.
we have create table called tel_directory with 9000 row this current size of the table
\dt+ tel_directory
if vaccum is enabled by default we will not be able to test our size deferent when we do vacuum of full
vacuum when we delete row
now 3002 record has been delete let's check the row count and size
the size didn't change but it increased , because the record are not removed form table they are marked
as old version , meaning they still occupy space from disk
state.sql
the size of the table increased but this now consider bloated (''data fragmented'') table containing record
that are there and record that are delete but not removed only marked as old .
the ''bloating'' where size of the table increased even though it holding the same amount of records
now I will drop this table and start again and create table and insert the record from insert.sql file
truncate will remove all record in the table
i will delete the row same as before and check the count and the size after delete
# vacuum(verbose) tel_directory
the option verbose will show what has been and status after the command , its optional not needed
you can use the below syntax if you would like
vacum tel_directory
now we will insert the row in state.sql and check if the space changed
vacuum(full,verbose) tel_directory;
the size as shown below has decreased with huge margin compare to normal vacuum
vacuum full has gone ahead and deleted all dead rows and released the space to the OS
routine reindexing
a fragmented index will have pages where logica order based on key value differs from the physical
ordering inside the data file .
heavily fragmented index can degrade query performance because additional I/O is required to
locate data to which the index point
reindex rebuild as index using data stored in index table and eliminates empty space between
pages
syntax : reindex index <index-name> .
there internal veiw in PostgreSQL that will show statics about index , the view will show the extension
that you have to install in order to add more feature , wear looking for extension that will show table
statics
to install the extension we will go to psql console and type the below command
to use the extension first check how many index you have in the database.
\di
now i will see what is the status of one of the index , whether this index is bloated or its fragmented .
or its working fine
note : the extension must be installed on database you want to check the
index .
the most important thing here is leaf _fragmentation .
if the leaf fragmentation is about 50 this is considered as fragmented index
rebuild index
syntax :
cluster
cluster instruct PostgreSQL to cluster the table base on the index specified by the index_name
cluster is one-time operation , when table receive consent changes , the changes will not be
clustered so you will have to to do cluster again
now to demonstrate the cluster i will create table and i will insert row in random order .
now I will cluster the table using the index i have created , after that we will do simple query and check if
the record are in order after cluster .
prerequisite :
Configuring Master
Configuring Slave
monitoring replication
prerequisite :
4. Administrative privileges (sudo access) are required for modifying configuration files.
5. Temporarily disable the firewall to facilitate the necessary configurations.
6. Root access is essential for instances where deletion of slave data files is necessary.
we will be implement master-slave streaming replication using two VMS running ubuntu 22 jammy
both node will have PostgreSQL 12 installed , and configured with ip in same subnet
below are VM details .
1.postgresqlDB01 : 10.10.10.77
2.postgresqlDB02 : 10.10.10.78
installing PostgreSQL
first check if PostgreSQL 12 is available in Ubuntu repositories using the below command
apt-cache madison postgresql-12
If PostgreSQL 12 isn't available in our current repository, the next step involves adding the PostgreSQL
12 repository to our Ubuntu machine. This requires importing the GPG key and then adding the
repository. To accomplish this, we'll use the following command.
Note: Internet access on the VM is a prerequisite for these steps
We've successfully updated the repository. Now, we'll proceed with the installation of PostgreSQL 12.
The following command will be used for this purpose
Configuring Master
Now we will create a database user “replication” who will carry out the task of replication.
switch to postgres user that will be automatically added to ubuntu once we installed PostgreSQL
su - postgres
if you are unable to switch to postgres user then reset its password using the below command
create a database user “replication” who will carry out the task of replication using the below command ,
also note down the password and username you will need it in the slave configuration
\du
Next, we'll adjust the maximum number of connections permitted for the replication user. This is done by
executing the following command within the psql client.
The configuration files for PostgreSQL are typically found in the following directory:
/etc/postgresql/12/main
We need to modify a file named postgresql.conf to configure this server as the master. While both nano
and vim are suitable editors for this task, I will be using nano for this purpose
Note : Make sure to use a user account with sudo privileges or the root account for these steps, as
administrative access is required to edit the postgresql.conf file.
sudo nano /etc/postgresql/12/main/postgresql.conf
Edit the following parameter in the postgresql.conf file. If the line is currently commented out,
uncomment it to activate the option
you can use Ctrl+w to search and go to desire line
listen_addresses = '*'
wal_level = replica
max_wal_senders = 10
wal_keep_segments = 64
once done click Ctrl+x then y , then hit enter
now Slave server need authentication for replication. Now append following line
to /etc/postgresql/12/main/pg_hba.conf file
sudo nano /etc/postgresql/12/main/pg_hba.conf
# Replace 10.10.10.78 with slave server's private IP host replication replication
10.10.10.78/24 md5
next steps is to restart postgresql services
We have finished all configuration in Master Server and our master server is ready for replication
Configuring Slave
As mentioned earlier, remember the password set for the replication user we created. In the following
step, you will be prompted to enter the password for this replication use
Once the fetching process is complete, proceed to start the PostgreSQL service
Congratulations, you have successfully replicated your database! You can verify this by making any
change in the master database and observing that it gets immediately replicated in the slave database
First, create a database named TestDB . Use the following SQL command:
\c TestDB
Then, create a table. Let's say you create a table named example_table :
CREATE TABLE example_table (
id SERIAL PRIMARY KEY,
data VARCHAR(100)
);
Finally, on the slave server, check if the data has been replicated. You can do this by querying
the same table on the slave server:
monitoring replication
We can verify the replication status by using the following command. If the state displays 'streaming', it
indicates that everything is functioning correctly
failback
before we start the failover , we need to verified the syn between the master and slave
in the slave run the below command and if its return 0 means no delay
once you confirmed everything is fine , you need to stop the services either by shutdown the server or
stop PostgreSQL services
for this purpose i will stop the PostgreSQL services .
systemctl stop postgresql
promote the standby to read-write
next step is to promote the standby server to be able to read and write
using the following command
edit the pg_hba.conf in slave server and add the ip of the master server to act as slave
cd /etc/postgresql/12/main echo "host replication replication 10.10.10.80/24 md5" >>
pg_hba.conf
Create standby.signal on Old Primary
Create the standby.signal file on the old primary to ensure it starts as a standby when brought back
online:
touch /etc/postgresql/12/main/standby.signal
run the below command if its return T means the database is in read only if F means the database in
read write
SELECT pg_is_in_recovery();
This query will return a single boolean value:
true if the server is a standby and false if it's the primary
failback
fist verify that database is syncing by running the below command on slave server and should be 0
cd /etc/postgresql/12/main/
nano pg_hba.conf
promote the slave (old-master)
touch /var/lib/postgresql/12/main/standby.signal
cd /etc/postgresql/12/main
nano postgresql.auto.conf
`
12. logical replication
deterrent than stream replication , where the master will send wall log , there we send the actual
command such (insert into t1 values (1,'''value'' ))
Logical Replication
Source Database
Replica Database
Physical Replication
Source Database
Replica Database
physical replication
in physical replication , the replica is force to copy the whole database schema tables and so on from
master server .
physical replication in this case can not support in replicating singe table
logical replication
as mentioned only replicated the changes , which give the advantage of replicating singe table .
here the primary database will send the DML command to be replayed on standby server
here's the scoop:
1. Decoding WAL Records: In PostgreSQL, Write-Ahead Logging (WAL) records all changes made to
the database. When using logical replication, the first step is to decode these WAL records. Think of
it like decrypting a secret code; this process extracts the actual changes that were made to the data.
2. Streaming to Replica Server: Once those changes are decoded, they're streamed over to the
replica server. This is like sending a live feed of the changes happening in the source database to
the replica, ensuring it stays up to date with the latest data modifications.
3. Applying Statements on Replica: On the replica server, these decoded changes are then applied
as SQL statements. It's like having a copycat follow along with the source database's actions,
executing the same SQL commands to mimic the changes.
So, in a nutshell, PostgreSQL goes through this process of decoding, streaming, and applying changes
to keep the replica database in sync with the source. It's like a well-choreographed dance of data
replication!
Source Database
Source Database
Writes Data
Decoded Changes
Streaming Changes
Live Stream
Replica Database
Replica Database
Applies Changes
this whole setup the primary server is called publisher server , and replica is called subscriber server .
similar to MS SQL replicationudemy
physical replication
sequence of steps: