Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 989be08

Browse files
committed
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm that all changes made by a transaction had been transferred to at most one synchronous standby server. This commit extends synchronous replication so that it supports multiple synchronous standby servers. It enables users to consider one or more standby servers as synchronous, and increase the level of transaction durability by ensuring that transaction commits wait for replies from all of those synchronous standbys. Multiple synchronous standby servers are configured in synchronous_standby_names which is extended to support new syntax of 'num_sync ( standby_name [ , ... ] )', where num_sync specifies the number of synchronous standbys that transaction commits need to wait for replies from and standby_name is the name of a standby server. The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before is also still supported. It's the same as new syntax with num_sync=1. This commit doesn't include "quorum commit" feature which was discussed in pgsql-hackers. Synchronous standbys are chosen based on their priorities. synchronous_standby_names determines the priority of each standby for being chosen as a synchronous standby. The standbys whose names appear earlier in the list are given higher priority and will be considered as synchronous. Other standby servers appearing later in this list represent potential synchronous standbys. The regression test for multiple synchronous standbys is not included in this commit. It should come later. Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs, Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen, Rajeev Rastogi Many thanks to the various individuals who were involved in discussing and developing this feature.
1 parent 2143f5e commit 989be08

File tree

13 files changed

+806
-171
lines changed

13 files changed

+806
-171
lines changed

doc/src/sgml/config.sgml

+42-7
Original file line numberDiff line numberDiff line change
@@ -2906,34 +2906,69 @@ include_dir 'conf.d'
29062906
</term>
29072907
<listitem>
29082908
<para>
2909-
Specifies a comma-separated list of standby names that can support
2909+
Specifies a list of standby names that can support
29102910
<firstterm>synchronous replication</>, as described in
29112911
<xref linkend="synchronous-replication">.
2912-
At any one time there will be at most one active synchronous standby;
2912+
There will be one or more active synchronous standbys;
29132913
transactions waiting for commit will be allowed to proceed after
2914-
this standby server confirms receipt of their data.
2915-
The synchronous standby will be the first standby named in this list
2914+
these standby servers confirm receipt of their data.
2915+
The synchronous standbys will be those whose names appear
2916+
earlier in this list, and
29162917
that is both currently connected and streaming data in real-time
29172918
(as shown by a state of <literal>streaming</literal> in the
29182919
<link linkend="monitoring-stats-views-table">
29192920
<literal>pg_stat_replication</></link> view).
29202921
Other standby servers appearing later in this list represent potential
2921-
synchronous standbys.
2922-
If the current synchronous standby disconnects for whatever reason,
2922+
synchronous standbys. If any of the current synchronous
2923+
standbys disconnects for whatever reason,
29232924
it will be replaced immediately with the next-highest-priority standby.
29242925
Specifying more than one standby name can allow very high availability.
29252926
</para>
2927+
<para>
2928+
This parameter specifies a list of standby servers by using
2929+
either of the following syntaxes:
2930+
<synopsis>
2931+
<replaceable class="parameter">num_sync</replaceable> ( <replaceable class="parameter">standby_name</replaceable> [, ...] )
2932+
<replaceable class="parameter">standby_name</replaceable> [, ...]
2933+
</synopsis>
2934+
where <replaceable class="parameter">num_sync</replaceable> is
2935+
the number of synchronous standbys that transactions need to
2936+
wait for replies from,
2937+
and <replaceable class="parameter">standby_name</replaceable>
2938+
is the name of a standby server. For example, a setting of
2939+
<literal>'3 (s1, s2, s3, s4)'</> makes transaction commits wait
2940+
until their WAL records are received by three higher priority standbys
2941+
chosen from standby servers <literal>s1</>, <literal>s2</>,
2942+
<literal>s3</> and <literal>s4</>.
2943+
</para>
2944+
<para>
2945+
The second syntax was used before <productname>PostgreSQL</>
2946+
version 9.6 and is still supported. It's the same as the first syntax
2947+
with <replaceable class="parameter">num_sync</replaceable>=1.
2948+
For example, both settings of <literal>'1 (s1, s2)'</> and
2949+
<literal>'s1, s2'</> have the same meaning; either <literal>s1</>
2950+
or <literal>s2</> is chosen as a synchronous standby.
2951+
</para>
29262952
<para>
29272953
The name of a standby server for this purpose is the
29282954
<varname>application_name</> setting of the standby, as set in the
29292955
<varname>primary_conninfo</> of the standby's WAL receiver. There is
29302956
no mechanism to enforce uniqueness. In case of duplicates one of the
2931-
matching standbys will be chosen to be the synchronous standby, though
2957+
matching standbys will be considered as higher priority, though
29322958
exactly which one is indeterminate.
29332959
The special entry <literal>*</> matches any
29342960
<varname>application_name</>, including the default application name
29352961
of <literal>walreceiver</>.
29362962
</para>
2963+
<note>
2964+
<para>
2965+
The <replaceable class="parameter">standby_name</replaceable>
2966+
must be enclosed in double quotes if a comma (<literal>,</>),
2967+
a double quote (<literal>"</>), <!-- " font-lock sanity -->
2968+
a left parentheses (<literal>(</>), a right parentheses (<literal>)</>)
2969+
or a space is used in the name of a standby server.
2970+
</para>
2971+
</note>
29372972
<para>
29382973
If no synchronous standby names are specified here, then synchronous
29392974
replication is not enabled and transaction commits will not wait for

doc/src/sgml/high-availability.sgml

+58-18
Original file line numberDiff line numberDiff line change
@@ -1027,10 +1027,12 @@ primary_slot_name = 'node_a_slot'
10271027

10281028
<para>
10291029
Synchronous replication offers the ability to confirm that all changes
1030-
made by a transaction have been transferred to one synchronous standby
1031-
server. This extends the standard level of durability
1030+
made by a transaction have been transferred to one or more synchronous
1031+
standby servers. This extends that standard level of durability
10321032
offered by a transaction commit. This level of protection is referred
1033-
to as 2-safe replication in computer science theory.
1033+
to as 2-safe replication in computer science theory, and group-1-safe
1034+
(group-safe and 1-safe) when <varname>synchronous_commit</> is set to
1035+
<literal>remote_write</>.
10341036
</para>
10351037

10361038
<para>
@@ -1084,8 +1086,8 @@ primary_slot_name = 'node_a_slot'
10841086
In the case that <varname>synchronous_commit</> is set to
10851087
<literal>remote_apply</>, the standby sends reply messages when the commit
10861088
record is replayed, making the transaction visible.
1087-
If the standby is the first matching standby, as specified in
1088-
<varname>synchronous_standby_names</> on the primary, the reply
1089+
If the standby is chosen as the synchronous standby, from a priority
1090+
list of <varname>synchronous_standby_names</> on the primary, the reply
10891091
messages from that standby will be used to wake users waiting for
10901092
confirmation that the commit record has been received. These parameters
10911093
allow the administrator to specify which standby servers should be
@@ -1126,6 +1128,40 @@ primary_slot_name = 'node_a_slot'
11261128

11271129
</sect3>
11281130

1131+
<sect3 id="synchronous-replication-multiple-standbys">
1132+
<title>Multiple Synchronous Standbys</title>
1133+
1134+
<para>
1135+
Synchronous replication supports one or more synchronous standby servers;
1136+
transactions will wait until all the standby servers which are considered
1137+
as synchronous confirm receipt of their data. The number of synchronous
1138+
standbys that transactions must wait for replies from is specified in
1139+
<varname>synchronous_standby_names</>. This parameter also specifies
1140+
a list of standby names, which determines the priority of each standby
1141+
for being chosen as a synchronous standby. The standbys whose names
1142+
appear earlier in the list are given higher priority and will be considered
1143+
as synchronous. Other standby servers appearing later in this list
1144+
represent potential synchronous standbys. If any of the current
1145+
synchronous standbys disconnects for whatever reason, it will be replaced
1146+
immediately with the next-highest-priority standby.
1147+
</para>
1148+
<para>
1149+
An example of <varname>synchronous_standby_names</> for multiple
1150+
synchronous standbys is:
1151+
<programlisting>
1152+
synchronous_standby_names = '2 (s1, s2, s3)'
1153+
</programlisting>
1154+
In this example, if four standby servers <literal>s1</>, <literal>s2</>,
1155+
<literal>s3</> and <literal>s4</> are running, the two standbys
1156+
<literal>s1</> and <literal>s2</> will be chosen as synchronous standbys
1157+
because their names appear early in the list of standby names.
1158+
<literal>s3</> is a potential synchronous standby and will take over
1159+
the role of synchronous standby when either of <literal>s1</> or
1160+
<literal>s2</> fails. <literal>s4</> is an asynchronous standby since
1161+
its name is not in the list.
1162+
</para>
1163+
</sect3>
1164+
11291165
<sect3 id="synchronous-replication-performance">
11301166
<title>Planning for Performance</title>
11311167

@@ -1171,19 +1207,21 @@ primary_slot_name = 'node_a_slot'
11711207
<title>Planning for High Availability</title>
11721208

11731209
<para>
1174-
Commits made when <varname>synchronous_commit</> is set to <literal>on</>,
1175-
<literal>remote_apply</> or <literal>remote_write</> will wait until the
1176-
synchronous standby responds. The response may never occur if the last, or
1177-
only, standby should crash.
1210+
<varname>synchronous_standby_names</> specifies the number and
1211+
names of synchronous standbys that transaction commits made when
1212+
<varname>synchronous_commit</> is set to <literal>on</>,
1213+
<literal>remote_apply</> or <literal>remote_write</> will wait for
1214+
responses from. Such transaction commits may never be completed
1215+
if any one of synchronous standbys should crash.
11781216
</para>
11791217

11801218
<para>
1181-
The best solution for avoiding data loss is to ensure you don't lose
1182-
your last remaining synchronous standby. This can be achieved by naming multiple
1219+
The best solution for high availability is to ensure you keep as many
1220+
synchronous standbys as requested. This can be achieved by naming multiple
11831221
potential synchronous standbys using <varname>synchronous_standby_names</>.
1184-
The first named standby will be used as the synchronous standby. Standbys
1185-
listed after this will take over the role of synchronous standby if the
1186-
first one should fail.
1222+
The standbys whose names appear earlier in the list will be used as
1223+
synchronous standbys. Standbys listed after these will take over
1224+
the role of synchronous standby if one of current ones should fail.
11871225
</para>
11881226

11891227
<para>
@@ -1208,13 +1246,15 @@ primary_slot_name = 'node_a_slot'
12081246
they show as committed on the primary. The guarantee we offer is that
12091247
the application will not receive explicit acknowledgement of the
12101248
successful commit of a transaction until the WAL data is known to be
1211-
safely received by the standby.
1249+
safely received by all the synchronous standbys.
12121250
</para>
12131251

12141252
<para>
1215-
If you really do lose your last standby server then you should disable
1216-
<varname>synchronous_standby_names</> and reload the configuration file
1217-
on the primary server.
1253+
If you really cannot keep as many synchronous standbys as requested
1254+
then you should decrease the number of synchronous standbys that
1255+
transaction commits must wait for responses from
1256+
in <varname>synchronous_standby_names</> (or disable it) and
1257+
reload the configuration file on the primary server.
12181258
</para>
12191259

12201260
<para>

src/backend/Makefile

+3-1
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,7 @@ distprep:
203203
$(MAKE) -C parser gram.c gram.h scan.c
204204
$(MAKE) -C bootstrap bootparse.c bootscanner.c
205205
$(MAKE) -C catalog schemapg.h postgres.bki postgres.description postgres.shdescription
206-
$(MAKE) -C replication repl_gram.c repl_scanner.c
206+
$(MAKE) -C replication repl_gram.c repl_scanner.c syncrep_gram.c syncrep_scanner.c
207207
$(MAKE) -C storage/lmgr lwlocknames.h
208208
$(MAKE) -C utils fmgrtab.c fmgroids.h errcodes.h
209209
$(MAKE) -C utils/misc guc-file.c
@@ -320,6 +320,8 @@ maintainer-clean: distclean
320320
catalog/postgres.shdescription \
321321
replication/repl_gram.c \
322322
replication/repl_scanner.c \
323+
replication/syncrep_gram.c \
324+
replication/syncrep_scanner.c \
323325
storage/lmgr/lwlocknames.c \
324326
storage/lmgr/lwlocknames.h \
325327
utils/fmgroids.h \

src/backend/replication/.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,4 @@
11
/repl_gram.c
22
/repl_scanner.c
3+
/syncrep_gram.c
4+
/syncrep_scanner.c

src/backend/replication/Makefile

+8-3
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ include $(top_builddir)/src/Makefile.global
1515
override CPPFLAGS := -I. -I$(srcdir) $(CPPFLAGS)
1616

1717
OBJS = walsender.o walreceiverfuncs.o walreceiver.o basebackup.o \
18-
repl_gram.o slot.o slotfuncs.o syncrep.o
18+
repl_gram.o slot.o slotfuncs.o syncrep.o syncrep_gram.o
1919

2020
SUBDIRS = logical
2121

@@ -24,5 +24,10 @@ include $(top_srcdir)/src/backend/common.mk
2424
# repl_scanner is compiled as part of repl_gram
2525
repl_gram.o: repl_scanner.c
2626

27-
# repl_gram.c and repl_scanner.c are in the distribution tarball, so
28-
# they are not cleaned here.
27+
# syncrep_scanner is complied as part of syncrep_gram
28+
syncrep_gram.o: syncrep_scanner.c
29+
syncrep_scanner.c: FLEXFLAGS = -CF -p
30+
syncrep_scanner.c: FLEX_NO_BACKUP=yes
31+
32+
# repl_gram.c, repl_scanner.c, syncrep_gram.c and syncrep_scanner.c
33+
# are in the distribution tarball, so they are not cleaned here.

0 commit comments

Comments
 (0)