Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 5286105

Browse files
Cascading replication feature for streaming log-based replication.
Standby servers can now have WALSender processes, which can work with either WALReceiver or archive_commands to pass data. Fully updated docs, including new conceptual terms of sending server, upstream and downstream servers. WALSenders terminated when promote to master. Fujii Masao, review, rework and doc rewrite by Simon Riggs
1 parent 3d4890c commit 5286105

File tree

10 files changed

+423
-189
lines changed

10 files changed

+423
-189
lines changed

doc/src/sgml/config.sgml

+78-49
Original file line numberDiff line numberDiff line change
@@ -1962,24 +1962,26 @@ SET ENABLE_SEQSCAN TO OFF;
19621962
<para>
19631963
These settings control the behavior of the built-in
19641964
<firstterm>streaming replication</> feature (see
1965-
<xref linkend="streaming-replication">).
1966-
Some parameters must be set on the master server, while others must be
1967-
set on the standby server(s) that will receive replication data.
1965+
<xref linkend="streaming-replication">). Servers will be either a
1966+
Master or a Standby server. Masters can send data, while Standby(s)
1967+
are always receivers of replicated data. When cascading replication
1968+
(see <xref linkend="cascading-replication">) is used, Standby server(s)
1969+
can also be senders, as well as receivers.
1970+
Parameters are mainly for Sending and Standby servers, though some
1971+
parameters have meaning only on the Master server. Settings may vary
1972+
across the cluster without problems if that is required.
19681973
</para>
19691974

1970-
<sect2 id="runtime-config-replication-master">
1971-
<title>Master Server</title>
1975+
<sect2 id="runtime-config-replication-sender">
1976+
<title>Sending Server(s)</title>
19721977

19731978
<para>
1974-
These parameters can be set on the primary server that is
1979+
These parameters can be set on any server that is
19751980
to send replication data to one or more standby servers.
1976-
Note that in addition to these parameters,
1977-
<xref linkend="guc-wal-level"> must be set appropriately on the master
1978-
server, and you will typically want to enable WAL archiving as
1979-
well (see <xref linkend="runtime-config-wal-archiving">).
1980-
The values of these parameters on standby servers are irrelevant,
1981-
although you may wish to set them there in preparation for the
1982-
possibility of a standby becoming the master.
1981+
The master is always a sending server, so these parameters must
1982+
always be set on the master.
1983+
The role and meaning of these parameters does not change after a
1984+
standby becomes the master.
19831985
</para>
19841986

19851987
<variablelist>
@@ -2034,10 +2036,11 @@ SET ENABLE_SEQSCAN TO OFF;
20342036
<filename>pg_xlog</>
20352037
directory, in case a standby server needs to fetch them for streaming
20362038
replication. Each segment is normally 16 megabytes. If a standby
2037-
server connected to the primary falls behind by more than
2038-
<varname>wal_keep_segments</> segments, the primary might remove
2039+
server connected to the sending server falls behind by more than
2040+
<varname>wal_keep_segments</> segments, the sending server might remove
20392041
a WAL segment still needed by the standby, in which case the
2040-
replication connection will be terminated. (However, the standby
2042+
replication connection will be terminated. Downstream connections
2043+
will also eventually fail as a result. (However, the standby
20412044
server can recover by fetching the segment from archive, if WAL
20422045
archiving is in use.)
20432046
</para>
@@ -2050,42 +2053,13 @@ SET ENABLE_SEQSCAN TO OFF;
20502053
doesn't keep any extra segments for standby purposes, so the number
20512054
of old WAL segments available to standby servers is a function of
20522055
the location of the previous checkpoint and status of WAL
2053-
archiving. This parameter has no effect on restartpoints.
2056+
archiving.
20542057
This parameter can only be set in the
20552058
<filename>postgresql.conf</> file or on the server command line.
20562059
</para>
20572060
</listitem>
20582061
</varlistentry>
20592062

2060-
<varlistentry id="guc-vacuum-defer-cleanup-age" xreflabel="vacuum_defer_cleanup_age">
2061-
<term><varname>vacuum_defer_cleanup_age</varname> (<type>integer</type>)</term>
2062-
<indexterm>
2063-
<primary><varname>vacuum_defer_cleanup_age</> configuration parameter</primary>
2064-
</indexterm>
2065-
<listitem>
2066-
<para>
2067-
Specifies the number of transactions by which <command>VACUUM</> and
2068-
<acronym>HOT</> updates will defer cleanup of dead row versions. The
2069-
default is zero transactions, meaning that dead row versions can be
2070-
removed as soon as possible, that is, as soon as they are no longer
2071-
visible to any open transaction. You may wish to set this to a
2072-
non-zero value on a primary server that is supporting hot standby
2073-
servers, as described in <xref linkend="hot-standby">. This allows
2074-
more time for queries on the standby to complete without incurring
2075-
conflicts due to early cleanup of rows. However, since the value
2076-
is measured in terms of number of write transactions occurring on the
2077-
primary server, it is difficult to predict just how much additional
2078-
grace time will be made available to standby queries.
2079-
This parameter can only be set in the <filename>postgresql.conf</>
2080-
file or on the server command line.
2081-
</para>
2082-
<para>
2083-
You should also consider setting <varname>hot_standby_feedback</>
2084-
as an alternative to using this parameter.
2085-
</para>
2086-
</listitem>
2087-
</varlistentry>
2088-
20892063
<varlistentry id="guc-replication-timeout" xreflabel="replication_timeout">
20902064
<term><varname>replication_timeout</varname> (<type>integer</type>)</term>
20912065
<indexterm>
@@ -2095,7 +2069,7 @@ SET ENABLE_SEQSCAN TO OFF;
20952069
<para>
20962070
Terminate replication connections that are inactive longer
20972071
than the specified number of milliseconds. This is useful for
2098-
the primary server to detect a standby crash or network outage.
2072+
the sending server to detect a standby crash or network outage.
20992073
A value of zero disables the timeout mechanism. This parameter
21002074
can only be set in
21012075
the <filename>postgresql.conf</> file or on the server command line.
@@ -2110,6 +2084,26 @@ SET ENABLE_SEQSCAN TO OFF;
21102084
</listitem>
21112085
</varlistentry>
21122086

2087+
</variablelist>
2088+
</sect2>
2089+
2090+
<sect2 id="runtime-config-replication-master">
2091+
<title>Master Server</title>
2092+
2093+
<para>
2094+
These parameters can be set on the master/primary server that is
2095+
to send replication data to one or more standby servers.
2096+
Note that in addition to these parameters,
2097+
<xref linkend="guc-wal-level"> must be set appropriately on the master
2098+
server, and may also want to enable WAL archiving as
2099+
well (see <xref linkend="runtime-config-wal-archiving">).
2100+
The values of these parameters on standby servers are irrelevant,
2101+
although you may wish to set them there in preparation for the
2102+
possibility of a standby becoming the master.
2103+
</para>
2104+
2105+
<variablelist>
2106+
21132107
<varlistentry id="guc-synchronous-standby-names" xreflabel="synchronous_standby_names">
21142108
<term><varname>synchronous_standby_names</varname> (<type>string</type>)</term>
21152109
<indexterm>
@@ -2161,6 +2155,35 @@ SET ENABLE_SEQSCAN TO OFF;
21612155
</listitem>
21622156
</varlistentry>
21632157

2158+
<varlistentry id="guc-vacuum-defer-cleanup-age" xreflabel="vacuum_defer_cleanup_age">
2159+
<term><varname>vacuum_defer_cleanup_age</varname> (<type>integer</type>)</term>
2160+
<indexterm>
2161+
<primary><varname>vacuum_defer_cleanup_age</> configuration parameter</primary>
2162+
</indexterm>
2163+
<listitem>
2164+
<para>
2165+
Specifies the number of transactions by which <command>VACUUM</> and
2166+
<acronym>HOT</> updates will defer cleanup of dead row versions. The
2167+
default is zero transactions, meaning that dead row versions can be
2168+
removed as soon as possible, that is, as soon as they are no longer
2169+
visible to any open transaction. You may wish to set this to a
2170+
non-zero value on a primary server that is supporting hot standby
2171+
servers, as described in <xref linkend="hot-standby">. This allows
2172+
more time for queries on the standby to complete without incurring
2173+
conflicts due to early cleanup of rows. However, since the value
2174+
is measured in terms of number of write transactions occurring on the
2175+
primary server, it is difficult to predict just how much additional
2176+
grace time will be made available to standby queries.
2177+
This parameter can only be set in the <filename>postgresql.conf</>
2178+
file or on the server command line.
2179+
</para>
2180+
<para>
2181+
You should also consider setting <varname>hot_standby_feedback</>
2182+
on standby server(s) as an alternative to using this parameter.
2183+
</para>
2184+
</listitem>
2185+
</varlistentry>
2186+
21642187
</variablelist>
21652188
</sect2>
21662189

@@ -2261,7 +2284,7 @@ SET ENABLE_SEQSCAN TO OFF;
22612284
<para>
22622285
Specifies the minimum frequency for the WAL receiver
22632286
process on the standby to send information about replication progress
2264-
to the primary, where it can be seen using the
2287+
to the primary or upstream standby, where it can be seen using the
22652288
<link linkend="monitoring-stats-views-table">
22662289
<literal>pg_stat_replication</></link> view. The standby will report
22672290
the last transaction log position it has written, the last position it
@@ -2276,7 +2299,7 @@ SET ENABLE_SEQSCAN TO OFF;
22762299
The default value is 10 seconds.
22772300
</para>
22782301
<para>
2279-
When <xref linkend="guc-replication-timeout"> is enabled on the primary,
2302+
When <xref linkend="guc-replication-timeout"> is enabled on a sending server,
22802303
<varname>wal_receiver_status_interval</> must be enabled, and its value
22812304
must be less than the value of <varname>replication_timeout</>.
22822305
</para>
@@ -2291,6 +2314,7 @@ SET ENABLE_SEQSCAN TO OFF;
22912314
<listitem>
22922315
<para>
22932316
Specifies whether or not a hot standby will send feedback to the primary
2317+
or upstream standby
22942318
about queries currently executing on the standby. This parameter can
22952319
be used to eliminate query cancels caused by cleanup records, but
22962320
can cause database bloat on the primary for some workloads.
@@ -2299,6 +2323,11 @@ SET ENABLE_SEQSCAN TO OFF;
22992323
<literal>off</literal>. This parameter can only be set in the
23002324
<filename>postgresql.conf</> file or on the server command line.
23012325
</para>
2326+
<para>
2327+
If cascaded replication is in use the feedback is passed upstream
2328+
until it eventually reaches the primary. Standbys make no other use
2329+
of feedback they receive other than to pass upstream.
2330+
</para>
23022331
</listitem>
23032332
</varlistentry>
23042333

doc/src/sgml/high-availability.sgml

+61-1
Original file line numberDiff line numberDiff line change
@@ -877,8 +877,66 @@ primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
877877
network delay, or that the standby is under heavy load.
878878
</para>
879879
</sect3>
880+
</sect2>
881+
882+
<sect2 id="cascading-replication">
883+
<title>Cascading Replication</title>
884+
885+
<indexterm zone="high-availability">
886+
<primary>Cascading Replication</primary>
887+
</indexterm>
888+
889+
<para>
890+
The cascading replication feature allows a standby server to accept replication
891+
connections and stream WAL records to other standbys, acting as a relay.
892+
This can be used to reduce the number of direct connections to the master
893+
and also to minimise inter-site bandwidth overheads.
894+
</para>
880895

896+
<para>
897+
A standby acting as both a receiver and a sender is known as a cascading
898+
standby. Standbys that are more directly connected to the master are known
899+
as upstream servers, while those standby servers further away are downstream
900+
servers. Cascading replication does not place limits on the number or
901+
arrangement of downstream servers, though each standby connects to only
902+
one upstream server which eventually links to a single master/primary
903+
server.
904+
</para>
905+
906+
<para>
907+
A cascading standby sends not only WAL records received from the
908+
master but also those restored from the archive. So even if the replication
909+
connection in some upstream connection is terminated, streaming replication
910+
continues downstream for as long as new WAL records are available.
911+
</para>
912+
913+
<para>
914+
Cascading replication is currently asynchronous. Synchronous replication
915+
(see <xref linkend="synchronous-replication">) settings have no effect on
916+
cascading replication at present.
917+
</para>
918+
919+
<para>
920+
Hot Standby feedback propagates upstream, whatever the cascaded arrangement.
921+
</para>
922+
923+
<para>
924+
Promoting a cascading standby terminates the immediate downstream replication
925+
connections which it serves. This is because the timeline becomes different
926+
between standbys, and they can no longer continue replication. The
927+
effected standby(s) may reconnect to reestablish streaming replication.
928+
</para>
929+
930+
<para>
931+
To use cascading replication, set up the cascading standby so that it can
932+
accept replication connections, i.e., set <varname>max_wal_senders</>,
933+
<varname>hot_standby</> and authentication option (see
934+
<xref linkend="streaming-replication"> and <xref linkend="hot-standby">).
935+
Also set <varname>primary_conninfo</> in the downstream standby to point
936+
to the cascading standby.
937+
</para>
881938
</sect2>
939+
882940
<sect2 id="synchronous-replication">
883941
<title>Synchronous Replication</title>
884942

@@ -955,7 +1013,9 @@ primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
9551013
confirmation that the commit record has been received. These parameters
9561014
allow the administrator to specify which standby servers should be
9571015
synchronous standbys. Note that the configuration of synchronous
958-
replication is mainly on the master.
1016+
replication is mainly on the master. Named standbys must be directly
1017+
connected to the master; the master knows nothing about downstream
1018+
standby servers using cascaded replication.
9591019
</para>
9601020

9611021
<para>

0 commit comments

Comments
 (0)