Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 878bd9a

Browse files
committed
pg_rewind docs: clarify handling of remote servers
1 parent 3ebc88e commit 878bd9a

File tree

1 file changed

+49
-45
lines changed

1 file changed

+49
-45
lines changed

doc/src/sgml/ref/pg_rewind.sgml

+49-45
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ PostgreSQL documentation
1616

1717
<refnamediv>
1818
<refname>pg_rewind</refname>
19-
<refpurpose>synchronize a <productname>PostgreSQL</productname> data directory with another data directory that was forked from the first one</refpurpose>
19+
<refpurpose>synchronize a <productname>PostgreSQL</productname> data directory with another data directory that was forked from it</refpurpose>
2020
</refnamediv>
2121

2222
<refsynopsisdiv>
@@ -44,56 +44,56 @@ PostgreSQL documentation
4444
<application>pg_rewind</> is a tool for synchronizing a PostgreSQL cluster
4545
with another copy of the same cluster, after the clusters' timelines have
4646
diverged. A typical scenario is to bring an old master server back online
47-
after failover, as a standby that follows the new master.
47+
after failover as a standby that follows the new master.
4848
</para>
4949

5050
<para>
5151
The result is equivalent to replacing the target data directory with the
52-
source one. All files are copied, including configuration files. The
52+
source one. Only changed blocks from relation files are copied;
53+
all other files are copied in full, including configuration files. The
5354
advantage of <application>pg_rewind</> over taking a new base backup, or
5455
tools like <application>rsync</>, is that <application>pg_rewind</> does
55-
not require reading through all unchanged files in the cluster. That makes
56-
it a lot faster when the database is large and only a small portion of it
57-
differs between the clusters.
56+
not require reading through unchanged blocks in the cluster. This makes
57+
it a lot faster when the database is large and only a small
58+
fraction of blocks differ between the clusters.
5859
</para>
5960

6061
<para>
6162
<application>pg_rewind</> examines the timeline histories of the source
6263
and target clusters to determine the point where they diverged, and
6364
expects to find WAL in the target cluster's <filename>pg_xlog</> directory
6465
reaching all the way back to the point of divergence. The point of divergence
65-
could be found either on target timeline, source timeline or their common
66+
can be found either on the target timeline, the source timeline, or their common
6667
ancestor. In the typical failover scenario where the target cluster was
67-
shut down soon after the divergence, that is not a problem, but if the
68-
target cluster had run for a long time after the divergence, the old WAL
69-
files might not be present anymore. In that case, they can be manually
70-
copied from the WAL archive to the <filename>pg_xlog</> directory. Fetching
71-
missing files from a WAL archive automatically is currently not supported.
72-
Besides, <application>pg_rewind</> use cases are not limited by failover.
73-
For instance, standby server could be promoted, run some writes and
74-
then be returned back as standby.
68+
shut down soon after the divergence, this is not a problem, but if the
69+
target cluster ran for a long time after the divergence, the old WAL
70+
files might no longer be present. In that case, they can be manually
71+
copied from the WAL archive to the <filename>pg_xlog</> directory, or
72+
fetched on startup by configuring <filename>recovery.conf</>. The use of
73+
<application>pg_rewind</> is not limited to failover, e.g. a standby
74+
server can be promoted, run some write transactions, and then rewinded
75+
to become a standby again.
7576
</para>
7677

7778
<para>
78-
When the target server is started up for the first time after running
79+
When the target server is started for the first time after running
7980
<application>pg_rewind</>, it will go into recovery mode and replay all
8081
WAL generated in the source server after the point of divergence.
8182
If some of the WAL was no longer available in the source server when
82-
<application>pg_rewind</> was run, and therefore could not be copied by
83-
<application>pg_rewind</> session, it needs to be made available when the
84-
target server is started up. That can be done by creating a
83+
<application>pg_rewind</> was run, and therefore could not be copied by the
84+
<application>pg_rewind</> session, it must be made available when the
85+
target server is started. This can be done by creating a
8586
<filename>recovery.conf</> file in the target data directory with a
8687
suitable <varname>restore_command</>.
8788
</para>
8889

8990
<para>
9091
<application>pg_rewind</> requires that the target server either has
91-
the <xref linkend="guc-wal-log-hints"> option is enabled
92-
in <filename>postgresql.conf</> or that data checksums were enabled when
92+
the <xref linkend="guc-wal-log-hints"> option enabled
93+
in <filename>postgresql.conf</> or data checksums enabled when
9394
the cluster was initialized with <application>initdb</>. Neither of these
94-
are currently on by default.
95-
<xref linkend="guc-full-page-writes"> must also be enabled. That is the
96-
default.
95+
are currently on by default. <xref linkend="guc-full-page-writes">
96+
must also be set to <literal>on</>, but is enabled by default.
9797
</para>
9898
</refsect1>
9999

@@ -111,7 +111,7 @@ PostgreSQL documentation
111111
<listitem>
112112
<para>
113113
This option specifies the target data directory that is synchronized
114-
with the source. The target server must shut down cleanly before
114+
with the source. The target server must be shut down cleanly before
115115
running <application>pg_rewind</application>
116116
</para>
117117
</listitem>
@@ -121,9 +121,9 @@ PostgreSQL documentation
121121
<term><option>--source-pgdata=<replaceable class="parameter">directory</replaceable></option></term>
122122
<listitem>
123123
<para>
124-
Specifies path to the data directory of the source server, to
125-
synchronize the target with. This option requires the source server
126-
to be cleanly shut down.
124+
Specifies the file system path to the data directory of the source
125+
server to synchronize the target with. This option requires the
126+
source server to be cleanly shut down.
127127
</para>
128128
</listitem>
129129
</varlistentry>
@@ -135,8 +135,8 @@ PostgreSQL documentation
135135
Specifies a libpq connection string to connect to the source
136136
<productname>PostgreSQL</> server to synchronize the target with.
137137
The connection must be a normal (non-replication) connection
138-
with superuser access. This option requires the server to be running
139-
and not in recovery mode.
138+
with superuser access. This option requires the source
139+
server to be running and not in recovery mode.
140140
</para>
141141
</listitem>
142142
</varlistentry>
@@ -157,7 +157,7 @@ PostgreSQL documentation
157157
<listitem>
158158
<para>
159159
Enables progress reporting. Turning this on will deliver an approximate
160-
progress report while copying data over from the source cluster.
160+
progress report while copying data from the source cluster.
161161
</para>
162162
</listitem>
163163
</varlistentry>
@@ -205,38 +205,42 @@ PostgreSQL documentation
205205
<title>How it works</title>
206206

207207
<para>
208-
The basic idea is to copy everything from the new cluster to the old
209-
cluster, except for the blocks that we know to be the same.
208+
The basic idea is to copy all file system-level changes from the source
209+
cluster to the target cluster:
210210
</para>
211211

212212
<procedure>
213213
<step>
214214
<para>
215-
Scan the WAL log of the old cluster, starting from the last checkpoint
216-
before the point where the new cluster's timeline history forked off
217-
from the old cluster. For each WAL record, make a note of the data
218-
blocks that were touched. This yields a list of all the data blocks
219-
that were changed in the old cluster, after the new cluster forked off.
215+
Scan the WAL log of the target cluster, starting from the last
216+
checkpoint before the point where the source cluster's timeline
217+
history forked off from the target cluster. For each WAL record,
218+
record each data block that was touched. This yields a list of all
219+
the data blocks that were changed in the target cluster, after the
220+
source cluster forked off.
220221
</para>
221222
</step>
222223
<step>
223224
<para>
224-
Copy all those changed blocks from the new cluster to the old cluster.
225+
Copy all those changed blocks from the source cluster to
226+
the target cluster, either using direct file system access
227+
(<option>--source-pgdata</>) or SQL (<option>--source-server</>).
225228
</para>
226229
</step>
227230
<step>
228231
<para>
229-
Copy all other files such as <filename>clog</filename> and configuration files from the new cluster
230-
to the old cluster, everything except the relation files.
232+
Copy all other files such as <filename>pg_clog</filename> and
233+
configuration files from the source cluster to the target cluster
234+
(everything except the relation files).
231235
</para>
232236
</step>
233237
<step>
234238
<para>
235-
Apply the WAL from the new cluster, starting from the checkpoint
239+
Apply the WAL from the source cluster, starting from the checkpoint
236240
created at failover. (Strictly speaking, <application>pg_rewind</>
237-
doesn't apply the WAL, it just creates a backup label file indicating
238-
that when <productname>PostgreSQL</> is started, it will start replay
239-
from that checkpoint and apply all the required WAL.)
241+
doesn't apply the WAL, it just creates a backup label file that
242+
makes <productname>PostgreSQL</> start by replaying all WAL from
243+
that checkpoint forward.)
240244
</para>
241245
</step>
242246
</procedure>

0 commit comments

Comments
 (0)