@@ -73,10 +73,12 @@ PostgreSQL documentation
73
73
transfer mechanism. <application>pg_dump</application> can be used to
74
74
backup an entire database, then <application>pg_restore</application>
75
75
can be used to examine the archive and/or select which parts of the
76
- database are to be restored. The most flexible output file format is
77
- the <quote>custom</quote> format (<option>-Fc</option>). It allows
78
- for selection and reordering of all archived items, and is compressed
79
- by default.
76
+ database are to be restored. The most flexible output file formats are
77
+ the <quote>custom</quote> format (<option>-Fc</option>) and the
78
+ <quote>directory</quote> format(<option>-Fd</option>). They allow
79
+ for selection and reordering of all archived items, support parallel
80
+ restoration, and are compressed by default. The <quote>directory</quote>
81
+ format is the only format that supports parallel dumps.
80
82
</para>
81
83
82
84
<para>
@@ -251,7 +253,8 @@ PostgreSQL documentation
251
253
can read. A directory format archive can be manipulated with
252
254
standard Unix tools; for example, files in an uncompressed archive
253
255
can be compressed with the <application>gzip</application> tool.
254
- This format is compressed by default.
256
+ This format is compressed by default and also supports parallel
257
+ dumps.
255
258
</para>
256
259
</listitem>
257
260
</varlistentry>
@@ -285,6 +288,62 @@ PostgreSQL documentation
285
288
</listitem>
286
289
</varlistentry>
287
290
291
+ <varlistentry>
292
+ <term><option>-j <replaceable class="parameter">njobs</replaceable></></term>
293
+ <term><option>--jobs=<replaceable class="parameter">njobs</replaceable></></term>
294
+ <listitem>
295
+ <para>
296
+ Run the dump in parallel by dumping <replaceable class="parameter">njobs</replaceable>
297
+ tables simultaneously. This option reduces the time of the dump but it also
298
+ increases the load on the database server. You can only use this option with the
299
+ directory output format because this is the only output format where multiple processes
300
+ can write their data at the same time.
301
+ </para>
302
+ <para>
303
+ <application>pg_dump</> will open <replaceable class="parameter">njobs</replaceable>
304
+ + 1 connections to the database, so make sure your <xref linkend="guc-max-connections">
305
+ setting is high enough to accommodate all connections.
306
+ </para>
307
+ <para>
308
+ Requesting exclusive locks on database objects while running a parallel dump could
309
+ cause the dump to fail. The reason is that the <application>pg_dump</> master process
310
+ requests shared locks on the objects that the worker processes are going to dump later
311
+ in order to
312
+ make sure that nobody deletes them and makes them go away while the dump is running.
313
+ If another client then requests an exclusive lock on a table, that lock will not be
314
+ granted but will be queued waiting for the shared lock of the master process to be
315
+ released.. Consequently any other access to the table will not be granted either and
316
+ will queue after the exclusive lock request. This includes the worker process trying
317
+ to dump the table. Without any precautions this would be a classic deadlock situation.
318
+ To detect this conflict, the <application>pg_dump</> worker process requests another
319
+ shared lock using the <literal>NOWAIT</> option. If the worker process is not granted
320
+ this shared lock, somebody else must have requested an exclusive lock in the meantime
321
+ and there is no way to continue with the dump, so <application>pg_dump</> has no choice
322
+ but to abort the dump.
323
+ </para>
324
+ <para>
325
+ For a consistent backup, the database server needs to support synchronized snapshots,
326
+ a feature that was introduced in <productname>PostgreSQL</productname> 9.2. With this
327
+ feature, database clients can ensure they see the same dataset even though they use
328
+ different connections. <command>pg_dump -j</command> uses multiple database
329
+ connections; it connects to the database once with the master process and
330
+ once again for each worker job. Without the sychronized snapshot feature, the
331
+ different worker jobs wouldn't be guaranteed to see the same data in each connection,
332
+ which could lead to an inconsistent backup.
333
+ </para>
334
+ <para>
335
+ If you want to run a parallel dump of a pre-9.2 server, you need to make sure that the
336
+ database content doesn't change from between the time the master connects to the
337
+ database until the last worker job has connected to the database. The easiest way to
338
+ do this is to halt any data modifying processes (DDL and DML) accessing the database
339
+ before starting the backup. You also need to specify the
340
+ <option>--no-synchronized-snapshots</option> parameter when running
341
+ <command>pg_dump -j</command> against a pre-9.2 <productname>PostgreSQL</productname>
342
+ server.
343
+ </para>
344
+ </listitem>
345
+ </varlistentry>
346
+
288
347
<varlistentry>
289
348
<term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
290
349
<term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
@@ -690,6 +749,17 @@ PostgreSQL documentation
690
749
</listitem>
691
750
</varlistentry>
692
751
752
+ <varlistentry>
753
+ <term><option>--no-synchronized-snapshots</></term>
754
+ <listitem>
755
+ <para>
756
+ This option allows running <command>pg_dump -j</> against a pre-9.2
757
+ server, see the documentation of the <option>-j</option> parameter
758
+ for more details.
759
+ </para>
760
+ </listitem>
761
+ </varlistentry>
762
+
693
763
<varlistentry>
694
764
<term><option>--no-tablespaces</option></term>
695
765
<listitem>
@@ -1082,6 +1152,15 @@ CREATE DATABASE foo WITH TEMPLATE template0;
1082
1152
</screen>
1083
1153
</para>
1084
1154
1155
+ <para>
1156
+ To dump a database into a directory-format archive in parallel with
1157
+ 5 worker jobs:
1158
+
1159
+ <screen>
1160
+ <prompt>$</prompt> <userinput>pg_dump -Fd mydb -j 5 -f dumpdir</userinput>
1161
+ </screen>
1162
+ </para>
1163
+
1085
1164
<para>
1086
1165
To reload an archive file into a (freshly created) database named
1087
1166
<literal>newdb</>:
0 commit comments