Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit ef65f6f

Browse files
committed
Prevent WAL logging when COPY is done in the same transation that
created it. Simon Riggs
1 parent 693c85d commit ef65f6f

File tree

10 files changed

+182
-31
lines changed

10 files changed

+182
-31
lines changed

doc/src/sgml/perform.sgml

Lines changed: 47 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.59 2006/10/07 20:59:03 petere Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.60 2007/01/25 02:17:25 momjian Exp $ -->
22

33
<chapter id="performance-tips">
44
<title>Performance Tips</title>
@@ -800,7 +800,9 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
800800
prepared <command>INSERT</command> statement, and then use
801801
<command>EXECUTE</command> as many times as required. This avoids
802802
some of the overhead of repeatedly parsing and planning
803-
<command>INSERT</command>.
803+
<command>INSERT</command>. Different interfaces provide this facility
804+
in different ways; look for Prepared Statements in the interface
805+
documentation.
804806
</para>
805807

806808
<para>
@@ -809,6 +811,20 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
809811
<command>INSERT</command>, even if <command>PREPARE</> is used and
810812
multiple insertions are batched into a single transaction.
811813
</para>
814+
815+
<para>
816+
<command>COPY</command> is fastest when used within the same
817+
transaction as an earlier <command>CREATE TABLE</command> or
818+
<command>TRUNCATE</command> command. In those cases, no WAL
819+
needs to be written because in case of an error, the files
820+
containing the newly loaded data will be removed automatically.
821+
<command>CREATE TABLE AS SELECT</command> is also optimized
822+
to avoid writing WAL. <command>COPY</command> and
823+
<command>CREATE TABLE AS SELECT</command> will write WAL
824+
when <xref linkend="guc-archive-command"> is set and will not
825+
therefore be optimized in that case.
826+
</para>
827+
812828
</sect2>
813829

814830
<sect2 id="populate-rm-indexes">
@@ -877,6 +893,29 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
877893
</para>
878894
</sect2>
879895

896+
<sect2 id="populate-pitr">
897+
<title>Turn off <varname>archive_command</varname></title>
898+
899+
<para>
900+
When loading large amounts of data you may want to unset the
901+
<xref linkend="guc-archive-command"> before loading. It may be
902+
faster to take a new base backup once the load has completed
903+
than to allow a large archive to accumulate.
904+
</para>
905+
906+
<para>
907+
This is particularly important advice because certain commands
908+
will perform more slowly when <varname>archive_command</varname>
909+
is set, as a result of their needing to write large amounts of WAL.
910+
This applies to the following commands:
911+
<command>CREATE TABLE AS SELECT</command>,
912+
<command>CREATE INDEX</command> and also <command>COPY</command>, when
913+
it is executed in the same transaction as a prior
914+
<command>CREATE TABLE</command> or <command>TRUNCATE</command> command.
915+
</para>
916+
917+
</sect2>
918+
880919
<sect2 id="populate-analyze">
881920
<title>Run <command>ANALYZE</command> Afterwards</title>
882921

@@ -914,8 +953,12 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
914953
the first several guidelines are handled automatically. What is left
915954
for you to do is to set appropriate (i.e., larger than normal) values
916955
for <varname>maintenance_work_mem</varname> and
917-
<varname>checkpoint_segments</varname> before loading the dump script,
918-
and then to run <command>ANALYZE</> afterwards.
956+
<varname>checkpoint_segments</varname>, as well as unsetting
957+
<varname>archive_command</varname> before loading the dump script,
958+
and then to run <command>ANALYZE</> afterwards and resetting
959+
<varname>archive_command</varname> if required. All of the
960+
parameters can be reset once the load has completed without needing
961+
to restart the server, as described in <xref linkend="config-setting">.
919962
</para>
920963

921964
<para>

src/backend/access/heap/heapam.c

Lines changed: 40 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/heap/heapam.c,v 1.224 2007/01/09 22:00:59 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/heap/heapam.c,v 1.225 2007/01/25 02:17:25 momjian Exp $
1212
*
1313
*
1414
* INTERFACE ROUTINES
@@ -28,6 +28,7 @@
2828
* heap_update - replace a tuple in a relation with another tuple
2929
* heap_markpos - mark scan position
3030
* heap_restrpos - restore position to marked location
31+
* heap_sync - sync heap, for when no WAL has been written
3132
*
3233
* NOTES
3334
* This file contains the heap_ routines which implement
@@ -50,6 +51,7 @@
5051
#include "miscadmin.h"
5152
#include "pgstat.h"
5253
#include "storage/procarray.h"
54+
#include "storage/smgr.h"
5355
#include "utils/inval.h"
5456
#include "utils/lsyscache.h"
5557
#include "utils/relcache.h"
@@ -1358,7 +1360,7 @@ heap_get_latest_tid(Relation relation,
13581360
* that all new tuples go into new pages not containing any tuples from other
13591361
* transactions, that the relation gets fsync'd before commit, and that the
13601362
* transaction emits at least one WAL record to ensure RecordTransactionCommit
1361-
* will decide to WAL-log the commit.
1363+
* will decide to WAL-log the commit. (see heap_sync() comments also)
13621364
*
13631365
* use_fsm is passed directly to RelationGetBufferForTuple, which see for
13641366
* more info.
@@ -1418,7 +1420,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
14181420
*/
14191421
if (HeapTupleHasExternal(tup) ||
14201422
(MAXALIGN(tup->t_len) > TOAST_TUPLE_THRESHOLD))
1421-
heaptup = toast_insert_or_update(relation, tup, NULL);
1423+
heaptup = toast_insert_or_update(relation, tup, NULL, use_wal);
14221424
else
14231425
heaptup = tup;
14241426

@@ -1535,6 +1537,18 @@ simple_heap_insert(Relation relation, HeapTuple tup)
15351537
return heap_insert(relation, tup, GetCurrentCommandId(), true, true);
15361538
}
15371539

1540+
/*
1541+
* fast_heap_insert - insert a tuple with options to improve speed
1542+
*
1543+
* Currently, this routine allows specifying additional options for speed
1544+
* in certain cases, such as WAL-avoiding COPY command
1545+
*/
1546+
Oid
1547+
fast_heap_insert(Relation relation, HeapTuple tup, bool use_wal)
1548+
{
1549+
return heap_insert(relation, tup, GetCurrentCommandId(), use_wal, use_wal);
1550+
}
1551+
15381552
/*
15391553
* heap_delete - delete a tuple
15401554
*
@@ -2086,11 +2100,11 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
20862100
*
20872101
* Note: below this point, heaptup is the data we actually intend to
20882102
* store into the relation; newtup is the caller's original untoasted
2089-
* data.
2103+
* data. (We always use WAL for toast table updates.)
20902104
*/
20912105
if (need_toast)
20922106
{
2093-
heaptup = toast_insert_or_update(relation, newtup, &oldtup);
2107+
heaptup = toast_insert_or_update(relation, newtup, &oldtup, true);
20942108
newtupsize = MAXALIGN(heaptup->t_len);
20952109
}
20962110
else
@@ -3966,3 +3980,24 @@ heap2_desc(StringInfo buf, uint8 xl_info, char *rec)
39663980
else
39673981
appendStringInfo(buf, "UNKNOWN");
39683982
}
3983+
3984+
/* ----------------
3985+
* heap_sync - sync a heap, for use when no WAL has been written
3986+
*
3987+
* ----------------
3988+
*/
3989+
void
3990+
heap_sync(Relation rel)
3991+
{
3992+
if (!rel->rd_istemp)
3993+
{
3994+
/*
3995+
* If we skipped using WAL, and it's not a temp relation,
3996+
* we must force the relation down to disk before it's
3997+
* safe to commit the transaction. This requires forcing
3998+
* out any dirty buffers and then doing a forced fsync.
3999+
*/
4000+
FlushRelationBuffers(rel);
4001+
smgrimmedsync(rel->rd_smgr);
4002+
}
4003+
}

src/backend/access/heap/tuptoaster.c

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/heap/tuptoaster.c,v 1.68 2007/01/05 22:19:22 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/heap/tuptoaster.c,v 1.69 2007/01/25 02:17:26 momjian Exp $
1212
*
1313
*
1414
* INTERFACE ROUTINES
@@ -42,7 +42,7 @@
4242
#undef TOAST_DEBUG
4343

4444
static void toast_delete_datum(Relation rel, Datum value);
45-
static Datum toast_save_datum(Relation rel, Datum value);
45+
static Datum toast_save_datum(Relation rel, Datum value, bool use_wal);
4646
static varattrib *toast_fetch_datum(varattrib *attr);
4747
static varattrib *toast_fetch_datum_slice(varattrib *attr,
4848
int32 sliceoffset, int32 length);
@@ -342,7 +342,7 @@ toast_delete(Relation rel, HeapTuple oldtup)
342342
* ----------
343343
*/
344344
HeapTuple
345-
toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup)
345+
toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup, bool use_wal)
346346
{
347347
HeapTuple result_tuple;
348348
TupleDesc tupleDesc;
@@ -612,7 +612,7 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup)
612612
i = biggest_attno;
613613
old_value = toast_values[i];
614614
toast_action[i] = 'p';
615-
toast_values[i] = toast_save_datum(rel, toast_values[i]);
615+
toast_values[i] = toast_save_datum(rel, toast_values[i], use_wal);
616616
if (toast_free[i])
617617
pfree(DatumGetPointer(old_value));
618618

@@ -724,7 +724,7 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup)
724724
i = biggest_attno;
725725
old_value = toast_values[i];
726726
toast_action[i] = 'p';
727-
toast_values[i] = toast_save_datum(rel, toast_values[i]);
727+
toast_values[i] = toast_save_datum(rel, toast_values[i], use_wal);
728728
if (toast_free[i])
729729
pfree(DatumGetPointer(old_value));
730730

@@ -972,7 +972,7 @@ toast_compress_datum(Datum value)
972972
* ----------
973973
*/
974974
static Datum
975-
toast_save_datum(Relation rel, Datum value)
975+
toast_save_datum(Relation rel, Datum value, bool use_wal)
976976
{
977977
Relation toastrel;
978978
Relation toastidx;
@@ -1057,7 +1057,7 @@ toast_save_datum(Relation rel, Datum value)
10571057
if (!HeapTupleIsValid(toasttup))
10581058
elog(ERROR, "failed to build TOAST tuple");
10591059

1060-
simple_heap_insert(toastrel, toasttup);
1060+
fast_heap_insert(toastrel, toasttup, use_wal);
10611061

10621062
/*
10631063
* Create the index entry. We cheat a little here by not using

src/backend/catalog/index.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/catalog/index.c,v 1.276 2007/01/09 02:14:11 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/catalog/index.c,v 1.277 2007/01/25 02:17:26 momjian Exp $
1212
*
1313
*
1414
* INTERFACE ROUTINES
@@ -1245,6 +1245,9 @@ setNewRelfilenode(Relation relation)
12451245

12461246
heap_close(pg_class, RowExclusiveLock);
12471247

1248+
/* Remember we did this in current transaction, to allow later optimisations */
1249+
relation->rd_newRelfilenodeSubid = GetCurrentSubTransactionId();
1250+
12481251
/* Make sure the relfilenode change is visible */
12491252
CommandCounterIncrement();
12501253
}

src/backend/commands/copy.c

Lines changed: 51 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/commands/copy.c,v 1.274 2007/01/05 22:19:25 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/commands/copy.c,v 1.275 2007/01/25 02:17:26 momjian Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -1652,6 +1652,7 @@ CopyFrom(CopyState cstate)
16521652
ExprContext *econtext; /* used for ExecEvalExpr for default atts */
16531653
MemoryContext oldcontext = CurrentMemoryContext;
16541654
ErrorContextCallback errcontext;
1655+
bool use_wal = true; /* By default, we use WAL to log db changes */
16551656

16561657
Assert(cstate->rel);
16571658

@@ -1843,6 +1844,28 @@ CopyFrom(CopyState cstate)
18431844
nfields = file_has_oids ? (attr_count + 1) : attr_count;
18441845
field_strings = (char **) palloc(nfields * sizeof(char *));
18451846

1847+
/*
1848+
* Check for performance optimization by avoiding WAL writes
1849+
*
1850+
* If archive logging is not be enabled *and* either
1851+
* - table is created in same transaction as this COPY
1852+
* - table data is now being written to new relfilenode
1853+
* then we can safely avoid writing WAL. Why?
1854+
* The data files for the table plus toast table/index, plus any indexes
1855+
* will all be dropped at the end of the transaction if it fails, so we
1856+
* do not need to worry about inconsistent states.
1857+
* As mentioned in comments in utils/rel.h, the in-same-transaction test is
1858+
* not completely reliable, since rd_createSubId can be reset to zero in
1859+
* certain cases before the end of the creating transaction.
1860+
* We are doing this for performance only, so we only need to know:
1861+
* if rd_createSubid != InvalidSubTransactionId then it is *always* just
1862+
* created. If we have PITR enabled, then we *must* use_wal
1863+
*/
1864+
if ((cstate->rel->rd_createSubid != InvalidSubTransactionId ||
1865+
cstate->rel->rd_newRelfilenodeSubid != InvalidSubTransactionId)
1866+
&& !XLogArchivingActive())
1867+
use_wal = false;
1868+
18461869
/* Initialize state variables */
18471870
cstate->fe_eof = false;
18481871
cstate->eol_type = EOL_UNKNOWN;
@@ -2076,7 +2099,7 @@ CopyFrom(CopyState cstate)
20762099
ExecConstraints(resultRelInfo, slot, estate);
20772100

20782101
/* OK, store the tuple and create index entries for it */
2079-
simple_heap_insert(cstate->rel, tuple);
2102+
fast_heap_insert(cstate->rel, tuple, use_wal);
20802103

20812104
if (resultRelInfo->ri_NumIndices > 0)
20822105
ExecInsertIndexTuples(slot, &(tuple->t_self), estate, false);
@@ -2093,6 +2116,32 @@ CopyFrom(CopyState cstate)
20932116
}
20942117
}
20952118

2119+
/*
2120+
* If we skipped writing WAL for heaps, then we need to sync
2121+
*/
2122+
if (!use_wal)
2123+
{
2124+
/* main heap */
2125+
heap_sync(cstate->rel);
2126+
2127+
/* main heap indexes, if any */
2128+
/* we always use WAL for index inserts, so no need to sync */
2129+
2130+
/* toast heap, if any */
2131+
if (OidIsValid(cstate->rel->rd_rel->reltoastrelid))
2132+
{
2133+
Relation toastrel;
2134+
2135+
toastrel = heap_open(cstate->rel->rd_rel->reltoastrelid,
2136+
AccessShareLock);
2137+
heap_sync(toastrel);
2138+
heap_close(toastrel, AccessShareLock);
2139+
}
2140+
2141+
/* toast index, if toast heap */
2142+
/* we always use WAL for index inserts, so no need to sync */
2143+
}
2144+
20962145
/* Done, clean up */
20972146
error_context_stack = errcontext.previous;
20982147

src/backend/executor/execMain.c

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
*
2727
*
2828
* IDENTIFICATION
29-
* $PostgreSQL: pgsql/src/backend/executor/execMain.c,v 1.283 2007/01/05 22:19:27 momjian Exp $
29+
* $PostgreSQL: pgsql/src/backend/executor/execMain.c,v 1.284 2007/01/25 02:17:26 momjian Exp $
3030
*
3131
*-------------------------------------------------------------------------
3232
*/
@@ -2516,11 +2516,7 @@ CloseIntoRel(QueryDesc *queryDesc)
25162516
*/
25172517
if (!estate->es_into_relation_use_wal &&
25182518
!estate->es_into_relation_descriptor->rd_istemp)
2519-
{
2520-
FlushRelationBuffers(estate->es_into_relation_descriptor);
2521-
/* FlushRelationBuffers will have opened rd_smgr */
2522-
smgrimmedsync(estate->es_into_relation_descriptor->rd_smgr);
2523-
}
2519+
heap_sync(estate->es_into_relation_descriptor);
25242520

25252521
/* close rel, but keep lock until commit */
25262522
heap_close(estate->es_into_relation_descriptor, NoLock);

0 commit comments

Comments
 (0)