Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 60b2444

Browse files
committed
Add code to prevent transaction ID wraparound by enforcing a safe limit
in GetNewTransactionId(). Since the limit value has to be computed before we run any real transactions, this requires adding code to database startup to scan pg_database and determine the oldest datfrozenxid. This can conveniently be combined with the first stage of an attack on the problem that the 'flat file' copies of pg_shadow and pg_group are not properly updated during WAL recovery. The code I've added to startup resides in a new file src/backend/utils/init/flatfiles.c, and it is responsible for rewriting the flat files as well as initializing the XID wraparound limit value. This will eventually allow us to get rid of GetRawDatabaseInfo too, but we'll need an initdb so we can add a trigger to pg_database.
1 parent 617d16f commit 60b2444

File tree

15 files changed

+1191
-571
lines changed

15 files changed

+1191
-571
lines changed

doc/src/sgml/maintenance.sgml

+45-21
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.40 2004/12/27 22:30:10 tgl Exp $
2+
$PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.41 2005/02/20 02:21:26 tgl Exp $
33
-->
44

55
<chapter id="maintenance">
@@ -290,7 +290,7 @@ $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.40 2004/12/27 22:30:10 tgl
290290
transaction's XID is <quote>in the future</> and should not be visible
291291
to the current transaction. But since transaction IDs have limited size
292292
(32 bits at this writing) a cluster that runs for a long time (more
293-
than 4 billion transactions) will suffer <firstterm>transaction ID
293+
than 4 billion transactions) would suffer <firstterm>transaction ID
294294
wraparound</>: the XID counter wraps around to zero, and all of a sudden
295295
transactions that were in the past appear to be in the future &mdash; which
296296
means their outputs become invisible. In short, catastrophic data loss.
@@ -313,8 +313,13 @@ $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.40 2004/12/27 22:30:10 tgl
313313
In practice this isn't an onerous requirement, but since the
314314
consequences of failing to meet it can be complete data loss (not
315315
just wasted disk space or slow performance), some special provisions
316-
have been made to help database administrators keep track of the
317-
time since the last <command>VACUUM</>. The remainder of this
316+
have been made to help database administrators avoid disaster.
317+
For each database in the cluster, <productname>PostgreSQL</productname>
318+
keeps track of the time of the last database-wide <command>VACUUM</>.
319+
When any database approaches the billion-transaction danger level,
320+
the system begins to emit warning messages. If nothing is done, it
321+
will eventually shut down normal operations until appropriate
322+
manual maintenance is done. The remainder of this
318323
section gives the details.
319324
</para>
320325

@@ -363,7 +368,8 @@ $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.40 2004/12/27 22:30:10 tgl
363368
statistics in the system table <literal>pg_database</>. In particular,
364369
the <literal>datfrozenxid</> column of a database's
365370
<literal>pg_database</> row is updated at the completion of any
366-
database-wide <command>VACUUM</command> operation (i.e., <command>VACUUM</> that does not
371+
database-wide <command>VACUUM</command> operation (i.e.,
372+
<command>VACUUM</> that does not
367373
name a specific table). The value stored in this field is the freeze
368374
cutoff XID that was used by that <command>VACUUM</> command. All normal
369375
XIDs older than this cutoff XID are guaranteed to have been replaced by
@@ -391,12 +397,37 @@ SELECT datname, age(datfrozenxid) FROM pg_database;
391397

392398
<programlisting>
393399
play=# VACUUM;
394-
WARNING: some databases have not been vacuumed in 1613770184 transactions
395-
HINT: Better vacuum them within 533713463 transactions, or you may have a wraparound failure.
400+
WARNING: database "mydb" must be vacuumed within 177009986 transactions
401+
HINT: To avoid a database shutdown, execute a full-database VACUUM in "mydb".
396402
VACUUM
397403
</programlisting>
398404
</para>
399405

406+
<para>
407+
If the warnings emitted by <command>VACUUM</> go ignored, then
408+
<productname>PostgreSQL</productname> will begin to emit a warning
409+
like the above on every transaction start once there are fewer than 10
410+
million transactions left until wraparound. If those warnings also are
411+
ignored, the system will shut down and refuse to execute any new
412+
transactions once there are fewer than 1 million transactions left
413+
until wraparound:
414+
415+
<programlisting>
416+
play=# select 2+2;
417+
ERROR: database is shut down to avoid wraparound data loss in database "mydb"
418+
HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb".
419+
</programlisting>
420+
421+
The 1-million-transaction safety margin exists to let the
422+
administrator recover without data loss, by manually executing the
423+
required <command>VACUUM</> commands. However, since the system will not
424+
execute commands once it has gone into the safety shutdown mode,
425+
the only way to do this is to stop the postmaster and use a standalone
426+
backend to execute <command>VACUUM</>. The shutdown mode is not enforced
427+
by a standalone backend. See the <xref linkend="app-postgres"> reference
428+
page for details about using a standalone backend.
429+
</para>
430+
400431
<para>
401432
<command>VACUUM</> with the <command>FREEZE</> option uses a more
402433
aggressive freezing policy: row versions are frozen if they are old enough
@@ -410,26 +441,19 @@ VACUUM
410441
It should also be used to prepare any user-created databases that
411442
are to be marked <literal>datallowconn</> = <literal>false</> in
412443
<literal>pg_database</>, since there isn't any convenient way to
413-
<command>VACUUM</command> a database that you can't connect to. Note that
414-
<command>VACUUM</command>'s automatic warning message about
415-
unvacuumed databases will ignore <literal>pg_database</> entries
416-
with <literal>datallowconn</> = <literal>false</>, so as to avoid
417-
giving false warnings about these databases; therefore it's up to
418-
you to ensure that such databases are frozen correctly.
444+
<command>VACUUM</command> a database that you can't connect to.
419445
</para>
420446

421447
<warning>
422448
<para>
423-
To be sure of safety against transaction wraparound, it is necessary
424-
to vacuum <emphasis>every</> table, including system catalogs, in
425-
<emphasis>every</> database at least once every billion transactions.
426-
We have seen data loss situations caused by people deciding that they
427-
only needed to vacuum their active user tables, rather than issuing
428-
database-wide vacuum commands. That will appear to work fine ...
429-
for a while.
449+
A database that is marked <literal>datallowconn</> = <literal>false</>
450+
in <literal>pg_database</> is assumed to be properly frozen; the
451+
automatic warnings and wraparound protection shutdown do not take
452+
such databases into account. Therefore it's up to you to ensure
453+
you've correctly frozen a database before you mark it with
454+
<literal>datallowconn</> = <literal>false</>.
430455
</para>
431456
</warning>
432-
433457
</sect2>
434458
</sect1>
435459

src/backend/access/transam/varsup.c

+112-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
* Copyright (c) 2000-2005, PostgreSQL Global Development Group
77
*
88
* IDENTIFICATION
9-
* $PostgreSQL: pgsql/src/backend/access/transam/varsup.c,v 1.60 2005/01/01 05:43:06 momjian Exp $
9+
* $PostgreSQL: pgsql/src/backend/access/transam/varsup.c,v 1.61 2005/02/20 02:21:28 tgl Exp $
1010
*
1111
*-------------------------------------------------------------------------
1212
*/
@@ -16,8 +16,10 @@
1616
#include "access/clog.h"
1717
#include "access/subtrans.h"
1818
#include "access/transam.h"
19+
#include "miscadmin.h"
1920
#include "storage/ipc.h"
2021
#include "storage/proc.h"
22+
#include "utils/builtins.h"
2123

2224

2325
/* Number of OIDs to prefetch (preallocate) per XLOG write */
@@ -46,6 +48,37 @@ GetNewTransactionId(bool isSubXact)
4648

4749
xid = ShmemVariableCache->nextXid;
4850

51+
/*
52+
* Check to see if it's safe to assign another XID. This protects
53+
* against catastrophic data loss due to XID wraparound. The basic
54+
* rules are: warn if we're past xidWarnLimit, and refuse to execute
55+
* transactions if we're past xidStopLimit, unless we are running in
56+
* a standalone backend (which gives an escape hatch to the DBA who
57+
* ignored all those warnings).
58+
*
59+
* Test is coded to fall out as fast as possible during normal operation,
60+
* ie, when the warn limit is set and we haven't violated it.
61+
*/
62+
if (TransactionIdFollowsOrEquals(xid, ShmemVariableCache->xidWarnLimit) &&
63+
TransactionIdIsValid(ShmemVariableCache->xidWarnLimit))
64+
{
65+
if (IsUnderPostmaster &&
66+
TransactionIdFollowsOrEquals(xid, ShmemVariableCache->xidStopLimit))
67+
ereport(ERROR,
68+
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
69+
errmsg("database is shut down to avoid wraparound data loss in database \"%s\"",
70+
NameStr(ShmemVariableCache->limit_datname)),
71+
errhint("Stop the postmaster and use a standalone backend to VACUUM in \"%s\".",
72+
NameStr(ShmemVariableCache->limit_datname))));
73+
else
74+
ereport(WARNING,
75+
(errmsg("database \"%s\" must be vacuumed within %u transactions",
76+
NameStr(ShmemVariableCache->limit_datname),
77+
ShmemVariableCache->xidWrapLimit - xid),
78+
errhint("To avoid a database shutdown, execute a full-database VACUUM in \"%s\".",
79+
NameStr(ShmemVariableCache->limit_datname))));
80+
}
81+
4982
/*
5083
* If we are allocating the first XID of a new page of the commit log,
5184
* zero out that commit-log page before returning. We must do this
@@ -137,6 +170,84 @@ ReadNewTransactionId(void)
137170
return xid;
138171
}
139172

173+
/*
174+
* Determine the last safe XID to allocate given the currently oldest
175+
* datfrozenxid (ie, the oldest XID that might exist in any database
176+
* of our cluster).
177+
*/
178+
void
179+
SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
180+
Name oldest_datname)
181+
{
182+
TransactionId xidWarnLimit;
183+
TransactionId xidStopLimit;
184+
TransactionId xidWrapLimit;
185+
TransactionId curXid;
186+
187+
Assert(TransactionIdIsValid(oldest_datfrozenxid));
188+
189+
/*
190+
* The place where we actually get into deep trouble is halfway around
191+
* from the oldest potentially-existing XID. (This calculation is
192+
* probably off by one or two counts, because the special XIDs reduce the
193+
* size of the loop a little bit. But we throw in plenty of slop below,
194+
* so it doesn't matter.)
195+
*/
196+
xidWrapLimit = oldest_datfrozenxid + (MaxTransactionId >> 1);
197+
if (xidWrapLimit < FirstNormalTransactionId)
198+
xidWrapLimit += FirstNormalTransactionId;
199+
200+
/*
201+
* We'll refuse to continue assigning XIDs in interactive mode once
202+
* we get within 1M transactions of data loss. This leaves lots
203+
* of room for the DBA to fool around fixing things in a standalone
204+
* backend, while not being significant compared to total XID space.
205+
* (Note that since vacuuming requires one transaction per table
206+
* cleaned, we had better be sure there's lots of XIDs left...)
207+
*/
208+
xidStopLimit = xidWrapLimit - 1000000;
209+
if (xidStopLimit < FirstNormalTransactionId)
210+
xidStopLimit -= FirstNormalTransactionId;
211+
212+
/*
213+
* We'll start complaining loudly when we get within 10M transactions
214+
* of the stop point. This is kind of arbitrary, but if you let your
215+
* gas gauge get down to 1% of full, would you be looking for the
216+
* next gas station? We need to be fairly liberal about this number
217+
* because there are lots of scenarios where most transactions are
218+
* done by automatic clients that won't pay attention to warnings.
219+
* (No, we're not gonna make this configurable. If you know enough to
220+
* configure it, you know enough to not get in this kind of trouble in
221+
* the first place.)
222+
*/
223+
xidWarnLimit = xidStopLimit - 10000000;
224+
if (xidWarnLimit < FirstNormalTransactionId)
225+
xidWarnLimit -= FirstNormalTransactionId;
226+
227+
/* Grab lock for just long enough to set the new limit values */
228+
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
229+
ShmemVariableCache->xidWarnLimit = xidWarnLimit;
230+
ShmemVariableCache->xidStopLimit = xidStopLimit;
231+
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
232+
namecpy(&ShmemVariableCache->limit_datname, oldest_datname);
233+
curXid = ShmemVariableCache->nextXid;
234+
LWLockRelease(XidGenLock);
235+
236+
/* Log the info */
237+
ereport(LOG,
238+
(errmsg("transaction ID wrap limit is %u, limited by database \"%s\"",
239+
xidWrapLimit, NameStr(*oldest_datname))));
240+
/* Give an immediate warning if past the wrap warn point */
241+
if (TransactionIdFollowsOrEquals(curXid, xidWarnLimit))
242+
ereport(WARNING,
243+
(errmsg("database \"%s\" must be vacuumed within %u transactions",
244+
NameStr(*oldest_datname),
245+
xidWrapLimit - curXid),
246+
errhint("To avoid a database shutdown, execute a full-database VACUUM in \"%s\".",
247+
NameStr(*oldest_datname))));
248+
}
249+
250+
140251
/* ----------------------------------------------------------------
141252
* object id generation support
142253
* ----------------------------------------------------------------

src/backend/access/transam/xact.c

+34-14
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
*
1111
*
1212
* IDENTIFICATION
13-
* $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.195 2004/12/31 21:59:29 pgsql Exp $
13+
* $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.196 2005/02/20 02:21:28 tgl Exp $
1414
*
1515
*-------------------------------------------------------------------------
1616
*/
@@ -28,14 +28,14 @@
2828
#include "commands/async.h"
2929
#include "commands/tablecmds.h"
3030
#include "commands/trigger.h"
31-
#include "commands/user.h"
3231
#include "executor/spi.h"
3332
#include "libpq/be-fsstubs.h"
3433
#include "miscadmin.h"
3534
#include "storage/fd.h"
3635
#include "storage/proc.h"
3736
#include "storage/sinval.h"
3837
#include "storage/smgr.h"
38+
#include "utils/flatfiles.h"
3939
#include "utils/guc.h"
4040
#include "utils/inval.h"
4141
#include "utils/memutils.h"
@@ -1354,6 +1354,7 @@ StartTransaction(void)
13541354
* start processing
13551355
*/
13561356
s->state = TRANS_START;
1357+
s->transactionId = InvalidTransactionId; /* until assigned */
13571358

13581359
/*
13591360
* Make sure we've freed any old snapshot, and reset xact state
@@ -1464,9 +1465,9 @@ CommitTransaction(void)
14641465
/* NOTIFY commit must come before lower-level cleanup */
14651466
AtCommit_Notify();
14661467

1467-
/* Update the flat password file if we changed pg_shadow or pg_group */
1468+
/* Update flat files if we changed pg_database, pg_shadow or pg_group */
14681469
/* This should be the last step before commit */
1469-
AtEOXact_UpdatePasswordFile(true);
1470+
AtEOXact_UpdateFlatFiles(true);
14701471

14711472
/* Prevent cancel/die interrupt while cleaning up */
14721473
HOLD_INTERRUPTS();
@@ -1654,10 +1655,14 @@ AbortTransaction(void)
16541655
AtAbort_Portals();
16551656
AtEOXact_LargeObject(false); /* 'false' means it's abort */
16561657
AtAbort_Notify();
1657-
AtEOXact_UpdatePasswordFile(false);
1658+
AtEOXact_UpdateFlatFiles(false);
16581659

1659-
/* Advertise the fact that we aborted in pg_clog. */
1660-
RecordTransactionAbort();
1660+
/*
1661+
* Advertise the fact that we aborted in pg_clog (assuming that we
1662+
* got as far as assigning an XID to advertise).
1663+
*/
1664+
if (TransactionIdIsValid(s->transactionId))
1665+
RecordTransactionAbort();
16611666

16621667
/*
16631668
* Let others know about no transaction in progress by me. Note that
@@ -2034,10 +2039,25 @@ AbortCurrentTransaction(void)
20342039

20352040
switch (s->blockState)
20362041
{
2037-
/*
2038-
* we aren't in a transaction, so we do nothing.
2039-
*/
20402042
case TBLOCK_DEFAULT:
2043+
if (s->state == TRANS_DEFAULT)
2044+
{
2045+
/* we are idle, so nothing to do */
2046+
}
2047+
else
2048+
{
2049+
/*
2050+
* We can get here after an error during transaction start
2051+
* (state will be TRANS_START). Need to clean up the
2052+
* incompletely started transaction. First, adjust the
2053+
* low-level state to suppress warning message from
2054+
* AbortTransaction.
2055+
*/
2056+
if (s->state == TRANS_START)
2057+
s->state = TRANS_INPROGRESS;
2058+
AbortTransaction();
2059+
CleanupTransaction();
2060+
}
20412061
break;
20422062

20432063
/*
@@ -3277,8 +3297,8 @@ CommitSubTransaction(void)
32773297
AtEOSubXact_LargeObject(true, s->subTransactionId,
32783298
s->parent->subTransactionId);
32793299
AtSubCommit_Notify();
3280-
AtEOSubXact_UpdatePasswordFile(true, s->subTransactionId,
3281-
s->parent->subTransactionId);
3300+
AtEOSubXact_UpdateFlatFiles(true, s->subTransactionId,
3301+
s->parent->subTransactionId);
32823302

32833303
CallSubXactCallbacks(SUBXACT_EVENT_COMMIT_SUB, s->subTransactionId,
32843304
s->parent->subTransactionId);
@@ -3387,8 +3407,8 @@ AbortSubTransaction(void)
33873407
AtEOSubXact_LargeObject(false, s->subTransactionId,
33883408
s->parent->subTransactionId);
33893409
AtSubAbort_Notify();
3390-
AtEOSubXact_UpdatePasswordFile(false, s->subTransactionId,
3391-
s->parent->subTransactionId);
3410+
AtEOSubXact_UpdateFlatFiles(false, s->subTransactionId,
3411+
s->parent->subTransactionId);
33923412

33933413
/* Advertise the fact that we aborted in pg_clog. */
33943414
if (TransactionIdIsValid(s->transactionId))

src/backend/bootstrap/bootstrap.c

+3-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/bootstrap/bootstrap.c,v 1.198 2005/01/14 21:08:44 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/bootstrap/bootstrap.c,v 1.199 2005/02/20 02:21:31 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -39,6 +39,7 @@
3939
#include "storage/proc.h"
4040
#include "tcop/tcopprot.h"
4141
#include "utils/builtins.h"
42+
#include "utils/flatfiles.h"
4243
#include "utils/fmgroids.h"
4344
#include "utils/guc.h"
4445
#include "utils/lsyscache.h"
@@ -407,6 +408,7 @@ BootstrapMain(int argc, char *argv[])
407408
bootstrap_signals();
408409
StartupXLOG();
409410
LoadFreeSpaceMap();
411+
BuildFlatFiles(false);
410412
proc_exit(0); /* startup done */
411413

412414
case BS_XLOG_BGWRITER:

0 commit comments

Comments
 (0)