Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 1e55e7d

Browse files
Add wraparound failsafe to VACUUM.
Add a failsafe mechanism that is triggered by VACUUM when it notices that the table's relfrozenxid and/or relminmxid are dangerously far in the past. VACUUM checks the age of the table dynamically, at regular intervals. When the failsafe triggers, VACUUM takes extraordinary measures to finish as quickly as possible so that relfrozenxid and/or relminmxid can be advanced. VACUUM will stop applying any cost-based delay that may be in effect. VACUUM will also bypass any further index vacuuming and heap vacuuming -- it only completes whatever remaining pruning and freezing is required. Bypassing index/heap vacuuming is enabled by commit 8523492, which made it possible to dynamically trigger the mechanism already used within VACUUM when it is run with INDEX_CLEANUP off. It is expected that the failsafe will almost always trigger within an autovacuum to prevent wraparound, long after the autovacuum began. However, the failsafe mechanism can trigger in any VACUUM operation. Even in a non-aggressive VACUUM, where we're likely to not advance relfrozenxid, it still seems like a good idea to finish off remaining pruning and freezing. An aggressive/anti-wraparound VACUUM will be launched immediately afterwards. Note that the anti-wraparound VACUUM that follows will itself trigger the failsafe, usually before it even begins its first (and only) pass over the heap. The failsafe is controlled by two new GUCs: vacuum_failsafe_age, and vacuum_multixact_failsafe_age. There are no equivalent reloptions, since that isn't expected to be useful. The GUCs have rather high defaults (both default to 1.6 billion), and are expected to generally only be used to make the failsafe trigger sooner/more frequently. Author: Masahiko Sawada <sawada.mshk@gmail.com> Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzmgH3ySGYeC-m-eOBsa2=sDwa292-CFghV4rESYo39FsQ@mail.gmail.com
1 parent 4f0b096 commit 1e55e7d

File tree

6 files changed

+316
-13
lines changed

6 files changed

+316
-13
lines changed

doc/src/sgml/config.sgml

+66
Original file line numberDiff line numberDiff line change
@@ -8644,6 +8644,39 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
86448644
</listitem>
86458645
</varlistentry>
86468646

8647+
<varlistentry id="guc-vacuum-failsafe-age" xreflabel="vacuum_failsafe_age">
8648+
<term><varname>vacuum_failsafe_age</varname> (<type>integer</type>)
8649+
<indexterm>
8650+
<primary><varname>vacuum_failsafe_age</varname> configuration parameter</primary>
8651+
</indexterm>
8652+
</term>
8653+
<listitem>
8654+
<para>
8655+
Specifies the maximum age (in transactions) that a table's
8656+
<structname>pg_class</structname>.<structfield>relfrozenxid</structfield>
8657+
field can attain before <command>VACUUM</command> takes
8658+
extraordinary measures to avoid system-wide transaction ID
8659+
wraparound failure. This is <command>VACUUM</command>'s
8660+
strategy of last resort. The failsafe typically triggers
8661+
when an autovacuum to prevent transaction ID wraparound has
8662+
already been running for some time, though it's possible for
8663+
the failsafe to trigger during any <command>VACUUM</command>.
8664+
</para>
8665+
<para>
8666+
When the failsafe is triggered, any cost-based delay that is
8667+
in effect will no longer be applied, and further non-essential
8668+
maintenance tasks (such as index vacuuming) are bypassed.
8669+
</para>
8670+
<para>
8671+
The default is 1.6 billion transactions. Although users can
8672+
set this value anywhere from zero to 2.1 billion,
8673+
<command>VACUUM</command> will silently adjust the effective
8674+
value to no less than 105% of <xref
8675+
linkend="guc-autovacuum-freeze-max-age"/>.
8676+
</para>
8677+
</listitem>
8678+
</varlistentry>
8679+
86478680
<varlistentry id="guc-vacuum-multixact-freeze-table-age" xreflabel="vacuum_multixact_freeze_table_age">
86488681
<term><varname>vacuum_multixact_freeze_table_age</varname> (<type>integer</type>)
86498682
<indexterm>
@@ -8690,6 +8723,39 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
86908723
</listitem>
86918724
</varlistentry>
86928725

8726+
<varlistentry id="guc-multixact-failsafe-age" xreflabel="vacuum_multixact_failsafe_age">
8727+
<term><varname>vacuum_multixact_failsafe_age</varname> (<type>integer</type>)
8728+
<indexterm>
8729+
<primary><varname>vacuum_multixact_failsafe_age</varname> configuration parameter</primary>
8730+
</indexterm>
8731+
</term>
8732+
<listitem>
8733+
<para>
8734+
Specifies the maximum age (in transactions) that a table's
8735+
<structname>pg_class</structname>.<structfield>relminmxid</structfield>
8736+
field can attain before <command>VACUUM</command> takes
8737+
extraordinary measures to avoid system-wide multixact ID
8738+
wraparound failure. This is <command>VACUUM</command>'s
8739+
strategy of last resort. The failsafe typically triggers when
8740+
an autovacuum to prevent transaction ID wraparound has already
8741+
been running for some time, though it's possible for the
8742+
failsafe to trigger during any <command>VACUUM</command>.
8743+
</para>
8744+
<para>
8745+
When the failsafe is triggered, any cost-based delay that is
8746+
in effect will no longer be applied, and further non-essential
8747+
maintenance tasks (such as index vacuuming) are bypassed.
8748+
</para>
8749+
<para>
8750+
The default is 1.6 billion multixacts. Although users can set
8751+
this value anywhere from zero to 2.1 billion,
8752+
<command>VACUUM</command> will silently adjust the effective
8753+
value to no less than 105% of <xref
8754+
linkend="guc-autovacuum-multixact-freeze-max-age"/>.
8755+
</para>
8756+
</listitem>
8757+
</varlistentry>
8758+
86938759
<varlistentry id="guc-bytea-output" xreflabel="bytea_output">
86948760
<term><varname>bytea_output</varname> (<type>enum</type>)
86958761
<indexterm>

src/backend/access/heap/vacuumlazy.c

+164-12
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,13 @@
103103
#define VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL 50 /* ms */
104104
#define VACUUM_TRUNCATE_LOCK_TIMEOUT 5000 /* ms */
105105

106+
/*
107+
* When a table is small (i.e. smaller than this), save cycles by avoiding
108+
* repeated failsafe checks
109+
*/
110+
#define FAILSAFE_MIN_PAGES \
111+
((BlockNumber) (((uint64) 4 * 1024 * 1024 * 1024) / BLCKSZ))
112+
106113
/*
107114
* When a table has no indexes, vacuum the FSM after every 8GB, approximately
108115
* (it won't be exact because we only vacuum FSM after processing a heap page
@@ -299,6 +306,8 @@ typedef struct LVRelState
299306
/* Do index vacuuming/cleanup? */
300307
bool do_index_vacuuming;
301308
bool do_index_cleanup;
309+
/* Wraparound failsafe in effect? (implies !do_index_vacuuming) */
310+
bool do_failsafe;
302311

303312
/* Buffer access strategy and parallel state */
304313
BufferAccessStrategy bstrategy;
@@ -393,12 +402,13 @@ static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
393402
GlobalVisState *vistest,
394403
LVPagePruneState *prunestate);
395404
static void lazy_vacuum(LVRelState *vacrel);
396-
static void lazy_vacuum_all_indexes(LVRelState *vacrel);
405+
static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
397406
static void lazy_vacuum_heap_rel(LVRelState *vacrel);
398407
static int lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
399408
Buffer buffer, int tupindex, Buffer *vmbuffer);
400409
static bool lazy_check_needs_freeze(Buffer buf, bool *hastup,
401410
LVRelState *vacrel);
411+
static bool lazy_check_wraparound_failsafe(LVRelState *vacrel);
402412
static void do_parallel_lazy_vacuum_all_indexes(LVRelState *vacrel);
403413
static void do_parallel_lazy_cleanup_all_indexes(LVRelState *vacrel);
404414
static void do_parallel_vacuum_or_cleanup(LVRelState *vacrel, int nworkers);
@@ -544,6 +554,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
544554
&vacrel->indrels);
545555
vacrel->do_index_vacuuming = true;
546556
vacrel->do_index_cleanup = true;
557+
vacrel->do_failsafe = false;
547558
if (params->index_cleanup == VACOPT_TERNARY_DISABLED)
548559
{
549560
vacrel->do_index_vacuuming = false;
@@ -888,6 +899,12 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
888899
vacrel->indstats = (IndexBulkDeleteResult **)
889900
palloc0(vacrel->nindexes * sizeof(IndexBulkDeleteResult *));
890901

902+
/*
903+
* Before beginning scan, check if it's already necessary to apply
904+
* failsafe
905+
*/
906+
lazy_check_wraparound_failsafe(vacrel);
907+
891908
/*
892909
* Allocate the space for dead tuples. Note that this handles parallel
893910
* VACUUM initialization as part of allocating shared memory space used
@@ -1311,12 +1328,17 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
13111328
* Periodically perform FSM vacuuming to make newly-freed
13121329
* space visible on upper FSM pages. Note we have not yet
13131330
* performed FSM processing for blkno.
1331+
*
1332+
* Call lazy_check_wraparound_failsafe() here, too, since we
1333+
* also don't want to do that too frequently, or too
1334+
* infrequently.
13141335
*/
13151336
if (blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
13161337
{
13171338
FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
13181339
blkno);
13191340
next_fsm_block_to_vacuum = blkno;
1341+
lazy_check_wraparound_failsafe(vacrel);
13201342
}
13211343

13221344
/*
@@ -1450,6 +1472,13 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
14501472
* make available in cases where it's possible to truncate the
14511473
* page's line pointer array.
14521474
*
1475+
* Note: It's not in fact 100% certain that we really will call
1476+
* lazy_vacuum_heap_rel() -- lazy_vacuum() might yet opt to skip
1477+
* index vacuuming (and so must skip heap vacuuming). This is
1478+
* deemed okay because it only happens in emergencies. (Besides,
1479+
* we start recording free space in the FSM once index vacuuming
1480+
* has been abandoned.)
1481+
*
14531482
* Note: The one-pass (no indexes) case is only supposed to make
14541483
* it this far when there were no LP_DEAD items during pruning.
14551484
*/
@@ -1499,7 +1528,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
14991528

15001529
/*
15011530
* Vacuum the remainder of the Free Space Map. We must do this whether or
1502-
* not there were indexes.
1531+
* not there were indexes, and whether or not we bypassed index vacuuming.
15031532
*/
15041533
if (blkno > next_fsm_block_to_vacuum)
15051534
FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
@@ -1953,6 +1982,11 @@ lazy_scan_prune(LVRelState *vacrel,
19531982

19541983
/*
19551984
* Remove the collected garbage tuples from the table and its indexes.
1985+
*
1986+
* In rare emergencies, the ongoing VACUUM operation can be made to skip both
1987+
* index vacuuming and index cleanup at the point we're called. This avoids
1988+
* having the whole system refuse to allocate further XIDs/MultiXactIds due to
1989+
* wraparound.
19561990
*/
19571991
static void
19581992
lazy_vacuum(LVRelState *vacrel)
@@ -1969,11 +2003,30 @@ lazy_vacuum(LVRelState *vacrel)
19692003
return;
19702004
}
19712005

1972-
/* Okay, we're going to do index vacuuming */
1973-
lazy_vacuum_all_indexes(vacrel);
1974-
1975-
/* Remove tuples from heap */
1976-
lazy_vacuum_heap_rel(vacrel);
2006+
if (lazy_vacuum_all_indexes(vacrel))
2007+
{
2008+
/*
2009+
* We successfully completed a round of index vacuuming. Do related
2010+
* heap vacuuming now.
2011+
*/
2012+
lazy_vacuum_heap_rel(vacrel);
2013+
}
2014+
else
2015+
{
2016+
/*
2017+
* Failsafe case.
2018+
*
2019+
* we attempted index vacuuming, but didn't finish a full round/full
2020+
* index scan. This happens when relfrozenxid or relminmxid is too
2021+
* far in the past.
2022+
*
2023+
* From this point on the VACUUM operation will do no further index
2024+
* vacuuming or heap vacuuming. It will do any remaining pruning that
2025+
* may be required, plus other heap-related and relation-level
2026+
* maintenance tasks. But that's it.
2027+
*/
2028+
Assert(vacrel->do_failsafe);
2029+
}
19772030

19782031
/*
19792032
* Forget the now-vacuumed tuples -- just press on
@@ -1983,17 +2036,31 @@ lazy_vacuum(LVRelState *vacrel)
19832036

19842037
/*
19852038
* lazy_vacuum_all_indexes() -- Main entry for index vacuuming
2039+
*
2040+
* Returns true in the common case when all indexes were successfully
2041+
* vacuumed. Returns false in rare cases where we determined that the ongoing
2042+
* VACUUM operation is at risk of taking too long to finish, leading to
2043+
* wraparound failure.
19862044
*/
1987-
static void
2045+
static bool
19882046
lazy_vacuum_all_indexes(LVRelState *vacrel)
19892047
{
2048+
bool allindexes = true;
2049+
19902050
Assert(!IsParallelWorker());
19912051
Assert(vacrel->nindexes > 0);
19922052
Assert(vacrel->do_index_vacuuming);
19932053
Assert(vacrel->do_index_cleanup);
19942054
Assert(TransactionIdIsNormal(vacrel->relfrozenxid));
19952055
Assert(MultiXactIdIsValid(vacrel->relminmxid));
19962056

2057+
/* Precheck for XID wraparound emergencies */
2058+
if (lazy_check_wraparound_failsafe(vacrel))
2059+
{
2060+
/* Wraparound emergency -- don't even start an index scan */
2061+
return false;
2062+
}
2063+
19972064
/* Report that we are now vacuuming indexes */
19982065
pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
19992066
PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
@@ -2008,26 +2075,50 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
20082075
vacrel->indstats[idx] =
20092076
lazy_vacuum_one_index(indrel, istat, vacrel->old_live_tuples,
20102077
vacrel);
2078+
2079+
if (lazy_check_wraparound_failsafe(vacrel))
2080+
{
2081+
/* Wraparound emergency -- end current index scan */
2082+
allindexes = false;
2083+
break;
2084+
}
20112085
}
20122086
}
20132087
else
20142088
{
20152089
/* Outsource everything to parallel variant */
20162090
do_parallel_lazy_vacuum_all_indexes(vacrel);
2091+
2092+
/*
2093+
* Do a postcheck to consider applying wraparound failsafe now. Note
2094+
* that parallel VACUUM only gets the precheck and this postcheck.
2095+
*/
2096+
if (lazy_check_wraparound_failsafe(vacrel))
2097+
allindexes = false;
20172098
}
20182099

20192100
/*
20202101
* We delete all LP_DEAD items from the first heap pass in all indexes on
2021-
* each call here. This makes the next call to lazy_vacuum_heap_rel()
2022-
* safe.
2102+
* each call here (except calls where we choose to do the failsafe). This
2103+
* makes the next call to lazy_vacuum_heap_rel() safe (except in the event
2104+
* of the failsafe triggering, which prevents the next call from taking
2105+
* place).
20232106
*/
20242107
Assert(vacrel->num_index_scans > 0 ||
20252108
vacrel->dead_tuples->num_tuples == vacrel->lpdead_items);
2109+
Assert(allindexes || vacrel->do_failsafe);
20262110

2027-
/* Increase and report the number of index scans */
2111+
/*
2112+
* Increase and report the number of index scans.
2113+
*
2114+
* We deliberately include the case where we started a round of bulk
2115+
* deletes that we weren't able to finish due to the failsafe triggering.
2116+
*/
20282117
vacrel->num_index_scans++;
20292118
pgstat_progress_update_param(PROGRESS_VACUUM_NUM_INDEX_VACUUMS,
20302119
vacrel->num_index_scans);
2120+
2121+
return allindexes;
20312122
}
20322123

20332124
/*
@@ -2320,6 +2411,67 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup, LVRelState *vacrel)
23202411
return (offnum <= maxoff);
23212412
}
23222413

2414+
/*
2415+
* Trigger the failsafe to avoid wraparound failure when vacrel table has a
2416+
* relfrozenxid and/or relminmxid that is dangerously far in the past.
2417+
*
2418+
* Triggering the failsafe makes the ongoing VACUUM bypass any further index
2419+
* vacuuming and heap vacuuming. It also stops the ongoing VACUUM from
2420+
* applying any cost-based delay that may be in effect.
2421+
*
2422+
* Returns true when failsafe has been triggered.
2423+
*
2424+
* Caller is expected to call here before and after vacuuming each index in
2425+
* the case of two-pass VACUUM, or every VACUUM_FSM_EVERY_PAGES blocks in the
2426+
* case of no-indexes/one-pass VACUUM.
2427+
*
2428+
* There is also a precheck before the first pass over the heap begins, which
2429+
* is helpful when the failsafe initially triggers during a non-aggressive
2430+
* VACUUM -- the automatic aggressive vacuum to prevent wraparound that
2431+
* follows can independently trigger the failsafe right away.
2432+
*/
2433+
static bool
2434+
lazy_check_wraparound_failsafe(LVRelState *vacrel)
2435+
{
2436+
/* Avoid calling vacuum_xid_failsafe_check() very frequently */
2437+
if (vacrel->num_index_scans == 0 &&
2438+
vacrel->rel_pages <= FAILSAFE_MIN_PAGES)
2439+
return false;
2440+
2441+
/* Don't warn more than once per VACUUM */
2442+
if (vacrel->do_failsafe)
2443+
return true;
2444+
2445+
if (unlikely(vacuum_xid_failsafe_check(vacrel->relfrozenxid,
2446+
vacrel->relminmxid)))
2447+
{
2448+
Assert(vacrel->do_index_vacuuming);
2449+
Assert(vacrel->do_index_cleanup);
2450+
2451+
vacrel->do_index_vacuuming = false;
2452+
vacrel->do_index_cleanup = false;
2453+
vacrel->do_failsafe = true;
2454+
2455+
ereport(WARNING,
2456+
(errmsg("abandoned index vacuuming of table \"%s.%s.%s\" as a failsafe after %d index scans",
2457+
get_database_name(MyDatabaseId),
2458+
vacrel->relnamespace,
2459+
vacrel->relname,
2460+
vacrel->num_index_scans),
2461+
errdetail("table's relfrozenxid or relminmxid is too far in the past"),
2462+
errhint("Consider increasing configuration parameter \"maintenance_work_mem\" or \"autovacuum_work_mem\".\n"
2463+
"You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
2464+
2465+
/* Stop applying cost limits from this point on */
2466+
VacuumCostActive = false;
2467+
VacuumCostBalance = 0;
2468+
2469+
return true;
2470+
}
2471+
2472+
return false;
2473+
}
2474+
23232475
/*
23242476
* Perform lazy_vacuum_all_indexes() steps in parallel
23252477
*/
@@ -3173,7 +3325,7 @@ lazy_space_alloc(LVRelState *vacrel, int nworkers, BlockNumber nblocks)
31733325
* be used for an index, so we invoke parallelism only if there are at
31743326
* least two indexes on a table.
31753327
*/
3176-
if (nworkers >= 0 && vacrel->nindexes > 1)
3328+
if (nworkers >= 0 && vacrel->nindexes > 1 && vacrel->do_index_vacuuming)
31773329
{
31783330
/*
31793331
* Since parallel workers cannot access data in temporary tables, we

0 commit comments

Comments
 (0)