Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 83c39a1

Browse files
Ensure vacuum removes all visibly dead tuples older than OldestXmin
If vacuum fails to remove a tuple with xmax older than VacuumCutoffs->OldestXmin and younger than GlobalVisState->maybe_needed, it may attempt to freeze the tuple's xmax and then ERROR out in pre-freeze checks with "cannot freeze committed xmax". Fix this by having vacuum always remove tuples older than OldestXmin. It is possible for GlobalVisState->maybe_needed to precede OldestXmin if maybe_needed is forced to go backward while vacuum is running. This can happen if a disconnected standby with a running transaction older than VacuumCutoffs->OldestXmin reconnects to the primary after vacuum initially calculates GlobalVisState and OldestXmin. In back branches starting with 14, the first version using GlobalVisState, failing to remove tuples older than OldestXmin during pruning caused vacuum to infinitely loop in lazy_scan_prune(), as investigated on this [1] thread. After 1ccc1e0 removed the retry loop in lazy_scan_prune() and stopped comparing tuples to OldestXmin, the hang could no longer happen, but we could still attempt to freeze dead tuples with xmax older than OldestXmin -- resulting in an ERROR. Fix this by always removing dead tuples with xmax older than VacuumCutoffs->OldestXmin. This is okay because the standby won't replay the tuple removal until the tuple is removable. Thus, the worst that can happen is a recovery conflict. [1] https://postgr.es/m/20240415173913.4zyyrwaftujxthf2%40awork3.anarazel.de#1b216b7768b5bd577a3d3d51bd5aadee Back-patch through 14 Author: Melanie Plageman Reviewed-by: Peter Geoghegan, Robert Haas, Andres Freund, Heikki Linnakangas, and Noah Misch Discussion: https://postgr.es/m/CAAKRu_bDD7oq9ZwB2OJqub5BovMG6UjEYsoK2LVttadjEqyRGg%40mail.gmail.com
1 parent 5784a49 commit 83c39a1

File tree

2 files changed

+29
-8
lines changed

2 files changed

+29
-8
lines changed

src/backend/access/heap/pruneheap.c

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -325,6 +325,8 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
325325
*
326326
* cutoffs contains the freeze cutoffs, established by VACUUM at the beginning
327327
* of vacuuming the relation. Required if HEAP_PRUNE_FREEZE option is set.
328+
* cutoffs->OldestXmin is also used to determine if dead tuples are
329+
* HEAPTUPLE_RECENTLY_DEAD or HEAPTUPLE_DEAD.
328330
*
329331
* presult contains output parameters needed by callers, such as the number of
330332
* tuples removed and the offsets of dead items on the page after pruning.
@@ -922,8 +924,27 @@ heap_prune_satisfies_vacuum(PruneState *prstate, HeapTuple tup, Buffer buffer)
922924
if (res != HEAPTUPLE_RECENTLY_DEAD)
923925
return res;
924926

927+
/*
928+
* For VACUUM, we must be sure to prune tuples with xmax older than
929+
* OldestXmin -- a visibility cutoff determined at the beginning of
930+
* vacuuming the relation. OldestXmin is used for freezing determination
931+
* and we cannot freeze dead tuples' xmaxes.
932+
*/
933+
if (prstate->cutoffs &&
934+
TransactionIdIsValid(prstate->cutoffs->OldestXmin) &&
935+
NormalTransactionIdPrecedes(dead_after, prstate->cutoffs->OldestXmin))
936+
return HEAPTUPLE_DEAD;
937+
938+
/*
939+
* Determine whether or not the tuple is considered dead when compared
940+
* with the provided GlobalVisState. On-access pruning does not provide
941+
* VacuumCutoffs. And for vacuum, even if the tuple's xmax is not older
942+
* than OldestXmin, GlobalVisTestIsRemovableXid() could find the row dead
943+
* if the GlobalVisState has been updated since the beginning of vacuuming
944+
* the relation.
945+
*/
925946
if (GlobalVisTestIsRemovableXid(prstate->vistest, dead_after))
926-
res = HEAPTUPLE_DEAD;
947+
return HEAPTUPLE_DEAD;
927948

928949
return res;
929950
}

src/backend/access/heap/vacuumlazy.c

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -438,13 +438,13 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
438438
* as an upper bound on the XIDs stored in the pages we'll actually scan
439439
* (NewRelfrozenXid tracking must never be allowed to miss unfrozen XIDs).
440440
*
441-
* Next acquire vistest, a related cutoff that's used in pruning. We
442-
* expect vistest will always make heap_page_prune_and_freeze() remove any
443-
* deleted tuple whose xmax is < OldestXmin. lazy_scan_prune must never
444-
* become confused about whether a tuple should be frozen or removed. (In
445-
* the future we might want to teach lazy_scan_prune to recompute vistest
446-
* from time to time, to increase the number of dead tuples it can prune
447-
* away.)
441+
* Next acquire vistest, a related cutoff that's used in pruning. We use
442+
* vistest in combination with OldestXmin to ensure that
443+
* heap_page_prune_and_freeze() always removes any deleted tuple whose
444+
* xmax is < OldestXmin. lazy_scan_prune must never become confused about
445+
* whether a tuple should be frozen or removed. (In the future we might
446+
* want to teach lazy_scan_prune to recompute vistest from time to time,
447+
* to increase the number of dead tuples it can prune away.)
448448
*/
449449
vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs);
450450
vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel);

0 commit comments

Comments
 (0)