Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 43e409c

Browse files
Backpatch nbtree page deletion hardening.
Postgres 14 commit 5b861ba taught nbtree VACUUM to tolerate buggy opclasses. VACUUM's inability to locate a to-be-deleted page's downlink in the parent page was logged instead of throwing an error. VACUUM could just press on with vacuuming the index, and vacuuming the table as a whole. There are now anecdotal reports of this error causing problems that were much more disruptive than the underlying index corruption ever could be. Anything that makes VACUUM unable to make forward progress against one table/index ultimately risks making the system enter xidStopLimit mode. There is no good reason to take any chances here, so backpatch the hardening commit. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wzm9HR6Pow=t-iQa57zT8qmX6_M4h14F-pTtb=xFDW5FBA@mail.gmail.com Backpatch: 10-13 (all supported versions that lacked the hardening)
1 parent b70db6c commit 43e409c

File tree

1 file changed

+17
-1
lines changed

1 file changed

+17
-1
lines changed

src/backend/access/nbtree/nbtpage.c

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2388,10 +2388,26 @@ _bt_lock_subtree_parent(Relation rel, BlockNumber child, BTStack stack,
23882388
*/
23892389
pbuf = _bt_getstackbuf(rel, stack, child);
23902390
if (pbuf == InvalidBuffer)
2391-
ereport(ERROR,
2391+
{
2392+
/*
2393+
* Failed to "re-find" a pivot tuple whose downlink matched our child
2394+
* block number on the parent level -- the index must be corrupt.
2395+
* Don't even try to delete the leafbuf subtree. Just report the
2396+
* issue and press on with vacuuming the index.
2397+
*
2398+
* Note: _bt_getstackbuf() recovers from concurrent page splits that
2399+
* take place on the parent level. Its approach is a near-exhaustive
2400+
* linear search. This also gives it a surprisingly good chance of
2401+
* recovering in the event of a buggy or inconsistent opclass. But we
2402+
* don't rely on that here.
2403+
*/
2404+
ereport(LOG,
23922405
(errcode(ERRCODE_INDEX_CORRUPTED),
23932406
errmsg_internal("failed to re-find parent key in index \"%s\" for deletion target page %u",
23942407
RelationGetRelationName(rel), child)));
2408+
return false;
2409+
}
2410+
23952411
parent = stack->bts_blkno;
23962412
parentoffset = stack->bts_offset;
23972413

0 commit comments

Comments
 (0)