Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 2adadf0

Browse files
committed
Check for interrupts inside the nbtree page deletion code.
When deleting pages the nbtree code has to walk through siblings of a tree node. When those sibling links are corrupted that can lead to endless loops - which are currently not interruptible. This is especially problematic if autovacuum is repeatedly blocked on such indexes, as it can be hard to get out of that situation without resorting to single user mode. Thus add interrupt checks to appropriate places in such loops. Unfortunately in one of the cases it's it's not easy to do so. Between 9.3 and 9.4 the page deletion (and page split) code changed significantly. Before it was significantly less robust against interruptions. Therefore don't backpatch to 9.3. Author: Andres Freund Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de Backpatch: 9.4-
1 parent 7da22d8 commit 2adadf0

File tree

1 file changed

+21
-0
lines changed

1 file changed

+21
-0
lines changed

src/backend/access/nbtree/nbtpage.c

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1282,6 +1282,7 @@ _bt_pagedel(Relation rel, Buffer buf)
12821282
rightsib_empty = false;
12831283
while (P_ISHALFDEAD(opaque))
12841284
{
1285+
/* will check for interrupts, once lock is released */
12851286
if (!_bt_unlink_halfdead_page(rel, buf, &rightsib_empty))
12861287
{
12871288
/* _bt_unlink_halfdead_page already released buffer */
@@ -1294,6 +1295,12 @@ _bt_pagedel(Relation rel, Buffer buf)
12941295

12951296
_bt_relbuf(rel, buf);
12961297

1298+
/*
1299+
* Check here, as calling loops will have locks held, preventing
1300+
* interrupts from being processed.
1301+
*/
1302+
CHECK_FOR_INTERRUPTS();
1303+
12971304
/*
12981305
* The page has now been deleted. If its right sibling is completely
12991306
* empty, it's possible that the reason we haven't deleted it earlier
@@ -1545,6 +1552,12 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
15451552

15461553
LockBuffer(leafbuf, BUFFER_LOCK_UNLOCK);
15471554

1555+
/*
1556+
* Check here, as calling loops will have locks held, preventing
1557+
* interrupts from being processed.
1558+
*/
1559+
CHECK_FOR_INTERRUPTS();
1560+
15481561
/*
15491562
* If the leaf page still has a parent pointing to it (or a chain of
15501563
* parents), we don't unlink the leaf page yet, but the topmost remaining
@@ -1603,6 +1616,14 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
16031616
/* step right one page */
16041617
leftsib = opaque->btpo_next;
16051618
_bt_relbuf(rel, lbuf);
1619+
1620+
/*
1621+
* It'd be good to check for interrupts here, but it's not easy to
1622+
* do so because a lock is always held. This block isn't
1623+
* frequently reached, so hopefully the consequences of not
1624+
* checking interrupts aren't too bad.
1625+
*/
1626+
16061627
if (leftsib == P_NONE)
16071628
{
16081629
elog(LOG, "no left sibling (concurrent deletion?) of block %u in \"%s\"",

0 commit comments

Comments
 (0)