Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit eb7a6b9

Browse files
committed
Fix query-cancel handling in spgdoinsert().
Knowing that a buggy opclass could cause an infinite insertion loop, spgdoinsert() intended to allow its loop to be interrupted by query cancel. However, that never actually worked, because in iterations after the first, we'd be holding buffer lock(s) which would cause InterruptHoldoffCount to be positive, preventing servicing of the interrupt. To fix, check if an interrupt is pending, and if so fall out of the insertion loop and service the interrupt after we've released the buffers. If it was indeed a query cancel, that's the end of the matter. If it was a non-canceling interrupt reason, make use of the existing provision to retry the whole insertion. (This isn't as wasteful as it might seem, since any upper-level index tuples we already created should be usable in the next attempt.) While there's no known instance of such a bug in existing release branches, it still seems like a good idea to back-patch this to all supported branches, since the behavior is fairly nasty if a loop does happen --- not only is it uncancelable, but it will quickly consume memory to the point of an OOM failure. In any case, this code is certainly not working as intended. Per report from Dilip Kumar. Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
1 parent e47f93f commit eb7a6b9

File tree

1 file changed

+45
-7
lines changed

1 file changed

+45
-7
lines changed

src/backend/access/spgist/spgdoinsert.c

+45-7
Original file line numberDiff line numberDiff line change
@@ -1905,13 +1905,14 @@ spgSplitNodeAction(Relation index, SpGistState *state,
19051905
* Insert one item into the index.
19061906
*
19071907
* Returns true on success, false if we failed to complete the insertion
1908-
* because of conflict with a concurrent insert. In the latter case,
1909-
* caller should re-call spgdoinsert() with the same args.
1908+
* (typically because of conflict with a concurrent insert). In the latter
1909+
* case, caller should re-call spgdoinsert() with the same args.
19101910
*/
19111911
bool
19121912
spgdoinsert(Relation index, SpGistState *state,
19131913
ItemPointer heapPtr, Datum *datums, bool *isnulls)
19141914
{
1915+
bool result = true;
19151916
TupleDesc leafDescriptor = state->leafTupDesc;
19161917
bool isnull = isnulls[spgKeyColumn];
19171918
int level = 0;
@@ -2012,16 +2013,33 @@ spgdoinsert(Relation index, SpGistState *state,
20122013
parent.offnum = InvalidOffsetNumber;
20132014
parent.node = -1;
20142015

2016+
/*
2017+
* Before entering the loop, try to clear any pending interrupt condition.
2018+
* If a query cancel is pending, we might as well accept it now not later;
2019+
* while if a non-canceling condition is pending, servicing it here avoids
2020+
* having to restart the insertion and redo all the work so far.
2021+
*/
2022+
CHECK_FOR_INTERRUPTS();
2023+
20152024
for (;;)
20162025
{
20172026
bool isNew = false;
20182027

20192028
/*
20202029
* Bail out if query cancel is pending. We must have this somewhere
20212030
* in the loop since a broken opclass could produce an infinite
2022-
* picksplit loop.
2031+
* picksplit loop. However, because we'll be holding buffer lock(s)
2032+
* after the first iteration, ProcessInterrupts() wouldn't be able to
2033+
* throw a cancel error here. Hence, if we see that an interrupt is
2034+
* pending, break out of the loop and deal with the situation below.
2035+
* Set result = false because we must restart the insertion if the
2036+
* interrupt isn't a query-cancel-or-die case.
20232037
*/
2024-
CHECK_FOR_INTERRUPTS();
2038+
if (INTERRUPTS_PENDING_CONDITION())
2039+
{
2040+
result = false;
2041+
break;
2042+
}
20252043

20262044
if (current.blkno == InvalidBlockNumber)
20272045
{
@@ -2140,10 +2158,14 @@ spgdoinsert(Relation index, SpGistState *state,
21402158
* spgAddNode and spgSplitTuple cases will loop back to here to
21412159
* complete the insertion operation. Just in case the choose
21422160
* function is broken and produces add or split requests
2143-
* repeatedly, check for query cancel.
2161+
* repeatedly, check for query cancel (see comments above).
21442162
*/
21452163
process_inner_tuple:
2146-
CHECK_FOR_INTERRUPTS();
2164+
if (INTERRUPTS_PENDING_CONDITION())
2165+
{
2166+
result = false;
2167+
break;
2168+
}
21472169

21482170
innerTuple = (SpGistInnerTuple) PageGetItem(current.page,
21492171
PageGetItemId(current.page, current.offnum));
@@ -2267,5 +2289,21 @@ spgdoinsert(Relation index, SpGistState *state,
22672289
UnlockReleaseBuffer(parent.buffer);
22682290
}
22692291

2270-
return true;
2292+
/*
2293+
* We do not support being called while some outer function is holding a
2294+
* buffer lock (or any other reason to postpone query cancels). If that
2295+
* were the case, telling the caller to retry would create an infinite
2296+
* loop.
2297+
*/
2298+
Assert(INTERRUPTS_CAN_BE_PROCESSED());
2299+
2300+
/*
2301+
* Finally, check for interrupts again. If there was a query cancel,
2302+
* ProcessInterrupts() will be able to throw the error here. If it was
2303+
* some other kind of interrupt that can just be cleared, return false to
2304+
* tell our caller to retry.
2305+
*/
2306+
CHECK_FOR_INTERRUPTS();
2307+
2308+
return result;
22712309
}

0 commit comments

Comments
 (0)