Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit aa210e0

Browse files
Fix nbtree backward scan race condition comments.
Remove comments that supposed that holding a pin was a useful interlock for _bt_walk_left(). There are times when _bt_walk_left() doesn't hold either a lock or a pin on any page, so clearly this can't be true. _bt_walk_left() is even prepared to deal with concurrent deletion of both the original page and any pages to its left. Oversight in commit 2ed5b87.
1 parent dc3f9bc commit aa210e0

File tree

1 file changed

+8
-19
lines changed

1 file changed

+8
-19
lines changed

src/backend/access/nbtree/nbtsearch.c

+8-19
Original file line numberDiff line numberDiff line change
@@ -2036,8 +2036,8 @@ _bt_steppage(IndexScanDesc scan, ScanDirection dir)
20362036
* _bt_readnextpage() -- Read next page containing valid data for scan
20372037
*
20382038
* On success exit, so->currPos is updated to contain data from the next
2039-
* interesting page. Caller is responsible to release lock and pin on
2040-
* buffer on success. We return true to indicate success.
2039+
* interesting page, and we return true. Caller must release the lock (and
2040+
* maybe the pin) on the buffer on success exit.
20412041
*
20422042
* If there are no more matching records in the given direction, we drop all
20432043
* locks and pins, set so->currPos.buf to InvalidBuffer, and return false.
@@ -2127,18 +2127,9 @@ _bt_readnextpage(IndexScanDesc scan, BlockNumber blkno, ScanDirection dir)
21272127
*
21282128
* It might be possible to rearrange this code to have less overhead
21292129
* in pinning and locking, but that would require capturing the left
2130-
* pointer when the page is initially read, and using it here, along
2131-
* with big changes to _bt_walk_left() and the code below. It is not
2132-
* clear whether this would be a win, since if the page immediately to
2133-
* the left splits after we read this page and before we step left, we
2134-
* would need to visit more pages than with the current code.
2135-
*
2136-
* Note that if we change the code so that we drop the pin for a scan
2137-
* which uses a non-MVCC snapshot, we will need to modify the code for
2138-
* walking left, to allow for the possibility that a referenced page
2139-
* has been deleted. As long as the buffer is pinned or the snapshot
2140-
* is MVCC the page cannot move past the half-dead state to fully
2141-
* deleted.
2130+
* sibling block number when the page is initially read, and then
2131+
* optimistically starting there (rather than pinning the page twice).
2132+
* It is not clear that this would be worth the complexity.
21422133
*/
21432134
if (BTScanPosIsPinned(so->currPos))
21442135
_bt_lockbuf(rel, so->currPos.buf, BT_READ);
@@ -2243,9 +2234,8 @@ _bt_parallel_readpage(IndexScanDesc scan, BlockNumber blkno, ScanDirection dir)
22432234
* Returns InvalidBuffer if there is no page to the left (no lock is held
22442235
* in that case).
22452236
*
2246-
* When working on a non-leaf level, it is possible for the returned page
2247-
* to be half-dead; the caller should check that condition and step left
2248-
* again if it's important.
2237+
* It is possible for the returned leaf page to be half-dead; caller must
2238+
* check that condition and step left again when required.
22492239
*/
22502240
static Buffer
22512241
_bt_walk_left(Relation rel, Buffer buf)
@@ -2288,8 +2278,7 @@ _bt_walk_left(Relation rel, Buffer buf)
22882278
* anymore, not that its left sibling got split more than four times.
22892279
*
22902280
* Note that it is correct to test P_ISDELETED not P_IGNORE here,
2291-
* because half-dead pages are still in the sibling chain. Caller
2292-
* must reject half-dead pages if wanted.
2281+
* because half-dead pages are still in the sibling chain.
22932282
*/
22942283
tries = 0;
22952284
for (;;)

0 commit comments

Comments
 (0)