Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 44096c3

Browse files
committed
Prevent access to no-longer-pinned buffer in heapam_tuple_lock().
heap_fetch() used to have a "keep_buf" parameter that told it to return ownership of the buffer pin to the caller after finding that the requested tuple TID exists but is invisible to the specified snapshot. This was thoughtlessly removed in commit 5db6df0, which broke heapam_tuple_lock() (formerly EvalPlanQualFetch) because that function needs to do more accesses to the tuple even if it's invisible. The net effect is that we would continue to touch the page for a microsecond or two after releasing pin on the buffer. Usually no harm would result; but if a different session decided to defragment the page concurrently, we could see garbage data and mistakenly conclude that there's no newer tuple version to chain up to. (It's hard to say whether this has happened in the field. The bug was actually found thanks to a later change that allowed valgrind to detect accesses to non-pinned buffers.) The most reasonable way to fix this is to reintroduce keep_buf, although I made it behave slightly differently: buffer ownership is passed back only if there is a valid tuple at the requested TID. In HEAD, we can just add the parameter back to heap_fetch(). To avoid an API break in the back branches, introduce an additional function heap_fetch_extended() in those branches. In HEAD there is an additional, less obvious API change: tuple->t_data will be set to NULL in all cases where buffer ownership is not returned, in particular when the tuple exists but fails the time qual (and !keep_buf). This is to defend against any other callers attempting to access non-pinned buffers. We concluded that making that change in back branches would be more likely to introduce problems than cure any. In passing, remove a comment about heap_fetch that was obsoleted by 9a8ee1d. Per bug #17462 from Daniil Anisimov. Back-patch to v12 where the bug was introduced. Discussion: https://postgr.es/m/17462-9c98a0f00df9bd36@postgresql.org
1 parent 9144fa2 commit 44096c3

File tree

3 files changed

+43
-15
lines changed

3 files changed

+43
-15
lines changed

src/backend/access/heap/heapam.c

Lines changed: 33 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1366,10 +1366,13 @@ heap_getnextslot(TableScanDesc sscan, ScanDirection direction, TupleTableSlot *s
13661366
* must unpin the buffer when done with the tuple.
13671367
*
13681368
* If the tuple is not found (ie, item number references a deleted slot),
1369-
* then tuple->t_data is set to NULL and false is returned.
1369+
* then tuple->t_data is set to NULL, *userbuf is set to InvalidBuffer,
1370+
* and false is returned.
13701371
*
13711372
* If the tuple is found but fails the time qual check, then false is returned
1372-
* but tuple->t_data is left pointing to the tuple.
1373+
* and *userbuf is set to InvalidBuffer, but tuple->t_data is left pointing
1374+
* to the tuple. (Note that it is unsafe to dereference tuple->t_data in
1375+
* this case, but callers might choose to test it for NULL-ness.)
13731376
*
13741377
* heap_fetch does not follow HOT chains: only the exact TID requested will
13751378
* be fetched.
@@ -1388,6 +1391,25 @@ heap_fetch(Relation relation,
13881391
Snapshot snapshot,
13891392
HeapTuple tuple,
13901393
Buffer *userbuf)
1394+
{
1395+
return heap_fetch_extended(relation, snapshot, tuple, userbuf, false);
1396+
}
1397+
1398+
/*
1399+
* heap_fetch_extended - fetch tuple even if it fails snapshot test
1400+
*
1401+
* If keep_buf is true, then upon finding a tuple that is valid but fails
1402+
* the snapshot check, we return the tuple pointer in tuple->t_data and the
1403+
* buffer ID in *userbuf, keeping the buffer pin, just as if it had passed
1404+
* the snapshot. (The function result is still "false" though.)
1405+
* If keep_buf is false then this behaves identically to heap_fetch().
1406+
*/
1407+
bool
1408+
heap_fetch_extended(Relation relation,
1409+
Snapshot snapshot,
1410+
HeapTuple tuple,
1411+
Buffer *userbuf,
1412+
bool keep_buf)
13911413
{
13921414
ItemPointer tid = &(tuple->t_self);
13931415
ItemId lp;
@@ -1470,9 +1492,14 @@ heap_fetch(Relation relation,
14701492
return true;
14711493
}
14721494

1473-
/* Tuple failed time qual */
1474-
ReleaseBuffer(buffer);
1475-
*userbuf = InvalidBuffer;
1495+
/* Tuple failed time qual, but maybe caller wants to see it anyway. */
1496+
if (keep_buf)
1497+
*userbuf = buffer;
1498+
else
1499+
{
1500+
ReleaseBuffer(buffer);
1501+
*userbuf = InvalidBuffer;
1502+
}
14761503

14771504
return false;
14781505
}
@@ -1495,8 +1522,7 @@ heap_fetch(Relation relation,
14951522
* are vacuumable, false if not.
14961523
*
14971524
* Unlike heap_fetch, the caller must already have pin and (at least) share
1498-
* lock on the buffer; it is still pinned/locked at exit. Also unlike
1499-
* heap_fetch, we do not report any pgstats count; caller may do so if wanted.
1525+
* lock on the buffer; it is still pinned/locked at exit.
15001526
*/
15011527
bool
15021528
heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,

src/backend/access/heap/heapam_handler.c

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -399,7 +399,8 @@ heapam_tuple_lock(Relation relation, ItemPointer tid, Snapshot snapshot,
399399
errmsg("tuple to be locked was already moved to another partition due to concurrent update")));
400400

401401
tuple->t_self = *tid;
402-
if (heap_fetch(relation, &SnapshotDirty, tuple, &buffer))
402+
if (heap_fetch_extended(relation, &SnapshotDirty, tuple,
403+
&buffer, true))
403404
{
404405
/*
405406
* If xmin isn't what we're expecting, the slot must have
@@ -498,6 +499,7 @@ heapam_tuple_lock(Relation relation, ItemPointer tid, Snapshot snapshot,
498499
*/
499500
if (tuple->t_data == NULL)
500501
{
502+
Assert(!BufferIsValid(buffer));
501503
return TM_Deleted;
502504
}
503505

@@ -507,8 +509,7 @@ heapam_tuple_lock(Relation relation, ItemPointer tid, Snapshot snapshot,
507509
if (!TransactionIdEquals(HeapTupleHeaderGetXmin(tuple->t_data),
508510
priorXmax))
509511
{
510-
if (BufferIsValid(buffer))
511-
ReleaseBuffer(buffer);
512+
ReleaseBuffer(buffer);
512513
return TM_Deleted;
513514
}
514515

@@ -524,22 +525,20 @@ heapam_tuple_lock(Relation relation, ItemPointer tid, Snapshot snapshot,
524525
*
525526
* As above, it should be safe to examine xmax and t_ctid
526527
* without the buffer content lock, because they can't be
527-
* changing.
528+
* changing. We'd better hold a buffer pin though.
528529
*/
529530
if (ItemPointerEquals(&tuple->t_self, &tuple->t_data->t_ctid))
530531
{
531532
/* deleted, so forget about it */
532-
if (BufferIsValid(buffer))
533-
ReleaseBuffer(buffer);
533+
ReleaseBuffer(buffer);
534534
return TM_Deleted;
535535
}
536536

537537
/* updated, so look at the updated row */
538538
*tid = tuple->t_data->t_ctid;
539539
/* updated row should have xmin matching this xmax */
540540
priorXmax = HeapTupleHeaderGetUpdateXid(tuple->t_data);
541-
if (BufferIsValid(buffer))
542-
ReleaseBuffer(buffer);
541+
ReleaseBuffer(buffer);
543542
/* loop back to fetch next in chain */
544543
}
545544
}

src/include/access/heapam.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,9 @@ extern bool heap_getnextslot(TableScanDesc sscan,
124124

125125
extern bool heap_fetch(Relation relation, Snapshot snapshot,
126126
HeapTuple tuple, Buffer *userbuf);
127+
extern bool heap_fetch_extended(Relation relation, Snapshot snapshot,
128+
HeapTuple tuple, Buffer *userbuf,
129+
bool keep_buf);
127130
extern bool heap_hot_search_buffer(ItemPointer tid, Relation relation,
128131
Buffer buffer, Snapshot snapshot, HeapTuple heapTuple,
129132
bool *all_dead, bool first_call);

0 commit comments

Comments
 (0)