Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 2f29fd4

Browse files
Handle new HOT chains in index-build table scans
When a table is scanned by heapam_index_build_range_scan (née IndexBuildHeapScan) and the table lock being held allows concurrent data changes, it is possible for new HOT chains to sprout in a page that were unknown when the scan of a page happened. This leads to an error such as ERROR: failed to find parent tuple for heap-only tuple at (X,Y) in table "tbl" because the root tuple was not present when we first obtained the list of the page's root tuples. This can be fixed by re-obtaining the list of root tuples, if we see that a heap-only tuple appears to point to a non-existing root. This was reported by Anastasia as occurring for BRIN summarization (which exists since 9.5), but I think it could theoretically also happen with CREATE INDEX CONCURRENTLY (much older) or REINDEX CONCURRENTLY (very recent). It seems a happy coincidence that BRIN forces us to backpatch this all the way to 9.5. Reported-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru> Diagnosed-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru> Co-authored-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/602d8487-f0b2-5486-0088-0f372b2549fa@postgrespro.ru Backpatch: 9.5 - master
1 parent 8782ea2 commit 2f29fd4

File tree

2 files changed

+23
-2
lines changed

2 files changed

+23
-2
lines changed

src/backend/access/heap/heapam_handler.c

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1322,6 +1322,12 @@ heapam_index_build_range_scan(Relation heapRelation,
13221322
* buffer continuously while visiting the page, so no pruning
13231323
* operation can occur either.
13241324
*
1325+
* In cases with only ShareUpdateExclusiveLock on the table, it's
1326+
* possible for some HOT tuples to appear that we didn't know about
1327+
* when we first read the page. To handle that case, we re-obtain the
1328+
* list of root offsets when a HOT tuple points to a root item that we
1329+
* don't know about.
1330+
*
13251331
* Also, although our opinions about tuple liveness could change while
13261332
* we scan the page (due to concurrent transaction commits/aborts),
13271333
* the chain root locations won't, so this info doesn't need to be
@@ -1623,6 +1629,20 @@ heapam_index_build_range_scan(Relation heapRelation,
16231629

16241630
offnum = ItemPointerGetOffsetNumber(&heapTuple->t_self);
16251631

1632+
/*
1633+
* If a HOT tuple points to a root that we don't know
1634+
* about, obtain root items afresh. If that still fails,
1635+
* report it as corruption.
1636+
*/
1637+
if (root_offsets[offnum - 1] == InvalidOffsetNumber)
1638+
{
1639+
Page page = BufferGetPage(hscan->rs_cbuf);
1640+
1641+
LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_SHARE);
1642+
heap_get_root_tuples(page, root_offsets);
1643+
LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_UNLOCK);
1644+
}
1645+
16261646
if (!OffsetNumberIsValid(root_offsets[offnum - 1]))
16271647
ereport(ERROR,
16281648
(errcode(ERRCODE_DATA_CORRUPTED),

src/backend/access/heap/pruneheap.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -732,7 +732,7 @@ heap_page_prune_execute(Buffer buffer,
732732
* root_offsets[k - 1] = j.
733733
*
734734
* The passed-in root_offsets array must have MaxHeapTuplesPerPage entries.
735-
* We zero out all unused entries.
735+
* Unused entries are filled with InvalidOffsetNumber (zero).
736736
*
737737
* The function must be called with at least share lock on the buffer, to
738738
* prevent concurrent prune operations.
@@ -747,7 +747,8 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
747747
OffsetNumber offnum,
748748
maxoff;
749749

750-
MemSet(root_offsets, 0, MaxHeapTuplesPerPage * sizeof(OffsetNumber));
750+
MemSet(root_offsets, InvalidOffsetNumber,
751+
MaxHeapTuplesPerPage * sizeof(OffsetNumber));
751752

752753
maxoff = PageGetMaxOffsetNumber(page);
753754
for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum = OffsetNumberNext(offnum))

0 commit comments

Comments
 (0)