Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 9ff6867

Browse files
committed
Fix crash in brininsertcleanup during logical replication.
Logical replication crashes if the subscriber's partitioned table has a BRIN index. There are two independently blamable causes, and this patch fixes both: 1. brininsertcleanup fails if called twice for the same IndexInfo, because it half-destroys its BrinInsertState but leaves it still linked from ii_AmCache. brininsert would also fail in that state, so it's pretty hard to see any advantage to this coding. Fully remove the BrinInsertState, instead, so that a new brininsert call would create a new cache. 2. A logical replication subscriber sometimes does ExecOpenIndices twice on the same ResultRelInfo, followed by doing ExecCloseIndices twice; the second call reaches the brininsertcleanup bug. Quite aside from tickling unexpected cases in aminsertcleanup methods, this seems very wasteful, because the IndexInfos built in the first ExecOpenIndices call are just lost during the second call, and have to be rebuilt at possibly-nontrivial cost. We should establish a coding rule that you don't do that. The problematic coding is that when the target table is partitioned, apply_handle_tuple_routing calls ExecFindPartition which does ExecOpenIndices (and expects that ExecCleanupTupleRouting will close the indexes again). Using the ResultRelInfo made by ExecFindPartition, it calls apply_handle_delete_internal or apply_handle_insert_internal, both of which think they need to do ExecOpenIndices/ExecCloseIndices for themselves. They do in the main non-partitioned code paths, but not here. The simplest fix is to pull their ExecOpenIndices/ExecCloseIndices calls out and put them in the call sites for the non-partitioned cases. (We could have refactored apply_handle_update_internal similarly, but I did not do so today because there's no bug there: the partitioned code path doesn't call it.) Also, remove the always-duplicative open/close calls within apply_handle_tuple_routing itself. Since brininsertcleanup and indeed the whole aminsertcleanup mechanism are new in v17, there's no observable bug in older branches. A case could be made for trying to avoid these duplicative open/close calls in the older branches, but for now it seems not worth the trouble and risk of new bugs. Bug: #18815 Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com> Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org Backpatch-through: 17
1 parent a1b4f28 commit 9ff6867

File tree

3 files changed

+35
-17
lines changed

3 files changed

+35
-17
lines changed

src/backend/access/brin/brin.c

+5-3
Original file line numberDiff line numberDiff line change
@@ -511,16 +511,18 @@ brininsertcleanup(Relation index, IndexInfo *indexInfo)
511511
BrinInsertState *bistate = (BrinInsertState *) indexInfo->ii_AmCache;
512512

513513
/* bail out if cache not initialized */
514-
if (indexInfo->ii_AmCache == NULL)
514+
if (bistate == NULL)
515515
return;
516516

517+
/* do this first to avoid dangling pointer if we fail partway through */
518+
indexInfo->ii_AmCache = NULL;
519+
517520
/*
518521
* Clean up the revmap. Note that the brinDesc has already been cleaned up
519522
* as part of its own memory context.
520523
*/
521524
brinRevmapTerminate(bistate->bis_rmAccess);
522-
bistate->bis_rmAccess = NULL;
523-
bistate->bis_desc = NULL;
525+
pfree(bistate);
524526
}
525527

526528
/*

src/backend/replication/logical/worker.c

+26-14
Original file line numberDiff line numberDiff line change
@@ -2454,8 +2454,13 @@ apply_handle_insert(StringInfo s)
24542454
apply_handle_tuple_routing(edata,
24552455
remoteslot, NULL, CMD_INSERT);
24562456
else
2457-
apply_handle_insert_internal(edata, edata->targetRelInfo,
2458-
remoteslot);
2457+
{
2458+
ResultRelInfo *relinfo = edata->targetRelInfo;
2459+
2460+
ExecOpenIndices(relinfo, true);
2461+
apply_handle_insert_internal(edata, relinfo, remoteslot);
2462+
ExecCloseIndices(relinfo);
2463+
}
24592464

24602465
finish_edata(edata);
24612466

@@ -2482,16 +2487,18 @@ apply_handle_insert_internal(ApplyExecutionData *edata,
24822487
{
24832488
EState *estate = edata->estate;
24842489

2485-
/* We must open indexes here. */
2486-
ExecOpenIndices(relinfo, true);
2490+
/* Caller should have opened indexes already. */
2491+
Assert(relinfo->ri_IndexRelationDescs != NULL ||
2492+
!relinfo->ri_RelationDesc->rd_rel->relhasindex ||
2493+
RelationGetIndexList(relinfo->ri_RelationDesc) == NIL);
2494+
2495+
/* Caller will not have done this bit. */
2496+
Assert(relinfo->ri_onConflictArbiterIndexes == NIL);
24872497
InitConflictIndexes(relinfo);
24882498

24892499
/* Do the insert. */
24902500
TargetPrivilegesCheck(relinfo->ri_RelationDesc, ACL_INSERT);
24912501
ExecSimpleRelationInsert(relinfo, estate, remoteslot);
2492-
2493-
/* Cleanup. */
2494-
ExecCloseIndices(relinfo);
24952502
}
24962503

24972504
/*
@@ -2816,8 +2823,14 @@ apply_handle_delete(StringInfo s)
28162823
apply_handle_tuple_routing(edata,
28172824
remoteslot, NULL, CMD_DELETE);
28182825
else
2819-
apply_handle_delete_internal(edata, edata->targetRelInfo,
2826+
{
2827+
ResultRelInfo *relinfo = edata->targetRelInfo;
2828+
2829+
ExecOpenIndices(relinfo, false);
2830+
apply_handle_delete_internal(edata, relinfo,
28202831
remoteslot, rel->localindexoid);
2832+
ExecCloseIndices(relinfo);
2833+
}
28212834

28222835
finish_edata(edata);
28232836

@@ -2851,7 +2864,11 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
28512864
bool found;
28522865

28532866
EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL);
2854-
ExecOpenIndices(relinfo, false);
2867+
2868+
/* Caller should have opened indexes already. */
2869+
Assert(relinfo->ri_IndexRelationDescs != NULL ||
2870+
!localrel->rd_rel->relhasindex ||
2871+
RelationGetIndexList(localrel) == NIL);
28552872

28562873
found = FindReplTupleInLocalRel(edata, localrel, remoterel, localindexoid,
28572874
remoteslot, &localslot);
@@ -2892,7 +2909,6 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
28922909
}
28932910

28942911
/* Cleanup. */
2895-
ExecCloseIndices(relinfo);
28962912
EvalPlanQualEnd(&epqstate);
28972913
}
28982914

@@ -3131,7 +3147,6 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
31313147
* work already done above to find the local tuple in the
31323148
* partition.
31333149
*/
3134-
ExecOpenIndices(partrelinfo, true);
31353150
InitConflictIndexes(partrelinfo);
31363151

31373152
EvalPlanQualSetSlot(&epqstate, remoteslot_part);
@@ -3181,8 +3196,6 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
31813196
get_namespace_name(RelationGetNamespace(partrel_new)),
31823197
RelationGetRelationName(partrel_new));
31833198

3184-
ExecOpenIndices(partrelinfo, false);
3185-
31863199
/* DELETE old tuple found in the old partition. */
31873200
EvalPlanQualSetSlot(&epqstate, localslot);
31883201
TargetPrivilegesCheck(partrelinfo->ri_RelationDesc, ACL_DELETE);
@@ -3217,7 +3230,6 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
32173230
remoteslot_part);
32183231
}
32193232

3220-
ExecCloseIndices(partrelinfo);
32213233
EvalPlanQualEnd(&epqstate);
32223234
}
32233235
break;

src/test/subscription/t/013_partition.pl

+4
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,10 @@
4949
$node_subscriber1->safe_psql('postgres',
5050
"CREATE TABLE tab1 (c text, a int PRIMARY KEY, b text) PARTITION BY LIST (a)"
5151
);
52+
# make a BRIN index to test aminsertcleanup logic in subscriber
53+
$node_subscriber1->safe_psql('postgres',
54+
"CREATE INDEX tab1_c_brin_idx ON tab1 USING brin (c)"
55+
);
5256
$node_subscriber1->safe_psql('postgres',
5357
"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)"
5458
);

0 commit comments

Comments
 (0)