Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit a71cfc5

Browse files
committed
Fix bugs in RETURNING in cross-partition UPDATE cases.
If the source and destination partitions don't have identical rowtypes (for example, one has dropped columns the other lacks), then the planSlot contents will be different because of that. If the query has a RETURNING list that tries to return resjunk columns out of the planSlot, that is columns from tables that were joined to the target table, we'd get errors or wrong answers. That's because we used the RETURNING list generated for the destination partition, which expects a planSlot matching that partition's subplan. The most practical fix seems to be to convert the updated destination tuple back to the source partition's rowtype, and then apply the RETURNING list generated for the source partition. This avoids making fragile assumptions about whether the per-subpartition subplans generated all the resjunk columns in the same order. This has been broken since v11 introduced cross-partition UPDATE. The lack of field complaints shows that non-identical partitions aren't a common case; therefore, don't stress too hard about making the conversion efficient. There's no such bug in HEAD, because commit 86dc900 got rid of per-target-relation variance in the contents of the planSlot. Hence, patch v11-v13 only. Amit Langote and Etsuro Fujita, small changes by me Discussion: https://postgr.es/m/CA+HiwqE_UK1jTSNrjb8mpTdivzd3dum6mK--xqKq0Y9VmfwWQA@mail.gmail.com
1 parent 574a1b8 commit a71cfc5

File tree

3 files changed

+141
-12
lines changed

3 files changed

+141
-12
lines changed

src/backend/executor/nodeModifyTable.c

Lines changed: 76 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -149,34 +149,40 @@ ExecCheckPlanOutput(Relation resultRel, List *targetList)
149149
/*
150150
* ExecProcessReturning --- evaluate a RETURNING list
151151
*
152-
* resultRelInfo: current result rel
152+
* projectReturning: the projection to evaluate
153+
* resultRelOid: result relation's OID
153154
* tupleSlot: slot holding tuple actually inserted/updated/deleted
154155
* planSlot: slot holding tuple returned by top subplan node
155156
*
157+
* In cross-partition UPDATE cases, projectReturning and planSlot are as
158+
* for the source partition, and tupleSlot must conform to that. But
159+
* resultRelOid is for the destination partition.
160+
*
156161
* Note: If tupleSlot is NULL, the FDW should have already provided econtext's
157162
* scan tuple.
158163
*
159164
* Returns a slot holding the result tuple
160165
*/
161166
static TupleTableSlot *
162-
ExecProcessReturning(ResultRelInfo *resultRelInfo,
167+
ExecProcessReturning(ProjectionInfo *projectReturning,
168+
Oid resultRelOid,
163169
TupleTableSlot *tupleSlot,
164170
TupleTableSlot *planSlot)
165171
{
166-
ProjectionInfo *projectReturning = resultRelInfo->ri_projectReturning;
167172
ExprContext *econtext = projectReturning->pi_exprContext;
168173

169174
/* Make tuple and any needed join variables available to ExecProject */
170175
if (tupleSlot)
171176
econtext->ecxt_scantuple = tupleSlot;
177+
else
178+
Assert(econtext->ecxt_scantuple);
172179
econtext->ecxt_outertuple = planSlot;
173180

174181
/*
175-
* RETURNING expressions might reference the tableoid column, so
176-
* reinitialize tts_tableOid before evaluating them.
182+
* RETURNING expressions might reference the tableoid column, so be sure
183+
* we expose the desired OID, ie that of the real target relation.
177184
*/
178-
econtext->ecxt_scantuple->tts_tableOid =
179-
RelationGetRelid(resultRelInfo->ri_RelationDesc);
185+
econtext->ecxt_scantuple->tts_tableOid = resultRelOid;
180186

181187
/* Compute the RETURNING expressions */
182188
return ExecProject(projectReturning);
@@ -368,13 +374,25 @@ ExecComputeStoredGenerated(EState *estate, TupleTableSlot *slot, CmdType cmdtype
368374
* For INSERT, we have to insert the tuple into the target relation
369375
* and insert appropriate tuples into the index relations.
370376
*
377+
* slot contains the new tuple value to be stored.
378+
* planSlot is the output of the ModifyTable's subplan; we use it
379+
* to access "junk" columns that are not going to be stored.
380+
* In a cross-partition UPDATE, srcSlot is the slot that held the
381+
* updated tuple for the source relation; otherwise it's NULL.
382+
*
383+
* returningRelInfo is the resultRelInfo for the source relation of a
384+
* cross-partition UPDATE; otherwise it's the current result relation.
385+
* We use it to process RETURNING lists, for reasons explained below.
386+
*
371387
* Returns RETURNING result if any, otherwise NULL.
372388
* ----------------------------------------------------------------
373389
*/
374390
static TupleTableSlot *
375391
ExecInsert(ModifyTableState *mtstate,
376392
TupleTableSlot *slot,
377393
TupleTableSlot *planSlot,
394+
TupleTableSlot *srcSlot,
395+
ResultRelInfo *returningRelInfo,
378396
EState *estate,
379397
bool canSetTag)
380398
{
@@ -677,8 +695,45 @@ ExecInsert(ModifyTableState *mtstate,
677695
ExecWithCheckOptions(WCO_VIEW_CHECK, resultRelInfo, slot, estate);
678696

679697
/* Process RETURNING if present */
680-
if (resultRelInfo->ri_projectReturning)
681-
result = ExecProcessReturning(resultRelInfo, slot, planSlot);
698+
if (returningRelInfo->ri_projectReturning)
699+
{
700+
/*
701+
* In a cross-partition UPDATE with RETURNING, we have to use the
702+
* source partition's RETURNING list, because that matches the output
703+
* of the planSlot, while the destination partition might have
704+
* different resjunk columns. This means we have to map the
705+
* destination tuple back to the source's format so we can apply that
706+
* RETURNING list. This is expensive, but it should be an uncommon
707+
* corner case, so we won't spend much effort on making it fast.
708+
*
709+
* We assume that we can use srcSlot to hold the re-converted tuple.
710+
* Note that in the common case where the child partitions both match
711+
* the root's format, previous optimizations will have resulted in
712+
* slot and srcSlot being identical, cueing us that there's nothing to
713+
* do here.
714+
*/
715+
if (returningRelInfo != resultRelInfo && slot != srcSlot)
716+
{
717+
Relation srcRelationDesc = returningRelInfo->ri_RelationDesc;
718+
AttrMap *map;
719+
720+
map = build_attrmap_by_name_if_req(RelationGetDescr(resultRelationDesc),
721+
RelationGetDescr(srcRelationDesc));
722+
if (map)
723+
{
724+
TupleTableSlot *origSlot = slot;
725+
726+
slot = execute_attr_map_slot(map, slot, srcSlot);
727+
slot->tts_tid = origSlot->tts_tid;
728+
slot->tts_tableOid = origSlot->tts_tableOid;
729+
free_attrmap(map);
730+
}
731+
}
732+
733+
result = ExecProcessReturning(returningRelInfo->ri_projectReturning,
734+
RelationGetRelid(resultRelationDesc),
735+
slot, planSlot);
736+
}
682737

683738
return result;
684739
}
@@ -1027,7 +1082,9 @@ ldelete:;
10271082
}
10281083
}
10291084

1030-
rslot = ExecProcessReturning(resultRelInfo, slot, planSlot);
1085+
rslot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
1086+
RelationGetRelid(resultRelationDesc),
1087+
slot, planSlot);
10311088

10321089
/*
10331090
* Before releasing the target tuple again, make sure rslot has a
@@ -1203,6 +1260,7 @@ lreplace:;
12031260
{
12041261
bool tuple_deleted;
12051262
TupleTableSlot *ret_slot;
1263+
TupleTableSlot *orig_slot = slot;
12061264
TupleTableSlot *epqslot = NULL;
12071265
PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing;
12081266
int map_index;
@@ -1309,6 +1367,7 @@ lreplace:;
13091367
mtstate->rootResultRelInfo, slot);
13101368

13111369
ret_slot = ExecInsert(mtstate, slot, planSlot,
1370+
orig_slot, resultRelInfo,
13121371
estate, canSetTag);
13131372

13141373
/* Revert ExecPrepareTupleRouting's node change. */
@@ -1505,7 +1564,9 @@ lreplace:;
15051564

15061565
/* Process RETURNING if present */
15071566
if (resultRelInfo->ri_projectReturning)
1508-
return ExecProcessReturning(resultRelInfo, slot, planSlot);
1567+
return ExecProcessReturning(resultRelInfo->ri_projectReturning,
1568+
RelationGetRelid(resultRelationDesc),
1569+
slot, planSlot);
15091570

15101571
return NULL;
15111572
}
@@ -2154,7 +2215,9 @@ ExecModifyTable(PlanState *pstate)
21542215
* ExecProcessReturning by IterateDirectModify, so no need to
21552216
* provide it here.
21562217
*/
2157-
slot = ExecProcessReturning(resultRelInfo, NULL, planSlot);
2218+
slot = ExecProcessReturning(resultRelInfo->ri_projectReturning,
2219+
RelationGetRelid(resultRelInfo->ri_RelationDesc),
2220+
NULL, planSlot);
21582221

21592222
estate->es_result_relation_info = saved_resultRelInfo;
21602223
return slot;
@@ -2244,6 +2307,7 @@ ExecModifyTable(PlanState *pstate)
22442307
slot = ExecPrepareTupleRouting(node, estate, proute,
22452308
resultRelInfo, slot);
22462309
slot = ExecInsert(node, slot, planSlot,
2310+
NULL, estate->es_result_relation_info,
22472311
estate, node->canSetTag);
22482312
/* Revert ExecPrepareTupleRouting's state change. */
22492313
if (proute)

src/test/regress/expected/update.out

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -445,6 +445,46 @@ UPDATE range_parted set c = 95 WHERE a = 'b' and b > 10 and c > 100 returning (r
445445
part_c_1_100 | b | 17 | 95 | 19 |
446446
(6 rows)
447447

448+
-- The following tests computing RETURNING when the source and the destination
449+
-- partitions of a UPDATE row movement operation have different tuple
450+
-- descriptors, which has been shown to be problematic in the cases where the
451+
-- RETURNING targetlist contains non-target relation attributes that are
452+
-- computed by referring to the source partition plan's output tuple. Also,
453+
-- a trigger on the destination relation may change the tuple, which must be
454+
-- reflected in the RETURNING output, so we test that too.
455+
CREATE TABLE part_c_1_c_20 (LIKE range_parted);
456+
ALTER TABLE part_c_1_c_20 DROP a, DROP b, ADD a text, ADD b bigint;
457+
ALTER TABLE range_parted ATTACH PARTITION part_c_1_c_20 FOR VALUES FROM ('c', 1) TO ('c', 20);
458+
CREATE FUNCTION trigfunc () RETURNS TRIGGER LANGUAGE plpgsql as $$ BEGIN NEW.e := 'in trigfunc()'; RETURN NEW; END; $$;
459+
CREATE TRIGGER trig BEFORE INSERT ON part_c_1_c_20 FOR EACH ROW EXECUTE FUNCTION trigfunc();
460+
UPDATE range_parted r set a = 'c' FROM (VALUES ('a', 1), ('a', 10), ('b', 12)) s(x, y) WHERE s.x = r.a AND s.y = r.b RETURNING tableoid::regclass, *;
461+
tableoid | a | b | c | d | e | x | y
462+
---------------+---+----+-----+---+---------------+---+----
463+
part_c_1_c_20 | c | 1 | 1 | 1 | in trigfunc() | a | 1
464+
part_c_1_c_20 | c | 10 | 200 | 1 | in trigfunc() | a | 10
465+
part_c_1_c_20 | c | 12 | 96 | 1 | in trigfunc() | b | 12
466+
(3 rows)
467+
468+
DROP TRIGGER trig ON part_c_1_c_20;
469+
DROP FUNCTION trigfunc;
470+
:init_range_parted;
471+
CREATE FUNCTION trigfunc () RETURNS TRIGGER LANGUAGE plpgsql as $$ BEGIN RETURN NULL; END; $$;
472+
CREATE TRIGGER trig BEFORE INSERT ON part_c_1_c_20 FOR EACH ROW EXECUTE FUNCTION trigfunc();
473+
UPDATE range_parted r set a = 'c' FROM (VALUES ('a', 1), ('a', 10), ('b', 12)) s(x, y) WHERE s.x = r.a AND s.y = r.b RETURNING tableoid::regclass, *;
474+
tableoid | a | b | c | d | e | x | y
475+
----------+---+---+---+---+---+---+---
476+
(0 rows)
477+
478+
:show_data;
479+
partname | a | b | c | d | e
480+
--------------+---+----+-----+----+---
481+
part_c_1_100 | b | 13 | 97 | 2 |
482+
part_d_15_20 | b | 15 | 105 | 16 |
483+
part_d_15_20 | b | 17 | 105 | 19 |
484+
(3 rows)
485+
486+
DROP TABLE part_c_1_c_20;
487+
DROP FUNCTION trigfunc;
448488
-- Transition tables with update row movement
449489
:init_range_parted;
450490
CREATE FUNCTION trans_updatetrigfunc() RETURNS trigger LANGUAGE plpgsql AS

src/test/regress/sql/update.sql

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -236,6 +236,31 @@ DROP VIEW upview;
236236
UPDATE range_parted set c = 95 WHERE a = 'b' and b > 10 and c > 100 returning (range_parted), *;
237237
:show_data;
238238

239+
-- The following tests computing RETURNING when the source and the destination
240+
-- partitions of a UPDATE row movement operation have different tuple
241+
-- descriptors, which has been shown to be problematic in the cases where the
242+
-- RETURNING targetlist contains non-target relation attributes that are
243+
-- computed by referring to the source partition plan's output tuple. Also,
244+
-- a trigger on the destination relation may change the tuple, which must be
245+
-- reflected in the RETURNING output, so we test that too.
246+
CREATE TABLE part_c_1_c_20 (LIKE range_parted);
247+
ALTER TABLE part_c_1_c_20 DROP a, DROP b, ADD a text, ADD b bigint;
248+
ALTER TABLE range_parted ATTACH PARTITION part_c_1_c_20 FOR VALUES FROM ('c', 1) TO ('c', 20);
249+
CREATE FUNCTION trigfunc () RETURNS TRIGGER LANGUAGE plpgsql as $$ BEGIN NEW.e := 'in trigfunc()'; RETURN NEW; END; $$;
250+
CREATE TRIGGER trig BEFORE INSERT ON part_c_1_c_20 FOR EACH ROW EXECUTE FUNCTION trigfunc();
251+
UPDATE range_parted r set a = 'c' FROM (VALUES ('a', 1), ('a', 10), ('b', 12)) s(x, y) WHERE s.x = r.a AND s.y = r.b RETURNING tableoid::regclass, *;
252+
253+
DROP TRIGGER trig ON part_c_1_c_20;
254+
DROP FUNCTION trigfunc;
255+
256+
:init_range_parted;
257+
CREATE FUNCTION trigfunc () RETURNS TRIGGER LANGUAGE plpgsql as $$ BEGIN RETURN NULL; END; $$;
258+
CREATE TRIGGER trig BEFORE INSERT ON part_c_1_c_20 FOR EACH ROW EXECUTE FUNCTION trigfunc();
259+
UPDATE range_parted r set a = 'c' FROM (VALUES ('a', 1), ('a', 10), ('b', 12)) s(x, y) WHERE s.x = r.a AND s.y = r.b RETURNING tableoid::regclass, *;
260+
:show_data;
261+
262+
DROP TABLE part_c_1_c_20;
263+
DROP FUNCTION trigfunc;
239264

240265
-- Transition tables with update row movement
241266
:init_range_parted;

0 commit comments

Comments
 (0)