Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit fb2d645

Browse files
committed
Revise make_partition_pruneinfo to not use its partitioned_rels input.
It turns out that the calculation of [Merge]AppendPath.partitioned_rels in allpaths.c is faulty and sometimes omits relevant non-leaf partitions, allowing an assertion added by commit a929e17 to trigger. Rather than fix that, it seems better to get rid of those fields altogether. We don't really need the info until create_plan time, and calculating it once for the selected plan should be cheaper than calculating it for each append path we consider. As a first step, teach make_partition_pruneinfo to collect the relevant partitioned tables for itself. It's not hard to do so by traversing from child tables up to parents using the AppendRelInfo links. While here, make some minor stylistic improvements; mainly, don't use the "Relids" alias for bitmapsets that are not identities of any relation considered by the planner. Try to document the logic better, too. No backpatch, as there does not seem to be a live problem before a929e17. Also no new regression test; the code where the bug was will be gone at the end of this patch series, so it seems a bit pointless to memorialize the issue. Tom Lane and David Rowley, per reports from Andreas Seltenreich and Jaime Casanova. Discussion: https://postgr.es/m/87sg8tqhsl.fsf@aurora.ydns.eu Discussion: https://postgr.es/m/CAJKUy5gCXDSmFs2c=R+VGgn7FiYcLCsEFEuDNNLGfoha=pBE_g@mail.gmail.com
1 parent 3696a60 commit fb2d645

File tree

1 file changed

+149
-50
lines changed

1 file changed

+149
-50
lines changed

src/backend/partitioning/partprune.c

+149-50
Original file line numberDiff line numberDiff line change
@@ -138,10 +138,12 @@ typedef struct PruneStepResult
138138
} PruneStepResult;
139139

140140

141+
static List *add_part_relids(List *allpartrelids, Bitmapset *partrelids);
141142
static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
142143
RelOptInfo *parentrel,
144+
List *prunequal,
145+
Bitmapset *partrelids,
143146
int *relid_subplan_map,
144-
Relids partrelids, List *prunequal,
145147
Bitmapset **matchedsubplans);
146148
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
147149
PartClauseTarget target,
@@ -213,67 +215,105 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
213215
*
214216
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
215217
* of scan paths for its child rels.
216-
*
217-
* 'partitioned_rels' is a List containing Lists of relids of partitioned
218-
* tables (a/k/a non-leaf partitions) that are parents of some of the child
219-
* rels. Here we attempt to populate the PartitionPruneInfo by adding a
220-
* 'prune_infos' item for each sublist in the 'partitioned_rels' list.
221-
* However, some of the sets of partitioned relations may not require any
222-
* run-time pruning. In these cases we'll simply not include a 'prune_infos'
223-
* item for that set and instead we'll add all the subplans which belong to
224-
* that set into the PartitionPruneInfo's 'other_subplans' field. Callers
225-
* will likely never want to prune subplans which are mentioned in this field.
226-
*
227-
* 'prunequal' is a list of potential pruning quals.
218+
* 'prunequal' is a list of potential pruning quals (i.e., restriction
219+
* clauses that are applicable to the appendrel).
228220
*/
229221
PartitionPruneInfo *
230222
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
231-
List *subpaths, List *partitioned_rels,
223+
List *subpaths, List *unused_partitioned_rels,
232224
List *prunequal)
233225
{
234226
PartitionPruneInfo *pruneinfo;
235227
Bitmapset *allmatchedsubplans = NULL;
228+
List *allpartrelids;
229+
List *prunerelinfos;
236230
int *relid_subplan_map;
237231
ListCell *lc;
238-
List *prunerelinfos;
239232
int i;
240233

241234
/*
242-
* Construct a temporary array to map from planner relids to subplan
243-
* indexes. For convenience, we use 1-based indexes here, so that zero
244-
* can represent an un-filled array entry.
235+
* Scan the subpaths to see which ones are scans of partition child
236+
* relations, and identify their parent partitioned rels. (Note: we must
237+
* restrict the parent partitioned rels to be parentrel or children of
238+
* parentrel, otherwise we couldn't translate prunequal to match.)
239+
*
240+
* Also construct a temporary array to map from partition-child-relation
241+
* relid to the index in 'subpaths' of the scan plan for that partition.
242+
* (Use of "subplan" rather than "subpath" is a bit of a misnomer, but
243+
* we'll let it stand.) For convenience, we use 1-based indexes here, so
244+
* that zero can represent an un-filled array entry.
245245
*/
246+
allpartrelids = NIL;
246247
relid_subplan_map = palloc0(sizeof(int) * root->simple_rel_array_size);
247248

248-
/*
249-
* relid_subplan_map maps relid of a leaf partition to the index in
250-
* 'subpaths' of the scan plan for that partition.
251-
*/
252249
i = 1;
253250
foreach(lc, subpaths)
254251
{
255252
Path *path = (Path *) lfirst(lc);
256253
RelOptInfo *pathrel = path->parent;
257254

258-
Assert(IS_SIMPLE_REL(pathrel));
259-
Assert(pathrel->relid < root->simple_rel_array_size);
260-
/* No duplicates please */
261-
Assert(relid_subplan_map[pathrel->relid] == 0);
255+
/* We don't consider partitioned joins here */
256+
if (pathrel->reloptkind == RELOPT_OTHER_MEMBER_REL)
257+
{
258+
RelOptInfo *prel = pathrel;
259+
Bitmapset *partrelids = NULL;
262260

263-
relid_subplan_map[pathrel->relid] = i++;
261+
/*
262+
* Traverse up to the pathrel's topmost partitioned parent,
263+
* collecting parent relids as we go; but stop if we reach
264+
* parentrel. (Normally, a pathrel's topmost partitioned parent
265+
* is either parentrel or a UNION ALL appendrel child of
266+
* parentrel. But when handling partitionwise joins of
267+
* multi-level partitioning trees, we can see an append path whose
268+
* parentrel is an intermediate partitioned table.)
269+
*/
270+
do
271+
{
272+
AppendRelInfo *appinfo;
273+
274+
Assert(prel->relid < root->simple_rel_array_size);
275+
appinfo = root->append_rel_array[prel->relid];
276+
prel = find_base_rel(root, appinfo->parent_relid);
277+
if (!IS_PARTITIONED_REL(prel))
278+
break; /* reached a non-partitioned parent */
279+
/* accept this level as an interesting parent */
280+
partrelids = bms_add_member(partrelids, prel->relid);
281+
if (prel == parentrel)
282+
break; /* don't traverse above parentrel */
283+
} while (prel->reloptkind == RELOPT_OTHER_MEMBER_REL);
284+
285+
if (partrelids)
286+
{
287+
/*
288+
* Found some relevant parent partitions, which may or may not
289+
* overlap with partition trees we already found. Add new
290+
* information to the allpartrelids list.
291+
*/
292+
allpartrelids = add_part_relids(allpartrelids, partrelids);
293+
/* Also record the subplan in relid_subplan_map[] */
294+
/* No duplicates please */
295+
Assert(relid_subplan_map[pathrel->relid] == 0);
296+
relid_subplan_map[pathrel->relid] = i;
297+
}
298+
}
299+
i++;
264300
}
265301

266-
/* We now build a PartitionedRelPruneInfo for each partitioned rel. */
302+
/*
303+
* We now build a PartitionedRelPruneInfo for each topmost partitioned rel
304+
* (omitting any that turn out not to have useful pruning quals).
305+
*/
267306
prunerelinfos = NIL;
268-
foreach(lc, partitioned_rels)
307+
foreach(lc, allpartrelids)
269308
{
270-
Relids partrelids = (Relids) lfirst(lc);
309+
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
271310
List *pinfolist;
272311
Bitmapset *matchedsubplans = NULL;
273312

274313
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
314+
prunequal,
315+
partrelids,
275316
relid_subplan_map,
276-
partrelids, prunequal,
277317
&matchedsubplans);
278318

279319
/* When pruning is possible, record the matched subplans */
@@ -299,7 +339,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
299339
pruneinfo->prune_infos = prunerelinfos;
300340

301341
/*
302-
* Some subplans may not belong to any of the listed partitioned rels.
342+
* Some subplans may not belong to any of the identified partitioned rels.
303343
* This can happen for UNION ALL queries which include a non-partitioned
304344
* table, or when some of the hierarchies aren't run-time prunable. Build
305345
* a bitmapset of the indexes of all such subplans, so that the executor
@@ -321,28 +361,86 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
321361
return pruneinfo;
322362
}
323363

364+
/*
365+
* add_part_relids
366+
* Add new info to a list of Bitmapsets of partitioned relids.
367+
*
368+
* Within 'allpartrelids', there is one Bitmapset for each topmost parent
369+
* partitioned rel. Each Bitmapset contains the RT indexes of the topmost
370+
* parent as well as its relevant non-leaf child partitions. Since (by
371+
* construction of the rangetable list) parent partitions must have lower
372+
* RT indexes than their children, we can distinguish the topmost parent
373+
* as being the lowest set bit in the Bitmapset.
374+
*
375+
* 'partrelids' contains the RT indexes of a parent partitioned rel, and
376+
* possibly some non-leaf children, that are newly identified as parents of
377+
* some subpath rel passed to make_partition_pruneinfo(). These are added
378+
* to an appropriate member of 'allpartrelids'.
379+
*
380+
* Note that the list contains only RT indexes of partitioned tables that
381+
* are parents of some scan-level relation appearing in the 'subpaths' that
382+
* make_partition_pruneinfo() is dealing with. Also, "topmost" parents are
383+
* not allowed to be higher than the 'parentrel' associated with the append
384+
* path. In this way, we avoid expending cycles on partitioned rels that
385+
* can't contribute useful pruning information for the problem at hand.
386+
* (It is possible for 'parentrel' to be a child partitioned table, and it
387+
* is also possible for scan-level relations to be child partitioned tables
388+
* rather than leaf partitions. Hence we must construct this relation set
389+
* with reference to the particular append path we're dealing with, rather
390+
* than looking at the full partitioning structure represented in the
391+
* RelOptInfos.)
392+
*/
393+
static List *
394+
add_part_relids(List *allpartrelids, Bitmapset *partrelids)
395+
{
396+
Index targetpart;
397+
ListCell *lc;
398+
399+
/* We can easily get the lowest set bit this way: */
400+
targetpart = bms_next_member(partrelids, -1);
401+
Assert(targetpart > 0);
402+
403+
/* Look for a matching topmost parent */
404+
foreach(lc, allpartrelids)
405+
{
406+
Bitmapset *currpartrelids = (Bitmapset *) lfirst(lc);
407+
Index currtarget = bms_next_member(currpartrelids, -1);
408+
409+
if (targetpart == currtarget)
410+
{
411+
/* Found a match, so add any new RT indexes to this hierarchy */
412+
currpartrelids = bms_add_members(currpartrelids, partrelids);
413+
lfirst(lc) = currpartrelids;
414+
return allpartrelids;
415+
}
416+
}
417+
/* No match, so add the new partition hierarchy to the list */
418+
return lappend(allpartrelids, partrelids);
419+
}
420+
324421
/*
325422
* make_partitionedrel_pruneinfo
326-
* Build a List of PartitionedRelPruneInfos, one for each partitioned
327-
* rel. These can be used in the executor to allow additional partition
328-
* pruning to take place.
329-
*
330-
* Here we generate partition pruning steps for 'prunequal' and also build a
331-
* data structure which allows mapping of partition indexes into 'subpaths'
332-
* indexes.
333-
*
334-
* If no non-Const expressions are being compared to the partition key in any
335-
* of the 'partitioned_rels', then we return NIL to indicate no run-time
336-
* pruning should be performed. Run-time pruning would be useless since the
337-
* pruning done during planning will have pruned everything that can be.
338-
*
339-
* On non-NIL return, 'matchedsubplans' is set to the subplan indexes which
340-
* were matched to this partition hierarchy.
423+
* Build a List of PartitionedRelPruneInfos, one for each interesting
424+
* partitioned rel in a partitioning hierarchy. These can be used in the
425+
* executor to allow additional partition pruning to take place.
426+
*
427+
* parentrel: rel associated with the appendpath being considered
428+
* prunequal: potential pruning quals, represented for parentrel
429+
* partrelids: Set of RT indexes identifying relevant partitioned tables
430+
* within a single partitioning hierarchy
431+
* relid_subplan_map[]: maps child relation relids to subplan indexes
432+
* matchedsubplans: on success, receives the set of subplan indexes which
433+
* were matched to this partition hierarchy
434+
*
435+
* If we cannot find any useful run-time pruning steps, return NIL.
436+
* However, on success, each rel identified in partrelids will have
437+
* an element in the result list, even if some of them are useless.
341438
*/
342439
static List *
343440
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
441+
List *prunequal,
442+
Bitmapset *partrelids,
344443
int *relid_subplan_map,
345-
Relids partrelids, List *prunequal,
346444
Bitmapset **matchedsubplans)
347445
{
348446
RelOptInfo *targetpart = NULL;
@@ -2421,11 +2519,12 @@ get_steps_using_prefix_recurse(GeneratePruningStepsContext *context,
24212519
*/
24222520
Assert(list_length(step_exprs) == cur_keyno ||
24232521
!bms_is_empty(step_nullkeys));
2522+
24242523
/*
24252524
* Note also that for hash partitioning, each partition key should
24262525
* have either equality clauses or an IS NULL clause, so if a
2427-
* partition key doesn't have an expression, it would be specified
2428-
* in step_nullkeys.
2526+
* partition key doesn't have an expression, it would be specified in
2527+
* step_nullkeys.
24292528
*/
24302529
Assert(context->rel->part_scheme->strategy
24312530
!= PARTITION_STRATEGY_HASH ||

0 commit comments

Comments
 (0)