Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit dd4134e

Browse files
committed
Revisit handling of UNION ALL subqueries with non-Var output columns.
In commit 57664ed I tried to fix a bug reported by Teodor Sigaev by making non-simple-Var output columns distinct (by wrapping their expressions with dummy PlaceHolderVar nodes). This did not work too well. Commit b28ffd0 fixed some ensuing problems with matching to child indexes, but per a recent report from Claus Stadler, constraint exclusion of UNION ALL subqueries was still broken, because constant-simplification didn't handle the injected PlaceHolderVars well either. On reflection, the original patch was quite misguided: there is no reason to expect that EquivalenceClass child members will be distinct. So instead of trying to make them so, we should ensure that we can cope with the situation when they're not. Accordingly, this patch reverts the code changes in the above-mentioned commits (though the regression test cases they added stay). Instead, I've added assorted defenses to make sure that duplicate EC child members don't cause any problems. Teodor's original problem ("MergeAppend child's targetlist doesn't match MergeAppend") is addressed more directly by revising prepare_sort_from_pathkeys to let the parent MergeAppend's sort list guide creation of each child's sort list. In passing, get rid of add_sort_column; as far as I can tell, testing for duplicate sort keys at this stage is dead code. Certainly it doesn't trigger often enough to be worth expending cycles on in ordinary queries. And keeping the test would've greatly complicated the new logic in prepare_sort_from_pathkeys, because comparing pathkey list entries against a previous output array requires that we not skip any entries in the list. Back-patch to 9.1, like the previous patches. The only known issue in this area that wasn't caused by the ill-advised previous patches was the MergeAppend planning failure, which of course is not relevant before 9.1. It's possible that we need some of the new defenses against duplicate child EC entries in older branches, but until there's some clear evidence of that I'm going to refrain from back-patching further.
1 parent aef5fe7 commit dd4134e

File tree

16 files changed

+411
-258
lines changed

16 files changed

+411
-258
lines changed

src/backend/optimizer/README

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,14 @@ it's possible that it belongs to more than one. We keep track of all the
496496
families to ensure that we can make use of an index belonging to any one of
497497
the families for mergejoin purposes.)
498498

499+
An EquivalenceClass can contain "em_is_child" members, which are copies
500+
of members that contain appendrel parent relation Vars, transposed to
501+
contain the equivalent child-relation variables or expressions. These
502+
members are *not* full-fledged members of the EquivalenceClass and do not
503+
affect the class's overall properties at all. They are kept only to
504+
simplify matching of child-relation expressions to EquivalenceClasses.
505+
Most operations on EquivalenceClasses should ignore child members.
506+
499507

500508
PathKeys
501509
--------

src/backend/optimizer/path/equivclass.c

Lines changed: 41 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -491,6 +491,15 @@ add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids,
491491
* sortref is the SortGroupRef of the originating SortGroupClause, if any,
492492
* or zero if not. (It should never be zero if the expression is volatile!)
493493
*
494+
* If rel is not NULL, it identifies a specific relation we're considering
495+
* a path for, and indicates that child EC members for that relation can be
496+
* considered. Otherwise child members are ignored. (Note: since child EC
497+
* members aren't guaranteed unique, a non-NULL value means that there could
498+
* be more than one EC that matches the expression; if so it's order-dependent
499+
* which one you get. This is annoying but it only happens in corner cases,
500+
* so for now we live with just reporting the first match. See also
501+
* generate_implied_equalities_for_indexcol and match_pathkeys_to_index.)
502+
*
494503
* If create_it is TRUE, we'll build a new EquivalenceClass when there is no
495504
* match. If create_it is FALSE, we just return NULL when no match.
496505
*
@@ -511,6 +520,7 @@ get_eclass_for_sort_expr(PlannerInfo *root,
511520
Oid opcintype,
512521
Oid collation,
513522
Index sortref,
523+
Relids rel,
514524
bool create_it)
515525
{
516526
EquivalenceClass *newec;
@@ -548,6 +558,13 @@ get_eclass_for_sort_expr(PlannerInfo *root,
548558
{
549559
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
550560

561+
/*
562+
* Ignore child members unless they match the request.
563+
*/
564+
if (cur_em->em_is_child &&
565+
!bms_equal(cur_em->em_relids, rel))
566+
continue;
567+
551568
/*
552569
* If below an outer join, don't match constants: they're not as
553570
* constant as they look.
@@ -1505,6 +1522,7 @@ reconsider_outer_join_clause(PlannerInfo *root, RestrictInfo *rinfo,
15051522
{
15061523
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
15071524

1525+
Assert(!cur_em->em_is_child); /* no children yet */
15081526
if (equal(outervar, cur_em->em_expr))
15091527
{
15101528
match = true;
@@ -1626,6 +1644,7 @@ reconsider_full_join_clause(PlannerInfo *root, RestrictInfo *rinfo)
16261644
foreach(lc2, cur_ec->ec_members)
16271645
{
16281646
coal_em = (EquivalenceMember *) lfirst(lc2);
1647+
Assert(!coal_em->em_is_child); /* no children yet */
16291648
if (IsA(coal_em->em_expr, CoalesceExpr))
16301649
{
16311650
CoalesceExpr *cexpr = (CoalesceExpr *) coal_em->em_expr;
@@ -1747,6 +1766,8 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
17471766
{
17481767
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2);
17491768

1769+
if (em->em_is_child)
1770+
continue; /* ignore children here */
17501771
if (equal(item1, em->em_expr))
17511772
item1member = true;
17521773
else if (equal(item2, em->em_expr))
@@ -1800,6 +1821,9 @@ add_child_rel_equivalences(PlannerInfo *root,
18001821
{
18011822
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
18021823

1824+
if (cur_em->em_is_child)
1825+
continue; /* ignore children here */
1826+
18031827
/* Does it reference (only) parent_rel? */
18041828
if (bms_equal(cur_em->em_relids, parent_rel->relids))
18051829
{
@@ -1908,7 +1932,16 @@ generate_implied_equalities_for_indexcol(PlannerInfo *root,
19081932
!bms_is_subset(rel->relids, cur_ec->ec_relids))
19091933
continue;
19101934

1911-
/* Scan members, looking for a match to the indexable column */
1935+
/*
1936+
* Scan members, looking for a match to the indexable column. Note
1937+
* that child EC members are considered, but only when they belong to
1938+
* the target relation. (Unlike regular members, the same expression
1939+
* could be a child member of more than one EC. Therefore, it's
1940+
* potentially order-dependent which EC a child relation's index
1941+
* column gets matched to. This is annoying but it only happens in
1942+
* corner cases, so for now we live with just reporting the first
1943+
* match. See also get_eclass_for_sort_expr.)
1944+
*/
19121945
cur_em = NULL;
19131946
foreach(lc2, cur_ec->ec_members)
19141947
{
@@ -1933,6 +1966,9 @@ generate_implied_equalities_for_indexcol(PlannerInfo *root,
19331966
Oid eq_op;
19341967
RestrictInfo *rinfo;
19351968

1969+
if (other_em->em_is_child)
1970+
continue; /* ignore children here */
1971+
19361972
/* Make sure it'll be a join to a different rel */
19371973
if (other_em == cur_em ||
19381974
bms_overlap(other_em->em_relids, rel->relids))
@@ -2187,8 +2223,10 @@ eclass_useful_for_merging(EquivalenceClass *eclass,
21872223
{
21882224
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc);
21892225

2190-
if (!cur_em->em_is_child &&
2191-
!bms_overlap(cur_em->em_relids, rel->relids))
2226+
if (cur_em->em_is_child)
2227+
continue; /* ignore children here */
2228+
2229+
if (!bms_overlap(cur_em->em_relids, rel->relids))
21922230
return true;
21932231
}
21942232

src/backend/optimizer/path/indxpath.c

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2157,7 +2157,14 @@ match_pathkeys_to_index(IndexOptInfo *index, List *pathkeys,
21572157
if (pathkey->pk_eclass->ec_has_volatile)
21582158
return;
21592159

2160-
/* Try to match eclass member expression(s) to index */
2160+
/*
2161+
* Try to match eclass member expression(s) to index. Note that child
2162+
* EC members are considered, but only when they belong to the target
2163+
* relation. (Unlike regular members, the same expression could be a
2164+
* child member of more than one EC. Therefore, the same index could
2165+
* be considered to match more than one pathkey list, which is OK
2166+
* here. See also get_eclass_for_sort_expr.)
2167+
*/
21612168
foreach(lc2, pathkey->pk_eclass->ec_members)
21622169
{
21632170
EquivalenceMember *member = (EquivalenceMember *) lfirst(lc2);
@@ -2580,15 +2587,6 @@ match_index_to_operand(Node *operand,
25802587
{
25812588
int indkey;
25822589

2583-
/*
2584-
* Ignore any PlaceHolderVar nodes above the operand. This is needed so
2585-
* that we can successfully use expression-index constraints pushed down
2586-
* through appendrels (UNION ALL). It's safe because a PlaceHolderVar
2587-
* appearing in a relation-scan-level expression is certainly a no-op.
2588-
*/
2589-
while (operand && IsA(operand, PlaceHolderVar))
2590-
operand = (Node *) ((PlaceHolderVar *) operand)->phexpr;
2591-
25922590
/*
25932591
* Ignore any RelabelType node above the operand. This is needed to be
25942592
* able to apply indexscanning in binary-compatible-operator cases. Note:

src/backend/optimizer/path/pathkeys.c

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,11 @@ canonicalize_pathkeys(PlannerInfo *root, List *pathkeys)
221221
* If the PathKey is being generated from a SortGroupClause, sortref should be
222222
* the SortGroupClause's SortGroupRef; otherwise zero.
223223
*
224+
* If rel is not NULL, it identifies a specific relation we're considering
225+
* a path for, and indicates that child EC members for that relation can be
226+
* considered. Otherwise child members are ignored. (See the comments for
227+
* get_eclass_for_sort_expr.)
228+
*
224229
* create_it is TRUE if we should create any missing EquivalenceClass
225230
* needed to represent the sort key. If it's FALSE, we return NULL if the
226231
* sort key isn't already present in any EquivalenceClass.
@@ -237,6 +242,7 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
237242
bool reverse_sort,
238243
bool nulls_first,
239244
Index sortref,
245+
Relids rel,
240246
bool create_it,
241247
bool canonicalize)
242248
{
@@ -268,7 +274,7 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
268274
/* Now find or (optionally) create a matching EquivalenceClass */
269275
eclass = get_eclass_for_sort_expr(root, expr, opfamilies,
270276
opcintype, collation,
271-
sortref, create_it);
277+
sortref, rel, create_it);
272278

273279
/* Fail if no EC and !create_it */
274280
if (!eclass)
@@ -320,6 +326,7 @@ make_pathkey_from_sortop(PlannerInfo *root,
320326
(strategy == BTGreaterStrategyNumber),
321327
nulls_first,
322328
sortref,
329+
NULL,
323330
create_it,
324331
canonicalize);
325332
}
@@ -546,6 +553,7 @@ build_index_pathkeys(PlannerInfo *root,
546553
reverse_sort,
547554
nulls_first,
548555
0,
556+
index->rel->relids,
549557
false,
550558
true);
551559

@@ -636,6 +644,7 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
636644
sub_member->em_datatype,
637645
sub_eclass->ec_collation,
638646
0,
647+
rel->relids,
639648
false);
640649

641650
/*
@@ -680,6 +689,9 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
680689
Oid sub_expr_coll = sub_eclass->ec_collation;
681690
ListCell *k;
682691

692+
if (sub_member->em_is_child)
693+
continue; /* ignore children here */
694+
683695
foreach(k, sub_tlist)
684696
{
685697
TargetEntry *tle = (TargetEntry *) lfirst(k);
@@ -719,6 +731,7 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
719731
sub_expr_type,
720732
sub_expr_coll,
721733
0,
734+
rel->relids,
722735
false);
723736

724737
/*
@@ -910,6 +923,7 @@ initialize_mergeclause_eclasses(PlannerInfo *root, RestrictInfo *restrictinfo)
910923
lefttype,
911924
((OpExpr *) clause)->inputcollid,
912925
0,
926+
NULL,
913927
true);
914928
restrictinfo->right_ec =
915929
get_eclass_for_sort_expr(root,
@@ -918,6 +932,7 @@ initialize_mergeclause_eclasses(PlannerInfo *root, RestrictInfo *restrictinfo)
918932
righttype,
919933
((OpExpr *) clause)->inputcollid,
920934
0,
935+
NULL,
921936
true);
922937
}
923938

0 commit comments

Comments
 (0)