Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 805f798

Browse files
committed
Revisit handling of UNION ALL subqueries with non-Var output columns.
In commit 57664ed I tried to fix a bug reported by Teodor Sigaev by making non-simple-Var output columns distinct (by wrapping their expressions with dummy PlaceHolderVar nodes). This did not work too well. Commit b28ffd0 fixed some ensuing problems with matching to child indexes, but per a recent report from Claus Stadler, constraint exclusion of UNION ALL subqueries was still broken, because constant-simplification didn't handle the injected PlaceHolderVars well either. On reflection, the original patch was quite misguided: there is no reason to expect that EquivalenceClass child members will be distinct. So instead of trying to make them so, we should ensure that we can cope with the situation when they're not. Accordingly, this patch reverts the code changes in the above-mentioned commits (though the regression test cases they added stay). Instead, I've added assorted defenses to make sure that duplicate EC child members don't cause any problems. Teodor's original problem ("MergeAppend child's targetlist doesn't match MergeAppend") is addressed more directly by revising prepare_sort_from_pathkeys to let the parent MergeAppend's sort list guide creation of each child's sort list. In passing, get rid of add_sort_column; as far as I can tell, testing for duplicate sort keys at this stage is dead code. Certainly it doesn't trigger often enough to be worth expending cycles on in ordinary queries. And keeping the test would've greatly complicated the new logic in prepare_sort_from_pathkeys, because comparing pathkey list entries against a previous output array requires that we not skip any entries in the list. Back-patch to 9.1, like the previous patches. The only known issue in this area that wasn't caused by the ill-advised previous patches was the MergeAppend planning failure, which of course is not relevant before 9.1. It's possible that we need some of the new defenses against duplicate child EC entries in older branches, but until there's some clear evidence of that I'm going to refrain from back-patching further.
1 parent 0cb4a0b commit 805f798

File tree

16 files changed

+410
-258
lines changed

16 files changed

+410
-258
lines changed

src/backend/optimizer/README

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,14 @@ it's possible that it belongs to more than one. We keep track of all the
496496
families to ensure that we can make use of an index belonging to any one of
497497
the families for mergejoin purposes.)
498498

499+
An EquivalenceClass can contain "em_is_child" members, which are copies
500+
of members that contain appendrel parent relation Vars, transposed to
501+
contain the equivalent child-relation variables or expressions. These
502+
members are *not* full-fledged members of the EquivalenceClass and do not
503+
affect the class's overall properties at all. They are kept only to
504+
simplify matching of child-relation expressions to EquivalenceClasses.
505+
Most operations on EquivalenceClasses should ignore child members.
506+
499507

500508
PathKeys
501509
--------

src/backend/optimizer/path/equivclass.c

Lines changed: 40 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -491,6 +491,15 @@ add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids,
491491
* sortref is the SortGroupRef of the originating SortGroupClause, if any,
492492
* or zero if not. (It should never be zero if the expression is volatile!)
493493
*
494+
* If rel is not NULL, it identifies a specific relation we're considering
495+
* a path for, and indicates that child EC members for that relation can be
496+
* considered. Otherwise child members are ignored. (Note: since child EC
497+
* members aren't guaranteed unique, a non-NULL value means that there could
498+
* be more than one EC that matches the expression; if so it's order-dependent
499+
* which one you get. This is annoying but it only happens in corner cases,
500+
* so for now we live with just reporting the first match. See also
501+
* generate_implied_equalities_for_indexcol and match_pathkeys_to_index.)
502+
*
494503
* If create_it is TRUE, we'll build a new EquivalenceClass when there is no
495504
* match. If create_it is FALSE, we just return NULL when no match.
496505
*
@@ -511,6 +520,7 @@ get_eclass_for_sort_expr(PlannerInfo *root,
511520
Oid opcintype,
512521
Oid collation,
513522
Index sortref,
523+
Relids rel,
514524
bool create_it)
515525
{
516526
EquivalenceClass *newec;
@@ -548,6 +558,13 @@ get_eclass_for_sort_expr(PlannerInfo *root,
548558
{
549559
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
550560

561+
/*
562+
* Ignore child members unless they match the request.
563+
*/
564+
if (cur_em->em_is_child &&
565+
!bms_equal(cur_em->em_relids, rel))
566+
continue;
567+
551568
/*
552569
* If below an outer join, don't match constants: they're not as
553570
* constant as they look.
@@ -1505,6 +1522,7 @@ reconsider_outer_join_clause(PlannerInfo *root, RestrictInfo *rinfo,
15051522
{
15061523
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
15071524

1525+
Assert(!cur_em->em_is_child); /* no children yet */
15081526
if (equal(outervar, cur_em->em_expr))
15091527
{
15101528
match = true;
@@ -1626,6 +1644,7 @@ reconsider_full_join_clause(PlannerInfo *root, RestrictInfo *rinfo)
16261644
foreach(lc2, cur_ec->ec_members)
16271645
{
16281646
coal_em = (EquivalenceMember *) lfirst(lc2);
1647+
Assert(!coal_em->em_is_child); /* no children yet */
16291648
if (IsA(coal_em->em_expr, CoalesceExpr))
16301649
{
16311650
CoalesceExpr *cexpr = (CoalesceExpr *) coal_em->em_expr;
@@ -1747,6 +1766,8 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
17471766
{
17481767
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2);
17491768

1769+
if (em->em_is_child)
1770+
continue; /* ignore children here */
17501771
if (equal(item1, em->em_expr))
17511772
item1member = true;
17521773
else if (equal(item2, em->em_expr))
@@ -1800,6 +1821,9 @@ add_child_rel_equivalences(PlannerInfo *root,
18001821
{
18011822
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
18021823

1824+
if (cur_em->em_is_child)
1825+
continue; /* ignore children here */
1826+
18031827
/* Does it reference (only) parent_rel? */
18041828
if (bms_equal(cur_em->em_relids, parent_rel->relids))
18051829
{
@@ -1891,7 +1915,16 @@ find_eclass_clauses_for_index_join(PlannerInfo *root, RelOptInfo *rel,
18911915
if (!bms_overlap(outer_relids, cur_ec->ec_relids))
18921916
continue;
18931917

1894-
/* Scan members, looking for indexable columns */
1918+
/*
1919+
* Scan members, looking for indexable columns. Note
1920+
* that child EC members are considered, but only when they belong to
1921+
* the target relation. (Unlike regular members, the same expression
1922+
* could be a child member of more than one EC. Therefore, it's
1923+
* potentially order-dependent which EC a child relation's index
1924+
* column gets matched to. This is annoying but it only happens in
1925+
* corner cases, so for now we live with just reporting the first
1926+
* match. See also get_eclass_for_sort_expr.)
1927+
*/
18951928
foreach(lc2, cur_ec->ec_members)
18961929
{
18971930
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
@@ -1914,6 +1947,8 @@ find_eclass_clauses_for_index_join(PlannerInfo *root, RelOptInfo *rel,
19141947
EquivalenceMember *outer_em = (EquivalenceMember *) lfirst(lc3);
19151948
Oid eq_op;
19161949

1950+
if (outer_em->em_is_child)
1951+
continue; /* ignore children here */
19171952
if (!bms_is_subset(outer_em->em_relids, outer_relids))
19181953
continue;
19191954
eq_op = select_equality_operator(cur_ec,
@@ -2144,8 +2179,10 @@ eclass_useful_for_merging(EquivalenceClass *eclass,
21442179
{
21452180
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc);
21462181

2147-
if (!cur_em->em_is_child &&
2148-
!bms_overlap(cur_em->em_relids, rel->relids))
2182+
if (cur_em->em_is_child)
2183+
continue; /* ignore children here */
2184+
2185+
if (!bms_overlap(cur_em->em_relids, rel->relids))
21492186
return true;
21502187
}
21512188

src/backend/optimizer/path/indxpath.c

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1489,7 +1489,14 @@ match_index_to_pathkeys(IndexOptInfo *index, List *pathkeys)
14891489
if (pathkey->pk_eclass->ec_has_volatile)
14901490
return NIL;
14911491

1492-
/* Try to match eclass member expression(s) to index */
1492+
/*
1493+
* Try to match eclass member expression(s) to index. Note that child
1494+
* EC members are considered, but only when they belong to the target
1495+
* relation. (Unlike regular members, the same expression could be a
1496+
* child member of more than one EC. Therefore, the same index could
1497+
* be considered to match more than one pathkey list, which is OK
1498+
* here. See also get_eclass_for_sort_expr.)
1499+
*/
14931500
foreach(lc2, pathkey->pk_eclass->ec_members)
14941501
{
14951502
EquivalenceMember *member = (EquivalenceMember *) lfirst(lc2);
@@ -2285,15 +2292,6 @@ match_index_to_operand(Node *operand,
22852292
{
22862293
int indkey;
22872294

2288-
/*
2289-
* Ignore any PlaceHolderVar nodes above the operand. This is needed so
2290-
* that we can successfully use expression-index constraints pushed down
2291-
* through appendrels (UNION ALL). It's safe because a PlaceHolderVar
2292-
* appearing in a relation-scan-level expression is certainly a no-op.
2293-
*/
2294-
while (operand && IsA(operand, PlaceHolderVar))
2295-
operand = (Node *) ((PlaceHolderVar *) operand)->phexpr;
2296-
22972295
/*
22982296
* Ignore any RelabelType node above the operand. This is needed to be
22992297
* able to apply indexscanning in binary-compatible-operator cases. Note:

src/backend/optimizer/path/pathkeys.c

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -224,6 +224,11 @@ canonicalize_pathkeys(PlannerInfo *root, List *pathkeys)
224224
* If the PathKey is being generated from a SortGroupClause, sortref should be
225225
* the SortGroupClause's SortGroupRef; otherwise zero.
226226
*
227+
* If rel is not NULL, it identifies a specific relation we're considering
228+
* a path for, and indicates that child EC members for that relation can be
229+
* considered. Otherwise child members are ignored. (See the comments for
230+
* get_eclass_for_sort_expr.)
231+
*
227232
* create_it is TRUE if we should create any missing EquivalenceClass
228233
* needed to represent the sort key. If it's FALSE, we return NULL if the
229234
* sort key isn't already present in any EquivalenceClass.
@@ -240,6 +245,7 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
240245
bool reverse_sort,
241246
bool nulls_first,
242247
Index sortref,
248+
Relids rel,
243249
bool create_it,
244250
bool canonicalize)
245251
{
@@ -271,7 +277,7 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
271277
/* Now find or (optionally) create a matching EquivalenceClass */
272278
eclass = get_eclass_for_sort_expr(root, expr, opfamilies,
273279
opcintype, collation,
274-
sortref, create_it);
280+
sortref, rel, create_it);
275281

276282
/* Fail if no EC and !create_it */
277283
if (!eclass)
@@ -323,6 +329,7 @@ make_pathkey_from_sortop(PlannerInfo *root,
323329
(strategy == BTGreaterStrategyNumber),
324330
nulls_first,
325331
sortref,
332+
NULL,
326333
create_it,
327334
canonicalize);
328335
}
@@ -554,6 +561,7 @@ build_index_pathkeys(PlannerInfo *root,
554561
reverse_sort,
555562
nulls_first,
556563
0,
564+
index->rel->relids,
557565
false,
558566
true);
559567

@@ -677,6 +685,7 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
677685
sub_member->em_datatype,
678686
sub_eclass->ec_collation,
679687
0,
688+
rel->relids,
680689
false);
681690

682691
/*
@@ -721,6 +730,9 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
721730
Oid sub_expr_coll = sub_eclass->ec_collation;
722731
ListCell *k;
723732

733+
if (sub_member->em_is_child)
734+
continue; /* ignore children here */
735+
724736
foreach(k, sub_tlist)
725737
{
726738
TargetEntry *tle = (TargetEntry *) lfirst(k);
@@ -760,6 +772,7 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
760772
sub_expr_type,
761773
sub_expr_coll,
762774
0,
775+
rel->relids,
763776
false);
764777

765778
/*
@@ -951,6 +964,7 @@ initialize_mergeclause_eclasses(PlannerInfo *root, RestrictInfo *restrictinfo)
951964
lefttype,
952965
((OpExpr *) clause)->inputcollid,
953966
0,
967+
NULL,
954968
true);
955969
restrictinfo->right_ec =
956970
get_eclass_for_sort_expr(root,
@@ -959,6 +973,7 @@ initialize_mergeclause_eclasses(PlannerInfo *root, RestrictInfo *restrictinfo)
959973
righttype,
960974
((OpExpr *) clause)->inputcollid,
961975
0,
976+
NULL,
962977
true);
963978
}
964979

0 commit comments

Comments
 (0)