Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit f3b3b8d

Browse files
committed
Compute correct em_nullable_relids in get_eclass_for_sort_expr().
Bug #8591 from Claudio Freire demonstrates that get_eclass_for_sort_expr must be able to compute valid em_nullable_relids for any new equivalence class members it creates. I'd worried about this in the commit message for db9f0e1, but claimed that it wasn't a problem because multi-member ECs should already exist when it runs. That is transparently wrong, though, because this function is also called by initialize_mergeclause_eclasses, which runs during deconstruct_jointree. The example given in the bug report (which the new regression test item is based upon) fails because the COALESCE() expression is first seen by initialize_mergeclause_eclasses rather than process_equivalence. Fixing this requires passing the appropriate nullable_relids set to get_eclass_for_sort_expr, and it requires new code to compute that set for top-level expressions such as ORDER BY, GROUP BY, etc. We store the top-level nullable_relids in a new field in PlannerInfo to avoid computing it many times. In the back branches, I've added the new field at the end of the struct to minimize ABI breakage for planner plugins. There doesn't seem to be a good alternative to changing get_eclass_for_sort_expr's API signature, though. There probably aren't any third-party extensions calling that function directly; moreover, if there are, they probably need to think about what to pass for nullable_relids anyway. Back-patch to 9.2, like the previous patch in this area.
1 parent c7b849a commit f3b3b8d

File tree

8 files changed

+119
-7
lines changed

8 files changed

+119
-7
lines changed

src/backend/nodes/outfuncs.c

+1
Original file line numberDiff line numberDiff line change
@@ -1698,6 +1698,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
16981698
WRITE_UINT_FIELD(query_level);
16991699
WRITE_NODE_FIELD(plan_params);
17001700
WRITE_BITMAPSET_FIELD(all_baserels);
1701+
WRITE_BITMAPSET_FIELD(nullable_baserels);
17011702
WRITE_NODE_FIELD(join_rel_list);
17021703
WRITE_INT_FIELD(join_cur_level);
17031704
WRITE_NODE_FIELD(init_plans);

src/backend/optimizer/path/equivclass.c

+17-2
Original file line numberDiff line numberDiff line change
@@ -510,6 +510,13 @@ add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids,
510510
* equivalence class it is a member of; if none, optionally build a new
511511
* single-member EquivalenceClass for it.
512512
*
513+
* expr is the expression, and nullable_relids is the set of base relids
514+
* that are potentially nullable below it. We actually only care about
515+
* the set of such relids that are used in the expression; but for caller
516+
* convenience, we perform that intersection step here. The caller need
517+
* only be sure that nullable_relids doesn't omit any nullable rels that
518+
* might appear in the expr.
519+
*
513520
* sortref is the SortGroupRef of the originating SortGroupClause, if any,
514521
* or zero if not. (It should never be zero if the expression is volatile!)
515522
*
@@ -538,13 +545,15 @@ add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids,
538545
EquivalenceClass *
539546
get_eclass_for_sort_expr(PlannerInfo *root,
540547
Expr *expr,
548+
Relids nullable_relids,
541549
List *opfamilies,
542550
Oid opcintype,
543551
Oid collation,
544552
Index sortref,
545553
Relids rel,
546554
bool create_it)
547555
{
556+
Relids expr_relids;
548557
EquivalenceClass *newec;
549558
EquivalenceMember *newem;
550559
ListCell *lc1;
@@ -555,6 +564,12 @@ get_eclass_for_sort_expr(PlannerInfo *root,
555564
*/
556565
expr = canonicalize_ec_expression(expr, opcintype, collation);
557566

567+
/*
568+
* Get the precise set of nullable relids appearing in the expression.
569+
*/
570+
expr_relids = pull_varnos((Node *) expr);
571+
nullable_relids = bms_intersect(nullable_relids, expr_relids);
572+
558573
/*
559574
* Scan through the existing EquivalenceClasses for a match
560575
*/
@@ -629,8 +644,8 @@ get_eclass_for_sort_expr(PlannerInfo *root,
629644
if (newec->ec_has_volatile && sortref == 0) /* should not happen */
630645
elog(ERROR, "volatile EquivalenceClass has no sortref");
631646

632-
newem = add_eq_member(newec, copyObject(expr), pull_varnos((Node *) expr),
633-
NULL, false, opcintype);
647+
newem = add_eq_member(newec, copyObject(expr), expr_relids,
648+
nullable_relids, false, opcintype);
634649

635650
/*
636651
* add_eq_member doesn't check for volatile functions, set-returning

src/backend/optimizer/path/pathkeys.c

+28-4
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,9 @@ pathkey_is_redundant(PathKey *new_pathkey, List *pathkeys)
154154
* Given an expression and sort-order information, create a PathKey.
155155
* The result is always a "canonical" PathKey, but it might be redundant.
156156
*
157+
* expr is the expression, and nullable_relids is the set of base relids
158+
* that are potentially nullable below it.
159+
*
157160
* If the PathKey is being generated from a SortGroupClause, sortref should be
158161
* the SortGroupClause's SortGroupRef; otherwise zero.
159162
*
@@ -169,6 +172,7 @@ pathkey_is_redundant(PathKey *new_pathkey, List *pathkeys)
169172
static PathKey *
170173
make_pathkey_from_sortinfo(PlannerInfo *root,
171174
Expr *expr,
175+
Relids nullable_relids,
172176
Oid opfamily,
173177
Oid opcintype,
174178
Oid collation,
@@ -204,8 +208,8 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
204208
equality_op);
205209

206210
/* Now find or (optionally) create a matching EquivalenceClass */
207-
eclass = get_eclass_for_sort_expr(root, expr, opfamilies,
208-
opcintype, collation,
211+
eclass = get_eclass_for_sort_expr(root, expr, nullable_relids,
212+
opfamilies, opcintype, collation,
209213
sortref, rel, create_it);
210214

211215
/* Fail if no EC and !create_it */
@@ -227,6 +231,7 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
227231
static PathKey *
228232
make_pathkey_from_sortop(PlannerInfo *root,
229233
Expr *expr,
234+
Relids nullable_relids,
230235
Oid ordering_op,
231236
bool nulls_first,
232237
Index sortref,
@@ -248,6 +253,7 @@ make_pathkey_from_sortop(PlannerInfo *root,
248253

249254
return make_pathkey_from_sortinfo(root,
250255
expr,
256+
nullable_relids,
251257
opfamily,
252258
opcintype,
253259
collation,
@@ -461,9 +467,13 @@ build_index_pathkeys(PlannerInfo *root,
461467
nulls_first = index->nulls_first[i];
462468
}
463469

464-
/* OK, try to make a canonical pathkey for this sort key */
470+
/*
471+
* OK, try to make a canonical pathkey for this sort key. Note we're
472+
* underneath any outer joins, so nullable_relids should be NULL.
473+
*/
465474
cpathkey = make_pathkey_from_sortinfo(root,
466475
indexkey,
476+
NULL,
467477
index->sortopfamily[i],
468478
index->opcintype[i],
469479
index->indexcollations[i],
@@ -551,11 +561,14 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
551561
* expression is *not* volatile in the outer query: it's just
552562
* a Var referencing whatever the subquery emitted. (IOW, the
553563
* outer query isn't going to re-execute the volatile
554-
* expression itself.) So this is okay.
564+
* expression itself.) So this is okay. Likewise, it's
565+
* correct to pass nullable_relids = NULL, because we're
566+
* underneath any outer joins appearing in the outer query.
555567
*/
556568
outer_ec =
557569
get_eclass_for_sort_expr(root,
558570
outer_expr,
571+
NULL,
559572
sub_eclass->ec_opfamilies,
560573
sub_member->em_datatype,
561574
sub_eclass->ec_collation,
@@ -643,6 +656,7 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
643656
/* See if we have a matching EC for that */
644657
outer_ec = get_eclass_for_sort_expr(root,
645658
outer_expr,
659+
NULL,
646660
sub_eclass->ec_opfamilies,
647661
sub_expr_type,
648662
sub_expr_coll,
@@ -748,6 +762,13 @@ build_join_pathkeys(PlannerInfo *root,
748762
* The resulting PathKeys are always in canonical form. (Actually, there
749763
* is no longer any code anywhere that creates non-canonical PathKeys.)
750764
*
765+
* We assume that root->nullable_baserels is the set of base relids that could
766+
* have gone to NULL below the SortGroupClause expressions. This is okay if
767+
* the expressions came from the query's top level (ORDER BY, DISTINCT, etc)
768+
* and if this function is only invoked after deconstruct_jointree. In the
769+
* future we might have to make callers pass in the appropriate
770+
* nullable-relids set, but for now it seems unnecessary.
771+
*
751772
* 'sortclauses' is a list of SortGroupClause nodes
752773
* 'tlist' is the targetlist to find the referenced tlist entries in
753774
*/
@@ -769,6 +790,7 @@ make_pathkeys_for_sortclauses(PlannerInfo *root,
769790
Assert(OidIsValid(sortcl->sortop));
770791
pathkey = make_pathkey_from_sortop(root,
771792
sortkey,
793+
root->nullable_baserels,
772794
sortcl->sortop,
773795
sortcl->nulls_first,
774796
sortcl->tleSortGroupRef,
@@ -824,6 +846,7 @@ initialize_mergeclause_eclasses(PlannerInfo *root, RestrictInfo *restrictinfo)
824846
restrictinfo->left_ec =
825847
get_eclass_for_sort_expr(root,
826848
(Expr *) get_leftop(clause),
849+
restrictinfo->nullable_relids,
827850
restrictinfo->mergeopfamilies,
828851
lefttype,
829852
((OpExpr *) clause)->inputcollid,
@@ -833,6 +856,7 @@ initialize_mergeclause_eclasses(PlannerInfo *root, RestrictInfo *restrictinfo)
833856
restrictinfo->right_ec =
834857
get_eclass_for_sort_expr(root,
835858
(Expr *) get_rightop(clause),
859+
restrictinfo->nullable_relids,
836860
restrictinfo->mergeopfamilies,
837861
righttype,
838862
((OpExpr *) clause)->inputcollid,

src/backend/optimizer/plan/initsplan.c

+20
Original file line numberDiff line numberDiff line change
@@ -649,6 +649,9 @@ deconstruct_jointree(PlannerInfo *root)
649649
Assert(root->parse->jointree != NULL &&
650650
IsA(root->parse->jointree, FromExpr));
651651

652+
/* this is filled as we scan the jointree */
653+
root->nullable_baserels = NULL;
654+
652655
result = deconstruct_recurse(root, (Node *) root->parse->jointree, false,
653656
&qualscope, &inner_join_rels,
654657
&postponed_qual_list);
@@ -790,6 +793,7 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
790793
left_inners,
791794
right_inners,
792795
nonnullable_rels,
796+
nullable_rels,
793797
ojscope;
794798
List *leftjoinlist,
795799
*rightjoinlist;
@@ -823,6 +827,8 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
823827
*inner_join_rels = *qualscope;
824828
/* Inner join adds no restrictions for quals */
825829
nonnullable_rels = NULL;
830+
/* and it doesn't force anything to null, either */
831+
nullable_rels = NULL;
826832
break;
827833
case JOIN_LEFT:
828834
case JOIN_ANTI:
@@ -837,6 +843,7 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
837843
*qualscope = bms_union(leftids, rightids);
838844
*inner_join_rels = bms_union(left_inners, right_inners);
839845
nonnullable_rels = leftids;
846+
nullable_rels = rightids;
840847
break;
841848
case JOIN_SEMI:
842849
leftjoinlist = deconstruct_recurse(root, j->larg,
@@ -851,6 +858,13 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
851858
*inner_join_rels = bms_union(left_inners, right_inners);
852859
/* Semi join adds no restrictions for quals */
853860
nonnullable_rels = NULL;
861+
862+
/*
863+
* Theoretically, a semijoin would null the RHS; but since the
864+
* RHS can't be accessed above the join, this is immaterial
865+
* and we needn't account for it.
866+
*/
867+
nullable_rels = NULL;
854868
break;
855869
case JOIN_FULL:
856870
leftjoinlist = deconstruct_recurse(root, j->larg,
@@ -865,16 +879,22 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
865879
*inner_join_rels = bms_union(left_inners, right_inners);
866880
/* each side is both outer and inner */
867881
nonnullable_rels = *qualscope;
882+
nullable_rels = *qualscope;
868883
break;
869884
default:
870885
/* JOIN_RIGHT was eliminated during reduce_outer_joins() */
871886
elog(ERROR, "unrecognized join type: %d",
872887
(int) j->jointype);
873888
nonnullable_rels = NULL; /* keep compiler quiet */
889+
nullable_rels = NULL;
874890
leftjoinlist = rightjoinlist = NIL;
875891
break;
876892
}
877893

894+
/* Report all rels that will be nulled anywhere in the jointree */
895+
root->nullable_baserels = bms_add_members(root->nullable_baserels,
896+
nullable_rels);
897+
878898
/*
879899
* For an OJ, form the SpecialJoinInfo now, because we need the OJ's
880900
* semantic scope (ojscope) to pass to distribute_qual_to_rels. But

src/include/nodes/relation.h

+10-1
Original file line numberDiff line numberDiff line change
@@ -150,10 +150,19 @@ typedef struct PlannerInfo
150150
/*
151151
* all_baserels is a Relids set of all base relids (but not "other"
152152
* relids) in the query; that is, the Relids identifier of the final join
153-
* we need to form.
153+
* we need to form. This is computed in make_one_rel, just before we
154+
* start making Paths.
154155
*/
155156
Relids all_baserels;
156157

158+
/*
159+
* nullable_baserels is a Relids set of base relids that are nullable by
160+
* some outer join in the jointree; these are rels that are potentially
161+
* nullable below the WHERE clause, SELECT targetlist, etc. This is
162+
* computed in deconstruct_jointree.
163+
*/
164+
Relids nullable_baserels;
165+
157166
/*
158167
* join_rel_list is a list of all join-relation RelOptInfos we have
159168
* considered in this planning run. For small problems we just scan the

src/include/optimizer/paths.h

+1
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,7 @@ extern Expr *canonicalize_ec_expression(Expr *expr,
109109
extern void reconsider_outer_join_clauses(PlannerInfo *root);
110110
extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
111111
Expr *expr,
112+
Relids nullable_relids,
112113
List *opfamilies,
113114
Oid opcintype,
114115
Oid collation,

src/test/regress/expected/join.out

+29
Original file line numberDiff line numberDiff line change
@@ -2900,6 +2900,35 @@ select f1, unique2, case when unique2 is null then f1 else 0 end
29002900
0 | 0 | 0
29012901
(1 row)
29022902

2903+
--
2904+
-- another case with equivalence clauses above outer joins (bug #8591)
2905+
--
2906+
explain (costs off)
2907+
select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
2908+
from tenk1 a left join tenk1 b on b.thousand = a.unique1 left join tenk1 c on c.unique2 = coalesce(b.twothousand, a.twothousand)
2909+
where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
2910+
QUERY PLAN
2911+
---------------------------------------------------------------------------------------------
2912+
Nested Loop Left Join
2913+
-> Nested Loop Left Join
2914+
Filter: (COALESCE(b.twothousand, a.twothousand) = 44)
2915+
-> Index Scan using tenk1_unique2 on tenk1 a
2916+
Index Cond: (unique2 = 5530)
2917+
-> Bitmap Heap Scan on tenk1 b
2918+
Recheck Cond: (thousand = a.unique1)
2919+
-> Bitmap Index Scan on tenk1_thous_tenthous
2920+
Index Cond: (thousand = a.unique1)
2921+
-> Index Scan using tenk1_unique2 on tenk1 c
2922+
Index Cond: ((unique2 = COALESCE(b.twothousand, a.twothousand)) AND (unique2 = 44))
2923+
(11 rows)
2924+
2925+
select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
2926+
from tenk1 a left join tenk1 b on b.thousand = a.unique1 left join tenk1 c on c.unique2 = coalesce(b.twothousand, a.twothousand)
2927+
where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
2928+
unique1 | unique1 | unique1 | coalesce
2929+
---------+---------+---------+----------
2930+
(0 rows)
2931+
29032932
--
29042933
-- test ability to push constants through outer join clauses
29052934
--

src/test/regress/sql/join.sql

+13
Original file line numberDiff line numberDiff line change
@@ -790,6 +790,19 @@ select f1, unique2, case when unique2 is null then f1 else 0 end
790790
from int4_tbl a left join tenk1 b on f1 = unique2
791791
where (case when unique2 is null then f1 else 0 end) = 0;
792792

793+
--
794+
-- another case with equivalence clauses above outer joins (bug #8591)
795+
--
796+
797+
explain (costs off)
798+
select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
799+
from tenk1 a left join tenk1 b on b.thousand = a.unique1 left join tenk1 c on c.unique2 = coalesce(b.twothousand, a.twothousand)
800+
where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
801+
802+
select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
803+
from tenk1 a left join tenk1 b on b.thousand = a.unique1 left join tenk1 c on c.unique2 = coalesce(b.twothousand, a.twothousand)
804+
where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
805+
793806
--
794807
-- test ability to push constants through outer join clauses
795808
--

0 commit comments

Comments
 (0)