Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 9df8f90

Browse files
committed
Fix some issues with improper placement of outer join clauses.
After applying outer-join identity 3 in the forward direction, it was possible for the planner to mistakenly apply a qual clause from above the two outer joins at the now-lower join level. This can give the wrong answer, since a value that would get nulled by the now-upper join might not yet be null. To fix, when we perform such a transformation, consider that the now-lower join hasn't really completed the outer join it's nominally responsible for and thus its relid set should not include that OJ's relid (nor should its output Vars have that nullingrel bit set). Instead we add those bits when the now-upper join is performed. The existing rules for qual placement then suffice to prevent higher qual clauses from dropping below the now-upper join. There are a few complications from needing to consider transitive closures in case multiple pushdowns have happened, but all in all it's not a very complex patch. This is all new logic (from 2489d76) so no need to back-patch. The added test cases all have the same results as in v15. Tom Lane and Richard Guo Discussion: https://postgr.es/m/0b819232-4b50-f245-1c7d-c8c61bf41827@postgrespro.ru
1 parent 867be9c commit 9df8f90

File tree

14 files changed

+450
-86
lines changed

14 files changed

+450
-86
lines changed

src/backend/optimizer/README

+21
Original file line numberDiff line numberDiff line change
@@ -507,6 +507,27 @@ problem for join relation identification either, since whether a semijoin
507507
has been completed is again implicit in the set of base relations
508508
included in the join.
509509

510+
As usual, outer join identity 3 complicates matters. If we start with
511+
(A leftjoin B on (Pab)) leftjoin C on (Pbc)
512+
then the parser will have marked any C Vars appearing above these joins
513+
with the RT index of the B/C join. If we now transform to
514+
A leftjoin (B leftjoin C on (Pbc)) on (Pab)
515+
then it would appear that a clause using only such Vars could be pushed
516+
down and applied as a filter clause (not a join clause) at the lower
517+
B/C join. But *this might not give the right answer* since the clause
518+
might see a non-null value for the C Var that will be replaced by null
519+
once the A/B join is performed. We handle this by saying that the
520+
pushed-down join hasn't completely performed the work of the B/C join
521+
and hence is not entitled to include that outer join relid in its
522+
relid set. When we form the A/B join, both outer joins' relids will
523+
be added to its relid set, and then the upper clause will be applied
524+
at the correct join level. (Note there is no problem when identity 3
525+
is applied in the other direction: if we started with the second form
526+
then upper C Vars are marked with both outer join relids, so they
527+
cannot drop below whichever join is applied second.) Similarly,
528+
Vars representing the output of a pushed-down join do not acquire
529+
nullingrel bits for that join until after the upper join is performed.
530+
510531
There is one additional complication for qual clause placement, which
511532
occurs when we have made multiple versions of an outer-join clause as
512533
described previously (that is, we have both "Pbc" and "Pb*c" forms of

src/backend/optimizer/path/costsize.c

+4-2
Original file line numberDiff line numberDiff line change
@@ -4789,7 +4789,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
47894789
norm_sjinfo.ojrelid = 0;
47904790
norm_sjinfo.commute_above_l = NULL;
47914791
norm_sjinfo.commute_above_r = NULL;
4792-
norm_sjinfo.commute_below = NULL;
4792+
norm_sjinfo.commute_below_l = NULL;
4793+
norm_sjinfo.commute_below_r = NULL;
47934794
/* we don't bother trying to make the remaining fields valid */
47944795
norm_sjinfo.lhs_strict = false;
47954796
norm_sjinfo.semi_can_btree = false;
@@ -4957,7 +4958,8 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
49574958
sjinfo.ojrelid = 0;
49584959
sjinfo.commute_above_l = NULL;
49594960
sjinfo.commute_above_r = NULL;
4960-
sjinfo.commute_below = NULL;
4961+
sjinfo.commute_below_l = NULL;
4962+
sjinfo.commute_below_r = NULL;
49614963
/* we don't bother trying to make the remaining fields valid */
49624964
sjinfo.lhs_strict = false;
49634965
sjinfo.semi_can_btree = false;

src/backend/optimizer/path/equivclass.c

+12-9
Original file line numberDiff line numberDiff line change
@@ -1366,19 +1366,20 @@ generate_base_implied_equalities_broken(PlannerInfo *root,
13661366
* commutative duplicates, i.e. if the algorithm selects "a.x = b.y" but
13671367
* we already have "b.y = a.x", we return the existing clause.
13681368
*
1369-
* If we are considering an outer join, ojrelid is the associated OJ relid,
1370-
* otherwise it's zero.
1369+
* If we are considering an outer join, sjinfo is the associated OJ info,
1370+
* otherwise it can be NULL.
13711371
*
13721372
* join_relids should always equal bms_union(outer_relids, inner_rel->relids)
1373-
* plus ojrelid if that's not zero. We could simplify this function's API by
1374-
* computing it internally, but most callers have the value at hand anyway.
1373+
* plus whatever add_outer_joins_to_relids() would add. We could simplify
1374+
* this function's API by computing it internally, but most callers have the
1375+
* value at hand anyway.
13751376
*/
13761377
List *
13771378
generate_join_implied_equalities(PlannerInfo *root,
13781379
Relids join_relids,
13791380
Relids outer_relids,
13801381
RelOptInfo *inner_rel,
1381-
Index ojrelid)
1382+
SpecialJoinInfo *sjinfo)
13821383
{
13831384
List *result = NIL;
13841385
Relids inner_relids = inner_rel->relids;
@@ -1396,8 +1397,10 @@ generate_join_implied_equalities(PlannerInfo *root,
13961397
nominal_inner_relids = inner_rel->top_parent_relids;
13971398
/* ECs will be marked with the parent's relid, not the child's */
13981399
nominal_join_relids = bms_union(outer_relids, nominal_inner_relids);
1399-
if (ojrelid != 0)
1400-
nominal_join_relids = bms_add_member(nominal_join_relids, ojrelid);
1400+
nominal_join_relids = add_outer_joins_to_relids(root,
1401+
nominal_join_relids,
1402+
sjinfo,
1403+
NULL);
14011404
}
14021405
else
14031406
{
@@ -1418,7 +1421,7 @@ generate_join_implied_equalities(PlannerInfo *root,
14181421
* At inner joins, we can be smarter: only consider eclasses mentioning
14191422
* both input rels.
14201423
*/
1421-
if (ojrelid != 0)
1424+
if (sjinfo && sjinfo->ojrelid != 0)
14221425
matching_ecs = get_eclass_indexes_for_relids(root, nominal_join_relids);
14231426
else
14241427
matching_ecs = get_common_eclass_indexes(root, nominal_inner_relids,
@@ -1467,7 +1470,7 @@ generate_join_implied_equalities(PlannerInfo *root,
14671470
* generate_join_implied_equalities_for_ecs
14681471
* As above, but consider only the listed ECs.
14691472
*
1470-
* For the sole current caller, we can assume ojrelid == 0, that is we are
1473+
* For the sole current caller, we can assume sjinfo == NULL, that is we are
14711474
* not interested in outer-join filter clauses. This might need to change
14721475
* in future.
14731476
*/

src/backend/optimizer/path/indxpath.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -3364,7 +3364,7 @@ check_index_predicates(PlannerInfo *root, RelOptInfo *rel)
33643364
otherrels),
33653365
otherrels,
33663366
rel,
3367-
0));
3367+
NULL));
33683368

33693369
/*
33703370
* Normally we remove quals that are implied by a partial index's

src/backend/optimizer/path/joinrels.c

+115-8
Original file line numberDiff line numberDiff line change
@@ -691,6 +691,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
691691
Relids joinrelids;
692692
SpecialJoinInfo *sjinfo;
693693
bool reversed;
694+
List *pushed_down_joins = NIL;
694695
SpecialJoinInfo sjinfo_data;
695696
RelOptInfo *joinrel;
696697
List *restrictlist;
@@ -710,9 +711,12 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
710711
return NULL;
711712
}
712713

713-
/* If we have an outer join, add its RTI to form the canonical relids. */
714-
if (sjinfo && sjinfo->ojrelid != 0)
715-
joinrelids = bms_add_member(joinrelids, sjinfo->ojrelid);
714+
/*
715+
* Add outer join relid(s) to form the canonical relids. Any added outer
716+
* joins besides sjinfo itself are appended to pushed_down_joins.
717+
*/
718+
joinrelids = add_outer_joins_to_relids(root, joinrelids, sjinfo,
719+
&pushed_down_joins);
716720

717721
/* Swap rels if needed to match the join info. */
718722
if (reversed)
@@ -740,7 +744,8 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
740744
sjinfo->ojrelid = 0;
741745
sjinfo->commute_above_l = NULL;
742746
sjinfo->commute_above_r = NULL;
743-
sjinfo->commute_below = NULL;
747+
sjinfo->commute_below_l = NULL;
748+
sjinfo->commute_below_r = NULL;
744749
/* we don't bother trying to make the remaining fields valid */
745750
sjinfo->lhs_strict = false;
746751
sjinfo->semi_can_btree = false;
@@ -753,7 +758,8 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
753758
* Find or build the join RelOptInfo, and compute the restrictlist that
754759
* goes with this particular joining.
755760
*/
756-
joinrel = build_join_rel(root, joinrelids, rel1, rel2, sjinfo,
761+
joinrel = build_join_rel(root, joinrelids, rel1, rel2,
762+
sjinfo, pushed_down_joins,
757763
&restrictlist);
758764

759765
/*
@@ -775,6 +781,108 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
775781
return joinrel;
776782
}
777783

784+
/*
785+
* add_outer_joins_to_relids
786+
* Add relids to input_relids to represent any outer joins that will be
787+
* calculated at this join.
788+
*
789+
* input_relids is the union of the relid sets of the two input relations.
790+
* Note that we modify this in-place and return it; caller must bms_copy()
791+
* it first, if a separate value is desired.
792+
*
793+
* sjinfo represents the join being performed.
794+
*
795+
* If the current join completes the calculation of any outer joins that
796+
* have been pushed down per outer-join identity 3, those relids will be
797+
* added to the result along with sjinfo's own relid. If pushed_down_joins
798+
* is not NULL, then also the SpecialJoinInfos for such added outer joins will
799+
* be appended to *pushed_down_joins (so caller must initialize it to NIL).
800+
*/
801+
Relids
802+
add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
803+
SpecialJoinInfo *sjinfo,
804+
List **pushed_down_joins)
805+
{
806+
/* Nothing to do if this isn't an outer join with an assigned relid. */
807+
if (sjinfo == NULL || sjinfo->ojrelid == 0)
808+
return input_relids;
809+
810+
/*
811+
* If it's not a left join, we have no rules that would permit executing
812+
* it in non-syntactic order, so just form the syntactic relid set. (This
813+
* is just a quick-exit test; we'd come to the same conclusion anyway,
814+
* since its commute_below_l and commute_above_l sets must be empty.)
815+
*/
816+
if (sjinfo->jointype != JOIN_LEFT)
817+
return bms_add_member(input_relids, sjinfo->ojrelid);
818+
819+
/*
820+
* We cannot add the OJ relid if this join has been pushed into the RHS of
821+
* a syntactically-lower left join per OJ identity 3. (If it has, then we
822+
* cannot claim that its outputs represent the final state of its RHS.)
823+
* There will not be any other OJs that can be added either, so we're
824+
* done.
825+
*/
826+
if (!bms_is_subset(sjinfo->commute_below_l, input_relids))
827+
return input_relids;
828+
829+
/* OK to add OJ's own relid */
830+
input_relids = bms_add_member(input_relids, sjinfo->ojrelid);
831+
832+
/*
833+
* Contrariwise, if we are now forming the final result of such a commuted
834+
* pair of OJs, it's time to add the relid(s) of the pushed-down join(s).
835+
* We can skip this if this join was never a candidate to be pushed up.
836+
*/
837+
if (sjinfo->commute_above_l)
838+
{
839+
Relids commute_above_rels = bms_copy(sjinfo->commute_above_l);
840+
ListCell *lc;
841+
842+
/*
843+
* The current join could complete the nulling of more than one
844+
* pushed-down join, so we have to examine all the SpecialJoinInfos.
845+
* Because join_info_list was built in bottom-up order, it's
846+
* sufficient to traverse it once: an ojrelid we add in one loop
847+
* iteration would not have affected decisions of earlier iterations.
848+
*/
849+
foreach(lc, root->join_info_list)
850+
{
851+
SpecialJoinInfo *othersj = (SpecialJoinInfo *) lfirst(lc);
852+
853+
if (othersj == sjinfo ||
854+
othersj->ojrelid == 0 || othersj->jointype != JOIN_LEFT)
855+
continue; /* definitely not interesting */
856+
857+
if (!bms_is_member(othersj->ojrelid, commute_above_rels))
858+
continue;
859+
860+
/* Add it if not already present but conditions now satisfied */
861+
if (!bms_is_member(othersj->ojrelid, input_relids) &&
862+
bms_is_subset(othersj->min_lefthand, input_relids) &&
863+
bms_is_subset(othersj->min_righthand, input_relids) &&
864+
bms_is_subset(othersj->commute_below_l, input_relids))
865+
{
866+
input_relids = bms_add_member(input_relids, othersj->ojrelid);
867+
/* report such pushed down outer joins, if asked */
868+
if (pushed_down_joins != NULL)
869+
*pushed_down_joins = lappend(*pushed_down_joins, othersj);
870+
871+
/*
872+
* We must also check any joins that othersj potentially
873+
* commutes with. They likewise must appear later in
874+
* join_info_list than othersj itself, so we can visit them
875+
* later in this loop.
876+
*/
877+
commute_above_rels = bms_add_members(commute_above_rels,
878+
othersj->commute_above_l);
879+
}
880+
}
881+
}
882+
883+
return input_relids;
884+
}
885+
778886
/*
779887
* populate_joinrel_with_paths
780888
* Add paths to the given joinrel for given pair of joining relations. The
@@ -1534,9 +1642,8 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
15341642

15351643
/* Build correct join relids for child join */
15361644
child_joinrelids = bms_union(child_rel1->relids, child_rel2->relids);
1537-
if (child_sjinfo->ojrelid != 0)
1538-
child_joinrelids = bms_add_member(child_joinrelids,
1539-
child_sjinfo->ojrelid);
1645+
child_joinrelids = add_outer_joins_to_relids(root, child_joinrelids,
1646+
child_sjinfo, NULL);
15401647

15411648
/* Find the AppendRelInfo structures */
15421649
appinfos = find_appinfos_by_relids(root, child_joinrelids, &nappinfos);

src/backend/optimizer/plan/analyzejoins.c

+10-6
Original file line numberDiff line numberDiff line change
@@ -88,8 +88,11 @@ remove_useless_joins(PlannerInfo *root, List *joinlist)
8888
*/
8989
innerrelid = bms_singleton_member(sjinfo->min_righthand);
9090

91-
/* Compute the relid set for the join we are considering */
92-
joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
91+
/*
92+
* Compute the relid set for the join we are considering. We can
93+
* assume things are done in syntactic order.
94+
*/
95+
joinrelids = bms_union(sjinfo->syn_lefthand, sjinfo->syn_righthand);
9396
if (sjinfo->ojrelid != 0)
9497
joinrelids = bms_add_member(joinrelids, sjinfo->ojrelid);
9598

@@ -201,8 +204,8 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
201204
if (!rel_supports_distinctness(root, innerrel))
202205
return false;
203206

204-
/* Compute the relid set for the join we are considering */
205-
inputrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
207+
/* Compute the syntactic relid set for the join we are considering */
208+
inputrelids = bms_union(sjinfo->syn_lefthand, sjinfo->syn_righthand);
206209
Assert(sjinfo->ojrelid != 0);
207210
joinrelids = bms_copy(inputrelids);
208211
joinrelids = bms_add_member(joinrelids, sjinfo->ojrelid);
@@ -399,7 +402,8 @@ remove_rel_from_query(PlannerInfo *root, int relid, int ojrelid,
399402
/* relid cannot appear in these fields, but ojrelid can: */
400403
sjinfo->commute_above_l = bms_del_member(sjinfo->commute_above_l, ojrelid);
401404
sjinfo->commute_above_r = bms_del_member(sjinfo->commute_above_r, ojrelid);
402-
sjinfo->commute_below = bms_del_member(sjinfo->commute_below, ojrelid);
405+
sjinfo->commute_below_l = bms_del_member(sjinfo->commute_below_l, ojrelid);
406+
sjinfo->commute_below_r = bms_del_member(sjinfo->commute_below_r, ojrelid);
403407
}
404408

405409
/*
@@ -665,7 +669,7 @@ reduce_unique_semijoins(PlannerInfo *root)
665669
joinrelids,
666670
sjinfo->min_lefthand,
667671
innerrel,
668-
0),
672+
NULL),
669673
innerrel->joininfo);
670674

671675
/* Test whether the innerrel is unique for those clauses. */

0 commit comments

Comments
 (0)