Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 6bef118

Browse files
committed
Restructure code that is responsible for ensuring that clauseless joins are
considered when it is necessary to do so because of a join-order restriction (that is, an outer-join or IN-subselect construct). The former coding was a bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario Weilguni's recent bug report. His specific problem was that an IN could be turned into a "clauseless" join due to constant-propagation removing the IN's joinclause, and if the IN's subselect involved more than one relation and there was more than one such IN linking to the same upper relation, then the only valid join orders involve "bushy" plans but we would fail to consider the specific paths needed to get there. (See the example case added to the join regression test.) On examining the code I wonder if there weren't some other problem cases too; in particular it seems that GEQO was defending against a different set of corner cases than the main planner was. There was also an efficiency problem, in that when we did realize we needed a clauseless join because of an IN, we'd consider clauseless joins against every other relation whether this was sensible or not. It seems a better design is to use the outer-join and in-clause lists as a backup heuristic, just as the rule of joining only where there are joinclauses is a heuristic: we'll join two relations if they have a usable joinclause *or* this might be necessary to satisfy an outer-join or IN-clause join order restriction. I refactored the code to have just one place considering this instead of three, and made sure that it covered all the cases that any of them had been considering. Backpatch as far as 8.1 (which has only the IN-clause form of the disease). By rights 8.0 and 7.4 should have the bug too, but they accidentally fail to fail, because the joininfo structure used in those releases preserves some memory of there having once been a joinclause between the inner and outer sides of an IN, and so it leads the code in the right direction anyway. I'll be conservative and not touch them.
1 parent 1820650 commit 6bef118

File tree

8 files changed

+223
-151
lines changed

8 files changed

+223
-151
lines changed

src/backend/optimizer/README

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -105,12 +105,16 @@ that are either base rels or joinrels constructed per sub-join-lists.
105105
We can join these rels together in any order the planner sees fit.
106106
The standard (non-GEQO) planner does this as follows:
107107

108-
Consider joining each RelOptInfo to each other RelOptInfo specified in its
109-
RelOptInfo.joininfo, and generate a Path for each possible join method for
110-
each such pair. (If we have a RelOptInfo with no join clauses, we have no
111-
choice but to generate a clauseless Cartesian-product join; so we consider
112-
joining that rel to each other available rel. But in the presence of join
113-
clauses we will only consider joins that use available join clauses.)
108+
Consider joining each RelOptInfo to each other RelOptInfo for which there
109+
is a usable joinclause, and generate a Path for each possible join method
110+
for each such pair. (If we have a RelOptInfo with no join clauses, we have
111+
no choice but to generate a clauseless Cartesian-product join; so we
112+
consider joining that rel to each other available rel. But in the presence
113+
of join clauses we will only consider joins that use available join
114+
clauses. Note that join-order restrictions induced by outer joins and
115+
IN clauses are treated as if they were real join clauses, to ensure that
116+
we find a workable join order in cases where those restrictions force a
117+
clauseless join to be done.)
114118

115119
If we only had two relations in the list, we are done: we just pick
116120
the cheapest path for the join RelOptInfo. If we had more than two, we now

src/backend/optimizer/geqo/geqo_eval.c

Lines changed: 5 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
77
* Portions Copyright (c) 1994, Regents of the University of California
88
*
9-
* $PostgreSQL: pgsql/src/backend/optimizer/geqo/geqo_eval.c,v 1.84 2007/02/13 02:31:02 tgl Exp $
9+
* $PostgreSQL: pgsql/src/backend/optimizer/geqo/geqo_eval.c,v 1.85 2007/02/16 00:14:01 tgl Exp $
1010
*
1111
*-------------------------------------------------------------------------
1212
*/
@@ -253,52 +253,14 @@ static bool
253253
desirable_join(PlannerInfo *root,
254254
RelOptInfo *outer_rel, RelOptInfo *inner_rel)
255255
{
256-
ListCell *l;
257-
258256
/*
259-
* Join if there is an applicable join clause.
257+
* Join if there is an applicable join clause, or if there is a join
258+
* order restriction forcing these rels to be joined.
260259
*/
261-
if (have_relevant_joinclause(root, outer_rel, inner_rel))
260+
if (have_relevant_joinclause(root, outer_rel, inner_rel) ||
261+
have_join_order_restriction(root, outer_rel, inner_rel))
262262
return true;
263263

264-
/*
265-
* Join if the rels overlap the same outer-join side and don't already
266-
* implement the outer join. This is needed to ensure that we can find a
267-
* valid solution in a case where an OJ contains a clauseless join.
268-
*/
269-
foreach(l, root->oj_info_list)
270-
{
271-
OuterJoinInfo *ojinfo = (OuterJoinInfo *) lfirst(l);
272-
273-
/* ignore full joins --- other mechanisms preserve their ordering */
274-
if (ojinfo->is_full_join)
275-
continue;
276-
if (bms_overlap(outer_rel->relids, ojinfo->min_righthand) &&
277-
bms_overlap(inner_rel->relids, ojinfo->min_righthand) &&
278-
!bms_overlap(outer_rel->relids, ojinfo->min_lefthand) &&
279-
!bms_overlap(inner_rel->relids, ojinfo->min_lefthand))
280-
return true;
281-
if (bms_overlap(outer_rel->relids, ojinfo->min_lefthand) &&
282-
bms_overlap(inner_rel->relids, ojinfo->min_lefthand) &&
283-
!bms_overlap(outer_rel->relids, ojinfo->min_righthand) &&
284-
!bms_overlap(inner_rel->relids, ojinfo->min_righthand))
285-
return true;
286-
}
287-
288-
/*
289-
* Join if the rels are members of the same IN sub-select. This is needed
290-
* to ensure that we can find a valid solution in a case where an IN
291-
* sub-select has a clauseless join.
292-
*/
293-
foreach(l, root->in_info_list)
294-
{
295-
InClauseInfo *ininfo = (InClauseInfo *) lfirst(l);
296-
297-
if (bms_is_subset(outer_rel->relids, ininfo->righthand) &&
298-
bms_is_subset(inner_rel->relids, ininfo->righthand))
299-
return true;
300-
}
301-
302264
/* Otherwise postpone the join till later. */
303265
return false;
304266
}

0 commit comments

Comments
 (0)