Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit d19798e

Browse files
committed
Fix set_joinrel_size_estimates() to estimate outer-join sizes more
accurately: we have to distinguish the effects of the join's own ON clauses from the effects of pushed-down clauses. Failing to do so was a quick hack long ago, but it's time to be smarter. Per example from Thomas H.
1 parent dcbdf9b commit d19798e

File tree

1 file changed

+61
-17
lines changed

1 file changed

+61
-17
lines changed

src/backend/optimizer/path/costsize.c

Lines changed: 61 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454
* Portions Copyright (c) 1994, Regents of the University of California
5555
*
5656
* IDENTIFICATION
57-
* $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.167 2006/10/04 00:29:53 momjian Exp $
57+
* $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.168 2006/11/10 01:21:41 tgl Exp $
5858
*
5959
*-------------------------------------------------------------------------
6060
*/
@@ -1948,7 +1948,8 @@ set_joinrel_size_estimates(PlannerInfo *root, RelOptInfo *rel,
19481948
JoinType jointype,
19491949
List *restrictlist)
19501950
{
1951-
Selectivity selec;
1951+
Selectivity jselec;
1952+
Selectivity pselec;
19521953
double nrows;
19531954
UniquePath *upath;
19541955

@@ -1957,20 +1958,60 @@ set_joinrel_size_estimates(PlannerInfo *root, RelOptInfo *rel,
19571958
* clauses that become restriction clauses at this join level; we are not
19581959
* double-counting them because they were not considered in estimating the
19591960
* sizes of the component rels.
1961+
*
1962+
* For an outer join, we have to distinguish the selectivity of the
1963+
* join's own clauses (JOIN/ON conditions) from any clauses that were
1964+
* "pushed down". For inner joins we just count them all as joinclauses.
19601965
*/
1961-
selec = clauselist_selectivity(root,
1962-
restrictlist,
1963-
0,
1964-
jointype);
1966+
if (IS_OUTER_JOIN(jointype))
1967+
{
1968+
List *joinquals = NIL;
1969+
List *pushedquals = NIL;
1970+
ListCell *l;
1971+
1972+
/* Grovel through the clauses to separate into two lists */
1973+
foreach(l, restrictlist)
1974+
{
1975+
RestrictInfo *rinfo = (RestrictInfo *) lfirst(l);
1976+
1977+
Assert(IsA(rinfo, RestrictInfo));
1978+
if (rinfo->is_pushed_down)
1979+
pushedquals = lappend(pushedquals, rinfo);
1980+
else
1981+
joinquals = lappend(joinquals, rinfo);
1982+
}
1983+
1984+
/* Get the separate selectivities */
1985+
jselec = clauselist_selectivity(root,
1986+
joinquals,
1987+
0,
1988+
jointype);
1989+
pselec = clauselist_selectivity(root,
1990+
pushedquals,
1991+
0,
1992+
jointype);
1993+
1994+
/* Avoid leaking a lot of ListCells */
1995+
list_free(joinquals);
1996+
list_free(pushedquals);
1997+
}
1998+
else
1999+
{
2000+
jselec = clauselist_selectivity(root,
2001+
restrictlist,
2002+
0,
2003+
jointype);
2004+
pselec = 0.0; /* not used, keep compiler quiet */
2005+
}
19652006

19662007
/*
19672008
* Basically, we multiply size of Cartesian product by selectivity.
19682009
*
1969-
* If we are doing an outer join, take that into account: the output must
1970-
* be at least as large as the non-nullable input. (Is there any chance
1971-
* of being even smarter?) (XXX this is not really right, because it
1972-
* assumes all the restriction clauses are join clauses; we should figure
1973-
* pushed-down clauses separately.)
2010+
* If we are doing an outer join, take that into account: the joinqual
2011+
* selectivity has to be clamped using the knowledge that the output must
2012+
* be at least as large as the non-nullable input. However, any
2013+
* pushed-down quals are applied after the outer join, so their
2014+
* selectivity applies fully.
19742015
*
19752016
* For JOIN_IN and variants, the Cartesian product is figured with respect
19762017
* to a unique-ified input, and then we can clamp to the size of the other
@@ -1979,38 +2020,41 @@ set_joinrel_size_estimates(PlannerInfo *root, RelOptInfo *rel,
19792020
switch (jointype)
19802021
{
19812022
case JOIN_INNER:
1982-
nrows = outer_rel->rows * inner_rel->rows * selec;
2023+
nrows = outer_rel->rows * inner_rel->rows * jselec;
19832024
break;
19842025
case JOIN_LEFT:
1985-
nrows = outer_rel->rows * inner_rel->rows * selec;
2026+
nrows = outer_rel->rows * inner_rel->rows * jselec;
19862027
if (nrows < outer_rel->rows)
19872028
nrows = outer_rel->rows;
2029+
nrows *= pselec;
19882030
break;
19892031
case JOIN_RIGHT:
1990-
nrows = outer_rel->rows * inner_rel->rows * selec;
2032+
nrows = outer_rel->rows * inner_rel->rows * jselec;
19912033
if (nrows < inner_rel->rows)
19922034
nrows = inner_rel->rows;
2035+
nrows *= pselec;
19932036
break;
19942037
case JOIN_FULL:
1995-
nrows = outer_rel->rows * inner_rel->rows * selec;
2038+
nrows = outer_rel->rows * inner_rel->rows * jselec;
19962039
if (nrows < outer_rel->rows)
19972040
nrows = outer_rel->rows;
19982041
if (nrows < inner_rel->rows)
19992042
nrows = inner_rel->rows;
2043+
nrows *= pselec;
20002044
break;
20012045
case JOIN_IN:
20022046
case JOIN_UNIQUE_INNER:
20032047
upath = create_unique_path(root, inner_rel,
20042048
inner_rel->cheapest_total_path);
2005-
nrows = outer_rel->rows * upath->rows * selec;
2049+
nrows = outer_rel->rows * upath->rows * jselec;
20062050
if (nrows > outer_rel->rows)
20072051
nrows = outer_rel->rows;
20082052
break;
20092053
case JOIN_REVERSE_IN:
20102054
case JOIN_UNIQUE_OUTER:
20112055
upath = create_unique_path(root, outer_rel,
20122056
outer_rel->cheapest_total_path);
2013-
nrows = upath->rows * inner_rel->rows * selec;
2057+
nrows = upath->rows * inner_rel->rows * jselec;
20142058
if (nrows > inner_rel->rows)
20152059
nrows = inner_rel->rows;
20162060
break;

0 commit comments

Comments
 (0)