Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit a3f945a

Browse files
committed
Adjust estimate_num_groups() to not clamp per-relation group count
estimate to less than the number of values estimated for any one grouping Var, as suggested by Manfred. This is intuitively right, and what's more it puts the plan choices in the subselect regression test back the way they were before ...
1 parent 48522fd commit a3f945a

File tree

2 files changed

+18
-4
lines changed

2 files changed

+18
-4
lines changed

src/backend/utils/adt/selfuncs.c

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
*
1616
*
1717
* IDENTIFICATION
18-
* $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.170 2005/01/28 20:34:25 tgl Exp $
18+
* $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.171 2005/02/01 23:07:58 tgl Exp $
1919
*
2020
*-------------------------------------------------------------------------
2121
*/
@@ -2043,6 +2043,7 @@ estimate_num_groups(Query *root, List *groupExprs, double input_rows)
20432043
GroupVarInfo *varinfo1 = (GroupVarInfo *) linitial(varinfos);
20442044
RelOptInfo *rel = varinfo1->rel;
20452045
double reldistinct = varinfo1->ndistinct;
2046+
double relmaxndistinct = reldistinct;
20462047
int relvarcount = 1;
20472048
List *newvarinfos = NIL;
20482049

@@ -2057,6 +2058,8 @@ estimate_num_groups(Query *root, List *groupExprs, double input_rows)
20572058
if (varinfo2->rel == varinfo1->rel)
20582059
{
20592060
reldistinct *= varinfo2->ndistinct;
2061+
if (relmaxndistinct < varinfo2->ndistinct)
2062+
relmaxndistinct = varinfo2->ndistinct;
20602063
relvarcount++;
20612064
}
20622065
else
@@ -2075,12 +2078,23 @@ estimate_num_groups(Query *root, List *groupExprs, double input_rows)
20752078
/*
20762079
* Clamp to size of rel, or size of rel / 10 if multiple Vars.
20772080
* The fudge factor is because the Vars are probably correlated
2078-
* but we don't know by how much.
2081+
* but we don't know by how much. We should never clamp to less
2082+
* than the largest ndistinct value for any of the Vars, though,
2083+
* since there will surely be at least that many groups.
20792084
*/
20802085
double clamp = rel->tuples;
20812086

20822087
if (relvarcount > 1)
2088+
{
20832089
clamp *= 0.1;
2090+
if (clamp < relmaxndistinct)
2091+
{
2092+
clamp = relmaxndistinct;
2093+
/* for sanity in case some ndistinct is too large: */
2094+
if (clamp > rel->tuples)
2095+
clamp = rel->tuples;
2096+
}
2097+
}
20842098
if (reldistinct > clamp)
20852099
reldistinct = clamp;
20862100

src/test/regress/expected/subselect.out

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -134,11 +134,11 @@ SELECT '' AS five, f1 AS "Correlated Field"
134134
WHERE f3 IS NOT NULL);
135135
five | Correlated Field
136136
------+------------------
137+
| 2
137138
| 3
138139
| 1
139-
| 3
140-
| 2
141140
| 2
141+
| 3
142142
(5 rows)
143143

144144
--

0 commit comments

Comments
 (0)