Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 33e9915

Browse files
committed
Use fuzzy not exact cost comparison for the final tie-breaker in add_path.
Instead of an exact cost comparison, use a fuzzy comparison with 1e-10 delta after all other path metrics have proved equal. This is to avoid having platform-specific roundoff behaviors determine the choice when two paths are really the same to our cost estimators. Adjust the recently-added test case that made it obvious we had a problem here.
1 parent 09ff76f commit 33e9915

File tree

2 files changed

+38
-22
lines changed

2 files changed

+38
-22
lines changed

src/backend/optimizer/util/pathnode.c

+30-14
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,11 @@ compare_fractional_path_costs(Path *path1, Path *path2,
125125
*
126126
* We use fuzzy comparisons so that add_path() can avoid keeping both of
127127
* a pair of paths that really have insignificantly different cost.
128-
* The fuzz factor is 1% of the smaller cost. (XXX does this percentage
129-
* need to be user-configurable?)
128+
*
129+
* The fuzz_factor argument must be 1.0 plus delta, where delta is the
130+
* fraction of the smaller cost that is considered to be a significant
131+
* difference. For example, fuzz_factor = 1.01 makes the fuzziness limit
132+
* be 1% of the smaller cost.
130133
*
131134
* The two paths are said to have "equal" costs if both startup and total
132135
* costs are fuzzily the same. Path1 is said to be better than path2 if
@@ -138,27 +141,27 @@ compare_fractional_path_costs(Path *path1, Path *path2,
138141
* dominates the other across the whole performance spectrum.
139142
*/
140143
static PathCostComparison
141-
compare_path_costs_fuzzily(Path *path1, Path *path2)
144+
compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
142145
{
143146
/*
144147
* Check total cost first since it's more likely to be different; many
145148
* paths have zero startup cost.
146149
*/
147-
if (path1->total_cost > path2->total_cost * 1.01)
150+
if (path1->total_cost > path2->total_cost * fuzz_factor)
148151
{
149152
/* path1 fuzzily worse on total cost */
150-
if (path2->startup_cost > path1->startup_cost * 1.01)
153+
if (path2->startup_cost > path1->startup_cost * fuzz_factor)
151154
{
152155
/* ... but path2 fuzzily worse on startup, so DIFFERENT */
153156
return COSTS_DIFFERENT;
154157
}
155158
/* else path2 dominates */
156159
return COSTS_BETTER2;
157160
}
158-
if (path2->total_cost > path1->total_cost * 1.01)
161+
if (path2->total_cost > path1->total_cost * fuzz_factor)
159162
{
160163
/* path2 fuzzily worse on total cost */
161-
if (path1->startup_cost > path2->startup_cost * 1.01)
164+
if (path1->startup_cost > path2->startup_cost * fuzz_factor)
162165
{
163166
/* ... but path1 fuzzily worse on startup, so DIFFERENT */
164167
return COSTS_DIFFERENT;
@@ -167,12 +170,12 @@ compare_path_costs_fuzzily(Path *path1, Path *path2)
167170
return COSTS_BETTER1;
168171
}
169172
/* fuzzily the same on total cost */
170-
if (path1->startup_cost > path2->startup_cost * 1.01)
173+
if (path1->startup_cost > path2->startup_cost * fuzz_factor)
171174
{
172175
/* ... but path1 fuzzily worse on startup, so path2 wins */
173176
return COSTS_BETTER2;
174177
}
175-
if (path2->startup_cost > path1->startup_cost * 1.01)
178+
if (path2->startup_cost > path1->startup_cost * fuzz_factor)
176179
{
177180
/* ... but path2 fuzzily worse on startup, so path1 wins */
178181
return COSTS_BETTER1;
@@ -354,7 +357,11 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
354357

355358
p1_next = lnext(p1);
356359

357-
costcmp = compare_path_costs_fuzzily(new_path, old_path);
360+
/*
361+
* Do a fuzzy cost comparison with 1% fuzziness limit. (XXX does this
362+
* percentage need to be user-configurable?)
363+
*/
364+
costcmp = compare_path_costs_fuzzily(new_path, old_path, 1.01);
358365

359366
/*
360367
* If the two paths compare differently for startup and total cost,
@@ -403,15 +410,24 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
403410
/*
404411
* Same pathkeys and outer rels, and fuzzily
405412
* the same cost, so keep just one; to decide
406-
* which, first check rows and then do an
407-
* exact cost comparison.
413+
* which, first check rows and then do a fuzzy
414+
* cost comparison with very small fuzz limit.
415+
* (We used to do an exact cost comparison,
416+
* but that results in annoying
417+
* platform-specific plan variations due to
418+
* roundoff in the cost estimates.) If things
419+
* are still tied, arbitrarily keep only the
420+
* old path. Notice that we will keep only
421+
* the old path even if the less-fuzzy
422+
* comparison decides the startup and total
423+
* costs compare differently.
408424
*/
409425
if (new_path->rows < old_path->rows)
410426
remove_old = true; /* new dominates old */
411427
else if (new_path->rows > old_path->rows)
412428
accept_new = false; /* old dominates new */
413-
else if (compare_path_costs(new_path, old_path,
414-
TOTAL_COST) < 0)
429+
else if (compare_path_costs_fuzzily(new_path, old_path,
430+
1.0000000001) == COSTS_BETTER1)
415431
remove_old = true; /* new dominates old */
416432
else
417433
accept_new = false; /* old equals or dominates new */

src/test/regress/expected/join.out

+8-8
Original file line numberDiff line numberDiff line change
@@ -2798,17 +2798,17 @@ select b.unique1 from
27982798
Sort Key: b.unique1
27992799
-> Nested Loop Left Join
28002800
-> Seq Scan on int4_tbl i2
2801-
-> Nested Loop Left Join
2802-
Join Filter: (b.unique1 = 42)
2803-
-> Nested Loop
2801+
-> Nested Loop
2802+
-> Seq Scan on int4_tbl i1
2803+
-> Nested Loop Left Join
2804+
Join Filter: (b.unique1 = 42)
28042805
-> Nested Loop
2805-
-> Seq Scan on int4_tbl i1
28062806
-> Index Scan using tenk1_thous_tenthous on tenk1 b
28072807
Index Cond: ((thousand = i1.f1) AND (i2.f1 = tenthous))
2808-
-> Index Scan using tenk1_unique1 on tenk1 a
2809-
Index Cond: (unique1 = b.unique2)
2810-
-> Index Only Scan using tenk1_thous_tenthous on tenk1 c
2811-
Index Cond: (thousand = a.thousand)
2808+
-> Index Scan using tenk1_unique1 on tenk1 a
2809+
Index Cond: (unique1 = b.unique2)
2810+
-> Index Only Scan using tenk1_thous_tenthous on tenk1 c
2811+
Index Cond: (thousand = a.thousand)
28122812
(15 rows)
28132813

28142814
select b.unique1 from

0 commit comments

Comments
 (0)