Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit a90c950

Browse files
committed
Prevent overly large and NaN row estimates in relations
Given a query with enough joins, it was possible that the query planner, after multiplying the row estimates with the join selectivity that the estimated number of rows would exceed the limits of the double data type and become infinite. To give an indication on how extreme a case is required to hit this, the particular example case reported required 379 joins to a table without any statistics, which resulted in the 1.0/DEFAULT_NUM_DISTINCT being used for the join selectivity. This eventually caused the row estimates to go infinite and resulted in an assert failure in initial_cost_mergejoin() where the infinite row estimated was multiplied by an outerstartsel of 0.0 resulting in NaN. The failing assert verified that NaN <= Inf, which is false. To get around this we use clamp_row_est() to cap row estimates at a maximum of 1e100. This value is thought to be low enough that costs derived from it would remain within the bounds of what the double type can represent. Aside from fixing the failing Assert, this also has the added benefit of making it so add_path() will still receive proper numerical values as costs which will allow it to make more sane choices when determining the cheaper path in extreme cases such as the one described above. Additionally, we also get rid of the isnan() checks in the join costing functions. The actual case which originally triggered those checks to be added in the first place never made it to the mailing lists. It seems likely that the new code being added to clamp_row_est() will result in those becoming checks redundant, so just remove them. The fairly harmless assert failure problem does also exist in the backbranches, however, a more minimalistic fix will be applied there. Reported-by: Onder Kalaci Reviewed-by: Tom Lane Discussion: https://postgr.es/m/DM6PR21MB1211FF360183BCA901B27F04D80B0@DM6PR21MB1211.namprd21.prod.outlook.com
1 parent d5a9a66 commit a90c950

File tree

1 file changed

+22
-13
lines changed

1 file changed

+22
-13
lines changed

src/backend/optimizer/path/costsize.c

+22-13
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,13 @@
107107
*/
108108
#define APPEND_CPU_COST_MULTIPLIER 0.5
109109

110+
/*
111+
* Maximum value for row estimates. We cap row estimates to this to help
112+
* ensure that costs based on these estimates remain within the range of what
113+
* double can represent. add_path() wouldn't act sanely given infinite or NaN
114+
* cost values.
115+
*/
116+
#define MAXIMUM_ROWCOUNT 1e100
110117

111118
double seq_page_cost = DEFAULT_SEQ_PAGE_COST;
112119
double random_page_cost = DEFAULT_RANDOM_PAGE_COST;
@@ -189,11 +196,14 @@ double
189196
clamp_row_est(double nrows)
190197
{
191198
/*
192-
* Force estimate to be at least one row, to make explain output look
193-
* better and to avoid possible divide-by-zero when interpolating costs.
194-
* Make it an integer, too.
199+
* Avoid infinite and NaN row estimates. Costs derived from such values
200+
* are going to be useless. Also force the estimate to be at least one
201+
* row, to make explain output look better and to avoid possible
202+
* divide-by-zero when interpolating costs. Make it an integer, too.
195203
*/
196-
if (nrows <= 1.0)
204+
if (nrows > MAXIMUM_ROWCOUNT || isnan(nrows))
205+
nrows = MAXIMUM_ROWCOUNT;
206+
else if (nrows <= 1.0)
197207
nrows = 1.0;
198208
else
199209
nrows = rint(nrows);
@@ -2737,12 +2747,11 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
27372747
QualCost restrict_qual_cost;
27382748
double ntuples;
27392749

2740-
/* Protect some assumptions below that rowcounts aren't zero or NaN */
2741-
if (outer_path_rows <= 0 || isnan(outer_path_rows))
2750+
/* Protect some assumptions below that rowcounts aren't zero */
2751+
if (outer_path_rows <= 0)
27422752
outer_path_rows = 1;
2743-
if (inner_path_rows <= 0 || isnan(inner_path_rows))
2753+
if (inner_path_rows <= 0)
27442754
inner_path_rows = 1;
2745-
27462755
/* Mark the path with the correct row estimate */
27472756
if (path->path.param_info)
27482757
path->path.rows = path->path.param_info->ppi_rows;
@@ -2952,10 +2961,10 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
29522961
innerendsel;
29532962
Path sort_path; /* dummy for result of cost_sort */
29542963

2955-
/* Protect some assumptions below that rowcounts aren't zero or NaN */
2956-
if (outer_path_rows <= 0 || isnan(outer_path_rows))
2964+
/* Protect some assumptions below that rowcounts aren't zero */
2965+
if (outer_path_rows <= 0)
29572966
outer_path_rows = 1;
2958-
if (inner_path_rows <= 0 || isnan(inner_path_rows))
2967+
if (inner_path_rows <= 0)
29592968
inner_path_rows = 1;
29602969

29612970
/*
@@ -3185,8 +3194,8 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
31853194
rescannedtuples;
31863195
double rescanratio;
31873196

3188-
/* Protect some assumptions below that rowcounts aren't zero or NaN */
3189-
if (inner_path_rows <= 0 || isnan(inner_path_rows))
3197+
/* Protect some assumptions below that rowcounts aren't zero */
3198+
if (inner_path_rows <= 0)
31903199
inner_path_rows = 1;
31913200

31923201
/* Mark the path with the correct row estimate */

0 commit comments

Comments
 (0)