Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit f0a8515

Browse files
committed
Fix planner's cost estimation for SEMI/ANTI joins with inner indexscans.
When the inner side of a nestloop SEMI or ANTI join is an indexscan that uses all the join clauses as indexquals, it can be presumed that both matched and unmatched outer rows will be processed very quickly: for matched rows, we'll stop after fetching one row from the indexscan, while for unmatched rows we'll have an indexscan that finds no matching index entries, which should also be quick. The planner already knew about this, but it was nonetheless charging for at least one full run of the inner indexscan, as a consequence of concerns about the behavior of materialized inner scans --- but those concerns don't apply in the fast case. If the inner side has low cardinality (many matching rows) this could make an indexscan plan look far more expensive than it actually is. To fix, rearrange the work in initial_cost_nestloop/final_cost_nestloop so that we don't add the inner scan cost until we've inspected the indexquals, and then we can add either the full-run cost or just the first tuple's cost as appropriate. Experimentation with this fix uncovered another problem: add_path and friends were coded to disregard cheap startup cost when considering parameterized paths. That's usually okay (and desirable, because it thins the path herd faster); but in this fast case for SEMI/ANTI joins, it could result in throwing away the desired plain indexscan path in favor of a bitmap scan path before we ever get to the join costing logic. In the many-matching-rows cases of interest here, a bitmap scan will do a lot more work than required, so this is a problem. To fix, add a per-relation flag consider_param_startup that works like the existing consider_startup flag, but applies to parameterized paths, and set it for relations that are the inside of a SEMI or ANTI join. To make this patch reasonably safe to back-patch, care has been taken to avoid changing the planner's behavior except in the very narrow case of SEMI/ANTI joins with inner indexscans. There are places in compare_path_costs_fuzzily and add_path_precheck that are not terribly consistent with the new approach, but changing them will affect planner decisions at the margins in other cases, so we'll leave that for a HEAD-only fix. Back-patch to 9.3; before that, the consider_startup flag didn't exist, meaning that the second aspect of the patch would be too invasive. Per a complaint from Peter Holzer and analysis by Tomas Vondra.
1 parent de17fe4 commit f0a8515

File tree

7 files changed

+175
-86
lines changed

7 files changed

+175
-86
lines changed

src/backend/nodes/outfuncs.c

+1
Original file line numberDiff line numberDiff line change
@@ -1745,6 +1745,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
17451745
WRITE_FLOAT_FIELD(rows, "%.0f");
17461746
WRITE_INT_FIELD(width);
17471747
WRITE_BOOL_FIELD(consider_startup);
1748+
WRITE_BOOL_FIELD(consider_param_startup);
17481749
WRITE_NODE_FIELD(reltargetlist);
17491750
WRITE_NODE_FIELD(pathlist);
17501751
WRITE_NODE_FIELD(ppilist);

src/backend/optimizer/README

+8-1
Original file line numberDiff line numberDiff line change
@@ -795,7 +795,7 @@ a nestloop that provides parameters to the lower join's inputs). While we
795795
do not ignore merge joins entirely, joinpath.c does not fully explore the
796796
space of potential merge joins with parameterized inputs. Also, add_path
797797
treats parameterized paths as having no pathkeys, so that they compete
798-
only on total cost and rowcount; they don't get preference for producing a
798+
only on cost and rowcount; they don't get preference for producing a
799799
special sort order. This creates additional bias against merge joins,
800800
since we might discard a path that could have been useful for performing
801801
a merge without an explicit sort step. Since a parameterized path must
@@ -804,6 +804,13 @@ uninteresting, these choices do not affect any requirement for the final
804804
output order of a query --- they only make it harder to use a merge join
805805
at a lower level. The savings in planning work justifies that.
806806

807+
Similarly, parameterized paths do not normally get preference in add_path
808+
for having cheap startup cost; that's seldom of much value when on the
809+
inside of a nestloop, so it seems not worth keeping extra paths solely for
810+
that. An exception occurs for parameterized paths for the RHS relation of
811+
a SEMI or ANTI join: in those cases, we can stop the inner scan after the
812+
first match, so it's primarily startup not total cost that we care about.
813+
807814

808815
LATERAL subqueries
809816
------------------

src/backend/optimizer/path/allpaths.c

+47
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ int geqo_threshold;
5656
join_search_hook_type join_search_hook = NULL;
5757

5858

59+
static void set_base_rel_consider_startup(PlannerInfo *root);
5960
static void set_base_rel_sizes(PlannerInfo *root);
6061
static void set_base_rel_pathlists(PlannerInfo *root);
6162
static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
@@ -141,6 +142,9 @@ make_one_rel(PlannerInfo *root, List *joinlist)
141142
root->all_baserels = bms_add_member(root->all_baserels, brel->relid);
142143
}
143144

145+
/* Mark base rels as to whether we care about fast-start plans */
146+
set_base_rel_consider_startup(root);
147+
144148
/*
145149
* Generate access paths for the base rels.
146150
*/
@@ -160,6 +164,49 @@ make_one_rel(PlannerInfo *root, List *joinlist)
160164
return rel;
161165
}
162166

167+
/*
168+
* set_base_rel_consider_startup
169+
* Set the consider_[param_]startup flags for each base-relation entry.
170+
*
171+
* For the moment, we only deal with consider_param_startup here; because the
172+
* logic for consider_startup is pretty trivial and is the same for every base
173+
* relation, we just let build_simple_rel() initialize that flag correctly to
174+
* start with. If that logic ever gets more complicated it would probably
175+
* be better to move it here.
176+
*/
177+
static void
178+
set_base_rel_consider_startup(PlannerInfo *root)
179+
{
180+
/*
181+
* Since parameterized paths can only be used on the inside of a nestloop
182+
* join plan, there is usually little value in considering fast-start
183+
* plans for them. However, for relations that are on the RHS of a SEMI
184+
* or ANTI join, a fast-start plan can be useful because we're only going
185+
* to care about fetching one tuple anyway.
186+
*
187+
* To minimize growth of planning time, we currently restrict this to
188+
* cases where the RHS is a single base relation, not a join; there is no
189+
* provision for consider_param_startup to get set at all on joinrels.
190+
* Also we don't worry about appendrels. costsize.c's costing rules for
191+
* nestloop semi/antijoins don't consider such cases either.
192+
*/
193+
ListCell *lc;
194+
195+
foreach(lc, root->join_info_list)
196+
{
197+
SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc);
198+
199+
if ((sjinfo->jointype == JOIN_SEMI || sjinfo->jointype == JOIN_ANTI) &&
200+
bms_membership(sjinfo->syn_righthand) == BMS_SINGLETON)
201+
{
202+
int varno = bms_singleton_member(sjinfo->syn_righthand);
203+
RelOptInfo *rel = find_base_rel(root, varno);
204+
205+
rel->consider_param_startup = true;
206+
}
207+
}
208+
}
209+
163210
/*
164211
* set_base_rel_sizes
165212
* Set the size estimates (rows and widths) for each base-relation entry.

src/backend/optimizer/path/costsize.c

+77-47
Original file line numberDiff line numberDiff line change
@@ -1662,7 +1662,8 @@ cost_group(Path *path, PlannerInfo *root,
16621662
* estimate and getting a tight lower bound. We choose to not examine the
16631663
* join quals here, since that's by far the most expensive part of the
16641664
* calculations. The end result is that CPU-cost considerations must be
1665-
* left for the second phase.
1665+
* left for the second phase; and for SEMI/ANTI joins, we must also postpone
1666+
* incorporation of the inner path's run cost.
16661667
*
16671668
* 'workspace' is to be filled with startup_cost, total_cost, and perhaps
16681669
* other data to be used by final_cost_nestloop
@@ -1710,44 +1711,16 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
17101711

17111712
if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
17121713
{
1713-
double outer_matched_rows;
1714-
Selectivity inner_scan_frac;
1715-
17161714
/*
17171715
* SEMI or ANTI join: executor will stop after first match.
17181716
*
1719-
* For an outer-rel row that has at least one match, we can expect the
1720-
* inner scan to stop after a fraction 1/(match_count+1) of the inner
1721-
* rows, if the matches are evenly distributed. Since they probably
1722-
* aren't quite evenly distributed, we apply a fuzz factor of 2.0 to
1723-
* that fraction. (If we used a larger fuzz factor, we'd have to
1724-
* clamp inner_scan_frac to at most 1.0; but since match_count is at
1725-
* least 1, no such clamp is needed now.)
1726-
*
1727-
* A complicating factor is that rescans may be cheaper than first
1728-
* scans. If we never scan all the way to the end of the inner rel,
1729-
* it might be (depending on the plan type) that we'd never pay the
1730-
* whole inner first-scan run cost. However it is difficult to
1731-
* estimate whether that will happen, so be conservative and always
1732-
* charge the whole first-scan cost once.
1733-
*/
1734-
run_cost += inner_run_cost;
1735-
1736-
outer_matched_rows = rint(outer_path_rows * semifactors->outer_match_frac);
1737-
inner_scan_frac = 2.0 / (semifactors->match_count + 1.0);
1738-
1739-
/* Add inner run cost for additional outer tuples having matches */
1740-
if (outer_matched_rows > 1)
1741-
run_cost += (outer_matched_rows - 1) * inner_rescan_run_cost * inner_scan_frac;
1742-
1743-
/*
1744-
* The cost of processing unmatched rows varies depending on the
1745-
* details of the joinclauses, so we leave that part for later.
1717+
* Getting decent estimates requires inspection of the join quals,
1718+
* which we choose to postpone to final_cost_nestloop.
17461719
*/
17471720

17481721
/* Save private data for final_cost_nestloop */
1749-
workspace->outer_matched_rows = outer_matched_rows;
1750-
workspace->inner_scan_frac = inner_scan_frac;
1722+
workspace->inner_run_cost = inner_run_cost;
1723+
workspace->inner_rescan_run_cost = inner_rescan_run_cost;
17511724
}
17521725
else
17531726
{
@@ -1764,7 +1737,6 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
17641737
workspace->total_cost = startup_cost + run_cost;
17651738
/* Save private data for final_cost_nestloop */
17661739
workspace->run_cost = run_cost;
1767-
workspace->inner_rescan_run_cost = inner_rescan_run_cost;
17681740
}
17691741

17701742
/*
@@ -1788,7 +1760,6 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
17881760
double inner_path_rows = inner_path->rows;
17891761
Cost startup_cost = workspace->startup_cost;
17901762
Cost run_cost = workspace->run_cost;
1791-
Cost inner_rescan_run_cost = workspace->inner_rescan_run_cost;
17921763
Cost cpu_per_tuple;
17931764
QualCost restrict_qual_cost;
17941765
double ntuples;
@@ -1807,42 +1778,101 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
18071778
if (!enable_nestloop)
18081779
startup_cost += disable_cost;
18091780

1810-
/* cost of source data */
1781+
/* cost of inner-relation source data (we already dealt with outer rel) */
18111782

18121783
if (path->jointype == JOIN_SEMI || path->jointype == JOIN_ANTI)
18131784
{
1814-
double outer_matched_rows = workspace->outer_matched_rows;
1815-
Selectivity inner_scan_frac = workspace->inner_scan_frac;
1816-
18171785
/*
18181786
* SEMI or ANTI join: executor will stop after first match.
18191787
*/
1788+
Cost inner_run_cost = workspace->inner_run_cost;
1789+
Cost inner_rescan_run_cost = workspace->inner_rescan_run_cost;
1790+
double outer_matched_rows;
1791+
Selectivity inner_scan_frac;
18201792

1821-
/* Compute number of tuples processed (not number emitted!) */
1793+
/*
1794+
* For an outer-rel row that has at least one match, we can expect the
1795+
* inner scan to stop after a fraction 1/(match_count+1) of the inner
1796+
* rows, if the matches are evenly distributed. Since they probably
1797+
* aren't quite evenly distributed, we apply a fuzz factor of 2.0 to
1798+
* that fraction. (If we used a larger fuzz factor, we'd have to
1799+
* clamp inner_scan_frac to at most 1.0; but since match_count is at
1800+
* least 1, no such clamp is needed now.)
1801+
*/
1802+
outer_matched_rows = rint(outer_path_rows * semifactors->outer_match_frac);
1803+
inner_scan_frac = 2.0 / (semifactors->match_count + 1.0);
1804+
1805+
/*
1806+
* Compute number of tuples processed (not number emitted!). First,
1807+
* account for successfully-matched outer rows.
1808+
*/
18221809
ntuples = outer_matched_rows * inner_path_rows * inner_scan_frac;
18231810

18241811
/*
1825-
* For unmatched outer-rel rows, there are two cases. If the inner
1826-
* path is an indexscan using all the joinquals as indexquals, then an
1827-
* unmatched row results in an indexscan returning no rows, which is
1828-
* probably quite cheap. We estimate this case as the same cost to
1829-
* return the first tuple of a nonempty scan. Otherwise, the executor
1830-
* will have to scan the whole inner rel; not so cheap.
1812+
* Now we need to estimate the actual costs of scanning the inner
1813+
* relation, which may be quite a bit less than N times inner_run_cost
1814+
* due to early scan stops. We consider two cases. If the inner path
1815+
* is an indexscan using all the joinquals as indexquals, then an
1816+
* unmatched outer row results in an indexscan returning no rows,
1817+
* which is probably quite cheap. Otherwise, the executor will have
1818+
* to scan the whole inner rel for an unmatched row; not so cheap.
18311819
*/
18321820
if (has_indexed_join_quals(path))
18331821
{
1822+
/*
1823+
* Successfully-matched outer rows will only require scanning
1824+
* inner_scan_frac of the inner relation. In this case, we don't
1825+
* need to charge the full inner_run_cost even when that's more
1826+
* than inner_rescan_run_cost, because we can assume that none of
1827+
* the inner scans ever scan the whole inner relation. So it's
1828+
* okay to assume that all the inner scan executions can be
1829+
* fractions of the full cost, even if materialization is reducing
1830+
* the rescan cost. At this writing, it's impossible to get here
1831+
* for a materialized inner scan, so inner_run_cost and
1832+
* inner_rescan_run_cost will be the same anyway; but just in
1833+
* case, use inner_run_cost for the first matched tuple and
1834+
* inner_rescan_run_cost for additional ones.
1835+
*/
1836+
run_cost += inner_run_cost * inner_scan_frac;
1837+
if (outer_matched_rows > 1)
1838+
run_cost += (outer_matched_rows - 1) * inner_rescan_run_cost * inner_scan_frac;
1839+
1840+
/*
1841+
* Add the cost of inner-scan executions for unmatched outer rows.
1842+
* We estimate this as the same cost as returning the first tuple
1843+
* of a nonempty scan. We consider that these are all rescans,
1844+
* since we used inner_run_cost once already.
1845+
*/
18341846
run_cost += (outer_path_rows - outer_matched_rows) *
18351847
inner_rescan_run_cost / inner_path_rows;
18361848

18371849
/*
1838-
* We won't be evaluating any quals at all for these rows, so
1850+
* We won't be evaluating any quals at all for unmatched rows, so
18391851
* don't add them to ntuples.
18401852
*/
18411853
}
18421854
else
18431855
{
1856+
/*
1857+
* Here, a complicating factor is that rescans may be cheaper than
1858+
* first scans. If we never scan all the way to the end of the
1859+
* inner rel, it might be (depending on the plan type) that we'd
1860+
* never pay the whole inner first-scan run cost. However it is
1861+
* difficult to estimate whether that will happen (and it could
1862+
* not happen if there are any unmatched outer rows!), so be
1863+
* conservative and always charge the whole first-scan cost once.
1864+
*/
1865+
run_cost += inner_run_cost;
1866+
1867+
/* Add inner run cost for additional outer tuples having matches */
1868+
if (outer_matched_rows > 1)
1869+
run_cost += (outer_matched_rows - 1) * inner_rescan_run_cost * inner_scan_frac;
1870+
1871+
/* Add inner run cost for unmatched outer tuples */
18441872
run_cost += (outer_path_rows - outer_matched_rows) *
18451873
inner_rescan_run_cost;
1874+
1875+
/* And count the unmatched join tuples as being processed */
18461876
ntuples += (outer_path_rows - outer_matched_rows) *
18471877
inner_path_rows;
18481878
}

0 commit comments

Comments
 (0)