Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit d3fdec6

Browse files
committed
Fix planner's cost estimation for SEMI/ANTI joins with inner indexscans.
When the inner side of a nestloop SEMI or ANTI join is an indexscan that uses all the join clauses as indexquals, it can be presumed that both matched and unmatched outer rows will be processed very quickly: for matched rows, we'll stop after fetching one row from the indexscan, while for unmatched rows we'll have an indexscan that finds no matching index entries, which should also be quick. The planner already knew about this, but it was nonetheless charging for at least one full run of the inner indexscan, as a consequence of concerns about the behavior of materialized inner scans --- but those concerns don't apply in the fast case. If the inner side has low cardinality (many matching rows) this could make an indexscan plan look far more expensive than it actually is. To fix, rearrange the work in initial_cost_nestloop/final_cost_nestloop so that we don't add the inner scan cost until we've inspected the indexquals, and then we can add either the full-run cost or just the first tuple's cost as appropriate. Experimentation with this fix uncovered another problem: add_path and friends were coded to disregard cheap startup cost when considering parameterized paths. That's usually okay (and desirable, because it thins the path herd faster); but in this fast case for SEMI/ANTI joins, it could result in throwing away the desired plain indexscan path in favor of a bitmap scan path before we ever get to the join costing logic. In the many-matching-rows cases of interest here, a bitmap scan will do a lot more work than required, so this is a problem. To fix, add a per-relation flag consider_param_startup that works like the existing consider_startup flag, but applies to parameterized paths, and set it for relations that are the inside of a SEMI or ANTI join. To make this patch reasonably safe to back-patch, care has been taken to avoid changing the planner's behavior except in the very narrow case of SEMI/ANTI joins with inner indexscans. There are places in compare_path_costs_fuzzily and add_path_precheck that are not terribly consistent with the new approach, but changing them will affect planner decisions at the margins in other cases, so we'll leave that for a HEAD-only fix. Back-patch to 9.3; before that, the consider_startup flag didn't exist, meaning that the second aspect of the patch would be too invasive. Per a complaint from Peter Holzer and analysis by Tomas Vondra.
1 parent 00ca051 commit d3fdec6

File tree

7 files changed

+175
-86
lines changed

7 files changed

+175
-86
lines changed

src/backend/nodes/outfuncs.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1742,6 +1742,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
17421742
WRITE_FLOAT_FIELD(rows, "%.0f");
17431743
WRITE_INT_FIELD(width);
17441744
WRITE_BOOL_FIELD(consider_startup);
1745+
WRITE_BOOL_FIELD(consider_param_startup);
17451746
WRITE_NODE_FIELD(reltargetlist);
17461747
WRITE_NODE_FIELD(pathlist);
17471748
WRITE_NODE_FIELD(ppilist);

src/backend/optimizer/README

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -798,7 +798,7 @@ a nestloop that provides parameters to the lower join's inputs). While we
798798
do not ignore merge joins entirely, joinpath.c does not fully explore the
799799
space of potential merge joins with parameterized inputs. Also, add_path
800800
treats parameterized paths as having no pathkeys, so that they compete
801-
only on total cost and rowcount; they don't get preference for producing a
801+
only on cost and rowcount; they don't get preference for producing a
802802
special sort order. This creates additional bias against merge joins,
803803
since we might discard a path that could have been useful for performing
804804
a merge without an explicit sort step. Since a parameterized path must
@@ -807,6 +807,13 @@ uninteresting, these choices do not affect any requirement for the final
807807
output order of a query --- they only make it harder to use a merge join
808808
at a lower level. The savings in planning work justifies that.
809809

810+
Similarly, parameterized paths do not normally get preference in add_path
811+
for having cheap startup cost; that's seldom of much value when on the
812+
inside of a nestloop, so it seems not worth keeping extra paths solely for
813+
that. An exception occurs for parameterized paths for the RHS relation of
814+
a SEMI or ANTI join: in those cases, we can stop the inner scan after the
815+
first match, so it's primarily startup not total cost that we care about.
816+
810817

811818
LATERAL subqueries
812819
------------------

src/backend/optimizer/path/allpaths.c

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ int geqo_threshold;
4747
join_search_hook_type join_search_hook = NULL;
4848

4949

50+
static void set_base_rel_consider_startup(PlannerInfo *root);
5051
static void set_base_rel_sizes(PlannerInfo *root);
5152
static void set_base_rel_pathlists(PlannerInfo *root);
5253
static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
@@ -131,6 +132,9 @@ make_one_rel(PlannerInfo *root, List *joinlist)
131132
root->all_baserels = bms_add_member(root->all_baserels, brel->relid);
132133
}
133134

135+
/* Mark base rels as to whether we care about fast-start plans */
136+
set_base_rel_consider_startup(root);
137+
134138
/*
135139
* Generate access paths for the base rels.
136140
*/
@@ -150,6 +154,49 @@ make_one_rel(PlannerInfo *root, List *joinlist)
150154
return rel;
151155
}
152156

157+
/*
158+
* set_base_rel_consider_startup
159+
* Set the consider_[param_]startup flags for each base-relation entry.
160+
*
161+
* For the moment, we only deal with consider_param_startup here; because the
162+
* logic for consider_startup is pretty trivial and is the same for every base
163+
* relation, we just let build_simple_rel() initialize that flag correctly to
164+
* start with. If that logic ever gets more complicated it would probably
165+
* be better to move it here.
166+
*/
167+
static void
168+
set_base_rel_consider_startup(PlannerInfo *root)
169+
{
170+
/*
171+
* Since parameterized paths can only be used on the inside of a nestloop
172+
* join plan, there is usually little value in considering fast-start
173+
* plans for them. However, for relations that are on the RHS of a SEMI
174+
* or ANTI join, a fast-start plan can be useful because we're only going
175+
* to care about fetching one tuple anyway.
176+
*
177+
* To minimize growth of planning time, we currently restrict this to
178+
* cases where the RHS is a single base relation, not a join; there is no
179+
* provision for consider_param_startup to get set at all on joinrels.
180+
* Also we don't worry about appendrels. costsize.c's costing rules for
181+
* nestloop semi/antijoins don't consider such cases either.
182+
*/
183+
ListCell *lc;
184+
185+
foreach(lc, root->join_info_list)
186+
{
187+
SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc);
188+
189+
if ((sjinfo->jointype == JOIN_SEMI || sjinfo->jointype == JOIN_ANTI) &&
190+
bms_membership(sjinfo->syn_righthand) == BMS_SINGLETON)
191+
{
192+
int varno = bms_singleton_member(sjinfo->syn_righthand);
193+
RelOptInfo *rel = find_base_rel(root, varno);
194+
195+
rel->consider_param_startup = true;
196+
}
197+
}
198+
}
199+
153200
/*
154201
* set_base_rel_sizes
155202
* Set the size estimates (rows and widths) for each base-relation entry.

src/backend/optimizer/path/costsize.c

Lines changed: 77 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1654,7 +1654,8 @@ cost_group(Path *path, PlannerInfo *root,
16541654
* estimate and getting a tight lower bound. We choose to not examine the
16551655
* join quals here, since that's by far the most expensive part of the
16561656
* calculations. The end result is that CPU-cost considerations must be
1657-
* left for the second phase.
1657+
* left for the second phase; and for SEMI/ANTI joins, we must also postpone
1658+
* incorporation of the inner path's run cost.
16581659
*
16591660
* 'workspace' is to be filled with startup_cost, total_cost, and perhaps
16601661
* other data to be used by final_cost_nestloop
@@ -1702,44 +1703,16 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
17021703

17031704
if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
17041705
{
1705-
double outer_matched_rows;
1706-
Selectivity inner_scan_frac;
1707-
17081706
/*
17091707
* SEMI or ANTI join: executor will stop after first match.
17101708
*
1711-
* For an outer-rel row that has at least one match, we can expect the
1712-
* inner scan to stop after a fraction 1/(match_count+1) of the inner
1713-
* rows, if the matches are evenly distributed. Since they probably
1714-
* aren't quite evenly distributed, we apply a fuzz factor of 2.0 to
1715-
* that fraction. (If we used a larger fuzz factor, we'd have to
1716-
* clamp inner_scan_frac to at most 1.0; but since match_count is at
1717-
* least 1, no such clamp is needed now.)
1718-
*
1719-
* A complicating factor is that rescans may be cheaper than first
1720-
* scans. If we never scan all the way to the end of the inner rel,
1721-
* it might be (depending on the plan type) that we'd never pay the
1722-
* whole inner first-scan run cost. However it is difficult to
1723-
* estimate whether that will happen, so be conservative and always
1724-
* charge the whole first-scan cost once.
1725-
*/
1726-
run_cost += inner_run_cost;
1727-
1728-
outer_matched_rows = rint(outer_path_rows * semifactors->outer_match_frac);
1729-
inner_scan_frac = 2.0 / (semifactors->match_count + 1.0);
1730-
1731-
/* Add inner run cost for additional outer tuples having matches */
1732-
if (outer_matched_rows > 1)
1733-
run_cost += (outer_matched_rows - 1) * inner_rescan_run_cost * inner_scan_frac;
1734-
1735-
/*
1736-
* The cost of processing unmatched rows varies depending on the
1737-
* details of the joinclauses, so we leave that part for later.
1709+
* Getting decent estimates requires inspection of the join quals,
1710+
* which we choose to postpone to final_cost_nestloop.
17381711
*/
17391712

17401713
/* Save private data for final_cost_nestloop */
1741-
workspace->outer_matched_rows = outer_matched_rows;
1742-
workspace->inner_scan_frac = inner_scan_frac;
1714+
workspace->inner_run_cost = inner_run_cost;
1715+
workspace->inner_rescan_run_cost = inner_rescan_run_cost;
17431716
}
17441717
else
17451718
{
@@ -1756,7 +1729,6 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
17561729
workspace->total_cost = startup_cost + run_cost;
17571730
/* Save private data for final_cost_nestloop */
17581731
workspace->run_cost = run_cost;
1759-
workspace->inner_rescan_run_cost = inner_rescan_run_cost;
17601732
}
17611733

17621734
/*
@@ -1780,7 +1752,6 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
17801752
double inner_path_rows = inner_path->rows;
17811753
Cost startup_cost = workspace->startup_cost;
17821754
Cost run_cost = workspace->run_cost;
1783-
Cost inner_rescan_run_cost = workspace->inner_rescan_run_cost;
17841755
Cost cpu_per_tuple;
17851756
QualCost restrict_qual_cost;
17861757
double ntuples;
@@ -1799,42 +1770,101 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
17991770
if (!enable_nestloop)
18001771
startup_cost += disable_cost;
18011772

1802-
/* cost of source data */
1773+
/* cost of inner-relation source data (we already dealt with outer rel) */
18031774

18041775
if (path->jointype == JOIN_SEMI || path->jointype == JOIN_ANTI)
18051776
{
1806-
double outer_matched_rows = workspace->outer_matched_rows;
1807-
Selectivity inner_scan_frac = workspace->inner_scan_frac;
1808-
18091777
/*
18101778
* SEMI or ANTI join: executor will stop after first match.
18111779
*/
1780+
Cost inner_run_cost = workspace->inner_run_cost;
1781+
Cost inner_rescan_run_cost = workspace->inner_rescan_run_cost;
1782+
double outer_matched_rows;
1783+
Selectivity inner_scan_frac;
18121784

1813-
/* Compute number of tuples processed (not number emitted!) */
1785+
/*
1786+
* For an outer-rel row that has at least one match, we can expect the
1787+
* inner scan to stop after a fraction 1/(match_count+1) of the inner
1788+
* rows, if the matches are evenly distributed. Since they probably
1789+
* aren't quite evenly distributed, we apply a fuzz factor of 2.0 to
1790+
* that fraction. (If we used a larger fuzz factor, we'd have to
1791+
* clamp inner_scan_frac to at most 1.0; but since match_count is at
1792+
* least 1, no such clamp is needed now.)
1793+
*/
1794+
outer_matched_rows = rint(outer_path_rows * semifactors->outer_match_frac);
1795+
inner_scan_frac = 2.0 / (semifactors->match_count + 1.0);
1796+
1797+
/*
1798+
* Compute number of tuples processed (not number emitted!). First,
1799+
* account for successfully-matched outer rows.
1800+
*/
18141801
ntuples = outer_matched_rows * inner_path_rows * inner_scan_frac;
18151802

18161803
/*
1817-
* For unmatched outer-rel rows, there are two cases. If the inner
1818-
* path is an indexscan using all the joinquals as indexquals, then an
1819-
* unmatched row results in an indexscan returning no rows, which is
1820-
* probably quite cheap. We estimate this case as the same cost to
1821-
* return the first tuple of a nonempty scan. Otherwise, the executor
1822-
* will have to scan the whole inner rel; not so cheap.
1804+
* Now we need to estimate the actual costs of scanning the inner
1805+
* relation, which may be quite a bit less than N times inner_run_cost
1806+
* due to early scan stops. We consider two cases. If the inner path
1807+
* is an indexscan using all the joinquals as indexquals, then an
1808+
* unmatched outer row results in an indexscan returning no rows,
1809+
* which is probably quite cheap. Otherwise, the executor will have
1810+
* to scan the whole inner rel for an unmatched row; not so cheap.
18231811
*/
18241812
if (has_indexed_join_quals(path))
18251813
{
1814+
/*
1815+
* Successfully-matched outer rows will only require scanning
1816+
* inner_scan_frac of the inner relation. In this case, we don't
1817+
* need to charge the full inner_run_cost even when that's more
1818+
* than inner_rescan_run_cost, because we can assume that none of
1819+
* the inner scans ever scan the whole inner relation. So it's
1820+
* okay to assume that all the inner scan executions can be
1821+
* fractions of the full cost, even if materialization is reducing
1822+
* the rescan cost. At this writing, it's impossible to get here
1823+
* for a materialized inner scan, so inner_run_cost and
1824+
* inner_rescan_run_cost will be the same anyway; but just in
1825+
* case, use inner_run_cost for the first matched tuple and
1826+
* inner_rescan_run_cost for additional ones.
1827+
*/
1828+
run_cost += inner_run_cost * inner_scan_frac;
1829+
if (outer_matched_rows > 1)
1830+
run_cost += (outer_matched_rows - 1) * inner_rescan_run_cost * inner_scan_frac;
1831+
1832+
/*
1833+
* Add the cost of inner-scan executions for unmatched outer rows.
1834+
* We estimate this as the same cost as returning the first tuple
1835+
* of a nonempty scan. We consider that these are all rescans,
1836+
* since we used inner_run_cost once already.
1837+
*/
18261838
run_cost += (outer_path_rows - outer_matched_rows) *
18271839
inner_rescan_run_cost / inner_path_rows;
18281840

18291841
/*
1830-
* We won't be evaluating any quals at all for these rows, so
1842+
* We won't be evaluating any quals at all for unmatched rows, so
18311843
* don't add them to ntuples.
18321844
*/
18331845
}
18341846
else
18351847
{
1848+
/*
1849+
* Here, a complicating factor is that rescans may be cheaper than
1850+
* first scans. If we never scan all the way to the end of the
1851+
* inner rel, it might be (depending on the plan type) that we'd
1852+
* never pay the whole inner first-scan run cost. However it is
1853+
* difficult to estimate whether that will happen (and it could
1854+
* not happen if there are any unmatched outer rows!), so be
1855+
* conservative and always charge the whole first-scan cost once.
1856+
*/
1857+
run_cost += inner_run_cost;
1858+
1859+
/* Add inner run cost for additional outer tuples having matches */
1860+
if (outer_matched_rows > 1)
1861+
run_cost += (outer_matched_rows - 1) * inner_rescan_run_cost * inner_scan_frac;
1862+
1863+
/* Add inner run cost for unmatched outer tuples */
18361864
run_cost += (outer_path_rows - outer_matched_rows) *
18371865
inner_rescan_run_cost;
1866+
1867+
/* And count the unmatched join tuples as being processed */
18381868
ntuples += (outer_path_rows - outer_matched_rows) *
18391869
inner_path_rows;
18401870
}

0 commit comments

Comments
 (0)