Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 3ced883

Browse files
committed
Simplify query_planner's API by having it return the top-level RelOptInfo.
Formerly, query_planner returned one or possibly two Paths for the topmost join relation, so that grouping_planner didn't see the join RelOptInfo (at least not directly; it didn't have any hesitation about examining cheapest_path->parent, though). However, correct selection of the Paths involved a significant amount of coupling between query_planner and grouping_planner, a problem which has gotten worse over time. It seems best to give up on this API choice and instead return the topmost RelOptInfo explicitly. Then grouping_planner can pull out the Paths it wants from the rel's path list. In this way we can remove all knowledge of grouping behaviors from query_planner. The only real benefit of the old way is that in the case of an empty FROM clause, we never made any RelOptInfos at all, just a Path. Now we have to gin up a dummy RelOptInfo to represent the empty FROM clause. That's not a very big deal though. While at it, simplify query_planner's API a bit more by having the caller set up root->tuple_fraction and root->limit_tuples, rather than passing those values as separate parameters. Since query_planner no longer does anything with either value, requiring it to fill the PlannerInfo fields seemed pretty arbitrary. This patch just rearranges code; it doesn't (intentionally) change any behaviors. Followup patches will do more interesting things.
1 parent 841c29c commit 3ced883

File tree

8 files changed

+247
-277
lines changed

8 files changed

+247
-277
lines changed

src/backend/optimizer/README

+3-3
Original file line numberDiff line numberDiff line change
@@ -372,10 +372,10 @@ generated during the optimization process are marked with their sort order
372372

373373
It is also possible to avoid an explicit sort step to implement a user's
374374
ORDER BY clause if the final path has the right ordering already, so the
375-
sort ordering is of interest even at the top level. query_planner() will
375+
sort ordering is of interest even at the top level. grouping_planner() will
376376
look for the cheapest path with a sort order matching the desired order,
377-
and grouping_planner() will compare its cost to the cost of using the
378-
cheapest-overall path and doing an explicit sort.
377+
then compare its cost to the cost of using the cheapest-overall path and
378+
doing an explicit sort on that.
379379

380380
When we are generating paths for a particular RelOptInfo, we discard a path
381381
if it is more expensive than another known path that has the same or better

src/backend/optimizer/plan/planagg.c

+21-22
Original file line numberDiff line numberDiff line change
@@ -316,6 +316,7 @@ find_minmax_aggs_walker(Node *node, List **context)
316316
Assert(aggref->agglevelsup == 0);
317317
if (list_length(aggref->args) != 1)
318318
return true; /* it couldn't be MIN/MAX */
319+
319320
/*
320321
* ORDER BY is usually irrelevant for MIN/MAX, but it can change the
321322
* outcome if the aggsortop's operator class recognizes non-identical
@@ -329,6 +330,7 @@ find_minmax_aggs_walker(Node *node, List **context)
329330
*/
330331
if (aggref->aggorder != NIL)
331332
return true;
333+
332334
/*
333335
* We might implement the optimization when a FILTER clause is present
334336
* by adding the filter to the quals of the generated subquery.
@@ -399,9 +401,8 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
399401
TargetEntry *tle;
400402
NullTest *ntest;
401403
SortGroupClause *sortcl;
402-
Path *cheapest_path;
404+
RelOptInfo *final_rel;
403405
Path *sorted_path;
404-
double dNumGroups;
405406
Cost path_cost;
406407
double path_fraction;
407408

@@ -470,37 +471,35 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
470471
* Generate the best paths for this query, telling query_planner that we
471472
* have LIMIT 1.
472473
*/
473-
query_planner(subroot, parse->targetList, 1.0, 1.0,
474-
minmax_qp_callback, NULL,
475-
&cheapest_path, &sorted_path, &dNumGroups);
474+
subroot->tuple_fraction = 1.0;
475+
subroot->limit_tuples = 1.0;
476+
477+
final_rel = query_planner(subroot, parse->targetList,
478+
minmax_qp_callback, NULL);
476479

477480
/*
478-
* Fail if no presorted path. However, if query_planner determines that
479-
* the presorted path is also the cheapest, it will set sorted_path to
480-
* NULL ... don't be fooled. (This is kind of a pain here, but it
481-
* simplifies life for grouping_planner, so leave it be.)
481+
* Get the best presorted path, that being the one that's cheapest for
482+
* fetching just one row. If there's no such path, fail.
482483
*/
484+
if (final_rel->rows > 1.0)
485+
path_fraction = 1.0 / final_rel->rows;
486+
else
487+
path_fraction = 1.0;
488+
489+
sorted_path =
490+
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
491+
subroot->query_pathkeys,
492+
NULL,
493+
path_fraction);
483494
if (!sorted_path)
484-
{
485-
if (cheapest_path &&
486-
pathkeys_contained_in(subroot->sort_pathkeys,
487-
cheapest_path->pathkeys))
488-
sorted_path = cheapest_path;
489-
else
490-
return false;
491-
}
495+
return false;
492496

493497
/*
494498
* Determine cost to get just the first row of the presorted path.
495499
*
496500
* Note: cost calculation here should match
497501
* compare_fractional_path_costs().
498502
*/
499-
if (sorted_path->parent->rows > 1.0)
500-
path_fraction = 1.0 / sorted_path->parent->rows;
501-
else
502-
path_fraction = 1.0;
503-
504503
path_cost = sorted_path->startup_cost +
505504
path_fraction * (sorted_path->total_cost - sorted_path->startup_cost);
506505

src/backend/optimizer/plan/planmain.c

+20-207
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,10 @@
2020
*/
2121
#include "postgres.h"
2222

23-
#include "miscadmin.h"
24-
#include "optimizer/cost.h"
2523
#include "optimizer/pathnode.h"
2624
#include "optimizer/paths.h"
2725
#include "optimizer/placeholder.h"
2826
#include "optimizer/planmain.h"
29-
#include "optimizer/tlist.h"
30-
#include "utils/selfuncs.h"
3127

3228

3329
/*
@@ -36,86 +32,58 @@
3632
* which may involve joins but not any fancier features.
3733
*
3834
* Since query_planner does not handle the toplevel processing (grouping,
39-
* sorting, etc) it cannot select the best path by itself. It selects
40-
* two paths: the cheapest path that produces all the required tuples,
41-
* independent of any ordering considerations, and the cheapest path that
42-
* produces the expected fraction of the required tuples in the required
43-
* ordering, if there is a path that is cheaper for this than just sorting
44-
* the output of the cheapest overall path. The caller (grouping_planner)
45-
* will make the final decision about which to use.
35+
* sorting, etc) it cannot select the best path by itself. Instead, it
36+
* returns the RelOptInfo for the top level of joining, and the caller
37+
* (grouping_planner) can choose one of the surviving paths for the rel.
38+
* Normally it would choose either the rel's cheapest path, or the cheapest
39+
* path for the desired sort order.
4640
*
47-
* Input parameters:
4841
* root describes the query to plan
4942
* tlist is the target list the query should produce
5043
* (this is NOT necessarily root->parse->targetList!)
51-
* tuple_fraction is the fraction of tuples we expect will be retrieved
52-
* limit_tuples is a hard limit on number of tuples to retrieve,
53-
* or -1 if no limit
5444
* qp_callback is a function to compute query_pathkeys once it's safe to do so
5545
* qp_extra is optional extra data to pass to qp_callback
5646
*
57-
* Output parameters:
58-
* *cheapest_path receives the overall-cheapest path for the query
59-
* *sorted_path receives the cheapest presorted path for the query,
60-
* if any (NULL if there is no useful presorted path)
61-
* *num_groups receives the estimated number of groups, or 1 if query
62-
* does not use grouping
63-
*
6447
* Note: the PlannerInfo node also includes a query_pathkeys field, which
6548
* tells query_planner the sort order that is desired in the final output
6649
* plan. This value is *not* available at call time, but is computed by
6750
* qp_callback once we have completed merging the query's equivalence classes.
6851
* (We cannot construct canonical pathkeys until that's done.)
69-
*
70-
* tuple_fraction is interpreted as follows:
71-
* 0: expect all tuples to be retrieved (normal case)
72-
* 0 < tuple_fraction < 1: expect the given fraction of tuples available
73-
* from the plan to be retrieved
74-
* tuple_fraction >= 1: tuple_fraction is the absolute number of tuples
75-
* expected to be retrieved (ie, a LIMIT specification)
76-
* Note that a nonzero tuple_fraction could come from outer context; it is
77-
* therefore not redundant with limit_tuples. We use limit_tuples to determine
78-
* whether a bounded sort can be used at runtime.
7952
*/
80-
void
53+
RelOptInfo *
8154
query_planner(PlannerInfo *root, List *tlist,
82-
double tuple_fraction, double limit_tuples,
83-
query_pathkeys_callback qp_callback, void *qp_extra,
84-
Path **cheapest_path, Path **sorted_path,
85-
double *num_groups)
55+
query_pathkeys_callback qp_callback, void *qp_extra)
8656
{
8757
Query *parse = root->parse;
8858
List *joinlist;
8959
RelOptInfo *final_rel;
90-
Path *cheapestpath;
91-
Path *sortedpath;
9260
Index rti;
9361
double total_pages;
9462

95-
/* Make tuple_fraction, limit_tuples accessible to lower-level routines */
96-
root->tuple_fraction = tuple_fraction;
97-
root->limit_tuples = limit_tuples;
98-
99-
*num_groups = 1; /* default result */
100-
10163
/*
10264
* If the query has an empty join tree, then it's something easy like
10365
* "SELECT 2+2;" or "INSERT ... VALUES()". Fall through quickly.
10466
*/
10567
if (parse->jointree->fromlist == NIL)
10668
{
107-
/* We need a trivial path result */
108-
*cheapest_path = (Path *)
109-
create_result_path((List *) parse->jointree->quals);
110-
*sorted_path = NULL;
69+
/* We need a dummy joinrel to describe the empty set of baserels */
70+
final_rel = build_empty_join_rel(root);
71+
72+
/* The only path for it is a trivial Result path */
73+
add_path(final_rel, (Path *)
74+
create_result_path((List *) parse->jointree->quals));
75+
76+
/* Select cheapest path (pretty easy in this case...) */
77+
set_cheapest(final_rel);
11178

11279
/*
11380
* We still are required to call qp_callback, in case it's something
11481
* like "SELECT 2+2 ORDER BY 1".
11582
*/
11683
root->canon_pathkeys = NIL;
11784
(*qp_callback) (root, qp_extra);
118-
return;
85+
86+
return final_rel;
11987
}
12088

12189
/*
@@ -259,165 +227,10 @@ query_planner(PlannerInfo *root, List *tlist,
259227
*/
260228
final_rel = make_one_rel(root, joinlist);
261229

230+
/* Check that we got at least one usable path */
262231
if (!final_rel || !final_rel->cheapest_total_path ||
263232
final_rel->cheapest_total_path->param_info != NULL)
264233
elog(ERROR, "failed to construct the join relation");
265234

266-
/*
267-
* If there's grouping going on, estimate the number of result groups. We
268-
* couldn't do this any earlier because it depends on relation size
269-
* estimates that were set up above.
270-
*
271-
* Then convert tuple_fraction to fractional form if it is absolute, and
272-
* adjust it based on the knowledge that grouping_planner will be doing
273-
* grouping or aggregation work with our result.
274-
*
275-
* This introduces some undesirable coupling between this code and
276-
* grouping_planner, but the alternatives seem even uglier; we couldn't
277-
* pass back completed paths without making these decisions here.
278-
*/
279-
if (parse->groupClause)
280-
{
281-
List *groupExprs;
282-
283-
groupExprs = get_sortgrouplist_exprs(parse->groupClause,
284-
parse->targetList);
285-
*num_groups = estimate_num_groups(root,
286-
groupExprs,
287-
final_rel->rows);
288-
289-
/*
290-
* In GROUP BY mode, an absolute LIMIT is relative to the number of
291-
* groups not the number of tuples. If the caller gave us a fraction,
292-
* keep it as-is. (In both cases, we are effectively assuming that
293-
* all the groups are about the same size.)
294-
*/
295-
if (tuple_fraction >= 1.0)
296-
tuple_fraction /= *num_groups;
297-
298-
/*
299-
* If both GROUP BY and ORDER BY are specified, we will need two
300-
* levels of sort --- and, therefore, certainly need to read all the
301-
* tuples --- unless ORDER BY is a subset of GROUP BY. Likewise if we
302-
* have both DISTINCT and GROUP BY, or if we have a window
303-
* specification not compatible with the GROUP BY.
304-
*/
305-
if (!pathkeys_contained_in(root->sort_pathkeys, root->group_pathkeys) ||
306-
!pathkeys_contained_in(root->distinct_pathkeys, root->group_pathkeys) ||
307-
!pathkeys_contained_in(root->window_pathkeys, root->group_pathkeys))
308-
tuple_fraction = 0.0;
309-
310-
/* In any case, limit_tuples shouldn't be specified here */
311-
Assert(limit_tuples < 0);
312-
}
313-
else if (parse->hasAggs || root->hasHavingQual)
314-
{
315-
/*
316-
* Ungrouped aggregate will certainly want to read all the tuples, and
317-
* it will deliver a single result row (so leave *num_groups 1).
318-
*/
319-
tuple_fraction = 0.0;
320-
321-
/* limit_tuples shouldn't be specified here */
322-
Assert(limit_tuples < 0);
323-
}
324-
else if (parse->distinctClause)
325-
{
326-
/*
327-
* Since there was no grouping or aggregation, it's reasonable to
328-
* assume the UNIQUE filter has effects comparable to GROUP BY. Return
329-
* the estimated number of output rows for use by caller. (If DISTINCT
330-
* is used with grouping, we ignore its effects for rowcount
331-
* estimation purposes; this amounts to assuming the grouped rows are
332-
* distinct already.)
333-
*/
334-
List *distinctExprs;
335-
336-
distinctExprs = get_sortgrouplist_exprs(parse->distinctClause,
337-
parse->targetList);
338-
*num_groups = estimate_num_groups(root,
339-
distinctExprs,
340-
final_rel->rows);
341-
342-
/*
343-
* Adjust tuple_fraction the same way as for GROUP BY, too.
344-
*/
345-
if (tuple_fraction >= 1.0)
346-
tuple_fraction /= *num_groups;
347-
348-
/* limit_tuples shouldn't be specified here */
349-
Assert(limit_tuples < 0);
350-
}
351-
else
352-
{
353-
/*
354-
* Plain non-grouped, non-aggregated query: an absolute tuple fraction
355-
* can be divided by the number of tuples.
356-
*/
357-
if (tuple_fraction >= 1.0)
358-
tuple_fraction /= final_rel->rows;
359-
}
360-
361-
/*
362-
* Pick out the cheapest-total path and the cheapest presorted path for
363-
* the requested pathkeys (if there is one). We should take the tuple
364-
* fraction into account when selecting the cheapest presorted path, but
365-
* not when selecting the cheapest-total path, since if we have to sort
366-
* then we'll have to fetch all the tuples. (But there's a special case:
367-
* if query_pathkeys is NIL, meaning order doesn't matter, then the
368-
* "cheapest presorted" path will be the cheapest overall for the tuple
369-
* fraction.)
370-
*
371-
* The cheapest-total path is also the one to use if grouping_planner
372-
* decides to use hashed aggregation, so we return it separately even if
373-
* this routine thinks the presorted path is the winner.
374-
*/
375-
cheapestpath = final_rel->cheapest_total_path;
376-
377-
sortedpath =
378-
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
379-
root->query_pathkeys,
380-
NULL,
381-
tuple_fraction);
382-
383-
/* Don't return same path in both guises; just wastes effort */
384-
if (sortedpath == cheapestpath)
385-
sortedpath = NULL;
386-
387-
/*
388-
* Forget about the presorted path if it would be cheaper to sort the
389-
* cheapest-total path. Here we need consider only the behavior at the
390-
* tuple fraction point.
391-
*/
392-
if (sortedpath)
393-
{
394-
Path sort_path; /* dummy for result of cost_sort */
395-
396-
if (root->query_pathkeys == NIL ||
397-
pathkeys_contained_in(root->query_pathkeys,
398-
cheapestpath->pathkeys))
399-
{
400-
/* No sort needed for cheapest path */
401-
sort_path.startup_cost = cheapestpath->startup_cost;
402-
sort_path.total_cost = cheapestpath->total_cost;
403-
}
404-
else
405-
{
406-
/* Figure cost for sorting */
407-
cost_sort(&sort_path, root, root->query_pathkeys,
408-
cheapestpath->total_cost,
409-
final_rel->rows, final_rel->width,
410-
0.0, work_mem, limit_tuples);
411-
}
412-
413-
if (compare_fractional_path_costs(sortedpath, &sort_path,
414-
tuple_fraction) > 0)
415-
{
416-
/* Presorted path is a loser */
417-
sortedpath = NULL;
418-
}
419-
}
420-
421-
*cheapest_path = cheapestpath;
422-
*sorted_path = sortedpath;
235+
return final_rel;
423236
}

0 commit comments

Comments
 (0)