Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit e83bb10

Browse files
committed
Adjust definition of cheapest_total_path to work better with LATERAL.
In the initial cut at LATERAL, I kept the rule that cheapest_total_path was always unparameterized, which meant it had to be NULL if the relation has no unparameterized paths. It turns out to work much more nicely if we always have *some* path nominated as cheapest-total for each relation. In particular, let's still say it's the cheapest unparameterized path if there is one; if not, take the cheapest-total-cost path among those of the minimum available parameterization. (The first rule is actually a special case of the second.) This allows reversion of some temporary lobotomizations I'd put in place. In particular, the planner can now consider hash and merge joins for joins below a parameter-supplying nestloop, even if there aren't any unparameterized paths available. This should bring planning of LATERAL-containing queries to the same level as queries not using that feature. Along the way, simplify management of parameterized paths in add_path() and friends. In the original coding for parameterized paths in 9.2, I tried to minimize the logic changes in add_path(), so it just treated parameterization as yet another dimension of comparison for paths. We later made it ignore pathkeys (sort ordering) of parameterized paths, on the grounds that ordering isn't a useful property for the path on the inside of a nestloop, so we might as well get rid of useless parameterized paths as quickly as possible. But we didn't take that reasoning as far as we should have. Startup cost isn't a useful property inside a nestloop either, so add_path() ought to discount startup cost of parameterized paths as well. Having done that, the secondary sorting I'd implemented (in add_parameterized_path) is no longer needed --- any parameterized path that survives add_path() at all is worth considering at higher levels. So this should be a bit faster as well as simpler.
1 parent 9fe6da5 commit e83bb10

File tree

8 files changed

+217
-252
lines changed

8 files changed

+217
-252
lines changed

src/backend/optimizer/README

+16-1
Original file line numberDiff line numberDiff line change
@@ -789,7 +789,7 @@ a nestloop that provides parameters to the lower join's inputs). While we
789789
do not ignore merge joins entirely, joinpath.c does not fully explore the
790790
space of potential merge joins with parameterized inputs. Also, add_path
791791
treats parameterized paths as having no pathkeys, so that they compete
792-
only on cost and rowcount; they don't get preference for producing a
792+
only on total cost and rowcount; they don't get preference for producing a
793793
special sort order. This creates additional bias against merge joins,
794794
since we might discard a path that could have been useful for performing
795795
a merge without an explicit sort step. Since a parameterized path must
@@ -799,4 +799,19 @@ output order of a query --- they only make it harder to use a merge join
799799
at a lower level. The savings in planning work justifies that.
800800

801801

802+
LATERAL subqueries
803+
------------------
804+
805+
As of 9.3 we support SQL-standard LATERAL references from subqueries in
806+
FROM (and also functions in FROM). The planner implements these by
807+
generating parameterized paths for any RTE that contains lateral
808+
references. In such cases, *all* paths for that relation will be
809+
parameterized by at least the set of relations used in its lateral
810+
references. (And in turn, join relations including such a subquery might
811+
not have any unparameterized paths.) All the other comments made above for
812+
parameterized paths still apply, though; in particular, each such path is
813+
still expected to enforce any join clauses that can be pushed down to it,
814+
so that all paths of the same parameterization have the same rowcount.
815+
816+
802817
-- bjm & tgl

src/backend/optimizer/geqo/geqo_eval.c

+2-8
Original file line numberDiff line numberDiff line change
@@ -102,18 +102,12 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
102102
joinrel = gimme_tree(root, tour, num_gene);
103103
best_path = joinrel->cheapest_total_path;
104104

105-
/*
106-
* If no unparameterized path, use the cheapest parameterized path for
107-
* costing purposes. XXX revisit this after LATERAL dust settles
108-
*/
109-
if (!best_path)
110-
best_path = linitial(joinrel->cheapest_parameterized_paths);
111-
112105
/*
113106
* compute fitness
114107
*
115108
* XXX geqo does not currently support optimization for partial result
116-
* retrieval --- how to fix?
109+
* retrieval, nor do we take any cognizance of possible use of
110+
* parameterized paths --- how to fix?
117111
*/
118112
fitness = best_path->total_cost;
119113

src/backend/optimizer/path/allpaths.c

+1-2
Original file line numberDiff line numberDiff line change
@@ -722,7 +722,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
722722
* the unparameterized Append path we are constructing for the parent.
723723
* If not, there's no workable unparameterized path.
724724
*/
725-
if (childrel->cheapest_total_path)
725+
if (childrel->cheapest_total_path->param_info == NULL)
726726
subpaths = accumulate_append_subpath(subpaths,
727727
childrel->cheapest_total_path);
728728
else
@@ -932,7 +932,6 @@ generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
932932
cheapest_startup = cheapest_total =
933933
childrel->cheapest_total_path;
934934
/* Assert we do have an unparameterized path for this child */
935-
Assert(cheapest_total != NULL);
936935
Assert(cheapest_total->param_info == NULL);
937936
}
938937

src/backend/optimizer/path/joinpath.c

+52-33
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,9 @@
2222
#include "optimizer/paths.h"
2323

2424

25+
#define PATH_PARAM_BY_REL(path, rel) \
26+
((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), (rel)->relids))
27+
2528
static void sort_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
2629
RelOptInfo *outerrel, RelOptInfo *innerrel,
2730
List *restrictlist, List *mergeclause_list,
@@ -503,18 +506,24 @@ sort_inner_and_outer(PlannerInfo *root,
503506
* cheapest-startup-cost input paths later, and only if they don't need a
504507
* sort.
505508
*
506-
* This function intentionally does not consider parameterized input paths
507-
* (implicit in the fact that it only looks at cheapest_total_path, which
508-
* is always unparameterized). If we did so, we'd have a combinatorial
509-
* explosion of mergejoin paths of dubious value. This interacts with
510-
* decisions elsewhere that also discriminate against mergejoins with
511-
* parameterized inputs; see comments in src/backend/optimizer/README.
509+
* This function intentionally does not consider parameterized input
510+
* paths, except when the cheapest-total is parameterized. If we did so,
511+
* we'd have a combinatorial explosion of mergejoin paths of dubious
512+
* value. This interacts with decisions elsewhere that also discriminate
513+
* against mergejoins with parameterized inputs; see comments in
514+
* src/backend/optimizer/README.
512515
*/
513516
outer_path = outerrel->cheapest_total_path;
514517
inner_path = innerrel->cheapest_total_path;
515518

516-
/* Punt if either rel has only parameterized paths */
517-
if (!outer_path || !inner_path)
519+
/*
520+
* If either cheapest-total path is parameterized by the other rel, we
521+
* can't use a mergejoin. (There's no use looking for alternative input
522+
* paths, since these should already be the least-parameterized available
523+
* paths.)
524+
*/
525+
if (PATH_PARAM_BY_REL(outer_path, innerrel) ||
526+
PATH_PARAM_BY_REL(inner_path, outerrel))
518527
return;
519528

520529
/*
@@ -714,16 +723,23 @@ match_unsorted_outer(PlannerInfo *root,
714723
break;
715724
}
716725

726+
/*
727+
* If inner_cheapest_total is parameterized by the outer rel, ignore it;
728+
* we will consider it below as a member of cheapest_parameterized_paths,
729+
* but the other possibilities considered in this routine aren't usable.
730+
*/
731+
if (PATH_PARAM_BY_REL(inner_cheapest_total, outerrel))
732+
inner_cheapest_total = NULL;
733+
717734
/*
718735
* If we need to unique-ify the inner path, we will consider only the
719736
* cheapest-total inner.
720737
*/
721738
if (save_jointype == JOIN_UNIQUE_INNER)
722739
{
723-
/* XXX for the moment, don't crash on LATERAL --- rethink this */
740+
/* No way to do this with an inner path parameterized by outer rel */
724741
if (inner_cheapest_total == NULL)
725742
return;
726-
727743
inner_cheapest_total = (Path *)
728744
create_unique_path(root, innerrel, inner_cheapest_total, sjinfo);
729745
Assert(inner_cheapest_total);
@@ -756,15 +772,13 @@ match_unsorted_outer(PlannerInfo *root,
756772
/*
757773
* We cannot use an outer path that is parameterized by the inner rel.
758774
*/
759-
if (bms_overlap(PATH_REQ_OUTER(outerpath), innerrel->relids))
775+
if (PATH_PARAM_BY_REL(outerpath, innerrel))
760776
continue;
761777

762778
/*
763779
* If we need to unique-ify the outer path, it's pointless to consider
764780
* any but the cheapest outer. (XXX we don't consider parameterized
765781
* outers, nor inners, for unique-ified cases. Should we?)
766-
*
767-
* XXX does nothing for LATERAL, rethink
768782
*/
769783
if (save_jointype == JOIN_UNIQUE_OUTER)
770784
{
@@ -844,8 +858,8 @@ match_unsorted_outer(PlannerInfo *root,
844858
if (save_jointype == JOIN_UNIQUE_OUTER)
845859
continue;
846860

847-
/* Can't do anything else if inner has no unparameterized paths */
848-
if (!inner_cheapest_total)
861+
/* Can't do anything else if inner rel is parameterized by outer */
862+
if (inner_cheapest_total == NULL)
849863
continue;
850864

851865
/* Look for useful mergeclauses (if any) */
@@ -1126,10 +1140,14 @@ hash_inner_and_outer(PlannerInfo *root,
11261140
Path *cheapest_total_outer = outerrel->cheapest_total_path;
11271141
Path *cheapest_total_inner = innerrel->cheapest_total_path;
11281142

1129-
/* Punt if either rel has only parameterized paths */
1130-
if (!cheapest_startup_outer ||
1131-
!cheapest_total_outer ||
1132-
!cheapest_total_inner)
1143+
/*
1144+
* If either cheapest-total path is parameterized by the other rel, we
1145+
* can't use a hashjoin. (There's no use looking for alternative
1146+
* input paths, since these should already be the least-parameterized
1147+
* available paths.)
1148+
*/
1149+
if (PATH_PARAM_BY_REL(cheapest_total_outer, innerrel) ||
1150+
PATH_PARAM_BY_REL(cheapest_total_inner, outerrel))
11331151
return;
11341152

11351153
/* Unique-ify if need be; we ignore parameterized possibilities */
@@ -1169,7 +1187,8 @@ hash_inner_and_outer(PlannerInfo *root,
11691187
cheapest_total_inner,
11701188
restrictlist,
11711189
hashclauses);
1172-
if (cheapest_startup_outer != cheapest_total_outer)
1190+
if (cheapest_startup_outer != NULL &&
1191+
cheapest_startup_outer != cheapest_total_outer)
11731192
try_hashjoin_path(root,
11741193
joinrel,
11751194
jointype,
@@ -1193,16 +1212,17 @@ hash_inner_and_outer(PlannerInfo *root,
11931212
ListCell *lc1;
11941213
ListCell *lc2;
11951214

1196-
try_hashjoin_path(root,
1197-
joinrel,
1198-
jointype,
1199-
sjinfo,
1200-
semifactors,
1201-
param_source_rels,
1202-
cheapest_startup_outer,
1203-
cheapest_total_inner,
1204-
restrictlist,
1205-
hashclauses);
1215+
if (cheapest_startup_outer != NULL)
1216+
try_hashjoin_path(root,
1217+
joinrel,
1218+
jointype,
1219+
sjinfo,
1220+
semifactors,
1221+
param_source_rels,
1222+
cheapest_startup_outer,
1223+
cheapest_total_inner,
1224+
restrictlist,
1225+
hashclauses);
12061226

12071227
foreach(lc1, outerrel->cheapest_parameterized_paths)
12081228
{
@@ -1212,7 +1232,7 @@ hash_inner_and_outer(PlannerInfo *root,
12121232
* We cannot use an outer path that is parameterized by the
12131233
* inner rel.
12141234
*/
1215-
if (bms_overlap(PATH_REQ_OUTER(outerpath), innerrel->relids))
1235+
if (PATH_PARAM_BY_REL(outerpath, innerrel))
12161236
continue;
12171237

12181238
foreach(lc2, innerrel->cheapest_parameterized_paths)
@@ -1223,8 +1243,7 @@ hash_inner_and_outer(PlannerInfo *root,
12231243
* We cannot use an inner path that is parameterized by
12241244
* the outer rel, either.
12251245
*/
1226-
if (bms_overlap(PATH_REQ_OUTER(innerpath),
1227-
outerrel->relids))
1246+
if (PATH_PARAM_BY_REL(innerpath, outerrel))
12281247
continue;
12291248

12301249
if (outerpath == cheapest_startup_outer &&

src/backend/optimizer/plan/planmain.c

+2-1
Original file line numberDiff line numberDiff line change
@@ -267,7 +267,8 @@ query_planner(PlannerInfo *root, List *tlist,
267267
*/
268268
final_rel = make_one_rel(root, joinlist);
269269

270-
if (!final_rel || !final_rel->cheapest_total_path)
270+
if (!final_rel || !final_rel->cheapest_total_path ||
271+
final_rel->cheapest_total_path->param_info != NULL)
271272
elog(ERROR, "failed to construct the join relation");
272273

273274
/*

0 commit comments

Comments
 (0)