Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 14231a4

Browse files
committed
Avoid creation of useless EquivalenceClasses during planning.
Zoltan Boszormenyi exhibited a test case in which planning time was dominated by construction of EquivalenceClasses and PathKeys that had no actual relevance to the query (and in fact got discarded immediately). This happened because we generated PathKeys describing the sort ordering of every index on every table in the query, and only after that checked to see if the sort ordering was relevant. The EC/PK construction code is O(N^2) in the number of ECs, which is all right for the intended number of such objects, but it gets out of hand if there are ECs for lots of irrelevant indexes. To fix, twiddle the handling of mergeclauses a little bit to ensure that every interesting EC is created before we begin path generation. (This doesn't cost anything --- in fact I think it's a bit cheaper than before --- since we always eventually created those ECs anyway.) Then, if an index column can't be found in any pre-existing EC, we know that that sort ordering is irrelevant for the query. Instead of creating a useless EC, we can just not build a pathkey for the index column in the first place. The index will still be considered if it's useful for non-order-related reasons, but we will think of its output as unsorted.
1 parent f184de3 commit 14231a4

File tree

6 files changed

+220
-62
lines changed

6 files changed

+220
-62
lines changed

src/backend/optimizer/README

+33-3
Original file line numberDiff line numberDiff line change
@@ -632,9 +632,39 @@ sort ordering was important; and so using the same PathKey for both sort
632632
orderings doesn't create any real problem.
633633

634634

635+
Order of processing for EquivalenceClasses and PathKeys
636+
-------------------------------------------------------
637+
638+
As alluded to above, there is a specific sequence of phases in the
639+
processing of EquivalenceClasses and PathKeys during planning. During the
640+
initial scanning of the query's quals (deconstruct_jointree followed by
641+
reconsider_outer_join_clauses), we construct EquivalenceClasses based on
642+
mergejoinable clauses found in the quals. At the end of this process,
643+
we know all we can know about equivalence of different variables, so
644+
subsequently there will be no further merging of EquivalenceClasses.
645+
At that point it is possible to consider the EquivalenceClasses as
646+
"canonical" and build canonical PathKeys that reference them. Before
647+
we reach that point (actually, before entering query_planner at all)
648+
we also ensure that we have constructed EquivalenceClasses for all the
649+
expressions used in the query's ORDER BY and related clauses. These
650+
classes might or might not get merged together, depending on what we
651+
find in the quals.
652+
653+
Because all the EquivalenceClasses are known before we begin path
654+
generation, we can use them as a guide to which indexes are of interest:
655+
if an index's column is not mentioned in any EquivalenceClass then that
656+
index's sort order cannot possibly be helpful for the query. This allows
657+
short-circuiting of much of the processing of create_index_paths() for
658+
irrelevant indexes.
659+
660+
There are some cases where planner.c constructs additional
661+
EquivalenceClasses and PathKeys after query_planner has completed.
662+
In these cases, the extra ECs/PKs are needed to represent sort orders
663+
that were not considered during query_planner. Such situations should be
664+
minimized since it is impossible for query_planner to return a plan
665+
producing such a sort order, meaning a explicit sort will always be needed.
666+
Currently this happens only for queries involving multiple window functions
667+
with different orderings, for which extra sorts are needed anyway.
635668

636-
Though Bob Devine <bob.devine@worldnet.att.net> was not involved in the
637-
coding of our optimizer, he is available to field questions about
638-
optimizer topics.
639669

640670
-- bjm & tgl

src/backend/optimizer/path/equivclass.c

+38-7
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,10 @@ static bool reconsider_full_join_clause(PlannerInfo *root,
7878
* join. (This is the reason why we need a failure return. It's more
7979
* convenient to check this case here than at the call sites...)
8080
*
81+
* On success return, we have also initialized the clause's left_ec/right_ec
82+
* fields to point to the EquivalenceClass representing it. This saves lookup
83+
* effort later.
84+
*
8185
* Note: constructing merged EquivalenceClasses is a standard UNION-FIND
8286
* problem, for which there exist better data structures than simple lists.
8387
* If this code ever proves to be a bottleneck then it could be sped up ---
@@ -106,6 +110,10 @@ process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
106110
*em2;
107111
ListCell *lc1;
108112

113+
/* Should not already be marked as having generated an eclass */
114+
Assert(restrictinfo->left_ec == NULL);
115+
Assert(restrictinfo->right_ec == NULL);
116+
109117
/* Extract info from given clause */
110118
Assert(is_opclause(clause));
111119
opno = ((OpExpr *) clause)->opno;
@@ -236,8 +244,10 @@ process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
236244
{
237245
ec1->ec_sources = lappend(ec1->ec_sources, restrictinfo);
238246
ec1->ec_below_outer_join |= below_outer_join;
247+
/* mark the RI as associated with this eclass */
248+
restrictinfo->left_ec = ec1;
249+
restrictinfo->right_ec = ec1;
239250
/* mark the RI as usable with this pair of EMs */
240-
/* NB: can't set left_ec/right_ec until merging is finished */
241251
restrictinfo->left_em = em1;
242252
restrictinfo->right_em = em2;
243253
return true;
@@ -266,6 +276,9 @@ process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
266276
ec2->ec_relids = NULL;
267277
ec1->ec_sources = lappend(ec1->ec_sources, restrictinfo);
268278
ec1->ec_below_outer_join |= below_outer_join;
279+
/* mark the RI as associated with this eclass */
280+
restrictinfo->left_ec = ec1;
281+
restrictinfo->right_ec = ec1;
269282
/* mark the RI as usable with this pair of EMs */
270283
restrictinfo->left_em = em1;
271284
restrictinfo->right_em = em2;
@@ -276,6 +289,9 @@ process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
276289
em2 = add_eq_member(ec1, item2, item2_relids, false, item2_type);
277290
ec1->ec_sources = lappend(ec1->ec_sources, restrictinfo);
278291
ec1->ec_below_outer_join |= below_outer_join;
292+
/* mark the RI as associated with this eclass */
293+
restrictinfo->left_ec = ec1;
294+
restrictinfo->right_ec = ec1;
279295
/* mark the RI as usable with this pair of EMs */
280296
restrictinfo->left_em = em1;
281297
restrictinfo->right_em = em2;
@@ -286,6 +302,9 @@ process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
286302
em1 = add_eq_member(ec2, item1, item1_relids, false, item1_type);
287303
ec2->ec_sources = lappend(ec2->ec_sources, restrictinfo);
288304
ec2->ec_below_outer_join |= below_outer_join;
305+
/* mark the RI as associated with this eclass */
306+
restrictinfo->left_ec = ec2;
307+
restrictinfo->right_ec = ec2;
289308
/* mark the RI as usable with this pair of EMs */
290309
restrictinfo->left_em = em1;
291310
restrictinfo->right_em = em2;
@@ -311,6 +330,9 @@ process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
311330

312331
root->eq_classes = lappend(root->eq_classes, ec);
313332

333+
/* mark the RI as associated with this eclass */
334+
restrictinfo->left_ec = ec;
335+
restrictinfo->right_ec = ec;
314336
/* mark the RI as usable with this pair of EMs */
315337
restrictinfo->left_em = em1;
316338
restrictinfo->right_em = em2;
@@ -362,15 +384,19 @@ add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids,
362384
/*
363385
* get_eclass_for_sort_expr
364386
* Given an expression and opfamily info, find an existing equivalence
365-
* class it is a member of; if none, build a new single-member
387+
* class it is a member of; if none, optionally build a new single-member
366388
* EquivalenceClass for it.
367389
*
368390
* sortref is the SortGroupRef of the originating SortGroupClause, if any,
369391
* or zero if not. (It should never be zero if the expression is volatile!)
370392
*
393+
* If create_it is TRUE, we'll build a new EquivalenceClass when there is no
394+
* match. If create_it is FALSE, we just return NULL when no match.
395+
*
371396
* This can be used safely both before and after EquivalenceClass merging;
372397
* since it never causes merging it does not invalidate any existing ECs
373-
* or PathKeys.
398+
* or PathKeys. However, ECs added after path generation has begun are
399+
* of limited usefulness, so usually it's best to create them beforehand.
374400
*
375401
* Note: opfamilies must be chosen consistently with the way
376402
* process_equivalence() would do; that is, generated from a mergejoinable
@@ -382,7 +408,8 @@ get_eclass_for_sort_expr(PlannerInfo *root,
382408
Expr *expr,
383409
Oid expr_datatype,
384410
List *opfamilies,
385-
Index sortref)
411+
Index sortref,
412+
bool create_it)
386413
{
387414
EquivalenceClass *newec;
388415
EquivalenceMember *newem;
@@ -426,8 +453,12 @@ get_eclass_for_sort_expr(PlannerInfo *root,
426453
}
427454
}
428455

456+
/* No match; does caller want a NULL result? */
457+
if (!create_it)
458+
return NULL;
459+
429460
/*
430-
* No match, so build a new single-member EC
461+
* OK, build a new single-member EC
431462
*
432463
* Here, we must be sure that we construct the EC in the right context. We
433464
* can assume, however, that the passed expr is long-lived.
@@ -1094,8 +1125,8 @@ create_join_clause(PlannerInfo *root,
10941125
rinfo->parent_ec = parent_ec;
10951126

10961127
/*
1097-
* We can set these now, rather than letting them be looked up later,
1098-
* since this is only used after EC merging is complete.
1128+
* We know the correct values for left_ec/right_ec, ie this particular EC,
1129+
* so we can just set them directly instead of forcing another lookup.
10991130
*/
11001131
rinfo->left_ec = ec;
11011132
rinfo->right_ec = ec;

src/backend/optimizer/path/joinpath.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -1041,7 +1041,7 @@ select_mergejoin_clauses(PlannerInfo *root,
10411041
* mergejoin is not really all that big a deal, and so it's not clear
10421042
* that improving this is important.
10431043
*/
1044-
cache_mergeclause_eclasses(root, restrictinfo);
1044+
update_mergeclause_eclasses(root, restrictinfo);
10451045

10461046
if (EC_MUST_BE_REDUNDANT(restrictinfo->left_ec) ||
10471047
EC_MUST_BE_REDUNDANT(restrictinfo->right_ec))

0 commit comments

Comments
 (0)