Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit e3b9852

Browse files
committed
Teach planner how to rearrange join order for some classes of OUTER JOIN.
Per my recent proposal. I ended up basing the implementation on the existing mechanism for enforcing valid join orders of IN joins --- the rules for valid outer-join orders are somewhat similar.
1 parent 1a6aaaa commit e3b9852

File tree

23 files changed

+955
-708
lines changed

23 files changed

+955
-708
lines changed

doc/src/sgml/config.sgml

Lines changed: 10 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.38 2005/12/09 15:51:13 petere Exp $
2+
$PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.39 2005/12/20 02:30:35 tgl Exp $
33
-->
44
<chapter Id="runtime-config">
55
<title>Server Configuration</title>
@@ -2028,6 +2028,7 @@ SELECT * FROM parent WHERE key = 2400;
20282028
this many items. Smaller values reduce planning time but may
20292029
yield inferior query plans. The default is 8. It is usually
20302030
wise to keep this less than <xref linkend="guc-geqo-threshold">.
2031+
For more information see <xref linkend="explicit-joins">.
20312032
</para>
20322033
</listitem>
20332034
</varlistentry>
@@ -2039,48 +2040,24 @@ SELECT * FROM parent WHERE key = 2400;
20392040
</indexterm>
20402041
<listitem>
20412042
<para>
2042-
The planner will rewrite explicit inner <literal>JOIN</>
2043-
constructs into lists of <literal>FROM</> items whenever a
2044-
list of no more than this many items in total would
2045-
result. Prior to <productname>PostgreSQL</> 7.4, joins
2046-
specified via the <literal>JOIN</literal> construct would
2047-
never be reordered by the query planner. The query planner has
2048-
subsequently been improved so that inner joins written in this
2049-
form can be reordered; this configuration parameter controls
2050-
the extent to which this reordering is performed.
2051-
<note>
2052-
<para>
2053-
At present, the order of outer joins specified via the
2054-
<literal>JOIN</> construct is never adjusted by the query
2055-
planner; therefore, <varname>join_collapse_limit</> has no
2056-
effect on this behavior. The planner may be improved to
2057-
reorder some classes of outer joins in a future release of
2058-
<productname>PostgreSQL</productname>.
2059-
</para>
2060-
</note>
2043+
The planner will rewrite explicit <literal>JOIN</>
2044+
constructs (except <literal>FULL JOIN</>s) into lists of
2045+
<literal>FROM</> items whenever a list of no more than this many items
2046+
would result. Smaller values reduce planning time but may
2047+
yield inferior query plans.
20612048
</para>
20622049

20632050
<para>
20642051
By default, this variable is set the same as
20652052
<varname>from_collapse_limit</varname>, which is appropriate
20662053
for most uses. Setting it to 1 prevents any reordering of
2067-
inner <literal>JOIN</>s. Thus, the explicit join order
2054+
explicit <literal>JOIN</>s. Thus, the explicit join order
20682055
specified in the query will be the actual order in which the
20692056
relations are joined. The query planner does not always choose
20702057
the optimal join order; advanced users may elect to
20712058
temporarily set this variable to 1, and then specify the join
2072-
order they desire explicitly. Another consequence of setting
2073-
this variable to 1 is that the query planner will behave more
2074-
like the <productname>PostgreSQL</productname> 7.3 query
2075-
planner, which some users might find useful for backward
2076-
compatibility reasons.
2077-
</para>
2078-
2079-
<para>
2080-
Setting this variable to a value between 1 and
2081-
<varname>from_collapse_limit</varname> might be useful to
2082-
trade off planning time against the quality of the chosen plan
2083-
(higher values produce better plans).
2059+
order they desire explicitly.
2060+
For more information see <xref linkend="explicit-joins">.
20842061
</para>
20852062
</listitem>
20862063
</varlistentry>

doc/src/sgml/perform.sgml

Lines changed: 28 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.54 2005/11/04 23:14:00 petere Exp $
2+
$PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.55 2005/12/20 02:30:35 tgl Exp $
33
-->
44

55
<chapter id="performance-tips">
@@ -627,7 +627,7 @@ SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
627627
</para>
628628

629629
<para>
630-
When the query involves outer joins, the planner has much less freedom
630+
When the query involves outer joins, the planner has less freedom
631631
than it does for plain (inner) joins. For example, consider
632632
<programlisting>
633633
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
@@ -637,16 +637,30 @@ SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
637637
emitted for each row of A that has no matching row in the join of B and C.
638638
Therefore the planner has no choice of join order here: it must join
639639
B to C and then join A to that result. Accordingly, this query takes
640-
less time to plan than the previous query.
640+
less time to plan than the previous query. In other cases, the planner
641+
may be able to determine that more than one join order is safe.
642+
For example, given
643+
<programlisting>
644+
SELECT * FROM a LEFT JOIN b ON (a.bid = b.id) LEFT JOIN c ON (a.cid = c.id);
645+
</programlisting>
646+
it is valid to join A to either B or C first. Currently, only
647+
<literal>FULL JOIN</> completely constrains the join order. Most
648+
practical cases involving <literal>LEFT JOIN</> or <literal>RIGHT JOIN</>
649+
can be rearranged to some extent.
641650
</para>
642651

643652
<para>
644653
Explicit inner join syntax (<literal>INNER JOIN</>, <literal>CROSS
645654
JOIN</>, or unadorned <literal>JOIN</>) is semantically the same as
646-
listing the input relations in <literal>FROM</>, so it does not need to
647-
constrain the join order. But it is possible to instruct the
648-
<productname>PostgreSQL</productname> query planner to treat
649-
explicit inner <literal>JOIN</>s as constraining the join order anyway.
655+
listing the input relations in <literal>FROM</>, so it does not
656+
constrain the join order.
657+
</para>
658+
659+
<para>
660+
Even though most kinds of <literal>JOIN</> don't completely constrain
661+
the join order, it is possible to instruct the
662+
<productname>PostgreSQL</productname> query planner to treat all
663+
<literal>JOIN</> clauses as constraining the join order anyway.
650664
For example, these three queries are logically equivalent:
651665
<programlisting>
652666
SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
@@ -660,7 +674,8 @@ SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
660674
</para>
661675

662676
<para>
663-
To force the planner to follow the <literal>JOIN</> order for inner joins,
677+
To force the planner to follow the join order laid out by explicit
678+
<literal>JOIN</>s,
664679
set the <xref linkend="guc-join-collapse-limit"> run-time parameter to 1.
665680
(Other possible values are discussed below.)
666681
</para>
@@ -697,9 +712,9 @@ FROM x, y,
697712
WHERE somethingelse;
698713
</programlisting>
699714
This situation might arise from use of a view that contains a join;
700-
the view's <literal>SELECT</> rule will be inserted in place of the view reference,
701-
yielding a query much like the above. Normally, the planner will try
702-
to collapse the subquery into the parent, yielding
715+
the view's <literal>SELECT</> rule will be inserted in place of the view
716+
reference, yielding a query much like the above. Normally, the planner
717+
will try to collapse the subquery into the parent, yielding
703718
<programlisting>
704719
SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
705720
</programlisting>
@@ -722,12 +737,12 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
722737
linkend="guc-join-collapse-limit">
723738
are similarly named because they do almost the same thing: one controls
724739
when the planner will <quote>flatten out</> subselects, and the
725-
other controls when it will flatten out explicit inner joins. Typically
740+
other controls when it will flatten out explicit joins. Typically
726741
you would either set <varname>join_collapse_limit</> equal to
727742
<varname>from_collapse_limit</> (so that explicit joins and subselects
728743
act similarly) or set <varname>join_collapse_limit</> to 1 (if you want
729744
to control join order with explicit joins). But you might set them
730-
differently if you are trying to fine-tune the trade off between planning
745+
differently if you are trying to fine-tune the trade-off between planning
731746
time and run time.
732747
</para>
733748
</sect1>

src/backend/nodes/copyfuncs.c

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
* Portions Copyright (c) 1994, Regents of the University of California
1616
*
1717
* IDENTIFICATION
18-
* $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.322 2005/11/26 22:14:56 tgl Exp $
18+
* $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.323 2005/12/20 02:30:35 tgl Exp $
1919
*
2020
*-------------------------------------------------------------------------
2121
*/
@@ -1277,6 +1277,22 @@ _copyRestrictInfo(RestrictInfo *from)
12771277
return newnode;
12781278
}
12791279

1280+
/*
1281+
* _copyOuterJoinInfo
1282+
*/
1283+
static OuterJoinInfo *
1284+
_copyOuterJoinInfo(OuterJoinInfo *from)
1285+
{
1286+
OuterJoinInfo *newnode = makeNode(OuterJoinInfo);
1287+
1288+
COPY_BITMAPSET_FIELD(min_lefthand);
1289+
COPY_BITMAPSET_FIELD(min_righthand);
1290+
COPY_SCALAR_FIELD(is_full_join);
1291+
COPY_SCALAR_FIELD(lhs_strict);
1292+
1293+
return newnode;
1294+
}
1295+
12801296
/*
12811297
* _copyInClauseInfo
12821298
*/
@@ -2906,6 +2922,9 @@ copyObject(void *from)
29062922
case T_RestrictInfo:
29072923
retval = _copyRestrictInfo(from);
29082924
break;
2925+
case T_OuterJoinInfo:
2926+
retval = _copyOuterJoinInfo(from);
2927+
break;
29092928
case T_InClauseInfo:
29102929
retval = _copyInClauseInfo(from);
29112930
break;

src/backend/nodes/equalfuncs.c

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
* Portions Copyright (c) 1994, Regents of the University of California
1919
*
2020
* IDENTIFICATION
21-
* $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.258 2005/11/22 18:17:11 momjian Exp $
21+
* $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.259 2005/12/20 02:30:35 tgl Exp $
2222
*
2323
*-------------------------------------------------------------------------
2424
*/
@@ -613,6 +613,17 @@ _equalRestrictInfo(RestrictInfo *a, RestrictInfo *b)
613613
return true;
614614
}
615615

616+
static bool
617+
_equalOuterJoinInfo(OuterJoinInfo *a, OuterJoinInfo *b)
618+
{
619+
COMPARE_BITMAPSET_FIELD(min_lefthand);
620+
COMPARE_BITMAPSET_FIELD(min_righthand);
621+
COMPARE_SCALAR_FIELD(is_full_join);
622+
COMPARE_SCALAR_FIELD(lhs_strict);
623+
624+
return true;
625+
}
626+
616627
static bool
617628
_equalInClauseInfo(InClauseInfo *a, InClauseInfo *b)
618629
{
@@ -1954,6 +1965,9 @@ equal(void *a, void *b)
19541965
case T_RestrictInfo:
19551966
retval = _equalRestrictInfo(a, b);
19561967
break;
1968+
case T_OuterJoinInfo:
1969+
retval = _equalOuterJoinInfo(a, b);
1970+
break;
19571971
case T_InClauseInfo:
19581972
retval = _equalInClauseInfo(a, b);
19591973
break;

src/backend/nodes/outfuncs.c

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.264 2005/11/28 04:35:30 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.265 2005/12/20 02:30:35 tgl Exp $
1212
*
1313
* NOTES
1414
* Every node type that can appear in stored rules' parsetrees *must*
@@ -1167,6 +1167,7 @@ _outPlannerInfo(StringInfo str, PlannerInfo *node)
11671167
WRITE_NODE_FIELD(left_join_clauses);
11681168
WRITE_NODE_FIELD(right_join_clauses);
11691169
WRITE_NODE_FIELD(full_join_clauses);
1170+
WRITE_NODE_FIELD(oj_info_list);
11701171
WRITE_NODE_FIELD(in_info_list);
11711172
WRITE_NODE_FIELD(query_pathkeys);
11721173
WRITE_NODE_FIELD(group_pathkeys);
@@ -1201,7 +1202,6 @@ _outRelOptInfo(StringInfo str, RelOptInfo *node)
12011202
WRITE_FLOAT_FIELD(tuples, "%.0f");
12021203
WRITE_NODE_FIELD(subplan);
12031204
WRITE_NODE_FIELD(baserestrictinfo);
1204-
WRITE_BITMAPSET_FIELD(outerjoinset);
12051205
WRITE_NODE_FIELD(joininfo);
12061206
WRITE_BITMAPSET_FIELD(index_outer_relids);
12071207
WRITE_NODE_FIELD(index_inner_paths);
@@ -1265,6 +1265,17 @@ _outInnerIndexscanInfo(StringInfo str, InnerIndexscanInfo *node)
12651265
WRITE_NODE_FIELD(best_innerpath);
12661266
}
12671267

1268+
static void
1269+
_outOuterJoinInfo(StringInfo str, OuterJoinInfo *node)
1270+
{
1271+
WRITE_NODE_TYPE("OUTERJOININFO");
1272+
1273+
WRITE_BITMAPSET_FIELD(min_lefthand);
1274+
WRITE_BITMAPSET_FIELD(min_righthand);
1275+
WRITE_BOOL_FIELD(is_full_join);
1276+
WRITE_BOOL_FIELD(lhs_strict);
1277+
}
1278+
12681279
static void
12691280
_outInClauseInfo(StringInfo str, InClauseInfo *node)
12701281
{
@@ -2019,6 +2030,9 @@ _outNode(StringInfo str, void *obj)
20192030
case T_InnerIndexscanInfo:
20202031
_outInnerIndexscanInfo(str, obj);
20212032
break;
2033+
case T_OuterJoinInfo:
2034+
_outOuterJoinInfo(str, obj);
2035+
break;
20222036
case T_InClauseInfo:
20232037
_outInClauseInfo(str, obj);
20242038
break;

0 commit comments

Comments
 (0)