Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit d3d55ce

Browse files
committed
Remove useless self-joins
The Self Join Elimination (SJE) feature removes an inner join of a plain table to itself in the query tree if is proved that the join can be replaced with a scan without impacting the query result. Self join and inner relation are replaced with the outer in query, equivalence classes, and planner info structures. Also, inner restrictlist moves to the outer one with removing duplicated clauses. Thus, this optimization reduces the length of the range table list (this especially makes sense for partitioned relations), reduces the number of restriction clauses === selectivity estimations, and potentially can improve total planner prediction for the query. The SJE proof is based on innerrel_is_unique machinery. We can remove a self-join when for each outer row: 1. At most one inner row matches the join clause. 2. Each matched inner row must be (physically) the same row as the outer one. In this patch we use the next approach to identify a self-join: 1. Collect all merge-joinable join quals which look like a.x = b.x 2. Add to the list above the baseretrictinfo of the inner table. 3. Check innerrel_is_unique() for the qual list. If it returns false, skip this pair of joining tables. 4. Check uniqueness, proved by the baserestrictinfo clauses. To prove the possibility of self-join elimination inner and outer clauses must have an exact match. The relation replacement procedure is not trivial and it is partly combined with the one, used to remove useless left joins. Tests, covering this feature, were added to join.sql. Some regression tests changed due to self-join removal logic. Discussion: https://postgr.es/m/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru Author: Andrey Lepikhov, Alexander Kuzmenkov Reviewed-by: Tom Lane, Robert Haas, Andres Freund, Simon Riggs, Jonathan S. Katz Reviewed-by: David Rowley, Thomas Munro, Konstantin Knizhnik, Heikki Linnakangas Reviewed-by: Hywel Carver, Laurenz Albe, Ronan Dunklau, vignesh C, Zhihong Yu Reviewed-by: Greg Stark, Jaime Casanova, Michał Kłeczek, Alena Rybakina Reviewed-by: Alexander Korotkov
1 parent 8f0fd47 commit d3d55ce

File tree

14 files changed

+2457
-68
lines changed

14 files changed

+2457
-68
lines changed

doc/src/sgml/config.sgml

+16
Original file line numberDiff line numberDiff line change
@@ -5306,6 +5306,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
53065306
</listitem>
53075307
</varlistentry>
53085308

5309+
<varlistentry id="guc-enable_self_join_removal" xreflabel="enable_self_join_removal">
5310+
<term><varname>enable_self_join_removal</varname> (<type>boolean</type>)
5311+
<indexterm>
5312+
<primary><varname>enable_self_join_removal</varname> configuration parameter</primary>
5313+
</indexterm>
5314+
</term>
5315+
<listitem>
5316+
<para>
5317+
Enables or disables the query planner's optimization which analyses
5318+
the query tree and replaces self joins with semantically equivalent
5319+
single scans. Takes into consideration only plain tables.
5320+
The default is <literal>on</literal>.
5321+
</para>
5322+
</listitem>
5323+
</varlistentry>
5324+
53095325
<varlistentry id="guc-enable-seqscan" xreflabel="enable_seqscan">
53105326
<term><varname>enable_seqscan</varname> (<type>boolean</type>)
53115327
<indexterm>

src/backend/optimizer/path/indxpath.c

+39
Original file line numberDiff line numberDiff line change
@@ -3494,6 +3494,22 @@ bool
34943494
relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
34953495
List *restrictlist,
34963496
List *exprlist, List *oprlist)
3497+
{
3498+
return relation_has_unique_index_ext(root, rel, restrictlist,
3499+
exprlist, oprlist, NULL);
3500+
}
3501+
3502+
/*
3503+
* relation_has_unique_index_ext
3504+
* Same as relation_has_unique_index_for(), but supports extra_clauses
3505+
* parameter. If extra_clauses isn't NULL, return baserestrictinfo clauses
3506+
* which were used to derive uniqueness.
3507+
*/
3508+
bool
3509+
relation_has_unique_index_ext(PlannerInfo *root, RelOptInfo *rel,
3510+
List *restrictlist,
3511+
List *exprlist, List *oprlist,
3512+
List **extra_clauses)
34973513
{
34983514
ListCell *ic;
34993515

@@ -3549,6 +3565,7 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
35493565
{
35503566
IndexOptInfo *ind = (IndexOptInfo *) lfirst(ic);
35513567
int c;
3568+
List *exprs = NIL;
35523569

35533570
/*
35543571
* If the index is not unique, or not immediately enforced, or if it's
@@ -3600,6 +3617,24 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
36003617
if (match_index_to_operand(rexpr, c, ind))
36013618
{
36023619
matched = true; /* column is unique */
3620+
3621+
if (bms_membership(rinfo->clause_relids) == BMS_SINGLETON)
3622+
{
3623+
MemoryContext oldMemCtx =
3624+
MemoryContextSwitchTo(root->planner_cxt);
3625+
3626+
/*
3627+
* Add filter clause into a list allowing caller to
3628+
* know if uniqueness have made not only by join
3629+
* clauses.
3630+
*/
3631+
Assert(bms_is_empty(rinfo->left_relids) ||
3632+
bms_is_empty(rinfo->right_relids));
3633+
if (extra_clauses)
3634+
exprs = lappend(exprs, rinfo);
3635+
MemoryContextSwitchTo(oldMemCtx);
3636+
}
3637+
36033638
break;
36043639
}
36053640
}
@@ -3642,7 +3677,11 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
36423677

36433678
/* Matched all key columns of this index? */
36443679
if (c == ind->nkeycolumns)
3680+
{
3681+
if (extra_clauses)
3682+
*extra_clauses = exprs;
36453683
return true;
3684+
}
36463685
}
36473686

36483687
return false;

0 commit comments

Comments
 (0)