Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 60be3f9

Browse files
author
Richard Guo
committed
Avoid unnecessary wrapping for more complex expressions
When pulling up a subquery that is under an outer join, if the subquery's target list contains a strict expression that uses a subquery variable, it's okay to pull up the expression without wrapping it in a PlaceHolderVar: if the subquery variable is forced to NULL by the outer join, the expression result will come out as NULL too. If the strict expression does not contain any subquery variables, the current code always wraps it in a PlaceHolderVar. While this is not incorrect, the analysis could be tighter: if the strict expression contains any variables of rels that are under the same lowest nulling outer join as the subquery, we can also avoid wrapping it. This is safe because if the subquery variable is forced to NULL by the outer join, the variables of rels that are under the same lowest nulling outer join will also be forced to NULL, resulting in the expression evaluating to NULL as well. Therefore, it's not necessary to force the expression to be evaluated below the outer join. It could be beneficial to get rid of such PHVs because they could imply lateral dependencies, which force us to resort to nestloop joins. This patch checks if the lateral references in the strict expression contain any variables of rels under the same lowest nulling outer join as the subquery, and avoids wrapping the expression in that case. This is fundamentally a generalization of the optimizations for bare Vars and PHVs introduced in commit f64ec81. No backpatch as this could result in plan changes. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4_ENtfRdLaM_bXAxiKRYO7DmwDBDG4_2=VTDi0mJP-jAw@mail.gmail.com
1 parent 2364f61 commit 60be3f9

File tree

3 files changed

+186
-17
lines changed

3 files changed

+186
-17
lines changed

src/backend/optimizer/prep/prepjointree.c

Lines changed: 50 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2649,26 +2649,29 @@ pullup_replace_vars_callback(Var *var,
26492649
{
26502650
/*
26512651
* If the node contains Var(s) or PlaceHolderVar(s) of the
2652-
* subquery being pulled up, and does not contain any
2653-
* non-strict constructs, then instead of adding a PHV on top
2654-
* we can add the required nullingrels to those Vars/PHVs.
2655-
* (This is fundamentally a generalization of the above cases
2656-
* for bare Vars and PHVs.)
2652+
* subquery being pulled up, or of rels that are under the
2653+
* same lowest nulling outer join as the subquery, and does
2654+
* not contain any non-strict constructs, then instead of
2655+
* adding a PHV on top we can add the required nullingrels to
2656+
* those Vars/PHVs. (This is fundamentally a generalization
2657+
* of the above cases for bare Vars and PHVs.)
26572658
*
26582659
* This test is somewhat expensive, but it avoids pessimizing
26592660
* the plan in cases where the nullingrels get removed again
26602661
* later by outer join reduction.
26612662
*
26622663
* Note that we don't force wrapping of expressions containing
26632664
* lateral references, so long as they also contain Vars/PHVs
2664-
* of the subquery. This is okay because of the restriction
2665-
* to strict constructs: if the subquery's Vars/PHVs have been
2666-
* forced to NULL by an outer join then the end result of the
2667-
* expression will be NULL too, regardless of the lateral
2668-
* references. So it's not necessary to force the expression
2669-
* to be evaluated below the outer join. This can be a very
2670-
* valuable optimization, because it may allow us to avoid
2671-
* using a nested loop to pass the lateral reference down.
2665+
* of the subquery, or of rels that are under the same lowest
2666+
* nulling outer join as the subquery. This is okay because
2667+
* of the restriction to strict constructs: if those Vars/PHVs
2668+
* have been forced to NULL by an outer join then the end
2669+
* result of the expression will be NULL too, regardless of
2670+
* the lateral references. So it's not necessary to force the
2671+
* expression to be evaluated below the outer join. This can
2672+
* be a very valuable optimization, because it may allow us to
2673+
* avoid using a nested loop to pass the lateral reference
2674+
* down.
26722675
*
26732676
* This analysis could be tighter: in particular, a non-strict
26742677
* construct hidden within a lower-level PlaceHolderVar is not
@@ -2679,10 +2682,40 @@ pullup_replace_vars_callback(Var *var,
26792682
* membership of the node, but if it's non-lateral then any
26802683
* level-zero var must belong to the subquery.
26812684
*/
2682-
if ((rcon->target_rte->lateral ?
2683-
bms_overlap(pull_varnos(rcon->root, newnode),
2684-
rcon->relids) :
2685-
contain_vars_of_level(newnode, 0)) &&
2685+
bool contain_nullable_vars = false;
2686+
2687+
if (!rcon->target_rte->lateral)
2688+
{
2689+
if (contain_vars_of_level(newnode, 0))
2690+
contain_nullable_vars = true;
2691+
}
2692+
else
2693+
{
2694+
Relids all_varnos;
2695+
2696+
all_varnos = pull_varnos(rcon->root, newnode);
2697+
if (bms_overlap(all_varnos, rcon->relids))
2698+
contain_nullable_vars = true;
2699+
else
2700+
{
2701+
nullingrel_info *nullinfo = rcon->nullinfo;
2702+
int varno;
2703+
2704+
varno = -1;
2705+
while ((varno = bms_next_member(all_varnos, varno)) >= 0)
2706+
{
2707+
Assert(varno > 0 && varno <= nullinfo->rtlength);
2708+
if (bms_is_subset(nullinfo->nullingrels[rcon->varno],
2709+
nullinfo->nullingrels[varno]))
2710+
{
2711+
contain_nullable_vars = true;
2712+
break;
2713+
}
2714+
}
2715+
}
2716+
}
2717+
2718+
if (contain_nullable_vars &&
26862719
!contain_nonstrict_functions(newnode))
26872720
{
26882721
/* No wrap needed */

src/test/regress/expected/subselect.out

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1848,6 +1848,109 @@ order by 1, 2;
18481848
4567890123456789 | 9135780246913578
18491849
(11 rows)
18501850

1851+
-- strict expressions containing variables of rels under the same lowest
1852+
-- nulling outer join can escape being wrapped
1853+
explain (verbose, costs off)
1854+
select t1.q1, x from
1855+
int8_tbl t1 left join
1856+
(int8_tbl t2 inner join
1857+
lateral (select t2.q1+1 as x, * from int8_tbl t3) t3 on t2.q2 = t3.q2)
1858+
on t1.q2 = t2.q2
1859+
order by 1, 2;
1860+
QUERY PLAN
1861+
--------------------------------------------------------
1862+
Sort
1863+
Output: t1.q1, ((t2.q1 + 1))
1864+
Sort Key: t1.q1, ((t2.q1 + 1))
1865+
-> Hash Right Join
1866+
Output: t1.q1, (t2.q1 + 1)
1867+
Hash Cond: (t2.q2 = t1.q2)
1868+
-> Hash Join
1869+
Output: t2.q1, t2.q2
1870+
Hash Cond: (t2.q2 = t3.q2)
1871+
-> Seq Scan on public.int8_tbl t2
1872+
Output: t2.q1, t2.q2
1873+
-> Hash
1874+
Output: t3.q2
1875+
-> Seq Scan on public.int8_tbl t3
1876+
Output: t3.q2
1877+
-> Hash
1878+
Output: t1.q1, t1.q2
1879+
-> Seq Scan on public.int8_tbl t1
1880+
Output: t1.q1, t1.q2
1881+
(19 rows)
1882+
1883+
select t1.q1, x from
1884+
int8_tbl t1 left join
1885+
(int8_tbl t2 inner join
1886+
lateral (select t2.q1+1 as x, * from int8_tbl t3) t3 on t2.q2 = t3.q2)
1887+
on t1.q2 = t2.q2
1888+
order by 1, 2;
1889+
q1 | x
1890+
------------------+------------------
1891+
123 | 124
1892+
123 | 124
1893+
123 | 124
1894+
123 | 4567890123456790
1895+
123 | 4567890123456790
1896+
4567890123456789 | 124
1897+
4567890123456789 | 124
1898+
4567890123456789 | 4567890123456790
1899+
4567890123456789 | 4567890123456790
1900+
4567890123456789 | 4567890123456790
1901+
4567890123456789 | 4567890123456790
1902+
(11 rows)
1903+
1904+
-- otherwise we need to wrap the strict expressions
1905+
explain (verbose, costs off)
1906+
select t1.q1, x from
1907+
int8_tbl t1 left join
1908+
(int8_tbl t2 left join
1909+
lateral (select t2.q1+1 as x, * from int8_tbl t3) t3 on t2.q2 = t3.q2)
1910+
on t1.q2 = t2.q2
1911+
order by 1, 2;
1912+
QUERY PLAN
1913+
--------------------------------------------------
1914+
Sort
1915+
Output: t1.q1, ((t2.q1 + 1))
1916+
Sort Key: t1.q1, ((t2.q1 + 1))
1917+
-> Hash Right Join
1918+
Output: t1.q1, ((t2.q1 + 1))
1919+
Hash Cond: (t2.q2 = t1.q2)
1920+
-> Nested Loop Left Join
1921+
Output: t2.q2, ((t2.q1 + 1))
1922+
-> Seq Scan on public.int8_tbl t2
1923+
Output: t2.q1, t2.q2
1924+
-> Seq Scan on public.int8_tbl t3
1925+
Output: t3.q2, (t2.q1 + 1)
1926+
Filter: (t2.q2 = t3.q2)
1927+
-> Hash
1928+
Output: t1.q1, t1.q2
1929+
-> Seq Scan on public.int8_tbl t1
1930+
Output: t1.q1, t1.q2
1931+
(17 rows)
1932+
1933+
select t1.q1, x from
1934+
int8_tbl t1 left join
1935+
(int8_tbl t2 left join
1936+
lateral (select t2.q1+1 as x, * from int8_tbl t3) t3 on t2.q2 = t3.q2)
1937+
on t1.q2 = t2.q2
1938+
order by 1, 2;
1939+
q1 | x
1940+
------------------+------------------
1941+
123 | 124
1942+
123 | 124
1943+
123 | 124
1944+
123 | 4567890123456790
1945+
123 | 4567890123456790
1946+
4567890123456789 | 124
1947+
4567890123456789 | 124
1948+
4567890123456789 | 4567890123456790
1949+
4567890123456789 | 4567890123456790
1950+
4567890123456789 | 4567890123456790
1951+
4567890123456789 | 4567890123456790
1952+
(11 rows)
1953+
18511954
-- lateral references for simple Vars can escape being wrapped if the
18521955
-- referenced rel is under the same lowest nulling outer join
18531956
explain (verbose, costs off)

src/test/regress/sql/subselect.sql

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -939,6 +939,39 @@ select t1.q1, x from
939939
on t1.q2 = t2.q2
940940
order by 1, 2;
941941

942+
-- strict expressions containing variables of rels under the same lowest
943+
-- nulling outer join can escape being wrapped
944+
explain (verbose, costs off)
945+
select t1.q1, x from
946+
int8_tbl t1 left join
947+
(int8_tbl t2 inner join
948+
lateral (select t2.q1+1 as x, * from int8_tbl t3) t3 on t2.q2 = t3.q2)
949+
on t1.q2 = t2.q2
950+
order by 1, 2;
951+
952+
select t1.q1, x from
953+
int8_tbl t1 left join
954+
(int8_tbl t2 inner join
955+
lateral (select t2.q1+1 as x, * from int8_tbl t3) t3 on t2.q2 = t3.q2)
956+
on t1.q2 = t2.q2
957+
order by 1, 2;
958+
959+
-- otherwise we need to wrap the strict expressions
960+
explain (verbose, costs off)
961+
select t1.q1, x from
962+
int8_tbl t1 left join
963+
(int8_tbl t2 left join
964+
lateral (select t2.q1+1 as x, * from int8_tbl t3) t3 on t2.q2 = t3.q2)
965+
on t1.q2 = t2.q2
966+
order by 1, 2;
967+
968+
select t1.q1, x from
969+
int8_tbl t1 left join
970+
(int8_tbl t2 left join
971+
lateral (select t2.q1+1 as x, * from int8_tbl t3) t3 on t2.q2 = t3.q2)
972+
on t1.q2 = t2.q2
973+
order by 1, 2;
974+
942975
-- lateral references for simple Vars can escape being wrapped if the
943976
-- referenced rel is under the same lowest nulling outer join
944977
explain (verbose, costs off)

0 commit comments

Comments
 (0)