Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 5ebaaa4

Browse files
committed
Implement SQL-standard LATERAL subqueries.
This patch implements the standard syntax of LATERAL attached to a sub-SELECT in FROM, and also allows LATERAL attached to a function in FROM, since set-returning function calls are expected to be one of the principal use-cases. The main change here is a rewrite of the mechanism for keeping track of which relations are visible for column references while the FROM clause is being scanned. The parser "namespace" lists are no longer lists of bare RTEs, but are lists of ParseNamespaceItem structs, which carry an RTE pointer as well as some visibility-controlling flags. Aside from supporting LATERAL correctly, this lets us get rid of the ancient hacks that required rechecking subqueries and JOIN/ON and function-in-FROM expressions for invalid references after they were initially parsed. Invalid column references are now always correctly detected on sight. In passing, remove assorted parser error checks that are now dead code by virtue of our having gotten rid of add_missing_from, as well as some comments that are obsolete for the same reason. (It was mainly add_missing_from that caused so much fudging here in the first place.) The planner support for this feature is very minimal, and will be improved in future patches. It works well enough for testing purposes, though. catversion bump forced due to new field in RangeTblEntry.
1 parent 5078be4 commit 5ebaaa4

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+1300
-481
lines changed

doc/src/sgml/keywords.sgml

+1-1
Original file line numberDiff line numberDiff line change
@@ -2444,7 +2444,7 @@
24442444
</row>
24452445
<row>
24462446
<entry><token>LATERAL</token></entry>
2447-
<entry></entry>
2447+
<entry>reserved</entry>
24482448
<entry>reserved</entry>
24492449
<entry>reserved</entry>
24502450
<entry></entry>

doc/src/sgml/queries.sgml

+82-1
Original file line numberDiff line numberDiff line change
@@ -590,7 +590,7 @@ SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
590590
<para>
591591
Subqueries specifying a derived table must be enclosed in
592592
parentheses and <emphasis>must</emphasis> be assigned a table
593-
alias name. (See <xref linkend="queries-table-aliases">.) For
593+
alias name (as in <xref linkend="queries-table-aliases">). For
594594
example:
595595
<programlisting>
596596
FROM (SELECT * FROM table1) AS alias_name
@@ -697,6 +697,87 @@ SELECT *
697697
expand to.
698698
</para>
699699
</sect3>
700+
701+
<sect3 id="queries-lateral">
702+
<title><literal>LATERAL</> Subqueries</title>
703+
704+
<indexterm zone="queries-lateral">
705+
<primary>LATERAL</>
706+
<secondary>in the FROM clause</>
707+
</indexterm>
708+
709+
<para>
710+
Subqueries and table functions appearing in <literal>FROM</> can be
711+
preceded by the key word <literal>LATERAL</>. This allows them to
712+
reference columns provided by preceding <literal>FROM</> items.
713+
(Without <literal>LATERAL</literal>, each <literal>FROM</> item is
714+
evaluated independently and so cannot cross-reference any other
715+
<literal>FROM</> item.)
716+
A <literal>LATERAL</literal> item can appear at top level in the
717+
<literal>FROM</> list, or within a <literal>JOIN</> tree; in the latter
718+
case it can also refer to any items that are on the left-hand side of a
719+
<literal>JOIN</> that it is on the right-hand side of.
720+
</para>
721+
722+
<para>
723+
When a <literal>FROM</> item contains <literal>LATERAL</literal>
724+
cross-references, evaluation proceeds as follows: for each row of the
725+
<literal>FROM</> item providing the cross-referenced column(s), or
726+
set of rows of multiple <literal>FROM</> items providing the
727+
columns, the <literal>LATERAL</literal> item is evaluated using that
728+
row or row set's values of the columns. The resulting row(s) are
729+
joined as usual with the rows they were computed from. This is
730+
repeated for each row or set of rows from the column source table(s).
731+
</para>
732+
733+
<para>
734+
A trivial example of <literal>LATERAL</literal> is
735+
<programlisting>
736+
SELECT * FROM foo, LATERAL (SELECT * FROM bar WHERE bar.id = foo.bar_id) ss;
737+
</programlisting>
738+
This is not especially useful since it has exactly the same result as
739+
the more conventional
740+
<programlisting>
741+
SELECT * FROM foo, bar WHERE bar.id = foo.bar_id;
742+
</programlisting>
743+
<literal>LATERAL</literal> is primarily useful when the cross-referenced
744+
column is necessary for computing the row(s) to be joined. A common
745+
application is providing an argument value for a set-returning function.
746+
For example, supposing that <function>vertices(polygon)</> returns the
747+
set of vertices of a polygon, we could identify close-together vertices
748+
of polygons stored in a table with:
749+
<programlisting>
750+
SELECT p1.id, p2.id, v1, v2
751+
FROM polygons p1, polygons p2,
752+
LATERAL vertices(p1.poly) v1,
753+
LATERAL vertices(p2.poly) v2
754+
WHERE (v1 &lt;-&gt; v2) &lt; 10 AND p1.id != p2.id;
755+
</programlisting>
756+
This query could also be written
757+
<programlisting>
758+
SELECT p1.id, p2.id, v1, v2
759+
FROM polygons p1 CROSS JOIN LATERAL vertices(p1.poly) v1,
760+
polygons p2 CROSS JOIN LATERAL vertices(p2.poly) v2
761+
WHERE (v1 &lt;-&gt; v2) &lt; 10 AND p1.id != p2.id;
762+
</programlisting>
763+
or in several other equivalent formulations.
764+
</para>
765+
766+
<para>
767+
It is often particularly handy to <literal>LEFT JOIN</> to a
768+
<literal>LATERAL</literal> subquery, so that source rows will appear in
769+
the result even if the <literal>LATERAL</literal> subquery produces no
770+
rows for them. For example, if <function>get_product_names()</> returns
771+
the names of products made by a manufacturer, but some manufacturers in
772+
our table currently produce no products, we could find out which ones
773+
those are like this:
774+
<programlisting>
775+
SELECT m.name
776+
FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON true
777+
WHERE pname IS NULL;
778+
</programlisting>
779+
</para>
780+
</sect3>
700781
</sect2>
701782

702783
<sect2 id="queries-where">

doc/src/sgml/ref/select.sgml

+89-13
Original file line numberDiff line numberDiff line change
@@ -50,10 +50,10 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
5050
<phrase>where <replaceable class="parameter">from_item</replaceable> can be one of:</phrase>
5151

5252
[ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
53-
( <replaceable class="parameter">select</replaceable> ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ]
53+
[ LATERAL ] ( <replaceable class="parameter">select</replaceable> ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ]
5454
<replaceable class="parameter">with_query_name</replaceable> [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
55-
<replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] | <replaceable class="parameter">column_definition</replaceable> [, ...] ) ]
56-
<replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] ) AS ( <replaceable class="parameter">column_definition</replaceable> [, ...] )
55+
[ LATERAL ] <replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] | <replaceable class="parameter">column_definition</replaceable> [, ...] ) ]
56+
[ LATERAL ] <replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] ) AS ( <replaceable class="parameter">column_definition</replaceable> [, ...] )
5757
<replaceable class="parameter">from_item</replaceable> [ NATURAL ] <replaceable class="parameter">join_type</replaceable> <replaceable class="parameter">from_item</replaceable> [ ON <replaceable class="parameter">join_condition</replaceable> | USING ( <replaceable class="parameter">join_column</replaceable> [, ...] ) ]
5858

5959
<phrase>and <replaceable class="parameter">with_query</replaceable> is:</phrase>
@@ -284,8 +284,8 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
284284
The <literal>FROM</literal> clause specifies one or more source
285285
tables for the <command>SELECT</command>. If multiple sources are
286286
specified, the result is the Cartesian product (cross join) of all
287-
the sources. But usually qualification conditions
288-
are added to restrict the returned rows to a small subset of the
287+
the sources. But usually qualification conditions are added (via
288+
<literal>WHERE</>) to restrict the returned rows to a small subset of the
289289
Cartesian product.
290290
</para>
291291

@@ -414,17 +414,18 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
414414
</para>
415415

416416
<para>
417-
A <literal>JOIN</literal> clause combines two
418-
<literal>FROM</> items. Use parentheses if necessary to
419-
determine the order of nesting. In the absence of parentheses,
420-
<literal>JOIN</literal>s nest left-to-right. In any case
421-
<literal>JOIN</literal> binds more tightly than the commas
422-
separating <literal>FROM</> items.
417+
A <literal>JOIN</literal> clause combines two <literal>FROM</>
418+
items, which for convenience we will refer to as <quote>tables</>,
419+
though in reality they can be any type of <literal>FROM</> item.
420+
Use parentheses if necessary to determine the order of nesting.
421+
In the absence of parentheses, <literal>JOIN</literal>s nest
422+
left-to-right. In any case <literal>JOIN</literal> binds more
423+
tightly than the commas separating <literal>FROM</>-list items.
423424
</para>
424425

425426
<para><literal>CROSS JOIN</> and <literal>INNER JOIN</literal>
426427
produce a simple Cartesian product, the same result as you get from
427-
listing the two items at the top level of <literal>FROM</>,
428+
listing the two tables at the top level of <literal>FROM</>,
428429
but restricted by the join condition (if any).
429430
<literal>CROSS JOIN</> is equivalent to <literal>INNER JOIN ON
430431
(TRUE)</>, that is, no rows are removed by qualification.
@@ -449,7 +450,7 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
449450
joined rows, plus one row for each unmatched right-hand row
450451
(extended with nulls on the left). This is just a notational
451452
convenience, since you could convert it to a <literal>LEFT
452-
OUTER JOIN</> by switching the left and right inputs.
453+
OUTER JOIN</> by switching the left and right tables.
453454
</para>
454455

455456
<para><literal>FULL OUTER JOIN</> returns all the joined rows, plus
@@ -495,6 +496,47 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
495496
</para>
496497
</listitem>
497498
</varlistentry>
499+
500+
<varlistentry>
501+
<term><literal>LATERAL</literal></term>
502+
<listitem>
503+
<para>The <literal>LATERAL</literal> key word can precede a
504+
sub-<command>SELECT</command> or function-call <literal>FROM</>
505+
item. This allows the sub-<command>SELECT</command> or function
506+
expression to refer to columns of <literal>FROM</> items that appear
507+
before it in the <literal>FROM</> list. (Without
508+
<literal>LATERAL</literal>, each <literal>FROM</> item is evaluated
509+
independently and so cannot cross-reference any other
510+
<literal>FROM</> item.) A <literal>LATERAL</literal> item can
511+
appear at top level in the <literal>FROM</> list, or within a
512+
<literal>JOIN</> tree; in the latter case it can also refer to any
513+
items that are on the left-hand side of a <literal>JOIN</> that it is
514+
on the right-hand side of.
515+
</para>
516+
517+
<para>
518+
When a <literal>FROM</> item contains <literal>LATERAL</literal>
519+
cross-references, evaluation proceeds as follows: for each row of the
520+
<literal>FROM</> item providing the cross-referenced column(s), or
521+
set of rows of multiple <literal>FROM</> items providing the
522+
columns, the <literal>LATERAL</literal> item is evaluated using that
523+
row or row set's values of the columns. The resulting row(s) are
524+
joined as usual with the rows they were computed from. This is
525+
repeated for each row or set of rows from the column source table(s).
526+
</para>
527+
528+
<para>
529+
The column source table(s) must be <literal>INNER</> or
530+
<literal>LEFT</> joined to the <literal>LATERAL</literal> item, else
531+
there would not be a well-defined set of rows from which to compute
532+
each set of rows for the <literal>LATERAL</literal> item. Thus,
533+
although a construct such as <literal><replaceable>X</> RIGHT JOIN
534+
LATERAL <replaceable>Y</></literal> is syntactically valid, it is
535+
not actually allowed for <replaceable>Y</> to reference
536+
<replaceable>X</>.
537+
</para>
538+
</listitem>
539+
</varlistentry>
498540
</variablelist>
499541
</para>
500542
</refsect2>
@@ -1532,6 +1574,26 @@ SELECT distance, employee_name FROM employee_recursive;
15321574
else the query will loop indefinitely. (See <xref linkend="queries-with">
15331575
for more examples.)
15341576
</para>
1577+
1578+
<para>
1579+
This example uses <literal>LATERAL</> to apply a set-returning function
1580+
<function>get_product_names()</> for each row of the
1581+
<structname>manufacturers</> table:
1582+
1583+
<programlisting>
1584+
SELECT m.name AS mname, pname
1585+
FROM manufacturers m, LATERAL get_product_names(m.id) pname;
1586+
</programlisting>
1587+
1588+
Manufacturers not currently having any products would not appear in the
1589+
result, since it is an inner join. If we wished to include the names of
1590+
such manufacturers in the result, we could do:
1591+
1592+
<programlisting>
1593+
SELECT m.name AS mname, pname
1594+
FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON true;
1595+
</programlisting>
1596+
</para>
15351597
</refsect1>
15361598

15371599
<refsect1>
@@ -1611,6 +1673,20 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
16111673
</para>
16121674
</refsect2>
16131675

1676+
<refsect2>
1677+
<title>Function Calls in <literal>FROM</literal></title>
1678+
1679+
<para>
1680+
<productname>PostgreSQL</productname> allows a function call to be
1681+
written directly as a member of the <literal>FROM</> list. In the SQL
1682+
standard it would be necessary to wrap such a function call in a
1683+
sub-<command>SELECT</command>; that is, the syntax
1684+
<literal>FROM <replaceable>func</>(...) <replaceable>alias</></literal>
1685+
is approximately equivalent to
1686+
<literal>FROM (SELECT <replaceable>func</>(...)) <replaceable>alias</></literal>.
1687+
</para>
1688+
</refsect2>
1689+
16141690
<refsect2>
16151691
<title>Namespace Available to <literal>GROUP BY</literal> and <literal>ORDER BY</literal></title>
16161692

src/backend/nodes/copyfuncs.c

+3
Original file line numberDiff line numberDiff line change
@@ -1973,6 +1973,7 @@ _copyRangeTblEntry(const RangeTblEntry *from)
19731973
COPY_NODE_FIELD(ctecolcollations);
19741974
COPY_NODE_FIELD(alias);
19751975
COPY_NODE_FIELD(eref);
1976+
COPY_SCALAR_FIELD(lateral);
19761977
COPY_SCALAR_FIELD(inh);
19771978
COPY_SCALAR_FIELD(inFromCl);
19781979
COPY_SCALAR_FIELD(requiredPerms);
@@ -2250,6 +2251,7 @@ _copyRangeSubselect(const RangeSubselect *from)
22502251
{
22512252
RangeSubselect *newnode = makeNode(RangeSubselect);
22522253

2254+
COPY_SCALAR_FIELD(lateral);
22532255
COPY_NODE_FIELD(subquery);
22542256
COPY_NODE_FIELD(alias);
22552257

@@ -2261,6 +2263,7 @@ _copyRangeFunction(const RangeFunction *from)
22612263
{
22622264
RangeFunction *newnode = makeNode(RangeFunction);
22632265

2266+
COPY_SCALAR_FIELD(lateral);
22642267
COPY_NODE_FIELD(funccallnode);
22652268
COPY_NODE_FIELD(alias);
22662269
COPY_NODE_FIELD(coldeflist);

src/backend/nodes/equalfuncs.c

+3
Original file line numberDiff line numberDiff line change
@@ -2161,6 +2161,7 @@ _equalWindowDef(const WindowDef *a, const WindowDef *b)
21612161
static bool
21622162
_equalRangeSubselect(const RangeSubselect *a, const RangeSubselect *b)
21632163
{
2164+
COMPARE_SCALAR_FIELD(lateral);
21642165
COMPARE_NODE_FIELD(subquery);
21652166
COMPARE_NODE_FIELD(alias);
21662167

@@ -2170,6 +2171,7 @@ _equalRangeSubselect(const RangeSubselect *a, const RangeSubselect *b)
21702171
static bool
21712172
_equalRangeFunction(const RangeFunction *a, const RangeFunction *b)
21722173
{
2174+
COMPARE_SCALAR_FIELD(lateral);
21732175
COMPARE_NODE_FIELD(funccallnode);
21742176
COMPARE_NODE_FIELD(alias);
21752177
COMPARE_NODE_FIELD(coldeflist);
@@ -2287,6 +2289,7 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
22872289
COMPARE_NODE_FIELD(ctecolcollations);
22882290
COMPARE_NODE_FIELD(alias);
22892291
COMPARE_NODE_FIELD(eref);
2292+
COMPARE_SCALAR_FIELD(lateral);
22902293
COMPARE_SCALAR_FIELD(inh);
22912294
COMPARE_SCALAR_FIELD(inFromCl);
22922295
COMPARE_SCALAR_FIELD(requiredPerms);

src/backend/nodes/outfuncs.c

+3
Original file line numberDiff line numberDiff line change
@@ -2362,6 +2362,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
23622362
break;
23632363
}
23642364

2365+
WRITE_BOOL_FIELD(lateral);
23652366
WRITE_BOOL_FIELD(inh);
23662367
WRITE_BOOL_FIELD(inFromCl);
23672368
WRITE_UINT_FIELD(requiredPerms);
@@ -2565,6 +2566,7 @@ _outRangeSubselect(StringInfo str, const RangeSubselect *node)
25652566
{
25662567
WRITE_NODE_TYPE("RANGESUBSELECT");
25672568

2569+
WRITE_BOOL_FIELD(lateral);
25682570
WRITE_NODE_FIELD(subquery);
25692571
WRITE_NODE_FIELD(alias);
25702572
}
@@ -2574,6 +2576,7 @@ _outRangeFunction(StringInfo str, const RangeFunction *node)
25742576
{
25752577
WRITE_NODE_TYPE("RANGEFUNCTION");
25762578

2579+
WRITE_BOOL_FIELD(lateral);
25772580
WRITE_NODE_FIELD(funccallnode);
25782581
WRITE_NODE_FIELD(alias);
25792582
WRITE_NODE_FIELD(coldeflist);

src/backend/nodes/readfuncs.c

+1
Original file line numberDiff line numberDiff line change
@@ -1222,6 +1222,7 @@ _readRangeTblEntry(void)
12221222
break;
12231223
}
12241224

1225+
READ_BOOL_FIELD(lateral);
12251226
READ_BOOL_FIELD(inh);
12261227
READ_BOOL_FIELD(inFromCl);
12271228
READ_UINT_FIELD(requiredPerms);

src/backend/optimizer/geqo/geqo_eval.c

+10-1
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
5656
MemoryContext mycontext;
5757
MemoryContext oldcxt;
5858
RelOptInfo *joinrel;
59+
Path *best_path;
5960
Cost fitness;
6061
int savelength;
6162
struct HTAB *savehash;
@@ -99,14 +100,22 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
99100

100101
/* construct the best path for the given combination of relations */
101102
joinrel = gimme_tree(root, tour, num_gene);
103+
best_path = joinrel->cheapest_total_path;
104+
105+
/*
106+
* If no unparameterized path, use the cheapest parameterized path for
107+
* costing purposes. XXX revisit this after LATERAL dust settles
108+
*/
109+
if (!best_path)
110+
best_path = linitial(joinrel->cheapest_parameterized_paths);
102111

103112
/*
104113
* compute fitness
105114
*
106115
* XXX geqo does not currently support optimization for partial result
107116
* retrieval --- how to fix?
108117
*/
109-
fitness = joinrel->cheapest_total_path->total_cost;
118+
fitness = best_path->total_cost;
110119

111120
/*
112121
* Restore join_rel_list to its former state, and put back original

0 commit comments

Comments
 (0)