Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 44d5be0

Browse files
committed
Implement SQL-standard WITH clauses, including WITH RECURSIVE.
There are some unimplemented aspects: recursive queries must use UNION ALL (should allow UNION too), and we don't have SEARCH or CYCLE clauses. These might or might not get done for 8.4, but even without them it's a pretty useful feature. There are also a couple of small loose ends and definitional quibbles, which I'll send a memo about to pgsql-hackers shortly. But let's land the patch now so we can get on with other development. Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane
1 parent 607b2be commit 44d5be0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

77 files changed

+5893
-313
lines changed

doc/src/sgml/errcodes.sgml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.24 2008/05/15 22:39:48 tgl Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.25 2008/10/04 21:56:52 tgl Exp $ -->
22

33
<appendix id="errcodes-appendix">
44
<title><productname>PostgreSQL</productname> Error Codes</title>
@@ -990,6 +990,12 @@
990990
<entry>grouping_error</entry>
991991
</row>
992992

993+
<row>
994+
<entry><literal>42P19</literal></entry>
995+
<entry>INVALID RECURSION</entry>
996+
<entry>invalid_recursion</entry>
997+
</row>
998+
993999
<row>
9941000
<entry><literal>42830</literal></entry>
9951001
<entry>INVALID FOREIGN KEY</entry>

doc/src/sgml/queries.sgml

Lines changed: 191 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.45 2008/02/15 22:17:06 tgl Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.46 2008/10/04 21:56:52 tgl Exp $ -->
22

33
<chapter id="queries">
44
<title>Queries</title>
@@ -28,10 +28,11 @@
2828
used to specify queries. The general syntax of the
2929
<command>SELECT</command> command is
3030
<synopsis>
31-
SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional><replaceable>sort_specification</replaceable></optional>
31+
<optional>WITH <replaceable>with_queries</replaceable></optional> SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional><replaceable>sort_specification</replaceable></optional>
3232
</synopsis>
3333
The following sections describe the details of the select list, the
34-
table expression, and the sort specification.
34+
table expression, and the sort specification. <literal>WITH</>
35+
queries are treated last since they are an advanced feature.
3536
</para>
3637

3738
<para>
@@ -107,7 +108,7 @@ SELECT random();
107108

108109
<sect2 id="queries-from">
109110
<title>The <literal>FROM</literal> Clause</title>
110-
111+
111112
<para>
112113
The <xref linkend="sql-from" endterm="sql-from-title"> derives a
113114
table from one or more other tables given in a comma-separated
@@ -211,7 +212,7 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
211212
<replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> )
212213
<replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable>
213214
</synopsis>
214-
215+
215216
<para>
216217
The words <literal>INNER</literal> and
217218
<literal>OUTER</literal> are optional in all forms.
@@ -303,7 +304,7 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
303304
</para>
304305
</listitem>
305306
</varlistentry>
306-
307+
307308
<varlistentry>
308309
<term><literal>RIGHT OUTER JOIN</></term>
309310

@@ -326,7 +327,7 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
326327
</para>
327328
</listitem>
328329
</varlistentry>
329-
330+
330331
<varlistentry>
331332
<term><literal>FULL OUTER JOIN</></term>
332333

@@ -1042,7 +1043,7 @@ SELECT a AS value, b + c AS sum FROM ...
10421043
<para>
10431044
If no output column name is specified using <literal>AS</>,
10441045
the system assigns a default column name. For simple column references,
1045-
this is the name of the referenced column. For function
1046+
this is the name of the referenced column. For function
10461047
calls, this is the name of the function. For complex expressions,
10471048
the system will generate a generic name.
10481049
</para>
@@ -1302,7 +1303,7 @@ SELECT a, max(b) FROM table1 GROUP BY a ORDER BY 1;
13021303
<programlisting>
13031304
SELECT a + b AS sum, c FROM table1 ORDER BY sum + c; -- wrong
13041305
</programlisting>
1305-
This restriction is made to reduce ambiguity. There is still
1306+
This restriction is made to reduce ambiguity. There is still
13061307
ambiguity if an <literal>ORDER BY</> item is a simple name that
13071308
could match either an output column name or a column from the table
13081309
expression. The output column is used in such cases. This would
@@ -1455,4 +1456,185 @@ SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression
14551456

14561457
</sect1>
14571458

1459+
1460+
<sect1 id="queries-with">
1461+
<title><literal>WITH</literal> Queries</title>
1462+
1463+
<indexterm zone="queries-with">
1464+
<primary>WITH</primary>
1465+
<secondary>in SELECT</secondary>
1466+
</indexterm>
1467+
1468+
<indexterm>
1469+
<primary>common table expression</primary>
1470+
<see>WITH</see>
1471+
</indexterm>
1472+
1473+
<para>
1474+
<literal>WITH</> provides a way to write subqueries for use in a larger
1475+
<literal>SELECT</> query. The subqueries can be thought of as defining
1476+
temporary tables that exist just for this query. One use of this feature
1477+
is to break down complicated queries into simpler parts. An example is:
1478+
1479+
<programlisting>
1480+
WITH regional_sales AS (
1481+
SELECT region, SUM(amount) AS total_sales
1482+
FROM orders
1483+
GROUP BY region
1484+
), top_regions AS (
1485+
SELECT region
1486+
FROM regional_sales
1487+
WHERE total_sales &gt; (SELECT SUM(total_sales)/10 FROM regional_sales)
1488+
)
1489+
SELECT region,
1490+
product,
1491+
SUM(quantity) AS product_units,
1492+
SUM(amount) AS product_sales
1493+
FROM orders
1494+
WHERE region IN (SELECT region FROM top_regions)
1495+
GROUP BY region, product;
1496+
</programlisting>
1497+
1498+
which displays per-product sales totals in only the top sales regions.
1499+
This example could have been written without <literal>WITH</>,
1500+
but we'd have needed two levels of nested sub-SELECTs. It's a bit
1501+
easier to follow this way.
1502+
</para>
1503+
1504+
<para>
1505+
The optional <literal>RECURSIVE</> modifier changes <literal>WITH</>
1506+
from a mere syntactic convenience into a feature that accomplishes
1507+
things not otherwise possible in standard SQL. Using
1508+
<literal>RECURSIVE</>, a <literal>WITH</> query can refer to its own
1509+
output. A very simple example is this query to sum the integers from 1
1510+
through 100:
1511+
1512+
<programlisting>
1513+
WITH RECURSIVE t(n) AS (
1514+
VALUES (1)
1515+
UNION ALL
1516+
SELECT n+1 FROM t WHERE n &lt; 100
1517+
)
1518+
SELECT sum(n) FROM t;
1519+
</programlisting>
1520+
1521+
The general form of a recursive <literal>WITH</> query is always a
1522+
<firstterm>non-recursive term</>, then <literal>UNION ALL</>, then a
1523+
<firstterm>recursive term</>, where only the recursive term can contain
1524+
a reference to the query's own output. Such a query is executed as
1525+
follows:
1526+
</para>
1527+
1528+
<procedure>
1529+
<title>Recursive Query Evaluation</title>
1530+
1531+
<step performance="required">
1532+
<para>
1533+
Evaluate the non-recursive term. Include all its output rows in the
1534+
result of the recursive query, and also place them in a temporary
1535+
<firstterm>working table</>.
1536+
</para>
1537+
</step>
1538+
1539+
<step performance="required">
1540+
<para>
1541+
So long as the working table is not empty, repeat these steps:
1542+
</para>
1543+
<substeps>
1544+
<step performance="required">
1545+
<para>
1546+
Evaluate the recursive term, substituting the current contents of
1547+
the working table for the recursive self-reference. Include all its
1548+
output rows in the result of the recursive query, and also place them
1549+
in a temporary <firstterm>intermediate table</>.
1550+
</para>
1551+
</step>
1552+
1553+
<step performance="required">
1554+
<para>
1555+
Replace the contents of the working table with the contents of the
1556+
intermediate table, then empty the intermediate table.
1557+
</para>
1558+
</step>
1559+
</substeps>
1560+
</step>
1561+
</procedure>
1562+
1563+
<note>
1564+
<para>
1565+
Strictly speaking, this process is iteration not recursion, but
1566+
<literal>RECURSIVE</> is the terminology chosen by the SQL standards
1567+
committee.
1568+
</para>
1569+
</note>
1570+
1571+
<para>
1572+
In the example above, the working table has just a single row in each step,
1573+
and it takes on the values from 1 through 100 in successive steps. In
1574+
the 100th step, there is no output because of the <literal>WHERE</>
1575+
clause, and so the query terminates.
1576+
</para>
1577+
1578+
<para>
1579+
Recursive queries are typically used to deal with hierarchical or
1580+
tree-structured data. A useful example is this query to find all the
1581+
direct and indirect sub-parts of a product, given only a table that
1582+
shows immediate inclusions:
1583+
1584+
<programlisting>
1585+
WITH RECURSIVE included_parts(sub_part, part, quantity) AS (
1586+
SELECT sub_part, part, quantity FROM parts WHERE part = 'our_product'
1587+
UNION ALL
1588+
SELECT p.sub_part, p.part, p.quantity
1589+
FROM included_parts pr, parts p
1590+
WHERE p.part = pr.sub_part
1591+
)
1592+
SELECT sub_part, SUM(quantity) as total_quantity
1593+
FROM included_parts
1594+
GROUP BY sub_part
1595+
</programlisting>
1596+
</para>
1597+
1598+
<para>
1599+
When working with recursive queries it is important to be sure that
1600+
the recursive part of the query will eventually return no tuples,
1601+
or else the query will loop indefinitely. A useful trick for
1602+
development purposes is to place a <literal>LIMIT</> in the parent
1603+
query. For example, this query would loop forever without the
1604+
<literal>LIMIT</>:
1605+
1606+
<programlisting>
1607+
WITH RECURSIVE t(n) AS (
1608+
SELECT 1
1609+
UNION ALL
1610+
SELECT n+1 FROM t
1611+
)
1612+
SELECT n FROM t LIMIT 100;
1613+
</programlisting>
1614+
1615+
This works because <productname>PostgreSQL</productname>'s implementation
1616+
evaluates only as many rows of a <literal>WITH</> query as are actually
1617+
demanded by the parent query. Using this trick in production is not
1618+
recommended, because other systems might work differently.
1619+
</para>
1620+
1621+
<para>
1622+
A useful property of <literal>WITH</> queries is that they are evaluated
1623+
only once per execution of the parent query, even if they are referred to
1624+
more than once by the parent query or sibling <literal>WITH</> queries.
1625+
Thus, expensive calculations that are needed in multiple places can be
1626+
placed within a <literal>WITH</> query to avoid redundant work. Another
1627+
possible application is to prevent unwanted multiple evaluations of
1628+
functions with side-effects.
1629+
However, the other side of this coin is that the optimizer is less able to
1630+
push restrictions from the parent query down into a <literal>WITH</> query
1631+
than an ordinary sub-query. The <literal>WITH</> query will generally be
1632+
evaluated as stated, without suppression of rows that the parent query
1633+
might discard afterwards. (But, as mentioned above, evaluation might stop
1634+
early if the reference(s) to the query demand only a limited number of
1635+
rows.)
1636+
</para>
1637+
1638+
</sect1>
1639+
14581640
</chapter>

0 commit comments

Comments
 (0)