Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 3696a60

Browse files
committed
SEARCH and CYCLE clauses
This adds the SQL standard feature that adds the SEARCH and CYCLE clauses to recursive queries to be able to do produce breadth- or depth-first search orders and detect cycles. These clauses can be rewritten into queries using existing syntax, and that is what this patch does in the rewriter. Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5be2@2ndquadrant.com
1 parent bb513b3 commit 3696a60

28 files changed

+2301
-33
lines changed

doc/src/sgml/queries.sgml

+63-1
Original file line numberDiff line numberDiff line change
@@ -2218,6 +2218,39 @@ SELECT * FROM search_tree <emphasis>ORDER BY depth</emphasis>;
22182218
in any case.
22192219
</para>
22202220
</tip>
2221+
2222+
<para>
2223+
There is built-in syntax to compute a depth- or breadth-first sort column.
2224+
For example:
2225+
2226+
<programlisting>
2227+
WITH RECURSIVE search_tree(id, link, data) AS (
2228+
SELECT t.id, t.link, t.data
2229+
FROM tree t
2230+
UNION ALL
2231+
SELECT t.id, t.link, t.data
2232+
FROM tree t, search_tree st
2233+
WHERE t.id = st.link
2234+
) <emphasis>SEARCH DEPTH FIRST BY id SET ordercol</emphasis>
2235+
SELECT * FROM search_tree ORDER BY ordercol;
2236+
2237+
WITH RECURSIVE search_tree(id, link, data) AS (
2238+
SELECT t.id, t.link, t.data
2239+
FROM tree t
2240+
UNION ALL
2241+
SELECT t.id, t.link, t.data
2242+
FROM tree t, search_tree st
2243+
WHERE t.id = st.link
2244+
) <emphasis>SEARCH BREADTH FIRST BY id SET ordercol</emphasis>
2245+
SELECT * FROM search_tree ORDER BY ordercol;
2246+
</programlisting>
2247+
This syntax is internally expanded to something similar to the above
2248+
hand-written forms. The <literal>SEARCH</literal> clause specifies whether
2249+
depth- or breadth first search is wanted, the list of columns to track for
2250+
sorting, and a column name that will contain the result data that can be
2251+
used for sorting. That column will implicitly be added to the output rows
2252+
of the CTE.
2253+
</para>
22212254
</sect3>
22222255

22232256
<sect3 id="queries-with-cycle">
@@ -2305,10 +2338,39 @@ SELECT * FROM search_graph;
23052338
</para>
23062339
</tip>
23072340

2341+
<para>
2342+
There is built-in syntax to simplify cycle detection. The above query can
2343+
also be written like this:
2344+
<programlisting>
2345+
WITH RECURSIVE search_graph(id, link, data, depth) AS (
2346+
SELECT g.id, g.link, g.data, 1
2347+
FROM graph g
2348+
UNION ALL
2349+
SELECT g.id, g.link, g.data, sg.depth + 1
2350+
FROM graph g, search_graph sg
2351+
WHERE g.id = sg.link
2352+
) <emphasis>CYCLE id SET is_cycle TO true DEFAULT false USING path</emphasis>
2353+
SELECT * FROM search_graph;
2354+
</programlisting>
2355+
and it will be internally rewritten to the above form. The
2356+
<literal>CYCLE</literal> clause specifies first the list of columns to
2357+
track for cycle detection, then a column name that will show whether a
2358+
cycle has been detected, then two values to use in that column for the yes
2359+
and no cases, and finally the name of another column that will track the
2360+
path. The cycle and path columns will implicitly be added to the output
2361+
rows of the CTE.
2362+
</para>
2363+
23082364
<tip>
23092365
<para>
23102366
The cycle path column is computed in the same way as the depth-first
2311-
ordering column show in the previous section.
2367+
ordering column show in the previous section. A query can have both a
2368+
<literal>SEARCH</literal> and a <literal>CYCLE</literal> clause, but a
2369+
depth-first search specification and a cycle detection specification would
2370+
create redundant computations, so it's more efficient to just use the
2371+
<literal>CYCLE</literal> clause and order by the path column. If
2372+
breadth-first ordering is wanted, then specifying both
2373+
<literal>SEARCH</literal> and <literal>CYCLE</literal> can be useful.
23122374
</para>
23132375
</tip>
23142376

doc/src/sgml/ref/select.sgml

+44
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,8 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
7373
<phrase>and <replaceable class="parameter">with_query</replaceable> is:</phrase>
7474

7575
<replaceable class="parameter">with_query_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ] AS [ [ NOT ] MATERIALIZED ] ( <replaceable class="parameter">select</replaceable> | <replaceable class="parameter">values</replaceable> | <replaceable class="parameter">insert</replaceable> | <replaceable class="parameter">update</replaceable> | <replaceable class="parameter">delete</replaceable> )
76+
[ SEARCH { BREADTH | DEPTH } FIRST BY <replaceable>column_name</replaceable> [, ...] SET <replaceable>search_seq_col_name</replaceable> ]
77+
[ CYCLE <replaceable>column_name</replaceable> [, ...] SET <replaceable>cycle_mark_col_name</replaceable> TO <replaceable>cycle_mark_value</replaceable> DEFAULT <replaceable>cycle_mark_default</replaceable> USING <replaceable>cycle_path_col_name</replaceable> ]
7678

7779
TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
7880
</synopsis>
@@ -276,6 +278,48 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
276278
queries that do not use recursion or forward references.
277279
</para>
278280

281+
<para>
282+
The optional <literal>SEARCH</literal> clause computes a <firstterm>search
283+
sequence column</firstterm> that can be used for ordering the results of a
284+
recursive query in either breadth-first or depth-first order. The
285+
supplied column name list specifies the row key that is to be used for
286+
keeping track of visited rows. A column named
287+
<replaceable>search_seq_col_name</replaceable> will be added to the result
288+
column list of the <literal>WITH</literal> query. This column can be
289+
ordered by in the outer query to achieve the respective ordering. See
290+
<xref linkend="queries-with-search"/> for examples.
291+
</para>
292+
293+
<para>
294+
The optional <literal>CYCLE</literal> clause is used to detect cycles in
295+
recursive queries. The supplied column name list specifies the row key
296+
that is to be used for keeping track of visited rows. A column named
297+
<replaceable>cycle_mark_col_name</replaceable> will be added to the result
298+
column list of the <literal>WITH</literal> query. This column will be set
299+
to <replaceable>cycle_mark_value</replaceable> when a cycle has been
300+
detected, else to <replaceable>cycle_mark_default</replaceable>.
301+
Furthermore, processing of the recursive union will stop when a cycle has
302+
been detected. <replaceable>cycle_mark_value</replaceable> and
303+
<replaceable>cycle_mark_default</replaceable> must be constants and they
304+
must be coercible to a common data type, and the data type must have an
305+
inequality operator. (The SQL standard requires that they be character
306+
strings, but PostgreSQL does not require that.) Furthermore, a column
307+
named <replaceable>cycle_path_col_name</replaceable> will be added to the
308+
result column list of the <literal>WITH</literal> query. This column is
309+
used internally for tracking visited rows. See <xref
310+
linkend="queries-with-cycle"/> for examples.
311+
</para>
312+
313+
<para>
314+
Both the <literal>SEARCH</literal> and the <literal>CYCLE</literal> clause
315+
are only valid for recursive <literal>WITH</literal> queries. The
316+
<replaceable>with_query</replaceable> must be a <literal>UNION</literal>
317+
(or <literal>UNION ALL</literal>) of two <literal>SELECT</literal> (or
318+
equivalent) commands (no nested <literal>UNION</literal>s). If both
319+
clauses are used, the column added by the <literal>SEARCH</literal> clause
320+
appears before the columns added by the <literal>CYCLE</literal> clause.
321+
</para>
322+
279323
<para>
280324
The primary query and the <literal>WITH</literal> queries are all
281325
(notionally) executed at the same time. This implies that the effects of

src/backend/catalog/dependency.c

+15
Original file line numberDiff line numberDiff line change
@@ -2264,6 +2264,21 @@ find_expr_references_walker(Node *node,
22642264
context->addrs);
22652265
/* fall through to examine substructure */
22662266
}
2267+
else if (IsA(node, CTECycleClause))
2268+
{
2269+
CTECycleClause *cc = (CTECycleClause *) node;
2270+
2271+
if (OidIsValid(cc->cycle_mark_type))
2272+
add_object_address(OCLASS_TYPE, cc->cycle_mark_type, 0,
2273+
context->addrs);
2274+
if (OidIsValid(cc->cycle_mark_collation))
2275+
add_object_address(OCLASS_COLLATION, cc->cycle_mark_collation, 0,
2276+
context->addrs);
2277+
if (OidIsValid(cc->cycle_mark_neop))
2278+
add_object_address(OCLASS_OPERATOR, cc->cycle_mark_neop, 0,
2279+
context->addrs);
2280+
/* fall through to examine substructure */
2281+
}
22672282
else if (IsA(node, Query))
22682283
{
22692284
/* Recurse into RTE subquery or not-yet-planned sublink subquery */

src/backend/nodes/copyfuncs.c

+40
Original file line numberDiff line numberDiff line change
@@ -2589,6 +2589,38 @@ _copyOnConflictClause(const OnConflictClause *from)
25892589
return newnode;
25902590
}
25912591

2592+
static CTESearchClause *
2593+
_copyCTESearchClause(const CTESearchClause *from)
2594+
{
2595+
CTESearchClause *newnode = makeNode(CTESearchClause);
2596+
2597+
COPY_NODE_FIELD(search_col_list);
2598+
COPY_SCALAR_FIELD(search_breadth_first);
2599+
COPY_STRING_FIELD(search_seq_column);
2600+
COPY_LOCATION_FIELD(location);
2601+
2602+
return newnode;
2603+
}
2604+
2605+
static CTECycleClause *
2606+
_copyCTECycleClause(const CTECycleClause *from)
2607+
{
2608+
CTECycleClause *newnode = makeNode(CTECycleClause);
2609+
2610+
COPY_NODE_FIELD(cycle_col_list);
2611+
COPY_STRING_FIELD(cycle_mark_column);
2612+
COPY_NODE_FIELD(cycle_mark_value);
2613+
COPY_NODE_FIELD(cycle_mark_default);
2614+
COPY_STRING_FIELD(cycle_path_column);
2615+
COPY_LOCATION_FIELD(location);
2616+
COPY_SCALAR_FIELD(cycle_mark_type);
2617+
COPY_SCALAR_FIELD(cycle_mark_typmod);
2618+
COPY_SCALAR_FIELD(cycle_mark_collation);
2619+
COPY_SCALAR_FIELD(cycle_mark_neop);
2620+
2621+
return newnode;
2622+
}
2623+
25922624
static CommonTableExpr *
25932625
_copyCommonTableExpr(const CommonTableExpr *from)
25942626
{
@@ -2598,6 +2630,8 @@ _copyCommonTableExpr(const CommonTableExpr *from)
25982630
COPY_NODE_FIELD(aliascolnames);
25992631
COPY_SCALAR_FIELD(ctematerialized);
26002632
COPY_NODE_FIELD(ctequery);
2633+
COPY_NODE_FIELD(search_clause);
2634+
COPY_NODE_FIELD(cycle_clause);
26012635
COPY_LOCATION_FIELD(location);
26022636
COPY_SCALAR_FIELD(cterecursive);
26032637
COPY_SCALAR_FIELD(cterefcount);
@@ -5682,6 +5716,12 @@ copyObjectImpl(const void *from)
56825716
case T_OnConflictClause:
56835717
retval = _copyOnConflictClause(from);
56845718
break;
5719+
case T_CTESearchClause:
5720+
retval = _copyCTESearchClause(from);
5721+
break;
5722+
case T_CTECycleClause:
5723+
retval = _copyCTECycleClause(from);
5724+
break;
56855725
case T_CommonTableExpr:
56865726
retval = _copyCommonTableExpr(from);
56875727
break;

src/backend/nodes/equalfuncs.c

+36
Original file line numberDiff line numberDiff line change
@@ -2841,13 +2841,43 @@ _equalOnConflictClause(const OnConflictClause *a, const OnConflictClause *b)
28412841
return true;
28422842
}
28432843

2844+
static bool
2845+
_equalCTESearchClause(const CTESearchClause *a, const CTESearchClause *b)
2846+
{
2847+
COMPARE_NODE_FIELD(search_col_list);
2848+
COMPARE_SCALAR_FIELD(search_breadth_first);
2849+
COMPARE_STRING_FIELD(search_seq_column);
2850+
COMPARE_LOCATION_FIELD(location);
2851+
2852+
return true;
2853+
}
2854+
2855+
static bool
2856+
_equalCTECycleClause(const CTECycleClause *a, const CTECycleClause *b)
2857+
{
2858+
COMPARE_NODE_FIELD(cycle_col_list);
2859+
COMPARE_STRING_FIELD(cycle_mark_column);
2860+
COMPARE_NODE_FIELD(cycle_mark_value);
2861+
COMPARE_NODE_FIELD(cycle_mark_default);
2862+
COMPARE_STRING_FIELD(cycle_path_column);
2863+
COMPARE_LOCATION_FIELD(location);
2864+
COMPARE_SCALAR_FIELD(cycle_mark_type);
2865+
COMPARE_SCALAR_FIELD(cycle_mark_typmod);
2866+
COMPARE_SCALAR_FIELD(cycle_mark_collation);
2867+
COMPARE_SCALAR_FIELD(cycle_mark_neop);
2868+
2869+
return true;
2870+
}
2871+
28442872
static bool
28452873
_equalCommonTableExpr(const CommonTableExpr *a, const CommonTableExpr *b)
28462874
{
28472875
COMPARE_STRING_FIELD(ctename);
28482876
COMPARE_NODE_FIELD(aliascolnames);
28492877
COMPARE_SCALAR_FIELD(ctematerialized);
28502878
COMPARE_NODE_FIELD(ctequery);
2879+
COMPARE_NODE_FIELD(search_clause);
2880+
COMPARE_NODE_FIELD(cycle_clause);
28512881
COMPARE_LOCATION_FIELD(location);
28522882
COMPARE_SCALAR_FIELD(cterecursive);
28532883
COMPARE_SCALAR_FIELD(cterefcount);
@@ -3735,6 +3765,12 @@ equal(const void *a, const void *b)
37353765
case T_OnConflictClause:
37363766
retval = _equalOnConflictClause(a, b);
37373767
break;
3768+
case T_CTESearchClause:
3769+
retval = _equalCTESearchClause(a, b);
3770+
break;
3771+
case T_CTECycleClause:
3772+
retval = _equalCTECycleClause(a, b);
3773+
break;
37383774
case T_CommonTableExpr:
37393775
retval = _equalCommonTableExpr(a, b);
37403776
break;

src/backend/nodes/nodeFuncs.c

+41-1
Original file line numberDiff line numberDiff line change
@@ -1566,6 +1566,12 @@ exprLocation(const Node *expr)
15661566
case T_OnConflictClause:
15671567
loc = ((const OnConflictClause *) expr)->location;
15681568
break;
1569+
case T_CTESearchClause:
1570+
loc = ((const CTESearchClause *) expr)->location;
1571+
break;
1572+
case T_CTECycleClause:
1573+
loc = ((const CTECycleClause *) expr)->location;
1574+
break;
15691575
case T_CommonTableExpr:
15701576
loc = ((const CommonTableExpr *) expr)->location;
15711577
break;
@@ -1909,6 +1915,7 @@ expression_tree_walker(Node *node,
19091915
case T_NextValueExpr:
19101916
case T_RangeTblRef:
19111917
case T_SortGroupClause:
1918+
case T_CTESearchClause:
19121919
/* primitive node types with no expression subnodes */
19131920
break;
19141921
case T_WithCheckOption:
@@ -2148,6 +2155,16 @@ expression_tree_walker(Node *node,
21482155
return true;
21492156
}
21502157
break;
2158+
case T_CTECycleClause:
2159+
{
2160+
CTECycleClause *cc = (CTECycleClause *) node;
2161+
2162+
if (walker(cc->cycle_mark_value, context))
2163+
return true;
2164+
if (walker(cc->cycle_mark_default, context))
2165+
return true;
2166+
}
2167+
break;
21512168
case T_CommonTableExpr:
21522169
{
21532170
CommonTableExpr *cte = (CommonTableExpr *) node;
@@ -2156,7 +2173,13 @@ expression_tree_walker(Node *node,
21562173
* Invoke the walker on the CTE's Query node, so it can
21572174
* recurse into the sub-query if it wants to.
21582175
*/
2159-
return walker(cte->ctequery, context);
2176+
if (walker(cte->ctequery, context))
2177+
return true;
2178+
2179+
if (walker(cte->search_clause, context))
2180+
return true;
2181+
if (walker(cte->cycle_clause, context))
2182+
return true;
21602183
}
21612184
break;
21622185
case T_List:
@@ -2615,6 +2638,7 @@ expression_tree_mutator(Node *node,
26152638
case T_NextValueExpr:
26162639
case T_RangeTblRef:
26172640
case T_SortGroupClause:
2641+
case T_CTESearchClause:
26182642
return (Node *) copyObject(node);
26192643
case T_WithCheckOption:
26202644
{
@@ -3019,6 +3043,17 @@ expression_tree_mutator(Node *node,
30193043
return (Node *) newnode;
30203044
}
30213045
break;
3046+
case T_CTECycleClause:
3047+
{
3048+
CTECycleClause *cc = (CTECycleClause *) node;
3049+
CTECycleClause *newnode;
3050+
3051+
FLATCOPY(newnode, cc, CTECycleClause);
3052+
MUTATE(newnode->cycle_mark_value, cc->cycle_mark_value, Node *);
3053+
MUTATE(newnode->cycle_mark_default, cc->cycle_mark_default, Node *);
3054+
return (Node *) newnode;
3055+
}
3056+
break;
30223057
case T_CommonTableExpr:
30233058
{
30243059
CommonTableExpr *cte = (CommonTableExpr *) node;
@@ -3031,6 +3066,10 @@ expression_tree_mutator(Node *node,
30313066
* recurse into the sub-query if it wants to.
30323067
*/
30333068
MUTATE(newnode->ctequery, cte->ctequery, Node *);
3069+
3070+
MUTATE(newnode->search_clause, cte->search_clause, CTESearchClause *);
3071+
MUTATE(newnode->cycle_clause, cte->cycle_clause, CTECycleClause *);
3072+
30343073
return (Node *) newnode;
30353074
}
30363075
break;
@@ -3913,6 +3952,7 @@ raw_expression_tree_walker(Node *node,
39133952
}
39143953
break;
39153954
case T_CommonTableExpr:
3955+
/* search_clause and cycle_clause are not interesting here */
39163956
return walker(((CommonTableExpr *) node)->ctequery, context);
39173957
default:
39183958
elog(ERROR, "unrecognized node type: %d",

0 commit comments

Comments
 (0)