Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 0436f6b

Browse files
committed
Disallow set-returning functions inside CASE or COALESCE.
When we reimplemented SRFs in commit 69f4b9c, our initial choice was to allow the behavior to vary from historical practice in cases where a SRF call appeared within a conditional-execution construct (currently, only CASE or COALESCE). But that was controversial to begin with, and subsequent discussion has resulted in a consensus that it's better to throw an error instead of executing the query differently from before, so long as we can provide a reasonably clear error message and a way to rewrite the query. Hence, add a parser mechanism to allow detection of such cases during parse analysis. The mechanism just requires storing, in the ParseState, a pointer to the set-returning FuncExpr or OpExpr most recently emitted by parse analysis. Then the parsing functions for CASE and COALESCE can detect the presence of a SRF in their arguments by noting whether this pointer changes while analyzing their arguments. Furthermore, if it does, it provides a suitable error cursor location for the complaint. (This means that if there's more than one SRF in the arguments, the error will point at the last one to be analyzed not the first. While connoisseurs of parsing behavior might find that odd, it's unlikely the average user would ever notice.) While at it, we can also provide more specific error messages than before about some pre-existing restrictions, such as no-SRFs-within-aggregates. Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM construct would need to return a set. We've never supported that, but the restriction is depended on in more subtle ways now, so it seems wise to detect it at the start. Also, provide some documentation about how to rewrite a SRF-within-CASE query using a custom wrapper SRF. It turns out that the information_schema.user_mapping_options view contained an instance of exactly the behavior we're now forbidding; but rewriting it makes it more clear and safer too. initdb forced because of user_mapping_options change. Patch by me, with error message suggestions from Alvaro Herrera and Andres Freund, pursuant to a complaint from Regina Obe. Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
1 parent 39da0f7 commit 0436f6b

File tree

17 files changed

+249
-91
lines changed

17 files changed

+249
-91
lines changed

doc/src/sgml/xfunc.sgml

+58-25
Original file line numberDiff line numberDiff line change
@@ -997,6 +997,29 @@ SELECT name, listchildren(name) FROM nodes;
997997
the <literal>LATERAL</> syntax.
998998
</para>
999999

1000+
<para>
1001+
<productname>PostgreSQL</>'s behavior for a set-returning function in a
1002+
query's select list is almost exactly the same as if the set-returning
1003+
function had been written in a <literal>LATERAL FROM</>-clause item
1004+
instead. For example,
1005+
<programlisting>
1006+
SELECT x, generate_series(1,5) AS g FROM tab;
1007+
</programlisting>
1008+
is almost equivalent to
1009+
<programlisting>
1010+
SELECT x, g FROM tab, LATERAL generate_series(1,5) AS g;
1011+
</programlisting>
1012+
It would be exactly the same, except that in this specific example,
1013+
the planner could choose to put <structname>g</> on the outside of the
1014+
nestloop join, since <structname>g</> has no actual lateral dependency
1015+
on <structname>tab</>. That would result in a different output row
1016+
order. Set-returning functions in the select list are always evaluated
1017+
as though they are on the inside of a nestloop join with the rest of
1018+
the <literal>FROM</> clause, so that the function(s) are run to
1019+
completion before the next row from the <literal>FROM</> clause is
1020+
considered.
1021+
</para>
1022+
10001023
<para>
10011024
If there is more than one set-returning function in the query's select
10021025
list, the behavior is similar to what you get from putting the functions
@@ -1028,32 +1051,19 @@ SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM tab;
10281051
</para>
10291052

10301053
<para>
1031-
This behavior also means that set-returning functions will be evaluated
1032-
even when it might appear that they should be skipped because of a
1033-
conditional-evaluation construct, such as <literal>CASE</>
1034-
or <literal>COALESCE</>. For example, consider
1054+
Set-returning functions cannot be used within conditional-evaluation
1055+
constructs, such as <literal>CASE</> or <literal>COALESCE</>. For
1056+
example, consider
10351057
<programlisting>
10361058
SELECT x, CASE WHEN x &gt; 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
10371059
</programlisting>
1038-
It might seem that this should produce five repetitions of input
1039-
rows that have <literal>x &gt; 0</>, and a single repetition of those
1040-
that do not; but actually it will produce five repetitions of every
1041-
input row. This is because <function>generate_series()</> is run first,
1042-
and then the <literal>CASE</> expression is applied to its result rows.
1043-
The behavior is thus comparable to
1044-
<programlisting>
1045-
SELECT x, CASE WHEN x &gt; 0 THEN g ELSE 0 END
1046-
FROM tab, LATERAL generate_series(1,5) AS g;
1047-
</programlisting>
1048-
It would be exactly the same, except that in this specific example,
1049-
the planner could choose to put <structname>g</> on the outside of the
1050-
nestloop join, since <structname>g</> has no actual lateral dependency
1051-
on <structname>tab</>. That would result in a different output row
1052-
order. Set-returning functions in the select list are always evaluated
1053-
as though they are on the inside of a nestloop join with the rest of
1054-
the <literal>FROM</> clause, so that the function(s) are run to
1055-
completion before the next row from the <literal>FROM</> clause is
1056-
considered.
1060+
It might seem that this should produce five repetitions of input rows
1061+
that have <literal>x &gt; 0</>, and a single repetition of those that do
1062+
not; but actually, because <function>generate_series(1, 5)</> would be
1063+
run in an implicit <literal>LATERAL FROM</> item before
1064+
the <literal>CASE</> expression is ever evaluated, it would produce five
1065+
repetitions of every input row. To reduce confusion, such cases produce
1066+
a parse-time error instead.
10571067
</para>
10581068

10591069
<note>
@@ -1078,11 +1088,34 @@ SELECT x, CASE WHEN x &gt; 0 THEN g ELSE 0 END
10781088
functions. Also, nested set-returning functions did not work as
10791089
described above; instead, a set-returning function could have at most
10801090
one set-returning argument, and each nest of set-returning functions
1081-
was run independently. The behavior for conditional execution
1082-
(set-returning functions inside <literal>CASE</> etc) was different too.
1091+
was run independently. Also, conditional execution (set-returning
1092+
functions inside <literal>CASE</> etc) was previously allowed,
1093+
complicating things even more.
10831094
Use of the <literal>LATERAL</> syntax is recommended when writing
10841095
queries that need to work in older <productname>PostgreSQL</> versions,
10851096
because that will give consistent results across different versions.
1097+
If you have a query that is relying on conditional execution of a
1098+
set-returning function, you may be able to fix it by moving the
1099+
conditional test into a custom set-returning function. For example,
1100+
<programlisting>
1101+
SELECT x, CASE WHEN y &gt; 0 THEN generate_series(1, z) ELSE 5 END FROM tab;
1102+
</programlisting>
1103+
could become
1104+
<programlisting>
1105+
CREATE FUNCTION case_generate_series(cond bool, start int, fin int, els int)
1106+
RETURNS SETOF int AS $$
1107+
BEGIN
1108+
IF cond THEN
1109+
RETURN QUERY SELECT generate_series(start, fin);
1110+
ELSE
1111+
RETURN QUERY SELECT els;
1112+
END IF;
1113+
END$$ LANGUAGE plpgsql;
1114+
1115+
SELECT x, case_generate_series(y &gt; 0, 1, z, 5) FROM tab;
1116+
</programlisting>
1117+
This formulation will work the same in all versions
1118+
of <productname>PostgreSQL</>.
10861119
</para>
10871120
</note>
10881121
</sect2>

src/backend/catalog/information_schema.sql

+5-3
Original file line numberDiff line numberDiff line change
@@ -2936,12 +2936,14 @@ CREATE VIEW user_mapping_options AS
29362936
SELECT authorization_identifier,
29372937
foreign_server_catalog,
29382938
foreign_server_name,
2939-
CAST((pg_options_to_table(um.umoptions)).option_name AS sql_identifier) AS option_name,
2939+
CAST(opts.option_name AS sql_identifier) AS option_name,
29402940
CAST(CASE WHEN (umuser <> 0 AND authorization_identifier = current_user)
29412941
OR (umuser = 0 AND pg_has_role(srvowner, 'USAGE'))
2942-
OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user) THEN (pg_options_to_table(um.umoptions)).option_value
2942+
OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user)
2943+
THEN opts.option_value
29432944
ELSE NULL END AS character_data) AS option_value
2944-
FROM _pg_user_mappings um;
2945+
FROM _pg_user_mappings um,
2946+
pg_options_to_table(um.umoptions) opts;
29452947

29462948
GRANT SELECT ON user_mapping_options TO PUBLIC;
29472949

src/backend/executor/functions.c

+1
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,7 @@ sql_fn_post_column_ref(ParseState *pstate, ColumnRef *cref, Node *var)
388388
param = ParseFuncOrColumn(pstate,
389389
list_make1(subfield),
390390
list_make1(param),
391+
pstate->p_last_srf,
391392
NULL,
392393
cref->location);
393394
}

src/backend/parser/parse_agg.c

+8
Original file line numberDiff line numberDiff line change
@@ -705,6 +705,14 @@ check_agg_arguments_walker(Node *node,
705705
}
706706
/* Continue and descend into subtree */
707707
}
708+
/* We can throw error on sight for a set-returning function */
709+
if ((IsA(node, FuncExpr) &&((FuncExpr *) node)->funcretset) ||
710+
(IsA(node, OpExpr) &&((OpExpr *) node)->opretset))
711+
ereport(ERROR,
712+
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
713+
errmsg("aggregate function calls cannot contain set-returning function calls"),
714+
errhint("You might be able to move the set-returning function into a LATERAL FROM item."),
715+
parser_errposition(context->pstate, exprLocation(node))));
708716
/* We can throw error on sight for a window function */
709717
if (IsA(node, WindowFunc))
710718
ereport(ERROR,

src/backend/parser/parse_clause.c

+32-6
Original file line numberDiff line numberDiff line change
@@ -572,6 +572,8 @@ transformRangeFunction(ParseState *pstate, RangeFunction *r)
572572
List *pair = (List *) lfirst(lc);
573573
Node *fexpr;
574574
List *coldeflist;
575+
Node *newfexpr;
576+
Node *last_srf;
575577

576578
/* Disassemble the function-call/column-def-list pairs */
577579
Assert(list_length(pair) == 2);
@@ -618,13 +620,25 @@ transformRangeFunction(ParseState *pstate, RangeFunction *r)
618620
Node *arg = (Node *) lfirst(lc);
619621
FuncCall *newfc;
620622

623+
last_srf = pstate->p_last_srf;
624+
621625
newfc = makeFuncCall(SystemFuncName("unnest"),
622626
list_make1(arg),
623627
fc->location);
624628

625-
funcexprs = lappend(funcexprs,
626-
transformExpr(pstate, (Node *) newfc,
627-
EXPR_KIND_FROM_FUNCTION));
629+
newfexpr = transformExpr(pstate, (Node *) newfc,
630+
EXPR_KIND_FROM_FUNCTION);
631+
632+
/* nodeFunctionscan.c requires SRFs to be at top level */
633+
if (pstate->p_last_srf != last_srf &&
634+
pstate->p_last_srf != newfexpr)
635+
ereport(ERROR,
636+
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
637+
errmsg("set-returning functions must appear at top level of FROM"),
638+
parser_errposition(pstate,
639+
exprLocation(pstate->p_last_srf))));
640+
641+
funcexprs = lappend(funcexprs, newfexpr);
628642

629643
funcnames = lappend(funcnames,
630644
FigureColname((Node *) newfc));
@@ -638,9 +652,21 @@ transformRangeFunction(ParseState *pstate, RangeFunction *r)
638652
}
639653

640654
/* normal case ... */
641-
funcexprs = lappend(funcexprs,
642-
transformExpr(pstate, fexpr,
643-
EXPR_KIND_FROM_FUNCTION));
655+
last_srf = pstate->p_last_srf;
656+
657+
newfexpr = transformExpr(pstate, fexpr,
658+
EXPR_KIND_FROM_FUNCTION);
659+
660+
/* nodeFunctionscan.c requires SRFs to be at top level */
661+
if (pstate->p_last_srf != last_srf &&
662+
pstate->p_last_srf != newfexpr)
663+
ereport(ERROR,
664+
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
665+
errmsg("set-returning functions must appear at top level of FROM"),
666+
parser_errposition(pstate,
667+
exprLocation(pstate->p_last_srf))));
668+
669+
funcexprs = lappend(funcexprs, newfexpr);
644670

645671
funcnames = lappend(funcnames,
646672
FigureColname(fexpr));

0 commit comments

Comments
 (0)