Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 112676f

Browse files
committed
Doc: improve documentation about composite-value usage.
Create a section specifically for the syntactic rules around whole-row variable usage, such as expansion of "foo.*". This was previously documented only haphazardly, with some critical info buried in unexpected places like xfunc-sql-composite-functions. Per repeated questions in different mailing lists. Discussion: <16288.1479610770@sss.pgh.pa.us>
1 parent 275e8c8 commit 112676f

File tree

4 files changed

+231
-72
lines changed

4 files changed

+231
-72
lines changed

doc/src/sgml/queries.sgml

+2-1
Original file line numberDiff line numberDiff line change
@@ -1457,7 +1457,8 @@ SELECT tbl1.a, tbl2.a, tbl1.b FROM ...
14571457
<programlisting>
14581458
SELECT tbl1.*, tbl2.a FROM ...
14591459
</programlisting>
1460-
(See also <xref linkend="queries-where">.)
1460+
See <xref linkend="rowtypes-usage"> for more about
1461+
the <replaceable>table_name</><literal>.*</> notation.
14611462
</para>
14621463

14631464
<para>

doc/src/sgml/rowtypes.sgml

+207-7
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
column of a table can be declared to be of a composite type.
2020
</para>
2121

22-
<sect2>
22+
<sect2 id="rowtypes-declaring">
2323
<title>Declaration of Composite Types</title>
2424

2525
<para>
@@ -90,7 +90,7 @@ CREATE TABLE inventory_item (
9090
</sect2>
9191

9292
<sect2>
93-
<title>Composite Value Input</title>
93+
<title>Constructing Composite Values</title>
9494

9595
<indexterm>
9696
<primary>composite type</primary>
@@ -101,8 +101,9 @@ CREATE TABLE inventory_item (
101101
To write a composite value as a literal constant, enclose the field
102102
values within parentheses and separate them by commas. You can put double
103103
quotes around any field value, and must do so if it contains commas or
104-
parentheses. (More details appear below.) Thus, the general format of a
105-
composite constant is the following:
104+
parentheses. (More details appear <link
105+
linkend="rowtypes-io-syntax">below</link>.) Thus, the general format of
106+
a composite constant is the following:
106107
<synopsis>
107108
'( <replaceable>val1</replaceable> , <replaceable>val2</replaceable> , ... )'
108109
</synopsis>
@@ -129,7 +130,8 @@ CREATE TABLE inventory_item (
129130
the generic type constants discussed in <xref
130131
linkend="sql-syntax-constants-generic">. The constant is initially
131132
treated as a string and passed to the composite-type input conversion
132-
routine. An explicit type specification might be necessary.)
133+
routine. An explicit type specification might be necessary to tell
134+
which type to convert the constant to.)
133135
</para>
134136

135137
<para>
@@ -143,7 +145,7 @@ ROW('fuzzy dice', 42, 1.99)
143145
ROW('', 42, NULL)
144146
</programlisting>
145147
The ROW keyword is actually optional as long as you have more than one
146-
field in the expression, so these can simplify to:
148+
field in the expression, so these can be simplified to:
147149
<programlisting>
148150
('fuzzy dice', 42, 1.99)
149151
('', 42, NULL)
@@ -153,7 +155,7 @@ ROW('', 42, NULL)
153155
</para>
154156
</sect2>
155157

156-
<sect2>
158+
<sect2 id="rowtypes-accessing">
157159
<title>Accessing Composite Types</title>
158160

159161
<para>
@@ -198,6 +200,11 @@ SELECT (my_func(...)).field FROM ...
198200

199201
Without the extra parentheses, this will generate a syntax error.
200202
</para>
203+
204+
<para>
205+
The special field name <literal>*</> means <quote>all fields</>, as
206+
further explained in <xref linkend="rowtypes-usage">.
207+
</para>
201208
</sect2>
202209

203210
<sect2>
@@ -243,6 +250,199 @@ INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);
243250
</para>
244251
</sect2>
245252

253+
<sect2 id="rowtypes-usage">
254+
<title>Using Composite Types in Queries</title>
255+
256+
<para>
257+
There are various special syntax rules and behaviors associated with
258+
composite types in queries. These rules provide useful shortcuts,
259+
but can be confusing if you don't know the logic behind them.
260+
</para>
261+
262+
<para>
263+
In <productname>PostgreSQL</>, a reference to a table name (or alias)
264+
in a query is effectively a reference to the composite value of the
265+
table's current row. For example, if we had a table
266+
<structname>inventory_item</> as shown
267+
<link linkend="rowtypes-declaring">above</link>, we could write:
268+
<programlisting>
269+
SELECT c FROM inventory_item c;
270+
</programlisting>
271+
This query produces a single composite-valued column, so we might get
272+
output like:
273+
<programlisting>
274+
c
275+
------------------------
276+
("fuzzy dice",42,1.99)
277+
(1 row)
278+
</programlisting>
279+
Note however that simple names are matched to column names before table
280+
names, so this example works only because there is no column
281+
named <structfield>c</> in the query's tables.
282+
</para>
283+
284+
<para>
285+
The ordinary qualified-column-name
286+
syntax <replaceable>table_name</><literal>.</><replaceable>column_name</>
287+
can be understood as applying <link linkend="field-selection">field
288+
selection</link> to the composite value of the table's current row.
289+
(For efficiency reasons, it's not actually implemented that way.)
290+
</para>
291+
292+
<para>
293+
When we write
294+
<programlisting>
295+
SELECT c.* FROM inventory_item c;
296+
</programlisting>
297+
then, according to the SQL standard, we should get the contents of the
298+
table expanded into separate columns:
299+
<programlisting>
300+
name | supplier_id | price
301+
------------+-------------+-------
302+
fuzzy dice | 42 | 1.99
303+
(1 row)
304+
</programlisting>
305+
as if the query were
306+
<programlisting>
307+
SELECT c.name, c.supplier_id, c.price FROM inventory_item c;
308+
</programlisting>
309+
<productname>PostgreSQL</> will apply this expansion behavior to
310+
any composite-valued expression, although as shown <link
311+
linkend="rowtypes-accessing">above</link>, you need to write parentheses
312+
around the value that <literal>.*</> is applied to whenever it's not a
313+
simple table name. For example, if <function>myfunc()</> is a function
314+
returning a composite type with columns <structfield>a</>,
315+
<structfield>b</>, and <structfield>c</>, then these two queries have the
316+
same result:
317+
<programlisting>
318+
SELECT (myfunc(x)).* FROM some_table;
319+
SELECT (myfunc(x)).a, (myfunc(x)).b, (myfunc(x)).c FROM some_table;
320+
</programlisting>
321+
</para>
322+
323+
<tip>
324+
<para>
325+
<productname>PostgreSQL</> handles column expansion by
326+
actually transforming the first form into the second. So, in this
327+
example, <function>myfunc()</> would get invoked three times per row
328+
with either syntax. If it's an expensive function you may wish to
329+
avoid that, which you can do with a query like:
330+
<programlisting>
331+
SELECT (m).* FROM (SELECT myfunc(x) AS m FROM some_table OFFSET 0) ss;
332+
</programlisting>
333+
The <literal>OFFSET 0</> clause keeps the optimizer
334+
from <quote>flattening</> the sub-select to arrive at the form with
335+
multiple calls of <function>myfunc()</>.
336+
</para>
337+
</tip>
338+
339+
<para>
340+
The <replaceable>composite_value</><literal>.*</> syntax results in
341+
column expansion of this kind when it appears at the top level of
342+
a <link linkend="queries-select-lists"><command>SELECT</> output
343+
list</link>, a <link linkend="dml-returning"><literal>RETURNING</>
344+
list</link> in <command>INSERT</>/<command>UPDATE</>/<command>DELETE</>,
345+
a <link linkend="queries-values"><literal>VALUES</> clause</link>, or
346+
a <link linkend="sql-syntax-row-constructors">row constructor</link>.
347+
In all other contexts (including when nested inside one of those
348+
constructs), attaching <literal>.*</> to a composite value does not
349+
change the value, since it means <quote>all columns</> and so the
350+
same composite value is produced again. For example,
351+
if <function>somefunc()</> accepts a composite-valued argument,
352+
these queries are the same:
353+
354+
<programlisting>
355+
SELECT somefunc(c.*) FROM inventory_item c;
356+
SELECT somefunc(c) FROM inventory_item c;
357+
</programlisting>
358+
359+
In both cases, the current row of <structname>inventory_item</> is
360+
passed to the function as a single composite-valued argument.
361+
Even though <literal>.*</> does nothing in such cases, using it is good
362+
style, since it makes clear that a composite value is intended. In
363+
particular, the parser will consider <literal>c</> in <literal>c.*</> to
364+
refer to a table name or alias, not to a column name, so that there is
365+
no ambiguity; whereas without <literal>.*</>, it is not clear
366+
whether <literal>c</> means a table name or a column name, and in fact
367+
the column-name interpretation will be preferred if there is a column
368+
named <literal>c</>.
369+
</para>
370+
371+
<para>
372+
Another example demonstrating these concepts is that all these queries
373+
mean the same thing:
374+
<programlisting>
375+
SELECT * FROM inventory_item c ORDER BY c;
376+
SELECT * FROM inventory_item c ORDER BY c.*;
377+
SELECT * FROM inventory_item c ORDER BY ROW(c.*);
378+
</programlisting>
379+
All of these <literal>ORDER BY</> clauses specify the row's composite
380+
value, resulting in sorting the rows according to the rules described
381+
in <xref linkend="composite-type-comparison">. However,
382+
if <structname>inventory_item</> contained a column
383+
named <structfield>c</>, the first case would be different from the
384+
others, as it would mean to sort by that column only. Given the column
385+
names previously shown, these queries are also equivalent to those above:
386+
<programlisting>
387+
SELECT * FROM inventory_item c ORDER BY ROW(c.name, c.supplier_id, c.price);
388+
SELECT * FROM inventory_item c ORDER BY (c.name, c.supplier_id, c.price);
389+
</programlisting>
390+
(The last case uses a row constructor with the key word <literal>ROW</>
391+
omitted.)
392+
</para>
393+
394+
<para>
395+
Another special syntactical behavior associated with composite values is
396+
that we can use <firstterm>functional notation</> for extracting a field
397+
of a composite value. The simple way to explain this is that
398+
the notations <literal><replaceable>field</>(<replaceable>table</>)</>
399+
and <literal><replaceable>table</>.<replaceable>field</></>
400+
are interchangeable. For example, these queries are equivalent:
401+
402+
<programlisting>
403+
SELECT c.name FROM inventory_item c WHERE c.price &gt; 1000;
404+
SELECT name(c) FROM inventory_item c WHERE price(c) &gt; 1000;
405+
</programlisting>
406+
407+
Moreover, if we have a function that accepts a single argument of a
408+
composite type, we can call it with either notation. These queries are
409+
all equivalent:
410+
411+
<programlisting>
412+
SELECT somefunc(c) FROM inventory_item c;
413+
SELECT somefunc(c.*) FROM inventory_item c;
414+
SELECT c.somefunc FROM inventory_item c;
415+
</programlisting>
416+
</para>
417+
418+
<para>
419+
This equivalence between functional notation and field notation
420+
makes it possible to use functions on composite types to implement
421+
<quote>computed fields</>.
422+
<indexterm>
423+
<primary>computed field</primary>
424+
</indexterm>
425+
<indexterm>
426+
<primary>field</primary>
427+
<secondary>computed</secondary>
428+
</indexterm>
429+
An application using the last query above wouldn't need to be directly
430+
aware that <literal>somefunc</> isn't a real column of the table.
431+
</para>
432+
433+
<tip>
434+
<para>
435+
Because of this behavior, it's unwise to give a function that takes a
436+
single composite-type argument the same name as any of the fields of
437+
that composite type. If there is ambiguity, the field-name
438+
interpretation will be preferred, so that such a function could not be
439+
called without tricks. One way to force the function interpretation is
440+
to schema-qualify the function name, that is, write
441+
<literal><replaceable>schema</>.<replaceable>func</>(<replaceable>compositevalue</>)</literal>.
442+
</para>
443+
</tip>
444+
</sect2>
445+
246446
<sect2 id="rowtypes-io-syntax">
247447
<title>Composite Type Input and Output Syntax</title>
248448

doc/src/sgml/syntax.sgml

+9-7
Original file line numberDiff line numberDiff line change
@@ -1449,12 +1449,13 @@ $1.somecolumn
14491449
</para>
14501450

14511451
<para>
1452-
In a select list (see <xref linkend="queries-select-lists">), you
1453-
can ask for all fields of a composite value by
1452+
You can ask for all fields of a composite value by
14541453
writing <literal>.*</literal>:
14551454
<programlisting>
14561455
(compositecol).*
14571456
</programlisting>
1457+
This notation behaves differently depending on context;
1458+
see <xref linkend="rowtypes-usage"> for details.
14581459
</para>
14591460
</sect2>
14601461

@@ -1531,7 +1532,7 @@ sqrt(2)
15311532
interchangeable. This behavior is not SQL-standard but is provided
15321533
in <productname>PostgreSQL</> because it allows use of functions to
15331534
emulate <quote>computed fields</>. For more information see
1534-
<xref linkend="xfunc-sql-composite-functions">.
1535+
<xref linkend="rowtypes-usage">.
15351536
</para>
15361537
</note>
15371538
</sect2>
@@ -2291,7 +2292,8 @@ SELECT ROW(1,2.5,'this is a test');
22912292
<replaceable>rowvalue</replaceable><literal>.*</literal>,
22922293
which will be expanded to a list of the elements of the row value,
22932294
just as occurs when the <literal>.*</> syntax is used at the top level
2294-
of a <command>SELECT</> list. For example, if table <literal>t</> has
2295+
of a <command>SELECT</> list (see <xref linkend="rowtypes-usage">).
2296+
For example, if table <literal>t</> has
22952297
columns <literal>f1</> and <literal>f2</>, these are the same:
22962298
<programlisting>
22972299
SELECT ROW(t.*, 42) FROM t;
@@ -2302,9 +2304,9 @@ SELECT ROW(t.f1, t.f2, 42) FROM t;
23022304
<note>
23032305
<para>
23042306
Before <productname>PostgreSQL</productname> 8.2, the
2305-
<literal>.*</literal> syntax was not expanded, so that writing
2306-
<literal>ROW(t.*, 42)</> created a two-field row whose first field
2307-
was another row value. The new behavior is usually more useful.
2307+
<literal>.*</literal> syntax was not expanded in row constructors, so
2308+
that writing <literal>ROW(t.*, 42)</> created a two-field row whose first
2309+
field was another row value. The new behavior is usually more useful.
23082310
If you need the old behavior of nested row values, write the inner
23092311
row value without <literal>.*</literal>, for instance
23102312
<literal>ROW(t, 42)</>.

0 commit comments

Comments
 (0)