Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorTom Lane2020-12-09 17:40:37 +0000
committerTom Lane2020-12-09 17:40:37 +0000
commitc7aba7c14efdbd9fc1bb44b4cb83bedee0c6a6fc (patch)
treed6980ca2951d353475957a56b58866cd4fafcdd3 /doc
parent8b069ef5dca97cd737a5fd64c420df3cd61ec1c9 (diff)
Support subscripting of arbitrary types, not only arrays.
This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
Diffstat (limited to 'doc')
-rw-r--r--doc/src/sgml/catalogs.sgml38
-rw-r--r--doc/src/sgml/ref/create_type.sgml76
2 files changed, 90 insertions, 24 deletions
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 79069ddfabe..62711ee83ff 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8742,24 +8742,36 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>typsubscript</structfield> <type>regproc</type>
+ (references <link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.<structfield>oid</structfield>)
+ </para>
+ <para>
+ Subscripting handler function's OID, or zero if this type doesn't
+ support subscripting. Types that are <quote>true</quote> array
+ types have <structfield>typsubscript</structfield>
+ = <function>array_subscript_handler</function>, but other types may
+ have other handler functions to implement specialized subscripting
+ behavior.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
<structfield>typelem</structfield> <type>oid</type>
(references <link linkend="catalog-pg-type"><structname>pg_type</structname></link>.<structfield>oid</structfield>)
</para>
<para>
If <structfield>typelem</structfield> is not 0 then it
- identifies another row in <structname>pg_type</structname>.
- The current type can then be subscripted like an array yielding
- values of type <structfield>typelem</structfield>. A
- <quote>true</quote> array type is variable length
- (<structfield>typlen</structfield> = -1),
- but some fixed-length (<structfield>typlen</structfield> &gt; 0) types
- also have nonzero <structfield>typelem</structfield>, for example
- <type>name</type> and <type>point</type>.
- If a fixed-length type has a <structfield>typelem</structfield> then
- its internal representation must be some number of values of the
- <structfield>typelem</structfield> data type with no other data.
- Variable-length array types have a header defined by the array
- subroutines.
+ identifies another row in <structname>pg_type</structname>,
+ defining the type yielded by subscripting. This should be 0
+ if <structfield>typsubscript</structfield> is 0. However, it can
+ be 0 when <structfield>typsubscript</structfield> isn't 0, if the
+ handler doesn't need <structfield>typelem</structfield> to
+ determine the subscripting result type.
+ Note that a <structfield>typelem</structfield> dependency is
+ considered to imply physical containment of the element type in
+ this type; so DDL changes on the element type might be restricted
+ by the presence of this type.
</para></entry>
</row>
diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index 970b517db9f..d909ee0d33b 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -43,6 +43,7 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> (
[ , TYPMOD_IN = <replaceable class="parameter">type_modifier_input_function</replaceable> ]
[ , TYPMOD_OUT = <replaceable class="parameter">type_modifier_output_function</replaceable> ]
[ , ANALYZE = <replaceable class="parameter">analyze_function</replaceable> ]
+ [ , SUBSCRIPT = <replaceable class="parameter">subscript_function</replaceable> ]
[ , INTERNALLENGTH = { <replaceable class="parameter">internallength</replaceable> | VARIABLE } ]
[ , PASSEDBYVALUE ]
[ , ALIGNMENT = <replaceable class="parameter">alignment</replaceable> ]
@@ -196,8 +197,9 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
<replaceable class="parameter">receive_function</replaceable>,
<replaceable class="parameter">send_function</replaceable>,
<replaceable class="parameter">type_modifier_input_function</replaceable>,
- <replaceable class="parameter">type_modifier_output_function</replaceable> and
- <replaceable class="parameter">analyze_function</replaceable>
+ <replaceable class="parameter">type_modifier_output_function</replaceable>,
+ <replaceable class="parameter">analyze_function</replaceable>, and
+ <replaceable class="parameter">subscript_function</replaceable>
are optional. Generally these functions have to be coded in C
or another low-level language.
</para>
@@ -319,6 +321,26 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The optional <replaceable class="parameter">subscript_function</replaceable>
+ allows the data type to be subscripted in SQL commands. Specifying this
+ function does not cause the type to be considered a <quote>true</quote>
+ array type; for example, it will not be a candidate for the result type
+ of <literal>ARRAY[]</literal> constructs. But if subscripting a value
+ of the type is a natural notation for extracting data from it, then
+ a <replaceable class="parameter">subscript_function</replaceable> can
+ be written to define what that means. The subscript function must be
+ declared to take a single argument of type <type>internal</type>, and
+ return an <type>internal</type> result, which is a pointer to a struct
+ of methods (functions) that implement subscripting.
+ The detailed API for subscript functions appears
+ in <filename>src/include/nodes/subscripting.h</filename>;
+ it may also be useful to read the array implementation
+ in <filename>src/backend/utils/adt/arraysubs.c</filename>.
+ Additional information appears in
+ <xref linkend="sql-createtype-array"/> below.
+ </para>
+
+ <para>
While the details of the new type's internal representation are only
known to the I/O functions and other functions you create to work with
the type, there are several properties of the internal representation
@@ -428,11 +450,12 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
</para>
<para>
- To indicate that a type is an array, specify the type of the array
+ To indicate that a type is a fixed-length array type,
+ specify the type of the array
elements using the <literal>ELEMENT</literal> key word. For example, to
define an array of 4-byte integers (<type>int4</type>), specify
- <literal>ELEMENT = int4</literal>. More details about array types
- appear below.
+ <literal>ELEMENT = int4</literal>. For more details,
+ see <xref linkend="sql-createtype-array"/> below.
</para>
<para>
@@ -456,7 +479,7 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
</para>
</refsect2>
- <refsect2>
+ <refsect2 id="sql-createtype-array" xreflabel="Array Types">
<title>Array Types</title>
<para>
@@ -469,14 +492,16 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
repeated until a non-colliding name is found.)
This implicitly-created array type is variable length and uses the
built-in input and output functions <literal>array_in</literal> and
- <literal>array_out</literal>. The array type tracks any changes in its
+ <literal>array_out</literal>. Furthermore, this type is what the system
+ uses for constructs such as <literal>ARRAY[]</literal> over the
+ user-defined type. The array type tracks any changes in its
element type's owner or schema, and is dropped if the element type is.
</para>
<para>
You might reasonably ask why there is an <option>ELEMENT</option>
option, if the system makes the correct array type automatically.
- The only case where it's useful to use <option>ELEMENT</option> is when you are
+ The main case where it's useful to use <option>ELEMENT</option> is when you are
making a fixed-length type that happens to be internally an array of a number of
identical things, and you want to allow these things to be accessed
directly by subscripting, in addition to whatever operations you plan
@@ -485,13 +510,32 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
using <literal>point[0]</literal> and <literal>point[1]</literal>.
Note that
this facility only works for fixed-length types whose internal form
- is exactly a sequence of identical fixed-length fields. A subscriptable
- variable-length type must have the generalized internal representation
- used by <literal>array_in</literal> and <literal>array_out</literal>.
+ is exactly a sequence of identical fixed-length fields.
For historical reasons (i.e., this is clearly wrong but it's far too
late to change it), subscripting of fixed-length array types starts from
zero, rather than from one as for variable-length arrays.
</para>
+
+ <para>
+ Specifying the <option>SUBSCRIPT</option> option allows a data type to
+ be subscripted, even though the system does not otherwise regard it as
+ an array type. The behavior just described for fixed-length arrays is
+ actually implemented by the <option>SUBSCRIPT</option> handler
+ function <function>raw_array_subscript_handler</function>, which is
+ used automatically if you specify <option>ELEMENT</option> for a
+ fixed-length type without also writing <option>SUBSCRIPT</option>.
+ </para>
+
+ <para>
+ When specifying a custom <option>SUBSCRIPT</option> function, it is
+ not necessary to specify <option>ELEMENT</option> unless
+ the <option>SUBSCRIPT</option> handler function needs to
+ consult <structfield>typelem</structfield> to find out what to return.
+ Be aware that specifying <option>ELEMENT</option> causes the system to
+ assume that the new type contains, or is somehow physically dependent on,
+ the element type; thus for example changing properties of the element
+ type won't be allowed if there are any columns of the dependent type.
+ </para>
</refsect2>
</refsect1>
@@ -655,6 +699,16 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
</varlistentry>
<varlistentry>
+ <term><replaceable class="parameter">subscript_function</replaceable></term>
+ <listitem>
+ <para>
+ The name of a function that defines what subscripting a value of the
+ data type does.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><replaceable class="parameter">internallength</replaceable></term>
<listitem>
<para>