Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit e3b0117

Browse files
committed
Implement comparison of generic records (composite types), and invent a
pseudo-type record[] to represent arrays of possibly-anonymous composite types. Since composite datums carry their own type identification, no extra knowledge is needed at the array level. The main reason for doing this right now is that it is necessary to support the general case of detection of cycles in recursive queries: if you need to compare more than one column to detect a cycle, you need to compare a ROW() to an array built from ROW()s, at least if you want to do it as the spec suggests. Add some documentation and regression tests concerning the cycle detection issue.
1 parent d6dfa1e commit e3b0117

File tree

18 files changed

+809
-22
lines changed

18 files changed

+809
-22
lines changed

doc/src/sgml/func.sgml

+15-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.448 2008/10/03 07:33:08 heikki Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.449 2008/10/13 16:25:19 tgl Exp $ -->
22

33
<chapter id="functions">
44
<title>Functions and Operators</title>
@@ -10667,6 +10667,20 @@ AND
1066710667
be either true or false, never null.
1066810668
</para>
1066910669

10670+
<note>
10671+
<para>
10672+
The SQL specification requires row-wise comparison to return NULL if the
10673+
result depends on comparing two NULL values or a NULL and a non-NULL.
10674+
<productname>PostgreSQL</productname> does this only when comparing the
10675+
results of two row constructors or comparing a row constructor to the
10676+
output of a subquery (as in <xref linkend="functions-subquery">).
10677+
In other contexts where two composite-type values are compared, two
10678+
NULL field values are considered equal, and a NULL is considered larger
10679+
than a non-NULL. This is necessary in order to have consistent sorting
10680+
and indexing behavior for composite types.
10681+
</para>
10682+
</note>
10683+
1067010684
</sect2>
1067110685
</sect1>
1067210686

doc/src/sgml/queries.sgml

+80-3
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.47 2008/10/07 19:27:03 tgl Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.48 2008/10/13 16:25:19 tgl Exp $ -->
22

33
<chapter id="queries">
44
<title>Queries</title>
@@ -1604,8 +1604,85 @@ GROUP BY sub_part
16041604
the recursive part of the query will eventually return no tuples,
16051605
or else the query will loop indefinitely. Sometimes, using
16061606
<literal>UNION</> instead of <literal>UNION ALL</> can accomplish this
1607-
by discarding rows that duplicate previous output rows; this catches
1608-
cycles that would otherwise repeat. A useful trick for testing queries
1607+
by discarding rows that duplicate previous output rows. However, often a
1608+
cycle does not involve output rows that are completely duplicate: it may be
1609+
necessary to check just one or a few fields to see if the same point has
1610+
been reached before. The standard method for handling such situations is
1611+
to compute an array of the already-visited values. For example, consider
1612+
the following query that searches a table <structname>graph</> using a
1613+
<structfield>link</> field:
1614+
1615+
<programlisting>
1616+
WITH RECURSIVE search_graph(id, link, data, depth) AS (
1617+
SELECT g.id, g.link, g.data, 1
1618+
FROM graph g
1619+
UNION ALL
1620+
SELECT g.id, g.link, g.data, sg.depth + 1
1621+
FROM graph g, search_graph sg
1622+
WHERE g.id = sg.link
1623+
)
1624+
SELECT * FROM search_graph;
1625+
</programlisting>
1626+
1627+
This query will loop if the <structfield>link</> relationships contain
1628+
cycles. Because we require a <quote>depth</> output, just changing
1629+
<literal>UNION ALL</> to <literal>UNION</> would not eliminate the looping.
1630+
Instead we need to recognize whether we have reached the same row again
1631+
while following a particular path of links. We add two columns
1632+
<structfield>path</> and <structfield>cycle</> to the loop-prone query:
1633+
1634+
<programlisting>
1635+
WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
1636+
SELECT g.id, g.link, g.data, 1,
1637+
ARRAY[g.id],
1638+
false
1639+
FROM graph g
1640+
UNION ALL
1641+
SELECT g.id, g.link, g.data, sg.depth + 1,
1642+
path || ARRAY[g.id],
1643+
g.id = ANY(path)
1644+
FROM graph g, search_graph sg
1645+
WHERE g.id = sg.link AND NOT cycle
1646+
)
1647+
SELECT * FROM search_graph;
1648+
</programlisting>
1649+
1650+
Aside from preventing cycles, the array value is often useful in its own
1651+
right as representing the <quote>path</> taken to reach any particular row.
1652+
</para>
1653+
1654+
<para>
1655+
In the general case where more than one field needs to be checked to
1656+
recognize a cycle, use an array of rows. For example, if we needed to
1657+
compare fields <structfield>f1</> and <structfield>f2</>:
1658+
1659+
<programlisting>
1660+
WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
1661+
SELECT g.id, g.link, g.data, 1,
1662+
ARRAY[ROW(g.f1, g.f2)],
1663+
false
1664+
FROM graph g
1665+
UNION ALL
1666+
SELECT g.id, g.link, g.data, sg.depth + 1,
1667+
path || ARRAY[ROW(g.f1, g.f2)],
1668+
ROW(g.f1, g.f2) = ANY(path)
1669+
FROM graph g, search_graph sg
1670+
WHERE g.id = sg.link AND NOT cycle
1671+
)
1672+
SELECT * FROM search_graph;
1673+
</programlisting>
1674+
</para>
1675+
1676+
<tip>
1677+
<para>
1678+
Omit the <literal>ROW()</> syntax in the common case where only one field
1679+
needs to be checked to recognize a cycle. This allows a simple array
1680+
rather than a composite-type array to be used, gaining efficiency.
1681+
</para>
1682+
</tip>
1683+
1684+
<para>
1685+
A helpful trick for testing queries
16091686
when you are not certain if they might loop is to place a <literal>LIMIT</>
16101687
in the parent query. For example, this query would loop forever without
16111688
the <literal>LIMIT</>:

src/backend/commands/indexcmds.c

+3-2
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/commands/indexcmds.c,v 1.179 2008/08/25 22:42:32 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/commands/indexcmds.c,v 1.180 2008/10/13 16:25:19 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -795,7 +795,8 @@ ComputeIndexAttrs(IndexInfo *indexInfo,
795795
atttype = attform->atttypid;
796796
ReleaseSysCache(atttuple);
797797
}
798-
else if (attribute->expr && IsA(attribute->expr, Var))
798+
else if (attribute->expr && IsA(attribute->expr, Var) &&
799+
((Var *) attribute->expr)->varattno != InvalidAttrNumber)
799800
{
800801
/* Tricky tricky, he wrote (column) ... treat as simple attr */
801802
Var *var = (Var *) attribute->expr;

src/backend/parser/parse_coerce.c

+61-3
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/parser/parse_coerce.c,v 2.168 2008/10/06 17:39:26 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/parser/parse_coerce.c,v 2.169 2008/10/13 16:25:19 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -46,6 +46,7 @@ static Node *coerce_record_to_complex(ParseState *pstate, Node *node,
4646
CoercionContext ccontext,
4747
CoercionForm cformat,
4848
int location);
49+
static bool is_complex_array(Oid typid);
4950

5051

5152
/*
@@ -402,6 +403,21 @@ coerce_type(ParseState *pstate, Node *node,
402403
/* NB: we do NOT want a RelabelType here */
403404
return node;
404405
}
406+
#ifdef NOT_USED
407+
if (inputTypeId == RECORDARRAYOID &&
408+
is_complex_array(targetTypeId))
409+
{
410+
/* Coerce record[] to a specific complex array type */
411+
/* not implemented yet ... */
412+
}
413+
#endif
414+
if (targetTypeId == RECORDARRAYOID &&
415+
is_complex_array(inputTypeId))
416+
{
417+
/* Coerce a specific complex array type to record[] */
418+
/* NB: we do NOT want a RelabelType here */
419+
return node;
420+
}
405421
if (typeInheritsFrom(inputTypeId, targetTypeId))
406422
{
407423
/*
@@ -492,6 +508,23 @@ can_coerce_type(int nargs, Oid *input_typeids, Oid *target_typeids,
492508
ISCOMPLEX(inputTypeId))
493509
continue;
494510

511+
#ifdef NOT_USED /* not implemented yet */
512+
/*
513+
* If input is record[] and target is a composite array type,
514+
* assume we can coerce (may need tighter checking here)
515+
*/
516+
if (inputTypeId == RECORDARRAYOID &&
517+
is_complex_array(targetTypeId))
518+
continue;
519+
#endif
520+
521+
/*
522+
* If input is a composite array type and target is record[], accept
523+
*/
524+
if (targetTypeId == RECORDARRAYOID &&
525+
is_complex_array(inputTypeId))
526+
continue;
527+
495528
/*
496529
* If input is a class type that inherits from target, accept
497530
*/
@@ -1724,8 +1757,8 @@ IsPreferredType(TYPCATEGORY category, Oid type)
17241757
* invokable, no-function-needed pg_cast entry. Also, a domain is always
17251758
* binary-coercible to its base type, though *not* vice versa (in the other
17261759
* direction, one must apply domain constraint checks before accepting the
1727-
* value as legitimate). We also need to special-case the polymorphic
1728-
* ANYARRAY type.
1760+
* value as legitimate). We also need to special-case various polymorphic
1761+
* types.
17291762
*
17301763
* This function replaces IsBinaryCompatible(), which was an inherently
17311764
* symmetric test. Since the pg_cast entries aren't necessarily symmetric,
@@ -1765,6 +1798,16 @@ IsBinaryCoercible(Oid srctype, Oid targettype)
17651798
if (type_is_enum(srctype))
17661799
return true;
17671800

1801+
/* Also accept any composite type as coercible to RECORD */
1802+
if (targettype == RECORDOID)
1803+
if (ISCOMPLEX(srctype))
1804+
return true;
1805+
1806+
/* Also accept any composite array type as coercible to RECORD[] */
1807+
if (targettype == RECORDARRAYOID)
1808+
if (is_complex_array(srctype))
1809+
return true;
1810+
17681811
/* Else look in pg_cast */
17691812
tuple = SearchSysCache(CASTSOURCETARGET,
17701813
ObjectIdGetDatum(srctype),
@@ -2002,3 +2045,18 @@ find_typmod_coercion_function(Oid typeId,
20022045

20032046
return result;
20042047
}
2048+
2049+
/*
2050+
* is_complex_array
2051+
* Is this type an array of composite?
2052+
*
2053+
* Note: this will not return true for record[]; check for RECORDARRAYOID
2054+
* separately if needed.
2055+
*/
2056+
static bool
2057+
is_complex_array(Oid typid)
2058+
{
2059+
Oid elemtype = get_element_type(typid);
2060+
2061+
return (OidIsValid(elemtype) && ISCOMPLEX(elemtype));
2062+
}

0 commit comments

Comments
 (0)