Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 91e18db

Browse files
committed
Docs updates for cross-type hashing.
1 parent 8076c8c commit 91e18db

File tree

2 files changed

+41
-15
lines changed

2 files changed

+41
-15
lines changed

doc/src/sgml/xindex.sgml

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.58 2007/02/01 00:28:18 momjian Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.59 2007/02/06 04:38:31 tgl Exp $ -->
22

33
<sect1 id="xindex">
44
<title>Interfacing Extensions To Indexes</title>
@@ -139,7 +139,7 @@
139139
</table>
140140

141141
<para>
142-
Hash indexes express only bitwise equality, and so they use only one
142+
Hash indexes support only equality comparisons, and so they use only one
143143
strategy, shown in <xref linkend="xindex-hash-strat-table">.
144144
</para>
145145

@@ -162,7 +162,7 @@
162162
</table>
163163

164164
<para>
165-
GiST indexes are even more flexible: they do not have a fixed set of
165+
GiST indexes are more flexible: they do not have a fixed set of
166166
strategies at all. Instead, the <quote>consistency</> support routine
167167
of each particular GiST operator class interprets the strategy numbers
168168
however it likes. As an example, several of the built-in GiST index
@@ -802,14 +802,23 @@ ALTER OPERATOR FAMILY integer_ops USING btree ADD
802802
operator in the family there must be a support function having the same
803803
two input data types as the operator. It is recommended that a family be
804804
complete, i.e., for each combination of data types, all operators are
805-
included. An operator class should include just the non-cross-type
805+
included. Each operator class should include just the non-cross-type
806806
operators and support function for its data type.
807807
</para>
808808

809809
<para>
810-
At this writing, hash indexes do not support cross-type operations,
811-
and so there is little use for a hash operator family larger than one
812-
operator class. This is expected to be relaxed in the future.
810+
To build a multiple-data-type hash operator family, compatible hash
811+
support functions must be created for each data type supported by the
812+
family. Here compatibility means that the functions are guaranteed to
813+
return the same hash code for any two values that are considered equal
814+
by the family's equality operators, even when the values are of different
815+
types. This is usually difficult to accomplish when the types have
816+
different physical representations, but it can be done in some cases.
817+
Notice that there is only one support function per data type, not one
818+
per equality operator. It is recommended that a family be complete, i.e.,
819+
provide an equality operator for each combination of data types.
820+
Each operator class should include just the non-cross-type equality
821+
operator and the support function for its data type.
813822
</para>
814823

815824
<para>

doc/src/sgml/xoper.sgml

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/xoper.sgml,v 1.41 2007/02/01 19:10:24 momjian Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/xoper.sgml,v 1.42 2007/02/06 04:38:31 tgl Exp $ -->
22

33
<sect1 id="xoper">
44
<title>User-Defined Operators</title>
@@ -85,7 +85,7 @@ SELECT (a + b) AS c FROM test_complex;
8585
appropriate, because they can make for considerable speedups in execution
8686
of queries that use the operator. But if you provide them, you must be
8787
sure that they are right! Incorrect use of an optimization clause can
88-
result in server process crashes, subtly wrong output, or other Bad Things.
88+
result in slow queries, subtly wrong output, or other Bad Things.
8989
You can always leave out an optimization clause if you are not sure
9090
about it; the only consequence is that queries might run slower than
9191
they need to.
@@ -326,8 +326,8 @@ table1.column1 OP table2.column2
326326
The <literal>HASHES</literal> clause, if present, tells the system that
327327
it is permissible to use the hash join method for a join based on this
328328
operator. <literal>HASHES</> only makes sense for a binary operator that
329-
returns <literal>boolean</>, and in practice the operator had better be
330-
equality for some data type.
329+
returns <literal>boolean</>, and in practice the operator must represent
330+
equality for some data type or pair of data types.
331331
</para>
332332

333333
<para>
@@ -337,7 +337,13 @@ table1.column1 OP table2.column2
337337
join will never compare them at all, implicitly assuming that the
338338
result of the join operator must be false. So it never makes sense
339339
to specify <literal>HASHES</literal> for operators that do not represent
340-
some form of equality.
340+
some form of equality. In most cases it is only practical to support
341+
hashing for operators that take the same data type on both sides.
342+
However, sometimes it is possible to design compatible hash functions
343+
for two or more datatypes; that is, functions that will generate the
344+
same hash codes for <quote>equal</> values, even though the values
345+
have different representations. For example, it's fairly simple
346+
to arrange this property when hashing integers of different widths.
341347
</para>
342348

343349
<para>
@@ -346,9 +352,9 @@ table1.column1 OP table2.column2
346352
the operator, since of course the referencing operator family couldn't
347353
exist yet. But attempts to use the operator in hash joins will fail
348354
at run time if no such operator family exists. The system needs the
349-
operator family to find the data-type-specific hash function for the
350-
operator's input data type. Of course, you must also create a suitable
351-
hash function before you can create the operator family.
355+
operator family to find the data-type-specific hash function(s) for the
356+
operator's input data type(s). Of course, you must also create suitable
357+
hash functions before you can create the operator family.
352358
</para>
353359

354360
<para>
@@ -366,6 +372,17 @@ table1.column1 OP table2.column2
366372
to ensure it generates the same hash value as positive zero.
367373
</para>
368374

375+
<para>
376+
A hash-joinable operator must have a commutator (itself if the two
377+
operand data types are the same, or a related equality operator
378+
if they are different) that appears in the same operator family.
379+
If this is not the case, planner errors might occur when the operator
380+
is used. Also, it is a good idea (but not strictly required) for
381+
a hash operator family that supports multiple datatypes to provide
382+
equality operators for every combination of the datatypes; this
383+
allows better optimization.
384+
</para>
385+
369386
<note>
370387
<para>
371388
The function underlying a hash-joinable operator must be marked

0 commit comments

Comments
 (0)