|
1 |
| -<!-- $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.55 2007/01/20 23:13:01 tgl Exp $ --> |
| 1 | +<!-- $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.56 2007/01/23 20:45:28 tgl Exp $ --> |
2 | 2 |
|
3 | 3 | <sect1 id="xindex">
|
4 | 4 | <title>Interfacing Extensions To Indexes</title>
|
|
18 | 18 | complex numbers in ascending absolute value order.
|
19 | 19 | </para>
|
20 | 20 |
|
21 |
| - <note> |
22 |
| - <para> |
23 |
| - Prior to <productname>PostgreSQL</productname> release 7.3, it was |
24 |
| - necessary to make manual additions to the system catalogs |
25 |
| - <classname>pg_amop</>, <classname>pg_amproc</>, and |
26 |
| - <classname>pg_opclass</> in order to create a user-defined |
27 |
| - operator class. That approach is now deprecated in favor of using |
28 |
| - <xref linkend="sql-createopclass" endterm="sql-createopclass-title">, |
29 |
| - which is a much simpler and less error-prone way of creating the |
30 |
| - necessary catalog entries. |
31 |
| - </para> |
32 |
| - </note> |
| 21 | + <para> |
| 22 | + Operator classes can be grouped into <firstterm>operator families</> |
| 23 | + to show the relationships between semantically compatible classes. |
| 24 | + When only a single data type is involved, an operator class is sufficient, |
| 25 | + so we'll focus on that case first and then return to operator families. |
| 26 | + </para> |
33 | 27 |
|
34 |
| - <sect2 id="xindex-im"> |
| 28 | + <sect2 id="xindex-opclass"> |
35 | 29 | <title>Index Methods and Operator Classes</title>
|
36 | 30 |
|
37 | 31 | <para>
|
|
282 | 276 | </table>
|
283 | 277 |
|
284 | 278 | <para>
|
285 |
| - Note that all strategy operators return Boolean values. In |
| 279 | + Notice that all strategy operators return Boolean values. In |
286 | 280 | practice, all operators defined as index method strategies must
|
287 | 281 | return type <type>boolean</type>, since they must appear at the top
|
288 | 282 | level of a <literal>WHERE</> clause to be used with an index.
|
|
309 | 303 | functions should play each of these roles for a given data type and
|
310 | 304 | semantic interpretation. The index method defines the set
|
311 | 305 | of functions it needs, and the operator class identifies the correct
|
312 |
| - functions to use by assigning them to the <quote>support function numbers</>. |
| 306 | + functions to use by assigning them to the <quote>support function numbers</> |
| 307 | + specified by the index method. |
313 | 308 | </para>
|
314 | 309 |
|
315 | 310 | <para>
|
|
329 | 324 | <tbody>
|
330 | 325 | <row>
|
331 | 326 | <entry>
|
332 |
| - Compare two keys and return an integer less than zero, zero, or |
333 |
| - greater than zero, indicating whether the first key is less than, equal to, |
334 |
| - or greater than the second. |
| 327 | + Compare two keys and return an integer less than zero, zero, or |
| 328 | + greater than zero, indicating whether the first key is less than, |
| 329 | + equal to, or greater than the second. |
335 | 330 | </entry>
|
336 | 331 | <entry>1</entry>
|
337 | 332 | </row>
|
|
456 | 451 | <para>
|
457 | 452 | Unlike strategy operators, support functions return whichever data
|
458 | 453 | type the particular index method expects; for example in the case
|
459 |
| - of the comparison function for B-trees, a signed integer. |
| 454 | + of the comparison function for B-trees, a signed integer. The number |
| 455 | + and types of the arguments to each support function are likewise |
| 456 | + dependent on the index method. For B-tree and hash the support functions |
| 457 | + take the same input data types as do the operators included in the operator |
| 458 | + class, but this is not the case for most GIN and GiST support functions. |
460 | 459 | </para>
|
461 | 460 | </sect2>
|
462 | 461 |
|
@@ -644,69 +643,192 @@ CREATE OPERATOR CLASS complex_abs_ops
|
644 | 643 | </para>
|
645 | 644 | </sect2>
|
646 | 645 |
|
647 |
| - <sect2 id="xindex-opclass-crosstype"> |
648 |
| - <title>Cross-Data-Type Operator Classes</title> |
| 646 | + <sect2 id="xindex-opfamily"> |
| 647 | + <title>Operator Classes and Operator Families</title> |
649 | 648 |
|
650 | 649 | <para>
|
651 | 650 | So far we have implicitly assumed that an operator class deals with
|
652 | 651 | only one data type. While there certainly can be only one data type in
|
653 | 652 | a particular index column, it is often useful to index operations that
|
654 |
| - compare an indexed column to a value of a different data type. This is |
655 |
| - presently supported by the B-tree and GiST index methods. |
656 |
| - </para> |
657 |
| - |
658 |
| - <para> |
659 |
| - B-trees require the left-hand operand of each operator to be the indexed |
660 |
| - data type, but the right-hand operand can be of a different type. There |
661 |
| - must be a support function having a matching signature. For example, |
662 |
| - the built-in operator class for type <type>bigint</> (<type>int8</>) |
663 |
| - allows cross-type comparisons to <type>int4</> and <type>int2</>. It |
664 |
| - could be duplicated by this definition: |
| 653 | + compare an indexed column to a value of a different data type. Also, |
| 654 | + if there is use for a cross-data-type operator in connection with an |
| 655 | + operator class, it is often the case that the other data type has a |
| 656 | + related operator class of its own. It is helpful to make the connections |
| 657 | + between related classes explicit, because this can aid the planner in |
| 658 | + optimizing SQL queries (particularly for B-tree operator classes, since |
| 659 | + the planner contains a great deal of knowledge about how to work with them). |
| 660 | + </para> |
| 661 | + |
| 662 | + <para> |
| 663 | + To handle these needs, <productname>PostgreSQL</productname> |
| 664 | + uses the concept of an <firstterm>operator |
| 665 | + family</><indexterm><primary>operator family</></indexterm>. |
| 666 | + An operator family contains one or more operator classes, and may also |
| 667 | + contain indexable operators and corresponding support functions that |
| 668 | + belong to the family as a whole but not to any single class within the |
| 669 | + family. We say that such operators and functions are <quote>loose</> |
| 670 | + within the family, as opposed to being bound into a specific class. |
| 671 | + Typically each operator class contains single-data-type operators |
| 672 | + while cross-data-type operators are loose in the family. |
| 673 | + </para> |
| 674 | + |
| 675 | + <para> |
| 676 | + All the operators and functions in an operator family must have compatible |
| 677 | + semantics, where the compatibility requirements are set by the index |
| 678 | + method. You might therefore wonder why bother to single out particular |
| 679 | + subsets of the family as operator classes; and indeed for many purposes |
| 680 | + the class divisions are irrelevant and the family is the only interesting |
| 681 | + grouping. The reason for defining operator classes is that they specify |
| 682 | + how much of the family is needed to support any particular index. |
| 683 | + If there is an index using an operator class, then that operator class |
| 684 | + cannot be dropped without dropping the index — but other parts of |
| 685 | + the operator family, namely other operator classes and loose operators, |
| 686 | + could be dropped. Thus, an operator class should be specified to contain |
| 687 | + the minimum set of operators and functions that are reasonably needed |
| 688 | + to work with an index on a specific data type, and then related but |
| 689 | + non-essential operators can be added as loose members of the operator |
| 690 | + family. |
| 691 | + </para> |
| 692 | + |
| 693 | + <para> |
| 694 | + As an example, <productname>PostgreSQL</productname> has a built-in |
| 695 | + B-tree operator family <literal>integer_ops</>, which includes operator |
| 696 | + classes <literal>int8_ops</>, <literal>int4_ops</>, and |
| 697 | + <literal>int2_ops</> for indexes on <type>bigint</> (<type>int8</>), |
| 698 | + <type>integer</> (<type>int4</>), and <type>smallint</> (<type>int2</>) |
| 699 | + columns respectively. The family also contains cross-data-type comparison |
| 700 | + operators allowing any two of these types to be compared, so that an index |
| 701 | + on one of these types can be searched using a comparison value of another |
| 702 | + type. The family could be duplicated by these definitions: |
665 | 703 |
|
666 | 704 | <programlisting>
|
| 705 | +CREATE OPERATOR FAMILY integer_ops USING btree; |
| 706 | + |
667 | 707 | CREATE OPERATOR CLASS int8_ops
|
668 |
| -DEFAULT FOR TYPE int8 USING btree AS |
| 708 | +DEFAULT FOR TYPE int8 USING btree FAMILY integer_ops AS |
669 | 709 | -- standard int8 comparisons
|
670 | 710 | OPERATOR 1 < ,
|
671 | 711 | OPERATOR 2 <= ,
|
672 | 712 | OPERATOR 3 = ,
|
673 | 713 | OPERATOR 4 >= ,
|
674 | 714 | OPERATOR 5 > ,
|
675 |
| - FUNCTION 1 btint8cmp(int8, int8) , |
| 715 | + FUNCTION 1 btint8cmp(int8, int8) ; |
| 716 | + |
| 717 | +CREATE OPERATOR CLASS int4_ops |
| 718 | +DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS |
| 719 | + -- standard int4 comparisons |
| 720 | + OPERATOR 1 < , |
| 721 | + OPERATOR 2 <= , |
| 722 | + OPERATOR 3 = , |
| 723 | + OPERATOR 4 >= , |
| 724 | + OPERATOR 5 > , |
| 725 | + FUNCTION 1 btint4cmp(int4, int4) ; |
676 | 726 |
|
677 |
| - -- cross-type comparisons to int2 (smallint) |
| 727 | +CREATE OPERATOR CLASS int2_ops |
| 728 | +DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS |
| 729 | + -- standard int2 comparisons |
| 730 | + OPERATOR 1 < , |
| 731 | + OPERATOR 2 <= , |
| 732 | + OPERATOR 3 = , |
| 733 | + OPERATOR 4 >= , |
| 734 | + OPERATOR 5 > , |
| 735 | + FUNCTION 1 btint2cmp(int2, int2) ; |
| 736 | + |
| 737 | +ALTER OPERATOR FAMILY integer_ops USING btree ADD |
| 738 | + -- cross-type comparisons int8 vs int2 |
678 | 739 | OPERATOR 1 < (int8, int2) ,
|
679 | 740 | OPERATOR 2 <= (int8, int2) ,
|
680 | 741 | OPERATOR 3 = (int8, int2) ,
|
681 | 742 | OPERATOR 4 >= (int8, int2) ,
|
682 | 743 | OPERATOR 5 > (int8, int2) ,
|
683 | 744 | FUNCTION 1 btint82cmp(int8, int2) ,
|
684 | 745 |
|
685 |
| - -- cross-type comparisons to int4 (integer) |
| 746 | + -- cross-type comparisons int8 vs int4 |
686 | 747 | OPERATOR 1 < (int8, int4) ,
|
687 | 748 | OPERATOR 2 <= (int8, int4) ,
|
688 | 749 | OPERATOR 3 = (int8, int4) ,
|
689 | 750 | OPERATOR 4 >= (int8, int4) ,
|
690 | 751 | OPERATOR 5 > (int8, int4) ,
|
691 |
| - FUNCTION 1 btint84cmp(int8, int4) ; |
| 752 | + FUNCTION 1 btint84cmp(int8, int4) , |
| 753 | + |
| 754 | + -- cross-type comparisons int4 vs int2 |
| 755 | + OPERATOR 1 < (int4, int2) , |
| 756 | + OPERATOR 2 <= (int4, int2) , |
| 757 | + OPERATOR 3 = (int4, int2) , |
| 758 | + OPERATOR 4 >= (int4, int2) , |
| 759 | + OPERATOR 5 > (int4, int2) , |
| 760 | + FUNCTION 1 btint42cmp(int4, int2) , |
| 761 | + |
| 762 | + -- cross-type comparisons int4 vs int8 |
| 763 | + OPERATOR 1 < (int4, int8) , |
| 764 | + OPERATOR 2 <= (int4, int8) , |
| 765 | + OPERATOR 3 = (int4, int8) , |
| 766 | + OPERATOR 4 >= (int4, int8) , |
| 767 | + OPERATOR 5 > (int4, int8) , |
| 768 | + FUNCTION 1 btint48cmp(int4, int8) , |
| 769 | + |
| 770 | + -- cross-type comparisons int2 vs int8 |
| 771 | + OPERATOR 1 < (int2, int8) , |
| 772 | + OPERATOR 2 <= (int2, int8) , |
| 773 | + OPERATOR 3 = (int2, int8) , |
| 774 | + OPERATOR 4 >= (int2, int8) , |
| 775 | + OPERATOR 5 > (int2, int8) , |
| 776 | + FUNCTION 1 btint28cmp(int2, int8) , |
| 777 | + |
| 778 | + -- cross-type comparisons int2 vs int4 |
| 779 | + OPERATOR 1 < (int2, int4) , |
| 780 | + OPERATOR 2 <= (int2, int4) , |
| 781 | + OPERATOR 3 = (int2, int4) , |
| 782 | + OPERATOR 4 >= (int2, int4) , |
| 783 | + OPERATOR 5 > (int2, int4) , |
| 784 | + FUNCTION 1 btint24cmp(int2, int4) ; |
692 | 785 | </programlisting>
|
693 | 786 |
|
694 | 787 | Notice that this definition <quote>overloads</> the operator strategy and
|
695 |
| - support function numbers. This is allowed (for B-tree operator classes |
696 |
| - only) so long as each instance of a particular number has a different |
697 |
| - right-hand data type. The instances that are not cross-type are the |
698 |
| - default or primary operators of the operator class. |
| 788 | + support function numbers: each number occurs multiple times within the |
| 789 | + family. This is allowed so long as each instance of a |
| 790 | + particular number has distinct input data types. The instances that have |
| 791 | + both input types equal to an operator class's input type are the |
| 792 | + primary operators and support functions for that operator class, |
| 793 | + and in most cases should be declared as part of the operator class rather |
| 794 | + than as loose members of the family. |
| 795 | + </para> |
| 796 | + |
| 797 | + <para> |
| 798 | + In a B-tree operator family, all the operators in the family must sort |
| 799 | + compatibly, meaning that the transitive laws hold across all the data types |
| 800 | + supported by the family: <quote>if A = B and B = C, then A = |
| 801 | + C</>, and <quote>if A < B and B < C, then A < C</>. For each |
| 802 | + operator in the family there must be a support function having the same |
| 803 | + two input data types as the operator. It is recommended that a family be |
| 804 | + complete, i.e., for each combination of data types, all operators are |
| 805 | + included. An operator class should include just the non-cross-type |
| 806 | + operators and support function for its data type. |
699 | 807 | </para>
|
700 | 808 |
|
701 | 809 | <para>
|
702 |
| - GiST indexes do not allow overloading of strategy or support function |
703 |
| - numbers, but it is still possible to get the effect of supporting |
704 |
| - multiple right-hand data types, by assigning a distinct strategy number |
705 |
| - to each operator that needs to be supported. The <literal>consistent</> |
706 |
| - support function must determine what it needs to do based on the strategy |
707 |
| - number, and must be prepared to accept comparison values of the appropriate |
708 |
| - data types. |
| 810 | + At this writing, hash indexes do not support cross-type operations, |
| 811 | + and so there is little use for a hash operator family larger than one |
| 812 | + operator class. This is expected to be relaxed in the future. |
709 | 813 | </para>
|
| 814 | + |
| 815 | + <para> |
| 816 | + GIN and GiST indexes do not have any explicit notion of cross-data-type |
| 817 | + operations. The set of operators supported is just whatever the primary |
| 818 | + support functions for a given operator class can handle. |
| 819 | + </para> |
| 820 | + |
| 821 | + <note> |
| 822 | + <para> |
| 823 | + Prior to <productname>PostgreSQL</productname> 8.3, there was no concept |
| 824 | + of operator families, and so any cross-data-type operators intended to be |
| 825 | + used with an index had to be bound directly into the index's operator |
| 826 | + class. While this approach still works, it is deprecated because it |
| 827 | + makes an index's dependencies too broad, and because the planner can |
| 828 | + handle cross-data-type comparisons more effectively when both data types |
| 829 | + have operators in the same operator family. |
| 830 | + </para> |
| 831 | + </note> |
710 | 832 | </sect2>
|
711 | 833 |
|
712 | 834 | <sect2 id="xindex-opclass-dependencies">
|
@@ -774,7 +896,8 @@ DEFAULT FOR TYPE int8 USING btree AS
|
774 | 896 | </para>
|
775 | 897 |
|
776 | 898 | <para>
|
777 |
| - Normally, declaring an operator as a member of an operator class means |
| 899 | + Normally, declaring an operator as a member of an operator class |
| 900 | + (or family) means |
778 | 901 | that the index method can retrieve exactly the set of rows
|
779 | 902 | that satisfy a <literal>WHERE</> condition using the operator. For example,
|
780 | 903 | <programlisting>
|
|
0 commit comments