@@ -664,13 +664,6 @@ SELECT a COLLATE "C" < b COLLATE "POSIX" FROM test1;
664
664
</listitem>
665
665
</varlistentry>
666
666
667
- <varlistentry>
668
- <term><literal>de-u-co-phonebk-x-icu</literal></term>
669
- <listitem>
670
- <para>German collation, phone book variant</para>
671
- </listitem>
672
- </varlistentry>
673
-
674
667
<varlistentry>
675
668
<term><literal>de-AT-x-icu</literal></term>
676
669
<listitem>
@@ -683,13 +676,6 @@ SELECT a COLLATE "C" < b COLLATE "POSIX" FROM test1;
683
676
</listitem>
684
677
</varlistentry>
685
678
686
- <varlistentry>
687
- <term><literal>de-AT-u-co-phonebk-x-icu</literal></term>
688
- <listitem>
689
- <para>German collation for Austria, phone book variant</para>
690
- </listitem>
691
- </varlistentry>
692
-
693
679
<varlistentry>
694
680
<term><literal>und-x-icu</literal> (for <quote>undefined</quote>)</term>
695
681
<listitem>
@@ -709,6 +695,90 @@ SELECT a COLLATE "C" < b COLLATE "POSIX" FROM test1;
709
695
will draw an error along the lines of <quote>collation "de-x-icu" for
710
696
encoding "WIN874" does not exist</>.
711
697
</para>
698
+
699
+ <para>
700
+ ICU allows collations to be customized beyond the basic language+country
701
+ set that is preloaded by <command>initdb</command>. Users are encouraged
702
+ to define their own collation objects that make use of these facilities to
703
+ suit the sorting behavior to their requirements. Here are some examples:
704
+
705
+ <variablelist>
706
+ <varlistentry>
707
+ <term><literal>CREATE COLLATION "de-u-co-phonebk-x-icu" (provider = icu, locale = 'de-u-co-phonebk')</literal></term>
708
+ <listitem>
709
+ <para>German collation with phone book collation type</para>
710
+ </listitem>
711
+ </varlistentry>
712
+
713
+ <varlistentry>
714
+ <term><literal>CREATE COLLATION "und-u-co-emoji-x-icu" (provider = icu, locale = 'und-u-co-emoji')</literal></term>
715
+ <listitem>
716
+ <para>
717
+ Root collation with Emoji collation type, per Unicode Technical Standard #51
718
+ </para>
719
+ </listitem>
720
+ </varlistentry>
721
+
722
+ <varlistentry>
723
+ <term><literal>CREATE COLLATION digitslast (provider = icu, locale = 'en-u-kr-latn-digit')</literal></term>
724
+ <listitem>
725
+ <para>
726
+ Sort digits after Latin letters. (The default is digits before letters.)
727
+ </para>
728
+ </listitem>
729
+ </varlistentry>
730
+
731
+ <varlistentry>
732
+ <term><literal>CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper')</literal></term>
733
+ <listitem>
734
+ <para>
735
+ Sort upper-case letters before lower-case letters. (The default is
736
+ lower-case letters first.)
737
+ </para>
738
+ </listitem>
739
+ </varlistentry>
740
+
741
+ <varlistentry>
742
+ <term><literal>CREATE COLLATION special (provider = icu, locale = 'en-u-kf-upper-kr-latn-digit')</literal></term>
743
+ <listitem>
744
+ <para>
745
+ Combines both of the above options.
746
+ </para>
747
+ </listitem>
748
+ </varlistentry>
749
+
750
+ <varlistentry>
751
+ <term><literal>CREATE COLLATION numeric (provider = icu, locale = 'en-u-kn-true')</literal></term>
752
+ <listitem>
753
+ <para>
754
+ Numeric ordering, sorts sequences of digits by their numeric value,
755
+ for example: <literal>A-21</literal> < <literal>A-123</literal>
756
+ (also known as natural sort).
757
+ </para>
758
+ </listitem>
759
+ </varlistentry>
760
+ </variablelist>
761
+
762
+ See <ulink url="http://unicode.org/reports/tr35/tr35-collation.html">Unicode
763
+ Technical Standard #35</ulink>
764
+ and <ulink url="https://tools.ietf.org/html/bcp47">BCP 47</ulink> for
765
+ details. The list of possible collation types (<literal>co</literal>
766
+ subtag) can be found in
767
+ the <ulink url="http://www.unicode.org/repos/cldr/trunk/common/bcp47/collation.xml">CLDR
768
+ repository</ulink>.
769
+ The <ulink url="https://ssl.icu-project.org/icu-bin/locexp">ICU Locale
770
+ Explorer</ulink> can be used to check the details of a particular locale
771
+ definition.
772
+ </para>
773
+
774
+ <para>
775
+ Note that while this system allows creating collations that <quote>ignore
776
+ case</quote> or <quote>ignore accents</quote> or similar (using
777
+ the <literal>ks</literal> key), PostgreSQL does not at the moment allow
778
+ such collations to act in a truly case- or accent-insensitive manner. Any
779
+ strings that compare equal according to the collation but are not
780
+ byte-wise equal will be sorted according to their byte values.
781
+ </para>
712
782
</sect4>
713
783
</sect3>
714
784
0 commit comments