@@ -58,9 +58,9 @@ SELECT * FROM tab WHERE lower(col) = LOWER(?);
58
58
The <type>citext</> data type allows you to eliminate calls
59
59
to <function>lower</> in SQL queries, and allows a primary key to
60
60
be case-insensitive. <type>citext</> is locale-aware, just
61
- like <type>text</>, which means that the comparison of upper case and
61
+ like <type>text</>, which means that the matching of upper case and
62
62
lower case characters is dependent on the rules of
63
- the <literal>LC_CTYPE</> locale setting. Again, this behavior is
63
+ the database's <literal>LC_CTYPE</> setting. Again, this behavior is
64
64
identical to the use of <function>lower</> in queries. But because it's
65
65
done transparently by the data type, you don't have to remember to do
66
66
anything special in your queries.
@@ -97,17 +97,25 @@ SELECT * FROM users WHERE nick = 'Larry';
97
97
98
98
<sect2>
99
99
<title>String Comparison Behavior</title>
100
+
101
+ <para>
102
+ <type>citext</> performs comparisons by converting each string to lower
103
+ case (as though <function>lower</> were called) and then comparing the
104
+ results normally. Thus, for example, two strings are considered equal
105
+ if <function>lower</> would produce identical results for them.
106
+ </para>
107
+
100
108
<para>
101
109
In order to emulate a case-insensitive collation as closely as possible,
102
- there are <type>citext</>-specific versions of a number of the comparison
110
+ there are <type>citext</>-specific versions of a number of string-processing
103
111
operators and functions. So, for example, the regular expression
104
112
operators <literal>~</> and <literal>~*</> exhibit the same behavior when
105
- applied to <type>citext</>: they both compare case-insensitively.
113
+ applied to <type>citext</>: they both match case-insensitively.
106
114
The same is true
107
115
for <literal>!~</> and <literal>!~*</>, as well as for the
108
116
<literal>LIKE</> operators <literal>~~</> and <literal>~~*</>, and
109
117
<literal>!~~</> and <literal>!~~*</>. If you'd like to match
110
- case-sensitively, you can always cast to <type>text</> before comparing .
118
+ case-sensitively, you can cast the operator's arguments to <type>text</>.
111
119
</para>
112
120
113
121
<para>
@@ -168,10 +176,10 @@ SELECT * FROM users WHERE nick = 'Larry';
168
176
<itemizedlist>
169
177
<listitem>
170
178
<para>
171
- <type>citext</>'s behavior depends on
179
+ <type>citext</>'s case-folding behavior depends on
172
180
the <literal>LC_CTYPE</> setting of your database. How it compares
173
- values is therefore determined when
174
- <application>initdb</> is run to create the cluster. It is not truly
181
+ values is therefore determined when the database is created.
182
+ It is not truly
175
183
case-insensitive in the terms defined by the Unicode standard.
176
184
Effectively, what this means is that, as long as you're happy with your
177
185
collation, you should be happy with <type>citext</>'s comparisons. But
@@ -181,6 +189,20 @@ SELECT * FROM users WHERE nick = 'Larry';
181
189
</para>
182
190
</listitem>
183
191
192
+ <listitem>
193
+ <para>
194
+ As of <productname>PostgreSQL</> 9.1, you can attach a
195
+ <literal>COLLATE</> specification to <type>citext</> columns or data
196
+ values. Currently, <type>citext</> operators will honor a non-default
197
+ <literal>COLLATE</> specification while comparing case-folded strings,
198
+ but the initial folding to lower case is always done according to the
199
+ database's <literal>LC_CTYPE</> setting (that is, as though
200
+ <literal>COLLATE "default"</> were given). This may be changed in a
201
+ future release so that both steps follow the input <literal>COLLATE</>
202
+ specification.
203
+ </para>
204
+ </listitem>
205
+
184
206
<listitem>
185
207
<para>
186
208
<type>citext</> is not as efficient as <type>text</> because the
@@ -198,20 +220,20 @@ SELECT * FROM users WHERE nick = 'Larry';
198
220
contexts. The standard answer is to use the <type>text</> type and
199
221
manually use the <function>lower</> function when you need to compare
200
222
case-insensitively; this works all right if case-insensitive comparison
201
- is needed only infrequently. If you need case-insensitive most of
202
- the time and case-sensitive infrequently, consider storing the data
223
+ is needed only infrequently. If you need case-insensitive behavior most
224
+ of the time and case-sensitive infrequently, consider storing the data
203
225
as <type>citext</> and explicitly casting the column to <type>text</>
204
- when you want case-sensitive comparison. In either situation, you
205
- will need two indexes if you want both types of searches to be fast.
226
+ when you want case-sensitive comparison. In either situation, you will
227
+ need two indexes if you want both types of searches to be fast.
206
228
</para>
207
229
</listitem>
208
230
209
231
<listitem>
210
232
<para>
211
233
The schema containing the <type>citext</> operators must be
212
234
in the current <varname>search_path</> (typically <literal>public</>);
213
- if it is not, a normal case-sensitive <type>text</> comparison
214
- is performed .
235
+ if it is not, the normal case-sensitive <type>text</> operators
236
+ will be invoked instead .
215
237
</para>
216
238
</listitem>
217
239
</itemizedlist>
0 commit comments