1
- <!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.45 2003/09/30 03: 22:33 tgl Exp $ -->
1
+ <!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.46 2003/11/06 22:21:47 tgl Exp $ -->
2
2
3
3
<chapter id="indexes">
4
4
<title id="indexes-title">Indexes</title>
@@ -77,7 +77,7 @@ CREATE INDEX test1_id_index ON test1 (id);
77
77
than a sequential table scan. But you may have to run the
78
78
<command>ANALYZE</command> command regularly to update
79
79
statistics to allow the query planner to make educated decisions.
80
- Also read <xref linkend="performance-tips"> for information about
80
+ See <xref linkend="performance-tips"> for information about
81
81
how to find out whether an index is used and when and why the
82
82
planner may choose <emphasis>not</emphasis> to use an index.
83
83
</para>
@@ -106,8 +106,8 @@ CREATE INDEX test1_id_index ON test1 (id);
106
106
107
107
<para>
108
108
<productname>PostgreSQL</productname> provides several index types:
109
- B-tree, R-tree, GiST, and Hash. Each index type is more appropriate for
110
- a particular query type because of the algorithm it uses .
109
+ B-tree, R-tree, GiST, and Hash. Each index type uses a different
110
+ algorithm that is best suited to different types of queries .
111
111
<indexterm>
112
112
<primary>index</primary>
113
113
<secondary>B-tree</secondary>
@@ -116,9 +116,10 @@ CREATE INDEX test1_id_index ON test1 (id);
116
116
<primary>B-tree</primary>
117
117
<see>index</see>
118
118
</indexterm>
119
- By
120
- default, the <command>CREATE INDEX</command> command will create a
121
- B-tree index, which fits the most common situations. In
119
+ By default, the <command>CREATE INDEX</command> command will create a
120
+ B-tree index, which fits the most common situations. B-trees can
121
+ handle equality and range queries on data that can be sorted into
122
+ some ordering. In
122
123
particular, the <productname>PostgreSQL</productname> query planner
123
124
will consider using a B-tree index whenever an indexed column is
124
125
involved in a comparison using one of these operators:
@@ -154,7 +155,7 @@ CREATE INDEX test1_id_index ON test1 (id);
154
155
<primary>R-tree</primary>
155
156
<see>index</see>
156
157
</indexterm>
157
- R-tree indexes are especially suited for spatial data. To create
158
+ R-tree indexes are suited for queries on spatial data. To create
158
159
an R-tree index, use a command of the form
159
160
<synopsis>
160
161
CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING RTREE (<replaceable>column</replaceable>);
@@ -185,6 +186,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
185
186
<primary>hash</primary>
186
187
<see>index</see>
187
188
</indexterm>
189
+ Hash indexes can only handle simple equality comparisons.
188
190
The query planner will consider using a hash index whenever an
189
191
indexed column is involved in a comparison using the
190
192
<literal>=</literal> operator. The following command is used to
@@ -195,19 +197,18 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
195
197
<note>
196
198
<para>
197
199
Testing has shown <productname>PostgreSQL</productname>'s hash
198
- indexes to be similar or slower than B-tree indexes, and the
199
- index size and build time for hash indexes is much worse. Hash
200
- indexes also suffer poor performance under high concurrency. For
200
+ indexes to perform no better than B-tree indexes, and the
201
+ index size and build time for hash indexes is much worse. For
201
202
these reasons, hash index use is presently discouraged.
202
203
</para>
203
204
</note>
204
205
</para>
205
206
206
207
<para>
207
- The B-tree index is an implementation of Lehman-Yao
208
+ The B-tree index method is an implementation of Lehman-Yao
208
209
high-concurrency B-trees. The R-tree index method implements
209
210
standard R-trees using Guttman's quadratic split algorithm. The
210
- hash index is an implementation of Litwin's linear hashing. We
211
+ hash index method is an implementation of Litwin's linear hashing. We
211
212
mention the algorithms used solely to indicate that all of these
212
213
index methods are fully dynamic and do not have to be optimized
213
214
periodically (as is the case with, for example, static hash methods).
@@ -233,7 +234,7 @@ CREATE TABLE test2 (
233
234
name varchar
234
235
);
235
236
</programlisting>
236
- (Say , you keep your <filename class="directory">/dev</filename>
237
+ (say , you keep your <filename class="directory">/dev</filename>
237
238
directory in a database...) and you frequently make queries like
238
239
<programlisting>
239
240
SELECT name FROM test2 WHERE major = <replaceable>constant</replaceable> AND minor = <replaceable>constant</replaceable>;
@@ -263,8 +264,8 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
263
264
<literal>a</literal> and <literal>b</literal>, or in queries
264
265
involving only <literal>a</literal>, but not in other combinations.
265
266
(In a query involving <literal>a</literal> and <literal>c</literal>
266
- the planner might choose to use the index for
267
- <literal>a</literal> only and treat <literal>c</literal> like an
267
+ the planner could choose to use the index for
268
+ <literal>a</literal>, while treating <literal>c</literal> like an
268
269
ordinary unindexed column.) Of course, each column must be used with
269
270
operators appropriate to the index type; clauses that involve other
270
271
operators will not be considered.
@@ -310,16 +311,16 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
310
311
<para>
311
312
When an index is declared unique, multiple table rows with equal
312
313
indexed values will not be allowed. Null values are not considered
313
- equal.
314
+ equal. A multicolumn unique index will only reject cases where all
315
+ of the indexed columns are equal in two rows.
314
316
</para>
315
317
316
318
<para>
317
- <productname>PostgreSQL</productname> automatically creates unique
318
- indexes when a table is declared with a unique constraint or a
319
- primary key, on the columns that make up the primary key or unique
320
- columns (a multicolumn index, if appropriate), to enforce that
321
- constraint. A unique index can be added to a table at any later
322
- time, to add a unique constraint.
319
+ <productname>PostgreSQL</productname> automatically creates a unique
320
+ index when a unique constraint or a primary key is defined for a table.
321
+ The index covers the columns that make up the primary key or unique
322
+ columns (a multicolumn index, if appropriate), and is the mechanism
323
+ that enforces the constraint.
323
324
</para>
324
325
325
326
<note>
@@ -328,6 +329,9 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
328
329
<literal>ALTER TABLE ... ADD CONSTRAINT</literal>. The use of
329
330
indexes to enforce unique constraints could be considered an
330
331
implementation detail that should not be accessed directly.
332
+ One should, however, be aware that there's no need to manually
333
+ create indexes on unique columns; doing so would just duplicate
334
+ the automatically-created index.
331
335
</para>
332
336
</note>
333
337
</sect1>
@@ -362,6 +366,14 @@ CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
362
366
</programlisting>
363
367
</para>
364
368
369
+ <para>
370
+ If we were to declare this index <literal>UNIQUE</>, it would prevent
371
+ creation of rows whose <literal>col1</> values differ only in case,
372
+ as well as rows whose <literal>col1</> values are actually identical.
373
+ Thus, indexes on expressions can be used to enforce constraints that
374
+ are not definable as simple unique constraints.
375
+ </para>
376
+
365
377
<para>
366
378
As another example, if one often does queries like this:
367
379
<programlisting>
@@ -409,7 +421,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
409
421
In practice the default operator class for the column's data type is
410
422
usually sufficient. The main point of having operator classes is
411
423
that for some data types, there could be more than one meaningful
412
- ordering . For example, we might want to sort a complex-number data
424
+ index behavior . For example, we might want to sort a complex-number data
413
425
type either by absolute value or by real part. We could do this by
414
426
defining two operator classes for the data type and then selecting
415
427
the proper class when making an index.
@@ -419,20 +431,6 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
419
431
There are also some built-in operator classes besides the default ones:
420
432
421
433
<itemizedlist>
422
- <listitem>
423
- <para>
424
- The operator classes <literal>box_ops</literal> and
425
- <literal>bigbox_ops</literal> both support R-tree indexes on the
426
- <type>box</type> data type. The difference between them is
427
- that <literal>bigbox_ops</literal> scales box coordinates down,
428
- to avoid floating-point exceptions from doing multiplication,
429
- addition, and subtraction on very large floating-point
430
- coordinates. If the field on which your rectangles lie is about
431
- 20 000 square units or larger, you should use
432
- <literal>bigbox_ops</literal>.
433
- </para>
434
- </listitem>
435
-
436
434
<listitem>
437
435
<para>
438
436
The operator classes <literal>text_pattern_ops</literal>,
@@ -644,7 +642,8 @@ SELECT * FROM orders WHERE order_nr = 3501;
644
642
create, it would probably be too slow to be of any real use.)
645
643
The system can recognize simple inequality implications, for example
646
644
<quote>x < 1</quote> implies <quote>x < 2</quote>; otherwise
647
- the predicate condition must exactly match the query's <literal>WHERE</> condition
645
+ the predicate condition must exactly match part of the query's
646
+ <literal>WHERE</> condition
648
647
or the index will not be recognized to be usable.
649
648
</para>
650
649
@@ -723,7 +722,8 @@ CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
723
722
maintenance and tuning, it is still important to check
724
723
which indexes are actually used by the real-life query workload.
725
724
Examining index usage for an individual query is done with the
726
- <command>EXPLAIN</> command; its application for this purpose is
725
+ <xref linkend="sql-explain" endterm="sql-explain-title">
726
+ command; its application for this purpose is
727
727
illustrated in <xref linkend="using-explain">.
728
728
It is also possible to gather overall statistics about index usage
729
729
in a running server, as described in <xref linkend="monitoring-stats">.
@@ -740,7 +740,8 @@ CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
740
740
<itemizedlist>
741
741
<listitem>
742
742
<para>
743
- Always run <command>ANALYZE</command> first. This command
743
+ Always run <xref linkend="sql-analyze" endterm="sql-analyze-title">
744
+ first. This command
744
745
collects statistics about the distribution of the values in the
745
746
table. This information is required to guess the number of rows
746
747
returned by a query, which is needed by the planner to assign
@@ -813,8 +814,8 @@ CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
813
814
run-time parameters (described in <xref linkend="runtime-config">).
814
815
An inaccurate selectivity estimate is due to
815
816
insufficient statistics. It may be possible to help this by
816
- tuning the statistics-gathering parameters (see <command>ALTER
817
- TABLE</command> reference ).
817
+ tuning the statistics-gathering parameters (see
818
+ <xref linkend="sql-altertable" endterm="sql-altertable-title"> ).
818
819
</para>
819
820
820
821
<para>
0 commit comments