Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit fc8d970

Browse files
committed
Replace functional-index facility with expressional indexes. Any column
of an index can now be a computed expression instead of a simple variable. Restrictions on expressions are the same as for predicates (only immutable functions, no sub-selects). This fixes problems recently introduced with inlining SQL functions, because the inlining transformation is applied to both expression trees so the planner can still match them up. Along the way, improve efficiency of handling index predicates (both predicates and index expressions are now cached by the relcache) and fix 7.3 oversight that didn't record dependencies of predicate expressions.
1 parent e5f1959 commit fc8d970

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+1348
-1280
lines changed

contrib/dblink/dblink.c

+1-4
Original file line numberDiff line numberDiff line change
@@ -1492,10 +1492,7 @@ get_pkey_attnames(Oid relid, int16 *numatts)
14921492
/* we're only interested if it is the primary key */
14931493
if (index->indisprimary == TRUE)
14941494
{
1495-
i = 0;
1496-
while (index->indkey[i++] != 0)
1497-
(*numatts)++;
1498-
1495+
*numatts = index->indnatts;
14991496
if (*numatts > 0)
15001497
{
15011498
result = (char **) palloc(*numatts * sizeof(char *));

doc/src/sgml/catalogs.sgml

+28-26
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
<!--
22
Documentation of the system catalogs, directed toward PostgreSQL developers
3-
$Header: /cvsroot/pgsql/doc/src/sgml/catalogs.sgml,v 2.70 2003/05/08 22:19:55 tgl Exp $
3+
$Header: /cvsroot/pgsql/doc/src/sgml/catalogs.sgml,v 2.71 2003/05/28 16:03:55 tgl Exp $
44
-->
55

66
<chapter id="catalogs">
@@ -1933,26 +1933,18 @@
19331933
<entry>The OID of the <structname>pg_class</> entry for the table this index is for</entry>
19341934
</row>
19351935

1936-
<row>
1937-
<entry><structfield>indproc</structfield></entry>
1938-
<entry><type>regproc</type></entry>
1939-
<entry><literal>pg_proc.oid</literal></entry>
1940-
<entry>The function's OID if this is a functional index,
1941-
else zero</entry>
1942-
</row>
1943-
19441936
<row>
19451937
<entry><structfield>indkey</structfield></entry>
19461938
<entry><type>int2vector</type></entry>
19471939
<entry>pg_attribute.attnum</entry>
19481940
<entry>
1949-
This is an array of up to
1950-
<symbol>INDEX_MAX_KEYS</symbol> values that indicate which
1951-
table columns this index pertains to. For example a value of
1952-
<literal>1 3</literal> would mean that the first and the third
1953-
column make up the index key. For a functional index, these
1954-
columns are the inputs to the function, and the function's return
1955-
value is the index key.
1941+
This is an array of <structfield>indnatts</structfield> (up to
1942+
<symbol>INDEX_MAX_KEYS</symbol>) values that indicate which
1943+
table columns this index indexes. For example a value of
1944+
<literal>1 3</literal> would mean that the first and the third table
1945+
columns make up the index key. A zero in this array indicates that the
1946+
corresponding index attribute is an expression over the table columns,
1947+
rather than a simple column reference.
19561948
</entry>
19571949
</row>
19581950

@@ -1961,17 +1953,18 @@
19611953
<entry><type>oidvector</type></entry>
19621954
<entry>pg_opclass.oid</entry>
19631955
<entry>
1964-
For each column in the index key this contains a reference to
1956+
For each column in the index key this contains the OID of
19651957
the operator class to use. See
19661958
<structname>pg_opclass</structname> for details.
19671959
</entry>
19681960
</row>
19691961

19701962
<row>
1971-
<entry><structfield>indisclustered</structfield></entry>
1972-
<entry><type>bool</type></entry>
1963+
<entry><structfield>indnatts</structfield></entry>
1964+
<entry><type>int2</type></entry>
19731965
<entry></entry>
1974-
<entry>If true, the table was last clustered on this index.</entry>
1966+
<entry>The number of columns in the index (duplicates
1967+
<literal>pg_class.relnatts</literal>)</entry>
19751968
</row>
19761969

19771970
<row>
@@ -1990,19 +1983,28 @@
19901983
</row>
19911984

19921985
<row>
1993-
<entry><structfield>indreference</structfield></entry>
1994-
<entry><type>oid</type></entry>
1986+
<entry><structfield>indisclustered</structfield></entry>
1987+
<entry><type>bool</type></entry>
19951988
<entry></entry>
1996-
<entry>unused</entry>
1989+
<entry>If true, the table was last clustered on this index.</entry>
1990+
</row>
1991+
1992+
<row>
1993+
<entry><structfield>indexprs</structfield></entry>
1994+
<entry><type>text</type></entry>
1995+
<entry></entry>
1996+
<entry>Expression trees (in <function>nodeToString()</function> representation)
1997+
for index attributes that are not simple column references. This is a
1998+
list with one element for each zero entry in <structfield>indkey</>.
1999+
Null if all index attributes are simple references.</entry>
19972000
</row>
19982001

19992002
<row>
20002003
<entry><structfield>indpred</structfield></entry>
20012004
<entry><type>text</type></entry>
20022005
<entry></entry>
2003-
<entry>Expression tree (in the form of a <function>nodeToString()</function> representation)
2004-
for partial index predicate. Empty string if not a partial
2005-
index.</entry>
2006+
<entry>Expression tree (in <function>nodeToString()</function> representation)
2007+
for partial index predicate. Null if not a partial index.</entry>
20062008
</row>
20072009
</tbody>
20082010
</tgroup>

doc/src/sgml/indices.sgml

+60-49
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.41 2003/05/15 15:50:18 petere Exp $ -->
1+
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.42 2003/05/28 16:03:55 tgl Exp $ -->
22

33
<chapter id="indexes">
44
<title id="indexes-title">Indexes</title>
@@ -20,8 +20,7 @@
2020
<title>Introduction</title>
2121

2222
<para>
23-
The classical example for the need of an index is if there is a
24-
table similar to this:
23+
Suppose we have a table similar to this:
2524
<programlisting>
2625
CREATE TABLE test1 (
2726
id integer,
@@ -32,24 +31,24 @@ CREATE TABLE test1 (
3231
<programlisting>
3332
SELECT content FROM test1 WHERE id = <replaceable>constant</replaceable>;
3433
</programlisting>
35-
Ordinarily, the system would have to scan the entire
36-
<structname>test1</structname> table row by row to find all
34+
With no advance preparation, the system would have to scan the entire
35+
<structname>test1</structname> table, row by row, to find all
3736
matching entries. If there are a lot of rows in
38-
<structname>test1</structname> and only a few rows (possibly zero
39-
or one) returned by the query, then this is clearly an inefficient
40-
method. If the system were instructed to maintain an index on the
41-
<structfield>id</structfield> column, then it could use a more
37+
<structname>test1</structname> and only a few rows (perhaps only zero
38+
or one) that would be returned by such a query, then this is clearly an
39+
inefficient method. But if the system has been instructed to maintain an
40+
index on the <structfield>id</structfield> column, then it can use a more
4241
efficient method for locating matching rows. For instance, it
4342
might only have to walk a few levels deep into a search tree.
4443
</para>
4544

4645
<para>
47-
A similar approach is used in most books of non-fiction: Terms and
46+
A similar approach is used in most books of non-fiction: terms and
4847
concepts that are frequently looked up by readers are collected in
4948
an alphabetic index at the end of the book. The interested reader
5049
can scan the index relatively quickly and flip to the appropriate
51-
page, and would not have to read the entire book to find the
52-
interesting location. As it is the task of the author to
50+
page(s), rather than having to read the entire book to find the
51+
material of interest. Just as it is the task of the author to
5352
anticipate the items that the readers are most likely to look up,
5453
it is the task of the database programmer to foresee which indexes
5554
would be of advantage.
@@ -73,13 +72,14 @@ CREATE INDEX test1_id_index ON test1 (id);
7372

7473
<para>
7574
Once the index is created, no further intervention is required: the
76-
system will use the index when it thinks it would be more efficient
75+
system will update the index when the table is modified, and it will
76+
use the index in queries when it thinks this would be more efficient
7777
than a sequential table scan. But you may have to run the
7878
<command>ANALYZE</command> command regularly to update
7979
statistics to allow the query planner to make educated decisions.
8080
Also read <xref linkend="performance-tips"> for information about
8181
how to find out whether an index is used and when and why the
82-
planner may choose to <emphasis>not</emphasis> use an index.
82+
planner may choose <emphasis>not</emphasis> to use an index.
8383
</para>
8484

8585
<para>
@@ -198,7 +198,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
198198
than B-tree indexes, and the index size and build time for hash
199199
indexes is much worse. Hash indexes also suffer poor performance
200200
under high concurrency. For these reasons, hash index use is
201-
discouraged.
201+
presently discouraged.
202202
</para>
203203
</note>
204204
</para>
@@ -250,14 +250,13 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
250250
Currently, only the B-tree and GiST implementations support multicolumn
251251
indexes. Up to 32 columns may be specified. (This limit can be
252252
altered when building <productname>PostgreSQL</productname>; see the
253-
file <filename>pg_config.h</filename>.)
253+
file <filename>pg_config_manual.h</filename>.)
254254
</para>
255255

256256
<para>
257257
The query planner can use a multicolumn index for queries that
258-
involve the leftmost column in the index definition and any number
259-
of columns listed to the right of it without a gap (when
260-
used with appropriate operators). For example,
258+
involve the leftmost column in the index definition plus any number
259+
of columns listed to the right of it, without a gap. For example,
261260
an index on <literal>(a, b, c)</literal> can be used in queries
262261
involving all of <literal>a</literal>, <literal>b</literal>, and
263262
<literal>c</literal>, or in queries involving both
@@ -266,7 +265,9 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
266265
(In a query involving <literal>a</literal> and <literal>c</literal>
267266
the planner might choose to use the index for
268267
<literal>a</literal> only and treat <literal>c</literal> like an
269-
ordinary unindexed column.)
268+
ordinary unindexed column.) Of course, each column must be used with
269+
operators appropriate to the index type; clauses that involve other
270+
operators will not be considered.
270271
</para>
271272

272273
<para>
@@ -283,8 +284,8 @@ SELECT name FROM test2 WHERE major = <replaceable>constant</replaceable> OR mino
283284
<para>
284285
Multicolumn indexes should be used sparingly. Most of the time,
285286
an index on a single column is sufficient and saves space and time.
286-
Indexes with more than three columns are almost certainly
287-
inappropriate.
287+
Indexes with more than three columns are unlikely to be helpful
288+
unless the usage of the table is extremely stylized.
288289
</para>
289290
</sect1>
290291

@@ -332,19 +333,19 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
332333
</sect1>
333334

334335

335-
<sect1 id="indexes-functional">
336-
<title>Functional Indexes</title>
336+
<sect1 id="indexes-expressional">
337+
<title>Indexes on Expressions</title>
337338

338-
<indexterm zone="indexes-functional">
339+
<indexterm zone="indexes-expressional">
339340
<primary>indexes</primary>
340-
<secondary>on functions</secondary>
341+
<secondary>on expressions</secondary>
341342
</indexterm>
342343

343344
<para>
344-
For a <firstterm>functional index</firstterm>, an index is defined
345-
on the result of a function applied to one or more columns of a
346-
single table. Functional indexes can be used to obtain fast access
347-
to data based on the result of function calls.
345+
An index column need not be just a column of the underlying table,
346+
but can be a function or scalar expression computed from one or
347+
more columns of the table. This feature is useful to obtain fast
348+
access to tables based on the results of computations.
348349
</para>
349350

350351
<para>
@@ -362,20 +363,29 @@ CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
362363
</para>
363364

364365
<para>
365-
The function in the index definition can take more than one
366-
argument, but they must be table columns, not constants.
367-
Functional indexes are always single-column (namely, the function
368-
result) even if the function uses more than one input column; there
369-
cannot be multicolumn indexes that contain function calls.
366+
As another example, if one often does queries like this:
367+
<programlisting>
368+
SELECT * FROM people WHERE (first_name || ' ' || last_name) = 'John Smith';
369+
</programlisting>
370+
then it might be worth creating an index like this:
371+
<programlisting>
372+
CREATE INDEX people_names ON people ((first_name || ' ' || last_name));
373+
</programlisting>
370374
</para>
371375

372-
<tip>
373-
<para>
374-
The restrictions mentioned in the previous paragraph can easily be
375-
worked around by defining a custom function to use in the index
376-
definition that computes any desired result internally.
377-
</para>
378-
</tip>
376+
<para>
377+
The syntax of the <command>CREATE INDEX</> command normally requires
378+
writing parentheses around index expressions, as shown in the second
379+
example. The parentheses may be omitted when the expression is just
380+
a function call, as in the first example.
381+
</para>
382+
383+
<para>
384+
Index expressions are relatively expensive to maintain, since the
385+
derived expression(s) must be computed for each row upon insertion
386+
or whenever it is updated. Therefore they should be used only when
387+
queries that can use the index are very frequent.
388+
</para>
379389
</sect1>
380390

381391

@@ -391,8 +401,8 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
391401
The operator class identifies the operators to be used by the index
392402
for that column. For example, a B-tree index on the type <type>int4</type>
393403
would use the <literal>int4_ops</literal> class; this operator
394-
class includes comparison functions for values of type <type>int4</type>. In
395-
practice the default operator class for the column's data type is
404+
class includes comparison functions for values of type <type>int4</type>.
405+
In practice the default operator class for the column's data type is
396406
usually sufficient. The main point of having operator classes is
397407
that for some data types, there could be more than one meaningful
398408
ordering. For example, we might want to sort a complex-number data
@@ -427,24 +437,25 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
427437
<literal>name_pattern_ops</literal> support B-tree indexes on
428438
the types <type>text</type>, <type>varchar</type>,
429439
<type>char</type>, and <type>name</type>, respectively. The
430-
difference to the ordinary operator classes is that the values
440+
difference from the ordinary operator classes is that the values
431441
are compared strictly character by character rather than
432442
according to the locale-specific collation rules. This makes
433443
these operator classes suitable for use by queries involving
434444
pattern matching expressions (<literal>LIKE</literal> or POSIX
435445
regular expressions) if the server does not use the standard
436-
<quote>C</quote> locale. As an example, to index a
446+
<quote>C</quote> locale. As an example, you might index a
437447
<type>varchar</type> column like this:
438448
<programlisting>
439449
CREATE INDEX test_index ON test_table (col varchar_pattern_ops);
440450
</programlisting>
441-
If you do use the C locale, you should instead create an index
442-
with the default operator class. Also note that you should
451+
If you do use the C locale, you may instead create an index
452+
with the default operator class, and it will still be useful
453+
for pattern-matching queries. Also note that you should
443454
create an index with the default operator class if you want
444455
queries involving ordinary comparisons to use an index. Such
445456
queries cannot use the
446457
<literal><replaceable>xxx</replaceable>_pattern_ops</literal>
447-
operator classes. It is possible, however, to create multiple
458+
operator classes. It is allowed to create multiple
448459
indexes on the same column with different operator classes.
449460
</para>
450461
</listitem>

doc/src/sgml/plpgsql.sgml

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$Header: /cvsroot/pgsql/doc/src/sgml/plpgsql.sgml,v 1.18 2003/04/27 22:21:22 tgl Exp $
2+
$Header: /cvsroot/pgsql/doc/src/sgml/plpgsql.sgml,v 1.19 2003/05/28 16:03:55 tgl Exp $
33
-->
44

55
<chapter id="plpgsql">
@@ -136,9 +136,10 @@ END;
136136
<para>
137137
Except for input/output conversion and calculation functions
138138
for user-defined types, anything that can be defined in C language
139-
functions can also be done with <application>PL/pgSQL</application>. For example, it is possible to
139+
functions can also be done with <application>PL/pgSQL</application>.
140+
For example, it is possible to
140141
create complex conditional computation functions and later use
141-
them to define operators or use them in functional indexes.
142+
them to define operators or use them in index expressions.
142143
</para>
143144

144145
<sect2 id="plpgsql-advantages">

0 commit comments

Comments
 (0)