@@ -975,7 +975,7 @@ static char *str_param_default = "default";
975
975
/*
976
976
* Sample validator: checks that string is not longer than 8 bytes.
977
977
*/
978
- static void
978
+ static void
979
979
validate_my_string_relopt(const char *value)
980
980
{
981
981
if (strlen(value) > 8)
@@ -987,7 +987,7 @@ validate_my_string_relopt(const char *value)
987
987
/*
988
988
* Sample filler: switches characters to lower case.
989
989
*/
990
- static Size
990
+ static Size
991
991
fill_my_string_relopt(const char *value, void *ptr)
992
992
{
993
993
char *tmp = str_tolower(value, strlen(value), DEFAULT_COLLATION_OID);
@@ -1157,36 +1157,52 @@ my_sortsupport(PG_FUNCTION_ARGS)
1157
1157
<title>Implementation</title>
1158
1158
1159
1159
<sect2 id="gist-buffering-build">
1160
- <title>GiST Buffering Build</title>
1160
+ <title>GiST Index Build Methods</title>
1161
+
1162
+ <para>
1163
+ The simplest way to build a GiST index is just to insert all the entries,
1164
+ one by one. This tends to be slow for large indexes, because if the
1165
+ index tuples are scattered across the index and the index is large enough
1166
+ to not fit in cache, a lot of random I/O will be
1167
+ needed. <productname>PostgreSQL</productname> supports two alternative
1168
+ methods for initial build of a GiST index: <firstterm>sorted</firstterm>
1169
+ and <firstterm>buffered</firstterm> modes.
1170
+ </para>
1171
+
1172
+ <para>
1173
+ The sorted method is only available if each of the opclasses used by the
1174
+ index provides a <function>sortsupport</function> function, as described
1175
+ in <xref linkend="gist-extensibility"/>. If they do, this method is
1176
+ usually the best, so it is used by default.
1177
+ </para>
1178
+
1161
1179
<para>
1162
- Building large GiST indexes by simply inserting all the tuples tends to be
1163
- slow, because if the index tuples are scattered across the index and the
1164
- index is large enough to not fit in cache, the insertions need to perform
1165
- a lot of random I/O. Beginning in version 9.2, PostgreSQL supports a more
1166
- efficient method to build GiST indexes based on buffering, which can
1167
- dramatically reduce the number of random I/Os needed for non-ordered data
1168
- sets. For well-ordered data sets the benefit is smaller or non-existent,
1169
- because only a small number of pages receive new tuples at a time, and
1170
- those pages fit in cache even if the index as whole does not.
1180
+ The buffered method works by not inserting tuples directly into the index
1181
+ right away. It can dramatically reduce the amount of random I/O needed
1182
+ for non-ordered data sets. For well-ordered data sets the benefit is
1183
+ smaller or non-existent, because only a small number of pages receive new
1184
+ tuples at a time, and those pages fit in cache even if the index as a
1185
+ whole does not.
1171
1186
</para>
1172
1187
1173
1188
<para>
1174
- However, buffering index build needs to call the <function>penalty</function>
1175
- function more often, which consumes some extra CPU resources. Also, the
1176
- buffers used in the buffering build need temporary disk space, up to
1189
+ The buffered method needs to call the <function>penalty</function>
1190
+ function more often than the simple method does, which consumes some
1191
+ extra CPU resources. Also, the buffers need temporary disk space, up to
1177
1192
the size of the resulting index. Buffering can also influence the quality
1178
1193
of the resulting index, in both positive and negative directions. That
1179
1194
influence depends on various factors, like the distribution of the input
1180
1195
data and the operator class implementation.
1181
1196
</para>
1182
1197
1183
1198
<para>
1184
- By default, a GiST index build switches to the buffering method when the
1185
- index size reaches <xref linkend="guc-effective-cache-size"/>. It can
1186
- be manually turned on or off by the <literal>buffering</literal> parameter
1187
- to the CREATE INDEX command. The default behavior is good for most cases,
1188
- but turning buffering off might speed up the build somewhat if the input
1189
- data is ordered.
1199
+ If sorting is not possible, then by default a GiST index build switches
1200
+ to the buffering method when the index size reaches
1201
+ <xref linkend="guc-effective-cache-size"/>. Buffering can be manually
1202
+ forced or prevented by the <literal>buffering</literal> parameter to the
1203
+ CREATE INDEX command. The default behavior is good for most cases, but
1204
+ turning buffering off might speed up the build somewhat if the input data
1205
+ is ordered.
1190
1206
</para>
1191
1207
1192
1208
</sect2>
0 commit comments