Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 7526e10

Browse files
committed
BRIN auto-summarization
Previously, only VACUUM would cause a page range to get initially summarized by BRIN indexes, which for some use cases takes too much time since the inserts occur. To avoid the delay, have brininsert request a summarization run for the previous range as soon as the first tuple is inserted into the first page of the next range. Autovacuum is in charge of processing these requests, after doing all the regular vacuuming/ analyzing work on tables. This doesn't impose any new tasks on autovacuum, because autovacuum was already in charge of doing summarizations. The only actual effect is to change the timing, i.e. that it occurs earlier. For this reason, we don't go any great lengths to record these requests very robustly; if they are lost because of a server crash or restart, they will happen at a later time anyway. Most of the new code here is in autovacuum, which can now be told about "work items" to process. This can be used for other things such as GIN pending list cleaning, perhaps visibility map bit setting, both of which are currently invoked during vacuum, but do not really depend on vacuum taking place. The requests are at the page range level, a granularity for which we did not have SQL-level access; we only had index-level summarization requests via brin_summarize_new_values(). It seems reasonable to add SQL-level access to range-level summarization too, so add a function brin_summarize_range() to do that. Authors: Álvaro Herrera, based on sketch from Simon Riggs. Reviewed-by: Thomas Munro. Discussion: https://postgr.es/m/20170301045823.vneqdqkmsd4as4ds@alvherre.pgsql
1 parent 7220c7b commit 7526e10

File tree

13 files changed

+672
-25
lines changed

13 files changed

+672
-25
lines changed

doc/src/sgml/brin.sgml

+7-2
Original file line numberDiff line numberDiff line change
@@ -74,9 +74,14 @@
7474
tuple; those tuples remain unsummarized until a summarization run is
7575
invoked later, creating initial summaries.
7676
This process can be invoked manually using the
77-
<function>brin_summarize_new_values(regclass)</function> function,
78-
or automatically when <command>VACUUM</command> processes the table.
77+
<function>brin_summarize_range(regclass, bigint)</function> or
78+
<function>brin_summarize_new_values(regclass)</function> functions;
79+
automatically when <command>VACUUM</command> processes the table;
80+
or by automatic summarization executed by autovacuum, as insertions
81+
occur. (This last trigger is disabled by default and can be enabled
82+
with the <literal>autosummarize</literal> parameter.)
7983
</para>
84+
8085
</sect2>
8186
</sect1>
8287

doc/src/sgml/func.sgml

+9-1
Original file line numberDiff line numberDiff line change
@@ -19683,6 +19683,13 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
1968319683
<entry><type>integer</type></entry>
1968419684
<entry>summarize page ranges not already summarized</entry>
1968519685
</row>
19686+
<row>
19687+
<entry>
19688+
<literal><function>brin_summarize_range(<parameter>index</> <type>regclass</>, <parameter>blockNumber</> <type>bigint</type>)</function></literal>
19689+
</entry>
19690+
<entry><type>integer</type></entry>
19691+
<entry>summarize the page range covering the given block, if not already summarized</entry>
19692+
</row>
1968619693
<row>
1968719694
<entry>
1968819695
<literal><function>gin_clean_pending_list(<parameter>index</> <type>regclass</>)</function></literal>
@@ -19700,7 +19707,8 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
1970019707
that are not currently summarized by the index; for any such range
1970119708
it creates a new summary index tuple by scanning the table pages.
1970219709
It returns the number of new page range summaries that were inserted
19703-
into the index.
19710+
into the index. <function>brin_summarize_range</> does the same, except
19711+
it only summarizes the range that covers the given block number.
1970419712
</para>
1970519713

1970619714
<para>

doc/src/sgml/ref/create_index.sgml

+11-1
Original file line numberDiff line numberDiff line change
@@ -382,7 +382,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
382382
</variablelist>
383383

384384
<para>
385-
<acronym>BRIN</> indexes accept a different parameter:
385+
<acronym>BRIN</> indexes accept different parameters:
386386
</para>
387387

388388
<variablelist>
@@ -396,6 +396,16 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
396396
</para>
397397
</listitem>
398398
</varlistentry>
399+
400+
<varlistentry>
401+
<term><literal>autosummarize</></term>
402+
<listitem>
403+
<para>
404+
Defines whether a summarization run is invoked for the previous page
405+
range whenever an insertion is detected on the next one.
406+
</para>
407+
</listitem>
408+
</varlistentry>
399409
</variablelist>
400410
</refsect2>
401411

src/backend/access/brin/brin.c

+106-18
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
#include "catalog/pg_am.h"
2727
#include "miscadmin.h"
2828
#include "pgstat.h"
29+
#include "postmaster/autovacuum.h"
2930
#include "storage/bufmgr.h"
3031
#include "storage/freespace.h"
3132
#include "utils/builtins.h"
@@ -60,10 +61,12 @@ typedef struct BrinOpaque
6061
BrinDesc *bo_bdesc;
6162
} BrinOpaque;
6263

64+
#define BRIN_ALL_BLOCKRANGES InvalidBlockNumber
65+
6366
static BrinBuildState *initialize_brin_buildstate(Relation idxRel,
6467
BrinRevmap *revmap, BlockNumber pagesPerRange);
6568
static void terminate_brin_buildstate(BrinBuildState *state);
66-
static void brinsummarize(Relation index, Relation heapRel,
69+
static void brinsummarize(Relation index, Relation heapRel, BlockNumber pageRange,
6770
double *numSummarized, double *numExisting);
6871
static void form_and_insert_tuple(BrinBuildState *state);
6972
static void union_tuples(BrinDesc *bdesc, BrinMemTuple *a,
@@ -126,8 +129,11 @@ brinhandler(PG_FUNCTION_ARGS)
126129
* with those of the new tuple. If the tuple values are not consistent with
127130
* the summary tuple, we need to update the index tuple.
128131
*
132+
* If autosummarization is enabled, check if we need to summarize the previous
133+
* page range.
134+
*
129135
* If the range is not currently summarized (i.e. the revmap returns NULL for
130-
* it), there's nothing to do.
136+
* it), there's nothing to do for this tuple.
131137
*/
132138
bool
133139
brininsert(Relation idxRel, Datum *values, bool *nulls,
@@ -136,30 +142,59 @@ brininsert(Relation idxRel, Datum *values, bool *nulls,
136142
IndexInfo *indexInfo)
137143
{
138144
BlockNumber pagesPerRange;
145+
BlockNumber origHeapBlk;
146+
BlockNumber heapBlk;
139147
BrinDesc *bdesc = (BrinDesc *) indexInfo->ii_AmCache;
140148
BrinRevmap *revmap;
141149
Buffer buf = InvalidBuffer;
142150
MemoryContext tupcxt = NULL;
143151
MemoryContext oldcxt = CurrentMemoryContext;
152+
bool autosummarize = BrinGetAutoSummarize(idxRel);
144153

145154
revmap = brinRevmapInitialize(idxRel, &pagesPerRange, NULL);
146155

156+
/*
157+
* origHeapBlk is the block number where the insertion occurred. heapBlk
158+
* is the first block in the corresponding page range.
159+
*/
160+
origHeapBlk = ItemPointerGetBlockNumber(heaptid);
161+
heapBlk = (origHeapBlk / pagesPerRange) * pagesPerRange;
162+
147163
for (;;)
148164
{
149165
bool need_insert = false;
150166
OffsetNumber off;
151167
BrinTuple *brtup;
152168
BrinMemTuple *dtup;
153-
BlockNumber heapBlk;
154169
int keyno;
155170

156171
CHECK_FOR_INTERRUPTS();
157172

158-
heapBlk = ItemPointerGetBlockNumber(heaptid);
159-
/* normalize the block number to be the first block in the range */
160-
heapBlk = (heapBlk / pagesPerRange) * pagesPerRange;
161-
brtup = brinGetTupleForHeapBlock(revmap, heapBlk, &buf, &off, NULL,
162-
BUFFER_LOCK_SHARE, NULL);
173+
/*
174+
* If auto-summarization is enabled and we just inserted the first
175+
* tuple into the first block of a new non-first page range, request a
176+
* summarization run of the previous range.
177+
*/
178+
if (autosummarize &&
179+
heapBlk > 0 &&
180+
heapBlk == origHeapBlk &&
181+
ItemPointerGetOffsetNumber(heaptid) == FirstOffsetNumber)
182+
{
183+
BlockNumber lastPageRange = heapBlk - 1;
184+
BrinTuple *lastPageTuple;
185+
186+
lastPageTuple =
187+
brinGetTupleForHeapBlock(revmap, lastPageRange, &buf, &off,
188+
NULL, BUFFER_LOCK_SHARE, NULL);
189+
if (!lastPageTuple)
190+
AutoVacuumRequestWork(AVW_BRINSummarizeRange,
191+
RelationGetRelid(idxRel),
192+
lastPageRange);
193+
brin_free_tuple(lastPageTuple);
194+
}
195+
196+
brtup = brinGetTupleForHeapBlock(revmap, heapBlk, &buf, &off,
197+
NULL, BUFFER_LOCK_SHARE, NULL);
163198

164199
/* if range is unsummarized, there's nothing to do */
165200
if (!brtup)
@@ -747,7 +782,7 @@ brinvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
747782

748783
brin_vacuum_scan(info->index, info->strategy);
749784

750-
brinsummarize(info->index, heapRel,
785+
brinsummarize(info->index, heapRel, BRIN_ALL_BLOCKRANGES,
751786
&stats->num_index_tuples, &stats->num_index_tuples);
752787

753788
heap_close(heapRel, AccessShareLock);
@@ -765,7 +800,8 @@ brinoptions(Datum reloptions, bool validate)
765800
BrinOptions *rdopts;
766801
int numoptions;
767802
static const relopt_parse_elt tab[] = {
768-
{"pages_per_range", RELOPT_TYPE_INT, offsetof(BrinOptions, pagesPerRange)}
803+
{"pages_per_range", RELOPT_TYPE_INT, offsetof(BrinOptions, pagesPerRange)},
804+
{"autosummarize", RELOPT_TYPE_BOOL, offsetof(BrinOptions, autosummarize)}
769805
};
770806

771807
options = parseRelOptions(reloptions, validate, RELOPT_KIND_BRIN,
@@ -791,13 +827,40 @@ brinoptions(Datum reloptions, bool validate)
791827
*/
792828
Datum
793829
brin_summarize_new_values(PG_FUNCTION_ARGS)
830+
{
831+
Datum relation = PG_GETARG_DATUM(0);
832+
833+
return DirectFunctionCall2(brin_summarize_range,
834+
relation,
835+
Int64GetDatum((int64) BRIN_ALL_BLOCKRANGES));
836+
}
837+
838+
/*
839+
* SQL-callable function to summarize the indicated page range, if not already
840+
* summarized. If the second argument is BRIN_ALL_BLOCKRANGES, all
841+
* unsummarized ranges are summarized.
842+
*/
843+
Datum
844+
brin_summarize_range(PG_FUNCTION_ARGS)
794845
{
795846
Oid indexoid = PG_GETARG_OID(0);
847+
int64 heapBlk64 = PG_GETARG_INT64(1);
848+
BlockNumber heapBlk;
796849
Oid heapoid;
797850
Relation indexRel;
798851
Relation heapRel;
799852
double numSummarized = 0;
800853

854+
if (heapBlk64 > BRIN_ALL_BLOCKRANGES || heapBlk64 < 0)
855+
{
856+
char *blk = psprintf(INT64_FORMAT, heapBlk64);
857+
858+
ereport(ERROR,
859+
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
860+
errmsg("block number out of range: %s", blk)));
861+
}
862+
heapBlk = (BlockNumber) heapBlk64;
863+
801864
/*
802865
* We must lock table before index to avoid deadlocks. However, if the
803866
* passed indexoid isn't an index then IndexGetRelation() will fail.
@@ -837,7 +900,7 @@ brin_summarize_new_values(PG_FUNCTION_ARGS)
837900
RelationGetRelationName(indexRel))));
838901

839902
/* OK, do it */
840-
brinsummarize(indexRel, heapRel, &numSummarized, NULL);
903+
brinsummarize(indexRel, heapRel, heapBlk, &numSummarized, NULL);
841904

842905
relation_close(indexRel, ShareUpdateExclusiveLock);
843906
relation_close(heapRel, ShareUpdateExclusiveLock);
@@ -1063,17 +1126,17 @@ summarize_range(IndexInfo *indexInfo, BrinBuildState *state, Relation heapRel,
10631126
}
10641127

10651128
/*
1066-
* Scan a complete BRIN index, and summarize each page range that's not already
1067-
* summarized. The index and heap must have been locked by caller in at
1068-
* least ShareUpdateExclusiveLock mode.
1129+
* Summarize page ranges that are not already summarized. If pageRange is
1130+
* BRIN_ALL_BLOCKRANGES then the whole table is scanned; otherwise, only the
1131+
* page range containing the given heap page number is scanned.
10691132
*
10701133
* For each new index tuple inserted, *numSummarized (if not NULL) is
10711134
* incremented; for each existing tuple, *numExisting (if not NULL) is
10721135
* incremented.
10731136
*/
10741137
static void
1075-
brinsummarize(Relation index, Relation heapRel, double *numSummarized,
1076-
double *numExisting)
1138+
brinsummarize(Relation index, Relation heapRel, BlockNumber pageRange,
1139+
double *numSummarized, double *numExisting)
10771140
{
10781141
BrinRevmap *revmap;
10791142
BrinBuildState *state = NULL;
@@ -1082,15 +1145,40 @@ brinsummarize(Relation index, Relation heapRel, double *numSummarized,
10821145
BlockNumber heapBlk;
10831146
BlockNumber pagesPerRange;
10841147
Buffer buf;
1148+
BlockNumber startBlk;
1149+
BlockNumber endBlk;
1150+
1151+
/* determine range of pages to process; nothing to do for an empty table */
1152+
heapNumBlocks = RelationGetNumberOfBlocks(heapRel);
1153+
if (heapNumBlocks == 0)
1154+
return;
10851155

10861156
revmap = brinRevmapInitialize(index, &pagesPerRange, NULL);
10871157

1158+
if (pageRange == BRIN_ALL_BLOCKRANGES)
1159+
{
1160+
startBlk = 0;
1161+
endBlk = heapNumBlocks;
1162+
}
1163+
else
1164+
{
1165+
startBlk = (pageRange / pagesPerRange) * pagesPerRange;
1166+
/* Nothing to do if start point is beyond end of table */
1167+
if (startBlk > heapNumBlocks)
1168+
{
1169+
brinRevmapTerminate(revmap);
1170+
return;
1171+
}
1172+
endBlk = startBlk + pagesPerRange;
1173+
if (endBlk > heapNumBlocks)
1174+
endBlk = heapNumBlocks;
1175+
}
1176+
10881177
/*
10891178
* Scan the revmap to find unsummarized items.
10901179
*/
10911180
buf = InvalidBuffer;
1092-
heapNumBlocks = RelationGetNumberOfBlocks(heapRel);
1093-
for (heapBlk = 0; heapBlk < heapNumBlocks; heapBlk += pagesPerRange)
1181+
for (heapBlk = startBlk; heapBlk < endBlk; heapBlk += pagesPerRange)
10941182
{
10951183
BrinTuple *tup;
10961184
OffsetNumber off;

src/backend/access/brin/brin_revmap.c

+5-1
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,11 @@ brinGetTupleForHeapBlock(BrinRevmap *revmap, BlockNumber heapBlk,
205205
/* normalize the heap block number to be the first page in the range */
206206
heapBlk = (heapBlk / revmap->rm_pagesPerRange) * revmap->rm_pagesPerRange;
207207

208-
/* Compute the revmap page number we need */
208+
/*
209+
* Compute the revmap page number we need. If Invalid is returned (i.e.,
210+
* the revmap page hasn't been created yet), the requested page range is
211+
* not summarized.
212+
*/
209213
mapBlk = revmap_get_blkno(revmap, heapBlk);
210214
if (mapBlk == InvalidBlockNumber)
211215
{

src/backend/access/common/reloptions.c

+9
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,15 @@
9292

9393
static relopt_bool boolRelOpts[] =
9494
{
95+
{
96+
{
97+
"autosummarize",
98+
"Enables automatic summarization on this BRIN index",
99+
RELOPT_KIND_BRIN,
100+
AccessExclusiveLock
101+
},
102+
false
103+
},
95104
{
96105
{
97106
"autovacuum_enabled",

0 commit comments

Comments
 (0)