Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit d22a09d

Browse files
committed
Support GiST index support functions that want to cache data across calls.
pg_trgm was already doing this unofficially, but the implementation hadn't been thought through very well and leaked memory. Restructure the core GiST code so that it actually works, and document it. Ordinarily this would have required an extra memory context creation/destruction for each GiST index search, but I was able to avoid that in the normal case of a non-rescanned search by finessing the handling of the RBTree. It used to have its own context always, but now shares a context with the scan-lifespan data structures, unless there is more than one rescan call. This should make the added overhead unnoticeable in typical cases.
1 parent 79edb2b commit d22a09d

File tree

6 files changed

+206
-82
lines changed

6 files changed

+206
-82
lines changed

doc/src/sgml/gist.sgml

Lines changed: 30 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -86,11 +86,6 @@
8686
reuse, and a clean interface.
8787
</para>
8888

89-
</sect1>
90-
91-
<sect1 id="gist-implementation">
92-
<title>Implementation</title>
93-
9489
<para>
9590
There are seven methods that an index operator class for
9691
<acronym>GiST</acronym> must provide, and an eighth that is optional.
@@ -642,35 +637,54 @@ my_distance(PG_FUNCTION_ARGS)
642637

643638
</variablelist>
644639

640+
<para>
641+
All the GiST support methods are normally called in short-lived memory
642+
contexts; that is, <varname>CurrentMemoryContext</> will get reset after
643+
each tuple is processed. It is therefore not very important to worry about
644+
pfree'ing everything you palloc. However, in some cases it's useful for a
645+
support method to cache data across repeated calls. To do that, allocate
646+
the longer-lived data in <literal>fcinfo-&gt;flinfo-&gt;fn_mcxt</>, and
647+
keep a pointer to it in <literal>fcinfo-&gt;flinfo-&gt;fn_extra</>. Such
648+
data will survive for the life of the index operation (e.g., a single GiST
649+
index scan, index build, or index tuple insertion). Be careful to pfree
650+
the previous value when replacing a <literal>fn_extra</> value, or the leak
651+
will accumulate for the duration of the operation.
652+
</para>
653+
654+
</sect1>
655+
656+
<sect1 id="gist-implementation">
657+
<title>Implementation</title>
658+
645659
<sect2 id="gist-buffering-build">
646660
<title>GiST buffering build</title>
647661
<para>
648662
Building large GiST indexes by simply inserting all the tuples tends to be
649663
slow, because if the index tuples are scattered across the index and the
650664
index is large enough to not fit in cache, the insertions need to perform
651-
a lot of random I/O. PostgreSQL from version 9.2 supports a more efficient
652-
method to build GiST indexes based on buffering, which can dramatically
653-
reduce number of random I/O needed for non-ordered data sets. For
654-
well-ordered datasets the benefit is smaller or non-existent, because
655-
only a small number of pages receive new tuples at a time, and those pages
656-
fit in cache even if the index as whole does not.
665+
a lot of random I/O. Beginning in version 9.2, PostgreSQL supports a more
666+
efficient method to build GiST indexes based on buffering, which can
667+
dramatically reduce the number of random I/Os needed for non-ordered data
668+
sets. For well-ordered datasets the benefit is smaller or non-existent,
669+
because only a small number of pages receive new tuples at a time, and
670+
those pages fit in cache even if the index as whole does not.
657671
</para>
658672

659673
<para>
660674
However, buffering index build needs to call the <function>penalty</>
661675
function more often, which consumes some extra CPU resources. Also, the
662676
buffers used in the buffering build need temporary disk space, up to
663-
the size of the resulting index. Buffering can also infuence the quality
664-
of the produced index, in both positive and negative directions. That
677+
the size of the resulting index. Buffering can also influence the quality
678+
of the resulting index, in both positive and negative directions. That
665679
influence depends on various factors, like the distribution of the input
666-
data and operator class implementation.
680+
data and the operator class implementation.
667681
</para>
668682

669683
<para>
670-
By default, the index build switches to the buffering method when the
684+
By default, a GiST index build switches to the buffering method when the
671685
index size reaches <xref linkend="guc-effective-cache-size">. It can
672686
be manually turned on or off by the <literal>BUFFERING</literal> parameter
673-
to the CREATE INDEX clause. The default behavior is good for most cases,
687+
to the CREATE INDEX command. The default behavior is good for most cases,
674688
but turning buffering off might speed up the build somewhat if the input
675689
data is ordered.
676690
</para>

src/backend/access/gist/gist.c

Lines changed: 49 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -94,25 +94,29 @@ gistinsert(PG_FUNCTION_ARGS)
9494
IndexUniqueCheck checkUnique = (IndexUniqueCheck) PG_GETARG_INT32(5);
9595
#endif
9696
IndexTuple itup;
97-
GISTSTATE giststate;
98-
MemoryContext oldCtx;
99-
MemoryContext insertCtx;
97+
GISTSTATE *giststate;
98+
MemoryContext oldCxt;
10099

101-
insertCtx = createTempGistContext();
102-
oldCtx = MemoryContextSwitchTo(insertCtx);
100+
giststate = initGISTstate(r);
103101

104-
initGISTstate(&giststate, r);
102+
/*
103+
* We use the giststate's scan context as temp context too. This means
104+
* that any memory leaked by the support functions is not reclaimed until
105+
* end of insert. In most cases, we aren't going to call the support
106+
* functions very many times before finishing the insert, so this seems
107+
* cheaper than resetting a temp context for each function call.
108+
*/
109+
oldCxt = MemoryContextSwitchTo(giststate->tempCxt);
105110

106-
itup = gistFormTuple(&giststate, r,
111+
itup = gistFormTuple(giststate, r,
107112
values, isnull, true /* size is currently bogus */ );
108113
itup->t_tid = *ht_ctid;
109114

110-
gistdoinsert(r, itup, 0, &giststate);
115+
gistdoinsert(r, itup, 0, giststate);
111116

112117
/* cleanup */
113-
freeGISTstate(&giststate);
114-
MemoryContextSwitchTo(oldCtx);
115-
MemoryContextDelete(insertCtx);
118+
MemoryContextSwitchTo(oldCxt);
119+
freeGISTstate(giststate);
116120

117121
PG_RETURN_BOOL(false);
118122
}
@@ -1213,47 +1217,64 @@ gistSplit(Relation r,
12131217
}
12141218

12151219
/*
1216-
* Fill a GISTSTATE with information about the index
1220+
* Create a GISTSTATE and fill it with information about the index
12171221
*/
1218-
void
1219-
initGISTstate(GISTSTATE *giststate, Relation index)
1222+
GISTSTATE *
1223+
initGISTstate(Relation index)
12201224
{
1225+
GISTSTATE *giststate;
1226+
MemoryContext scanCxt;
1227+
MemoryContext oldCxt;
12211228
int i;
12221229

1230+
/* safety check to protect fixed-size arrays in GISTSTATE */
12231231
if (index->rd_att->natts > INDEX_MAX_KEYS)
12241232
elog(ERROR, "numberOfAttributes %d > %d",
12251233
index->rd_att->natts, INDEX_MAX_KEYS);
12261234

1235+
/* Create the memory context that will hold the GISTSTATE */
1236+
scanCxt = AllocSetContextCreate(CurrentMemoryContext,
1237+
"GiST scan context",
1238+
ALLOCSET_DEFAULT_MINSIZE,
1239+
ALLOCSET_DEFAULT_INITSIZE,
1240+
ALLOCSET_DEFAULT_MAXSIZE);
1241+
oldCxt = MemoryContextSwitchTo(scanCxt);
1242+
1243+
/* Create and fill in the GISTSTATE */
1244+
giststate = (GISTSTATE *) palloc(sizeof(GISTSTATE));
1245+
1246+
giststate->scanCxt = scanCxt;
1247+
giststate->tempCxt = scanCxt; /* caller must change this if needed */
12271248
giststate->tupdesc = index->rd_att;
12281249

12291250
for (i = 0; i < index->rd_att->natts; i++)
12301251
{
12311252
fmgr_info_copy(&(giststate->consistentFn[i]),
12321253
index_getprocinfo(index, i + 1, GIST_CONSISTENT_PROC),
1233-
CurrentMemoryContext);
1254+
scanCxt);
12341255
fmgr_info_copy(&(giststate->unionFn[i]),
12351256
index_getprocinfo(index, i + 1, GIST_UNION_PROC),
1236-
CurrentMemoryContext);
1257+
scanCxt);
12371258
fmgr_info_copy(&(giststate->compressFn[i]),
12381259
index_getprocinfo(index, i + 1, GIST_COMPRESS_PROC),
1239-
CurrentMemoryContext);
1260+
scanCxt);
12401261
fmgr_info_copy(&(giststate->decompressFn[i]),
12411262
index_getprocinfo(index, i + 1, GIST_DECOMPRESS_PROC),
1242-
CurrentMemoryContext);
1263+
scanCxt);
12431264
fmgr_info_copy(&(giststate->penaltyFn[i]),
12441265
index_getprocinfo(index, i + 1, GIST_PENALTY_PROC),
1245-
CurrentMemoryContext);
1266+
scanCxt);
12461267
fmgr_info_copy(&(giststate->picksplitFn[i]),
12471268
index_getprocinfo(index, i + 1, GIST_PICKSPLIT_PROC),
1248-
CurrentMemoryContext);
1269+
scanCxt);
12491270
fmgr_info_copy(&(giststate->equalFn[i]),
12501271
index_getprocinfo(index, i + 1, GIST_EQUAL_PROC),
1251-
CurrentMemoryContext);
1272+
scanCxt);
12521273
/* opclasses are not required to provide a Distance method */
12531274
if (OidIsValid(index_getprocid(index, i + 1, GIST_DISTANCE_PROC)))
12541275
fmgr_info_copy(&(giststate->distanceFn[i]),
12551276
index_getprocinfo(index, i + 1, GIST_DISTANCE_PROC),
1256-
CurrentMemoryContext);
1277+
scanCxt);
12571278
else
12581279
giststate->distanceFn[i].fn_oid = InvalidOid;
12591280

@@ -1273,10 +1294,15 @@ initGISTstate(GISTSTATE *giststate, Relation index)
12731294
else
12741295
giststate->supportCollation[i] = DEFAULT_COLLATION_OID;
12751296
}
1297+
1298+
MemoryContextSwitchTo(oldCxt);
1299+
1300+
return giststate;
12761301
}
12771302

12781303
void
12791304
freeGISTstate(GISTSTATE *giststate)
12801305
{
1281-
/* no work */
1306+
/* It's sufficient to delete the scanCxt */
1307+
MemoryContextDelete(giststate->scanCxt);
12821308
}

src/backend/access/gist/gistbuild.c

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ typedef enum
5454
typedef struct
5555
{
5656
Relation indexrel;
57-
GISTSTATE giststate;
57+
GISTSTATE *giststate;
5858
GISTBuildBuffers *gfbb;
5959

6060
int64 indtuples; /* number of tuples indexed */
@@ -63,7 +63,6 @@ typedef struct
6363
Size freespace; /* amount of free space to leave on pages */
6464

6565
GistBufferingMode bufferingMode;
66-
MemoryContext tmpCtx;
6766
} GISTBuildState;
6867

6968
static void gistInitBuffering(GISTBuildState *buildstate);
@@ -146,7 +145,14 @@ gistbuild(PG_FUNCTION_ARGS)
146145
RelationGetRelationName(index));
147146

148147
/* no locking is needed */
149-
initGISTstate(&buildstate.giststate, index);
148+
buildstate.giststate = initGISTstate(index);
149+
150+
/*
151+
* Create a temporary memory context that is reset once for each tuple
152+
* processed. (Note: we don't bother to make this a child of the
153+
* giststate's scanCxt, so we have to delete it separately at the end.)
154+
*/
155+
buildstate.giststate->tempCxt = createTempGistContext();
150156

151157
/* initialize the root page */
152158
buffer = gistNewBuffer(index);
@@ -184,12 +190,6 @@ gistbuild(PG_FUNCTION_ARGS)
184190
buildstate.indtuples = 0;
185191
buildstate.indtuplesSize = 0;
186192

187-
/*
188-
* create a temporary memory context that is reset once for each tuple
189-
* processed.
190-
*/
191-
buildstate.tmpCtx = createTempGistContext();
192-
193193
/*
194194
* Do the heap scan.
195195
*/
@@ -208,9 +208,9 @@ gistbuild(PG_FUNCTION_ARGS)
208208

209209
/* okay, all heap tuples are indexed */
210210
MemoryContextSwitchTo(oldcxt);
211-
MemoryContextDelete(buildstate.tmpCtx);
211+
MemoryContextDelete(buildstate.giststate->tempCxt);
212212

213-
freeGISTstate(&buildstate.giststate);
213+
freeGISTstate(buildstate.giststate);
214214

215215
/*
216216
* Return statistics
@@ -440,10 +440,10 @@ gistBuildCallback(Relation index,
440440
IndexTuple itup;
441441
MemoryContext oldCtx;
442442

443-
oldCtx = MemoryContextSwitchTo(buildstate->tmpCtx);
443+
oldCtx = MemoryContextSwitchTo(buildstate->giststate->tempCxt);
444444

445445
/* form an index tuple and point it at the heap tuple */
446-
itup = gistFormTuple(&buildstate->giststate, index, values, isnull, true);
446+
itup = gistFormTuple(buildstate->giststate, index, values, isnull, true);
447447
itup->t_tid = htup->t_self;
448448

449449
if (buildstate->bufferingMode == GIST_BUFFERING_ACTIVE)
@@ -458,15 +458,15 @@ gistBuildCallback(Relation index,
458458
* locked, we call gistdoinsert directly.
459459
*/
460460
gistdoinsert(index, itup, buildstate->freespace,
461-
&buildstate->giststate);
461+
buildstate->giststate);
462462
}
463463

464464
/* Update tuple count and total size. */
465465
buildstate->indtuples += 1;
466466
buildstate->indtuplesSize += IndexTupleSize(itup);
467467

468468
MemoryContextSwitchTo(oldCtx);
469-
MemoryContextReset(buildstate->tmpCtx);
469+
MemoryContextReset(buildstate->giststate->tempCxt);
470470

471471
if (buildstate->bufferingMode == GIST_BUFFERING_ACTIVE &&
472472
buildstate->indtuples % BUFFERING_MODE_TUPLE_SIZE_STATS_TARGET == 0)
@@ -520,7 +520,7 @@ static bool
520520
gistProcessItup(GISTBuildState *buildstate, IndexTuple itup,
521521
GISTBufferingInsertStack *startparent)
522522
{
523-
GISTSTATE *giststate = &buildstate->giststate;
523+
GISTSTATE *giststate = buildstate->giststate;
524524
GISTBuildBuffers *gfbb = buildstate->gfbb;
525525
Relation indexrel = buildstate->indexrel;
526526
GISTBufferingInsertStack *path;
@@ -652,7 +652,7 @@ gistbufferinginserttuples(GISTBuildState *buildstate, Buffer buffer,
652652

653653
is_split = gistplacetopage(buildstate->indexrel,
654654
buildstate->freespace,
655-
&buildstate->giststate,
655+
buildstate->giststate,
656656
buffer,
657657
itup, ntup, oldoffnum,
658658
InvalidBuffer,
@@ -720,7 +720,7 @@ gistbufferinginserttuples(GISTBuildState *buildstate, Buffer buffer,
720720
* buffers that will eventually be inserted to them.
721721
*/
722722
gistRelocateBuildBuffersOnSplit(gfbb,
723-
&buildstate->giststate,
723+
buildstate->giststate,
724724
buildstate->indexrel,
725725
path, buffer, splitinfo);
726726

@@ -919,7 +919,7 @@ gistProcessEmptyingQueue(GISTBuildState *buildstate)
919919
}
920920

921921
/* Free all the memory allocated during index tuple processing */
922-
MemoryContextReset(CurrentMemoryContext);
922+
MemoryContextReset(buildstate->giststate->tempCxt);
923923
}
924924
}
925925
}
@@ -938,7 +938,7 @@ gistEmptyAllBuffers(GISTBuildState *buildstate)
938938
MemoryContext oldCtx;
939939
int i;
940940

941-
oldCtx = MemoryContextSwitchTo(buildstate->tmpCtx);
941+
oldCtx = MemoryContextSwitchTo(buildstate->giststate->tempCxt);
942942

943943
/*
944944
* Iterate through the levels from top to bottom.
@@ -970,7 +970,7 @@ gistEmptyAllBuffers(GISTBuildState *buildstate)
970970
nodeBuffer->queuedForEmptying = true;
971971
gfbb->bufferEmptyingQueue =
972972
lcons(nodeBuffer, gfbb->bufferEmptyingQueue);
973-
MemoryContextSwitchTo(buildstate->tmpCtx);
973+
MemoryContextSwitchTo(buildstate->giststate->tempCxt);
974974
}
975975
gistProcessEmptyingQueue(buildstate);
976976
}

src/backend/access/gist/gistget.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -307,12 +307,12 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
307307
* Must call gistindex_keytest in tempCxt, and clean up any leftover
308308
* junk afterward.
309309
*/
310-
oldcxt = MemoryContextSwitchTo(so->tempCxt);
310+
oldcxt = MemoryContextSwitchTo(so->giststate->tempCxt);
311311

312312
match = gistindex_keytest(scan, it, page, i, &recheck);
313313

314314
MemoryContextSwitchTo(oldcxt);
315-
MemoryContextReset(so->tempCxt);
315+
MemoryContextReset(so->giststate->tempCxt);
316316

317317
/* Ignore tuple if it doesn't match */
318318
if (!match)

0 commit comments

Comments
 (0)