Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 590b045

Browse files
committed
Improve memory management and performance of tuplestore.c
Here we make tuplestore.c use a generation.c memory context rather than allocating tuples into the CurrentMemoryContext, which primarily is the ExecutorState or PortalHoldContext memory context. Not having a dedicated context can cause the CurrentMemoryContext context to become bloated when pfree'd chunks are not reused by future tuples. Using generation speeds up users of tuplestore.c, such as the Materialize, WindowAgg and CTE Scan executor nodes. The main reason for the speedup is due to generation.c being more memory efficient than aset.c memory contexts. Specifically, generation does not round sizes up to the next power of 2 value. This both saves memory, allowing more tuples to fit in work_mem, but also makes the memory usage more compact and fit on fewer cachelines. One benchmark showed up to a 22% performance increase in a query containing a Materialize node. Much higher gains are possible if the memory reduction prevents tuplestore.c from spilling to disk. This is especially true for WindowAgg nodes where improvements of several thousand times are possible if the memory reductions made here prevent tuplestore from spilling to disk. Additionally, a generation.c memory context is much better suited for this job as it works well with FIFO palloc/pfree patterns, which is exactly how tuplestore.c uses it. Because of the way generation.c allocates memory, tuples consecutively stored in tuplestores are much more likely to be stored consecutively in memory. This allows the CPU's hardware prefetcher to work more efficiently as it provides a more predictable pattern to allow cachelines for the next tuple to be loaded from RAM in advance of them being needed by the executor. Using a dedicated memory context for storing tuples also allows us to more efficiently clean up the memory used by the tuplestore as we can reset or delete the context rather than looping over all stored tuples and pfree'ing them one by one. Also, remove a badly placed USEMEM call in readtup_heap(). The tuple wasn't being allocated in the Tuplestorestate's context, so no need to adjust the memory consumed by the tuplestore there. Author: David Rowley Reviewed-by: Matthias van de Meent, Dmitry Dolgov Discussion: https://postgr.es/m/CAApHDvp5Py9g4Rjq7_inL3-MCK1Co2CRt_YWFwTU2zfQix0p4A@mail.gmail.com
1 parent 53abb1e commit 590b045

File tree

1 file changed

+40
-15
lines changed

1 file changed

+40
-15
lines changed

src/backend/utils/sort/tuplestore.c

+40-15
Original file line numberDiff line numberDiff line change
@@ -266,7 +266,14 @@ tuplestore_begin_common(int eflags, bool interXact, int maxKBytes)
266266
state->availMem = state->allowedMem;
267267
state->maxSpace = 0;
268268
state->myfile = NULL;
269-
state->context = CurrentMemoryContext;
269+
270+
/*
271+
* The palloc/pfree pattern for tuple memory is in a FIFO pattern. A
272+
* generation context is perfectly suited for this.
273+
*/
274+
state->context = GenerationContextCreate(CurrentMemoryContext,
275+
"tuplestore tuples",
276+
ALLOCSET_DEFAULT_SIZES);
270277
state->resowner = CurrentResourceOwner;
271278

272279
state->memtupdeleted = 0;
@@ -429,14 +436,38 @@ tuplestore_clear(Tuplestorestate *state)
429436
if (state->myfile)
430437
BufFileClose(state->myfile);
431438
state->myfile = NULL;
432-
if (state->memtuples)
439+
440+
#ifdef USE_ASSERT_CHECKING
433441
{
442+
int64 availMem = state->availMem;
443+
444+
/*
445+
* Below, we reset the memory context for storing tuples. To save
446+
* from having to always call GetMemoryChunkSpace() on all stored
447+
* tuples, we adjust the availMem to forget all the tuples and just
448+
* recall USEMEM for the space used by the memtuples array. Here we
449+
* just Assert that's correct and the memory tracking hasn't gone
450+
* wrong anywhere.
451+
*/
434452
for (i = state->memtupdeleted; i < state->memtupcount; i++)
435-
{
436-
FREEMEM(state, GetMemoryChunkSpace(state->memtuples[i]));
437-
pfree(state->memtuples[i]);
438-
}
453+
availMem += GetMemoryChunkSpace(state->memtuples[i]);
454+
455+
availMem += GetMemoryChunkSpace(state->memtuples);
456+
457+
Assert(availMem == state->allowedMem);
439458
}
459+
#endif
460+
461+
/* clear the memory consumed by the memory tuples */
462+
MemoryContextReset(state->context);
463+
464+
/*
465+
* Zero the used memory and re-consume the space for the memtuples array.
466+
* This saves having to FREEMEM for each stored tuple.
467+
*/
468+
state->availMem = state->allowedMem;
469+
USEMEM(state, GetMemoryChunkSpace(state->memtuples));
470+
440471
state->status = TSS_INMEM;
441472
state->truncated = false;
442473
state->memtupdeleted = 0;
@@ -458,16 +489,11 @@ tuplestore_clear(Tuplestorestate *state)
458489
void
459490
tuplestore_end(Tuplestorestate *state)
460491
{
461-
int i;
462-
463492
if (state->myfile)
464493
BufFileClose(state->myfile);
465-
if (state->memtuples)
466-
{
467-
for (i = state->memtupdeleted; i < state->memtupcount; i++)
468-
pfree(state->memtuples[i]);
469-
pfree(state->memtuples);
470-
}
494+
495+
MemoryContextDelete(state->context);
496+
pfree(state->memtuples);
471497
pfree(state->readptrs);
472498
pfree(state);
473499
}
@@ -1578,7 +1604,6 @@ readtup_heap(Tuplestorestate *state, unsigned int len)
15781604
MinimalTuple tuple = (MinimalTuple) palloc(tuplen);
15791605
char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
15801606

1581-
USEMEM(state, GetMemoryChunkSpace(tuple));
15821607
/* read in the tuple proper */
15831608
tuple->t_len = tuplen;
15841609
BufFileReadExact(state->myfile, tupbody, tupbodylen);

0 commit comments

Comments
 (0)