Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: postgresql-cfbot/postgresql
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: cf/5748~1
Choose a base ref
...
head repository: postgresql-cfbot/postgresql
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: cf/5748
Choose a head ref
  • 19 commits
  • 31 files changed
  • 2 contributors

Commits on May 31, 2025

  1. Improve our support for Valgrind's leak tracking.

    When determining whether an allocated chunk is still reachable,
    Valgrind will consider only pointers within what it believes to be
    allocated chunks.  Normally, all of a block obtained from malloc()
    would be considered "allocated" --- but it turns out that if we use
    VALGRIND_MEMPOOL_ALLOC to designate sub-section(s) of a malloc'ed
    block as allocated, all the rest of that malloc'ed block is ignored.
    This leads to lots of false positives of course.  In particular,
    in any multi-malloc-block context, all but the primary block were
    reported as leaked.  We also had a problem with context "ident"
    strings, which were reported as leaked unless there was some other
    pointer to them besides the one in the context header.
    
    To fix, we need to use VALGRIND_MEMPOOL_ALLOC to designate
    a context's management structs (the context struct itself and
    any per-block headers) as allocated chunks.  That forces moving
    the VALGRIND_CREATE_MEMPOOL/VALGRIND_DESTROY_MEMPOOL calls into
    the per-context-type code, so that the pool identifier can be
    made as soon as we've allocated the initial block, but otherwise
    it's fairly straightforward.  Note that in Valgrind's eyes there
    is no distinction between these allocations and the allocations
    that the mmgr modules hand out to user code.  That's fine for
    now, but perhaps someday we'll want to do better yet.
    
    When reading this patch, it's helpful to start with the comments
    added at the head of mcxt.c.
    
    Author: Andres Freund <andres@anarazel.de>
    Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    Discussion: https://postgr.es/m/20210317181531.7oggpqevzz6bka3g@alap3.anarazel.de
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    fadb777 View commit details
    Browse the repository at this point in the history
  2. Fix MemoryContextAllocAligned's interaction with Valgrind.

    Arrange that only the "aligned chunk" part of the allocated space is
    included in a Valgrind vchunk.  This suppresses complaints about that
    vchunk being possibly lost because PG is retaining only pointers to
    the aligned chunk.  Also make sure that trailing wasted space is
    marked NOACCESS.
    
    As a tiny performance improvement, arrange that MCXT_ALLOC_ZERO zeroes
    only the returned "aligned chunk", not the wasted padding space.
    
    In passing, fix GetLocalBufferStorage to use MemoryContextAllocAligned
    instead of rolling its own implementation, which was equally broken
    according to Valgrind.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    3de170f View commit details
    Browse the repository at this point in the history
  3. Silence complaints about leaked dynahash storage.

    Because dynahash.c never frees hashtable storage except by deleting
    the whole hashtable context, it doesn't bother to track the individual
    blocks of elements allocated by element_alloc().  This results in
    "possibly lost" complaints from Valgrind except when the first element
    of each block is actively in use.  (Otherwise it'll be on a freelist,
    but very likely only reachable via "interior pointers" within element
    blocks, which doesn't satisfy Valgrind.)
    
    To fix, if we're building with USE_VALGRIND, expend an extra pointer's
    worth of space in each element block so that we can chain them all
    together from the HTAB header.  Skip this in shared hashtables though:
    Valgrind doesn't track those, and we'd need additional locking to make
    it safe to manipulate a shared chain.
    
    While here, update a comment obsoleted by 9c911ec.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    46706ce View commit details
    Browse the repository at this point in the history
  4. Silence complaints involving dlist_node lists.

    Put the dlist_node fields of catctup and catclist structs first.
    This ensures that the dlist pointers point to the starts of these
    palloc blocks, and thus that Valgrind won't consider them
    "possibly lost".
    
    The postmaster's PMChild structs and the autovac launcher's avl_dbase
    structs also have the dlist_node-is-not-first problem, but putting it
    first still wouldn't silence the warning because we bulk-allocate
    those structs in an array, so that Valgrind sees a single allocation.
    Commonly the first array element will be pointed to only from some
    later element, so that the reference would be an interior pointer even
    if it pointed to the array start.  (This is the same issue fixed in
    the previous patch for dynahash elements.)  Since these are pretty
    simple data structures, I don't feel too bad about faking out Valgrind
    by just keeping a static pointer to the array start.
    
    This is all quite hacky, and it's not hard to imagine usages where
    we'd need some other idea in order to have reasonable leak tracking of
    structures that are only accessible via dlist_node lists.  But this
    patch seems to be enough to silence this class of leakage complaints
    for the moment.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    da5ff0b View commit details
    Browse the repository at this point in the history
  5. Silence complaints about save_ps_display_args.

    Valgrind seems not to consider the global "environ" variable as a
    valid root pointer; so when we allocate a new environment array,
    it claims that data is leaked.  To fix that, keep our own
    statically-allocated copy of the pointer, similarly to the
    previous patch.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    1e50d59 View commit details
    Browse the repository at this point in the history
  6. Don't leak the startup-packet buffer in ProcessStartupPacket.

    This is the first actual leakage bug fix in this patch series.
    
    The amount of memory regained is quite negligible of course.
    But we don't want Valgrind whining about this in every session.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    9e5ba62 View commit details
    Browse the repository at this point in the history
  7. Silence complaints about autovacuum workers.

    Free a couple of data structures manually near the end of the run
    when USE_VALGRIND, and ensure that the final vac_update_datfrozenxid()
    call is done in a non-permanent context.  This doesn't have any real
    effect on the process's total memory consumption, since we're going to
    exit as soon as that last transaction is done.  But it does pacify
    Valgrind.
    
    In combination with commit 02502c1, these fixes reduce reported
    leakage in autovacuum workers to zero in my tests.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    28258da View commit details
    Browse the repository at this point in the history
  8. Fix leakage of pq_mq_handle in a parallel worker.

    The amount of storage involved here is quite negligible,
    but without this change Valgrind will complain about every
    parallel worker process.
    
    While at it, move mq_putmessage's "Assert(pq_mq_handle != NULL)"
    to someplace where it's not trivially useless.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    e9e51eb View commit details
    Browse the repository at this point in the history
  9. Reduce leakage during PL/pgSQL function compilation.

    format_procedure leaks memory, so run it in a short-lived context
    not the session-lifespan cache context for the PL/pgSQL function.
    
    parse_datatype called the core parser in the function's cache context,
    thus leaking potentially a lot of storage into that context.  We were
    also being a bit careless with the TypeName structures made in that
    code path and others.  Most of the time we don't need to retain the
    TypeName, so make sure it is made in the short-lived temp context,
    and copy it only if we do need to retain it.
    
    These are far from the only leaks in PL/pgSQL compilation, but
    they're the biggest as far as I've seen, and further improvement
    looks like it'd require delicate and bug-prone surgery.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    59fbba4 View commit details
    Browse the repository at this point in the history
  10. Suppress complaints about leaks in function cache loading.

    PL/pgSQL and SQL-function parsing leak some stuff into the long-lived
    function cache context.  This isn't really a huge practical problem,
    since it's not a large amount of data and the cruft will be recovered
    if we have to re-parse the function.  It's not clear that it's worth
    working any harder than the previous patch did to eliminate these
    leak complaints, so instead silence them with a suppression rule.
    
    This suppression rule also hides the fact that CachedFunction structs
    are intentionally leaked in some cases because we're unsure if any
    fn_extra pointers remain.  That might be nice to do something about
    eventually, but it's not clear how.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    ea2f5fa View commit details
    Browse the repository at this point in the history
  11. Suppress complaints about leaks in TS dictionary loading.

    Like the situation with function cache loading, text search
    dictionary loading functions tend to leak some cruft into the
    dictionary's long-lived cache context.  To judge by the examples in
    the core regression tests, not very many bytes are at stake.
    Moreover, I don't see a way to prevent such leaks without changing the
    API for TS template initialization functions: right now they do not
    have to worry about making sure that their results are long-lived.
    
    Hence, I think we should install a suppression rule rather than trying
    to fix this completely.  However, I did grab some low-hanging fruit:
    several places were leaking the result of get_tsearch_config_filename.
    This seems worth doing mostly because they are inconsistent with other
    dictionaries that were freeing it already.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    a48c894 View commit details
    Browse the repository at this point in the history
  12. Fix leaks in load_domaintype_info().

    load_domaintype_info() intentionally leaks some intermediate cruft
    into the long-lived DomainConstraintCache's memory context, reasoning
    that the amount of leakage will typically not be much so it's not
    worth doing a copyObject() of the final tree to avoid that.  But
    Valgrind knows nothing of engineering tradeoffs and complains anyway.
    
    On the whole, the copyObject doesn't cost that much and this is
    surely not a performance-critical code path, so let's do it the
    clean way.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    5d23463 View commit details
    Browse the repository at this point in the history
  13. Silence complaints about leaks in PlanCacheComputeResultDesc.

    CompleteCachedPlan intentionally doesn't worry about small
    leaks from PlanCacheComputeResultDesc.  However, Valgrind
    knows nothing of engineering tradeoffs and complains anyway.
    Silence it by doing things the hard way if USE_VALGRIND.
    
    I don't really love this patch, because it makes the handling
    of plansource->resultDesc different from the handling of query
    dependencies and search_path just above, which likewise are willing
    to accept small leaks into the cached plan's context.  However,
    those cases aren't provoking Valgrind complaints.  (Perhaps in a
    CLOBBER_CACHE_ALWAYS build, they would?)  For the moment, this
    makes the src/pl/plpgsql tests leak-free according to Valgrind.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    842efeb View commit details
    Browse the repository at this point in the history
  14. Fix leaks when replacing or deleting a GUC placeholder variable.

    MarkGUCPrefixReserved didn't bother to clean up removed placeholder
    GUCs at all, which shows up as a leak in one regression test.
    It seems appropriate for it to do as much cleanup as
    define_custom_variable does when replacing placeholders, so factor
    that code out into a helper function.
    
    Testing showed that define_custom_variable's logic was one brick shy
    of a load too: it forgot to free the separate allocation for the
    placeholder's name.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    1037817 View commit details
    Browse the repository at this point in the history
  15. Fix leak in logicalrep_worker_detach().

    This runs in a long-lived context, so Valgrind complains about it.
    It's not clear to me that very much memory can actually be lost.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    9664ba6 View commit details
    Browse the repository at this point in the history
  16. Fix leak in evtcache.c's DecodeTextArrayToBitmapset().

    If the presented array is toasted, this neglected to free the
    detoasted copy, which was then leaked into EventTriggerCacheContext.
    
    I'm a bit distressed by the amount of code that BuildEventTriggerCache
    is willing to run while switched into a long-lived cache context.
    Although the detoasted array is the only leak that Valgrind reports,
    let's tighten things up while we're here.  (DecodeTextArrayToBitmapset
    is still run in the cache context, so doing that doesn't remove the
    need for the detoast fix.  But it reduces the surface area for other
    leaks.)
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    2af8edb View commit details
    Browse the repository at this point in the history
  17. Silence complaints about leaks in postmaster.

    Valgrind complains about the postmaster's socket-files and
    lock-files lists being leaked, which we can silence by just
    not nulling out the static pointers to them.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    296550a View commit details
    Browse the repository at this point in the history
  18. Fix leaks in startup process.

    These leaks are of absolutely no real-world consequence,
    since they are small one-time leaks in a one-time process.
    But this is the last step to get to zero reported leaks
    in a run of the core regression tests, so let's do it.
    
    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
    tglsfdc authored and Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    32fd40f View commit details
    Browse the repository at this point in the history
  19. [CF 5748] v3 - Improve Valgrind support and remove some memory leaks

    This branch was automatically generated by a robot using patches from an
    email thread registered at:
    
    https://commitfest.postgresql.org/patch/5748
    
    The branch will be overwritten each time a new patch version is posted to
    the thread, and also periodically to check for bitrot caused by changes
    on the master branch.
    
    Patch(es): https://www.postgresql.org/message-id/2884224.1748035274@sss.pgh.pa.us
    Author(s): Tom Lane
    Commitfest Bot committed May 31, 2025
    Configuration menu
    Copy the full SHA
    836db92 View commit details
    Browse the repository at this point in the history
Loading