-
Notifications
You must be signed in to change notification settings - Fork 2
Comparing changes
Open a pull request
base repository: postgresql-cfbot/postgresql
base: cf/5227~1
head repository: postgresql-cfbot/postgresql
compare: cf/5227
- 17 commits
- 58 files changed
- 4 contributors
Commits on Apr 1, 2025
-
docs: Add acronym and glossary entries for I/O and AIO
These are fairly basic, but better than nothing. While there are several opportunities to link to these entries, this patch does not add any. They will however be referenced by future patches. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250326183102.92.nmisch@google.com
Configuration menu - View commit details
-
Copy full SHA for e2dee06 - Browse repository at this point
Copy the full SHA e2dee06View commit details -
The new view lists all IO handles that are currently in use and is mainly useful for PG developers, but may also be useful when tuning PG. FIXME: - catversion bump before commit Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Configuration menu - View commit details
-
Copy full SHA for 69e7e6b - Browse repository at this point
Copy the full SHA 69e7e6bView commit details -
aio: Add README.md explaining higher level design
Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
Configuration menu - View commit details
-
Copy full SHA for b35b9c1 - Browse repository at this point
Copy the full SHA b35b9c1View commit details -
To make the tests possible, a few functions from bufmgr.c/localbuf.c had to be exported, via buf_internals.h. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Andres Freund <andres@anarazel.de> Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Configuration menu - View commit details
-
Copy full SHA for 927f90f - Browse repository at this point
Copy the full SHA 927f90fView commit details -
md: Add comment & assert to buffer-zeroing path in md[start]readv()
mdreadv() has a codepath to zero out buffers when a read returns zero bytes, guarded by a check for zero_damaged_pages || InRecovery. The InRecovery codepath to zero out buffers in mdreadv() appears to be unreachable. The only known paths to reach mdreadv()/mdstartreadv() in recovery are XLogReadBufferExtended(), vm_readbuf(), and fsm_readbuf(), each of which takes care to extend the relation if necessary. This looks to either have been the case for a long time, or the code was never reachable. The zero_damaged_pages path is incomplete, as as missing segments are not created. Putting blocks into the buffer-pool that do not exist on disk is rather problematic, as such blocks will, at least initially, not be found by scans that rely on smgrnblocks(), as they are beyond EOF. It also can cause weird problems with relation extension, as relation extension does not expect blocks beyond EOF to exist. Therefore we would like to remove that path. mdstartreadv(), which I added in e5fe570b51c, does not implement this zeroing logic. I had started a discussion about that a while ago (linked below), but forgot to act on the conclusion of the discussion, namely to disable the in-memory-zeroing behavior. We could certainly implement equivalent zeroing logic in mdstartreadv(), but it would have to be more complicated due to potential differences in the zero_damaged_pages setting between the definer and completor of IO. Given that we want to remove the logic, that does not seem worth implementing the necessary logic. For now, put an Assert(false) comments documenting this choice into mdreadv() and comments documenting the deprecation of the path in mdreadv() and the non-implementation of it in mdstartreadv(). If we, during testing, discover that we do need the path, we can implement it at that time. Discussion: https://postgr.es/m/postgr.es/m/20250330024513.ac.nmisch@google.com Discussion: https://postgr.es/m/postgr.es/m/3qxxsnciyffyf3wyguiz4besdp5t5uxvv3utg75cbcszojlz7p@uibfzmnukkbd
Configuration menu - View commit details
-
Copy full SHA for d88688b - Browse repository at this point
Copy the full SHA d88688bView commit details -
Author: Reviewed-by: Discussion: https://postgr.es/m/ Backpatch:
Configuration menu - View commit details
-
Copy full SHA for 81b0f39 - Browse repository at this point
Copy the full SHA 81b0f39View commit details -
aio: Add errcontext for processing I/Os for another backend
Push an ErrorContextCallback adding additional detail about the process performing the I/O and the owner of the I/O when those are not the same. For io_method worker, this adds context specifying which process owns the I/O that the I/O worker is processing. For io_method io_uring, this adds context only when a backend is *completing* I/O for another backend. It specifies the pid of the owning process. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/20250325141120.8e.nmisch%40google.com
Configuration menu - View commit details
-
Copy full SHA for 527b9dd - Browse repository at this point
Copy the full SHA 527b9ddView commit details -
aio: Experimental heuristics to increase batching in read_stream.c
Author: Reviewed-by: Discussion: https://postgr.es/m/ Backpatch:
Configuration menu - View commit details
-
Copy full SHA for 615ce93 - Browse repository at this point
Copy the full SHA 615ce93View commit details -
aio: Implement smgr/md/fd write support
TODO: - Right now the sync.c integration with smgr.c/md.c isn't properly safe to use in a critical section The only reason it doesn't immediately fail is that it's reasonably rare that RegisterSyncRequest() fails *and* either: - smgropen()->hash_search(HASH_ENTER) decides to resize the hash table, even though the lookup is guaranteed to succeed for io_method=worker. - an io_method=uring completion is run in a different backend and smgropen() needs to build a new entry and thus needs to allocate memory For a bit I thought this could be worked around easily enough by not doing an smgropen() in mdsyncfiletag(), or adding a "fallible" smgropen() and instead just opening the file directly. That actually does kinda solve the problem, but only because the memory allocation in PathNameOpenFile() uses malloc(), not palloc() and thus doesn't trigger - temp_file_limit implementation
Configuration menu - View commit details
-
Copy full SHA for a7d518a - Browse repository at this point
Copy the full SHA a7d518aView commit details -
Configuration menu - View commit details
-
Copy full SHA for eb254b3 - Browse repository at this point
Copy the full SHA eb254b3View commit details -
bufmgr: Implement AIO write support
As of this commit there are no users of these AIO facilities, that'll come in later commits. Problems with AIO writes: - Write logic needs to be rebased on-top of the patch series to not hit bit dirty buffers while IO is going on The performance impact of doing the memory copies is rather substantial, as on intel memory bandwidth is *the* IO bottleneck even just for the checksum computation, without a copy. That makes the memory copy for something like bounce buffers hurt really badly. And the memory usage of bounce buffers is also really concerning. And even without checksums, several filesystems *really* don't like buffers getting modified during DIO writes. Which I think would mean we ought to use bounce buffers for *all* writes, which would impose a *very* substantial overhead (basically removing the benefit of DMA happening off-cpu). - I think it requires new lwlock.c infrastructure (as v1 of aio had), to make LockBuffer(BUFFER_LOCK_EXCLUSIVE) etc wait in a concurrency safe manner for in-progress writes I can think of ways to solve this purely in bufmgr.c, but only in ways that would cause other problems (e.g. setting BM_IO_IN_PROGRESS before waiting for an exclusive lock) and/or expensive.
Configuration menu - View commit details
-
Copy full SHA for c779e9b - Browse repository at this point
Copy the full SHA c779e9bView commit details -
This is likely never going to anywhere - Thomas Munro is working on something more complete. But I needed a way to exercise aio for checkpointer / bgwriter.
Configuration menu - View commit details
-
Copy full SHA for 6074c30 - Browse repository at this point
Copy the full SHA 6074c30View commit details -
bufmgr: use AIO in checkpointer, bgwriter
This is far from ready - just included to be able to exercise AIO writes and get some preliminary numbers. In all likelihood this will instead be based on-top of work by Thomas Munro instead of the preceding commit. TODO; - This doesn't implement bgwriter_flush_after, checkpointer_flush_after I think that's not too hard to do, it's mainly round tuits. - The queuing logic doesn't carefully respect pin limits That might be ok for checkpointer and bgwriter, but the infrastructure should be usable outside of this as well.
Configuration menu - View commit details
-
Copy full SHA for b9d5e77 - Browse repository at this point
Copy the full SHA b9d5e77View commit details -
Ensure a resowner exists for all paths that may perform AIO
Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi
Configuration menu - View commit details
-
Copy full SHA for e93050c - Browse repository at this point
Copy the full SHA e93050cView commit details -
Temporary: Increase BAS_BULKREAD size
Without this we only can execute very little AIO for sequential scans, as there's just not enough buffers in the ring. This isn't the right fix, as just increasing the ring size can have negative performance implications in workloads where the kernel has all the data cached. Author: Reviewed-By: Discussion: https://postgr.es/m/ Backpatch:
Configuration menu - View commit details
-
Copy full SHA for e1a943b - Browse repository at this point
Copy the full SHA e1a943bView commit details -
For benchmarking it's quite annoying that the first time a memory is touched has completely different perf characteristics than subsequent accesses. Using MAP_POPULATE reduces that substantially.
Configuration menu - View commit details
-
Copy full SHA for b33bd81 - Browse repository at this point
Copy the full SHA b33bd81View commit details -
This branch was automatically generated by a robot using patches from an email thread registered at: https://commitfest.postgresql.org/patch/5227 The branch will be overwritten each time a new patch version is posted to the thread, and also periodically to check for bitrot caused by changes on the master branch. Patch(es): https://www.postgresql.org/message-id/7s6fclfekpcxoaaorwrq67v4vmgixf2dcjcgznuj7vxs3ie3wq@okqbi5vabqbx Author(s): Andres Freund
Commitfest Bot committedApr 1, 2025 Configuration menu - View commit details
-
Copy full SHA for 75f5663 - Browse repository at this point
Copy the full SHA 75f5663View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff cf/5227~1...cf/5227