Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit fa91d4c

Browse files
committed
Make parallel worker shutdown complete entirely via before_shmem_exit().
This is a step toward storing stats in dynamic shared memory. As dynamic shared memory segments are detached from just after before_shmem_exit() callbacks are processed, but before on_shmem_exit() callbacks are, no stats can be collected after before_shmem_exit() callbacks have been processed. Parallel worker shutdown can cause stats to be emitted during DSM detach callbacks, e.g. for SharedFileSet (which closes its files, which can causes fd.c to emit stats about temporary files). Therefore parallel worker shutdown needs to complete during the processing of before_shmem_exit callbacks. One might think this problem could instead be solved by carefully ordering the attaching to DSM segments, so that the pgstats segments get detached from later than the parallel query ones. That turns out to not work because the stats hash might need to grow which can cause new segments to be allocated, which then will be detached from earlier. There are two code changes: First, call ParallelWorkerShutdown() via before_shmem_exit. That's a good idea on its own, because other shutdown callbacks like ShutdownPostgres and ShutdownAuxiliaryProcess are called via before_*. Second, explicitly detach from the parallel query DSM segment, thereby ensuring all stats are emitted during ParallelWorkerShutdown(). There are nicer solutions to these problems, but it's not obvious which of those solutions is the correct one. As the shared memory stats work already is a huge amount of work... Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de Discussion: https://postgr.es/m/20210803023612.iziacxk5syn2r4ut@alap3.anarazel.de
1 parent ee3f8d3 commit fa91d4c

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

src/backend/access/transam/parallel.c

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1305,7 +1305,7 @@ ParallelWorkerMain(Datum main_arg)
13051305
/* Arrange to signal the leader if we exit. */
13061306
ParallelLeaderPid = fps->parallel_leader_pid;
13071307
ParallelLeaderBackendId = fps->parallel_leader_backend_id;
1308-
on_shmem_exit(ParallelWorkerShutdown, (Datum) 0);
1308+
before_shmem_exit(ParallelWorkerShutdown, PointerGetDatum(seg));
13091309

13101310
/*
13111311
* Now we can find and attach to the error queue provided for us. That's
@@ -1507,13 +1507,25 @@ ParallelWorkerReportLastRecEnd(XLogRecPtr last_xlog_end)
15071507
* This guards against the case where we exit uncleanly without sending an
15081508
* ErrorResponse to the leader, for example because some code calls proc_exit
15091509
* directly.
1510+
*
1511+
* Also explicitly detach from dsm segment so that subsystems using
1512+
* on_dsm_detach() have a chance to send stats before the stats subsystem is
1513+
* shut down as as part of a before_shmem_exit() hook.
1514+
*
1515+
* One might think this could instead be solved by carefully ordering the
1516+
* attaching to dsm segments, so that the pgstats segments get detached from
1517+
* later than the parallel query one. That turns out to not work because the
1518+
* stats hash might need to grow which can cause new segments to be allocated,
1519+
* which then will be detached from earlier.
15101520
*/
15111521
static void
15121522
ParallelWorkerShutdown(int code, Datum arg)
15131523
{
15141524
SendProcSignal(ParallelLeaderPid,
15151525
PROCSIG_PARALLEL_MESSAGE,
15161526
ParallelLeaderBackendId);
1527+
1528+
dsm_detach((dsm_segment *) DatumGetPointer(arg));
15171529
}
15181530

15191531
/*

0 commit comments

Comments
 (0)