Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 27b2c6a

Browse files
committed
Don't launch new child processes after we've been told to shut down.
Once we've received a shutdown signal (SIGINT or SIGTERM), we should not launch any more child processes, even if we get signals requesting such. The normal code path for spawning backends has always understood that, but the postmaster's infrastructure for hot standby and autovacuum didn't get the memo. As reported by Hari Babu in bug #7643, this could lead to failure to shut down at all in some cases, such as when SIGINT is received just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd launch a bgwriter and checkpointer, and then those processes would have no idea that they ought to quit. Similarly, launching a new autovacuum worker would result in waiting till it finished before shutting down. Also, switch the order of the code blocks in reaper() that detect startup process crash versus shutdown termination. Once we've sent it a signal, we should not consider that exit(1) is surprising. This is just a cosmetic fix since shutdown occurs correctly anyway, but better not to log a phony complaint about startup process crash. Back-patch to 9.0. Some parts of this might be applicable before that, but given the lack of prior complaints I'm not going to worry too much about older branches.
1 parent 5cb0e33 commit 27b2c6a

File tree

1 file changed

+23
-20
lines changed

1 file changed

+23
-20
lines changed

src/backend/postmaster/postmaster.c

Lines changed: 23 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -2261,9 +2261,9 @@ pmdie(SIGNAL_ARGS)
22612261
if (pmState == PM_RECOVERY)
22622262
{
22632263
/*
2264-
* Only startup, bgwriter, and checkpointer should be active
2265-
* in this state; we just signaled the first two, and we don't
2266-
* want to kill checkpointer yet.
2264+
* Only startup, bgwriter, walreceiver, and/or checkpointer
2265+
* should be active in this state; we just signaled the first
2266+
* three, and we don't want to kill checkpointer yet.
22672267
*/
22682268
pmState = PM_WAIT_BACKENDS;
22692269
}
@@ -2354,6 +2354,18 @@ reaper(SIGNAL_ARGS)
23542354
{
23552355
StartupPID = 0;
23562356

2357+
/*
2358+
* Startup process exited in response to a shutdown request (or it
2359+
* completed normally regardless of the shutdown request).
2360+
*/
2361+
if (Shutdown > NoShutdown &&
2362+
(EXIT_STATUS_0(exitstatus) || EXIT_STATUS_1(exitstatus)))
2363+
{
2364+
pmState = PM_WAIT_BACKENDS;
2365+
/* PostmasterStateMachine logic does the rest */
2366+
continue;
2367+
}
2368+
23572369
/*
23582370
* Unexpected exit of startup process (including FATAL exit)
23592371
* during PM_STARTUP is treated as catastrophic. There are no
@@ -2368,18 +2380,6 @@ reaper(SIGNAL_ARGS)
23682380
ExitPostmaster(1);
23692381
}
23702382

2371-
/*
2372-
* Startup process exited in response to a shutdown request (or it
2373-
* completed normally regardless of the shutdown request).
2374-
*/
2375-
if (Shutdown > NoShutdown &&
2376-
(EXIT_STATUS_0(exitstatus) || EXIT_STATUS_1(exitstatus)))
2377-
{
2378-
pmState = PM_WAIT_BACKENDS;
2379-
/* PostmasterStateMachine logic does the rest */
2380-
continue;
2381-
}
2382-
23832383
/*
23842384
* After PM_STARTUP, any unexpected exit (including FATAL exit) of
23852385
* the startup process is catastrophic, so kill other children,
@@ -4283,7 +4283,7 @@ sigusr1_handler(SIGNAL_ARGS)
42834283
* first. We don't want to go back to recovery in that case.
42844284
*/
42854285
if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_STARTED) &&
4286-
pmState == PM_STARTUP)
4286+
pmState == PM_STARTUP && Shutdown == NoShutdown)
42874287
{
42884288
/* WAL redo has started. We're out of reinitialization. */
42894289
FatalError = false;
@@ -4300,7 +4300,7 @@ sigusr1_handler(SIGNAL_ARGS)
43004300
pmState = PM_RECOVERY;
43014301
}
43024302
if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
4303-
pmState == PM_RECOVERY)
4303+
pmState == PM_RECOVERY && Shutdown == NoShutdown)
43044304
{
43054305
/*
43064306
* Likewise, start other special children as needed.
@@ -4331,7 +4331,8 @@ sigusr1_handler(SIGNAL_ARGS)
43314331
signal_child(SysLoggerPID, SIGUSR1);
43324332
}
43334333

4334-
if (CheckPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER))
4334+
if (CheckPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER) &&
4335+
Shutdown == NoShutdown)
43354336
{
43364337
/*
43374338
* Start one iteration of the autovacuum daemon, even if autovacuuming
@@ -4345,7 +4346,8 @@ sigusr1_handler(SIGNAL_ARGS)
43454346
start_autovac_launcher = true;
43464347
}
43474348

4348-
if (CheckPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER))
4349+
if (CheckPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER) &&
4350+
Shutdown == NoShutdown)
43494351
{
43504352
/* The autovacuum launcher wants us to start a worker process. */
43514353
StartAutovacuumWorker();
@@ -4354,7 +4356,8 @@ sigusr1_handler(SIGNAL_ARGS)
43544356
if (CheckPostmasterSignal(PMSIGNAL_START_WALRECEIVER) &&
43554357
WalReceiverPID == 0 &&
43564358
(pmState == PM_STARTUP || pmState == PM_RECOVERY ||
4357-
pmState == PM_HOT_STANDBY || pmState == PM_WAIT_READONLY))
4359+
pmState == PM_HOT_STANDBY || pmState == PM_WAIT_READONLY) &&
4360+
Shutdown == NoShutdown)
43584361
{
43594362
/* Startup Process wants us to start the walreceiver process. */
43604363
WalReceiverPID = StartWalReceiver();

0 commit comments

Comments
 (0)