Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit bf26179

Browse files
committed
Fix various bugs in postmaster SIGKILL processing
Clamp the minimum sleep time during immediate shutdown or crash to a minimum of zero, not a maximum of one second. The previous code could result in a negative sleep time, leading to failure in select() calls. Also, on crash recovery, reset AbortStartTime as soon as SIGKILL is sent or abort processing has commenced instead of waiting until the startup process completes. Reset AbortStartTime as soon as SIGKILL is sent, too, to avoid doing that repeatedly. Per trouble report from Jeff Janes on CAMkU=1xd3=wFqZwwuXPWe4BQs3h1seYo8LV9JtSjW5RodoPxMg@mail.gmail.com Author: MauMau
1 parent 2d6c0f1 commit bf26179

File tree

1 file changed

+10
-5
lines changed

1 file changed

+10
-5
lines changed

src/backend/postmaster/postmaster.c

+10-5
Original file line numberDiff line numberDiff line change
@@ -1422,9 +1422,9 @@ DetermineSleepTime(struct timeval * timeout)
14221422
{
14231423
if (AbortStartTime > 0)
14241424
{
1425-
/* remaining time, but at least 1 second */
1426-
timeout->tv_sec = Min(SIGKILL_CHILDREN_AFTER_SECS -
1427-
(time(NULL) - AbortStartTime), 1);
1425+
/* time left to abort; clamp to 0 in case it already expired */
1426+
timeout->tv_sec = Max(SIGKILL_CHILDREN_AFTER_SECS -
1427+
(time(NULL) - AbortStartTime), 0);
14281428
timeout->tv_usec = 0;
14291429
}
14301430
else
@@ -1676,10 +1676,13 @@ ServerLoop(void)
16761676
* Note we also do this during recovery from a process crash.
16771677
*/
16781678
if ((Shutdown >= ImmediateShutdown || (FatalError && !SendStop)) &&
1679+
AbortStartTime > 0 &&
16791680
now - AbortStartTime >= SIGKILL_CHILDREN_AFTER_SECS)
16801681
{
16811682
/* We were gentle with them before. Not anymore */
16821683
TerminateChildren(SIGKILL);
1684+
/* reset flag so we don't SIGKILL again */
1685+
AbortStartTime = 0;
16831686

16841687
/*
16851688
* Additionally, unless we're recovering from a process crash, it's
@@ -2584,7 +2587,7 @@ reaper(SIGNAL_ARGS)
25842587
* Startup succeeded, commence normal operations
25852588
*/
25862589
FatalError = false;
2587-
AbortStartTime = 0;
2590+
Assert(AbortStartTime == 0);
25882591
ReachedNormalRunning = true;
25892592
pmState = PM_RUN;
25902593

@@ -3544,6 +3547,8 @@ PostmasterStateMachine(void)
35443547
StartupPID = StartupDataBase();
35453548
Assert(StartupPID != 0);
35463549
pmState = PM_STARTUP;
3550+
/* crash recovery started, reset SIGKILL flag */
3551+
AbortStartTime = 0;
35473552
}
35483553
}
35493554

@@ -4737,7 +4742,7 @@ sigusr1_handler(SIGNAL_ARGS)
47374742
{
47384743
/* WAL redo has started. We're out of reinitialization. */
47394744
FatalError = false;
4740-
AbortStartTime = 0;
4745+
Assert(AbortStartTime == 0);
47414746

47424747
/*
47434748
* Crank up the background tasks. It doesn't matter if this fails,

0 commit comments

Comments
 (0)