Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit a39e78b

Browse files
committed
Block signals while computing the sleep time in postmaster's main loop.
DetermineSleepTime() was previously called without blocked signals. That's not good, because it allows signal handlers to interrupt its workings. DetermineSleepTime() was added in 9.3 with the addition of background workers (da07a1e), where it only read from BackgroundWorkerList. Since 9.4, where dynamic background workers were added (7f7485a), the list is also manipulated in DetermineSleepTime(). That's bad because the list now can be persistently corrupted if modified by both a signal handler and DetermineSleepTime(). This was discovered during the investigation of hangs on buildfarm member anole. It's unclear whether this bug is the source of these hangs or not, but it's worth fixing either way. I have confirmed that it can cause crashes. It luckily looks like this only can cause problems when bgworkers are actively used. Discussion: 20140929193733.GB14400@awork2.anarazel.de Backpatch to 9.3 where background workers were introduced.
1 parent 32984d8 commit a39e78b

File tree

1 file changed

+16
-10
lines changed

1 file changed

+16
-10
lines changed

src/backend/postmaster/postmaster.c

Lines changed: 16 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1482,6 +1482,8 @@ DetermineSleepTime(struct timeval * timeout)
14821482

14831483
/*
14841484
* Main idle loop of postmaster
1485+
*
1486+
* NB: Needs to be called with signals blocked
14851487
*/
14861488
static int
14871489
ServerLoop(void)
@@ -1503,34 +1505,38 @@ ServerLoop(void)
15031505
/*
15041506
* Wait for a connection request to arrive.
15051507
*
1508+
* We block all signals except while sleeping. That makes it safe for
1509+
* signal handlers, which again block all signals while executing, to
1510+
* do nontrivial work.
1511+
*
15061512
* If we are in PM_WAIT_DEAD_END state, then we don't want to accept
1507-
* any new connections, so we don't call select() at all; just sleep
1508-
* for a little bit with signals unblocked.
1513+
* any new connections, so we don't call select(), and just sleep.
15091514
*/
15101515
memcpy((char *) &rmask, (char *) &readmask, sizeof(fd_set));
15111516

1512-
PG_SETMASK(&UnBlockSig);
1513-
15141517
if (pmState == PM_WAIT_DEAD_END)
15151518
{
1519+
PG_SETMASK(&UnBlockSig);
1520+
15161521
pg_usleep(100000L); /* 100 msec seems reasonable */
15171522
selres = 0;
1523+
1524+
PG_SETMASK(&BlockSig);
15181525
}
15191526
else
15201527
{
15211528
/* must set timeout each time; some OSes change it! */
15221529
struct timeval timeout;
15231530

1531+
/* Needs to run with blocked signals! */
15241532
DetermineSleepTime(&timeout);
15251533

1534+
PG_SETMASK(&UnBlockSig);
1535+
15261536
selres = select(nSockets, &rmask, NULL, NULL, &timeout);
1527-
}
15281537

1529-
/*
1530-
* Block all signals until we wait again. (This makes it safe for our
1531-
* signal handlers to do nontrivial work.)
1532-
*/
1533-
PG_SETMASK(&BlockSig);
1538+
PG_SETMASK(&BlockSig);
1539+
}
15341540

15351541
/* Now check the select() result */
15361542
if (selres < 0)

0 commit comments

Comments
 (0)