Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 6322e84

Browse files
committed
Change StatementCancelHandler() to check the DoingCommandRead flag to decide
whether to execute an immediate interrupt, rather than testing whether LockWaitCancel() cancelled a lock wait. The old way misclassified the case where we were blocked in ProcWaitForSignal(), and arguably would misclassify any other future additions of new ImmediateInterruptOK states too. This allows reverting the old kluge that gave LockWaitCancel() a return value, since no callers care anymore. Improve comments in the various implementations of PGSemaphoreLock() to explain that on some platforms, the assumption that semop() exits after a signal is wrong, and so we must ensure that the signal handler itself throws elog if we want cancel or die interrupts to be effective. Per testing related to bug #3883, though this patch doesn't solve those problems fully. Perhaps this change should be back-patched, but since pre-8.3 branches aren't really relying on autovacuum to respond to SIGINT, it doesn't seem critical for them.
1 parent ace1b29 commit 6322e84

File tree

6 files changed

+39
-64
lines changed

6 files changed

+39
-64
lines changed

src/backend/port/posix_sema.c

Lines changed: 5 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
* Portions Copyright (c) 1994, Regents of the University of California
1212
*
1313
* IDENTIFICATION
14-
* $PostgreSQL: pgsql/src/backend/port/posix_sema.c,v 1.19 2008/01/01 19:45:51 momjian Exp $
14+
* $PostgreSQL: pgsql/src/backend/port/posix_sema.c,v 1.20 2008/01/26 19:55:08 tgl Exp $
1515
*
1616
*-------------------------------------------------------------------------
1717
*/
@@ -241,37 +241,10 @@ PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
241241
int errStatus;
242242

243243
/*
244-
* Note: if errStatus is -1 and errno == EINTR then it means we returned
245-
* from the operation prematurely because we were sent a signal. So we
246-
* try and lock the semaphore again.
247-
*
248-
* Each time around the loop, we check for a cancel/die interrupt. We
249-
* assume that if such an interrupt comes in while we are waiting, it will
250-
* cause the sem_wait() call to exit with errno == EINTR, so that we will
251-
* be able to service the interrupt (if not in a critical section
252-
* already).
253-
*
254-
* Once we acquire the lock, we do NOT check for an interrupt before
255-
* returning. The caller needs to be able to record ownership of the lock
256-
* before any interrupt can be accepted.
257-
*
258-
* There is a window of a few instructions between CHECK_FOR_INTERRUPTS
259-
* and entering the sem_wait() call. If a cancel/die interrupt occurs in
260-
* that window, we would fail to notice it until after we acquire the lock
261-
* (or get another interrupt to escape the sem_wait()). We can avoid this
262-
* problem by temporarily setting ImmediateInterruptOK to true before we
263-
* do CHECK_FOR_INTERRUPTS; then, a die() interrupt in this interval will
264-
* execute directly. However, there is a huge pitfall: there is another
265-
* window of a few instructions after the sem_wait() before we are able to
266-
* reset ImmediateInterruptOK. If an interrupt occurs then, we'll lose
267-
* control, which means that the lock has been acquired but our caller did
268-
* not get a chance to record the fact. Therefore, we only set
269-
* ImmediateInterruptOK if the caller tells us it's OK to do so, ie, the
270-
* caller does not need to record acquiring the lock. (This is currently
271-
* true for lockmanager locks, since the process that granted us the lock
272-
* did all the necessary state updates. It's not true for Posix semaphores
273-
* used to implement LW locks or emulate spinlocks --- but the wait time
274-
* for such locks should not be very long, anyway.)
244+
* See notes in sysv_sema.c's implementation of PGSemaphoreLock.
245+
* Just as that code does for semop(), we handle both the case where
246+
* sem_wait() returns errno == EINTR after a signal, and the case
247+
* where it just keeps waiting.
275248
*/
276249
do
277250
{

src/backend/port/sysv_sema.c

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/port/sysv_sema.c,v 1.22 2008/01/01 19:45:51 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/port/sysv_sema.c,v 1.23 2008/01/26 19:55:08 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -377,10 +377,11 @@ PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
377377
* from the operation prematurely because we were sent a signal. So we
378378
* try and lock the semaphore again.
379379
*
380-
* Each time around the loop, we check for a cancel/die interrupt. We
381-
* assume that if such an interrupt comes in while we are waiting, it will
382-
* cause the semop() call to exit with errno == EINTR, so that we will be
383-
* able to service the interrupt (if not in a critical section already).
380+
* Each time around the loop, we check for a cancel/die interrupt. On
381+
* some platforms, if such an interrupt comes in while we are waiting,
382+
* it will cause the semop() call to exit with errno == EINTR, allowing
383+
* us to service the interrupt (if not in a critical section already)
384+
* during the next loop iteration.
384385
*
385386
* Once we acquire the lock, we do NOT check for an interrupt before
386387
* returning. The caller needs to be able to record ownership of the lock
@@ -403,6 +404,14 @@ PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
403404
* did all the necessary state updates. It's not true for SysV semaphores
404405
* used to implement LW locks or emulate spinlocks --- but the wait time
405406
* for such locks should not be very long, anyway.)
407+
*
408+
* On some platforms, signals marked SA_RESTART (which is most, for us)
409+
* will not interrupt the semop(); it will just keep waiting. Therefore
410+
* it's necessary for cancel/die interrupts to be serviced directly by
411+
* the signal handler. On these platforms the behavior is really the same
412+
* whether the signal arrives just before the semop() begins, or while it
413+
* is waiting. The loop on EINTR is thus important only for other types
414+
* of interrupts.
406415
*/
407416
do
408417
{

src/backend/port/win32_sema.c

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
* Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
77
*
88
* IDENTIFICATION
9-
* $PostgreSQL: pgsql/src/backend/port/win32_sema.c,v 1.6 2008/01/01 19:45:51 momjian Exp $
9+
* $PostgreSQL: pgsql/src/backend/port/win32_sema.c,v 1.7 2008/01/26 19:55:08 tgl Exp $
1010
*
1111
*-------------------------------------------------------------------------
1212
*/
@@ -124,6 +124,12 @@ PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
124124
wh[0] = *sema;
125125
wh[1] = pgwin32_signal_event;
126126

127+
/*
128+
* As in other implementations of PGSemaphoreLock, we need to check
129+
* for cancel/die interrupts each time through the loop. But here,
130+
* there is no hidden magic about whether the syscall will internally
131+
* service a signal --- we do that ourselves.
132+
*/
127133
do
128134
{
129135
ImmediateInterruptOK = interruptOK;

src/backend/storage/lmgr/proc.c

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.198 2008/01/01 19:45:52 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.199 2008/01/26 19:55:08 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -486,20 +486,18 @@ HaveNFreeProcs(int n)
486486
/*
487487
* Cancel any pending wait for lock, when aborting a transaction.
488488
*
489-
* Returns true if we had been waiting for a lock, else false.
490-
*
491489
* (Normally, this would only happen if we accept a cancel/die
492490
* interrupt while waiting; but an ereport(ERROR) while waiting is
493491
* within the realm of possibility, too.)
494492
*/
495-
bool
493+
void
496494
LockWaitCancel(void)
497495
{
498496
LWLockId partitionLock;
499497

500498
/* Nothing to do if we weren't waiting for a lock */
501499
if (lockAwaited == NULL)
502-
return false;
500+
return;
503501

504502
/* Turn off the deadlock timer, if it's still running (see ProcSleep) */
505503
disable_sig_alarm(false);
@@ -538,12 +536,6 @@ LockWaitCancel(void)
538536
* wakeup signal isn't harmful, and it seems not worth expending cycles to
539537
* get rid of a signal that most likely isn't there.
540538
*/
541-
542-
/*
543-
* Return true even if we were kicked off the lock before we were able to
544-
* remove ourselves.
545-
*/
546-
return true;
547539
}
548540

549541

src/backend/tcop/postgres.c

Lines changed: 8 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/tcop/postgres.c,v 1.541 2008/01/01 19:45:52 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/tcop/postgres.c,v 1.542 2008/01/26 19:55:08 tgl Exp $
1212
*
1313
* NOTES
1414
* this is the "main" module of the postgres backend and
@@ -2449,10 +2449,9 @@ die(SIGNAL_ARGS)
24492449
/* bump holdoff count to make ProcessInterrupts() a no-op */
24502450
/* until we are done getting ready for it */
24512451
InterruptHoldoffCount++;
2452+
LockWaitCancel(); /* prevent CheckDeadLock from running */
24522453
DisableNotifyInterrupt();
24532454
DisableCatchupInterrupt();
2454-
/* Make sure CheckDeadLock won't run while shutting down... */
2455-
LockWaitCancel();
24562455
InterruptHoldoffCount--;
24572456
ProcessInterrupts();
24582457
}
@@ -2498,20 +2497,16 @@ StatementCancelHandler(SIGNAL_ARGS)
24982497
* waiting for input, however.
24992498
*/
25002499
if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
2501-
CritSectionCount == 0)
2500+
CritSectionCount == 0 && !DoingCommandRead)
25022501
{
25032502
/* bump holdoff count to make ProcessInterrupts() a no-op */
25042503
/* until we are done getting ready for it */
25052504
InterruptHoldoffCount++;
2506-
if (LockWaitCancel())
2507-
{
2508-
DisableNotifyInterrupt();
2509-
DisableCatchupInterrupt();
2510-
InterruptHoldoffCount--;
2511-
ProcessInterrupts();
2512-
}
2513-
else
2514-
InterruptHoldoffCount--;
2505+
LockWaitCancel(); /* prevent CheckDeadLock from running */
2506+
DisableNotifyInterrupt();
2507+
DisableCatchupInterrupt();
2508+
InterruptHoldoffCount--;
2509+
ProcessInterrupts();
25152510
}
25162511
}
25172512

src/include/storage/proc.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
* Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
10-
* $PostgreSQL: pgsql/src/include/storage/proc.h,v 1.103 2008/01/01 19:45:59 momjian Exp $
10+
* $PostgreSQL: pgsql/src/include/storage/proc.h,v 1.104 2008/01/26 19:55:08 tgl Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
@@ -166,7 +166,7 @@ extern void ProcQueueInit(PROC_QUEUE *queue);
166166
extern int ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable);
167167
extern PGPROC *ProcWakeup(PGPROC *proc, int waitStatus);
168168
extern void ProcLockWakeup(LockMethod lockMethodTable, LOCK *lock);
169-
extern bool LockWaitCancel(void);
169+
extern void LockWaitCancel(void);
170170

171171
extern void ProcWaitForSignal(void);
172172
extern void ProcSendSignal(int pid);

0 commit comments

Comments
 (0)