Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit ac36e6f

Browse files
committed
Move CheckRecoveryConflictDeadlock() call to a safer place.
This kluge was inserted in a spot apparently chosen at random: the lock manager's state is not yet fully set up for the wait, and in particular LockWaitCancel hasn't been armed by setting lockAwaited, so the ProcLock will not get cleaned up if the ereport is thrown. This seems to not cause any observable problem in trivial test cases, because LockReleaseAll will silently clean up the debris; but I was able to cause failures with tests involving subtransactions. Fixes breakage induced by commit c85c941. Back-patch to all affected branches.
1 parent 2e53bd5 commit ac36e6f

File tree

4 files changed

+22
-19
lines changed

4 files changed

+22
-19
lines changed

src/backend/storage/ipc/standby.c

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -459,24 +459,25 @@ SendRecoveryConflictWithBufferPin(ProcSignalReason reason)
459459

460460
/*
461461
* In Hot Standby perform early deadlock detection. We abort the lock
462-
* wait if are about to sleep while holding the buffer pin that Startup
463-
* process is waiting for. The deadlock occurs because we can only be
464-
* waiting behind an AccessExclusiveLock, which can only clear when a
465-
* transaction completion record is replayed, which can only occur when
466-
* Startup process is not waiting. So if Startup process is waiting we
467-
* never will clear that lock, so if we wait we cause deadlock. If we
468-
* are the Startup process then no need to check for deadlocks.
462+
* wait if we are about to sleep while holding the buffer pin that Startup
463+
* process is waiting for.
464+
*
465+
* Note: this code is pessimistic, because there is no way for it to
466+
* determine whether an actual deadlock condition is present: the lock we
467+
* need to wait for might be unrelated to any held by the Startup process.
468+
* Sooner or later, this mechanism should get ripped out in favor of somehow
469+
* accounting for buffer locks in DeadLockCheck(). However, errors here
470+
* seem to be very low-probability in practice, so for now it's not worth
471+
* the trouble.
469472
*/
470473
void
471-
CheckRecoveryConflictDeadlock(LWLockId partitionLock)
474+
CheckRecoveryConflictDeadlock(void)
472475
{
473-
Assert(!InRecovery);
476+
Assert(!InRecovery); /* do not call in Startup process */
474477

475478
if (!HoldingBufferPinThatDelaysRecovery())
476479
return;
477480

478-
LWLockRelease(partitionLock);
479-
480481
/*
481482
* Error message should match ProcessInterrupts() but we avoid calling
482483
* that because we aren't handling an interrupt at this point. Note that

src/backend/storage/lmgr/lock.c

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -844,13 +844,6 @@ LockAcquireExtended(const LOCKTAG *locktag,
844844
return LOCKACQUIRE_NOT_AVAIL;
845845
}
846846

847-
/*
848-
* In Hot Standby perform early deadlock detection in normal backends.
849-
* If deadlock found we release partition lock but do not return.
850-
*/
851-
if (RecoveryInProgress() && !InRecovery)
852-
CheckRecoveryConflictDeadlock(partitionLock);
853-
854847
/*
855848
* Set bitmask of locks this process already holds on this object.
856849
*/

src/backend/storage/lmgr/proc.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -934,6 +934,15 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
934934
*/
935935
LWLockRelease(partitionLock);
936936

937+
/*
938+
* Also, now that we will successfully clean up after an ereport, it's
939+
* safe to check to see if there's a buffer pin deadlock against the
940+
* Startup process. Of course, that's only necessary if we're doing
941+
* Hot Standby and are not the Startup process ourselves.
942+
*/
943+
if (RecoveryInProgress() && !InRecovery)
944+
CheckRecoveryConflictDeadlock();
945+
937946
/* Reset deadlock_state before enabling the signal handler */
938947
deadlock_state = DS_NOT_YET_CHECKED;
939948

src/include/storage/standby.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ extern void ResolveRecoveryConflictWithDatabase(Oid dbid);
3535

3636
extern void ResolveRecoveryConflictWithBufferPin(void);
3737
extern void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
38-
extern void CheckRecoveryConflictDeadlock(LWLockId partitionLock);
38+
extern void CheckRecoveryConflictDeadlock(void);
3939

4040
/*
4141
* Standby Rmgr (RM_STANDBY_ID)

0 commit comments

Comments
 (0)