Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 9ab3b2b

Browse files
committed
Fix possibility of self-deadlock in ResolveRecoveryConflictWithBufferPin().
The tests added in 9f8a050 failed nearly reliably on FreeBSD in CI, and occasionally on the buildfarm. That turns out to be caused not by a bug in the test, but by a longstanding bug in recovery conflict handling. The standby timeout handler, used by ResolveRecoveryConflictWithBufferPin(), executed SendRecoveryConflictWithBufferPin() inside a signal handler. A bad idea, because the deadlock timeout handler (or a spurious latch set) could have interrupted ProcWaitForSignal(). If unlucky that could cause a self-deadlock on ProcArrayLock, if the deadlock check is in SendRecoveryConflictWithBufferPin()->CancelDBBackends(). To fix, set a flag in StandbyTimeoutHandler(), and check the flag in ResolveRecoveryConflictWithBufferPin(). Subsequently the recovery conflict tests will be backpatched. Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de Backpatch: 10-
1 parent 5ab8e80 commit 9ab3b2b

File tree

1 file changed

+10
-10
lines changed

1 file changed

+10
-10
lines changed

src/backend/storage/ipc/standby.c

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ static HTAB *RecoveryLockLists;
4545

4646
/* Flags set by timeout handlers */
4747
static volatile sig_atomic_t got_standby_deadlock_timeout = false;
48+
static volatile sig_atomic_t got_standby_delay_timeout = false;
4849
static volatile sig_atomic_t got_standby_lock_timeout = false;
4950

5051
static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
@@ -792,7 +793,8 @@ ResolveRecoveryConflictWithBufferPin(void)
792793
}
793794

794795
/*
795-
* Wait to be signaled by UnpinBuffer().
796+
* Wait to be signaled by UnpinBuffer() or for the wait to be interrupted
797+
* by one of the timeouts established above.
796798
*
797799
* We assume that only UnpinBuffer() and the timeout requests established
798800
* above can wake us up here. WakeupRecovery() called by walreceiver or
@@ -801,7 +803,9 @@ ResolveRecoveryConflictWithBufferPin(void)
801803
*/
802804
ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
803805

804-
if (got_standby_deadlock_timeout)
806+
if (got_standby_delay_timeout)
807+
SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
808+
else if (got_standby_deadlock_timeout)
805809
{
806810
/*
807811
* Send out a request for hot-standby backends to check themselves for
@@ -827,6 +831,7 @@ ResolveRecoveryConflictWithBufferPin(void)
827831
* individually, but that'd be slower.
828832
*/
829833
disable_all_timeouts(false);
834+
got_standby_delay_timeout = false;
830835
got_standby_deadlock_timeout = false;
831836
}
832837

@@ -886,8 +891,8 @@ CheckRecoveryConflictDeadlock(void)
886891
*/
887892

888893
/*
889-
* StandbyDeadLockHandler() will be called if STANDBY_DEADLOCK_TIMEOUT
890-
* occurs before STANDBY_TIMEOUT.
894+
* StandbyDeadLockHandler() will be called if STANDBY_DEADLOCK_TIMEOUT is
895+
* exceeded.
891896
*/
892897
void
893898
StandbyDeadLockHandler(void)
@@ -897,16 +902,11 @@ StandbyDeadLockHandler(void)
897902

898903
/*
899904
* StandbyTimeoutHandler() will be called if STANDBY_TIMEOUT is exceeded.
900-
* Send out a request to release conflicting buffer pins unconditionally,
901-
* so we can press ahead with applying changes in recovery.
902905
*/
903906
void
904907
StandbyTimeoutHandler(void)
905908
{
906-
/* forget any pending STANDBY_DEADLOCK_TIMEOUT request */
907-
disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false);
908-
909-
SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
909+
got_standby_delay_timeout = true;
910910
}
911911

912912
/*

0 commit comments

Comments
 (0)