Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 8709dcc

Browse files
author
Amit Kapila
committed
Fix the race condition in ReplicationSlotAcquire().
After commit f41d846, a process could acquire and use a replication slot that had just been invalidated, leading to failures while accessing WAL. To ensure that we don't accidentally start using invalid slots, we must perform the invalidation check after acquiring the slot or under the spinlock where we associate the slot with a particular process. We choose the earlier method to keep the code simple. Reported-by: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CABdArM7J-LbGoMPGUPiFiLOyB_TZ5+YaZb=HMES0mQqzVTn8Gg@mail.gmail.com
1 parent 845511a commit 8709dcc

File tree

1 file changed

+16
-16
lines changed

1 file changed

+16
-16
lines changed

src/backend/replication/slot.c

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -580,19 +580,6 @@ ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
580580
name)));
581581
}
582582

583-
/* Invalid slots can't be modified or used before accessing the WAL. */
584-
if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
585-
{
586-
LWLockRelease(ReplicationSlotControlLock);
587-
588-
ereport(ERROR,
589-
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
590-
errmsg("can no longer access replication slot \"%s\"",
591-
NameStr(s->data.name)),
592-
errdetail("This replication slot has been invalidated due to \"%s\".",
593-
GetSlotInvalidationCauseName(s->data.invalidated)));
594-
}
595-
596583
/*
597584
* This is the slot we want; check if it's active under some other
598585
* process. In single user mode, we don't need this check.
@@ -650,12 +637,25 @@ ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
650637
else if (!nowait)
651638
ConditionVariableCancelSleep(); /* no sleep needed after all */
652639

653-
/* Let everybody know we've modified this slot */
654-
ConditionVariableBroadcast(&s->active_cv);
655-
656640
/* We made this slot active, so it's ours now. */
657641
MyReplicationSlot = s;
658642

643+
/*
644+
* We need to check for invalidation after making the slot ours to avoid
645+
* the possible race condition with the checkpointer that can otherwise
646+
* invalidate the slot immediately after the check.
647+
*/
648+
if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
649+
ereport(ERROR,
650+
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
651+
errmsg("can no longer access replication slot \"%s\"",
652+
NameStr(s->data.name)),
653+
errdetail("This replication slot has been invalidated due to \"%s\".",
654+
GetSlotInvalidationCauseName(s->data.invalidated)));
655+
656+
/* Let everybody know we've modified this slot */
657+
ConditionVariableBroadcast(&s->active_cv);
658+
659659
/*
660660
* The call to pgstat_acquire_replslot() protects against stats for a
661661
* different slot, from before a restart or such, being present during

0 commit comments

Comments
 (0)