Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 459c64d

Browse files
committed
Fix concurrent locking of tuple update chain
If several sessions are concurrently locking a tuple update chain with nonconflicting lock modes using an old snapshot, and they all succeed, it may happen that some of them fail because of restarting the loop (due to a concurrent Xmax change) and getting an error in the subsequent pass while trying to obtain a tuple lock that they already have in some tuple version. This can only happen with very high concurrency (where a row is being both updated and FK-checked by multiple transactions concurrently), but it's been observed in the field and can have unpleasant consequences such as an FK check failing to see a tuple that definitely exists: ERROR: insert or update on table "child_table" violates foreign key constraint "fk_constraint_name" DETAIL: Key (keyid)=(123456) is not present in table "parent_table". (where the key is observably present in the table). Discussion: https://postgr.es/m/20170714210011.r25mrff4nxjhmf3g@alvherre.pgsql
1 parent c28e4f4 commit 459c64d

File tree

1 file changed

+37
-4
lines changed

1 file changed

+37
-4
lines changed

src/backend/access/heap/heapam.c

+37-4
Original file line numberDiff line numberDiff line change
@@ -5524,8 +5524,10 @@ compute_new_xmax_infomask(TransactionId xmax, uint16 old_infomask,
55245524
* with the given xid, does the current transaction need to wait, fail, or can
55255525
* it continue if it wanted to acquire a lock of the given mode? "needwait"
55265526
* is set to true if waiting is necessary; if it can continue, then
5527-
* HeapTupleMayBeUpdated is returned. In case of a conflict, a different
5528-
* HeapTupleSatisfiesUpdate return code is returned.
5527+
* HeapTupleMayBeUpdated is returned. If the lock is already held by the
5528+
* current transaction, return HeapTupleSelfUpdated. In case of a conflict
5529+
* with another transaction, a different HeapTupleSatisfiesUpdate return code
5530+
* is returned.
55295531
*
55305532
* The held status is said to be hypothetical because it might correspond to a
55315533
* lock held by a single Xid, i.e. not a real MultiXactId; we express it this
@@ -5548,8 +5550,9 @@ test_lockmode_for_conflict(MultiXactStatus status, TransactionId xid,
55485550
if (TransactionIdIsCurrentTransactionId(xid))
55495551
{
55505552
/*
5551-
* Updated by our own transaction? Just return failure. This
5552-
* shouldn't normally happen.
5553+
* The tuple has already been locked by our own transaction. This is
5554+
* very rare but can happen if multiple transactions are trying to
5555+
* lock an ancient version of the same tuple.
55535556
*/
55545557
return HeapTupleSelfUpdated;
55555558
}
@@ -5748,6 +5751,22 @@ heap_lock_updated_tuple_rec(Relation rel, ItemPointer tid, TransactionId xid,
57485751
members[i].xid,
57495752
mode, &needwait);
57505753

5754+
/*
5755+
* If the tuple was already locked by ourselves in a
5756+
* previous iteration of this (say heap_lock_tuple was
5757+
* forced to restart the locking loop because of a change
5758+
* in xmax), then we hold the lock already on this tuple
5759+
* version and we don't need to do anything; and this is
5760+
* not an error condition either. We just need to skip
5761+
* this tuple and continue locking the next version in the
5762+
* update chain.
5763+
*/
5764+
if (result == HeapTupleSelfUpdated)
5765+
{
5766+
pfree(members);
5767+
goto next;
5768+
}
5769+
57515770
if (needwait)
57525771
{
57535772
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
@@ -5808,6 +5827,19 @@ heap_lock_updated_tuple_rec(Relation rel, ItemPointer tid, TransactionId xid,
58085827

58095828
result = test_lockmode_for_conflict(status, rawxmax, mode,
58105829
&needwait);
5830+
5831+
/*
5832+
* If the tuple was already locked by ourselves in a previous
5833+
* iteration of this (say heap_lock_tuple was forced to
5834+
* restart the locking loop because of a change in xmax), then
5835+
* we hold the lock already on this tuple version and we don't
5836+
* need to do anything; and this is not an error condition
5837+
* either. We just need to skip this tuple and continue
5838+
* locking the next version in the update chain.
5839+
*/
5840+
if (result == HeapTupleSelfUpdated)
5841+
goto next;
5842+
58115843
if (needwait)
58125844
{
58135845
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
@@ -5868,6 +5900,7 @@ heap_lock_updated_tuple_rec(Relation rel, ItemPointer tid, TransactionId xid,
58685900

58695901
END_CRIT_SECTION();
58705902

5903+
next:
58715904
/* if we find the end of update chain, we're done. */
58725905
if (mytup.t_data->t_infomask & HEAP_XMAX_INVALID ||
58735906
ItemPointerEquals(&mytup.t_self, &mytup.t_data->t_ctid) ||

0 commit comments

Comments
 (0)