Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 8c34876

Browse files
committed
Fix concurrent locking of tuple update chain
If several sessions are concurrently locking a tuple update chain with nonconflicting lock modes using an old snapshot, and they all succeed, it may happen that some of them fail because of restarting the loop (due to a concurrent Xmax change) and getting an error in the subsequent pass while trying to obtain a tuple lock that they already have in some tuple version. This can only happen with very high concurrency (where a row is being both updated and FK-checked by multiple transactions concurrently), but it's been observed in the field and can have unpleasant consequences such as an FK check failing to see a tuple that definitely exists: ERROR: insert or update on table "child_table" violates foreign key constraint "fk_constraint_name" DETAIL: Key (keyid)=(123456) is not present in table "parent_table". (where the key is observably present in the table). Discussion: https://postgr.es/m/20170714210011.r25mrff4nxjhmf3g@alvherre.pgsql
1 parent 3b7bbee commit 8c34876

File tree

1 file changed

+37
-4
lines changed

1 file changed

+37
-4
lines changed

src/backend/access/heap/heapam.c

+37-4
Original file line numberDiff line numberDiff line change
@@ -5550,8 +5550,10 @@ compute_new_xmax_infomask(TransactionId xmax, uint16 old_infomask,
55505550
* with the given xid, does the current transaction need to wait, fail, or can
55515551
* it continue if it wanted to acquire a lock of the given mode? "needwait"
55525552
* is set to true if waiting is necessary; if it can continue, then
5553-
* HeapTupleMayBeUpdated is returned. In case of a conflict, a different
5554-
* HeapTupleSatisfiesUpdate return code is returned.
5553+
* HeapTupleMayBeUpdated is returned. If the lock is already held by the
5554+
* current transaction, return HeapTupleSelfUpdated. In case of a conflict
5555+
* with another transaction, a different HeapTupleSatisfiesUpdate return code
5556+
* is returned.
55555557
*
55565558
* The held status is said to be hypothetical because it might correspond to a
55575559
* lock held by a single Xid, i.e. not a real MultiXactId; we express it this
@@ -5574,8 +5576,9 @@ test_lockmode_for_conflict(MultiXactStatus status, TransactionId xid,
55745576
if (TransactionIdIsCurrentTransactionId(xid))
55755577
{
55765578
/*
5577-
* Updated by our own transaction? Just return failure. This
5578-
* shouldn't normally happen.
5579+
* The tuple has already been locked by our own transaction. This is
5580+
* very rare but can happen if multiple transactions are trying to
5581+
* lock an ancient version of the same tuple.
55795582
*/
55805583
return HeapTupleSelfUpdated;
55815584
}
@@ -5774,6 +5777,22 @@ heap_lock_updated_tuple_rec(Relation rel, ItemPointer tid, TransactionId xid,
57745777
members[i].xid,
57755778
mode, &needwait);
57765779

5780+
/*
5781+
* If the tuple was already locked by ourselves in a
5782+
* previous iteration of this (say heap_lock_tuple was
5783+
* forced to restart the locking loop because of a change
5784+
* in xmax), then we hold the lock already on this tuple
5785+
* version and we don't need to do anything; and this is
5786+
* not an error condition either. We just need to skip
5787+
* this tuple and continue locking the next version in the
5788+
* update chain.
5789+
*/
5790+
if (result == HeapTupleSelfUpdated)
5791+
{
5792+
pfree(members);
5793+
goto next;
5794+
}
5795+
57775796
if (needwait)
57785797
{
57795798
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
@@ -5834,6 +5853,19 @@ heap_lock_updated_tuple_rec(Relation rel, ItemPointer tid, TransactionId xid,
58345853

58355854
result = test_lockmode_for_conflict(status, rawxmax, mode,
58365855
&needwait);
5856+
5857+
/*
5858+
* If the tuple was already locked by ourselves in a previous
5859+
* iteration of this (say heap_lock_tuple was forced to
5860+
* restart the locking loop because of a change in xmax), then
5861+
* we hold the lock already on this tuple version and we don't
5862+
* need to do anything; and this is not an error condition
5863+
* either. We just need to skip this tuple and continue
5864+
* locking the next version in the update chain.
5865+
*/
5866+
if (result == HeapTupleSelfUpdated)
5867+
goto next;
5868+
58375869
if (needwait)
58385870
{
58395871
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
@@ -5894,6 +5926,7 @@ heap_lock_updated_tuple_rec(Relation rel, ItemPointer tid, TransactionId xid,
58945926

58955927
END_CRIT_SECTION();
58965928

5929+
next:
58975930
/* if we find the end of update chain, we're done. */
58985931
if (mytup.t_data->t_infomask & HEAP_XMAX_INVALID ||
58995932
ItemPointerEquals(&mytup.t_self, &mytup.t_data->t_ctid) ||

0 commit comments

Comments
 (0)