Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit f5d9212

Browse files
committed
Avoid deadlock during orphan temp table removal.
If temp tables have dependencies (such as sequences) then it's possible for autovacuum's cleanup of orphan temp tables to deadlock against an incoming backend that's trying to clean out the temp namespace for its own use. That can happen because RemoveTempRelations' performDeletion call can visit objects within the namespace in an order different from the order in which a per-table deletion will visit them. To fix, observe that performDeletion will begin by taking an exclusive lock on the temp namespace (even though it won't actually delete it). So, if we can get a shared lock on the namespace, we can be sure we're not running concurrently with RemoveTempRelations, while also not conflicting with ordinary use of the namespace. This requires introducing a conditional version of LockDatabaseObject, but that's no big deal. (It's surprising we've got along without that this long.) Report and patch by Mikhail Zhilin. Back-patch to all supported branches. Discussion: https://postgr.es/m/c43ce028-2bc2-4865-9b89-3f706246eed5@postgrespro.ru
1 parent 8f8511d commit f5d9212

File tree

3 files changed

+60
-1
lines changed

3 files changed

+60
-1
lines changed

src/backend/postmaster/autovacuum.c

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@
7575
#include "catalog/dependency.h"
7676
#include "catalog/namespace.h"
7777
#include "catalog/pg_database.h"
78+
#include "catalog/pg_namespace.h"
7879
#include "commands/dbcommands.h"
7980
#include "commands/vacuum.h"
8081
#include "lib/ilist.h"
@@ -2270,6 +2271,24 @@ do_autovacuum(void)
22702271
continue;
22712272
}
22722273

2274+
/*
2275+
* Try to lock the temp namespace, too. Even though we have lock on
2276+
* the table itself, there's a risk of deadlock against an incoming
2277+
* backend trying to clean out the temp namespace, in case this table
2278+
* has dependencies (such as sequences) that the backend's
2279+
* performDeletion call might visit in a different order. If we can
2280+
* get AccessShareLock on the namespace, that's sufficient to ensure
2281+
* we're not running concurrently with RemoveTempRelations. If we
2282+
* can't, back off and let RemoveTempRelations do its thing.
2283+
*/
2284+
if (!ConditionalLockDatabaseObject(NamespaceRelationId,
2285+
classForm->relnamespace, 0,
2286+
AccessShareLock))
2287+
{
2288+
UnlockRelationOid(relid, AccessExclusiveLock);
2289+
continue;
2290+
}
2291+
22732292
/* OK, let's delete it */
22742293
ereport(LOG,
22752294
(errmsg("autovacuum: dropping orphan temp table \"%s.%s.%s\"",
@@ -2287,7 +2306,7 @@ do_autovacuum(void)
22872306

22882307
/*
22892308
* To commit the deletion, end current transaction and start a new
2290-
* one. Note this also releases the lock we took.
2309+
* one. Note this also releases the locks we took.
22912310
*/
22922311
CommitTransactionCommand();
22932312
StartTransactionCommand();

src/backend/storage/lmgr/lmgr.c

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -990,6 +990,44 @@ LockDatabaseObject(Oid classid, Oid objid, uint16 objsubid,
990990
AcceptInvalidationMessages();
991991
}
992992

993+
/*
994+
* ConditionalLockDatabaseObject
995+
*
996+
* As above, but only lock if we can get the lock without blocking.
997+
* Returns true iff the lock was acquired.
998+
*/
999+
bool
1000+
ConditionalLockDatabaseObject(Oid classid, Oid objid, uint16 objsubid,
1001+
LOCKMODE lockmode)
1002+
{
1003+
LOCKTAG tag;
1004+
LOCALLOCK *locallock;
1005+
LockAcquireResult res;
1006+
1007+
SET_LOCKTAG_OBJECT(tag,
1008+
MyDatabaseId,
1009+
classid,
1010+
objid,
1011+
objsubid);
1012+
1013+
res = LockAcquireExtended(&tag, lockmode, false, true, true, &locallock);
1014+
1015+
if (res == LOCKACQUIRE_NOT_AVAIL)
1016+
return false;
1017+
1018+
/*
1019+
* Now that we have the lock, check for invalidation messages; see notes
1020+
* in LockRelationOid.
1021+
*/
1022+
if (res != LOCKACQUIRE_ALREADY_CLEAR)
1023+
{
1024+
AcceptInvalidationMessages();
1025+
MarkLockClear(locallock);
1026+
}
1027+
1028+
return true;
1029+
}
1030+
9931031
/*
9941032
* UnlockDatabaseObject
9951033
*/

src/include/storage/lmgr.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,8 @@ extern void SpeculativeInsertionWait(TransactionId xid, uint32 token);
9292
/* Lock a general object (other than a relation) of the current database */
9393
extern void LockDatabaseObject(Oid classid, Oid objid, uint16 objsubid,
9494
LOCKMODE lockmode);
95+
extern bool ConditionalLockDatabaseObject(Oid classid, Oid objid,
96+
uint16 objsubid, LOCKMODE lockmode);
9597
extern void UnlockDatabaseObject(Oid classid, Oid objid, uint16 objsubid,
9698
LOCKMODE lockmode);
9799

0 commit comments

Comments
 (0)