Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 1192ba8

Browse files
committed
Avoid potential deadlock in InitCatCachePhase2().
Opening a catcache's index could require reading from that cache's own catalog, which of course would acquire AccessShareLock on the catalog. So the original coding here risks locking index before heap, which could deadlock against another backend trying to get exclusive locks in the normal order. Because InitCatCachePhase2 is only called when a backend has to start up without a relcache init file, the deadlock was seldom seen in the field. (And by the same token, there's no need to worry about any performance disadvantage; so not much point in trying to distinguish exactly which catalogs have the risk.) Bug report, diagnosis, and patch by Nikhil Sontakke. Additional commentary by me. Back-patch to all supported branches.
1 parent 6e8e7cc commit 1192ba8

File tree

2 files changed

+19
-0
lines changed

2 files changed

+19
-0
lines changed

src/backend/utils/cache/catcache.c

+9
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
#ifdef CATCACHE_STATS
2727
#include "storage/ipc.h" /* for on_proc_exit */
2828
#endif
29+
#include "storage/lmgr.h"
2930
#include "utils/builtins.h"
3031
#include "utils/fmgroids.h"
3132
#include "utils/inval.h"
@@ -967,8 +968,16 @@ InitCatCachePhase2(CatCache *cache, bool touch_index)
967968
{
968969
Relation idesc;
969970

971+
/*
972+
* We must lock the underlying catalog before opening the index to
973+
* avoid deadlock, since index_open could possibly result in reading
974+
* this same catalog, and if anyone else is exclusive-locking this
975+
* catalog and index they'll be doing it in that order.
976+
*/
977+
LockRelationOid(cache->cc_reloid, AccessShareLock);
970978
idesc = index_open(cache->cc_indexoid, AccessShareLock);
971979
index_close(idesc, AccessShareLock);
980+
UnlockRelationOid(cache->cc_reloid, AccessShareLock);
972981
}
973982
}
974983

src/backend/utils/cache/relcache.c

+10
Original file line numberDiff line numberDiff line change
@@ -1651,6 +1651,12 @@ RelationClose(Relation relation)
16511651
* We assume that at the time we are called, we have at least AccessShareLock
16521652
* on the target index. (Note: in the calls from RelationClearRelation,
16531653
* this is legitimate because we know the rel has positive refcount.)
1654+
*
1655+
* If the target index is an index on pg_class or pg_index, we'd better have
1656+
* previously gotten at least AccessShareLock on its underlying catalog,
1657+
* else we are at risk of deadlock against someone trying to exclusive-lock
1658+
* the heap and index in that order. This is ensured in current usage by
1659+
* only applying this to indexes being opened or having positive refcount.
16541660
*/
16551661
static void
16561662
RelationReloadIndexInfo(Relation relation)
@@ -3611,6 +3617,10 @@ RelationGetIndexPredicate(Relation relation)
36113617
* Attribute numbers are offset by FirstLowInvalidHeapAttributeNumber so that
36123618
* we can include system attributes (e.g., OID) in the bitmap representation.
36133619
*
3620+
* Caller had better hold at least RowExclusiveLock on the target relation
3621+
* to ensure that it has a stable set of indexes. This also makes it safe
3622+
* (deadlock-free) for us to take locks on the relation's indexes.
3623+
*
36143624
* The returned result is palloc'd in the caller's memory context and should
36153625
* be bms_free'd when not needed anymore.
36163626
*/

0 commit comments

Comments
 (0)