Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit ac86eda

Browse files
committed
Fix incorrect order of database-locking operations in InitPostgres().
We should set MyProc->databaseId after acquiring the per-database lock, not beforehand. The old way risked deadlock against processes trying to copy or delete the target database, since they would first acquire the lock and then wait for processes with matching databaseId to exit; that left a window wherein an incoming process could set its databaseId and then block on the lock, while the other process had the lock and waited in vain for the incoming process to exit. CountOtherDBBackends() would time out and fail after 5 seconds, so this just resulted in an unexpected failure not a permanent lockup, but it's still annoying when it happens. A real-world example of a use-case is that short-duration connections to a template database should not cause CREATE DATABASE to fail. Doing it in the other order should be fine since the contract has always been that processes searching the ProcArray for a database ID must hold the relevant per-database lock while searching. Thus, this actually removes the former race condition that required an assumption that storing to MyProc->databaseId is atomic. It's been like this for a long time, so back-patch to all active branches.
1 parent 2a9b019 commit ac86eda

File tree

1 file changed

+21
-7
lines changed

1 file changed

+21
-7
lines changed

src/backend/utils/init/postinit.c

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -800,20 +800,20 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
800800
strcpy(out_dbname, dbname);
801801
}
802802

803-
/* Now we can mark our PGPROC entry with the database ID */
804-
/* (We assume this is an atomic store so no lock is needed) */
805-
MyProc->databaseId = MyDatabaseId;
806-
807803
/*
808804
* Now, take a writer's lock on the database we are trying to connect to.
809805
* If there is a concurrently running DROP DATABASE on that database, this
810806
* will block us until it finishes (and has committed its update of
811807
* pg_database).
812808
*
813809
* Note that the lock is not held long, only until the end of this startup
814-
* transaction. This is OK since we are already advertising our use of
815-
* the database in the PGPROC array; anyone trying a DROP DATABASE after
816-
* this point will see us there.
810+
* transaction. This is OK since we will advertise our use of the
811+
* database in the ProcArray before dropping the lock (in fact, that's the
812+
* next thing to do). Anyone trying a DROP DATABASE after this point will
813+
* see us in the array once they have the lock. Ordering is important for
814+
* this because we don't want to advertise ourselves as being in this
815+
* database until we have the lock; otherwise we create what amounts to a
816+
* deadlock with CountOtherDBBackends().
817817
*
818818
* Note: use of RowExclusiveLock here is reasonable because we envision
819819
* our session as being a concurrent writer of the database. If we had a
@@ -825,6 +825,20 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
825825
LockSharedObject(DatabaseRelationId, MyDatabaseId, 0,
826826
RowExclusiveLock);
827827

828+
/*
829+
* Now we can mark our PGPROC entry with the database ID.
830+
*
831+
* We assume this is an atomic store so no lock is needed; though actually
832+
* things would work fine even if it weren't atomic. Anyone searching the
833+
* ProcArray for this database's ID should hold the database lock, so they
834+
* would not be executing concurrently with this store. A process looking
835+
* for another database's ID could in theory see a chance match if it read
836+
* a partially-updated databaseId value; but as long as all such searches
837+
* wait and retry, as in CountOtherDBBackends(), they will certainly see
838+
* the correct value on their next try.
839+
*/
840+
MyProc->databaseId = MyDatabaseId;
841+
828842
/*
829843
* Recheck pg_database to make sure the target database hasn't gone away.
830844
* If there was a concurrent DROP DATABASE, this ensures we will die

0 commit comments

Comments
 (0)