Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 6b5d8e1

Browse files
committed
ReleaseRelationBuffers() failed to check for I/O in progress on a buffer
it wants to release. This leads to a race condition: does the backend that's trying to flush the buffer do so before the one that's deleting the relation does so? Usually no problem, I expect, but on occasion this could lead to hard-to-reproduce complaints from md.c, especially mdblindwrt.
1 parent 610dfa6 commit 6b5d8e1

File tree

1 file changed

+56
-22
lines changed

1 file changed

+56
-22
lines changed

src/backend/storage/buffer/bufmgr.c

Lines changed: 56 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
*
88
*
99
* IDENTIFICATION
10-
* $Header: /cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v 1.66 1999/11/16 04:13:56 momjian Exp $
10+
* $Header: /cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v 1.67 1999/11/22 01:19:42 tgl Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
@@ -1056,8 +1056,13 @@ BufferSync()
10561056

10571057

10581058
/*
1059-
* WaitIO -- Block until the IO_IN_PROGRESS flag on 'buf'
1060-
* is cleared. Because IO_IN_PROGRESS conflicts are
1059+
* WaitIO -- Block until the IO_IN_PROGRESS flag on 'buf' is cleared.
1060+
*
1061+
* Should be entered with buffer manager spinlock held; releases it before
1062+
* waiting and re-acquires it afterwards.
1063+
*
1064+
* OLD NOTES:
1065+
* Because IO_IN_PROGRESS conflicts are
10611066
* expected to be rare, there is only one BufferIO
10621067
* lock in the entire system. All processes block
10631068
* on this semaphore when they try to use a buffer
@@ -1069,15 +1074,13 @@ BufferSync()
10691074
* is simple, but efficient enough if WaitIO is
10701075
* rarely called by multiple processes simultaneously.
10711076
*
1072-
* ProcSleep atomically releases the spinlock and goes to
1073-
* sleep.
1074-
*
1075-
* Note: there is an easy fix if the queue becomes long.
1076-
* save the id of the buffer we are waiting for in
1077-
* the queue structure. That way signal can figure
1078-
* out which proc to wake up.
1077+
* NEW NOTES:
1078+
* The above is true only on machines without test-and-set
1079+
* semaphores (which we hope are few, these days). On better
1080+
* hardware, each buffer has a spinlock that we can wait on.
10791081
*/
10801082
#ifdef HAS_TEST_AND_SET
1083+
10811084
static void
10821085
WaitIO(BufferDesc *buf, SPINLOCK spinlock)
10831086
{
@@ -1087,7 +1090,8 @@ WaitIO(BufferDesc *buf, SPINLOCK spinlock)
10871090
SpinAcquire(spinlock);
10881091
}
10891092

1090-
#else /* HAS_TEST_AND_SET */
1093+
#else /* !HAS_TEST_AND_SET */
1094+
10911095
IpcSemaphoreId WaitIOSemId;
10921096
IpcSemaphoreId WaitCLSemId;
10931097

@@ -1387,26 +1391,30 @@ RelationGetNumberOfBlocks(Relation relation)
13871391
*
13881392
* this function unmarks all the dirty pages of a relation
13891393
* in the buffer pool so that at the end of transaction
1390-
* these pages will not be flushed.
1394+
* these pages will not be flushed. This is used when the
1395+
* relation is about to be deleted. We assume that the caller
1396+
* holds an exclusive lock on the relation, which should assure
1397+
* that no new buffers will be acquired for the rel meanwhile.
1398+
*
13911399
* XXX currently it sequentially searches the buffer pool, should be
13921400
* changed to more clever ways of searching.
13931401
* --------------------------------------------------------------------
13941402
*/
13951403
void
13961404
ReleaseRelationBuffers(Relation rel)
13971405
{
1406+
Oid relid = RelationGetRelid(rel);
1407+
bool holding = false;
13981408
int i;
1399-
int holding = 0;
14001409
BufferDesc *buf;
14011410

14021411
if (rel->rd_myxactonly)
14031412
{
14041413
for (i = 0; i < NLocBuffer; i++)
14051414
{
14061415
buf = &LocalBufferDescriptors[i];
1407-
if ((buf->flags & BM_DIRTY) &&
1408-
(buf->tag.relId.relId == RelationGetRelid(rel)))
1409-
buf->flags &= ~BM_DIRTY;
1416+
if (buf->tag.relId.relId == relid)
1417+
buf->flags &= ~ ( BM_DIRTY | BM_JUST_DIRTIED);
14101418
}
14111419
return;
14121420
}
@@ -1417,21 +1425,47 @@ ReleaseRelationBuffers(Relation rel)
14171425
if (!holding)
14181426
{
14191427
SpinAcquire(BufMgrLock);
1420-
holding = 1;
1428+
holding = true;
14211429
}
1422-
if ((buf->flags & BM_DIRTY) &&
1423-
(buf->tag.relId.dbId == MyDatabaseId) &&
1424-
(buf->tag.relId.relId == RelationGetRelid(rel)))
1430+
recheck:
1431+
if (buf->tag.relId.dbId == MyDatabaseId &&
1432+
buf->tag.relId.relId == relid)
14251433
{
1426-
buf->flags &= ~BM_DIRTY;
1434+
/*
1435+
* If there is I/O in progress, better wait till it's done;
1436+
* don't want to delete the relation out from under someone
1437+
* who's just trying to flush the buffer!
1438+
*/
1439+
if (buf->flags & BM_IO_IN_PROGRESS)
1440+
{
1441+
WaitIO(buf, BufMgrLock);
1442+
/* By now, the buffer very possibly belongs to some other
1443+
* rel, so check again before proceeding.
1444+
*/
1445+
goto recheck;
1446+
}
1447+
/* Now we can do what we came for */
1448+
buf->flags &= ~ ( BM_DIRTY | BM_JUST_DIRTIED);
1449+
CommitInfoNeedsSave[i - 1] = 0;
1450+
/*
1451+
* Release any refcount we may have.
1452+
*
1453+
* This is very probably dead code, and if it isn't then it's
1454+
* probably wrong. I added the Assert to find out --- tgl 11/99.
1455+
*/
14271456
if (!(buf->flags & BM_FREE))
14281457
{
1458+
/* Assert checks that buffer will actually get freed! */
1459+
Assert(PrivateRefCount[i - 1] == 1 &&
1460+
buf->refcount == 1);
1461+
/* ReleaseBuffer expects we do not hold the lock at entry */
14291462
SpinRelease(BufMgrLock);
1430-
holding = 0;
1463+
holding = false;
14311464
ReleaseBuffer(i);
14321465
}
14331466
}
14341467
}
1468+
14351469
if (holding)
14361470
SpinRelease(BufMgrLock);
14371471
}

0 commit comments

Comments
 (0)