Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 3b4d85c

Browse files
committed
Cope with catcache entries becoming stale during detoasting.
We've long had a policy that any toasted fields in a catalog tuple should be pulled in-line before entering the tuple in a catalog cache. However, that requires access to the catalog's toast table, and we'll typically do AcceptInvalidationMessages while opening the toast table. So it's possible that the catalog tuple is outdated by the time we finish detoasting it. Since no cache entry exists yet, we can't mark the entry stale during AcceptInvalidationMessages, and instead we'll press forward and build an apparently-valid cache entry. The upshot is that we have a race condition whereby an out-of-date entry could be made in a backend's catalog cache, and persist there indefinitely causing indeterminate misbehavior. To fix, use the existing systable_recheck_tuple code to recheck whether the catalog tuple is still up-to-date after we finish detoasting it. If not, loop around and restart the process of searching the catalog and constructing cache entries from the top. The case is rare enough that this shouldn't create any meaningful performance penalty, even in the SearchCatCacheList case where we need to tear down and reconstruct the whole list. Indeed, the case is so rare that AFAICT it doesn't occur during our regression tests, and there doesn't seem to be any easy way to build a test that would exercise it reliably. To allow testing of the retry code paths, add logic (in USE_ASSERT_CHECKING builds only) that randomly pretends that the recheck failed about one time out of a thousand. This is enough to ensure that we'll pass through the retry paths during most regression test runs. By adding an extra level of looping, this commit creates a need to reindent most of SearchCatCacheMiss and SearchCatCacheList. I'll do that separately, to allow putting those changes in .git-blame-ignore-revs. Patch by me; thanks to Alexander Lakhin for having built a test case to prove the bug is real, and to Xiaoran Wang for review. Back-patch to all supported branches. Discussion: https://postgr.es/m/1393953.1698353013@sss.pgh.pa.us Discussion: https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
1 parent 35cc9f1 commit 3b4d85c

File tree

1 file changed

+123
-40
lines changed

1 file changed

+123
-40
lines changed

src/backend/utils/cache/catcache.c

Lines changed: 123 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -89,10 +89,10 @@ static void CatCachePrintStats(int code, Datum arg);
8989
static void CatCacheRemoveCTup(CatCache *cache, CatCTup *ct);
9090
static void CatCacheRemoveCList(CatCache *cache, CatCList *cl);
9191
static void CatalogCacheInitializeCache(CatCache *cache);
92-
static CatCTup *CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp,
92+
static CatCTup *CatalogCacheCreateEntry(CatCache *cache,
93+
HeapTuple ntp, SysScanDesc scandesc,
9394
Datum *arguments,
94-
uint32 hashValue, Index hashIndex,
95-
bool negative);
95+
uint32 hashValue, Index hashIndex);
9696

9797
static void CatCacheFreeKeys(TupleDesc tupdesc, int nkeys, int *attnos,
9898
Datum *keys);
@@ -1321,6 +1321,7 @@ SearchCatCacheMiss(CatCache *cache,
13211321
SysScanDesc scandesc;
13221322
HeapTuple ntp;
13231323
CatCTup *ct;
1324+
bool stale;
13241325
Datum arguments[CATCACHE_MAXKEYS];
13251326

13261327
/* Initialize local parameter array */
@@ -1329,16 +1330,6 @@ SearchCatCacheMiss(CatCache *cache,
13291330
arguments[2] = v3;
13301331
arguments[3] = v4;
13311332

1332-
/*
1333-
* Ok, need to make a lookup in the relation, copy the scankey and fill
1334-
* out any per-call fields.
1335-
*/
1336-
memcpy(cur_skey, cache->cc_skey, sizeof(ScanKeyData) * nkeys);
1337-
cur_skey[0].sk_argument = v1;
1338-
cur_skey[1].sk_argument = v2;
1339-
cur_skey[2].sk_argument = v3;
1340-
cur_skey[3].sk_argument = v4;
1341-
13421333
/*
13431334
* Tuple was not found in cache, so we have to try to retrieve it directly
13441335
* from the relation. If found, we will add it to the cache; if not
@@ -1353,9 +1344,28 @@ SearchCatCacheMiss(CatCache *cache,
13531344
* will eventually age out of the cache, so there's no functional problem.
13541345
* This case is rare enough that it's not worth expending extra cycles to
13551346
* detect.
1347+
*
1348+
* Another case, which we *must* handle, is that the tuple could become
1349+
* outdated during CatalogCacheCreateEntry's attempt to detoast it (since
1350+
* AcceptInvalidationMessages can run during TOAST table access). We do
1351+
* not want to return already-stale catcache entries, so we loop around
1352+
* and do the table scan again if that happens.
13561353
*/
13571354
relation = table_open(cache->cc_reloid, AccessShareLock);
13581355

1356+
do
1357+
{
1358+
/*
1359+
* Ok, need to make a lookup in the relation, copy the scankey and
1360+
* fill out any per-call fields. (We must re-do this when retrying,
1361+
* because systable_beginscan scribbles on the scankey.)
1362+
*/
1363+
memcpy(cur_skey, cache->cc_skey, sizeof(ScanKeyData) * nkeys);
1364+
cur_skey[0].sk_argument = v1;
1365+
cur_skey[1].sk_argument = v2;
1366+
cur_skey[2].sk_argument = v3;
1367+
cur_skey[3].sk_argument = v4;
1368+
13591369
scandesc = systable_beginscan(relation,
13601370
cache->cc_indexoid,
13611371
IndexScanOK(cache, cur_skey),
@@ -1364,12 +1374,18 @@ SearchCatCacheMiss(CatCache *cache,
13641374
cur_skey);
13651375

13661376
ct = NULL;
1377+
stale = false;
13671378

13681379
while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
13691380
{
1370-
ct = CatalogCacheCreateEntry(cache, ntp, arguments,
1371-
hashValue, hashIndex,
1372-
false);
1381+
ct = CatalogCacheCreateEntry(cache, ntp, scandesc, NULL,
1382+
hashValue, hashIndex);
1383+
/* upon failure, we must start the scan over */
1384+
if (ct == NULL)
1385+
{
1386+
stale = true;
1387+
break;
1388+
}
13731389
/* immediately set the refcount to 1 */
13741390
ResourceOwnerEnlargeCatCacheRefs(CurrentResourceOwner);
13751391
ct->refcount++;
@@ -1378,6 +1394,7 @@ SearchCatCacheMiss(CatCache *cache,
13781394
}
13791395

13801396
systable_endscan(scandesc);
1397+
} while (stale);
13811398

13821399
table_close(relation, AccessShareLock);
13831400

@@ -1396,9 +1413,11 @@ SearchCatCacheMiss(CatCache *cache,
13961413
if (IsBootstrapProcessingMode())
13971414
return NULL;
13981415

1399-
ct = CatalogCacheCreateEntry(cache, NULL, arguments,
1400-
hashValue, hashIndex,
1401-
true);
1416+
ct = CatalogCacheCreateEntry(cache, NULL, NULL, arguments,
1417+
hashValue, hashIndex);
1418+
1419+
/* Creating a negative cache entry shouldn't fail */
1420+
Assert(ct != NULL);
14021421

14031422
CACHE_elog(DEBUG2, "SearchCatCache(%s): Contains %d/%d tuples",
14041423
cache->cc_relname, cache->cc_ntup, CacheHdr->ch_ntup);
@@ -1605,7 +1624,8 @@ SearchCatCacheList(CatCache *cache,
16051624
* We have to bump the member refcounts temporarily to ensure they won't
16061625
* get dropped from the cache while loading other members. We use a PG_TRY
16071626
* block to ensure we can undo those refcounts if we get an error before
1608-
* we finish constructing the CatCList.
1627+
* we finish constructing the CatCList. ctlist must be valid throughout
1628+
* the PG_TRY block.
16091629
*/
16101630
ResourceOwnerEnlargeCatCacheListRefs(CurrentResourceOwner);
16111631

@@ -1616,19 +1636,23 @@ SearchCatCacheList(CatCache *cache,
16161636
ScanKeyData cur_skey[CATCACHE_MAXKEYS];
16171637
Relation relation;
16181638
SysScanDesc scandesc;
1619-
1620-
/*
1621-
* Ok, need to make a lookup in the relation, copy the scankey and
1622-
* fill out any per-call fields.
1623-
*/
1624-
memcpy(cur_skey, cache->cc_skey, sizeof(ScanKeyData) * cache->cc_nkeys);
1625-
cur_skey[0].sk_argument = v1;
1626-
cur_skey[1].sk_argument = v2;
1627-
cur_skey[2].sk_argument = v3;
1628-
cur_skey[3].sk_argument = v4;
1639+
bool stale;
16291640

16301641
relation = table_open(cache->cc_reloid, AccessShareLock);
16311642

1643+
do
1644+
{
1645+
/*
1646+
* Ok, need to make a lookup in the relation, copy the scankey and
1647+
* fill out any per-call fields. (We must re-do this when
1648+
* retrying, because systable_beginscan scribbles on the scankey.)
1649+
*/
1650+
memcpy(cur_skey, cache->cc_skey, sizeof(ScanKeyData) * cache->cc_nkeys);
1651+
cur_skey[0].sk_argument = v1;
1652+
cur_skey[1].sk_argument = v2;
1653+
cur_skey[2].sk_argument = v3;
1654+
cur_skey[3].sk_argument = v4;
1655+
16321656
scandesc = systable_beginscan(relation,
16331657
cache->cc_indexoid,
16341658
IndexScanOK(cache, cur_skey),
@@ -1639,6 +1663,8 @@ SearchCatCacheList(CatCache *cache,
16391663
/* The list will be ordered iff we are doing an index scan */
16401664
ordered = (scandesc->irel != NULL);
16411665

1666+
stale = false;
1667+
16421668
while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
16431669
{
16441670
uint32 hashValue;
@@ -1681,9 +1707,32 @@ SearchCatCacheList(CatCache *cache,
16811707
if (!found)
16821708
{
16831709
/* We didn't find a usable entry, so make a new one */
1684-
ct = CatalogCacheCreateEntry(cache, ntp, arguments,
1685-
hashValue, hashIndex,
1686-
false);
1710+
ct = CatalogCacheCreateEntry(cache, ntp, scandesc, NULL,
1711+
hashValue, hashIndex);
1712+
/* upon failure, we must start the scan over */
1713+
if (ct == NULL)
1714+
{
1715+
/*
1716+
* Release refcounts on any items we already had. We dare
1717+
* not try to free them if they're now unreferenced, since
1718+
* an error while doing that would result in the PG_CATCH
1719+
* below doing extra refcount decrements. Besides, we'll
1720+
* likely re-adopt those items in the next iteration, so
1721+
* it's not worth complicating matters to try to get rid
1722+
* of them.
1723+
*/
1724+
foreach(ctlist_item, ctlist)
1725+
{
1726+
ct = (CatCTup *) lfirst(ctlist_item);
1727+
Assert(ct->c_list == NULL);
1728+
Assert(ct->refcount > 0);
1729+
ct->refcount--;
1730+
}
1731+
/* Reset ctlist in preparation for new try */
1732+
ctlist = NIL;
1733+
stale = true;
1734+
break;
1735+
}
16871736
}
16881737

16891738
/* Careful here: add entry to ctlist, then bump its refcount */
@@ -1693,6 +1742,7 @@ SearchCatCacheList(CatCache *cache,
16931742
}
16941743

16951744
systable_endscan(scandesc);
1745+
} while (stale);
16961746

16971747
table_close(relation, AccessShareLock);
16981748

@@ -1800,22 +1850,42 @@ ReleaseCatCacheList(CatCList *list)
18001850
* CatalogCacheCreateEntry
18011851
* Create a new CatCTup entry, copying the given HeapTuple and other
18021852
* supplied data into it. The new entry initially has refcount 0.
1853+
*
1854+
* To create a normal cache entry, ntp must be the HeapTuple just fetched
1855+
* from scandesc, and "arguments" is not used. To create a negative cache
1856+
* entry, pass NULL for ntp and scandesc; then "arguments" is the cache
1857+
* keys to use. In either case, hashValue/hashIndex are the hash values
1858+
* computed from the cache keys.
1859+
*
1860+
* Returns NULL if we attempt to detoast the tuple and observe that it
1861+
* became stale. (This cannot happen for a negative entry.) Caller must
1862+
* retry the tuple lookup in that case.
18031863
*/
18041864
static CatCTup *
1805-
CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
1806-
uint32 hashValue, Index hashIndex,
1807-
bool negative)
1865+
CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, SysScanDesc scandesc,
1866+
Datum *arguments,
1867+
uint32 hashValue, Index hashIndex)
18081868
{
18091869
CatCTup *ct;
18101870
HeapTuple dtp;
18111871
MemoryContext oldcxt;
18121872

1813-
/* negative entries have no tuple associated */
18141873
if (ntp)
18151874
{
18161875
int i;
18171876

1818-
Assert(!negative);
1877+
/*
1878+
* The visibility recheck below essentially never fails during our
1879+
* regression tests, and there's no easy way to force it to fail for
1880+
* testing purposes. To ensure we have test coverage for the retry
1881+
* paths in our callers, make debug builds randomly fail about 0.1% of
1882+
* the times through this code path, even when there's no toasted
1883+
* fields.
1884+
*/
1885+
#ifdef USE_ASSERT_CHECKING
1886+
if (random() <= (MAX_RANDOM_VALUE / 1000))
1887+
return NULL;
1888+
#endif
18191889

18201890
/*
18211891
* If there are any out-of-line toasted fields in the tuple, expand
@@ -1825,7 +1895,20 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18251895
* something using a slightly stale catcache entry.
18261896
*/
18271897
if (HeapTupleHasExternal(ntp))
1898+
{
18281899
dtp = toast_flatten_tuple(ntp, cache->cc_tupdesc);
1900+
1901+
/*
1902+
* The tuple could become stale while we are doing toast table
1903+
* access (since AcceptInvalidationMessages can run then), so we
1904+
* must recheck its visibility afterwards.
1905+
*/
1906+
if (!systable_recheck_tuple(scandesc, ntp))
1907+
{
1908+
heap_freetuple(dtp);
1909+
return NULL;
1910+
}
1911+
}
18291912
else
18301913
dtp = ntp;
18311914

@@ -1864,7 +1947,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18641947
}
18651948
else
18661949
{
1867-
Assert(negative);
1950+
/* Set up keys for a negative cache entry */
18681951
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
18691952
ct = (CatCTup *) palloc(sizeof(CatCTup));
18701953

@@ -1886,7 +1969,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18861969
ct->c_list = NULL;
18871970
ct->refcount = 0; /* for the moment */
18881971
ct->dead = false;
1889-
ct->negative = negative;
1972+
ct->negative = (ntp == NULL);
18901973
ct->hash_value = hashValue;
18911974

18921975
dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);

0 commit comments

Comments
 (0)