Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit ded8919

Browse files
committed
Advance the stop point for multixact offset creation only at checkpoint.
Commit b69bf30 advanced the stop point at vacuum time, but this has subsequently been shown to be unsafe as a result of analysis by myself and Thomas Munro and testing by Thomas Munro. The crux of the problem is that the SLRU deletion logic may get confused about what to remove if, at exactly the right time during the checkpoint process, the head of the SLRU crosses what used to be the tail. This patch, by me, fixes the problem by advancing the stop point only following a checkpoint. This has the additional advantage of making the removal logic work during recovery more like the way it works during normal running, which is probably good. At least one of the calls to DetermineSafeOldestOffset which this patch removes was already dead, because MultiXactAdvanceOldest is called only during recovery and DetermineSafeOldestOffset was set up to do nothing during recovery. That, however, is inconsistent with the principle that recovery and normal running should work similarly, and was confusing to boot. Along the way, fix some comments that previous patches in this area neglected to update. It's not clear to me whether there's any concrete basis for the decision to use only half of the multixact ID space, but it's neither necessary nor sufficient to prevent multixact member wraparound, so the comments should not say otherwise.
1 parent 7b3f0f8 commit ded8919

File tree

1 file changed

+17
-26
lines changed

1 file changed

+17
-26
lines changed

src/backend/access/transam/multixact.c

+17-26
Original file line numberDiff line numberDiff line change
@@ -2062,8 +2062,6 @@ TrimMultiXact(void)
20622062
}
20632063

20642064
LWLockRelease(MultiXactMemberControlLock);
2065-
2066-
DetermineSafeOldestOffset(MultiXactState->oldestMultiXactId);
20672065
}
20682066

20692067
/*
@@ -2167,13 +2165,11 @@ SetMultiXactIdLimit(MultiXactId oldest_datminmxid, Oid oldest_datoid)
21672165
Assert(MultiXactIdIsValid(oldest_datminmxid));
21682166

21692167
/*
2170-
* Since multixacts wrap differently from transaction IDs, this logic is
2171-
* not entirely correct: in some scenarios we could go for longer than 2
2172-
* billion multixacts without seeing any data loss, and in some others we
2173-
* could get in trouble before that if the new pg_multixact/members data
2174-
* stomps on the previous cycle's data. For lack of a better mechanism we
2175-
* use the same logic as for transaction IDs, that is, start taking action
2176-
* halfway around the oldest potentially-existing multixact.
2168+
* We pretend that a wrap will happen halfway through the multixact ID
2169+
* space, but that's not really true, because multixacts wrap differently
2170+
* from transaction IDs. Note that, separately from any concern about
2171+
* multixact IDs wrapping, we must ensure that multixact members do not
2172+
* wrap. Limits for that are set in DetermineSafeOldestOffset, not here.
21772173
*/
21782174
multiWrapLimit = oldest_datminmxid + (MaxMultiXactId >> 1);
21792175
if (multiWrapLimit < FirstMultiXactId)
@@ -2228,8 +2224,6 @@ SetMultiXactIdLimit(MultiXactId oldest_datminmxid, Oid oldest_datoid)
22282224
curMulti = MultiXactState->nextMXact;
22292225
LWLockRelease(MultiXactGenLock);
22302226

2231-
DetermineSafeOldestOffset(oldest_datminmxid);
2232-
22332227
/* Log the info */
22342228
ereport(DEBUG1,
22352229
(errmsg("MultiXactId wrap limit is %u, limited by database with OID %u",
@@ -2324,8 +2318,6 @@ MultiXactAdvanceOldest(MultiXactId oldestMulti, Oid oldestMultiDB)
23242318
{
23252319
if (MultiXactIdPrecedes(MultiXactState->oldestMultiXactId, oldestMulti))
23262320
SetMultiXactIdLimit(oldestMulti, oldestMultiDB);
2327-
else
2328-
DetermineSafeOldestOffset(oldestMulti);
23292321
}
23302322

23312323
/*
@@ -2503,19 +2495,11 @@ DetermineSafeOldestOffset(MultiXactId oldestMXact)
25032495
MultiXactOffset oldestOffset;
25042496

25052497
/*
2506-
* Can't do this while initdb'ing or in the startup process while
2507-
* replaying WAL: the segment file to read might have not yet been
2508-
* created, or already been removed.
2509-
*/
2510-
if (IsBootstrapProcessingMode() || InRecovery)
2511-
return;
2512-
2513-
/*
2514-
* Determine the offset of the oldest multixact. Normally, we can read
2515-
* the offset from the multixact itself, but there's an important special
2516-
* case: if there are no multixacts in existence at all, oldestMXact
2517-
* obviously can't point to one. It will instead point to the multixact
2518-
* ID that will be assigned the next time one is needed.
2498+
* We determine the safe upper bound for offsets of new xacts by reading
2499+
* the offset of the oldest multixact, and going back one segment. This
2500+
* way, the sequence of multixact member segments will always have a
2501+
* one-segment hole at a minimum. We start spewing warnings a few
2502+
* complete segments before that.
25192503
*/
25202504
LWLockAcquire(MultiXactGenLock, LW_SHARED);
25212505
if (MultiXactState->nextMXact == oldestMXact)
@@ -2852,6 +2836,13 @@ TruncateMultiXact(void)
28522836
SimpleLruTruncate(MultiXactOffsetCtl,
28532837
MultiXactIdToOffsetPage(oldestMXact));
28542838

2839+
2840+
/*
2841+
* Now, and only now, we can advance the stop point for multixact members.
2842+
* If we did it any sooner, the segments we deleted above might already
2843+
* have been overwritten with new members. That would be bad.
2844+
*/
2845+
DetermineSafeOldestOffset(oldestMXact);
28552846
}
28562847

28572848
/*

0 commit comments

Comments
 (0)