Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 98f58a3

Browse files
committed
Fix Hot-Standby initialization of clog and subtrans.
These bugs can cause data loss on standbys started with hot_standby=on at the moment they start to accept read only queries, by marking committed transactions as uncommited. The likelihood of such corruptions is small unless the primary has a high transaction rate. 5a031a5 fixed bugs in HS's startup logic by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both clog and subtrans are written to before that. This only failed to fail in common cases because the usage of ExtendCLOG in procarray.c was superflous since clog extensions are actually WAL logged. f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing extensions of pg_subtrans due to the former commit's changes - which are not WAL logged - by performing the extensions when switching to a state > STANDBY_INITIALIZED and not performing xid assignments before that - again missing the fact that ExtendCLOG is unneccessary - but screwed up twice: Once because latestObservedXid wasn't updated anymore in that state due to the earlier commit and once by having an off-by-one error in the loop performing extensions. This means that whenever a CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed between the start of the checkpoint recovery started from and the first xl_running_xact record old transactions commit bits in pg_clog could be overwritten if they started and committed in that window. Fix this mess by not performing ExtendCLOG() in HS at all anymore since it's unneeded and evidently dangerous and by performing subtrans extensions even before reaching STANDBY_SNAPSHOT_PENDING. Analysis and patch by Andres Freund. Reported by Christophe Pettus. Backpatch down to 9.0, like the previous commit that caused this.
1 parent 1a3d104 commit 98f58a3

File tree

2 files changed

+41
-29
lines changed

2 files changed

+41
-29
lines changed

src/backend/access/transam/clog.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -622,7 +622,7 @@ ExtendCLOG(TransactionId newestXact)
622622
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
623623

624624
/* Zero the page and make an XLOG entry about it */
625-
ZeroCLOGPage(pageno, !InRecovery);
625+
ZeroCLOGPage(pageno, true);
626626

627627
LWLockRelease(CLogControlLock);
628628
}

src/backend/storage/ipc/procarray.c

Lines changed: 40 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -473,7 +473,7 @@ ProcArrayClearTransaction(PGPROC *proc)
473473
* ProcArrayInitRecovery -- initialize recovery xid mgmt environment
474474
*
475475
* Remember up to where the startup process initialized the CLOG and subtrans
476-
* so we can ensure its initialized gaplessly up to the point where necessary
476+
* so we can ensure it's initialized gaplessly up to the point where necessary
477477
* while in recovery.
478478
*/
479479
void
@@ -483,9 +483,10 @@ ProcArrayInitRecovery(TransactionId initializedUptoXID)
483483
Assert(TransactionIdIsNormal(initializedUptoXID));
484484

485485
/*
486-
* we set latestObservedXid to the xid SUBTRANS has been initialized upto
487-
* so we can extend it from that point onwards when we reach a consistent
488-
* state in ProcArrayApplyRecoveryInfo().
486+
* we set latestObservedXid to the xid SUBTRANS has been initialized upto,
487+
* so we can extend it from that point onwards in
488+
* RecordKnownAssignedTransactionIds, and when we get consistent in
489+
* ProcArrayApplyRecoveryInfo().
489490
*/
490491
latestObservedXid = initializedUptoXID;
491492
TransactionIdRetreat(latestObservedXid);
@@ -661,17 +662,23 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
661662
pfree(xids);
662663

663664
/*
664-
* latestObservedXid is set to the the point where SUBTRANS was started up
665-
* to, initialize subtrans from thereon, up to nextXid - 1.
665+
* latestObservedXid is at least set to the the point where SUBTRANS was
666+
* started up to (c.f. ProcArrayInitRecovery()) or to the biggest xid
667+
* RecordKnownAssignedTransactionIds() was called for. Initialize
668+
* subtrans from thereon, up to nextXid - 1.
669+
*
670+
* We need to duplicate parts of RecordKnownAssignedTransactionId() here,
671+
* because we've just added xids to the known assigned xids machinery that
672+
* haven't gone through RecordKnownAssignedTransactionId().
666673
*/
667674
Assert(TransactionIdIsNormal(latestObservedXid));
675+
TransactionIdAdvance(latestObservedXid);
668676
while (TransactionIdPrecedes(latestObservedXid, running->nextXid))
669677
{
670-
ExtendCLOG(latestObservedXid);
671678
ExtendSUBTRANS(latestObservedXid);
672-
673679
TransactionIdAdvance(latestObservedXid);
674680
}
681+
TransactionIdRetreat(latestObservedXid); /* = running->nextXid - 1 */
675682

676683
/* ----------
677684
* Now we've got the running xids we need to set the global values that
@@ -756,10 +763,6 @@ ProcArrayApplyXidAssignment(TransactionId topxid,
756763

757764
Assert(standbyState >= STANDBY_INITIALIZED);
758765

759-
/* can't do anything useful unless we have more state setup */
760-
if (standbyState == STANDBY_INITIALIZED)
761-
return;
762-
763766
max_xid = TransactionIdLatest(topxid, nsubxids, subxids);
764767

765768
/*
@@ -786,6 +789,10 @@ ProcArrayApplyXidAssignment(TransactionId topxid,
786789
for (i = 0; i < nsubxids; i++)
787790
SubTransSetParent(subxids[i], topxid, false);
788791

792+
/* KnownAssignedXids isn't maintained yet, so we're done for now */
793+
if (standbyState == STANDBY_INITIALIZED)
794+
return;
795+
789796
/*
790797
* Uses same locking as transaction commit
791798
*/
@@ -2661,18 +2668,11 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
26612668
{
26622669
Assert(standbyState >= STANDBY_INITIALIZED);
26632670
Assert(TransactionIdIsValid(xid));
2671+
Assert(TransactionIdIsValid(latestObservedXid));
26642672

26652673
elog(trace_recovery(DEBUG4), "record known xact %u latestObservedXid %u",
26662674
xid, latestObservedXid);
26672675

2668-
/*
2669-
* If the KnownAssignedXids machinery isn't up yet, do nothing.
2670-
*/
2671-
if (standbyState <= STANDBY_INITIALIZED)
2672-
return;
2673-
2674-
Assert(TransactionIdIsValid(latestObservedXid));
2675-
26762676
/*
26772677
* When a newly observed xid arrives, it is frequently the case that it is
26782678
* *not* the next xid in sequence. When this occurs, we must treat the
@@ -2683,22 +2683,34 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
26832683
TransactionId next_expected_xid;
26842684

26852685
/*
2686-
* Extend clog and subtrans like we do in GetNewTransactionId() during
2687-
* normal operation using individual extend steps. Typical case
2688-
* requires almost no activity.
2686+
* Extend subtrans like we do in GetNewTransactionId() during normal
2687+
* operation using individual extend steps. Note that we do not need
2688+
* to extend clog since its extensions are WAL logged.
2689+
*
2690+
* This part has to be done regardless of standbyState since we
2691+
* immediately start assigning subtransactions to their toplevel
2692+
* transactions.
26892693
*/
26902694
next_expected_xid = latestObservedXid;
2691-
TransactionIdAdvance(next_expected_xid);
2692-
while (TransactionIdPrecedesOrEquals(next_expected_xid, xid))
2695+
while (TransactionIdPrecedes(next_expected_xid, xid))
26932696
{
2694-
ExtendCLOG(next_expected_xid);
2697+
TransactionIdAdvance(next_expected_xid);
26952698
ExtendSUBTRANS(next_expected_xid);
2699+
}
2700+
Assert(next_expected_xid == xid);
26962701

2697-
TransactionIdAdvance(next_expected_xid);
2702+
/*
2703+
* If the KnownAssignedXids machinery isn't up yet, there's nothing
2704+
* more to do since we don't track assigned xids yet.
2705+
*/
2706+
if (standbyState <= STANDBY_INITIALIZED)
2707+
{
2708+
latestObservedXid = xid;
2709+
return;
26982710
}
26992711

27002712
/*
2701-
* Add the new xids onto the KnownAssignedXids array.
2713+
* Add (latestObservedXid, xid] onto the KnownAssignedXids array.
27022714
*/
27032715
next_expected_xid = latestObservedXid;
27042716
TransactionIdAdvance(next_expected_xid);

0 commit comments

Comments
 (0)