Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit edbd1b4

Browse files
committed
Add more LOG messages when starting and ending recovery from a backup
Three LOG messages are added in the recovery code paths, providing information that can be useful to track corruption issues depending on the state of the cluster, telling that: - Recovery has started from a backup_label. - Recovery is restarting from a backup start LSN, without a backup_label. - Recovery has completed from a backup. This was originally applied on HEAD as of 1d35f70, and there is consensus that this can be useful for older versions. This applies cleanly down to 15, so do it down to this version for now (older versions have heavily refactored the WAL recovery paths, making the change less straight-forward to do). Author: Andres Freund Reviewed-by: David Steele, Laurenz Albe, Michael Paquier Discussion: https://postgr.es/m/20231117041811.vz4vgkthwjnwp2pp@awork3.anarazel.de Backpatch-through: 15
1 parent f57a580 commit edbd1b4

File tree

1 file changed

+34
-0
lines changed

1 file changed

+34
-0
lines changed

src/backend/access/transam/xlogrecovery.c

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -623,6 +623,22 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
623623
if (StandbyModeRequested)
624624
EnableStandbyMode();
625625

626+
/*
627+
* Omitting backup_label when creating a new replica, PITR node etc.
628+
* unfortunately is a common cause of corruption. Logging that
629+
* backup_label was used makes it a bit easier to exclude that as the
630+
* cause of observed corruption.
631+
*
632+
* Do so before we try to read the checkpoint record (which can fail),
633+
* as otherwise it can be hard to understand why a checkpoint other
634+
* than ControlFile->checkPoint is used.
635+
*/
636+
ereport(LOG,
637+
(errmsg("starting backup recovery with redo LSN %X/%X, checkpoint LSN %X/%X, on timeline ID %u",
638+
LSN_FORMAT_ARGS(RedoStartLSN),
639+
LSN_FORMAT_ARGS(CheckPointLoc),
640+
CheckPointTLI)));
641+
626642
/*
627643
* When a backup_label file is present, we want to roll forward from
628644
* the checkpoint it identifies, rather than using pg_control.
@@ -761,6 +777,16 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
761777
EnableStandbyMode();
762778
}
763779

780+
/*
781+
* For the same reason as when starting up with backup_label present,
782+
* emit a log message when we continue initializing from a base
783+
* backup.
784+
*/
785+
if (!XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
786+
ereport(LOG,
787+
(errmsg("restarting backup recovery with redo LSN %X/%X",
788+
LSN_FORMAT_ARGS(ControlFile->backupStartPoint))));
789+
764790
/* Get the last valid checkpoint record. */
765791
CheckPointLoc = ControlFile->checkPoint;
766792
CheckPointTLI = ControlFile->checkPointCopy.ThisTimeLineID;
@@ -2123,6 +2149,9 @@ CheckRecoveryConsistency(void)
21232149
if (!XLogRecPtrIsInvalid(backupEndPoint) &&
21242150
backupEndPoint <= lastReplayedEndRecPtr)
21252151
{
2152+
XLogRecPtr saveBackupStartPoint = backupStartPoint;
2153+
XLogRecPtr saveBackupEndPoint = backupEndPoint;
2154+
21262155
elog(DEBUG1, "end of backup reached");
21272156

21282157
/*
@@ -2133,6 +2162,11 @@ CheckRecoveryConsistency(void)
21332162
backupStartPoint = InvalidXLogRecPtr;
21342163
backupEndPoint = InvalidXLogRecPtr;
21352164
backupEndRequired = false;
2165+
2166+
ereport(LOG,
2167+
(errmsg("completed backup recovery with redo LSN %X/%X and end LSN %X/%X",
2168+
LSN_FORMAT_ARGS(saveBackupStartPoint),
2169+
LSN_FORMAT_ARGS(saveBackupEndPoint))));
21362170
}
21372171

21382172
/*

0 commit comments

Comments
 (0)