Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit d661532

Browse files
committed
Also trigger restartpoints based on max_wal_size on standby.
When archive recovery and restartpoints were initially introduced, checkpoint_segments was ignored on the grounds that the files restored from archive don't consume any space in the recovery server. That was changed in later releases, but even then it was arguably a feature rather than a bug, as performing restartpoints as often as checkpoints during normal operation might be excessive, but you might nevertheless not want to waste a lot of space for pre-allocated WAL by setting checkpoint_segments to a high value. But now that we have separate min_wal_size and max_wal_size settings, you can bound WAL usage with max_wal_size, and still avoid consuming excessive space usage by setting min_wal_size to a lower value, so that argument is moot. There are still some issues with actually limiting the space usage to max_wal_size: restartpoints in recovery can only start after seeing the checkpoint record, while a checkpoint starts flushing buffers as soon as the redo-pointer is set. Restartpoint is paced to happen at the same leisurily speed, determined by checkpoint_completion_target, as checkpoints, but because they are started later, max_wal_size can be exceeded by upto one checkpoint cycle's worth of WAL, depending on checkpoint_completion_target. But that seems better than not trying at all, and max_wal_size is a soft limit anyway. The documentation already claimed that max_wal_size is obeyed in recovery, so this just fixes the behaviour to match the docs. However, add some weasel-words there to mention that max_wal_size may well be exceeded by some amount in recovery.
1 parent 6ab4d38 commit d661532

File tree

3 files changed

+29
-13
lines changed

3 files changed

+29
-13
lines changed

doc/src/sgml/wal.sgml

+5-1
Original file line numberDiff line numberDiff line change
@@ -590,7 +590,11 @@
590590
A restartpoint is triggered when a checkpoint record is reached if at
591591
least <varname>checkpoint_timeout</> seconds have passed since the last
592592
restartpoint, or if WAL size is about to exceed
593-
<varname>max_wal_size</>.
593+
<varname>max_wal_size</>. However, because of limitations on when a
594+
restartpoint can be performed, <varname>max_wal_size</> is often exceeded
595+
during recovery, by up to one checkpoint cycle's worth of WAL.
596+
(<varname>max_wal_size</> is never a hard limit anyway, so you should
597+
always leave plenty of headroom to avoid running out of disk space.)
594598
</para>
595599

596600
<para>

src/backend/access/transam/xlog.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -10943,7 +10943,7 @@ XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen,
1094310943
* Request a restartpoint if we've replayed too much xlog since the
1094410944
* last one.
1094510945
*/
10946-
if (StandbyModeRequested && bgwriterLaunched)
10946+
if (bgwriterLaunched)
1094710947
{
1094810948
if (XLogCheckpointNeeded(readSegNo))
1094910949
{

src/backend/postmaster/checkpointer.c

+23-11
Original file line numberDiff line numberDiff line change
@@ -475,10 +475,12 @@ CheckpointerMain(void)
475475

476476
/*
477477
* Initialize checkpointer-private variables used during
478-
* checkpoint
478+
* checkpoint.
479479
*/
480480
ckpt_active = true;
481-
if (!do_restartpoint)
481+
if (do_restartpoint)
482+
ckpt_start_recptr = GetXLogReplayRecPtr(NULL);
483+
else
482484
ckpt_start_recptr = GetInsertRecPtr();
483485
ckpt_start_time = now;
484486
ckpt_cached_elapsed = 0;
@@ -720,7 +722,7 @@ CheckpointWriteDelay(int flags, double progress)
720722

721723
/*
722724
* IsCheckpointOnSchedule -- are we on schedule to finish this checkpoint
723-
* in time?
725+
* (or restartpoint) in time?
724726
*
725727
* Compares the current progress against the time/segments elapsed since last
726728
* checkpoint, and returns true if the progress we've made this far is greater
@@ -757,17 +759,27 @@ IsCheckpointOnSchedule(double progress)
757759
* compares against RedoRecptr, so this is not completely accurate.
758760
* However, it's good enough for our purposes, we're only calculating an
759761
* estimate anyway.
762+
*
763+
* During recovery, we compare last replayed WAL record's location with
764+
* the location computed before calling CreateRestartPoint. That maintains
765+
* the same pacing as we have during checkpoints in normal operation, but
766+
* we might exceed max_wal_size by a fair amount. That's because there can
767+
* be a large gap between a checkpoint's redo-pointer and the checkpoint
768+
* record itself, and we only start the restartpoint after we've seen the
769+
* checkpoint record. (The gap is typically up to CheckPointSegments *
770+
* checkpoint_completion_target where checkpoint_completion_target is the
771+
* value that was in effect when the WAL was generated).
760772
*/
761-
if (!RecoveryInProgress())
762-
{
773+
if (RecoveryInProgress())
774+
recptr = GetXLogReplayRecPtr(NULL);
775+
else
763776
recptr = GetInsertRecPtr();
764-
elapsed_xlogs = (((double) (recptr - ckpt_start_recptr)) / XLogSegSize) / CheckPointSegments;
777+
elapsed_xlogs = (((double) (recptr - ckpt_start_recptr)) / XLogSegSize) / CheckPointSegments;
765778

766-
if (progress < elapsed_xlogs)
767-
{
768-
ckpt_cached_elapsed = elapsed_xlogs;
769-
return false;
770-
}
779+
if (progress < elapsed_xlogs)
780+
{
781+
ckpt_cached_elapsed = elapsed_xlogs;
782+
return false;
771783
}
772784

773785
/*

0 commit comments

Comments
 (0)