Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 2dd9322

Browse files
committed
Move BKP_REMOVABLE bit from individual WAL records to WAL page headers.
Removing this bit from xl_info allows us to restore the old limit of four (not three) separate pages touched by a WAL record, which is needed for the upcoming SP-GiST feature, and will likely be useful elsewhere in future. When we implemented XLR_BKP_REMOVABLE in 2007, we had to do it like that because no special WAL-visible action was taken when starting a backup. However, now we force a segment switch when starting a backup, so a compressing WAL archiver (such as pglesslog) that uses the state shown in the current page header will not be fooled as to removability of backup blocks. The only downside is that the archiver will not return to compressing mode for up to one WAL page after the backup is over, which is a small price to pay for getting back the extra xl_info bit. In any case the archiver could look for XLOG_BACKUP_END records if it thought it was worth the trouble to do so. Bump XLOG_PAGE_MAGIC since this is effectively a change in WAL format.
1 parent 8409b60 commit 2dd9322

File tree

4 files changed

+44
-42
lines changed

4 files changed

+44
-42
lines changed

src/backend/access/transam/README

+1-1
Original file line numberDiff line numberDiff line change
@@ -473,7 +473,7 @@ the same page, only BKP(1) would have been set.
473473
For this reason as well as the risk of deadlocking on buffer locks, it's best
474474
to design WAL records so that they reflect small atomic actions involving just
475475
one or a few pages. The current XLOG infrastructure cannot handle WAL records
476-
involving references to more than three shared buffers, anyway.
476+
involving references to more than four shared buffers, anyway.
477477

478478
In the case where the WAL record contains enough information to re-generate
479479
the entire contents of a page, do *not* show that page's buffer ID in the

src/backend/access/transam/xlog.c

+34-26
Original file line numberDiff line numberDiff line change
@@ -970,19 +970,6 @@ begin:;
970970
}
971971
}
972972

973-
/*
974-
* If we backed up any full blocks and online backup is not in progress,
975-
* mark the backup blocks as removable. This allows the WAL archiver to
976-
* know whether it is safe to compress archived WAL data by transforming
977-
* full-block records into the non-full-block format.
978-
*
979-
* Note: we could just set the flag whenever !forcePageWrites, but
980-
* defining it like this leaves the info bit free for some potential other
981-
* use in records without any backup blocks.
982-
*/
983-
if ((info & XLR_BKP_BLOCK_MASK) && !Insert->forcePageWrites)
984-
info |= XLR_BKP_REMOVABLE;
985-
986973
/*
987974
* If there isn't enough space on the current XLOG page for a record
988975
* header, advance to the next page (leaving the unused space as zeroes).
@@ -1601,6 +1588,21 @@ AdvanceXLInsertBuffer(bool new_segment)
16011588
NewPage ->xlp_pageaddr.xlogid = NewPageEndPtr.xlogid;
16021589
NewPage ->xlp_pageaddr.xrecoff = NewPageEndPtr.xrecoff - XLOG_BLCKSZ;
16031590

1591+
/*
1592+
* If online backup is not in progress, mark the header to indicate that
1593+
* WAL records beginning in this page have removable backup blocks. This
1594+
* allows the WAL archiver to know whether it is safe to compress archived
1595+
* WAL data by transforming full-block records into the non-full-block
1596+
* format. It is sufficient to record this at the page level because we
1597+
* force a page switch (in fact a segment switch) when starting a backup,
1598+
* so the flag will be off before any records can be written during the
1599+
* backup. At the end of a backup, the last page will be marked as all
1600+
* unsafe when perhaps only part is unsafe, but at worst the archiver
1601+
* would miss the opportunity to compress a few records.
1602+
*/
1603+
if (!Insert->forcePageWrites)
1604+
NewPage->xlp_info |= XLP_BKP_REMOVABLE;
1605+
16041606
/*
16051607
* If first page of an XLOG segment file, make it a long header.
16061608
*/
@@ -8849,19 +8851,6 @@ do_pg_start_backup(const char *backupidstr, bool fast, char **labelfile)
88498851
errmsg("backup label too long (max %d bytes)",
88508852
MAXPGPATH)));
88518853

8852-
/*
8853-
* Force an XLOG file switch before the checkpoint, to ensure that the WAL
8854-
* segment the checkpoint is written to doesn't contain pages with old
8855-
* timeline IDs. That would otherwise happen if you called
8856-
* pg_start_backup() right after restoring from a PITR archive: the first
8857-
* WAL segment containing the startup checkpoint has pages in the
8858-
* beginning with the old timeline ID. That can cause trouble at recovery:
8859-
* we won't have a history file covering the old timeline if pg_xlog
8860-
* directory was not included in the base backup and the WAL archive was
8861-
* cleared too before starting the backup.
8862-
*/
8863-
RequestXLogSwitch();
8864-
88658854
/*
88668855
* Mark backup active in shared memory. We must do full-page WAL writes
88678856
* during an on-line backup even if not doing so at other times, because
@@ -8902,6 +8891,25 @@ do_pg_start_backup(const char *backupidstr, bool fast, char **labelfile)
89028891
{
89038892
bool gotUniqueStartpoint = false;
89048893

8894+
/*
8895+
* Force an XLOG file switch before the checkpoint, to ensure that the
8896+
* WAL segment the checkpoint is written to doesn't contain pages with
8897+
* old timeline IDs. That would otherwise happen if you called
8898+
* pg_start_backup() right after restoring from a PITR archive: the
8899+
* first WAL segment containing the startup checkpoint has pages in
8900+
* the beginning with the old timeline ID. That can cause trouble at
8901+
* recovery: we won't have a history file covering the old timeline if
8902+
* pg_xlog directory was not included in the base backup and the WAL
8903+
* archive was cleared too before starting the backup.
8904+
*
8905+
* This also ensures that we have emitted a WAL page header that has
8906+
* XLP_BKP_REMOVABLE off before we emit the checkpoint record.
8907+
* Therefore, if a WAL archiver (such as pglesslog) is trying to
8908+
* compress out removable backup blocks, it won't remove any that
8909+
* occur after this point.
8910+
*/
8911+
RequestXLogSwitch();
8912+
89058913
do
89068914
{
89078915
/*

src/include/access/xlog.h

+5-13
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
* backup block data
3030
* ...
3131
*
32-
* where there can be zero to three backup blocks (as signaled by xl_info flag
32+
* where there can be zero to four backup blocks (as signaled by xl_info flag
3333
* bits). XLogRecord structs always start on MAXALIGN boundaries in the WAL
3434
* files, and we round up SizeOfXLogRecord so that the rmgr data is also
3535
* guaranteed to begin on a MAXALIGN boundary. However, no padding is added
@@ -66,24 +66,16 @@ typedef struct XLogRecord
6666

6767
/*
6868
* If we backed up any disk blocks with the XLOG record, we use flag bits in
69-
* xl_info to signal it. We support backup of up to 3 disk blocks per XLOG
69+
* xl_info to signal it. We support backup of up to 4 disk blocks per XLOG
7070
* record.
7171
*/
72-
#define XLR_BKP_BLOCK_MASK 0x0E /* all info bits used for bkp blocks */
73-
#define XLR_MAX_BKP_BLOCKS 3
72+
#define XLR_BKP_BLOCK_MASK 0x0F /* all info bits used for bkp blocks */
73+
#define XLR_MAX_BKP_BLOCKS 4
7474
#define XLR_SET_BKP_BLOCK(iblk) (0x08 >> (iblk))
7575
#define XLR_BKP_BLOCK_1 XLR_SET_BKP_BLOCK(0) /* 0x08 */
7676
#define XLR_BKP_BLOCK_2 XLR_SET_BKP_BLOCK(1) /* 0x04 */
7777
#define XLR_BKP_BLOCK_3 XLR_SET_BKP_BLOCK(2) /* 0x02 */
78-
79-
/*
80-
* Bit 0 of xl_info is set if the backed-up blocks could safely be removed
81-
* from a compressed version of XLOG (that is, they are backed up only to
82-
* prevent partial-page-write problems, and not to ensure consistency of PITR
83-
* recovery). The compression algorithm would need to extract data from the
84-
* blocks to create an equivalent non-full-page XLOG record.
85-
*/
86-
#define XLR_BKP_REMOVABLE 0x01
78+
#define XLR_BKP_BLOCK_4 XLR_SET_BKP_BLOCK(3) /* 0x01 */
8779

8880
/* Sync methods */
8981
#define SYNC_METHOD_FSYNC 0

src/include/access/xlog_internal.h

+4-2
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ typedef struct XLogContRecord
7171
/*
7272
* Each page of XLOG file has a header like this:
7373
*/
74-
#define XLOG_PAGE_MAGIC 0xD068 /* can be used as WAL version indicator */
74+
#define XLOG_PAGE_MAGIC 0xD069 /* can be used as WAL version indicator */
7575

7676
typedef struct XLogPageHeaderData
7777
{
@@ -106,8 +106,10 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader;
106106
#define XLP_FIRST_IS_CONTRECORD 0x0001
107107
/* This flag indicates a "long" page header */
108108
#define XLP_LONG_HEADER 0x0002
109+
/* This flag indicates backup blocks starting in this page are optional */
110+
#define XLP_BKP_REMOVABLE 0x0004
109111
/* All defined flag bits in xlp_info (used for validity checking of header) */
110-
#define XLP_ALL_FLAGS 0x0003
112+
#define XLP_ALL_FLAGS 0x0007
111113

112114
#define XLogPageHeaderSize(hdr) \
113115
(((hdr)->xlp_info & XLP_LONG_HEADER) ? SizeOfXLogLongPHD : SizeOfXLogShortPHD)

0 commit comments

Comments
 (0)