Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 3cd1ba1

Browse files
committed
Fix comments about WAL rule "write xlog before data" versus pg_multixact.
Recovery does not achieve its goal of zeroing all pg_multixact entries whose accompanying WAL records never reached disk. Remove that claim and justify its expendability. Detail the need for TrimMultiXact(), which has little in common with the TrimCLOG() rationale. Merge two tightly-related comments. Stop presenting pg_multixact as specific to heap_lock_tuple(); PostgreSQL 9.3 extended its use to heap_update(). Noticed while investigating a report from Andres Freund.
1 parent 253de19 commit 3cd1ba1

File tree

1 file changed

+21
-25
lines changed

1 file changed

+21
-25
lines changed

src/backend/access/transam/multixact.c

Lines changed: 21 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -24,17 +24,21 @@
2424
* since it would get completely confused if someone inquired about a bogus
2525
* MultiXactId that pointed to an intermediate slot containing an XID.)
2626
*
27-
* XLOG interactions: this module generates an XLOG record whenever a new
28-
* OFFSETs or MEMBERs page is initialized to zeroes, as well as an XLOG record
29-
* whenever a new MultiXactId is defined. This allows us to completely
30-
* rebuild the data entered since the last checkpoint during XLOG replay.
31-
* Because this is possible, we need not follow the normal rule of
32-
* "write WAL before data"; the only correctness guarantee needed is that
33-
* we flush and sync all dirty OFFSETs and MEMBERs pages to disk before a
34-
* checkpoint is considered complete. If a page does make it to disk ahead
35-
* of corresponding WAL records, it will be forcibly zeroed before use anyway.
36-
* Therefore, we don't need to mark our pages with LSN information; we have
37-
* enough synchronization already.
27+
* XLOG interactions: this module generates a record whenever a new OFFSETs or
28+
* MEMBERs page is initialized to zeroes, as well as an
29+
* XLOG_MULTIXACT_CREATE_ID record whenever a new MultiXactId is defined.
30+
* This module ignores the WAL rule "write xlog before data," because it
31+
* suffices that actions recording a MultiXactId in a heap xmax do follow that
32+
* rule. The only way for the MXID to be referenced from any data page is for
33+
* heap_lock_tuple() or heap_update() to have put it there, and each generates
34+
* an XLOG record that must follow ours. The normal LSN interlock between the
35+
* data page and that XLOG record will ensure that our XLOG record reaches
36+
* disk first. If the SLRU members/offsets data reaches disk sooner than the
37+
* XLOG records, we do not care; after recovery, no xmax will refer to it. On
38+
* the flip side, to ensure that all referenced entries _do_ reach disk, this
39+
* module's XLOG records completely rebuild the data entered since the last
40+
* checkpoint. We flush and sync all dirty OFFSETs and MEMBERs pages to disk
41+
* before each checkpoint is considered complete.
3842
*
3943
* Like clog.c, and unlike subtrans.c, we have to preserve state across
4044
* crashes and ensure that MXID and offset numbering increases monotonically
@@ -795,19 +799,7 @@ MultiXactIdCreateFromMembers(int nmembers, MultiXactMember *members)
795799
*/
796800
multi = GetNewMultiXactId(nmembers, &offset);
797801

798-
/*
799-
* Make an XLOG entry describing the new MXID.
800-
*
801-
* Note: we need not flush this XLOG entry to disk before proceeding. The
802-
* only way for the MXID to be referenced from any data page is for
803-
* heap_lock_tuple() to have put it there, and heap_lock_tuple() generates
804-
* an XLOG record that must follow ours. The normal LSN interlock between
805-
* the data page and that XLOG record will ensure that our XLOG record
806-
* reaches disk first. If the SLRU members/offsets data reaches disk
807-
* sooner than the XLOG record, we do not care because we'll overwrite it
808-
* with zeroes unless the XLOG record is there too; see notes at top of
809-
* this file.
810-
*/
802+
/* Make an XLOG entry describing the new MXID. */
811803
xlrec.mid = multi;
812804
xlrec.moff = offset;
813805
xlrec.nmembers = nmembers;
@@ -2037,7 +2029,11 @@ TrimMultiXact(void)
20372029

20382030
/*
20392031
* Zero out the remainder of the current offsets page. See notes in
2040-
* TrimCLOG() for motivation.
2032+
* TrimCLOG() for background. Unlike CLOG, some WAL record covers every
2033+
* pg_multixact SLRU mutation. Since, also unlike CLOG, we ignore the WAL
2034+
* rule "write xlog before data," nextMXact successors may carry obsolete,
2035+
* nonzero offset values. Zero those so case 2 of GetMultiXactIdMembers()
2036+
* operates normally.
20412037
*/
20422038
entryno = MultiXactIdToOffsetEntry(nextMXact);
20432039
if (entryno != 0)

0 commit comments

Comments
 (0)