Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 07e8b6a

Browse files
committed
Don't allow walsender to send WAL data until it's been safely fsync'd on the
master. Otherwise a subsequent crash could cause the master to lose WAL that has already been applied on the slave, resulting in the slave being out of sync and soon corrupt. Per recent discussion and an example from Robert Haas. Fujii Masao
1 parent 8f4e121 commit 07e8b6a

File tree

3 files changed

+20
-17
lines changed

3 files changed

+20
-17
lines changed

src/backend/access/transam/xlog.c

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
10-
* $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.424 2010/06/14 06:04:21 heikki Exp $
10+
* $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.425 2010/06/17 16:41:25 tgl Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
@@ -6803,17 +6803,18 @@ GetInsertRecPtr(void)
68036803
}
68046804

68056805
/*
6806-
* GetWriteRecPtr -- Returns the current write position.
6806+
* GetFlushRecPtr -- Returns the current flush position, ie, the last WAL
6807+
* position known to be fsync'd to disk.
68076808
*/
68086809
XLogRecPtr
6809-
GetWriteRecPtr(void)
6810+
GetFlushRecPtr(void)
68106811
{
68116812
/* use volatile pointer to prevent code rearrangement */
68126813
volatile XLogCtlData *xlogctl = XLogCtl;
68136814
XLogRecPtr recptr;
68146815

68156816
SpinLockAcquire(&xlogctl->info_lck);
6816-
recptr = xlogctl->LogwrtResult.Write;
6817+
recptr = xlogctl->LogwrtResult.Flush;
68176818
SpinLockRelease(&xlogctl->info_lck);
68186819

68196820
return recptr;

src/backend/replication/walsender.c

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,9 @@
33
* walsender.c
44
*
55
* The WAL sender process (walsender) is new as of Postgres 9.0. It takes
6-
* charge of XLOG streaming sender in the primary server. At first, it is
7-
* started by the postmaster when the walreceiver in the standby server
6+
* care of sending XLOG from the primary server to a single recipient.
7+
* (Note that there can be more than one walsender process concurrently.)
8+
* It is started by the postmaster when the walreceiver of a standby server
89
* connects to the primary server and requests XLOG streaming replication.
910
* It attempts to keep reading XLOG records from the disk and sending them
1011
* to the standby server, as long as the connection is alive (i.e., like
@@ -23,13 +24,11 @@
2324
* This instruct walsender to send any outstanding WAL, including the
2425
* shutdown checkpoint record, and then exit.
2526
*
26-
* Note that there can be more than one walsender process concurrently.
2727
*
2828
* Portions Copyright (c) 2010-2010, PostgreSQL Global Development Group
2929
*
30-
*
3130
* IDENTIFICATION
32-
* $PostgreSQL: pgsql/src/backend/replication/walsender.c,v 1.26 2010/06/03 23:00:14 tgl Exp $
31+
* $PostgreSQL: pgsql/src/backend/replication/walsender.c,v 1.27 2010/06/17 16:41:25 tgl Exp $
3332
*
3433
*-------------------------------------------------------------------------
3534
*/
@@ -641,7 +640,7 @@ XLogRead(char *buf, XLogRecPtr recptr, Size nbytes)
641640
}
642641

643642
/*
644-
* Read up to MAX_SEND_SIZE bytes of WAL that's been written to disk,
643+
* Read up to MAX_SEND_SIZE bytes of WAL that's been flushed to disk,
645644
* but not yet sent to the client, and send it.
646645
*
647646
* msgbuf is a work area in which the output message is constructed. It's
@@ -663,11 +662,14 @@ XLogSend(char *msgbuf, bool *caughtup)
663662
WalDataMessageHeader msghdr;
664663

665664
/*
666-
* Attempt to send all data that's already been written out from WAL
667-
* buffers (note it might not yet be fsync'd to disk). We cannot go
668-
* further than that given the current implementation of XLogRead().
665+
* Attempt to send all data that's already been written out and fsync'd
666+
* to disk. We cannot go further than what's been written out given the
667+
* current implementation of XLogRead(). And in any case it's unsafe to
668+
* send WAL that is not securely down to disk on the master: if the master
669+
* subsequently crashes and restarts, slaves must not have applied any WAL
670+
* that gets lost on the master.
669671
*/
670-
SendRqstPtr = GetWriteRecPtr();
672+
SendRqstPtr = GetFlushRecPtr();
671673

672674
/* Quick exit if nothing to do */
673675
if (XLByteLE(SendRqstPtr, sentPtr))
@@ -679,7 +681,7 @@ XLogSend(char *msgbuf, bool *caughtup)
679681
/*
680682
* Figure out how much to send in one message. If there's no more than
681683
* MAX_SEND_SIZE bytes to send, send everything. Otherwise send
682-
* MAX_SEND_SIZE bytes, but round to logfile or page boundary.
684+
* MAX_SEND_SIZE bytes, but round back to logfile or page boundary.
683685
*
684686
* The rounding is not only for performance reasons. Walreceiver
685687
* relies on the fact that we never split a WAL record across two

src/include/access/xlog.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
77
* Portions Copyright (c) 1994, Regents of the University of California
88
*
9-
* $PostgreSQL: pgsql/src/include/access/xlog.h,v 1.112 2010/06/10 07:49:23 heikki Exp $
9+
* $PostgreSQL: pgsql/src/include/access/xlog.h,v 1.113 2010/06/17 16:41:25 tgl Exp $
1010
*/
1111
#ifndef XLOG_H
1212
#define XLOG_H
@@ -294,7 +294,7 @@ extern bool CreateRestartPoint(int flags);
294294
extern void XLogPutNextOid(Oid nextOid);
295295
extern XLogRecPtr GetRedoRecPtr(void);
296296
extern XLogRecPtr GetInsertRecPtr(void);
297-
extern XLogRecPtr GetWriteRecPtr(void);
297+
extern XLogRecPtr GetFlushRecPtr(void);
298298
extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
299299
extern TimeLineID GetRecoveryTargetTLI(void);
300300

0 commit comments

Comments
 (0)