Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit f32678c

Browse files
committed
Reduce delay for last logicalrep feedback message when master goes idle.
The regression tests contain numerous cases where we do some activity on a master server and then wait till the slave has ack'd flushing its copy of that transaction. Because WAL flush on the slave is asynchronous to the logicalrep worker process, the worker cannot send such a feedback message during the LogicalRepApplyLoop iteration where it processes the last data from the master. In the previous coding, the feedback message would come out only when the loop's WaitLatchOrSocket call returned WL_TIMEOUT. That requires one full second of delay (NAPTIME_PER_CYCLE); and to add insult to injury, it could take more than that if the WaitLatchOrSocket was interrupted a few times by latch-setting events. In reality we can expect the slave's walwriter process to have flushed the WAL data after, more or less, WalWriterDelay (typically 200ms). Hence, if there are unacked transactions pending, make the wait delay only that long rather than the full NAPTIME_PER_CYCLE. Also, move one of the send_feedback() calls into the loop main line, so that we'll check for the need to send feedback even if we were woken by a latch event and not either socket data or timeout. It's not clear how much this matters for production purposes, but it's definitely helpful for testing. Discussion: https://postgr.es/m/30864.1498861103@sss.pgh.pa.us
1 parent 799f8bc commit f32678c

File tree

1 file changed

+16
-5
lines changed

1 file changed

+16
-5
lines changed

src/backend/replication/logical/worker.c

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@
5252

5353
#include "postmaster/bgworker.h"
5454
#include "postmaster/postmaster.h"
55+
#include "postmaster/walwriter.h"
5556

5657
#include "replication/decode.h"
5758
#include "replication/logical.h"
@@ -1027,6 +1028,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
10271028
bool endofstream = false;
10281029
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
10291030
bool ping_sent = false;
1031+
long wait_time;
10301032

10311033
CHECK_FOR_INTERRUPTS();
10321034

@@ -1114,11 +1116,11 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
11141116

11151117
len = walrcv_receive(wrconn, &buf, &fd);
11161118
}
1117-
1118-
/* confirm all writes at once */
1119-
send_feedback(last_received, false, false);
11201119
}
11211120

1121+
/* confirm all writes so far */
1122+
send_feedback(last_received, false, false);
1123+
11221124
if (!in_remote_transaction)
11231125
{
11241126
/*
@@ -1147,12 +1149,21 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
11471149
}
11481150

11491151
/*
1150-
* Wait for more data or latch.
1152+
* Wait for more data or latch. If we have unflushed transactions,
1153+
* wake up after WalWriterDelay to see if they've been flushed yet (in
1154+
* which case we should send a feedback message). Otherwise, there's
1155+
* no particular urgency about waking up unless we get data or a
1156+
* signal.
11511157
*/
1158+
if (!dlist_is_empty(&lsn_mapping))
1159+
wait_time = WalWriterDelay;
1160+
else
1161+
wait_time = NAPTIME_PER_CYCLE;
1162+
11521163
rc = WaitLatchOrSocket(MyLatch,
11531164
WL_SOCKET_READABLE | WL_LATCH_SET |
11541165
WL_TIMEOUT | WL_POSTMASTER_DEATH,
1155-
fd, NAPTIME_PER_CYCLE,
1166+
fd, wait_time,
11561167
WAIT_EVENT_LOGICAL_APPLY_MAIN);
11571168

11581169
/* Emergency bailout if postmaster has died */

0 commit comments

Comments
 (0)