Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 1861b20

Browse files
committed
Correct base backup throttling
Throttling for sending a base backup in walsender is broken for the case where there is a lot of WAL traffic, because the latch used to put the walsender to sleep is also signalled by regular WAL traffic (and each signal causes an additional batch of data to be sent); the net effect is that there is no or little actual throttling. This is undesirable, so rewrite the sleep into a loop to achieve the desired effeect. Author: Jeff Janes, small tweaks by me Reviewed-by: Antonin Houska Discussion: https://postgr.es/m/CAMkU=1xH6mde-yL-Eo1TKBGNd0PB1-TMxvrNvqcAkN-qr2E9mw@mail.gmail.com
1 parent a1af1e7 commit 1861b20

File tree

1 file changed

+26
-11
lines changed

1 file changed

+26
-11
lines changed

src/backend/replication/basebackup.c

+26-11
Original file line numberDiff line numberDiff line change
@@ -1336,10 +1336,7 @@ _tarWriteDir(const char *pathbuf, int basepathlen, struct stat *statbuf,
13361336
static void
13371337
throttle(size_t increment)
13381338
{
1339-
TimeOffset elapsed,
1340-
elapsed_min,
1341-
sleep;
1342-
int wait_result;
1339+
TimeOffset elapsed_min;
13431340

13441341
if (throttling_counter < 0)
13451342
return;
@@ -1348,14 +1345,28 @@ throttle(size_t increment)
13481345
if (throttling_counter < throttling_sample)
13491346
return;
13501347

1351-
/* Time elapsed since the last measurement (and possible wake up). */
1352-
elapsed = GetCurrentTimestamp() - throttled_last;
1353-
/* How much should have elapsed at minimum? */
1354-
elapsed_min = elapsed_min_unit * (throttling_counter / throttling_sample);
1355-
sleep = elapsed_min - elapsed;
1356-
/* Only sleep if the transfer is faster than it should be. */
1357-
if (sleep > 0)
1348+
/* How much time should have elapsed at minimum? */
1349+
elapsed_min = elapsed_min_unit *
1350+
(throttling_counter / throttling_sample);
1351+
1352+
/*
1353+
* Since the latch could be set repeatedly because of concurrently WAL
1354+
* activity, sleep in a loop to ensure enough time has passed.
1355+
*/
1356+
for (;;)
13581357
{
1358+
TimeOffset elapsed,
1359+
sleep;
1360+
int wait_result;
1361+
1362+
/* Time elapsed since the last measurement (and possible wake up). */
1363+
elapsed = GetCurrentTimestamp() - throttled_last;
1364+
1365+
/* sleep if the transfer is faster than it should be */
1366+
sleep = elapsed_min - elapsed;
1367+
if (sleep <= 0)
1368+
break;
1369+
13591370
ResetLatch(MyLatch);
13601371

13611372
/* We're eating a potentially set latch, so check for interrupts */
@@ -1372,6 +1383,10 @@ throttle(size_t increment)
13721383

13731384
if (wait_result & WL_LATCH_SET)
13741385
CHECK_FOR_INTERRUPTS();
1386+
1387+
/* Done waiting? */
1388+
if (wait_result & WL_TIMEOUT)
1389+
break;
13751390
}
13761391

13771392
/*

0 commit comments

Comments
 (0)