Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit cbab940

Browse files
committed
Server-side fix for delayed NOTIFY and SIGTERM processing.
Commit 4f85fde introduced some code that was meant to ensure that we'd process cancel, die, sinval catchup, and notify interrupts while waiting for client input. But there was a flaw: it supposed that the process latch would be set upon arrival at secure_read() if any such interrupt was pending. In reality, we might well have cleared the process latch at some earlier point while those flags remained set -- particularly notifyInterruptPending, which can't be handled as long as we're within a transaction. To fix the NOTIFY case, also attempt to process signals (except ProcDiePending) before trying to read. Also, if we see that ProcDiePending is set before we read, forcibly set the process latch to ensure that we will handle that signal promptly if no data is available. I also made it set the process latch on the way out, in case there is similar logic elsewhere. (It remains true that we won't service ProcDiePending here unless we need to wait for input.) The code for handling ProcDiePending during a write needs those changes, too. Also be a little more careful about when to reset whereToSendOutput, and improve related comments. Back-patch to 9.5 where this code was added. I'm not entirely convinced that older branches don't have similar issues, but the complaint at hand is just about the >= 9.5 code. Jeff Janes and Tom Lane Discussion: https://postgr.es/m/CAOYf6ec-TmRYjKBXLLaGaB-jrd=mjG1Hzn1a1wufUAR39PQYhw@mail.gmail.com
1 parent 5dfb53f commit cbab940

File tree

2 files changed

+62
-35
lines changed

2 files changed

+62
-35
lines changed

src/backend/libpq/be-secure.c

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,9 @@ secure_read(Port *port, void *ptr, size_t len)
123123
ssize_t n;
124124
int waitfor;
125125

126+
/* Deal with any already-pending interrupt condition. */
127+
ProcessClientReadInterrupt(false);
128+
126129
retry:
127130
#ifdef USE_SSL
128131
waitfor = 0;
@@ -186,9 +189,8 @@ secure_read(Port *port, void *ptr, size_t len)
186189
}
187190

188191
/*
189-
* Process interrupts that happened while (or before) receiving. Note that
190-
* we signal that we're not blocking, which will prevent some types of
191-
* interrupts from being processed.
192+
* Process interrupts that happened during a successful (or non-blocking,
193+
* or hard-failed) read.
192194
*/
193195
ProcessClientReadInterrupt(false);
194196

@@ -225,6 +227,9 @@ secure_write(Port *port, void *ptr, size_t len)
225227
ssize_t n;
226228
int waitfor;
227229

230+
/* Deal with any already-pending interrupt condition. */
231+
ProcessClientWriteInterrupt(false);
232+
228233
retry:
229234
waitfor = 0;
230235
#ifdef USE_SSL
@@ -263,17 +268,16 @@ secure_write(Port *port, void *ptr, size_t len)
263268

264269
/*
265270
* We'll retry the write. Most likely it will return immediately
266-
* because there's still no data available, and we'll wait for the
267-
* socket to become ready again.
271+
* because there's still no buffer space available, and we'll wait
272+
* for the socket to become ready again.
268273
*/
269274
}
270275
goto retry;
271276
}
272277

273278
/*
274-
* Process interrupts that happened while (or before) sending. Note that
275-
* we signal that we're not blocking, which will prevent some types of
276-
* interrupts from being processed.
279+
* Process interrupts that happened during a successful (or non-blocking,
280+
* or hard-failed) write.
277281
*/
278282
ProcessClientWriteInterrupt(false);
279283

src/backend/tcop/postgres.c

Lines changed: 50 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,7 @@ interactive_getc(void)
300300

301301
c = getc(stdin);
302302

303-
ProcessClientReadInterrupt(true);
303+
ProcessClientReadInterrupt(false);
304304

305305
return c;
306306
}
@@ -505,8 +505,9 @@ ReadCommand(StringInfo inBuf)
505505
/*
506506
* ProcessClientReadInterrupt() - Process interrupts specific to client reads
507507
*
508-
* This is called just after low-level reads. That might be after the read
509-
* finished successfully, or it was interrupted via interrupt.
508+
* This is called just before and after low-level reads.
509+
* 'blocked' is true if no data was available to read and we plan to retry,
510+
* false if about to read or done reading.
510511
*
511512
* Must preserve errno!
512513
*/
@@ -517,23 +518,31 @@ ProcessClientReadInterrupt(bool blocked)
517518

518519
if (DoingCommandRead)
519520
{
520-
/* Check for general interrupts that arrived while reading */
521+
/* Check for general interrupts that arrived before/while reading */
521522
CHECK_FOR_INTERRUPTS();
522523

523-
/* Process sinval catchup interrupts that happened while reading */
524+
/* Process sinval catchup interrupts, if any */
524525
if (catchupInterruptPending)
525526
ProcessCatchupInterrupt();
526527

527-
/* Process sinval catchup interrupts that happened while reading */
528+
/* Process notify interrupts, if any */
528529
if (notifyInterruptPending)
529530
ProcessNotifyInterrupt();
530531
}
531-
else if (ProcDiePending && blocked)
532+
else if (ProcDiePending)
532533
{
533534
/*
534-
* We're dying. It's safe (and sane) to handle that now.
535+
* We're dying. If there is no data available to read, then it's safe
536+
* (and sane) to handle that now. If we haven't tried to read yet,
537+
* make sure the process latch is set, so that if there is no data
538+
* then we'll come back here and die. If we're done reading, also
539+
* make sure the process latch is set, as we might've undesirably
540+
* cleared it while reading.
535541
*/
536-
CHECK_FOR_INTERRUPTS();
542+
if (blocked)
543+
CHECK_FOR_INTERRUPTS();
544+
else
545+
SetLatch(MyLatch);
537546
}
538547

539548
errno = save_errno;
@@ -542,9 +551,9 @@ ProcessClientReadInterrupt(bool blocked)
542551
/*
543552
* ProcessClientWriteInterrupt() - Process interrupts specific to client writes
544553
*
545-
* This is called just after low-level writes. That might be after the read
546-
* finished successfully, or it was interrupted via interrupt. 'blocked' tells
547-
* us whether the
554+
* This is called just before and after low-level writes.
555+
* 'blocked' is true if no data could be written and we plan to retry,
556+
* false if about to write or done writing.
548557
*
549558
* Must preserve errno!
550559
*/
@@ -553,25 +562,39 @@ ProcessClientWriteInterrupt(bool blocked)
553562
{
554563
int save_errno = errno;
555564

556-
/*
557-
* We only want to process the interrupt here if socket writes are
558-
* blocking to increase the chance to get an error message to the client.
559-
* If we're not blocked there'll soon be a CHECK_FOR_INTERRUPTS(). But if
560-
* we're blocked we'll never get out of that situation if the client has
561-
* died.
562-
*/
563-
if (ProcDiePending && blocked)
565+
if (ProcDiePending)
564566
{
565567
/*
566-
* We're dying. It's safe (and sane) to handle that now. But we don't
567-
* want to send the client the error message as that a) would possibly
568-
* block again b) would possibly lead to sending an error message to
569-
* the client, while we already started to send something else.
568+
* We're dying. If it's not possible to write, then we should handle
569+
* that immediately, else a stuck client could indefinitely delay our
570+
* response to the signal. If we haven't tried to write yet, make
571+
* sure the process latch is set, so that if the write would block
572+
* then we'll come back here and die. If we're done writing, also
573+
* make sure the process latch is set, as we might've undesirably
574+
* cleared it while writing.
570575
*/
571-
if (whereToSendOutput == DestRemote)
572-
whereToSendOutput = DestNone;
576+
if (blocked)
577+
{
578+
/*
579+
* Don't mess with whereToSendOutput if ProcessInterrupts wouldn't
580+
* do anything.
581+
*/
582+
if (InterruptHoldoffCount == 0 && CritSectionCount == 0)
583+
{
584+
/*
585+
* We don't want to send the client the error message, as a)
586+
* that would possibly block again, and b) it would likely
587+
* lead to loss of protocol sync because we may have already
588+
* sent a partial protocol message.
589+
*/
590+
if (whereToSendOutput == DestRemote)
591+
whereToSendOutput = DestNone;
573592

574-
CHECK_FOR_INTERRUPTS();
593+
CHECK_FOR_INTERRUPTS();
594+
}
595+
}
596+
else
597+
SetLatch(MyLatch);
575598
}
576599

577600
errno = save_errno;

0 commit comments

Comments
 (0)