Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 79f17d4

Browse files
committed
Don't run atexit callbacks in quickdie signal handlers.
exit() is not async-signal safe. Even if the libc implementation is, 3rd party libraries might have installed unsafe atexit() callbacks. After receiving SIGQUIT, we really just want to exit as quickly as possible, so we don't really want to run the atexit() callbacks anyway. The original report by Jimmy Yih was a self-deadlock in startup_die(). However, this patch doesn't address that scenario; the signal handling while waiting for the startup packet is more complicated. But at least this alleviates similar problems in the SIGQUIT handlers, like that reported by Asim R P later in the same thread. Backpatch to 9.3 (all supported versions). Discussion: https://www.postgresql.org/message-id/CAOMx_OAuRUHiAuCg2YgicZLzPVv5d9_H4KrL_OFsFP%3DVPekigA%40mail.gmail.com
1 parent a3deecb commit 79f17d4

File tree

7 files changed

+77
-108
lines changed

7 files changed

+77
-108
lines changed

src/backend/postmaster/bgworker.c

Lines changed: 9 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -644,28 +644,21 @@ SanityCheckBackgroundWorker(BackgroundWorker *worker, int elevel)
644644
static void
645645
bgworker_quickdie(SIGNAL_ARGS)
646646
{
647-
sigaddset(&BlockSig, SIGQUIT); /* prevent nested calls */
648-
PG_SETMASK(&BlockSig);
649-
650-
/*
651-
* We DO NOT want to run proc_exit() callbacks -- we're here because
652-
* shared memory may be corrupted, so we don't want to try to clean up our
653-
* transaction. Just nail the windows shut and get out of town. Now that
654-
* there's an atexit callback to prevent third-party code from breaking
655-
* things by calling exit() directly, we have to reset the callbacks
656-
* explicitly to make this work as intended.
657-
*/
658-
on_exit_reset();
659-
660647
/*
661-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
662-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
648+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
649+
* because shared memory may be corrupted, so we don't want to try to
650+
* clean up our transaction. Just nail the windows shut and get out of
651+
* town. The callbacks wouldn't be safe to run from a signal handler,
652+
* anyway.
653+
*
654+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
655+
* a system reset cycle if someone sends a manual SIGQUIT to a random
663656
* backend. This is necessary precisely because we don't clean up our
664657
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
665658
* should ensure the postmaster sees this as a crash, too, but no harm in
666659
* being doubly sure.)
667660
*/
668-
exit(2);
661+
_exit(2);
669662
}
670663

671664
/*

src/backend/postmaster/bgwriter.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -409,27 +409,21 @@ BackgroundWriterMain(void)
409409
static void
410410
bg_quickdie(SIGNAL_ARGS)
411411
{
412-
PG_SETMASK(&BlockSig);
413-
414412
/*
415-
* We DO NOT want to run proc_exit() callbacks -- we're here because
416-
* shared memory may be corrupted, so we don't want to try to clean up our
417-
* transaction. Just nail the windows shut and get out of town. Now that
418-
* there's an atexit callback to prevent third-party code from breaking
419-
* things by calling exit() directly, we have to reset the callbacks
420-
* explicitly to make this work as intended.
421-
*/
422-
on_exit_reset();
423-
424-
/*
425-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
426-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
413+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
414+
* because shared memory may be corrupted, so we don't want to try to
415+
* clean up our transaction. Just nail the windows shut and get out of
416+
* town. The callbacks wouldn't be safe to run from a signal handler,
417+
* anyway.
418+
*
419+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
420+
* a system reset cycle if someone sends a manual SIGQUIT to a random
427421
* backend. This is necessary precisely because we don't clean up our
428422
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
429423
* should ensure the postmaster sees this as a crash, too, but no harm in
430424
* being doubly sure.)
431425
*/
432-
exit(2);
426+
_exit(2);
433427
}
434428

435429
/* SIGHUP: set flag to re-read config file at next convenient time */

src/backend/postmaster/checkpointer.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -823,27 +823,21 @@ IsCheckpointOnSchedule(double progress)
823823
static void
824824
chkpt_quickdie(SIGNAL_ARGS)
825825
{
826-
PG_SETMASK(&BlockSig);
827-
828826
/*
829-
* We DO NOT want to run proc_exit() callbacks -- we're here because
830-
* shared memory may be corrupted, so we don't want to try to clean up our
831-
* transaction. Just nail the windows shut and get out of town. Now that
832-
* there's an atexit callback to prevent third-party code from breaking
833-
* things by calling exit() directly, we have to reset the callbacks
834-
* explicitly to make this work as intended.
835-
*/
836-
on_exit_reset();
837-
838-
/*
839-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
840-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
827+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
828+
* because shared memory may be corrupted, so we don't want to try to
829+
* clean up our transaction. Just nail the windows shut and get out of
830+
* town. The callbacks wouldn't be safe to run from a signal handler,
831+
* anyway.
832+
*
833+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
834+
* a system reset cycle if someone sends a manual SIGQUIT to a random
841835
* backend. This is necessary precisely because we don't clean up our
842836
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
843837
* should ensure the postmaster sees this as a crash, too, but no harm in
844838
* being doubly sure.)
845839
*/
846-
exit(2);
840+
_exit(2);
847841
}
848842

849843
/* SIGHUP: set flag to re-read config file at next convenient time */

src/backend/postmaster/startup.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -69,27 +69,21 @@ static void StartupProcSigHupHandler(SIGNAL_ARGS);
6969
static void
7070
startupproc_quickdie(SIGNAL_ARGS)
7171
{
72-
PG_SETMASK(&BlockSig);
73-
74-
/*
75-
* We DO NOT want to run proc_exit() callbacks -- we're here because
76-
* shared memory may be corrupted, so we don't want to try to clean up our
77-
* transaction. Just nail the windows shut and get out of town. Now that
78-
* there's an atexit callback to prevent third-party code from breaking
79-
* things by calling exit() directly, we have to reset the callbacks
80-
* explicitly to make this work as intended.
81-
*/
82-
on_exit_reset();
83-
8472
/*
85-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
86-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
73+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
74+
* because shared memory may be corrupted, so we don't want to try to
75+
* clean up our transaction. Just nail the windows shut and get out of
76+
* town. The callbacks wouldn't be safe to run from a signal handler,
77+
* anyway.
78+
*
79+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
80+
* a system reset cycle if someone sends a manual SIGQUIT to a random
8781
* backend. This is necessary precisely because we don't clean up our
8882
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
8983
* should ensure the postmaster sees this as a crash, too, but no harm in
9084
* being doubly sure.)
9185
*/
92-
exit(2);
86+
_exit(2);
9387
}
9488

9589

src/backend/postmaster/walwriter.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -319,27 +319,21 @@ WalWriterMain(void)
319319
static void
320320
wal_quickdie(SIGNAL_ARGS)
321321
{
322-
PG_SETMASK(&BlockSig);
323-
324322
/*
325-
* We DO NOT want to run proc_exit() callbacks -- we're here because
326-
* shared memory may be corrupted, so we don't want to try to clean up our
327-
* transaction. Just nail the windows shut and get out of town. Now that
328-
* there's an atexit callback to prevent third-party code from breaking
329-
* things by calling exit() directly, we have to reset the callbacks
330-
* explicitly to make this work as intended.
331-
*/
332-
on_exit_reset();
333-
334-
/*
335-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
336-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
323+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
324+
* because shared memory may be corrupted, so we don't want to try to
325+
* clean up our transaction. Just nail the windows shut and get out of
326+
* town. The callbacks wouldn't be safe to run from a signal handler,
327+
* anyway.
328+
*
329+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
330+
* a system reset cycle if someone sends a manual SIGQUIT to a random
337331
* backend. This is necessary precisely because we don't clean up our
338332
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
339333
* should ensure the postmaster sees this as a crash, too, but no harm in
340334
* being doubly sure.)
341335
*/
342-
exit(2);
336+
_exit(2);
343337
}
344338

345339
/* SIGHUP: set flag to re-read config file at next convenient time */

src/backend/replication/walreceiver.c

Lines changed: 13 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -860,27 +860,21 @@ WalRcvShutdownHandler(SIGNAL_ARGS)
860860
static void
861861
WalRcvQuickDieHandler(SIGNAL_ARGS)
862862
{
863-
PG_SETMASK(&BlockSig);
864-
865863
/*
866-
* We DO NOT want to run proc_exit() callbacks -- we're here because
867-
* shared memory may be corrupted, so we don't want to try to clean up our
868-
* transaction. Just nail the windows shut and get out of town. Now that
869-
* there's an atexit callback to prevent third-party code from breaking
870-
* things by calling exit() directly, we have to reset the callbacks
871-
* explicitly to make this work as intended.
872-
*/
873-
on_exit_reset();
874-
875-
/*
876-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
877-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
878-
* backend. This is necessary precisely because we don't clean up our
879-
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
880-
* should ensure the postmaster sees this as a crash, too, but no harm in
881-
* being doubly sure.)
864+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
865+
* because shared memory may be corrupted, so we don't want to try to
866+
* clean up our transaction. Just nail the windows shut and get out of
867+
* town. The callbacks wouldn't be safe to run from a signal handler,
868+
* anyway.
869+
*
870+
* Note we use _exit(2) not _exit(0). This is to force the postmaster
871+
* into a system reset cycle if someone sends a manual SIGQUIT to a
872+
* random backend. This is necessary precisely because we don't clean up
873+
* our shared memory state. (The "dead man switch" mechanism in
874+
* pmsignal.c should ensure the postmaster sees this as a crash, too, but
875+
* no harm in being doubly sure.)
882876
*/
883-
exit(2);
877+
_exit(2);
884878
}
885879

886880
/*

src/backend/tcop/postgres.c

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2616,6 +2616,16 @@ quickdie(SIGNAL_ARGS)
26162616
whereToSendOutput = DestNone;
26172617

26182618
/*
2619+
* Notify the client before exiting, to give a clue on what happened.
2620+
*
2621+
* It's dubious to call ereport() from a signal handler. It is certainly
2622+
* not async-signal safe. But it seems better to try, than to disconnect
2623+
* abruptly and leave the client wondering what happened. It's remotely
2624+
* possible that we crash or hang while trying to send the message, but
2625+
* receiving a SIGQUIT is a sign that something has already gone badly
2626+
* wrong, so there's not much to lose. Assuming the postmaster is still
2627+
* running, it will SIGKILL us soon if we get stuck for some reason.
2628+
*
26192629
* Ideally this should be ereport(FATAL), but then we'd not get control
26202630
* back...
26212631
*/
@@ -2630,24 +2640,20 @@ quickdie(SIGNAL_ARGS)
26302640
" database and repeat your command.")));
26312641

26322642
/*
2633-
* We DO NOT want to run proc_exit() callbacks -- we're here because
2634-
* shared memory may be corrupted, so we don't want to try to clean up our
2635-
* transaction. Just nail the windows shut and get out of town. Now that
2636-
* there's an atexit callback to prevent third-party code from breaking
2637-
* things by calling exit() directly, we have to reset the callbacks
2638-
* explicitly to make this work as intended.
2639-
*/
2640-
on_exit_reset();
2641-
2642-
/*
2643-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
2644-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
2643+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
2644+
* because shared memory may be corrupted, so we don't want to try to
2645+
* clean up our transaction. Just nail the windows shut and get out of
2646+
* town. The callbacks wouldn't be safe to run from a signal handler,
2647+
* anyway.
2648+
*
2649+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
2650+
* a system reset cycle if someone sends a manual SIGQUIT to a random
26452651
* backend. This is necessary precisely because we don't clean up our
26462652
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
26472653
* should ensure the postmaster sees this as a crash, too, but no harm in
26482654
* being doubly sure.)
26492655
*/
2650-
exit(2);
2656+
_exit(2);
26512657
}
26522658

26532659
/*

0 commit comments

Comments
 (0)