Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 45e004f

Browse files
committed
On Windows, retry process creation if we fail to reserve shared memory.
We've heard occasional reports of backend launch failing because pgwin32_ReserveSharedMemoryRegion() fails, indicating that something has already used that address space in the child process. It's not very clear what, given that we disable ASLR in Windows builds, but suspicion falls on antivirus products. It'd be better if we didn't have to disable ASLR, anyway. So let's try to ameliorate the problem by retrying the process launch after such a failure, up to 100 times. Patch by me, based on previous work by Amit Kapila and others. This is a longstanding issue, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com
1 parent d137a6d commit 45e004f

File tree

1 file changed

+15
-7
lines changed

1 file changed

+15
-7
lines changed

src/backend/postmaster/postmaster.c

+15-7
Original file line numberDiff line numberDiff line change
@@ -4510,6 +4510,7 @@ internal_forkexec(int argc, char *argv[], Port *port)
45104510
static pid_t
45114511
internal_forkexec(int argc, char *argv[], Port *port)
45124512
{
4513+
int retry_count = 0;
45134514
STARTUPINFO si;
45144515
PROCESS_INFORMATION pi;
45154516
int i;
@@ -4527,6 +4528,9 @@ internal_forkexec(int argc, char *argv[], Port *port)
45274528
Assert(strncmp(argv[1], "--fork", 6) == 0);
45284529
Assert(argv[2] == NULL);
45294530

4531+
/* Resume here if we need to retry */
4532+
retry:
4533+
45304534
/* Set up shared memory for parameter passing */
45314535
ZeroMemory(&sa, sizeof(sa));
45324536
sa.nLength = sizeof(sa);
@@ -4618,22 +4622,26 @@ internal_forkexec(int argc, char *argv[], Port *port)
46184622

46194623
/*
46204624
* Reserve the memory region used by our main shared memory segment before
4621-
* we resume the child process.
4625+
* we resume the child process. Normally this should succeed, but if ASLR
4626+
* is active then it might sometimes fail due to the stack or heap having
4627+
* gotten mapped into that range. In that case, just terminate the
4628+
* process and retry.
46224629
*/
46234630
if (!pgwin32_ReserveSharedMemoryRegion(pi.hProcess))
46244631
{
4625-
/*
4626-
* Failed to reserve the memory, so terminate the newly created
4627-
* process and give up.
4628-
*/
4632+
/* pgwin32_ReserveSharedMemoryRegion already made a log entry */
46294633
if (!TerminateProcess(pi.hProcess, 255))
46304634
ereport(LOG,
46314635
(errmsg_internal("could not terminate process that failed to reserve memory: error code %lu",
46324636
GetLastError())));
46334637
CloseHandle(pi.hProcess);
46344638
CloseHandle(pi.hThread);
4635-
return -1; /* logging done made by
4636-
* pgwin32_ReserveSharedMemoryRegion() */
4639+
if (++retry_count < 100)
4640+
goto retry;
4641+
ereport(LOG,
4642+
(errmsg("giving up after too many tries to reserve shared memory"),
4643+
errhint("This might be caused by ASLR or antivirus software.")));
4644+
return -1;
46374645
}
46384646

46394647
/*

0 commit comments

Comments
 (0)