Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit af2c5aa

Browse files
committed
Improve performance of timezone loading, especially pg_timezone_names view.
tzparse() would attempt to load the "posixrules" timezone database file on each call. That might seem like it would only be an issue when selecting a POSIX-style zone name rather than a zone defined in the timezone database, but it turns out that each zone definition file contains a POSIX-style zone string and tzload() will call tzparse() to parse that. Thus, when scanning the whole timezone file tree as we do in the pg_timezone_names view, "posixrules" was read repetitively for each zone definition file. Fix that by caching the file on first use within any given process. (We cache other zone definitions for the life of the process, so there seems little reason not to cache this one as well.) This probably won't help much in processes that never run pg_timezone_names, but even one additional SET of the timezone GUC would come out ahead. An even worse problem for pg_timezone_names is that pg_open_tzfile() has an inefficient way of identifying the canonical case of a zone name: it basically re-descends the directory tree to the zone file. That's not awful for an individual "SET timezone" operation, but it's pretty horrid when we're inspecting every zone in the database. And it's pointless too because we already know the canonical spelling, having just read it from the filesystem. Fix by teaching pg_open_tzfile() to avoid the directory search if it's not asked for the canonical name, and backfilling the proper result in pg_tzenumerate_next(). In combination these changes seem to make the pg_timezone_names view about 3x faster to read, for me. Since a scan of pg_timezone_names has up to now been one of the slowest queries in the regression tests, this should help some little bit for buildfarm cycle times. Back-patch to all supported branches, not so much because it's likely that users will care much about the view's performance as because tracking changes in the upstream IANA timezone code is really painful if we don't keep all the branches in sync. Discussion: https://postgr.es/m/27962.1493671706@sss.pgh.pa.us
1 parent 23c6eb0 commit af2c5aa

File tree

3 files changed

+60
-8
lines changed

3 files changed

+60
-8
lines changed

src/timezone/README

+3
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,9 @@ other exposed names.
7979
slightly modified the API of the former, in part because it now relies
8080
on our own pg_open_tzfile() rather than opening files for itself.
8181

82+
* tzparse() is adjusted to cache the result of loading the TZDEFRULES
83+
zone, so that that's not repeated more than once per process.
84+
8285
* There's a fair amount of code we don't need and have removed,
8386
including all the nonstandard optional APIs. We have also added
8487
a few functions of our own at the bottom of localtime.c.

src/timezone/localtime.c

+22-1
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,13 @@ static const char gmt[] = "GMT";
5454
static const pg_time_t time_t_min = MINVAL(pg_time_t, TYPE_BIT(pg_time_t));
5555
static const pg_time_t time_t_max = MAXVAL(pg_time_t, TYPE_BIT(pg_time_t));
5656

57+
/*
58+
* We cache the result of trying to load the TZDEFRULES zone here.
59+
* tzdefrules_loaded is 0 if not tried yet, +1 if good, -1 if failed.
60+
*/
61+
static struct state tzdefrules_s;
62+
static int tzdefrules_loaded = 0;
63+
5764
/*
5865
* The DST rules to use if TZ has no rules and we can't load TZDEFRULES.
5966
* We default to US rules as of 1999-08-17.
@@ -942,7 +949,21 @@ tzparse(const char *name, struct state * sp, bool lastditch)
942949
charcnt = stdlen + 1;
943950
if (sizeof sp->chars < charcnt)
944951
return false;
945-
load_ok = tzload(TZDEFRULES, NULL, sp, false) == 0;
952+
953+
/*
954+
* This bit also differs from the IANA code, which doesn't make any
955+
* attempt to avoid repetitive loadings of the TZDEFRULES zone.
956+
*/
957+
if (tzdefrules_loaded == 0)
958+
{
959+
if (tzload(TZDEFRULES, NULL, &tzdefrules_s, false) == 0)
960+
tzdefrules_loaded = 1;
961+
else
962+
tzdefrules_loaded = -1;
963+
}
964+
load_ok = (tzdefrules_loaded > 0);
965+
if (load_ok)
966+
memcpy(sp, &tzdefrules_s, sizeof(struct state));
946967
}
947968
if (!load_ok)
948969
sp->leapcnt = 0; /* so, we're off a little */

src/timezone/pgtz.c

+35-7
Original file line numberDiff line numberDiff line change
@@ -80,12 +80,37 @@ pg_open_tzfile(const char *name, char *canonname)
8080
int fullnamelen;
8181
int orignamelen;
8282

83+
/* Initialize fullname with base name of tzdata directory */
84+
strlcpy(fullname, pg_TZDIR(), sizeof(fullname));
85+
orignamelen = fullnamelen = strlen(fullname);
86+
87+
if (fullnamelen + 1 + strlen(name) >= MAXPGPATH)
88+
return -1; /* not gonna fit */
89+
90+
/*
91+
* If the caller doesn't need the canonical spelling, first just try to
92+
* open the name as-is. This can be expected to succeed if the given name
93+
* is already case-correct, or if the filesystem is case-insensitive; and
94+
* we don't need to distinguish those situations if we aren't tasked with
95+
* reporting the canonical spelling.
96+
*/
97+
if (canonname == NULL)
98+
{
99+
int result;
100+
101+
fullname[fullnamelen] = '/';
102+
/* test above ensured this will fit: */
103+
strcpy(fullname + fullnamelen + 1, name);
104+
result = open(fullname, O_RDONLY | PG_BINARY, 0);
105+
if (result >= 0)
106+
return result;
107+
/* If that didn't work, fall through to do it the hard way */
108+
}
109+
83110
/*
84111
* Loop to split the given name into directory levels; for each level,
85112
* search using scan_directory_ci().
86113
*/
87-
strlcpy(fullname, pg_TZDIR(), sizeof(fullname));
88-
orignamelen = fullnamelen = strlen(fullname);
89114
fname = name;
90115
for (;;)
91116
{
@@ -97,8 +122,6 @@ pg_open_tzfile(const char *name, char *canonname)
97122
fnamelen = slashptr - fname;
98123
else
99124
fnamelen = strlen(fname);
100-
if (fullnamelen + 1 + fnamelen >= MAXPGPATH)
101-
return -1; /* not gonna fit */
102125
if (!scan_directory_ci(fullname, fname, fnamelen,
103126
fullname + fullnamelen + 1,
104127
MAXPGPATH - fullnamelen - 1))
@@ -458,10 +481,11 @@ pg_tzenumerate_next(pg_tzenum *dir)
458481

459482
/*
460483
* Load this timezone using tzload() not pg_tzset(), so we don't fill
461-
* the cache
484+
* the cache. Also, don't ask for the canonical spelling: we already
485+
* know it, and pg_open_tzfile's way of finding it out is pretty
486+
* inefficient.
462487
*/
463-
if (tzload(fullname + dir->baselen, dir->tz.TZname, &dir->tz.state,
464-
true) != 0)
488+
if (tzload(fullname + dir->baselen, NULL, &dir->tz.state, true) != 0)
465489
{
466490
/* Zone could not be loaded, ignore it */
467491
continue;
@@ -473,6 +497,10 @@ pg_tzenumerate_next(pg_tzenum *dir)
473497
continue;
474498
}
475499

500+
/* OK, return the canonical zone name spelling. */
501+
strlcpy(dir->tz.TZname, fullname + dir->baselen,
502+
sizeof(dir->tz.TZname));
503+
476504
/* Timezone loaded OK. */
477505
return &dir->tz;
478506
}

0 commit comments

Comments
 (0)