Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit b2cbced

Browse files
committed
Support timezone abbreviations that sometimes change.
Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06d et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
1 parent 90063a7 commit b2cbced

29 files changed

+2045
-664
lines changed

contrib/btree_gist/btree_ts.c

+3-19
Original file line numberDiff line numberDiff line change
@@ -200,27 +200,11 @@ tstz_dist(PG_FUNCTION_ARGS)
200200
**************************************************/
201201

202202

203-
static Timestamp
203+
static inline Timestamp
204204
tstz_to_ts_gmt(TimestampTz ts)
205205
{
206-
Timestamp gmt;
207-
int val,
208-
tz;
209-
210-
gmt = ts;
211-
DecodeSpecial(0, "gmt", &val);
212-
213-
if (ts < DT_NOEND && ts > DT_NOBEGIN)
214-
{
215-
tz = val * 60;
216-
217-
#ifdef HAVE_INT64_TIMESTAMP
218-
gmt -= (tz * INT64CONST(1000000));
219-
#else
220-
gmt -= tz;
221-
#endif
222-
}
223-
return gmt;
206+
/* No timezone correction is needed, since GMT is offset 0 by definition */
207+
return (Timestamp) ts;
224208
}
225209

226210

doc/src/sgml/config.sgml

+3-3
Original file line numberDiff line numberDiff line change
@@ -6003,9 +6003,9 @@ SET XML OPTION { DOCUMENT | CONTENT };
60036003
Sets the collection of time zone abbreviations that will be accepted
60046004
by the server for datetime input. The default is <literal>'Default'</>,
60056005
which is a collection that works in most of the world; there are
6006-
also <literal>'Australia'</literal> and <literal>'India'</literal>, and other collections can be defined
6007-
for a particular installation. See <xref
6008-
linkend="datetime-appendix"> for more information.
6006+
also <literal>'Australia'</literal> and <literal>'India'</literal>,
6007+
and other collections can be defined for a particular installation.
6008+
See <xref linkend="datetime-config-files"> for more information.
60096009
</para>
60106010
</listitem>
60116011
</varlistentry>

doc/src/sgml/datatype.sgml

+28-11
Original file line numberDiff line numberDiff line change
@@ -2325,7 +2325,7 @@ January 8 04:05:06 1999 PST
23252325
but continue to be prone to arbitrary changes, particularly with
23262326
respect to daylight-savings rules.
23272327
<productname>PostgreSQL</productname> uses the widely-used
2328-
<literal>zoneinfo</> (Olson) time zone database for information about
2328+
IANA (Olson) time zone database for information about
23292329
historical time zone rules. For times in the future, the assumption
23302330
is that the latest known rules for a given time zone will
23312331
continue to be observed indefinitely far into the future.
@@ -2390,8 +2390,8 @@ January 8 04:05:06 1999 PST
23902390
The recognized time zone names are listed in the
23912391
<literal>pg_timezone_names</literal> view (see <xref
23922392
linkend="view-pg-timezone-names">).
2393-
<productname>PostgreSQL</productname> uses the widely-used
2394-
<literal>zoneinfo</> time zone data for this purpose, so the same
2393+
<productname>PostgreSQL</productname> uses the widely-used IANA
2394+
time zone data for this purpose, so the same time zone
23952395
names are also recognized by much other software.
23962396
</para>
23972397
</listitem>
@@ -2427,7 +2427,7 @@ January 8 04:05:06 1999 PST
24272427
When a daylight-savings zone abbreviation is present,
24282428
it is assumed to be used
24292429
according to the same daylight-savings transition rules used in the
2430-
<literal>zoneinfo</> time zone database's <filename>posixrules</> entry.
2430+
IANA time zone database's <filename>posixrules</> entry.
24312431
In a standard <productname>PostgreSQL</productname> installation,
24322432
<filename>posixrules</> is the same as <literal>US/Eastern</>, so
24332433
that POSIX-style time zone specifications follow USA daylight-savings
@@ -2438,9 +2438,25 @@ January 8 04:05:06 1999 PST
24382438
</itemizedlist>
24392439

24402440
In short, this is the difference between abbreviations
2441-
and full names: abbreviations always represent a fixed offset from
2442-
UTC, whereas most of the full names imply a local daylight-savings time
2443-
rule, and so have two possible UTC offsets.
2441+
and full names: abbreviations represent a specific offset from UTC,
2442+
whereas many of the full names imply a local daylight-savings time
2443+
rule, and so have two possible UTC offsets. As an example,
2444+
<literal>2014-06-04 12:00 America/New_York</> represents noon local
2445+
time in New York, which for this particular date was Eastern Daylight
2446+
Time (UTC-4). So <literal>2014-06-04 12:00 EDT</> specifies that
2447+
same time instant. But <literal>2014-06-04 12:00 EST</> specifies
2448+
noon Eastern Standard Time (UTC-5), regardless of whether daylight
2449+
savings was nominally in effect on that date.
2450+
</para>
2451+
2452+
<para>
2453+
To complicate matters, some jurisdictions have used the same timezone
2454+
abbreviation to mean different UTC offsets at different times; for
2455+
example, in Moscow <literal>MSK</> has meant UTC+3 in some years and
2456+
UTC+4 in others. <application>PostgreSQL</> interprets such
2457+
abbreviations according to whatever they meant (or had most recently
2458+
meant) on the specified date; but, as with the <literal>EST</> example
2459+
above, this is not necessarily the same as local civil time on that date.
24442460
</para>
24452461

24462462
<para>
@@ -2457,13 +2473,14 @@ January 8 04:05:06 1999 PST
24572473
</para>
24582474

24592475
<para>
2460-
In all cases, timezone names are recognized case-insensitively.
2461-
(This is a change from <productname>PostgreSQL</productname> versions
2462-
prior to 8.2, which were case-sensitive in some contexts but not others.)
2476+
In all cases, timezone names and abbreviations are recognized
2477+
case-insensitively. (This is a change from <productname>PostgreSQL</>
2478+
versions prior to 8.2, which were case-sensitive in some contexts but
2479+
not others.)
24632480
</para>
24642481

24652482
<para>
2466-
Neither full names nor abbreviations are hard-wired into the server;
2483+
Neither timezone names nor abbreviations are hard-wired into the server;
24672484
they are obtained from configuration files stored under
24682485
<filename>.../share/timezone/</> and <filename>.../share/timezonesets/</>
24692486
of the installation directory

doc/src/sgml/datetime.sgml

+20-15
Original file line numberDiff line numberDiff line change
@@ -374,22 +374,27 @@
374374
these formats:
375375

376376
<synopsis>
377-
<replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable>
378-
<replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable> D
377+
<replaceable>zone_abbreviation</replaceable> <replaceable>offset</replaceable>
378+
<replaceable>zone_abbreviation</replaceable> <replaceable>offset</replaceable> D
379+
<replaceable>zone_abbreviation</replaceable> <replaceable>time_zone_name</replaceable>
379380
@INCLUDE <replaceable>file_name</replaceable>
380381
@OVERRIDE
381382
</synopsis>
382383
</para>
383384

384385
<para>
385-
A <replaceable>time_zone_name</replaceable> is just the abbreviation
386-
being defined. The <replaceable>offset</replaceable> is the zone's
386+
A <replaceable>zone_abbreviation</replaceable> is just the abbreviation
387+
being defined. The <replaceable>offset</replaceable> is the equivalent
387388
offset in seconds from UTC, positive being east from Greenwich and
388389
negative being west. For example, -18000 would be five hours west
389390
of Greenwich, or North American east coast standard time. <literal>D</>
390-
indicates that the zone name represents local daylight-savings time
391-
rather than standard time. Since all known time zone offsets are on
392-
15 minute boundaries, the number of seconds has to be a multiple of 900.
391+
indicates that the zone name represents local daylight-savings time rather
392+
than standard time. Alternatively, a <replaceable>time_zone_name</> can
393+
be given, in which case that time zone definition is consulted, and the
394+
abbreviation's meaning in that zone is used. This alternative is
395+
recommended only for abbreviations whose meaning has historically varied,
396+
as looking up the meaning is noticeably more expensive than just using
397+
a fixed integer value.
393398
</para>
394399

395400
<para>
@@ -400,24 +405,24 @@
400405

401406
<para>
402407
The <literal>@OVERRIDE</> syntax indicates that subsequent entries in the
403-
file can override previous entries (i.e., entries obtained from included
404-
files). Without this, conflicting definitions of the same timezone
405-
abbreviation are considered an error.
408+
file can override previous entries (typically, entries obtained from
409+
included files). Without this, conflicting definitions of the same
410+
timezone abbreviation are considered an error.
406411
</para>
407412

408413
<para>
409414
In an unmodified installation, the file <filename>Default</> contains
410415
all the non-conflicting time zone abbreviations for most of the world.
411416
Additional files <filename>Australia</> and <filename>India</> are
412417
provided for those regions: these files first include the
413-
<literal>Default</> file and then add or modify timezones as needed.
418+
<literal>Default</> file and then add or modify abbreviations as needed.
414419
</para>
415420

416421
<para>
417422
For reference purposes, a standard installation also contains files
418423
<filename>Africa.txt</>, <filename>America.txt</>, etc, containing
419424
information about every time zone abbreviation known to be in use
420-
according to the <literal>zoneinfo</> timezone database. The zone name
425+
according to the IANA timezone database. The zone name
421426
definitions found in these files can be copied and pasted into a custom
422427
configuration file as needed. Note that these files cannot be directly
423428
referenced as <varname>timezone_abbreviations</> settings, because of
@@ -426,9 +431,9 @@
426431

427432
<note>
428433
<para>
429-
If an error occurs while reading the time zone data sets, no new value is
430-
applied but the old set is kept. If the error occurs while starting the
431-
database, startup fails.
434+
If an error occurs while reading the time zone abbreviation set, no new
435+
value is applied and the old set is kept. If the error occurs while
436+
starting the database, startup fails.
432437
</para>
433438
</note>
434439

doc/src/sgml/installation.sgml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1108,7 +1108,7 @@ su - postgres
11081108
<para>
11091109
<productname>PostgreSQL</> includes its own time zone database,
11101110
which it requires for date and time operations. This time zone
1111-
database is in fact compatible with the <quote>zoneinfo</> time zone
1111+
database is in fact compatible with the IANA time zone
11121112
database provided by many operating systems such as FreeBSD,
11131113
Linux, and Solaris, so it would be redundant to install it again.
11141114
When this option is used, the system-supplied time zone database

src/backend/utils/adt/date.c

+23-8
Original file line numberDiff line numberDiff line change
@@ -2695,24 +2695,39 @@ timetz_zone(PG_FUNCTION_ARGS)
26952695
pg_tz *tzp;
26962696

26972697
/*
2698-
* Look up the requested timezone. First we look in the date token table
2699-
* (to handle cases like "EST"), and if that fails, we look in the
2700-
* timezone database (to handle cases like "America/New_York"). (This
2701-
* matches the order in which timestamp input checks the cases; it's
2702-
* important because the timezone database unwisely uses a few zone names
2703-
* that are identical to offset abbreviations.)
2698+
* Look up the requested timezone. First we look in the timezone
2699+
* abbreviation table (to handle cases like "EST"), and if that fails, we
2700+
* look in the timezone database (to handle cases like
2701+
* "America/New_York"). (This matches the order in which timestamp input
2702+
* checks the cases; it's important because the timezone database unwisely
2703+
* uses a few zone names that are identical to offset abbreviations.)
27042704
*/
27052705
text_to_cstring_buffer(zone, tzname, sizeof(tzname));
2706+
2707+
/* DecodeTimezoneAbbrev requires lowercase input */
27062708
lowzone = downcase_truncate_identifier(tzname,
27072709
strlen(tzname),
27082710
false);
27092711

2710-
type = DecodeSpecial(0, lowzone, &val);
2712+
type = DecodeTimezoneAbbrev(0, lowzone, &val, &tzp);
27112713

27122714
if (type == TZ || type == DTZ)
2713-
tz = val * MINS_PER_HOUR;
2715+
{
2716+
/* fixed-offset abbreviation */
2717+
tz = -val;
2718+
}
2719+
else if (type == DYNTZ)
2720+
{
2721+
/* dynamic-offset abbreviation, resolve using current time */
2722+
pg_time_t now = (pg_time_t) time(NULL);
2723+
struct pg_tm *tm;
2724+
2725+
tm = pg_localtime(&now, tzp);
2726+
tz = DetermineTimeZoneAbbrevOffset(tm, tzname, tzp);
2727+
}
27142728
else
27152729
{
2730+
/* try it as a full zone name */
27162731
tzp = pg_tzset(tzname);
27172732
if (tzp)
27182733
{

0 commit comments

Comments
 (0)