Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 6b0faf7

Browse files
committed
Make collation-aware system catalog columns use "C" collation.
Up to now we allowed text columns in system catalogs to use collation "default", but that isn't really safe because it might mean something different in template0 than it means in a database cloned from template0. In particular, this could mean that cloned pg_statistic entries for such columns weren't entirely valid, possibly leading to bogus planner estimates, though (probably) not any outright failures. In the wake of commit 5e09280, a better solution is available: if we label such columns with "C" collation, then their pg_statistic entries will also use that collation and hence will be valid independently of the database collation. This also provides a cleaner solution for indexes on such columns than the hack added by commit 0b28ea7: the indexes will naturally inherit "C" collation and don't have to be forced to use text_pattern_ops. Also, with the planned improvement of type "name" to be collation-aware, this policy will apply cleanly to both text and name columns. Because of the pg_statistic angle, we should also apply this policy to the tables in information_schema. This patch does that by adjusting information_schema's textual domain types to specify "C" collation. That has the user-visible effect that order-sensitive comparisons to textual information_schema view columns will now use "C" collation by default. The SQL standard says that the collation of those view columns is implementation-defined, so I think this is legal per spec. At some point this might allow for translation of such comparisons into indexable conditions on the underlying "name" columns, although additional work will be needed before that can happen. Discussion: https://postgr.es/m/19346.1544895309@sss.pgh.pa.us
1 parent b2d9e17 commit 6b0faf7

File tree

9 files changed

+60
-32
lines changed

9 files changed

+60
-32
lines changed

src/backend/access/common/scankey.c

+4-4
Original file line numberDiff line numberDiff line change
@@ -64,9 +64,9 @@ ScanKeyEntryInitialize(ScanKey entry,
6464
* It cannot handle NULL arguments, unary operators, or nondefault operators,
6565
* but we need none of those features for most hardwired lookups.
6666
*
67-
* We set collation to DEFAULT_COLLATION_OID always. This is appropriate
68-
* for textual columns in system catalogs, and it will be ignored for
69-
* non-textual columns, so it's not worth trying to be more finicky.
67+
* We set collation to C_COLLATION_OID always. This is the correct value
68+
* for all collation-aware columns in system catalogs, and it will be ignored
69+
* for other column types, so it's not worth trying to be more finicky.
7070
*
7171
* Note: CurrentMemoryContext at call should be as long-lived as the ScanKey
7272
* itself, because that's what will be used for any subsidiary info attached
@@ -83,7 +83,7 @@ ScanKeyInit(ScanKey entry,
8383
entry->sk_attno = attributeNumber;
8484
entry->sk_strategy = strategy;
8585
entry->sk_subtype = InvalidOid;
86-
entry->sk_collation = DEFAULT_COLLATION_OID;
86+
entry->sk_collation = C_COLLATION_OID;
8787
entry->sk_argument = argument;
8888
fmgr_info(procedure, &entry->sk_func);
8989
}

src/backend/bootstrap/bootstrap.c

+9
Original file line numberDiff line numberDiff line change
@@ -744,6 +744,15 @@ DefineAttr(char *name, char *type, int attnum, int nullness)
744744
attrtypes[attnum]->attndims = 0;
745745
}
746746

747+
/*
748+
* If a system catalog column is collation-aware, force it to use C
749+
* collation, so that its behavior is independent of the database's
750+
* collation. This is essential to allow template0 to be cloned with a
751+
* different database collation.
752+
*/
753+
if (OidIsValid(attrtypes[attnum]->attcollation))
754+
attrtypes[attnum]->attcollation = C_COLLATION_OID;
755+
747756
attrtypes[attnum]->attstattarget = -1;
748757
attrtypes[attnum]->attcacheoff = -1;
749758
attrtypes[attnum]->atttypmod = -1;

src/backend/catalog/genbki.pl

+7-1
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,9 @@
167167
my $BOOTSTRAP_SUPERUSERID =
168168
Catalog::FindDefinedSymbolFromData($catalog_data{pg_authid},
169169
'BOOTSTRAP_SUPERUSERID');
170+
my $C_COLLATION_OID =
171+
Catalog::FindDefinedSymbolFromData($catalog_data{pg_collation},
172+
'C_COLLATION_OID');
170173
my $PG_CATALOG_NAMESPACE =
171174
Catalog::FindDefinedSymbolFromData($catalog_data{pg_namespace},
172175
'PG_CATALOG_NAMESPACE');
@@ -693,7 +696,10 @@ sub morph_row_for_pgattr
693696

694697
# set attndims if it's an array type
695698
$row->{attndims} = $type->{typcategory} eq 'A' ? '1' : '0';
696-
$row->{attcollation} = $type->{typcollation};
699+
700+
# collation-aware catalog columns must use C collation
701+
$row->{attcollation} = $type->{typcollation} != 0 ?
702+
$C_COLLATION_OID : 0;
697703

698704
if (defined $attr->{forcenotnull})
699705
{

src/backend/catalog/information_schema.sql

+3-3
Original file line numberDiff line numberDiff line change
@@ -208,15 +208,15 @@ CREATE DOMAIN cardinal_number AS integer
208208
* CHARACTER_DATA domain
209209
*/
210210

211-
CREATE DOMAIN character_data AS character varying;
211+
CREATE DOMAIN character_data AS character varying COLLATE "C";
212212

213213

214214
/*
215215
* 5.5
216216
* SQL_IDENTIFIER domain
217217
*/
218218

219-
CREATE DOMAIN sql_identifier AS character varying;
219+
CREATE DOMAIN sql_identifier AS character varying COLLATE "C";
220220

221221

222222
/*
@@ -243,7 +243,7 @@ CREATE DOMAIN time_stamp AS timestamp(2) with time zone
243243
* YES_OR_NO domain
244244
*/
245245

246-
CREATE DOMAIN yes_or_no AS character varying(3)
246+
CREATE DOMAIN yes_or_no AS character varying(3) COLLATE "C"
247247
CONSTRAINT yes_or_no_check CHECK (value IN ('YES', 'NO'));
248248

249249

src/backend/utils/cache/catcache.c

+7-6
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
#include "access/tuptoaster.h"
2323
#include "access/valid.h"
2424
#include "access/xact.h"
25+
#include "catalog/pg_collation.h"
2526
#include "catalog/pg_operator.h"
2627
#include "catalog/pg_type.h"
2728
#include "miscadmin.h"
@@ -1014,8 +1015,8 @@ CatalogCacheInitializeCache(CatCache *cache)
10141015
/* Fill in sk_strategy as well --- always standard equality */
10151016
cache->cc_skey[i].sk_strategy = BTEqualStrategyNumber;
10161017
cache->cc_skey[i].sk_subtype = InvalidOid;
1017-
/* Currently, there are no catcaches on collation-aware data types */
1018-
cache->cc_skey[i].sk_collation = InvalidOid;
1018+
/* If a catcache key requires a collation, it must be C collation */
1019+
cache->cc_skey[i].sk_collation = C_COLLATION_OID;
10191020

10201021
CACHE4_elog(DEBUG2, "CatalogCacheInitializeCache %s %d %p",
10211022
cache->cc_relname,
@@ -1961,10 +1962,10 @@ CatCacheCopyKeys(TupleDesc tupdesc, int nkeys, int *attnos,
19611962
NameData srcname;
19621963

19631964
/*
1964-
* Must be careful in case the caller passed a C string where a
1965-
* NAME is wanted: convert the given argument to a correctly
1966-
* padded NAME. Otherwise the memcpy() done by datumCopy() could
1967-
* fall off the end of memory.
1965+
* Must be careful in case the caller passed a C string where a NAME
1966+
* is wanted: convert the given argument to a correctly padded NAME.
1967+
* Otherwise the memcpy() done by datumCopy() could fall off the end
1968+
* of memory.
19681969
*/
19691970
if (att->atttypid == NAMEOID)
19701971
{

src/include/catalog/catversion.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,6 @@
5353
*/
5454

5555
/* yyyymmddN */
56-
#define CATALOG_VERSION_NO 201812181
56+
#define CATALOG_VERSION_NO 201812182
5757

5858
#endif

src/include/catalog/indexing.h

+3-3
Original file line numberDiff line numberDiff line change
@@ -310,10 +310,10 @@ DECLARE_UNIQUE_INDEX(pg_default_acl_oid_index, 828, on pg_default_acl using btre
310310
DECLARE_UNIQUE_INDEX(pg_db_role_setting_databaseid_rol_index, 2965, on pg_db_role_setting using btree(setdatabase oid_ops, setrole oid_ops));
311311
#define DbRoleSettingDatidRolidIndexId 2965
312312

313-
DECLARE_UNIQUE_INDEX(pg_seclabel_object_index, 3597, on pg_seclabel using btree(objoid oid_ops, classoid oid_ops, objsubid int4_ops, provider text_pattern_ops));
313+
DECLARE_UNIQUE_INDEX(pg_seclabel_object_index, 3597, on pg_seclabel using btree(objoid oid_ops, classoid oid_ops, objsubid int4_ops, provider text_ops));
314314
#define SecLabelObjectIndexId 3597
315315

316-
DECLARE_UNIQUE_INDEX(pg_shseclabel_object_index, 3593, on pg_shseclabel using btree(objoid oid_ops, classoid oid_ops, provider text_pattern_ops));
316+
DECLARE_UNIQUE_INDEX(pg_shseclabel_object_index, 3593, on pg_shseclabel using btree(objoid oid_ops, classoid oid_ops, provider text_ops));
317317
#define SharedSecLabelObjectIndexId 3593
318318

319319
DECLARE_UNIQUE_INDEX(pg_extension_oid_index, 3080, on pg_extension using btree(oid oid_ops));
@@ -333,7 +333,7 @@ DECLARE_UNIQUE_INDEX(pg_policy_polrelid_polname_index, 3258, on pg_policy using
333333
DECLARE_UNIQUE_INDEX(pg_replication_origin_roiident_index, 6001, on pg_replication_origin using btree(roident oid_ops));
334334
#define ReplicationOriginIdentIndex 6001
335335

336-
DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, on pg_replication_origin using btree(roname text_pattern_ops));
336+
DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, on pg_replication_origin using btree(roname text_ops));
337337
#define ReplicationOriginNameIndex 6002
338338

339339
DECLARE_UNIQUE_INDEX(pg_partitioned_table_partrelid_index, 3351, on pg_partitioned_table using btree(partrelid oid_ops));

src/test/regress/expected/opr_sanity.out

+14-7
Original file line numberDiff line numberDiff line change
@@ -2060,18 +2060,25 @@ ORDER BY 1;
20602060
-- a representational error in pg_index, but simply wrong catalog design.
20612061
-- It's bad because we expect to be able to clone template0 and assign the
20622062
-- copy a different database collation. It would especially not work for
2063-
-- shared catalogs. Note that although text columns will show a collation
2064-
-- in indcollation, they're still okay to index with text_pattern_ops,
2065-
-- so allow that case.
2063+
-- shared catalogs.
2064+
SELECT relname, attname, attcollation
2065+
FROM pg_class c, pg_attribute a
2066+
WHERE c.oid = attrelid AND c.oid < 16384 AND
2067+
c.relkind != 'v' AND -- we don't care about columns in views
2068+
attcollation != 0 AND
2069+
attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C');
2070+
relname | attname | attcollation
2071+
---------+---------+--------------
2072+
(0 rows)
2073+
2074+
-- Double-check that collation-sensitive indexes have "C" collation, too.
20662075
SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
20672076
FROM (SELECT indexrelid, indrelid,
20682077
unnest(indclass) as iclass, unnest(indcollation) as icoll
20692078
FROM pg_index
20702079
WHERE indrelid < 16384) ss
2071-
WHERE icoll != 0 AND iclass !=
2072-
(SELECT oid FROM pg_opclass
2073-
WHERE opcname = 'text_pattern_ops' AND opcmethod =
2074-
(SELECT oid FROM pg_am WHERE amname = 'btree'));
2080+
WHERE icoll != 0 AND
2081+
icoll != (SELECT oid FROM pg_collation WHERE collname = 'C');
20752082
indexrelid | indrelid | iclass | icoll
20762083
------------+----------+--------+-------
20772084
(0 rows)

src/test/regress/sql/opr_sanity.sql

+12-7
Original file line numberDiff line numberDiff line change
@@ -1333,16 +1333,21 @@ ORDER BY 1;
13331333
-- a representational error in pg_index, but simply wrong catalog design.
13341334
-- It's bad because we expect to be able to clone template0 and assign the
13351335
-- copy a different database collation. It would especially not work for
1336-
-- shared catalogs. Note that although text columns will show a collation
1337-
-- in indcollation, they're still okay to index with text_pattern_ops,
1338-
-- so allow that case.
1336+
-- shared catalogs.
1337+
1338+
SELECT relname, attname, attcollation
1339+
FROM pg_class c, pg_attribute a
1340+
WHERE c.oid = attrelid AND c.oid < 16384 AND
1341+
c.relkind != 'v' AND -- we don't care about columns in views
1342+
attcollation != 0 AND
1343+
attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C');
1344+
1345+
-- Double-check that collation-sensitive indexes have "C" collation, too.
13391346

13401347
SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
13411348
FROM (SELECT indexrelid, indrelid,
13421349
unnest(indclass) as iclass, unnest(indcollation) as icoll
13431350
FROM pg_index
13441351
WHERE indrelid < 16384) ss
1345-
WHERE icoll != 0 AND iclass !=
1346-
(SELECT oid FROM pg_opclass
1347-
WHERE opcname = 'text_pattern_ops' AND opcmethod =
1348-
(SELECT oid FROM pg_am WHERE amname = 'btree'));
1352+
WHERE icoll != 0 AND
1353+
icoll != (SELECT oid FROM pg_collation WHERE collname = 'C');

0 commit comments

Comments
 (0)