[PATCH] Opclass parameters - Mailing list pgsql-hackers
From | Nikita Glukhov |
---|---|
Subject | [PATCH] Opclass parameters |
Date | |
Msg-id | d22c3a18-31c7-1879-fc11-4c1ce2f5e5af@postgrespro.ru Whole thread Raw |
Responses |
Re: [PATCH] Opclass parameters
|
List | pgsql-hackers |
Hi hackers. I would like to present patch set implementing opclass parameters. This feature was recently presented at pgconf.ru: http://www.sai.msu.su/~megera/postgres/talks/opclass_pgconf.ru-2018.pdf A analogous work was already done by Nikolay Shaplov two years ago: https://www.postgresql.org/message-id/5213596.TqFRiqmCTe%40nataraj-amd64 But this patches are not based on it, although they are very similar. Opclass parameters can give user ability to: * Define the values of the constants that are hardcoded now in the opclasses depending on the indexed data. * Specify what to index for non-atomic data types (arrays, json[b], tsvector). Partial index can only filter whole rows. * Specify what indexing algorithm to use depending on the indexed data. Description of patches: 1. Infrastructure for opclass parameters. SQL grammar is changed only for CREATE INDEX statement: parenthesized parameters in reloptions format are added after column's opclass name. Default opclass can be specified with DEFAULT keyword: CREATE INDEX idx ON tab USING am ( {expr {opclass | DEFAULT} ({name=value} [,...])} [,...] ); Example for contrib/intarray: CREATE INDEX ON arrays USING gist ( arr gist__intbig_ops (siglen = 32), arr DEFAULT (numranges = 100) ); \d arrays Table "public.arrays" Column | Type | Collation | Nullable | Default --------+-----------+-----------+----------+--------- arr | integer[] | | | Indexes: "arrays_arr_arr1_idx" gist (arr gist__intbig_ops (siglen='32'), arr gist__int_ops (numranges='100')) I decided to store parameters in text[] column pg_index.indoptions near to existing columns like indkey, indcollation, indclass, indoption. I-th element of indoptions[] is a text array of parameters of i-th index column serialized into a string. Each parameter is stored as 'name=value' text string like ordinal reloptions. There is another way to store opclass parameters: store them in the existing column pg_attribute.attoptions (as it was done by Nikolay Shaplov) and there will be no need to serialize reloptions to a text array element. Example query showing how parameters are stored: SELECT ARRAY( SELECT (pg_identify_object('pg_opclass'::regclass, opcid, 0)).name FROM unnest(indclass::int[]) opcid ) indclass, indoptions FROM pg_index WHERE indoptions IS NOT NULL; indclass | indoptions ----------------------------------+------------------------------------ {gist__intbig_ops,gist__int_ops} | {"{siglen=32}","{numranges=100}"} {jsonb_path_ops} | {"{projection=$.tags[*].term}"} (2 rows) Each access method supporting opclass parameters specifies amopclassoptions routine for transformation of text[] parameters datum into a binary bytea structure which will be cached in RelationData and IndexOptInfo structures: typedef bytea *(*amopclassoptions_function) ( Relation index, AttrNumber colnum, Datum indoptions, bool validate ); If access method wants simply to delegate parameters processing to one of column opclass's support functions, then it can use index_opclass_options_generic() subroutine in its amopclassoptions implementation: bytea *index_opclass_options_generic( Relation relation, AttrNumber attnum, uint16 procnum, Datum indoptions, bool validate ); This support functions must have the following signature: internal (options internal, validate bool). Opclass parameters are passed as a text[] reloptions datum, returned pointer to a bytea structure with parsed parameter values. Opclass can use new functions parseLocalRelOptions(), parseAndFillLocalRelOptions() for reloptions parsing. This functions differ from the standard parseRelOptions() in that a local array of reloptions descriptions is passed here, not a global relopt_kind. But it seems that reloptions processing still needs deeper refactoring like the one already done by Nikolay Shaplov (https://www.postgresql.org/message-id/flat/2146419.veIEZdk4E4%40x200m#2146419.veIEZdk4E4@x200m). 2. Opclass parameters support in GiST indices. Parametrized GiST opclass specifies optional 10th (GIST_OPCLASSOPT_PROC) support function with the following signature: internal (options internal, validate bool) Returned parsed bytea pointer with parameters will be passed to all support functions in the last argument. 3. Opclass parameters support in GIN indices. Everything is the same as for GiST, except for the optional support function number which is 7 (GIN_OPCLASSOPTIONS_PROC) here. 4. Opclass parameters for GiST tsvector_ops 5. Opclass parameters for contrib/intarray 6. Opclass parameters for contrib/ltree 7. Opclass parameters for contrib/pg_trgm 8. Opclass parameters for contrib/hstore This 5 patches for GiST opclasses are very similar: added optional 'siglen' parameter for specifying signature length. Default signature length is left equal to the hardcoded value that was here before. Also added 'numranges' parameter for gist__int_ops. We also have two more complex unfinished patches for GIN opclasses which should be posted in separate threads: * tsvector_ops: added parameter 'weights' for specification of indexed lexeme's weight groups. This parameter can reduce index size and its build/update time and can also eliminate recheck. By default, all weights are indexed within the same group. * jsonb_ops: added jsonpath parameter 'projection' for specification of indexed paths in jsonb (this patch depends on SQL/JSON jsonpath patch). Analogically to tsvector_ops, this parameter can reduce index size and its build/update time, but can not eliminate recheck. -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
- 0001-opclass-parameters-v01.patch
- 0002-opclass-parameters-GiST-v01.patch
- 0003-opclass-parameters-GIN-v01.patch
- 0004-opclass-parameters-GiST-tsvector_ops-v01.patch
- 0005-opclass-parameters-contrib_intarray-v01.patch
- 0006-opclass-parameters-contrib_ltree-v01.patch
- 0007-opclass-parameters-contrib_pg_trgm-v01.patch
- 0008-opclass-parameters-contrib_hstore-v01.patch
pgsql-hackers by date: