Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-12Fix GIN's shimTriConsistentFn to not corrupt its input.Tom Lane
Commit 0f21db36d made an assumption that GIN triConsistentFns would not modify their input entryRes[] arrays. But in fact, the "shim" triConsistentFn that we use for opclasses that don't supply their own did exactly that, potentially leading to wrong answers from a GIN index search. Through bad luck, none of the test cases that we have for such opclasses exposed the bug. One response to this could be that the assumption of consistency check functions not modifying entryRes[] arrays is a bad one, but it still seems reasonable to me. Notably, shimTriConsistentFn is itself assuming that with respect to the underlying boolean consistentFn, so it's sure being self-centered in supposing that it gets to do so. Fortunately, it's quite simple to fix shimTriConsistentFn to restore the entry-time state of entryRes[], so let's do that instead. This issue doesn't affect any core GIN opclasses, since they all supply their own triConsistentFns. It does affect contrib modules btree_gin, hstore, and intarray. Along the way, I (tgl) noticed that shimTriConsistentFn failed to pick up on a "recheck" flag returned by its first call to the boolean consistentFn. This may be only a latent problem, since it would be unlikely for a consistentFn to set recheck for the all-false case and not any other cases. (Indeed, none of our contrib modules do that.) Nonetheless, it's formally wrong. Reported-by: Vinod Sridharan <vsridh90@gmail.com> Author: Vinod Sridharan <vsridh90@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFMdLD7XzsXfi1+DpTqTgrD8XU0i2C99KuF=5VHLWjx4C1pkcg@mail.gmail.com Backpatch-through: 13
2024-01-07Fix integer-overflow problem in intarray's g_int_decompress().Tom Lane
An array element equal to INT_MAX gave this code indigestion, causing an infinite loop that surely ended in SIGSEGV. We fixed some nearby problems awhile ago (cf 757c5182f) but missed this. Report and diagnosis by Alexander Lakhin (bug #18273); patch by me Discussion: https://postgr.es/m/18273-9a832d1da122600c@postgresql.org
2023-09-26Fix another bug in parent page splitting during GiST index build.Heikki Linnakangas
Yet another bug in the ilk of commits a7ee7c851 and 741b88435. In 741b88435, we took care to clear the memorized location of the downlink when we split the parent page, because splitting the parent page can move the downlink. But we missed that even *updating* a tuple on the parent can move it, because updating a tuple on a gist page is implemented as a delete+insert, so the updated tuple gets moved to the end of the page. This commit fixes the bug in two different ways (belt and suspenders): 1. Clear the downlink when we update a tuple on the parent page, even if it's not split. This the same approach as in commits a7ee7c851 and 741b88435. I also noticed that gistFindCorrectParent did not clear the 'downlinkoffnum' when it stepped to the right sibling. Fix that too, as it seems like a clear bug even though I haven't been able to find a test case to hit that. 2. Change gistFindCorrectParent so that it treats 'downlinkoffnum' merely as a hint. It now always first checks if the downlink is still at that location, and if not, it scans the page like before. That's more robust if there are still more cases where we fail to clear 'downlinkoffnum' that we haven't yet uncovered. With this, it's no longer necessary to meticulously clear 'downlinkoffnum', so this makes the previous fixes unnecessary, but I didn't revert them because it still seems nice to clear it when we know that the downlink has moved. Also add the test case using the same test data that Alexander posted. I tried to reduce it to a smaller test, and I also tried to reproduce this with different test data, but I was not able to, so let's just include what we have. Backpatch to v12, like the previous fixes. Reported-by: Alexander Lakhin Discussion: https://www.postgresql.org/message-id/18129-caca016eaf0c3702@postgresql.org
2023-06-15intarray: Prevent out-of-bound memory reads with gist__int_opsMichael Paquier
As gist__int_ops stands in intarray, it is possible to store GiST entries for leaf pages that can cause corruptions when decompressed. Leaf nodes are stored as decompressed all the time by the compression method, and the decompression method should map with that, retrieving the contents of the page without doing any decompression. However, the code authorized the insertion of leaf page data with a higher number of array items than what can be supported, generating a NOTICE message to inform about this matter (199 for a 8k page, for reference). When calling the decompression method, a decompression would be attempted on this leaf node item but the contents should be retrieved as they are. The NOTICE message generated when dealing with the compression of a leaf page and too many elements in the input array for gist__int_ops has been introduced by 08ee64e, removing the marker stored in the array to track if this is actually a leaf node. However, it also missed the fact that the decompression path should do nothing for a leaf page. Hence, as the code stand, a too-large array would be stored as uncompressed but the decompression path would attempt a decompression rather that retrieving the contents as they are. This leads to various problems. First, even if 08ee64e tried to address that, it is possible to do out-of-bound chunk writes with a large input array, with the backend informing about that with WARNINGs. On decompression, retrieving the stored leaf data would lead to incorrect memory reads, leading to crashes or even worse. Perhaps somebody would be interested in expanding the number of array items that can be handled in a leaf page for this operator in the future, which would require revisiting the choice done in 08ee64e, but based on the lack of reports about this problem since 2005 it does not look so. For now, this commit prevents the insertion of data for leaf pages when using more array items that the code can handle on decompression, switching the NOTICE message to an ERROR. If one wishes to use more array items, gist__intbig_ops is an optional choice. While on it, use ERRCODE_PROGRAM_LIMIT_EXCEEDED as error code when a limit is reached, because that's what the module is facing in such cases. Author: Ankit Kumar Pandey, Alexander Lakhin Reviewed-by: Richard Guo, Michael Paquier Discussion: https://postgr.es/m/796b65c3-57b7-bddf-b0d5-a8afafb8b627@gmail.com Discussion: https://postgr.es/m/17888-f72930e6b5ce8c14@postgresql.org Backpatch-through: 11
2023-02-27Rework pg_input_error_message(), now renamed pg_input_error_info()Michael Paquier
pg_input_error_info() is now a SQL function able to return a row with more than just the error message generated for incorrect data type inputs when these are able to handle soft failures, returning more contents of ErrorData, as of: - The error message (same as before). - The error detail, if set. - The error hint, if set. - SQL error code. All the regression tests that relied on pg_input_error_message() are updated to reflect the effects of the rename. Per discussion with Tom Lane and Andrew Dunstan. Author: Nathan Bossart Discussion: https://postgr.es/m/139a68e1-bd1f-a9a7-b5fe-0be9845c6311@dunslane.net
2022-12-28Convert contrib/intarray's bqarr_in() to report errors softlyAndrew Dunstan
Reviewed by Tom Lane and Amul Sul Discussion: https://postgr.es/m/49e598c2-cfe8-0928-b6fb-d0cc51aab626@dunslane.net
2020-03-30Implement operator class parametersAlexander Korotkov
PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2019-08-06Fix intarray's GiST opclasses to not fail for empty arrays with <@.Tom Lane
contrib/intarray considers "arraycol <@ constant-array" to be indexable, but its GiST opclass code fails to reliably find index entries for empty array values (which of course should trivially match such queries). This is because the test condition to see whether we should descend through a non-leaf node is wrong. Unfortunately, empty array entries could be anywhere in the index, as these index opclasses are currently designed. So there's no way to fix this except by lobotomizing <@ indexscans to scan the whole index ... which is what this patch does. That's pretty unfortunate: the performance is now actually worse than a seqscan, in most cases. We'd be better off to remove <@ from the GiST opclasses entirely, and perhaps a future non-back-patchable patch will do so. In the meantime, applications whose performance is adversely impacted have a couple of options. They could switch to a GIN index, which doesn't have this bug, or they could replace "arraycol <@ constant-array" with "arraycol <@ constant-array AND arraycol && constant-array". That will provide about the same performance as before, and it will find all non-empty subsets of the given constant-array, which is all that could reliably be expected of the query before. While at it, add some more regression test cases to improve code coverage of contrib/intarray. In passing, adjust resize_intArrayType so that when it's returning an empty array, it uses construct_empty_array for that rather than cowboy hacking on the input array. While the hack produces an array that looks valid for most purposes, it isn't bitwise equal to empty arrays produced by other code paths, which could have subtle odd effects. I don't think this code path is performance-critical enough to justify such shortcuts. (Back-patch this part only as far as v11; before commit 01783ac36 we were not careful about this in other intarray code paths either.) Back-patch the <@ fixes to all supported versions, since this was broken from day one. Patch by me; thanks to Alexander Korotkov for review. Discussion: https://postgr.es/m/458.1565114141@sss.pgh.pa.us
2018-07-09Fix yet more problems with incorrectly-constructed zero-length arrays.Tom Lane
Commit 716ea626a attempted to fix the problem of building 1-D zero-size arrays once and for all. But it turns out that contrib/intarray has some code that doesn't use construct_array() but just builds arrays by hand, so it didn't get the memo. This appears to affect all of subarray(), intset_subtract(), inner_int_union(), inner_int_inter(), and intarray_concat_arrays(). Back-patch into v11. In the past we've not back-patched this type of change, but since v11 is still in beta it seems all right to include this fix in it. Besides it's more consistent to make the fix in v11 where 716ea626a appeared. Report and patch by Alexey Kryuchkov, some cosmetic adjustments by me Report: https://postgr.es/m/153053285112.13258.434620894305716755@wrigleys.postgresql.org Discussion: https://postgr.es/m/CAN85JcYphDLYt4CpMDLZjjNVqGDrFJ5eS3YF=wLAhFoDQuBsyg@mail.gmail.com
2016-11-29Test all contrib-created operator classes with amvalidate.Tom Lane
I'd supposed that people would do this manually when creating new operator classes, but the folly of that was exposed today. The tests seem fast enough that we can just apply them during the normal regression tests. contrib/isn fails the checks for lack of complete sets of cross-type operators. That's a nice-to-have policy rather than a functional requirement, so leave it as-is, but insert ORDER BY in the query to ensure consistent cross-platform output. Discussion: https://postgr.es/m/7076.1480446837@sss.pgh.pa.us
2015-07-21Add selectivity estimation functions for intarray operators.Heikki Linnakangas
Uriy Zhuravlev and Alexander Korotkov, reviewed by Jeff Janes, some cleanup by me.
2012-02-17Fix longstanding error in contrib/intarray's int[] & int[] operator.Tom Lane
The array intersection code would give wrong results if the first entry of the correct output array would be "1". (I think only this value could be at risk, since the previous word would always be a lower-bound entry with that fixed value.) Problem spotted by Julien Rouhaud, initial patch by Guillaume Lelarge, cosmetic improvements by me.
2011-02-14Convert contrib modules to use the extension facility.Tom Lane
This isn't fully tested as yet, in particular I'm not sure that the "foo--unpackaged--1.0.sql" scripts are OK. But it's time to get some buildfarm cycles on it. sepgsql is not converted to an extension, mainly because it seems to require a very nonstandard installation process. Dimitri Fontaine and Tom Lane
2007-09-14Remove ill-considered (not to mention undocumented) attempt to makeTom Lane
contrib/intarray's GIN opclass override the built-in default. Per bug #3048 and other complaints.
2006-09-10Rename contrib contains/contained-by operators to @> and <@, per discussion.Tom Lane
2006-05-03Make GIN opclass worked with intarray extensionsTeodor Sigaev
2003-07-24Error message editing in contrib (mostly by Joe Conway --- thanks Joe!)Tom Lane
2002-10-21SET autocommit no longer needed in /contrib because pg_regress.sh doesBruce Momjian
it automatically now on regression session startup.
2002-10-18Update /contrib for "autocommit TO 'on'".Bruce Momjian
Create objects in public schema. Make spacing/capitalization consistent. Remove transaction block use for object creation. Remove unneeded function GRANTs.
2002-08-10August 6, 2002Bruce Momjian
1. Reworked patch from Andrey Oktyabrski (ano@spider.ru) with functions: icount, sort, sort_asc, uniq, idx, subarray operations: #, +, -, |, & FUNCTIONS: int icount(int[]) - the number of elements in intarray int[] sort(int[], 'asc' | 'desc') - sort intarray int[] sort(int[]) - sort in ascending order int[] sort_asc(int[]),sort_desc(int[]) - shortcuts for sort int[] uniq(int[]) - returns unique elements int idx(int[], int item) - returns index of first intarray matching element to item, or '0' if matching failed. int[] subarray(int[],int START [, int LEN]) - returns part of intarray starting from element number START (from 1) and length LEN. OPERATIONS: int[] && int[] - overlap - returns TRUE if arrays has at least one common elements. int[] @ int[] - contains - returns TRUE if left array contains right array int[] ~ int[] - contained - returns TRUE if left array is contained in right array # int[] - return the number of elements in array int[] + int - push element to array ( add to end of array) int[] + int[] - merge of arrays (right array added to the end of left one) int[] - int - remove entries matched by right argument from array int[] - int[] - remove left array from right int[] | int - returns intarray - union of arguments int[] | int[] - returns intarray as a union of two arrays int[] & int[] - returns intersection of arrays Oleg Bartunov
2001-09-23please apply attached patch to current CVS.Bruce Momjian
Changes: 1. Added support for boolean queries (indexable operator @@, looks like a @@ '1|(2&3)' 2. Some code cleanup and optimization Regards, Oleg
2001-08-21Restructure pg_opclass, pg_amop, and pg_amproc per previous discussions inTom Lane
pgsql-hackers. pg_opclass now has a row for each opclass supported by each index AM, not a row for each opclass name. This allows pg_opclass to show directly whether an AM supports an opclass, and furthermore makes it possible to store additional information about an opclass that might be AM-dependent. pg_opclass and pg_amop now store "lossy" and "haskeytype" information that we previously expected the user to remember to provide in CREATE INDEX commands. Lossiness is no longer an index-level property, but is associated with the use of a particular operator in a particular index opclass. Along the way, IndexSupportInitialize now uses the syscaches to retrieve pg_amop and pg_amproc entries. I find this reduces backend launch time by about ten percent, at the cost of a couple more special cases in catcache.c's IndexScanOK. Initial work by Oleg Bartunov and Teodor Sigaev, further hacking by Tom Lane. initdb forced.
2001-03-17Update contrib intarray to Jan 25 version.Bruce Momjian
2001-01-12commit Oleg and Teodor's RD-tree implementation ... this provides theMarc G. Fournier
regression tests for the GiST changes ... this should be integrated into the regular regression tests similar to Vadim's SPI contrib stuff ...