Fix conversion of SIMILAR TO regexes for character classes
The code that translates SIMILAR TO pattern matching expressions to
POSIX-style regular expressions did not consider that square brackets
can be nested. For example, in an expression like [[:alpha:]%_], the
logic replaced the placeholders '_' and '%' but it should not.
This commit fixes the conversion logic by tracking the nesting level of
square brackets marking character class areas, while considering that
in expressions like []] or [^]] the first closing square bracket is a
regular character. Multiple tests are added to show how the conversions
should or should not apply applied while in a character class area, with
specific cases added for all the characters converted outside character
classes like an opening parenthesis '(', dollar sign '$', etc.
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/16ab039d1af455652bdf4173402ddda145f2c73b.camel@cybertec.at
Backpatch-through: 13
Branch
------
REL_15_STABLE
Details
-------
https://git.postgresql.org/pg/commitdiff/b3e99115e44c5040c949c99d081ff3812e6ec4a3
Modified Files
--------------
src/backend/utils/adt/regexp.c | 38 +++++++++++++++++----
src/test/regress/expected/strings.out | 62 +++++++++++++++++++++++++++++++++++
src/test/regress/sql/strings.sql | 20 +++++++++++
3 files changed, 114 insertions(+), 6 deletions(-)