Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 1b5d34c

Browse files
committed
Docs: add an explicit example about controlling overall greediness of REs.
Per discussion of bug #13538.
1 parent 3bdd7f9 commit 1b5d34c

File tree

1 file changed

+28
-1
lines changed

1 file changed

+28
-1
lines changed

doc/src/sgml/func.sgml

+28-1
Original file line numberDiff line numberDiff line change
@@ -5203,10 +5203,37 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
52035203
The quantifiers <literal>{1,1}</> and <literal>{1,1}?</>
52045204
can be used to force greediness or non-greediness, respectively,
52055205
on a subexpression or a whole RE.
5206+
This is useful when you need the whole RE to have a greediness attribute
5207+
different from what's deduced from its elements. As an example,
5208+
suppose that we are trying to separate a string containing some digits
5209+
into the digits and the parts before and after them. We might try to
5210+
do that like this:
5211+
<screen>
5212+
SELECT regexp_matches('abc01234xyz', '(.*)(\d+)(.*)');
5213+
<lineannotation>Result: </lineannotation><computeroutput>{abc0123,4,xyz}</computeroutput>
5214+
</screen>
5215+
That didn't work: the first <literal>.*</> is greedy so
5216+
it <quote>eats</> as much as it can, leaving the <literal>\d+</> to
5217+
match at the last possible place, the last digit. We might try to fix
5218+
that by making it non-greedy:
5219+
<screen>
5220+
SELECT regexp_matches('abc01234xyz', '(.*?)(\d+)(.*)');
5221+
<lineannotation>Result: </lineannotation><computeroutput>{abc,0,""}</computeroutput>
5222+
</screen>
5223+
That didn't work either, because now the RE as a whole is non-greedy
5224+
and so it ends the overall match as soon as possible. We can get what
5225+
we want by forcing the RE as a whole to be greedy:
5226+
<screen>
5227+
SELECT regexp_matches('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
5228+
<lineannotation>Result: </lineannotation><computeroutput>{abc,01234,xyz}</computeroutput>
5229+
</screen>
5230+
Controlling the RE's overall greediness separately from its components'
5231+
greediness allows great flexibility in handling variable-length patterns.
52065232
</para>
52075233

52085234
<para>
5209-
Match lengths are measured in characters, not collating elements.
5235+
When deciding what is a longer or shorter match,
5236+
match lengths are measured in characters, not collating elements.
52105237
An empty string is considered longer than no match at all.
52115238
For example:
52125239
<literal>bb*</>

0 commit comments

Comments
 (0)