@@ -5203,10 +5203,37 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
5203
5203
The quantifiers <literal>{1,1}</> and <literal>{1,1}?</>
5204
5204
can be used to force greediness or non-greediness, respectively,
5205
5205
on a subexpression or a whole RE.
5206
+ This is useful when you need the whole RE to have a greediness attribute
5207
+ different from what's deduced from its elements. As an example,
5208
+ suppose that we are trying to separate a string containing some digits
5209
+ into the digits and the parts before and after them. We might try to
5210
+ do that like this:
5211
+ <screen>
5212
+ SELECT regexp_matches('abc01234xyz', '(.*)(\d+)(.*)');
5213
+ <lineannotation>Result: </lineannotation><computeroutput>{abc0123,4,xyz}</computeroutput>
5214
+ </screen>
5215
+ That didn't work: the first <literal>.*</> is greedy so
5216
+ it <quote>eats</> as much as it can, leaving the <literal>\d+</> to
5217
+ match at the last possible place, the last digit. We might try to fix
5218
+ that by making it non-greedy:
5219
+ <screen>
5220
+ SELECT regexp_matches('abc01234xyz', '(.*?)(\d+)(.*)');
5221
+ <lineannotation>Result: </lineannotation><computeroutput>{abc,0,""}</computeroutput>
5222
+ </screen>
5223
+ That didn't work either, because now the RE as a whole is non-greedy
5224
+ and so it ends the overall match as soon as possible. We can get what
5225
+ we want by forcing the RE as a whole to be greedy:
5226
+ <screen>
5227
+ SELECT regexp_matches('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
5228
+ <lineannotation>Result: </lineannotation><computeroutput>{abc,01234,xyz}</computeroutput>
5229
+ </screen>
5230
+ Controlling the RE's overall greediness separately from its components'
5231
+ greediness allows great flexibility in handling variable-length patterns.
5206
5232
</para>
5207
5233
5208
5234
<para>
5209
- Match lengths are measured in characters, not collating elements.
5235
+ When deciding what is a longer or shorter match,
5236
+ match lengths are measured in characters, not collating elements.
5210
5237
An empty string is considered longer than no match at all.
5211
5238
For example:
5212
5239
<literal>bb*</>
0 commit comments