Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 1a8b9fb

Browse files
committed
Extend the unknowns-are-same-as-known-inputs type resolution heuristic.
For a very long time, one of the parser's heuristics for resolving ambiguous operator calls has been to assume that unknown-type literals are of the same type as the other input (if it's known). However, this was only used in the first step of quickly checking for an exact-types match, and thus did not help in resolving matches that require coercion, such as matches to polymorphic operators. As we add more polymorphic operators, this becomes more of a problem. This patch adds another use of the same heuristic as a last-ditch check before failing to resolve an ambiguous operator or function call. In particular this will let us define the range inclusion operator in a less limited way (to come in a follow-on patch).
1 parent bf4f96b commit 1a8b9fb

File tree

2 files changed

+147
-26
lines changed

2 files changed

+147
-26
lines changed

doc/src/sgml/typeconv.sgml

Lines changed: 48 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -304,13 +304,18 @@ without more clues. Now discard
304304
candidates that do not accept the selected type category. Furthermore,
305305
if any candidate accepts a preferred type in that category,
306306
discard candidates that accept non-preferred types for that argument.
307+
Keep all candidates if none survive these tests.
308+
If only one candidate remains, use it; else continue to the next step.
307309
</para>
308310
</step>
309311
<step performance="required">
310312
<para>
311-
If only one candidate remains, use it. If no candidate or more than one
312-
candidate remains,
313-
then fail.
313+
If there are both <type>unknown</type> and known-type arguments, and all
314+
the known-type arguments have the same type, assume that the
315+
<type>unknown</type> arguments are also of that type, and check which
316+
candidates can accept that type at the <type>unknown</type>-argument
317+
positions. If exactly one candidate passes this test, use it.
318+
Otherwise, fail.
314319
</para>
315320
</step>
316321
</substeps>
@@ -376,7 +381,7 @@ be interpreted as type <type>text</type>.
376381
</para>
377382

378383
<para>
379-
Here is a concatenation on unspecified types:
384+
Here is a concatenation of two values of unspecified types:
380385
<screen>
381386
SELECT 'abc' || 'def' AS "unspecified";
382387

@@ -394,7 +399,7 @@ and finds that there are candidates accepting both string-category and
394399
bit-string-category inputs. Since string category is preferred when available,
395400
that category is selected, and then the
396401
preferred type for strings, <type>text</type>, is used as the specific
397-
type to resolve the unknown literals as.
402+
type to resolve the unknown-type literals as.
398403
</para>
399404
</example>
400405

@@ -450,6 +455,36 @@ SELECT ~ CAST('20' AS int8) AS "negation";
450455
</para>
451456
</example>
452457

458+
<example>
459+
<title>Array Inclusion Operator Type Resolution</title>
460+
461+
<para>
462+
Here is another example of resolving an operator with one known and one
463+
unknown input:
464+
<screen>
465+
SELECT array[1,2] &lt;@ '{1,2,3}' as "is subset";
466+
467+
is subset
468+
-----------
469+
t
470+
(1 row)
471+
</screen>
472+
The <productname>PostgreSQL</productname> operator catalog has several
473+
entries for the infix operator <literal>&lt;@</>, but the only two that
474+
could possibly accept an integer array on the left-hand side are
475+
array inclusion (<type>anyarray</> <literal>&lt;@</> <type>anyarray</>)
476+
and range inclusion (<type>anyelement</> <literal>&lt;@</> <type>anyrange</>).
477+
Since none of these polymorphic pseudo-types (see <xref
478+
linkend="datatype-pseudo">) are considered preferred, the parser cannot
479+
resolve the ambiguity on that basis. However, the last resolution rule tells
480+
it to assume that the unknown-type literal is of the same type as the other
481+
input, that is, integer array. Now only one of the two operators can match,
482+
so array inclusion is selected. (Had range inclusion been selected, we would
483+
have gotten an error, because the string does not have the right format to be
484+
a range literal.)
485+
</para>
486+
</example>
487+
453488
</sect1>
454489

455490
<sect1 id="typeconv-func">
@@ -594,13 +629,18 @@ the correct choice cannot be deduced without more clues.
594629
Now discard candidates that do not accept the selected type category.
595630
Furthermore, if any candidate accepts a preferred type in that category,
596631
discard candidates that accept non-preferred types for that argument.
632+
Keep all candidates if none survive these tests.
633+
If only one candidate remains, use it; else continue to the next step.
597634
</para>
598635
</step>
599636
<step performance="required">
600637
<para>
601-
If only one candidate remains, use it. If no candidate or more than one
602-
candidate remains,
603-
then fail.
638+
If there are both <type>unknown</type> and known-type arguments, and all
639+
the known-type arguments have the same type, assume that the
640+
<type>unknown</type> arguments are also of that type, and check which
641+
candidates can accept that type at the <type>unknown</type>-argument
642+
positions. If exactly one candidate passes this test, use it.
643+
Otherwise, fail.
604644
</para>
605645
</step>
606646
</substeps>

src/backend/parser/parse_func.c

Lines changed: 99 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -618,14 +618,16 @@ func_select_candidate(int nargs,
618618
Oid *input_typeids,
619619
FuncCandidateList candidates)
620620
{
621-
FuncCandidateList current_candidate;
622-
FuncCandidateList last_candidate;
621+
FuncCandidateList current_candidate,
622+
first_candidate,
623+
last_candidate;
623624
Oid *current_typeids;
624625
Oid current_type;
625626
int i;
626627
int ncandidates;
627628
int nbestMatch,
628-
nmatch;
629+
nmatch,
630+
nunknowns;
629631
Oid input_base_typeids[FUNC_MAX_ARGS];
630632
TYPCATEGORY slot_category[FUNC_MAX_ARGS],
631633
current_category;
@@ -651,9 +653,22 @@ func_select_candidate(int nargs,
651653
* take a domain as an input datatype. Such a function will be selected
652654
* over the base-type function only if it is an exact match at all
653655
* argument positions, and so was already chosen by our caller.
656+
*
657+
* While we're at it, count the number of unknown-type arguments for use
658+
* later.
654659
*/
660+
nunknowns = 0;
655661
for (i = 0; i < nargs; i++)
656-
input_base_typeids[i] = getBaseType(input_typeids[i]);
662+
{
663+
if (input_typeids[i] != UNKNOWNOID)
664+
input_base_typeids[i] = getBaseType(input_typeids[i]);
665+
else
666+
{
667+
/* no need to call getBaseType on UNKNOWNOID */
668+
input_base_typeids[i] = UNKNOWNOID;
669+
nunknowns++;
670+
}
671+
}
657672

658673
/*
659674
* Run through all candidates and keep those with the most matches on
@@ -749,14 +764,16 @@ func_select_candidate(int nargs,
749764
return candidates;
750765

751766
/*
752-
* Still too many candidates? Try assigning types for the unknown columns.
753-
*
754-
* NOTE: for a binary operator with one unknown and one non-unknown input,
755-
* we already tried the heuristic of looking for a candidate with the
756-
* known input type on both sides (see binary_oper_exact()). That's
757-
* essentially a special case of the general algorithm we try next.
767+
* Still too many candidates? Try assigning types for the unknown inputs.
758768
*
759-
* We do this by examining each unknown argument position to see if we can
769+
* If there are no unknown inputs, we have no more heuristics that apply,
770+
* and must fail.
771+
*/
772+
if (nunknowns == 0)
773+
return NULL; /* failed to select a best candidate */
774+
775+
/*
776+
* The next step examines each unknown argument position to see if we can
760777
* determine a "type category" for it. If any candidate has an input
761778
* datatype of STRING category, use STRING category (this bias towards
762779
* STRING is appropriate since unknown-type literals look like strings).
@@ -770,9 +787,9 @@ func_select_candidate(int nargs,
770787
* Having completed this examination, remove candidates that accept the
771788
* wrong category at any unknown position. Also, if at least one
772789
* candidate accepted a preferred type at a position, remove candidates
773-
* that accept non-preferred types.
774-
*
775-
* If we are down to one candidate at the end, we win.
790+
* that accept non-preferred types. If just one candidate remains,
791+
* return that one. However, if this rule turns out to reject all
792+
* candidates, keep them all instead.
776793
*/
777794
resolved_unknowns = false;
778795
for (i = 0; i < nargs; i++)
@@ -835,6 +852,7 @@ func_select_candidate(int nargs,
835852
{
836853
/* Strip non-matching candidates */
837854
ncandidates = 0;
855+
first_candidate = candidates;
838856
last_candidate = NULL;
839857
for (current_candidate = candidates;
840858
current_candidate != NULL;
@@ -874,15 +892,78 @@ func_select_candidate(int nargs,
874892
if (last_candidate)
875893
last_candidate->next = current_candidate->next;
876894
else
877-
candidates = current_candidate->next;
895+
first_candidate = current_candidate->next;
878896
}
879897
}
880-
if (last_candidate) /* terminate rebuilt list */
898+
899+
/* if we found any matches, restrict our attention to those */
900+
if (last_candidate)
901+
{
902+
candidates = first_candidate;
903+
/* terminate rebuilt list */
881904
last_candidate->next = NULL;
905+
}
906+
907+
if (ncandidates == 1)
908+
return candidates;
882909
}
883910

884-
if (ncandidates == 1)
885-
return candidates;
911+
/*
912+
* Last gasp: if there are both known- and unknown-type inputs, and all
913+
* the known types are the same, assume the unknown inputs are also that
914+
* type, and see if that gives us a unique match. If so, use that match.
915+
*
916+
* NOTE: for a binary operator with one unknown and one non-unknown input,
917+
* we already tried this heuristic in binary_oper_exact(). However, that
918+
* code only finds exact matches, whereas here we will handle matches that
919+
* involve coercion, polymorphic type resolution, etc.
920+
*/
921+
if (nunknowns < nargs)
922+
{
923+
Oid known_type = UNKNOWNOID;
924+
925+
for (i = 0; i < nargs; i++)
926+
{
927+
if (input_base_typeids[i] == UNKNOWNOID)
928+
continue;
929+
if (known_type == UNKNOWNOID) /* first known arg? */
930+
known_type = input_base_typeids[i];
931+
else if (known_type != input_base_typeids[i])
932+
{
933+
/* oops, not all match */
934+
known_type = UNKNOWNOID;
935+
break;
936+
}
937+
}
938+
939+
if (known_type != UNKNOWNOID)
940+
{
941+
/* okay, just one known type, apply the heuristic */
942+
for (i = 0; i < nargs; i++)
943+
input_base_typeids[i] = known_type;
944+
ncandidates = 0;
945+
last_candidate = NULL;
946+
for (current_candidate = candidates;
947+
current_candidate != NULL;
948+
current_candidate = current_candidate->next)
949+
{
950+
current_typeids = current_candidate->args;
951+
if (can_coerce_type(nargs, input_base_typeids, current_typeids,
952+
COERCION_IMPLICIT))
953+
{
954+
if (++ncandidates > 1)
955+
break; /* not unique, give up */
956+
last_candidate = current_candidate;
957+
}
958+
}
959+
if (ncandidates == 1)
960+
{
961+
/* successfully identified a unique match */
962+
last_candidate->next = NULL;
963+
return last_candidate;
964+
}
965+
}
966+
}
886967

887968
return NULL; /* failed to select a best candidate */
888969
} /* func_select_candidate() */

0 commit comments

Comments
 (0)