Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit ddbe8dc

Browse files
committed
Add a heuristic to transformAExprIn() to make it prefer expanding "x IN (list)"
into an OR of equality comparisons, rather than x = ANY(ARRAY[...]), when there are Vars in the right-hand side. This avoids a performance regression compared to pre-8.2 releases, in cases where the OR form can be optimized into scans of multiple indexes. Limit the possible downside by preferring this form only when the list isn't very long (I set the cutoff at 32 elements, which is a bit arbitrary but in the right ballpark). Per discussion with Jim Nasby. In passing, also make it try the OR form if it cannot select a common type for the array elements; we've seen a complaint or two about how the OR form worked for such cases and ARRAY doesn't.
1 parent 312b1a9 commit ddbe8dc

File tree

2 files changed

+35
-15
lines changed

2 files changed

+35
-15
lines changed

src/backend/parser/parse_coerce.c

+7-4
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/parser/parse_coerce.c,v 2.169 2008/10/13 16:25:19 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/parser/parse_coerce.c,v 2.170 2008/10/25 17:19:09 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -1085,13 +1085,14 @@ parser_coercion_errposition(ParseState *pstate,
10851085
/*
10861086
* select_common_type()
10871087
* Determine the common supertype of a list of input expressions.
1088-
* This is used for determining the output type of CASE and UNION
1089-
* constructs.
1088+
* This is used for determining the output type of CASE, UNION,
1089+
* and similar constructs.
10901090
*
10911091
* 'exprs' is a *nonempty* list of expressions. Note that earlier items
10921092
* in the list will be preferred if there is doubt.
10931093
* 'context' is a phrase to use in the error message if we fail to select
1094-
* a usable type.
1094+
* a usable type. Pass NULL to have the routine return InvalidOid
1095+
* rather than throwing an error on failure.
10951096
* 'which_expr': if not NULL, receives a pointer to the particular input
10961097
* expression from which the result type was taken.
10971098
*/
@@ -1166,6 +1167,8 @@ select_common_type(ParseState *pstate, List *exprs, const char *context,
11661167
/*
11671168
* both types in different categories? then not much hope...
11681169
*/
1170+
if (context == NULL)
1171+
return InvalidOid;
11691172
ereport(ERROR,
11701173
(errcode(ERRCODE_DATATYPE_MISMATCH),
11711174
/*------

src/backend/parser/parse_expr.c

+28-11
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/parser/parse_expr.c,v 1.235 2008/10/06 17:39:26 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/parser/parse_expr.c,v 1.236 2008/10/25 17:19:09 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -20,6 +20,7 @@
2020
#include "miscadmin.h"
2121
#include "nodes/makefuncs.h"
2222
#include "nodes/nodeFuncs.h"
23+
#include "optimizer/var.h"
2324
#include "parser/analyze.h"
2425
#include "parser/parse_coerce.h"
2526
#include "parser/parse_expr.h"
@@ -974,29 +975,45 @@ transformAExprIn(ParseState *pstate, A_Expr *a)
974975
}
975976

976977
/*
977-
* If not forced by presence of RowExpr, try to resolve a common scalar
978-
* type for all the expressions, and see if it has an array type. (But if
979-
* there's only one righthand expression, we may as well just fall through
980-
* and generate a simple = comparison.)
978+
* We prefer a boolean tree to ScalarArrayOpExpr if any of these are true:
979+
*
980+
* 1. We have a RowExpr anywhere.
981+
*
982+
* 2. There's only one righthand expression --- best to just generate a
983+
* simple = comparison.
984+
*
985+
* 3. There's a reasonably small number of righthand expressions and
986+
* they contain any Vars. This is a heuristic to support cases like
987+
* WHERE '555-1212' IN (tab.home_phone, tab.work_phone), which can be
988+
* optimized into an OR of indexscans on different indexes so long as
989+
* it's left as an OR tree. (It'd be better to leave this decision
990+
* to the planner, no doubt, but the amount of code required to reformat
991+
* the expression later on seems out of proportion to the benefit.)
981992
*/
982-
if (!haveRowExpr && list_length(rexprs) != 1)
993+
if (!(haveRowExpr ||
994+
list_length(rexprs) == 1 ||
995+
(list_length(rexprs) <= 32 &&
996+
contain_vars_of_level((Node *) rexprs, 0))))
983997
{
984998
List *allexprs;
985999
Oid scalar_type;
9861000
Oid array_type;
9871001

9881002
/*
989-
* Select a common type for the array elements. Note that since the
990-
* LHS' type is first in the list, it will be preferred when there is
991-
* doubt (eg, when all the RHS items are unknown literals).
1003+
* Try to select a common type for the array elements. Note that
1004+
* since the LHS' type is first in the list, it will be preferred when
1005+
* there is doubt (eg, when all the RHS items are unknown literals).
9921006
*
9931007
* Note: use list_concat here not lcons, to avoid damaging rexprs.
9941008
*/
9951009
allexprs = list_concat(list_make1(lexpr), rexprs);
996-
scalar_type = select_common_type(pstate, allexprs, "IN", NULL);
1010+
scalar_type = select_common_type(pstate, allexprs, NULL, NULL);
9971011

9981012
/* Do we have an array type to use? */
999-
array_type = get_array_type(scalar_type);
1013+
if (OidIsValid(scalar_type))
1014+
array_type = get_array_type(scalar_type);
1015+
else
1016+
array_type = InvalidOid;
10001017
if (array_type != InvalidOid)
10011018
{
10021019
/*

0 commit comments

Comments
 (0)