Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit de97072

Browse files
committed
Allow merge and hash joins to occur on arbitrary expressions (anything not
containing a volatile function), rather than only on 'Var = Var' clauses as before. This makes it practical to do flatten_join_alias_vars at the start of planning, which in turn eliminates a bunch of klugery inside the planner to deal with alias vars. As a free side effect, we now detect implied equality of non-Var expressions; for example in SELECT ... WHERE a.x = b.y and b.y = 42 we will deduce a.x = 42 and use that as a restriction qual on a. Also, we can remove the restriction introduced 12/5/02 to prevent pullup of subqueries whose targetlists contain sublinks. Still TODO: make statistical estimation routines in selfuncs.c and costsize.c smarter about expressions that are more complex than plain Vars. The need for this is considerably greater now that we have to be able to estimate the suitability of merge and hash join techniques on such expressions.
1 parent 0eed62f commit de97072

32 files changed

+520
-659
lines changed

doc/src/sgml/xoper.sgml

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$Header: /cvsroot/pgsql/doc/src/sgml/xoper.sgml,v 1.21 2003/01/06 01:20:40 tgl Exp $
2+
$Header: /cvsroot/pgsql/doc/src/sgml/xoper.sgml,v 1.22 2003/01/15 19:35:35 tgl Exp $
33
-->
44

55
<Chapter Id="xoper">
@@ -375,6 +375,27 @@ table1.column1 OP table2.column2
375375
equality operators that are (or could be) implemented by <function>memcmp()</function>.
376376
</para>
377377

378+
<note>
379+
<para>
380+
The function underlying a hashjoinable operator must be marked
381+
immutable or stable. If it is volatile, the system will never
382+
attempt to use the operator for a hash join.
383+
</para>
384+
</note>
385+
386+
<note>
387+
<para>
388+
If a hashjoinable operator has an underlying function that is marked
389+
strict, the
390+
function must also be complete: that is, it should return TRUE or
391+
FALSE, never NULL, for any two non-NULL inputs. If this rule is
392+
not followed, hash-optimization of <literal>IN</> operations may
393+
generate wrong results. (Specifically, <literal>IN</> might return
394+
FALSE where the correct answer per spec would be NULL; or it might
395+
yield an error complaining that it wasn't prepared for a NULL result.)
396+
</para>
397+
</note>
398+
378399
</sect2>
379400

380401
<sect2>
@@ -472,6 +493,14 @@ table1.column1 OP table2.column2
472493
</itemizedlist>
473494
</para>
474495

496+
<note>
497+
<para>
498+
The function underlying a mergejoinable operator must be marked
499+
immutable or stable. If it is volatile, the system will never
500+
attempt to use the operator for a merge join.
501+
</para>
502+
</note>
503+
475504
<note>
476505
<para>
477506
<literal>GROUP BY</> and <literal>DISTINCT</> operations require each

src/backend/nodes/copyfuncs.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
* Portions Copyright (c) 1994, Regents of the University of California
1616
*
1717
* IDENTIFICATION
18-
* $Header: /cvsroot/pgsql/src/backend/nodes/copyfuncs.c,v 1.235 2003/01/10 21:08:10 tgl Exp $
18+
* $Header: /cvsroot/pgsql/src/backend/nodes/copyfuncs.c,v 1.236 2003/01/15 19:35:35 tgl Exp $
1919
*
2020
*-------------------------------------------------------------------------
2121
*/
@@ -1059,6 +1059,8 @@ _copyRestrictInfo(RestrictInfo *from)
10591059
COPY_NODE_FIELD(subclauseindices); /* XXX probably bad */
10601060
COPY_SCALAR_FIELD(eval_cost);
10611061
COPY_SCALAR_FIELD(this_selec);
1062+
COPY_INTLIST_FIELD(left_relids);
1063+
COPY_INTLIST_FIELD(right_relids);
10621064
COPY_SCALAR_FIELD(mergejoinoperator);
10631065
COPY_SCALAR_FIELD(left_sortop);
10641066
COPY_SCALAR_FIELD(right_sortop);

src/backend/nodes/equalfuncs.c

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
* Portions Copyright (c) 1994, Regents of the University of California
1919
*
2020
* IDENTIFICATION
21-
* $Header: /cvsroot/pgsql/src/backend/nodes/equalfuncs.c,v 1.179 2003/01/10 21:08:10 tgl Exp $
21+
* $Header: /cvsroot/pgsql/src/backend/nodes/equalfuncs.c,v 1.180 2003/01/15 19:35:37 tgl Exp $
2222
*
2323
*-------------------------------------------------------------------------
2424
*/
@@ -464,10 +464,10 @@ _equalRestrictInfo(RestrictInfo *a, RestrictInfo *b)
464464
COMPARE_NODE_FIELD(clause);
465465
COMPARE_SCALAR_FIELD(ispusheddown);
466466
/*
467-
* We ignore subclauseindices, eval_cost, this_selec, left/right_pathkey,
468-
* and left/right_bucketsize, since they may not be set yet, and should be
469-
* derivable from the clause anyway. Probably it's not really necessary
470-
* to compare any of these remaining fields ...
467+
* We ignore subclauseindices, eval_cost, this_selec, left/right_relids,
468+
* left/right_pathkey, and left/right_bucketsize, since they may not be
469+
* set yet, and should be derivable from the clause anyway. Probably it's
470+
* not really necessary to compare any of these remaining fields ...
471471
*/
472472
COMPARE_SCALAR_FIELD(mergejoinoperator);
473473
COMPARE_SCALAR_FIELD(left_sortop);

src/backend/nodes/outfuncs.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $Header: /cvsroot/pgsql/src/backend/nodes/outfuncs.c,v 1.192 2003/01/10 21:08:11 tgl Exp $
11+
* $Header: /cvsroot/pgsql/src/backend/nodes/outfuncs.c,v 1.193 2003/01/15 19:35:39 tgl Exp $
1212
*
1313
* NOTES
1414
* Every node type that can appear in stored rules' parsetrees *must*
@@ -952,6 +952,8 @@ _outRestrictInfo(StringInfo str, RestrictInfo *node)
952952
WRITE_NODE_FIELD(clause);
953953
WRITE_BOOL_FIELD(ispusheddown);
954954
WRITE_NODE_FIELD(subclauseindices);
955+
WRITE_INTLIST_FIELD(left_relids);
956+
WRITE_INTLIST_FIELD(right_relids);
955957
WRITE_OID_FIELD(mergejoinoperator);
956958
WRITE_OID_FIELD(left_sortop);
957959
WRITE_OID_FIELD(right_sortop);

src/backend/nodes/print.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $Header: /cvsroot/pgsql/src/backend/nodes/print.c,v 1.58 2002/12/12 15:49:28 tgl Exp $
11+
* $Header: /cvsroot/pgsql/src/backend/nodes/print.c,v 1.59 2003/01/15 19:35:39 tgl Exp $
1212
*
1313
* HISTORY
1414
* AUTHOR DATE MAJOR EVENT
@@ -370,10 +370,10 @@ print_expr(Node *expr, List *rtable)
370370
{
371371
char *opname;
372372

373-
print_expr((Node *) get_leftop(e), rtable);
373+
print_expr(get_leftop(e), rtable);
374374
opname = get_opname(((OpExpr *) e)->opno);
375375
printf(" %s ", ((opname != NULL) ? opname : "(invalid operator)"));
376-
print_expr((Node *) get_rightop(e), rtable);
376+
print_expr(get_rightop(e), rtable);
377377
}
378378
else
379379
printf("an expr");

src/backend/optimizer/README

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -251,8 +251,10 @@ Optimizer Data Structures
251251

252252
RelOptInfo - a relation or joined relations
253253

254-
RestrictInfo - restriction clauses, like "x = 3"
255-
JoinInfo - join clauses, including the relids needed for the join
254+
RestrictInfo - WHERE clauses, like "x = 3" or "y = z"
255+
(note the same structure is used for restriction and
256+
join clauses)
257+
JoinInfo - join clauses associated with a particular pair of relations
256258

257259
Path - every way to generate a RelOptInfo(sequential,index,joins)
258260
SeqScan - a plain Path node with pathtype = T_SeqScan

src/backend/optimizer/path/clausesel.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $Header: /cvsroot/pgsql/src/backend/optimizer/path/clausesel.c,v 1.54 2002/12/12 15:49:28 tgl Exp $
11+
* $Header: /cvsroot/pgsql/src/backend/optimizer/path/clausesel.c,v 1.55 2003/01/15 19:35:39 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -266,12 +266,12 @@ addRangeClause(RangeQueryClause **rqlist, Node *clause,
266266

267267
if (varonleft)
268268
{
269-
var = (Node *) get_leftop((Expr *) clause);
269+
var = get_leftop((Expr *) clause);
270270
is_lobound = !isLTsel; /* x < something is high bound */
271271
}
272272
else
273273
{
274-
var = (Node *) get_rightop((Expr *) clause);
274+
var = get_rightop((Expr *) clause);
275275
is_lobound = isLTsel; /* something < x is low bound */
276276
}
277277

src/backend/optimizer/path/costsize.c

Lines changed: 10 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@
4242
* Portions Copyright (c) 1994, Regents of the University of California
4343
*
4444
* IDENTIFICATION
45-
* $Header: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v 1.99 2003/01/12 22:35:29 tgl Exp $
45+
* $Header: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v 1.100 2003/01/15 19:35:39 tgl Exp $
4646
*
4747
*-------------------------------------------------------------------------
4848
*/
@@ -752,7 +752,6 @@ cost_mergejoin(Path *path, Query *root,
752752
Cost cpu_per_tuple;
753753
QualCost restrict_qual_cost;
754754
RestrictInfo *firstclause;
755-
Var *leftvar;
756755
double outer_rows,
757756
inner_rows;
758757
double ntuples;
@@ -779,9 +778,7 @@ cost_mergejoin(Path *path, Query *root,
779778
&firstclause->left_mergescansel,
780779
&firstclause->right_mergescansel);
781780

782-
leftvar = get_leftop(firstclause->clause);
783-
Assert(IsA(leftvar, Var));
784-
if (VARISRELMEMBER(leftvar->varno, outer_path->parent))
781+
if (is_subseti(firstclause->left_relids, outer_path->parent->relids))
785782
{
786783
/* left side of clause is outer */
787784
outerscansel = firstclause->left_mergescansel;
@@ -935,14 +932,9 @@ cost_hashjoin(Path *path, Query *root,
935932
foreach(hcl, hashclauses)
936933
{
937934
RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(hcl);
938-
Var *left,
939-
*right;
940935
Selectivity thisbucketsize;
941936

942937
Assert(IsA(restrictinfo, RestrictInfo));
943-
/* these must be OK, since check_hashjoinable accepted the clause */
944-
left = get_leftop(restrictinfo->clause);
945-
right = get_rightop(restrictinfo->clause);
946938

947939
/*
948940
* First we have to figure out which side of the hashjoin clause is the
@@ -952,27 +944,30 @@ cost_hashjoin(Path *path, Query *root,
952944
* a large query, we cache the bucketsize estimate in the RestrictInfo
953945
* node to avoid repeated lookups of statistics.
954946
*/
955-
if (VARISRELMEMBER(right->varno, inner_path->parent))
947+
if (is_subseti(restrictinfo->right_relids, inner_path->parent->relids))
956948
{
957949
/* righthand side is inner */
958950
thisbucketsize = restrictinfo->right_bucketsize;
959951
if (thisbucketsize < 0)
960952
{
961953
/* not cached yet */
962-
thisbucketsize = estimate_hash_bucketsize(root, right,
954+
thisbucketsize = estimate_hash_bucketsize(root,
955+
(Var *) get_rightop(restrictinfo->clause),
963956
virtualbuckets);
964957
restrictinfo->right_bucketsize = thisbucketsize;
965958
}
966959
}
967960
else
968961
{
969-
Assert(VARISRELMEMBER(left->varno, inner_path->parent));
962+
Assert(is_subseti(restrictinfo->left_relids,
963+
inner_path->parent->relids));
970964
/* lefthand side is inner */
971965
thisbucketsize = restrictinfo->left_bucketsize;
972966
if (thisbucketsize < 0)
973967
{
974968
/* not cached yet */
975-
thisbucketsize = estimate_hash_bucketsize(root, left,
969+
thisbucketsize = estimate_hash_bucketsize(root,
970+
(Var *) get_leftop(restrictinfo->clause),
976971
virtualbuckets);
977972
restrictinfo->left_bucketsize = thisbucketsize;
978973
}
@@ -1088,7 +1083,7 @@ estimate_hash_bucketsize(Query *root, Var *var, int nbuckets)
10881083
* Lookup info about var's relation and attribute; if none available,
10891084
* return default estimate.
10901085
*/
1091-
if (!IsA(var, Var))
1086+
if (var == NULL || !IsA(var, Var))
10921087
return 0.1;
10931088

10941089
relid = getrelid(var->varno, root->rtable);

0 commit comments

Comments
 (0)