Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 2abd7ae

Browse files
committed
Fix representation of hash keys in Hash/HashJoin nodes.
In 5f32b29 I changed the creation of HashState.hashkeys to actually use HashState as the parent (instead of HashJoinState, which was incorrect, as they were executed below HashState), to fix the problem of hashkeys expressions otherwise relying on slot types appropriate for HashJoinState, rather than HashState as would be correct. That reliance was only introduced in 12, which is why it previously worked to use HashJoinState as the parent (although I'd be unsurprised if there were problematic cases). Unfortunately that's not a sufficient solution, because before this commit, the to-be-hashed expressions referenced inner/outer as appropriate for the HashJoin, not Hash. That didn't have obvious bad consequences, because the slots containing the tuples were put into ecxt_innertuple when hashing a tuple for HashState (even though Hash doesn't have an inner plan). There are less common cases where this can cause visible problems however (rather than just confusion when inspecting such executor trees). E.g. "ERROR: bogus varno: 65000", when explaining queries containing a HashJoin where the subsidiary Hash node's hash keys reference a subplan. While normally hashkeys aren't displayed by EXPLAIN, if one of those expressions references a subplan, that subplan may be printed as part of the Hash node - which then failed because an inner plan was referenced, and Hash doesn't have that. It seems quite possible that there's other broken cases, too. Fix the problem by properly splitting the expression for the HashJoin and Hash nodes at plan time, and have them reference the proper subsidiary node. While other workarounds are possible, fixing this correctly seems easy enough. It was a pretty ugly hack to have ExecInitHashJoin put the expression into the already initialized HashState, in the first place. I decided to not just split inner/outer hashkeys inside make_hashjoin(), but also to separate out hashoperators and hashcollations at plan time. Otherwise we would have ended up having two very similar loops, one at plan time and the other during executor startup. The work seems to more appropriately belong to plan time, anyway. Reported-By: Nikita Glukhov, Alexander Korotkov Author: Andres Freund Reviewed-By: Tom Lane, in an earlier version Discussion: https://postgr.es/m/CAPpHfdvGVegF_TKKRiBrSmatJL2dR9uwFCuR+teQ_8tEXU8mxg@mail.gmail.com Backpatch: 12-
1 parent a9f301d commit 2abd7ae

File tree

11 files changed

+330
-47
lines changed

11 files changed

+330
-47
lines changed

src/backend/executor/nodeHash.c

+14-8
Original file line numberDiff line numberDiff line change
@@ -157,15 +157,16 @@ MultiExecPrivateHash(HashState *node)
157157
econtext = node->ps.ps_ExprContext;
158158

159159
/*
160-
* get all inner tuples and insert into the hash table (or temp files)
160+
* Get all tuples from the node below the Hash node and insert into the
161+
* hash table (or temp files).
161162
*/
162163
for (;;)
163164
{
164165
slot = ExecProcNode(outerNode);
165166
if (TupIsNull(slot))
166167
break;
167168
/* We have to compute the hash value */
168-
econtext->ecxt_innertuple = slot;
169+
econtext->ecxt_outertuple = slot;
169170
if (ExecHashGetHashValue(hashtable, econtext, hashkeys,
170171
false, hashtable->keepNulls,
171172
&hashvalue))
@@ -281,7 +282,7 @@ MultiExecParallelHash(HashState *node)
281282
slot = ExecProcNode(outerNode);
282283
if (TupIsNull(slot))
283284
break;
284-
econtext->ecxt_innertuple = slot;
285+
econtext->ecxt_outertuple = slot;
285286
if (ExecHashGetHashValue(hashtable, econtext, hashkeys,
286287
false, hashtable->keepNulls,
287288
&hashvalue))
@@ -388,8 +389,9 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
388389
/*
389390
* initialize child expressions
390391
*/
391-
hashstate->ps.qual =
392-
ExecInitQual(node->plan.qual, (PlanState *) hashstate);
392+
Assert(node->plan.qual == NIL);
393+
hashstate->hashkeys =
394+
ExecInitExprList(node->hashkeys, (PlanState *) hashstate);
393395

394396
return hashstate;
395397
}
@@ -1773,9 +1775,13 @@ ExecParallelHashTableInsertCurrentBatch(HashJoinTable hashtable,
17731775
* ExecHashGetHashValue
17741776
* Compute the hash value for a tuple
17751777
*
1776-
* The tuple to be tested must be in either econtext->ecxt_outertuple or
1777-
* econtext->ecxt_innertuple. Vars in the hashkeys expressions should have
1778-
* varno either OUTER_VAR or INNER_VAR.
1778+
* The tuple to be tested must be in econtext->ecxt_outertuple (thus Vars in
1779+
* the hashkeys expressions need to have OUTER_VAR as varno). If outer_tuple
1780+
* is false (meaning it's the HashJoin's inner node, Hash), econtext,
1781+
* hashkeys, and slot need to be from Hash, with hashkeys/slot referencing and
1782+
* being suitable for tuples from the node below the Hash. Conversely, if
1783+
* outer_tuple is true, econtext is from HashJoin, and hashkeys/slot need to
1784+
* be appropriate for tuples from HashJoin's outer node.
17791785
*
17801786
* A true result means the tuple's hash value has been successfully computed
17811787
* and stored at *hashvalue. A false result means the tuple cannot match

src/backend/executor/nodeHashjoin.c

+4-36
Original file line numberDiff line numberDiff line change
@@ -600,14 +600,8 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
600600
HashJoinState *hjstate;
601601
Plan *outerNode;
602602
Hash *hashNode;
603-
List *lclauses;
604-
List *rclauses;
605-
List *rhclauses;
606-
List *hoperators;
607-
List *hcollations;
608603
TupleDesc outerDesc,
609604
innerDesc;
610-
ListCell *l;
611605
const TupleTableSlotOps *ops;
612606

613607
/* check for unsupported flags */
@@ -730,36 +724,10 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
730724
hjstate->hj_CurSkewBucketNo = INVALID_SKEW_BUCKET_NO;
731725
hjstate->hj_CurTuple = NULL;
732726

733-
/*
734-
* Deconstruct the hash clauses into outer and inner argument values, so
735-
* that we can evaluate those subexpressions separately. Also make a list
736-
* of the hash operator OIDs, in preparation for looking up the hash
737-
* functions to use.
738-
*/
739-
lclauses = NIL;
740-
rclauses = NIL;
741-
rhclauses = NIL;
742-
hoperators = NIL;
743-
hcollations = NIL;
744-
foreach(l, node->hashclauses)
745-
{
746-
OpExpr *hclause = lfirst_node(OpExpr, l);
747-
748-
lclauses = lappend(lclauses, ExecInitExpr(linitial(hclause->args),
749-
(PlanState *) hjstate));
750-
rclauses = lappend(rclauses, ExecInitExpr(lsecond(hclause->args),
751-
(PlanState *) hjstate));
752-
rhclauses = lappend(rhclauses, ExecInitExpr(lsecond(hclause->args),
753-
innerPlanState(hjstate)));
754-
hoperators = lappend_oid(hoperators, hclause->opno);
755-
hcollations = lappend_oid(hcollations, hclause->inputcollid);
756-
}
757-
hjstate->hj_OuterHashKeys = lclauses;
758-
hjstate->hj_InnerHashKeys = rclauses;
759-
hjstate->hj_HashOperators = hoperators;
760-
hjstate->hj_Collations = hcollations;
761-
/* child Hash node needs to evaluate inner hash keys, too */
762-
((HashState *) innerPlanState(hjstate))->hashkeys = rhclauses;
727+
hjstate->hj_OuterHashKeys = ExecInitExprList(node->hashkeys,
728+
(PlanState *) hjstate);
729+
hjstate->hj_HashOperators = node->hashoperators;
730+
hjstate->hj_Collations = node->hashcollations;
763731

764732
hjstate->hj_JoinState = HJ_BUILD_HASHTABLE;
765733
hjstate->hj_MatchedOuter = false;

src/backend/nodes/copyfuncs.c

+4
Original file line numberDiff line numberDiff line change
@@ -899,6 +899,9 @@ _copyHashJoin(const HashJoin *from)
899899
* copy remainder of node
900900
*/
901901
COPY_NODE_FIELD(hashclauses);
902+
COPY_NODE_FIELD(hashoperators);
903+
COPY_NODE_FIELD(hashcollations);
904+
COPY_NODE_FIELD(hashkeys);
902905

903906
return newnode;
904907
}
@@ -1066,6 +1069,7 @@ _copyHash(const Hash *from)
10661069
/*
10671070
* copy remainder of node
10681071
*/
1072+
COPY_NODE_FIELD(hashkeys);
10691073
COPY_SCALAR_FIELD(skewTable);
10701074
COPY_SCALAR_FIELD(skewColumn);
10711075
COPY_SCALAR_FIELD(skewInherit);

src/backend/nodes/outfuncs.c

+4
Original file line numberDiff line numberDiff line change
@@ -761,6 +761,9 @@ _outHashJoin(StringInfo str, const HashJoin *node)
761761
_outJoinPlanInfo(str, (const Join *) node);
762762

763763
WRITE_NODE_FIELD(hashclauses);
764+
WRITE_NODE_FIELD(hashoperators);
765+
WRITE_NODE_FIELD(hashcollations);
766+
WRITE_NODE_FIELD(hashkeys);
764767
}
765768

766769
static void
@@ -863,6 +866,7 @@ _outHash(StringInfo str, const Hash *node)
863866

864867
_outPlanInfo(str, (const Plan *) node);
865868

869+
WRITE_NODE_FIELD(hashkeys);
866870
WRITE_OID_FIELD(skewTable);
867871
WRITE_INT_FIELD(skewColumn);
868872
WRITE_BOOL_FIELD(skewInherit);

src/backend/nodes/readfuncs.c

+4
Original file line numberDiff line numberDiff line change
@@ -2096,6 +2096,9 @@ _readHashJoin(void)
20962096
ReadCommonJoin(&local_node->join);
20972097

20982098
READ_NODE_FIELD(hashclauses);
2099+
READ_NODE_FIELD(hashoperators);
2100+
READ_NODE_FIELD(hashcollations);
2101+
READ_NODE_FIELD(hashkeys);
20992102

21002103
READ_DONE();
21012104
}
@@ -2274,6 +2277,7 @@ _readHash(void)
22742277

22752278
ReadCommonPlan(&local_node->plan);
22762279

2280+
READ_NODE_FIELD(hashkeys);
22772281
READ_OID_FIELD(skewTable);
22782282
READ_INT_FIELD(skewColumn);
22792283
READ_BOOL_FIELD(skewInherit);

src/backend/optimizer/plan/createplan.c

+38
Original file line numberDiff line numberDiff line change
@@ -222,9 +222,12 @@ static NestLoop *make_nestloop(List *tlist,
222222
static HashJoin *make_hashjoin(List *tlist,
223223
List *joinclauses, List *otherclauses,
224224
List *hashclauses,
225+
List *hashoperators, List *hashcollations,
226+
List *hashkeys,
225227
Plan *lefttree, Plan *righttree,
226228
JoinType jointype, bool inner_unique);
227229
static Hash *make_hash(Plan *lefttree,
230+
List *hashkeys,
228231
Oid skewTable,
229232
AttrNumber skewColumn,
230233
bool skewInherit);
@@ -4380,9 +4383,14 @@ create_hashjoin_plan(PlannerInfo *root,
43804383
List *joinclauses;
43814384
List *otherclauses;
43824385
List *hashclauses;
4386+
List *hashoperators = NIL;
4387+
List *hashcollations = NIL;
4388+
List *inner_hashkeys = NIL;
4389+
List *outer_hashkeys = NIL;
43834390
Oid skewTable = InvalidOid;
43844391
AttrNumber skewColumn = InvalidAttrNumber;
43854392
bool skewInherit = false;
4393+
ListCell *lc;
43864394

43874395
/*
43884396
* HashJoin can project, so we don't have to demand exact tlists from the
@@ -4474,10 +4482,29 @@ create_hashjoin_plan(PlannerInfo *root,
44744482
}
44754483
}
44764484

4485+
/*
4486+
* Collect hash related information. The hashed expressions are
4487+
* deconstructed into outer/inner expressions, so they can be computed
4488+
* separately (inner expressions are used to build the hashtable via Hash,
4489+
* outer expressions to perform lookups of tuples from HashJoin's outer
4490+
* plan in the hashtable). Also collect operator information necessary to
4491+
* build the hashtable.
4492+
*/
4493+
foreach(lc, hashclauses)
4494+
{
4495+
OpExpr *hclause = lfirst_node(OpExpr, lc);
4496+
4497+
hashoperators = lappend_oid(hashoperators, hclause->opno);
4498+
hashcollations = lappend_oid(hashcollations, hclause->inputcollid);
4499+
outer_hashkeys = lappend(outer_hashkeys, linitial(hclause->args));
4500+
inner_hashkeys = lappend(inner_hashkeys, lsecond(hclause->args));
4501+
}
4502+
44774503
/*
44784504
* Build the hash node and hash join node.
44794505
*/
44804506
hash_plan = make_hash(inner_plan,
4507+
inner_hashkeys,
44814508
skewTable,
44824509
skewColumn,
44834510
skewInherit);
@@ -4504,6 +4531,9 @@ create_hashjoin_plan(PlannerInfo *root,
45044531
joinclauses,
45054532
otherclauses,
45064533
hashclauses,
4534+
hashoperators,
4535+
hashcollations,
4536+
outer_hashkeys,
45074537
outer_plan,
45084538
(Plan *) hash_plan,
45094539
best_path->jpath.jointype,
@@ -5545,6 +5575,9 @@ make_hashjoin(List *tlist,
55455575
List *joinclauses,
55465576
List *otherclauses,
55475577
List *hashclauses,
5578+
List *hashoperators,
5579+
List *hashcollations,
5580+
List *hashkeys,
55485581
Plan *lefttree,
55495582
Plan *righttree,
55505583
JoinType jointype,
@@ -5558,6 +5591,9 @@ make_hashjoin(List *tlist,
55585591
plan->lefttree = lefttree;
55595592
plan->righttree = righttree;
55605593
node->hashclauses = hashclauses;
5594+
node->hashoperators = hashoperators;
5595+
node->hashcollations = hashcollations;
5596+
node->hashkeys = hashkeys;
55615597
node->join.jointype = jointype;
55625598
node->join.inner_unique = inner_unique;
55635599
node->join.joinqual = joinclauses;
@@ -5567,6 +5603,7 @@ make_hashjoin(List *tlist,
55675603

55685604
static Hash *
55695605
make_hash(Plan *lefttree,
5606+
List *hashkeys,
55705607
Oid skewTable,
55715608
AttrNumber skewColumn,
55725609
bool skewInherit)
@@ -5579,6 +5616,7 @@ make_hash(Plan *lefttree,
55795616
plan->lefttree = lefttree;
55805617
plan->righttree = NULL;
55815618

5619+
node->hashkeys = hashkeys;
55825620
node->skewTable = skewTable;
55835621
node->skewColumn = skewColumn;
55845622
node->skewInherit = skewInherit;

src/backend/optimizer/plan/setrefs.c

+44
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,7 @@ static Plan *set_append_references(PlannerInfo *root,
107107
static Plan *set_mergeappend_references(PlannerInfo *root,
108108
MergeAppend *mplan,
109109
int rtoffset);
110+
static void set_hash_references(PlannerInfo *root, Plan *plan, int rtoffset);
110111
static Node *fix_scan_expr(PlannerInfo *root, Node *node, int rtoffset);
111112
static Node *fix_scan_expr_mutator(Node *node, fix_scan_expr_context *context);
112113
static bool fix_scan_expr_walker(Node *node, fix_scan_expr_context *context);
@@ -646,6 +647,9 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
646647
break;
647648

648649
case T_Hash:
650+
set_hash_references(root, plan, rtoffset);
651+
break;
652+
649653
case T_Material:
650654
case T_Sort:
651655
case T_Unique:
@@ -1419,6 +1423,36 @@ set_mergeappend_references(PlannerInfo *root,
14191423
return (Plan *) mplan;
14201424
}
14211425

1426+
/*
1427+
* set_hash_references
1428+
* Do set_plan_references processing on a Hash node
1429+
*/
1430+
static void
1431+
set_hash_references(PlannerInfo *root, Plan *plan, int rtoffset)
1432+
{
1433+
Hash *hplan = (Hash *) plan;
1434+
Plan *outer_plan = plan->lefttree;
1435+
indexed_tlist *outer_itlist;
1436+
1437+
/*
1438+
* Hash's hashkeys are used when feeding tuples into the hashtable,
1439+
* therefore have them reference Hash's outer plan (which itself is the
1440+
* inner plan of the HashJoin).
1441+
*/
1442+
outer_itlist = build_tlist_index(outer_plan->targetlist);
1443+
hplan->hashkeys = (List *)
1444+
fix_upper_expr(root,
1445+
(Node *) hplan->hashkeys,
1446+
outer_itlist,
1447+
OUTER_VAR,
1448+
rtoffset);
1449+
1450+
/* Hash doesn't project */
1451+
set_dummy_tlist_references(plan, rtoffset);
1452+
1453+
/* Hash nodes don't have their own quals */
1454+
Assert(plan->qual == NIL);
1455+
}
14221456

14231457
/*
14241458
* copyVar
@@ -1754,6 +1788,16 @@ set_join_references(PlannerInfo *root, Join *join, int rtoffset)
17541788
inner_itlist,
17551789
(Index) 0,
17561790
rtoffset);
1791+
1792+
/*
1793+
* HashJoin's hashkeys are used to look for matching tuples from its
1794+
* outer plan (not the Hash node!) in the hashtable.
1795+
*/
1796+
hj->hashkeys = (List *) fix_upper_expr(root,
1797+
(Node *) hj->hashkeys,
1798+
outer_itlist,
1799+
OUTER_VAR,
1800+
rtoffset);
17571801
}
17581802

17591803
/*

src/include/nodes/execnodes.h

-3
Original file line numberDiff line numberDiff line change
@@ -1852,7 +1852,6 @@ typedef struct MergeJoinState
18521852
*
18531853
* hashclauses original form of the hashjoin condition
18541854
* hj_OuterHashKeys the outer hash keys in the hashjoin condition
1855-
* hj_InnerHashKeys the inner hash keys in the hashjoin condition
18561855
* hj_HashOperators the join operators in the hashjoin condition
18571856
* hj_HashTable hash table for the hashjoin
18581857
* (NULL if table not built yet)
@@ -1883,7 +1882,6 @@ typedef struct HashJoinState
18831882
JoinState js; /* its first field is NodeTag */
18841883
ExprState *hashclauses;
18851884
List *hj_OuterHashKeys; /* list of ExprState nodes */
1886-
List *hj_InnerHashKeys; /* list of ExprState nodes */
18871885
List *hj_HashOperators; /* list of operator OIDs */
18881886
List *hj_Collations;
18891887
HashJoinTable hj_HashTable;
@@ -2222,7 +2220,6 @@ typedef struct HashState
22222220
PlanState ps; /* its first field is NodeTag */
22232221
HashJoinTable hashtable; /* hash table for the hashjoin */
22242222
List *hashkeys; /* list of ExprState nodes */
2225-
/* hashkeys is same as parent's hj_InnerHashKeys */
22262223

22272224
SharedHashInfo *shared_info; /* one entry per worker */
22282225
HashInstrumentation *hinstrument; /* this worker's entry */

src/include/nodes/plannodes.h

+14
Original file line numberDiff line numberDiff line change
@@ -737,6 +737,14 @@ typedef struct HashJoin
737737
{
738738
Join join;
739739
List *hashclauses;
740+
List *hashoperators;
741+
List *hashcollations;
742+
743+
/*
744+
* List of expressions to be hashed for tuples from the outer plan, to
745+
* perform lookups in the hashtable over the inner plan.
746+
*/
747+
List *hashkeys;
740748
} HashJoin;
741749

742750
/* ----------------
@@ -899,6 +907,12 @@ typedef struct GatherMerge
899907
typedef struct Hash
900908
{
901909
Plan plan;
910+
911+
/*
912+
* List of expressions to be hashed for tuples from Hash's outer plan,
913+
* needed to put them into the hashtable.
914+
*/
915+
List *hashkeys; /* hash keys for the hashjoin condition */
902916
Oid skewTable; /* outer join key's table OID, or InvalidOid */
903917
AttrNumber skewColumn; /* outer join key's column #, or zero */
904918
bool skewInherit; /* is outer join rel an inheritance tree? */

0 commit comments

Comments
 (0)