Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit ec63622

Browse files
committed
Fix usage of the parse tree for estimate_num_groups() in set operations
recurse_set_operations() uses the parse tree for the group number estimation, because of the "varno 0" hack. At the same time 2489d76 made root->parse and corresponding parent_root->simple_rte_array[]->subquery distinct copies of the parse tree, while d3d55ce introduced self-join removal replacing relid of removed relation only in one of the copies. The present commit fixes this bug by making recurse_set_operations() call estimate_num_groups() with the copy of the parse tree processed by self-join removal. In future, we may think about maintaining just one copy of the parse tree and/or keeping removed relids as aliases. Reported-by: Zuming Jiang Bug: #18170 Discussion: https://postgr.es/m/flat/18170-f1d17bf9a0d58b24%40postgresql.org Author: Richard Guo, Alexander Korotkov Reviewed-by: Andrei Lepikhov
1 parent a237a07 commit ec63622

File tree

3 files changed

+29
-2
lines changed

3 files changed

+29
-2
lines changed

src/backend/optimizer/prep/prepunion.c

+5-2
Original file line numberDiff line numberDiff line change
@@ -324,14 +324,17 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
324324
* anyway; otherwise do statistical estimation.
325325
*
326326
* XXX you don't really want to know about this: we do the estimation
327-
* using the subquery's original targetlist expressions, not the
327+
* using the subroot->parse's original targetlist expressions, not the
328328
* subroot->processed_tlist which might seem more appropriate. The
329329
* reason is that if the subquery is itself a setop, it may return a
330330
* processed_tlist containing "varno 0" Vars generated by
331331
* generate_append_tlist, and those would confuse estimate_num_groups
332332
* mightily. We ought to get rid of the "varno 0" hack, but that
333333
* requires a redesign of the parsetree representation of setops, so
334334
* that there can be an RTE corresponding to each setop's output.
335+
* Note, we use this not subquery's targetlist but subroot->parse's
336+
* targetlist, because it was revised by self-join removal. subquery's
337+
* targetlist might contain the references to the removed relids.
335338
*/
336339
if (pNumGroups)
337340
{
@@ -341,7 +344,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
341344
*pNumGroups = subpath->rows;
342345
else
343346
*pNumGroups = estimate_num_groups(subroot,
344-
get_tlist_exprs(subquery->targetList, false),
347+
get_tlist_exprs(subroot->parse->targetList, false),
345348
subpath->rows,
346349
NULL,
347350
NULL);

src/test/regress/expected/join.out

+17
Original file line numberDiff line numberDiff line change
@@ -6759,6 +6759,23 @@ WHERE c3.id=c2.id AND c3.id*c2.id=c1.id*c1.id;
67596759
Filter: ((id IS NOT NULL) AND ((id * id) IS NOT NULL))
67606760
(3 rows)
67616761

6762+
-- Check the usage of a parse tree by the set operations (bug #18170)
6763+
explain (costs off)
6764+
SELECT c1.code FROM emp1 c1 LEFT JOIN emp1 c2 ON c1.id = c2.id
6765+
WHERE c2.id IS NOT NULL
6766+
EXCEPT ALL
6767+
SELECT c3.code FROM emp1 c3;
6768+
QUERY PLAN
6769+
----------------------------------------------
6770+
HashSetOp Except All
6771+
-> Append
6772+
-> Subquery Scan on "*SELECT* 1"
6773+
-> Seq Scan on emp1 c2
6774+
Filter: (id IS NOT NULL)
6775+
-> Subquery Scan on "*SELECT* 2"
6776+
-> Seq Scan on emp1 c3
6777+
(7 rows)
6778+
67626779
-- We can remove the join even if we find the join can't duplicate rows and
67636780
-- the base quals of each side are different. In the following case we end up
67646781
-- moving quals over to s1 to make it so it can't match any rows.

src/test/regress/sql/join.sql

+7
Original file line numberDiff line numberDiff line change
@@ -2573,6 +2573,13 @@ explain (costs off)
25732573
SELECT count(*) FROM emp1 c1, emp1 c2, emp1 c3
25742574
WHERE c3.id=c2.id AND c3.id*c2.id=c1.id*c1.id;
25752575

2576+
-- Check the usage of a parse tree by the set operations (bug #18170)
2577+
explain (costs off)
2578+
SELECT c1.code FROM emp1 c1 LEFT JOIN emp1 c2 ON c1.id = c2.id
2579+
WHERE c2.id IS NOT NULL
2580+
EXCEPT ALL
2581+
SELECT c3.code FROM emp1 c3;
2582+
25762583
-- We can remove the join even if we find the join can't duplicate rows and
25772584
-- the base quals of each side are different. In the following case we end up
25782585
-- moving quals over to s1 to make it so it can't match any rows.

0 commit comments

Comments
 (0)