Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 4d72e2b

Browse files
committed
subselect notes from Vadim.
1 parent d30ad52 commit 4d72e2b

File tree

1 file changed

+156
-0
lines changed
  • src/backend/optimizer/plan

1 file changed

+156
-0
lines changed

src/backend/optimizer/plan/README

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
Subselect notes from Vadim.
2+
3+
4+
5+
From owner-pgsql-hackers@hub.org Fri Feb 13 09:01:19 1998
6+
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
7+
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id JAA11576
8+
for <maillist@candle.pha.pa.us>; Fri, 13 Feb 1998 09:01:17 -0500 (EST)
9+
Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id IAA09761 for <maillist@candle.pha.pa.us>; Fri, 13 Feb 1998 08:41:22 -0500 (EST)
10+
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id IAA08135; Fri, 13 Feb 1998 08:40:17 -0500 (EST)
11+
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 13 Feb 1998 08:38:42 -0500 (EST)
12+
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id IAA06646 for pgsql-hackers-outgoing; Fri, 13 Feb 1998 08:38:35 -0500 (EST)
13+
Received: from dune.krasnet.ru (dune.krasnet.ru [193.125.44.86]) by hub.org (8.8.8/8.7.5) with ESMTP id IAA04568 for <hackers@postgreSQL.org>; Fri, 13 Feb 1998 08:37:16 -0500 (EST)
14+
Received: from sable.krasnoyarsk.su (dune.krasnet.ru [193.125.44.86])
15+
by dune.krasnet.ru (8.8.7/8.8.7) with ESMTP id UAA13717
16+
for <hackers@postgreSQL.org>; Fri, 13 Feb 1998 20:51:03 +0700 (KRS)
17+
(envelope-from vadim@sable.krasnoyarsk.su)
18+
Message-ID: <34E44FBA.D64E7997@sable.krasnoyarsk.su>
19+
Date: Fri, 13 Feb 1998 20:50:50 +0700
20+
From: "Vadim B. Mikheev" <vadim@sable.krasnoyarsk.su>
21+
Organization: ITTS (Krasnoyarsk)
22+
X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
23+
MIME-Version: 1.0
24+
To: PostgreSQL Developers List <hackers@postgreSQL.org>
25+
Subject: [HACKERS] Subselects are in CVS...
26+
Content-Type: text/plain; charset=us-ascii
27+
Content-Transfer-Encoding: 7bit
28+
Sender: owner-pgsql-hackers@hub.org
29+
Precedence: bulk
30+
Status: OR
31+
32+
This is some implementation notes and opened issues...
33+
34+
First, implementation uses new type of parameters - PARAM_EXEC - to deal
35+
with correlation Vars. When query_planner() is called, it first tries to
36+
replace all upper queries Var referenced in current query with Param of
37+
this type. Some global variables are used to keep mapping of Vars to
38+
Params and Params to Vars.
39+
40+
After this, all current query' SubLinks are processed: for each SubLink
41+
found in query' qual union_planner() (old planner() function) will be
42+
called to plan corresponding subselect (union_planner() calls
43+
query_planner() for "simple" query and supports UNIONs). After subselect
44+
are planned, optimizer knows about is this correlated, un-correlated or
45+
_undirect_ correlated (references some grand-parent Vars but no parent
46+
ones: uncorrelated from the parent' point of view) query.
47+
48+
For uncorrelated and undirect correlated subqueries of EXPRession or
49+
EXISTS type SubLinks will be replaced with "normal" clauses from
50+
SubLink->Oper list (I changed this list to be list of EXPR nodes,
51+
not just Oper ones). Right sides of these nodes are replaced with
52+
PARAM_EXEC parameters. This is second use of new parameter type.
53+
At run-time these parameters get value from result of subquery
54+
evaluation (i.e. - from target list of subquery). Execution plan of
55+
subquery itself becomes init plan of parent query. InitPlan knows
56+
what parameters are to get values from subquery' results and will be
57+
executed "on-demand" (for query select * from table where x > 0 and
58+
y > (select max(a) from table_a) subquery will not be executed at all
59+
if there are no tuples with x > 0 _and_ y is not used in index scan).
60+
61+
SubLinks for subqueries of all other types are transformed into
62+
new type of Expr node - SUBPLAN_EXPR. Expr->args are just correlation
63+
variables from _parent_ query. Expr->oper is new SubPlan node.
64+
65+
This node is used for InitPlan too. It keeps subquery range table,
66+
indices of Params which are to get value from _parent_ query Vars
67+
(i.e. - from Expr->args), indices of Params into which subquery'
68+
results are to be substituted (this is for InitPlans), SubLink
69+
and subquery' execution plan.
70+
71+
Plan node was changed to know about dependencies on Params from
72+
parent queries and InitPlans, to keep list of changed Params
73+
(from the above) and so be re-scanned if this list is not NULL.
74+
Also, added list of InitPlans (actually, all of them for current
75+
query are in topmost plan node now) and other SubPlans (from
76+
plan->qual) - to initialize them and let them know about changed
77+
Params (from the list of their "interests").
78+
79+
After all SubLinks are processed, query_planner() calls qual'
80+
canonificator and does "normal" work. By using Params optimizer
81+
is mostly unchanged.
82+
83+
Well, Executor. To get subplans re-evaluated without ExecutorStart()
84+
and ExecutorEnd() (without opening and closing relations and indices
85+
and without many palloc() and pfree() - this is what SQL-funcs does
86+
on each call) ExecReScan() now supports most of Plan types...
87+
88+
Explanation of EXPLAIN.
89+
90+
vac=> explain select * from tmp where x >= (select max(x2) from test2
91+
where y2 = y and exists (select * from tempx where tx = x));
92+
NOTICE: QUERY PLAN:
93+
94+
Seq Scan on tmp (cost=40.03 size=101 width=8)
95+
SubPlan
96+
^^^^^^^ subquery is in Seq Scan' qual, its plan is below
97+
-> Aggregate (cost=2.05 size=0 width=0)
98+
InitPlan
99+
^^^^^^^^ EXISTS subsubquery is InitPlan of subquery
100+
-> Seq Scan on tempx (cost=4.33 size=1 width=4)
101+
-> Result (cost=2.05 size=0 width=0)
102+
^^^^^^ EXISTS subsubquery was transformed into Param
103+
and so we have Result node here
104+
-> Index Scan on test2 (cost=2.05 size=1 width=4)
105+
106+
107+
Opened issues.
108+
109+
1. No read permissions checking (easy, just not done yet).
110+
2. readfuncs.c can't read subplan-s (easy, not critical, because of
111+
we currently nowhere use ascii representation of execution plans).
112+
3. ExecReScan() doesn't support all plan types. At least support for
113+
MergeJoin has to be implemented.
114+
4. Memory leaks in ExecReScan().
115+
5. I need in advice: if subquery introduced with NOT IN doesn't return
116+
any tuples then qualification is failed, yes ?
117+
6. Regression tests !!!!!!!!!!!!!!!!!!!!
118+
(Could we use data/queries from MySQL' crash.me ?
119+
Copyright-ed ? Could they give us rights ?)
120+
7. Performance.
121+
- Should be good when subquery is transformed into InitPlan.
122+
- Something should be done for uncorrelated subqueries introduced
123+
with ANY/ALL - keep thinking. Currently, subplan will be re-scanned
124+
for each parent tuple - very slow...
125+
126+
Results of some test. TMP is table with x,y (int4-s), x in 0-9,
127+
y = 100 - x, 1000 tuples (10 duplicates of each tuple). TEST2 is table
128+
with x2, y2 (int4-s), x2 in 1-99, y2 = 100 -x2, 10000 tuples (100 dups).
129+
130+
Trying
131+
132+
select * from tmp where x >= (select max(x2) from test2 where y2 = y);
133+
134+
and
135+
136+
begin;
137+
select y as ty, max(x2) as mx into table tsub from test2, tmp
138+
where y2 = y group by ty;
139+
vacuum tsub;
140+
select x, y from tmp, tsub where x >= mx and y = ty;
141+
drop table tsub;
142+
end;
143+
144+
Without index on test2(y2):
145+
146+
SubSelect -> 320 sec
147+
Using temp table -> 32 sec
148+
149+
Having index
150+
151+
SubSelect -> 17 sec (2M of memory)
152+
Using temp table -> 32 sec (12M of memory: -S 8192)
153+
154+
Vadim
155+
156+

0 commit comments

Comments
 (0)