Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 1a8a4e5

Browse files
committed
Code review for foreign/custom join pushdown patch.
Commit e7cb7ee included some design decisions that seem pretty questionable to me, and there was quite a lot of stuff not to like about the documentation and comments. Clean up as follows: * Consider foreign joins only between foreign tables on the same server, rather than between any two foreign tables with the same underlying FDW handler function. In most if not all cases, the FDW would simply have had to apply the same-server restriction itself (far more expensively, both for lack of caching and because it would be repeated for each combination of input sub-joins), or else risk nasty bugs. Anyone who's really intent on doing something outside this restriction can always use the set_join_pathlist_hook. * Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist to better reflect what they're for, and allow these custom scan tlists to be used even for base relations. * Change make_foreignscan() API to include passing the fdw_scan_tlist value, since the FDW is required to set that. Backwards compatibility doesn't seem like an adequate reason to expect FDWs to set it in some ad-hoc extra step, and anyway existing FDWs can just pass NIL. * Change the API of path-generating subroutines of add_paths_to_joinrel, and in particular that of GetForeignJoinPaths and set_join_pathlist_hook, so that various less-used parameters are passed in a struct rather than as separate parameter-list entries. The objective here is to reduce the probability that future additions to those parameter lists will result in source-level API breaks for users of these hooks. It's possible that this is even a small win for the core code, since most CPU architectures can't pass more than half a dozen parameters efficiently anyway. I kept root, joinrel, outerrel, innerrel, and jointype as separate parameters to reduce code churn in joinpath.c --- in particular, putting jointype into the struct would have been problematic because of the subroutines' habit of changing their local copies of that variable. * Avoid ad-hocery in ExecAssignScanProjectionInfo. It was probably all right for it to know about IndexOnlyScan, but if the list is to grow we should refactor the knowledge out to the callers. * Restore nodeForeignscan.c's previous use of the relcache to avoid extra GetFdwRoutine lookups for base-relation scans. * Lots of cleanup of documentation and missed comments. Re-order some code additions into more logical places.
1 parent c594c75 commit 1a8a4e5

File tree

26 files changed

+701
-653
lines changed

26 files changed

+701
-653
lines changed

contrib/file_fdw/file_fdw.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -561,7 +561,8 @@ fileGetForeignPlan(PlannerInfo *root,
561561
scan_clauses,
562562
scan_relid,
563563
NIL, /* no expressions to evaluate */
564-
best_path->fdw_private);
564+
best_path->fdw_private,
565+
NIL /* no custom tlist */ );
565566
}
566567

567568
/*

contrib/postgres_fdw/postgres_fdw.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -872,7 +872,8 @@ postgresGetForeignPlan(PlannerInfo *root,
872872
local_exprs,
873873
scan_relid,
874874
params_list,
875-
fdw_private);
875+
fdw_private,
876+
NIL /* no custom tlist */ );
876877
}
877878

878879
/*

doc/src/sgml/custom-scan.sgml

Lines changed: 71 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -32,12 +32,13 @@
3232
</para>
3333

3434
<sect1 id="custom-scan-path">
35-
<title>Implementing Custom Paths</title>
35+
<title>Creating Custom Scan Paths</title>
3636

3737
<para>
38-
A custom scan provider will typically add paths by setting the following
39-
hook, which is called after the core code has generated what it believes
40-
to be the complete and correct set of access paths for the relation.
38+
A custom scan provider will typically add paths for a base relation by
39+
setting the following hook, which is called after the core code has
40+
generated what it believes to be the complete and correct set of access
41+
paths for the relation.
4142
<programlisting>
4243
typedef void (*set_rel_pathlist_hook_type) (PlannerInfo *root,
4344
RelOptInfo *rel,
@@ -74,37 +75,36 @@ typedef struct CustomPath
7475
can support mark and restore. Both capabilities are optional.
7576
<structfield>custom_private</> can be used to store the custom path's
7677
private data. Private data should be stored in a form that can be handled
77-
by <literal>nodeToString</>, so that debugging routines which attempt to
78+
by <literal>nodeToString</>, so that debugging routines that attempt to
7879
print the custom path will work as designed. <structfield>methods</> must
7980
point to a (usually statically allocated) object implementing the required
8081
custom path methods, of which there are currently only two, as further
8182
detailed below.
8283
</para>
8384

8485
<para>
85-
A custom scan provider can also add join paths; in this case, the scan
86-
must produce the same output as would normally be produced by the join
87-
it replaces. To do this, the join provider should set the following hook.
88-
This hook may be invoked repeatedly for the same pair of relations, with
89-
different combinations of inner and outer relations; it is the
90-
responsibility of the hook to minimize duplicated work.
86+
A custom scan provider can also provide join paths. Just as for base
87+
relations, such a path must produce the same output as would normally be
88+
produced by the join it replaces. To do this, the join provider should
89+
set the following hook, and then within the hook function,
90+
create <structname>CustomPath</> path(s) for the join relation.
9191
<programlisting>
9292
typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
9393
RelOptInfo *joinrel,
9494
RelOptInfo *outerrel,
9595
RelOptInfo *innerrel,
96-
List *restrictlist,
9796
JoinType jointype,
98-
SpecialJoinInfo *sjinfo,
99-
SemiAntiJoinFactors *semifactors,
100-
Relids param_source_rels,
101-
Relids extra_lateral_rels);
97+
JoinPathExtraData *extra);
10298
extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
10399
</programlisting>
100+
101+
This hook will be invoked repeatedly for the same join relation, with
102+
different combinations of inner and outer relations; it is the
103+
responsibility of the hook to minimize duplicated work.
104104
</para>
105105

106106
<sect2 id="custom-scan-path-callbacks">
107-
<title>Custom Path Callbacks</title>
107+
<title>Custom Scan Path Callbacks</title>
108108

109109
<para>
110110
<programlisting>
@@ -125,7 +125,7 @@ void (*TextOutCustomPath) (StringInfo str,
125125
const CustomPath *node);
126126
</programlisting>
127127
Generate additional output when <function>nodeToString</> is invoked on
128-
this custom path. This callback is optional. Since
128+
this custom path. This callback is optional. Since
129129
<function>nodeToString</> will automatically dump all fields in the
130130
structure that it can see, including <structfield>custom_private</>, this
131131
is only useful if the <structname>CustomPath</> is actually embedded in a
@@ -135,7 +135,7 @@ void (*TextOutCustomPath) (StringInfo str,
135135
</sect1>
136136

137137
<sect1 id="custom-scan-plan">
138-
<title>Implementing Custom Plans</title>
138+
<title>Creating Custom Scan Plans</title>
139139

140140
<para>
141141
A custom scan is represented in a finished plan tree using the following
@@ -146,9 +146,9 @@ typedef struct CustomScan
146146
Scan scan;
147147
uint32 flags;
148148
List *custom_exprs;
149-
List *custom_ps_tlist;
150149
List *custom_private;
151-
List *custom_relids;
150+
List *custom_scan_tlist;
151+
Bitmapset *custom_relids;
152152
const CustomScanMethods *methods;
153153
} CustomScan;
154154
</programlisting>
@@ -158,49 +158,57 @@ typedef struct CustomScan
158158
<structfield>scan</> must be initialized as for any other scan, including
159159
estimated costs, target lists, qualifications, and so on.
160160
<structfield>flags</> is a bitmask with the same meaning as in
161-
<structname>CustomPath</>. <structfield>custom_exprs</> should be used to
161+
<structname>CustomPath</>.
162+
<structfield>custom_exprs</> should be used to
162163
store expression trees that will need to be fixed up by
163164
<filename>setrefs.c</> and <filename>subselect.c</>, while
164-
<literal>custom_private</> should be used to store other private data that
165-
is only used by the custom scan provider itself. Plan trees must be able
166-
to be duplicated using <function>copyObject</>, so all the data stored
167-
within these two fields must consist of nodes that function can handle.
168-
<literal>custom_relids</> is set by the core code to the set of relations
169-
which this scan node must handle; except when this scan is replacing a
170-
join, it will have only one member.
165+
<structfield>custom_private</> should be used to store other private data
166+
that is only used by the custom scan provider itself.
167+
<structfield>custom_scan_tlist</> can be NIL when scanning a base
168+
relation, indicating that the custom scan returns scan tuples that match
169+
the base relation's rowtype. Otherwise it is a targetlist describing
170+
the actual scan tuples. <structfield>custom_scan_tlist</> must be
171+
provided for joins, and could be provided for scans if the custom scan
172+
provider can compute some non-Var expressions.
173+
<structfield>custom_relids</> is set by the core code to the set of
174+
relations (rangetable indexes) that this scan node handles; except when
175+
this scan is replacing a join, it will have only one member.
171176
<structfield>methods</> must point to a (usually statically allocated)
172177
object implementing the required custom scan methods, which are further
173178
detailed below.
174179
</para>
175180

176181
<para>
177182
When a <structname>CustomScan</> scans a single relation,
178-
<structfield>scan.scanrelid</> should be the range table index of the table
179-
to be scanned, and <structfield>custom_ps_tlist</> should be
180-
<literal>NULL</>. When it replaces a join, <structfield>scan.scanrelid</>
181-
should be zero, and <structfield>custom_ps_tlist</> should be a list of
182-
<structname>TargetEntry</> nodes. This is necessary because, when a join
183-
is replaced, the target list cannot be constructed from the table
184-
definition. At execution time, this list will be used to initialize the
185-
tuple descriptor of the <structname>TupleTableSlot</>. It will also be
186-
used by <command>EXPLAIN</>, when deparsing.
183+
<structfield>scan.scanrelid</> must be the range table index of the table
184+
to be scanned. When it replaces a join, <structfield>scan.scanrelid</>
185+
should be zero.
186+
</para>
187+
188+
<para>
189+
Plan trees must be able to be duplicated using <function>copyObject</>,
190+
so all the data stored within the <quote>custom</> fields must consist of
191+
nodes that that function can handle. Furthermore, custom scan providers
192+
cannot substitute a larger structure that embeds
193+
a <structname>CustomScan</> for the structure itself, as would be possible
194+
for a <structname>CustomPath</> or <structname>CustomScanState</>.
187195
</para>
188196

189197
<sect2 id="custom-scan-plan-callbacks">
190-
<title>Custom Scan Callbacks</title>
198+
<title>Custom Scan Plan Callbacks</title>
191199
<para>
192200
<programlisting>
193201
Node *(*CreateCustomScanState) (CustomScan *cscan);
194202
</programlisting>
195203
Allocate a <structname>CustomScanState</> for this
196204
<structname>CustomScan</>. The actual allocation will often be larger than
197205
required for an ordinary <structname>CustomScanState</>, because many
198-
scan types will wish to embed that as the first field of a large structure.
206+
providers will wish to embed that as the first field of a larger structure.
199207
The value returned must have the node tag and <structfield>methods</>
200-
set appropriately, but the other fields need not be initialized at this
208+
set appropriately, but other fields should be left as zeroes at this
201209
stage; after <function>ExecInitCustomScan</> performs basic initialization,
202210
the <function>BeginCustomScan</> callback will be invoked to give the
203-
custom scan state a chance to do whatever else is needed.
211+
custom scan provider a chance to do whatever else is needed.
204212
</para>
205213

206214
<para>
@@ -209,23 +217,21 @@ void (*TextOutCustomScan) (StringInfo str,
209217
const CustomScan *node);
210218
</programlisting>
211219
Generate additional output when <function>nodeToString</> is invoked on
212-
this custom plan. This callback is optional. Since a
213-
<structname>CustomScan</> must be copyable by <function>copyObject</>,
214-
custom scan providers cannot substitute a larger structure that embeds a
215-
<structname>CustomScan</> for the structure itself, as would be possible
216-
for a <structname>CustomPath</> or <structname>CustomScanState</>.
217-
Therefore, providing this callback is unlikely to be useful.
220+
this custom plan node. This callback is optional. Since
221+
<function>nodeToString</> will automatically dump all fields in the
222+
structure, including the substructure of the <quote>custom</> fields,
223+
there is usually not much need for this callback.
218224
</para>
219225
</sect2>
220226
</sect1>
221227

222-
<sect1 id="custom-scan-scan">
223-
<title>Implementing Custom Scans</title>
228+
<sect1 id="custom-scan-execution">
229+
<title>Executing Custom Scans</title>
224230

225231
<para>
226232
When a <structfield>CustomScan</> is executed, its execution state is
227233
represented by a <structfield>CustomScanState</>, which is declared as
228-
follows.
234+
follows:
229235
<programlisting>
230236
typedef struct CustomScanState
231237
{
@@ -237,7 +243,9 @@ typedef struct CustomScanState
237243
</para>
238244

239245
<para>
240-
<structfield>ss</> must be initialized as for any other scanstate;
246+
<structfield>ss</> is initialized as for any other scanstate,
247+
except that if the scan is for a join rather than a base relation,
248+
<literal>ss.ss_currentRelation</> is left NULL.
241249
<structfield>flags</> is a bitmask with the same meaning as in
242250
<structname>CustomPath</> and <structname>CustomScan</>.
243251
<structfield>methods</> must point to a (usually statically allocated)
@@ -247,8 +255,8 @@ typedef struct CustomScanState
247255
structure embedding the above as its first member.
248256
</para>
249257

250-
<sect2 id="custom-scan-scan-callbacks">
251-
<title>Custom Execution-Time Callbacks</title>
258+
<sect2 id="custom-scan-execution-callbacks">
259+
<title>Custom Scan Execution Callbacks</title>
252260

253261
<para>
254262
<programlisting>
@@ -257,8 +265,8 @@ void (*BeginCustomScan) (CustomScanState *node,
257265
int eflags);
258266
</programlisting>
259267
Complete initialization of the supplied <structname>CustomScanState</>.
260-
Some initialization is performed by <function>ExecInitCustomScan</>, but
261-
any private fields should be initialized here.
268+
Standard fields have been initialized by <function>ExecInitCustomScan</>,
269+
but any private fields should be initialized here.
262270
</para>
263271

264272
<para>
@@ -276,8 +284,8 @@ TupleTableSlot *(*ExecCustomScan) (CustomScanState *node);
276284
void (*EndCustomScan) (CustomScanState *node);
277285
</programlisting>
278286
Clean up any private data associated with the <literal>CustomScanState</>.
279-
This method is required, but may not need to do anything if the associated
280-
data does not exist or will be cleaned up automatically.
287+
This method is required, but it does not need to do anything if there is
288+
no associated data or it will be cleaned up automatically.
281289
</para>
282290

283291
<para>
@@ -293,8 +301,8 @@ void (*ReScanCustomScan) (CustomScanState *node);
293301
void (*MarkPosCustomScan) (CustomScanState *node);
294302
</programlisting>
295303
Save the current scan position so that it can subsequently be restored
296-
by the <function>RestrPosCustomScan</> callback. This calback is optional,
297-
and need only be supplied if
304+
by the <function>RestrPosCustomScan</> callback. This callback is
305+
optional, and need only be supplied if the
298306
<literal>CUSTOMPATH_SUPPORT_MARK_RESTORE</> flag is set.
299307
</para>
300308

@@ -304,7 +312,7 @@ void (*RestrPosCustomScan) (CustomScanState *node);
304312
</programlisting>
305313
Restore the previous scan position as saved by the
306314
<function>MarkPosCustomScan</> callback. This callback is optional,
307-
and need only be supplied if
315+
and need only be supplied if the
308316
<literal>CUSTOMPATH_SUPPORT_MARK_RESTORE</> flag is set.
309317
</para>
310318

@@ -314,8 +322,8 @@ void (*ExplainCustomScan) (CustomScanState *node,
314322
List *ancestors,
315323
ExplainState *es);
316324
</programlisting>
317-
Output additional information on <command>EXPLAIN</> that involves
318-
custom-scan node. This callback is optional. Common data stored in the
325+
Output additional information for <command>EXPLAIN</> of a custom-scan
326+
plan node. This callback is optional. Common data stored in the
319327
<structname>ScanState</>, such as the target list and scan relation, will
320328
be shown even without this callback, but the callback allows the display
321329
of additional, private state.

0 commit comments

Comments
 (0)