Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit f644c3b

Browse files
committed
doc: Update parallel join documentation for Parallel Shared Hash.
Thomas Munro Discussion: http://postgr.es/m/CAEepm=3XdL=+bn3=WQVCCT5wwfAEv-4onKpk+XQZdwDXv6etzA@mail.gmail.com
1 parent 649f179 commit f644c3b

File tree

1 file changed

+32
-15
lines changed

1 file changed

+32
-15
lines changed

doc/src/sgml/parallel.sgml

Lines changed: 32 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -323,23 +323,40 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
323323
more other tables using a nested loop, hash join, or merge join. The
324324
inner side of the join may be any kind of non-parallel plan that is
325325
otherwise supported by the planner provided that it is safe to run within
326-
a parallel worker. For example, if a nested loop join is chosen, the
327-
inner plan may be an index scan which looks up a value taken from the outer
328-
side of the join.
326+
a parallel worker. Depending on the join type, the inner side may also be
327+
a parallel plan.
329328
</para>
330329

331-
<para>
332-
Each worker will execute the inner side of the join in full. This is
333-
typically not a problem for nested loops, but may be inefficient for
334-
cases involving hash or merge joins. For example, for a hash join, this
335-
restriction means that an identical hash table is built in each worker
336-
process, which works fine for joins against small tables but may not be
337-
efficient when the inner table is large. For a merge join, it might mean
338-
that each worker performs a separate sort of the inner relation, which
339-
could be slow. Of course, in cases where a parallel plan of this type
340-
would be inefficient, the query planner will normally choose some other
341-
plan (possibly one which does not use parallelism) instead.
342-
</para>
330+
<itemizedlist>
331+
<listitem>
332+
<para>
333+
In a <emphasis>nested loop join</emphasis>, the inner side is always
334+
non-parallel. Although it is executed in full, this is efficient if
335+
the inner side is an index scan, because the outer tuples and thus
336+
the loops that look up values in the index are divided over the
337+
cooperating processes.
338+
</para>
339+
</listitem>
340+
<listitem>
341+
<para>
342+
In a <emphasis>merge join</emphasis>, the inner side is always
343+
a non-parallel plan and therefore executed in full. This may be
344+
inefficient, especially if a sort must be performed, because the work
345+
and resulting data are duplicated in every cooperating process.
346+
</para>
347+
</listitem>
348+
<listitem>
349+
<para>
350+
In a <emphasis>hash join</emphasis> (without the "parallel" prefix),
351+
the inner side is executed in full by every cooperating process
352+
to build identical copies of the hash table. This may be inefficient
353+
if the hash table is large or the plan is expensive. In a
354+
<emphasis>parallel hash join</emphasis>, the inner side is a
355+
<emphasis>parallel hash</emphasis> that divides the work of building
356+
a shared hash table over the cooperating processes.
357+
</para>
358+
</listitem>
359+
</itemizedlist>
343360
</sect2>
344361

345362
<sect2 id="parallel-aggregation">

0 commit comments

Comments
 (0)