Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit fb60478

Browse files
committed
Improve JIT docs.
Author: John Naylor and Andres Freund Discussion: https://postgr.es/m/CAJVSVGUs-VcwSY7-Kx-GQe__8hvWuA4Uhyf3gxoMXeiZqebE9g@mail.gmail.com
1 parent c1de1a3 commit fb60478

File tree

3 files changed

+56
-55
lines changed

3 files changed

+56
-55
lines changed

doc/src/sgml/func.sgml

+2-2
Original file line numberDiff line numberDiff line change
@@ -15945,8 +15945,8 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
1594515945
<row>
1594615946
<entry><literal><function>pg_jit_available()</function></literal></entry>
1594715947
<entry><type>boolean</type></entry>
15948-
<entry>is <acronym>JIT</acronym> available in this session (see <xref
15949-
linkend="jit"/>)? Returns <literal>false</literal> if <xref
15948+
<entry>is <acronym>JIT</acronym> compilation available in this session
15949+
(see <xref linkend="jit"/>)? Returns <literal>false</literal> if <xref
1595015950
linkend="guc-jit"/> is set to false.</entry>
1595115951
</row>
1595215952

doc/src/sgml/jit.sgml

+15-14
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
</para>
1919

2020
<sect1 id="jit-reason">
21-
<title>What is <acronym>JIT</acronym>?</title>
21+
<title>What is <acronym>JIT</acronym> compilation?</title>
2222

2323
<para>
2424
Just-in-time compilation (<acronym>JIT</acronym>) is the process of turning
@@ -33,7 +33,7 @@
3333

3434
<para>
3535
<productname>PostgreSQL</productname> has builtin support to perform
36-
<acronym>JIT</acronym> using <ulink
36+
<acronym>JIT</acronym> compilation using <ulink
3737
url="https://llvm.org/"><productname>LLVM</productname></ulink> when
3838
<productname>PostgreSQL</productname> was built with
3939
<literal>--with-llvm</literal> (see <xref linkend="configure-with-llvm"/>).
@@ -97,15 +97,15 @@
9797
<title>When to <acronym>JIT</acronym>?</title>
9898

9999
<para>
100-
<acronym>JIT</acronym> is beneficial primarily for long-running CPU bound
101-
queries. Frequently these will be analytical queries. For short queries
102-
the overhead of performing <acronym>JIT</acronym> will often be higher than
103-
the time it can save.
100+
<acronym>JIT</acronym> compilation is beneficial primarily for long-running
101+
CPU bound queries. Frequently these will be analytical queries. For short
102+
queries the added overhead of performing <acronym>JIT</acronym> compilation
103+
will often be higher than the time it can save.
104104
</para>
105105

106106
<para>
107-
To determine whether <acronym>JIT</acronym> is used, the total cost of a
108-
query (see <xref linkend="planner-stats-details"/> and <xref
107+
To determine whether <acronym>JIT</acronym> compilation is used, the total
108+
cost of a query (see <xref linkend="planner-stats-details"/> and <xref
109109
linkend="runtime-config-query-constants"/>) is used.
110110
</para>
111111

@@ -117,9 +117,9 @@
117117

118118
<para>
119119
If the planner, based on the above criterion, decided that
120-
<acronym>JIT</acronym> is beneficial, two further decisions are
120+
<acronym>JIT</acronym> compilation is beneficial, two further decisions are
121121
made. Firstly, if the query is more costly than the <xref
122-
linkend="guc-jit-optimize-above-cost"/>, GUC expensive optimizations are
122+
linkend="guc-jit-optimize-above-cost"/> GUC, expensive optimizations are
123123
used to improve the generated code. Secondly, if the query is more costly
124124
than the <xref linkend="guc-jit-inline-above-cost"/> GUC, short functions
125125
and operators used in the query will be inlined. Both of these operations
@@ -187,8 +187,9 @@ SET
187187
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
188188
</programlisting>
189189
As visible here, <acronym>JIT</acronym> was used, but inlining and
190-
optimization were not. If <xref linkend="guc-jit-optimize-above-cost"/>,
191-
<xref linkend="guc-jit-inline-above-cost"/> were lowered, just like <xref
190+
expensive optimization were not. If <xref
191+
linkend="guc-jit-optimize-above-cost"/>, <xref
192+
linkend="guc-jit-inline-above-cost"/> were lowered, just like <xref
192193
linkend="guc-jit-above-cost"/>, that would change.
193194
</para>
194195
</sect1>
@@ -197,8 +198,8 @@ SET
197198
<title>Configuration</title>
198199

199200
<para>
200-
<xref linkend="guc-jit"/> determines whether <acronym>JIT</acronym> is
201-
enabled or disabled.
201+
<xref linkend="guc-jit"/> determines whether <acronym>JIT</acronym>
202+
compilation is enabled or disabled.
202203
</para>
203204

204205
<para>

src/backend/jit/README

+39-39
Original file line numberDiff line numberDiff line change
@@ -13,20 +13,20 @@ the CPU that just handles that expression, yielding a speedup.
1313
That this is done at query execution time, possibly even only in cases
1414
the relevant task is done a number of times, makes it JIT, rather than
1515
ahead-of-time (AOT). Given the way JIT compilation is used in
16-
postgres, the lines between interpretation, AOT and JIT are somewhat
16+
PostgreSQL, the lines between interpretation, AOT and JIT are somewhat
1717
blurry.
1818

1919
Note that the interpreted program turned into a native program does
2020
not necessarily have to be a program in the classical sense. E.g. it
21-
is highly beneficial JIT compile tuple deforming into a native
21+
is highly beneficial to JIT compile tuple deforming into a native
2222
function just handling a specific type of table, despite tuple
2323
deforming not commonly being understood as a "program".
2424

2525

2626
Why JIT?
2727
========
2828

29-
Parts of postgres are commonly bottlenecked by comparatively small
29+
Parts of PostgreSQL are commonly bottlenecked by comparatively small
3030
pieces of CPU intensive code. In a number of cases that is because the
3131
relevant code has to be very generic (e.g. handling arbitrary SQL
3232
level expressions, over arbitrary tables, with arbitrary extensions
@@ -49,11 +49,11 @@ particularly beneficial for removing branches during tuple deforming.
4949
How to JIT
5050
==========
5151

52-
Postgres, by default, uses LLVM to perform JIT. LLVM was chosen
52+
PostgreSQL, by default, uses LLVM to perform JIT. LLVM was chosen
5353
because it is developed by several large corporations and therefore
5454
unlikely to be discontinued, because it has a license compatible with
55-
PostgreSQL, and because its LLVM IR can be generated from C
56-
using the clang compiler.
55+
PostgreSQL, and because its IR can be generated from C using the Clang
56+
compiler.
5757

5858

5959
Shared Library Separation
@@ -68,22 +68,22 @@ An additional benefit of doing so is that it is relatively easy to
6868
evaluate JIT compilation that does not use LLVM, by changing out the
6969
shared library used to provide JIT compilation.
7070

71-
To achieve this code, e.g. expression evaluation, intending to perform
72-
JIT, calls a LLVM independent wrapper located in jit.c to do so. If
73-
the shared library providing JIT support can be loaded (i.e. postgres
74-
was compiled with LLVM support and the shared library is installed),
75-
the task of JIT compiling an expression gets handed of to shared
76-
library. This obviously requires that the function in jit.c is allowed
77-
to fail in case no JIT provider can be loaded.
71+
To achieve this, code intending to perform JIT (e.g. expression evaluation)
72+
calls an LLVM independent wrapper located in jit.c to do so. If the
73+
shared library providing JIT support can be loaded (i.e. PostgreSQL was
74+
compiled with LLVM support and the shared library is installed), the task
75+
of JIT compiling an expression gets handed off to the shared library. This
76+
obviously requires that the function in jit.c is allowed to fail in case
77+
no JIT provider can be loaded.
7878

7979
Which shared library is loaded is determined by the jit_provider GUC,
8080
defaulting to "llvmjit".
8181

8282
Cloistering code performing JIT into a shared library unfortunately
8383
also means that code doing JIT compilation for various parts of code
8484
has to be located separately from the code doing so without
85-
JIT. E.g. the JITed version of execExprInterp.c is located in
86-
jit/llvm/ rather than executor/.
85+
JIT. E.g. the JIT version of execExprInterp.c is located in jit/llvm/
86+
rather than executor/.
8787

8888

8989
JIT Context
@@ -105,9 +105,9 @@ implementations.
105105

106106
Emitting individual functions separately is more expensive than
107107
emitting several functions at once, and emitting them together can
108-
provide additional optimization opportunities. To facilitate that the
109-
LLVM provider separates function definition from emitting them in an
110-
executable way.
108+
provide additional optimization opportunities. To facilitate that, the
109+
LLVM provider separates defining functions from optimizing and
110+
emitting functions in an executable manner.
111111

112112
Creating functions into the current mutable module (a module
113113
essentially is LLVM's equivalent of a translation unit in C) is done
@@ -127,7 +127,7 @@ used.
127127
Error Handling
128128
--------------
129129

130-
There are two aspects to error handling. Firstly, generated (LLVM IR)
130+
There are two aspects of error handling. Firstly, generated (LLVM IR)
131131
and emitted functions (mmap()ed segments) need to be cleaned up both
132132
after a successful query execution and after an error. This is done by
133133
registering each created JITContext with the current resource owner,
@@ -140,12 +140,12 @@ cleaning up emitted code upon ERROR, but there's also the chance that
140140
LLVM itself runs out of memory. LLVM by default does *not* use any C++
141141
exceptions. Its allocations are primarily funneled through the
142142
standard "new" handlers, and some direct use of malloc() and
143-
mmap(). For the former a 'new handler' exists
144-
http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
145-
latter LLVM provides callback that get called upon failure
146-
(unfortunately mmap() failures are treated as fatal rather than OOM
147-
errors). What we've, for now, chosen to do, is to have two functions
148-
that LLVM using code must use:
143+
mmap(). For the former a 'new handler' exists:
144+
http://en.cppreference.com/w/cpp/memory/new/set_new_handler
145+
For the latter LLVM provides callbacks that get called upon failure
146+
(unfortunately mmap() failures are treated as fatal rather than OOM errors).
147+
What we've chosen to do for now is have two functions that LLVM using code
148+
must use:
149149
extern void llvm_enter_fatal_on_oom(void);
150150
extern void llvm_leave_fatal_on_oom(void);
151151
before interacting with LLVM code.
@@ -160,31 +160,31 @@ the handlers instead are reset on toplevel sigsetjmp() level.
160160

161161
Using a relatively small enter/leave protected section of code, rather
162162
than setting up these handlers globally, avoids negative interactions
163-
with extensions that might use C++ like e.g. postgis. As LLVM code
163+
with extensions that might use C++ such as PostGIS. As LLVM code
164164
generation should never execute arbitrary code, just setting these
165165
handlers temporarily ought to suffice.
166166

167167

168168
Type Synchronization
169169
--------------------
170170

171-
To able to generate code performing tasks that are done in "interpreted"
172-
postgres, it obviously is required that code generation knows about at
173-
least a few postgres types. While it is possible to inform LLVM about
171+
To be able to generate code that can perform tasks done by "interpreted"
172+
PostgreSQL, it obviously is required that code generation knows about at
173+
least a few PostgreSQL types. While it is possible to inform LLVM about
174174
type definitions by recreating them manually in C code, that is failure
175175
prone and labor intensive.
176176

177177
Instead there is one small file (llvmjit_types.c) which references each of
178178
the types required for JITing. That file is translated to bitcode at
179179
compile time, and loaded when LLVM is initialized in a backend.
180180

181-
That works very well to synchronize the type definition, unfortunately
181+
That works very well to synchronize the type definition, but unfortunately
182182
it does *not* synchronize offsets as the IR level representation doesn't
183-
know field names. Instead required offsets are maintained as defines in
184-
the original struct definition. E.g.
183+
know field names. Instead, required offsets are maintained as defines in
184+
the original struct definition, like so:
185185
#define FIELDNO_TUPLETABLESLOT_NVALID 9
186186
int tts_nvalid; /* # of valid values in tts_values */
187-
while that still needs to be defined, it's only required for a
187+
While that still needs to be defined, it's only required for a
188188
relatively small number of fields, and it's bunched together with the
189189
struct definition, so it's easily kept synchronized.
190190

@@ -193,12 +193,12 @@ Inlining
193193
--------
194194

195195
One big advantage of JITing expressions is that it can significantly
196-
reduce the overhead of postgres's extensible function/operator
197-
mechanism, by inlining the body of called functions / operators.
196+
reduce the overhead of PostgreSQL's extensible function/operator
197+
mechanism, by inlining the body of called functions/operators.
198198

199199
It obviously is undesirable to maintain a second implementation of
200200
commonly used functions, just for inlining purposes. Instead we take
201-
advantage of the fact that the clang compiler can emit LLVM IR.
201+
advantage of the fact that the Clang compiler can emit LLVM IR.
202202

203203
The ability to do so allows us to get the LLVM IR for all operators
204204
(e.g. int8eq, float8pl etc), without maintaining two copies. These
@@ -225,7 +225,7 @@ Caching
225225
Currently it is not yet possible to cache generated functions, even
226226
though that'd be desirable from a performance point of view. The
227227
problem is that the generated functions commonly contain pointers into
228-
per-execution memory. The expression evaluation functionality needs to
228+
per-execution memory. The expression evaluation machinery needs to
229229
be redesigned a bit to avoid that. Basically all per-execution memory
230230
needs to be referenced as an offset to one block of memory stored in
231231
an ExprState, rather than absolute pointers into memory.
@@ -278,7 +278,7 @@ Currently there are a number of GUCs that influence JITing:
278278
- jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried if query has
279279
higher cost.
280280

281-
whenever a query's total cost is above these limits, JITing is
281+
Whenever a query's total cost is above these limits, JITing is
282282
performed.
283283

284284
Alternative costing models, e.g. by generating separate paths for
@@ -291,5 +291,5 @@ individual expressions.
291291
The obvious seeming approach of JITing expressions individually after
292292
a number of execution turns out not to work too well. Primarily
293293
because emitting many small functions individually has significant
294-
overhead. Secondarily because the time till JITing occurs causes
294+
overhead. Secondarily because the time until JITing occurs causes
295295
relative slowdowns that eat into the gain of JIT compilation.

0 commit comments

Comments
 (0)