@@ -13,20 +13,20 @@ the CPU that just handles that expression, yielding a speedup.
13
13
That this is done at query execution time, possibly even only in cases
14
14
the relevant task is done a number of times, makes it JIT, rather than
15
15
ahead-of-time (AOT). Given the way JIT compilation is used in
16
- postgres , the lines between interpretation, AOT and JIT are somewhat
16
+ PostgreSQL , the lines between interpretation, AOT and JIT are somewhat
17
17
blurry.
18
18
19
19
Note that the interpreted program turned into a native program does
20
20
not necessarily have to be a program in the classical sense. E.g. it
21
- is highly beneficial JIT compile tuple deforming into a native
21
+ is highly beneficial to JIT compile tuple deforming into a native
22
22
function just handling a specific type of table, despite tuple
23
23
deforming not commonly being understood as a "program".
24
24
25
25
26
26
Why JIT?
27
27
========
28
28
29
- Parts of postgres are commonly bottlenecked by comparatively small
29
+ Parts of PostgreSQL are commonly bottlenecked by comparatively small
30
30
pieces of CPU intensive code. In a number of cases that is because the
31
31
relevant code has to be very generic (e.g. handling arbitrary SQL
32
32
level expressions, over arbitrary tables, with arbitrary extensions
@@ -49,11 +49,11 @@ particularly beneficial for removing branches during tuple deforming.
49
49
How to JIT
50
50
==========
51
51
52
- Postgres , by default, uses LLVM to perform JIT. LLVM was chosen
52
+ PostgreSQL , by default, uses LLVM to perform JIT. LLVM was chosen
53
53
because it is developed by several large corporations and therefore
54
54
unlikely to be discontinued, because it has a license compatible with
55
- PostgreSQL, and because its LLVM IR can be generated from C
56
- using the clang compiler.
55
+ PostgreSQL, and because its IR can be generated from C using the Clang
56
+ compiler.
57
57
58
58
59
59
Shared Library Separation
@@ -68,22 +68,22 @@ An additional benefit of doing so is that it is relatively easy to
68
68
evaluate JIT compilation that does not use LLVM, by changing out the
69
69
shared library used to provide JIT compilation.
70
70
71
- To achieve this code, e.g. expression evaluation, intending to perform
72
- JIT, calls a LLVM independent wrapper located in jit.c to do so. If
73
- the shared library providing JIT support can be loaded (i.e. postgres
74
- was compiled with LLVM support and the shared library is installed),
75
- the task of JIT compiling an expression gets handed of to shared
76
- library. This obviously requires that the function in jit.c is allowed
77
- to fail in case no JIT provider can be loaded.
71
+ To achieve this, code intending to perform JIT ( e.g. expression evaluation)
72
+ calls an LLVM independent wrapper located in jit.c to do so. If the
73
+ shared library providing JIT support can be loaded (i.e. PostgreSQL was
74
+ compiled with LLVM support and the shared library is installed), the task
75
+ of JIT compiling an expression gets handed off to the shared library. This
76
+ obviously requires that the function in jit.c is allowed to fail in case
77
+ no JIT provider can be loaded.
78
78
79
79
Which shared library is loaded is determined by the jit_provider GUC,
80
80
defaulting to "llvmjit".
81
81
82
82
Cloistering code performing JIT into a shared library unfortunately
83
83
also means that code doing JIT compilation for various parts of code
84
84
has to be located separately from the code doing so without
85
- JIT. E.g. the JITed version of execExprInterp.c is located in
86
- jit/llvm/ rather than executor/.
85
+ JIT. E.g. the JIT version of execExprInterp.c is located in jit/llvm/
86
+ rather than executor/.
87
87
88
88
89
89
JIT Context
@@ -105,9 +105,9 @@ implementations.
105
105
106
106
Emitting individual functions separately is more expensive than
107
107
emitting several functions at once, and emitting them together can
108
- provide additional optimization opportunities. To facilitate that the
109
- LLVM provider separates function definition from emitting them in an
110
- executable way .
108
+ provide additional optimization opportunities. To facilitate that, the
109
+ LLVM provider separates defining functions from optimizing and
110
+ emitting functions in an executable manner .
111
111
112
112
Creating functions into the current mutable module (a module
113
113
essentially is LLVM's equivalent of a translation unit in C) is done
@@ -127,7 +127,7 @@ used.
127
127
Error Handling
128
128
--------------
129
129
130
- There are two aspects to error handling. Firstly, generated (LLVM IR)
130
+ There are two aspects of error handling. Firstly, generated (LLVM IR)
131
131
and emitted functions (mmap()ed segments) need to be cleaned up both
132
132
after a successful query execution and after an error. This is done by
133
133
registering each created JITContext with the current resource owner,
@@ -140,12 +140,12 @@ cleaning up emitted code upon ERROR, but there's also the chance that
140
140
LLVM itself runs out of memory. LLVM by default does *not* use any C++
141
141
exceptions. Its allocations are primarily funneled through the
142
142
standard "new" handlers, and some direct use of malloc() and
143
- mmap(). For the former a 'new handler' exists
144
- http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
145
- latter LLVM provides callback that get called upon failure
146
- (unfortunately mmap() failures are treated as fatal rather than OOM
147
- errors). What we've, for now, chosen to do, is to have two functions
148
- that LLVM using code must use:
143
+ mmap(). For the former a 'new handler' exists:
144
+ http://en.cppreference.com/w/cpp/memory/new/set_new_handler
145
+ For the latter LLVM provides callbacks that get called upon failure
146
+ (unfortunately mmap() failures are treated as fatal rather than OOM errors).
147
+ What we've chosen to do for now is have two functions that LLVM using code
148
+ must use:
149
149
extern void llvm_enter_fatal_on_oom(void);
150
150
extern void llvm_leave_fatal_on_oom(void);
151
151
before interacting with LLVM code.
@@ -160,31 +160,31 @@ the handlers instead are reset on toplevel sigsetjmp() level.
160
160
161
161
Using a relatively small enter/leave protected section of code, rather
162
162
than setting up these handlers globally, avoids negative interactions
163
- with extensions that might use C++ like e.g. postgis . As LLVM code
163
+ with extensions that might use C++ such as PostGIS . As LLVM code
164
164
generation should never execute arbitrary code, just setting these
165
165
handlers temporarily ought to suffice.
166
166
167
167
168
168
Type Synchronization
169
169
--------------------
170
170
171
- To able to generate code performing tasks that are done in "interpreted"
172
- postgres , it obviously is required that code generation knows about at
173
- least a few postgres types. While it is possible to inform LLVM about
171
+ To be able to generate code that can perform tasks done by "interpreted"
172
+ PostgreSQL , it obviously is required that code generation knows about at
173
+ least a few PostgreSQL types. While it is possible to inform LLVM about
174
174
type definitions by recreating them manually in C code, that is failure
175
175
prone and labor intensive.
176
176
177
177
Instead there is one small file (llvmjit_types.c) which references each of
178
178
the types required for JITing. That file is translated to bitcode at
179
179
compile time, and loaded when LLVM is initialized in a backend.
180
180
181
- That works very well to synchronize the type definition, unfortunately
181
+ That works very well to synchronize the type definition, but unfortunately
182
182
it does *not* synchronize offsets as the IR level representation doesn't
183
- know field names. Instead required offsets are maintained as defines in
184
- the original struct definition. E.g.
183
+ know field names. Instead, required offsets are maintained as defines in
184
+ the original struct definition, like so:
185
185
#define FIELDNO_TUPLETABLESLOT_NVALID 9
186
186
int tts_nvalid; /* # of valid values in tts_values */
187
- while that still needs to be defined, it's only required for a
187
+ While that still needs to be defined, it's only required for a
188
188
relatively small number of fields, and it's bunched together with the
189
189
struct definition, so it's easily kept synchronized.
190
190
@@ -193,12 +193,12 @@ Inlining
193
193
--------
194
194
195
195
One big advantage of JITing expressions is that it can significantly
196
- reduce the overhead of postgres 's extensible function/operator
197
- mechanism, by inlining the body of called functions / operators.
196
+ reduce the overhead of PostgreSQL 's extensible function/operator
197
+ mechanism, by inlining the body of called functions/ operators.
198
198
199
199
It obviously is undesirable to maintain a second implementation of
200
200
commonly used functions, just for inlining purposes. Instead we take
201
- advantage of the fact that the clang compiler can emit LLVM IR.
201
+ advantage of the fact that the Clang compiler can emit LLVM IR.
202
202
203
203
The ability to do so allows us to get the LLVM IR for all operators
204
204
(e.g. int8eq, float8pl etc), without maintaining two copies. These
@@ -225,7 +225,7 @@ Caching
225
225
Currently it is not yet possible to cache generated functions, even
226
226
though that'd be desirable from a performance point of view. The
227
227
problem is that the generated functions commonly contain pointers into
228
- per-execution memory. The expression evaluation functionality needs to
228
+ per-execution memory. The expression evaluation machinery needs to
229
229
be redesigned a bit to avoid that. Basically all per-execution memory
230
230
needs to be referenced as an offset to one block of memory stored in
231
231
an ExprState, rather than absolute pointers into memory.
@@ -278,7 +278,7 @@ Currently there are a number of GUCs that influence JITing:
278
278
- jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried if query has
279
279
higher cost.
280
280
281
- whenever a query's total cost is above these limits, JITing is
281
+ Whenever a query's total cost is above these limits, JITing is
282
282
performed.
283
283
284
284
Alternative costing models, e.g. by generating separate paths for
@@ -291,5 +291,5 @@ individual expressions.
291
291
The obvious seeming approach of JITing expressions individually after
292
292
a number of execution turns out not to work too well. Primarily
293
293
because emitting many small functions individually has significant
294
- overhead. Secondarily because the time till JITing occurs causes
294
+ overhead. Secondarily because the time until JITing occurs causes
295
295
relative slowdowns that eat into the gain of JIT compilation.
0 commit comments