29
29
<para>
30
30
<application>pg_test_timing</> is a tool to measure the timing overhead
31
31
on your system and confirm that the system time never moves backwards.
32
- Systems that are slow to collect timing data can give less accurate
32
+ Systems that are slow to collect timing data can give less accurate
33
33
<command>EXPLAIN ANALYZE</command> results.
34
34
</para>
35
35
</refsect1>
68
68
<title>Interpreting results</title>
69
69
70
70
<para>
71
- Good results will show most (>90%) individual timing calls take less
72
- than one microsecond. Average per loop overhead will be even lower,
73
- below 100 nanoseconds. This example from an Intel i7-860 system using
74
- a TSC clock source shows excellent performance:
75
- </para>
71
+ Good results will show most (>90%) individual timing calls take less than
72
+ one microsecond. Average per loop overhead will be even lower, below 100
73
+ nanoseconds. This example from an Intel i7-860 system using a TSC clock
74
+ source shows excellent performance:
76
75
77
76
<screen>
78
77
Testing timing overhead for 3 seconds.
@@ -85,56 +84,55 @@ Histogram of timing durations:
85
84
2: 2999652 3.59518%
86
85
1: 80435604 96.40465%
87
86
</screen>
87
+ </para>
88
88
89
89
<para>
90
- Note that different units are used for the per loop time than the
91
- histogram. The loop can have resolution within a few nanoseconds
92
- (nsec), while the individual timing calls can only resolve down to
93
- one microsecond (usec).
90
+ Note that different units are used for the per loop time than the
91
+ histogram. The loop can have resolution within a few nanoseconds (nsec),
92
+ while the individual timing calls can only resolve down to one microsecond
93
+ (usec).
94
94
</para>
95
95
96
96
</refsect2>
97
97
<refsect2>
98
98
<title>Measuring executor timing overhead</title>
99
99
100
100
<para>
101
- When the query executor is running a statement using
102
- <command>EXPLAIN ANALYZE</command>, individual operations are
103
- timed as well as showing a summary. The overhead of your system
104
- can be checked by counting rows with the psql program:
105
- </para>
101
+ When the query executor is running a statement using
102
+ <command>EXPLAIN ANALYZE</command>, individual operations are timed as well
103
+ as showing a summary. The overhead of your system can be checked by
104
+ counting rows with the psql program:
106
105
107
106
<screen>
108
107
CREATE TABLE t AS SELECT * FROM generate_series(1,100000);
109
108
\timing
110
109
SELECT COUNT(*) FROM t;
111
110
EXPLAIN ANALYZE SELECT COUNT(*) FROM t;
112
111
</screen>
112
+ </para>
113
113
114
114
<para>
115
- The i7-860 system measured runs the count query in 9.8 ms while
116
- the <command>EXPLAIN ANALYZE</command> version takes 16.6 ms,
117
- each processing just over 100,000 rows. That 6.8 ms difference
118
- means the timing overhead per row is 68 ns, about twice what
119
- pg_test_timing estimated it would be. Even that relatively
120
- small amount of overhead is making the fully timed count statement
121
- take almost 70% longer. On more substantial queries, the
122
- timing overhead would be less problematic.
115
+ The i7-860 system measured runs the count query in 9.8 ms while
116
+ the <command>EXPLAIN ANALYZE</command> version takes 16.6 ms, each
117
+ processing just over 100,000 rows. That 6.8 ms difference means the timing
118
+ overhead per row is 68 ns, about twice what pg_test_timing estimated it
119
+ would be. Even that relatively small amount of overhead is making the fully
120
+ timed count statement take almost 70% longer. On more substantial queries,
121
+ the timing overhead would be less problematic.
123
122
</para>
124
123
125
124
</refsect2>
126
125
127
126
<refsect2>
128
127
<title>Changing time sources</title>
129
128
<para>
130
- On some newer Linux systems, it's possible to change the clock
131
- source used to collect timing data at any time. A second example
132
- shows the slowdown possible from switching to the slower acpi_pm
133
- time source, on the same system used for the fast results above:
134
- </para>
129
+ On some newer Linux systems, it's possible to change the clock source used
130
+ to collect timing data at any time. A second example shows the slowdown
131
+ possible from switching to the slower acpi_pm time source, on the same
132
+ system used for the fast results above:
135
133
136
134
<screen>
137
- # cat /sys/devices/system/clocksource/clocksource0/available_clocksource
135
+ # cat /sys/devices/system/clocksource/clocksource0/available_clocksource
138
136
tsc hpet acpi_pm
139
137
# echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource
140
138
# pg_test_timing
@@ -147,45 +145,43 @@ Histogram of timing durations:
147
145
2: 2990371 72.05956%
148
146
1: 1155682 27.84870%
149
147
</screen>
148
+ </para>
150
149
151
150
<para>
152
- In this configuration, the sample <command>EXPLAIN ANALYZE</command>
153
- above takes 115.9 ms. That's 1061 nsec of timing overhead, again
154
- a small multiple of what's measured directly by this utility.
155
- That much timing overhead means the actual query itself is only
156
- taking a tiny fraction of the accounted for time, most of it
157
- is being consumed in overhead instead. In this configuration,
158
- any <command>EXPLAIN ANALYZE</command> totals involving many
159
- timed operations would be inflated significantly by timing overhead.
151
+ In this configuration, the sample <command>EXPLAIN ANALYZE</command> above
152
+ takes 115.9 ms. That's 1061 nsec of timing overhead, again a small multiple
153
+ of what's measured directly by this utility. That much timing overhead
154
+ means the actual query itself is only taking a tiny fraction of the
155
+ accounted for time, most of it is being consumed in overhead instead. In
156
+ this configuration, any <command>EXPLAIN ANALYZE</command> totals involving
157
+ many timed operations would be inflated significantly by timing overhead.
160
158
</para>
161
159
162
160
<para>
163
- FreeBSD also allows changing the time source on the fly, and
164
- it logs information about the timer selected during boot:
165
- </para>
161
+ FreeBSD also allows changing the time source on the fly, and it logs
162
+ information about the timer selected during boot:
166
163
167
164
<screen>
168
- dmesg | grep "Timecounter"
165
+ dmesg | grep "Timecounter"
169
166
sysctl kern.timecounter.hardware=TSC
170
167
</screen>
168
+ </para>
171
169
172
170
<para>
173
- Other systems may only allow setting the time source on boot.
174
- On older Linux systems the "clock" kernel setting is the only way
175
- to make this sort of change. And even on some more recent ones,
176
- the only option you'll see for a clock source is "jiffies". Jiffies
177
- are the older Linux software clock implementation, which can have
178
- good resolution when it's backed by fast enough timing hardware,
179
- as in this example:
180
- </para>
171
+ Other systems may only allow setting the time source on boot. On older
172
+ Linux systems the "clock" kernel setting is the only way to make this sort
173
+ of change. And even on some more recent ones, the only option you'll see
174
+ for a clock source is "jiffies". Jiffies are the older Linux software clock
175
+ implementation, which can have good resolution when it's backed by fast
176
+ enough timing hardware, as in this example:
181
177
182
178
<screen>
183
- $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
184
- jiffies
179
+ $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
180
+ jiffies
185
181
$ dmesg | grep time.c
186
182
time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
187
183
time.c: Detected 2400.153 MHz processor.
188
- $ ./ pg_test_timing
184
+ $ pg_test_timing
189
185
Testing timing overhead for 3 seconds.
190
186
Per timing duration including loop overhead: 97.75 ns
191
187
Histogram of timing durations:
@@ -197,76 +193,74 @@ Histogram of timing durations:
197
193
2: 2993204 9.75277%
198
194
1: 27694571 90.23734%
199
195
</screen>
196
+ </para>
200
197
201
198
</refsect2>
199
+
202
200
<refsect2>
203
201
<title>Clock hardware and timing accuracy</title>
204
202
205
203
<para>
206
- Collecting accurate timing information is normally done on computers
207
- using hardware clocks with various levels of accuracy. With some
208
- hardware the operating systems can pass the system clock time almost
209
- directly to programs. A system clock can also be derived from a chip
210
- that simply provides timing interrupts, periodic ticks at some known
211
- time interval. In either case, operating system kernels provide
212
- a clock source that hides these details. But the accuracy of that
213
- clock source and how quickly it can return results varies based
214
- on the underlying hardware.
204
+ Collecting accurate timing information is normally done on computers using
205
+ hardware clocks with various levels of accuracy. With some hardware the
206
+ operating systems can pass the system clock time almost directly to
207
+ programs. A system clock can also be derived from a chip that simply
208
+ provides timing interrupts, periodic ticks at some known time interval. In
209
+ either case, operating system kernels provide a clock source that hides
210
+ these details. But the accuracy of that clock source and how quickly it can
211
+ return results varies based on the underlying hardware.
215
212
</para>
216
213
217
214
<para>
218
- Inaccurate time keeping can result in system instability. Test
219
- any change to the clock source very carefully. Operating system
220
- defaults are sometimes made to favor reliability over best
221
- accuracy. And if you are using a virtual machine, look into the
222
- recommended time sources compatible with it. Virtual hardware
223
- faces additional difficulties when emulating timers, and there
224
- are often per operating system settings suggested by vendors.
215
+ Inaccurate time keeping can result in system instability. Test any change
216
+ to the clock source very carefully. Operating system defaults are sometimes
217
+ made to favor reliability over best accuracy. And if you are using a virtual
218
+ machine, look into the recommended time sources compatible with it. Virtual
219
+ hardware faces additional difficulties when emulating timers, and there are
220
+ often per operating system settings suggested by vendors.
225
221
</para>
226
222
227
223
<para>
228
- The Time Stamp Counter (TSC) clock source is the most accurate one
229
- available on current generation CPUs. It's the preferred way to track
230
- the system time when it's supported by the operating system and the
231
- TSC clock is reliable. There are several ways that TSC can fail
232
- to provide an accurate timing source, making it unreliable. Older
233
- systems can have a TSC clock that varies based on the CPU
234
- temperature, making it unusable for timing. Trying to use TSC on some
235
- older multi-core CPUs can give a reported time that's inconsistent
236
- among multiple cores. This can result in the time going backwards, a
237
- problem this program checks for. And even the newest systems can
238
- fail to provide accurate TSC timing with very aggressive power saving
239
- configurations.
224
+ The Time Stamp Counter (TSC) clock source is the most accurate one available
225
+ on current generation CPUs. It's the preferred way to track the system time
226
+ when it's supported by the operating system and the TSC clock is
227
+ reliable. There are several ways that TSC can fail to provide an accurate
228
+ timing source, making it unreliable. Older systems can have a TSC clock that
229
+ varies based on the CPU temperature, making it unusable for timing. Trying
230
+ to use TSC on some older multi-core CPUs can give a reported time that's
231
+ inconsistent among multiple cores. This can result in the time going
232
+ backwards, a problem this program checks for. And even the newest systems
233
+ can fail to provide accurate TSC timing with very aggressive power saving
234
+ configurations.
240
235
</para>
241
236
242
237
<para>
243
- Newer operating systems may check for the known TSC problems and
244
- switch to a slower, more stable clock source when they are seen.
245
- If your system supports TSC time but doesn't default to that, it
246
- may be disabled for a good reason. And some operating systems may
247
- not detect all the possible problems correctly, or will allow using
248
- TSC even in situations where it's known to be inaccurate.
238
+ Newer operating systems may check for the known TSC problems and switch to a
239
+ slower, more stable clock source when they are seen. If your system
240
+ supports TSC time but doesn't default to that, it may be disabled for a good
241
+ reason. And some operating systems may not detect all the possible problems
242
+ correctly, or will allow using TSC even in situations where it's known to be
243
+ inaccurate.
249
244
</para>
250
245
251
246
<para>
252
- The High Precision Event Timer (HPET) is the preferred timer on
253
- systems where it's available and TSC is not accurate. The timer
254
- chip itself is programmable to allow up to 100 nanosecond resolution,
255
- but you may not see that much accuracy in your system clock.
247
+ The High Precision Event Timer (HPET) is the preferred timer on systems
248
+ where it's available and TSC is not accurate. The timer chip itself is
249
+ programmable to allow up to 100 nanosecond resolution, but you may not see
250
+ that much accuracy in your system clock.
256
251
</para>
257
252
258
253
<para>
259
- Advanced Configuration and Power Interface (ACPI) provides a
260
- Power Management (PM) Timer, which Linux refers to as the acpi_pm.
261
- The clock derived from acpi_pm will at best provide 300 nanosecond
262
- resolution.
254
+ Advanced Configuration and Power Interface (ACPI) provides a Power
255
+ Management (PM) Timer, which Linux refers to as the acpi_pm. The clock
256
+ derived from acpi_pm will at best provide 300 nanosecond resolution.
263
257
</para>
264
258
265
259
<para>
266
- Timers used on older PC hardware including the 8254 Programmable
267
- Interval Timer (PIT), the real-time clock (RTC), the Advanced
268
- Programmable Interrupt Controller (APIC) timer, and the Cyclone
269
- timer. These timers aim for millisecond resolution.
260
+ Timers used on older PC hardware including the 8254 Programmable Interval
261
+ Timer (PIT), the real-time clock (RTC), the Advanced Programmable Interrupt
262
+ Controller (APIC) timer, and the Cyclone timer. These timers aim for
263
+ millisecond resolution.
270
264
</para>
271
265
</refsect2>
272
266
</refsect1>
0 commit comments