Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit efb0ef9

Browse files
committed
Track I/O timing for temporary file blocks in EXPLAIN (BUFFERS)
Previously, the output of EXPLAIN (BUFFERS) option showed only the I/O timing spent reading and writing shared and local buffers. This commit adds on top of that the I/O timing for temporary buffers in the output of EXPLAIN (for spilled external sorts, hashes, materialization. etc). This can be helpful for users in cases where the I/O related to temporary buffers is the bottleneck. Like its cousin, this information is available only when track_io_timing is enabled. Playing the patch, this is showing an extra overhead of up to 1% even when using gettimeofday() as implementation for interval timings, which is slightly within the usual range noise still that's measurable. Author: Masahiko Sawada Reviewed-by: Georgios Kokolatos, Melanie Plageman, Julien Rouhaud, Ranier Vilela Discussion: https://postgr.es/m/CAD21AoAJgotTeP83p6HiAGDhs_9Fw9pZ2J=_tYTsiO5Ob-V5GQ@mail.gmail.com
1 parent dafae97 commit efb0ef9

File tree

7 files changed

+147
-68
lines changed

7 files changed

+147
-68
lines changed

doc/src/sgml/ref/explain.sgml

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -174,14 +174,15 @@ ROLLBACK;
174174
Include information on buffer usage. Specifically, include the number of
175175
shared blocks hit, read, dirtied, and written, the number of local blocks
176176
hit, read, dirtied, and written, the number of temp blocks read and
177-
written, and the time spent reading and writing data file blocks
178-
(in milliseconds) if <xref linkend="guc-track-io-timing"/> is enabled.
179-
A <emphasis>hit</emphasis> means that a read was avoided because the block was
180-
found already in cache when needed.
177+
written, and the time spent reading and writing data file blocks and
178+
temporary file blocks (in milliseconds) if
179+
<xref linkend="guc-track-io-timing"/> is enabled. A
180+
<emphasis>hit</emphasis> means that a read was avoided because the block
181+
was found already in cache when needed.
181182
Shared blocks contain data from regular tables and indexes;
182183
local blocks contain data from temporary tables and indexes;
183-
while temp blocks contain short-term working data used in sorts, hashes,
184-
Materialize plan nodes, and similar cases.
184+
while temporary blocks contain short-term working data used in sorts,
185+
hashes, Materialize plan nodes, and similar cases.
185186
The number of blocks <emphasis>dirtied</emphasis> indicates the number of
186187
previously unmodified blocks that were changed by this query; while the
187188
number of blocks <emphasis>written</emphasis> indicates the number of

src/backend/commands/explain.c

Lines changed: 34 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3521,8 +3521,11 @@ show_buffer_usage(ExplainState *es, const BufferUsage *usage, bool planning)
35213521
usage->temp_blks_written > 0);
35223522
bool has_timing = (!INSTR_TIME_IS_ZERO(usage->blk_read_time) ||
35233523
!INSTR_TIME_IS_ZERO(usage->blk_write_time));
3524+
bool has_temp_timing = (!INSTR_TIME_IS_ZERO(usage->temp_blk_read_time) ||
3525+
!INSTR_TIME_IS_ZERO(usage->temp_blk_write_time));
35243526
bool show_planning = (planning && (has_shared ||
3525-
has_local || has_temp || has_timing));
3527+
has_local || has_temp || has_timing ||
3528+
has_temp_timing));
35263529

35273530
if (show_planning)
35283531
{
@@ -3587,16 +3590,33 @@ show_buffer_usage(ExplainState *es, const BufferUsage *usage, bool planning)
35873590
}
35883591

35893592
/* As above, show only positive counter values. */
3590-
if (has_timing)
3593+
if (has_timing || has_temp_timing)
35913594
{
35923595
ExplainIndentText(es);
35933596
appendStringInfoString(es->str, "I/O Timings:");
3594-
if (!INSTR_TIME_IS_ZERO(usage->blk_read_time))
3595-
appendStringInfo(es->str, " read=%0.3f",
3596-
INSTR_TIME_GET_MILLISEC(usage->blk_read_time));
3597-
if (!INSTR_TIME_IS_ZERO(usage->blk_write_time))
3598-
appendStringInfo(es->str, " write=%0.3f",
3599-
INSTR_TIME_GET_MILLISEC(usage->blk_write_time));
3597+
3598+
if (has_timing)
3599+
{
3600+
appendStringInfoString(es->str, " shared/local");
3601+
if (!INSTR_TIME_IS_ZERO(usage->blk_read_time))
3602+
appendStringInfo(es->str, " read=%0.3f",
3603+
INSTR_TIME_GET_MILLISEC(usage->blk_read_time));
3604+
if (!INSTR_TIME_IS_ZERO(usage->blk_write_time))
3605+
appendStringInfo(es->str, " write=%0.3f",
3606+
INSTR_TIME_GET_MILLISEC(usage->blk_write_time));
3607+
if (has_temp_timing)
3608+
appendStringInfoChar(es->str, ',');
3609+
}
3610+
if (has_temp_timing)
3611+
{
3612+
appendStringInfoString(es->str, " temp");
3613+
if (!INSTR_TIME_IS_ZERO(usage->temp_blk_read_time))
3614+
appendStringInfo(es->str, " read=%0.3f",
3615+
INSTR_TIME_GET_MILLISEC(usage->temp_blk_read_time));
3616+
if (!INSTR_TIME_IS_ZERO(usage->temp_blk_write_time))
3617+
appendStringInfo(es->str, " write=%0.3f",
3618+
INSTR_TIME_GET_MILLISEC(usage->temp_blk_write_time));
3619+
}
36003620
appendStringInfoChar(es->str, '\n');
36013621
}
36023622

@@ -3633,6 +3653,12 @@ show_buffer_usage(ExplainState *es, const BufferUsage *usage, bool planning)
36333653
ExplainPropertyFloat("I/O Write Time", "ms",
36343654
INSTR_TIME_GET_MILLISEC(usage->blk_write_time),
36353655
3, es);
3656+
ExplainPropertyFloat("Temp I/O Read Time", "ms",
3657+
INSTR_TIME_GET_MILLISEC(usage->temp_blk_read_time),
3658+
3, es);
3659+
ExplainPropertyFloat("Temp I/O Write Time", "ms",
3660+
INSTR_TIME_GET_MILLISEC(usage->temp_blk_write_time),
3661+
3, es);
36363662
}
36373663
}
36383664
}

src/backend/executor/instrument.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,8 @@ BufferUsageAdd(BufferUsage *dst, const BufferUsage *add)
237237
dst->temp_blks_written += add->temp_blks_written;
238238
INSTR_TIME_ADD(dst->blk_read_time, add->blk_read_time);
239239
INSTR_TIME_ADD(dst->blk_write_time, add->blk_write_time);
240+
INSTR_TIME_ADD(dst->temp_blk_read_time, add->temp_blk_read_time);
241+
INSTR_TIME_ADD(dst->temp_blk_write_time, add->temp_blk_write_time);
240242
}
241243

242244
/* dst += add - sub */
@@ -259,6 +261,10 @@ BufferUsageAccumDiff(BufferUsage *dst,
259261
add->blk_read_time, sub->blk_read_time);
260262
INSTR_TIME_ACCUM_DIFF(dst->blk_write_time,
261263
add->blk_write_time, sub->blk_write_time);
264+
INSTR_TIME_ACCUM_DIFF(dst->temp_blk_read_time,
265+
add->temp_blk_read_time, sub->temp_blk_read_time);
266+
INSTR_TIME_ACCUM_DIFF(dst->temp_blk_write_time,
267+
add->temp_blk_write_time, sub->temp_blk_write_time);
262268
}
263269

264270
/* helper functions for WAL usage accumulation */

src/backend/storage/file/buffile.c

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -429,6 +429,8 @@ static void
429429
BufFileLoadBuffer(BufFile *file)
430430
{
431431
File thisfile;
432+
instr_time io_start;
433+
instr_time io_time;
432434

433435
/*
434436
* Advance to next component file if necessary and possible.
@@ -440,10 +442,14 @@ BufFileLoadBuffer(BufFile *file)
440442
file->curOffset = 0L;
441443
}
442444

445+
thisfile = file->files[file->curFile];
446+
447+
if (track_io_timing)
448+
INSTR_TIME_SET_CURRENT(io_start);
449+
443450
/*
444451
* Read whatever we can get, up to a full bufferload.
445452
*/
446-
thisfile = file->files[file->curFile];
447453
file->nbytes = FileRead(thisfile,
448454
file->buffer.data,
449455
sizeof(file->buffer),
@@ -458,6 +464,13 @@ BufFileLoadBuffer(BufFile *file)
458464
FilePathName(thisfile))));
459465
}
460466

467+
if (track_io_timing)
468+
{
469+
INSTR_TIME_SET_CURRENT(io_time);
470+
INSTR_TIME_SUBTRACT(io_time, io_start);
471+
INSTR_TIME_ADD(pgBufferUsage.temp_blk_read_time, io_time);
472+
}
473+
461474
/* we choose not to advance curOffset here */
462475

463476
if (file->nbytes > 0)
@@ -485,6 +498,8 @@ BufFileDumpBuffer(BufFile *file)
485498
while (wpos < file->nbytes)
486499
{
487500
off_t availbytes;
501+
instr_time io_start;
502+
instr_time io_time;
488503

489504
/*
490505
* Advance to next component file if necessary and possible.
@@ -507,6 +522,10 @@ BufFileDumpBuffer(BufFile *file)
507522
bytestowrite = (int) availbytes;
508523

509524
thisfile = file->files[file->curFile];
525+
526+
if (track_io_timing)
527+
INSTR_TIME_SET_CURRENT(io_start);
528+
510529
bytestowrite = FileWrite(thisfile,
511530
file->buffer.data + wpos,
512531
bytestowrite,
@@ -517,6 +536,14 @@ BufFileDumpBuffer(BufFile *file)
517536
(errcode_for_file_access(),
518537
errmsg("could not write to file \"%s\": %m",
519538
FilePathName(thisfile))));
539+
540+
if (track_io_timing)
541+
{
542+
INSTR_TIME_SET_CURRENT(io_time);
543+
INSTR_TIME_SUBTRACT(io_time, io_start);
544+
INSTR_TIME_ADD(pgBufferUsage.temp_blk_write_time, io_time);
545+
}
546+
520547
file->curOffset += bytestowrite;
521548
wpos += bytestowrite;
522549

src/include/executor/instrument.h

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,10 @@ typedef struct BufferUsage
3333
int64 local_blks_written; /* # of local disk blocks written */
3434
int64 temp_blks_read; /* # of temp blocks read */
3535
int64 temp_blks_written; /* # of temp blocks written */
36-
instr_time blk_read_time; /* time spent reading */
37-
instr_time blk_write_time; /* time spent writing */
36+
instr_time blk_read_time; /* time spent reading blocks */
37+
instr_time blk_write_time; /* time spent writing blocks */
38+
instr_time temp_blk_read_time; /* time spent reading temp blocks */
39+
instr_time temp_blk_write_time; /* time spent writing temp blocks */
3840
} BufferUsage;
3941

4042
/*

src/test/regress/expected/explain.out

Lines changed: 62 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -85,56 +85,6 @@ select explain_filter('explain (analyze, buffers, format text) select * from int
8585
Execution Time: N.N ms
8686
(3 rows)
8787

88-
select explain_filter('explain (analyze, buffers, format json) select * from int8_tbl i8');
89-
explain_filter
90-
------------------------------------
91-
[ +
92-
{ +
93-
"Plan": { +
94-
"Node Type": "Seq Scan", +
95-
"Parallel Aware": false, +
96-
"Async Capable": false, +
97-
"Relation Name": "int8_tbl",+
98-
"Alias": "i8", +
99-
"Startup Cost": N.N, +
100-
"Total Cost": N.N, +
101-
"Plan Rows": N, +
102-
"Plan Width": N, +
103-
"Actual Startup Time": N.N, +
104-
"Actual Total Time": N.N, +
105-
"Actual Rows": N, +
106-
"Actual Loops": N, +
107-
"Shared Hit Blocks": N, +
108-
"Shared Read Blocks": N, +
109-
"Shared Dirtied Blocks": N, +
110-
"Shared Written Blocks": N, +
111-
"Local Hit Blocks": N, +
112-
"Local Read Blocks": N, +
113-
"Local Dirtied Blocks": N, +
114-
"Local Written Blocks": N, +
115-
"Temp Read Blocks": N, +
116-
"Temp Written Blocks": N +
117-
}, +
118-
"Planning": { +
119-
"Shared Hit Blocks": N, +
120-
"Shared Read Blocks": N, +
121-
"Shared Dirtied Blocks": N, +
122-
"Shared Written Blocks": N, +
123-
"Local Hit Blocks": N, +
124-
"Local Read Blocks": N, +
125-
"Local Dirtied Blocks": N, +
126-
"Local Written Blocks": N, +
127-
"Temp Read Blocks": N, +
128-
"Temp Written Blocks": N +
129-
}, +
130-
"Planning Time": N.N, +
131-
"Triggers": [ +
132-
], +
133-
"Execution Time": N.N +
134-
} +
135-
]
136-
(1 row)
137-
13888
select explain_filter('explain (analyze, buffers, format xml) select * from int8_tbl i8');
13989
explain_filter
14090
--------------------------------------------------------
@@ -276,6 +226,68 @@ select explain_filter('explain (buffers, format json) select * from int8_tbl i8'
276226
]
277227
(1 row)
278228

229+
-- Check output including I/O timings. These fields are conditional
230+
-- but always set in JSON format, so check them only in this case.
231+
set track_io_timing = on;
232+
select explain_filter('explain (analyze, buffers, format json) select * from int8_tbl i8');
233+
explain_filter
234+
------------------------------------
235+
[ +
236+
{ +
237+
"Plan": { +
238+
"Node Type": "Seq Scan", +
239+
"Parallel Aware": false, +
240+
"Async Capable": false, +
241+
"Relation Name": "int8_tbl",+
242+
"Alias": "i8", +
243+
"Startup Cost": N.N, +
244+
"Total Cost": N.N, +
245+
"Plan Rows": N, +
246+
"Plan Width": N, +
247+
"Actual Startup Time": N.N, +
248+
"Actual Total Time": N.N, +
249+
"Actual Rows": N, +
250+
"Actual Loops": N, +
251+
"Shared Hit Blocks": N, +
252+
"Shared Read Blocks": N, +
253+
"Shared Dirtied Blocks": N, +
254+
"Shared Written Blocks": N, +
255+
"Local Hit Blocks": N, +
256+
"Local Read Blocks": N, +
257+
"Local Dirtied Blocks": N, +
258+
"Local Written Blocks": N, +
259+
"Temp Read Blocks": N, +
260+
"Temp Written Blocks": N, +
261+
"I/O Read Time": N.N, +
262+
"I/O Write Time": N.N, +
263+
"Temp I/O Read Time": N.N, +
264+
"Temp I/O Write Time": N.N +
265+
}, +
266+
"Planning": { +
267+
"Shared Hit Blocks": N, +
268+
"Shared Read Blocks": N, +
269+
"Shared Dirtied Blocks": N, +
270+
"Shared Written Blocks": N, +
271+
"Local Hit Blocks": N, +
272+
"Local Read Blocks": N, +
273+
"Local Dirtied Blocks": N, +
274+
"Local Written Blocks": N, +
275+
"Temp Read Blocks": N, +
276+
"Temp Written Blocks": N, +
277+
"I/O Read Time": N.N, +
278+
"I/O Write Time": N.N, +
279+
"Temp I/O Read Time": N.N, +
280+
"Temp I/O Write Time": N.N +
281+
}, +
282+
"Planning Time": N.N, +
283+
"Triggers": [ +
284+
], +
285+
"Execution Time": N.N +
286+
} +
287+
]
288+
(1 row)
289+
290+
set track_io_timing = off;
279291
-- SETTINGS option
280292
-- We have to ignore other settings that might be imposed by the environment,
281293
-- so printing the whole Settings field unfortunately won't do.

src/test/regress/sql/explain.sql

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,12 +65,17 @@ select explain_filter('explain select * from int8_tbl i8');
6565
select explain_filter('explain (analyze) select * from int8_tbl i8');
6666
select explain_filter('explain (analyze, verbose) select * from int8_tbl i8');
6767
select explain_filter('explain (analyze, buffers, format text) select * from int8_tbl i8');
68-
select explain_filter('explain (analyze, buffers, format json) select * from int8_tbl i8');
6968
select explain_filter('explain (analyze, buffers, format xml) select * from int8_tbl i8');
7069
select explain_filter('explain (analyze, buffers, format yaml) select * from int8_tbl i8');
7170
select explain_filter('explain (buffers, format text) select * from int8_tbl i8');
7271
select explain_filter('explain (buffers, format json) select * from int8_tbl i8');
7372

73+
-- Check output including I/O timings. These fields are conditional
74+
-- but always set in JSON format, so check them only in this case.
75+
set track_io_timing = on;
76+
select explain_filter('explain (analyze, buffers, format json) select * from int8_tbl i8');
77+
set track_io_timing = off;
78+
7479
-- SETTINGS option
7580
-- We have to ignore other settings that might be imposed by the environment,
7681
-- so printing the whole Settings field unfortunately won't do.

0 commit comments

Comments
 (0)