Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 249cf07

Browse files
committed
Create and use wait events for read, write, and fsync operations.
Previous commits, notably 53be0b1 and 6f3bd98, made it possible to see from pg_stat_activity when a backend was stuck waiting for another backend, but it's also fairly common for a backend to be stuck waiting for an I/O. Add wait events for those operations, too. Rushabh Lathia, with further hacking by me. Reviewed and tested by Michael Paquier, Amit Kapila, Rajkumar Raghuwanshi, and Rahila Syed. Discussion: http://postgr.es/m/CAGPqQf0LsYHXREPAZqYGVkDqHSyjf=KsD=k0GTVPAuzyThh-VQ@mail.gmail.com
1 parent 928250a commit 249cf07

File tree

21 files changed

+782
-29
lines changed

21 files changed

+782
-29
lines changed

doc/src/sgml/monitoring.sgml

+271
Original file line numberDiff line numberDiff line change
@@ -716,6 +716,12 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
716716
point.
717717
</para>
718718
</listitem>
719+
<listitem>
720+
<para>
721+
<literal>IO</>: The server process is waiting for a IO to complete.
722+
<literal>wait_event</> will identify the specific wait point.
723+
</para>
724+
</listitem>
719725
</itemizedlist>
720726
</entry>
721727
</row>
@@ -1272,6 +1278,271 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
12721278
<entry><literal>RecoveryApplyDelay</></entry>
12731279
<entry>Waiting to apply WAL at recovery because it is delayed.</entry>
12741280
</row>
1281+
<row>
1282+
<entry morerows="66"><literal>IO</></entry>
1283+
<entry><literal>BufFileRead</></entry>
1284+
<entry>Waiting for a read from a buffered file.</entry>
1285+
</row>
1286+
<row>
1287+
<entry><literal>BufFileWrite</></entry>
1288+
<entry>Waiting for a write to a buffered file.</entry>
1289+
</row>
1290+
<row>
1291+
<entry><literal>ControlFileRead</></entry>
1292+
<entry>Waiting for a read from the control file.</entry>
1293+
</row>
1294+
<row>
1295+
<entry><literal>ControlFileSync</></entry>
1296+
<entry>Waiting for the control file to reach stable storage.</entry>
1297+
</row>
1298+
<row>
1299+
<entry><literal>ControlFileSyncUpdate</></entry>
1300+
<entry>Waiting for an update to the control file to reach stable storage.</entry>
1301+
</row>
1302+
<row>
1303+
<entry><literal>ControlFileWrite</></entry>
1304+
<entry>Waiting for a write to the control file.</entry>
1305+
</row>
1306+
<row>
1307+
<entry><literal>ControlFileWriteUpdate</></entry>
1308+
<entry>Waiting for a write to update the control file.</entry>
1309+
</row>
1310+
<row>
1311+
<entry><literal>CopyFileRead</></entry>
1312+
<entry>Waiting for a read during a file copy operation.</entry>
1313+
</row>
1314+
<row>
1315+
<entry><literal>CopyFileWrite</></entry>
1316+
<entry>Waiting for a write during a file copy operation.</entry>
1317+
</row>
1318+
<row>
1319+
<entry><literal>DataFileExtend</></entry>
1320+
<entry>Waiting for a relation data file to be extended.</entry>
1321+
</row>
1322+
<row>
1323+
<entry><literal>DataFileFlush</></entry>
1324+
<entry>Waiting for a relation data file to reach stable storage.</entry>
1325+
</row>
1326+
<row>
1327+
<entry><literal>DataFileImmediateSync</></entry>
1328+
<entry>Waiting for an immediate synchronization of a relation data file to stable storage.</entry>
1329+
</row>
1330+
<row>
1331+
<entry><literal>DataFilePrefetch</></entry>
1332+
<entry>Waiting for an asynchronous prefetch from a relation data file.</entry>
1333+
</row>
1334+
<row>
1335+
<entry><literal>DataFileRead</></entry>
1336+
<entry>Waiting for a read from a relation data file.</entry>
1337+
</row>
1338+
<row>
1339+
<entry><literal>DataFileSync</></entry>
1340+
<entry>Waiting for changes to a relation data file to reach stable storage.</entry>
1341+
</row>
1342+
<row>
1343+
<entry><literal>DataFileTruncate</></entry>
1344+
<entry>Waiting for a relation data file to be truncated.</entry>
1345+
</row>
1346+
<row>
1347+
<entry><literal>DataFileWrite</></entry>
1348+
<entry>Waiting for a write to a relation data file.</entry>
1349+
</row>
1350+
<row>
1351+
<entry><literal>DSMFillZeroWrite</></entry>
1352+
<entry>Waiting to write zero bytes to a dynamic shared memory backing file.</entry>
1353+
</row>
1354+
<row>
1355+
<entry><literal>LockFileAddToDataDirRead</></entry>
1356+
<entry>Waiting for a read while adding a line to the data directory lock file.</entry>
1357+
</row>
1358+
<row>
1359+
<entry><literal>LockFileAddToDataDirSync</></entry>
1360+
<entry>Waiting for data to reach stable storage while adding a line to the data directory lock file.</entry>
1361+
</row>
1362+
<row>
1363+
<entry><literal>LockFileAddToDataDirWrite</></entry>
1364+
<entry>Waiting for a write while adding a line to the data directory lock file.</entry>
1365+
</row>
1366+
<row>
1367+
<entry><literal>LockFileCreateRead</></entry>
1368+
<entry>Waiting to read while creating the data directory lock file.</entry>
1369+
</row>
1370+
<row>
1371+
<entry><literal>LockFileCreateSync</></entry>
1372+
<entry>Waiting for data to reach stable storage while creating the data directory lock file.</entry>
1373+
</row>
1374+
<row>
1375+
<entry><literal>LockFileCreateWrite</></entry>
1376+
<entry>Waiting for a write while creating the data directory lock file.</entry>
1377+
</row>
1378+
<row>
1379+
<entry><literal>LockFileReCheckDataDirRead</></entry>
1380+
<entry>Waiting for a read during recheck of the data directory lock file.</entry>
1381+
</row>
1382+
<row>
1383+
<entry><literal>LogicalRewriteCheckpointSync</></entry>
1384+
<entry>Waiting for logical rewrite mappings to reach stable storage during a checkpoint.</entry>
1385+
</row>
1386+
<row>
1387+
<entry><literal>LogicalRewriteMappingSync</></entry>
1388+
<entry>Waiting for mapping data to reach stable storage during a logical rewrite.</entry>
1389+
</row>
1390+
<row>
1391+
<entry><literal>LogicalRewriteMappingWrite</></entry>
1392+
<entry>Waiting for a write of mapping data during a logical rewrite.</entry>
1393+
</row>
1394+
<row>
1395+
<entry><literal>LogicalRewriteSync</></entry>
1396+
<entry>Waiting for logical rewrite mappings to reach stable storage.</entry>
1397+
</row>
1398+
<row>
1399+
<entry><literal>LogicalRewriteWrite</></entry>
1400+
<entry>Waiting for a write of logical rewrite mappings.</entry>
1401+
</row>
1402+
<row>
1403+
<entry><literal>RelationMapRead</></entry>
1404+
<entry>Waiting for a read of the relation map file.</entry>
1405+
</row>
1406+
<row>
1407+
<entry><literal>RelationMapSync</></entry>
1408+
<entry>Waiting for the relation map file to reach stable storage.</entry>
1409+
</row>
1410+
<row>
1411+
<entry><literal>RelationMapWrite</></entry>
1412+
<entry>Waiting for a write to the relation map file.</entry>
1413+
</row>
1414+
<row>
1415+
<entry><literal>ReorderBufferRead</></entry>
1416+
<entry>Waiting for a read during reorder buffer management.</entry>
1417+
</row>
1418+
<row>
1419+
<entry><literal>ReorderBufferWrite</></entry>
1420+
<entry>Waiting for a write during reorder buffer management.</entry>
1421+
</row>
1422+
<row>
1423+
<entry><literal>ReorderLogicalMappingRead</></entry>
1424+
<entry>Waiting for a read of a logical mapping during reorder buffer management.</entry>
1425+
</row>
1426+
<row>
1427+
<entry><literal>ReplicationSlotRead</></entry>
1428+
<entry>Waiting for a read from a replication slot control file.</entry>
1429+
</row>
1430+
<row>
1431+
<entry><literal>ReplicationSlotRestoreSync</></entry>
1432+
<entry>Waiting for a replication slot control file to reach stable storage while restoring it to memory.</entry>
1433+
</row>
1434+
<row>
1435+
<entry><literal>ReplicationSlotSync</></entry>
1436+
<entry>Waiting for a replication slot control file to reach stable storage.</entry>
1437+
</row>
1438+
<row>
1439+
<entry><literal>ReplicationSlotWrite</></entry>
1440+
<entry>Waiting for a write to a replication slot control file.</entry>
1441+
</row>
1442+
<row>
1443+
<entry><literal>SLRUFlushSync</></entry>
1444+
<entry>Waiting for SLRU data to reach stable storage during a checkpoint or database shutdown.</entry>
1445+
</row>
1446+
<row>
1447+
<entry><literal>SLRURead</></entry>
1448+
<entry>Waiting for a read of an SLRU page.</entry>
1449+
</row>
1450+
<row>
1451+
<entry><literal>SLRUSync</></entry>
1452+
<entry>Waiting for SLRU data to reach stable storage following a page write.</entry>
1453+
</row>
1454+
<row>
1455+
<entry><literal>SLRUWrite</></entry>
1456+
<entry>Waiting for a write of an SLRU page.</entry>
1457+
</row>
1458+
<row>
1459+
<entry><literal>SnapbuildRead</></entry>
1460+
<entry>Waiting for a read of a serialized historical catalog snapshot.</entry>
1461+
</row>
1462+
<row>
1463+
<entry><literal>SnapbuildSync</></entry>
1464+
<entry>Waiting for a serialized historical catalog snapshot to reach stable storage.</entry>
1465+
</row>
1466+
<row>
1467+
<entry><literal>SnapbuildWrite</></entry>
1468+
<entry>Waiting for a write of a serialized historical catalog snapshot.</entry>
1469+
</row>
1470+
<row>
1471+
<entry><literal>TimelineHistoryFileSync</></entry>
1472+
<entry>Waiting for a timeline history file received via streaming replication to reach stable storage.</entry>
1473+
</row>
1474+
<row>
1475+
<entry><literal>TimelineHistoryFileWrite</></entry>
1476+
<entry>Waiting for a write of a timeline history file received via streaming replication.</entry>
1477+
</row>
1478+
<row>
1479+
<entry><literal>TimelineHistoryRead</></entry>
1480+
<entry>Waiting for a read of a timeline history file.</entry>
1481+
</row>
1482+
<row>
1483+
<entry><literal>TimelineHistorySync</></entry>
1484+
<entry>Waiting for a newly created timeline history file to reach stable storage.</entry>
1485+
</row>
1486+
<row>
1487+
<entry><literal>TimelineHistoryWrite</></entry>
1488+
<entry>Waiting for a write of a newly created timeline history file.</entry>
1489+
</row>
1490+
<row>
1491+
<entry><literal>TwophaseFileRead</></entry>
1492+
<entry>Waiting for a read of a two phase state file.</entry>
1493+
</row>
1494+
<row>
1495+
<entry><literal>TwophaseFileSync</></entry>
1496+
<entry>Waiting for a two phase state file to reach stable storage.</entry>
1497+
</row>
1498+
<row>
1499+
<entry><literal>TwophaseFileWrite</></entry>
1500+
<entry>Waiting for a write of a two phase state file.</entry>
1501+
</row>
1502+
<row>
1503+
<entry><literal>WALBootstrapSync</></entry>
1504+
<entry>Waiting for WAL to reach stable storage during bootstrapping.</entry>
1505+
</row>
1506+
<row>
1507+
<entry><literal>WALBootstrapWrite</></entry>
1508+
<entry>Waiting for a write of a WAL page during bootstrapping.</entry>
1509+
</row>
1510+
<row>
1511+
<entry><literal>WALCopyRead</></entry>
1512+
<entry>Waiting for a read when creating a new WAL segment by copying an existing one.</entry>
1513+
</row>
1514+
<row>
1515+
<entry><literal>WALCopySync</></entry>
1516+
<entry>Waiting a new WAL segment created by copying an existing one to reach stable storage.</entry>
1517+
</row>
1518+
<row>
1519+
<entry><literal>WALCopyWrite</></entry>
1520+
<entry>Waiting for a write when creating a new WAL segment by copying an existing one.</entry>
1521+
</row>
1522+
<row>
1523+
<entry><literal>WALInitSync</></entry>
1524+
<entry>Waiting for a newly initialized WAL file to reach stable storage.</entry>
1525+
</row>
1526+
<row>
1527+
<entry><literal>WALInitWrite</></entry>
1528+
<entry>Waiting for a write while initializing a new WAL file.</entry>
1529+
</row>
1530+
<row>
1531+
<entry><literal>WALRead</></entry>
1532+
<entry>Waiting for a read from a WAL file.</entry>
1533+
</row>
1534+
<row>
1535+
<entry><literal>WALSenderTimelineHistoryRead</></entry>
1536+
<entry>Waiting for a read from a timeline history file during walsender timeline command.</entry>
1537+
</row>
1538+
<row>
1539+
<entry><literal>WALSyncMethodAssign</></entry>
1540+
<entry>Waiting for data to reach stable storage while assigning WAL sync method.</entry>
1541+
</row>
1542+
<row>
1543+
<entry><literal>WALWrite</></entry>
1544+
<entry>Waiting for a write to a WAL file.</entry>
1545+
</row>
12751546
</tbody>
12761547
</tgroup>
12771548
</table>

src/backend/access/heap/rewriteheap.c

+14-3
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,8 @@
119119

120120
#include "lib/ilist.h"
121121

122+
#include "pgstat.h"
123+
122124
#include "replication/logical.h"
123125
#include "replication/slot.h"
124126

@@ -916,7 +918,8 @@ logical_heap_rewrite_flush_mappings(RewriteState state)
916918
* Note that we deviate from the usual WAL coding practices here,
917919
* check the above "Logical rewrite support" comment for reasoning.
918920
*/
919-
written = FileWrite(src->vfd, waldata_start, len);
921+
written = FileWrite(src->vfd, waldata_start, len,
922+
WAIT_EVENT_LOGICAL_REWRITE_WRITE);
920923
if (written != len)
921924
ereport(ERROR,
922925
(errcode_for_file_access(),
@@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state)
957960
hash_seq_init(&seq_status, state->rs_logical_mappings);
958961
while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) != NULL)
959962
{
960-
if (FileSync(src->vfd) != 0)
963+
if (FileSync(src->vfd, WAIT_EVENT_LOGICAL_REWRITE_SYNC) != 0)
961964
ereport(ERROR,
962965
(errcode_for_file_access(),
963966
errmsg("could not fsync file \"%s\": %m", src->path)));
@@ -1141,11 +1144,13 @@ heap_xlog_logical_rewrite(XLogReaderState *r)
11411144
* Truncate all data that's not guaranteed to have been safely fsynced (by
11421145
* previous record or by the last checkpoint).
11431146
*/
1147+
pgstat_report_wait_start(WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE);
11441148
if (ftruncate(fd, xlrec->offset) != 0)
11451149
ereport(ERROR,
11461150
(errcode_for_file_access(),
11471151
errmsg("could not truncate file \"%s\" to %u: %m",
11481152
path, (uint32) xlrec->offset)));
1153+
pgstat_report_wait_end();
11491154

11501155
/* now seek to the position we want to write our data to */
11511156
if (lseek(fd, xlrec->offset, SEEK_SET) != xlrec->offset)
@@ -1159,20 +1164,24 @@ heap_xlog_logical_rewrite(XLogReaderState *r)
11591164
len = xlrec->num_mappings * sizeof(LogicalRewriteMappingData);
11601165

11611166
/* write out tail end of mapping file (again) */
1167+
pgstat_report_wait_start(WAIT_EVENT_LOGICAL_REWRITE_MAPPING_WRITE);
11621168
if (write(fd, data, len) != len)
11631169
ereport(ERROR,
11641170
(errcode_for_file_access(),
11651171
errmsg("could not write to file \"%s\": %m", path)));
1172+
pgstat_report_wait_end();
11661173

11671174
/*
11681175
* Now fsync all previously written data. We could improve things and only
11691176
* do this for the last write to a file, but the required bookkeeping
11701177
* doesn't seem worth the trouble.
11711178
*/
1179+
pgstat_report_wait_start(WAIT_EVENT_LOGICAL_REWRITE_MAPPING_SYNC);
11721180
if (pg_fsync(fd) != 0)
11731181
ereport(ERROR,
11741182
(errcode_for_file_access(),
11751183
errmsg("could not fsync file \"%s\": %m", path)));
1184+
pgstat_report_wait_end();
11761185

11771186
CloseTransientFile(fd);
11781187
}
@@ -1266,10 +1275,12 @@ CheckPointLogicalRewriteHeap(void)
12661275
* changed or have only been created since the checkpoint's start,
12671276
* but it's currently not deemed worth the effort.
12681277
*/
1269-
else if (pg_fsync(fd) != 0)
1278+
pgstat_report_wait_start(WAIT_EVENT_LOGICAL_REWRITE_CHECKPOINT_SYNC);
1279+
if (pg_fsync(fd) != 0)
12701280
ereport(ERROR,
12711281
(errcode_for_file_access(),
12721282
errmsg("could not fsync file \"%s\": %m", path)));
1283+
pgstat_report_wait_end();
12731284
CloseTransientFile(fd);
12741285
}
12751286
}

0 commit comments

Comments
 (0)