Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 6912acc

Browse files
Replication lag tracking for walsenders
Adds write_lag, flush_lag and replay_lag cols to pg_stat_replication. Implements a lag tracker module that reports the lag times based upon measurements of the time taken for recent WAL to be written, flushed and replayed and for the sender to hear about it. These times represent the commit lag that was (or would have been) introduced by each synchronous commit level, if the remote server was configured as a synchronous standby. For an asynchronous standby, the replay_lag column approximates the delay before recent transactions became visible to queries. If the standby server has entirely caught up with the sending server and there is no more WAL activity, the most recently measured lag times will continue to be displayed for a short time and then show NULL. Physical replication lag tracking is automatic. Logical replication tracking is possible but is the responsibility of the logical decoding plugin. Tracking is a private module operating within each walsender individually, with values reported to shared memory. Module not used outside of walsender. Design and code is good enough now to commit - kudos to the author. In many ways a difficult topic, with important and subtle behaviour so this shoudl be expected to generate discussion and multiple open items: Test now! Author: Thomas Munro, following designs by Fujii Masao and Simon Riggs Review: Simon Riggs, Ian Barwick and Craig Ringer
1 parent 7c4f524 commit 6912acc

File tree

8 files changed

+370
-7
lines changed

8 files changed

+370
-7
lines changed

doc/src/sgml/monitoring.sgml

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1695,6 +1695,36 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
16951695
<entry>Last transaction log position replayed into the database on this
16961696
standby server</entry>
16971697
</row>
1698+
<row>
1699+
<entry><structfield>write_lag</></entry>
1700+
<entry><type>interval</></entry>
1701+
<entry>Time elapsed between flushing recent WAL locally and receiving
1702+
notification that this standby server has written it (but not yet
1703+
flushed it or applied it). This can be used to gauge the delay that
1704+
<literal>synchronous_commit</literal> level
1705+
<literal>remote_write</literal> incurred while committing if this
1706+
server was configured as a synchronous standby.</entry>
1707+
</row>
1708+
<row>
1709+
<entry><structfield>flush_lag</></entry>
1710+
<entry><type>interval</></entry>
1711+
<entry>Time elapsed between flushing recent WAL locally and receiving
1712+
notification that this standby server has written and flushed it
1713+
(but not yet applied it). This can be used to gauge the delay that
1714+
<literal>synchronous_commit</literal> level
1715+
<literal>remote_flush</literal> incurred while committing if this
1716+
server was configured as a synchronous standby.</entry>
1717+
</row>
1718+
<row>
1719+
<entry><structfield>replay_lag</></entry>
1720+
<entry><type>interval</></entry>
1721+
<entry>Time elapsed between flushing recent WAL locally and receiving
1722+
notification that this standby server has written, flushed and
1723+
applied it. This can be used to gauge the delay that
1724+
<literal>synchronous_commit</literal> level
1725+
<literal>remote_apply</literal> incurred while committing if this
1726+
server was configured as a synchronous standby.</entry>
1727+
</row>
16981728
<row>
16991729
<entry><structfield>sync_priority</></entry>
17001730
<entry><type>integer</></entry>
@@ -1745,6 +1775,45 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
17451775
listed; no information is available about downstream standby servers.
17461776
</para>
17471777

1778+
<para>
1779+
The lag times reported in the <structname>pg_stat_replication</structname>
1780+
view are measurements of the time taken for recent WAL to be written,
1781+
flushed and replayed and for the sender to know about it. These times
1782+
represent the commit delay that was (or would have been) introduced by each
1783+
synchronous commit level, if the remote server was configured as a
1784+
synchronous standby. For an asynchronous standby, the
1785+
<structfield>replay_lag</structfield> column approximates the delay
1786+
before recent transactions became visible to queries. If the standby
1787+
server has entirely caught up with the sending server and there is no more
1788+
WAL activity, the most recently measured lag times will continue to be
1789+
displayed for a short time and then show NULL.
1790+
</para>
1791+
1792+
<para>
1793+
Lag times work automatically for physical replication. Logical decoding
1794+
plugins may optionally emit tracking messages; if they do not, the tracking
1795+
mechanism will simply display NULL lag.
1796+
</para>
1797+
1798+
<note>
1799+
<para>
1800+
The reported lag times are not predictions of how long it will take for
1801+
the standby to catch up with the sending server assuming the current
1802+
rate of replay. Such a system would show similar times while new WAL is
1803+
being generated, but would differ when the sender becomes idle. In
1804+
particular, when the standby has caught up completely,
1805+
<structname>pg_stat_replication</structname> shows the time taken to
1806+
write, flush and replay the most recent reported WAL position rather than
1807+
zero as some users might expect. This is consistent with the goal of
1808+
measuring synchronous commit and transaction visibility delays for
1809+
recent write transactions.
1810+
To reduce confusion for users expecting a different model of lag, the
1811+
lag columns revert to NULL after a short time on a fully replayed idle
1812+
system. Monitoring systems should choose whether to represent this
1813+
as missing data, zero or continue to display the last known value.
1814+
</para>
1815+
</note>
1816+
17481817
<table id="pg-stat-wal-receiver-view" xreflabel="pg_stat_wal_receiver">
17491818
<title><structname>pg_stat_wal_receiver</structname> View</title>
17501819
<tgroup cols="3">

src/backend/access/transam/xlog.c

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11555,6 +11555,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
1155511555
{
1155611556
static TimestampTz last_fail_time = 0;
1155711557
TimestampTz now;
11558+
bool streaming_reply_sent = false;
1155811559

1155911560
/*-------
1156011561
* Standby mode is implemented by a state machine:
@@ -11877,6 +11878,19 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
1187711878
break;
1187811879
}
1187911880

11881+
/*
11882+
* Since we have replayed everything we have received so
11883+
* far and are about to start waiting for more WAL, let's
11884+
* tell the upstream server our replay location now so
11885+
* that pg_stat_replication doesn't show stale
11886+
* information.
11887+
*/
11888+
if (!streaming_reply_sent)
11889+
{
11890+
WalRcvForceReply();
11891+
streaming_reply_sent = true;
11892+
}
11893+
1188011894
/*
1188111895
* Wait for more WAL to arrive. Time out after 5 seconds
1188211896
* to react to a trigger file promptly.

src/backend/catalog/system_views.sql

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -705,6 +705,9 @@ CREATE VIEW pg_stat_replication AS
705705
W.write_location,
706706
W.flush_location,
707707
W.replay_location,
708+
W.write_lag,
709+
W.flush_lag,
710+
W.replay_lag,
708711
W.sync_priority,
709712
W.sync_state
710713
FROM pg_stat_get_activity(NULL) AS S

0 commit comments

Comments
 (0)