Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit a9885f2

Browse files
committed
doc: add section about heap-only tuples (HOT)
Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/c59ffbd5-96ac-a5a5-a401-14f627ca1405@postgresql.org Backpatch-through: 11
1 parent a4a24fe commit a9885f2

9 files changed

+87
-12
lines changed

doc/src/sgml/acronyms.sgml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -299,9 +299,7 @@
299299
<term><acronym>HOT</acronym></term>
300300
<listitem>
301301
<para>
302-
<ulink
303-
url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT;hb=HEAD">Heap-Only
304-
Tuples</ulink>
302+
<link linkend="storage-hot">Heap-Only Tuples</link>
305303
</para>
306304
</listitem>
307305
</varlistentry>

doc/src/sgml/btree.sgml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -708,8 +708,9 @@ options(<replaceable>relopts</replaceable> <type>local_relopts *</type>) returns
708708
entry. <quote>Version duplicates</quote> may sometimes accumulate
709709
and adversely affect query latency and throughput. This typically
710710
occurs with <command>UPDATE</command>-heavy workloads where most
711-
individual updates cannot apply the <acronym>HOT</acronym>
712-
optimization (often because at least one indexed column gets
711+
individual updates cannot apply the
712+
<link linkend="storage-hot"><acronym>HOT</acronym> optimization</link>
713+
(often because at least one indexed column gets
713714
modified, necessitating a new set of index tuple versions &mdash;
714715
one new tuple for <emphasis>each and every</emphasis> index). In
715716
effect, B-Tree deduplication ameliorates index bloat caused by

doc/src/sgml/catalogs.sgml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4287,7 +4287,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
42874287
<para>
42884288
If true, queries must not use the index until the <structfield>xmin</structfield>
42894289
of this <structname>pg_index</structname> row is below their <symbol>TransactionXmin</symbol>
4290-
event horizon, because the table may contain broken HOT chains with
4290+
event horizon, because the table may contain broken <link linkend="storage-hot">HOT chains</link> with
42914291
incompatible rows that they can see
42924292
</para></entry>
42934293
</row>

doc/src/sgml/config.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4195,7 +4195,8 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
41954195
<listitem>
41964196
<para>
41974197
Specifies the number of transactions by which <command>VACUUM</command> and
4198-
<acronym>HOT</acronym> updates will defer cleanup of dead row versions. The
4198+
<link linkend="storage-hot"><acronym>HOT</acronym> updates</link>
4199+
will defer cleanup of dead row versions. The
41994200
default is zero transactions, meaning that dead row versions can be
42004201
removed as soon as possible, that is, as soon as they are no longer
42014202
visible to any open transaction. You may wish to set this to a

doc/src/sgml/indexam.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,8 @@
4545
extant versions of the same logical row; to an index, each tuple is
4646
an independent object that needs its own index entry. Thus, an
4747
update of a row always creates all-new index entries for the row, even if
48-
the key values did not change. (HOT tuples are an exception to this
48+
the key values did not change. (<link linkend="storage-hot">HOT
49+
tuples</link> are an exception to this
4950
statement; but indexes do not deal with those, either.) Index entries for
5051
dead tuples are reclaimed (by vacuuming) when the dead tuples themselves
5152
are reclaimed.

doc/src/sgml/indices.sgml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,9 @@ CREATE INDEX test1_id_index ON test1 (id);
103103

104104
<para>
105105
After an index is created, the system has to keep it synchronized with the
106-
table. This adds overhead to data manipulation operations.
106+
table. This adds overhead to data manipulation operations. Indexes can
107+
also prevent the creation of <link linkend="storage-hot">heap-only
108+
tuples</link>.
107109
Therefore indexes that are seldom or never used in queries
108110
should be removed.
109111
</para>
@@ -733,7 +735,7 @@ CREATE INDEX people_names ON people ((first_name || ' ' || last_name));
733735
<para>
734736
Index expressions are relatively expensive to maintain, because the
735737
derived expression(s) must be computed for each row insertion
736-
and non-HOT update. However, the index expressions are
738+
and <link linkend="storage-hot">non-HOT update.</link> However, the index expressions are
737739
<emphasis>not</emphasis> recomputed during an indexed search, since they are
738740
already stored in the index. In both examples above, the system
739741
sees the query as just <literal>WHERE indexedcolumn = 'constant'</literal>

doc/src/sgml/monitoring.sgml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3721,7 +3721,7 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
37213721
<structfield>n_tup_upd</structfield> <type>bigint</type>
37223722
</para>
37233723
<para>
3724-
Number of rows updated (includes HOT updated rows)
3724+
Number of rows updated (includes <link linkend="storage-hot">HOT updated rows</link>)
37253725
</para></entry>
37263726
</row>
37273727

doc/src/sgml/ref/create_table.sgml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1357,7 +1357,9 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
13571357
to the indicated percentage; the remaining space on each page is
13581358
reserved for updating rows on that page. This gives <command>UPDATE</command>
13591359
a chance to place the updated copy of a row on the same page as the
1360-
original, which is more efficient than placing it on a different page.
1360+
original, which is more efficient than placing it on a different
1361+
page, and makes <link linkend="storage-hot">heap-only tuple
1362+
updates</link> more likely.
13611363
For a table whose entries are never updated, complete packing is the
13621364
best choice, but in heavily updated tables smaller fillfactors are
13631365
appropriate. This parameter cannot be set for TOAST tables.

doc/src/sgml/storage.sgml

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1070,4 +1070,74 @@ data. Empty in ordinary tables.</entry>
10701070
</sect2>
10711071
</sect1>
10721072

1073+
<sect1 id="storage-hot">
1074+
1075+
<title>Heap-Only Tuples (<acronym>HOT</acronym>)</title>
1076+
1077+
<para>
1078+
To allow for high concurrency, <productname>PostgreSQL</productname>
1079+
uses <link linkend="mvcc-intro">multiversion concurrency
1080+
control</link> (<acronym>MVCC</acronym>) to store rows. However,
1081+
<acronym>MVCC</acronym> has some downsides for update queries.
1082+
Specifically, updates require new versions of rows to be added to
1083+
tables. This can also require new index entries for each updated row,
1084+
and removal of old versions of rows and their index entries can be
1085+
expensive.
1086+
</para>
1087+
1088+
<para>
1089+
To help reduce the overhead of updates,
1090+
<productname>PostgreSQL</productname> has an optimization called
1091+
heap-only tuples (<acronym>HOT</acronym>). This optimization is
1092+
possible when:
1093+
1094+
<itemizedlist>
1095+
<listitem>
1096+
<para>
1097+
The update does not modify any columns referenced by the table's
1098+
indexes, including expression and partial indexes.
1099+
</para>
1100+
</listitem>
1101+
<listitem>
1102+
<para>
1103+
There is sufficient free space on the page containing the old row
1104+
for the updated row.
1105+
</para>
1106+
</listitem>
1107+
</itemizedlist>
1108+
1109+
In such cases, heap-only tuples provide two optimizations:
1110+
1111+
<itemizedlist>
1112+
<listitem>
1113+
<para>
1114+
New index entries are not needed to represent updated rows.
1115+
</para>
1116+
</listitem>
1117+
<listitem>
1118+
<para>
1119+
Old versions of updated rows can be completely removed during normal
1120+
operation, including <command>SELECT</command>s, instead of requiring
1121+
periodic vacuum operations. (This is possible because indexes
1122+
do not reference their <link linkend="storage-page-layout">page
1123+
item identifiers</link>.)
1124+
</para>
1125+
</listitem>
1126+
</itemizedlist>
1127+
</para>
1128+
1129+
<para>
1130+
In summary, heap-only tuple updates can only be created
1131+
if columns used by indexes are not updated. You can
1132+
increase the likelihood of sufficient page space for
1133+
<acronym>HOT</acronym> updates by decreasing a table's <link
1134+
linkend="sql-createtable"><literal>fillfactor</literal></link>.
1135+
If you don't, <acronym>HOT</acronym> updates will still happen because
1136+
new rows will naturally migrate to new pages and existing pages with
1137+
sufficient free space for new row versions. The system view <link
1138+
linkend="monitoring-pg-stat-all-tables-view">pg_stat_all_tables</link>
1139+
allows monitoring of the occurrence of HOT and non-HOT updates.
1140+
</para>
1141+
</sect1>
1142+
10731143
</chapter>

0 commit comments

Comments
 (0)