Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 5866218

Browse files
author
Commitfest Bot
committed
[CF 5117] VACUUM FULL / CLUSTER CONCURRENTLY
This branch was automatically generated by a robot using patches from an email thread registered at: https://commitfest.postgresql.org/patch/5117 The branch will be overwritten each time a new patch version is posted to the thread, and also periodically to check for bitrot caused by changes on the master branch. Patch(es): https://www.postgresql.org/message-id/55563.1743784734@localhost Author(s): Antonin Houska
2 parents 8a51027 + 86989a4 commit 5866218

File tree

25 files changed

+1259
-207
lines changed

25 files changed

+1259
-207
lines changed

doc/src/sgml/monitoring.sgml

Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -400,6 +400,14 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
400400
</entry>
401401
</row>
402402

403+
<row>
404+
<entry><structname>pg_stat_progress_repack</structname><indexterm><primary>pg_stat_progress_repack</primary></indexterm></entry>
405+
<entry>One row for each backend running
406+
<command>REPACK</command>, showing current progress. See
407+
<xref linkend="repack-progress-reporting"/>.
408+
</entry>
409+
</row>
410+
403411
<row>
404412
<entry><structname>pg_stat_progress_basebackup</structname><indexterm><primary>pg_stat_progress_basebackup</primary></indexterm></entry>
405413
<entry>One row for each WAL sender process streaming a base backup,
@@ -5943,6 +5951,228 @@ FROM pg_stat_get_backend_idset() AS backendid;
59435951
</table>
59445952
</sect2>
59455953

5954+
<sect2 id="repack-progress-reporting">
5955+
<title>REPACK Progress Reporting</title>
5956+
5957+
<indexterm>
5958+
<primary>pg_stat_progress_repack</primary>
5959+
</indexterm>
5960+
5961+
<para>
5962+
Whenever <command>REPACK</command> is running,
5963+
the <structname>pg_stat_progress_repack</structname> view will contain a
5964+
row for each backend that is currently running the command. The tables
5965+
below describe the information that will be reported and provide
5966+
information about how to interpret it.
5967+
</para>
5968+
5969+
<table id="pg-stat-progress-repack-view" xreflabel="pg_stat_progress_repack">
5970+
<title><structname>pg_stat_progress_repack</structname> View</title>
5971+
<tgroup cols="1">
5972+
<thead>
5973+
<row>
5974+
<entry role="catalog_table_entry"><para role="column_definition">
5975+
Column Type
5976+
</para>
5977+
<para>
5978+
Description
5979+
</para></entry>
5980+
</row>
5981+
</thead>
5982+
5983+
<tbody>
5984+
<row>
5985+
<entry role="catalog_table_entry"><para role="column_definition">
5986+
<structfield>pid</structfield> <type>integer</type>
5987+
</para>
5988+
<para>
5989+
Process ID of backend.
5990+
</para></entry>
5991+
</row>
5992+
5993+
<row>
5994+
<entry role="catalog_table_entry"><para role="column_definition">
5995+
<structfield>datid</structfield> <type>oid</type>
5996+
</para>
5997+
<para>
5998+
OID of the database to which this backend is connected.
5999+
</para></entry>
6000+
</row>
6001+
6002+
<row>
6003+
<entry role="catalog_table_entry"><para role="column_definition">
6004+
<structfield>datname</structfield> <type>name</type>
6005+
</para>
6006+
<para>
6007+
Name of the database to which this backend is connected.
6008+
</para></entry>
6009+
</row>
6010+
6011+
<row>
6012+
<entry role="catalog_table_entry"><para role="column_definition">
6013+
<structfield>relid</structfield> <type>oid</type>
6014+
</para>
6015+
<para>
6016+
OID of the table being repacked.
6017+
</para></entry>
6018+
</row>
6019+
6020+
<row>
6021+
<entry role="catalog_table_entry"><para role="column_definition">
6022+
<structfield>command</structfield> <type>text</type>
6023+
</para>
6024+
<para>
6025+
The command that is running. Currently, the only value
6026+
is <literal>REPACK</literal>.
6027+
</para></entry>
6028+
</row>
6029+
6030+
<row>
6031+
<entry role="catalog_table_entry"><para role="column_definition">
6032+
<structfield>phase</structfield> <type>text</type>
6033+
</para>
6034+
<para>
6035+
Current processing phase. See <xref linkend="repack-phases"/>.
6036+
</para></entry>
6037+
</row>
6038+
6039+
<row>
6040+
<entry role="catalog_table_entry"><para role="column_definition">
6041+
<structfield>repack_index_relid</structfield> <type>oid</type>
6042+
</para>
6043+
<para>
6044+
If the table is being scanned using an index, this is the OID of the
6045+
index being used; otherwise, it is zero.
6046+
</para></entry>
6047+
</row>
6048+
6049+
<row>
6050+
<entry role="catalog_table_entry"><para role="column_definition">
6051+
<structfield>heap_tuples_scanned</structfield> <type>bigint</type>
6052+
</para>
6053+
<para>
6054+
Number of heap tuples scanned.
6055+
This counter only advances when the phase is
6056+
<literal>seq scanning heap</literal>,
6057+
<literal>index scanning heap</literal>
6058+
or <literal>writing new heap</literal>.
6059+
</para></entry>
6060+
</row>
6061+
6062+
<row>
6063+
<entry role="catalog_table_entry"><para role="column_definition">
6064+
<structfield>heap_tuples_written</structfield> <type>bigint</type>
6065+
</para>
6066+
<para>
6067+
Number of heap tuples written.
6068+
This counter only advances when the phase is
6069+
<literal>seq scanning heap</literal>,
6070+
<literal>index scanning heap</literal>
6071+
or <literal>writing new heap</literal>.
6072+
</para></entry>
6073+
</row>
6074+
6075+
<row>
6076+
<entry role="catalog_table_entry"><para role="column_definition">
6077+
<structfield>heap_blks_total</structfield> <type>bigint</type>
6078+
</para>
6079+
<para>
6080+
Total number of heap blocks in the table. This number is reported
6081+
as of the beginning of <literal>seq scanning heap</literal>.
6082+
</para></entry>
6083+
</row>
6084+
6085+
<row>
6086+
<entry role="catalog_table_entry"><para role="column_definition">
6087+
<structfield>heap_blks_scanned</structfield> <type>bigint</type>
6088+
</para>
6089+
<para>
6090+
Number of heap blocks scanned. This counter only advances when the
6091+
phase is <literal>seq scanning heap</literal>.
6092+
</para></entry>
6093+
</row>
6094+
6095+
<row>
6096+
<entry role="catalog_table_entry"><para role="column_definition">
6097+
<structfield>index_rebuild_count</structfield> <type>bigint</type>
6098+
</para>
6099+
<para>
6100+
Number of indexes rebuilt. This counter only advances when the phase
6101+
is <literal>rebuilding index</literal>.
6102+
</para></entry>
6103+
</row>
6104+
</tbody>
6105+
</tgroup>
6106+
</table>
6107+
6108+
<table id="repack-phases">
6109+
<title>REPACK Phases</title>
6110+
<tgroup cols="2">
6111+
<colspec colname="col1" colwidth="1*"/>
6112+
<colspec colname="col2" colwidth="2*"/>
6113+
<thead>
6114+
<row>
6115+
<entry>Phase</entry>
6116+
<entry>Description</entry>
6117+
</row>
6118+
</thead>
6119+
6120+
<tbody>
6121+
<row>
6122+
<entry><literal>initializing</literal></entry>
6123+
<entry>
6124+
The command is preparing to begin scanning the heap. This phase is
6125+
expected to be very brief.
6126+
</entry>
6127+
</row>
6128+
<row>
6129+
<entry><literal>seq scanning heap</literal></entry>
6130+
<entry>
6131+
The command is currently scanning the table using a sequential scan.
6132+
</entry>
6133+
</row>
6134+
<row>
6135+
<entry><literal>index scanning heap</literal></entry>
6136+
<entry>
6137+
<command>REPACK</command> is currently scanning the table using an index scan.
6138+
</entry>
6139+
</row>
6140+
<row>
6141+
<entry><literal>sorting tuples</literal></entry>
6142+
<entry>
6143+
<command>REPACK</command> is currently sorting tuples.
6144+
</entry>
6145+
</row>
6146+
<row>
6147+
<entry><literal>writing new heap</literal></entry>
6148+
<entry>
6149+
<command>REPACK</command> is currently writing the new heap.
6150+
</entry>
6151+
</row>
6152+
<row>
6153+
<entry><literal>swapping relation files</literal></entry>
6154+
<entry>
6155+
The command is currently swapping newly-built files into place.
6156+
</entry>
6157+
</row>
6158+
<row>
6159+
<entry><literal>rebuilding index</literal></entry>
6160+
<entry>
6161+
The command is currently rebuilding an index.
6162+
</entry>
6163+
</row>
6164+
<row>
6165+
<entry><literal>performing final cleanup</literal></entry>
6166+
<entry>
6167+
The command is performing final cleanup. When this phase is
6168+
completed, <command>REPACK</command> will end.
6169+
</entry>
6170+
</row>
6171+
</tbody>
6172+
</tgroup>
6173+
</table>
6174+
</sect2>
6175+
59466176
<sect2 id="copy-progress-reporting">
59476177
<title>COPY Progress Reporting</title>
59486178

doc/src/sgml/ref/allfiles.sgml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,7 @@ Complete list of usable sgml source files in this directory.
167167
<!ENTITY refreshMaterializedView SYSTEM "refresh_materialized_view.sgml">
168168
<!ENTITY reindex SYSTEM "reindex.sgml">
169169
<!ENTITY releaseSavepoint SYSTEM "release_savepoint.sgml">
170+
<!ENTITY repack SYSTEM "repack.sgml">
170171
<!ENTITY reset SYSTEM "reset.sgml">
171172
<!ENTITY revoke SYSTEM "revoke.sgml">
172173
<!ENTITY rollback SYSTEM "rollback.sgml">

doc/src/sgml/ref/cluster.sgml

Lines changed: 17 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -42,17 +42,23 @@ CLUSTER [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
4242
<replaceable class="parameter">table_name</replaceable>.
4343
</para>
4444

45-
<para>
46-
When a table is clustered, it is physically reordered
47-
based on the index information. Clustering is a one-time operation:
48-
when the table is subsequently updated, the changes are
49-
not clustered. That is, no attempt is made to store new or
50-
updated rows according to their index order. (If one wishes, one can
51-
periodically recluster by issuing the command again. Also, setting
52-
the table's <literal>fillfactor</literal> storage parameter to less than
53-
100% can aid in preserving cluster ordering during updates, since updated
54-
rows are kept on the same page if enough space is available there.)
55-
</para>
45+
<warning>
46+
<para>
47+
The <command>CLUSTER</command> command is deprecated in favor of
48+
<xref linkend="sql-repack"/>.
49+
</para>
50+
</warning>
51+
52+
<note>
53+
<para>
54+
<xref linkend="sql-repack-notes-on-clustering"/> explain how clustering
55+
works, whether it is initiated by <command>CLUSTER</command> or
56+
by <command>REPACK</command>. The notable difference between the two is
57+
that <command>REPACK</command> does not remember the index used last
58+
time. Thus if you don't specify an index, <command>REPACK</command>
59+
rewrites the table but does not try to cluster it.
60+
</para>
61+
</note>
5662

5763
<para>
5864
When a table is clustered, <productname>PostgreSQL</productname>
@@ -136,63 +142,12 @@ CLUSTER [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
136142
on the table.
137143
</para>
138144

139-
<para>
140-
In cases where you are accessing single rows randomly
141-
within a table, the actual order of the data in the
142-
table is unimportant. However, if you tend to access some
143-
data more than others, and there is an index that groups
144-
them together, you will benefit from using <command>CLUSTER</command>.
145-
If you are requesting a range of indexed values from a table, or a
146-
single indexed value that has multiple rows that match,
147-
<command>CLUSTER</command> will help because once the index identifies the
148-
table page for the first row that matches, all other rows
149-
that match are probably already on the same table page,
150-
and so you save disk accesses and speed up the query.
151-
</para>
152-
153-
<para>
154-
<command>CLUSTER</command> can re-sort the table using either an index scan
155-
on the specified index, or (if the index is a b-tree) a sequential
156-
scan followed by sorting. It will attempt to choose the method that
157-
will be faster, based on planner cost parameters and available statistical
158-
information.
159-
</para>
160-
161145
<para>
162146
While <command>CLUSTER</command> is running, the <xref
163147
linkend="guc-search-path"/> is temporarily changed to <literal>pg_catalog,
164148
pg_temp</literal>.
165149
</para>
166150

167-
<para>
168-
When an index scan is used, a temporary copy of the table is created that
169-
contains the table data in the index order. Temporary copies of each
170-
index on the table are created as well. Therefore, you need free space on
171-
disk at least equal to the sum of the table size and the index sizes.
172-
</para>
173-
174-
<para>
175-
When a sequential scan and sort is used, a temporary sort file is
176-
also created, so that the peak temporary space requirement is as much
177-
as double the table size, plus the index sizes. This method is often
178-
faster than the index scan method, but if the disk space requirement is
179-
intolerable, you can disable this choice by temporarily setting <xref
180-
linkend="guc-enable-sort"/> to <literal>off</literal>.
181-
</para>
182-
183-
<para>
184-
It is advisable to set <xref linkend="guc-maintenance-work-mem"/> to
185-
a reasonably large value (but not more than the amount of RAM you can
186-
dedicate to the <command>CLUSTER</command> operation) before clustering.
187-
</para>
188-
189-
<para>
190-
Because the planner records statistics about the ordering of
191-
tables, it is advisable to run <link linkend="sql-analyze"><command>ANALYZE</command></link>
192-
on the newly clustered table.
193-
Otherwise, the planner might make poor choices of query plans.
194-
</para>
195-
196151
<para>
197152
Because <command>CLUSTER</command> remembers which indexes are clustered,
198153
one can cluster the tables one wants clustered manually the first time,

0 commit comments

Comments
 (0)