postgresql-cfbot
diff --git a/‎doc/src/sgml/monitoring.sgml
Lines changed: 19 additions & 7 deletions b/‎doc/src/sgml/monitoring.sgml
Lines changed: 19 additions & 7 deletions
diff --git a/‎doc/src/sgml/ref/create_index.sgml
Lines changed: 18 additions & 16 deletions b/‎doc/src/sgml/ref/create_index.sgml
Lines changed: 18 additions & 16 deletions
diff --git a/‎doc/src/sgml/ref/reindex.sgml
Lines changed: 25 additions & 16 deletions b/‎doc/src/sgml/ref/reindex.sgml
Lines changed: 25 additions & 16 deletions
diff --git a/‎src/backend/access/heap/README.HOT
Lines changed: 9 additions & 4 deletions b/‎src/backend/access/heap/README.HOT
Lines changed: 9 additions & 4 deletions
@@ -6314,6 +6314,18 @@ FROM pg_stat_get_backend_idset() AS backendid;
        information for this phase.
       </entry>
      </row>
+     <row>
+      <entry><literal>waiting for writers to use auxiliary index</literal></entry>
+      <entry>
+       <command>CREATE INDEX CONCURRENTLY</command> or <command>REINDEX CONCURRENTLY</command> is waiting for transactions
+       with write locks that can potentially see the table to finish, to ensure use of auxiliary index for new tuples in
+       future transactions.
+       This phase is skipped when not in concurrent mode.
+       Columns <structname>lockers_total</structname>, <structname>lockers_done</structname>
+       and <structname>current_locker_pid</structname> contain the progress
+       information for this phase.
+      </entry>
+     </row>
      <row>
       <entry><literal>building index</literal></entry>
       <entry>
@@ -6354,13 +6366,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
       </entry>
      </row>
      <row>
-      <entry><literal>index validation: scanning table</literal></entry>
+      <entry><literal>index validation: merging indexes</literal></entry>
       <entry>
-       <command>CREATE INDEX CONCURRENTLY</command> is scanning the table
-       to validate the index tuples collected in the previous two phases.
+       <command>CREATE INDEX CONCURRENTLY</command> merging content of auxiliary index with the target index.
        This phase is skipped when not in concurrent mode.
-       Columns <structname>blocks_total</structname> (set to the total size of the table)
-       and <structname>blocks_done</structname> contain the progress information for this phase.
+       Columns <structname>tuples_total</structname> (set to the number of tuples to be merged)
+       and <structname>tuples_done</structname> contain the progress information for this phase.
       </entry>
      </row>
      <row>
@@ -6377,8 +6388,9 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry><literal>waiting for readers before marking dead</literal></entry>
       <entry>
-       <command>REINDEX CONCURRENTLY</command> is waiting for transactions
-       with read locks on the table to finish, before marking the old index dead.
+       <command>CREATE INDEX CONCURRENTLY</command> is waiting for transactions
+        with read locks on the table to finish, before marking the auxiliary index as dead.
+       <command>REINDEX CONCURRENTLY</command> is also waiting before marking the old index as dead.
        This phase is skipped when not in concurrent mode.
        Columns <structname>lockers_total</structname>, <structname>lockers_done</structname>
        and <structname>current_locker_pid</structname> contain the progress
 
@@ -620,25 +620,25 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
     out writes.  This method is invoked by specifying the
     <literal>CONCURRENTLY</literal> option of <command>CREATE INDEX</command>.
     When this option is used,
-    <productname>PostgreSQL</productname> must perform two scans of the table, and in
-    addition it must wait for all existing transactions that could potentially
-    modify or use the index to terminate.  Thus
-    this method requires more total work than a standard index build and takes
+    <productname>PostgreSQL</productname> must perform table scan followed by
+    validation phase, and in addition it must wait for all existing transactions
+    that could potentially modify or use the index to terminate.  Thus
+    this method requires more total work than a standard index build and may take
     significantly longer to complete.  However, since it allows normal
     operations to continue while the index is built, this method is useful for
     adding new indexes in a production environment.  Of course, the extra CPU
     and I/O load imposed by the index creation might slow other operations.
    </para>
 
    <para>
-    In a concurrent index build, the index is actually entered as an
-    <quote>invalid</quote> index into
-    the system catalogs in one transaction, then two table scans occur in
-    two more transactions.  Before each table scan, the index build must
+    In a concurrent index build, the main and auxiliary indexes is actually
+    entered as an <quote>invalid</quote> index into
+    the system catalogs in one transaction, then two phases occur in
+    multiple transactions.  Before each phase, the index build must
     wait for existing transactions that have modified the table to terminate.
-    After the second scan, the index build must wait for any transactions
+    After the second phase, the index build must wait for any transactions
     that have a snapshot (see <xref linkend="mvcc"/>) predating the second
-    scan to terminate, including transactions used by any phase of concurrent
+    phase to terminate, including transactions used by any phase of concurrent
     index builds on other tables, if the indexes involved are partial or have
     columns that are not simple column references.
     Then finally the index can be marked <quote>valid</quote> and ready for use,
@@ -651,10 +651,11 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
    <para>
     If a problem arises while scanning the table, such as a deadlock or a
     uniqueness violation in a unique index, the <command>CREATE INDEX</command>
-    command will fail but leave behind an <quote>invalid</quote> index. This index
-    will be ignored for querying purposes because it might be incomplete;
-    however it will still consume update overhead. The <application>psql</application>
-    <command>\d</command> command will report such an index as <literal>INVALID</literal>:
+    command will fail but leave behind an <quote>invalid</quote> index and its
+    associated auxiliary index. These indexes
+    will be ignored for querying purposes because they might be incomplete;
+    however they will still consume update overhead. The <application>psql</application>
+    <command>\d</command> command will report such indexes as <literal>INVALID</literal>:
 
 <programlisting>
 postgres=# \d tab
@@ -664,11 +665,12 @@ postgres=# \d tab
  col    | integer |           |          |
 Indexes:
     "idx" btree (col) INVALID
+    "idx_ccaux" stir (col) INVALID
 </programlisting>
 
     The recommended recovery
-    method in such cases is to drop the index and try again to perform
-    <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is
+    method in such cases is to drop these indexes and try again to perform
+    <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is
     to rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>).
    </para>
 
 
@@ -368,9 +368,8 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { DA
     <productname>PostgreSQL</productname> supports rebuilding indexes with minimum locking
     of writes.  This method is invoked by specifying the
     <literal>CONCURRENTLY</literal> option of <command>REINDEX</command>. When this option
-    is used, <productname>PostgreSQL</productname> must perform two scans of the table
-    for each index that needs to be rebuilt and wait for termination of
-    all existing transactions that could potentially use the index.
+    is used, <productname>PostgreSQL</productname> must perform several steps to ensure data
+    consistency while allowing normal operations to continue.
     This method requires more total work than a standard index
     rebuild and takes significantly longer to complete as it needs to wait
     for unfinished transactions that might modify the index. However, since
@@ -388,7 +387,7 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { DA
     <orderedlist>
      <listitem>
       <para>
-       A new transient index definition is added to the catalog
+       A new transient index definition and an auxiliary index are added to the catalog
        <literal>pg_index</literal>.  This definition will be used to replace
        the old index.  A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
        session level is taken on the indexes being reindexed as well as their
@@ -398,7 +397,15 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { DA
 
      <listitem>
       <para>
-       A first pass to build the index is done for each new index.  Once the
+       The auxiliary index is marked as "ready for inserts", making
+       it visible to other sessions. This index efficiently tracks all new
+       tuples during the reindex process.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+       The new main index is built by scanning the table.  Once the
        index is built, its flag <literal>pg_index.indisready</literal> is
        switched to <quote>true</quote> to make it ready for inserts, making it
        visible to other sessions once the transaction that performed the build
@@ -409,9 +416,9 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { DA
 
      <listitem>
       <para>
-       Then a second pass is performed to add tuples that were added while the
-       first pass was running.  This step is also done in a separate
-       transaction for each index.
+       A validation phase merges any missing entries from the auxiliary index
+       into the main index, ensuring all concurrent changes are captured.
+       This step is also done in a separate transaction for each index.
       </para>
      </listitem>
 
@@ -428,15 +435,15 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { DA
 
      <listitem>
       <para>
-       The old indexes have <literal>pg_index.indisready</literal> switched to
+       The old and auxiliary indexes have <literal>pg_index.indisready</literal> switched to
        <quote>false</quote> to prevent any new tuple insertions, after waiting
        for running queries that might reference the old index to complete.
       </para>
      </listitem>
 
      <listitem>
       <para>
-       The old indexes are dropped.  The <literal>SHARE UPDATE
+       The old and auxiliary indexes are dropped.  The <literal>SHARE UPDATE
        EXCLUSIVE</literal> session locks for the indexes and the table are
        released.
       </para>
@@ -447,11 +454,11 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { DA
    <para>
     If a problem arises while rebuilding the indexes, such as a
     uniqueness violation in a unique index, the <command>REINDEX</command>
-    command will fail but leave behind an <quote>invalid</quote> new index in addition to
-    the pre-existing one. This index will be ignored for querying purposes
-    because it might be incomplete; however it will still consume update
+    command will fail but leave behind an <quote>invalid</quote> new index and its auxiliary index in addition to
+    the pre-existing one. These indexes will be ignored for querying purposes
+    because they might be incomplete; however they will still consume update
     overhead. The <application>psql</application> <command>\d</command> command will report
-    such an index as <literal>INVALID</literal>:
+    such indexes as <literal>INVALID</literal>:
 
 <programlisting>
 postgres=# \d tab
@@ -462,12 +469,14 @@ postgres=# \d tab
 Indexes:
     "idx" btree (col)
     "idx_ccnew" btree (col) INVALID
+    "idx_ccaux" stir (col) INVALID
+
 </programlisting>
 
     If the index marked <literal>INVALID</literal> is suffixed
-    <literal>_ccnew</literal>, then it corresponds to the transient
+    <literal>_ccnew</literal> or <literal>_ccaux</literal>, then it corresponds to the transient or auxiliary
     index created during the concurrent operation, and the recommended
-    recovery method is to drop it using <literal>DROP INDEX</literal>,
+    recovery method is to drop these indexes using <literal>DROP INDEX</literal>,
     then attempt <command>REINDEX CONCURRENTLY</command> again.
     If the invalid index is instead suffixed <literal>_ccold</literal>,
     it corresponds to the original index which could not be dropped;
 
@@ -375,6 +375,11 @@ constraint on which updates can be HOT.  Other transactions must include
 such an index when determining HOT-safety of updates, even though they
 must ignore it for both insertion and searching purposes.
 
+Also, special auxiliary index is created the same way. It marked as
+"ready for inserts" without any actual table scan. Its purpose is collect
+new tuples inserted into table while our target index is still "not ready
+for inserts"
+
 We must do this to avoid making incorrect index entries.  For example,
 suppose we are building an index on column X and we make an index entry for
 a non-HOT tuple with X=1.  Then some other backend, unaware that X is an
@@ -394,10 +399,10 @@ As above, we point the index entry at the root of the HOT-update chain but we
 use the key value from the live tuple.
 
 We mark the index open for inserts (but still not ready for reads) then
-we again wait for transactions which have the table open.  Then we take
-a second reference snapshot and validate the index.  This searches for
-tuples missing from the index, and inserts any missing ones.  Again,
-the index entries have to have TIDs equal to HOT-chain root TIDs, but
+we again wait for transactions which have the table open.  Then validate
+the index.  This searches for tuples missing from the index in auxiliary
+index, and inserts any missing ones if them visible to reference snapshot.
+Again, the index entries have to have TIDs equal to HOT-chain root TIDs, but
 the value to be inserted is the one from the live tuple.
 
 Then we wait until every transaction that could have a snapshot older than