Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit d28cd3e

Browse files
committed
At update of non-LP_NORMAL TID, fail instead of corrupting page header.
The right mix of DDL and VACUUM could corrupt a catalog page header such that PageIsVerified() durably fails, requiring a restore from backup. This affects only catalogs that both have a syscache and have DDL code that uses syscache tuples to construct updates. One of the test permutations shows a variant not yet fixed. This makes !TransactionIdIsValid(TM_FailureData.xmax) possible with TM_Deleted. I think core and PGXN are indifferent to that. Per bug #17821 from Alexander Lakhin. Back-patch to v13 (all supported versions). The test case is v17+, since it uses INJECTION_POINT. Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org
1 parent 81772a4 commit d28cd3e

File tree

10 files changed

+490
-5
lines changed

10 files changed

+490
-5
lines changed

src/backend/access/heap/heapam.c

Lines changed: 48 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,8 +49,10 @@
4949
#include "storage/predicate.h"
5050
#include "storage/procarray.h"
5151
#include "utils/datum.h"
52+
#include "utils/injection_point.h"
5253
#include "utils/inval.h"
5354
#include "utils/spccache.h"
55+
#include "utils/syscache.h"
5456

5557

5658
static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup,
@@ -3254,6 +3256,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
32543256
interesting_attrs = bms_add_members(interesting_attrs, id_attrs);
32553257

32563258
block = ItemPointerGetBlockNumber(otid);
3259+
INJECTION_POINT("heap_update-before-pin");
32573260
buffer = ReadBuffer(relation, block);
32583261
page = BufferGetPage(buffer);
32593262

@@ -3269,7 +3272,51 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
32693272
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
32703273

32713274
lp = PageGetItemId(page, ItemPointerGetOffsetNumber(otid));
3272-
Assert(ItemIdIsNormal(lp));
3275+
3276+
/*
3277+
* Usually, a buffer pin and/or snapshot blocks pruning of otid, ensuring
3278+
* we see LP_NORMAL here. When the otid origin is a syscache, we may have
3279+
* neither a pin nor a snapshot. Hence, we may see other LP_ states, each
3280+
* of which indicates concurrent pruning.
3281+
*
3282+
* Failing with TM_Updated would be most accurate. However, unlike other
3283+
* TM_Updated scenarios, we don't know the successor ctid in LP_UNUSED and
3284+
* LP_DEAD cases. While the distinction between TM_Updated and TM_Deleted
3285+
* does matter to SQL statements UPDATE and MERGE, those SQL statements
3286+
* hold a snapshot that ensures LP_NORMAL. Hence, the choice between
3287+
* TM_Updated and TM_Deleted affects only the wording of error messages.
3288+
* Settle on TM_Deleted, for two reasons. First, it avoids complicating
3289+
* the specification of when tmfd->ctid is valid. Second, it creates
3290+
* error log evidence that we took this branch.
3291+
*
3292+
* Since it's possible to see LP_UNUSED at otid, it's also possible to see
3293+
* LP_NORMAL for a tuple that replaced LP_UNUSED. If it's a tuple for an
3294+
* unrelated row, we'll fail with "duplicate key value violates unique".
3295+
* XXX if otid is the live, newer version of the newtup row, we'll discard
3296+
* changes originating in versions of this catalog row after the version
3297+
* the caller got from syscache. See syscache-update-pruned.spec.
3298+
*/
3299+
if (!ItemIdIsNormal(lp))
3300+
{
3301+
Assert(RelationSupportsSysCache(RelationGetRelid(relation)));
3302+
3303+
UnlockReleaseBuffer(buffer);
3304+
Assert(!have_tuple_lock);
3305+
if (vmbuffer != InvalidBuffer)
3306+
ReleaseBuffer(vmbuffer);
3307+
tmfd->ctid = *otid;
3308+
tmfd->xmax = InvalidTransactionId;
3309+
tmfd->cmax = InvalidCommandId;
3310+
*update_indexes = TU_None;
3311+
3312+
bms_free(hot_attrs);
3313+
bms_free(sum_attrs);
3314+
bms_free(key_attrs);
3315+
bms_free(id_attrs);
3316+
/* modified_attrs not yet initialized */
3317+
bms_free(interesting_attrs);
3318+
return TM_Deleted;
3319+
}
32733320

32743321
/*
32753322
* Fill in enough data in oldtup for HeapDetermineColumnsInfo to work

src/backend/utils/cache/inval.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,7 @@
123123
#include "storage/sinval.h"
124124
#include "storage/smgr.h"
125125
#include "utils/catcache.h"
126+
#include "utils/injection_point.h"
126127
#include "utils/inval.h"
127128
#include "utils/memdebug.h"
128129
#include "utils/memutils.h"
@@ -1134,6 +1135,8 @@ AtEOXact_Inval(bool isCommit)
11341135
/* Must be at top of stack */
11351136
Assert(transInvalInfo->my_level == 1 && transInvalInfo->parent == NULL);
11361137

1138+
INJECTION_POINT("AtEOXact_Inval-with-transInvalInfo");
1139+
11371140
if (isCommit)
11381141
{
11391142
/*

src/include/access/tableam.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,8 @@ typedef enum TU_UpdateIndexes
136136
*
137137
* xmax is the outdating transaction's XID. If the caller wants to visit the
138138
* replacement tuple, it must check that this matches before believing the
139-
* replacement is really a match.
139+
* replacement is really a match. This is InvalidTransactionId if the target
140+
* was !LP_NORMAL (expected only for a TID retrieved from syscache).
140141
*
141142
* cmax is the outdating command's CID, but only when the failure code is
142143
* TM_SelfModified (i.e., something in the current transaction outdated the

src/test/modules/injection_points/Makefile

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,16 @@ OBJS = \
55
$(WIN32RES) \
66
injection_points.o \
77
injection_stats.o \
8-
injection_stats_fixed.o
8+
injection_stats_fixed.o \
9+
regress_injection.o
910
EXTENSION = injection_points
1011
DATA = injection_points--1.0.sql
1112
PGFILEDESC = "injection_points - facility for injection points"
1213

1314
REGRESS = injection_points reindex_conc
1415
REGRESS_OPTS = --dlpath=$(top_builddir)/src/test/regress
1516

16-
ISOLATION = basic inplace
17+
ISOLATION = basic inplace syscache-update-pruned
1718

1819
TAP_TESTS = 1
1920

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
Parsed test spec with 4 sessions
2+
3+
starting permutation: cachefill1 at2 waitprunable4 vac4 grant1 wakeinval4 wakegrant4
4+
step cachefill1: SELECT FROM vactest.reloid_catcache_set('vactest.orig50');
5+
step at2:
6+
CREATE TRIGGER to_set_relhastriggers BEFORE UPDATE ON vactest.orig50
7+
FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
8+
<waiting ...>
9+
step waitprunable4: CALL vactest.wait_prunable();
10+
step vac4: VACUUM pg_class;
11+
step grant1: GRANT SELECT ON vactest.orig50 TO PUBLIC; <waiting ...>
12+
step wakeinval4:
13+
SELECT FROM injection_points_detach('AtEOXact_Inval-with-transInvalInfo');
14+
SELECT FROM injection_points_wakeup('AtEOXact_Inval-with-transInvalInfo');
15+
<waiting ...>
16+
step at2: <... completed>
17+
step wakeinval4: <... completed>
18+
step wakegrant4:
19+
SELECT FROM injection_points_detach('heap_update-before-pin');
20+
SELECT FROM injection_points_wakeup('heap_update-before-pin');
21+
<waiting ...>
22+
step grant1: <... completed>
23+
ERROR: tuple concurrently deleted
24+
step wakegrant4: <... completed>
25+
26+
starting permutation: cachefill1 at2 waitprunable4 vac4 grant1 wakeinval4 mkrels4 wakegrant4
27+
step cachefill1: SELECT FROM vactest.reloid_catcache_set('vactest.orig50');
28+
step at2:
29+
CREATE TRIGGER to_set_relhastriggers BEFORE UPDATE ON vactest.orig50
30+
FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
31+
<waiting ...>
32+
step waitprunable4: CALL vactest.wait_prunable();
33+
step vac4: VACUUM pg_class;
34+
step grant1: GRANT SELECT ON vactest.orig50 TO PUBLIC; <waiting ...>
35+
step wakeinval4:
36+
SELECT FROM injection_points_detach('AtEOXact_Inval-with-transInvalInfo');
37+
SELECT FROM injection_points_wakeup('AtEOXact_Inval-with-transInvalInfo');
38+
<waiting ...>
39+
step at2: <... completed>
40+
step wakeinval4: <... completed>
41+
step mkrels4:
42+
SELECT FROM vactest.mkrels('intruder', 1, 100); -- repopulate LP_UNUSED
43+
44+
step wakegrant4:
45+
SELECT FROM injection_points_detach('heap_update-before-pin');
46+
SELECT FROM injection_points_wakeup('heap_update-before-pin');
47+
<waiting ...>
48+
step grant1: <... completed>
49+
ERROR: duplicate key value violates unique constraint "pg_class_oid_index"
50+
step wakegrant4: <... completed>
51+
52+
starting permutation: snap3 cachefill1 at2 mkrels4 r3 waitprunable4 vac4 grant1 wakeinval4 at4 wakegrant4 inspect4
53+
step snap3: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT;
54+
step cachefill1: SELECT FROM vactest.reloid_catcache_set('vactest.orig50');
55+
step at2:
56+
CREATE TRIGGER to_set_relhastriggers BEFORE UPDATE ON vactest.orig50
57+
FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
58+
<waiting ...>
59+
step mkrels4:
60+
SELECT FROM vactest.mkrels('intruder', 1, 100); -- repopulate LP_UNUSED
61+
62+
step r3: ROLLBACK;
63+
step waitprunable4: CALL vactest.wait_prunable();
64+
step vac4: VACUUM pg_class;
65+
step grant1: GRANT SELECT ON vactest.orig50 TO PUBLIC; <waiting ...>
66+
step wakeinval4:
67+
SELECT FROM injection_points_detach('AtEOXact_Inval-with-transInvalInfo');
68+
SELECT FROM injection_points_wakeup('AtEOXact_Inval-with-transInvalInfo');
69+
<waiting ...>
70+
step at2: <... completed>
71+
step wakeinval4: <... completed>
72+
step at4: ALTER TABLE vactest.child50 INHERIT vactest.orig50;
73+
step wakegrant4:
74+
SELECT FROM injection_points_detach('heap_update-before-pin');
75+
SELECT FROM injection_points_wakeup('heap_update-before-pin');
76+
<waiting ...>
77+
step grant1: <... completed>
78+
step wakegrant4: <... completed>
79+
step inspect4:
80+
SELECT relhastriggers, relhassubclass FROM pg_class
81+
WHERE oid = 'vactest.orig50'::regclass;
82+
83+
relhastriggers|relhassubclass
84+
--------------+--------------
85+
f |f
86+
(1 row)
87+
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
Parsed test spec with 4 sessions
2+
3+
starting permutation: cachefill1 at2 waitprunable4 vac4 grant1 wakeinval4 wakegrant4
4+
step cachefill1: SELECT FROM vactest.reloid_catcache_set('vactest.orig50');
5+
step at2:
6+
CREATE TRIGGER to_set_relhastriggers BEFORE UPDATE ON vactest.orig50
7+
FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
8+
<waiting ...>
9+
step waitprunable4: CALL vactest.wait_prunable();
10+
step vac4: VACUUM pg_class;
11+
step grant1: GRANT SELECT ON vactest.orig50 TO PUBLIC; <waiting ...>
12+
step wakeinval4:
13+
SELECT FROM injection_points_detach('AtEOXact_Inval-with-transInvalInfo');
14+
SELECT FROM injection_points_wakeup('AtEOXact_Inval-with-transInvalInfo');
15+
<waiting ...>
16+
step at2: <... completed>
17+
step wakeinval4: <... completed>
18+
step wakegrant4:
19+
SELECT FROM injection_points_detach('heap_update-before-pin');
20+
SELECT FROM injection_points_wakeup('heap_update-before-pin');
21+
<waiting ...>
22+
step grant1: <... completed>
23+
step wakegrant4: <... completed>
24+
25+
starting permutation: cachefill1 at2 waitprunable4 vac4 grant1 wakeinval4 mkrels4 wakegrant4
26+
step cachefill1: SELECT FROM vactest.reloid_catcache_set('vactest.orig50');
27+
step at2:
28+
CREATE TRIGGER to_set_relhastriggers BEFORE UPDATE ON vactest.orig50
29+
FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
30+
<waiting ...>
31+
step waitprunable4: CALL vactest.wait_prunable();
32+
step vac4: VACUUM pg_class;
33+
step grant1: GRANT SELECT ON vactest.orig50 TO PUBLIC; <waiting ...>
34+
step wakeinval4:
35+
SELECT FROM injection_points_detach('AtEOXact_Inval-with-transInvalInfo');
36+
SELECT FROM injection_points_wakeup('AtEOXact_Inval-with-transInvalInfo');
37+
<waiting ...>
38+
step at2: <... completed>
39+
step wakeinval4: <... completed>
40+
step mkrels4:
41+
SELECT FROM vactest.mkrels('intruder', 1, 100); -- repopulate LP_UNUSED
42+
43+
step wakegrant4:
44+
SELECT FROM injection_points_detach('heap_update-before-pin');
45+
SELECT FROM injection_points_wakeup('heap_update-before-pin');
46+
<waiting ...>
47+
step grant1: <... completed>
48+
step wakegrant4: <... completed>
49+
50+
starting permutation: snap3 cachefill1 at2 mkrels4 r3 waitprunable4 vac4 grant1 wakeinval4 at4 wakegrant4 inspect4
51+
step snap3: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT;
52+
step cachefill1: SELECT FROM vactest.reloid_catcache_set('vactest.orig50');
53+
step at2:
54+
CREATE TRIGGER to_set_relhastriggers BEFORE UPDATE ON vactest.orig50
55+
FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
56+
<waiting ...>
57+
step mkrels4:
58+
SELECT FROM vactest.mkrels('intruder', 1, 100); -- repopulate LP_UNUSED
59+
60+
step r3: ROLLBACK;
61+
step waitprunable4: CALL vactest.wait_prunable();
62+
step vac4: VACUUM pg_class;
63+
step grant1: GRANT SELECT ON vactest.orig50 TO PUBLIC; <waiting ...>
64+
step wakeinval4:
65+
SELECT FROM injection_points_detach('AtEOXact_Inval-with-transInvalInfo');
66+
SELECT FROM injection_points_wakeup('AtEOXact_Inval-with-transInvalInfo');
67+
<waiting ...>
68+
step at2: <... completed>
69+
step wakeinval4: <... completed>
70+
step at4: ALTER TABLE vactest.child50 INHERIT vactest.orig50;
71+
step wakegrant4:
72+
SELECT FROM injection_points_detach('heap_update-before-pin');
73+
SELECT FROM injection_points_wakeup('heap_update-before-pin');
74+
<waiting ...>
75+
step grant1: <... completed>
76+
ERROR: tuple concurrently updated
77+
step wakegrant4: <... completed>
78+
step inspect4:
79+
SELECT relhastriggers, relhassubclass FROM pg_class
80+
WHERE oid = 'vactest.orig50'::regclass;
81+
82+
relhastriggers|relhassubclass
83+
--------------+--------------
84+
t |t
85+
(1 row)
86+

src/test/modules/injection_points/injection_points--1.0.sql

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,3 +97,11 @@ CREATE FUNCTION injection_points_stats_fixed(OUT numattach int8,
9797
RETURNS record
9898
AS 'MODULE_PATHNAME', 'injection_points_stats_fixed'
9999
LANGUAGE C STRICT;
100+
101+
--
102+
-- regress_injection.c functions
103+
--
104+
CREATE FUNCTION removable_cutoff(rel regclass)
105+
RETURNS xid8
106+
AS 'MODULE_PATHNAME'
107+
LANGUAGE C CALLED ON NULL INPUT;

src/test/modules/injection_points/meson.build

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ injection_points_sources = files(
88
'injection_points.c',
99
'injection_stats.c',
1010
'injection_stats_fixed.c',
11+
'regress_injection.c',
1112
)
1213

1314
if host_system == 'windows'
@@ -44,8 +45,9 @@ tests += {
4445
'specs': [
4546
'basic',
4647
'inplace',
48+
'syscache-update-pruned',
4749
],
48-
'runningcheck': false, # align with GNU make build system
50+
'runningcheck': false, # see syscache-update-pruned
4951
},
5052
'tap': {
5153
'env': {
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
/*--------------------------------------------------------------------------
2+
*
3+
* regress_injection.c
4+
* Functions supporting test-specific subject matter.
5+
*
6+
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
7+
* Portions Copyright (c) 1994, Regents of the University of California
8+
*
9+
* IDENTIFICATION
10+
* src/test/modules/injection_points/regress_injection.c
11+
*
12+
* -------------------------------------------------------------------------
13+
*/
14+
15+
#include "postgres.h"
16+
17+
#include "access/table.h"
18+
#include "fmgr.h"
19+
#include "miscadmin.h"
20+
#include "storage/procarray.h"
21+
#include "utils/xid8.h"
22+
23+
/*
24+
* removable_cutoff - for syscache-update-pruned.spec
25+
*
26+
* Wrapper around GetOldestNonRemovableTransactionId(). In general, this can
27+
* move backward. runningcheck=false isolation tests can reasonably prevent
28+
* that. For the causes of backward movement, see
29+
* postgr.es/m/CAEze2Wj%2BV0kTx86xB_YbyaqTr5hnE_igdWAwuhSyjXBYscf5-Q%40mail.gmail.com
30+
* and the header comment for ComputeXidHorizons(). One can assume this
31+
* doesn't move backward if one arranges for concurrent activity not to reach
32+
* AbortTransaction() and not to allocate an XID while connected to another
33+
* database. Non-runningcheck tests can control most concurrent activity,
34+
* except autovacuum and the isolationtester control connection. Neither
35+
* allocates XIDs, and AbortTransaction() in those would justify test failure.
36+
*/
37+
PG_FUNCTION_INFO_V1(removable_cutoff);
38+
Datum
39+
removable_cutoff(PG_FUNCTION_ARGS)
40+
{
41+
Relation rel = NULL;
42+
TransactionId xid;
43+
FullTransactionId next_fxid_before,
44+
next_fxid;
45+
46+
/* could take other relkinds callee takes, but we've not yet needed it */
47+
if (!PG_ARGISNULL(0))
48+
rel = table_open(PG_GETARG_OID(0), AccessShareLock);
49+
50+
/*
51+
* No lock or snapshot necessarily prevents oldestXid from advancing past
52+
* "xid" while this function runs. That concerns us only in that we must
53+
* not ascribe "xid" to the wrong epoch. (That may never arise in
54+
* isolation testing, but let's set a good example.) As a crude solution,
55+
* retry until nextXid doesn't change.
56+
*/
57+
next_fxid = ReadNextFullTransactionId();
58+
do
59+
{
60+
CHECK_FOR_INTERRUPTS();
61+
next_fxid_before = next_fxid;
62+
xid = GetOldestNonRemovableTransactionId(rel);
63+
next_fxid = ReadNextFullTransactionId();
64+
} while (!FullTransactionIdEquals(next_fxid, next_fxid_before));
65+
66+
if (rel)
67+
table_close(rel, AccessShareLock);
68+
69+
PG_RETURN_FULLTRANSACTIONID(FullTransactionIdFromAllowableAt(next_fxid,
70+
xid));
71+
}

0 commit comments

Comments
 (0)