Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 3e701a0

Browse files
committed
Fix heap_page_prune's problem with failing to send cache invalidation
messages if the calling transaction aborts later on. Collapsing out line pointer redirects is a done deal as soon as we complete the page update, so syscache *must* be notified even if the VACUUM FULL as a whole doesn't complete. To fix, add some functionality to inval.c to allow the pending inval messages to be sent immediately while heap_page_prune is still running. The implementation is a bit chintzy: it will only work in the context of VACUUM FULL. But that's all we need now, and it can always be extended later if needed. Per my trouble report of a week ago.
1 parent f4bce7e commit 3e701a0

File tree

3 files changed

+124
-14
lines changed

3 files changed

+124
-14
lines changed

src/backend/access/heap/pruneheap.c

+25-12
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/heap/pruneheap.c,v 1.7 2008/03/08 21:57:59 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/heap/pruneheap.c,v 1.8 2008/03/13 18:00:32 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -158,7 +158,14 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
158158
* much logic as possible out of the critical section, and also ensures
159159
* that WAL replay will work the same as the normal case.
160160
*
161-
* First, initialize the new pd_prune_xid value to zero (indicating no
161+
* First, inform inval.c that upcoming CacheInvalidateHeapTuple calls
162+
* are nontransactional.
163+
*/
164+
if (redirect_move)
165+
BeginNonTransactionalInvalidation();
166+
167+
/*
168+
* Initialize the new pd_prune_xid value to zero (indicating no
162169
* prunable tuples). If we find any tuples which may soon become
163170
* prunable, we will save the lowest relevant XID in new_prune_xid.
164171
* Also initialize the rest of our working state.
@@ -191,6 +198,18 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
191198
redirect_move);
192199
}
193200

201+
/*
202+
* Send invalidation messages for any tuples we are about to move.
203+
* It is safe to do this now, even though we could theoretically still
204+
* fail before making the actual page update, because a useless cache
205+
* invalidation doesn't hurt anything. Also, no one else can reload the
206+
* tuples while we have exclusive buffer lock, so it's not too early to
207+
* send the invals. This avoids sending the invals while inside the
208+
* critical section, which is a good thing for robustness.
209+
*/
210+
if (redirect_move)
211+
EndNonTransactionalInvalidation();
212+
194213
/* Any error while applying the changes is critical */
195214
START_CRIT_SECTION();
196215

@@ -587,16 +606,10 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
587606
/*
588607
* If we are going to implement a redirect by moving tuples, we have
589608
* to issue a cache invalidation against the redirection target tuple,
590-
* because its CTID will be effectively changed by the move. It is
591-
* safe to do this now, even though we might fail before reaching the
592-
* actual page update, because a useless cache invalidation doesn't
593-
* hurt anything. Furthermore we really really don't want to issue
594-
* the inval inside the critical section, since it has a nonzero
595-
* chance of causing an error (eg, due to running out of memory).
596-
*
597-
* XXX this is not really right because CacheInvalidateHeapTuple
598-
* expects transactional semantics, and our move isn't transactional.
599-
* FIXME!!
609+
* because its CTID will be effectively changed by the move. Note that
610+
* CacheInvalidateHeapTuple only queues the request, it doesn't send it;
611+
* if we fail before reaching EndNonTransactionalInvalidation, nothing
612+
* happens and no harm is done.
600613
*/
601614
if (OffsetNumberIsValid(redirect_target))
602615
{

src/backend/utils/cache/inval.c

+94-1
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@
8080
* Portions Copyright (c) 1994, Regents of the University of California
8181
*
8282
* IDENTIFICATION
83-
* $PostgreSQL: pgsql/src/backend/utils/cache/inval.c,v 1.83 2008/01/01 19:45:53 momjian Exp $
83+
* $PostgreSQL: pgsql/src/backend/utils/cache/inval.c,v 1.84 2008/03/13 18:00:32 tgl Exp $
8484
*
8585
*-------------------------------------------------------------------------
8686
*/
@@ -964,6 +964,99 @@ CommandEndInvalidationMessages(void)
964964
&transInvalInfo->CurrentCmdInvalidMsgs);
965965
}
966966

967+
968+
/*
969+
* BeginNonTransactionalInvalidation
970+
* Prepare for invalidation messages for nontransactional updates.
971+
*
972+
* A nontransactional invalidation is one that must be sent whether or not
973+
* the current transaction eventually commits. We arrange for all invals
974+
* queued between this call and EndNonTransactionalInvalidation() to be sent
975+
* immediately when the latter is called.
976+
*
977+
* Currently, this is only used by heap_page_prune(), and only when it is
978+
* invoked during VACUUM FULL's first pass over a table. We expect therefore
979+
* that we are not inside a subtransaction and there are no already-pending
980+
* invalidations. This could be relaxed by setting up a new nesting level of
981+
* invalidation data, but for now there's no need. Note that heap_page_prune
982+
* knows that this function does not change any state, and therefore there's
983+
* no need to worry about cleaning up if there's an elog(ERROR) before
984+
* reaching EndNonTransactionalInvalidation (the invals will just be thrown
985+
* away if that happens).
986+
*/
987+
void
988+
BeginNonTransactionalInvalidation(void)
989+
{
990+
/* Must be at top of stack */
991+
Assert(transInvalInfo != NULL && transInvalInfo->parent == NULL);
992+
993+
/* Must not have any previously-queued activity */
994+
Assert(transInvalInfo->PriorCmdInvalidMsgs.cclist == NULL);
995+
Assert(transInvalInfo->PriorCmdInvalidMsgs.rclist == NULL);
996+
Assert(transInvalInfo->CurrentCmdInvalidMsgs.cclist == NULL);
997+
Assert(transInvalInfo->CurrentCmdInvalidMsgs.rclist == NULL);
998+
Assert(transInvalInfo->RelcacheInitFileInval == false);
999+
}
1000+
1001+
/*
1002+
* EndNonTransactionalInvalidation
1003+
* Process queued-up invalidation messages for nontransactional updates.
1004+
*
1005+
* We expect to find messages in CurrentCmdInvalidMsgs only (else there
1006+
* was a CommandCounterIncrement within the "nontransactional" update).
1007+
* We must process them locally and send them out to the shared invalidation
1008+
* message queue.
1009+
*
1010+
* We must also reset the lists to empty and explicitly free memory (we can't
1011+
* rely on end-of-transaction cleanup for that).
1012+
*/
1013+
void
1014+
EndNonTransactionalInvalidation(void)
1015+
{
1016+
InvalidationChunk *chunk;
1017+
InvalidationChunk *next;
1018+
1019+
/* Must be at top of stack */
1020+
Assert(transInvalInfo != NULL && transInvalInfo->parent == NULL);
1021+
1022+
/* Must not have any prior-command messages */
1023+
Assert(transInvalInfo->PriorCmdInvalidMsgs.cclist == NULL);
1024+
Assert(transInvalInfo->PriorCmdInvalidMsgs.rclist == NULL);
1025+
1026+
/*
1027+
* At present, this function is only used for CTID-changing updates;
1028+
* since the relcache init file doesn't store any tuple CTIDs, we
1029+
* don't have to invalidate it. That might not be true forever
1030+
* though, in which case we'd need code similar to AtEOXact_Inval.
1031+
*/
1032+
1033+
/* Send out the invals */
1034+
ProcessInvalidationMessages(&transInvalInfo->CurrentCmdInvalidMsgs,
1035+
LocalExecuteInvalidationMessage);
1036+
ProcessInvalidationMessages(&transInvalInfo->CurrentCmdInvalidMsgs,
1037+
SendSharedInvalidMessage);
1038+
1039+
/* Clean up and release memory */
1040+
for (chunk = transInvalInfo->CurrentCmdInvalidMsgs.cclist;
1041+
chunk != NULL;
1042+
chunk = next)
1043+
{
1044+
next = chunk->next;
1045+
pfree(chunk);
1046+
}
1047+
for (chunk = transInvalInfo->CurrentCmdInvalidMsgs.rclist;
1048+
chunk != NULL;
1049+
chunk = next)
1050+
{
1051+
next = chunk->next;
1052+
pfree(chunk);
1053+
}
1054+
transInvalInfo->CurrentCmdInvalidMsgs.cclist = NULL;
1055+
transInvalInfo->CurrentCmdInvalidMsgs.rclist = NULL;
1056+
transInvalInfo->RelcacheInitFileInval = false;
1057+
}
1058+
1059+
9671060
/*
9681061
* CacheInvalidateHeapTuple
9691062
* Register the given tuple for invalidation at end of command

src/include/utils/inval.h

+5-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
* Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
10-
* $PostgreSQL: pgsql/src/include/utils/inval.h,v 1.41 2008/01/01 19:45:59 momjian Exp $
10+
* $PostgreSQL: pgsql/src/include/utils/inval.h,v 1.42 2008/03/13 18:00:32 tgl Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
@@ -37,6 +37,10 @@ extern void PostPrepare_Inval(void);
3737

3838
extern void CommandEndInvalidationMessages(void);
3939

40+
extern void BeginNonTransactionalInvalidation(void);
41+
42+
extern void EndNonTransactionalInvalidation(void);
43+
4044
extern void CacheInvalidateHeapTuple(Relation relation, HeapTuple tuple);
4145

4246
extern void CacheInvalidateRelcache(Relation relation);

0 commit comments

Comments
 (0)