Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 7c99dc5

Browse files
author
Amit Kapila
committed
Fix ALTER SUBSCRIPTION ... SET PUBLICATION ... command.
The problem is that ALTER SUBSCRIPTION ... SET PUBLICATION ... will lead to restarting of apply worker and after the restart, the apply worker will use the existing slot and replication origin corresponding to the subscription. Now, it is possible that before the restart, the origin has not been updated, and the WAL start location points to a location before where PUBLICATION pointed to by SET PUBLICATION doesn't exist, and that can lead to an error like: "ERROR: publication "pub1" does not exist". Once this error occurs, apply worker will never be able to proceed and will always return the same error. We decided to skip loading the publication if the publication does not exist. The publication is loaded later and updates the relation entry when the publication gets created. We decided not to backpatch this as this is a behaviour change, and we don't see field reports. This problem has been found by intermittent buildfarm failures. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/flat/CALDaNm0-n8FGAorM%2BbTxkzn%2BAOUyx5%3DL_XmnvOP6T24%2B-NcBKg%40mail.gmail.com Discussion: https://postgr.es/m/CAA4eK1+T-ETXeRM4DHWzGxBpKafLCp__5bPA_QZfFQp7-0wj4Q@mail.gmail.com
1 parent 4618045 commit 7c99dc5

File tree

2 files changed

+57
-3
lines changed

2 files changed

+57
-3
lines changed

src/backend/replication/pgoutput/pgoutput.c

+14-2
Original file line numberDiff line numberDiff line change
@@ -1764,6 +1764,11 @@ pgoutput_shutdown(LogicalDecodingContext *ctx)
17641764

17651765
/*
17661766
* Load publications from the list of publication names.
1767+
*
1768+
* Here, we skip the publications that don't exist yet. This will allow us
1769+
* to silently continue the replication in the absence of a missing publication.
1770+
* This is required because we allow the users to create publications after they
1771+
* have specified the required publications at the time of replication start.
17671772
*/
17681773
static List *
17691774
LoadPublications(List *pubnames)
@@ -1774,9 +1779,16 @@ LoadPublications(List *pubnames)
17741779
foreach(lc, pubnames)
17751780
{
17761781
char *pubname = (char *) lfirst(lc);
1777-
Publication *pub = GetPublicationByName(pubname, false);
1782+
Publication *pub = GetPublicationByName(pubname, true);
17781783

1779-
result = lappend(result, pub);
1784+
if (pub)
1785+
result = lappend(result, pub);
1786+
else
1787+
ereport(WARNING,
1788+
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
1789+
errmsg("skipped loading publication: %s", pubname),
1790+
errdetail("The publication does not exist at this point in the WAL."),
1791+
errhint("Create the publication if it does not exist."));
17801792
}
17811793

17821794
return result;

src/test/subscription/t/024_add_drop_pub.pl

+43-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11

22
# Copyright (c) 2021-2025, PostgreSQL Global Development Group
33

4-
# This test checks behaviour of ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION
4+
# This test checks behaviour of ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION and
5+
# ensures that creating a publication associated with a subscription at a later
6+
# point of time does not break logical replication.
57
use strict;
68
use warnings FATAL => 'all';
79
use PostgreSQL::Test::Cluster;
@@ -80,6 +82,46 @@
8082
"SELECT count(*), min(a), max(a) FROM tab_1");
8183
is($result, qq(20|1|10), 'check initial data is copied to subscriber');
8284

85+
# Ensure that setting a missing publication to the subscription does not
86+
# disrupt existing logical replication. Instead, it should log a warning
87+
# while allowing replication to continue. Additionally, verify that replication
88+
# resumes after the missing publication is created for the publication table.
89+
90+
# Create table on publisher and subscriber
91+
$node_publisher->safe_psql('postgres', "CREATE TABLE tab_3 (a int)");
92+
$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_3 (a int)");
93+
94+
# Set the subscription with a missing publication
95+
$node_subscriber->safe_psql('postgres',
96+
"ALTER SUBSCRIPTION tap_sub SET PUBLICATION tap_pub_3");
97+
98+
my $offset = -s $node_publisher->logfile;
99+
100+
$node_publisher->safe_psql('postgres',"INSERT INTO tab_3 values(1)");
101+
102+
# Verify that a warning is logged.
103+
$node_publisher->wait_for_log(
104+
qr/WARNING: ( [A-Z0-9]+:)? skipped loading publication: tap_pub_3/, $offset);
105+
106+
$node_publisher->safe_psql('postgres',
107+
"CREATE PUBLICATION tap_pub_3 FOR TABLE tab_3");
108+
109+
$node_subscriber->safe_psql('postgres',
110+
"ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION");
111+
112+
$node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub');
113+
114+
$node_publisher->safe_psql('postgres', "INSERT INTO tab_3 values(2)");
115+
116+
$node_publisher->wait_for_catchup('tap_sub');
117+
118+
# Verify that the insert operation gets replicated to subscriber after
119+
# publication is created.
120+
$result = $node_subscriber->safe_psql('postgres',
121+
"SELECT * FROM tab_3");
122+
is($result, qq(1
123+
2), 'check that the incremental data is replicated after the publication is created');
124+
83125
# shutdown
84126
$node_subscriber->stop('fast');
85127
$node_publisher->stop('fast');

0 commit comments

Comments
 (0)