Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 53fe7e6

Browse files
committed
Harden postgres_fdw tests against unexpected cache flushes.
postgres_fdw will close its remote session if an sinval cache reset occurs, since it's possible that that means some FDW parameters changed. We had two tests that were trying to ensure that the session remains alive by setting debug_discard_caches = 0; but that's not sufficient. Even though the tests seem stable enough in the buildfarm, they flap a lot under CI. In the first test, which is checking the ability to recover from a lost connection, we can stabilize the results by just not caring whether pg_terminate_backend() finds a victim backend. If a reset did happen, there won't be a session to terminate anymore, but the test can proceed anyway. (Arguably, we are then not testing the unintentional-disconnect case, but as long as that scenario is exercised in most runs I think it's fine; testing the reset-driven case is of value too.) In the second test, which is trying to verify the application_name displayed in pg_stat_activity by a remote session, we had a race condition in that the remote session might go away before we can fetch its pg_stat_activity entry. We can close that race and make the test more certainly test what it intends to by arranging things so that the remote session itself fetches its pg_stat_activity entry (based on PID rather than a somewhat-circular assumption about the application name). Both tests now demonstrably pass under debug_discard_caches = 1, so we can remove that hack. Back-patch into relevant back branches. Discussion: https://postgr.es/m/20230226194340.u44bkfgyz64c67i6@awork3.anarazel.de
1 parent 46647cc commit 53fe7e6

File tree

2 files changed

+64
-81
lines changed

2 files changed

+64
-81
lines changed

contrib/postgres_fdw/expected/postgres_fdw.out

Lines changed: 38 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -9926,11 +9926,6 @@ WARNING: there is no transaction in progress
99269926
-- Change application_name of remote connection to special one
99279927
-- so that we can easily terminate the connection later.
99289928
ALTER SERVER loopback OPTIONS (application_name 'fdw_retry_check');
9929-
-- If debug_discard_caches is active, it results in
9930-
-- dropping remote connections after every transaction, making it
9931-
-- impossible to test termination meaningfully. So turn that off
9932-
-- for this test.
9933-
SET debug_discard_caches = 0;
99349929
-- Make sure we have a remote connection.
99359930
SELECT 1 FROM ft1 LIMIT 1;
99369931
?column?
@@ -9939,13 +9934,12 @@ SELECT 1 FROM ft1 LIMIT 1;
99399934
(1 row)
99409935

99419936
-- Terminate the remote connection and wait for the termination to complete.
9942-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
9937+
-- (If a cache flush happens, the remote connection might have already been
9938+
-- dropped; so code this step in a way that doesn't fail if no connection.)
9939+
DO $$ BEGIN
9940+
PERFORM pg_terminate_backend(pid, 180000) FROM pg_stat_activity
99439941
WHERE application_name = 'fdw_retry_check';
9944-
pg_terminate_backend
9945-
----------------------
9946-
t
9947-
(1 row)
9948-
9942+
END $$;
99499943
-- This query should detect the broken connection when starting new remote
99509944
-- transaction, reestablish new connection, and then succeed.
99519945
BEGIN;
@@ -9958,21 +9952,17 @@ SELECT 1 FROM ft1 LIMIT 1;
99589952
-- If we detect the broken connection when starting a new remote
99599953
-- subtransaction, we should fail instead of establishing a new connection.
99609954
-- Terminate the remote connection and wait for the termination to complete.
9961-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
9955+
DO $$ BEGIN
9956+
PERFORM pg_terminate_backend(pid, 180000) FROM pg_stat_activity
99629957
WHERE application_name = 'fdw_retry_check';
9963-
pg_terminate_backend
9964-
----------------------
9965-
t
9966-
(1 row)
9967-
9958+
END $$;
99689959
SAVEPOINT s;
99699960
-- The text of the error might vary across platforms, so only show SQLSTATE.
99709961
\set VERBOSITY sqlstate
99719962
SELECT 1 FROM ft1 LIMIT 1; -- should fail
99729963
ERROR: 08006
99739964
\set VERBOSITY default
99749965
COMMIT;
9975-
RESET debug_discard_caches;
99769966
-- =============================================================================
99779967
-- test connection invalidation cases and postgres_fdw_get_connections function
99789968
-- =============================================================================
@@ -11629,74 +11619,66 @@ HINT: There are no valid options in this context.
1162911619
-- ===================================================================
1163011620
-- test postgres_fdw.application_name GUC
1163111621
-- ===================================================================
11632-
--- Turn debug_discard_caches off for this test to make sure that
11633-
--- the remote connection is alive when checking its application_name.
11634-
SET debug_discard_caches = 0;
11622+
-- To avoid race conditions in checking the remote session's application_name,
11623+
-- use this view to make the remote session itself read its application_name.
11624+
CREATE VIEW my_application_name AS
11625+
SELECT application_name FROM pg_stat_activity WHERE pid = pg_backend_pid();
11626+
CREATE FOREIGN TABLE remote_application_name (application_name text)
11627+
SERVER loopback2
11628+
OPTIONS (schema_name 'public', table_name 'my_application_name');
11629+
SELECT count(*) FROM remote_application_name;
11630+
count
11631+
-------
11632+
1
11633+
(1 row)
11634+
1163511635
-- Specify escape sequences in application_name option of a server
1163611636
-- object so as to test that they are replaced with status information
11637-
-- expectedly.
11637+
-- expectedly. Note that we are also relying on ALTER SERVER to force
11638+
-- the remote session to be restarted with its new application name.
1163811639
--
1163911640
-- Since pg_stat_activity.application_name may be truncated to less than
1164011641
-- NAMEDATALEN characters, note that substring() needs to be used
1164111642
-- at the condition of test query to make sure that the string consisting
1164211643
-- of database name and process ID is also less than that.
1164311644
ALTER SERVER loopback2 OPTIONS (application_name 'fdw_%d%p');
11644-
SELECT 1 FROM ft6 LIMIT 1;
11645-
?column?
11646-
----------
11647-
1
11648-
(1 row)
11649-
11650-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
11645+
SELECT count(*) FROM remote_application_name
1165111646
WHERE application_name =
1165211647
substring('fdw_' || current_database() || pg_backend_pid() for
1165311648
current_setting('max_identifier_length')::int);
11654-
pg_terminate_backend
11655-
----------------------
11656-
t
11649+
count
11650+
-------
11651+
1
1165711652
(1 row)
1165811653

1165911654
-- postgres_fdw.application_name overrides application_name option
1166011655
-- of a server object if both settings are present.
11656+
ALTER SERVER loopback2 OPTIONS (SET application_name 'fdw_wrong');
1166111657
SET postgres_fdw.application_name TO 'fdw_%a%u%%';
11662-
SELECT 1 FROM ft6 LIMIT 1;
11663-
?column?
11664-
----------
11665-
1
11666-
(1 row)
11667-
11668-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
11658+
SELECT count(*) FROM remote_application_name
1166911659
WHERE application_name =
1167011660
substring('fdw_' || current_setting('application_name') ||
1167111661
CURRENT_USER || '%' for current_setting('max_identifier_length')::int);
11672-
pg_terminate_backend
11673-
----------------------
11674-
t
11662+
count
11663+
-------
11664+
1
1167511665
(1 row)
1167611666

11667+
RESET postgres_fdw.application_name;
1167711668
-- Test %c (session ID) and %C (cluster name) escape sequences.
11678-
SET postgres_fdw.application_name TO 'fdw_%C%c';
11679-
SELECT 1 FROM ft6 LIMIT 1;
11680-
?column?
11681-
----------
11682-
1
11683-
(1 row)
11684-
11685-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
11669+
ALTER SERVER loopback2 OPTIONS (SET application_name 'fdw_%C%c');
11670+
SELECT count(*) FROM remote_application_name
1168611671
WHERE application_name =
1168711672
substring('fdw_' || current_setting('cluster_name') ||
1168811673
to_hex(trunc(EXTRACT(EPOCH FROM (SELECT backend_start FROM
1168911674
pg_stat_get_activity(pg_backend_pid()))))::integer) || '.' ||
1169011675
to_hex(pg_backend_pid())
1169111676
for current_setting('max_identifier_length')::int);
11692-
pg_terminate_backend
11693-
----------------------
11694-
t
11677+
count
11678+
-------
11679+
1
1169511680
(1 row)
1169611681

11697-
--Clean up
11698-
RESET postgres_fdw.application_name;
11699-
RESET debug_discard_caches;
1170011682
-- ===================================================================
1170111683
-- test parallel commit
1170211684
-- ===================================================================

contrib/postgres_fdw/sql/postgres_fdw.sql

Lines changed: 26 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -3100,18 +3100,16 @@ ROLLBACK;
31003100
-- so that we can easily terminate the connection later.
31013101
ALTER SERVER loopback OPTIONS (application_name 'fdw_retry_check');
31023102

3103-
-- If debug_discard_caches is active, it results in
3104-
-- dropping remote connections after every transaction, making it
3105-
-- impossible to test termination meaningfully. So turn that off
3106-
-- for this test.
3107-
SET debug_discard_caches = 0;
3108-
31093103
-- Make sure we have a remote connection.
31103104
SELECT 1 FROM ft1 LIMIT 1;
31113105

31123106
-- Terminate the remote connection and wait for the termination to complete.
3113-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
3107+
-- (If a cache flush happens, the remote connection might have already been
3108+
-- dropped; so code this step in a way that doesn't fail if no connection.)
3109+
DO $$ BEGIN
3110+
PERFORM pg_terminate_backend(pid, 180000) FROM pg_stat_activity
31143111
WHERE application_name = 'fdw_retry_check';
3112+
END $$;
31153113

31163114
-- This query should detect the broken connection when starting new remote
31173115
-- transaction, reestablish new connection, and then succeed.
@@ -3121,17 +3119,17 @@ SELECT 1 FROM ft1 LIMIT 1;
31213119
-- If we detect the broken connection when starting a new remote
31223120
-- subtransaction, we should fail instead of establishing a new connection.
31233121
-- Terminate the remote connection and wait for the termination to complete.
3124-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
3122+
DO $$ BEGIN
3123+
PERFORM pg_terminate_backend(pid, 180000) FROM pg_stat_activity
31253124
WHERE application_name = 'fdw_retry_check';
3125+
END $$;
31263126
SAVEPOINT s;
31273127
-- The text of the error might vary across platforms, so only show SQLSTATE.
31283128
\set VERBOSITY sqlstate
31293129
SELECT 1 FROM ft1 LIMIT 1; -- should fail
31303130
\set VERBOSITY default
31313131
COMMIT;
31323132

3133-
RESET debug_discard_caches;
3134-
31353133
-- =============================================================================
31363134
-- test connection invalidation cases and postgres_fdw_get_connections function
31373135
-- =============================================================================
@@ -3850,49 +3848,52 @@ ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
38503848
-- ===================================================================
38513849
-- test postgres_fdw.application_name GUC
38523850
-- ===================================================================
3853-
--- Turn debug_discard_caches off for this test to make sure that
3854-
--- the remote connection is alive when checking its application_name.
3855-
SET debug_discard_caches = 0;
3851+
-- To avoid race conditions in checking the remote session's application_name,
3852+
-- use this view to make the remote session itself read its application_name.
3853+
CREATE VIEW my_application_name AS
3854+
SELECT application_name FROM pg_stat_activity WHERE pid = pg_backend_pid();
3855+
3856+
CREATE FOREIGN TABLE remote_application_name (application_name text)
3857+
SERVER loopback2
3858+
OPTIONS (schema_name 'public', table_name 'my_application_name');
3859+
3860+
SELECT count(*) FROM remote_application_name;
38563861

38573862
-- Specify escape sequences in application_name option of a server
38583863
-- object so as to test that they are replaced with status information
3859-
-- expectedly.
3864+
-- expectedly. Note that we are also relying on ALTER SERVER to force
3865+
-- the remote session to be restarted with its new application name.
38603866
--
38613867
-- Since pg_stat_activity.application_name may be truncated to less than
38623868
-- NAMEDATALEN characters, note that substring() needs to be used
38633869
-- at the condition of test query to make sure that the string consisting
38643870
-- of database name and process ID is also less than that.
38653871
ALTER SERVER loopback2 OPTIONS (application_name 'fdw_%d%p');
3866-
SELECT 1 FROM ft6 LIMIT 1;
3867-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
3872+
SELECT count(*) FROM remote_application_name
38683873
WHERE application_name =
38693874
substring('fdw_' || current_database() || pg_backend_pid() for
38703875
current_setting('max_identifier_length')::int);
38713876

38723877
-- postgres_fdw.application_name overrides application_name option
38733878
-- of a server object if both settings are present.
3879+
ALTER SERVER loopback2 OPTIONS (SET application_name 'fdw_wrong');
38743880
SET postgres_fdw.application_name TO 'fdw_%a%u%%';
3875-
SELECT 1 FROM ft6 LIMIT 1;
3876-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
3881+
SELECT count(*) FROM remote_application_name
38773882
WHERE application_name =
38783883
substring('fdw_' || current_setting('application_name') ||
38793884
CURRENT_USER || '%' for current_setting('max_identifier_length')::int);
3885+
RESET postgres_fdw.application_name;
38803886

38813887
-- Test %c (session ID) and %C (cluster name) escape sequences.
3882-
SET postgres_fdw.application_name TO 'fdw_%C%c';
3883-
SELECT 1 FROM ft6 LIMIT 1;
3884-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
3888+
ALTER SERVER loopback2 OPTIONS (SET application_name 'fdw_%C%c');
3889+
SELECT count(*) FROM remote_application_name
38853890
WHERE application_name =
38863891
substring('fdw_' || current_setting('cluster_name') ||
38873892
to_hex(trunc(EXTRACT(EPOCH FROM (SELECT backend_start FROM
38883893
pg_stat_get_activity(pg_backend_pid()))))::integer) || '.' ||
38893894
to_hex(pg_backend_pid())
38903895
for current_setting('max_identifier_length')::int);
38913896

3892-
--Clean up
3893-
RESET postgres_fdw.application_name;
3894-
RESET debug_discard_caches;
3895-
38963897
-- ===================================================================
38973898
-- test parallel commit
38983899
-- ===================================================================

0 commit comments

Comments
 (0)