Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Don't use Asserts to check for violations of replication protocol.
authorTom Lane <tgl@sss.pgh.pa.us>
Sat, 12 Jun 2021 16:59:15 +0000 (12:59 -0400)
committerTom Lane <tgl@sss.pgh.pa.us>
Sat, 12 Jun 2021 16:59:15 +0000 (12:59 -0400)
Using an Assert to check the validity of incoming messages is an
extremely poor decision.  In a debug build, it should not be that easy
for a broken or malicious remote client to crash the logrep worker.
The consequences could be even worse in non-debug builds, which will
fail to make such checks at all, leading to who-knows-what misbehavior.
Hence, promote every Assert that could possibly be triggered by wrong
or out-of-order replication messages to a full test-and-ereport.

To avoid bloating the set of messages the translation team has to cope
with, establish a policy that replication protocol violation error
reports don't need to be translated.  Hence, all the new messages here
use errmsg_internal().  A couple of old messages are changed likewise
for consistency.

Along the way, fix some non-idiomatic or outright wrong uses of
hash_search().

Most of these mistakes are new with the "streaming replication"
patch (commit 464824323), but a couple go back a long way.
Back-patch as appropriate.

Discussion: https://postgr.es/m/1719083.1623351052@sss.pgh.pa.us

src/backend/replication/logical/reorderbuffer.c
src/backend/replication/logical/worker.c

index 941ca9afe49a41d9e68754e36d6926b54ed04fc0..c1def5ca3f9d08e62b20e3aad8f1b72a657d5625 100644 (file)
@@ -1331,7 +1331,7 @@ ReorderBufferBuildTupleCidHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
        ent = (ReorderBufferTupleCidEnt *)
            hash_search(txn->tuplecid_hash,
                        (void *) &key,
-                       HASH_ENTER | HASH_FIND,
+                       HASH_ENTER,
                        &found);
        if (!found)
        {
index f6650a2ee927b2f2aaa44e7c73d1a9c9a76f2dc6..571b0e717e155a1132b6366c59f02d80ea0c009c 100644 (file)
@@ -501,7 +501,14 @@ apply_handle_commit(StringInfo s)
 
    logicalrep_read_commit(s, &commit_data);
 
-   Assert(commit_data.commit_lsn == remote_final_lsn);
+   if (commit_data.commit_lsn != remote_final_lsn)
+       ereport(ERROR,
+               (errcode(ERRCODE_PROTOCOL_VIOLATION),
+                errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
+                                (uint32) (commit_data.commit_lsn >> 32),
+                                (uint32) commit_data.commit_lsn,
+                                (uint32) (remote_final_lsn >> 32),
+                                (uint32) remote_final_lsn)));
 
    /* The synchronization worker runs in single transaction. */
    if (IsTransactionState() && !am_tablesync_worker())