@@ -470,6 +470,30 @@ recovery()
470
470
Check consistency of cluster state against current metadata and perform recovery,
471
471
if needed (reconfigure LR channels, repair FDW, etc).
472
472
473
+ ``` plpgsql
474
+ monitor(deadlock_check_timeout_sec int = 5 , rm_node_timeout_sec int = 60 )
475
+ ```
476
+ Monitor cluster for presence of distributed deadlocks and node failures.
477
+ This function is intended to be executed at shardlord and is redirected to shardlord been launched at any other node.
478
+ It starts infinite loop which polls all clusters nodes, collecting local * lock graphs* from all nodes.
479
+ Period of poll is specified by ` deadlock_check_timeout_sec ` parameter (default value is 5 seconds).
480
+ Local lock graphs are combined into global lock graph which is analyzed for presence of loops.
481
+ A loop in lock graph means distributed deadlock. Monitor function tries to resolve deadlock by canceling one or more backends
482
+ involved in the deadlock loop (using ` pg_cancel_backend ` function, which is not actually terminate backend but tries to cancel current query).
483
+ As far as not all backends are blocked in active query state, it may be needed send cancel several times.
484
+ Right now canceled backend is randomly chosen within
485
+ deadlock loop.
486
+
487
+ Local local graphs collected from all nodes do not form consistent global snapshot, so there is possibility of false deadlocks:
488
+ edges in deadlock loop correspond to different moment of times. To prevent false deadlock detection, monitor function
489
+ doesn't react on detected deadlock immediately. Instead of it, previous deadlock loop located at previous iteration is compared with current deadlock
490
+ loop and only if them are equal, then deadlock is reported and recovery is performed.
491
+
492
+ If some node is unreachable then monitor function prints correspondent error message and retries access until
493
+ ` rm_node_timeout_sec ` timeout expiration. After it node is removed from the cluster using ` shardman.rm_node ` function.
494
+ If redundancy level is non-zero, then primary partitions from the disabled node are replaced with replicas.
495
+
496
+
473
497
## Transactions
474
498
When using vanilla PostgreSQL, local changes are handled by PostgreSQL as usual
475
499
-- so if you queries touch only only node, you are safe. Distributed
@@ -489,6 +513,5 @@ be made new shardlord at any moment.
489
513
## Some limitations:
490
514
* You should not touch ` sync_standby_names ` manually while using pg_shardman.
491
515
* The shardlord itself can't be worker node for now.
492
- * ALTER TABLE for sharded is mostly not supported.
493
516
* All [ limitations of ` pg_pathman ` ] ( https://github.com/postgrespro/pg_pathman/wiki/Known-limitations ) ,
494
517
e.g. we don't support global primary keys and foreign keys to sharded tables.
0 commit comments