MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11

MySQL Database Architectures
Disaster Recovery Solutions
Introducing MySQL InnoDB ClusterSet

Kenny Gryp
MySQL Product Manager

Safe Harbor Statement
The following is intended to outline our general product direction.
It is intended for information purpose only, and may not be incorporated
into any
contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied up in making purchasing decisions.
The
development, release and timing of any features or functionality described
for Oracle's product remains at the sole discretion of Oracle.
Copyright @ 2021 Oracle and/or its affiliates.
2 / 55

IT Disasters & Outages: Primary Causes

On-site power failure is the biggest cause of significant outages
3 / 55

IT Disasters & Outages: Costs are Rising
Over half who had experienced an outage costing more than $100,000.
4 / 55

5-hour computer outage cost us $150 million. The airline eventually
canceled about 1,000 flights on the day of the outage and ground an
additional 1,000 flights over the following two days.
Millions of websites offline after fire at French cloud services firm. The
fire is expected to cost the company more than €105 million.

Tens of thousands of passengers were stranded in cities around the
world due to cancellation of about 130 flights and the delay of 200.

Millions of bank customers were unable to access online accounts. The
bank took almost 2 days to recover and get back to normal functioning.
IT Disasters and Outages: Examples

5 / 55

Past, Present & Future
6 / 55

Setting up Replication topology was usually done manually, taking
many steps
including user management, restoring backups, configuring
replication...
MySQL only offered the technical pieces, leaving it up to the user to
setup an (always customized) architecture
Even required other software ... bringing lot's of work for DBA's and
experts, who spent their time automating and integrating their
customized architecture
'Past' - Manual
7 / 55

RPO = 0

RTO = seconds (automatic failover)
2016 - MySQL InnoDB Cluster
MySQL Group Replication: Automatic membership changes,
network partition handling, consistency...
MySQL Shell to provide a powerful interface that helps in
automating and integrating all components
InnoDB CLONE to automatically provision members, fully integrated
in InnoDB
MySQL Router
MySQL Server
Present - Solutions!
8 / 55

RPO != 0

RTO = minutes (manual failover)
2020 - MySQL InnoDB Replicaset
'classic', 'asynchronous' Replication based Solution, fully integrated
MySQL Shell
MySQL Router
MySQL Server
Present - Solutions!
9 / 55

Concepts - RTO & RPO
RTO: Recovery Time Objective
How long does it take to recover from a single failure
RPO: Recovery Point Objective
How much data can be lost when a failure occurs
Types of Failure:

High Availability: Single Server Failure, Network Partition

Disaster Recovery: Full Region/Network Failure

Human Error: Little Bobby Tables
Business Requirements
10 / 55

MySQL InnoDB Cluster & MySQL InnoDB ReplicaSet are solutions

implemented to meet various High Availability requirements.
Introducing MySQL InnoDB ClusterSet,
which is one of the key options to meet Disaster Recovery requirements.
11 / 55

MySQL InnoDB ClusterSet
12 / 55

High Availability (Failure Within a Region)
RPO = 0
RTO = seconds (automatic failover)
Disaster Recovery (Region Failure)
RPO != 0
RTO = minutes or more (manual failover)
No write performance impact
Features
Easy to use
Familiar interface and usability

mysqlsh, CLONE, ...
Add/remove nodes/clusters online
Router integration, no need to reconfigure application
if the topology changes
One or more REPLICA MySQL InnoDB Clusters attached to a PRIMARY MySQL InnoDB Cluster
13 / 55

MySQL InnoDB ClusterSet - 3 Datacenters
14 / 55

MySQL InnoDB ClusterSet - Not every Cluster has to be 3 nodes
Each replica is a MySQL InnoDB Cluster that can have 1-9 members.
15 / 55

Replication Enhancements
Features in replication that made ClusterSet possible:
8.0.22: Automatic connection failover for Async Replication Channels
8.0.23: Automatic connection failover for Async Replication Channels using Group Replication
8.0.24: Make skip-replica-start a global, persistable, read-only system variable.
8.0.26: Group Replication Member actions (configurable super_read_only on PRIMARY member)
8.0.26: Specify the UUID used to log View_change_log_event
8.0.27: Asynchronous Replication Channel configuration automatically follows the PRIMARY member
16 / 55

Configuration
Commands
17 / 55

Create MySQL InnoDB Cluster
Start with setting up a regular MySQL InnoDB Cluster:
mysqlsh> c root@localhost:3331
mysqlsh> sql create schema sbtest;
mysqlsh> bru = dba.createCluster("BRU")
mysqlsh> bru.addInstance('localhost:3332')
mysqlsh> bru.addInstance('localhost:3333')
mysqlsh> bru.status()
18 / 55

Create ClusterSet
mysqlsh> clusterset = bru.createClusterSet('clusterset')
A new ClusterSet will be created based on the Cluster 'BRU'.
- Validating Cluster 'BRU' for ClusterSet compliance.
- Creating InnoDB ClusterSet 'clusterset' on 'BRU'...
- Updating metadata...
ClusterSet successfully created. Use ClusterSet.createReplicaCluster() to add Replica Clusters to it.
<ClusterSet:cluster>
19 / 55

Check ClusterSet Status
mysqlsh> clusterset.status()
{
"clusters": {
"BRU": {
"clusterRole": "PRIMARY",
"globalStatus": "OK",
"primary": "127.0.0.1:3331"
}
},
"domainName": "clusterset",
"globalPrimaryInstance": "127.0.0.1:3331",
"primaryCluster": "BRU",
"status": "HEALTHY",
"statusText": "All Clusters available."
}
20 / 55

Add Replica Cluster
Supports incremental recovery (binlog) & full recovery (CLONE)
mysqlsh> lis = clusterset.createReplicaCluster('localhost:4441', 'LIS')
mysqlsh> lis.addInstance('localhost:4442')
mysqlsh> lis.addInstance('localhost:4443')
21 / 55

mysqlsh> bru.status()
mysqlsh> lis.status()
{
"clusters": {
"BRU": {
"primary": "127.0.0.1:3331"
},
"LIS": {
"clusterRole": "REPLICA",
"clusterSetReplicationStatus": "OK",
"globalStatus": "OK"
}
},
}

Or, to get everything in one command:
mysqlsh> clusterset.status({extended:1})
Check ClusterSet Status
22 / 55

mysqlsh> rom = clusterset.createReplicaCluster(
'localhost:5551',
'ROM')
mysqlsh> rom.addInstance('localhost:5552')
mysqlsh> rom.addInstance('localhost:5553')
mysqlsh> rom.status()
{
"clusters": {
"ROM": {
},
"BRU": {
"primary": "127.0.0.1:3331"
},
"LIS": {
}
},
}
Add second Replica Cluster
23 / 55

Router Integration
24 / 55

Configure your application to connect to a local MySQL
Router to connect to the ClusterSet.
Router Integration
25 / 55

Router Target Modes:
follow the PRIMARY cluster
Writes & Reads go to the PRIMARY Cluster
connect to the configured target cluster
When target cluster is not PRIMARY:
only read traffic is open
writes will be denied
when target cluster is PRIMARY
write port opens
Features:
Configurable per Router instance
Configuration can be changed ONLINE in mysqlsh
Deploy 2 types of routers:
target PRIMARY to send writes to PRIMARY
define target cluster to keep read traffic local
INVALIDATED clusters can still be used for read traffic
(configurable)
Router Integration
26 / 55

Router Integration - 3DC

27 / 55

Router Integration
Commands
28 / 55

Bootstrap Router
Same as MySQL InnoDB Cluster & MySQL InnoDB ReplicaSet
$ sudo mysqlrouter --bootstrap root@localhost:3331 --user=mysqlrouter
$ sudo systemctl start mysqlrouter
$ sudo tail -f /var/log/mysqlrouter/mysqlrouter.log
29 / 55

Changing Router Configuration Options
Change the target_cluster:
mysqlsh> clusterset.setRoutingOption('instance-....com::system', 'target_cluster', 'ROM')
mysqlsh> clusterset.setRoutingOption('instance-....com::system', 'target_cluster', 'BRU')
mysqlsh> clusterset.setRoutingOption('instance-....com::system', 'target_cluster', 'LIS')
mysqlsh> clusterset.setRoutingOption('instance-....com::system', 'target_cluster', 'primary')
target_cluster is also configurable during mysqlrouter --bootstrap with

--conf-target-cluster or --conf-target-cluster-by-name
Change the invalidated_cluster_policy:
When the target_cluster cluster is invalidated, should it still accept reads, knowing that they will be stale reads or should all traffic be dropped?
mysqlsh> clusterset.setRoutingOption('instance-....com::system', 'invalidated_cluster_policy', 'accept_ro')
mysqlsh> clusterset.setRoutingOption('instance-....com::system', 'invalidated_cluster_policy', 'drop_all')
Change the default setting for routers:
mysqlsh> clusterset.setRoutingOption('target_cluster', 'LIS')
30 / 55

Managing
Commands
31 / 55

Change PRIMARY member in PRIMARY cluster
mysqlsh> bru.setPrimaryInstance('localhost:3332')
Setting instance 'localhost:3332' as the primary instance of cluster 'BRU'...
Instance '127.0.0.1:3331' was switched from PRIMARY to SECONDARY.
Instance '127.0.0.1:3332' was switched from SECONDARY to PRIMARY.
Instance '127.0.0.1:3333' remains SECONDARY.
WARNING: The cluster internal session is not the primary member anymore. For cluster management operations
please obtain a fresh cluster handle using dba.getCluster().
The instance 'localhost:3332' was successfully elected as primary.
32 / 55

Change PRIMARY member in REPLICA cluster
mysqlsh> lis.setPrimaryInstance('localhost:4442')
Setting instance 'localhost:4442' as the primary instance of cluster 'LIS'...
Instance '127.0.0.1:4442' was switched from SECONDARY to PRIMARY.
Instance '127.0.0.1:4443' remains SECONDARY.
Instance '127.0.0.1:4441' was switched from PRIMARY to SECONDARY.
WARNING: The cluster internal session is not the primary member anymore. For cluster management operations
please obtain a fresh cluster handle using dba.getCluster().
The instance 'localhost:4442' was successfully elected as primary.
33 / 55

Switchover - Changing PRIMARY Cluster - setPrimaryCluster()
mysqlsh> clusterset.setPrimaryCluster('LIS')
Switching the primary cluster of the clusterset to 'LIS'
- Verifying clusterset status
-- Checking cluster BRU - Cluster 'BRU' is available
-- Checking cluster ROM - Cluster 'ROM' is available
-- Checking cluster LIS - Cluster 'LIS' is available
- Refreshing replication account of demoted cluster
- Synchronizing transaction backlog at 127.0.0.1:4442
- Updating metadata
- Updating topology
-- Changing replication source of 127.0.0.1:3331 to 127.0.0.1:4442
- Acquiring locks in replicaset instances
-- Pre-synchronizing SECONDARIES
-- Acquiring global lock at PRIMARY & SECONDARIES
- Synchronizing remaining transactions at promoted primary
- Updating replica clusters
Cluster 'LIS' was promoted to PRIMARY of the clusterset. The PRIMARY instance is '127.0.0.1:4442'
34 / 55

Failover to another Cluster
mysqlsh> c root@localhost:3331
mysqlsh> clusterset=dba.getClusterSet()
mysqlsh> clusterset.forcePrimaryCluster('BRU')
Failing-over primary cluster of the clusterset to 'BRU'
- Verifying primary cluster status
None of the instances of the PRIMARY cluster 'LIS' could be reached.
- Verifying clusterset status
-- Checking cluster BRU
Cluster 'BRU' is available
-- Checking cluster ROM
Cluster 'ROM' is available
-- Checking whether target cluster has the most recent GTID set
- Promoting cluster 'BRU'
- Updating metadata
PRIMARY cluster failed-over to 'BRU'. The PRIMARY instance is '127.0.0.1:3331'
Former PRIMARY cluster was INVALIDATED, transactions that were not yet replicated may be lost.
35 / 55

Removing a Cluster from the ClusterSet
mysqlsh> clusterset.removeCluster('LIS')
36 / 55

ClusterSet Scenarios
37 / 55

When there is newly elected PRIMARY member in a
cluster
Works on failures in PRIMARY and REPLICA clusters
Automatic Handling of InnoDB Cluster state
changes
Asynchronous replication is automatically
reconfigured after primary change
PRIMARY Cluster PRIMARY member Crash/Partition
38 / 55

cluster
changes
PRIMARY Cluster PRIMARY member Crash/Partition - Automatic!
39 / 55

cluster
changes
REPLICA Cluster PRIMARY member Crash/Partition - Automatic!
40 / 55

Switchover
one command that does it all: setPrimaryCluster()
Asynchronous replication channels between clusters
are automatically reconfigured
Consistency guaranteed
All routers will immediately redirect if needed
(depending on target mode)
Changing Primary - Change Primary Cluster on Healthy System
41 / 55

Switchover
one command that does it all: setPrimaryCluster()
Asynchronous replication channels between clusters
are automatically reconfigured
Consistency guaranteed
All routers will immediately redirect if needed
(depending on target mode)
Changing Primary - setPrimaryCluster()
42 / 55

one command to invalidate the PRIMARY cluster and
promote a new PRIMARY cluster:
forcePrimaryCluster()
other REPLICA clusters replication will be reconfigured
Split Brain Warning
local Routers that cannot connect to other clusters will
not learn about new topology
if datacenter is network partitioned, it will continue to
operate as PRIMARY
Datacenter Crash/Partition
43 / 55

other REPLICA clusters replication will be reconfigured
Split Brain Warning
local Routers that cannot connect to other clusters will
not learn about new topology
if datacenter is network partitioned, it will continue to
operate as PRIMARY
Datacenter Crash/Partition - forcePrimaryCluster()
44 / 55

Router Integration
Routers will learn about new topology and redirect
traffic
Routers that come back, will learn about new topology
and abandon the old PRIMARY Cluster (e.g failed DC
comes back online)
Datacenter Crash/Partition - Router Integration
45 / 55

Datacenter Crash/Partition - Multiple REPLICA clusters Support

46 / 55

Router Integration
When GR is offline:
network partition
no quorum
full cluster lost (e.g. power outage)
Router instances will follow PRIMARY (depending on
target mode)
Group Replication Crash/Partition
47 / 55

Router Integration
When GR is offline:
network partition
no quorum
full cluster lost (e.g. power outage)
Router instances will follow PRIMARY (depending on
target mode)
Group Replication Crash/Partition - forcePrimaryCluster() & Router
48 / 55

MySQL InnoDB ClusterSet - Restrictions
Requires Server, Router & Shell version 8.0.27 or higher
Only works with Single Primary mode in InnoDB Cluster
Asynchronous replication between clusters, not semi-sync

(use a single cluster spread across regions if RPO=0)
49 / 55

50 / 55

Concepts - RTO & RPO
RTO: Recovery Time Objective
How long does it take to recover from a single failure
RPO: Recovery Point Objective
How much data can be lost when a failure occurs
Types of Failure:

High Availability: Single Server Failure, Network Partition

Disaster Recovery: Full Region/Network Failure

Human Error: Little Bobby Tables
51 / 55

MySQL InnoDB Cluster
RPO = 0
RTO = Seconds
MySQL InnoDB ReplicaSet
RPO != 0
RTO = Minutes+ (manual failover)
👍🏽 Best write performance

👎🏼 Manual failover
High Availability - Single Region
52 / 55

RPO = 0
RTO = Seconds
👍🏽 Multi-Region Multi-Primary

👎🏼 3 DC

👎🏼 Requires very stable WAN

👎🏼 Write performance affected by latency between dc's
Disaster Recovery - Multi Region
MySQL InnoDB Cluster
53 / 55

RPO != 0
RTO = Minutes+ (manual failover)

👍🏽 RPO = 0 & RTO = seconds within Region (HA)

👍🏽 Write performance (no sync to other region required)

👎🏼 Higher RTO: Manual failover

👎🏼 RPO != 0 when region fails
Disaster Recovery - Multi Region
54 / 55

MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11

More Related Content

MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11