MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability solution for MySQL

MySQL InnoDB Cluster
Copyright 2018, Oracle and/or its affiliates. All rights reserved
A complete High Availability solution for MySQL
Olivier Dasini
MySQL Principal Solutions Architect EMEA
olivier.dasini@oracle.com
Twitter : @freshdaz
Blog : http://dasini.net/blog

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is
intended for information purposes only, and may not be incorporated
into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing
decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole
discretion of Oracle.
2

Me, Myself & I
➢
MySQL Geek
✔ Addicted to MySQL for 15+ years!
✔ Playing with databases for 20+ years
➢
MySQL Writer, Blogger and Speaker
✔ Also former : DBA, Consultant, Architect, Trainer, ...
➢
MySQL Principal Solutions Architect EMEA at Oracle
➢
Stay tuned! :
✔ Twitter : @freshdaz
✔ Blog : http://dasini.net/blog
3
Olivier DASINI

Copyright © 2018 Oracle and/or its affiliates. All rights reserved. |
Program Agenda
1
2
3
Why High Availability?
Demo : Deploying MySQL InnoDB Cluster
4

Copyright 2018, Oracle and/or its affiliates. All rights reserved
Virtually all organizations require their
most critical systems to be highly available
5
100%
High Availability

High Availability: Factors
• Environment
– Redundant servers in different datacenters and geographical areas will protect you
against regional issues—power grid failures, hurricanes, earthquakes, etc.
• Hardware
– Each part of your hardware stack—networking, storage, servers—should be
redundant
• Software
– Every layer of the software stack needs to be duplicated and distributed across
separate hardware and environments
• Data
– Data loss and inconsistency/corruption must be prevented by having multiple copies
of each piece of data, with consistency checks and guarantees for each change
7

High Availability: The Causes of Downtime
8
40.00%
40.00%
20.00%
Software/Application Human Error Hardware
* Source: Gartner Group 1998 survey
A study by the Gartner Group
projected that through 2015,
80% of downtime will be due to
people and process issues

High Availability: The Business Cost of
Downtime
• Calculate a cost per minute of downtime
– Average revenue generated per-minute over a year
– Cost of not meeting any customer SLAs
– Factor in costs that are harder to quantify
1. Revenue
2. Reputation
3. Customer sentiment
4. Stock price
5. Service’s success
6. Company’s very existence
9
THIS is why
HA matters!

Program Agenda
1
2
3
10

Collection of products that work together
to provide a complete HHigh AAvailability solution for MySQL

InnoDB Cluster
12
App Servers with
MySQL Router
MySQL Group Replication
“High Availability becomes a core
first class feature of MySQL!”
MySQL Shell
Setup, Manage,
Orchestrate

A single product: MySQL
• All components created together
• Tested together
• Packaged together
Flexible and Modern
• C++ 11
• Protocol Buffers
• Developer friendly
InnoDB Cluster – Goals
Easy to use
• A single client: MySQL Shell
• Easy packaging
• Homogenous servers
Scale-out
• Sharded clusters
• Federated system of N replica sets
• Each replica set manages a shard

InnoDB Cluster: The Big Picture
P
App
S S
MySQL Shell and Orchestration Tooling
App
Simple Mapping,
State and Extra
Metadata
Control, Coordinate,
Provision
Monitoring (MEM)
MySQL Router
MySQL Router

Native, built-in High Availability for your MySQL databases
15

Group Replication
16
App Servers with
MySQL Router
MySQL Shell
Setup, Manage,
Orchestrate

Group Replication: What Is It?
• Group Replication library
– Implementation of Replicated Database State Machine theory
• MySQL GCS is based on Paxos (variant of Mencius)
– Provides virtually synchronous replication for MySQL 5.7+
– Supported on all MySQL platforms
• Linux, Windows, Solaris, OSX, FreeBSD
“Single/Multi-master update everywhere replication plugin for MySQL with built-in
automatic distributed recovery, conflict detection and group
membership.”
17
http://dasini.net/blog/2016/11/08/deployer-un-cluster-mysql-group-replication/

• A Highly Available distributed MySQL database service
– Clustering eliminates single points of failure (No SPOF)
• Allows for online maintenance
– Removes the need for handling server fail-over
– Provides fault tolerance
– Enables update everywhere setups
– Automates group reconfiguration (handling of crashes, failures, re-connects)
– Provides a highly available replicated database
– Automatically ensures data consistency
• Detects and handles conflicts
• Prevents data loss
• Prevents data corruption
18
Group Replication: What Does It Provide?

Group Replication: What Sets It Apart?
• Built by the MySQL Engineering Team
– Natively integrated into Server: InnoDB, Replication, GTIDs, Performance
Schema, SYS
– Built-in, no need for separate downloads
– Available on all platforms [Linux, Windows, Solaris, FreeBSD, etc]
• Better performance than similar offerings
– MySQL GCS has optimized network protocol that reduces the impact on latency
• Easier monitoring
– Simple Performance Schema tables for group and node status/stats
– Native support for Group Replication in MySQL Enterprise Monitor
• Modern full stack MySQL HA being built around it
19

Group Replication: Architecture
Node Types
R: Traffic routers/proxies: mysqlrouter, ProxySQL, HAProxy...
M: mysqld nodes participating in Group Replication
20

Group Replication: Performance
21

Group Replication: Performance Comparison
22

Single Primary Mode
• Configuration mode that makes a single member act as a writeable master (PRIMARY) and the
rest of the members act as hot-standbys (SECONDARIES)
– The group itself coordinates automatically to figure out which is the member that will act as the PRIMARY,
through an automatic primary election mechanism
– Secondaries are automatically set to read-only
• Single_primary mode is the default mode
– Closer to classic asynchronous replication setups, simpler to reason about the beginning
– Avoids some limitations of multi-primary mode by default
• The current PRIMARY member UUID can be know by executing the following SQL statement:
23
SQL> SELECT * FROM performance_schema.replication_group_members WHERE MEMBER_ROLE='PRIMARY'G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
MEMBER_ID: da6e3c8e-a3cb-11e8-84dc-0242ac13000b
MEMBER_HOST: mysql_8.0_node1
MEMBER_PORT: 3306
MEMBER_STATE: ONLINE
MEMBER_ROLE: PRIMARY
MEMBER_VERSION: 8.0.12
SQL> SELECT * FROM performance_schema.replication_group_members WHERE MEMBER_ROLE='PRIMARY'G
*************************** 1. row ***************************
MEMBER_ID: da6e3c8e-a3cb-11e8-84dc-0242ac13000b
MEMBER_HOST: mysql_8.0_node1
MEMBER_PORT: 3306

Multi-primary Mode
• Configuration mode that makes all members writeable
– Enabled by setting option --group_replication_single_primary_mode to
OFF
• Any two transactions on different servers can write to the same tuple
• Conflicts will be detected and dealt with
– First committer wins rule
24

Full stack secure connections
• Following the industry standards, Group Replication supports secure
connections along the complete stack
– Client connections
– Distributed recovery connections
– Connections between members
• IP Whitelisting
– Restrict which hosts are allowed to connect to the group
– By default it is set to the values AUTOMATIC, which allow connections from private
subnetworks active on the host
25
http://mysqlhighavailability.com/mysql-group-replication-securing-the-perimeter/

Prioritize member for the Primary Member Election
• group_replication_member_weight
– allows users to influence primary member election
– takes integer value between 0 and 100
– default value = 50
• The first primary member is still the member which bootstrapped the group
irrespective of group_replication_member_weight value.
26
http://mysqlhighavailability.com/group-replication-prioritise-member-for-the-primary-member-election/
node1> SET GLOBAL group_replication_member_weight= 90;

Parallel applier support
• Group Replication now also takes full advantage of parallel binary log applier
infrastructure
– Reduces applier lad and improves replication performance considerably
– Configured in the sale way as asynchronous replication
27
slave_parallel_workers=<NUMBER>
slave_parallel_type=logical_clock
slave_preserve_commit_order=ON
slave_parallel_workers=<NUMBER>
slave_parallel_type=logical_clock
slave_preserve_commit_order=ON

• Optimize the system as a whole
– Sometimes it is beneficial to delay some parts of a distributed system it to improve the throughput of the system as a
whole
– In MySQL Group Replication, it is used to
●
keep writers operating below the sustained capacity of the system;
●
reduce buffering stress on the replication pipeline;
●
protect the correct execution of the system.
• Designed as a safety measure
– Throttling will never be active while the system is operating below its sustained capacity.
• To better support unbalanced systems and unfriedly workloads
– Keep members closer for faster failover
– Keep members closer for reduced replication lag
– Reduce the number of transactions aborts in multi-master
• Make sure new members can always join write-intensive groups
– Nodes entering the group need to catch up previous work while also storing current work to apply later
– In case excess capacity is not available, the cluster will need to be put at lower throughput for new members to catch up
28
Flow Control

• MySQL 5.7 - Basic configuration options
– group-replicaton-fow-control-mode = QUOTA | DISABLED
– group-replicaton-fow-control-certifier-threshold = 0..n
– group-replicaton-fow-control-applier-threshold = 0..n
• Certifier/applier thresholds
– The thresholds are the point at which the flow-control system will delay the writes at the master
– The default is set to 25000 and should be kept larger then one second of sustained commit rate
– But some members will be up to 25000 transactions delayed, if, and only if, they are unable to keep up with the writer members
• MySQL 8.0.3 introduce additional options to fine-tune the heuristics
– group_replicaton_fow_control_min_quota = X commits/s
– group_replicaton_fow_control_min_recovery_quota = X commits/s
– group_replicaton_fow_control_max_commit_quota = X commits/s
– group_replicaton_fow_control_member_quota_percent = Y %
– group_replicaton_fow_control_period = Z seconds
– group_replicaton_fow_control_hold_percent = Y %
– group_replicaton_fow_control_release_percent = Y %
29
Flow Control options

Storage Engine InnoDB
Distributed Architecture Shared nothing
Network failure protection Quorum
NoSQL APIs MySQL Document Store / InnoDB Memcached
Replication mode Virtually Synchronous
Database Failover Automatic
OS Supported Linux, Windows, Solaris, OSX, FreeBSD
Operational Complexity Medium
Administration Standard (MySQL)
Group Replication overview

Shell
App Servers with
MySQL Router
MySQL Shell
Setup, Manage,
Orchestrate
”MySQL Shell provides the developer and DBA with a single intuitive,
flexible, and powerful interface for all MySQL related tasks!”

Shell
32Confidential - Oracle Internal 32
MySQL
Server 5.7
MySQL 8.0
Upgrade Checker
Prompt Themes
Auto Completion
&
Command History
MySQL
Server 8.0
Document Store
X DevAPI
InnoDB ClusterSQL CLI
Output Formats
(Table, JSON,
Tabbed)
Batch Execution
JavaScript
Python
SQL

Shell – Key Features
33
Interface for Development and Administration of MySQL
• Scripting for Javascript, Python, and SQL mode
• Supports MySQL Standard and X Protocols
• Document and Relational Models
• CRUD Document and Relational APIs via scripting
• Traditional Table, JSON, Tab Separated output results formats
• Both Interactive and Batch operations

Shell – Admin API
• Creation and Management of InnoDB Clusters
– Hide the complexity of Configuration, Provisioning, Orchestration
– JS> dba.help()
• The global variable 'dba' is used to access the MySQL AdminAPI
• Perform DBA operations
– Manage MySQL InnoDB clusters
• Create clusters
• Get cluster info
• Start/Stop MySQL Instances
• Validate MySQL instances
• …
Database Administration Interface
34

App Servers with
MySQL Router
MySQL Shell
Setup, Manage,
Orchestrate
Router

• MySQL Router is lightweight middleware that provides transparent
routing between your application and back-end MySQL Servers.
• MySQL Router also provides High Availability and Scalability by
effectively routing database traffic to appropriate back-end MySQL Servers.
• The pluggable architecture also enables developers to extend MySQL
Router for custom use cases.
36
Router

Router
• Native support for MySQL InnoDB clusters
– Understands Group Replication topology
– Utilizes metadata schema stored on each member
• Bootstraps itself and sets up client routing for the GR cluster
• Allows for intelligent client routing into the GR cluster
• Supports multi-master and single primary modes
• Core improvements
– Logging
– Monitoring
– Performance
– Security
37

InnoDB Cluster
38
App Servers with
MySQL Router
MySQL Shell
Setup, Manage,
Orchestrate

Program Agenda
1
2
3
39

Demo :
Deploying MySQL InnoDB Cluster
in Single-Primary Mode
40

InnoDB Cluster: Architecture

Testing MySQL InnoDB Cluster in a sandbox
Cluster admin user
• Does not have to be the root account
• The preferred method to create users to administer the cluster is using the
clusterAdmin option with the dba.configureInstance(), and
cluster.addInstance() operations
https://dev.mysql.com/doc/refman/8.0/en/mysql-innodb-cluster-production-deployment.html

Testing MySQL InnoDB Cluster in a sandbox – Demo 1/7
http://dasini.net/blog/2017/03/13/tester-mysql-innodb-cluster/
1/ Connect to MySQL Shell
$ mysqlsh
2/ Deploy 3 local MySQL instances on ports : 3310 - 3320 - 3330
JS> dba.deploySandboxInstance(3310)
• Deploying local sandbox MySQL Instances
Now we have 3 MySQL server
instances running on ports 3310,
3320 and 3330.

Stop
JS> dba.stopSandboxInstance(3310);
Start
JS> dba.startSandboxInstance(3320);
Kill : simulate an unexpected halt while testing failover
JS> dba.killSandboxInstance(3330);
Delete : Completely removes the sandbox instance from your file system
JS> dba.deleteSandboxInstance(3330);
• Managing local sandbox MySQL Instances
Once a sandbox instance is running, it is possible to change its status at any time :

Check instance configuration before the deployment
JS> dba.checkInstanceConfiguration('root@localhost:3310')
• Checking the configuration
Configuration issues could be fixed manually or with configureLocalInstance
e.g.
JS> dba.configureLocalInstance('root@localhost:3310')

1/ Connect to instance 3310
JS> c root@localhost:3310
2/ Create the cluster
JS> var cluster=dba.createCluster('demoCluster');
3/ Check the transaction set and add node 3320 to the cluster
JS> cluster.checkInstanceState('root@localhost:3320')
JS> cluster.addInstance('root@localhost:3320')
4/ Check the transaction set and add node 3330 to the cluster
JS> cluster.checkInstanceState('root@localhost:3330')
JS> cluster.addInstance('root@localhost:3330')
5/ Check cluster status
JS> cluster.status()
• Deploying the InnoDB Cluster

47
var cluster = dba.getCluster('demoCluster'); cluster.status();
{
"clusterName": "demoCluster",
"defaultReplicaSet": {
"name": "default",
"primary": "localhost:3310",
"ssl": "REQUIRED",
"status": "OK",
"statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
"topology": {
"localhost:3310": {
"address": "localhost:3310",
"mode": "R/W",
"readReplicas": {},
"role": "HA",
"status": "ONLINE"
},
"localhost:3320": {
"mode": "R/O",
"readReplicas": {},
"role": "HA",
"status": "ONLINE"
},
"localhost:3330": {
"mode": "R/O",
"readReplicas": {},
"role": "HA",
"status": "ONLINE"
}
}
},
"groupInformationSourceMember": "mysql://root@localhost:3310"
}

# Bootstrap MySQL Router (listening on 3306)
$ mysqlrouter --bootstrap root@localhost:3310 --directory /tmp/demoSandbox --conf-base-port 3306
# Start MySQL router
$ /tmp/demoSandbox/start.sh
• Deploying the Router
The following connection information can be used to connect to the cluster.
Classic MySQL protocol connections to cluster 'demoSandbox':
- Read/Write Connections: localhost : 3306
- Read/Only Connections: localhost : 3307
X protocol connections to cluster 'demoSandbox':
- Read/Write Connections: localhost : 3308
- Read/Only Connections: localhost : 3309

1/ Connect to MySQL Router (3306 or default port = 6446)
$ mysqlsh --uri=root@localhost:3306
2/ Detect primary node’s port
JS> sql
SQL> SELECT @@port; -- display 3310
3/ Primary node crash!
SQL> js
JS> dba.killSandboxInstance(3310)
• Failover
4/ Select primary node’s port
JS> c root@localhost:3306
SQL> SELECT @@port; -- display 3320
5/ Member is back to the group
SQL> js
JS> dba.startSandboxInstance(3310)
JS> var cluster = dba.getCluster('demoCluster')
JS> cluster.status()

50
Administration - Monitoring

Enterprise Monitor
• Native holistic support for Group Replication clusters
– Intelligent monitoring and alerting
– Topology views
– Detailed metrics and graphs
– Best Practice advice
• Monitoring of MySQL Group Replication
51

52
• Group Replication with 3 online nodes
• Group Replication with 3 nodes :
– 2 online
– 1 unreachable
Enterprise Monitor

53
• Asynchronous replication between
– Group Replication cluster : master
– Standalone instances : slaves
Enterprise Monitor

54
• Asynchronous replication between
– Group Replication cluster 1 : master
– Group Replication cluster 2 : slave
Enterprise Monitor

Monitoring: replication_group_member_stats
• Useful to understand :
– how the applier queue is growing
– how many conflicts have been found
– how many transactions were checked
– which transactions are committed everywhere
●
Important for monitoring the performance of the members connected in the group
node1> SELECT * FROM performance_schema.replication_group_member_statsG
*************************** 1. row ***************************
VIEW_ID: 14845735801161197:3
MEMBER_ID: 00014001-1111-1111-1111-111111111111
COUNT_TRANSACTIONS_IN_QUEUE: 0
COUNT_TRANSACTIONS_CHECKED: 0
COUNT_CONFLICTS_DETECTED: 0
COUNT_TRANSACTIONS_ROWS_VALIDATING: 0
TRANSACTIONS_COMMITTED_ALL_MEMBERS: 4e0f05b7-d9d0-11e6-87cf-002710cccc64:1-2
LAST_CONFLICT_FREE_TRANSACTION:
55

Monitoring: replication_group_members 1/2
• Used for monitoring the status of the different server instances that are tracked in the current view
node1> SELECT * FROM performance_schema.replication_group_membersG
*************************** 1. row ***************************
MEMBER_ID: 00014001-1111-1111-1111-111111111111
MEMBER_HOST: localhost
MEMBER_PORT: 14001
*************************** 2. row ***************************
MEMBER_ID: 00014002-2222-2222-2222-222222222222
MEMBER_PORT: 14002
MEMBER_ROLE: SECONDARY
*************************** 2. row ***************************
MEMBER_ID: 00014003-3333-3333-3333-333333333333
MEMBER_PORT: 14003
MEMBER_ROLE: SECONDARY
56

Monitoring: replication_group_members 2/2
• replication_group_members table is updated whenever there is a view change
• There are various states that a server instance can be in
• If servers are communicating properly, all report the same states for all servers
• If there is a network partition, or a server leaves the group, then different information
may be reported, depending on which server is queried
57
Field Description Group Synchronized
ONLINE
The member is ready to serve as a fully functional group member, meaning that the
client can connect and start executing transactions Yes
RECOVERING
The member is in the process of becoming an active member of the group and is
currently going through the recovery process, receiving state information from a
donor
No
OFFLINE The plugin is loaded but the member does not belong to any group No
ERROR
The state of the local node. Whenever there is an error on the recovery phase or
while applying changes, the server enters this state No
UNREACHABLE
Whenever the local failure detector suspects that a given server is not reachable,
because maybe it has crashed or was disconnected involuntarily, it shows that
server's state as 'unreachable'
No

Monitoring: replication_connection_status
• Show information regarding Group Replication :
– transactions that have been received from the group and queued in the applier queue (the relay log)
– Recovery
node1> SELECT * FROM performance_schema.replication_connection_statusG
*************************** 1. row ***************************
GROUP_NAME: 4e0f05b7-d9d0-11e6-87cf-002710cccc64
SOURCE_UUID: 4e0f05b7-d9d0-11e6-87cf-002710cccc64
THREAD_ID: NULL
SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 0
LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
RECEIVED_TRANSACTION_SET: 4e0f05b7-d9d0-11e6-87cf-002710cccc64:1-2
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
58

Monitoring: replication_applier_status
• The state of the Group Replication related channels and thread
• If there are many different worker threads applying transactions then the
worker tables can also be used to monitor what each worker thread is
doing
node1> SELECT * FROM performance_schema.replication_applier_statusG
*************************** 1. row ***************************
SERVICE_STATE: ON
REMAINING_DELAY: NULL
COUNT_TRANSACTIONS_RETRIES: 0
59

Links
• Tutoriel – Déployer MySQL 8.0 InnoDB Cluster
– http://dasini.net/blog/2018/08/30/tutoriel-deployer-mysql-8-0-innodb-cluster/
• MySQL InnoDB Cluster Documentation
– https://dev.mysql.com/doc/refman/8.0/en/mysql-innodb-cluster-userguide.html
• MySQL Group Replication Documentation
– https://dev.mysql.com/doc/refman/8.0/en/group-replication.html
• Tutoriel – Configurer ProxySQL 1.4 pour MySQL 5.7 Group Replication
– http://dasini.net/blog/2018/01/09/configurer-proxysql-1-4-pour-mysql-5-7-group-replication/
• Tester MySQL InnoDB Cluster
– http://dasini.net/blog/2017/03/13/tester-mysql-innodb-cluster/
• Adopte un… cluster MySQL Group Replication
– http://dasini.net/blog/2017/04/10/adopte-un-cluster-mysql-group-replication/
• Déployer un cluster MySQL Group Replication
– http://dasini.net/blog/2016/11/08/deployer-un-cluster-mysql-group-replication/
60

Links
• MEB – User Guide
– https://dev.mysql.com/doc/mysql-enterprise-backup/8.0/en/
• MEB – Using MySQL Enterprise Backup with Group Replication
– https://dev.mysql.com/doc/mysql-enterprise-backup/8.0/en/meb-group-replication.html
• MEB - Using MySQL Enterprise Backup with Group Replication
– https://dev.mysql.com/doc/refman/8.0/en/group-replication-enterprise-backup.html
• MySQL InnoDB Cluster: is the router a single point of failure ?
– https://lefred.be/content/mysql-innodb-cluster-is-the-router-a-single-point-of-failure/
• MySQL Router HA with Pacemaker
– https://lefred.be/content/mysql-router-ha-with-pacemaker/
• MySQL Router HA with Keepalived
– https://lefred.be/content/mysql-router-ha-with-keepalived/
• Docker Compose Setup for InnoDB Cluster
– https://mysqlrelease.com/2018/03/docker-compose-setup-for-innodb-cluster/
61

Thank you!

MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability solution for MySQL

MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability solution for MySQL

More Related Content

MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability solution for MySQL