MySQL High Availability and Disaster Recovery with Continuent, a VMware company

© 2015 VMware Inc. All rights reserved.
MySQL High Availability and Disaster
Recovery
Featuring Continuent
Robert Hodges
January 2015

Continuent Quick Introduction
2
History Products
2004 Continuent established in USA
2009 3rd Generation Continuent
Tungsten (aka VMware
Continuent) ships
2014 100+ customers running business-
critical applications
Oct 2014 Acquisition by VMware: Now part
of the vCloud Air Business Unit
Oct 2015 Continuent solutions available
through VMware sales
Industry-leading clustering and
replication for open source DBMS
Clustering – Commercial-grade HA,
performance scaling, and data
management for MySQL
Replication– Flexible, high-
performance data movement

Business-Critical Deployment Examples
High Availability for
MySQL
Largest cluster deployment performs 800M+ transactions/
day on 275 TB of relational data
Business Continuity
Cross-site cluster topologies widely deployed including
primary/DR and multi-master
High Performance
Replication
Largest installations transfer billions of transactions daily
using high speed, parallel replication
Heterogeneous
Integration
Customers replicate from MySQL to Oracle, Hadoop,
Redshift, Vertica, and others
Real-time Analytics
Optimized data loading for data warehouses with
deployments of up to 200 MySQL masters feeding to
Hadoop
Continuent Facts
3

The Dream: multiple, active DBMS servers with
identical data over distance
High
Availability
Updates
propagated
immediately
to all servers
Transparent
read/write to
any server
High
Performance

Synchronous multi-master clusters claim to deliver on
the dream
Table foo
id=1, data=6
Ordering Table foo
id=1, data=5
Table foo
id=7, data=25
[1] id=1, data=6
[2] id=1, data=5
[3] id=7, data=25

Synchronous multi-master introduces new problems
Table foo
id=1, data=6
Ordering Table foo
id=1, data=5
REJECTED!
Table foo
id=7, data=25
[1] id=1, data=6
[2] id=1, data=5
[3] id=7, data=25

…That grow as data scale in volume and distance
•  Transaction failures due to conflicts
•  Operations like SELECT FOR UPDATE not supported
•  Slow writes due to synchronous messaging
•  Large transactions lock system or cause failures
•  Cross-site replication is unstable

Can master/slave clusters offer the same benefits?
High
Availability?
Updates
propagated
immediately?
Transparent
read/write to
any server?
High
Performance?

Continuent
Master/Slave Clusters

24x7 data access
SQL load balancing
Simple management
Off-the-shelf MySQL
Continuent Clustering: HA, DR and Performance Scaling
db2 db1 db3
Slave Master Slave
Application Stack
Continuent Connector
Application Stack
Benefits

Manager
Replicator
Manager
Replicator
Manager
Replicator
Continuent clusters add HA and scaling without
taking features away
13
Slave Master Slave
Continuent Connector Continuent Connector

Continuent Connector operates as an intelligent proxy to
the DBMS
•  Any MySQL client can connect
•  Connector initiates connections on behalf of client to the DBMS
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector
MySQL
Protocol
COM_QUERY
COM_INIT_DB
COM_DROP_DB
…

Connector minimizes overhead from proxying
•  Pass-through operation after connection
•  Full transparency and low overhead for clients
mySQL
Master
mySQL
Slave
mySQL
Slave
(Packet)
COM_QUERY
SELECT *
FROM foo
(Packet)
OK
ResultSet
Rows: 1

Continuent SmartScale provides session load balancing
•  Initial write goes to master
•  Reads go to replicas if it is safe to do so.
mySQL
Master
mySQL
Slave
mySQL
Slave
(Session “X” Binlog Position)
Initial
Write
Connect/
Insert data
Write
committed
Not
received
Not
received

•  Auto-commit reads are eligible to go to slave
•  Reads stay on master until a slave catches up
mySQL
Master
mySQL
Slave
mySQL
Slave
Select Data
Write
committed
Not
received
Received
but not
applied
Read from master
NO read from slave

•  Reads go to slave when it has caught up with master
•  Session tags may be schema name or supplied by application
mySQL
Master
mySQL
Slave
mySQL
Slave
Select Data
Write
committed
Received
but not
applied
Received
and
appliedRead from slave

Manager
Replicator
Manager
Replicator
Manager
Replicator
Connectors can be conﬁgured to support different
levels of service
19
Slave Master Slave
(SmartScale) (Strict Consistency)

Demo: Transparent
connectivity to replicas

Continuent clusters automatically monitor all cluster
nodes for failure
Master
SlaveSlave

Cluster rules fail over master if DBMS no longer
accepts network connections
Master
SlaveSlave
X1. Detect non-
responsive node
2. Halt in-coming
connections
3. Find and
promote most up-
to-date slave

Failed nodes can be reprovisioned from a backup with
a single management command
New Master
Shunned nodeSlave
4. Administrator
inspects and
recovers old masterX

Continuent clusters support zero-downtime maintenance
operations from parameter changes to app upgrade
•  Task: change the InnoDB log file size
•  Problem: requires a mysqld restart, hence can cause
application downtime
•  Constraint: avoid application-visible restart
•  Solution: upgrade nodes in succession

Rolling maintenance proceeds node-by-node starting with
slaves and proceeding to master
Slave
upgrade
Slave
upgrade
Switch
Master
upgrade
•  Shun slave
•  Resize
journal,
restart mysqld
•  Return node
to cluster
•  Discard and
reprovision on
failure
•  Repeat for
remaining
slave(s)
•  Switch
master to
promote an
upgraded
slave
•  Upgrade old
master
•  Maintenance
is now done!

Transaction Scaling
with Master/Slave
Topologies

Size and transaction activity on business data
depend on many factors
28
0
200
400
600
800
1000
1200
1400
1 501
DatasetSizeinGigiabyes
Customers
SaaS Datasets -- Size of Top 1000
Customers
99th percentile=290GB
Max=1214GB
Median=2.6GB
Source: Statistics provided by Continuent customer

Manager
Replicator
Manager
Replicator
Manager
Replicator
DBMS workloads are correspondingly varied
29
Complex
queries
Large batch
operations
Small online
transactions
Analytic
reports
Slave Master Slave

Asynchronous replication decouples transaction
processing on master and slave DBMS nodes
30
Replicator
mySQL
DBMS
Logs
mySQL
Replicator
THL
THL
Download
transactions via
network
Apply using JDBC
(Transactions + metadata)
(Transactions + metadata)
Master
Slave

Parallel apply maximizes DBMS I/O bandwidth
when updating replicas
31
Master
replicator
THL
Parallel queue(Transactions + metadata)
Slave
Extrac
t
Filte
r
Apply
Extrac
t
Filte
r
Apply
Extrac
t
Filte
r
Apply
Extrac
t
Filte
r
Apply
Extrac
t
Filte
r
Apply
StageStage
Stage
Slave Replicator Pipeline

Demo: Scalable
transaction processing

Distributing Data
between Regions and
to Other DBMS types

Continuent Disaster Recovery creates composite clusters
that span sites and are ready for immediate failover
SJC Master Service NYC Slave Service
Slave Slave
Master
Slave Slave
RelayCross-Region
Replication
(Async
master/slave)

Continuent multi-master, cross-site cluster operate
independent, active clusters on 2 or more remote sites
SJC Service NYC Service
Slave Slave
Master
Slave Slave
MasterCross-Region
Replication
(Async
Multi-master)

The same replication mechanism supports real-time
loading of data warehouses
SJC Service Hadoop Cluster
Slave Slave
Master

Master/slave clustering is a robust technology for
enterprise data management!
Very High
Availability
Updates
propagated
without cost
to applications
Transparent
connectivity
with full SQL
semantics
Very High
Performance

Continuent offers…
•  Highly available clusters of off-the-shelf MySQL servers
•  Zero-downtime maintenance and upgrade
•  High performance regardless of data volume or distance
•  Replication over regions to DR sites as well as non-
MySQL data warehouses

For more information, contact us:
Robert Noyes
Alliance Manager, USA & Canada
rnoyes@vmware.com
+1 (650) 575-0958
Philippe Bernard
Alliance Manager, EMEA & APAC
pbernard@vmware.com
+41 79 347 1385
Eero Teerikorpi
Sr. Director, Strategic Alliances
eteerikorpi@vmware.com
+1 (408) 431-3305

MySQL High Availability and Disaster Recovery with Continuent, a VMware company

More Related Content

MySQL High Availability and Disaster Recovery with Continuent, a VMware company