Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
© 2015 VMware Inc. All rights reserved.
MySQL High Availability and Disaster
Recovery
Featuring Continuent
Robert Hodges
January 2015
Continuent Quick Introduction
2
History Products
2004 Continuent established in USA
2009 3rd Generation Continuent
Tungsten (aka VMware
Continuent) ships
2014 100+ customers running business-
critical applications
Oct 2014 Acquisition by VMware: Now part
of the vCloud Air Business Unit
Oct 2015 Continuent solutions available
through VMware sales
Industry-leading clustering and
replication for open source DBMS
Clustering – Commercial-grade HA,
performance scaling, and data
management for MySQL
Replication– Flexible, high-
performance data movement
Business-Critical Deployment Examples
High Availability for
MySQL
Largest cluster deployment performs 800M+ transactions/
day on 275 TB of relational data
Business Continuity
Cross-site cluster topologies widely deployed including
primary/DR and multi-master
High Performance
Replication
Largest installations transfer billions of transactions daily
using high speed, parallel replication
Heterogeneous
Integration
Customers replicate from MySQL to Oracle, Hadoop,
Redshift, Vertica, and others
Real-time Analytics
Optimized data loading for data warehouses with
deployments of up to 200 MySQL masters feeding to
Hadoop
Continuent Facts
3
Select Continuent Customers
4
Too Good To Be True
The Dream: multiple, active DBMS servers with
identical data over distance
High
Availability
Updates
propagated
immediately
to all servers
Transparent
read/write to
any server
High
Performance
Synchronous multi-master clusters claim to deliver on
the dream
Table foo
id=1, data=6
Ordering Table foo
id=1, data=5
Table foo
id=7, data=25
[1] id=1, data=6
[2] id=1, data=5
[3] id=7, data=25
Synchronous multi-master introduces new problems
Table foo
id=1, data=6
Ordering Table foo
id=1, data=5
REJECTED!
Table foo
id=7, data=25
[1] id=1, data=6
[2] id=1, data=5
[3] id=7, data=25
…That grow as data scale in volume and distance
•  Transaction failures due to conflicts
•  Operations like SELECT FOR UPDATE not supported
•  Slow writes due to synchronous messaging
•  Large transactions lock system or cause failures
•  Cross-site replication is unstable
Can master/slave clusters offer the same benefits?
High
Availability?
Updates
propagated
immediately?
Transparent
read/write to
any server?
High
Performance?
Continuent
Master/Slave Clusters
24x7 data access
SQL load balancing
Simple management
Off-the-shelf MySQL
Continuent Clustering: HA, DR and Performance Scaling
db2 db1 db3
Slave Master Slave
Application Stack
Continuent Connector
Application Stack
Continuent Connector
Benefits
Manager
Replicator
Manager
Replicator
Manager
Replicator
Continuent clusters add HA and scaling without
taking features away
13
Slave Master Slave
Continuent Connector Continuent Connector
Continuent Connector operates as an intelligent proxy to
the DBMS
•  Any MySQL client can connect
•  Connector initiates connections on behalf of client to the DBMS
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector
MySQL
Protocol
COM_QUERY
COM_INIT_DB
COM_DROP_DB
…
Connector minimizes overhead from proxying
•  Pass-through operation after connection
•  Full transparency and low overhead for clients
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector
(Packet)
COM_QUERY
SELECT *
FROM foo
(Packet)
OK
ResultSet
Rows: 1
Continuent SmartScale provides session load balancing
•  Initial write goes to master
•  Reads go to replicas if it is safe to do so.
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector
(Session “X” Binlog Position)
Initial
Write
Connect/
Insert data
Write
committed
Not
received
Not
received
Continuent SmartScale provides session load balancing
•  Auto-commit reads are eligible to go to slave
•  Reads stay on master until a slave catches up
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector
(Session “X” Binlog Position)
Select Data
Write
committed
Not
received
Received
but not
applied
Read from master
NO read from slave
Continuent SmartScale provides session load balancing
•  Reads go to slave when it has caught up with master
•  Session tags may be schema name or supplied by application
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector
(Session “X” Binlog Position)
Select Data
Write
committed
Received
but not
applied
Received
and
appliedRead from slave
Manager
Replicator
Manager
Replicator
Manager
Replicator
Connectors can be configured to support different
levels of service
19
Slave Master Slave
Continuent Connector Continuent Connector
(SmartScale) (Strict Consistency)
Demo: Transparent
connectivity to replicas
Failover and
Maintenance
Continuent clusters automatically monitor all cluster
nodes for failure
Continuent Connector
Master
SlaveSlave
Cluster rules fail over master if DBMS no longer
accepts network connections
Continuent Connector
Master
SlaveSlave
X1. Detect non-
responsive node
2. Halt in-coming
connections
3. Find and
promote most up-
to-date slave
Failed nodes can be reprovisioned from a backup with
a single management command
Continuent Connector
New Master
Shunned nodeSlave
4. Administrator
inspects and
recovers old masterX
Continuent clusters support zero-downtime maintenance
operations from parameter changes to app upgrade
•  Task: change the InnoDB log file size
•  Problem: requires a mysqld restart, hence can cause
application downtime
•  Constraint: avoid application-visible restart
•  Solution: upgrade nodes in succession
Rolling maintenance proceeds node-by-node starting with
slaves and proceeding to master
Slave
upgrade
Slave
upgrade
Switch
Master
upgrade
•  Shun slave
•  Resize
journal,
restart mysqld
•  Return node
to cluster
•  Discard and
reprovision on
failure
•  Repeat for
remaining
slave(s)
•  Switch
master to
promote an
upgraded
slave
•  Upgrade old
master
•  Maintenance
is now done!
Transaction Scaling
with Master/Slave
Topologies
Size and transaction activity on business data
depend on many factors
28
0
200
400
600
800
1000
1200
1400
1 501
DatasetSizeinGigiabyes
Customers
SaaS Datasets -- Size of Top 1000
Customers
99th percentile=290GB
Max=1214GB
Median=2.6GB
Source: Statistics provided by Continuent customer
Manager
Replicator
Manager
Replicator
Manager
Replicator
DBMS workloads are correspondingly varied
29
Complex
queries
Large batch
operations
Small online
transactions
Analytic
reports
Slave Master Slave
Asynchronous replication decouples transaction
processing on master and slave DBMS nodes
30
Replicator
mySQL
DBMS
Logs
mySQL
Replicator
THL
THL
Download
transactions via
network
Apply using JDBC
(Transactions + metadata)
(Transactions + metadata)
Master
Slave
Parallel apply maximizes DBMS I/O bandwidth
when updating replicas
31
Master
replicator
THL
Parallel queue(Transactions + metadata)
Slave
Extrac
t
Filte
r
Apply
Extrac
t
Filte
r
Apply
Extrac
t
Filte
r
Apply
Extrac
t
Filte
r
Apply
Extrac
t
Filte
r
Apply
StageStage
Stage
Slave Replicator Pipeline
Demo: Scalable
transaction processing
Distributing Data
between Regions and
to Other DBMS types
Continuent Disaster Recovery creates composite clusters
that span sites and are ready for immediate failover
SJC Master Service NYC Slave Service
Slave Slave
Master
Slave Slave
RelayCross-Region
Replication
(Async
master/slave)
Continuent Connector Continuent Connector
Continuent multi-master, cross-site cluster operate
independent, active clusters on 2 or more remote sites
SJC Service NYC Service
Slave Slave
Master
Slave Slave
MasterCross-Region
Replication
(Async
Multi-master)
Continuent Connector Continuent Connector
The same replication mechanism supports real-time
loading of data warehouses
SJC Service Hadoop Cluster
Slave Slave
Master
Continuent Connector
Wrap-Up
Master/slave clustering is a robust technology for
enterprise data management!
Very High
Availability
Updates
propagated
without cost
to applications
Transparent
connectivity
with full SQL
semantics
Very High
Performance
Continuent offers…
•  Highly available clusters of off-the-shelf MySQL servers
•  Zero-downtime maintenance and upgrade
•  High performance regardless of data volume or distance
•  Replication over regions to DR sites as well as non-
MySQL data warehouses
For more information, contact us:
Robert Noyes
Alliance Manager, USA & Canada
rnoyes@vmware.com
+1 (650) 575-0958
Philippe Bernard
Alliance Manager, EMEA & APAC
pbernard@vmware.com
+41 79 347 1385
Eero Teerikorpi
Sr. Director, Strategic Alliances
eteerikorpi@vmware.com
+1 (408) 431-3305

More Related Content

MySQL High Availability and Disaster Recovery with Continuent, a VMware company

  • 1. © 2015 VMware Inc. All rights reserved. MySQL High Availability and Disaster Recovery Featuring Continuent Robert Hodges January 2015
  • 2. Continuent Quick Introduction 2 History Products 2004 Continuent established in USA 2009 3rd Generation Continuent Tungsten (aka VMware Continuent) ships 2014 100+ customers running business- critical applications Oct 2014 Acquisition by VMware: Now part of the vCloud Air Business Unit Oct 2015 Continuent solutions available through VMware sales Industry-leading clustering and replication for open source DBMS Clustering – Commercial-grade HA, performance scaling, and data management for MySQL Replication– Flexible, high- performance data movement
  • 3. Business-Critical Deployment Examples High Availability for MySQL Largest cluster deployment performs 800M+ transactions/ day on 275 TB of relational data Business Continuity Cross-site cluster topologies widely deployed including primary/DR and multi-master High Performance Replication Largest installations transfer billions of transactions daily using high speed, parallel replication Heterogeneous Integration Customers replicate from MySQL to Oracle, Hadoop, Redshift, Vertica, and others Real-time Analytics Optimized data loading for data warehouses with deployments of up to 200 MySQL masters feeding to Hadoop Continuent Facts 3
  • 5. Too Good To Be True
  • 6. The Dream: multiple, active DBMS servers with identical data over distance High Availability Updates propagated immediately to all servers Transparent read/write to any server High Performance
  • 7. Synchronous multi-master clusters claim to deliver on the dream Table foo id=1, data=6 Ordering Table foo id=1, data=5 Table foo id=7, data=25 [1] id=1, data=6 [2] id=1, data=5 [3] id=7, data=25
  • 8. Synchronous multi-master introduces new problems Table foo id=1, data=6 Ordering Table foo id=1, data=5 REJECTED! Table foo id=7, data=25 [1] id=1, data=6 [2] id=1, data=5 [3] id=7, data=25
  • 9. …That grow as data scale in volume and distance •  Transaction failures due to conflicts •  Operations like SELECT FOR UPDATE not supported •  Slow writes due to synchronous messaging •  Large transactions lock system or cause failures •  Cross-site replication is unstable
  • 10. Can master/slave clusters offer the same benefits? High Availability? Updates propagated immediately? Transparent read/write to any server? High Performance?
  • 12. 24x7 data access SQL load balancing Simple management Off-the-shelf MySQL Continuent Clustering: HA, DR and Performance Scaling db2 db1 db3 Slave Master Slave Application Stack Continuent Connector Application Stack Continuent Connector Benefits
  • 13. Manager Replicator Manager Replicator Manager Replicator Continuent clusters add HA and scaling without taking features away 13 Slave Master Slave Continuent Connector Continuent Connector
  • 14. Continuent Connector operates as an intelligent proxy to the DBMS •  Any MySQL client can connect •  Connector initiates connections on behalf of client to the DBMS mySQL Master mySQL Slave mySQL Slave Application Connector MySQL Protocol COM_QUERY COM_INIT_DB COM_DROP_DB …
  • 15. Connector minimizes overhead from proxying •  Pass-through operation after connection •  Full transparency and low overhead for clients mySQL Master mySQL Slave mySQL Slave Application Connector (Packet) COM_QUERY SELECT * FROM foo (Packet) OK ResultSet Rows: 1
  • 16. Continuent SmartScale provides session load balancing •  Initial write goes to master •  Reads go to replicas if it is safe to do so. mySQL Master mySQL Slave mySQL Slave Application Connector (Session “X” Binlog Position) Initial Write Connect/ Insert data Write committed Not received Not received
  • 17. Continuent SmartScale provides session load balancing •  Auto-commit reads are eligible to go to slave •  Reads stay on master until a slave catches up mySQL Master mySQL Slave mySQL Slave Application Connector (Session “X” Binlog Position) Select Data Write committed Not received Received but not applied Read from master NO read from slave
  • 18. Continuent SmartScale provides session load balancing •  Reads go to slave when it has caught up with master •  Session tags may be schema name or supplied by application mySQL Master mySQL Slave mySQL Slave Application Connector (Session “X” Binlog Position) Select Data Write committed Received but not applied Received and appliedRead from slave
  • 19. Manager Replicator Manager Replicator Manager Replicator Connectors can be configured to support different levels of service 19 Slave Master Slave Continuent Connector Continuent Connector (SmartScale) (Strict Consistency)
  • 22. Continuent clusters automatically monitor all cluster nodes for failure Continuent Connector Master SlaveSlave
  • 23. Cluster rules fail over master if DBMS no longer accepts network connections Continuent Connector Master SlaveSlave X1. Detect non- responsive node 2. Halt in-coming connections 3. Find and promote most up- to-date slave
  • 24. Failed nodes can be reprovisioned from a backup with a single management command Continuent Connector New Master Shunned nodeSlave 4. Administrator inspects and recovers old masterX
  • 25. Continuent clusters support zero-downtime maintenance operations from parameter changes to app upgrade •  Task: change the InnoDB log file size •  Problem: requires a mysqld restart, hence can cause application downtime •  Constraint: avoid application-visible restart •  Solution: upgrade nodes in succession
  • 26. Rolling maintenance proceeds node-by-node starting with slaves and proceeding to master Slave upgrade Slave upgrade Switch Master upgrade •  Shun slave •  Resize journal, restart mysqld •  Return node to cluster •  Discard and reprovision on failure •  Repeat for remaining slave(s) •  Switch master to promote an upgraded slave •  Upgrade old master •  Maintenance is now done!
  • 28. Size and transaction activity on business data depend on many factors 28 0 200 400 600 800 1000 1200 1400 1 501 DatasetSizeinGigiabyes Customers SaaS Datasets -- Size of Top 1000 Customers 99th percentile=290GB Max=1214GB Median=2.6GB Source: Statistics provided by Continuent customer
  • 29. Manager Replicator Manager Replicator Manager Replicator DBMS workloads are correspondingly varied 29 Complex queries Large batch operations Small online transactions Analytic reports Slave Master Slave
  • 30. Asynchronous replication decouples transaction processing on master and slave DBMS nodes 30 Replicator mySQL DBMS Logs mySQL Replicator THL THL Download transactions via network Apply using JDBC (Transactions + metadata) (Transactions + metadata) Master Slave
  • 31. Parallel apply maximizes DBMS I/O bandwidth when updating replicas 31 Master replicator THL Parallel queue(Transactions + metadata) Slave Extrac t Filte r Apply Extrac t Filte r Apply Extrac t Filte r Apply Extrac t Filte r Apply Extrac t Filte r Apply StageStage Stage Slave Replicator Pipeline
  • 33. Distributing Data between Regions and to Other DBMS types
  • 34. Continuent Disaster Recovery creates composite clusters that span sites and are ready for immediate failover SJC Master Service NYC Slave Service Slave Slave Master Slave Slave RelayCross-Region Replication (Async master/slave) Continuent Connector Continuent Connector
  • 35. Continuent multi-master, cross-site cluster operate independent, active clusters on 2 or more remote sites SJC Service NYC Service Slave Slave Master Slave Slave MasterCross-Region Replication (Async Multi-master) Continuent Connector Continuent Connector
  • 36. The same replication mechanism supports real-time loading of data warehouses SJC Service Hadoop Cluster Slave Slave Master Continuent Connector
  • 38. Master/slave clustering is a robust technology for enterprise data management! Very High Availability Updates propagated without cost to applications Transparent connectivity with full SQL semantics Very High Performance
  • 39. Continuent offers… •  Highly available clusters of off-the-shelf MySQL servers •  Zero-downtime maintenance and upgrade •  High performance regardless of data volume or distance •  Replication over regions to DR sites as well as non- MySQL data warehouses
  • 40. For more information, contact us: Robert Noyes Alliance Manager, USA & Canada rnoyes@vmware.com +1 (650) 575-0958 Philippe Bernard Alliance Manager, EMEA & APAC pbernard@vmware.com +41 79 347 1385 Eero Teerikorpi Sr. Director, Strategic Alliances eteerikorpi@vmware.com +1 (408) 431-3305