Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
DR to The Cloud
with VMware Site Recovery Manager and
Rackspace Disaster Recovery Planning Services
Paul Croteau, Rackspace
Bryan Evans, VMware
BCO5170
#BCO5170
22
Agenda
 Simple and Reliable DR with vSphere and SRM
• SRM Simplifies and Automates the traditional Runbook Model for DR
 SRM as a Service
• Thousands of Customers Already Rely on SRM
• DR to The Cloud with SRM is offered by an expanding list of Service
Providers
 Rackspace® Replication Manager & the Managed
Hosting Advantage
• Identify and Understand the Problem
• Plan Accordingly
• Maintain Your Focus By Working With Partners
33
Simple and Reliable DR
with vSphere and SRM
44
Challenges of Traditional Disaster Recovery
Expensive
Complex
Recovery Plans
?
?
?
?
?
?
?
?
Unreliable
Failovers
Apps
Hosts
Storage
Network
Software
Hosts
Storage
Facilities
>$10K per app
Failure to meet business requirements
 Long RTOs – days to weeks
 Too much time and resources consumed
55
vSphere Provides the Best Foundation for Disaster Recovery
Flexible Infrastructure
 Eliminate need for identical hardware across
sites
 Enable waterfalling of equipment to recovery site
Simple Application Protection
 Entire system – including application, OS,
and data – is stored as virtual machine files
 Entire system can be protected with data
protection tools
Cost-Efficient Infrastructure
 Reduced hardware requirements at recovery site
 Use recovery hardware to run low-priority apps
Encapsulation
Consolidation
Hardware
Independence
vSphere
vSphere vSphere
66
Site Recovery Manager Complements vSphere for DR
Traditional DR VMware
Streamline planned migrations
and automated failback
Automated DR failover and non-
disruptive testing
Simple management of recovery
and migration plans
vSphere Replication
Encapsulation for simple recovery
of entire systems
Hardware independence at
failover site
Consolidation to reduce costs
vSphere
SRM
77
SRM Simplifies Setup and Management of Recovery Plans
 Weeks or months to set up
 Error-prone
 Quickly falls out of sync with apps
and infrastructure changes
 Simple recovery plan set up in minutes
 Fewer steps means far less room for errors
 Simple to keep in sync with changes
…to Simple Recovery PlansFrom Complex Runbooks…
88
SRM Reduces Recovery Risk with Frequent Testing
 During the testing gap, organizations can’t be sure that they
can recover the current IT environment
 A failover scenario may take days or weeks to complete,
leaving the business at extreme risk
Lack of confidence
in DR process
Time
DR Test DR Test
TESTING GAP
Recovery
Risk
Traditional Disaster Recovery
99
SRM Reduces Recovery Risk with Frequent Testing
SRM provides assurance that DR objectives will be met
Time
DR Test DR Test
TESTING GAP
Recovery
Risk
Traditional Disaster Recovery
Recovery
Risk
DR Test DR Test
Time
Site Recovery Manager
Frequent
DR Testing
1010
SRM Provides Broad Choice of Replication Options
vSphere Replication
Simple, cost-efficient replication for Tier 2 applications and smaller sites
Storage-based Replication
High-performance replication for business-critical applications in larger sites
vCenter Server
Site
Recovery
Manager
vSphere
vCenter Server
Site
Recovery
Manager
vSphere
vSphere
Replication
Storage-based
replication
Site A (Primary) Site B (Recovery)
1111
Simplify failback process
 Automate replication management
 Eliminate need to set up new recovery plan
Streamline frequent bi-directional migarations
Automated Failback to Streamline Bi-Directional Migrations
Re-protect VMs from Site B to Site A
 Reverse replication
 Apply reverse resource mapping
Automate failover from Site B to Site A
 Reverse original recovery plan
Restrictions
 Does not apply if Site A has undergone major
changes / been rebuilt
 Not available with vSphere Replication
Overview
Benefits
Automated Failback
Site BSite A
Reverse
Replication
Reverse original recovery plan
vSphere vSphere
1212
Planned Migrations For App Consistency & No Data Loss
Overview
Benefits
Two workflows can be applied to recovery plans:
 DR failover
 Planned migration
Planned migration ensures application
consistency and no data-loss during migration
 Graceful shutdown of production VMs in
application consistent state
 Data sync to complete replication of VMs
 Recover fully replicated VMs
Better support for planned migrations
 No loss of data during migration process
 Recover ‘application-consistent’ VMs at
recovery site
Planned Migration
Site BSite A
Replication
1 Shut down
production VMs
2
Sync data, stop replication
and present LUNs to vSphere
3 Recover app-
consistent VMs
vSphere vSphere
1313
Automate DR Failover Processes
Overview
Benefits
Automatically detect site failures
 Require user to manually initiate failover
Automate recovery process
 Stop replication and present replicated LUNs to
vSphere
 Execute user-defined recovery plan
Ensure fast and predictable failovers and
migrations
 Consistently meet business requirements
Minimize risk of user errors
Site BSite A
Replication
1 Raise alert when
heartbeat lost
2 User initiates
failover
3
Stop replication and
present LUNs to vSphere
4 Recover VMs
DR Failover
vSphere vSphere
1414
Testing and Executing Recovery Plans
Steps in
recovery plan Status and time
stamps
When to execute
User
confirmation
message
1515
SRM as a Service
1616
Thousands of Customers Already Rely on SRM
Introduced in Q2’ 2008
Over 1 Million VMs Protected
10,000+ customers
“If your organization is already taking advantage of virtualization,
then adding Site Recovery Manager to handle disaster recovery
is a no-brainer.”
― Jerry Wilkin
Senior Systems Administrator, Dayton Superior Corp
1717
vCenter Site Recovery Manager Ensures Simple, Reliable DR
Provide cost-efficient replication of
applications to failover site
 Support for vSphere Replication
 Broad support for storage-based
replication
Simplify management of recovery and
migration plans
 Replace manual runbooks with
centralized recovery plans
 From weeks to minutes to set up new
plan
Automate failover and migration
processes for reliable recovery
 Enable frequent non-disruptive testing
 Ensure fast, automated failover
 Automate failback processes
Site Recovery Manager Complements vSphere to provide the simplest
and most reliable disaster protection and site migration for all applications
VMware vSphere
VMware
vCenter Server
Site Recovery
Manager
VMware
vCenter Server
Site Recovery
Manager
VMware vSphere
Site A (Primary) Site B (Recovery)
Servers Servers
1818
vCenter Site Recovery Manager Ensures Simple, Reliable DR
Provide cost-efficient replication of
applications to failover site
 Support for vSphere Replication
 Broad support for storage-based
replication
Simplify management of recovery and
migration plans
 Replace manual runbooks with
centralized recovery plans
 From weeks to minutes to set up new
plan
Automate failover and migration
processes for reliable recovery
 Enable frequent non-disruptive testing
 Ensure fast, automated failover
 Automate failback processes
Site Recovery Manager Complements vSphere to provide the simplest
and most reliable disaster protection and site migration for all applications
VMware vSphere
VMware
vCenter Server
Site Recovery
Manager
VMware
vCenter Server
Site Recovery
Manager
VMware vSphere
Site A (Primary) Site B (Recovery)
Servers Servers
1919
DR to the Cloud With SRM Offered By Several VSPP Partners
Public
Cloud
Service Provider
DR to the Cloud services
currently offered by:
 Rackspace
 FusionStorm
 Hosting.com
 IIJ (Japan)
 iLand
 SunGard
 Terremark
 VeriStor
Providers offer variety of
pricing, packaging, service
levels and deployment
options
2020
2121
2222
Establish the Business Problem
 IT focus should be on innovation, not Disaster Recovery
 Do you have DR staff or expertise onsite?
 Do you have the time or ability to design, implement, and test
recovery solutions that meet your business requirements?
 Do you know your business requirements?
 Do you know your cost of downtime?
2323
Business Impact Analysis
 You own the DR process.
 How do you prioritize?
 What exactly are you
protecting?
• Business Operations
• Revenue
• Data
• Customers
• All of the above
 Do you know your
cost of downtime?
Multi-Layer Business View
2424
The Numbers
 The average cost of data center downtime across industries:
approximately $5,600 per minute
 For a partial data center outage, averaging 59 minutes in length,
average costs were approximately $258,000
 For total data center outages, which had an average recovery time
of 134 minutes, average hourly costs were approximately $680,000
 93% of companies that lost their data for 10 days or more filed for
bankruptcy within one year of the disaster, and 50% filed for
bankruptcy immediately
2525
RTO vs RTA
 RTO is the target when architecting initial recovery solution
 RTA is the Recovery Time Actual at a single point in time
after deployment
RPO
RTO
RTA
Recovery
Completed
OUTAGE
2626
Need to Understand Time Discrepancy Between RTO and RTA
 Failover tests are critical and help understand the Recovery
Time Actual
 SRM enables non-disruptive failover testing without impacting
production systems
HOT COLDWARM
RTO
RPO
Tier
• DNS Failover
• Array-based Replication
• Host-based Replication
• DB Replication (Transactional)
• DB Rep. (Log Shipping)
$$$ $$ $
0-24
2-6
0-24
4-24+
24-48+
1 2 3 4
0-2
• MBU (Disk)
• VM Replication
Price
• MBU (Tape)
• MBU (Offsite)
Elements of DR,
not an end-to-end solution
Missing process, policies and
procedures• GSLB
2727
Downtime Calculator
www.rackspace.com/disaster-recovery-planning/
2828
Rackspace DR Planning Services
 DR specialists
 VMware Certified
Professionals
 EMC & NetApp certified
engineers
 Designed and deployed
thousands of DR-related
environments just like
yours
 US & UK DCs available
for SRM
Sample DR Reference Architecture
2929
Rackspace Replication Manager Description
 Powered by VMware®
vCenter™ Site Recovery
Manager™
 Relies on Rackspace
Array-based Replication
 EMC RecoverPoint or NetApp
SnapMirror
Rackspace Replication Manager
3030
Rackspace Assists with Testing and Provides Reports
 Rackspace includes failover test every three months to ensure you
meet your DR compliance requirements
• Companies don’t test their failover plan enough.
• Some replication services charge per test: expensive
• The failover/back process can be risky in production
• Risk dictates extensive planning around every test
 We offer Recovery Time SLA based on failover test every
three months
3131
Large VM Environments Are Dynamic and Change Constantly
 Change management in source and target sites - challenging
 Manual run books vs SRM automation
 Weeks or months to set up
 Error-prone
 Quickly falls out of sync with apps
and infrastructure changes
 Simple recovery plan set up in minutes
 Fewer steps means far less room for errors
 Simple to keep in sync with changes
3232
Resiliency Solutions Portfolio
 SRM is linchpin, dedicated storage, >1hr replication
 Database Replication and DNS failover
 All backed by Fanatical Support
RPA
SRM
vSphere-based
Array-based
Prod DR
3333
3434
Key Takeaways
3535
Key Takeaways
 VMware vCenter Site Recovery Manager, in conjunction with
vSphere, enables simple and reliable DR at much lower cost as
compared to traditional DR
 Site Recovery Manager is expanding options for single site
customers by working with Service Providers to build out SRM as a
Service offerings
 DR is owned by the business but can be facilitated through
outsourced relationships. Know your cost of downtime, prioritize
your apps, then select the right tools and partners
3636
Other VMware Activities Related to This Session
 HOL:
HOL-SDC-1305
Business Continuity and Disaster Recovery In Action
 Group Discussions:
BCO1003-GD
Disaster Recovery and Replication with Ken Werneburg
THANK YOU
VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace Disaster Recovery Planning Services
DR to The Cloud
with VMware Site Recovery Manager and
Rackspace Disaster Recovery Planning Services
Paul Croteau, Rackspace
Bryan Evans, VMware
BCO5170
#BCO5170

More Related Content

VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace Disaster Recovery Planning Services

  • 1. DR to The Cloud with VMware Site Recovery Manager and Rackspace Disaster Recovery Planning Services Paul Croteau, Rackspace Bryan Evans, VMware BCO5170 #BCO5170
  • 2. 22 Agenda  Simple and Reliable DR with vSphere and SRM • SRM Simplifies and Automates the traditional Runbook Model for DR  SRM as a Service • Thousands of Customers Already Rely on SRM • DR to The Cloud with SRM is offered by an expanding list of Service Providers  Rackspace® Replication Manager & the Managed Hosting Advantage • Identify and Understand the Problem • Plan Accordingly • Maintain Your Focus By Working With Partners
  • 3. 33 Simple and Reliable DR with vSphere and SRM
  • 4. 44 Challenges of Traditional Disaster Recovery Expensive Complex Recovery Plans ? ? ? ? ? ? ? ? Unreliable Failovers Apps Hosts Storage Network Software Hosts Storage Facilities >$10K per app Failure to meet business requirements  Long RTOs – days to weeks  Too much time and resources consumed
  • 5. 55 vSphere Provides the Best Foundation for Disaster Recovery Flexible Infrastructure  Eliminate need for identical hardware across sites  Enable waterfalling of equipment to recovery site Simple Application Protection  Entire system – including application, OS, and data – is stored as virtual machine files  Entire system can be protected with data protection tools Cost-Efficient Infrastructure  Reduced hardware requirements at recovery site  Use recovery hardware to run low-priority apps Encapsulation Consolidation Hardware Independence vSphere vSphere vSphere
  • 6. 66 Site Recovery Manager Complements vSphere for DR Traditional DR VMware Streamline planned migrations and automated failback Automated DR failover and non- disruptive testing Simple management of recovery and migration plans vSphere Replication Encapsulation for simple recovery of entire systems Hardware independence at failover site Consolidation to reduce costs vSphere SRM
  • 7. 77 SRM Simplifies Setup and Management of Recovery Plans  Weeks or months to set up  Error-prone  Quickly falls out of sync with apps and infrastructure changes  Simple recovery plan set up in minutes  Fewer steps means far less room for errors  Simple to keep in sync with changes …to Simple Recovery PlansFrom Complex Runbooks…
  • 8. 88 SRM Reduces Recovery Risk with Frequent Testing  During the testing gap, organizations can’t be sure that they can recover the current IT environment  A failover scenario may take days or weeks to complete, leaving the business at extreme risk Lack of confidence in DR process Time DR Test DR Test TESTING GAP Recovery Risk Traditional Disaster Recovery
  • 9. 99 SRM Reduces Recovery Risk with Frequent Testing SRM provides assurance that DR objectives will be met Time DR Test DR Test TESTING GAP Recovery Risk Traditional Disaster Recovery Recovery Risk DR Test DR Test Time Site Recovery Manager Frequent DR Testing
  • 10. 1010 SRM Provides Broad Choice of Replication Options vSphere Replication Simple, cost-efficient replication for Tier 2 applications and smaller sites Storage-based Replication High-performance replication for business-critical applications in larger sites vCenter Server Site Recovery Manager vSphere vCenter Server Site Recovery Manager vSphere vSphere Replication Storage-based replication Site A (Primary) Site B (Recovery)
  • 11. 1111 Simplify failback process  Automate replication management  Eliminate need to set up new recovery plan Streamline frequent bi-directional migarations Automated Failback to Streamline Bi-Directional Migrations Re-protect VMs from Site B to Site A  Reverse replication  Apply reverse resource mapping Automate failover from Site B to Site A  Reverse original recovery plan Restrictions  Does not apply if Site A has undergone major changes / been rebuilt  Not available with vSphere Replication Overview Benefits Automated Failback Site BSite A Reverse Replication Reverse original recovery plan vSphere vSphere
  • 12. 1212 Planned Migrations For App Consistency & No Data Loss Overview Benefits Two workflows can be applied to recovery plans:  DR failover  Planned migration Planned migration ensures application consistency and no data-loss during migration  Graceful shutdown of production VMs in application consistent state  Data sync to complete replication of VMs  Recover fully replicated VMs Better support for planned migrations  No loss of data during migration process  Recover ‘application-consistent’ VMs at recovery site Planned Migration Site BSite A Replication 1 Shut down production VMs 2 Sync data, stop replication and present LUNs to vSphere 3 Recover app- consistent VMs vSphere vSphere
  • 13. 1313 Automate DR Failover Processes Overview Benefits Automatically detect site failures  Require user to manually initiate failover Automate recovery process  Stop replication and present replicated LUNs to vSphere  Execute user-defined recovery plan Ensure fast and predictable failovers and migrations  Consistently meet business requirements Minimize risk of user errors Site BSite A Replication 1 Raise alert when heartbeat lost 2 User initiates failover 3 Stop replication and present LUNs to vSphere 4 Recover VMs DR Failover vSphere vSphere
  • 14. 1414 Testing and Executing Recovery Plans Steps in recovery plan Status and time stamps When to execute User confirmation message
  • 15. 1515 SRM as a Service
  • 16. 1616 Thousands of Customers Already Rely on SRM Introduced in Q2’ 2008 Over 1 Million VMs Protected 10,000+ customers “If your organization is already taking advantage of virtualization, then adding Site Recovery Manager to handle disaster recovery is a no-brainer.” ― Jerry Wilkin Senior Systems Administrator, Dayton Superior Corp
  • 17. 1717 vCenter Site Recovery Manager Ensures Simple, Reliable DR Provide cost-efficient replication of applications to failover site  Support for vSphere Replication  Broad support for storage-based replication Simplify management of recovery and migration plans  Replace manual runbooks with centralized recovery plans  From weeks to minutes to set up new plan Automate failover and migration processes for reliable recovery  Enable frequent non-disruptive testing  Ensure fast, automated failover  Automate failback processes Site Recovery Manager Complements vSphere to provide the simplest and most reliable disaster protection and site migration for all applications VMware vSphere VMware vCenter Server Site Recovery Manager VMware vCenter Server Site Recovery Manager VMware vSphere Site A (Primary) Site B (Recovery) Servers Servers
  • 18. 1818 vCenter Site Recovery Manager Ensures Simple, Reliable DR Provide cost-efficient replication of applications to failover site  Support for vSphere Replication  Broad support for storage-based replication Simplify management of recovery and migration plans  Replace manual runbooks with centralized recovery plans  From weeks to minutes to set up new plan Automate failover and migration processes for reliable recovery  Enable frequent non-disruptive testing  Ensure fast, automated failover  Automate failback processes Site Recovery Manager Complements vSphere to provide the simplest and most reliable disaster protection and site migration for all applications VMware vSphere VMware vCenter Server Site Recovery Manager VMware vCenter Server Site Recovery Manager VMware vSphere Site A (Primary) Site B (Recovery) Servers Servers
  • 19. 1919 DR to the Cloud With SRM Offered By Several VSPP Partners Public Cloud Service Provider DR to the Cloud services currently offered by:  Rackspace  FusionStorm  Hosting.com  IIJ (Japan)  iLand  SunGard  Terremark  VeriStor Providers offer variety of pricing, packaging, service levels and deployment options
  • 20. 2020
  • 21. 2121
  • 22. 2222 Establish the Business Problem  IT focus should be on innovation, not Disaster Recovery  Do you have DR staff or expertise onsite?  Do you have the time or ability to design, implement, and test recovery solutions that meet your business requirements?  Do you know your business requirements?  Do you know your cost of downtime?
  • 23. 2323 Business Impact Analysis  You own the DR process.  How do you prioritize?  What exactly are you protecting? • Business Operations • Revenue • Data • Customers • All of the above  Do you know your cost of downtime? Multi-Layer Business View
  • 24. 2424 The Numbers  The average cost of data center downtime across industries: approximately $5,600 per minute  For a partial data center outage, averaging 59 minutes in length, average costs were approximately $258,000  For total data center outages, which had an average recovery time of 134 minutes, average hourly costs were approximately $680,000  93% of companies that lost their data for 10 days or more filed for bankruptcy within one year of the disaster, and 50% filed for bankruptcy immediately
  • 25. 2525 RTO vs RTA  RTO is the target when architecting initial recovery solution  RTA is the Recovery Time Actual at a single point in time after deployment RPO RTO RTA Recovery Completed OUTAGE
  • 26. 2626 Need to Understand Time Discrepancy Between RTO and RTA  Failover tests are critical and help understand the Recovery Time Actual  SRM enables non-disruptive failover testing without impacting production systems HOT COLDWARM RTO RPO Tier • DNS Failover • Array-based Replication • Host-based Replication • DB Replication (Transactional) • DB Rep. (Log Shipping) $$$ $$ $ 0-24 2-6 0-24 4-24+ 24-48+ 1 2 3 4 0-2 • MBU (Disk) • VM Replication Price • MBU (Tape) • MBU (Offsite) Elements of DR, not an end-to-end solution Missing process, policies and procedures• GSLB
  • 28. 2828 Rackspace DR Planning Services  DR specialists  VMware Certified Professionals  EMC & NetApp certified engineers  Designed and deployed thousands of DR-related environments just like yours  US & UK DCs available for SRM Sample DR Reference Architecture
  • 29. 2929 Rackspace Replication Manager Description  Powered by VMware® vCenter™ Site Recovery Manager™  Relies on Rackspace Array-based Replication  EMC RecoverPoint or NetApp SnapMirror Rackspace Replication Manager
  • 30. 3030 Rackspace Assists with Testing and Provides Reports  Rackspace includes failover test every three months to ensure you meet your DR compliance requirements • Companies don’t test their failover plan enough. • Some replication services charge per test: expensive • The failover/back process can be risky in production • Risk dictates extensive planning around every test  We offer Recovery Time SLA based on failover test every three months
  • 31. 3131 Large VM Environments Are Dynamic and Change Constantly  Change management in source and target sites - challenging  Manual run books vs SRM automation  Weeks or months to set up  Error-prone  Quickly falls out of sync with apps and infrastructure changes  Simple recovery plan set up in minutes  Fewer steps means far less room for errors  Simple to keep in sync with changes
  • 32. 3232 Resiliency Solutions Portfolio  SRM is linchpin, dedicated storage, >1hr replication  Database Replication and DNS failover  All backed by Fanatical Support RPA SRM vSphere-based Array-based Prod DR
  • 33. 3333
  • 35. 3535 Key Takeaways  VMware vCenter Site Recovery Manager, in conjunction with vSphere, enables simple and reliable DR at much lower cost as compared to traditional DR  Site Recovery Manager is expanding options for single site customers by working with Service Providers to build out SRM as a Service offerings  DR is owned by the business but can be facilitated through outsourced relationships. Know your cost of downtime, prioritize your apps, then select the right tools and partners
  • 36. 3636 Other VMware Activities Related to This Session  HOL: HOL-SDC-1305 Business Continuity and Disaster Recovery In Action  Group Discussions: BCO1003-GD Disaster Recovery and Replication with Ken Werneburg
  • 39. DR to The Cloud with VMware Site Recovery Manager and Rackspace Disaster Recovery Planning Services Paul Croteau, Rackspace Bryan Evans, VMware BCO5170 #BCO5170