Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Data Center Awareness
    Scott Hernandez, 10gen
Agenda
●   Infrastructure
●   Players and moving pieces
●   Goals and challenges
●   MongoDB solutions
    ○ Now (Replication, Sharding, Tagging)
    ○ Future (+=, Shard affinity, more Tagging)
Multiple Online Data Centers
● Europe           ● USA
  ○ App              ○ App
  ○ Data             ○ Data


● Asia
  ○ App
Infrastructure
●   Geographically disparate clients
●   Geo-DNS/Load Balancer (LB)
●   Accelerators/Reverse-Proxies
●   Server/Application
Players Introduction
● Users

● Data Centers
  ○ Application Servers
  ○ Databases


● Application Awareness
  ○ Configuration
  ○ Why?
Goals
● Reduce network
    ○ latency
    ○ Inter-dc traffic
● Localize resource use
    ○ Reduce failure cases
    ○ Increase availability
    ○ Isolate dependencies
●   Provide multiple active sites
●   Partition geo/regional data
Challenges
● Data concistency
    ○ User experience
    ○ Backup and operational needs
● Scaling
● Partitioning/Sharding
Multiple Online Data Centers
● Europe           ● USA
  ○ App              ○ App
  ○ Data             ○ Data


● Asia
  ○ App
Non-default behaviors
Default:               Multi-DC needs:
● Primary read/write   ● Read locally
● No stale reads       ● Support some stale
                         reads
Replication
● Replica Sets
  ○ Possible to read from non-primary replicas
  ○ Copy of data
  ○ Write Concern
    ■ Tagging
    ■ Verifiable writes


● Provides
  ○ Isolation (possible stale reads)
  ○ Availability
  ○ Distribution of read, possibly stale (WriteConcern)
Replica Writes

Client   Write   Replica_Safe

                                   USA-NY

         Ack


                                            USA-CA




                           Query
                                   EUROPE
                 Client
Tagged Writes
Uses
● Multiple Racks/DCs
● Backups
● Disaster Recovery
Sharding + Replication
● Range-based sharding (chunks)
  ○ Not tag aware
  ○ Random distribution (balancer)
● Shards made up of Replica Sets
  ○ All advantages
● Writes to (primary) shard per chunk
● Reads
  ○ From Primary by default
  ○ Optional non-primary reads
New Features
● ReadPreference
  ○   Primary (only)
  ○   Secondary (preferred)
  ○   PrimaryFirst?
  ○   SecondayOnly?
  ○   Any (closest)


● Replica Sets
  ○ ReadPreference
● Sharding
  ○ Tagged balancing
  ○ ReadPreference
Sharding - Reads
● Local when non-primary
● Tagged
  ○ Custom Tagging
  ○ By region/dc/rack?
Sharding - Balancing/Distribution
● Tag chunks/ranges
● Possible super-chunks (grouping)

More Related Content

MongoDB Datacenter Awareness (mongosf2012)

  • 1. Data Center Awareness Scott Hernandez, 10gen
  • 2. Agenda ● Infrastructure ● Players and moving pieces ● Goals and challenges ● MongoDB solutions ○ Now (Replication, Sharding, Tagging) ○ Future (+=, Shard affinity, more Tagging)
  • 3. Multiple Online Data Centers ● Europe ● USA ○ App ○ App ○ Data ○ Data ● Asia ○ App
  • 4. Infrastructure ● Geographically disparate clients ● Geo-DNS/Load Balancer (LB) ● Accelerators/Reverse-Proxies ● Server/Application
  • 5. Players Introduction ● Users ● Data Centers ○ Application Servers ○ Databases ● Application Awareness ○ Configuration ○ Why?
  • 6. Goals ● Reduce network ○ latency ○ Inter-dc traffic ● Localize resource use ○ Reduce failure cases ○ Increase availability ○ Isolate dependencies ● Provide multiple active sites ● Partition geo/regional data
  • 7. Challenges ● Data concistency ○ User experience ○ Backup and operational needs ● Scaling ● Partitioning/Sharding
  • 8. Multiple Online Data Centers ● Europe ● USA ○ App ○ App ○ Data ○ Data ● Asia ○ App
  • 9. Non-default behaviors Default: Multi-DC needs: ● Primary read/write ● Read locally ● No stale reads ● Support some stale reads
  • 10. Replication ● Replica Sets ○ Possible to read from non-primary replicas ○ Copy of data ○ Write Concern ■ Tagging ■ Verifiable writes ● Provides ○ Isolation (possible stale reads) ○ Availability ○ Distribution of read, possibly stale (WriteConcern)
  • 11. Replica Writes Client Write Replica_Safe USA-NY Ack USA-CA Query EUROPE Client
  • 12. Tagged Writes Uses ● Multiple Racks/DCs ● Backups ● Disaster Recovery
  • 13. Sharding + Replication ● Range-based sharding (chunks) ○ Not tag aware ○ Random distribution (balancer) ● Shards made up of Replica Sets ○ All advantages ● Writes to (primary) shard per chunk ● Reads ○ From Primary by default ○ Optional non-primary reads
  • 14. New Features ● ReadPreference ○ Primary (only) ○ Secondary (preferred) ○ PrimaryFirst? ○ SecondayOnly? ○ Any (closest) ● Replica Sets ○ ReadPreference ● Sharding ○ Tagged balancing ○ ReadPreference
  • 15. Sharding - Reads ● Local when non-primary ● Tagged ○ Custom Tagging ○ By region/dc/rack?
  • 16. Sharding - Balancing/Distribution ● Tag chunks/ranges ● Possible super-chunks (grouping)