The document discusses strategies for managing data across multiple online data centers, including using replication to provide high availability and distribute data across regions, implementing sharding with replica sets to partition and distribute data while taking advantage of replication, and new features in MongoDB like read preferences and tagged balancing to help optimize reads and distribution of data locally among data centers.
8. Multiple Online Data Centers
● Europe ● USA
○ App ○ App
○ Data ○ Data
● Asia
○ App
9. Non-default behaviors
Default: Multi-DC needs:
● Primary read/write ● Read locally
● No stale reads ● Support some stale
reads
10. Replication
● Replica Sets
○ Possible to read from non-primary replicas
○ Copy of data
○ Write Concern
■ Tagging
■ Verifiable writes
● Provides
○ Isolation (possible stale reads)
○ Availability
○ Distribution of read, possibly stale (WriteConcern)
13. Sharding + Replication
● Range-based sharding (chunks)
○ Not tag aware
○ Random distribution (balancer)
● Shards made up of Replica Sets
○ All advantages
● Writes to (primary) shard per chunk
● Reads
○ From Primary by default
○ Optional non-primary reads
14. New Features
● ReadPreference
○ Primary (only)
○ Secondary (preferred)
○ PrimaryFirst?
○ SecondayOnly?
○ Any (closest)
● Replica Sets
○ ReadPreference
● Sharding
○ Tagged balancing
○ ReadPreference
15. Sharding - Reads
● Local when non-primary
● Tagged
○ Custom Tagging
○ By region/dc/rack?