Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
DAT201- Understanding AWS Database Options
Sundar Raghavan – Amazon RDS
Zac Sprackett – Vice President of Operations with SugarCRM
Michael Thomas – Principal Software Engineer with Scopely
November 13, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Today’s discussion
AWS Database Options and Decision Factors
Best Practice Tips and Techniques
SugarCRM
Scopely
Q&A
Starting with the Customer
• How many of you use databases on AWS?
• How many of you use Amazon RDS, Amazon DynamoDB, Amazon
Redshift, or Amazon ElastiCache?
• How many of you have a well defined DR strategy for your
databases?
• How many of you are building geo-spatial and context sensitive
applications?
• We suggest that you attend Werner’s keynote!
Introducing: Cross Region Support
US GovCloud
(US ITAR
Region
-- Oregon)

US West x 2
(N. California
and Oregon)

US East
(Northern
Virginia)

LATAM
(Sao
Paola)

Europe West
(Dublin)

>10 data centers
In US East alone

9 AWS Regions including 25 Availability Zones and growing
46 world-wide points of presence

Asia Pacific
Region
(Singapore)

Asia Pacific
Region
(Tokyo)

Australia
Region
(Australia)

• RDS Snapshot Copy
•

All engines
Zoopla
“We are very happy with RDS cross region snapshot copy feature as it gives
us the ability to copy our data from one AWS region to another AWS region
with minimal effort.
Prior to this feature, it used to take 3 days and a number of manual steps to
copy our snapshots. Now we have an automated process that helps us to
achieve disaster recovery capabilities in just few steps.”
Joel Callaway, IT Operations Manager
Zoopla Property Group Ltd, UK
Your Mission is Clear
1. Zero to App in ____ Minutes
2. Zero to Millions of users in ____ Days
3. Zero to “Hero” in ____ Months
Focus on your App
Your Stack
Load balancer
Application tier

Database tier
Your Stack of Worries
Load balancer
Security, Scale, Availability…

Application tier
Security, Innovation, Scale, Performance, Availability…

Database tier
Security, Innovation, Scale, Transactions, Performance, Durability, Availability, Skills..
Spectrum of Database Options
SQL

NoSQL

Do-it Yourself

Fully
Managed



Low Cost

Not available
on AWS

High Cost
Spectrum of Options
SQL

NoSQL

Do-it Yourself

Fully
Managed
Spectrum of Options
SQL

NoSQL

Do-it Yourself

Fully
Managed

MySQL
Oracle, SQL Server,
MariaDB
Vertica, Paraccell
…

MySQL, Oracle, SQL
Server
Amazon Redshift
Spectrum of Options
SQL

NoSQL

Do-it Yourself

Fully
Managed

MongoDB
Cassandra
Redis
Memcache

DynamoDB
ElastiCache (Memcache)
ElastiCache (Redis)
SimpleDB
Thinking About the Questions
Should I use
MySQL or
PostgreSQL?

Should I use
SQL or NoSQL?

Should I use
MongoDB,
Cassandra, or
DynamoDB?

?
Should I use Redis,
Memcache, or
ElastiCache?
Actually, Thinking About the Right Questions
What are my
transactional and
consistency
needs?

What are my scale
and latency
needs?

What are my
read/write, storage
and IOPS needs?

?

What are my time
to market and
server control
needs?
Factors to Consider
Factors

SQL

NoSQL

Application

• App with complex business logic?

• Web app with lots of users?

Transactions

• Complex txns, joins, updates?

• Simple data model, updates, queries?

Scale

• Developer managed

• Automatic, on-demand scaling

Performance

• Developer architected

• Consistent, high performance at scale

Availability

• Architected for fail-over

• Seamless and transparent

Core Skills

• SQL + Java/Ruby/Python/PhP

• NoSQL + Java/Ruby/Python/PhP

Best of both worlds: Possible to Use SQL and NoSQL models in one App
Factors to Consider
Self-Managed Service

Managed Service

• Full control over the instance,
db and OS parameters
• Upgrades, back-ups, fail-over
are yours to manage
• All aspects of security is
managed by you
• Complex replication topologies
and data management

• Off-load the infrastructure and
software management
• Automate database life-cycle
with APIs
• Focus on database access and
app security
• Limited control over replication
topologies
Pace of Innovation – a Bonus
RDS team
launched 23+
features

•
•
•
•

SQL Server TDE, Version upgrade
Oracle TDE, Statspack, Fine grain access, 3TB/30K IOPS
Cross Region Snapshot Copy, Parallel replica, Chained replica
Multi-AZ SLA, Log access, VPC groups, …

NoSQL team
launched 10+
features

•
•
•
•

Redis engine support
Amazon DynamoDB Fine grain access control
Amazon DynamoDB local, Geospatial indexing library
Transaction library, Local secondary index, parallel scan

Redshift team
launched 20+
features

•
•
•
•

Encryption with HSM support
Audit logging, SNS notification, snapshot sharing
COPY from Amazon EMR/HDFS/SSH
Faster resize, improved concurrency, distributed tables, …
Amazon RDS is a managed SQL database service.

Choice of Database engines
Simple to deploy and scale
Reliable and cost effective
Without any operational burden
Optimizing for Developer Productivity
Schema design

Migration
Backup and recovery
Patching

Query construction

Configuration

Query optimization

Focus on the “innovation”

Software upgrades
Storage upgrades
Frequent server upgrades
Hardware crash

Off load the “administration”
Optimizing for Developer Productivity
 Multiple databases per instance

MySQL Manual for Read Replica

 Use MySQL tools & drivers
 Quickly set up Read Replicas
 High availability Multi-AZ option (99.95% SLA)
 Ability to promote Read replicas, Rename as Master
 Diagnostics

OR Amazon RDS console

 Native MySQL replication
 SSL for encryption over the wire
 Monitor metrics
 Shell, super user or direct file system access (Think security!)
ElastiCache is a managed caching service.

Easy to set up and operate cache clusters
Supports Memcached and Redis engines
Scale cache clusters with push button ease
Ultra fast response time for read scaling
Without any operational burden
ElastiCache is a Performance Booster
Serve most read queries
In-memory performance

Read Replica (Redis)

Master

App
Reads
Cache
Updates

Clients
Elastic Load
Balancing

EC2 App
Instances

Read/write queries
SSD performance

RDS
MySQL DB
Instance
with PIOPS
Amazon DynamoDB is a managed NoSQL
database service.
Store and retrieve any amount of data
Scale throughput to millions of IO
Single digit millisecond latencies
Without any operational burden
Optimizing for Developer Productivity
CreateTable
UpdateTable
DeleteTable
Manage tables

PutItem
GetItem
UpdateItem

DescribeTable
ListTables

DeleteItem

Query
Query specific
items OR scan the
full table

BatchGetItem

Scan

BatchWriteItem

“Select”, “insert”,
“update” items

Bulk select or
update (max 1MB)
Amazon Redshift is a managed data warehouse
service.
Petabyte scale columnar database
Fast response time (~10x that of typical relational stores)
Under $1,000 per TB per year
Without any operational burden
So, what are the tips and techniques for
successful deployments?
Thousands of Successful Deployments
Two Highlights
SugarCRM

CRM Software

Gaming Platform

Zac
Sprackett
Mike
Thomas
Crafting Loyal Customers with SugarCRM
Every Customer. Every User. Every Time.
S. Zachariah Sprackett, VP of Operations, SugarCRM
November 13, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
SugarCRM
• Redefining Customer Relationship Management
• Unique product bundling
– On Premise and Hosted offerings

• Manifest destiny
– Source code access and SQL database per customer

• Scale
– From one seat customers to multi thousand seat customers

• Globally distributed customer base
Deployment Models

Traditional SaaS

SugarCRM
Application Stack
MySQL

Apache

PHP
HTML5 & JavaScript

Elastic Search

Shadow

Linux
Email Archiving
Background Jobs
Cloud Stacks
Amazon SES
ElastiCache

RDS DB
Instance

Cloud
Provider

RDS DB
Instance Read
Replica
EC2 Web Servers

EC2 Job Servers

Amazon S3

EC2 Elastic
Search

Amazon Glacier
Cloud Providers

Route 53

EC2 HA
Proxy

Managed
Elastic IP

Cloud Stack
EC2 HA
Proxy
Management Console

Globally Distributed
Cloud Providers
Delivering On Time and On Budget
• Amazon lets you easily spin up testing environments
– Testing only works if you make use of it. Don’t make assumptions
– Monitor everything

• Change in cost model can surprise finance
– Planned capital expenditures versus after the fact operational expenditures
– Use reserved instances
– Third party tools such as Cloudability can help alert you of issues early

• Manage access keys effectively to control cost
– Learn to love AWS Identity and Access Management (IAM)
Things to Watch Out For
•

Understand your IO requirements
–

•
•

Use the heck out of read replicas
Snapshots are incredibly useful
–

•

Don’t get stuck waiting for deployments in a forced failover scenario

ElastiCache is not clustered across availability zones
Watch out for the SLA
–
–

•

Unless you really like restarting databases

Cold Standby is not instant on
–

•
•

But not available from a read replica

Don’t use the default parameter group for Amazon RDS
–

•

Make effective use of each of instance backed, Amazon EBS and Provisioned IOPS file
systems

99.95% for a region even across two AZ’s
This doesn’t include user error

You still need DBAs and Ops but they get to do cooler stuff
We’re Hiring
Email: zac@sugarcrm.com
Free Trials: http://www.sugacrm.com/try-sugar
Scopely
Michael Thomas – Principal Software Engineer with Scopely
November 13, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Our technical infrastructure allows
developers to build games
efficiently for both iOS and Android.

Millions of Users
Billions of Turns
All titles have reached the Top 5
in the App Store, and the last
three have been #1.

ABOUT
SCOPELY
Challenges
• Build a single platform to support many different
kinds of games – asynchronous turn based, single
player, synchronous, etc.
• Scale up and down as games are tested, launched,
grow, and are retired.
• We are not an infrastructure company – we must
focus on building features that support game
development.
Platform Features
•
•
•
•
•
•
•
•
•
•
•

Accounts / authentication
•
Gameplay / state persistence
•
Chat / messaging
•
In game economy
•
Facebook integration
•
Gifting
•
Single Player state tracking
•
Promotion / cross-promotion system •
Statistics
•
Tournaments
•
Achievements

Email targeting
Suggested friends
In game news system
External partner integration
Invitation attribution
Push notifications
Content management
Generic storage API
Application / device configuration
AB Testing
Different Features/Different Requirements
•
•
•
•
•

Dynamic scaling (game launches, promotions, tests)
High write/read ratio (playing turns)
Transactional consistency (real money purchases)
Indexed data (user accounts)
Complex, real-time data (leaderboards)
Operational Data Storage
Scopely Gaming Platform

Memcached for
performance,
scalability, and cost
savings

ElastiCache

Amazon S3 for
asset and image
storage.

S3
Redis for fast, complex
caching and message
passing.

Amazon DynamoDB for
unbounded data
with heavy write load.

ElastiCache

DynamoDB
RDS

MySQL for bounded,
transactional, queryable
data.
Analytics Data Pipeline
Scopely Gaming Platform

SQS: In-Flight Events

Redshift Data Warehouse

EC2: Message Loader

S3: Staged Messages

EMR: Transformer

S3: Processed Data
EC2: Redshift Loader

RDS: Process / Job Tracking
Schema Mapping DSL
from centipede.schema.table import Table
from centipede.attributes import *

class GemsTurn(Table):
user_id
= Integer, lambda message: message['Data']['GameData']['CurrentPlayerId']
current_turn
= Integer, lambda message: message['Data']['Gamedata']['CurrentTurn']
end_date
= Timestamp, lambda message: message['Data']['GameData']['EndDate']
expiration
= Timestamp, lambda message: message['Data']['GameData']['Expiration']
game_id
= Guid,
lambda message: message['Data']['GameData']['GameId']
resigning_user_id
= Integer, lambda message: message['Data']['GameData']['ResigningPlayerId']
start_context
= Integer, lambda message: message['Data']['GameData']['StartContext']
start_date
= Timestamp, lambda message: message['Data']['GameData']['StartDate']
status
= Integer, lambda message: message['Data']['GameData']['Status']
tournament_id
= Guid,
lambda message: message['Data']['GameData']['TournamentId']
tournament_price_category = Integer, lambda message: message['Data']['GameData']['TournamentPriceCategory']
tournament_price_paid
= Integer, lambda message: message['Data']['GameData']['TournamentPricePaid']
tutorial_type
= Integer, lambda message: message['Data']['GameData']['TutorialType']
winning_user_id
= Integer, lambda message: message['Data']['GameData']['WinningPlayerId']
awards
= List,
lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['Awards']
coins_gathered
= List,
lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['CoinsGathered']
custom_statistics
= VarChar, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['CustomStatistics']
has_hidden_game
= Boolean, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['HasHiddenGame']
last_nudge_date
= Timestamp, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['LastNudgeDate']
score
= Integer, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['Score']
score_for_award
= Integer, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['ScoreForAward']
opponent_user_id
= Integer, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.opponent_user_index(message)]['UserId']
Use Case: Leaderboards
•

“What is my rank in today’s tournament?”

•

Hard to cache since a single player getting a new high score
changes everyone’s rank

•

Highly optimized schema required 4 m2.2xlarge RDS nodes

•

Latency for “what is my rank” could be above 100ms

•

Redis sorted sets provide exactly what we need. Two m2.xlarge
instances are more than enough. Rank query is now in single digit
milliseconds.

Redis
Use Case: Game/Turn State
•

Extremely high throughput. Extremely large dataset.

DynamoDB
•

Semi-structured data – each game models “state” differently.

•

Always queried by UserID or GameID.

•

Maxed out an Amazon RDS instance – instead of spending time sharding /
optimizing Amazon RDS, we moved to Amazon DynamoDB.

•

Saves operational time and development time by not having to worry about
growing games/adding new games/traffic spikes.
Use Case: User Accounts
• Need to maintain uniqueness across multiple
columns (email, username, etc.)

MySQL (RDS)

• Queryable on multiple facets (email, username, external identifier)
• Entire table needs to be scanned regularly (promotions)
• Bounded data size
Use Case: Global Caching
• Cache everything possible in Memcached
including both entities in Amazon DynamoDB
and RDS.

Memcached
(ElastiCache)

• Single interface providing session caching, memcached
caching, and Amazon DynamoDB access encourages
consistent use of caching.
Use Case: Global Caching
public class CoherentStorage
{
public Cache L1Cache { get; set; }
public Cache L2Cache { get; set; }
public DynamoClient Dynamo { get; set; }
private readonly Games _game;
public CoherentStorage(Games game)
{
_game = game;
L1Cache = Cache.Request;
L2Cache = Cache.GetMemcached(String.Format("{0}GameState", game));
Dynamo = DynamoClient.Instance;
}
public void Save(object instance) { }
public void Delete(object instance) { }
public T Get<T>(object id, bool skipCache = false, bool consistentRead = true) { }
}

Memcached
(ElastiCache)
Tips & Traps
• Know your data – use reasonable heuristics for expected
data growth.
• Each data storage technology introduces some level of
operational and engineering overhead. Choose wisely.
• Get creative with Amazon DynamoDB.
• Prepare for the unexpected with Metadata columns in
MySQL.
Please give us your feedback on this
presentation

DAT201
As a thank you, we will select prize
winners daily for completed surveys!

More Related Content

Understanding AWS Database Options (DAT201) | AWS re:Invent 2013

  • 1. DAT201- Understanding AWS Database Options Sundar Raghavan – Amazon RDS Zac Sprackett – Vice President of Operations with SugarCRM Michael Thomas – Principal Software Engineer with Scopely November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 2. Today’s discussion AWS Database Options and Decision Factors Best Practice Tips and Techniques SugarCRM Scopely Q&A
  • 3. Starting with the Customer • How many of you use databases on AWS? • How many of you use Amazon RDS, Amazon DynamoDB, Amazon Redshift, or Amazon ElastiCache? • How many of you have a well defined DR strategy for your databases? • How many of you are building geo-spatial and context sensitive applications? • We suggest that you attend Werner’s keynote!
  • 4. Introducing: Cross Region Support US GovCloud (US ITAR Region -- Oregon) US West x 2 (N. California and Oregon) US East (Northern Virginia) LATAM (Sao Paola) Europe West (Dublin) >10 data centers In US East alone 9 AWS Regions including 25 Availability Zones and growing 46 world-wide points of presence Asia Pacific Region (Singapore) Asia Pacific Region (Tokyo) Australia Region (Australia) • RDS Snapshot Copy • All engines
  • 5. Zoopla “We are very happy with RDS cross region snapshot copy feature as it gives us the ability to copy our data from one AWS region to another AWS region with minimal effort. Prior to this feature, it used to take 3 days and a number of manual steps to copy our snapshots. Now we have an automated process that helps us to achieve disaster recovery capabilities in just few steps.” Joel Callaway, IT Operations Manager Zoopla Property Group Ltd, UK
  • 6. Your Mission is Clear 1. Zero to App in ____ Minutes 2. Zero to Millions of users in ____ Days 3. Zero to “Hero” in ____ Months
  • 9. Your Stack of Worries Load balancer Security, Scale, Availability… Application tier Security, Innovation, Scale, Performance, Availability… Database tier Security, Innovation, Scale, Transactions, Performance, Durability, Availability, Skills..
  • 10. Spectrum of Database Options SQL NoSQL Do-it Yourself Fully Managed  Low Cost Not available on AWS High Cost
  • 11. Spectrum of Options SQL NoSQL Do-it Yourself Fully Managed
  • 12. Spectrum of Options SQL NoSQL Do-it Yourself Fully Managed MySQL Oracle, SQL Server, MariaDB Vertica, Paraccell … MySQL, Oracle, SQL Server Amazon Redshift
  • 13. Spectrum of Options SQL NoSQL Do-it Yourself Fully Managed MongoDB Cassandra Redis Memcache DynamoDB ElastiCache (Memcache) ElastiCache (Redis) SimpleDB
  • 14. Thinking About the Questions Should I use MySQL or PostgreSQL? Should I use SQL or NoSQL? Should I use MongoDB, Cassandra, or DynamoDB? ? Should I use Redis, Memcache, or ElastiCache?
  • 15. Actually, Thinking About the Right Questions What are my transactional and consistency needs? What are my scale and latency needs? What are my read/write, storage and IOPS needs? ? What are my time to market and server control needs?
  • 16. Factors to Consider Factors SQL NoSQL Application • App with complex business logic? • Web app with lots of users? Transactions • Complex txns, joins, updates? • Simple data model, updates, queries? Scale • Developer managed • Automatic, on-demand scaling Performance • Developer architected • Consistent, high performance at scale Availability • Architected for fail-over • Seamless and transparent Core Skills • SQL + Java/Ruby/Python/PhP • NoSQL + Java/Ruby/Python/PhP Best of both worlds: Possible to Use SQL and NoSQL models in one App
  • 17. Factors to Consider Self-Managed Service Managed Service • Full control over the instance, db and OS parameters • Upgrades, back-ups, fail-over are yours to manage • All aspects of security is managed by you • Complex replication topologies and data management • Off-load the infrastructure and software management • Automate database life-cycle with APIs • Focus on database access and app security • Limited control over replication topologies
  • 18. Pace of Innovation – a Bonus RDS team launched 23+ features • • • • SQL Server TDE, Version upgrade Oracle TDE, Statspack, Fine grain access, 3TB/30K IOPS Cross Region Snapshot Copy, Parallel replica, Chained replica Multi-AZ SLA, Log access, VPC groups, … NoSQL team launched 10+ features • • • • Redis engine support Amazon DynamoDB Fine grain access control Amazon DynamoDB local, Geospatial indexing library Transaction library, Local secondary index, parallel scan Redshift team launched 20+ features • • • • Encryption with HSM support Audit logging, SNS notification, snapshot sharing COPY from Amazon EMR/HDFS/SSH Faster resize, improved concurrency, distributed tables, …
  • 19. Amazon RDS is a managed SQL database service. Choice of Database engines Simple to deploy and scale Reliable and cost effective Without any operational burden
  • 20. Optimizing for Developer Productivity Schema design Migration Backup and recovery Patching Query construction Configuration Query optimization Focus on the “innovation” Software upgrades Storage upgrades Frequent server upgrades Hardware crash Off load the “administration”
  • 21. Optimizing for Developer Productivity  Multiple databases per instance MySQL Manual for Read Replica  Use MySQL tools & drivers  Quickly set up Read Replicas  High availability Multi-AZ option (99.95% SLA)  Ability to promote Read replicas, Rename as Master  Diagnostics OR Amazon RDS console  Native MySQL replication  SSL for encryption over the wire  Monitor metrics  Shell, super user or direct file system access (Think security!)
  • 22. ElastiCache is a managed caching service. Easy to set up and operate cache clusters Supports Memcached and Redis engines Scale cache clusters with push button ease Ultra fast response time for read scaling Without any operational burden
  • 23. ElastiCache is a Performance Booster Serve most read queries In-memory performance Read Replica (Redis) Master App Reads Cache Updates Clients Elastic Load Balancing EC2 App Instances Read/write queries SSD performance RDS MySQL DB Instance with PIOPS
  • 24. Amazon DynamoDB is a managed NoSQL database service. Store and retrieve any amount of data Scale throughput to millions of IO Single digit millisecond latencies Without any operational burden
  • 25. Optimizing for Developer Productivity CreateTable UpdateTable DeleteTable Manage tables PutItem GetItem UpdateItem DescribeTable ListTables DeleteItem Query Query specific items OR scan the full table BatchGetItem Scan BatchWriteItem “Select”, “insert”, “update” items Bulk select or update (max 1MB)
  • 26. Amazon Redshift is a managed data warehouse service. Petabyte scale columnar database Fast response time (~10x that of typical relational stores) Under $1,000 per TB per year Without any operational burden
  • 27. So, what are the tips and techniques for successful deployments?
  • 28. Thousands of Successful Deployments Two Highlights SugarCRM CRM Software Gaming Platform Zac Sprackett Mike Thomas
  • 29. Crafting Loyal Customers with SugarCRM Every Customer. Every User. Every Time. S. Zachariah Sprackett, VP of Operations, SugarCRM November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 30. SugarCRM • Redefining Customer Relationship Management • Unique product bundling – On Premise and Hosted offerings • Manifest destiny – Source code access and SQL database per customer • Scale – From one seat customers to multi thousand seat customers • Globally distributed customer base
  • 32. Application Stack MySQL Apache PHP HTML5 & JavaScript Elastic Search Shadow Linux Email Archiving Background Jobs
  • 33. Cloud Stacks Amazon SES ElastiCache RDS DB Instance Cloud Provider RDS DB Instance Read Replica EC2 Web Servers EC2 Job Servers Amazon S3 EC2 Elastic Search Amazon Glacier
  • 34. Cloud Providers Route 53 EC2 HA Proxy Managed Elastic IP Cloud Stack EC2 HA Proxy
  • 36. Delivering On Time and On Budget • Amazon lets you easily spin up testing environments – Testing only works if you make use of it. Don’t make assumptions – Monitor everything • Change in cost model can surprise finance – Planned capital expenditures versus after the fact operational expenditures – Use reserved instances – Third party tools such as Cloudability can help alert you of issues early • Manage access keys effectively to control cost – Learn to love AWS Identity and Access Management (IAM)
  • 37. Things to Watch Out For • Understand your IO requirements – • • Use the heck out of read replicas Snapshots are incredibly useful – • Don’t get stuck waiting for deployments in a forced failover scenario ElastiCache is not clustered across availability zones Watch out for the SLA – – • Unless you really like restarting databases Cold Standby is not instant on – • • But not available from a read replica Don’t use the default parameter group for Amazon RDS – • Make effective use of each of instance backed, Amazon EBS and Provisioned IOPS file systems 99.95% for a region even across two AZ’s This doesn’t include user error You still need DBAs and Ops but they get to do cooler stuff
  • 38. We’re Hiring Email: zac@sugarcrm.com Free Trials: http://www.sugacrm.com/try-sugar
  • 39. Scopely Michael Thomas – Principal Software Engineer with Scopely November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 40. Our technical infrastructure allows developers to build games efficiently for both iOS and Android. Millions of Users Billions of Turns All titles have reached the Top 5 in the App Store, and the last three have been #1. ABOUT SCOPELY
  • 41. Challenges • Build a single platform to support many different kinds of games – asynchronous turn based, single player, synchronous, etc. • Scale up and down as games are tested, launched, grow, and are retired. • We are not an infrastructure company – we must focus on building features that support game development.
  • 42. Platform Features • • • • • • • • • • • Accounts / authentication • Gameplay / state persistence • Chat / messaging • In game economy • Facebook integration • Gifting • Single Player state tracking • Promotion / cross-promotion system • Statistics • Tournaments • Achievements Email targeting Suggested friends In game news system External partner integration Invitation attribution Push notifications Content management Generic storage API Application / device configuration AB Testing
  • 43. Different Features/Different Requirements • • • • • Dynamic scaling (game launches, promotions, tests) High write/read ratio (playing turns) Transactional consistency (real money purchases) Indexed data (user accounts) Complex, real-time data (leaderboards)
  • 44. Operational Data Storage Scopely Gaming Platform Memcached for performance, scalability, and cost savings ElastiCache Amazon S3 for asset and image storage. S3 Redis for fast, complex caching and message passing. Amazon DynamoDB for unbounded data with heavy write load. ElastiCache DynamoDB RDS MySQL for bounded, transactional, queryable data.
  • 45. Analytics Data Pipeline Scopely Gaming Platform SQS: In-Flight Events Redshift Data Warehouse EC2: Message Loader S3: Staged Messages EMR: Transformer S3: Processed Data EC2: Redshift Loader RDS: Process / Job Tracking
  • 46. Schema Mapping DSL from centipede.schema.table import Table from centipede.attributes import * class GemsTurn(Table): user_id = Integer, lambda message: message['Data']['GameData']['CurrentPlayerId'] current_turn = Integer, lambda message: message['Data']['Gamedata']['CurrentTurn'] end_date = Timestamp, lambda message: message['Data']['GameData']['EndDate'] expiration = Timestamp, lambda message: message['Data']['GameData']['Expiration'] game_id = Guid, lambda message: message['Data']['GameData']['GameId'] resigning_user_id = Integer, lambda message: message['Data']['GameData']['ResigningPlayerId'] start_context = Integer, lambda message: message['Data']['GameData']['StartContext'] start_date = Timestamp, lambda message: message['Data']['GameData']['StartDate'] status = Integer, lambda message: message['Data']['GameData']['Status'] tournament_id = Guid, lambda message: message['Data']['GameData']['TournamentId'] tournament_price_category = Integer, lambda message: message['Data']['GameData']['TournamentPriceCategory'] tournament_price_paid = Integer, lambda message: message['Data']['GameData']['TournamentPricePaid'] tutorial_type = Integer, lambda message: message['Data']['GameData']['TutorialType'] winning_user_id = Integer, lambda message: message['Data']['GameData']['WinningPlayerId'] awards = List, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['Awards'] coins_gathered = List, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['CoinsGathered'] custom_statistics = VarChar, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['CustomStatistics'] has_hidden_game = Boolean, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['HasHiddenGame'] last_nudge_date = Timestamp, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['LastNudgeDate'] score = Integer, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['Score'] score_for_award = Integer, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['ScoreForAward'] opponent_user_id = Integer, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.opponent_user_index(message)]['UserId']
  • 47. Use Case: Leaderboards • “What is my rank in today’s tournament?” • Hard to cache since a single player getting a new high score changes everyone’s rank • Highly optimized schema required 4 m2.2xlarge RDS nodes • Latency for “what is my rank” could be above 100ms • Redis sorted sets provide exactly what we need. Two m2.xlarge instances are more than enough. Rank query is now in single digit milliseconds. Redis
  • 48. Use Case: Game/Turn State • Extremely high throughput. Extremely large dataset. DynamoDB • Semi-structured data – each game models “state” differently. • Always queried by UserID or GameID. • Maxed out an Amazon RDS instance – instead of spending time sharding / optimizing Amazon RDS, we moved to Amazon DynamoDB. • Saves operational time and development time by not having to worry about growing games/adding new games/traffic spikes.
  • 49. Use Case: User Accounts • Need to maintain uniqueness across multiple columns (email, username, etc.) MySQL (RDS) • Queryable on multiple facets (email, username, external identifier) • Entire table needs to be scanned regularly (promotions) • Bounded data size
  • 50. Use Case: Global Caching • Cache everything possible in Memcached including both entities in Amazon DynamoDB and RDS. Memcached (ElastiCache) • Single interface providing session caching, memcached caching, and Amazon DynamoDB access encourages consistent use of caching.
  • 51. Use Case: Global Caching public class CoherentStorage { public Cache L1Cache { get; set; } public Cache L2Cache { get; set; } public DynamoClient Dynamo { get; set; } private readonly Games _game; public CoherentStorage(Games game) { _game = game; L1Cache = Cache.Request; L2Cache = Cache.GetMemcached(String.Format("{0}GameState", game)); Dynamo = DynamoClient.Instance; } public void Save(object instance) { } public void Delete(object instance) { } public T Get<T>(object id, bool skipCache = false, bool consistentRead = true) { } } Memcached (ElastiCache)
  • 52. Tips & Traps • Know your data – use reasonable heuristics for expected data growth. • Each data storage technology introduces some level of operational and engineering overhead. Choose wisely. • Get creative with Amazon DynamoDB. • Prepare for the unexpected with Metadata columns in MySQL.
  • 53. Please give us your feedback on this presentation DAT201 As a thank you, we will select prize winners daily for completed surveys!