Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo

1

Introduction to Amazon DynamoDB
Sean Shriver
NoSQL Solutions Architect
AWS Solution Architecture
22 February 2017

2

Agenda
• Brief history of data processing
• Relational (SQL) vs. nonrelational (NoSQL)
• NoSQL solutions on AWS
• Amazon DynamoDB’s fully managed features
• Demo – serverless applications

3

Data volume since 2010
• 90% of stored data generated in
last 2 years
• 1 terabyte of data in 2010 equals
6.5 petabytes today
• Linear correlation between data
pressure and technical innovation
• No reason these trends will not
continue over time

4

Timeline of database technology
DataPressure

5

Technology adoption and the hype curve

6

Relational (SQL) vs.
nonrelational (NoSQL)

7

Relational vs. nonrelational databases
Traditional SQL NoSQL
DB
Primary Secondary
Scale up
DB
DB
DBDB
DB DB
Scale out

8

SQL vs. NoSQL schema design
NoSQL design optimizes for
compute instead of storage

9

Why NoSQL?
Optimized for storage Optimized for compute
Normalized/relational Denormalized/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
SQL NoSQL

10

NoSQL solutions on AWS
• Bring your own NoSQL (or) use Amazon DynamoDB
• The widest range of NoSQL options
MongoDB
Cassandra
• Avoid the overhead of provisioning hardware
• Visit https://aws.amazon.com/nosql/document/
Couchbase
MarkLogic Amazon DynamoDB

11

NoSQL solutions using Amazon EC2 and EBS
DB hosted on-premises DB hosted on Amazon EC2

12

The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave™ are trademarks of Forrester
Research, Inc. The Forrester Wave™ is a graphical representation of Forrester's call on a market and is plotted using a detailed
spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in
the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.
The Forrester Wave™: Big Data NoSQL, Q3 2016

13

Amazon DynamoDB
Run your business, not your database

14

Fully managed
Fast, consistent performance
Highly scalable
Flexible
Event-driven programming
Fine-grained access control
DynamoDB Benefits

15

Fully managed service = automated operations
DB hosted on-premises DB hosted on Amazon EC2

16

Fully managed service = automated operations
DB hosted on premise DynamoDB

17

Consistently low latency at scale
PREDICTABLE
PERFORMANCE!

18

WRITES
Replicated continuously to 3 AZs
Persisted to disk (custom SSD)
READS
Strongly or eventually consistent
No latency trade-off
Designed to
support 99.99%
of availability
Built for high
durability
High availability and durability

19

Customer use cases

20

RDBMS
DynamoDB
Amazon’s Path to DynamoDB

21

MLBAM (MLB Advanced Media) is a full service solutions
provider, operating a powerful content delivery platform.
For the first time, we can
measure things we’ve never
been able to measure
before.
Joe Inzerillo
Executive Vice President and CTO, MLBAM
”
“ • MLBAM can scale to support many games on a
single day.
• Amazon DynamoDB powers queries and supports the
fast data retrieval required.
• MLBAM distributes 25,000 live events annually and
10 million streams daily.
Major League Baseball Fields Big Data,
Excitement with Amazon DynamoDB

22

Redfin is a full-service real estate company with local
agents and online tools to help people buy & sell homes.
We have billions of records
on DynamoDB being
refreshed daily or hourly or
even by seconds.
Yong Huang
Director, Big Data Analytics, Redfin
”
“ • Redfin provides property and agent details and
ratings through its websites and apps.
• With DynamoDB, latency for “similar” properties
improved from 2 seconds to just 12 milliseconds.
• Redfin stores and processes five billion items in
DynamoDB.
Redfin Is Revolutionizing Home Buying and
Selling with Amazon DynamoDB

23

Duolingo Scales to Store Over 31 Billion Items
Using DynamoDB
Duolingo is a free language learning service where
users help translate the web and rate translations.
Using AWS, we can handle
traffic spikes that expand up
to seven times the amount of
normal traffic.
Severin Hacker
CTO, Duolingo
”
“
• Duolingo stores data about each user to be able to
generate personalized lessons.
• The MySQL database couldn’t keep up with
Duolingo’s rate of growth
• By using the scalable database service, data store
capacity increased from 100 million to more than four
billion items
• Duolingo has the capacity to scale to support over
8 million active users

24

Nexon is a leading South Korean video game developer
and a pioneer in the world of interactive entertainment.
By using AWS, we
decreased our initial
investment costs, and only
pay for what we use.
Chunghoon Ryu
Department Manager, Nexon
”
“ • Nexon used Amazon DynamoDB as its
primary game database for a new blockbuster
mobile game, HIT
• HIT became the #1 Mobile Game in Korea
within the first day of launch and has > 2M
registered users
• Nexon’s HIT leverages DynamoDB to deliver
steady latency of less than 10ms to deliver a
fantastic mobile gaming experience for
170,000 concurrent players
Nexon Scales Mobile Gaming with Amazon
DynamoDB

25

Ad Tech Gaming MobileIoT Web
Scaling high-velocity use cases with DynamoDB

26

That sounds really good. How
do I get started?
Let’s create a table..

27

Products
Product_Id

28

Introduction to Amazon DynamoDB

29

DynamoDB table structure
Table
Items
Attributes
Partition
key
Sort
key
Mandatory
Key-value access pattern
Determines data distribution Optional
Model 1:N relationships
Enables rich query capabilities
All items for key
==, <, >, >=, <=
“begins with”
“between”
“contains”
“in”
sorted results
counts
top/bottom N values

30

Global secondary index (GSI)
Alternate partition and/or sort key
Index is across all partition keys
A1
(partition)
A2 A3 A4 A5
GSIs A5
(partition)
A4
(sort)
A1
(item key)
A3
(projected)
Table
INCLUDE A3
A4
(partition)
A5
(sort)
A1
(item key)
A2
(projected)
A3
(projected) ALL
A2
(partition)
A1
(itemkey) KEYS_ONLY
Online indexing
Read capacity units
(RCUs) and write
capacity units (WCUs)
are provisioned
separately for GSIs

31

How do GSI updates work?
Table
Primary
table
Primary
table
Primary
table
Primary
table
Global
secondary
index
Client
2. Asynchronous
update (in progress)
If GSIs don’t have enough write capacity, table writes will be throttled!

32

Local secondary index (LSI)
Alternate sort key attribute
Index is local to a partition key
A1
(partition)
A3
(sort)
A2
(item key)
A1
(partition)
A2
(sort)
A3 A4 A5
LSIs A1
(partition)
A4
(sort)
A2
(item key)
A3
(projected)
Table
KEYS_ONLY
INCLUDE A3
A1
(partition)
A5
(sort)
A2
(item key)
A3
(projected)
A4
(projected)
ALL
10 GB maximum per
partition key; LSIs limit the
number of range keys!

33

LSI or GSI?
LSI can be modeled as a GSI
If data size in an item collection > 10 GB, use GSI
If eventual consistency is okay for your scenario, use
GSI!

34

Advanced topics in DynamoDB
• Design patterns and best practices
• Data modeling
• Understanding Partitions
• DynamoDB Scaling

35

Demo
Serverless Web Apps with Amazon
DynamoDB, API Gateway, and AWS Lambda

36

Simple serverless web application – use case

37

Elastic event driven applications

38

Elastic event driven applications

39

Elastic event driven applications

40

Elastic event driven applications

41

Elastic event driven applications

42

Demo

43

• Free Tier
 25GB of storage
 25 Reads per second
 25 Writes per second
• Pricing for additional usage in US East (N. Virginia)
 $0.25 per GB per month
 Write throughput: $0.0065 per hour for every 10 units of Write Capacity
 Read throughput: $0.0065 per hour for every 50 units of Read Capacity
DynamoDB Pricing & Free Tier

44

Resources
Amazon DynamoDB: https://aws.amazon.com/dynamodb/
NoSQL on AWS: https://aws.amazon.com/nosql/document/
Upcoming session: Hands on Lab: Introduction to DynamoDB

45

aws.amazon.com/activate
Everything and Anything Startups
Need to Get Started on AWS

More Related Content

Introduction to Amazon DynamoDB

  • 1. Introduction to Amazon DynamoDB Sean Shriver NoSQL Solutions Architect AWS Solution Architecture 22 February 2017
  • 2. Agenda • Brief history of data processing • Relational (SQL) vs. nonrelational (NoSQL) • NoSQL solutions on AWS • Amazon DynamoDB’s fully managed features • Demo – serverless applications
  • 3. Data volume since 2010 • 90% of stored data generated in last 2 years • 1 terabyte of data in 2010 equals 6.5 petabytes today • Linear correlation between data pressure and technical innovation • No reason these trends will not continue over time
  • 4. Timeline of database technology DataPressure
  • 5. Technology adoption and the hype curve
  • 7. Relational vs. nonrelational databases Traditional SQL NoSQL DB Primary Secondary Scale up DB DB DBDB DB DB Scale out
  • 8. SQL vs. NoSQL schema design NoSQL design optimizes for compute instead of storage
  • 9. Why NoSQL? Optimized for storage Optimized for compute Normalized/relational Denormalized/hierarchical Ad hoc queries Instantiated views Scale vertically Scale horizontally Good for OLAP Built for OLTP at scale SQL NoSQL
  • 10. NoSQL solutions on AWS • Bring your own NoSQL (or) use Amazon DynamoDB • The widest range of NoSQL options MongoDB Cassandra • Avoid the overhead of provisioning hardware • Visit https://aws.amazon.com/nosql/document/ Couchbase MarkLogic Amazon DynamoDB
  • 11. NoSQL solutions using Amazon EC2 and EBS DB hosted on-premises DB hosted on Amazon EC2
  • 12. The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave™ are trademarks of Forrester Research, Inc. The Forrester Wave™ is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change. The Forrester Wave™: Big Data NoSQL, Q3 2016
  • 13. Amazon DynamoDB Run your business, not your database
  • 14. Fully managed Fast, consistent performance Highly scalable Flexible Event-driven programming Fine-grained access control DynamoDB Benefits
  • 15. Fully managed service = automated operations DB hosted on-premises DB hosted on Amazon EC2
  • 16. Fully managed service = automated operations DB hosted on premise DynamoDB
  • 17. Consistently low latency at scale PREDICTABLE PERFORMANCE!
  • 18. WRITES Replicated continuously to 3 AZs Persisted to disk (custom SSD) READS Strongly or eventually consistent No latency trade-off Designed to support 99.99% of availability Built for high durability High availability and durability
  • 21. MLBAM (MLB Advanced Media) is a full service solutions provider, operating a powerful content delivery platform. For the first time, we can measure things we’ve never been able to measure before. Joe Inzerillo Executive Vice President and CTO, MLBAM ” “ • MLBAM can scale to support many games on a single day. • Amazon DynamoDB powers queries and supports the fast data retrieval required. • MLBAM distributes 25,000 live events annually and 10 million streams daily. Major League Baseball Fields Big Data, Excitement with Amazon DynamoDB
  • 22. Redfin is a full-service real estate company with local agents and online tools to help people buy & sell homes. We have billions of records on DynamoDB being refreshed daily or hourly or even by seconds. Yong Huang Director, Big Data Analytics, Redfin ” “ • Redfin provides property and agent details and ratings through its websites and apps. • With DynamoDB, latency for “similar” properties improved from 2 seconds to just 12 milliseconds. • Redfin stores and processes five billion items in DynamoDB. Redfin Is Revolutionizing Home Buying and Selling with Amazon DynamoDB
  • 23. Duolingo Scales to Store Over 31 Billion Items Using DynamoDB Duolingo is a free language learning service where users help translate the web and rate translations. Using AWS, we can handle traffic spikes that expand up to seven times the amount of normal traffic. Severin Hacker CTO, Duolingo ” “ • Duolingo stores data about each user to be able to generate personalized lessons. • The MySQL database couldn’t keep up with Duolingo’s rate of growth • By using the scalable database service, data store capacity increased from 100 million to more than four billion items • Duolingo has the capacity to scale to support over 8 million active users
  • 24. Nexon is a leading South Korean video game developer and a pioneer in the world of interactive entertainment. By using AWS, we decreased our initial investment costs, and only pay for what we use. Chunghoon Ryu Department Manager, Nexon ” “ • Nexon used Amazon DynamoDB as its primary game database for a new blockbuster mobile game, HIT • HIT became the #1 Mobile Game in Korea within the first day of launch and has > 2M registered users • Nexon’s HIT leverages DynamoDB to deliver steady latency of less than 10ms to deliver a fantastic mobile gaming experience for 170,000 concurrent players Nexon Scales Mobile Gaming with Amazon DynamoDB
  • 25. Ad Tech Gaming MobileIoT Web Scaling high-velocity use cases with DynamoDB
  • 26. That sounds really good. How do I get started? Let’s create a table..
  • 29. DynamoDB table structure Table Items Attributes Partition key Sort key Mandatory Key-value access pattern Determines data distribution Optional Model 1:N relationships Enables rich query capabilities All items for key ==, <, >, >=, <= “begins with” “between” “contains” “in” sorted results counts top/bottom N values
  • 30. Global secondary index (GSI) Alternate partition and/or sort key Index is across all partition keys A1 (partition) A2 A3 A4 A5 GSIs A5 (partition) A4 (sort) A1 (item key) A3 (projected) Table INCLUDE A3 A4 (partition) A5 (sort) A1 (item key) A2 (projected) A3 (projected) ALL A2 (partition) A1 (itemkey) KEYS_ONLY Online indexing Read capacity units (RCUs) and write capacity units (WCUs) are provisioned separately for GSIs
  • 31. How do GSI updates work? Table Primary table Primary table Primary table Primary table Global secondary index Client 2. Asynchronous update (in progress) If GSIs don’t have enough write capacity, table writes will be throttled!
  • 32. Local secondary index (LSI) Alternate sort key attribute Index is local to a partition key A1 (partition) A3 (sort) A2 (item key) A1 (partition) A2 (sort) A3 A4 A5 LSIs A1 (partition) A4 (sort) A2 (item key) A3 (projected) Table KEYS_ONLY INCLUDE A3 A1 (partition) A5 (sort) A2 (item key) A3 (projected) A4 (projected) ALL 10 GB maximum per partition key; LSIs limit the number of range keys!
  • 33. LSI or GSI? LSI can be modeled as a GSI If data size in an item collection > 10 GB, use GSI If eventual consistency is okay for your scenario, use GSI!
  • 34. Advanced topics in DynamoDB • Design patterns and best practices • Data modeling • Understanding Partitions • DynamoDB Scaling
  • 35. Demo Serverless Web Apps with Amazon DynamoDB, API Gateway, and AWS Lambda
  • 36. Simple serverless web application – use case
  • 37. Elastic event driven applications
  • 38. Elastic event driven applications
  • 39. Elastic event driven applications
  • 40. Elastic event driven applications
  • 41. Elastic event driven applications
  • 42. Demo
  • 43. • Free Tier  25GB of storage  25 Reads per second  25 Writes per second • Pricing for additional usage in US East (N. Virginia)  $0.25 per GB per month  Write throughput: $0.0065 per hour for every 10 units of Write Capacity  Read throughput: $0.0065 per hour for every 50 units of Read Capacity DynamoDB Pricing & Free Tier
  • 44. Resources Amazon DynamoDB: https://aws.amazon.com/dynamodb/ NoSQL on AWS: https://aws.amazon.com/nosql/document/ Upcoming session: Hands on Lab: Introduction to DynamoDB
  • 45. aws.amazon.com/activate Everything and Anything Startups Need to Get Started on AWS

Editor's Notes

  1. We will look at the history of databases, and we’ll discuss relational database and non-relational databases, and the differences. I’ll introduce Amazon DynamoDB and we’ll look at customer references who have built scalable applications using this technology.
  2. To fully appreciate the need for NoSQL… Let’s start by looking into how much data volume has grown in the last 5 years. 90% of data was generated in the last 2 years. 1 TB vs 6.5 PB .. To put that into perspective… We are starting to see Businesses with multi-TB have exploded to multi-PB databases. As data volume increased, we started innovating data processing systems that would scale to process the large volume of data
  3. We started by remembering everything (human brain) and advanced to writing things down (for centuries). As data pressure increased we saw Magnetic storage, File systems, and then finally Relational Databases. 40 years. Table normalization was designed to eliminate duplicates and save storage costs. Multiple tables – Complex SQL joins – Resource intensive. Optimize for the costlier asset. AGNOSTIC TO ACCESS PATTERNS -- Great for adhoc queries –NOT optimized. Business are seeing the limitations in relational databases. Switching to NoSQL.
  4. Every time there is a new technology - initial excitement with early adopters… They may run into roadblocks. It’s the same with NoSQL. The goal of this presentation is to explain the difference between relational and NoSQL databases. And as you gain more experience with this technology you wll start to realize the benefits of NoSQL for your application. And that will help you cross the chasm in getting started with DynamoDB.
  5. Let’s deep dive into the differences between relational and non-relational databases. Why? Databases are a crucial part of your application and your choice of database technology will determine how your application scales. To understand the benefits of NoSQL…
  6. Relational - Data is normalized. To enable joins, You are tied to a single partition and a single system. performance on the hardware specs of the primary server. To improve performance, Optimize -- Move to a bigger box. You may still run out of headroom. Create Read Replicas. You will still run out. Scale UP. NoSQL -- NoSQL databases were designed specifically to overcome scalability issues. Scale “out” data using distributed clusters, low-cost hardware, throughput + low latency Therefore, Using NoSQL, businesses can scale virtually without limit.
  7. Generic product catalog. Table relationships in normalized. A product could be a book – say the Harry Potter Series. There’s a 1:1 relationship. Or it could be a movie.. You can imagine the types of queries that you’d have to execute. 1. Show me all the movies starring. 2. the entire product catalog. This is Resource intensive – perform complex join ** NoSQL you have to ask – how will the application access the data? optimize for the costlier asset. No joins. Just a select. Hierarchical structures. Designed by keeping in mind Access patterns. Via duplication of data (storage), optimized for compute, it is fast.
  8. Businesses are starting to see scalability problems with relational databases. I once had a customer say they top out with relational at around 3,000 requests per second and had to scale up to move to bigger hardware. With NoSQL, we have a technology that can easily sale to 100s of nodes, or even 1000s, and the scalability bottleneck goes away. Excellent for OLTP applications that scale, real time data access, fast, low latency, user cannot wait. == They store data in a denormalized hierarchical view, that makes it faster and easier to access the data.
  9. Using AWS, you can easily get started with a variety of NoSQL solutions. For those customers who want full control over their NoSQL databases but who don’t want to manage hardware infrastructure you can run your database on AWS and choose from a variety of database engines – Cassandra, Couchbase, MarkLogic or MongoDB. And you will use Amazon EC2 and Amazon EBS and have to think about availability and scalability. If instead, you just want to focus on building your application, then you can use the fully managed Amazon DynamoDB. All our solutions offer flexible, pay-as-you-go pricing, so you can quickly and easily scale at a low cost. You can download the AWS whitepaper to getting started with these NoSQL technologies [cassandra, mongo, rdbms to nosql]. The key takeaway from this slide is that we offer the widest range of NoSQL options and no matter which one you choose, you don’t have to worry about provisioning hardware and you will get the benefits of the underlying AWS global Cloud infrastructure. In the next few slides, I will give an overview of getting started with MongoDB on AWS and then we will discuss Amazon DynamoDB. == Cassandra – distributed open source, handles large amounts of data providing high availability with no single point of failure. Couchbase - a high-performance distributed key-value store. MongoDB - open source, high performance document database. 
  10. Those if you who are involved in spinning up and managing your own servers surely realize how resource intensive it is to manage your own infrastructure. It can be possible to underestimate the cost and complexity of maintaining…. You have to think about power, cooling, OS maintenance and patching. Now imagine managing a 1000 node cluster, this can become very resuource intensive Amazon EC2 is an AWS service for is the comupte capacity in cloud, it is resizable. Database instance hosted in an EC2 instance takes away some of the overhead. But, you still need to think about scalability and availability.
  11. So, this brings us to Amazon DynamoDB, which is what we are going to discuss today. Let’s take a closer look.
  12. Fully managed – With just a few clicks on the AWS console – create a table that is highly scalalable, highly available, and gives you fast consistent predictable performance. No need to launch or maintain any servers. Tell DynamoDB read/write – DynamoDB will scale to meet your application’s requirements Only pay for what you use. You get all of this with just a few clicks. Key take away: Using DynamoDB customers get consistent, single-digit millisecond latency at any scale. == DynamoDB supports both document and key-value store models, and offers a range of features including global secondary indexes, fine-grained access control via AWS Identity and Access Management, support for event-driven programming, and more. == Fully Managed Amazon DynamoDB is a fully managed cloud NoSQL database service – you simply create a database table, set your throughput, and let the service handle the rest. You no longer need to worry about database management tasks such as hardware or software provisioning, setup and configuration, software patching, operating a reliable, distributed database cluster, or partitioning data over multiple instances as you scale. Fast, Consistent Performance Amazon DynamoDB is designed to deliver consistent, fast performance at any scale for all applications. Average service-side latencies are typically single-digit milliseconds. As your data volumes grow and application performance demands increase, Amazon DynamoDB uses automatic partitioning and SSD technologies to meet your throughput requirements and deliver low latencies at any scale. Highly Scalable When creating a table, simply specify how much request capacity you require. If your throughput requirements change, simply update your table's request capacity using the AWS Management Console or the Amazon DynamoDB APIs. Amazon DynamoDB manages all the scaling behind the scenes, and you are still able to achieve your prior throughput levels while scaling is underway. Flexible Amazon DynamoDB supports both document and key-value data structures, giving you the flexibility to design the best architecture that is optimal for your application. Event Driven Programming Amazon DynamoDB integrates with AWS Lambda to provide Triggers which enables you to architect applications that automatically react to data changes. Fine-grained Access Control Amazon DynamoDB integrates with AWS Identity and Access Management (IAM) for fine-grained access control for users within your organization. You can assign unique security credentials to each user and control each user's access to services and resources. http://aws.amazon.com/dynamodb
  13. Those if you who are involved in spinning up and managing your own servers surely realize how resource intensive it is to manage your own infrastructure. It can be possible to underestimate the cost and complexity of maintaining…. You have to think about power, cooling, OS maintenance and patching. Now imagine managing a 1000 node cluster, this can become very resuource intensive Amazon EC2 is an AWS service for is the comupte capacity in cloud, it is resizable. Database instance hosted in an EC2 instance takes away some of the overhead. But, you still need to think about scalability and availability.
  14. This is the value that is built into DynamoDB. With DynamoDB, you have get an easy-to-use database. You don’t have to spin up any servers. You can easily design serverless scalable aplications with DynamoDB. You get scalability and multi-AZ replication without designing a distributed system. You get ongoing security upgrades, software improvements, cost reduction efforts, monitoring…without any effort at all. DDB is fully managed service, you have all of that benefit built into it. We built Dynamo to just work so you can focus on your app.
  15. In any business, as your business scales up, you need a way to easy scale to meet the traffic, and be able to get consistent predicatable latency at any scale. You need a way to scale down as your business needs changes. DynamoDB was designed to offer consistent and predictable single-digit millisecond latency, at any scale. And you only pay for what you use. NO limit on throughouput. No limit on Size – PB of data any number of items. The latency characteristics of DynamoDB are under 10 milliseconds and highly consistent. Most importantly, the data is durable in DynamoDB, constantly replicated across multiple data centers and persisted to SSD storage. Predictable Performance This is obviously something that’s important and valuable in any industry, whether it’s powering the New York Times recommendation engine, storing and retrieving game data for the game Fruit Ninja, or powering queries and fast data retrieval for Major League Baseball Advanced Media. Predictable performance at scale is a must-have for many web apps, and DynamoDB was designed specifically to deliver on this.
  16. 13/35. 4 more regions. DynamoDB is highly durable. AWS has a concept of regions and Availability zones. AWS region is a geographic area. Each region has multiple availability zones. Each AZ has 1 or more physical DCs. They have redundant power and cooling, and interconnected via high speed low latency fiber. Take for example the AWS region in NVIrgina. It has 4 Azs. When you create a DynamoDB table in Nvirgina, we will replicate the data to 3 Azs. All the data is stored in SSDs. A lot of value built into DynamoDB– a few clicks.
  17. Growing number of customers in the Mobile, IoT, Gaming space are using DynamoDB.
  18. Amazon’s path from Relational Databases to NoSQL reflects the journey many customers are now taking. Amazon.com, the online retail business, runs on one of the world’s largest web infrastructures. Back in 2004, Amazon.com was using Relational Oracle Databases and they were unable to scale their relational database. Maintenance and adminstration. In order to keep Amazon.com highly scalable to support all the incoming traffic, Internal project to investigate options… “If availability, durability, and scalability are the priority, what would the database look like?”. This resulted in a whitepaper that described what the database should look like. This paper made the way for many NoSQL technologies out there today. This was also the beginning of DynamoDB. Database as a Swiss Army Knife - Hundreds of applications built on RDBMS, Poor Scalability (Q4 was a pain), Poor availability, Exorbitantly high costs for h/w, software, admin Dynamo = replicated DHT with consistency management Specialist tool with limited query and simpler consistency Problem: required significant effort to maintain DynamoDB was designed to deliver consistently high performance at any scale: Predictable Performance Massively Scalable Fully Managed Low Cost
  19. Major League Baseball – A great example of a customer using DynamoDB to build IoT solution. Amazon DynamoDB powers queries required to support many games on a single day. When there are only a few games, it dials down throughput to save money; MLBAM only pays for the capacity it uses. === STORY BACKGROUND MLBAM (MLB Advanced Media) is a full service solutions provider, operating a powerful content delivery platform. Amazon DynamoDB powers queries and supports the fast data retrieval required to support many games on a single day. MLBAM distributes 25,000 live events annually and 10 million streams daily. SOLUTION AND BENEFITS MLBAM only pays for the capacity it uses. When there are only a few games, it dials down throughput to save money. MLBAM can focus on what it does best, rather than spending resources managing clusters of non-relational (NoSQL) databases. On big game days, MLBAM can quickly scale up DynamoDB read and write capacity to meet its demand without increased latency. ADDITIONAL INFORMATION https://aws.amazon.com/solutions/case-studies/major-league-baseball-mlbam/
  20. A customer who is using DynamoDB to power their Mobile applications -- Redfin – people use this application for searching buying and selling homes. More than 10,000 customers buy or sell homes with Redfin each year. == STORY BACKGROUND Redfin offers full-service real estate brokerage services with local agents and online tools to help people buy & sell homes. Redfin built technology to make customers smarter and faster when buying and selling homes. More than 10,000 customers buy or sell homes with Redfin each year. SOLUTION AND BENEFITS Redfin connects users with properties and agents. Redfin uses DynamoDB to deliver insights to its website and apps. DynamoDB stores property scores, recommendations, property data (e.g., sold, est. value), agent scoring (i.e., how the agent is performing). Redfin websites and apps consume these data from DynamoDB. Using Amazon DynamoDB, Amazon Redshift, Amazon EMR, Amazon S3 ADDITIONAL INFORMATION [Coming December 2015]
  21. Duolingo provides a free language-learning app that uses crowd sourcing to translate web content as users learn. Duolingo has to be able to scale to manage new users and in addition, expand the service to offer new languages. DynamoDB is Duolingo’s largest and most active data store. Elastic Load Balancing distributes web and mobile traffic across approximately 170 Amazon Elastic Compute Cloud (Amazon EC2) instances. STORY BACKGROUND Duolingo provides a free language-learning app that uses crowd sourcing to translate web content as users learn. In 2012, Apple named the Duolingo app iPhone App of the Year. Duolingo has to be able to scale to manage new users and in addition, expand the service to offer new languages. SOLUTION AND BENEFITS Learned about Amazon DynamoDB at re:Invent 2012. DynamoDB is Duolingo’s largest and most active data store. The company also uses Amazon Relational Database Service (Amazon RDS) running MySQL with provisioned IOPS storage. Elastic Load Balancing distributes web and mobile traffic across approximately 170 Amazon Elastic Compute Cloud (Amazon EC2) instances. Using Amazon DynamoDB, Amazon EC2, Elastic Load Balancing, Amazon SNS, Amazon SQS, Amazon VPC, Amazon CloudFront and Amazon CloudWatch ADDITIONAL INFORMATION https://aws.amazon.com/solutions/case-studies/duolingo
  22. Nexon is a leading South Korean video game developer. Their blockbuster game titled HIT attracts over 2 million players. They were ranked #1 Mobile Game in Korea on the day of its launch. They used Amazon DynamoDB to scale and to provide a reliable user experience. === STORY BACKGROUND Nexon is a leading South Korean video game developer and a pioneer in the world of interactive entertainment. Nexon provides 150 games to 150 countries, including FIFA Online 3, MapleStory, and Sudden Attack. As of 2014, sales reached $1.6 billion, with 60% from overseas business Nexon used DynamoDB as its primary game database for a new blockbuster Mobile Game, HIT SOLUTION AND BENEFITS DynamoDB serves as the primary game database, offering low latency and scale to match player demand Despite a steady increase in the size of the data, DynamoDB delivered steady latency of less than 10ms. This enabled Nexon to provide a reliable service to users HIT, which was the foundation for the success of HIT. ADDITIONAL INFORMATION https://aws.amazon.com/solutions/case-studies/nexon/
  23. Here are just a few examples of customers achieving tremendous scale with DynamoDB: And what do customers want? They want Predictable consistent low latency performance at scale; and DynamoDB was designed specifically to deliver on this. == Ad Tech AdRoll http://aws.amazon.com/solutions/case-studies/adroll/ DataXu http://info.qubole.com/how-dataxu-manages-big-data AdBrain http://www.adbrain.com/careers-generalapp/ DoApp https://aws.amazon.com/solutions/case-studies/doapp/ VidRoll https://aws.amazon.com/solutions/case-studies/vidroll/ Fiksu https://aws.amazon.com/solutions/case-studies/fiksu/ TubeMogul https://www.tubemogul.com/engineering/using-contextual-information-in-programmatic-advertising/ TCC https://github.com/TheClimateCorporation/mandolin Gaming Supercell http://aws.amazon.com/solutions/case-studies/supercell/ Zynga https://aws.amazon.com/solutions/case-studies/zynga/ Nexon http://aws.amazon.com/solutions/case-studies/nexon PennyPop http://aws.amazon.com/solutions/case-studies/battle-camp/ Frontier http://aws.amazon.com/solutions/case-studies/frontier-games/ scopely https://aws.amazon.com/solutions/case-studies/scopely/ Unalis https://aws.amazon.com/solutions/case-studies/unalis/ IoT MLBAM http://aws.amazon.com/solutions/case-studies/major-league-baseball-mlbam/ ACTi https://aws.amazon.com/solutions/case-studies/acti-case-study/ Canary https://aws.amazon.com/solutions/case-studies/canary/ Dropcam https://aws.amazon.com/solutions/case-studies/dropcam/ MediaTek https://aws.amazon.com/solutions/case-studies/mediatek/ Devicescape https://aws.amazon.com/solutions/case-studies/devicescape/ Mobile Duolingo http://aws.amazon.com/solutions/case-studies/duolingo-case-study-dynamodb/ Mapbox https://www.mapbox.com/blog/scaling-the-mapbox-infrastructure-with-dynamodb-streams/ Redfin http://aws.amazon.com/solutions/case-studies/redfin/ and https://www.youtube.com/watch?v=YiaPjILR9zw Remind https://aws.amazon.com/solutions/case-studies/remind/ Infraware http://aws.amazon.com/solutions/case-studies/infraware/ Myriad http://aws.amazon.com/solutions/case-studies/myriad-group/ Peak http://aws.amazon.com/solutions/case-studies/peak/ Web Expedia https://aws.amazon.com/solutions/case-studies/expedia/ Nordstrom https://aws.amazon.com/solutions/case-studies/nordstrom/ JustGiving http://aws.amazon.com/solutions/case-studies/justgiving/ Tokyu Hands https://aws.amazon.com/blogs/aws/how-tokyu-hands-architected-a-cost-effective-shopping-system-with-amazon-dynamodb/ jobandtalent https://aws.amazon.com/solutions/case-studies/jobandtalent/ Tigerspike http://aws.amazon.com/solutions/case-studies/tigerspike/
  24. Amazon DynamoDB is a fully Managed Service. So, to get started with Amazon DynamoDB you simply have to create a table.
  25. After you logon to the AWS console, select DynamoDB, and click create table, here’s what the screen looks like. Specify a table name, specify a “partition key”. IT’s like a primary key, and uniquely identifies a row. Next, if required change the value for amount of reads / writes the table should support. Or accept the defaults and Click Create.
  26. And you’ve created your table – this table which you’ve created in just a few clicks is highly scalable, highly available, and is designed to provide consistent low ms latency at scale.
  27. Attributes can vary between the items, Each item can have a different set of attributes than the other items. (as with any NoSQL database). Partition key – Primary key – uniquely identifies each item. Also determines HOW DATA IS Partitioned STORED Optional Sort key – you have a composite key; Sort keys help to create 1:many relationships, and useful in range queries.
  28. Some applications might need to perform many kinds of queries, using a variety of different attributes as query criteria.  Global Secondary Indexes – Parallel tables or secondary tables. GSI can have a partition key that is different from the Table. They can also have an alternate sort key. Customers, Orders, Date Range. Partition by Order Id and query for a date range. Note: When you create a GSI, you must specify read and write capacity units for the expected workload on that index.
  29. Customers often ask if LSI should be used or GSI. Think of this as a parallel table asynchronously populated by DynamoDB. Eventually consistent. GSI updates typically happen within a second. Throughput for GSI is important.. That is important on how soon the GSI will be updated. Note: When you create a GSI, you must specify read and write capacity units for the expected workload on that index. 1 Table update = 0, 1 or 2 GSI updates
  30. Some applications only need to query data using the table's primary key; however, there may be situations where an alternate sort key would be helpful. You can use LSIs. LSI is collocated on the same partition as the item in the table, so this gives us consistency. When an item is updated, LSI is updated, and then ack’d. LSI is partitioned by the same primary key as the parent table. Different Sort key. Say, there is a table containing Customers, Orders, date range. Customers and Orders. LSI can have sort key on a “date range”. A local secondary index maintains an alternate sort key for a given partition key value.
  31. More flexibility with GSI. You can have only 5 LSI and 5 GSI, however, with GSI, you have the flexibility to create them after the table is created. LSI must be created when the table is defined. LSI can be modeled as a GSI If data size in an item collection > 10 GB (Example, many orders for a customerID) use GSI that’s the only choice. Because LSI limit the data size in a particular partition. If eventual consistency is okay for your scenario, use GSI – it works for 99% of the scenarios out there.
  32. For those of you who want to learn more, there is a session later today that will cover advanced topics.
  33. I’ll show you a Demo of building a serveless web app, and we’ll also look at the integration capabilities of DynamoDB with AWS services. DynamoDB is a managed NoSQL offering from AWS, and we are looking for talented engineers to help build the next generation of this service. Contact Raja for more details.
  34. We will build a web application, that will ask you for feedback and store in securely on the AWS Cloud. The website is a simple HTML/javascript web interface. All the non-PII data – Names of the Super heroes and the mission details is stored in Amazon DynamoDB. All the PII data is stored in Amazon S3 with SSE. When we created this application, I had to main objectives. I do not want to spin up or have to manage any servers. Two, I want to take advantage of the high availability, scalability, and durability features of AWS services..
  35. *** Amazon S3 is secure, durable, highly-scalable cloud storage, where you can store and retrieve any amount of data.. In this demo, I will access a website using the internet. The website is a simple HTML/javascript web interface. The website is stored in Amazon S3. The application is a simple web interface that will retrieve flight schedules, flight number, wait list, etc stored in a DynamoDB table. In order to set this up, I did not have to spin up any servers, so no servers to maintain. I am taking advantage of all the fully managed capabilities of AWS services to securely access my data. All I did is create my application and let AWS handle the infrastructure and the scaling.
  36. So we said that API Gateway acts as a front door to the “business logic”. So my business logic is running on AWS Lambda. AWS Lambda is a Fully managed compute service – you just write the code and upload it. In this example, the APIs created from the API Gateway front-door, will call the business logic running on AWS lambda functions.
  37. And all the data is stored in DynamoDB. DynamoDB is a fully managed NoSQL database service that provides consistent, single-digit millisecond latency at any scale. [CLICK] So if you put this together, I will show you a demo where I will access a website hosted in an S3 bucket, which uses API Gateway calls to send requests to Lambda backends to store the DynamoDB data.
  38. https://s3-us-west-2.amazonaws.com/aesf-content/index.html?ea=alias&fn=Andy&ln=Jassy https://120418261770.signin.aws.amazon.com/console
  39. You can get started with creating your first serverless web application in AWS, by taking advantage of the DynamoDB free tier, that can handle up to 200 million requests for free. == As part of the AWS Free Tier, DynamoDB customers get 25GB of storage, 25 writes per second, and 25 reads per second. This lets you handle up to 200 million requests per month so you can deploy a proof-of-concept and begin testing the live cloud service. The DynamoDB free tier does not expire at the end of your 12 month AWS Free Tier term. http://aws.amazon.com/free