Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
159 views

Cloud AWS - Database On AWS

The document discusses relational databases on AWS including Amazon RDS and Aurora. It covers topics like database migration, monitoring, security, high availability, automated backups and database snapshots. The presentation also includes information on performance insights and read replicas.

Uploaded by

MINH120
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
159 views

Cloud AWS - Database On AWS

The document discusses relational databases on AWS including Amazon RDS and Aurora. It covers topics like database migration, monitoring, security, high availability, automated backups and database snapshots. The presentation also includes information on performance insights and read replicas.

Uploaded by

MINH120
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Learn & Lab

Database
Week 5 – Module 5

Tuan Vo
Solution Architect

© 2021, Amazon Web Services, Inc. or its Affiliates.


Agenda – Module 5

• Relational Databases on AWS


• Database Migration
• Purpose-Built Databases - DynamoDB
• Data Lake Introduction
• Kahoot Game
• Labs

© 2021, Amazon Web Services, Inc. or its Affiliates.


Amazon RDS

© 2021, Amazon Web Services, Inc. or its Affiliates.


Amazon Relational Database Service (Amazon RDS)
Managed relational database service with a choice of six popular database engines

Microsoft
SQL Server

Easy to administer Available and durable Highly scalable Fast and secure

No need for infrastructure Automatic Multi-AZ Scale database compute SSD storage and
provisioning or installing data replication; and storage with a guaranteed provisioned
and maintaining database automated backup, few clicks with no I/O; data encryption at
software snapshots, and failover application downtime rest and in transit

© 2021, Amazon Web Services, Inc. or its Affiliates.


Amazon RDS - fully managed

Spend time innovating & building new apps, not managing infrastructure

Automatic fail-over
Backup & recovery
Isolation & security
Schema design Industry compliance
You AWS Push-button scaling
Query construction
Query optimization Automated patching &
upgrades
Advanced monitoring
Routine maintenance

© 2021, Amazon Web Services, Inc. or its Affiliates.


Monitoring RDS/Aurora databases

Instance Operating System Database Engine

Amazon CloudWatch Amazon RDS Enhanced Amazon RDS Performance


Monitoring Insights

• CPU/ Memory / IOPS / • Process / Thread list • SQL / State / User / Host
Network • Per second metric storage (“Database Load”)
• Per minute metric storage in Amazon CloudWatch • Per second metric storage in
in Amazon CloudWatch Logs Amazon RDS

© 2021, Amazon Web Services, Inc. or its Affiliates.


Performance Insights increases productivity

Amazon RDS Performance Insights


measures database load over time
Easy to identify database
bottlenecks
• Top SQL/most intensive queries
Enables problem discovery
Adjustable timeframe
• Hour, day, week, and longer
Available for all Amazon RDS
database engines

© 2021, Amazon Web Services, Inc. or its Affiliates.


Security and compliance

• Network security
• Amazon Virtual Private Cloud (VPC) security groups act as a virtual firewall to
control inbound and outbound traffic

• Resource access permissions


• AWS Identity and Access Management (IAM) provides resource-level role
permission controls

• Data encryption
• Encryption at rest using AWS KMS or Oracle/Microsoft TDE
• SSL protection for data in transit

• Compliance and assurance programs for finance, healthcare,


government, and more
• HIPAA eligibility under a Business Associate Agreement (BAA) with AWS

• Active Directory / Kerberos integration


• RDS for Oracle, SQL Server, PostgreSQL
© 2021, Amazon Web Services, Inc. or its Affiliates.
Multi-AZ deployments
Enterprise-grade high availability

Application Database
servers failure Standby
Fault tolerance across
multiple data centers
• Automatic failover
New standby
• Synchronous replication Availability Zone A
• Enabled with one click
Primary

Availability Zone B

© 2021, Amazon Web Services, Inc. or its Affiliates.


Read Replicas
Read scaling and disaster recovery

RDS for MySQL, PostgreSQL, Primary


Read/write
MariaDB, and Oracle
• Relieve pressure on your master node
with additional read capacity
Asynchronous
replication
• Bring data close to your applications
in different regions
• Promote a read replica to a master for Read only

faster recovery in the event of disaster


BI/reporting
application server Read replica

© 2021, Amazon Web Services, Inc. or its Affiliates.


Automated backups
Point-in-time recovery for your DB instance

• Scheduled daily volume backup


of entire instance
• Archive database change logs
• 35–day maximum retention Every day during your backup
window, RDS creates a storage
• Minimal impact on database volume snapshot of your instance
performance
• Taken from standby when
running Multi-AZ Every five minutes, RDS backs up the
transaction logs of your database

© 2021, Amazon Web Services, Inc. or its Affiliates.


Database snapshots
Backups of your entire DB instance in Amazon S3

Volume
• Always incremental
• Amazon S3 à
99.999999999% durability
• Supports encryption
Bucket Snapshot 1 Snapshot 2 Snapshot 3
• Copy across accounts,
across regions

© 2021, Amazon Web Services, Inc. or its Affiliates.


Amazon Aurora

© 2021, Amazon Web Services, Inc. or its Affiliates.


You asked for a cost-effective, enterprise database…

So, we designed Amazon Aurora - enterprise database


at open source price, delivered as a managed service

Speed and availability of high-end commercial databases

Simplicity and cost-effectiveness of open source databases

Drop-in compatibility with MySQL and PostgreSQL


Amazon Aurora

Simple pay as you go pricing

© 2021, Amazon Web Services, Inc. or its Affiliates.


Amazon Aurora is fast…

up to 5x the throughput of MySQL; 3x the


throughput of PostgreSQL

© 2021, Amazon Web Services, Inc. or its Affiliates.


Traditional Database Architecture

Compute
Node

SQL

Databases are all about I/O… Transactions

Caching

Design principles over the last 40+ years: Logging

• Increase I/O bandwidth


• Decrease number of I/Os consumed

Attached
Storage

© 2021, Amazon Web Services, Inc. or its Affiliates.


Traditional Database Architecture… in the Cloud

Compute
Compute and storage have different Node

lifetimes SQL

Transactions
• Instances fail and may be replaced Caching
• Instances are shut down Logging
• Instances are scaled up/down
• Instances are added to cluster to scale out

Compute and storage are best decoupled


for scalability, availability and durability
Attached
Storage

© 2021, Amazon Web Services, Inc. or its Affiliates.


Scale-out, distributed, multi-tenant storage architecture

Purpose-built log-structured
distributed storage

Storage volume is striped


across hundreds of storage
nodes

Storage nodes with locally CLUSTER STORAGE VOLUME


attached SSDs

Continuous backup to Amazon


S3.

AZ 1 AZ 2 AZ 3

© 2021, Amazon Web Services, Inc. or its Affiliates.

Amazon S3
Scale-out, distributed, multi-tenant storage architecture

Six copies of data, two in each


Availability Zone to protect
against AZ+1 failure modes

Storage volume segmented in


10 GB protection groups (PG)

Each PG contains six 10 GB


CLUSTER STORAGE VOLUME
segments, copies of the same
data on different storage nodes,
two in each AZ.

Storage volume grows


automatically by adding PGs, up
to 64 TB AZ 1 AZ 2 AZ 3

© 2021, Amazon Web Services, Inc. or its Affiliates.


High durability storage system, tolerant of AZ+1 failures

Using quorum model for writes and


reads:
• 4 out of 6 for writes
• 3 out of 6 for reads (recovery)

Maintains write capability if an AZ


fails, maintains read capability if AZ
+ 1 storage node fails.
CLUSTER STORAGE VOLUME
Self-healing architecture rebalances
hot storage nodes, rebuilds
segments from failed hardware

AZ 1 AZ 2 AZ 3

© 2021, Amazon Web Services, Inc. or its Affiliates.


More: https://aws.amazon.com/blogs/database/amazon-aurora-under-the-hood-quorum-and-correlated-failure/
Amazon Aurora cluster topology

DB Cluster

Up to 16 DB instances/nodes in a
regional cluster, spanning Writer Reader Reader
SQL SQL SQL
multiple AZs
Transactions Transactions Transactions
Caching Caching Caching

One (first) is always the


writer/primary.

SHARED CLUSTER STORAGE VOLUME


Storage volume shared with
readers. Readers open volume in
read only mode (MySQL:
innodb_read_only = 1,
PostgreSQL:
transaction_read_only=on).
AZ 1 AZ 2 AZ 3

*Aurora Serverless does not expose compute


© 2021, Amazon Web Services, Inc. or its Affiliates. DB instances directly, and has no readers
Accessing your Aurora databases

Managed DB service, no OS or Cluster


Endpoint
Reader
Endpoint
filesystem level access
Writer Reader Reader
SQL SQL SQL
Connect to writer using Cluster Transactions Transactions Transactions
(DNS) Endpoint – always points Caching Caching Caching
to writer!

Round robin load balancing for


SHARED CLUSTER STORAGE VOLUME
reads using Reader (DNS)
Endpoint (excludes writer except
on single node clusters)

Custom (DNS) Endpoints, read


replica auto scaling supported
as well AZ 1 AZ 2 AZ 3

© 2021, Amazon Web Services, Inc. or its Affiliates.


Tolerating compute failures

Any reader node can be Cluster


Endpoint
Reader
Endpoint
promoted to writer/primary
Writer Writer Reader
SQL SQL SQL
Failover tier determines Transactions Transactions Transactions
preference on failover reader Caching Caching Caching
candidates. Lower values more 0 0 5
preferred.

SHARED CLUSTER STORAGE VOLUME


Failed instances/nodes will be
replaced after failover and come
online as readers.

NEW! Readers do not reboot on


writer restart/failover (Aurora
MySQL ≥2.10). AZ 1 AZ 2 AZ 3

© 2021, Amazon Web Services, Inc. or its Affiliates.


Writes and asynchronous readers

Storage is log structured, Aurora


doesn’t flush data pages (or full
blocks) to storage Writer Reader Reader
SQL SQL SQL
Transactions Transactions Transactions
Redo logs (change vectors for Caching Caching Caching
data pages) get streamed
continuously to storage nodes by
writer, in parallel
CLUSTER STORAGE VOLUME
Redo logs also streamed to
readers for buffer pool and read
view updates.

Low reader lag < ~20ms


AZ 1 AZ 2 AZ 3

© 2021, Amazon Web Services, Inc. or its Affiliates.

Amazon S3
Database Migration

© 2021, Amazon Web Services, Inc. or its Affiliates.


Overview

• Simple to use
• Minimal downtime
• Supports widely used
databases
• Low cost
• Fast and easy to setup
• Reliable

© 2021, Amazon Web Services, Inc. or its Affiliates.


Supports widely used databases

Sources* Targets**
Oracle Oracle
SQL Server SQL Server
On-premises
Azure SQL PostgreSQL
database
PostgreSQL MySQL
MySQL Amazon Redshift
SAP ASE SAP ASE
MongoDB Amazon S3
Amazon S3 Amazon DynamoDB
IBM DB2 Amazon Kinesis
Amazon ElasticSearch

*
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.html

** https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.html

© 2021, Amazon Web Services, Inc. or its Affiliates.


Fast and easy to setup

Set up a migration task in minutes


Create a
Connect to the Connect to the
replication
source target Create a task Run the task
instance to run
database database
the migration

You can use different tasks with different settings for different environments

© 2021, Amazon Web Services, Inc. or its Affiliates.


AWS Schema Conversion Tool (SCT)

The AWS Schema Conversion Tool helps


automate many database schema and code
conversion tasks when migrating to a new
database engine.

Features
Schema conversion between database engines
Database Migration Assessment report for choosing the best target engine
Code browser that highlights places where manual edits are required

© 2021, Amazon Web Services, Inc. or its Affiliates.


AWS Purpose-Built Databases

© 2021, Amazon Web Services, Inc. or its Affiliates.


The best tool for a job usually differs by use case

Build new applications with purpose-built databases

© 2021, Amazon Web Services, Inc. or its Affiliates.


Purpose-built databases

Relational Key - Value Document Wide Collumn In-memory Graph Time Series Ledger

Aurora Amazon DynamoDB Amazon Amazon ElastiCache Amazon Amazon Amazon


RDS DocumentDB Keyspaces Neptune Timestream QLDB

© 2021, Amazon Web Services, Inc. or its Affiliates.


Purpose-built databases

Relational Key - Value Document Wide Collumn In-memory Graph Time Series Ledger

Aurora Amazon DynamoDB Amazon Amazon ElastiCache Amazon Amazon Amazon


RDS DocumentDB Keyspaces Neptune Timestream QLDB

© 2021, Amazon Web Services, Inc. or its Affiliates.


Amazon DynamoDB

© 2021, Amazon Web Services, Inc. or its Affiliates.


The Amazon NoSQL journey

Dec 2004: Jan 2012: Today:


Database scalability DynamoDB general Tier 0 service powering most
challenges availability of Amazon

Oct 2007: Q3 2016:


Dynamo paper published DynamoDB leader in Gartner
MQ, Forrester Wave

© 2021, Amazon Web Services, Inc. or its Affiliates.


Retail

The internal Amazon.com Herd


system supports 100s of millions of
active workflows.

Migrated from Oracle to DynamoDB

• Improved customer experience:


Workflow processing delays dropped
from 1 second to 100 milliseconds.
• Reduced cost: Scaling and
maintenance effort dropped
Amazon DynamoDB supports multiple high-traffic sites 10 times.
and systems including Alexa, the Amazon.com sites, and
442 Amazon fulfillment centers. Across the 66-hour 2020 • Reduced complexity and risk:
Prime Day, these sources made 16.4 trillion calls to the Retired more than 300 Oracle hosts.
DynamoDB API, peaking at 80.1 million requests per second.

© 2021, Amazon Web Services, Inc. or its Affiliates. 36


Performance at any scale

High request volume Consistent low latency

Many millions of requests per second per table


Millisecond variance

© 2021, Amazon Web Services, Inc. or its Affiliates.


You work with tables…

Table 1 Table 2 Table 3

DynamoDB does the rest under the hood…


Server 1 Server N

1K WCU or 3K RCU
T1.p1 T1.pn
up to 10 GB
© 2021, Amazon Web Services, Inc. or its Affiliates.
DynamoDB Table
Table
A1 A2 A3 A4 A5
(partition key) (sort key)

A1 A2
(partition key) (sort key)
Items
A1 A2 A6 A4 A5
(partition key) (sort key)

All items for a partition key


A1 A2 A3 A4 A7 ==, <, >, >=, <=
(partition key) (sort key) “begins with”
“between”
sorted results
counts
Partition Key top/bottom N values
SortKey paged responses
Optional
Mandatory Model 1:N relationships
Key-value access pattern Enables rich query capabilities
© 2021, Amazon Web Services, Inc. or its Affiliates.
Determines data distribution
Item Distribution

Aggregates Partition key DynamoDB table


Hash.MIN = 0 Orders
00
Partition A
OrderId: 1
CountryCode: 1 Hash(1) = 7B
ASIN: [B00X4WHP5E] 55

Keyspace
Partition B
OrderId: 2
CountryCode : 1 Hash(2) = 48
ASIN: [B00OQVZDJM]
AA
OrderId: 3 Partition C
CountryCode : 1 Hash(3) = CD
ASIN: [B00U3FPN4U]
FF

Hash.MAX = FF

Related data (aggregate) is stored together for efficient access


© 2021, Amazon Web Services, Inc. or its Affiliates.
Path of a PutItem request
RR RR RR RR RR

AVA I L A B I L I T Y RR RR RR RR RR
ZONE 1
RR RR RR RR RR

RR RR RR RR RR

AVA I L A B I L I T Y RR RR RR RR RR
ZONE 2
RR RR RR RR RR
Network

RR RR RR RR RR

AVA I L A B I L I T Y RR RR RR RR RR
ZONE 3
RR RR RR RR RR

© 2021, Amazon Web Services, Inc. or its Affiliates.


Data Lake Introduction

© 2021, Amazon Web Services, Inc. or its Affiliates.


Companies are increasingly embracing data driven
decision making and fostering an open culture
where the data is not siloed within departments.

© 2021, Amazon Web Services, Inc. or its Affiliates.


Changing Requirements for Analytics
I WANT SUPPORT FOR . . .

Any scale, concurrency, with low cost, high throughput &


performance

Data from new sources, streaming, batch, real-time

Increasingly diverse types of data

Democratization of data – usage by many people of various skills,


make it easy run & operate

Choice of tools, techniques, and applications


© 2021, Amazon Web Services, Inc. or its Affiliates.
Using the Right Tool for the Task … … … …

Data Scientist
Amazon
Amazon
Amazon Amazon
Exploration, Integration,
S3 Amazon Amazon
Kinesis Neptune S3 QuickSight SageMaker Predictive Models
Systems of
Record
Amazon Data Experts
S3 Glacier Amazon Amazon Amazon
AWS
Glue Redshift DynamoDB Athena Ad-hoc Reports,
Raw Data Create KPIs
Systems of
Engagement
AWS Amazon
AWS
Database
Amazon
Amazon
Elasticsearch
Amazon
ElastiCache Lambda API Gateway Business Users
Service
Migration Service
S3 Dashboarding,
Sensor & Move Data Prepared Data Consumable Data Insights Consumption
Use KPIs, Slice & Dice
Log Data


Downstream Systems
Amazon AWS Amazon Amazon Amazon Amazon Amazon Data Feeds,
External Data Athena Glue EMR Transcribe Rekognition Comprehend SageMaker Information Hub
Data Processing, Metadata Management Machine Learning

Analytical Data …
Insights Applications
Actionable Insights at
AWS AWS AWS Amazon AWS AWS
the Point of Impact
Data Sources KMS IAM CloudTrail CloudWatch CloudFormation Config

Security, Identity and Compliance Management and Governance Data and Insights Applications

© 2021, Amazon Web Services, Inc. or its Affiliates.


Serverless data lakes and analytics

Web app data

Amazon RDS

AWS Glue AWS Glue Data Amazon


Amazon S3 crawler Amazon Athena
Catalog QuickSight
Other databases

On-premises data

Streaming data

© 2021, Amazon Web Services, Inc. or its Affiliates.


AWS Glue—Data Catalog
Make data discoverable

Glue
Data Catalog
• Automatically discovers data and stores schema

Discover data and • Catalog makes data searchable, and available for ETL
extract schema
• Catalog contains table and job definitions

• Computes statistics to make queries efficient

Compliance

© 2021, Amazon Web Services, Inc. or its Affiliates.


AWS Glue—ETL Service
Make ETL scripting and deployment easy

• Automatically generates ETL code

• Code is customizable with Python


and Spark

• Endpoints provided to edit, debug,


test code

• Jobs are scheduled or event-based

• Serverless

© 2021, Amazon Web Services, Inc. or its Affiliates.


Amazon Athena
Example Query

© 2021, Amazon Web Services, Inc. or its Affiliates.


QuickSight
Create Beautiful, Interactive Dashboards

• Add rich interactivity like filters, drill downs,


zooming, and more
• Blazing fast navigation
• Accessible on any device
• Data Refresh
• Publish to everyone with a click

© 2021, Amazon Web Services, Inc. or its Affiliates.


Labs

© 2021, Amazon Web Services, Inc. or its Affiliates.


Lab bắt buộc

Chuyển đổi Oracle DB sang Aurora:


https://000043.awsstudygroup.com/vi/2-oracle-aurora/

Giải thưởng:
• 5 áo thun AWS cho 5 anh/chị hoàn thành bài lab nhanh nhất.
• 5 áo thun AWS được trao ngẫu nhiên cho 5 anh/chị hoàn thành bài lab.
• Phần quà cho các anh chị tìm ra lỗi trong bài lab hoặc đưa ra ý tưởng hay để cải thiện bài lab.

Lưu ý:
- Chụp màn hình AWS Console sau khi hoàn thành các bài lab và đăng vào kênh lab-week-3 trên Slack.
- Hình chụp phải bao gồm Account ID (Ở góc trên bên phải của AWS Console).
- Tài khoản Lab có thể sử dụng đến 5 giờ chiều thứ 2 tuần sau.

© 2021, Amazon Web Services, Inc. or its Affiliates.


Lab tuỳ chọn

Xây dựng Data Lake trên AWS:


https://000070.awsstudygroup.com/vi/

Lưu ý:
- Bài Lab tuỳ chọn, không cần nộp bài.

© 2021, Amazon Web Services, Inc. or its Affiliates.


Kahoot Game

© 2021, Amazon Web Services, Inc. or its Affiliates.


Thank you!

© 2021, Amazon Web Services, Inc. or its Affiliates.

You might also like