Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
AWS Big Data Solution Overview
Ivan Cheng (鄭志帆)
AWS Solutions Architect
What is Big Data?
When your data sets become so large and complex
you have to start innovating around how to
collect, store, process, analyze, and share them.
GB
TB
PB
ZB
EB
Big Data: Unconstrained Growth
Unstructured data growth
is explosive
95% of the 1.2 zettabytes
of data in the digital
universe is unstructured
Machine data and IoT will
only steepen the curve
70% of this data is user-
generated content
Source: IDC, The Internet of Things: Getting Ready to Embrace Its Impact on the Digital Economy, March 2016.
The Cloud Was Built for Big Data
Elastic and highly scalable
No upfront capital expense
Only pay for what you use
+
+
Available on-demand
+
= the Cloud removes constraints
Ingest/
Collect
Consume/
visualize
Store Process/
analyze
Data
1 4
0 9
5
Answers &
insights
START HERE
WITH A BUSINESS CASE
Time to answer (Latency)
Cost
Evolution of Analytics
Retrospective
analysis and
reporting
Here-and-now
real-time processing
and dashboards
Predictions
to enable smart
applications
AWS Big Data Benefits
Immediate Availability. Deploy instantly. No hardware to procure,
no infrastructure to maintain & scale.
Broad & Deep Capabilities. Over 50 services and 100s of features
to support virtually any big data application & workload.
Trusted & Secure. Designed to meet the strictest requirements.
Continuously audited, including certifications such as ISO 27001,
FedRAMP, DoD CSM, and PCI DSS.
Hundreds of Partners & Solutions. Get help from a consulting partner
or choose from hundreds of tools and applications across the entire data
management stack.
AWS Data PipelineAWS Database Migration Service
EMR
Analyze
Amazon
Glacier
S3
StoreCollect
Amazon Kinesis
Direct Connect
Amazon
Machine
Learning
Amazon
Redshift
DynamoDBAWS IoT
AWS Snowball
QuickSight
Amazon Athena
EC2
Amazon
Elasticsearch
Service
Lambda
AWS Glue
Key AWS Certifications and Assurance Programs
AWS Big Data Customer Success
AWS Big Data Partners
AWS Big Data Service Overview
AWS Database
Migration
Service
AWS Direct
Connect
AWS
Import/Export
& Snowball
AWS
Storage
Gateway
Data Movement
Storage and Databases
• Store unlimited number of objects
• Designed for 99.999999999% durability
• As Data Lake with integration with other AWS services
(Amazon Kinesis, Amazon Redshift, Amazon EMR, etc.)
• Low cost with tired-storage (Standard, IA, Amazon Glacier)
via life-cycle policy
• Secure – SSL, client/server-side encryption at rest
Amazon S3
• Fully Managed NoSQL Database
• Fast consistent performance (single-digit millisecond latency
at any scale)
• Highly scalable - automatic scaling of throughput capacity
• Highly available and durability
• Store unlimited number of data
Amazon
DynamoDB
• Fully Managed Relational Database Service
• MySQL and PostgreSQL compatible relational database with up to
5x better performance running on the same hardware
• Security, availability, and reliability of commercial databases at
1/10th the cost
• Designed to offer greater than 99.99% availability.
• Automatically grows storage as needed, from 10GB up to 64TB
• Achieve up to 500,000 reads and 100,000 writes per second
Amazon
Aurora
• Fully managed petabyte-scale relational, MPP, data warehousing
• Built-in end-to-end security, including SSL connections and cluster
encryption
• Fault-tolerant - automatically recovers from disk and node failures
• Data automatically backed up to Amazon S3
• $1,000/TB/Year; start at $0.25/hour. Provision in minutes; scale
from 160 GB to 2 PB of compressed data with just a few clicks
Amazon
Redshift
Analytic Frameworks
• Managed Hadoop framework
• Apache Hadoop, Hive, Spark, Zeppelin, Presto, HBase, Phoenix,
Tez, Flink, etc.
• Auto Scaling clusters with support for on-demand and spot pricing
• Support for end-to-end encryption, IAM/VPC, S3 client-side
encryption with customer managed keys and AWS KMS
• Integrates with Amazon S3, Amazon DynamoDB, Amazon Kinesis
and Amazon Redshift
Amazon
EMR
PIG
Amazon
EMR
Amazon
S3
EMRFS
Amazon EMR
• Fully managed, reliable, and scalable Elasticsearch service
• Support for ELK
• Integration options with other AWS services (CloudWatch
Logs, Amazon DynamoDB, Amazon S3, Amazon Kinesis)
• Use Case: log analytics, full text search, application
monitoring, and more.
Amazon
Elasticsearch
• Serverless query service for querying data in S3 using
standard SQL with no infrastructure to manage
• Support for multiple data formats include text, CSV, TSV,
JSON, Avro, ORC, Parquet
• Pay per query only when you’re running queries based on
data scanned. If you compress your data, you pay less and
your queries run faster
Amazon
Athena
Familiar Technologies Under the Covers
Used for SQL Queries
In-memory distributed query engine
ANSI-SQL compatible with extensions
Used for DDL functionality
Complex data types
Multitude of formats
Supports data partitioning
• Fast and cloud-powered Business Analytics
• Easy to use, no infrastructure to manage
• Quick calculations with SPICE
• 1/10th the cost of legacy BI software
• Accessed from any browser or mobile device
Amazon
Quicksight
• Fully managed ETL (extract, transform, load) service
• Integrated data catalog, automatic schema discovery, ETL
code generation, flexible job scheduler
• Integrated across a wide range of AWS services (Amazon
RDS, Database running on Amazon EC2, Amazon Athena,
etc.)
AWS Glue
1. Build your data catalog
2. Generate and Edit Transformations
3. Schedule and Run Your Jobs
How AWS Glue Works
Real-time Analytics
• Fully managed streaming application
• Scalable – handle any amount of streaming data
• Ingest, buffer and process data in real-time
• React quickly – derive insight in seconds
Amazon
Kinesis
Amazon Kinesis
Amazon Kinesis
Streams
Build your own custom
applications that process or
analyze streaming data
Amazon Kinesis
Firehose
Easily load massive volumes
of streaming data into
Amazon S3, Amazon
Redshift, and Amazon
Elasticsearch
Amazon Kinesis
Analytics
Easily analyze data streams
using standard SQL queries
Amazon Kinesis Streams
• Reliably ingest and durably store streaming data at low
cost
• Build custom real-time applications to process
streaming data
Amazon Kinesis Firehose
Reliably ingest and deliver batched, compressed, and encrypted
data to S3, Amazon Redshift, and Amazon Elasticsearch Service
Amazon Kinesis Analytics
Interact with streaming data in real time using SQL
Hundreds of big data products are immediately available through the AWS marketplace
AWS Market Place for Big Data Solution
Advanced AnalyticsDatabase and Data Enablement Business Inteligence
Fully Integrated | 1-click deployment | Pay-as-you-go
pricing
Modern Data Analytics Architecture on AWS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modern data architecture
Insights to enhance business applications, new digital services
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modern data architecture
Insights to enhance business applications, new digital services
Transactions
Web logs /
cookies
ERP
Data analysts
Data scientists
Business users
Engagement platformsConnected
devices
Social media Automation / events
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modern data architecture
Insights to enhance business applications, new digital services
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Connected
devices
Social media
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modern data architecture
Insights to enhance business applications, new digital services
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Connected
devices
Social media
Speed (Real-time)
Scale (Batch)
Amazon S3
Staged Data
(Data Lake)
Amazon S3
Raw Data
Amazon EMR
ETL
AWS Glue
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modern data architecture
Insights to enhance business applications, new digital services
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Connected
devices
Social media
Speed (Real-time)
Scale (Batch)
Amazon S3
Staged Data
(Data Lake)
Amazon S3
Raw Data
Amazon EMR
ETL
Advanced
Analytics
MLlib
Deep Learning
Amazon ML
Serving
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modern data architecture
Insights to enhance business applications, new digital services
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Connected
devices
Social media
Speed (Real-time)
Scale (Batch)
Amazon S3
Staged Data
(Data Lake)
Amazon S3
Raw Data
Amazon EMR
ETL
Advanced
Analytics
MLlib
Deep Learning
Amazon ML
Serving
Data Warehouse
Amazon Redshift
Legacy Apps
Amazon RDS
Schemaless
Amazon ElasticSearch
Direct Query
Amazon Athena
Near-Zero Latency
Amazon DynamoDB
Semi/Unstructured
Amazon EMR
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modern data architecture
Insights to enhance business applications, new digital services
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Connected
devices
Social media
Speed (Real-time)
Scale (Batch)
Amazon S3
Staged Data
(Data Lake)
Amazon S3
Raw Data
Amazon EMR
ETL
Advanced
Analytics
MLlib
Deep Learning
Amazon ML
Serving
Data Warehouse
Amazon Redshift
Legacy Apps
Amazon RDS
Schemaless
Amazon ElasticSearch
Direct Query
Amazon Athena
Near-Zero Latency
Amazon DynamoDB
Semi/Unstructured
Amazon EMR
Amazon
QuickSight
Amazon
API Gateway
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Connected
devices
Social media
Speed (Real-time)
Scale (Batch)
Amazon S3
Staged Data
(Data Lake)
Amazon S3
Raw Data
Amazon EMR
ETL
Advanced
Analytics
MLlib
Deep Learning
Amazon ML
Serving
Data Warehouse
Amazon Redshift
Legacy Apps
Amazon RDS
Schemaless
Amazon ElasticSearch
Direct Query
Amazon Athena
Near-Zero Latency
Amazon DynamoDB
Semi/Unstructured
Amazon EMR
Amazon
QuickSight
Amazon
API Gateway
Event Capture
Amazon Kinesis
Stream Analysis
Amazon EMR Event Scoring
Amazon AI
Event Handler
AWS Lambda Response Handler
AWS Lambda
Modern data architecture
Insights to enhance business applications, new digital services
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Connected
devices
Social media
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Speed (Real-time)
Scale (Batch)
Amazon S3
Staged Data
(Data Lake)
Amazon S3
Raw Data
Amazon EMR
ETL
Advanced
Analytics
MLlib
Deep Learning
Amazon ML
Serving
Data Warehouse
Amazon Redshift
Legacy Apps
Amazon RDS
Schemaless
Amazon ElasticSearch
Direct Query
Amazon Athena
Near-Zero Latency
Amazon DynamoDB
Semi/Unstructured
Amazon EMR
Amazon
QuickSight
Amazon
API Gateway
Event Capture
Amazon Kinesis
Stream Analysis
Amazon EMR Event Scoring
Amazon AI
Event Handler
AWS Lambda Response Handler
AWS Lambda
Modern data architecture
Insights to enhance business applications, new digital services
Reference Architecture
Sample Reference Architecture: Data Lake
AthenaGlue
Data Marts
(Amazon
Redshift)
Query Cluster
(EMR)
Query Cluster
(EMR)
Auto Scaling
EC2
Analytics
App
Normalization
ETL Clusters
(EMR)
Batch Analytic
Clusters
Ad Hoc Query
Cluster (EMR)
Auto Scaling
EC2
Analytics
App
Users Data
Providers
Auto Scaling
EC2
Data
Ingestion
Services
Optimization
ETL Clusters
(EMR)
Shared Metastore
(RDS)
Query Optimized
(S3)
Auto Scaling EC2
Data
Catalog
& Lineage
Services
Reference Data
(RDS)
Shared Data Services
Auto Scaling
EC2
Cluster Mgt
& Workflow
Services
Source of
Truth (S3)
>5 PB, up to 75 billion events per day
Amazon
S3
Amazon
EMR
Amazon
S3
Amazon
Redshift
Amazon
QuickSight
Data
Sources
Enterprise Data Warehouse
Amazon
Athena
Amazon
Athena
Ingest/
Collect
Consume/
visualize
Store
Process/
analyze
Data
1 4
0 9
5
Outcomes
& insights
Personalized
recommendations within
seconds (from 15-20 min)
Scale the expertise of
stylists to all shoppers
Reduce costs by 2X order
of magnitude
…
Mobile Users
Desktop Users
Analytics
Tools
Online Stylist
Amazon
Redshift
Amazon
Kinesis
AWS
Lambda
Amazon
DynamoDB
AWS
Lambda
Amazon S3
Data Storage
NORDSTROM
Big Data on AWS:
https://aws.amazon.com/big-data/
Thank you!

More Related Content

What's hot

Introduction to Amazon Elasticsearch Service
Introduction to  Amazon Elasticsearch ServiceIntroduction to  Amazon Elasticsearch Service
Introduction to Amazon Elasticsearch Service
Amazon Web Services
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
Amazon Web Services
 
Intro to AWS Lambda
Intro to AWS Lambda Intro to AWS Lambda
Intro to AWS Lambda
Amazon Web Services
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
Amazon Web Services
 
Introducing DynamoDB
Introducing DynamoDBIntroducing DynamoDB
Introducing DynamoDB
Amazon Web Services
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
Amazon Web Services
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless Architectures
Amazon Web Services
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
Amazon Web Services
 
Deep Dive on AWS Lambda
Deep Dive on AWS LambdaDeep Dive on AWS Lambda
Deep Dive on AWS Lambda
Amazon Web Services
 
Introduction to AWS Cost Management
Introduction to AWS Cost ManagementIntroduction to AWS Cost Management
Introduction to AWS Cost Management
Amazon Web Services
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
Amazon Web Services
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
Kel Graham
 
Best Practices for Implementing Your Encryption Strategy Using AWS Key Manage...
Best Practices for Implementing Your Encryption Strategy Using AWS Key Manage...Best Practices for Implementing Your Encryption Strategy Using AWS Key Manage...
Best Practices for Implementing Your Encryption Strategy Using AWS Key Manage...
Amazon Web Services
 
Deep Dive into AWS SAM
Deep Dive into AWS SAMDeep Dive into AWS SAM
Deep Dive into AWS SAM
Amazon Web Services
 
Introduction to Amazon Aurora
Introduction to Amazon AuroraIntroduction to Amazon Aurora
Introduction to Amazon Aurora
Amazon Web Services
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
Amazon Web Services
 
Introduction to AWS Glue
Introduction to AWS GlueIntroduction to AWS Glue
Introduction to AWS Glue
Amazon Web Services
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - Datalake
Lam Le
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
Amazon Web Services
 
Amazon Redshift Masterclass
Amazon Redshift MasterclassAmazon Redshift Masterclass
Amazon Redshift Masterclass
Amazon Web Services
 

What's hot (20)

Introduction to Amazon Elasticsearch Service
Introduction to  Amazon Elasticsearch ServiceIntroduction to  Amazon Elasticsearch Service
Introduction to Amazon Elasticsearch Service
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
Intro to AWS Lambda
Intro to AWS Lambda Intro to AWS Lambda
Intro to AWS Lambda
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
Introducing DynamoDB
Introducing DynamoDBIntroducing DynamoDB
Introducing DynamoDB
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless Architectures
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Deep Dive on AWS Lambda
Deep Dive on AWS LambdaDeep Dive on AWS Lambda
Deep Dive on AWS Lambda
 
Introduction to AWS Cost Management
Introduction to AWS Cost ManagementIntroduction to AWS Cost Management
Introduction to AWS Cost Management
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Best Practices for Implementing Your Encryption Strategy Using AWS Key Manage...
Best Practices for Implementing Your Encryption Strategy Using AWS Key Manage...Best Practices for Implementing Your Encryption Strategy Using AWS Key Manage...
Best Practices for Implementing Your Encryption Strategy Using AWS Key Manage...
 
Deep Dive into AWS SAM
Deep Dive into AWS SAMDeep Dive into AWS SAM
Deep Dive into AWS SAM
 
Introduction to Amazon Aurora
Introduction to Amazon AuroraIntroduction to Amazon Aurora
Introduction to Amazon Aurora
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
 
Introduction to AWS Glue
Introduction to AWS GlueIntroduction to AWS Glue
Introduction to AWS Glue
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - Datalake
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Amazon Redshift Masterclass
Amazon Redshift MasterclassAmazon Redshift Masterclass
Amazon Redshift Masterclass
 

Viewers also liked

Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
Amazon Web Services
 
AWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AIAWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AI
Amazon Web Services
 
智能零售解決方案
智能零售解決方案智能零售解決方案
智能零售解決方案
Amazon Web Services
 
運用大數據掌握您的客戶
運用大數據掌握您的客戶運用大數據掌握您的客戶
運用大數據掌握您的客戶
Amazon Web Services
 
運用 Amazon 提供 Robo-Advisors 與 FinteXchange 交易市集上的AaaS、DaaS、PaaS 服務
運用 Amazon 提供 Robo-Advisors 與 FinteXchange 交易市集上的AaaS、DaaS、PaaS 服務運用 Amazon 提供 Robo-Advisors 與 FinteXchange 交易市集上的AaaS、DaaS、PaaS 服務
運用 Amazon 提供 Robo-Advisors 與 FinteXchange 交易市集上的AaaS、DaaS、PaaS 服務
Amazon Web Services
 
Turn Big Data Into Big Value On Informatica and Amazon
Turn Big Data Into Big Value On Informatica and AmazonTurn Big Data Into Big Value On Informatica and Amazon
Turn Big Data Into Big Value On Informatica and Amazon
Amazon Web Services
 
Deep Dive in Big Data
Deep Dive in Big DataDeep Dive in Big Data
Deep Dive in Big Data
Amazon Web Services
 
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNetAWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
Amazon Web Services
 

Viewers also liked (8)

Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
 
AWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AIAWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AI
 
智能零售解決方案
智能零售解決方案智能零售解決方案
智能零售解決方案
 
運用大數據掌握您的客戶
運用大數據掌握您的客戶運用大數據掌握您的客戶
運用大數據掌握您的客戶
 
運用 Amazon 提供 Robo-Advisors 與 FinteXchange 交易市集上的AaaS、DaaS、PaaS 服務
運用 Amazon 提供 Robo-Advisors 與 FinteXchange 交易市集上的AaaS、DaaS、PaaS 服務運用 Amazon 提供 Robo-Advisors 與 FinteXchange 交易市集上的AaaS、DaaS、PaaS 服務
運用 Amazon 提供 Robo-Advisors 與 FinteXchange 交易市集上的AaaS、DaaS、PaaS 服務
 
Turn Big Data Into Big Value On Informatica and Amazon
Turn Big Data Into Big Value On Informatica and AmazonTurn Big Data Into Big Value On Informatica and Amazon
Turn Big Data Into Big Value On Informatica and Amazon
 
Deep Dive in Big Data
Deep Dive in Big DataDeep Dive in Big Data
Deep Dive in Big Data
 
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNetAWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
 

Similar to Welcome & AWS Big Data Solution Overview

AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
Amazon Web Services
 
Big Data Solutions Day - Calgary
Big Data Solutions Day - CalgaryBig Data Solutions Day - Calgary
Big Data Solutions Day - Calgary
Amazon Web Services
 
Building your First Big Data Application on AWS
Building your First Big Data Application on AWSBuilding your First Big Data Application on AWS
Building your First Big Data Application on AWS
Amazon Web Services
 
2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days
Amazon Web Services
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
Amazon Web Services
 
¿Quién es Amazon Web Services?
¿Quién es Amazon Web Services?¿Quién es Amazon Web Services?
¿Quién es Amazon Web Services?
Software Guru
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
Amazon Web Services
 
Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS AnalyticsFinding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
Amazon Web Services
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Amazon Web Services
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
Amazon Web Services
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
Amazon Web Services
 
The AWS Big Data Platform – Overview
The AWS Big Data Platform – OverviewThe AWS Big Data Platform – Overview
The AWS Big Data Platform – Overview
Amazon Web Services
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
Amazon Web Services
 
AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...
AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...
AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...
Amazon Web Services
 
Big data on AWS
Big data on AWSBig data on AWS
Big data on AWS
Stylight
 
Big Data on AWS
Big Data on AWSBig Data on AWS
Big Data on AWS
Johann Romefort
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
Amazon Web Services
 
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
Amazon Web Services
 

Similar to Welcome & AWS Big Data Solution Overview (20)

AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
Big Data Solutions Day - Calgary
Big Data Solutions Day - CalgaryBig Data Solutions Day - Calgary
Big Data Solutions Day - Calgary
 
Building your First Big Data Application on AWS
Building your First Big Data Application on AWSBuilding your First Big Data Application on AWS
Building your First Big Data Application on AWS
 
2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
 
¿Quién es Amazon Web Services?
¿Quién es Amazon Web Services?¿Quién es Amazon Web Services?
¿Quién es Amazon Web Services?
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS AnalyticsFinding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
 
The AWS Big Data Platform – Overview
The AWS Big Data Platform – OverviewThe AWS Big Data Platform – Overview
The AWS Big Data Platform – Overview
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...
AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...
AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...
 
Big data on AWS
Big data on AWSBig data on AWS
Big data on AWS
 
Big Data on AWS
Big Data on AWSBig Data on AWS
Big Data on AWS
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
Amazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
Amazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Welcome & AWS Big Data Solution Overview

  • 1. AWS Big Data Solution Overview Ivan Cheng (鄭志帆) AWS Solutions Architect
  • 2. What is Big Data? When your data sets become so large and complex you have to start innovating around how to collect, store, process, analyze, and share them.
  • 3. GB TB PB ZB EB Big Data: Unconstrained Growth Unstructured data growth is explosive 95% of the 1.2 zettabytes of data in the digital universe is unstructured Machine data and IoT will only steepen the curve 70% of this data is user- generated content Source: IDC, The Internet of Things: Getting Ready to Embrace Its Impact on the Digital Economy, March 2016.
  • 4. The Cloud Was Built for Big Data
  • 5. Elastic and highly scalable No upfront capital expense Only pay for what you use + + Available on-demand + = the Cloud removes constraints
  • 6. Ingest/ Collect Consume/ visualize Store Process/ analyze Data 1 4 0 9 5 Answers & insights START HERE WITH A BUSINESS CASE Time to answer (Latency) Cost
  • 7. Evolution of Analytics Retrospective analysis and reporting Here-and-now real-time processing and dashboards Predictions to enable smart applications
  • 8. AWS Big Data Benefits Immediate Availability. Deploy instantly. No hardware to procure, no infrastructure to maintain & scale. Broad & Deep Capabilities. Over 50 services and 100s of features to support virtually any big data application & workload. Trusted & Secure. Designed to meet the strictest requirements. Continuously audited, including certifications such as ISO 27001, FedRAMP, DoD CSM, and PCI DSS. Hundreds of Partners & Solutions. Get help from a consulting partner or choose from hundreds of tools and applications across the entire data management stack.
  • 9. AWS Data PipelineAWS Database Migration Service EMR Analyze Amazon Glacier S3 StoreCollect Amazon Kinesis Direct Connect Amazon Machine Learning Amazon Redshift DynamoDBAWS IoT AWS Snowball QuickSight Amazon Athena EC2 Amazon Elasticsearch Service Lambda AWS Glue
  • 10. Key AWS Certifications and Assurance Programs
  • 11. AWS Big Data Customer Success
  • 12. AWS Big Data Partners
  • 13. AWS Big Data Service Overview
  • 14. AWS Database Migration Service AWS Direct Connect AWS Import/Export & Snowball AWS Storage Gateway Data Movement
  • 16. • Store unlimited number of objects • Designed for 99.999999999% durability • As Data Lake with integration with other AWS services (Amazon Kinesis, Amazon Redshift, Amazon EMR, etc.) • Low cost with tired-storage (Standard, IA, Amazon Glacier) via life-cycle policy • Secure – SSL, client/server-side encryption at rest Amazon S3
  • 17. • Fully Managed NoSQL Database • Fast consistent performance (single-digit millisecond latency at any scale) • Highly scalable - automatic scaling of throughput capacity • Highly available and durability • Store unlimited number of data Amazon DynamoDB
  • 18. • Fully Managed Relational Database Service • MySQL and PostgreSQL compatible relational database with up to 5x better performance running on the same hardware • Security, availability, and reliability of commercial databases at 1/10th the cost • Designed to offer greater than 99.99% availability. • Automatically grows storage as needed, from 10GB up to 64TB • Achieve up to 500,000 reads and 100,000 writes per second Amazon Aurora
  • 19. • Fully managed petabyte-scale relational, MPP, data warehousing • Built-in end-to-end security, including SSL connections and cluster encryption • Fault-tolerant - automatically recovers from disk and node failures • Data automatically backed up to Amazon S3 • $1,000/TB/Year; start at $0.25/hour. Provision in minutes; scale from 160 GB to 2 PB of compressed data with just a few clicks Amazon Redshift
  • 21. • Managed Hadoop framework • Apache Hadoop, Hive, Spark, Zeppelin, Presto, HBase, Phoenix, Tez, Flink, etc. • Auto Scaling clusters with support for on-demand and spot pricing • Support for end-to-end encryption, IAM/VPC, S3 client-side encryption with customer managed keys and AWS KMS • Integrates with Amazon S3, Amazon DynamoDB, Amazon Kinesis and Amazon Redshift Amazon EMR
  • 23. • Fully managed, reliable, and scalable Elasticsearch service • Support for ELK • Integration options with other AWS services (CloudWatch Logs, Amazon DynamoDB, Amazon S3, Amazon Kinesis) • Use Case: log analytics, full text search, application monitoring, and more. Amazon Elasticsearch
  • 24. • Serverless query service for querying data in S3 using standard SQL with no infrastructure to manage • Support for multiple data formats include text, CSV, TSV, JSON, Avro, ORC, Parquet • Pay per query only when you’re running queries based on data scanned. If you compress your data, you pay less and your queries run faster Amazon Athena
  • 25. Familiar Technologies Under the Covers Used for SQL Queries In-memory distributed query engine ANSI-SQL compatible with extensions Used for DDL functionality Complex data types Multitude of formats Supports data partitioning
  • 26. • Fast and cloud-powered Business Analytics • Easy to use, no infrastructure to manage • Quick calculations with SPICE • 1/10th the cost of legacy BI software • Accessed from any browser or mobile device Amazon Quicksight
  • 27. • Fully managed ETL (extract, transform, load) service • Integrated data catalog, automatic schema discovery, ETL code generation, flexible job scheduler • Integrated across a wide range of AWS services (Amazon RDS, Database running on Amazon EC2, Amazon Athena, etc.) AWS Glue
  • 28. 1. Build your data catalog 2. Generate and Edit Transformations 3. Schedule and Run Your Jobs How AWS Glue Works
  • 30. • Fully managed streaming application • Scalable – handle any amount of streaming data • Ingest, buffer and process data in real-time • React quickly – derive insight in seconds Amazon Kinesis
  • 31. Amazon Kinesis Amazon Kinesis Streams Build your own custom applications that process or analyze streaming data Amazon Kinesis Firehose Easily load massive volumes of streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Amazon Kinesis Analytics Easily analyze data streams using standard SQL queries
  • 32. Amazon Kinesis Streams • Reliably ingest and durably store streaming data at low cost • Build custom real-time applications to process streaming data
  • 33. Amazon Kinesis Firehose Reliably ingest and deliver batched, compressed, and encrypted data to S3, Amazon Redshift, and Amazon Elasticsearch Service
  • 34. Amazon Kinesis Analytics Interact with streaming data in real time using SQL
  • 35. Hundreds of big data products are immediately available through the AWS marketplace AWS Market Place for Big Data Solution Advanced AnalyticsDatabase and Data Enablement Business Inteligence Fully Integrated | 1-click deployment | Pay-as-you-go pricing
  • 36. Modern Data Analytics Architecture on AWS
  • 37. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Insights to enhance business applications, new digital services Data analysts Data scientists Business users Engagement platforms Automation / events
  • 38. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Insights to enhance business applications, new digital services Transactions Web logs / cookies ERP Data analysts Data scientists Business users Engagement platformsConnected devices Social media Automation / events
  • 39. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Insights to enhance business applications, new digital services Data analysts Data scientists Business users Engagement platforms Automation / events Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media
  • 40. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Insights to enhance business applications, new digital services Data analysts Data scientists Business users Engagement platforms Automation / events Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media Speed (Real-time) Scale (Batch) Amazon S3 Staged Data (Data Lake) Amazon S3 Raw Data Amazon EMR ETL AWS Glue AWS Cloud Trail AWS IAM Amazon CloudWatch AWS KMS
  • 41. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Insights to enhance business applications, new digital services Data analysts Data scientists Business users Engagement platforms Automation / events Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media Speed (Real-time) Scale (Batch) Amazon S3 Staged Data (Data Lake) Amazon S3 Raw Data Amazon EMR ETL Advanced Analytics MLlib Deep Learning Amazon ML Serving AWS Cloud Trail AWS IAM Amazon CloudWatch AWS KMS
  • 42. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Insights to enhance business applications, new digital services Data analysts Data scientists Business users Engagement platforms Automation / events Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media Speed (Real-time) Scale (Batch) Amazon S3 Staged Data (Data Lake) Amazon S3 Raw Data Amazon EMR ETL Advanced Analytics MLlib Deep Learning Amazon ML Serving Data Warehouse Amazon Redshift Legacy Apps Amazon RDS Schemaless Amazon ElasticSearch Direct Query Amazon Athena Near-Zero Latency Amazon DynamoDB Semi/Unstructured Amazon EMR AWS Cloud Trail AWS IAM Amazon CloudWatch AWS KMS
  • 43. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Insights to enhance business applications, new digital services Data analysts Data scientists Business users Engagement platforms Automation / events Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media Speed (Real-time) Scale (Batch) Amazon S3 Staged Data (Data Lake) Amazon S3 Raw Data Amazon EMR ETL Advanced Analytics MLlib Deep Learning Amazon ML Serving Data Warehouse Amazon Redshift Legacy Apps Amazon RDS Schemaless Amazon ElasticSearch Direct Query Amazon Athena Near-Zero Latency Amazon DynamoDB Semi/Unstructured Amazon EMR Amazon QuickSight Amazon API Gateway AWS Cloud Trail AWS IAM Amazon CloudWatch AWS KMS
  • 44. Speed (Real-time) Ingest ServingData sources Scale (Batch) Data analysts Data scientists Business users Engagement platforms Automation / events Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media Speed (Real-time) Scale (Batch) Amazon S3 Staged Data (Data Lake) Amazon S3 Raw Data Amazon EMR ETL Advanced Analytics MLlib Deep Learning Amazon ML Serving Data Warehouse Amazon Redshift Legacy Apps Amazon RDS Schemaless Amazon ElasticSearch Direct Query Amazon Athena Near-Zero Latency Amazon DynamoDB Semi/Unstructured Amazon EMR Amazon QuickSight Amazon API Gateway Event Capture Amazon Kinesis Stream Analysis Amazon EMR Event Scoring Amazon AI Event Handler AWS Lambda Response Handler AWS Lambda Modern data architecture Insights to enhance business applications, new digital services AWS Cloud Trail AWS IAM Amazon CloudWatch AWS KMS
  • 45. Speed (Real-time) Ingest ServingData sources Scale (Batch) Data analysts Data scientists Business users Engagement platforms Automation / events Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media AWS Cloud Trail AWS IAM Amazon CloudWatch AWS KMS Speed (Real-time) Scale (Batch) Amazon S3 Staged Data (Data Lake) Amazon S3 Raw Data Amazon EMR ETL Advanced Analytics MLlib Deep Learning Amazon ML Serving Data Warehouse Amazon Redshift Legacy Apps Amazon RDS Schemaless Amazon ElasticSearch Direct Query Amazon Athena Near-Zero Latency Amazon DynamoDB Semi/Unstructured Amazon EMR Amazon QuickSight Amazon API Gateway Event Capture Amazon Kinesis Stream Analysis Amazon EMR Event Scoring Amazon AI Event Handler AWS Lambda Response Handler AWS Lambda Modern data architecture Insights to enhance business applications, new digital services
  • 47. Sample Reference Architecture: Data Lake AthenaGlue
  • 48. Data Marts (Amazon Redshift) Query Cluster (EMR) Query Cluster (EMR) Auto Scaling EC2 Analytics App Normalization ETL Clusters (EMR) Batch Analytic Clusters Ad Hoc Query Cluster (EMR) Auto Scaling EC2 Analytics App Users Data Providers Auto Scaling EC2 Data Ingestion Services Optimization ETL Clusters (EMR) Shared Metastore (RDS) Query Optimized (S3) Auto Scaling EC2 Data Catalog & Lineage Services Reference Data (RDS) Shared Data Services Auto Scaling EC2 Cluster Mgt & Workflow Services Source of Truth (S3) >5 PB, up to 75 billion events per day
  • 50. Ingest/ Collect Consume/ visualize Store Process/ analyze Data 1 4 0 9 5 Outcomes & insights Personalized recommendations within seconds (from 15-20 min) Scale the expertise of stylists to all shoppers Reduce costs by 2X order of magnitude … Mobile Users Desktop Users Analytics Tools Online Stylist Amazon Redshift Amazon Kinesis AWS Lambda Amazon DynamoDB AWS Lambda Amazon S3 Data Storage NORDSTROM
  • 51. Big Data on AWS: https://aws.amazon.com/big-data/