Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
20 views

SQL Based Data Processing Amazon Ecs

Uploaded by

unes.id2001
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

SQL Based Data Processing Amazon Ecs

Uploaded by

unes.id2001
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

SQL Based Data Processing in Amazon ECS

Build a configuration-driven, codeless extract-transform-load (ETL) alternative using a containerized ETL


framework (ARC) that simplifies and accelerates data processing with Apache Spark. 1
User creates an extract-transform-load
(ETL) data pipeline based on ARC
framework and SQL scripts in an
4 5 6 7 interactive ARC Jupyter Notebook. The
pipeline is hosted in Amazon Elastic
Container Service (Amazon ECS).
1
User Amazon S3 3 AWS Lambda Amazon CloudWatch Amazon CloudWatch The Notebook and ETL jobs process batch
Job Flow File Store Event
2 and stream data via AWS PrivateLink. The
ETL Job Log
traffic between ETL processes and data
stores does not leave the Amazon network.
VPC
ARC Jupyter notebook produces a job flow
Public subnet Private subnet 3 configuration JSON file; user uploads the
file and SQL scripts to Amazon S3 via
CI/CD automated deployment process or
Availability Zone 2

manually.
AWS Secrets An Amazon S3 file arrival event triggers an
Manager 4 AWS Lambda function.
Amazon Managed
Streaming for Kafka AWS
AWS Fargate Amazon ECS
Stream Data PrivateLink ETL IDE ETL Job The Lambda function spins up an Amazon
(Notebook) 5 ECS task to process batch data in a
transient way, or to process stream data
2 Amazon ECS Service Amazon ECS Task Pull
continuously in a long-running container.
Image
Each job has isolated compute resources.
Availability Zone 1

Application
Amazon ECR
Load Balancer Amazon CloudWatch Events schedules
6 and orchestrates regular ARC ETL jobs and
Amazon S3 ECS tasks with AWS Fargate or Amazon
AWS Fargate Amazon ECS
Batch Data EC2 launch types.
ETL IDE ETL Job
(Notebook)
ARC ETL job generates application logs for
7 each data process stages, at a granular
level. Amazon CloudWatch offers
monitoring and alerting capabilities.
Reviewed for technical accuracy March 8, 2021
© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Reference Architecture

You might also like