by Peter Dalton, Principal Consultant AWS and Taz Sayed, Sr Technical Account Manager AWS
AWS Data & Analytics Week is an opportunity to learn about Amazon’s family of managed analytics services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon Redshift data warehouse; Data Lake services including Amazon EMR, Amazon Athena, & Amazon Redshift Spectrum; Log Analytics with Amazon Elasticsearch Service; and data preparation and placement services with AWS Glue and Amazon Kinesis. You'll will learn how to get started, how to support applications, and how to scale.
17. Amazon Kinesis: Streaming Data Done the AWS Way
Makes it easy to capture, deliver, and process real-time data streams
Pay as you go, no up-front costs
Elastically scalable
Right services for your specific use cases
Real-time latencies
Easy to provision, deploy, and manage
18. Streaming Data Scenarios Across Verticals
Scenarios/
Verticals
Accelerated Ingest-
Transform-Load
Continuous Metrics
Generation
Responsive Data Analysis
Digital Ad
Tech/Marketing
Publisher, bidder data
aggregation
Advertising metrics like
coverage, yield, and
conversion
User engagement with ads,
optimized bid/buy engines
IoT Sensor, device telemetry
data ingestion
Operational metrics and
dashboards
Device operational
intelligence and alerts
Gaming Online data aggregation,
e.g., top 10 players
Massively multiplayer online
game (MMOG) live
dashboard
Leader board generation,
player-skill match
Consumer
Online
Clickstream analytics Metrics like impressions and
page views
Recommendation engines,
proactive care
19. Amazon Kinesis Firehose
Load massive volumes of streaming data into Amazon S3, Redshift and Elasticsearch
• Zero administration: Capture and deliver streaming data into Amazon S3, Amazon Redshift, and other destinations
without writing an application or managing infrastructure.
• Direct-to-data store integration: Batch, compress, and encrypt streaming data for delivery into data destinations in
as little as 60 secs using simple configurations.
• Seamless elasticity: Seamlessly scales to match data throughput w/o intervention
• Serverless ETL using AWS Lambda - Firehose can invoke your Lambda function to transform incoming source data.
Capture and submit streaming
data
Analyze streaming data using your
favorite BI tools
Firehose loads streaming data continuously
into Amazon S3, Redshift and Elasticsearch
20. AWS Platform SDKs Mobile SDKs Kinesis Agent AWS IoT
Amazon S3 Amazon Redshift
• Send data from IT infra, mobile devices, sensors
• Integrated with AWS SDK, Agents, and AWS IoT
• Fully-managed service to capture streaming data
• Elastic w/o resource provisioning
• Pay-as-you-go: 3.5 cents/ GB transferred
• Batch, compress, and encrypt data before loads
• Loads data into Redshift tables base on COPY
command
Amazon Kinesis Firehose
Capture IT & App Logs, Device & Sensor Data, and more
Enable near-real time analytics using existing tools
24. Amazon Kinesis Data Firehose vs.
Amazon Kinesis Data Streams
Amazon Kinesis Data Streams is for use cases that require custom
processing, per incoming record, with sub-1 second processing latency, and
a choice of stream processing frameworks.
Amazon Kinesis Data Firehose is for use cases that require zero
administration, ability to use existing analytics tools based on Amazon S3,
Amazon Redshift and Amazon Elasticsearch, and a data latency of 60
seconds or higher.