Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Digital Media Ingest and Storage
Options on AWS
Erik Durand
Amazon Web Services
Content has Gravity and is getting heavier …
…it’s easier to move processing to the content
4k/8k
Content
Where is the problem?
More Bandwidth
$$$$$
More Powerful
Compute
$$$$$
Way more Storage
$$$$$
Some Progress
(ABR, HEVC, VP10)
Where is the sliding scale on my Infrastructure?
Amazon EFS
File
Amazon EBS
Amazon EC2
Instance Store
Block
Amazon S3 Amazon Glacier
Object
Data Transfer
AWS Direct
Connect
AWS
Snowball
ISV Connectors Amazon
Kinesis
Firehose
S3 Transfer
Acceleration
Storage
Gateway
AWS Storage is a platform
A Concept - the Content Lake
Inspired from Data Lake (Coined by James Dixon in 2010)
A single store of all of digital content that you create and
acquire in any form or factor
•Don’t assume any resolutions/formats (for now or future)
•It is up to the consumer (application consuming the content) to use the
appropriate infrastructure for processing
Amazon S3 : the Content Lake
• Durable, cost-effective and fast
• Highly scalable front-end
– Multi-part uploads (parallel writes)
– Range-gets (parallel reads)
• No need for capacity planning or
provisioning
• Use Amazon S3 with on-premises
storage in a hybrid model
• Secure
S3 scalability: buckets and objects
1 PB raw storage
800 TB usable storage
600 TB allocated storage
400 TB application data
S3 capacity pricing—pay only for what you use
AWS Cloud
Storage
Object Storage Options
S3 Standard S3 Standard - Infrequent
Access
Amazon Glacier
Active data Archive dataInfrequently accessed data
- Transition Standard to Standard-IA
- Transition Standard-IA to Amazon Glacier
- Expiration lifecycle policy
- Versioning support
- Prefix support
Data Lifecycle Management
T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days
Data access frequency over time
Save money on storage
58% saving over S3 Standard
44% saving over S3 Standard-IA
* Assumes the highest public pricing tier
Securing your data on AWS
• AWS alignment with the latest MPAA cloud
based application guidelines for content security
– August 2015
• VPC private endpoint for Amazon S3 – enables
a true private workflow capability
• Encryption & key management capabilities
• Amazon Glacier Vault for high-value
media/originals
Hydrating the Content Lake
Amazon S3 Amazon S3
(multi-part Upload)
Direct Connect
N x 1G | 10G
Massively Scalable Front-end
S3 Transfer
Acceleration AWS Snowball
Storage Gateway
Avere - Demonstrated M&E Success
• Who uses Avere? Movie studios for the top-20
blockbusters of 2015 for special effects
Customer Challenges
• Scale rendering and transcoding performance
• Cost, space & power
• Managing storage silos
• High latency of data access over WAN
• Add compute resources at peak times
• Need for 2-3 months, no long-term commitment
• Do NOT want to rewrite applications
Avere Benefits
• Hot data stored on RAM & SSD within FXT cache
• Bulk of data can remain on-prem or on inexpensive
S3
• Caching of remote data eliminates WAN latency
• Clustering provides scalable NAS performance
• Hybrid model - FXT filer on-prem and/or vFXT on EC2
• Pay only for what is used
Optimizing Data Transfer Method #1:
The Highway
What is AWS Snowball? Petabyte-scale data transport
E-ink shipping
label
Ruggedized case
“8.5G impact”
All data encrypted
end-to-end
Rain- and dust-
resistant
Tamper-resistant
case and
electronics
80 TB
10 GE network
How it works
Use cases: AWS Import/Export Snowball
Cloud
Migration
Disaster
Recovery
Data Center
Decommission
Content
Distribution
How fast is Snowball?
• Less than 1 day to transfer 200TB via 3x10G connections with 3
Snowballs, less than 1 week including shipping
• Number of days to transfer 200TB via the Internet at typical utilizations
Internet Connection Speed
Utilization 1Gbps 500Mbps 300Mbps 150Mbps
25% 71 141 236 471
50% 36 71 118 236
75% 24 47 225 157
Optimizing Data Transfer Method #2:
The Internet
Complicated
Setup and Management
Hard to Optimize
Performance
Expensive
Optimizing Internet Performance Is Hard
Prioprietary
Introducing Amazon S3 transfer acceleration
S3 Bucket
AWS Edge
Location
Uploader
Optimized
Throughput!
Typically 50%–400% faster
Change your endpoint, not your code
54 global edge locations
No firewall exceptions
No client software required
Rio De
Janeiro
Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los
Angeles
Seattle Tokyo Singapore
Time[hrs]
500 GB upload from these edge locations to a bucket in Singapore
Public Internet
How fast is S3 transfer acceleration?
S3 Transfer Acceleration
We have customers uploading large files from all over
the world.
We’ve seen performance improvements in excess of
500% in some cases.
- Emery Wells, Cofounder/CTO
”
“
Use case: media uploads
Regional Lakes…
AWS is available today in the U.S., Brazil, Europe, Japan,
Singapore, Australia, and China. Additional regions in India, Korea,
the UK, and Ohio are expected to come online over the next 12 –
18 months.
Over 1 million active customers across
190 countries
2,000 government agencies
5,000 educational institutions
12 regions
32 availability zones
54 edge locations
Region
Edge Location
Source
(Virginia)
Destination
(Oregon)
• Only replicates new PUTs. Once
S3 is configured, all new uploads
into a source bucket will be
replicated
• Entire bucket or prefix based
• 1:1 replication between any 2
regions
Use cases
Compliance - store data hundreds of miles apart
Lower latency - distribute data to remote customers/partners)
S3 cross-region replication
Automated, fast, and reliable asynchronous replication of data across AWS regions
Amazon S3
Amazon S3 (range-gets)
Direct Connect
N x 1G | 10G
Massively Scalable S3 Front-end
EBS
Instance
Store
c
Massively Scalable
Compute on AWS Cloud
On-Prem Apps
Consuming the Content Lake
Media Workloads (redefined)
EBS
Instance
Store
Amazon EBS/EFS/EC2 Instance Store
Process
Partner/Affiliate/
Service Provider
User Delivery/ConsumptionVFX/Production
On-Prem Apps
Archive
Amazon Glacier (Life Cycle Policies)
c
c
Direct Connect
Content Access Transfer
Disposable Infrastructure
Auto-scaling
Workload specific
Amazon S3
EFS
Q&A
Learn more at: http://aws.amazon.com/s3/
http://aws.amazon.com/glacier/
http://aws.amazon.com/importexport/
eddurand@amazon.com

More Related Content

Digital Media Ingest and Storage Options on AWS

  • 1. Digital Media Ingest and Storage Options on AWS Erik Durand Amazon Web Services
  • 2. Content has Gravity and is getting heavier … …it’s easier to move processing to the content 4k/8k Content
  • 3. Where is the problem? More Bandwidth $$$$$ More Powerful Compute $$$$$ Way more Storage $$$$$ Some Progress (ABR, HEVC, VP10)
  • 4. Where is the sliding scale on my Infrastructure?
  • 5. Amazon EFS File Amazon EBS Amazon EC2 Instance Store Block Amazon S3 Amazon Glacier Object Data Transfer AWS Direct Connect AWS Snowball ISV Connectors Amazon Kinesis Firehose S3 Transfer Acceleration Storage Gateway AWS Storage is a platform
  • 6. A Concept - the Content Lake Inspired from Data Lake (Coined by James Dixon in 2010) A single store of all of digital content that you create and acquire in any form or factor •Don’t assume any resolutions/formats (for now or future) •It is up to the consumer (application consuming the content) to use the appropriate infrastructure for processing
  • 7. Amazon S3 : the Content Lake • Durable, cost-effective and fast • Highly scalable front-end – Multi-part uploads (parallel writes) – Range-gets (parallel reads) • No need for capacity planning or provisioning • Use Amazon S3 with on-premises storage in a hybrid model • Secure
  • 9. 1 PB raw storage 800 TB usable storage 600 TB allocated storage 400 TB application data S3 capacity pricing—pay only for what you use AWS Cloud Storage
  • 10. Object Storage Options S3 Standard S3 Standard - Infrequent Access Amazon Glacier Active data Archive dataInfrequently accessed data
  • 11. - Transition Standard to Standard-IA - Transition Standard-IA to Amazon Glacier - Expiration lifecycle policy - Versioning support - Prefix support Data Lifecycle Management T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days Data access frequency over time
  • 12. Save money on storage 58% saving over S3 Standard 44% saving over S3 Standard-IA * Assumes the highest public pricing tier
  • 13. Securing your data on AWS • AWS alignment with the latest MPAA cloud based application guidelines for content security – August 2015 • VPC private endpoint for Amazon S3 – enables a true private workflow capability • Encryption & key management capabilities • Amazon Glacier Vault for high-value media/originals
  • 14. Hydrating the Content Lake Amazon S3 Amazon S3 (multi-part Upload) Direct Connect N x 1G | 10G Massively Scalable Front-end S3 Transfer Acceleration AWS Snowball Storage Gateway
  • 15. Avere - Demonstrated M&E Success • Who uses Avere? Movie studios for the top-20 blockbusters of 2015 for special effects Customer Challenges • Scale rendering and transcoding performance • Cost, space & power • Managing storage silos • High latency of data access over WAN • Add compute resources at peak times • Need for 2-3 months, no long-term commitment • Do NOT want to rewrite applications Avere Benefits • Hot data stored on RAM & SSD within FXT cache • Bulk of data can remain on-prem or on inexpensive S3 • Caching of remote data eliminates WAN latency • Clustering provides scalable NAS performance • Hybrid model - FXT filer on-prem and/or vFXT on EC2 • Pay only for what is used
  • 16. Optimizing Data Transfer Method #1: The Highway
  • 17. What is AWS Snowball? Petabyte-scale data transport E-ink shipping label Ruggedized case “8.5G impact” All data encrypted end-to-end Rain- and dust- resistant Tamper-resistant case and electronics 80 TB 10 GE network
  • 19. Use cases: AWS Import/Export Snowball Cloud Migration Disaster Recovery Data Center Decommission Content Distribution
  • 20. How fast is Snowball? • Less than 1 day to transfer 200TB via 3x10G connections with 3 Snowballs, less than 1 week including shipping • Number of days to transfer 200TB via the Internet at typical utilizations Internet Connection Speed Utilization 1Gbps 500Mbps 300Mbps 150Mbps 25% 71 141 236 471 50% 36 71 118 236 75% 24 47 225 157
  • 21. Optimizing Data Transfer Method #2: The Internet
  • 22. Complicated Setup and Management Hard to Optimize Performance Expensive Optimizing Internet Performance Is Hard Prioprietary
  • 23. Introducing Amazon S3 transfer acceleration S3 Bucket AWS Edge Location Uploader Optimized Throughput! Typically 50%–400% faster Change your endpoint, not your code 54 global edge locations No firewall exceptions No client software required
  • 24. Rio De Janeiro Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los Angeles Seattle Tokyo Singapore Time[hrs] 500 GB upload from these edge locations to a bucket in Singapore Public Internet How fast is S3 transfer acceleration? S3 Transfer Acceleration
  • 25. We have customers uploading large files from all over the world. We’ve seen performance improvements in excess of 500% in some cases. - Emery Wells, Cofounder/CTO ” “ Use case: media uploads
  • 26. Regional Lakes… AWS is available today in the U.S., Brazil, Europe, Japan, Singapore, Australia, and China. Additional regions in India, Korea, the UK, and Ohio are expected to come online over the next 12 – 18 months. Over 1 million active customers across 190 countries 2,000 government agencies 5,000 educational institutions 12 regions 32 availability zones 54 edge locations Region Edge Location
  • 27. Source (Virginia) Destination (Oregon) • Only replicates new PUTs. Once S3 is configured, all new uploads into a source bucket will be replicated • Entire bucket or prefix based • 1:1 replication between any 2 regions Use cases Compliance - store data hundreds of miles apart Lower latency - distribute data to remote customers/partners) S3 cross-region replication Automated, fast, and reliable asynchronous replication of data across AWS regions
  • 28. Amazon S3 Amazon S3 (range-gets) Direct Connect N x 1G | 10G Massively Scalable S3 Front-end EBS Instance Store c Massively Scalable Compute on AWS Cloud On-Prem Apps Consuming the Content Lake
  • 29. Media Workloads (redefined) EBS Instance Store Amazon EBS/EFS/EC2 Instance Store Process Partner/Affiliate/ Service Provider User Delivery/ConsumptionVFX/Production On-Prem Apps Archive Amazon Glacier (Life Cycle Policies) c c Direct Connect Content Access Transfer Disposable Infrastructure Auto-scaling Workload specific Amazon S3 EFS
  • 30. Q&A Learn more at: http://aws.amazon.com/s3/ http://aws.amazon.com/glacier/ http://aws.amazon.com/importexport/ eddurand@amazon.com