[aws la media & entertainment event 2015]: digital media ingest & storage options on aws
Post on 07-Jan-2017
1.185 Views
Preview:
TRANSCRIPT
Digital Media Ingest and Storage Options on AWS
Guy FarberAmazon Web Services
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved©
Content has Gravity and is getting heavier …
…it’s easier to move processing to the content
4k/8kContent
Where is the problem?
More Bandwidth$$$$$
More PowerfulCompute $$$$$
Way more Storage$$$$$
Some Progress(ABR, HEVC, VP10)
File Block Object
AWS Storage options for digital media
Amazon
EFS
Amazon
EBS
Amazon EC2
Instance
storage
Amazon
S3Amazon
Glacier
A Concept - the Content LakeInspired from Data Lake (Coined by James Dixon in 2010)
A single store of all of digital content that you create and
acquire in any form or factor
•Don’t assume any resolutions/formats (for now or future)
•It is up to the consumer (application consuming the content) to use the
appropriate infrastructure for processing
Amazon S3 : the Content Lake
• Durable, cost-effective and fast
• Highly scalable front-end – Multi-part uploads (parallel writes)
– Range-gets (parallel reads)
• No need for capacity planning or provisioning
• Use Amazon S3 with on-premises storage in a hybrid model
• Secure
Hydrating the Content Lake
Amazon S3
Amazon S3(multi-part Upload)
Direct Connect
N x 1G | 10G
Massively Scalable Front-end
Introducing AWS Import/Export Snowball
Scale and Speed
• Up to 50TB Capacity per device
• 10Gbps and 1Gbps connectivity
• Parallel data transfer enables PBs transferred in a week
Secure
• Tamper-resistant enclosure
• 256-bit encryption with KMS
• Secure data erasure
Simple
• Manage entire process through AWS Console
• Lightweight data transfer client
• Notifications
What is Snowball? Petabyte scale data transport
E-ink shipping
label
Ruggedized
case
“8.5G Impact”
All data encrypted
end-to-end50 TB
10G network
Rain & dust
resistant
Tamper-resistant
case & electronics
Can I drop it?
• No (please don’t)
• Snowball is its own box
• Has had many drop tests already
• Can handle 8.5G impacts
• Designed for shipping
What does it cost?
• $200 / job plus shipping
• Includes 10 days to fill the device at your site
• $15/day after the tenth day on site
• Standard Amazon S3 charges apply
• $0.03/GB to transfer data out
• $0.00/GB to transfer data in
How fast is that truck full of drives?
• Less than 1 day to transfer 250TB via 5x10G connections with 5
Snowballs, less than 1 week including shipping
• Number of days to transfer 250TB via the Internet at typical
utilizations
InternetConnectionSpeed
Utilization 1Gbps 500Mbps 300Mbps 150Mbps
25% 95 190 316 632
50% 47 95 158 316
75% 32 63 105 211
What does it cost?
Example 1:
• 250TB loaded on to 5 Snowballs
• 8 days at your site
• 5 * $200 = $1,000 plus shipping
Example 2:
• 30TB exported on to 1 Snowball
• 8 days at your site
• $200 + 30TB * $0.03/GB = $1,121.60 plus shipping
Edge Locations
Availability Zone
Region
Dallas (2)
St.Louis
Miami
JacksonvilleLos Angeles (2)
Seattle
Ashburn (3)
Newark
New York (3)
Dublin
London (2)
Amsterdam (2)
Stockholm
Frankfurt (2)Paris (2)
Singapore(2)
Hong Kong (2)
Tokyo (2)
Sao Paulo
South Bend
San JosePalo AltoHayward
OsakaMilan
Sydney
MadridSeoul
Mumbai
Chennai
Regional Lakes …
Source
(Virginia)
Destination
(Oregon)
• Only replicates new PUTs. Once
S3 is configured, all new uploads
into a source bucket will be
replicated
• Entire bucket or prefix based
• 1:1 replication between any 2
regions
Use cases
Compliance - store data hundreds of miles apart
Lower latency - distribute data to remote customers/partners)
S3 cross-region replicationAutomated, fast, and reliable asynchronous replication of data across AWS regions
Amazon S3
Amazon S3 (range-gets)
Direct Connect
N x 1G | 10G
Massively Scalable S3 Front-end
EBS
Instance
Store
cMassively Scalable Compute on AWS Cloud
On-Prem Apps
Consuming the Content Lake
Object life cycle from hot to cold
S3 Standard• Primary data
• 11 9’s of durability
• 2.75c – 3c per GB/month, $338 -369 per TB/year
S3 – Infrequent Access• Active Archives
• Mezzanine files
• 11 9’s of durability
• 1.25c per GB/month, $154 per TB/year
• 1c per GB for retrievals
Glacier
• Deep/offline archives
• WORM-compliant
data
• 11 9’s of durability
• 0.7c per GB/month,
$86 per TB/year
Data tiering using Life Cycle Policies
Actual customer quote: $0.0125 ?! OMG I will
take all your storage!!!
1 PB raw storage
800 TB usable storage
600 TB allocated storage
400 TB application data
S3 capacity pricing—pay only for what you use!
AWS Cloud
Storage
Securing your data on S3
• AWS alignment with the latest MPAA cloud based application guidelines for content security – August 2015
• VPC private endpoint for Amazon S3 – enables a true private workflow capability
• Encryption & key management capabilities
• Amazon Glacier Vault for high-value media/originals
Preserve, retrieve, and restore every version
of every object stored in your bucket
S3 automatically adds new versions and
preserves deleted objects with delete
markers
Easily control the number of versions kept by
using lifecycle expiration policies
Easy to turn on in the AWS Management
Console
Key = photo.gif
ID = 121212
Key = photo.gif
ID = 111111
Versioning
Enabled
PUTKey = photo.gif
S3 versioning
Amazon S3 event notifications
Delivers notifications to Amazon SNS, Amazon SQS, or AWS
Lambda when events occur in Amazon S3
S3
Events
SNS topic
SQS queue
Lambda function
Notifications
Foo() {
…
}
Support for notification when
objects are created via Put,
Post, Copy, or Multipart
Upload.
Support for notification when
objects are deleted, as well
as with filtering on prefixes
and suffixes for all types of
notifications.
Reference Architecture – Content Processing
Pipeline (Using Lambda)
S3 multi-part API
S3 as backend storage for Content Files acesable to
other processing tasks
Amazon Elastic
Transcoder
S3 Notification
Trigger a Lambda
Function to Start a
transcoding job
Ingest
S3 Notification
Lambda function to
generate a signed
URL to share the
file
Update CMS or
Metadata
Elastic File System - Rendering in the Cloud
• Designed to support petabyte scale file systems
• Throughput scales linearly with storage
• Same latency spec across each AZ
• Thousands of concurrent NFS connections
• Works great for large I/O sizes
• Pay for only what you use not what you provision
• Managed with multi-copy durability
Media Workloads (redefined)
EBSInstance
Store
Amazon EBS/EFS/EC2 Instance Store
Process
Partner/Affiliate/Service Provider
User Delivery/ConsumptionVFX/Production
On-Prem Apps
Archive
Amazon Glacier (Life Cycle Policies)
c
c
Direct Connect
Content Access Transfer
Disposable Infrastructure
Auto-scaling
Workload specific
Amazon S3
EFS
How is my data transported securely?• Strong chain of custody
• Tamper-resistant case
• Tamper-resistant electronics (TPM)
• Each Snowball is erased according to NIST 800-88 media sanitization guidelines between every job
How fast is that truck full of drives?• Less than 1 day to transfer 50TB via a 10G
connection with Snowball, less than 1 week including shipping
• Number of days to transfer 50TB via the internet at typical utilizationsInternetConnectionSpeed
Utilization 1Gbps 500Mbps 300Mbps 150Mbps
25% 19 38 63 126
50% 9 19 32 63
75% 6 13 21 42
What does it cost?• Example 1:• 40TB loaded on to 1 Snowball• 2 days at your site• $200 plus shipping• Example 2:• 30TB loaded on to 1 Snowball• 12 days at your site• $200 + 2*$15/day = $230 plus shipping
Media Storage ServicesAmazon EBS
Block storage for use
with Amazon EC2
Amazon S3
Massively scalable
storage & front-end
11 9’s of durability
Internet scale
storage via API
Amazon Glacier
$0.01/GB/month
11 9’s of durability
Multiple copies across
different DCs
Storage for archiving and
backup
EC2
EBS
Amazon EFS
Share File storage for
use with Amazon EC2
EC2
EFS
Massively scalable
Storage up & down
Scalable Performance
Up to 16TB/volume
Up to 20K IOPS
SSD backed
Encryption
top related