automated media workflows in the cloud (med304) | aws re:invent 2013
DESCRIPTION
Ingesting, storing, processing and delivering a large library of content involves massive complexity. This session walks through sample code that leverages AWS Services to perform all these tasks while coordinating the activities with Amazon Simple Workflow Service (SWF). Along the journey you are introduced to best practices for cost optimization, monitoring, reporting, and exception or error handling. In addition to the sample workflow, a guest speaker from Netflix takes the audience on a deep dive into their “digital supply chain” where you learn how they have automated their processes in moving data all the way from the studios to the last mile. Services covered include Amazon SWF, Amazon Simple Storage Service (S3), Amazon Glacier, Amazon Elastic Compute Cloud (EC2), Amazon Elastic Transcoder, Amazon Mechanical Turk, and Amazon CloudFront.TRANSCRIPT
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
MED304 - Automated Media Workflows in the
Cloud
John Mancuso, Amazon Web Services
November 14, 2013
Agenda
• Why automate
• Workflow steps
• Automating the workflow
• Demo of an end-to-end media workflow
• How Netflix approaches their digital supply chain
Why Automate?
Analog VCD DVD 720p 1080p (3D) 2K 4K
SIZE USERS
FORMAT
Scenario
• At any given time, company X produces 10 broadcast quality shows
• Each show consists of 200 30-minute episodes per year
• High-res post-production copies of each show are temporarily stored at company X’s studio in Tokyo
• The content must be made available for distribution to consumers via web, mobile devices, and media players
• The high-res content must be archived for future access
Media Workflow
Ingest Processing Discovery &
Delivery
Media Workflow
Ingest Processing Discovery &
Delivery
Amazon Simple Workflow Service (SWF)
Amazon Storage Services
Amazon S3 – Standard & RRS, Amazon Glacier
Media Workflow
Ingest Processing Discovery &
Delivery
Amazon Simple Workflow Service (SWF)
Amazon Storage Services
Amazon S3 – Standard & RRS, Amazon Glacier
Ingest
Image courtesy of porbital FreeDigitalPhotos.net
Amazon S3 –
US East
Ingest – Data Transfer
AWS Command
Line Interface (CLI)
Amazon S3
Server Side
Amazon S3 parallel
multipart
uploads
Ingest – Data Transfer
Amazon S3
Tsunami UDP
Amazon EC2 Image courtesy of porbital FreeDigitalPhotos.net
Ingest –Timing Comparison 885 MB Video File
Single thread to S3 13 minutes 25 seconds --
Multiple threads to
S3
1 minute 93% reduction
Tsunami UDP +
multiple threads
15 seconds + 7 seconds
= 22 seconds
63% further reduction
Instance size: CC2.8xlarge
OS: Amazon Linux
Ingest – Code Snippet def doWork_INGEST(remoteIP,remoteFileName,s3Key_HighRes):
#Transfer using TSUNAMI
cmd_s = '/usr/local/bin/tsunami connect {} set rate 500m get {} quit'
cmd_s = cmd_s.format(remoteIP,remoteFileName)
execCMD(cmd_s)
#Upload to S3 using AWS CLI
s3Path = 's3://{}/{}'
s3Path = s3Path.format(s3Bucket_HighRes,s3Key_HighRes)
cmd_s = 'aws s3 cp {} {} --region us-east-1'
cmd_s = cmd_s.format(remoteFileName,s3Path)
execCMD(cmd_s)
#Delete the local file
os.remove(localFilePath)
Media Workflow
Ingest Processing Discovery &
Delivery
Amazon Simple Workflow Service (SWF)
Amazon Storage Services
Amazon S3 – Standard & RRS, Amazon Glacier
Processing
• Transcoding
• Thumbnail selection
• Archiving of high-res videos
Processing – Transcoding
Amazon S3 Amazon S3
(RRS)
Amazon Elastic
Transcoder
Transcoding – Code Snippet def doWork_PROCESS_TRANSCODE(Key_HighRes,s3PreFix_TranscodeRoot):
etc = ElasticTranscoderConnection()
job_input_name={"Key": s3Key_HighRes, "FrameRate": "auto", "Resolution": "auto", "AspectRatio": "auto", "Interlaced": "auto", "Container": "auto" }
job_outputs=[
{"Key": "MP4.mp4", "ThumbnailPattern": "MP4{count}", "Rotate": "auto", "PresetId": ET_PresetId_MP4},
{"Key": "HLS", "ThumbnailPattern": "HLS{count}", "Rotate": "auto", "PresetId": ET_PresetId_HLS}]
job = etc.create_job(pipeline_id=ET_Pipeline_ID,input_name=job_input_name,outputs=job_outputs,output_key_prefix=s3PreFix_TranscodeRoot)
jid = job['Job']['Id']
#Ideally you would leverage the SNS capabilities of ET to signal SWF on completion
waitForCompletion(etc,jid)
Processing –Thumbnail selection
Amazon S3
(RRS)
Amazon
DynamoDB
Amazon
Mechanical Turk
Thumbnail Selection – Code Snippet def getRequest(s3WebPath_Thumbnails):
request_params = {"Title":"Thumbnail Selcection",
"Description":"Please choose a thumbnail",
"MaxAssignments":"1",
"HITLayoutId": MTurk_HITLAYOUTID,
"Reward": {"Amount": "0.10","CurrencyCode":"USD"},
"LifetimeInSeconds":"300",
"AssignmentDurationInSeconds":"300",
"HITLayoutParameter": [
{"Name": "image1","Value": s3WebPath_Thumbnails + "MP400001.png"},
.
.
.
{"Name": "image10","Value": s3WebPath_Thumbnails + "MP400010.png"},
]
}
print request_params
Thumbnail Selection – Code Snippet def doWork_PROCESS_THUMBNAIL(s3PreFix_Thumbnails):
m = mturkcore.MechanicalTurk()
mtc = MTurkConnection()
s3WebPath_Thumbnails = 'http://{}.s3-website-us-east-1.amazonaws.com/{}'
s3WebPath_Thumbnails = s3WebPath_Thumbnails.format(s3Bucket_Thumbs, s3PreFix_Thumbnails)
request_params = getRequest(s3WebPath_Thumbnails)
hit = m.create_request("CreateHIT", request_params)
hid = hit['CreateHITResponse']['HIT']['HITId']
#Wait for an answer
answer = getAnswer(mtc,hid)
#Get the imagename from the answer
answer = answer[5:]
answer = answer.zfill(5)
imagekey = '{}MP4{}.png'
imagekey = imagekey.format(s3WebPath_Thumbnails,answer)
return imagekey
Processing – Archiving of High-res Videos
Amazon S3 Amazon
Glacier
Archiving – Code Snippet def doWork_PROCESS_ARCHIVE(s3Key_HighRes):
#Move the high-res video to a path in S3 configured to archive
#to Amazon Glacier with a lifecycle policy
s3PathA = 's3://{}/{}'
s3PathA = s3PathA.format(s3Bucket_HighRes,s3Key_HighRes)
s3PathB = 's3://{}/toArchive/{}'
s3PathB = s3PathB.format(s3Bucket_HighRes,s3Key_HighRes)
cmd_s = 'aws s3 mv {} {} --region us-east-1'
cmd_s = cmd_s.format(s3PathA,s3PathB)
execCMD(cmd_s)
Media Workflow
Ingest Processing Discovery &
Delivery
Amazon Simple Workflow Service (SWF)
Amazon Storage Services
Amazon S3 – Standard & RRS, Amazon Glacier
Discovery & Delivery
Amazon S3
(RRS)
Amazon CloudFront
CMS Running on Amazon EC2
Automating the Workflow
Media Workflow
Ingest Processing Discovery &
Delivery
Amazon Simple Workflow Service (SWF)
Amazon Storage Services
Amazon S3 – Standard & RRS, Amazon Glacier
Amazon Simple Workflow (SWF)
• SWF – Maintains distributed
application state
– Tracks workflow executions
– Dispatches tasks
(activities & deciders)
– Retains history
– Provides visibility
• Activities tasks – Do the “work” associated
with a workflow step
• Decider tasks – Determines which activity
task should come next
• Activities & deciders can run anywhere (on prem, in cloud)
Decider Logic
Task = GetDecision Task
Exists?
NextActivity =
ACTIVITIES[len(EventList)]
Signal Completion of
Execution
NextActivity.Input =
PreviosActivity.Result
NextActivity.Input =
Execution Input
Is First
Activity?
Yes
No
Yes Yes No
Start
EventList with
[‘ActivityTaskCompleted’,
‘WorkflowExecutionStarted’]
All Activities
Completed?
No
Activity Worker – Code Snippet from mwf_Ingest import *
swf_l1 = swf.Layer1()
while True:
task = swf_l1.poll_for_activity_task(domain['name'], workflow_type['task_list'])
if 'taskToken' in task:
task_token = task['taskToken']
task_input = json.loads(task['input'])
try:
if task['activityType']['name'] == activities[0]['name']:
remoteIP = task_input['remoteIP']
remoteFileName = task_input['remoteFileName']
s3Key_HighRes = get_rand() + remoteFileName[remoteFileName.rindex('.'):]
doWork_INGEST(remoteIP,remoteFileName,s3Key_HighRes)
dataToPass = {'s3Key_HighRes' : s3Key_HighRes}
task_status_s = json.dumps(dataToPass)
out = swf_l1.respond_activity_task_completed(task_token,task_status_s)
except:
out = swf_l1.respond_activity_task_failed(task_token,'','')
Workflow Steps
• Start workflow execution
• Ingest (transfer file to Amazon EC2 using
Tsunami UDP & upload to Amazon S3)
• Transcode file (multiple output formats)
• Select thumbnail
• Archive high-res file
• Signal completion of execution
Scalability & Fault Tolerance Analysis
Step Is Scalable? Is Fault Tolerant?
Ingest
Transcode
Archive to Amazon Glacier
Amazon Mechanical Turk
for thumbnails
Delivery with Amazon
CloudFront
Automation elements
Demo External references: MTurkCore, Boto
Netflix’s Transcoding Transformation
Tony Koinov, Director Engineering, Netflix
Netflix Media in AWS
• Matrix : The Netflix media pipeline
• MAPLE : New generation media
pipeline
• Concluding thoughts
33
Netflix Media Pipeline
34
EC2
S3 EC2
S3
Open
Connect
EC2
FTP Media
Processing
Driving to Hollywood Game
35
Rules of the Game
• 200 MPH!
• Purchase only
• Quantities limited
• It breaks, you fix it
• Pay for parking
• Obsolete in 1 year
• 85 MPH
• Lease, cancel anytime
• Unlimited quantity
• It breaks, replace it, no charge
• No parking, just walk away
• Brand new each year
36
Industry Heritage : Optimize for Latency • Interactive editing
– Master creation
– DVD/Blu-ray authoring
– Edits for television
37
Netflix 2008 • Custom data center
• Custom GPU encoders
• Fixed size
• New format needed – PC, Mac, Xbox
• Content library doubled
• Frequent HW failures
• Fail! Catalog incomplete
38
Fall 2009 – Launch Netflix PS3 Player
• First 100% AWS
transcode
• New format, unique to
Netflix PS3 player
• Encode recipe nailed
down late
• 3 weeks, transcode
entire catalog
39
Netflix 2009 to Present
• US East AWS
• Variable sized EC2 farm
• S3 for storage
• Optimized for throughput, not
latency
• No more missed deadlines – Devices, catalogs, countries
40
Spring 2010 – Launch Netflix iPad Player
• Launch April 10th
• Apple approached us in mid February
• Grew EC2 farm to 4,000 instances
• Entire library transcoded in 2 weeks
• New format ready for launch
41
Netflix Media Pipeline
42
EC2
S3 EC2
S3
Open
Connect
EC2
FTP Media
Processing
For Netflix, Throughput Trumps Latency
• Think horizontal, not vertical
• Priuses move more people than Ferraris
• Frequent re-encodes of growing libraries
• Netflix is nimble because of AWS
43
More Proof That Horizontal Wins
• New countries, new content
• Codec innovation
44
AWS Handles Netflix Scale
• 6 regional catalogs
• 4 formats supported today – 1 VC-1, 3 H.264
– Multiple bit rates per format
• 10s of 1000s of hours of content
• Petabytes of S3 storage
45
Netflix Media in AWS
• Matrix: The Netflix media pipeline
• MAPLE: New generation media
pipeline
• Concluding thoughts
46
New Generation : Address Faults and Latency • More than 1 week 4K
transcode
• 2 – 3 days for HD transcode
• Fault intolerant
• Maintenance is challenging
• Often too slow – Day after broadcast
– Redelivery of damaged content
47
EC2: C1 Medium
S3
~700 Mbps
10-16 Mbps
MAPLE : Massively Parallel Encoding
• 5-minute chunks – Close to real time
• Fault tolerant
• Easy maintenance
• Address low latency use cases – Day after broadcast
– Redelivery of damaged content
48 S3
EC2
Netflix Media in AWS
• Matrix : The Netflix media pipeline
• MAPLE : New generation media
pipeline
• Concluding thoughts
49
We Would Do It All Over Again
• Don’t be fooled by IT cost
comparisons – We don’t administer the gear
• 6,000 EC2 instances
• Petabytes of storage
• High network traffic
– Storage is durable
– It is a moving target
• You cannot put a price on nimble
50
Please give us your feedback on this
presentation
As a thank you, we will select prize
winners daily for completed surveys!
MED304