greening’the’video’transcoding’ service’with ... › presentation › ce2d › ... ·...
TRANSCRIPT
Greening The Video Transcoding Service With Low-‐Cost Hardware
Transcoders Peng Liu, Jongwon Yoon, Lance Johnson,
Suman Banerjee University of Wisconsin-‐Madison
Video Streaming Service Is Popular
2
Mobile Video Will Generate Three-‐Quarters of Mobile Data Traffic by 2020. Source: Cisco VNI Mobile, 2016
Wireless Live Video Streaming
3
• Challenges to stream live video to mobile devices
Glitch/JiUer InterrupVon Buffering
A TV Streaming Service On Campus • We have a deployment based on UDP mulVcast
• We developed a new system based on AdapVve BitRate (ABR) streaming over HTTP
4
ABR Streaming Over HTTP
5
Transcoder 3
Transcoder 2
Transcoder 1
Source Video Media Server
Index File
Video Player
…
…
…
…
Video chunks with the same content and duraVon but
different bitrates and qualiVes
HTTP
1200kbps
800kbps
600kbps
400kbps
2 seconds
Challenges Of Video Transcoding
6
High power consumpVon
on GPP
High computaVonal complexity
High throughput does not help for live video
Video quality is criVcal
Goal A low-‐cost, highly efficient transcoder
cluster to provide reliable service for a live video streaming service.
7
Outline • Video transcoding overview • Video decoder and encoder selecVon • VideoCoreCluster architecture • ImplementaVon • EvaluaVon • Deployment and summary
8
Outline • Video transcoding overview • Video decoder and encoder selecVon • VideoCoreCluster architecture • ImplementaVon • EvaluaVon • Deployment and summary
9
Video Transcoder ImplementaVons
10
H.264 Decoder
H.264 Encoder
VS
Flexible Efficient
✔
+
MoVon PredicVon
Entropy Decoding
Entropy Encoding
MoVon CompensaVon
Transform
MoVon EsVmaVon
…
CombinaVon of decoder and encoder Specialized H.264 to H.264 transcoder
Outline • Video transcoding overview • Video decoder and encoder selecVon • VideoCoreCluster architecture • ImplementaVon • EvaluaVon • Deployment and summary
11
Available Video Decoder/Encoder Implementa)ons Advantages Disadvantages
Sojware on GPP Flexible, good video quality High power consumpVon
GPU-‐based Medium power consumpVon
Expensive, inflexible
FPGA-‐based Medium power consumpVon, flexible
Expensive
Hardware (ASIC or hardware IP in
SoCs)
Low power consumpVon Inflexible, medium video quality
12
Decoder & Encoder In Our System • The GPU (VideoCore IV) of BCM2835 has high performance hardware H.264 video encoder and decoder.
13
BCM2835 Raspberry Pi
Outline • Video transcoding overview • Video decoder and encoder selecVon • VideoCoreCluster architecture • ImplementaVon • EvaluaVon • Deployment and summary
14
VideoCoreCluster Architecture
15
Cluster Manager
Transcoder
Dashboard
MQTT Broker
Transcoder Transcoder
Media Server
Player
MQTT
MQTT RTMP
HTTP
HTTP
. . . . .
Outline • Video transcoding overview • Video decoder and encoder selecVon • VideoCoreCluster architecture • ImplementaVon • EvaluaVon • Deployment and summary
16
Media Server • Receives RTMP pushes • Cuts the video stream to chunks • Generates the index files on the fly • Supports both HLS and MPEG-‐DASH
17
Cluster Manager
18
Scheduler
task0 0 1
task1 task2 task3 task4 task5 task0 task1 task2 task3 task0 task1 task2 task3 task4
transcoder1
Task lists with different prioriVes
Transcoder List
Assign Job
Reclaim Job
transcoder2 transcoder3
Triggered by events
2
Transcoder Sojware Overview
19
Cluster Agent
Transcoding Worker 1
Transcoding Worker 3
Transcoding Worker 2
MQTT
RTMP RTMP RTMP
Transcoding Worker ImplementaVon
20
VideoCore IV GPU
Linux Kernel and Driver
OpenMAX IL
gst-‐omx plugins
Demux plugins
GStreamer Framework Mux
plugins RTMP plugins
Transcoding Worker
Ethernet Controller
ARM . . .
Transcoder SynchronizaVon
21
….
….
Transcoder A
Transcoder B
Channel 1 (1.2Mbps)
IDR IDR
chunk
Chunk numbers are generated from frames’ Vmestamps
800kbps
600kbps
Transcoder Failure Handling
22
994 995 996 0 1
Task migraVon
Rest RTMP connecVons to reset Vmestamps
…
…
…
…
ABC_1200kbps
ABC_800kbps
ABC_600kbps
ABC_400kbps
Chunk Number
Outline • Video transcoding overview • Video decoder and encoder selecVon • VideoCoreCluster architecture • ImplementaVon • EvaluaVon • Deployment and summary
23
EvaluaVon Setup
Transcoding worker on Raspberry Pi Model B
24
FFmpeg H.264 decoder + x264 H.264 encoder Intel i5 processor with all the CPU capabiliVes (MMX, SSE2, AVX etc.) enabled
vs.
Video Quality EvaluaVon
25
30
32
34
36
38
40
42
600 800 1000 1200 1400 1600 1800 2000 2200 2400
PSN
R (d
B)
Bitrate (kbps)
PSNR vs Bitrate
VideoCore IVx264-ultrafastx264-superfastx264-veryfastx264-mediumx264-veryslow
0.8 0.82 0.84 0.86 0.88 0.9
0.92 0.94 0.96 0.98
600 800 1000 1200 1400 1600 1800 2000 2200 2400
SSIM
Bitrate (kbps)
SSIM vs Bitrate
VideoCore IVx264-ultrafastx264-superfastx264-veryfastx264-mediumx264-veryslow
Source: An HD channel (1280x720, 30fps) with bitrate 4Mbps
Transcoding Speed And Efficiency EvaluaVon
26
0 100 200 300 400 500 600 700 800
VideoCore IV
x264-superfast
x264-medium
Fram
erat
e (fp
s)
Transcoders
SD
120
690
229
HD
50
210
67
Video Sources: SD: 720x480, 30fps, 1.2Mbps HD: 1280x720, 30fps, 4Mbps
Speed(fps) Power Consump)on
(W)
Efficiency
VideoCore IV 120 2.5 7
x264-‐superfast
690 100 1
Transcoding Speed
Transcoding Efficiency EsVmaVon (SD)
Outline • Video transcoding overview • Video decoder and encoder selecVon • VideoCoreCluster architecture • ImplementaVon • EvaluaVon • Deployment and summary
27
Deployment • We are deploying VideoCoreCluster in a hybrid and incremental way
• Currently 27 channels are being transcoded • More than 4000 sessions and about 480 total watching hours in a month (April 2016)
28
Summary • A low-‐cost, highly efficient video transcoder cluster (VideoCoreCluster) for a live video streaming service
29
Thanks! Q&A