ee5359:multimedia processing project … · ee5359:multimedia processing swethaa alliyalamangalam...
TRANSCRIPT
EE5359:MULTIMEDIA PROCESSING
SWETHAA ALLIYALAMANGALAM JAYARAMAN
1001053849
THE UNIVERSITY OF TEXAS AT ARLINGTON
Interim Report on:
COMPARISON AND ANALYSIS OF INTRA PREDICTION EFFICIENCY IN HEVC, H.264, VP9 and AVS China PART 2
UNDER THE GUIDANCE OF DR. K.R.RAO
ELECTRICAL ENGINEERING DEPARTMENT,
THE UNIVERSITY OF TEXAS AT ARLINGTON
Acronyms
AVC Advanced Video Coding JVT Joint Video Team
AVS Audio Video Standard MB Macroblock
ADST Asymmetric Discrete Sine Transform MPEGMoving Picture Experts Group
AU Access Unit MSE Mean Square Error
BBC British Broadcasting Corporation NAL Network Adaptation Layer
BD-BR Bjøntegaard-Delta Bit-Rate NGO
VNext Generation Open Video
BD-PSNR Bjøntegaard-Delta Peak Signal-to-Noise RatioOBM
C
overlapped block-based motion
compensation
CABAC Context-adaptive binary arithmetic coding PSNR Peak Signal-to-Noise Ratio
CTU Coding Tree Unit PU Prediction Unit
CU Coding Unit QCIF Quarter Common Intermediate Format
DBF De-Blocking Filter RD Rate Distortion
DC Direct Current RDO Rate Distortion Optimization
DCT Discrete Cosine Transform SAO Sample Adaptive Offset
DFT Discrete Fourier Transform SDTI Serial Data Transport Interface
DST Discrete Sine TransformSMPT
E
Society of Motion Picture and Television
Engineers
HD High Definition SSIM Structural Similarity Index
HDTV High Definition Television TM True Motion
HEVC High Efficiency Video Coding TU Transform Unit
ISO International Organization for Standardization UVLC Universal Variable Length Code
ITU-T International Telecommunication Union (Telecommunication
Standardization Sector)VC Video Coding
JPEG Joint Photographic Experts Group VLC Variable Length Coding
Objective
This Project aims at comparing various video coding standards such
as HEVC (High Efficiency Video Coding), H.264/AVC, VP9, DIRAC
and AVS (Audio Video Standard) China part 2 based upon Intra-
Prediction Efficiency.
The comparison will be carried out with the help of performance
comparison metrics such as PSNR [5], SSIM [45], MSE [5], and BD –
PSNR [4], BD BR [4], and Computational Complexity.
The tests will be carried using The HM Test Model 16.3 [8], JM
Software 18.6 [9], The WebM Project’s Encoder [7], DIRAC Software
[10] and AVS China Reference Software [34] for HEVC, H.264/AVC,VP9, DIRAC and AVS PART 2 respectively
Basics (1)
Pixel: An abbreviation for pictureelement.
Image: An image is an array, or amatrix, of square pixels arranged incolumns and rows[29].
Video: A sequence of imagesprocessed electronically into ananalog or digital format and displayedon a screen with sufficient rapidity asto create the illusion of motion andcontinuity.
Multimedia Signal: It is the integrationof several media sources, such asvideo, audio, graphics, animation, textin a meaningful way to convey someinformation[30].
Figure 1: Representation of an Image
Video Compression: It is theprocess of lessening the amountof data needed forrepresentation of the videos byremoving redundant data.
Video Decompression: It is theinverse process of Videocompression.
Basics (2)
Inter-pixel Redundancy: It isreferred as the inter-pixelcorrelation present between thepixels of an image frame orpixels of a group of successiveimage or video frames[31].
Spatial Redundancy(Intra-FrameRedundancy):It represents thestatistical correlation betweenpixels within an imageframe[31].
Temporal Redundancy(Inter-Frame Redundancy): It isconcerned with the statisticalcorrelation between pixels fromsuccessive frames in a temporalimage or video sequence[31].
Figure 2: Classification of Types of Redundancies
Coding Tools: These are the tools which help inthe process of video compression by removingthe redundancies.
Coding Efficiency: It is the ability to minimize thebit rate necessary for representation of videocontent to reach a given level of video quality—or, as alternatively formulated, to maximize thevideo quality achievable within a givenavailable bit rate.
Basics (3)
Macroblocks: The input video frameis initially partitioned into blocks ofthe same size called macroblocks[22].
The compression and decodingprocess works within eachmacroblock.
A macroblock is sub partitioned intosmaller blocks to perform prediction.
The aim of the prediction process isto reduce data redundancy andtherefore, not store excessiveinformation in coded bit stream [22].
There are two basic types of
prediction: intra and inter.
Intra-prediction works within a
current video frame and is based
upon the compressed and
decoded data available for the
block being predicted.
Inter-prediction is used for motion
compensation: a similar region on
previously coded frames close to
the current block is used for
prediction.
What is Intra – Prediction?
Intra-prediction is carried inthe current video frameand makes prediction forthe current block basedupon the availableencoded and decodeddata [14].
Intra-Prediction plays a keyrole in the determination ofCompression Efficiency ofthe whole codec [11].
It was initially proposed in1952 and then it saw itsapplication in transformdomain such as H.261 andH.263 [12].
Table 1: Intra-Prediction among various video coding
standards at a glance
VIDEO CODING
STANDARDSBLOCK SIZE
NUMBER OF
PREDICTION
MODES
AVS PART 2 8x8 block 5 (0-4)
VP9 Super Blocks upto 64x64 10
H.264/AVC 4x4 and 16x16 9 or 4
HEVC 16x16 or 32x32 or 64x64 CTU 35 (0-34)
AVS (Audio Video Standard) PART 2:
The AVS Video Coding standard was developed by the China Audio Video Coding Standard (AVS) working group.
AVS Part 2 focusses into high-definition digital video broadcasting and high-density storage media. It is also known as AVS1-P2 in AVS [18].
Also called as AVS Jizhun Profile.
The spatial intra prediction is based on 8x8 block.
It uses decoded information in the current frame as the reference of prediction, exploiting statistical spatial dependencies between pixels within a picture.
5 luminance intra-prediction modes.
4 chrominance intra prediction modes.
Mode 0: Horizontal
Mode 1: Vertical
Mode 2: DC Mode: each pixel of current
block is predicted by an average of the
vertically and horizontally corresponding
reference pixels
Mode 3: Diagonal down left
Mode 4: Diagonal down right
AVS China Part 2: Encoder and
Decoder
Figure 4: AVS China PART 2: Decoder [34]Figure 3: AVS China PART 2: Encoder [34]
Intra frame prediction in AVS China
Part 2 Prediction of the most
probable mode is according to the intra prediction modes of neighboring blocks.
This will help to reduce average bits needed to describe the intra prediction mode in video bit stream.
The reconstructed pixels of neighboring blocks before deblocking filtered is used as reference pixels for the current block is shown in
Figure 4. Figure 4: AVS China Part 2: Macroblock partioning
Figure 6: AVS China Part 2: Five Luminance intra prediction modes [20]
Figure 5: AVS China Part 2: Neighbour pixels in luminance
prediction[31]
VP9
Like DIRAC, it is also an opensource and free-license videocompression standard developedby Google [26].
It also aims at reduced bit rate by50% compared to its predecessorwith the same video quality [27].
VP9 introduces super-blocks (SB)of size up to 64x64 and allowsbreakdown using recursivedecomposition all the way downto 4x4.
But unlike H.265 these do not needto be square so it can sample64x32 or 4x8 blocks for greaterefficiency.
Figure 10: Partitioning of a Super Block in VP9 [22]
Partitioning of a Super Block and Intra
Prediction Modes in VP9 A large part of the coding efficiency
improvements achieved in VP9 can
be attributed to incorporation of
larger prediction block sizes
It has 10 prediction modes to rebuild
them.
For blocks of 4x4:
DC, Vertical, Horizontal TM (True
Motion), Horizontal Up, Left Diagonal,
Vertical Right, Vertical Left, Right
Diagonal, and Horizontal Down.
For blocks from 8x8 to 64x64:
DC_PRED (DC prediction)
TM_PRED (True-motion prediction)
H_PRED (Horizontal prediction)
V_PRED (Vertical prediction)
D27 (angle 27 degrees)
D45 (angle 45 degrees)
D63 (angle 63 degrees)
D117 (angle 117 degrees)
D135 (angle 135 degrees)
D153 (angle 153 degrees)Figure 13: Angular intra prediction
Modes for VP9 [14]
H.264/AVC
THE H.264/AVC is the newest videocoding standard developed byITU-T Video Coding Experts Group(VCEG) and ISO/JEC MPEG VideoGroup named Joint Video Group(JVT) [21].
Each PU is predicted fromneighboring image data in the
same picture, using DC prediction(an average value for the PU),planar prediction (fitting a planesurface to the PU) or directionalprediction (extrapolating fromneighboring data)[40].
Mode 0: Vertical Prediction
Mode 1: Horizontal Prediction
Mode 2: DC Prediction
Mode 3: Diagonal Down-Left
Prediction
Mode 4: Diagonal Down-Right
Prediction
Mode 5: Vertical Right Prediction
Mode 6:Horizontal Down Prediction
Mode 7: Vertical Left Prediction
Mode 8: Horizontal Up Prediction
Figure 14: Intra_4x4 Prediction in H.264/AVC[40]
H.264/AVC: Encoder and Decoder
Figure 15: H.264/AVC: Encoder [36] Figure 16 :H.264/AVC: Decoder [36]
HEVC(HIGH EFFICIENCY VIDEO CODING)
High Efficiency Video Coding (HEVC) is the latest Video Coding format [10].
It challenges the state-of-the-art H.264/AVC [17] Video Coding standard
which is in current use in the industry by being able to reduce the bit rate
by 50% and retaining the same video quality.
On 13 April 2013 [17], HEVC standard also called H.265 was approved by
ITU-T. Joint Collaborative Team on Video Coding (JCTVC), is a group of
video coding experts from ITU-T Study Group (VCEG) and ISO/IEC JTC 1/SC
29/WG 11 (MPEG).
Figure 18: Coding Tree Unit splitting example with solid lines for CU split:
a) with PU splitting depicted as dotted lines
b) with TU splitting depicted as dotted lines [14]
PREDICTION BLOCK SIZES AND MACROBLOCK CONCEPT in HEVC[14]:
(a)
(b)
The concept of macroblock in HEVC [9] is represented by the Coding Tree Unit (CTU). CTU size can be 16x16, 32x32 or 64x64.
Larger CTU size aims to improve the efficiency of block partitioning on high resolution video sequence.
Larger blocks provoke the introduction of quad-tree partitioning of a CTU into smaller coding units (CUs).
A coding unit is a bottom-level quad-tree syntax element of CTU splitting.
The CU contains a prediction unit (PU) and a transform unit (TU).
The TU is a syntax element responsible for storing transform data.
Allowed TU sizes are 32x32, 16x16, 8x8 and 4x4.
The PU is a syntax element to store prediction data like the intra prediction angle or inter-prediction motion vector.
The CU can contain up to four prediction units.
CU splitting on PUs can be 2Nx2N, 2NxN, Nx2N, NxN, 2NxnU, 2NxnD,
nLx2N and nRx2N (Figure 5) where 2N is a size of a CU being split.
In the intra prediction mode only 2Nx2N PU splitting is allowed.
An NxN PU split is also possible for a bottom level CU that cannot be
further split into sub CUs.
Figure 19 : Prediction Unit Splitting in HEVC [14]
Figure 20: Prediction Modes in
HEVC [14]
Figure 21: Luma Intra Prediction Modes
in HEVC [14]
Test Sequences[31]Name: Bridge-close_cif.yuvResolution: 352x288Frame Rate: 30fps
Name: BQMall_832x480_60.yuvResolution: 832x480Frame Rate: 60fps
Name: BasketballDrive_1280x720_50.yuvResolution: 1280x720Frame Rate: 50 fps
Name: Kimono_1920x1080_24.yuvResolution: 1920x1080Frame Rate: 24 fps
Name: Claire_qcif.yuvResolution: 176x144Frame Rate: 15fps
Sample Command Line Parameters
HEVC: HM 16.4 Software
C:\ HEVC\ bin\vc10\Win32\Release> TAppEncoder.exe -c C:\HEVC\cfg\encoder_intra_main.cfg -wdt 832 -hgt 480 -fr 60 -f 10 -iC:\HEVC\test_seq\BQ_Mall 832x480_60.yuv
H.264: JM 18.3 Software
C:\ h_264\ bin >lencod.exe -f encoder_ main.cfg –p InputFile= "C:\HEVC\test_seq\bridge-close_cif.yuv" -p FramesToBeEncoded = 10 -p SourceWidth = 352 -p SourceHeight = 288 -p QPISlice = 32 -p FrameRate = 30.0 -p ProfileIDC= 77 –p LevelIDC =40 –p Intraperiod = 1
VP9: WebM Project’s VP9 Encoder
vpxenc Kimono_1920x1080_24.yuv -o kimono.webm \--codec=vp9 --i420 --width=1920 --height=1080 --passes=2 -t 0 \--rt --good --cpu-used=0 --end-usage=q \--auto-alt-ref=1 --fps=24000/1001 --verbose --psnr \--lag-in-frames=25 --kf-max-dist=1 \--min-q=32 --max-q=32
AVS China Part 2: AVS China Reference Software
lencod.exe -f encoder_ai.cfg -p InputFile = "C:\HEVC\test_seq\bridge-close_cif.yuv" -p FramesToBeEncoded = 10 -p SourceWidth = 352 -p SourceHeight = 288 -p TraceFile = "log_bridge.txt" -p OutputFile = "test_bridge.avs" -p QPIFrame = 32 -p FrameRate = 5 -p ChromaFormat = 1
R D Plots
30
32
34
36
38
40
42
44
46
48
0 100 200 300 400 500 600
PSN
R (
dB
)
Bitrate (kbps)
R-D Plot: claire_qcif.yuv
HEVC H.264 VP9 AVS China part 2
30
32
34
36
38
40
42
44
46
48
50
0 1000 2000 3000 4000 5000 6000 7000
PSN
R(d
B)
Bitrate (kbps)
R-D Plot: bridge-close_cif.yuvHEVC H.264 VP9 AVS China Part 2
30
32
34
36
38
40
42
44
46
48
50
0 10000 20000 30000 40000
PSN
R (
dB
)
Bitrate (kbps)
R-D Plot:
BQMall_832x480_60.yuv
HEVC H.264 VP9 AVS China part 2
30
32
34
36
38
40
42
44
46
48
50
0 5000 10000 15000 20000 25000 30000
PSN
R (
dB
)
Bitrate (kbps)
R-D Plot:BasketballDrive_1080x720_50.yuv
HEVC H.264 VP9 AVS China part 2
30
32
34
36
38
40
42
44
46
48
50
0 5000 10000 15000 20000 25000 30000 35000
PSN
R (
dB
)
Bitrate (kbps)
R-D Plot: Kimono_1920x1080_24.yuv HEVC H.264 VP9 AVS China part 2
Encoding Time Comparison
0
2
4
6
8
22 27 32 37
En
co
din
g T
ime
(se
c)
Quantization Parameter
ENCODING TIME
COMPARISON:
CLAIRE_QCIF.YUV
HEVC H.264 VP9 AVS China Part 2
0
10
20
30
40
50
22 27 32 37
En
co
din
g T
ime
(se
c)
Quantization Parameter
ENCODING TIME
COMPARISON: BRIDGE-
CLOSE_CIF.YUV
HEVC H.264 VP9 AVS China Part 2
0
50
100
150
200
250
22 27 32 37
En
co
din
g T
Ime
(se
c)
Quantization Parameter
ENCODING TIME
COMPARISON:
BQMALL_832X480_60.YUV
HEVC H.264 VP9 AVS China Part 2
0
50
100
150
200
250
22 27 32 37
En
co
din
g T
ime
(se
c)
Quantization Parameter
ENCODING TIME
COMPARISON:
BASKETBALLDRIVE_1280X720_
50.YUV
HEVC H.264 VP9 AVS China Part 2
0
100
200
300
400
500
600
22 27 32 37
En
co
din
g T
ime
(se
c)
Quantization Parameter
ENCODING TIME
COMPARISON:
KIMONO_1920X1080_24.YU
V
HEVC H.264 VP9 AVS China Part 2
References [1] I. Richardson, “The H.264 Advanced video Compression Standards”, Wiley, 2010.
[2] K.R. Rao, D.N. Kim and J.J. Hwang, “Video Coding Standards: AVS China, H.264/MPEG-4 Part10, HEVC,VP6, DIRAC and VC-1”, Springer, 2014
[3] BD-BR and BD-PSNR: G. Bjøntegaard, “Calculation of average PSNR differences between RD-curves”,ITU-T Q.6/SG16 VCEG 13th Meeting, Document VCEG-M33, Austin, USA, Apr. 2001.
[4] PSNR and MSE: http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/VELDHUIZEN/node18.html
[5] The WebM Project’s VP9 Encoder: http://www.webmproject.org/vp9/
[6] The HM Test Model 16.3: http://hevc.hhi.fraunhofer.de/HM-doc/
[7] JM Software 18.6: http://iphome.hhi.de/suehring/tml/
[8] S.Ma et al, “Overview of IEEE 1857 video coding standard”, access the link:http://ieeexplore.ieee.org/xpl/abstractAuthors.jsp?reload=true&arnumber=6738308
[9] Z. Wang, et al., ―Image quality assessment: From error visibility to structural similarity,‖ IEEE Trans. ImageProcessing, vol. 13, pp. 600–612, Apr. 2004
[10] G.J. Sullivan et al, "Standardized Extensions of High Efficiency Video Coding (HEVC)", IEEE Journal ofSelected Topics in Signal Processing, vol.7, no.6, pp.1001-1016, Dec. 2013
[11] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html: Project on:“Intra Prediction Efficiency and Performance Comparison of HEVC and VP9”, S. Sukumaran, 2014.
[12] ITU-T Recommendation H.263, “Video coding for low bit-rate communication”, Feb. 1998
[13] J. Ostermann et al, “Video Coding with H.264/AVC: Tools, performance and complexity”, IEEE -Circuitsand Systems Magazine, vol. 4, pp.7-28, First Quarter 2004.
[14] M.P. Sharabayko et al, "Intra Compression Efficiency in VP9 and HEVC" Applied Mathematical Sciences,Vol. 7, no. 137, pp.6803 – 6824, Hikari Ltd, 2013
[15] G. Bjontegaard, “Coding improvement by using 4x4 blocks for motion vectors and transform”, Nov.1997
[16] Z. Nan et al, “Spatial Prediction Based Intra-Coding [video-coding]” IEEE International Conference onMultimedia and Expo (ICME), Vol. 1, pp. 97-100, June 2004.
[17] T. Wiegand et al, “Overview of the H.264/AVC Video Coding Standard”, IEEE Transactions on Circuits andSystems for Video Technology, Vol. 13, No. 7, pp. 560-576, Jul. 2003.
[18] W. Gao et al, “AVS Video Coding Standard”, Intelligent Studies in Computational Intelligence Volume 280, pp.125-166, 2010.
[19] W. Gao et al, “AVS- the Chinese Next Generation Video Coding Standard”,http://www.avs.org.cn/reference/AVS%20NAB%20Paper%20Final03.pdf
[20] L. Yu et al. “An Overview of AVS-Video: tools, performance and complexity”, Visual Communications andImage Processing 2005, Proc. of SPIE, vol. 5960, pp.596021, July 31, 2006.
[21] ISO/IEC JTC1/SC29/WG1 1 (MPEG), "Coding of audio-visual objects - Part 10: Advanced Video Coding," International Standard 14496-10, ISO/IEC, 2004.
[22] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Comparative study of Intra Frame Coding efficiency in HEVC and VP9”, S. Kodpadi, 2014.
[23] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Study and implementation of Video compression standards (H.264/AVC and Dirac).” , K.V.Dhonsale, Spring 2012.
[24] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Video compression standard for high definition video: A comparative study of H.264, Dirac Pro and AVS part 2”, S.P.Gangavati, Spring 2012.
[25] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Audio Video Standard for Mobile (AVS-M)”, S.Devaraju, Spring 2009.
[26] Video Test Sequences: http://ultravideo.cs.tut.fi/#testsequences
[27] "VP-Next Overview and Progress Update" (PDF). WebM Project (Google). Retrieved 2012-12-29. Available on: http://downloads.webmproject.org/ngov2012/pdf/04-ngovproject-update.pdf
[28] T. Davies, “A modified rate-distortion optimization strategy for hybrid wavelet video coding”, IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006, Vol.: 2, pp. II, Publication Date: 14-19 May 2006.
[29] Basics of Image Processing: https://www.spacetelescope.org/static/projects/fits_liberator/image_processing.pdf
[30] Multimedia Processing: http://nptel.iitk.ac.in/courses/Webcourse-contents/IIT%20Kharagpur/Multimedia%20Processing/pdf/ssg_m1l1.pdf
[31] L. Yu, S. Chen and J. Wang, “Overview of AVS-video coding standards”, Signal Processing: Image Communication, Vol. 24, Issue 4, Special Issue on AVS and its Application, pp. 247- 262, April 2009.
[32]“Dirac Pro web page” at http://www.bbc.co.uk/rd/projects/dirac/diracpro.shtml
[33] M.Wien: High Efficiency Video Coding: Coding Tools and Specification, Springer, 2015.
[34] Access the http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Video compression standard for high definition video: A comparative study of H.264, Dirac Pro and AVS part 2”, S. Gangavati, 2012.
[35] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Performance Analysis of Dirac Pro with H.264 Intra frame coding”, P. Kharwandikar, 2010.
[36] X. Zhou, E. Q. Li, and Y.-K. Chen, “Implementation of H.264 decoder on general purpose processors with media instructions”, SPIE Conference on Image and Video Communications and Processing, vol. 5022, pp.224-235, May 2003.
[37] D. Marpe et al, “The H.264/MPEG4 advanced video coding standard and its applications”, IEEE Communications Magazine, Vol. 44, pp. 134-143, Aug. 2006.
[38] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document: JCTVC J0292r1, July 2012.
[39] HM Software manual: http://hevc.hhi.fraunhofer.de/
[40] I.E.Richardson, “Coding Video : A Practical guide to HEVC and beyond”, Wiley, 11 May 2015.
[41] L.Yu et al, “Overview of AVS coding standards”, Signal Processing: Image Communication. Vol. 24, pp. 247 –262 , April 2009.
[42] L.Fang et al, “Overview of AVS Standard”, Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference, Vol.1, pp. 423- 426, June 2004 . [46] “P1857 - IEEE Draft Standard for Advanced Audio and Video Coding”, pp.1-190, 26 October 2012.