ee5359:multimedia processing project … · ee5359:multimedia processing swethaa alliyalamangalam...

32
EE5359:MULTIMEDIA PROCESSING SWETHAA ALLIYALAMANGALAM JAYARAMAN 1001053849 [email protected] THE UNIVERSITY OF TEXAS AT ARLINGTON Interim Report on: COMPARISON AND ANALYSIS OF INTRA PREDICTION EFFICIENCY IN HEVC, H.264, VP9 and AVS China PART 2 UNDER THE GUIDANCE OF DR. K.R.RAO ELECTRICAL ENGINEERING DEPARTMENT, THE UNIVERSITY OF TEXAS AT ARLINGTON

Upload: dodat

Post on 29-Aug-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

EE5359:MULTIMEDIA PROCESSING

SWETHAA ALLIYALAMANGALAM JAYARAMAN

1001053849

[email protected]

THE UNIVERSITY OF TEXAS AT ARLINGTON

Interim Report on:

COMPARISON AND ANALYSIS OF INTRA PREDICTION EFFICIENCY IN HEVC, H.264, VP9 and AVS China PART 2

UNDER THE GUIDANCE OF DR. K.R.RAO

ELECTRICAL ENGINEERING DEPARTMENT,

THE UNIVERSITY OF TEXAS AT ARLINGTON

Acronyms

AVC Advanced Video Coding JVT Joint Video Team

AVS Audio Video Standard MB Macroblock

ADST Asymmetric Discrete Sine Transform MPEGMoving Picture Experts Group

AU Access Unit MSE Mean Square Error

BBC British Broadcasting Corporation NAL Network Adaptation Layer

BD-BR Bjøntegaard-Delta Bit-Rate NGO

VNext Generation Open Video

BD-PSNR Bjøntegaard-Delta Peak Signal-to-Noise RatioOBM

C

overlapped block-based motion

compensation

CABAC Context-adaptive binary arithmetic coding PSNR Peak Signal-to-Noise Ratio

CTU Coding Tree Unit PU Prediction Unit

CU Coding Unit QCIF Quarter Common Intermediate Format

DBF De-Blocking Filter RD Rate Distortion

DC Direct Current RDO Rate Distortion Optimization

DCT Discrete Cosine Transform SAO Sample Adaptive Offset

DFT Discrete Fourier Transform SDTI Serial Data Transport Interface

DST Discrete Sine TransformSMPT

E

Society of Motion Picture and Television

Engineers

HD High Definition SSIM Structural Similarity Index

HDTV High Definition Television TM True Motion

HEVC High Efficiency Video Coding TU Transform Unit

ISO International Organization for Standardization UVLC Universal Variable Length Code

ITU-T International Telecommunication Union (Telecommunication

Standardization Sector)VC Video Coding

JPEG Joint Photographic Experts Group VLC Variable Length Coding

Objective

This Project aims at comparing various video coding standards such

as HEVC (High Efficiency Video Coding), H.264/AVC, VP9, DIRAC

and AVS (Audio Video Standard) China part 2 based upon Intra-

Prediction Efficiency.

The comparison will be carried out with the help of performance

comparison metrics such as PSNR [5], SSIM [45], MSE [5], and BD –

PSNR [4], BD BR [4], and Computational Complexity.

The tests will be carried using The HM Test Model 16.3 [8], JM

Software 18.6 [9], The WebM Project’s Encoder [7], DIRAC Software

[10] and AVS China Reference Software [34] for HEVC, H.264/AVC,VP9, DIRAC and AVS PART 2 respectively

Basics (1)

Pixel: An abbreviation for pictureelement.

Image: An image is an array, or amatrix, of square pixels arranged incolumns and rows[29].

Video: A sequence of imagesprocessed electronically into ananalog or digital format and displayedon a screen with sufficient rapidity asto create the illusion of motion andcontinuity.

Multimedia Signal: It is the integrationof several media sources, such asvideo, audio, graphics, animation, textin a meaningful way to convey someinformation[30].

Figure 1: Representation of an Image

Video Compression: It is theprocess of lessening the amountof data needed forrepresentation of the videos byremoving redundant data.

Video Decompression: It is theinverse process of Videocompression.

Basics (2)

Inter-pixel Redundancy: It isreferred as the inter-pixelcorrelation present between thepixels of an image frame orpixels of a group of successiveimage or video frames[31].

Spatial Redundancy(Intra-FrameRedundancy):It represents thestatistical correlation betweenpixels within an imageframe[31].

Temporal Redundancy(Inter-Frame Redundancy): It isconcerned with the statisticalcorrelation between pixels fromsuccessive frames in a temporalimage or video sequence[31].

Figure 2: Classification of Types of Redundancies

Coding Tools: These are the tools which help inthe process of video compression by removingthe redundancies.

Coding Efficiency: It is the ability to minimize thebit rate necessary for representation of videocontent to reach a given level of video quality—or, as alternatively formulated, to maximize thevideo quality achievable within a givenavailable bit rate.

Basics (3)

Macroblocks: The input video frameis initially partitioned into blocks ofthe same size called macroblocks[22].

The compression and decodingprocess works within eachmacroblock.

A macroblock is sub partitioned intosmaller blocks to perform prediction.

The aim of the prediction process isto reduce data redundancy andtherefore, not store excessiveinformation in coded bit stream [22].

There are two basic types of

prediction: intra and inter.

Intra-prediction works within a

current video frame and is based

upon the compressed and

decoded data available for the

block being predicted.

Inter-prediction is used for motion

compensation: a similar region on

previously coded frames close to

the current block is used for

prediction.

What is Intra – Prediction?

Intra-prediction is carried inthe current video frameand makes prediction forthe current block basedupon the availableencoded and decodeddata [14].

Intra-Prediction plays a keyrole in the determination ofCompression Efficiency ofthe whole codec [11].

It was initially proposed in1952 and then it saw itsapplication in transformdomain such as H.261 andH.263 [12].

Table 1: Intra-Prediction among various video coding

standards at a glance

VIDEO CODING

STANDARDSBLOCK SIZE

NUMBER OF

PREDICTION

MODES

AVS PART 2 8x8 block 5 (0-4)

VP9 Super Blocks upto 64x64 10

H.264/AVC 4x4 and 16x16 9 or 4

HEVC 16x16 or 32x32 or 64x64 CTU 35 (0-34)

AVS (Audio Video Standard) PART 2:

The AVS Video Coding standard was developed by the China Audio Video Coding Standard (AVS) working group.

AVS Part 2 focusses into high-definition digital video broadcasting and high-density storage media. It is also known as AVS1-P2 in AVS [18].

Also called as AVS Jizhun Profile.

The spatial intra prediction is based on 8x8 block.

It uses decoded information in the current frame as the reference of prediction, exploiting statistical spatial dependencies between pixels within a picture.

5 luminance intra-prediction modes.

4 chrominance intra prediction modes.

Mode 0: Horizontal

Mode 1: Vertical

Mode 2: DC Mode: each pixel of current

block is predicted by an average of the

vertically and horizontally corresponding

reference pixels

Mode 3: Diagonal down left

Mode 4: Diagonal down right

AVS China Part 2: Encoder and

Decoder

Figure 4: AVS China PART 2: Decoder [34]Figure 3: AVS China PART 2: Encoder [34]

Intra frame prediction in AVS China

Part 2 Prediction of the most

probable mode is according to the intra prediction modes of neighboring blocks.

This will help to reduce average bits needed to describe the intra prediction mode in video bit stream.

The reconstructed pixels of neighboring blocks before deblocking filtered is used as reference pixels for the current block is shown in

Figure 4. Figure 4: AVS China Part 2: Macroblock partioning

Figure 6: AVS China Part 2: Five Luminance intra prediction modes [20]

Figure 5: AVS China Part 2: Neighbour pixels in luminance

prediction[31]

VP9

Like DIRAC, it is also an opensource and free-license videocompression standard developedby Google [26].

It also aims at reduced bit rate by50% compared to its predecessorwith the same video quality [27].

VP9 introduces super-blocks (SB)of size up to 64x64 and allowsbreakdown using recursivedecomposition all the way downto 4x4.

But unlike H.265 these do not needto be square so it can sample64x32 or 4x8 blocks for greaterefficiency.

Figure 10: Partitioning of a Super Block in VP9 [22]

VP9: Encoder and Decoder

Figure 11 : VP9: Encoder [22] Figure 12 : VP9: Decoder [22]

Partitioning of a Super Block and Intra

Prediction Modes in VP9 A large part of the coding efficiency

improvements achieved in VP9 can

be attributed to incorporation of

larger prediction block sizes

It has 10 prediction modes to rebuild

them.

For blocks of 4x4:

DC, Vertical, Horizontal TM (True

Motion), Horizontal Up, Left Diagonal,

Vertical Right, Vertical Left, Right

Diagonal, and Horizontal Down.

For blocks from 8x8 to 64x64:

DC_PRED (DC prediction)

TM_PRED (True-motion prediction)

H_PRED (Horizontal prediction)

V_PRED (Vertical prediction)

D27 (angle 27 degrees)

D45 (angle 45 degrees)

D63 (angle 63 degrees)

D117 (angle 117 degrees)

D135 (angle 135 degrees)

D153 (angle 153 degrees)Figure 13: Angular intra prediction

Modes for VP9 [14]

H.264/AVC

THE H.264/AVC is the newest videocoding standard developed byITU-T Video Coding Experts Group(VCEG) and ISO/JEC MPEG VideoGroup named Joint Video Group(JVT) [21].

Each PU is predicted fromneighboring image data in the

same picture, using DC prediction(an average value for the PU),planar prediction (fitting a planesurface to the PU) or directionalprediction (extrapolating fromneighboring data)[40].

Mode 0: Vertical Prediction

Mode 1: Horizontal Prediction

Mode 2: DC Prediction

Mode 3: Diagonal Down-Left

Prediction

Mode 4: Diagonal Down-Right

Prediction

Mode 5: Vertical Right Prediction

Mode 6:Horizontal Down Prediction

Mode 7: Vertical Left Prediction

Mode 8: Horizontal Up Prediction

Figure 14: Intra_4x4 Prediction in H.264/AVC[40]

H.264/AVC: Encoder and Decoder

Figure 15: H.264/AVC: Encoder [36] Figure 16 :H.264/AVC: Decoder [36]

HEVC(HIGH EFFICIENCY VIDEO CODING)

High Efficiency Video Coding (HEVC) is the latest Video Coding format [10].

It challenges the state-of-the-art H.264/AVC [17] Video Coding standard

which is in current use in the industry by being able to reduce the bit rate

by 50% and retaining the same video quality.

On 13 April 2013 [17], HEVC standard also called H.265 was approved by

ITU-T. Joint Collaborative Team on Video Coding (JCTVC), is a group of

video coding experts from ITU-T Study Group (VCEG) and ISO/IEC JTC 1/SC

29/WG 11 (MPEG).

HEVC: Encoder and Decoder

Figure 16: HEVC: Encoder [37] Figure 17:HEVC: Decoder [38]

Figure 18: Coding Tree Unit splitting example with solid lines for CU split:

a) with PU splitting depicted as dotted lines

b) with TU splitting depicted as dotted lines [14]

PREDICTION BLOCK SIZES AND MACROBLOCK CONCEPT in HEVC[14]:

(a)

(b)

The concept of macroblock in HEVC [9] is represented by the Coding Tree Unit (CTU). CTU size can be 16x16, 32x32 or 64x64.

Larger CTU size aims to improve the efficiency of block partitioning on high resolution video sequence.

Larger blocks provoke the introduction of quad-tree partitioning of a CTU into smaller coding units (CUs).

A coding unit is a bottom-level quad-tree syntax element of CTU splitting.

The CU contains a prediction unit (PU) and a transform unit (TU).

The TU is a syntax element responsible for storing transform data.

Allowed TU sizes are 32x32, 16x16, 8x8 and 4x4.

The PU is a syntax element to store prediction data like the intra prediction angle or inter-prediction motion vector.

The CU can contain up to four prediction units.

CU splitting on PUs can be 2Nx2N, 2NxN, Nx2N, NxN, 2NxnU, 2NxnD,

nLx2N and nRx2N (Figure 5) where 2N is a size of a CU being split.

In the intra prediction mode only 2Nx2N PU splitting is allowed.

An NxN PU split is also possible for a bottom level CU that cannot be

further split into sub CUs.

Figure 19 : Prediction Unit Splitting in HEVC [14]

Figure 20: Prediction Modes in

HEVC [14]

Figure 21: Luma Intra Prediction Modes

in HEVC [14]

Test Sequences[31]Name: Bridge-close_cif.yuvResolution: 352x288Frame Rate: 30fps

Name: BQMall_832x480_60.yuvResolution: 832x480Frame Rate: 60fps

Name: BasketballDrive_1280x720_50.yuvResolution: 1280x720Frame Rate: 50 fps

Name: Kimono_1920x1080_24.yuvResolution: 1920x1080Frame Rate: 24 fps

Name: Claire_qcif.yuvResolution: 176x144Frame Rate: 15fps

Sample Command Line Parameters

HEVC: HM 16.4 Software

C:\ HEVC\ bin\vc10\Win32\Release> TAppEncoder.exe -c C:\HEVC\cfg\encoder_intra_main.cfg -wdt 832 -hgt 480 -fr 60 -f 10 -iC:\HEVC\test_seq\BQ_Mall 832x480_60.yuv

H.264: JM 18.3 Software

C:\ h_264\ bin >lencod.exe -f encoder_ main.cfg –p InputFile= "C:\HEVC\test_seq\bridge-close_cif.yuv" -p FramesToBeEncoded = 10 -p SourceWidth = 352 -p SourceHeight = 288 -p QPISlice = 32 -p FrameRate = 30.0 -p ProfileIDC= 77 –p LevelIDC =40 –p Intraperiod = 1

VP9: WebM Project’s VP9 Encoder

vpxenc Kimono_1920x1080_24.yuv -o kimono.webm \--codec=vp9 --i420 --width=1920 --height=1080 --passes=2 -t 0 \--rt --good --cpu-used=0 --end-usage=q \--auto-alt-ref=1 --fps=24000/1001 --verbose --psnr \--lag-in-frames=25 --kf-max-dist=1 \--min-q=32 --max-q=32

AVS China Part 2: AVS China Reference Software

lencod.exe -f encoder_ai.cfg -p InputFile = "C:\HEVC\test_seq\bridge-close_cif.yuv" -p FramesToBeEncoded = 10 -p SourceWidth = 352 -p SourceHeight = 288 -p TraceFile = "log_bridge.txt" -p OutputFile = "test_bridge.avs" -p QPIFrame = 32 -p FrameRate = 5 -p ChromaFormat = 1

Test Results

Test Results

Test Results

Test Results

Test Results

R D Plots

30

32

34

36

38

40

42

44

46

48

0 100 200 300 400 500 600

PSN

R (

dB

)

Bitrate (kbps)

R-D Plot: claire_qcif.yuv

HEVC H.264 VP9 AVS China part 2

30

32

34

36

38

40

42

44

46

48

50

0 1000 2000 3000 4000 5000 6000 7000

PSN

R(d

B)

Bitrate (kbps)

R-D Plot: bridge-close_cif.yuvHEVC H.264 VP9 AVS China Part 2

30

32

34

36

38

40

42

44

46

48

50

0 10000 20000 30000 40000

PSN

R (

dB

)

Bitrate (kbps)

R-D Plot:

BQMall_832x480_60.yuv

HEVC H.264 VP9 AVS China part 2

30

32

34

36

38

40

42

44

46

48

50

0 5000 10000 15000 20000 25000 30000

PSN

R (

dB

)

Bitrate (kbps)

R-D Plot:BasketballDrive_1080x720_50.yuv

HEVC H.264 VP9 AVS China part 2

30

32

34

36

38

40

42

44

46

48

50

0 5000 10000 15000 20000 25000 30000 35000

PSN

R (

dB

)

Bitrate (kbps)

R-D Plot: Kimono_1920x1080_24.yuv HEVC H.264 VP9 AVS China part 2

Encoding Time Comparison

0

2

4

6

8

22 27 32 37

En

co

din

g T

ime

(se

c)

Quantization Parameter

ENCODING TIME

COMPARISON:

CLAIRE_QCIF.YUV

HEVC H.264 VP9 AVS China Part 2

0

10

20

30

40

50

22 27 32 37

En

co

din

g T

ime

(se

c)

Quantization Parameter

ENCODING TIME

COMPARISON: BRIDGE-

CLOSE_CIF.YUV

HEVC H.264 VP9 AVS China Part 2

0

50

100

150

200

250

22 27 32 37

En

co

din

g T

Ime

(se

c)

Quantization Parameter

ENCODING TIME

COMPARISON:

BQMALL_832X480_60.YUV

HEVC H.264 VP9 AVS China Part 2

0

50

100

150

200

250

22 27 32 37

En

co

din

g T

ime

(se

c)

Quantization Parameter

ENCODING TIME

COMPARISON:

BASKETBALLDRIVE_1280X720_

50.YUV

HEVC H.264 VP9 AVS China Part 2

0

100

200

300

400

500

600

22 27 32 37

En

co

din

g T

ime

(se

c)

Quantization Parameter

ENCODING TIME

COMPARISON:

KIMONO_1920X1080_24.YU

V

HEVC H.264 VP9 AVS China Part 2

References [1] I. Richardson, “The H.264 Advanced video Compression Standards”, Wiley, 2010.

[2] K.R. Rao, D.N. Kim and J.J. Hwang, “Video Coding Standards: AVS China, H.264/MPEG-4 Part10, HEVC,VP6, DIRAC and VC-1”, Springer, 2014

[3] BD-BR and BD-PSNR: G. Bjøntegaard, “Calculation of average PSNR differences between RD-curves”,ITU-T Q.6/SG16 VCEG 13th Meeting, Document VCEG-M33, Austin, USA, Apr. 2001.

[4] PSNR and MSE: http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/VELDHUIZEN/node18.html

[5] The WebM Project’s VP9 Encoder: http://www.webmproject.org/vp9/

[6] The HM Test Model 16.3: http://hevc.hhi.fraunhofer.de/HM-doc/

[7] JM Software 18.6: http://iphome.hhi.de/suehring/tml/

[8] S.Ma et al, “Overview of IEEE 1857 video coding standard”, access the link:http://ieeexplore.ieee.org/xpl/abstractAuthors.jsp?reload=true&arnumber=6738308

[9] Z. Wang, et al., ―Image quality assessment: From error visibility to structural similarity,‖ IEEE Trans. ImageProcessing, vol. 13, pp. 600–612, Apr. 2004

[10] G.J. Sullivan et al, "Standardized Extensions of High Efficiency Video Coding (HEVC)", IEEE Journal ofSelected Topics in Signal Processing, vol.7, no.6, pp.1001-1016, Dec. 2013

[11] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html: Project on:“Intra Prediction Efficiency and Performance Comparison of HEVC and VP9”, S. Sukumaran, 2014.

[12] ITU-T Recommendation H.263, “Video coding for low bit-rate communication”, Feb. 1998

[13] J. Ostermann et al, “Video Coding with H.264/AVC: Tools, performance and complexity”, IEEE -Circuitsand Systems Magazine, vol. 4, pp.7-28, First Quarter 2004.

[14] M.P. Sharabayko et al, "Intra Compression Efficiency in VP9 and HEVC" Applied Mathematical Sciences,Vol. 7, no. 137, pp.6803 – 6824, Hikari Ltd, 2013

[15] G. Bjontegaard, “Coding improvement by using 4x4 blocks for motion vectors and transform”, Nov.1997

[16] Z. Nan et al, “Spatial Prediction Based Intra-Coding [video-coding]” IEEE International Conference onMultimedia and Expo (ICME), Vol. 1, pp. 97-100, June 2004.

[17] T. Wiegand et al, “Overview of the H.264/AVC Video Coding Standard”, IEEE Transactions on Circuits andSystems for Video Technology, Vol. 13, No. 7, pp. 560-576, Jul. 2003.

[18] W. Gao et al, “AVS Video Coding Standard”, Intelligent Studies in Computational Intelligence Volume 280, pp.125-166, 2010.

[19] W. Gao et al, “AVS- the Chinese Next Generation Video Coding Standard”,http://www.avs.org.cn/reference/AVS%20NAB%20Paper%20Final03.pdf

[20] L. Yu et al. “An Overview of AVS-Video: tools, performance and complexity”, Visual Communications andImage Processing 2005, Proc. of SPIE, vol. 5960, pp.596021, July 31, 2006.

[21] ISO/IEC JTC1/SC29/WG1 1 (MPEG), "Coding of audio-visual objects - Part 10: Advanced Video Coding," International Standard 14496-10, ISO/IEC, 2004.

[22] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Comparative study of Intra Frame Coding efficiency in HEVC and VP9”, S. Kodpadi, 2014.

[23] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Study and implementation of Video compression standards (H.264/AVC and Dirac).” , K.V.Dhonsale, Spring 2012.

[24] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Video compression standard for high definition video: A comparative study of H.264, Dirac Pro and AVS part 2”, S.P.Gangavati, Spring 2012.

[25] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Audio Video Standard for Mobile (AVS-M)”, S.Devaraju, Spring 2009.

[26] Video Test Sequences: http://ultravideo.cs.tut.fi/#testsequences

[27] "VP-Next Overview and Progress Update" (PDF). WebM Project (Google). Retrieved 2012-12-29. Available on: http://downloads.webmproject.org/ngov2012/pdf/04-ngovproject-update.pdf

[28] T. Davies, “A modified rate-distortion optimization strategy for hybrid wavelet video coding”, IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006, Vol.: 2, pp. II, Publication Date: 14-19 May 2006.

[29] Basics of Image Processing: https://www.spacetelescope.org/static/projects/fits_liberator/image_processing.pdf

[30] Multimedia Processing: http://nptel.iitk.ac.in/courses/Webcourse-contents/IIT%20Kharagpur/Multimedia%20Processing/pdf/ssg_m1l1.pdf

[31] L. Yu, S. Chen and J. Wang, “Overview of AVS-video coding standards”, Signal Processing: Image Communication, Vol. 24, Issue 4, Special Issue on AVS and its Application, pp. 247- 262, April 2009.

[32]“Dirac Pro web page” at http://www.bbc.co.uk/rd/projects/dirac/diracpro.shtml

[33] M.Wien: High Efficiency Video Coding: Coding Tools and Specification, Springer, 2015.

[34] Access the http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Video compression standard for high definition video: A comparative study of H.264, Dirac Pro and AVS part 2”, S. Gangavati, 2012.

[35] Access the website http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html Project on: “Performance Analysis of Dirac Pro with H.264 Intra frame coding”, P. Kharwandikar, 2010.

[36] X. Zhou, E. Q. Li, and Y.-K. Chen, “Implementation of H.264 decoder on general purpose processors with media instructions”, SPIE Conference on Image and Video Communications and Processing, vol. 5022, pp.224-235, May 2003.

[37] D. Marpe et al, “The H.264/MPEG4 advanced video coding standard and its applications”, IEEE Communications Magazine, Vol. 44, pp. 134-143, Aug. 2006.

[38] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document: JCTVC J0292r1, July 2012.

[39] HM Software manual: http://hevc.hhi.fraunhofer.de/

[40] I.E.Richardson, “Coding Video : A Practical guide to HEVC and beyond”, Wiley, 11 May 2015.

[41] L.Yu et al, “Overview of AVS coding standards”, Signal Processing: Image Communication. Vol. 24, pp. 247 –262 , April 2009.

[42] L.Fang et al, “Overview of AVS Standard”, Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference, Vol.1, pp. 423- 426, June 2004 . [46] “P1857 - IEEE Draft Standard for Advanced Audio and Video Coding”, pp.1-190, 26 October 2012.