parallel optimization of intra mode selection in hevc using open mp

58
Parallel optimization of intra mode selection in HEVC using Open MP

Upload: peregrine-poole

Post on 27-Dec-2015

226 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Parallel optimization of intra mode selection in HEVC using Open MP

Parallel optimization of intra mode selection in HEVC using Open MP

Page 2: Parallel optimization of intra mode selection in HEVC using Open MP

Why video compression ?• Increase in use of High resolution video streaming over

internet. Eg: Youtube.• Eliminate large storage requirements for multimedia data.• Reduce time interval to transmit compressed video or image

file over limited bandwidth

Page 3: Parallel optimization of intra mode selection in HEVC using Open MP

Basic video compression technique• Instead of compression each and every frame of a video

sequence we basically compress the difference between current and previous frame

Page 4: Parallel optimization of intra mode selection in HEVC using Open MP

Evolution of video coding standards

Page 5: Parallel optimization of intra mode selection in HEVC using Open MP

Introduction to HEVC• Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T

ISO/IEC started its work on a new standard for High Efficiency Video Coding (HEVC).

• HEVC [1][11][16][17] implements the same hybrid approach as H.264 [27] which includes both temporal and spatial predictions.

• HEVC divides the image into varying block sizes up to 64x64 pixels as compared to 16x16 pixels in H.264.

• It aims at 50% compression gain over H.264 while maintaining similar video quality

Page 6: Parallel optimization of intra mode selection in HEVC using Open MP

HEVC Encoder

Page 7: Parallel optimization of intra mode selection in HEVC using Open MP

HEVC Decoder

Page 8: Parallel optimization of intra mode selection in HEVC using Open MP

Intra Prediction• HEVC contains several elements for improving the efficiency of

intra prediction over earlier approaches.

• The introduced methods can model accurately different structures as well as smooth regions with gradually changing sample values.

• HEVC introduces 33 angular prediction modes along with planar and DC prediction modes.

• In HEVC there are four effective intra prediction block sizes ranging from 4 by 4 to 32 by 32 samples, each of which supports 33 distinct prediction directions.

Page 9: Parallel optimization of intra mode selection in HEVC using Open MP

Intra Prediction

An example of angular prediction when operating on sixth row of an 8x8 block

Page 10: Parallel optimization of intra mode selection in HEVC using Open MP

Types of partitioning of intra coded CU into PUs [54]

Intra Prediction

Page 11: Parallel optimization of intra mode selection in HEVC using Open MP

Intra Prediction in H.264

H.264 intra prediction modes

Page 12: Parallel optimization of intra mode selection in HEVC using Open MP

Intra Prediction in HEVC

HEVC intra prediction modes

Page 13: Parallel optimization of intra mode selection in HEVC using Open MP

Intra Prediction modes in HEVC

• HEVC has 33 different angular modes, while mode 0 and mode 1 are termed as planar and DC mode respectively

• The introduced methods can model accurately different structures as well as smooth regions with gradually changing sample values.

Page 14: Parallel optimization of intra mode selection in HEVC using Open MP

Intra Prediction modes in HEVC

Size of a PB Number of PBs in 64x64 CU

Number of modes to be tested in one PB

Total number of modes to be tested at this level

32x32 4 35 140

16x16 16 35 560

8x8 64 35 2240

4x4 256 35 8960

Total 11,900

Current problem- complexity and encoded time [74]

Page 15: Parallel optimization of intra mode selection in HEVC using Open MP

Intra mode decision in HEVC• In H.264/AVC mode coding approach, a single most probable

mode was derived based on relative RD cost value

• HEVC defines three most probable modes for each PU based on the modes of the neighboring PUs.

• The algorithm is based on deciding the best possible subset of modes that yield the smallest sum of absolute transformed differences (SATD) between original pixels and predicted pixels

Page 17: Parallel optimization of intra mode selection in HEVC using Open MP

Intra mode decision in HEVC

Page 18: Parallel optimization of intra mode selection in HEVC using Open MP

Points to be considered

1. There is no inter thread dependency. 2. Restructure the part of code which is shared by multiple

threads so that there is no dependency. Also if possible mark those variable private or make use of locks.

3. For every loop you parallelize, check whether work is equally distributed between each of the iterations.

4. Beware of deadlocks5. Use tools such as profiler which will help to find the hotspot

in your program.6. Determine optimum number of threads to be used

according to number of processor.

Page 19: Parallel optimization of intra mode selection in HEVC using Open MP

Proposed Algorithm

1. Sum of absolute differences is calculated using parallel approach in which the each thread is assigned different number of modes to find best possible subset of N modes corresponding to different Pus.

2. MPM (3 best modes) are included as additional candidates

3. Calculate RD cost of subset of N modes and MPM

4. Calculate best mode with the least RD cost

Page 20: Parallel optimization of intra mode selection in HEVC using Open MP

Profiler instrumentation data

Page 21: Parallel optimization of intra mode selection in HEVC using Open MP

Serial processing of intra prediction modes in HEVC [75]

Page 22: Parallel optimization of intra mode selection in HEVC using Open MP

Proposed algorithm for mode decision [75]

Page 23: Parallel optimization of intra mode selection in HEVC using Open MP

Proposed algorithm for mode decision• A parallel approach has been implemented to reduce the

encoding time required for decision making process.

• It aims at dividing the modes among multiple threads and calculating RD cost values and other decision parameters corresponding to each thread simultaneously and then comparing it.

• A maximum of two threads are run in parallel on two different cores to eliminate data dependency.

• It had been ensured to reducing data dependency in other functions which relate to work sharing among threads during decision making process.

Page 24: Parallel optimization of intra mode selection in HEVC using Open MP

What is Open MP ?• Open MP is basically an abbreviation for ‘Open Multi-

Processing’

• An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism

• The OpenMP API is intended to support programs that will execute correctly both as parallel programs and as sequential programs

• Comprised of three primary API components:• Compiler Directives• Runtime Library Routines• Environment Variables

Page 25: Parallel optimization of intra mode selection in HEVC using Open MP

General code structure in C/C++#include <omp.h> main () { int var1, var2, var3;#pragma omp parallel private(var1, var2) shared(var3) { Parallel section executed by all threads . Other OpenMP directives . Run-time Library calls . All threads join master thread and disband } Resume serial code . . . }

Page 26: Parallel optimization of intra mode selection in HEVC using Open MP

Open MP components

Page 27: Parallel optimization of intra mode selection in HEVC using Open MP

Multi threading in Open MP

Fork join parallelism in Open MP

Page 28: Parallel optimization of intra mode selection in HEVC using Open MP

Results• Performance was evaluated using HM 9.1 software [11]• Intra main profile is used for coding with all intra and frame rate is

set to 50 fps.• Proposed algorithm is evaluated with 4 QPs of 22,27,32 and 37 using

standard test sequences [15] recommended by JCT-VC• Total 30 frames were used for each video test sequence

No. Sequence Name Resolution Type Approx. size

(MB)

No. of frames

1 RaceHorses_416x240_30.yuv 416x240 WQVGA 44 30

2 BasketballDrillText_832x480_50.yuv 832x480 WVGA 294 30

3 SlideEditing_1280x720_30.yuv 1280x720 SD 405 30

4 Kimono1_1920x1080_24.yuv 1920x1080 HD 729 30

5 PeopleOnStreet_2560x1600_30.yuv 2560x1600 WQHD 900 30

Page 29: Parallel optimization of intra mode selection in HEVC using Open MP

Test Sequences

Page 30: Parallel optimization of intra mode selection in HEVC using Open MP

Encoding time gain

22 27 32 370

500

1000

1500

2000

2500

3000

3500

4000

2746.764

3713.0743420.087

1422.315

2297.383

2876.004 2814.824

1289.585

Original

Proposed

BasketballDrillText_832x480_50

22 27 32 370

200

400

600

800

1000

1200

498.279442.88

965.389

421.117481.626

343.807

850.714

377.409

Original

Proposed

RaceHorses_416x240_30

22 27 32 370

1000

2000

3000

4000

5000

6000

3636.988 3487.9153177.193

5276.146

2076.008

2574.293

1356.849

4249.525

OriginalProposed

SlideEditing_1280x720_30

22 27 32 370

2000

4000

6000

8000

10000

12000

10408.007

7651.664

6187.364

7402.71

5225.91 5163.971

3774.613275.808

Original

Proposed

Kimono1_1920x1080_24

Enc

odin

g ti

me

(sec

)

Enc

odin

g ti

me

(sec

)

Enc

odin

g ti

me

(sec

)

Enc

odin

g ti

me

(sec

)

QP QP

QP QP

Page 31: Parallel optimization of intra mode selection in HEVC using Open MP

Encoding time gain

22 27 32 370

1000

2000

3000

4000

5000

6000

3636.9883487.915

3177.193

5276.146

2076.008

2574.293

1356.849

4249.525

OriginalProposed

PeopleOnStreet_2560x1600_30E

ncod

ing

time

(sec

)

QP

Page 32: Parallel optimization of intra mode selection in HEVC using Open MP

Percentage reduction in encoding time

22 27 32 37

-25.00

-20.00

-15.00

-10.00

-5.00

0.00

-3.34

-22.37

-11.88-10.38

Original vs Proposed

RaceHorses_416x240_30

22 27 32 37

-40.00

-35.00

-30.00

-25.00

-20.00

-15.00

-10.00

-5.00

0.00

-36.26

-22.54

-17.70

-9.33

BasketballDrillText_832x480_50

Original vs Proposed

22 27 32 37

-70.00

-60.00

-50.00

-40.00

-30.00

-20.00

-10.00

0.00

-42.92

-26.19

-57.29

-19.46

Original vs Proposed

SlideEditing_1280x720_30

22 27 32 37

-60.00

-50.00

-40.00

-30.00

-20.00

-10.00

0.00

-20.91

-32.51-38.99

-55.75

Kimono1_1920x1080_24

Original vs Proposed

% d

ecre

ase

in e

ncod

ing

tim

e

% d

ecre

ase

in e

ncod

ing

tim

e

QPQP

QPQP

% d

ecre

ase

in e

ncod

ing

tim

e

% d

ecre

ase

in e

ncod

ing

tim

e

Page 33: Parallel optimization of intra mode selection in HEVC using Open MP

Percentage reduction in encoding time

22 27 32 37

-70.00

-60.00

-50.00

-40.00

-30.00

-20.00

-10.00

0.00

-18.95-20.85

-59.89

-38.64

Original vs Proposed

PeopleOnStreet_2560x1600_30

% d

ecre

ase

in e

ncod

ing

tim

e

QP

Page 34: Parallel optimization of intra mode selection in HEVC using Open MP

BD-PSNR

22 27 32 37

-0.2-0.18-0.16-0.14-0.12

-0.1-0.08-0.06-0.04-0.02

0

-0.1696-0.1832

-0.0459 -0.051

Original vs Proposed

RaceHorses_416x240_30

QPQPQPQP

22 27 32 37

-0.25

-0.2

-0.15

-0.1

-0.05

0

-0.0359

-0.1408

-0.1897

-0.2357Original vs Proposed

SlideEditing_1280x720_30

QP

22 27 32 37

-0.18

-0.16

-0.14

-0.12

-0.1

-0.08

-0.06

-0.04

-0.02

0

-0.0459

-0.13

-0.1079

-0.1552

Kimono1_1920x1080_24

Original vs ProposedQP

BD

-PS

NR

(dB

)

BD

-PS

NR

(dB

)B

D-P

SN

R (

dB)

BD

-PS

NR

(dB

)

22 27 32 37

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0-0.0108

-0.1279

-0.2519

-0.2927

BasketballDrillText_832x480_50

Original vs Proposed

Page 35: Parallel optimization of intra mode selection in HEVC using Open MP

BD-PSNR

22 27 32 37

-0.45

-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

-0.0168

-0.3903

-0.1959

-0.2727

Original vs Proposed

PeopleOnStreet_2560x1600_30

BD

-PSN

R (

dB)

QP

Page 36: Parallel optimization of intra mode selection in HEVC using Open MP

BD- bitrate

22 27 32 370

0.5

1

1.5

2

2.5

3

3.5

1.3172

2.1152.5314

2.8799

Original vs Proposed

RaceHorses_416x240_30

BD

-Bitr

ate

(kbp

s)

QP

22 27 32 370

0.5

1

1.5

2

2.5

3

3.5

1.89732.1516

2.65112.974

Original vs Proposed

SlideEditing_1280x720_30

BD

-Bitr

ate

(kbp

s)

QP

22 27 32 370

0.5

1

1.5

2

2.5

0.2843

0.9519

1.6041

2.228

Kimono1_1920x1080_24

Original vs ProposedQP

BD

-Bit

rate

(kb

ps)

BD

-Bit

rate

(kb

ps)

BD

-Bit

rate

(kb

ps)

QP22 27 32 37

00.5

11.5

22.5

33.5

44.5

5

0.3959

1.6653

3.3677

4.4054

BasketballDrillText_832x480_50

Original vs Proposed

Page 37: Parallel optimization of intra mode selection in HEVC using Open MP

BD- bitrate

22 27 32 370

1

2

3

4

5

6

1.3127

2.43617

3.5837

4.9853

Original vs Proposed

PeopleOnStreet_2560x1600_30B

D-B

itra

te (

kbps

)

QP

Page 38: Parallel optimization of intra mode selection in HEVC using Open MP

Rate distortion plot

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 550030

32

34

36

38

40

42

44

OriginalProposed

PS

NR

(dB

)

Bitrate (kbps)

0 5000 10000 15000 20000 2500030

32

34

36

38

40

42

44

OriginalProposed

RaceHorses_416x240_30

PS

NR

(dB

)

Bitrate (kbps)

10000 15000 20000 25000 30000 35000 4000032

34

36

38

40

42

44

46

Orig-inalPro-posed

SlideEditing_1280x720_30

PS

NR

(dB

)

Bitrate (kbps) 20004000

60008000

1000012000

1400016000

1800020000

32

34

36

38

40

42

44

46

OriginalProposed

Kimono1_1920x1080_24

PS

NR

(dB

)

Bitrate (kbps)

Page 39: Parallel optimization of intra mode selection in HEVC using Open MP

Rate distortion plot

10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 11000032

34

36

38

40

42

44

46

OriginalProposed

PeopleOnStreet_2560x1600_30

Bitrate (kbps)

PS

NR

(dB

)

Page 40: Parallel optimization of intra mode selection in HEVC using Open MP

Encoded bit stream size

22 27 32 370

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

1558651

865639

478681

272846

1558890

873784

487692

278709

Original

Proposed

BasketballDrillText_832x480_50

Enc

oded

bit

stre

am s

ize

(Kb)

QP

Page 41: Parallel optimization of intra mode selection in HEVC using Open MP

Encoded bit stream size

22 27 32 370

100000

200000

300000

400000

500000

600000

700000

629610

392677

225283

120599

634061

395992

228033

122480

OriginalProposed

RaceHorses_416x240_30

Enc

oded

bit

stre

am s

ize

(Kb)

QP

Page 42: Parallel optimization of intra mode selection in HEVC using Open MP

22 27 32 370

500000

1000000

1500000

2000000

2500000

3000000

3500000

4000000

4500000

5000000

4291504

3205925

2432408

1815529

4343397

3251155

2466488

1855430

Original

Proposed

SlideEditing_1280x720_30

Enc

oded

bit

stre

am s

ize

(Kb)

QP

Encoded bit stream size

Page 43: Parallel optimization of intra mode selection in HEVC using Open MP

22 27 32 370

500000

1000000

1500000

2000000

2500000

30000002844373

1634344

977321

577370

2854385

1641297

983985

583038

OriginalProposed

Kimono1_1920x1080_24

Enc

oded

bit

stre

am s

ize

(Kb)

QP

Encoded bit stream size

Page 44: Parallel optimization of intra mode selection in HEVC using Open MP

22 27 32 370

2000000

4000000

6000000

8000000

10000000

12000000

1400000013010500

7538987

4283144

2493804

13123471

7633611

4351610

2550682

OriginalProposed

PeopleOnStreet_2560x1600_30

Enc

oded

bit

stre

am s

ize

(Kb)

QP

Encoded bit stream size

Page 45: Parallel optimization of intra mode selection in HEVC using Open MP

Conclusions• Parallel approach was used to optimize mode decision

process for intra prediction in HEVC.

• We can achieve approximately 35-40% reduction in encoding time on an average as compared to HM 9.1 encoder.

• There is negligible drop in image quality with slight increase in bitrate for different quantization parameter values on various test sequences

Page 46: Parallel optimization of intra mode selection in HEVC using Open MP

Future work • In parallel computing model like Computer Unified Device

Architecture (CUDA) [82] invented by NVIDIA, the thread creation overhead can be reduced as CUDA threads are extremely light weight, with very low creation overheads and switching time.

• The use of POSIX threads (pthreads) [84] provides thread pool which reduces overhead in thread creation as they already pre-allocated before the master thread begins dispatching threads to work. The tasks are processed in order, usually faster and can be done by creating a thread per task [83].

Page 47: Parallel optimization of intra mode selection in HEVC using Open MP

[1] G.J. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE Trans. CSVT, vol. 22, pp.1649-1668, Dec.2012. [2] C.C.Chi et al, “Parallel scalability and efficiency of HEVC parallelization approaches”, IEEE Trans. CSVT, vol. 22, pp.1827-1838, Dec.2012.

[3] J. Lainema et al, ”Intra coding of the HEVC standard”, IEEE Trans. CSVT,vol.22, pp.1792-1801, Dec.2012. [4] F. Bossen et al, “HEVC complexity and implementation analysis”, IEEE Trans. CSVT, vol. 22, pp.1685-1696, Dec.2012. [5] P. Hanhart et al, “Subjective quality evaluation of the upcoming HEVC video compression standard” SPIE Applications of digital image processing XXXV, vol.8499, pp.8499-30, Aug.2012. [6] J.-R Ohm, et al, “Comparison of the coding efficiency of video coding standards- including high efficiency video coding (HEVC)” , IEEE Trans. CSVT , vol.22, pp.1669-1684, Dec.2012. [7] X. Zhang, S. Liu and S. Lei, ”Intra mode coding in HEVC standard”, Visual Communications and Image Processing, VCIP 2012, pp. 1-6, San Diego, CA, Nov.2012. [8] Y.Duan, “An optimized real time multi-thread HEVC decoder”, Visual Communications and Image Processing, VCIP 2012, pp.1, San Diego, CA, Nov.2012. [9] G. Correa et al, “Performance and computational complexity assessment of high efficiency video encoders”, IEEE Trans. CSVT, vol.22, pp.1899-1909, Dec.2012. [10] A.Saxena, F. Fernandes and Y. Reznik, ”Fast transforms for intra-prediction-based image and video coding,” in Proc. IEEE Data Compression Conference (DCC’13), pp.13-22, Snowbird, UT, Mar.2013. [11]HEVC open source software (encoder/decoder) https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM-9.1-dev/

References

Page 48: Parallel optimization of intra mode selection in HEVC using Open MP

References[12] Introduction to parallel computing https://computing.llnl.gov/tutorials/parallel_comp/#Whatis [13] Information about quad tree structure of HEVC http://codesequoia.wordpress.com/2012/10/28/hevc-ctu-cu-ctb-cb-pb-and-tb/

[14] Guide into OpenMP: Easy multithreading programming for C++ http://bisqwit.iki.fi/story/howto/openmp/ [15] Website for downloading test sequence for research purposeshttp://media.xiph.org/video/derf/ [16] Information on developments in HEVC NGVC- Next generation video coding http://bisqwit.iki.fi/story/howto/openmp/ [17] F. Bossen, D. Flynn and K. Suhring (July 2012), “HEVC reference software manual” online available: http://phenix.int-evry.fr/jct/doc_end_user/documents/6_Torino/wg11/JCTVC-F634-v2.zip [18] JCT-VC documents are publicly available at http://ftp3.itu.ch/av-arch/jctvc-site and http://phenix.it-sudparis.eu/jct/ [19] Information about basics of video compressionhttp://desktopvideo.about.com/od/videoonyourwebsite/qt/compress.htm [20] Video research projecthttp://www1.curriculum.edu.au/videoresearch/compression.htm [21] Video compression basics http://wolfcrow.com/blog/what-is-video-compression/

[

Page 49: Parallel optimization of intra mode selection in HEVC using Open MP

References [22] T.L Silva et al, ”HEVC intra coding acceleration based on tree inter-level mode correlation”, SPA 2013 Sep.2013, Poznan, Poland

[23] A. Saxena and F. Fernanades, “Mode dependent DCT/DST for intra prediction in block based image/video coding”, IEEE ICIP, pp. 1685-1688, Sept. 2011.

[24] H. Zhang and Z. Ma, ”Fast intra prediction for high efficiency video coding ”, IEEE Trans. CSVT, vol. 24, pp.660-668, April 2014. [25] M. Zhang, C. Zhao and J. Xu, ”An adaptive fast intra mode decision in HEVC ”, IEEE ICIP 2012, pp.221-224, Orlando, FL, Sept.- Oct.2012. [25] K. Chen et al, ”Efficient SIMD optimization of HEVC encoder over X86 processors”, APSIPA, pp. 1732-1745, Los Angeles, CA, Dec. 2012. [26] Y. Kim et al, “A fast intra-prediction method in HEVC using rate-distortion estimation based on Hadamard transform”, ETRI Journal, vol.35, #2, pp.270-280, Apr.2013. [27] T. Wiegand et al., ”Overview of the H.264/AVC Video Coding Standard”, IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 560-576, Jul.2003. [28] M. Khan et al, “An adaptive complexity reduction scheme with fast prediction unit decision for HEVC Intra encoding”, IEEE ICIP, pp. 1578-1582, Sept. 2013.

Page 50: Parallel optimization of intra mode selection in HEVC using Open MP

References[29] ITU-T, “Video Codec for Audiovisual Services at px64 Kbit/s”, ITU-T Recommendation H.261, Version 1: Nov. 1990; Version 2: Mar. 1993. [30] ITU-T, “Video coding for low bit rate communication”, ITU-T Recommendation H.263, Nov. 1995 (and subsequent editions)

[31] ISO/IEC JTC 1, “Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Mbit/s – Part 2: Video”, ISO/IEC 11172 (MPEG-1),1993. [32] ISO/IEC JTC 1, “Coding of audio-visual objects – Part 2: Visual”, ISO/IEC 14496-2 (MPEG-4 Visual version 1), April 1999 (and subsequent editions). [33] ITU-T and ISO/IEC JTC 1, “Generic coding of moving pictures and associated audio information- Part 2: Video”, ITU-T Recommendation H.262 and ISO/IEC 13818-2 (MPEG @ Video), Nov. 1994. [34] ITU-T and ISO/IEC JTC 1, “Advanced Video Coding for generic audio-visual services”, ITU-T Recommendation H.264 and ISO/IEC 14496-10 (AVC), May 2003 (and subsequent editions). [35] K.R. Rao, D.N. Kim and J.J. Hwang, “Video coding standards: AVS China, H.264/MPEG-4 Part 10, HEVC, VP6, DIRAC and VC-1”, Springer, 2014.

Page 51: Parallel optimization of intra mode selection in HEVC using Open MP

References[36] Generic quadtree based approach for block partitioninghttp://www.hhi.fraunhofer.de/fields-of-competence/image-processing/research-groups/image-video-coding/hevc-high-efficiency-video-coding/generic-quadtree-based-approach-for-block-partitioning.html [37] Information about context based binary arithmetic codinghttp://www.hhi.fraunhofer.de/de/kompetenzfelder/image-processing/research-groups/image-video-coding/statistical-modeling-coding/context-based-adaptive-binary-arithmetic-coding-cabac.html [38] Basics of videohttp://lea.hamradio.si/~s51kq/V-BAS.HTM [39] Circuit implementation of High-Efficiency video coding toolshttp://dspace.mit.edu/bitstream/handle/1721.1/75691/820028934.pdf?sequence=1 [40] A. Norkin et al., ”HEVC Deblocking Filter”, IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1746-1754, Dec.2012.A. Norkin et al., ”Corrections to HEVC Deblocking Filter”, IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 12, pp. 2141, Dec.2013. [41] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC) document JCTVC- J0292r1, July 2012.

Page 52: Parallel optimization of intra mode selection in HEVC using Open MP

References[43] G.J. Sullivan et al, “Standardized Extension of High Efficiency Video Coding (HEVC)”, IEEE Journal of Selected topics in Signal processing, vol. 7, no. 6, pp.1001-1016, Dec.2013.

[44] Special issue on emerging research and standards in next generation video coding, IEEE Trans. CSVT, vol.22, pp.1646-1909, Dec. 2012. [45] G.J. Sullivan, “HEVC; The next generation in video compression”. Keynote speech, Visual Communications and Image Processing, VCIP 2012, San Diego, CA, 27-30 Nov.2012.

[46] M.T Pourazad et al,”HEVC: The new gold standard for video compression”, IEEE CE magazine vol.1, issue 3, pp. 36-46, July 2012.

[47] J-R. Ohm, T. Wiegand and G.J. Sullivan, “Video coding progress; The high efficiency video coding (HEVC) standard and its future extensions”, IEEE ICASSP, Tutorial, Vancouver, Canada, 2013. [48] Several papers on HEVC in the poster session IVMSP-P3: Video coding II, IEEE ICASSP 2013, Vancouver, Canada, June 2013. [49] HEVC tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.html [50] JCT-VC documents can be found in JCT-VC document management system http://phenix.int-evry.fr/ict

Page 53: Parallel optimization of intra mode selection in HEVC using Open MP

References[51] VCEG & JCT documents available fromhttp://wftp3.itu.int/av-arch in the video-site and jvt-site folders

[52] HEVC encoded bit streamsftp://ftp.kw.bbc.co.uk/hevc/hm-11.0-anchors/bitstreams/ [53] J. Lainema and K. Ugur, “Angular Intra Prediction in High Efficiency Video Coding”, IEEE 13th International Workshop on Multimedia Signal Processing (MMSP) , pp. 1-5, Oct. 2011

[54] T. Silva, L.Agostini and L.Cruz, “Fast HEVC Intra Prediction Mode Decision Based on Edge Direction Information”, 20th European Signal Processing Conference (EUSIPCO 2012), pp. 1214-1218, Aug. 2012. [55] High Efficiency Video Coding Test Model Available in: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-4.0/,2011 [56] J. Sole et al, “Transform Coefficient Coding in HEVC”, IEEE Trans. CSVT, vol. 22, no.12 pp.1765-1777, Dec.2012. [57] Introductions to parallel programming https://computing.llnl.gov/tutorials/parallel_comp/

Page 54: Parallel optimization of intra mode selection in HEVC using Open MP

References[58] Introduction to parallel programming and MapReducehttps://courses.cs.washington.edu/courses/cse490h/07wi/readings/IntroductionToParallelProgrammingAndMapReduce.pdf [59] Information about shared memoryhttp://www.csl.mtu.edu/cs4411.ck/www/NOTES/process/shm/what-is-shm.html [60] Definition of threadhttp://www.techopedia.com/definition/27857/thread [61] Information about message passing interfacehttp://www.hpcvl.org/faqs/programming/mpi-message-passing-interface [62] Information about Data parallel programming modelhttp://insidehpc.com/2006/03/what-is-data-parallel-programming/

[63] Information about Load imbalance https://software.intel.com/en-us/articles/load-balancing-between-threads [64] Description about race around condition and deadlocks http://support.microsoft.com/kb/317723

Page 55: Parallel optimization of intra mode selection in HEVC using Open MP

References[65] Information and example for deadlockhttp://www.roseindia.net/java/thread/deadlocks.shtml [66] Open MP componentshttp://www.capsl.udel.edu/courses/cpeg421/2012/slides/openmp_tutorial_04_06_2012.pdf [67] OpenMP Run time library routineshttps://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-D3FC1F0B-DD99-4176-B9B5-56EEE72E81A7.htm [68] OpenMP environment variablehttp://msdn.microsoft.com/en-us/library/6sfk977f.aspx [69] OpenMP tutorialhttps://computing.llnl.gov/tutorials/openMP/#Abstract [70] Information about OpenMP clauseshttps://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/optaps/common/optaps_par_dirs.htm [71] Information regarding OpenMP environment variableshttp://msdn.microsoft.com/en-us/library/tt15eb9t.aspx

Page 56: Parallel optimization of intra mode selection in HEVC using Open MP

References[72] Official website for information on OpenMPhttp://openmp.org/wp/

[72] Official website for information on OpenMPhttp://openmp.org/wp/ [73] D.P. Kumar, “Intra Frame Luma Prediction using Neural Networks in HEVC”, website: www.uta.edu/faculty/krrrao/dip ,Thesis, University of Texas at Arlington, UMI Dissertation Publishing, May 2013. [74] M.Mrak, A. Gabriellini and D.Flynn, “Parallel processing for combined intra prediction in high efficiency video coding”, IEEE ICIP, pp.3489 -3492, Sept. 2011. [75] J.Rehman and Y. Zhang,”Fast Intra Prediction mode decision using parallel processing”, Proceedings of the fourth international conference on machine learning and cybernetics, pp.5094-5098, Aug. 2005 [76]Overhead in openMP parameters http://www.embedded.com/design/mcus-processors-and-socs/4007155/Using-OpenMP-for-programming-parallel-threads-in-multicore-applications-Part-2

[77] X. Li et al, “Rate-Complexity-Distortion evaluation for hybrid video coding”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, pp. 957- 970, July 2011.

Page 57: Parallel optimization of intra mode selection in HEVC using Open MP

References[78] V.Sze, M.Budagavi, G.J. Sullivan, “High Efficiency Video Coding (HEVC)- Algorithm and Architectures”, Springer, 2014.

[79] Spiros N Agathos, P. Hadjidoukas and V. Dimakopoulos,“Task based execution of Nested OpenMP loops”, Springer, 2012. [80] R.Chandra et al, “Parallel programming in OpenMP”, Academic Press, 2001. [81] R.Eigenmann, B. Supinski, “OpenMP in a New Era of Parallelism”, 4th International Workshop, IWOMP 2008 West Lafayette, IN, USA proceedings, Springer 2008. [82] Getting Started with CUDA http://www.nvidia.com/content/cudazone/download/Getting_Started_w_CUDA_Training_NVISION08.pdf[83] Multithreaded programming guidehttp://docs.oracle.com/cd/E19253-01/816-5137/ggedn/index.html

[84] Information about Pthreadhttp://pubs.opengroup.org/onlinepubs/007908775/xsh/pthread.h.html

Page 58: Parallel optimization of intra mode selection in HEVC using Open MP

References[85] H. Jain,” Fast intra mode decision in high efficiency video coding”, M.S. Thesis, EE Dept., UTA, Aug 2014. Access from www.uta.edu/faculty/krrrao/dip [86] S. Gangavati, ”Complexity reduction of H.264 using parallel programming”, M.S. Thesis,EE Dept.,UTA, Dec 2012. Access from www.uta.edu/faculty/krrrao/dip [87] T. Saxena, “Reducing the encoding time for H.264 baseline profile using parallel programming techniques”, M.S. Thesis, EE Dept., UTA, June 2012. Access from www.uta.edu/faculty/krrrao/dip