a comparative study of depth map coding schemes for 3d video

A Comparative Study of Depth Map Coding Schemes

for 3D Video

Harsh Nayyar, Nirabh Regmi, Audrey Wei

March 10th, 2011EE 398A: Image and Video Compression

Professor Girod

A Comparative Study of Depth Map Coding Schemes for 3D VideoH. Nayyar, N. Regmi, A. Wei

Overview

• Background & Motivation• Research Methodology• Results & Performance Comparisons

– Block Transforms (DCT, KLT)– Block Truncation Coding (BTC)

• Conclusion• Questions

2


Background & Motivation

• 3D Compression– Issue: Bit rate scales linearly with number of views– Proposed solution: Code 2-3 views along with

depth maps to synthesize intermediate views [Wiegand et al.]

• Requires good depth maps

• Depth Maps– Desirable to preserve edges– Not typical images

3


Research Methodology

• Block Transform Coding– DCT and KLT

• Block Truncation Coding – Constant and adaptive block sizes

• Distortion calculated based on synthesized view from uncompressed depth maps

4


System Overview

Left Image

Right Image

(Compressed) Left Depth Map

ViewSynthesis

Intermediate Image

(Compressed) Right Depth Map

5


Evaluation Methodology

• Test Sequences: Balloons & Kendo• Depth Maps: Cameras 1 & 3• Synthesized Views: Camera 2

6

Acknowledgement: Tanimoto Lab, Nagoya University


Discrete Cosine Transform (DCT)

• Block Matrix Sizes: M = 8, 16• Uniform Quantizer

– Step Sizes: 21 - 28

• Entropy Coding• Type used: DCT-II

7


Discrete Cosine Transform (cont.)

Quantizer step size = 28


8


Discrete Cosine Transform (cont.)

9

balloons error, M = 8, Q = 128


Karhunen-Loeve Transform (KLT)• Block Matrix Sizes: M = 8, 16• Uniform Quantizer

– Step Sizes = 21 - 28

• Entropy Coding• Training Set: composed from both views

M x Mm x n x p

M

2mnp

M

10


Karhunen-Loeve Transform (cont.)



11


Karhunen-Loeve Transform (cont.)

12

balloons error, M = 8, Q = 128


Block Truncation Coding (BTC)

• Good at preserving edges• Quantized values per block: a & b

• Block Matrix Sizes: M = 2, 4, 8, 16, 32, 64• Entropy Coding

if , output = a

if , output = b

a X q

m q

b X m qq

X i X th

X i X th

X th Xwhere q = # of Xi’s >

for i = 1, 2, … , M2

13


Block Truncation Coding (cont.)

M = 8

M = 4

14

~1.1dB



balloons error, M = 64

15



16




17



Adaptive BTC

• Spend bits where necessary– Large blocks handle background (low rate) – Small blocks handle edges (high rate)

• Make block size selection based on Lagrangian cost function

18


• Lagrangian cost function,– Joint cost of both depth maps– Distortion (D) processed from synthesized view– , = 20 – 28

• Bit rate (R) calculation– 6 Block sizes (M=2-64): 3 bits– Quantized values, a & b: Entropy coding– Positions of a & b in the block: Run Length Coding

& Entropy coding

Adaptive BTC (cont.)

J DR

0.2Q2

Q

19

a b b

b b a

b a b

1 0 0

0 0 1

0 1 0


Adaptive BTC (cont.)

as Mmax increases

20


Final Results

21


Final Results (cont.)

22

Balloons error (frame 1)Scheme: DCT (M = 8, Q = 64)PSNR = 37.65 dBRate = 0.07465 bpp



23

Balloons error (frame 1)Scheme: Fixed BTC (M=32) PSNR = 38.6070 dBRate = 0.0703 bpp



24

Balloons error (frame 1)Scheme: A-BTC (Mmax=64,Q=32) PSNR = 41.4849 dBRate = 0.0622 bpp



25


Conclusion

• Depth Maps– Not ordinary images– Important to preserve edges

• Adaptive BTC technique can optimally trade off rate and synthesized distortion

• Fixed BTC outperforms DCT, KLT without side information about synthesized distortion

• Adaptive BTC outperforms DCT, KLT, Fixed BTC

26


Future Work

• Adaptive BTC– Joint Lagrangian cost based on all possible ways of

breaking down blocks in pair of views• Our implementation is sub-optimal

– Investigate heuristics to perform block sub-division top-down rather than bottom-up

– Preserve higher moments in BTC• Only preserved 2nd moment

– Larger block sizes• Only used up to Mmax = 64

27


References

• N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform,” IEEE Trans. Compiti., vol. C-23, pp. 90-93, 1974.

• Balloons & Kendo Sequences, Nagoya University Tanimoto Laboratory , http://www.tanimoto.nuee.nagoya-u.ac.jp/.

• E. Delp and O. Mitchell, “Image Compression Using Block Truncation Coding,” Communications, IEEE Transactions on., vol. 27, no. 9, pp. 1335-1342, Sep. 1979.

• Z. Li and M. Drew, ”Karhunen-Loeve Transform,” in Fundamentals of Multimedia. Upper Saddle River. Pearson Education, 2004, ch. 8, sec. 5.2. pp. 220-222.

• P. Merkle, Y. Morvan, A. Smolic, D. Farin, K. Muller, P. H. N. de With, and T. Wiegand, “The effects of multiview depth video compression on multiview rendering,” Signal Process., Image Commun., vol. 24, no. 1+2, pp. 7388, Jan. 2009.

• K. Mller, P. Merkle, and T. Wiegand, “3-D video representation using depth maps,” Proceedings of the IEEE, vol. PP, no. 99, pp. 1-14, 2010.

28

a comparative study of depth map coding schemes for 3d video

Documents

d videoharsh nayyar

nirabh regmi

uncompressed depth maps

wei discrete cosine

wei karhunenloeve

wei research methodologyblock

quantizer step size

kltblock matrix sizes