comparison of 8 × 8 integer dcts used in h.264, avs-china and vc-1 video codecs submitted by,...
TRANSCRIPT
COMPARISON OF 8 × 8 INTEGER DCTs USED IN H.264, AVS-CHINA AND VC-1 VIDEO CODECS
Submitted by,
Ashwini Urs and Sharath Patil
Under guidance of
Dr.K.R.Rao
Introduction
Integer DCT
• KLT is the statistically optimal transform.• The performance of DCT is close to the
performance of KLT [1].• DCT is a well-known transform and is widely
used by majority of coding standards.• Though integer DCT contains only integers, it
has similar energy-packing ability as that of DCT [1].
Integer DCT (Continued)
• Integer cosine transform does not involve floating point computations and hence is used in video coding standards such as H.264 [2], VC-1 [3] and AVS [4].
• Integer cosine transform has been implemented with transform sizes of 4, 8 and 16 [1].
• Even larger size transforms (up to 64) have been used for high resolution videos to achieve higher coding gain [1].
Integer DCTs compared
Integer DCT matrix for AVS-China, H.264 and VC-1
AVS-China [2]
2691010962
410104410104
6102992106
88888888
9210661029
104410104410
1096226910
88888888
361012121063
48844884
612310103126
88888888
103126612310
84488448
121063361012
88888888
H.264 [3]
491516161594
616166616166
916415154169
1212121212121212
154169916415
166616166616
161594491516
1212121212121212
VC-1 [4]
Integer DCT matrix for AVS-China, H.264 and VC-1
• The orthogonality of the 3 matrices was checked by evaluating [INTDCTi] x [INTDCTi]*T
.
• The orthogonalised matrices are:1. AVS-China = diag (512, 442, 464, 442, 512, 442,
464, 442)2. H.264 = diag (512, 578, 320, 578, 512, 578, 320,
578)3. VC-1 = diag (1152, 1156, 1168, 1156, 1152, 1156,
1168, 1156)
Order-16 Integer DCT matrix used in AVS-China [26]
8888888888888888
10109966222266991010
101044441010101044441010
99221010666610102299
8888888888888888
441010101044441010101066
441010101044441010101044
22669910101010996622
22669910101010996622
441010101044441010101044
66101022999922101066
8888888888888888
99221010666610102299
101044441010101044441010
10109966222266991010
8888888888888888
16T
Comparison of the properties of integer DCTs
Comparison of interger DCT matrices
• The properties of the 3 integer DCT matrices were compared by considering a covariance matrix R for a Markov-I process with ρ = 0.95 and N=8.
• Rjk = [ρ|j-k|] for j, k = 0, 1,…, N-1, where ρ is the adjacent correlation coefficient.
• Covariance matrix in transform domain is given by
where DOT is discrete orthogonal transform and [Σ] is the covariance matrix in spatial
*~
TDOTDOT
Properties used for comparison of integer DCTs
1. Variance distribution: The diagonal elements of correspond to the variances in the transform domain [7].
2. Rate versus distortion: RD is the minimum average rate (bits/sample) for coding a signal at a specified distortion D [7]. For fixed average distortion D, rate distortion function RD is computed as
Choose values of θ betweent 0.1 and 1. For the same values of θ, D and RD are calculated [7].
Properties used for comparison of integer DCTs
3. Normalized basis restriction error, Jm: The compaction of energy in a few transform coefficients can be represented by the normalized basis restriction error defined as [7]:
where are arranged in decreasing order [7].2~
kk
Properties used for comparison of integer DCTs
4. Residual correlation: An indication of the extent of decorrelation in transform domain can be gauged by correlation left undone by the discrete transform, which is measured by the absolute sum of cross-covariance (off diagonal elements) in the transform domain i.e.,
for N = 8 as a function of ρ [7].
Properties used for comparison of integer DCTs
5. Transform coding gain GTC: Transform coding gain is defined as the ratio of arithmetic mean to geometric mean of variances
where is the variance of the ith co-efficient in the transform domain.
• As sum of all the variances is in invariant under orthogonal transformation, by minimizing geometric mean GTC can be maximized [7].
2~
ii
2~
ii
Results and Conclusion
Variance distribution versus N
Rate versus distortion
Normalized basis restriction error versus samples retained m
Residual correlation versus correlation co-efficient
Conclusion
• Variance distribution, normalized basis restriction error and transform coding gain of these 3 codecs are almost comparable.
• Transform coding gain, GTC for AVS, H.264 and VC-1 are 8.2916, 8.0155 and 7.5477 respectively. From this, we observe that AVS achieves maximum GTC.
• For a fixed average distortion D, the rate distortion function characteristics of H.264 and AVS are indistinguishable.
• The residual correlation for ρ > 0.5 is indistinguishable for these 3 codecs.
References[1] C. Fong and W. Cham, “Simple order-16 integer transform for video coding”, The Chinese university of Hong Kong, Shatin, Hong Kong.
[2] S.K.Kwon, A.Tamhankar and K.R.Rao, “Overview of H.264 / MPEG-4 Part 10” J. Visual Communication and Image Representation, vol. 17, pp.186-216, April 2006.
[3] S. Srinivasan , et al, “Windows Media Video 9: overview and applications”, Signal Processing: Image Communication, vol. 19, Issue 9, pp. 851-875, Oct. 2004
[4] W. Gao et al., “AVS – The Chinese next-generation video coding standard,” National association of broadcasters, Las Vegas, 2004
[5] R. Joshi, Y. Reznik and M. Karczewicz, “Efficient large size transforms for high-performance video coding”, Qualcomm Inc., San Diego, CA, USA.
[6] “Integer DCT for AVS China”, INTDCT6 - http://www-ee.uta.edu/dip/Courses/EE5355/ee5355.htm.
References[7] “Comparison of discrete transforms”, http://www-ee.uta.edu/dip/Courses/EE5355/ee5355.htm.
[8] N.Ahmed, T.Natarajan and K.R.Rao, “Discrete cosine transform”, IEEE trans. computers, Vol. X, pp.90-93, 1974.
[9] A.K.Jain, “Fundamentals of digital image processing”, Prentice hall, 1989.
[10] A.T. Hinds, “Design of high-performance fixed-point transforms using the common factor method”, Ricoh I infoprint solutions company, Boulder, CO, USA.
[11] T.Wiegand, et al “Overview of the H.264/AVC video coding standard”, IEEE Trans. on Circuit and Systems for Video Technology, vol.13, pp. 560-576, July 2003.
[12] T. Wiegand and G. J. Sullivan, “The H.264 video coding standard”, IEEE Signal Processing Magazine, vol. 24, pp. 148-153, March 2007.
References[13] D. Marpe, T. Wiegand and G. J. Sullivan, “The H.264/MPEG-4 AVC standard and its applications”, IEEE Communications Magazine, vol. 44, pp. 134-143, Aug. 2006.
[14] A. Puri, X. Chen and A. Luthra, “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal processing: image communication, vol. 19, pp. 793-849, Oct. 2004.
[15] M.Fieldler, “Implementation of basic H.264/AVC decoder”, seminar paper at Chemnitz university of technology, June 2004.
[16 ]R. Schäfer, T. Wiegand and H. Schwarz, “The emerging H.264/AVC standard”, EBU Technical Review, Jan. 2003.
[17]D. Marpe, T. Wiegand, and S. Gordon, "H.264/MPEG4-avc fidelity range extensions: tools, profiles, performance, and application areas," in, IEEE international conference on image processing, vol. 1, pp. I-593-6, 2005.
[18] S. Saponara et al, "The JVT advanced video coding standard: complexity and performance analysis on a tool-by-tool basis," in Packet Video Workshop, Nantes, France, April 2003.
[19] VC-1 technical overview - http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx
References[20] S. Srinivasan and S. L. Regunathan, “An overview of VC-1”, SPIE / VCIP, vol. 5960, pp. 720-728, July 2005.
[21] AVS Video Expert Group, “Information technology – Advanced coding of audio and video – Part 2: Video (AVS1-P2 JQP FCD 1.0),” Audio Video Coding Standard Group of China (AVS), Doc. AVS-N1538, Sept. 2008.
[22] AVS Video Expert Group, “Information technology – Advanced coding of audio and video – Part 3: Audio,” Audio Video Coding Standard Group of China (AVS), Doc. AVS-N1551, Sept. 2008.
[23] L Yu et al., “Overview of AVS-Video: Tools, performance and complexity,” SPIE VCIP, vol. 5960, pp. 596021-1~ 596021-12, Beijing, China, July 2005. [24] L. Fan, S. Ma and F. Wu, “Overview of AVS video standard,” IEEE Int’l Conf. on Multimedia and Expo, ICME '04, vol. 1, pp. 423–426, Taipei, Taiwan, June 2004. . [25] Special issue on 'AVS and its Applications' Signal processing: image communication, vol. 24, pp. 245-344, April 2009.
[26] C. K. Fong and W. K. Cham, “Simple order-16 integer transform for video coding”, http://www-ee.uta.edu/Dip/Courses/EE5355/INTDCT5.pdf