research areas - university of texas at arlington€¦ · web view · 2012-01-06image/video...

RESEARCH AREAS

General research areas.

Image/Video Coding, including the emerging technologies around reconfigurable video coding, scalable video coding, multiview video coding, high-performance video coding, next-generation video coding, as well as efficient hardware/software implementations for high-definition real-time video coding. Image/Video Analysis, including compressed-domain object detection, multi target identification/tracking, image (extraction, restoration, and enhancement), and facial expression recognition. Other Topics, including Real-time Embedded Systems, Video Surveillance, Image/Video Watermarking, HW/SW Co-design of Multimedia Systems, as well as Reconfigurable and Multicore Architecture

The following book has 90 chapters. Each chapter can lead to a project.

Advanced Concepts for Intelligent Vision Systems

7th International Conference, ACIVS 2005, Antwerp, Belgium, September 20-23, 2005. Proceedings

Book Series

Lecture Notes in Computer Science

Publisher

Springer Berlin / Heidelberg

ISSN

0302-9743 (Print) 1611-3349 (Online)

Volume

Volume 3708/2005

DOI

10.1007/11558484

Copyright

2005

ISBN

978-3-540-29032-2

Subject Collection

Computer Science

SpringerLink Date

Wednesday, October 05, 2005

Pl access from

http://www.springerlink.com/content/7tuamycrnqvq/?p=ca8d7d742137450b97f6a136a9e3ecaf&pi=0

MULTIPLE DESCRIPTION CODING

I. Radulovic et al, Multiple description video coding with H.264/AVC redundant pictures, IEEE Trans. CSVT, vol. 20, pp. 144-148, Jan.2010.

See also references cited at the end of this paper.

2. V. K Goyal, Multiple description coding: Compression meets the network, IEEE SP Magazine, vol. 18, pp. 74-93, Sept. 2001.

These papers can lead to several projects.

VIEO OBJECT SEGMENTATTION IN COMPRESSED DOMAIN

F. Porikli, F. Bashir and H. Sun, Compressed domain video object segmentation, IEEE Trans. CSVT, Vol. 19, 2009.

High efficiency video coding (HEVC)

TMuC, Test model under consideration ,

TMuC, http://hevc.kw.bbc.co.uk/svn/jctvc-tmuc,

Joint Collaborative Team on Video Coding (JCT-VC)

TMuC, http://hevc.kw.bbc.co.uk/svn/jctvc-tmuc,

2010.

http://www.h265.net/ has info on developments in HEVC NGVC Next generation video coding.

Some of the tools contributing to the gain are:

(1) RD Picture Decision

(2) RDO_Q (from Qualcomm)

(3) MDDT (from Qualcomm)

(5) New Offset (from Qualcomm)

(4) Adaptive Interpolation Filter (from Qualcomm & Nokia)

(5) Block Adaptive Loop Filter (BALF) (from Toshiba)

(6) Bigger Blocks and Bigger transform (32x32 and 64x64) (Qualcomm)

(7) Motion Vector Competition (France Telecomm)

(8) Template matching

JVT KTA reference software (KTA: key technical areas)

http://iphome.hhi.de/suehring/tml/download/KTA/

G.J. Sullivan and J.-R. Ohm, Recent developements in standardization of high efficiency video coding, Proc. SPIE, vol. 7798, pp. 77980V-1 thru V-7, San diego, CA Aug. 2010. Many other papers.

IEEE Trans. on CSVT, vol. 20, Special section on high efficiency video coding (several papers), Dec. 2010.

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 7, NOVEMBER 2011 (several papers on HEVC) Introduction to the Issue on Emerging Technologies

for Video Compression.

(see M. Karczewicz et al, A hybrid video coder based on extended macroblock sizes, improved interpolation and flexible motion representation, IEEE Trans. CSVT, Vol.20, pp. 1698-1708, Dec. 2010.) (several other papers)

ETRI Journal, vol. 33, pp. 145-154, April 2011

http://etrij.etri.re.kr

Highly Efficient Video Codec for Entertainment-Quality

T. Wiegand, B. Bross, W.-J. Han, J.-R. Ohm, and

G. J. Sullivan, WD3: Working Draft 3 of High-

Efficiency Video Coding, Joint Collaborative Team (HEVC STANDRAD EMERGING)

on Video Coding (JCT-VC) of ITU-T VQEG and

ISO/IEC MPEG, Doc. JCTVC-E603, Geneva, CH,

March 2011.

Seyoon Jeong, Sung-Chang Lim, Hahyun Lee, Jongho Kim, Jin Soo Choi, and Haechul Choi

([email protected])

This paper introduces many new coding tools in H.264 resulting in improved coding efficiency. Evaluate these tools and explore other tools for improving the coding efficiency of H.264 leading to H.265. Also review the references.

HIGH EFFICIENCY VIDEO CODING (HEVC) 3/4/11

ITU-T documents

http://wftp3.itu.int/av-arch/jvt-site/

The latest draft meeting report for the last meeting is at:

http://ftp3.itu.int/av-arch/jctvc-site/2011_01_D_Daegu.

There is a lot of information in that meeting report, and it contains pointers to where to find more (e.g., where to find documents and software).

One trick is to search the report for the string "Decision:".

The output documents (of that meeting and the preceding one) are another good place to look. And the AHG reports and CE (core experiments) reports.

We have another meeting starting in two weeks, so a lot more information will be showing up soon, and of course some aspects of the design will change.

IEEE Trans. Circuits and Systems for Video Technology has a special section (several papers) devoted to the project. Vol. 20, Dec. 2010.

It would be helpful if the students could actually contribute to the effort instead of just writing reports that only you will read.

Two ways to contribute include testing and improving the reference software and improving the draft standard text specification. It is easy to find places in the text that are incomplete, incorrect, vague, grammatically bad, inconsistent, or otherwise needing improvement. (For the text especially, finding problems is not enough what we need are the solutions to the problems.)

Obviously though, we would only want competent help, not interference or messages on our email reflector saying things like "Hello, I am working on a report for my class project about the entropy coding in HEVC, so please explain how it works to me and give me some software and test results to put into my report which, by the way, is due next week."

vceg-document: http://ftp3.itu.ch/av-arch/video-site/0801_Ant/

jvt-document: http://ftp3.itu.ch/av-arch/jvt-site/

And these documents are helpful you to understand KTA software.

VCEG-AE08 [J. Jung, T. K. Tan] KTA 1.2 software manual

VCEG-AE09 [J. Jung, G. Laroche] Performance evaluation of KTA 1.2 software

Wiener Spatial Filtering in VCEG KTA

please check the various post-filter

and loop-filter proposals.

it is also useful to look at the

MPEG-4 AVC/H.264 proposals about the

post-filter hint SEI message: JVT-S030, JVT-T039, JVT-U035.

These documents can be accessed from

JVT ftp site:

http://wftp3.itu.int/av-arch/jvt-site/2006_04_Geneva/JVT-S030.zip

http://wftp3.itu.int/av-arch/jvt-site/2006_07_Klagenfurt/JVT-T039.zip

http://wftp3.itu.int/av-arch/jvt-site/2006_10_Hangzhou/JVT-U035.zip

Implement loop filter (on/off) and post filter (on/off)on 4:4:4 HD sequences

and evaluate the performance. Design other post filters.

Variable block size spatially varying transforms See papers.

1. C. Zhang, K. Ugur, J. Lainema and M. Gabbouj, Video coding using spatially varying transform, T. Wada, F. Huang and S. Lin (Eds): PSIVT, LNCS 5414, pp. 796-806, 2009. { kemal.ugur, jani.lainema}@nokia.com, {cixun.zhang, monef.gabbouj} @tut.fl

2. C. Zhang, K. Ugur, J. Lainema and M. Gabbouj, Video coding using variable block size spatially varying transforms,.

These two papers address variable block size and also location in a MB (where to apply) Implementing these two algorithms can lead to new research topics.

J. Chen et al, Efficient video coding using legacy algorithmic approaches, IEEE Trans. on multimedia (accepted Sept. 2011) Will have access to this paper soon. This has led to MPEG standards group to work on Type-1 video coding standard. This paper can lead to several research projects also comparison with H.264.

Video codec projects please access

http://www.compression.ru/video/video_codecs.htm

WebM project www.webmproject.org/code

WebM supports VP8 and Vorbis audio Explore and implement this (VP8 on2 technologies now google adobe flash) http://blog.webmproject.org/

2010/05/introducing-webm-open-web-media-project.html

http://blog.webmproject.org/about/faq

http://blog.webmproject.org/about

INTERPOLATION

G. Anbarjafari, and H. Demirel, "Image Super Resolution Based on Interpolation of Wavelet Domain High Frequency Subbands and the Spatial Domain Input Image," ETRI Journal, vol.32, no.3, June 2010, pp.390-394. DOI:10.4218/etrij.10.0109.0303

Abstract :

In this paper, we propose a new super-resolution technique based on interpolation of the high-frequency subband images obtained by discrete wavelet transform (DWT) and the input image. The proposed technique uses DWT to decompose an image into different subband images. Then the high-frequency subband images and the input low-resolution image have been interpolated, followed by combining all these images to generate a new super-resolved image by using inverse DWT. The proposed technique has been tested on Lena, Elaine, Pepper, and Baboon. The quantitative peak signal-to-noise ratio (PSNR) and visual results show the superiority of the proposed technique over the conventional and state-of-art image resolution enhancement techniques. For Lenas image, the PSNR is 7.93 dB higher than the bicubic interpolation. File; interpolation.

Key word :

Static image super resolution, discrete wavelet transform.

DOI :

10.4218/etrij.10.0109.0303

Implement this and compare with various other interpolation techniques using, mse., psnr, uiqi, ssim, PEVQ (perceptual evaluation of video quality), CZD ( Czenakowski distance measures differences between pixels) etc.

H.264

JSVM; Joint scalable video model

You can find more information on the current implementation of rate control in the JSVM reference software in the JVT document JVT-W043:

http://ftp3.itu.int/av-arch/jvt-site/2007_04_SanJose/JVT-W043.zip

Note that this RC (rate control) algorithm controls the bit rate only on the base layer. The enhancement layers are still being coded with the fixed pre-determined QP values.

JM reference software manual http://iphome.hhi.de/suehring/tml/JM%20Reference%20Software%20Manual%20(JVT-AE010).pdf.

B.M.K. Aswathappa and K.R. Rao, Rate-Distortion Optimization (RDO) using Structural Information in H.264 Strictly Intra-frame Encoder, IEEE Southeastern Symposium on Systems Theory, pp.367-370, Tyler, TX, March 2010.

Extend the above technique - RDO based on SSIM (structural similarity metric) - to inter frame encoder in H.264.

TUTORIAL ON H.264

LSI Logic Corporation: H.264/MPEG-4 AVC Video Compression Tutorial, available in:

http://www.cs.ucla.edu/classes/fall03/cs218/paper/H.264_MPEG4_Tutorial.pdf

Panasonic Corporation: AVC-Intra (H.264 Intra) Compression Tutorial, available in:

ftp://ftp.panasonic.com/pub/Panasonic/Drivers/PBTS/papers/WP_AVC-Intra.pdf

(1) K. Yu et al, Practical real time video codec for mobile devices, vol. III, pp. 509-512, ICME 2003.

Developed a practical low complexity real-time video codec for mobile devices. Reduces computational cost in ME, integer DCT, DCT/quantizer bypass. Applied to H.263. Extend this to H.261, MPEG1,2,4 (MPEG 4 Visual, SP, ASP) and to H.264 baseline profile. SP ; simple profile, ASP: Advanced simple profile.

Pl access the paper P. Carrillo, H. Kalva and T. Pin, " Low complexity H.264 video encoding", SPIE. VOL.7443, PAPER # 74430A, Aug. 2009., Applications of digital image processing. By applying machine learning to video coding, the authors are able to reduce the complexity of highly optimized encoder by about 63%. This is based on reducing the complexity in mode selection 16x16 to 4x4 block sizes for ME/MC in inter frames which is computationally exhaustive. This may be very useful in mobile devices (SMBA project - Korea). Pl implement this algorithm (simulation) and verify the results obtained in this paper. Explore implementing this algorithm in AVS China video and SMPTE VC-1 in order to reduce the encoder complexity. Both these standards have multiple block size ME/MC functions similar to those in H.264.

(2) G. Lakhani, Optimal Huffman coding of DCT blocks, IEEE Trans. CSVT, vol.14, pp.522-527, April 2004.

an coding in JPEG baseline, Can similar techniques be applied to other video coding standards! (H.261, MPEG series, H.263 3D VLC etc) See Table I about # of bits for each image (comparison).

(3) M. Horowitz et al, H.264 baseline profile decoder complexity analysis,

IEEE Trans. CSVT, Vol. 13, pp. 704-716, July 2003 and

V. Lappalainen, A. Hallapuro and T.D. Hamalailen, Complexity of optimized H.26L video decoder implementation, IEEE Trans. CSVT, Vol. 13, pp. 717-725, July 2003

Develop similar complexity analysis for H.264 Main Profile (both encoder/decoder) and compare with MPEG-2 Main Profile.

Doctoral thesis

Kemal Ugur, Improved prediction methods for low complexity, high quality video coding 8 Nov. 2010., Tampere Univ. of Technology, Tampere, Finland

This thesis can be downloaded from

http://www.cs.tut.fi/~moncef/publications/ugur.pdf

This can lead to research projects.

Reference papers:

a) A. Molino et al, Low complexity video codec for mobile video conferencing, EUSIPCO 2004, Vienna, Austria, Sept. 2004. (http://videoprocessing.ucsd.edu) ( http://www.vlsilab.polito.it)

b) M. Li et al, DCT-based phase correlation motion estimation, IEEE ICIP 2004, Singapore, Oct. 2004.

C) M. Song, A. Cai and J. Sun, Motion estimation in DCT domain, (contact [email protected]). Proc. 1996 IEEE Intrnl. Conf. on Communication Technology, vol.12, pp. 670-674, Beijing, China, 1996.

d) S-F. Chang and D.G. Messerschmitt, Manipulation and compositing of MC-DCT compressed video, IEEE JSAC, vol. 13, pp. 1-11, Jan. 1995.

ME/MC (generally implemented in spatial domain) is computationally intensive. ME/MC in transform domain may simplify this. Implementation complexity is a critical factor in designing codecs for wireless (mobile) communications. Consider this and other functions in a codec based on e.g., H. 264 (baseline profile) and other standards.

3a) S.S. Basavanhalli, Complexity Analysis of H.264 baseline decoder on ARM9TDMI processor, M.S. Thesis, EE Dept. UTA ,Dec. 2005..

3b) T. Bhatia, SIMD optimization of H.264 high profile HD decoder, M.S. Thesis, EE Dept. UTA , Dec. 2005.

(4) Extend these theses to H.264 encoders.

Topic for MS thesis/research

"Transcoding from H264 Main/High profile to H264 Baseline Profile"

(without dropping B frames. i.e. to convert B frames to P frames efficiently)

Apple iphone and many other phones support Baseline decoding of h264 video. and some company broadcasts in Main profile...so the need for these occurred....just an example

J. But, A novel MPEG-1 partial encryption scheme for the purposes of streaming video, Ph.D Thesis, ECSE Dept, Monash University, Clayton, Victoria, Australia,

2004. (copy of the thesis is in our lab). Also papers by J. But (under review)

Implementing Encrypted Streaming Video in a Distributed Server Environment

- Submitted to IEEE Multimedia

An Evaluation of Current MPEG-1 Ciphers and their Applicability to Streaming

Video

- Submitted to ICON 2004

KATIA - A Partial MPEG-1 Video Stream Cipher for the purposes of Streaming

Video

- Submitted to ACM Transactions on Multimedia Computing, Communications and

Applications

REVIEW

In view of the improvements in networks, internet services, DSL, cable modems, satellite dishes, set-top-boxes, hand-held mobile devices etc, video streaming has lots of potential and promise. One direct and extensive application is in the entertainment industry where a client can browse, select and access movies, video clips, sports events, historical/political/tourist/medical/geographical/scientific encoded video using video 0n demand VoD service. A major problem for content providers and distributors is in providing this service to bona fide/authorized client and collecting the regulated revenues without unauthorized persons duplicating/distributing the protected material. While encryption schemes have been developed for storage media, (DVD, Video CD etc), there is an urgent need to extend and implement this approach to video streaming over public networks such as internet, satellite links, terrestrial and cable channels. The thesis by But addresses this highly relevant and beneficial subject. Encryption techniques are applied to MPEG-1 coded bit streams such that with proper (authorized) decryption key, clients can access and watch the video of choice with out duplication/distribution. This approach requires a thorough understanding of MPEG-1 encoder/decoder algorithms together with video/audio systems and as well the encryption details.

The author has proposed a range of modifications to the distributed server design that will lead to lower implementation costs and also increase the customer base. The development of a MPEG-1 partial selection scheme for encryption of streaming video is significant since the current encryption algorithms are mainly designed for encryption and protection of stored video. Extension of the MPEG-1 cipher to MPEG-2 bit stream is discussed in general terms with details left for further research, Additional research areas are developing encryption schemes for other encoded bit stream based on MPEG-4 Visual, H.263 and the emerging H.264/MPEG-4 Part 10.

Further research; Apply/extend/implement these techniques to video streaming based on MPEG-2, MPEG-4 visual, H.263 and H.264/MPEG-4 Part 10 (encryption, authentication, authorization, robustness, copyright protection etc.).Pl see chapter 7 Conclusion of the thesis for summary and further research.

(5) H.264/MPEG-4 Part 10, the latest video coding standard specifies only video coding unlike MPEG1,2,4, H.263 etc. (see IEEE Trans. CSVT, vol. 13, July 2003, Special issue on H.264/MPEG-4 Part 10). lso several papers on H.264.For all video applications, audio is essential.

Investigate multiplexing of H.264/MPEG-4 Part 10 (encoded video) with encoded audio based on the MPEG-2,4 Systems compatibility at the transmitter side followed by inverse operations (demultiplexing into video and audio bit streams and decoding these two media along with the lip-sync and other aspects) at the receiver side. There are several standards/non standards based algorithms for encoding/decoding audio ( M. Bosi and R.E. Goldberg, Introduction to digital audio coding standards, Norwell, MA: Kluwer, 2002).

. H.264/MPEG-4 Part 10 video can be in various profiles/levels and as well the audio (mono, stereo, surround sound, lossless etc) aimed at various quality levels/applications and as well at various bit rates. This research can lead to several M.S. Theses. This research also has practical/industrial applications.

Below are the comments by industry experts actively involved in the video/audio standards.

Just like MPEG-2 video, the audio standards used in broadcast applications are defined by application standards such as ATSC (US Terrestrial Broadcast), SCTE (US/Canada Cable), ARIB (Japan)and DVB (Europe). ATSC and SCTE specify AC-3 (Dolby) audio while DVB specifies both MPEG-1 audio as well as AC-3. ARIB specifies MPEG-2 AAC.

The story for audio to be used with H.264 is more complex. DVB is considering AAC with SBR (called AAC plus) while ATSC has selected AC-3 plus from Dolby. In addition, for compatibility all the application standards will continue to use the existing audio standards (AC-3, MPEG-1 and MPEG-2 AAC).

The glue to all of these is the MPEG-2 transport that provides the audio/video synchronization mechanism for all thevideo and audio standards.

I have the document ETSI TS 101 154 V1.6.1 (2005) DVB: Implementation guidelines for the use of video and audio coding in broadcasting applications based on the MPEG-2 transport stream (file: ETSI-DVB).

Consider MPEG-2 and MPEG-4 SYSTEMS for multiplexing the video/audio coded bit streams.

WEB SITES for multiplexing/de multiplexing audio/video bit streams.

http://gpac.sourceforge.net/auth_mp4box.php

http://mpeg4ip.sourceforge.net/docs/(check the mp4creator tool).

Both of theabove have their advantages and drawbacks, but can help you do what you want (multiplexing and splitting of any files including AVC coded bitstreams)

(6) MPEG-2 MULTIPLEXER FOR TS and PS (TS Transport stream, PS Program stream)

Multiplexing of H.264/AVC

video codec with MPEGs

AAC audio codec

Swetha Krishnamurthy

http://www.linuxtv.org/projects.php page may help you.

More specifically, http://www.scara.com/~schirmer/o/mplex13818/

implements "An ISO-13818 compliant multiplexer for generating MPEG2 transport and program streams.

Harishankar has implemented multiplexing/demultiplexing H.264 Video with AAC audio (including encoding/decoding)along with lip-synch as part of his M.S. Thesis. Extension to H.264 high profile video and AAC/SBR audio (also 5.1 audio channels surround sound) is a worthwhile M.S. research.

see file ISPACSKS3

58. 6. H. 264/MPEG-4 Part 10 ( see item 5 above) Several new profiles/extensions have been developed. (ex. Studio and/or digital cinema, 12 bpp intensity resolutions, 4:2:2 and 4:4:4 formats, file and optical disk storage/transport over IP networks etc).

www.atsc.org advanced television systems committee

But since July 2008, ATSC supports the ITU-T H.264 video codec. The standard is split in two parts:

A/72 part 1: Video System Characteristics of AVC in the ATSC Digital Television System

A/72 part 2: AVC Video Transport Subsystem Characteristics

59. Thus new ATSC standard A/72 for digital HDTV supports H.264 video codec...

4) I am attaching here A/53, A/72 part 1, A/72 part 2 PDF.

5) Below is the link of press release.. which said that A/72 supports H.264...

http://www.atsc.org/communications/press/2008-09-15-A72-publication.php

See G.J. Sullivan, P. Topiwala and A. Luthra, The H.264/AVC advanced video coding standard: Overview and introduction to the fidelity range extensions, SPIE Conf. on applications of digital image processing XXVII, vol. 5558, pp. 53-74, Aug. 2004. This paper discusses the extensions to H.264 including various new profiles (high, High 10, High 4:2:2 and High 4:4:4) and compares the performance with previous standards. G.J. Sullivan, The H.264/MPEG-4 AVC video coding standard and its deployment status, SPIE/VCIP 2005, Vol. 5960, pp. 709-719, Beijing, China, July 2005.

See also Y. Su, Ming-Ting Sun and Kun-Wei Lin, Encoder optimization for H.264/AVC fidelity range extensions, SPIE - VCIP2005, vol. 5960, pp. 2067-2075, Beijing, China, July 2005.

These extensions can lead to several M.S. Theses and possibly Ph.D dissertations.

(7)see K. Yu et al, Practical real-time video codec for mobile devices, IEEE ICME 2003, Vol. III, pp. 509-512, 2003. They have developed a practical low-complexity real-time video codec for mobile devices based on H. 263. Explore/develop similar codecs based on H.264 baseline profile. See also the paper S.K. Dai et al, Enhanced intra-prediction algorithm in AVS-M, Proc. ISIMP, pp. 298-301, Oct. 2004. (M is for mobile applications). AVS is audio video standard of China.

(8) Implement/evaluate scalability extensions of H.264 (see current JVT documents). JSVM (Joint scalable video model) and SVC (scalable video coding). At present lots of activity on SVC by the JVT. Software is called JSVM. SVC has been finalized.

Do you know of any real-time implementations of SVC decoders (HW/SW)

for PCs, STBs, etc? July 15, 2009

You can find a open source software here:

https://sourceforge.net/projects/opensvcdecoder/

The wiki is here for more information:

http://opensvcdecoder.sourceforge.net

The player is Mplayer with a dedicated library for SVC.

We can achieve 720p with 2 enhancements layers in real time.

See the following link for the sequences supported by the decoder. http://opensvcdecoder.sourceforge.net/JVT-AB023.xls.

(9) Design/evaluate/simulate rate control techniques for all profiles/levels in H.264.

(10) Residual color transform (RCT)

In Frextensions to H.264/MPEG-4 Part 10, a new addition is the residual color transform. In this technique, the input/output and stored reference pictures are in RGB domain while bringing the forward and inverse color transformations inside the encoder and decoder for processing of the residual data only. Color transformations are RGB to YCgCo (orange and green chroma) and the inverse. Residual data implies (I assume)intra or motion compensated prediction errors.

JVT-L025r2.doc

pl see email from Woo-Shik Kim [email protected] (31-1-2006)

In RCT, the YCgCo transform is applied to the residual signal after intra/inter prediction and before integer transform/quantizationat the encoder, and the inverse YCgCo transform is applied to the reconstructed residual signal after dequantizaiton/inverse integer transform and before intra/inter prediction compensation at the decoder.

Since this is not a SVC subject, if you need further discussion you can use the JVT reflector or 4:4:4 AhG reflector.

The e-mail address is same for both ([email protected]) and [4:4:4] is added to the subject for the 4:4:4 Ahg (adhoc group) reflector.

Best Regards,

Woo-Shik Kim

(11) Advanced 4;4;4; profile in H.264/MPEG-4 Part 10

Intra residual lossless DPCM coding is proposed in advanced 4:4:4 Profile of H.264. Implement this and compare with RESIDUAL COLOR TRANSFORM. See JVT-Q035 17-21 Oct. 2005 Complexity of the proposed lossless intra for 4:4:4.

Y.L. Lee [email protected] Sejong Univ., Korea.

The most important thing to know about the High 4:4:4 profile is that we have removed it (or are in the process of removing it) from the standard. We are working on a new Advanced 4:4:4 profile. So the prior High 4:4:4 profile should be considered only a historical curiosity for purposes of academic study now.

In answer to your specific question, the primary other difference in the High 4:4:4 profile in addition to support of the 4:4:4 chroma sampling grid in a straight forward fashion similar to what was done to support 4:2:2 versus 4:2:0, was the support of a more efficient lossless coding mode, as controlled by a flag called qpprime_y_zero_transform_bypass_flag. This flag, when equal to 1,causes invoking of a special lossless mode when the QP' value for the nominal"Y" component (which would be the G component for RGB video) is equal to 0. In the special lossless mode, the transform is bypassed, and the differences are coded directly in the spatial domain using the entropy coding processes that are otherwise ordinarily applied to transform coefficients.

Best Regards,

-Gary Sullivan

(12) residual color prediction (H.264)

JVT-Q308r2.doc

EMBED Word.Document.8 \s

JVT-R046.doc

Pl open these documents. These files can be sources for research projects.

WAVELETS

(1) P. Tsai, Y-C. Hu and C.C. Chang, A progressive secret reveal system based on SPIHT image transmission, SP: Image communication, vol. 19, pp. 285-297, March 2004.

Secret image is directly embedded in a SPIHT encoded cover image (monochrome). Sally has extended this to color, RGB ----- YCBCR. She has also investigated robustness to various attacks. Sallys thesis and software are in our lab.

PROPOSED RESEARCH

Tsai, Hu and Chang suggest adding encryption schemes (DES, RSA) to encrypt the secret image before embedding. The objective is to enhance the security by steganography and encryption. INVESTIGATE THIS.

Ramaswamy has completed his thesis using SHA, DES, RSA for encrypting H. 264 video (verify integrity, identify sender/content creator etc). Both Puthussery and Ramaswamy have all the software (operational). This research topic is viable and relevant.

N. Ramaswamy, Digital signature in H.264/AVC MPEG4 Part 10,M.S.. Thesis,UTA, Aug. 2004.

S. Puthussery A progressive secret reveal system for color images, M.S. Thesis, UTA, Aug. 2004.

REFERENCES

1. I. Avcibas, N. Memon and B. Sankur, Steganalysis using image quality metrics, IEEE Trans. IP, vol. 12, pp. 221-229, Feb. 2003.

2. I. Avcibas, B. Sankur and K. Sayood, Statistical evaluation of image quality measures, J. of Electronic Imaging, vol. 11, pp. 206-223, April 2002.

3. A.M. Eskicioglu and P.S. Fisher, Image quality measures and their performance, IEEE Trans. Commun., vol.43, pp. 2959-2965, Dec. 1995.

4. A.M. Eskicioglu, Application of multidimensional quality measures to reconstructed medical images, Opt.Eng., vol. 35, pp. 778-785, March 1996.

5. B. Lambrecht, Ed., Special issue on image and video quality metrics, Signal Process., vol. 70, Oct. 1998.

US PATENTS

US PATENTS www.uspto.gov (US patent and trademark office) While the claims made in these patents can be simulated no products/devices based on these patents can be used for commercial purposes (proper licensing, patent release etc must be obtained)

1.US Patent 4, 999, 705 dated March 12, 1991 A. Puri, Three dimensional motion compensated video coding, Assignee: AT & T Bell Labs, Murray Hill, NJ.

2. US Patent 4, 958, 226 dated Sept. 18, 1990, B.G, Haskell and A. Puri, Conditional motion compensated interpolation of digital motion video, Assignee AT & T Bell Labs, Murray Hill, NJ.

Patent # 1 discusses adaptive 2D or 3D DCT of MXN or MXNXP blocks and a special zig-zag-zog scan for the 3D DCT case. Here MXN is the spatial block and P is in the temporal domain. It has # of interesting features and claims improved compression. It follows the GOP concept (IPB PICTURES) as MPEG-1,2,4 WITH VARIABLE # OF b PICTURES or even the size of GOP. This patent can be basis of # of research topics specially at the M.S. level.

3. US Patent 5, 309, 232, May 3, 1994 J. Hartung et al, Dynamic bit allocation for three-dimensional subband video coding, assignee AT & T Bell Labs, Murray Hill, NJ. (see also S-J Choi and J.W. Woods, Motion-compensated 3-D subband coding of video, IEEE Trans. IP, VOL.8, PP 155-167, Jan. 1998.)

This patent has 3 of interesting features and can lead to several research topics specially at the M.S. level.

Stegonagraphy

FRACTALS

(1)Fractal lossless image coding. See proc. of EC-VIP-MC 2003 in our lab. Extend this approach to color images, video etc

(2) Explore Fractal/DWT (Similar to Fractal/DCT) in image/video coding.

(3) Explore fractal/SVD in image/video coding

(4) FRACTAL BASED IMAGE RECOGNITION

See A. Sloan, Through a glass, Darkly: Image recognition with poor quality imagery, Advanced imaging, vol. pp.8-9 and 37, March 2003. ([email protected]) vol. 18. some research areas are suggested.

AUDIO CODERS

(1) High-fidelity multi channel audio coding with Karhunen-Loeve transformDai Yang Hongmei Ai Kyriakakis, C.and Kuo, C.-C.J.

IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 365-380, July 2003.

Review this paper and related ones cited in the references.

KLT is applied to advanced audio coding (AAC) adopted in MPEG-2. Can this technique be extended to other multi channel audio coding algorithms?

(2) Pl go to Google HEAAC or AAC plus (HE is high efficiency). Also go to www.codingtechnologies.com This is an improved multi channel audio coder adopted in MPEG-4 and also by various companies. Also in MP3 called MP3 pro. It is both backward and forward compatible with AAC. (see M. Wolters, K. Kjorling and H. Purnhagen A closer look into MPEG-4 high efficiency AAC, 115th convention AES, New York, NY: 10-13, Oct. 2003, www.aes.org and P. Ekstrand, Bandwidth extension of audio signals by spectral band replication, Proc. Ist IEEE Benelux Workshop on Model Based Processing and coding of Audio (MPCA-2002), Leuven, Belgium, Nov. 2002)

Can the KLT approach described in ref. 13 above be applied to AAC part of HEAAC (the other part is SBR spectral band replication) to further improve the coding efficiency.

(See M. Wolters et al, A closer look into MPEG-4 high efficiency AAC, 115th AES convention, 10-13, Oct. 2003, New York, NY. Also www.codingtechnologies.com.

(3) Encode H.264 High profile (FRExtensions) video and HE-AAC audio, multiplex the two coded bit streams using MPEG-2 or MPEG-4 systems (or any other), followed by inverse operations at the receiver (demultiplex into video and audio coded bit streams and decode). ISMA (internet streaming media alliance) has adopted H.264 along with HE-AAC for streaming media over internet. Access www.isma.tv

(4) Implement VC-1 (based on WMV9) and WMA (Microsoft). Encode both video and audio, multiplex the two coded bit streams followed by demultiplexing, decoding and maintaining lip synch between video and audio.

I have the hard copy of the PP slides on New technologies in MPEG audio presented by Dr. Quackenbush, ([email protected]) MPEG audio research group chair, Audio research labs presented in the one day workshop on MPEG international video and audio standards, HKUST, Hong Kong, on 22 Jan. 2005. Several research projects can be explored based on these slides. This also has some slides on spatial audio coding. Pl see the paper below.

Spatial audio coding

Y.J. Lee et al, Design and development of T-DMB multichannel audio service system based on spatial audio coding, ETRI Journal, vol. 31, #4, pp. 365-375, Aug. 2009. ([email protected])

J. Breebart et al, MPEG spatial audio coding/MPEG surround: Overview and current status, Proc. 119th AES Convention, NY, USA, Oct. 2005, Preprint 6447.

ISO/IEC JTC1/SC29/WG11 (MPEG)., Procedures for the evaluation of spatial audio coding systems, Document N6691, Redmond, WA, July 2004.

(4) See thesis from NTU, Development of AAC-Codec for streaming in wireless mobile applications (E. Kurniawati) 2004. I have this thesis. This research develops various techniques in reducing the implementation complexity while maintaining the same quality desirable for mobile communications. One concept is using odd DFT for both psychoacoustic analysis and MDCT. Extend these techniques to HE-AAC audio, (see item 14 above) MPEG-1 Audio and MPEG-2 AUDIO.

(5) See the paper, A. Ehret et al, Audio coding technology of ExAC, Proc. ISIMP 2004, pp. 290-293, Hong Kong, Oct. 2004, This paper discusses a new low bit rate audio coding technique based on enhanced Audio Coding (EAC) and SBR (spectral band replication). Multiplex this audio coder with AVS video coder, demultiplex into audio/video coded bitstreams and decode them to reconstruct the video/audio followed by lip synch. Consider several audio/video levels/profiles, (bit rates, spatial/temporal resolutions, mono/stereo/5.1 audio channels etc) several files on SBR. (see Harishankars (UTA-EE Dept.) thesis on H.264 video-AAC audio encode/multiplex/demultiplex/decode/lip synch)

( See M. Bosi and R.E. Goldberg, Introduction to digital audio coding standards, Norwell, MA: Kluwer, 2002).

(6) In HE-AAC or AAC-plus reduce complexity (also lossless audio) by using lifting scheme for MDCT/MDST. See Yoshis dissertation (UTA-EE Dept.)

DTS Digital theater system audio Multichannel surround sound audio used in Theaters etc.

http://en.wikipedia.org/wiki/Digital_Theater_System

There is one paper in Audio Engineering Society convention on DTS http://www.aes.org/e-lib/browse.cfm?elib=7486

This paper can be requested thru inter library loan of uta library.

AES E-Library: DTS Coherent Acoustics Delivering High-Quality Multichannel Sound to the Consumer by Smyth, S. M. F.; Smith, W. P.; Smyth, M. H. C.; Yan, M.; Jung, T.

DTS Coherent Acoustics Delivering High-Quality Multichannel Sound to the Consumer

With low bit-rate multichannel coders becoming more prominent in the home theater market, attention is beginning to focus on the applicability of the multichannel format for music reproduction in the home. What differences are there between discrete motion picture sound tracks and multichannel music? How does this affect the performance of low bit-rate coders? What are the limitations of the current CD and new DVD platforms as multitrack music players? What impact will these limitations have on the development of a new standard format for the release of multichannel music? This paper analyzes the potential of DVDs and CDs as high-quality multitrack audio media, focusing in particular on the use of DTS Coherent Acoustics as a means of extending the multichannel potential of these platforms. Designed to achieve audio transparency, DTS Coherent Acoustics operates at bit rates of up to 4.096 Mb/s and supports up to 8 discrete channels of audio, at up to 24-bit sample resolution and at sampling rates up to 96 kHz per channel. Within these constraints DVD can deliver a new multichannel music experience to the home. On CDs and laser discs, DTS has already demonstrated a 6-channel capability, at 20-bit resolution and at a sampling rate of 44.1 kHz.

Paper Number: 4293 AES Convention: 100 (May 1996)

Authors: Smyth, S. M. F.; Smith, W. P.; Smyth, M. H. C.; Yan, M.; Jung, T.

Affiliation: DTS Technology LP, Westlake Village, CA

TRANSCODING

(1) Transcoding AVS China to- H.264 and vice versa: Does this have any significance or relevance? This transcoding can be based on various profiles/levels at different bit rates/quality levels and spatial/temporal resolutions. Similarly transcoding between AVS China and WMV-9 (Microsoft video coder) and between H.264 and WMV-9. All levels and profiles.

See the paper H. Kalva, B. Petljanski and B. Furht, Complexity reduction tools for MPEG-2 to H.264 video transcoding, WSEAS Trans. on Info. Science & Applications, vol. 2, pp. 295-300, March 2005, (This has several interesting papers listed in references.).

See also Y. Su et al, Efficient MPEG-2 to H.264/AVC intra transcoding in transform-domain, IEEE ISCAS 2005. (CD in our lab) Also from IEEE XPLORE

Z. Wang et al, A fast intra mode decision algorithm for AVS to H.264 transcoding, IEEE ICME, pp.61-64, July 2006.

W J-B Lee and H. Kalva, An efficient algorithm for VC-1 to H.264 video transcoding in progressive compression, IEEE ICME, pp.53-56, 2006 (see several other papers on transcoding in the references)

Jing Lei Shi, Li-Wei Guo, Hui Xu, Fu-Rong Zhang, Jian Lou and Lu Yu, An AVS-to-MPEG2 Transcoding System, in Proc. of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, Oct.2004 (CD in our lab)

An AVS-to-MPEG2 Transcoding System, oral presentation, ISIMP 2004, Hong Kong, Oct.22-24, 2004. See the papers

I.Ahmad et al, " Video transcoding: An overview of various techniques

and research issues", IEEE Trans. on Multimedia, vol.7,

pp.793-804, Oct. 2005

J. Xin, C-W. Lin and M-T. Sun, Digital video transcoding, Proc. IEEE, vol. 93, pp.84-97, Jan. 2005.

T.D Nguyen, et al, Efficient MPEG-4 to H.264/AVC transcoding with spatial downscaling, ETRI Journal, vol. 29, pp. 826-828, Dec. 2007.http://etrij.etri.re.kr

S. Moiron et al, Video transcoding from H.264/AVC to MPEG-2 with reduced complexity Signal Processing: Image Communication, vol.24, issue 8, pp. 637-650, Sept. 2009.. (several valuable references)

This paper and the references therein can be the basis for various transcoding schemes among different standards MPEG-2, H.264/AVC, AVS China, SMPTE VC-1, DIRAC (BBC) etc.

Sony play station player has developed a MPEG-2 to AVC transcoder pspvideo9 freeware access

http://mpac.ee.ntu.edu.tw/Introduction/transcoding.php

(2) Nvidia (www.nvidia.com) has developed a software decoder to transcode MPEG-2 content into WMV 9 player. Develop a software decoder to transcode WMV 9 content into MPEG-2 player. This may require release from Microsoft. Pl see J. Xin, C-H. Lin and M-T. Sun, Digital video transcoding, Proc. IEEE, vol. 93, pp. 84-97, Jan. 2005.

Vidhya Vijayakumar worked on

H.264 to VC-1 TRANSCODING as her M.S. thesis (August 2010)

I have her thesis proposal and PP slides, thesis, Journal paper etc.

Low complexity H.264 to VC-1 transcoder

Vidhya Vijay kumar has implemented this as her M.S. thesis. (summer 2010) This deals with baseline profile of H.264 and simple profile of VC-1. Extend this to various profiles of H.264 and the profiles of VC-1. I have her thesis.

WMV9/VC1 (MICROSOFT/SMPTE)

(1) See S. Srinivasan et al, Windows media video 9: Overview and applications, Signal Processing: Image Communication, vol. 19 , pp. 851-875, Oct.2004.

S. Srinivasan and S.L. Regunathan, An overview of VC-1, , SPIE/VCIP2005, vol.5960, pp.720-728, Beijing, China, July 2005. ( I have several files related to WMV-9/VC-1). These papers describe the state-of-the-art video coding developed by Microsoft. It has been standardized by SMPTE ( named VC-1) and being adopted/considered by Blu-Ray DVD. Compare its performance with The H.264/AVC advanced video coding standard: Overview and introduction to the fidelity range extensions, SPIE Conf. on applications of digital image processing XXVII, vol. 5558, pp.55-74, Aug. 2004, by G. J. Sullivan, P. Topiwala and A. Luthra (similar analysis as in this paper). Carry out comparative performance analysis of VC-9 (Microsoft WMV9), H.264 FRExtensions and AVS China (see items 19-22) below. See also

A.E. Bell and C.J. Cookson, Next generation DVD: application requirements and technology, Signal Processing: Image Communication, vol. 19, pp.909-920, Oct. 2004.

http://blu-raydisc-founders.com

SMPTE VC-2 VIDEO COMPRESSION (standardized)

VC-2 is a wavelet based intra frame video compression system aimed at professional applications that provide efficient coding at many resolutions including various flavors of CIF, SDTV and HDTV. It is based on intra frame version of DIRAC called DIRAC PRO. Implement this and compare its performance with intra only of H.264 and other image coding standards such as JPEG-2000. I have the documents on SMPTE VC-2. Also several files on DIRAC.

(2) Compare the performances of WMV9 (VC-1) , H.264 with FRExtensions and AVS of China. Consider complexity, profiles/levels, error resilience, bit rates PSNR/subjective quality and other parameters.

(3) WMV9 (VC-1), H.264 with FRExtensions and AVS of China use different (although similar) 8x8 integer DCTs. Compare their coding gains, complexity (fixed point), ringing artifacts and related issues.

WMV9 OF MICROSOFT and VC1 OF SMPTE

5th EURAMicrosofts Windows Media video 9 was a proprietary codec until March 2004.

Microsoft has presented a document called Proposed SMPTE Standard for Television: VC-9 compressed video Bit-stream format and decoding Process to the SMPTE for the standardization of Windows Media Video 9 technology. The name of the standard was later changed to VC-1 and WMV9 is now just a software implementation of VC-1. I have SMPTE documents on VC-1 and on conformance bit stream.

SIP Conference focused onThe right place is an engineering committee of SMPTE, specifically C24-VC1, whose email reflector is:

[email protected]

(4) Repeat item 25 for 4x4 integer DCTs used in H.264 and WMV9.

See Y-J Chung, Y-C. Huang and J-L. Wu, An efficient algorithm for splitting an 8x8 DCT into four 4x4 modified DCTs used in AVC/H.264, 55th Eurasip conf., EC-SIP-M2005,pp. 311-316, Smolenice, Slovakia, June-July, 2005.

Can these four 4x4 modified DCTs used in AVC/H.264 be combined to get 8x8 DCT?

(5) Consider using 4x8 and 8x4 integer DCTs in H.264FRExtensions besides 4x4 and 8x8 integer DCTs. (WMV9 uses all these four transforms). Develop encoder/decoder based on these four transforms and evaluate any gains in coding.

LARGE SIZE TRANSFORMS

W.K. Cham, Simple order-16 integer transform for video coding IEEE ICIP 2010, Hong Kong, Sept.2010.

R. Joshi, Y.A. Reznik and M. Karczewicz, Efficient large size transforms for high-performance video coding, SPIE 0ptics + Photonics, vol. 7798, paper 7798-31, San Diego, CA, Aug. 2010.

A.T. Hinds, Design of high- performance fixed-point transforms using the common factor method, SPIE 0ptics + Photonics, vol. 7798, paper 7798-29, San Diego, CA, Aug. 2010.

G.J. Sullivan, Standardization of IDCT approximation behavior for video compression: the history and the new MPEG-C parts 1 and 2 standards, SPIE vol. 6696, paper 35, Aug.2007.

Evaluate the performance of these transforms. Extend to 32x32 and 64x64. Explore their applications.

VARIABLE BLOCK-SIZE SPATIALLY VARYING TRANSFORMS

PAPERS BY C. Zhang, K. Ugur, J. LAINEMA AND M. Gabbouj

Compares this with H.264 (fixed size and fixed location transforms). Many research projects based on these two papers. Also access

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 2, pp. 237-242, FEBRUARY 2011.

Video Coding Using Spatially Varying Transform

Cixun Zhang, Student Member, IEEE, Kemal Ugur, Jani Lainema, Antti Hallapuro, and

Moncef Gabbouj, Fellow, IEEE

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 4, pp. 519-523, APRIL 2011

Statistical Modeling of Inter-Frame Prediction

Error and Its Adaptive Transform

Ho June Leu, Seong-Dae Kim, Senior Member, IEEE, and Wook-Joong Kim

Review this for possible research topics.

DIRECTIONAL DISCRETE COSINE TRANSFORM

B. Zeng and J. Fu, Directional discrete cosine transform A new framework for image coding, IEEE Trans. CSVT, Vol. 18, pp. 305-313, March 2008.

B. Zeng and J. Fu, A Comparative Study of Compensation Techniques in Directional DCTs, IEEE Trans. ISCAS 2007, pp. 521-524, 2007.

B. Zeng and J. Fu, Directional Discrete Cosine Transforms: A Theoretical Analysis, IEEE Trans. ICASSP 2007, vol.1, pp. I-1105 thru I-1108, 2007.

S. Zhu, S. K. A. Yeung, and B. Zeng, R-D Performance Upper Bound of Transform Coding for 2-D Directional Sources, IEEE SPL, vol.16, Issue-10, pp. 861-864, Oct. 2009.

S. K. A. Yeung, S. Zhu, and B. Zeng, Partial Video Encryption Based on Alternating Transforms, IEEE SPL, vol.16, Issue-10, pp. 893-896, Oct.2009.

Go thru these papers in detail. Further research is suggested in Section VIII Conclusions and future works. Explore this.

Design/implement/simulate digital rights management (DRM) for H.264 codecs (video streaming/VOD/DVD etc). See C.C. Jay_Kuos tutorial on DRM. ISIMP2004, Oct.2004. Also review the paper WMV9 by Microsoft.

AVS China

(1) see the paper W. Gao et al, AVS The Chinese next-generation video coding standard, NAB 2004, Las Vegas, NV, April 2004.

(2) See Special issue on AVS and its applications Signal Processing: Image Communication, vol. 24, pp. 245-344, April 2009.

(3) (I have several files related to AVS China.) This deals with Audio-Video standard of China similar to H.264. It also claims high coding efficiency compared to MPEG-2. There are also several papers in the special session on AVS in ISIMP2004 held in Hong Kong (Oct. 2004). The MPL has the proceedings on CD. One paper deals with AVS-to-MPEG2 transcoding system. It is designed for transcoding from AVS coded bitstream to MPEG-2 coded bitstream applicable to MPEG-2 decoders. Develop similar transcoding schemes between H. 264 and MPEG-2.

I have also projects/papers by Sahana, Swaminathan and others.

latest version AVS China software

RM52k_r2 for AVS Part2 Jizhun profile at avs ftp://incoming/dropbox/video_software/P2_software/JiZhunProfile/rm52k_r2.zip.

T. Qian et al, Transform domain transcoding from MPEG-2 to H.264 with interpolation drift free compensation, IEEE Trans. CSVT, vol. 16, pp. 523-534, April 2006.

Research topic VC-1 to AVS China video transcoding with reduced complexity

Video transcoding from H.264/AVC to MPEG-2 with reduced computational complexitySandro Moiron, Srgio Faria, Antnio Navarro, Vitor Silva, Pedro A. Amado Assuno

Preview PDF (381 K) | Related Articles vol.24, issue 8, pp. 637-650, Sept. 2009.(Signal processing: image communication) Science Direct.

J-B. Lee and H. Kalva, An efficient algorithm for VC-1 to H.264 video transcoding in progressive compression, IEEE ICME pp.53-56, 2006. (see the references cited in this paper)

Rochelle Pereira has completed her thesis MPEG-2 Main Profile to H.264 Main Profile transcoder EE Dept. UTA, Dec. 2005.Her research opens up a # of related thesis topics. I have her M.S. Thesis, pp slides and software.

1. MPEG-2 various profiles to H.264 various profiles transcoders and the reverse

2. An immediate and relevant topic is H.264 Main Profile to MPEG-2 Main Profile transcoder.

(M. Bosi and R.E. Goldberg, Introduction to digital audio coding standards, Norwell, MA: Kluwer, 2002).

H.264 and MPEG-2 (CONSIDER ALL LEVELS AND PROFILES). see also L. Yu et al, Overview of AVS-Video: Tools, performance and complexity, SPIE VCIP2005, pp.. ,Beijing, China, July 2005. I have pp slides related to AVS China. FILES ispacstutorial2 and ispacsks1

(2) see the paper Enhanced intra-prediction algorithm in AVS-M, There are also several papers in the special session on AVS in ISIMP2004 held in Hong Kong (Oct. 2004). MPL has the proceedings on CD. Propose and evaluate similar techniques for H.264-M (here M is mobile. this is not the designation by ISO/IEC/ITU).

(3) See the paper Architecture of AVS hardware decoding system, develop similar architecture for H.264 decoder at several levels/profiles.

AVS CHINA WEB SITE

www.avs.org.cn/en

AVS_P1_Reference_

software_RM1.0.rar

AVS China systems software.

Rate-Distortion Optimization using Structural Information in H.264 strictly Intra-frame Encoder," EE5359 Babu Ashwathappa Fall 2009. Submitted to JEI 10117 (July 2010)

Extend this to H.264 inter frame encoders.

MULTIVIEW VIDEO

(1) 3D AV CODING / FREE VIEWPOINT VIDEO

free-viewpoint video (FVV, almost free navigation), Omni directional video (look around views) MPEG-4 2D/3D scene and object models are some of the research areas proposed by Jens-Rainer Ohm (file vicaohm). [email protected] These can lead to innovative research topics. Also MPEG EXPLORATION ON WAVELET VIDEO CODING.

MISCELLANEOUS

wg1n3829.doc

Joint Video Team (JVT) of ISO/IEC MPEG & ITU -T VCEG

(ISO/IEC JTC1/SC29/WG11 and ITU -T SG16 Q.6)

17th Meeting: Nice, France, 17-21 October, 2005

Document: JVT-Q308r2

Filename: JVT-Q308r2.doc

Title:

Core Experiment on Residual Color Transform (CE-8) for 4:4:4 Video

Status: Output Document of JVT

Purpose: Proposal

Author(s) or

Contact(s):

Hyun Mun Kim

Samsung AIT

P.O. Box 111

Suwon, Korea

Tel:

Email:

+82-31-280-9204

[email protected]

Source: Samsung AIT

_____________________________

1 Participants

Coordinator: Hyun Mun Kim, Samsung AIT.

Participant Contact Email

Samsung AIT Hyun Mun Kim [email protected]

Sharp Labs of

America

Shijun Sun [email protected]

Thomson Inc. Haoping Yu [email protected]

Microsoft Gary Sullivan [email protected]

HHI Thomas Wiegand [email protected]

2 Functionality addressed

In this core experiment, the technique for improving the 4:4:4 coding performance

using the in-loop color transform will be tested.

3 Description of approach

In JVT-Q059 [1], we proposed improved coding method using advanced residual

color transform to increase the coding efficiency of the current FRExt 4:4:4 video

coding technology. Here, we modified our previous method for further improvements.

The updated methods are described in the following sections. The updated proposed

method is integrated into the current Joint Working Draft of Amendment 2 and will be

compared to the current Joint Working Draft of Amendment 2.

3.1 Residual color prediction

The following Figure 1 shows a block diagram of the proposed 4:4:4 encoding

method using residual color prediction ( RCP) method. After intra/inter prediction the

residual samples of B and R components are further predicted from residual samples

(1) Y-C. Hu Multiple images embedding scheme based on moment preserving block truncation coding, Real-Time Imaging (under review)

Embedding multiple secret images in grayscale cover image using BTC (compression of secret images) followed by DES encryption is proposed. Extensive literature survey is very helpful (read review papers)

Extend this to color images RGB ----- YCBCR). Consider embedding in Y, CB, or CR with different combinations. Follow this by DES encryption (robustness). Consider compression schemes other than BTC for the secret images. (see Figs. 3 and 4). Consider schemes other than LSB substitution for embedding secret images. Evaluate capacity, robustness, complexity etc, (Review the theses by Ramaswamy and Sally. Their software will be very helpful.) Consider embedding secret images in JPEG or JPEG2000 (see conclusions of the above paper).

(2) MPEG is considering the need for development of a new voluntary standard specifying fixed point approximation to ideal IDCT (also for DCT) 8x8 (see ISO/IEC/SC29/WG11/N6915, Hong Kong , Jan 2005)

This document provides all details including evaluation criteria. Develop this 8x8 INTDCT/INTIDCT that can meet the evaluation criteria and integrate with MPEG codecs.

(3) One-day workshop on MPEG International Video and audio standards, 22 Jan. 2005, HKUST, Hong Kong,( Right after 71st MPEG meeting in HKUST) Several thought provoking R&D topics have been suggested in this workshop. (lecture notes in our lab). Some of these are

H.264 Scalable video coding

Multiview coding 3DAV

Scalable audio coding

http://mpeg.chiariglione.org

Spatial audio coding

http://digital-media-project.org

Joint speech and music coding

H.264 Scalable video coding (new project Jan.2005) temporal/SNR/spatial

Topics in this workshop are as follows:

1. The MPEG Story

2. Past, Present and Future of MPEG video

3. New Technologies in MPEG audio

4. Recent trends in multimedia storage :HD-DVD

5. Recent trends in multimedia IC and set-top box

6. Panel discussion: Where is MPEG going ?

7. The China AVS story

8. AVS 1.0 and HDTV for 2008 Olympic Games AVS-M and 3G

9. Hong Kong ITC Consumer Electronics R& D Center under ASTRI

10. Recent trends of Digital video broadcast and HDTV in Greater China

11. Recent trends of IC industry in Greater China

12. Recent trends of mobile multimedia services in Greater China

13. Panel discussion :Challenges and opportunities of AVS and MPEG in the telecommunication and consumer electronics market in Greater China

For encoded/decoded audio quality evaluation refer to

ITU-R BS 1387-1 Method for objective measurements of perceived audio quality, (I have the document). FastVDOs H.264 High Profile decoder

http://www.fastvdo.com/fastvdo264.html

Due to demand for HD test data, FastVDO is pleased to provide a consolidated

10-bit HD data set (mostly 1080p) for the research community. Content

includes a rich set of both film and non-film data. The data is from a

variety of sources, which retain data rights; usage rights are limited to

testing, research, standards development, and technical presentation.

Please check the site below, where some preliminary information on this is

available.

www.fastvdo.com/hddata

Included are:

1. some brief descriptions, including scene selections, provided by Dolby

and FastVDO when this data was first made available to this community

(JVT-J039 and JVT-J042), and

2. some instructions for obtaining this data

(www.fastvdo.com/hddata/GetHDData.html).

More information will be added shortly.

Dr. Pankaj Topiwala Voice: 410-309-6066

President/CEO FastVDO LLC Fax: 410-309-6554

7150 Riverwood Dr., Mobile: 443-538-3782

Columbia, MD 21046-1245 USA Email: [email protected]

The document reference is ETSI TS 101 154 V1.6.1 (2005-01), "DVB: Implementation guidelines for the use

of Video and Audio Coding in Broadcasting Applications based on the MPEG-2 Transport Stream".

DVB specifies H.264 Main profile level3 for SDTV and High profile level 4 for HDTV.

(4) The following information about the Joint Video Team (JVT) and its work may be helpful to some of you.

The primary work of the JVT currently consists of:

1) scalable video coding (SVC) extension development, and

2) maintenance of the existing Advanced Video Coding (AVC) standard ITU-T Rec. H.264 & ISO/IEC 14496-10, e.g., including errata reporting and maintenance of reference software and conformance specifications.

The JVT currently has 3 active email reflectors.

You can subscribe to two of them (the general JVT reflector and the conformance/interop bitstream exchange activity reflector) through http://mailman.rwth-aachen.de/mailman/listinfo/jvt-experts and http://mailman.rwth-aachen.de/mailman/listinfo/jvt-bitstream.

To subscribe to the 3rd JVT reflector (which is devoted to SVC work), send email to "[email protected]" containing "subscribe svc" in the body of the message.

JVT and VCEG documents can be found at http://ftp3.itu.int/jvt-site. No password is required for access to nearly all documents. A select few documents (such as integrated-format standard drafts) require password access, using a password given only to formal JVT members.

JVT meeting in Pozna Poland, 23-29 July 2005. 16-21 October 2005 in Nice, France and 15-20 January 2006 in Bangkok, Thailand). 30 March 4 April, Geneva, Switzerland.. (16-21 July 2006 in Klagenfurt, Austria and 22-27 Oct. 2006 in Hangzhou China).

The JVT has two parent bodies, which are MPEG (ISO/IEC JTC 1/SC 29/WG 11) and VCEG (ITU-T SG 16 Q.6). Participation in the JVT is open to anyone who is qualified to participate either in MPEG or VCEG, and to those personally invited by the chairmen. We are liberal in granting invitation requests.

To progress the work of the JVT between meetings, the JVT has created the following ad-hoc groups, and has appointed the following listed chairpersons for that work. The discussions involved in the work of those ad-hoc groups will be conducted on the above-listed email reflectors. 1. JVT Project Management and Errata Reporting (Gary Sullivan, Jens Rainer Ohm, Ajay Luthra, and Thomas Wiegand) 2. JM Description and Reference Software (Thomas Wiegand, Karsten Shring, Alexis Tourapis, and Keng Pang Lim) 3. Bitstream Exchange and Conformance (Teruhiko Suzuki and Lowell Winger) 4. SVC Core Experiments (Justin Ridge, Ulrich Benzler) 5. JSVM software improvement and new functionality integration (Greg Cook) 6. JSVM Text and WD Text Editing (Julien Reichel, Heiko Schwarz, Mathias Wien) 7. Spatial Scalability Resampling Filters (Gary Sullivan) 8. Test conditions and applications for error resilience (Ye Kui Wang) 9. Test conditions for coding efficiency work and JSVM performance evaluation (Mathias Wien, Heiko Schwarz) 10. Study of 4:4:4 video coding functionality (Teruhiko Suzuki)

In the work on scalable video coding (SVC), the JVT is conducting the following core experiments (CEs). A document describing each of these CEs is available on the JVT ftp site in the 2005_04_Busan directory as document number JVT-O3xx, where "xx" is the number of the core experiment as listed below. The appointed core experiment coordinator, some participating companies, and some relevant documents (prefix the numbers below by "JVT-O" for the complete document number) are also listed below.

CE1: MCTF memory management (009, 026, 027, 028) (Visiowave, Panasonic, Nokia) Julien Reichel

CE2: Improved de-blocking filter settings (non-normative?) (RWTH, FTRD) (067) Mathias Wien

CE3: Coding efficiency of entropy coding (SKKU, ETRI, Samsung) Woong Il Choi, (021, 063)

CE4: Inter-layer motion prediction (Samsung, LG) Kyohyuk Lee (058)

CE5: Quality Layers (FTRD, Nokia, ...) (044, 055) Isabelle Amonou

CE6: Improvement of update step (015, 030, 062) (Samsung, MSRA, Nokia, FhG-HHI) Woo-Jin Han

CE7: Enhancement-layer intra prediction (Thomson, FhG-HHI, Sharp, Huawei, Samsung) (010, 053, 065) Jill Boyce

CE8: Region of Interest (NCTU, ICU, ETRI, I2R) (020) Zhongkang Lu

CE9: Improvement of quantization (046, 060, 066, 069) (FTRD, Panasonic, Siemens, RWTH, FhG-HHI, Microsoft, Sharp) Stphane Pateux

CE10: Extended spatial scalability (Thomson, FTRD, Sharp, LG) (008, 041, 042) Edouard Francois

CE11: Improvement of FGS (055) (Nokia, FhG-HHI, NCTU) Justin Ridge

CE12: Weighted prediction from FGS layers (054) (Nokia, Visiowave, FhG-HHI) Yiliang Bao

On the ISO/IEC side, our standards are published as part of the ISO/IEC 14496 suite of standards, which is available for purchase at:

http://tinyurl.com/2dgpx

Anyone can get copies of 3 ITU-T standards for free by using the following link:

http://ecs.itu.ch/cgi-bin/ebookshop

The links to the JVT's standards at ITU-T are as follows:

-------------------------------------------------------------

Title: H.264 (03/05) : Advanced video coding for generic audiovisual services

URL: http://tinyurl.com/62t46

-------------------------------------------------------------

Title: H.264.1 (03/05) : Conformance specification for H.264 advanced video coding

URL: http://tinyurl.com/5qp7g

-------------------------------------------------------------

Title: H.264.2 (03/05) : Reference software for H.264 advanced video coding

URL: http://tinyurl.com/6flcp

-------------------------------------------------------------

Best regards,

Gary Sullivan

FREE AVAILABILITY OF ITU-T STANDARDS (1/8/2007)

Many of you may not be aware that published ITU-T standards can now be obtained for free (at least on a trial basis). I believe the old limit of 3 texts has been removed, at least for PDF format texts.

For example, see

http://www.itu.int/rec/T-REC-H.264

where ITU-T Rec. H.264 can be obtained for free in PDF form in English, Spanish, and French.

Best Regards,

Gary Sullivan

_______________________________________________

jvt-experts mailing list

[email protected] http://mailman.rwth-aachen.de/mailman/listinfo/jvt-experts

(5) DIRAC CODEC (comparison/evaluation with H.264)

Dirac is a conventional hybrid motion-compensated

(overlapped block motion compensation is used) video

codec. Dirac uses arithmetic coding.

Main difference from MPEG: Dirac uses a wavelet

transform rather than the DCT or DCT-like,

transform.

I am still reviewing/evaluating codec from the point

of view of D-Cinema (DCI spec, 2k-4k scalability,

etc.) Software etc can be accessed from the web site below.

http://dirac.sourceforge.net/index.html

--- Jean-Marc Glasser wrote:

> Dear JVT experts,

>

> Please find here the link to an alternative CODEC :

> http://dirac.sourceforge.net/specification.html

> I wonder if it fits within the JVT framework and how

> it compares to H.264.

DIRAC Pro (Adopted by SMPTE as VC-2) Intra frame coding only in DIRAC.

For more info on DiracPro, check out this link:http://www.bbc.co.uk/rd/projects/dirac/diracpro.shtml

Ravi Aruna Subramanian has implemented DIRAC and compared with H.264 based on test sequences at different bit rates, video formats etc (CIF, QCIF, SDTV etc ) as part of her thesis. She passed the thesis defense on 17 July 2009.I have soft copies of her thesis and ppt slides. Also a paper sent to J.VCIR.

There are some conformance test sequences and software on http://dirac.kw.bbc.co.uk/download/

Emails from [email protected] July11, 2009

Dirac is only a video codec. When we started the project I also had some audio engineers on the team and we would have liked to have produced a royalty free audio codec too. Unfortunately we simply didn't have the resources to do both. However there are a number of audio codecs that can be used with Dirac such as AAC or MPEG Layer 2. I say layer 2 because I think it is royalty free and, for high quality audio, it has similar performance to mp3 (at lower bit rates mp3 is better of course). But people who looking to use Dirac often want a royalty free codec and the Xiph organisation's Vorbis codec is quite good.

In order to use audio with Dirac it has to be wrapped in some container format. We have registered the use of Dirac in MPEG-2 transport streams with the SMPTE. This means that you can put Dirac into transport streams with any audio format supported by MPEG-TS. These include, uncompressed audio, MPEG layer 2, MPEG layer 3, AAC and probably others. I'm not sure whether Vorbis is supported in MPEG-TS. We have also written specifications for Dirac in mp4 files (.mov files) which allows associated audio. And we have written a spec for Dirac in Ogg wrappers, which allow it to be used with Vorbis. Ogg is a popular media container with those who wish to use open source, royalty free media. We are supporting Ogg by integrating Dirac into the libogg play library.

In case you did not know we have integrated Dirac decoding into VLC media player and encoding will be added in future releases (it is already in nightly builds I believe). So, soon with VLC media player (which is cross platform open source) it should be possible to transcode to Dirac and wrap in a variety of containers with a range of audio codecs. On the Linux only side Dirac has been integrated into gStreamer.

Most of our stuff is on the diracvideo.org web site. We're standardizing the intra frame version of Dirac, for professional use, with the SMPTE, so there are some conformance test streams for that on dirac.kw.bbc.co.uk.

This opens up # of thesis topics. Multiplexing/demultiplexing coded bitstreams (video- Dirac, audio MP2, AAC,AC-3, Vorbis etc) and achieving lip synch.

Dirac Pro is another name for SMPTE VC-2, which is the intra frame version of Dirac, currently being standardized by the SMPTE to become SMPTE S2042. Latest draft specs attached (not for distribution). I am hoping that, after several years of work, the VC-2 specification will be accepted this year before end September (but, given the way standards committees work this can never be guaranteed).

To put this in context: As you know VC-2 is the standardized version of Windows Media Video, VC-3 is Avid's DNxHD and between them is VC-2 a.k.a Dirac Pro. I have the documents on SMPTE VC-2.

Dirac Pro is aimed at professional applications not end user distribution. Typically it would be used for compression ratios up to about 20:1. In this space it competes with motion JPEG2000. Compare DIRSC PRO with other still frame image coding standards such as JPEG, JPEG-2000, JPEG-XR and JPEG-LS.

www.xiph.org/vorbis

Vorbis I specification, Xiph.org Foundation, June 2, 2009

Embedding Vorbis into an Ogg stream

This document describes using Ogg logical and physical transport streams to encapsulate Vorbis compressed audio packet data into file form.

Vorbis audio: MDCT, VQ

Xiph.org's Vorbis software CODEC implementation is distributed under a BSD-like license. BSD licenses represent a family of permissive free software licenses. The original was used for the Berkeley Software Distribution (BSD), a Unix-like operating system after which the license is named.

Latest audio codecs

Add DTS, DTS-High Definition High Resolution Audio, DTS-High Definition Master Audio coding standards to the list of Audio Coding Standards list on slide 13. These are the latest audio codecs developed by DTS and are using in the Blu-Ray discs.

(6) MODEL AIDED CODER

A MODEL AIDE CODER BASED ON MODEL-BASED CODING AND TRADITIONAL HYBRID CODING (mc-transform/prediction) is proposed by Thomas Wiegand (file: vicawiegand). This involves considerable original research to develop/design/implement this coder.

(7) AIC (ADVANCED IMAGE CODING) Convener Dr. Daniel Lee

[email protected]

This has been standardized as JPEG-XR based on HD-Photo by Microsoft.

I have several files on this. Radhika has implemented/compared several image compression standards including JPEG-XR.

ISO/IEC/JTC 1/SC29/WG1 has called for proposals ( I have some files on this.) for developing a standard beyond JPEG 2000.

Application Requirements

Development software, test environment, browsers integration

Forward compatibility with previous standards (decoding previously encoded images)

Network protocols/effects integration

Image processing integration

Image analysis/understanding/documentation integration

Sensors parameters independence (geometry, response, sensitivity)

Content and context aware encoding/decoding

2.2 Technical requirements

Low/Scalable complexity and power consumption

High compression efficiency in high quality imaging

Flexibility of implementation (in terms of required resources)

Compressed domain manipulation

Efficient 3D region of interest decoding/access

Unique algorithm core for compression of multiple dimensions images (spatial, components, time, etc.)

Several research projects can be explored related to AIC. Pl access

http://www.bilsen.com/index.htm?http://www.bilsen.com/aic/

AIC Advanced Image Coding (Beyond JPEG & JPEG 2000) ISO/IEC standardization process

I have several files on AIC. I frame only image coding in H.264 versus JPEG-2000 performance comparison.

F. Wu, C. Lan and G. Shi, Compress Compound Images in H.264/MPEG-4 AVC by Fully Exploiting Spatial Correlation ISCAS2009 May, Taipei, Taiwan. Paper ID 1109, track 16.1

JPEG2000:

D. T. Lee, JPEG 2000: Retrospective and new developments, Proc. IEEE vol. 93, pp.32-41, Jan. 2005. (MANY VALUABLE REFERENCES)

This paper describes four new parts designed for important new applications

Part 9 JPIP Interactivity tools, application programming interface and protocols,

Part 8 JPSEC Secure JPEG 2000

Part 10 JP3D 3-D and floating point data

Part 11

JPWL Wireless

These new parts can be the basis for many research projects.

W.D Neve et al, Improved BSDL based content adaptation for JPEG2000 and HD photo (JPEG XR), SP:IC, vol. 24,, issue 6, pp. 452-467, July 2009. Good paper for research/projects.

DIGITAL VIDEO CONSORTIUM (DVC) has developed a vision optimized (visual discrimination model VDM) MPEG-2 Encoder/Decoder which is superior to MPEG-2 (file DVCencoder has PP slides- Also Ch.12 HVS based perceptual video coders authors A. Pica, M. Isnardi and J. Lubin in the handbook edited by H.R. Wu and K.R. Rao, Digital video image quality and perceptual coding Taylor and Francis, 2006.) I have the related file DVCencoder which is developed by Sarnoff Corporation.. EXTEND THIS CONCEPT TO H.264/MPEG-4 PART 10 ENCODERS ALL LEVELS AND PROFILES. Evaluate and compare the performances of H.264 with and without the VDM. To implement this software licensing agreement with Sarnoff Corporation is required. Jennie Abraham is working on this (doctoral dissertation).

JVT-V204 "New profiles for professional applications" amendment to ITU-T Rec. H.264 & ISO/IEC 14496-10 (Amendment 2 to 2005 edition) 2/13/2007 research/projects based on these new profiles. Details are as follows: (file JVT-V204) www.itu.int/rec/T-REC-H.264 (4/10/07)

This document is a draft amendment to ITU-T Rec. H.264 & ISO/IEC 14496-10 creating a set of new profiles intended primarily for professional applications. It also defines two new types of supplemental enhancement information (SEI) messages.

One such new profile is the High 4:4:4Predictive profile. The High 4:4:4Predictive profile, as drafted herein, has two different 4:4:4 operation modes depending upon the value of a new syntax element, separate_ color_plane_flag that is present in the sequence parameter set. When separate_

color_plane_flag is equal to 0, each macroblock contains both luma and chroma samples, and a decoding process similar to the luma decoding process that is used in the other profiles is used to decode the luma and chroma samples in each such macroblock. When separate_

color_plane_flag is equal to 1, the decoding process for monochrome

each color plane individually as a distinct picture. In addition, a new intra decoding process that can be used by encoders to enable relatively-efficient lossless coding is also added for use when the qpprime_y_zero_transform_bypass_flag syntax element is equal to 1 and QP'Y is equal to 0. In the new High 4:4:4 Predictive profile, the bit depth is also extended up to 14 bits per sample.

In addition to adding the definition of the High4:4:4Predictive profile, four other profiles are also specified in this amendment. These profiles, referred to as the High10Intra, High4:2:2Intra, High4:4:4Intra, and CAVLC4:4:4Intra profiles, serve to enable applications demanding simple random-access and editing applications with low delay capability. Each of these profiles contains coding capabilities that similar to those of another corresponding profile, except for elimination of support for the decoding processes that involve inter-picture prediction and, in the case of the CAVLC4:4:4Intra profile, the additional elimination of support for the CABAC parsing process.

The two added SEI messages are the post-filter hint SEI message and the tone mapping information SEI message. The post-filter hint SEI message provides the coefficients of a post-filter or correlation information for the design of a post-filter for potential use in post-processing of the output decoded pictures to obtain improved displayed quality. The tone mapping information SEI message provides information to enable remapping of the color samples of the output decoded pictures for customization to particular display environments. See the paper below.

G.J. Sullivan et al, New standardized extensions of MPEG-4 AVC/H.264 for professional quality video applications, IEEE ICIP, I, pp. 13-16, 2007.

(8)Video Annotation in H.264 (3/21/07)

In the Marrakech meeting, JVT decided to create the AHG on Video Annotation SEI message. The main goal of Video Annotation AHG (ad hoc group) is to study some potential issues on making the compressed video bit stream have more functionality beyond compression, e.g., to support fast video search, value-added applications, or content management. Two reference proposals are JVT-U059 and JVT-V060.

Here we would like to initiate the discussion on this topic. All who are interested in this topic are welcome to make comments in this email thread or contact the chairs or me. Thanks!

AHG Mandates:

Identify applications for video annotation and their requirements

Work out suggestions for support needs in AVC

Find/create test material

Define experiments

Best regards,

Quqing Chen

TRANSCODER

Design, develop, implement and evaluate

H.264 to VP6 Transcoder (Jay Padia has implemented this as his M.S. thesis. May 2010.

(see C.Holder and H. Kalva, H.263 to VP6 transcoder, SPIE, vol. 6822 (VCIP), San Jose, CA, Jan. 2008. Access from http://spiedigitallibrary.aip.org) Pl access reference 1 from this paper (web site on flash8)

G.F.-Escribano et al, An MPEG-2 to H.264 video transcoder in the baseline profile, IEEE Trans. CSVT, vol. 20, pp. 763-768, May 2010.

Wyner-Ziv to H.264 transcoder (see E. Peixoto, R.L. de Queiroz and D. Mukherjee, Mobile video communications using Wyner-Ziv transcoder, SPIE, vol. 6822 (VCIP), San Jose, CA, Jan. 2008. Access from http://spiedigitallibrary.aip.org) . (several papers on Wyner-Ziv coding in this volume).

Subramanya has worked om WZ-codec as his M.S thesis

IEEE Trans on CSVT, Vol.18, April 2008. See the paper below. Warped DCT can this be extended to video coding?

Top of Form

Parameter Embedding Mode and Optimal Post-Process Filtering for Improved WDCT Image CompressionUrhan, O.; Erturk, S.Page(s):528-532Digital Object Identifier 10.1109/TCSVT.2008.918769AbstractPlus |Full Text: PDF (168 KB) Rights and Permissions

Bottom of Form

Enhanced AC-3 standard of ATSC (see the paper cited in ATSC DTV standard)

MPEG 1 Layer3 (MP3), MPEG-2 AAC and HE-AAC. Discuss encoder decoder block diagrams, advantages and disadvantages.

MPEG Surround audio coder, "ISO/IE C230003-1 Information Technology - MPEG audio technologies-Part 1, MPEG Surround, Feb. 2007.

Latest audio codecs

Add DTS, DTS-High Definition High Resolution Audio, DTS-High Definition Master Audio coding standards to the list of Audio Coding Standards list on slide 13. These are the latest audio codecs developed by DTS and are using in the Blu-Ray discs.

See: S.-T. Hsiang, "A new subband/wavelet framework for AVC/H.264 intraframe coding and performance comparison with motion-JPEG 2000", SPIE/VCIP, vol.6822, pp. 68220P-1 through 12, Jan. 2008. Implement this new intraframe scalable coding.

See R.G. de Oliveria and R.L. de Queiroz "Intra prediction versus wavelets and lapped transforms in an H.264/AVC coder", IEEE ICIP 2008, pp.137-140, San Antonio, TX, 2008. Implement lapped transforms/wavelets in H.264/AVC intra coding and compare the performance.

See D. Marpe et al, "Performance evaluation of motion- JPEG 2000 in comparison with H.264/AVC operated in pure intra coding mode", proc. SPIE, vol.5266, pp.129-137, Feb.2004. Implement this performance comparison.

Rahul_panchal@yahoo. Com 6/14/09

As such not much is published about H265. ( It is officially called high efficiency video coding.) All the info about tools which improves H264 and can become potentialtool in H265 can be obtained from VCEG/MPEG ftp site. I do not have document numbers handy, but I can give the list of interesting tools.(1) Rate Distortion Optimized Quantization (RDOQ) This is still H.264 decoder compatible...(2) Mode Dependent Directional Transform (MDDT)(3) Bigger blocks (64x64) and Bigger Transform( 16x16,16x8 and 8x16)These bigger transform as Integer approximation of DCT designed in Matlab...A good thesis topic can be "A better design of bigger transform for H265".(4) Adaptive Interpolation Filter (AIF)Drawback: It needs at least 2 encoding pass to design the filters. So its more computationally complex.Currently these are 6 different flavors of AIF in the KTA (key technical areas) software.The latest proposal was about single-pass AIF, but there they designed for P frames which is easy to digest.As of now, no idea how to design for B frames using single pass.Potential PhD thesis "Design of AIF using single pass encoding".(5) Competing to AIF is HPF (High Precision Filter) which is computationally simpler than AIF.(6) Quadtree Adaptive Loop Filter (QALF)Competing to QALF is "Post Filter" which unlike QALF is out of the encoding loop.Thesis Topic: "Simplification of QALF without losing much of coding gain"(7) Internal Bit Depth Increase (IBDI)(8) Motion Vector Competition(9) New Bi-Predictive Intra modes.(10) 1/8 pel motion vector ( I do not see this topic as interesting 265 latest developments)

G. Sullivan, Recent developments toward standardization of next generation video coding, SPIE Optics+Photonics, vol. 7798 (paper 30), SanDiego, CA, Aug. 2010. This can lead to several thesis topics/projects.

H.264 open source SVC decoder library 6/18/09 SCALABLE VIDEO CODING

I would like to announce an open source SVC decoder library we have developed that has been later on integrated in 2 different open source players (TCPMP and Mplayer).

This decoder was initially designed inside a French national project called scalimages.

The source code of this decoding library is available here: https://sourceforge.net/projects/opensvcdecoder/

You can find further information at https://sourceforge.net/apps/mediawiki/opensvcdecoder/index.php?title=Main_Page on the installation, features and additional tools related to our SVC decoder library.

The library performance is up to 50 times faster than the JSVM and supports up 2 enhancement layers (however we can change it easily)

The SVC decoder is conformant to the following sequences http://opensvcdecoder.sourceforge.net/JVT-AB023.xls (see IETR conformance entry in this tabular).

This decoder has been also ported over several platforms such as PDAs and DSP from TI. It will serve as a basis for future development in MPEG RVC (Reconfigurable Video Coding).

If some people are wondering to contribute to this library, they are welcome.

Best regards,

______________________________________________Mickal RAULETIngnieur de Recherche/Research Engineer

Institut d' Electronique et de Tlcommunications de Rennes (IETR) UMR CNRS 6164

Tl : +33 (2) 23 23 82 83Fax : +33 (2) 23 23 82 62Port : +33(6) 81 08 35 66

IETR/Groupe ImageINSA de Rennes20, avenue des Buttes de Cosmes35043 Rennes Cedex

Dear jvt-experts, 6/19/2009

A new win32 release and source code of mplayer integrated our open svc decoder has been released (version 1.1 http://sourceforge.net/projects/opensvcdecoder/ ). This version integrates layer switching on mplayer and will keep the higher available resolution when displayed. For the layer with finer resolution it will be upsampled. You can also find test sequences to download here.

http://sourceforge.net/project/showfiles.php?group_id=263634&package_id=324062

Mplayer can be used as follow to switch from one layer to another (hotkeys b (to go up), c (to go down)).

http://sourceforge.net/apps/mediawiki/opensvcdecoder/index.php?title=Players_Overview#Mplayer

Another suitable alternative is to use the following command line on the raw stream to select the desired layer (following this DQ_Id = dependency_id>

+

=

=

-

+

-

+

+

+

+

+

=

S

G

S

G

G

G

G

G

G

G

G

S

G

R

abs

R

sign

R

F

j

i

r

j

i

R

j

i

R

j

i

R

j

i

R

j

i

R

R

(2)

Where abs(x) means the absolute value of x. And Sign(x) means as follows:

p-/p

p/p

p=/p

p0/p

p,/p

p1/p

p0/p

p,/p

p1/p

p)/p

p(/p

px/p

px/p

px/p

pSign/p

/p

pThe spatially neighboring residual pixels outside a given block will be mirrored using the nearest residual pixels inside the block as shown in Figure 3. The size of the residual block is 4x4 when the coding mode is intra 4x4 (I_4x4) mode or intra 16x16 (I_16x16) mode, and 8x8 when the coding mode is intra 8x8 (I_8x8) or 8x8 transform is used in inter modes. /p

p/p

p class="caption"Figure 3 Residual pixel mirroring on the block boundary./p

p class="caption" In Figure 3, the shaded rectangles are the residual pixels inside a current coded block and the white rectangles are the residual pixels outside the current coded block. The values of white pixels are taken from the horizontally or vertically nearest shaded pixels as indicated in the arrows in Figure 3./p

p class="caption"Figure 4 shows the block-diagram of the proposed 4:4:4 decoding method using RCP method./p

ppEntropy /p

pDecoding/p

pQ/p

p-/p

p1/p

pT/p

p-/p

p1/p

pF/p

p(/p

p/p

p)/p

pQ/p

p-/p

p1/p

pT/p

p-/p

p1/p

pQ/p

p-/p

p1/p

pT/p

p-/p

p1/p

p+/p

p+/p

p+/p

p+/p

p+/p

pbitstream/p

p+/p

p+/p

p+/p

p+/p

p+/p

p+/p

p+/p

p+/p

p+/p

p+/p

pP/p

pG/p

pP/p

pB/p

pP/p

pR/p

pR/p

p/p

pG/p

pr/p

p/p

pG/p

pr/p

p/p

pB/p

pr/p

p/p

pR/p

pR/p

p/p

pB/p

pR/p

p/p

pR/p

pS/p

p/p

pG/p

pS/p

p/p

pB/p

pS/p

p/p

pR/p

pRCP/p

/p

p class="caption"Figure 4 Block diagram of the proposed 4:4:4 decoding method using residual color prediction./p

pIn Figure 4, the meaning of each symbol is summarized as follows:/p

p RG : reconstructed residual data (G component)/p

p rG : residual data predicted from residual color predictor div class="embedded" id="_1123308671"/

p)/p

p(/p

p/p

pF/p

(G component)/p

p rB and rR : reconstructed residual data before residual color prediction (B and R components)/p

p RB, and RR : reconstructed residual samples where rB and rR are compensated using residual data rG from residual color prediction (B and R components)/p

p PG, PB, and PR : predicted reference samples from intra or inter prediction (G, B, and R components)/p

p SG, SB, and SR : output sample of each color component (G, B, and R components)/p

pThe usage of the residual colour prediction is decided based on the characteristic of a given image sequence. The signal indicating the usage of residual colour prediction is handled by the residual_colour_prediction_flag in the sequence parameter set (see Appendix)./p

4 Testing conditions

pSoftware:/p

pThe reference software (JFVM-1) will be used. /p

pMeasurements:/p

pThere are three methods for objective measurements. The following must all be reported:/p

p1. RGB PSNR/p

p : R, G, and B picture PSNRs averaged over the whole sequence separately/p

p2. Average RGB PSNR/p

p 1) average of MSE over RGB for the whole sequence and take PSNR2) average of R, G, and B picture PSNR values for the whole sequence/p

p3. PSNR in YCgCo domain/p

p : Convert the original and the reconstructed RGB images into YCgCo domain and calculate Y, Co, and Cg PSNRs separately. /p

pTest Sequences: /p

pUse the JVT film (all of them) and Viper sequences.including the new ones from FastVDO while excluding the previous ones that have problems./p

pOther test conditions:/p

pPreliminary test conditions agreed by email exchange are as follows./p

pIntra-Only, IBBrBP(Br means recorded B pictures)/p

pQP=12, 18, 24, 30, optional: QP=6, 36, 42./p

pSymbolMode = 1 (CABAC)/p

pRDOptimization = 1 /p

pScalingMatrixPresentFlag = 0 /p

pOffsetMatrixPresentFlag = 1 /p

pQoffsetMatrixFile = "q_offset.cfg"/p

pAdaptiveRounding = 1 /p

pAdaptRndPeriod = 1 /p

pAdaptRndChroma = 0 /p

pAdaptRndWFactorX = 8/p

pSearchRange = 64/p

pUseFME = 1 /p

pPyramidCoding = 1 /p

pBiPredMotionEstimation = 1 /p

pBiPredMERefinements = 5 /p

pBiPredMESearchRange = 64 /p

pBiPredMESubPel = 2 /p

5 Evaluation criteria

pSignificant

research areas - university of texas at arlington€¦ · web view · 2012-01-06image/video...

Documents