computer vision – coding standards hanyang university jong-il park acknowledgement: many of the...
TRANSCRIPT
Computer VisionComputer Vision– Coding Standards– Coding Standards
Hanyang University
Jong-Il Park
Acknowledgement: Many of the materials are adapted from Prof. Jechang Jeong’s excellent presentation on international coding standards.
Department of Computer Science and Engineering, Hanyang University
Topics to be coveredTopics to be covered
International coding standards Background and brief history Key techniques in
JPEG MPEG-1,2,4
* Only image/video coding techniques will be covered
Department of Computer Science and Engineering, Hanyang University
Towards Multimedia :
Multimedia
Computer ConsumerElectronics
Tele-Communication Broadcasting
Multimedia EverywhereMultimedia Everywhere
Department of Computer Science and Engineering, Hanyang University
1980 : ITU-T T.4 : G3 FAX for PSTNModified Huffman and Modified READ
1984 : ITU-T T.6 : G4 FAX for ISDNModified MR
1992 : JPEG (ISO 10918, ITU-T T.81) : Color Still Picturesused for Color Fax, Electronic Still Camera,
Color Printer, Computer Applications etc Lossless/Lossy Modes, Baseline/Extended Modes, Progressive/Sequential Modes DPCM + DCT + Q + RLE + Huffman/Arithmetic Codes Motion JPEG can be used for Moving Pictures.
1993 : JBIG (ISO 11544, ITU-T T.82) : Bi-level PicturesImprovement on T.4 and T.6
Recently: JPEG-LS, JBIG2, etc
Still Picture Compression StandardsStill Picture Compression Standards
Department of Computer Science and Engineering, Hanyang University
1982 : ITU-R BT.601 : Studio Quality PCM Component VideoCommon to 525/60 and 625/50 Systems
13.5 MHz Sampling, 8 bit/sample, 4:2:2 Format
1990 : ITU-T H.261 : Video Phone/Conference Application via ISDNBitrate = p x 64 kbps, p = 1-30
MC DPCM + DCT + Q + RLE + Huffman Codes Reference Model 1 - 8
1992 : MPEG-1 Video : DSM Applications (e.g. Video CD)Bitrate = 1.5 Mbps
MC DPCM + DCT + Q + RLE + Huffman Codes GOP Structure for Random Access and Error Recovery (I, P, B Frames)
Simulation Model 1 - 3
Moving Picture Compression StandardsMoving Picture Compression Standards
Department of Computer Science and Engineering, Hanyang University
1994 : MPEG-2 Video (ISO 13818-2, ITU-T H.262) : Generic Algorithm for Various Applications
(Broadcasting, Communication, Network, DSM etc)
5 Profiles of Functionality (Simple, Main, Spatial Scalable, SNR Scalable,
High) 4 Levels of Resolution (Low, Main, High-1440, High)Deals with Interlaced Scan as well as Progressive
Scan Field/Frame ME & DCT, Dual Prime ME, Intra VLC, Alternate Scan, Nonuniform Q, etc
1993 : ITU-R CMTT.721 : 140 Mbps Contribution Quality VideoAdaptive DPCM, Componentwise
1993 : ITU-R CMTT.723 : 34-45 Mbps Contribution Quality VideoMC DPCM + DCT + Q + RLE + Huffman
Codes
Moving Picture Compression Standards(Cont.)Moving Picture Compression Standards(Cont.)
Department of Computer Science and Engineering, Hanyang University
1995 : ITU-T H.263 : Videophone via PSTNBitrate < 64 kbps
(V.34 modem = 33.6 kbps, Recent modem = 56 kbps) Improved version of H.261
1998 : MPEG-4Bitrates < 2 Mbps
Targets: Multimedia data base accessWireless multimedia
communication Components of H.263 are incorporated Content-based compression Synthetic and natural video/audio Multiple tools/algorithms/profiles => Flexibility
1999 : MPEG-4 Version 2, MPEG-7
Moving Picture Compression Moving Picture Compression Standards(Cont.)Standards(Cont.)
Department of Computer Science and Engineering, Hanyang University
JPEG(Joint Photographic Experts Group)
Applications : color FAX, digital still camera, multimedia computer, internet
JPEG Standard consists of- a lossy baseline coding system- an extended coding system for greater compression, higher precision
or progressive reconstruction applications- a lossless independent coding system for reversible compression
References- ITU-T recommendation T.81, “Information Technology - Digital
compression and Coding of Continuous-Tone Still Images - Requirements and Guideline”, 92. 2
- K. R. Rao, J. J. Hwang, “Techniques & Standards for Image, Video & Audio Coding”, Prentice Hall PTR, 1996
Continuous-tone still image Continuous-tone still image
Department of Computer Science and Engineering, Hanyang University
Baseline system : most widely used among JPEG standards
Data precision- 8 bits for input and output- 11 bits for quantized DCT coefficients
Algorithm- DCT + quantization + variable length coding
Compression Guideline- 0.25 ~ 0.5 bits/pixel : moderate to good quality, some applications- 0.5 ~ 0.75 bits/pixel : good to very good quality, many applications- 0.75 ~ 1.5 bits/pixel : excellent quality, most applications- 1.5 ~ 2.0 bits/pixel : indistinguishable (visually lossless) quality,
most demanding applications
Baseline systemBaseline system
Department of Computer Science and Engineering, Hanyang University
Baseline system encoder
Baseline system decoder
Block diagram of baseline systemBlock diagram of baseline system
Department of Computer Science and Engineering, Hanyang University
Quantization table- No default values for quantization
tables- Application may specify the tables- Q(u, v) : quantization table
integer value from 1 to 255
vuQvuFvuRtionDequantiza
vuQ
vuFroundvuFonQuantizati
Q
Q
,,, :
,
,, :
Quantization and inverse quant.Quantization and inverse quant.
Department of Computer Science and Engineering, Hanyang University
f (x,y)
FDCT
F (u,v)
Quant.
FQ (u,v)
r (x,y)
Inverse Q& IDCT
e (x,y)
ExampleExample
Department of Computer Science and Engineering, Hanyang University
DC Coefficient Coding
Differential Coding DC coefficients of adjacent blocks are strongly correlated.
VLC(Huffman Coding)
Entropy codingEntropy coding
Department of Computer Science and Engineering, Hanyang University
AC coefficients Coding- Zigzag Scanning- VLC(Variable Length Coding, Huffman Coding)
Entropy coding(Cont.)Entropy coding(Cont.)
Department of Computer Science and Engineering, Hanyang University
Original
image
(24bpp)
JPEG Compressed image
( 32:1 --
0.75bpp )
JPEG Compressed image
(8:1 -- 3bpp)
JPEG Compressed image
( 128:1 --
0.1875bpp )
Eg. JPEG CompressionEg. JPEG Compression
Department of Computer Science and Engineering, Hanyang University
MPEG Digital Video TechnologyMPEG Digital Video Technology MPEG-1( ISO/IEC 11172 ) and MPEG-2( ISO/IEC 13818 )
Applications :
MPEG-1 : Digital Storage Media(CD-ROM…)
MPEG-2 : Higher bit rates and broader generic applications
( Consumer electronics, Telecommunications, Digital Broadcasting, HDTV, DVD, VOD, etc. )
Coding scheme :
Spatial redundancy : DCT + Quantization
Temporal redundancy : Motion estimation and compensation
Statistical redundancy : VLC
References :- ISO/IEC 11172-2 (MPEG-1), ISO/IEC 13818-2 (MPEG-2)- K.R.RAO and J.J. HWANG, “TECHNIQUES & STANDARDS FOR IMAGE•VIDEO & AUDIO CODING,” Prentice Hall, 1996.
Department of Computer Science and Engineering, Hanyang University
MPEG :
- Motion Picture Experts Group
- Specifies a standard compression, transmission, and decompression scheme
for video and audio.
- ISO/IEC 11172 : MPEG-1
- ISO/IEC 13818 : MPEG-2
- Consists of 3 parts.
Part 1 : System
Part 2 : Video
Part 3 : Audio
MPEG OverviewMPEG Overview
Department of Computer Science and Engineering, Hanyang University
How to remove spectral, spatial, temporal, and statistical redundancy?
MPEG compression of videoMPEG compression of video
Department of Computer Science and Engineering, Hanyang University
Compressed Data
DCTEntropyCoding
Q MUX Buffer
No information lossNo data reduction
Information lossData reduction
VLCData reducetion
RLEData reduction
Rate Control
Quantization step size
Variable Length CodingUse short words for
most frequent symbols(like Morse code)
Run Length CodingGenerates (Run, Level)
symbols
QuantizingReduce the number of bits for each coefficient.
Give preference to certain coefficients.Reduction can differ for each coefficient
Coefficients processing orderto encourage runs of 0s
111110101100011010001000
8-bitquantization
Input Value
11
10
01
00
2-bitquantization
Input Value
Video
Intra-frame compressionIntra-frame compression
Department of Computer Science and Engineering, Hanyang University
Pixel Coding using the DCT
• As human eyes are insensitive to HF color changes, the R,G, B signal is converted into a luminance and two color difference signals. We can remove redundancy more on U, V than on Y.
• The top left DCT component is taken as the dc datum for the block.
• DCT coefficients to the right are increasingly higher horizontal spatial freqs. DCT coefficients below are higher vertical spatial frequencies.
Removing spatial redundancyRemoving spatial redundancy
Department of Computer Science and Engineering, Hanyang University
Activity calculator
SOURCEINPUT Frame
reorderingField/Frame
memory
Motion estimator 1
++
Field/FrameDCT
selector
DCT QVLCMUX
BUFFER
CODED BITSTREAM
De Q
IDCT
+
Field/Framememory
Adaptivepredictor
Motionestimator 2
Ratecontrol
MQ
Side informations
Side informations
Inter-frame compressionInter-frame compression
Department of Computer Science and Engineering, Hanyang University
Inter-frame prediction & motion estimation
• This really reduces the overall bit rate from frame to frame!
Temporal redundancyTemporal redundancy
Department of Computer Science and Engineering, Hanyang University
Motion estimationMotion estimation
Department of Computer Science and Engineering, Hanyang University
I, P, B Frames
• The Intra Frames contain full picture information
• Predicted(P) Frames are predicted from past I, or P frames
• Bi-directional predicted frames offer the greatest compression and use past and future I & P frames for motion compensation.
Putting it all togetherPutting it all together
Department of Computer Science and Engineering, Hanyang University
• This slide shows how the actual blocks, slices, frames etc. are all put together to form the elementary stream
• Along with the actual picture data, header information is required to reconstruct the I, B, P frames. This header structure is shown.
• The next stage is to take this ES and convert it into something that can be transmitted and decoded at the other end.
Building the elementary streamBuilding the elementary stream
Department of Computer Science and Engineering, Hanyang University
Frame Reordering
Ordering framesOrdering frames
Department of Computer Science and Engineering, Hanyang University
MPEG-4( ISO/IEC 14496 )
Applications :
Internet Multimedia
Wireless Multimedia Communication
Multimedia Contents for Computers and Consumer Electronics
Interactive Digital TV
Coding scheme :
Spatial redundancy : DCT + Quantization, Wavelet Transform
Temporal redundancy : Motion estimation and compensation
Statistical redundancy : VLC (Huffman Coding, Arithmetic Coding)
Shape Coding : Context-based Arithmetic Coding
References :- ISO/IEC 14496
MPEG-4MPEG-4
Department of Computer Science and Engineering, Hanyang University
Interactive televisionInteractive television
Department of Computer Science and Engineering, Hanyang University
Scene compositionScene composition
Department of Computer Science and Engineering, Hanyang University
MPEG-4: BackgroundMPEG-4: Background
Department of Computer Science and Engineering, Hanyang University
MPEG-4: Scene compositionMPEG-4: Scene composition
Department of Computer Science and Engineering, Hanyang University
MPEG-4 Video: SummaryMPEG-4 Video: Summary