computer vision – coding standards

38
Computer Vision Computer Vision – Coding Standards – Coding Standards Hanyang University Jong-Il Park Acknowledgement: Many of the materials are adapted from Prof. Jechang Jeong’s excellent presentation on international coding standards.

Upload: lotte

Post on 11-Feb-2016

63 views

Category:

Documents


0 download

DESCRIPTION

Acknowledgement: Many of the materials are adapted from Prof. Jechang Jeong’s excellent presentation on international coding standards. Computer Vision – Coding Standards. Hanyang University Jong-Il Park. Topics to be covered. International coding standards Background and brief history - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Computer Vision – Coding Standards

Computer VisionComputer Vision– Coding Standards– Coding Standards

Hanyang University

Jong-Il Park

Acknowledgement: Many of the materials are adapted from Prof. Jechang Jeong’s excellent presentation on international coding standards.

Page 2: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Topics to be coveredTopics to be covered International coding standards

Background and brief history Key techniques in

JPEG MPEG-1,2,4

* Only image/video coding techniques will be covered

Page 3: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Towards Multimedia :

Multimedia

Computer ConsumerElectronics

Tele-Communication Broadcasting

Multimedia EverywhereMultimedia Everywhere

Page 4: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

1980 : ITU-T T.4 : G3 FAX for PSTNModified Huffman and Modified READ

1984 : ITU-T T.6 : G4 FAX for ISDNModified MR

1992 : JPEG (ISO 10918, ITU-T T.81) : Color Still Picturesused for Color Fax, Electronic Still Camera,

Color Printer, Computer Applications etc Lossless/Lossy Modes, Baseline/Extended Modes, Progressive/Sequential Modes DPCM + DCT + Q + RLE + Huffman/Arithmetic Codes Motion JPEG can be used for Moving Pictures.

1993 : JBIG (ISO 11544, ITU-T T.82) : Bi-level PicturesImprovement on T.4 and T.6

Recently: JPEG-LS, JBIG2, etc

Still Picture Compression StandardsStill Picture Compression Standards

Page 5: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

1982 : ITU-R BT.601 : Studio Quality PCM Component VideoCommon to 525/60 and 625/50 Systems

13.5 MHz Sampling, 8 bit/sample, 4:2:2 Format

1990 : ITU-T H.261 : Video Phone/Conference Application via ISDNBitrate = p x 64 kbps, p = 1-30

MC DPCM + DCT + Q + RLE + Huffman Codes Reference Model 1 - 8

1992 : MPEG-1 Video : DSM Applications (e.g. Video CD)Bitrate = 1.5 Mbps

MC DPCM + DCT + Q + RLE + Huffman Codes GOP Structure for Random Access and Error Recovery (I, P, B Frames)

Simulation Model 1 - 3

Moving Picture Compression StandardsMoving Picture Compression Standards

Page 6: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

1994 : MPEG-2 Video (ISO 13818-2, ITU-T H.262) : Generic Algorithm for Various Applications (Broadcasting, Communication, Network, DSM etc)5 Profiles of Functionality (Simple, Main, Spatial Scalable, SNR Scalable, High) 4 Levels of Resolution (Low, Main, High-1440, High)Deals with Interlaced Scan as well as Progressive ScanField/Frame ME & DCT, Dual Prime ME, Intra VLC, Alternate Scan, Nonuniform Q, etc

1993 : ITU-R CMTT.721 : 140 Mbps Contribution Quality VideoAdaptive DPCM, Componentwise

1993 : ITU-R CMTT.723 : 34-45 Mbps Contribution Quality VideoMC DPCM + DCT + Q + RLE + Huffman Codes

Moving Picture Compression Standards(Cont.)Moving Picture Compression Standards(Cont.)

Page 7: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

1995 : ITU-T H.263 : Videophone via PSTNBitrate < 64 kbps

(V.34 modem = 33.6 kbps, Recent modem = 56 kbps) Improved version of H.261

1998 : MPEG-4Bitrates < 2 Mbps

Targets: Multimedia data base accessWireless multimedia

communication Components of H.263 are incorporated Content-based compression Synthetic and natural video/audio Multiple tools/algorithms/profiles => Flexibility

1999 : MPEG-4 Version 2, MPEG-7

Moving Picture Compression Moving Picture Compression Standards(Cont.)Standards(Cont.)

Page 8: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

JPEG(Joint Photographic Experts Group)Applications : color FAX, digital still camera, multimedia computer, internet

JPEG Standard consists of- a lossy baseline coding system- an extended coding system for greater compression, higher precision

or progressive reconstruction applications- a lossless independent coding system for reversible compression

References- ITU-T recommendation T.81, “Information Technology - Digital

compression and Coding of Continuous-Tone Still Images - Requirements and Guideline”, 92. 2

- K. R. Rao, J. J. Hwang, “Techniques & Standards for Image, Video & Audio Coding”, Prentice Hall PTR, 1996

Continuous-tone still image Continuous-tone still image

Page 9: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Baseline system : most widely used among JPEG standardsData precision

- 8 bits for input and output- 11 bits for quantized DCT coefficients

Algorithm- DCT + quantization + variable length coding

Compression Guideline- 0.25 ~ 0.5 bits/pixel : moderate to good quality, some applications- 0.5 ~ 0.75 bits/pixel : good to very good quality, many applications- 0.75 ~ 1.5 bits/pixel : excellent quality, most applications- 1.5 ~ 2.0 bits/pixel : indistinguishable (visually lossless) quality,

most demanding applications

Baseline systemBaseline system

Page 10: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Baseline system encoder

Baseline system decoder

Block diagram of baseline systemBlock diagram of baseline system

Page 11: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Quantization table- No default values for quantization

tables- Application may specify the tables- Q(u, v) : quantization table

integer value from 1 to 255

vuQvuFvuRtionDequantiza

vuQvuFroundvuFonQuantizati

Q

Q

,,, :

,,, :

Quantization and inverse quant.Quantization and inverse quant.

Page 12: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

f (x,y)

FDCT

F (u,v)

Quant.

FQ (u,v)

r (x,y)

Inverse Q& IDCT

e (x,y)

ExampleExample

Page 13: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

DC Coefficient CodingDifferential Coding

DC coefficients of adjacent blocks are strongly correlated.VLC(Huffman Coding)

Entropy codingEntropy coding

Page 14: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

AC coefficients Coding- Zigzag Scanning- VLC(Variable Length Coding, Huffman Coding)

Entropy coding(Cont.)Entropy coding(Cont.)

Page 15: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Originalimage(24bpp)

JPEG Compressed image( 32:1 -- 0.75bpp )

JPEG Compressed image(8:1 -- 3bpp)

JPEG Compressed image( 128:1 -- 0.1875bpp )

Eg. JPEG CompressionEg. JPEG Compression

Page 16: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

MPEG Digital Video TechnologyMPEG Digital Video Technology MPEG-1( ISO/IEC 11172 ) and MPEG-2( ISO/IEC 13818 )

Applications :

MPEG-1 : Digital Storage Media(CD-ROM…)

MPEG-2 : Higher bit rates and broader generic applications

( Consumer electronics, Telecommunications, Digital Broadcasting, HDTV, DVD, VOD, etc. )

Coding scheme :

Spatial redundancy : DCT + Quantization

Temporal redundancy : Motion estimation and compensation

Statistical redundancy : VLC

References :- ISO/IEC 11172-2 (MPEG-1), ISO/IEC 13818-2 (MPEG-2)- K.R.RAO and J.J. HWANG, “TECHNIQUES & STANDARDS FOR IMAGE•VIDEO & AUDIO CODING,” Prentice Hall, 1996.

Page 17: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

MPEG :

- Motion Picture Experts Group

- Specifies a standard compression, transmission, and decompression scheme

for video and audio.

- ISO/IEC 11172 : MPEG-1

- ISO/IEC 13818 : MPEG-2

- Consists of 3 parts.

Part 1 : System

Part 2 : Video

Part 3 : Audio

MPEG OverviewMPEG Overview

Page 18: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

How to remove spectral, spatial, temporal, and statistical redundancy?

MPEG compression of videoMPEG compression of video

Page 19: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Compressed Data

DCT EntropyCodingQ MUX Buffer

No information lossNo data reduction

Information lossData reduction

VLCData reducetion

RLEData reduction

Rate Control

Quantization step size

Variable Length CodingUse short words for

most frequent symbols(like Morse code)

Run Length CodingGenerates (Run, Level)

symbols

QuantizingReduce the number of bits for each coefficient.

Give preference to certain coefficients.Reduction can differ for each coefficient

Coefficients processing orderto encourage runs of 0s

111110101100011010001000

8-bitquantization

Input Value

11

10

01

00

2-bitquantization

Input Value

Video

Intra-frame compressionIntra-frame compression

Page 20: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Pixel Coding using the DCT

• As human eyes are insensitive to HF color changes, the R,G, B signal is converted into a luminance and two color difference signals. We can remove redundancy more on U, V than on Y.• The top left DCT component is taken as the dc datum for the block.• DCT coefficients to the right are increasingly higher horizontal spatial freqs. DCT coefficients below are higher vertical spatial frequencies.

Removing spatial redundancyRemoving spatial redundancy

Page 21: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Activity calculator

SOURCEINPUT Frame

reorderingField/Frame

memory

Motion estimator 1

++

Field/FrameDCT

selector

DCT Q VLCMUX BUFFER

CODED BITSTREAM

De Q

IDCT

+Field/Frame

memoryAdaptivepredictor

Motionestimator 2

Ratecontrol

MQSide informations

Side informations

Inter-frame compressionInter-frame compression

Page 22: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Inter-frame prediction & motion estimation

• This really reduces the overall bit rate from frame to frame!

Temporal redundancyTemporal redundancy

Page 23: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Motion estimationMotion estimation

Page 24: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

I, P, B Frames

• The Intra Frames contain full picture information

• Predicted(P) Frames are predicted from past I, or P frames

• Bi-directional predicted frames offer the greatest compression and use past and future I & P frames for motion compensation.

Putting it all togetherPutting it all together

Page 25: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

• This slide shows how the actual blocks, slices, frames etc. are all put together to form the elementary stream

• Along with the actual picture data, header information is required to reconstruct the I, B, P frames. This header structure is shown.

• The next stage is to take this ES and convert it into something that can be transmitted and decoded at the other end.

Building the elementary streamBuilding the elementary stream

Page 26: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Frame Reordering

Ordering framesOrdering frames

Page 27: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

MPEG-4( ISO/IEC 14496 )

Applications :

Internet Multimedia

Wireless Multimedia Communication

Multimedia Contents for Computers and Consumer Electronics

Interactive Digital TV

Coding scheme :

Spatial redundancy : DCT + Quantization, Wavelet Transform

Temporal redundancy : Motion estimation and compensation

Statistical redundancy : VLC (Huffman Coding, Arithmetic Coding)

Shape Coding : Context-based Arithmetic Coding

References :- ISO/IEC 14496

MPEG-4MPEG-4

Page 28: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Interactive televisionInteractive television

Page 29: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Scene compositionScene composition

Page 30: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

MPEG-4 MPEG-4

Page 31: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

MPEG-4: BackgroundMPEG-4: Background

Page 32: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

MPEG-4: ConceptMPEG-4: Concept

Page 33: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

MPEG-4: Scene compositionMPEG-4: Scene composition

Page 34: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

MPEG-4 Video: SummaryMPEG-4 Video: Summary

Page 35: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

MPEG-4 DecoderMPEG-4 Decoder

Page 36: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Sprite in MPEG-4Sprite in MPEG-4

Page 37: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

SNR ScalabilitySNR Scalability

Page 38: Computer Vision – Coding Standards

            

Department of Computer Science and Engineering, Hanyang University

Spatial ScalabilitySpatial Scalability