x-media audio, image and video coding · contents 3 •introduction to x-media coding generic...

1

X-MediaAudio, Image and Video Coding

Laurent Duval (IFP Energies nouvelles)Eric Debes (Thalès)

Philippe Morosini (Supélec)

SupportsMoodle/NTNoe

http://www.laurent-duval.eu/lcd-lecture-supelec-xmedia.html

2

General information

3Contents

• Introduction to X-Media coding� generic principles

• Audio� data recording/sampling, physiology

• Data, audio coding� LZ*, Mpeg-Layer 3

• Image coding� JPEG vs. JPEG 2000

• Video coding� MPEG formats, H-264

• Bonuses, exercices

4Initial motivation

5Initial motivation

• Coding/compression as a DSP engineerdiscipline� information reduction, standards & adaption,

complexity, integrity & security issues, interaction with SP pipeline

• Composite domain, yet ubiquitous� sampling (Nyquist-Shannon, filter banks), statistical

SP (KLT, decorrelation), transforms (Fourier, wavelets), classification (quantization, K-means), functional spaces (basis & frames), information theory (entropy), error measurements, modelling

6Initial motivation

Gray-level Histogram

Spatial

DFT DCT

Spectral

Digital Image Characteristics

Point Processing Masking Filtering

Enhancement

Degradation Models Inverse Filtering Wiener Filtering

Restoration

Pre-Processing

Information Theory

LZW (gif)

Lossless

Transform-based (jpeg)

Lossy

Compression

Edge Detection

Segmentation

Shape Descriptors Texture Morphology

Description

Digital Image Processing

7Initial motivation

• Exemplar� underline importance of specific steps (related to

other lectures) to make stuff work

� which algorithm for which task?� process text? sound? images? video?

� "the toolbox quote" (Juran)

• Evolving� steady evolution of standards, tools, overview of

future directions (and tools)

• Central to SP task� what is really important in my data?� (stored) info. overflow + degradations (decon.)

8

Principles

9Principles

• What is data (image) Compression?� Data compression is the art and science of

representing information in a compact form.

� Data is a sequence of symbols taken from a discrete alphabet.� Text: sequence of characters/bytes (0.5 D)� Sound: collection of arrays of values representing

intensities (1 D to 1.5 D) � Still image data: collection of arrays (one for each color

plane) of values representing intensity (color) of the point in corresponding spatial locations (pixel) (2 D ou 2.5 D)

� Video: sequence of still images (3 D)� Next: 3D-TV (audio, video, ambiance, smell?)

10

Flurry of formats ?

11Examples of data compression extensions

• Some (standard) extensionsGIF, RAR, ZIP, BZ2, MP3, MPEG4, AVC, PNG, J2K,

BH, AVI, R(A)M, LHA, OGG, ACC, HE-AAC, MPC, OGM, APE, TIFF, JP2, JPEG-XR (WMP/HD Photo), WAV, PAK, FLV, FLAC, MPC, PDF, MAT, BPM, Z, GZ, LHA, MJEG, 7z, TTA, DjVu, WebP

http://en.wikipedia.org/wiki/List_of_archive_formats

• Targeted for what kind of data?

12

Why?

13

Still Image• One page of A4 format at 600 dpi is > 100 MB.• One color image in digital camera generates 10-30 MB.• Scanned 3”×7” photograph at 300 dpi is 30 MB.

HDTV Video• 720x1280 pixels/frame x 60 frames/s = 1,3 Gb/s• HDTV bandwidth: 20 Mb/s • Objective: 70 x reduction• Equivalent : 0.35 bits/pixel

• Infotrends (2008/01): total #of digital pictures sincebeginning ~ 180 billions. Should grow to 347 billions in 2012

Why do we need Image Compression ?

14

1) Storage2) Transmission (cable, satellite, wifi)3) Data access

1990-2000 Disk capacities : 100MB -> 20 GB (200 times!) (3-4 TB)but seek time : 15 ms � 10 ms (3-4 ms)and transfer rate: 1MB/sec ->2 MB/sec.

(much better for SSD)

Compression improves overall response time in some applications

Why do we need Image Compression ?

15

•Image scanner

•Digital camera

•Video camera,

•Ultra-sound (US), Computer Tomography (CT),

Magnetic resonance image (MRI), digital X-ray (XR),

Infrared.

•Remote sensing, Seismics, Satellite, Radar, SAR

Source of images

16

IMAGECOMPRESSION

UNIVERSALCOMPRESSION

Videoimages

Gray-scale images

True colour images

Binaryimages

Colour palette images

Textual data

Why do we need specific algorithms?

Data types

17Binary image: 1 bit/pel

18Grayscale image: 8 bits/pel

Intensity = 0-255

19

6 bits(64 gray levels)



384××××256

192××××128

96××××64

48××××32

Parameters of digital images

20True color image: 3*8 bits/pel

21Goals of compression

• Balance redundancy and irrelevancy�sources of redundancy

� temporal� spatial� color� other?

�sources of irrelevancy� perceptually unimportant information

� issues: redundancy/irrelevancy� examples?� lossless/lossy choices

22

Lossless compression: reversible, information preservingtext compression algorithms, binary images, palette images

Lossy compression: irreversible grayscale, color, video

Near-lossless compression: medical imaging, remote sensing.

1) Why do we need lossy compression?2) When we can use lossy compression?

Lossy vs . Lossless compression

23

Bitrate:

Compression ratio:

N

C=image in the pixels

file compressed theof size

C

kN ⋅=file compressed theof size

file original theof size

bits/pel

Rate measures

24

Mean average error (MAE): ∑=

−=N

iii xyN 1

1MAE

Mean square error (MSE): ( )∑=

−=N

iii xyN 1

21MSE

[ ]MSElog10PSNR 210 A⋅=

Signal-to-noise ratio (SNR):

Pulse-signal-to-noise ratio (PSNR):

[ ]MSElog10SNR 210 σ⋅=(decibels)

(decibels)

A is amplitude of the signal: A = 28-1=255 for 8-bits signal.

Other measures: l_p norms, SSIM, MOS

Distortion measures

25Other issues

• Coder and decoder computation complexity• Memory requirements• Fixed rate or variable rate• Error resilience• Symmetric or asymmetric algorithms• Decompress at multiple resolutions• Decompress at various bit rates• Standard or proprietary

Reduce redundancy/irrelevancy at "each" stepconsidering performance and quality

26What is an image?

27What is an image?

28What is an image?

29Ultimate storage

• Image compression: why?• storage, transmission, database indexing• processing (denoising, scaling, rotation)

• How come?• 512 x 512 pix 8-bit image → 2,097,152 bits• Far less than 10100 atoms in the entire Universe• 1015 directions, magn., depth and exposure params

• Tautavel Man takes 1000 pix/s (since 450,000 BC)• A collection of 1,42.10176 pix.

• typical compressed image size?• # of bits needed: 17, 586, 10.253, 12.087.300, more?

30What is a compression system?

Compression Model

f(x,y) Transform QuantizeEncode• Source• Channel

31What is a compression system?

Pre -Processing

Blocking Transform Processing

Classifier

Rate Allocator

ReductionOrderingEntropy Coding

Quality

Data

Adaptive Compression

BitStream

Embedded Coding

32Compression scheme (1)

model error

parameters

+

=

modeling

33Compression scheme (2)

reduction

xO = 1 3 7 2 -5 -1 0 0 0

xR = 0 2 6 2 -6 0 0 0 0

xC = 0 1 3 1 -3 4

coding

transform

34Preprocessing


Pre -Processing


Classifier

Rate Allocator


Quality

Image

BitStream

Image analysis Filtering/enhancementRGB to YUVExtension

35Preprocessing

• Image Analysis

• Filtering/enhancement

• RGB to YUV

• Image Extension

36

Red Green Blue

RGB color space

37

R, G, B -- red, green, blueY -- the luminanceU,V -- the chrominance components

Most of the information is collected to the Y component,

while the information content in the U and V is less.

YRV

YBU

BGRY

−=−=

⋅+⋅+⋅= 1.06.03.0

−−=

−−−=

r

b

r

b

C

C

Y

B

G

R

B

G

R

C

C

Y

.

0772.10.1

71414.034413.00.1

402.100.1

.

08131.041869.05.0

5.033126.016875.0

114.0587.0299.0

RGB →→→→ YUV

38

Y U V

YUV color space

39Blocking

JPEG blockingIrregular tilingSegmentation

Pre -Processing


Classifier

Rate Allocator


Quality

Image

BitStream


40Blocking

• Goals:

• To exploit image unstationarities

• To reduce the computational cost

• To exploit inter-block dependencies (2D-3D)

• To select objects of interest (moving)

411D Signal Blocking

• How do we perform decomposition?• block by block

• with overlap

422D Image Blocking

JPEG blocking SegmentationIrregular tiling

43Transform Coding

Karhunen -LoèveFourier, DCTWalsh, HartleyWavelet (packets)

Image

BitStream

Pre -Processing


Classifier

Rate Allocator


Quality


44Transform Coding• Goals

• Efficient representation I(u,v) of an image i(x,y)

• Data decorrelation (KLT optimality?)

• Properties• Linear transforms (matrix op.)• Orthogonality

• Fast algorithms for compression/decompression• Laplace-Gauss distribution (var. length coding)

45Karhunen -Loeve (Hotelling ) Transform

{ }{ }

i

ii

i

Txxx

x

eA

λe

Nixλ

mxmxEC

xEm

Nx

of rowsh Matrix wit :

toingcorrespond rsEigenvecto :

,...,2,1 ,of sEigenvalue :

matrix Covariance :))((

rmean vecto :

vector1:

=−−=

=×

)( xmxAy −=

Hotelling transform of x

46Singular Value Decomposition

47Transform Coding• An optimality result:

• Simplest stationary source model AR(1)

DCT ~ KLT

• Troubles:• KLT calculations + overhead

• After a transformation, correlation can be made very small, but coefficients are far from being independent!

• Transform affects coding

48Transform Coding

• Choices• Good decorrelation properties• Low complexity, HW implementation (DCT, Fourier)

• Side effects (extension, zero or linear-phase)

• A great deal of nice transforms• Walsh-Hadamard system (with fast transforms)• Wavelet (Haar-1910, Mallat, Daubechies 1988)

• Lapped transforms (Malvar, Meyer)

49Transform Coding

50Transform Coding: performance

51Performance measure : coding gain

52DCT and size

53Transform coding : optimization

542D Hadamard -16

552D – DCT 16

56Standard transforms

• Limitations� fixed vector size

�constrained shape (orthogonality)�data-driven adaptation

�shifts and rotations robustness�higher dimension generalizations�analogy with vision aspects

� needed for compression : inverses/redundancy/sparsity

�pre- and post-processing

� coder complexity

57Transform coding

• A common waveform

• A common representation

58Transform coding

• A less common waveform

• A less common representation

59Transform coding

• A not so common waveform

• A not so common representation

60Novel transforms

FB-II

low frequencies

high frequencies

Wavelet

61Transform Coding

62Transform Coding

63Processing

Image

BitStream

Pre -Processing


Classifier

Rate Allocator


Quality


Time/freq. Filtering Image analysis Texture analysis

64Classifier

Spectrum allocationTexture extractionSegmentation

Image

BitStream

Pre -Processing


Classifier

Rate Allocator


Quality


65Texture Synthesis

66Bit Reduction

Image

BitStream

Pre -Processing


Classifier

Rate Allocator


Quality


ThresholdingSubsamplingScalar quantizationVector quantizationIterations (fractals)

67Bit Reduction

• The lossy stage• Human eyes see a limited range of tones/freq• Real-world pictures are already imperfect

• Methods• Scalar quant.: uniform, log., optimal (Lloyd-Max algo)• Vector quant.: Voronoï diagrams, nearest neighbour

• Adaptivity• Pre-stored tables• Training set based tables• On-the-fly quantization estimation

68Quantization

8-bits last 5 bits last 4 bits

69Quantization

last 3 bits last 2 bits last bit

70Quantization

8 3

1

4

2

71Quantization

72Quantization (non -uniform )

73Quantization (adaptive)

74Quantization (vector)

• Outline for images

75

xk

Image

closest matching code vector

Image vectors : Xj

Codebook

codevectors : Vi , V1

V2

Vk

VL

2

1

))()((1

),( iViXn

VXdn

ikjKj ∑

=

−=

2

1

))()(((1

),( iViXWn

VXdn

ikjiKj ∑

=

−=

Mean Square Error (MSE) (Euclidean Distance)

Weighted MSE

Li ≤≤1

Quantization (vector)


• LBG algorithm


• Simple codebooks� (parrots)

• Codebook for a specific feature� ex. edges, smooth areas, etc.

• Codebooks could be of different sizes

78Ordering

Image

BitStream

Pre -Processing


Classifier

Rate Allocator


Quality


Raster scanZig-zag, Hilbert scanZerotreesObject based coding

79Ordering

80Tree-coding ordering

81Ordering

82Wavelet/Blocking equivalence

83Entropy Coding

Image

BitStream

Pre -Processing


Classifier

Rate Allocator


Quality


RLE, HuffmanArithmetic, LZ*Form dictionaries

84Entropy Coding

After reduction:

• Lower coefficient variance • A lot zeroed out

85Arithmetic Coding

S ym b o l # H u f f . L o w U pA 1 4 0 0 .1C 1 5 0 .1 0 .2D 1 5 0 .2 0 .3E 3 1 0 .3 0 .6N 2 2 0 .6 0 .8T 2 3 0 .8 1

4.32 10-8 ≈ 24.46 bits against 27 bits

AAAAAAAAA END : 7 against 11 bits

ANTECEDENT Incoming Low UpStart 0 1A 0 0.1N 0.06 0.08T 0.076 0.08E 0.0772 0.0784C 0.07732 0.07744E 0.077356 0.077392D 0.07736632 0.0773668E 0.07736428 0.07736536N 0.077364928 0.077365144T 0.0773651008 0.077365144Codeword 0.07736511

86Rate allocation

Exact bit-rateProgressive codingDistortion matching

Image

BitStream

Pre -Processing


Classifier

Rate Allocator


Quality


87Quality measure


Image

BitStream

Pre -Processing


Classifier

Rate Allocator


Quality

Objective measures (SNR)Subjective measuresHVS model

88Embedded coding

Embedded coding

Image

BitStream

Pre -Processing


Classifier

Rate Allocator


Quality


89Embedded quantization

Sign s s s s s s s sMsb 4 1 1 1 0 0 0 0

3 x x x 1 1 0 02 x x x x x 0 01 x x x x x 1 1

Lsb 0 x x x x x x x

90Ordering

92

JPEG

93

8X8DCT

Quantizer Coefficients-to-SymbolsMap

Entropy Coder

Encoder

JPEG Principles

94

Input Image, Size=512 x 512 x 8 bits

JPEG Principles

95JPEG Principles

96

69 71 74 76 89 106 111 122

59 70 61 61 68 76 88 94

82 70 77 67 65 63 57 70

97 99 87 83 72 72 68 63

91 105 90 95 85 84 79 75

92 110 101 106 100 94 87 93

89 113 115 124 113 105 100 110

104 110 124 125 107 95 117 116

16

)12(cos

16

)12(cos],[

4

][][],[

7

0

7

0

ππ vnumnmx

vCuCvuX

m n

++= ∑∑= =

≤≤

==

711

,0,2/1

u

u

v

uC

717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6

-99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7

51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4

30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0

22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9

15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7

0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8

-0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1

Step 1: DCT

JPEG Principles

97JPEG Principles (Transform : DCT8)

98JPEG Principles

717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6

-99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7

51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4

30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0

22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9

15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7

0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8

-0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1

45 0 0 -1 0 0 0 0

-8 -3 2 1 0 0 0 0

4 -5 0 0 0 0 0 0

2 -1 0 0 0 0 0 0

1 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

16 11 10 16 24 40 51 61

12 12 14 19 26 58 60 55

14 13 16 24 40 57 69 56

14 17 22 29 51 87 80 62

18 22 37 56 68 109 103 77

24 35 55 64 81 104 113 92

49 64 78 87 103 121 120 101

72 92 95 98 112 100 103 99]

÷÷÷÷ Q

Step 2: Quantization

99

45 0 0 -1 0 0 0 0

-8 -3 2 1 0 0 0 0

4 -5 0 0 0 0 0 0

2 -1 0 0 0 0 0 0

1 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

Result = 45,0,-8,4,-3,0,-1,2,-5,2,1,-1,0,1,0,0,0,0,0,0,1,0x23 times

Step 3: Coefficient-to-Symbol Mapping

Input Zigzag scan procedure

Symbols defined as [run of zeros, nonzero terminating value]

Step 4: Entropy Coding• Symbols are encoded using mostly Huffman coding.• Huffman coding is a method of variable length coding in

which shorter codewords are assigned to the more frequently occurring symbols.

JPEG Principles

100

[ ] [ ] ∗= HfVF∗− = VV 1 ∗− = HH 1

( ) ( )∑∑∑∑−

=

−

=

−

=

−

=

=1

0

1

0

21

0

1

0

2,,

N

m

N

n

N

u

N

v

nmfvuF

[ ] [ ]HFVf ∗=

an image NxN[ ]f

Classical orthogonal transforms

101One JPEG block

102JPEG compression on several blocks

103

Blocking artifact (DCT)Blocking artifact (DCT)

Blocking ArtifactsBlocking Artifacts

104Examples using different quality factors to scale matrix Q

Size: 15159 Bytes17x Compression




Size: 11956 Bytes22x CompressionSize: 263224Bytes

Original Image

105

Another DCT Application: Video Compression

8X8DCT

Quantizer Entropy Coder

8X8IDCT

InverseQuantizer

DelayedFrame Memory

MotionCompensation

MotionEstimation

Entropy Coder

InputFrame

Prediction

PredictionError

Intraframe – OpenInterframe - Close

Motion Vectors

JPEG and DCT: motion

106JPEG and DCT: motion

107Classical orthogonal transforms

108

32 x 32 superbloc

Principles of overlapping

8 x 8 coefficients8 x 8 pixels

Original image Transformed image

109Lapped transforms

110

Ringing artifact (Wavelet, Ringing artifact (Wavelet, GenLOTGenLOT))

Blocking and Ringing Artifacts (Nagai 2001)Blocking and Ringing Artifacts (Nagai 2001)

111

Input

……

Quantization Error

Freq

ue

nc

y hig

h

Basis of Basis of GenLOTGenLOT

Basis of GULLOTBasis of GULLOT

Freq

ue

nc

y hig

h

Output

Output

How to Reduce the Ringing ?How to Reduce the Ringing ?

112

GULLOT

1-D base example

113Results

GULB (26.29 dB)GULB (26.29 dB) GenLOTGenLOT(26.06 dB)(26.06 dB)

Coded Yogi image at 0.3bpsCoded Yogi image at 0.3bps

115

Principles of wavelet image codingSPIHT/JPEG-2000

116

1-Level Wavelet Decomposition (2D DWT)

H1

H2

H1

H2

2

2

2

2

H1

H2

2

2

Row-wise operations Column-wise operations

Hi

∑=

−=L

ki khknxny

0

][][][

x[n] y[n]2

Keep one out of two pixels

Filter Decimator

Input Image

LLComponent

HLComponent

LHComponent

HHComponent

(Low pass)

(Low pass)

(Low pass)

(High pass)

(High pass)

(High pass)

SPIHT/JPEG 2000 principles

117

Multi-Level Wavelet Decomposition

LL HL1

LH1 HH1

2D-DWT

2D-DWT

LL HL2

HH2LH2HL1

LH1 HH1


118

Bitplanes and Self-Similarity Across Scales


119

Spatial Orientation Trees

Some Definitions: •O (i,j): set of coordinates of all offspring of node (i,j).

• D (i,j): set of coordinates of all descendants of node (i,j).

• H : set of coordinates of all spatial orientation tree roots.

• L (i,j): D (i,j) - O (i,j).


120

Coding Algorithm (SPIHT)

• Three list are defined:

1. LIS: List of Insignificant Sets

2. LIP: List of Insignificant Pixels3. LSP: List of Significant Pixels

• Type A: Entries are elements D (i,j)• Type B: Entries are elements L (i,j)

• Significance Test:

≥

=∈

otherwise

c

TS

nji

Tji

n

,0

2|}{|max,1

)(

,),(

• Key Ideas:

• Ordered bit plane transmission.• Multi-pass zero-tree coding.• Exploitation across scales of the 2-D DWT.


121

4-43065115

6363-46-32

40-232-303

2-26447-19-5

23-248-14-7-9

93-75-1231415

-1643-131423-31

7-121371049-3463

480

0-7(1,3)

0-9(0,3)

002(3,5)

00-3(2,5)

480

147(3,4)

00-1(2,4)

114(1,2)

015(0,2)

1-31(0,1)

0-34(1,0)

023(1,1)

00-7(1,3)

00-9(0,3)

0014(1,2)

0015(0,2)

1-31(0,1)

00-13(3,1)

0014(2,1)

0010(3,0)

480

149(2,0)

1-34(1,0)

0023(1,1)

00-31(0,1)

-481

1-34(1,0)

163(0,0)

Reconstruction Value

BinarySymbols

Coefficient Value

CoefficientCoordinatesSPIHT – Example

122Some ExamplesOriginal Image

96x Compression 48x Compression

22x Compression 16x Compression

123

JPEG-2000

124Introduction

• Image Compression has to:� Reduce storage and bandwidth requirements� Allow different extraction modes

• JPEG-2000 provides:� Low bit-rate compression performance� Progressive transmission by quality, resolution, component,

or spatial locality� Lossy and Lossless compression� Random access to the bitstream� Region of Interest coding� Robustness to bit errors

125Codec Structure

126Source Image Model

• One or Several Components in the image• Components can be at different resolutions

(different sizes)

127Intercomponent Transform

• Reduces the correlation between components• Maps image data from RGB to YCrCb• Advantages:

� Improve coding efficiency�Allow visually relevant quantization

• Two Transforms:� Irreversible color transform (ICT)�Reversible color transform (RCT)

128Reversible Component Transformation

• Used for Lossless or Lossy coding

• Advantages:� Reasonable Color Space� Ability of having lossless

compression

++

+−

=

−−

++

=

GU

GV

VUY

B

R

G

GB

GR

BGR

U

V

Y

r

r

rrr

r

r

r

4

4

2

129Wavelet Transform (6)

An example:

130Wavelet Transform (7)

• 5/3 Transform: reversible� Integer to Integer transform

�Can be used both for lossless or lossy coding

• 9/7 Transform: nonreversible�Real to Real transform

�Can only be used for lossy coding

131Quantization

• A uniform scalar quantization with dead-zone about the origin

A zero output may be produced for larger values on the input, to avoid recording noise

132Progression

• Different ordering of the packets in the code stream• 4 Types of Progression:

� Resolution� Quality � Spatial Location� Component

• Progression Type can be changed during coding

133Progression (2)

• Progression by Resolution

134Progression (3)

• Progression by Quality

135Region of Interest

• Coding different regions of the image with different quality

• Used when certain parts of the image are of higher importance

• ROI coding:� General Scaling-Based method� MAXSHIFT method

136General Scaling -Based

• Idea:Scale (shift) coefficients s.t. the bits associated with ROI are in higher bit-planes

• Some bits of ROI might be encoded together with nonROI bits

137

General Scaling -Based (2)• Steps:

• Wavelet Transform• ROI Mask is derived• Quantization• nonROI coefficients are

downscaled• Entropy coding

• Scaling Value and ROI coordinates are included

138MAXSHIFT Method

• Scaling value S is chosen such that:The minimum ROI coefficient is larger than the maximum nonROIcoefficient

• Advantages:� Allows arbitrary shaped ROIs� No ROI mask is needed

139Scalability

• The ability to achieve coding of more than one qualities and/or resolution simultaneously

• Two important types:� SNR Scalability� Spatial or Resolution Scalability

• Advantages: � No need to know target bit rate/resolution � No need for multiple compressions� Resilience to transmission errors

140SNR Scalability

• The bit stream can be decompressed at different quality levels (SNR)

Decompressed image “bike” at (a) 0.125 b/p, (b) 0.25 b/p, (c) 0.5 b/p

141Spatial Scalability

• The bit stream can be decompressed at different resolution level

142Scalability (2)

• Combination of Spatial and SNR• Changing the progression type

143JPEG-2000 V.S. JPEG

(a) (b)

Compression at 0.25 b/p by means of (a) JPEG (b) JPEG-2000

144JPEG-2000 V.S. JPEG

Compression at 0.2 b/p by means of (a) JPEG (b) JPEG-2000(a) (b)

145Comparison of JPEG 2000 with JPEG

• Much smaller files• Much better quality

Figure: 0.08bpp J2K Image (8KB); 0.1563bpp JPEG Image (16KB);

146Illustration

• Region of Interest (ROI) Encoding

Figure: Raw Image; 0.07bpp J2K Image with ROI; 0.07bpp J2K Image without ROI

147References• M.D. Adams, “The JPEG-2000 Still Image Compression Standard”, ISO/IEC JTC1/SC29/WG1

(ITU-T SG8), 2001

• D. Taubman, E. Ordentlcih, I. Ueno, “Embedded Block Coding”, Proc. Int. Conf. on Image Processing (ICIP '2000), Vol. II, 33-36, 2000

• M.W. Marcellin, M.J. Gormish, A. Bilgin, M.P. Boliek, “An Overview of JPEG-2000”, Proc. Of IEEE Data Compression Conference, pp. 523-541, 2000

• A. Skodras, C. Christopulos, T. Ebrahimi, “The JPEG 2000 Still Image Compression Standard”, IEEE Signal Processing Magazine, pp. 36-60, September 2001

• D.S. Taubman, and M.W. Marcellin, "Jpeg2000: Image Compression Fundamentals, Standards, and Practice", Kluwer Academic Pulishers, 2001.

• G. Cena, P. Montuschi, L. Ciminiera, A. Sanna, “A Q-Coder Algorithm with Carry Free Addition”, Proc. 13th IEEE Symposium on Computer Arithmetic, pp. 282-290, July 1997

• S.Y. Choo, G. Chew, “JPEG 2000 and Wavelet Compression” , http://www-ise.stanford.edu/class/psych221/00/shuoyen/

148JPEG-2000 Parts

149JPEG Principles

717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6

-99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7

51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4

30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0

22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9

15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7

0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8

-0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1

45 0 0 -1 0 0 0 0

-8 -3 2 1 0 0 0 0

4 -5 0 0 0 0 0 0

2 -1 0 0 0 0 0 0

1 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

16 11 10 16 24 40 51 61

12 12 14 19 26 58 60 55

14 13 16 24 40 57 69 56

14 17 22 29 51 87 80 62

18 22 37 56 68 109 103 77

24 35 55 64 81 104 113 92

49 64 78 87 103 121 120 101

72 92 95 98 112 100 103 99]

÷÷÷÷ Q

Step 2: Quantization

x-media audio, image and video coding · contents 3 •introduction to x-media coding generic...

Documents