x-media audio, image and video coding · contents 3 •introduction to x-media coding generic...
TRANSCRIPT
-
1
X-MediaAudio, Image and Video Coding
Laurent Duval (IFP Energies nouvelles)Eric Debes (Thalès)
Philippe Morosini (Supélec)
SupportsMoodle/NTNoe
http://www.laurent-duval.eu/lcd-lecture-supelec-xmedia.html
-
2
General information
-
3Contents
• Introduction to X-Media coding� generic principles
• Audio� data recording/sampling, physiology
• Data, audio coding� LZ*, Mpeg-Layer 3
• Image coding� JPEG vs. JPEG 2000
• Video coding� MPEG formats, H-264
• Bonuses, exercices
-
4Initial motivation
-
5Initial motivation
• Coding/compression as a DSP engineerdiscipline� information reduction, standards & adaption,
complexity, integrity & security issues, interaction with SP pipeline
• Composite domain, yet ubiquitous� sampling (Nyquist-Shannon, filter banks), statistical
SP (KLT, decorrelation), transforms (Fourier, wavelets), classification (quantization, K-means), functional spaces (basis & frames), information theory (entropy), error measurements, modelling
-
6Initial motivation
Gray-level Histogram
Spatial
DFT DCT
Spectral
Digital Image Characteristics
Point Processing Masking Filtering
Enhancement
Degradation Models Inverse Filtering Wiener Filtering
Restoration
Pre-Processing
Information Theory
LZW (gif)
Lossless
Transform-based (jpeg)
Lossy
Compression
Edge Detection
Segmentation
Shape Descriptors Texture Morphology
Description
Digital Image Processing
-
7Initial motivation
• Exemplar� underline importance of specific steps (related to
other lectures) to make stuff work
� which algorithm for which task?� process text? sound? images? video?
� "the toolbox quote" (Juran)
• Evolving� steady evolution of standards, tools, overview of
future directions (and tools)
• Central to SP task� what is really important in my data?� (stored) info. overflow + degradations (decon.)
-
8
Principles
-
9Principles
• What is data (image) Compression?� Data compression is the art and science of
representing information in a compact form.
� Data is a sequence of symbols taken from a discrete alphabet.� Text: sequence of characters/bytes (0.5 D)� Sound: collection of arrays of values representing
intensities (1 D to 1.5 D) � Still image data: collection of arrays (one for each color
plane) of values representing intensity (color) of the point in corresponding spatial locations (pixel) (2 D ou 2.5 D)
� Video: sequence of still images (3 D)� Next: 3D-TV (audio, video, ambiance, smell?)
-
10
Flurry of formats ?
-
11Examples of data compression extensions
• Some (standard) extensionsGIF, RAR, ZIP, BZ2, MP3, MPEG4, AVC, PNG, J2K,
BH, AVI, R(A)M, LHA, OGG, ACC, HE-AAC, MPC, OGM, APE, TIFF, JP2, JPEG-XR (WMP/HD Photo), WAV, PAK, FLV, FLAC, MPC, PDF, MAT, BPM, Z, GZ, LHA, MJEG, 7z, TTA, DjVu, WebP
http://en.wikipedia.org/wiki/List_of_archive_formats
• Targeted for what kind of data?
-
12
Why?
-
13
Still Image• One page of A4 format at 600 dpi is > 100 MB.• One color image in digital camera generates 10-30 MB.• Scanned 3”×7” photograph at 300 dpi is 30 MB.
HDTV Video• 720x1280 pixels/frame x 60 frames/s = 1,3 Gb/s• HDTV bandwidth: 20 Mb/s • Objective: 70 x reduction• Equivalent : 0.35 bits/pixel
• Infotrends (2008/01): total #of digital pictures sincebeginning ~ 180 billions. Should grow to 347 billions in 2012
Why do we need Image Compression ?
-
14
1) Storage2) Transmission (cable, satellite, wifi)3) Data access
1990-2000 Disk capacities : 100MB -> 20 GB (200 times!) (3-4 TB)but seek time : 15 ms � 10 ms (3-4 ms)and transfer rate: 1MB/sec ->2 MB/sec.
(much better for SSD)
Compression improves overall response time in some applications
Why do we need Image Compression ?
-
15
•Image scanner
•Digital camera
•Video camera,
•Ultra-sound (US), Computer Tomography (CT),
Magnetic resonance image (MRI), digital X-ray (XR),
Infrared.
•Remote sensing, Seismics, Satellite, Radar, SAR
Source of images
-
16
IMAGECOMPRESSION
UNIVERSALCOMPRESSION
Videoimages
Gray-scale images
True colour images
Binaryimages
Colour palette images
Textual data
Why do we need specific algorithms?
Data types
-
17Binary image: 1 bit/pel
-
18Grayscale image: 8 bits/pel
Intensity = 0-255
-
19
6 bits(64 gray levels)
4 bits(16 gray levels)
2 bits(4 gray levels)
384××××256
192××××128
96××××64
48××××32
Parameters of digital images
-
20True color image: 3*8 bits/pel
-
21Goals of compression
• Balance redundancy and irrelevancy�sources of redundancy
� temporal� spatial� color� other?
�sources of irrelevancy� perceptually unimportant information
� issues: redundancy/irrelevancy� examples?� lossless/lossy choices
-
22
Lossless compression: reversible, information preservingtext compression algorithms, binary images, palette images
Lossy compression: irreversible grayscale, color, video
Near-lossless compression: medical imaging, remote sensing.
1) Why do we need lossy compression?2) When we can use lossy compression?
Lossy vs . Lossless compression
-
23
Bitrate:
Compression ratio:
N
C=image in the pixels
file compressed theof size
C
kN ⋅=file compressed theof size
file original theof size
bits/pel
Rate measures
-
24
Mean average error (MAE): ∑=
−=N
iii xyN 1
1MAE
Mean square error (MSE): ( )∑=
−=N
iii xyN 1
21MSE
[ ]MSElog10PSNR 210 A⋅=
Signal-to-noise ratio (SNR):
Pulse-signal-to-noise ratio (PSNR):
[ ]MSElog10SNR 210 σ⋅=(decibels)
(decibels)
A is amplitude of the signal: A = 28-1=255 for 8-bits signal.
Other measures: l_p norms, SSIM, MOS
Distortion measures
-
25Other issues
• Coder and decoder computation complexity• Memory requirements• Fixed rate or variable rate• Error resilience• Symmetric or asymmetric algorithms• Decompress at multiple resolutions• Decompress at various bit rates• Standard or proprietary
Reduce redundancy/irrelevancy at "each" stepconsidering performance and quality
-
26What is an image?
-
27What is an image?
-
28What is an image?
-
29Ultimate storage
• Image compression: why?• storage, transmission, database indexing• processing (denoising, scaling, rotation)
• How come?• 512 x 512 pix 8-bit image → 2,097,152 bits• Far less than 10100 atoms in the entire Universe• 1015 directions, magn., depth and exposure params
• Tautavel Man takes 1000 pix/s (since 450,000 BC)• A collection of 1,42.10176 pix.
• typical compressed image size?• # of bits needed: 17, 586, 10.253, 12.087.300, more?
-
30What is a compression system?
Compression Model
f(x,y) Transform QuantizeEncode• Source• Channel
-
31What is a compression system?
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Data
Adaptive Compression
BitStream
Embedded Coding
-
32Compression scheme (1)
model error
parameters
+
=
modeling
-
33Compression scheme (2)
reduction
xO = 1 3 7 2 -5 -1 0 0 0
xR = 0 2 6 2 -6 0 0 0 0
xC = 0 1 3 1 -3 4
coding
transform
-
34Preprocessing
Adaptive Compression
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Image
BitStream
Image analysis Filtering/enhancementRGB to YUVExtension
-
35Preprocessing
• Image Analysis
• Filtering/enhancement
• RGB to YUV
• Image Extension
-
36
Red Green Blue
RGB color space
-
37
R, G, B -- red, green, blueY -- the luminanceU,V -- the chrominance components
Most of the information is collected to the Y component,
while the information content in the U and V is less.
YRV
YBU
BGRY
−=−=
⋅+⋅+⋅= 1.06.03.0
−−=
−−−=
r
b
r
b
C
C
Y
B
G
R
B
G
R
C
C
Y
.
0772.10.1
71414.034413.00.1
402.100.1
.
08131.041869.05.0
5.033126.016875.0
114.0587.0299.0
RGB →→→→ YUV
-
38
Y U V
YUV color space
-
39Blocking
JPEG blockingIrregular tilingSegmentation
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Image
BitStream
Adaptive Compression
-
40Blocking
• Goals:
• To exploit image unstationarities
• To reduce the computational cost
• To exploit inter-block dependencies (2D-3D)
• To select objects of interest (moving)
-
411D Signal Blocking
• How do we perform decomposition?• block by block
• with overlap
-
422D Image Blocking
JPEG blocking SegmentationIrregular tiling
-
43Transform Coding
Karhunen -LoèveFourier, DCTWalsh, HartleyWavelet (packets)
Image
BitStream
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Adaptive Compression
-
44Transform Coding• Goals
• Efficient representation I(u,v) of an image i(x,y)
• Data decorrelation (KLT optimality?)
• Properties• Linear transforms (matrix op.)• Orthogonality
• Fast algorithms for compression/decompression• Laplace-Gauss distribution (var. length coding)
-
45Karhunen -Loeve (Hotelling ) Transform
{ }{ }
i
ii
i
Txxx
x
eA
λe
Nixλ
mxmxEC
xEm
Nx
of rowsh Matrix wit :
toingcorrespond rsEigenvecto :
,...,2,1 ,of sEigenvalue :
matrix Covariance :))((
rmean vecto :
vector1:
=−−=
=×
)( xmxAy −=
Hotelling transform of x
-
46Singular Value Decomposition
-
47Transform Coding• An optimality result:
• Simplest stationary source model AR(1)
DCT ~ KLT
• Troubles:• KLT calculations + overhead
• After a transformation, correlation can be made very small, but coefficients are far from being independent!
• Transform affects coding
-
48Transform Coding
• Choices• Good decorrelation properties• Low complexity, HW implementation (DCT, Fourier)
• Side effects (extension, zero or linear-phase)
• A great deal of nice transforms• Walsh-Hadamard system (with fast transforms)• Wavelet (Haar-1910, Mallat, Daubechies 1988)
• Lapped transforms (Malvar, Meyer)
-
49Transform Coding
-
50Transform Coding: performance
-
51Performance measure : coding gain
-
52DCT and size
-
53Transform coding : optimization
-
542D Hadamard -16
-
552D – DCT 16
-
56Standard transforms
• Limitations� fixed vector size
�constrained shape (orthogonality)�data-driven adaptation
�shifts and rotations robustness�higher dimension generalizations�analogy with vision aspects
� needed for compression : inverses/redundancy/sparsity
�pre- and post-processing
� coder complexity
-
57Transform coding
• A common waveform
• A common representation
-
58Transform coding
• A less common waveform
• A less common representation
-
59Transform coding
• A not so common waveform
• A not so common representation
-
60Novel transforms
FB-II
low frequencies
high frequencies
Wavelet
-
61Transform Coding
-
62Transform Coding
-
63Processing
Image
BitStream
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Adaptive Compression
Time/freq. Filtering Image analysis Texture analysis
-
64Classifier
Spectrum allocationTexture extractionSegmentation
Image
BitStream
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Adaptive Compression
-
65Texture Synthesis
-
66Bit Reduction
Image
BitStream
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Adaptive Compression
ThresholdingSubsamplingScalar quantizationVector quantizationIterations (fractals)
-
67Bit Reduction
• The lossy stage• Human eyes see a limited range of tones/freq• Real-world pictures are already imperfect
• Methods• Scalar quant.: uniform, log., optimal (Lloyd-Max algo)• Vector quant.: Voronoï diagrams, nearest neighbour
• Adaptivity• Pre-stored tables• Training set based tables• On-the-fly quantization estimation
-
68Quantization
8-bits last 5 bits last 4 bits
-
69Quantization
last 3 bits last 2 bits last bit
-
70Quantization
8 3
1
4
2
-
71Quantization
-
72Quantization (non -uniform )
-
73Quantization (adaptive)
-
74Quantization (vector)
• Outline for images
-
75
xk
Image
closest matching code vector
Image vectors : Xj
Codebook
codevectors : Vi , V1
V2
Vk
VL
2
1
))()((1
),( iViXn
VXdn
ikjKj ∑
=
−=
2
1
))()(((1
),( iViXWn
VXdn
ikjiKj ∑
=
−=
Mean Square Error (MSE) (Euclidean Distance)
Weighted MSE
Li ≤≤1
Quantization (vector)
-
76Quantization (vector)
• LBG algorithm
-
77Quantization (vector)
• Simple codebooks� (parrots)
• Codebook for a specific feature� ex. edges, smooth areas, etc.
• Codebooks could be of different sizes
-
78Ordering
Image
BitStream
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Adaptive Compression
Raster scanZig-zag, Hilbert scanZerotreesObject based coding
-
79Ordering
-
80Tree-coding ordering
-
81Ordering
-
82Wavelet/Blocking equivalence
-
83Entropy Coding
Image
BitStream
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Adaptive Compression
RLE, HuffmanArithmetic, LZ*Form dictionaries
-
84Entropy Coding
After reduction:
• Lower coefficient variance • A lot zeroed out
-
85Arithmetic Coding
S ym b o l # H u f f . L o w U pA 1 4 0 0 .1C 1 5 0 .1 0 .2D 1 5 0 .2 0 .3E 3 1 0 .3 0 .6N 2 2 0 .6 0 .8T 2 3 0 .8 1
4.32 10-8 ≈ 24.46 bits against 27 bits
AAAAAAAAA END : 7 against 11 bits
ANTECEDENT Incoming Low UpStart 0 1A 0 0.1N 0.06 0.08T 0.076 0.08E 0.0772 0.0784C 0.07732 0.07744E 0.077356 0.077392D 0.07736632 0.0773668E 0.07736428 0.07736536N 0.077364928 0.077365144T 0.0773651008 0.077365144Codeword 0.07736511
-
86Rate allocation
Exact bit-rateProgressive codingDistortion matching
Image
BitStream
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Adaptive Compression
-
87Quality measure
Adaptive Compression
Image
BitStream
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Objective measures (SNR)Subjective measuresHVS model
-
88Embedded coding
Embedded coding
Image
BitStream
Pre -Processing
Blocking Transform Processing
Classifier
Rate Allocator
ReductionOrderingEntropy Coding
Quality
Adaptive Compression
-
89Embedded quantization
Sign s s s s s s s sMsb 4 1 1 1 0 0 0 0
3 x x x 1 1 0 02 x x x x x 0 01 x x x x x 1 1
Lsb 0 x x x x x x x
-
90Ordering
-
91Wavelet/Blocking equivalence
-
92
JPEG
-
93
8X8DCT
Quantizer Coefficients-to-SymbolsMap
Entropy Coder
Encoder
JPEG Principles
-
94
Input Image, Size=512 x 512 x 8 bits
JPEG Principles
-
95JPEG Principles
-
96
69 71 74 76 89 106 111 122
59 70 61 61 68 76 88 94
82 70 77 67 65 63 57 70
97 99 87 83 72 72 68 63
91 105 90 95 85 84 79 75
92 110 101 106 100 94 87 93
89 113 115 124 113 105 100 110
104 110 124 125 107 95 117 116
16
)12(cos
16
)12(cos],[
4
][][],[
7
0
7
0
ππ vnumnmx
vCuCvuX
m n
++= ∑∑= =
≤≤
==
711
,0,2/1
u
u
v
uC
717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6
-99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7
51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4
30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0
22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9
15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7
0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8
-0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1
Step 1: DCT
JPEG Principles
-
97JPEG Principles (Transform : DCT8)
-
98JPEG Principles
717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6
-99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7
51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4
30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0
22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9
15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7
0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8
-0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1
45 0 0 -1 0 0 0 0
-8 -3 2 1 0 0 0 0
4 -5 0 0 0 0 0 0
2 -1 0 0 0 0 0 0
1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99]
÷÷÷÷ Q
Step 2: Quantization
-
99
45 0 0 -1 0 0 0 0
-8 -3 2 1 0 0 0 0
4 -5 0 0 0 0 0 0
2 -1 0 0 0 0 0 0
1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Result = 45,0,-8,4,-3,0,-1,2,-5,2,1,-1,0,1,0,0,0,0,0,0,1,0x23 times
Step 3: Coefficient-to-Symbol Mapping
Input Zigzag scan procedure
Symbols defined as [run of zeros, nonzero terminating value]
Step 4: Entropy Coding• Symbols are encoded using mostly Huffman coding.• Huffman coding is a method of variable length coding in
which shorter codewords are assigned to the more frequently occurring symbols.
JPEG Principles
-
100
[ ] [ ] ∗= HfVF∗− = VV 1 ∗− = HH 1
( ) ( )∑∑∑∑−
=
−
=
−
=
−
=
=1
0
1
0
21
0
1
0
2,,
N
m
N
n
N
u
N
v
nmfvuF
[ ] [ ]HFVf ∗=
an image NxN[ ]f
Classical orthogonal transforms
-
101One JPEG block
-
102JPEG compression on several blocks
-
103
Blocking artifact (DCT)Blocking artifact (DCT)
Blocking ArtifactsBlocking Artifacts
-
104Examples using different quality factors to scale matrix Q
Size: 15159 Bytes17x Compression
Size: 18032 Bytes15x Compression
Size: 20922 Bytes13x Compression
Size: 5728 Bytes46x Compression
Size: 11956 Bytes22x CompressionSize: 263224Bytes
Original Image
-
105
Another DCT Application: Video Compression
8X8DCT
Quantizer Entropy Coder
8X8IDCT
InverseQuantizer
DelayedFrame Memory
MotionCompensation
MotionEstimation
Entropy Coder
InputFrame
Prediction
PredictionError
Intraframe – OpenInterframe - Close
Motion Vectors
JPEG and DCT: motion
-
106JPEG and DCT: motion
-
107Classical orthogonal transforms
-
108
32 x 32 superbloc
Principles of overlapping
8 x 8 coefficients8 x 8 pixels
Original image Transformed image
-
109Lapped transforms
-
110
Ringing artifact (Wavelet, Ringing artifact (Wavelet, GenLOTGenLOT))
Blocking and Ringing Artifacts (Nagai 2001)Blocking and Ringing Artifacts (Nagai 2001)
-
111
Input
……
Quantization Error
Freq
ue
nc
y hig
h
Basis of Basis of GenLOTGenLOT
Basis of GULLOTBasis of GULLOT
Freq
ue
nc
y hig
h
Output
Output
How to Reduce the Ringing ?How to Reduce the Ringing ?
-
112
GULLOT
1-D base example
-
113Results
GULB (26.29 dB)GULB (26.29 dB) GenLOTGenLOT(26.06 dB)(26.06 dB)
Coded Yogi image at 0.3bpsCoded Yogi image at 0.3bps
-
114Wavelet/Blocking equivalence
-
115
Principles of wavelet image codingSPIHT/JPEG-2000
-
116
1-Level Wavelet Decomposition (2D DWT)
H1
H2
H1
H2
2
2
2
2
H1
H2
2
2
Row-wise operations Column-wise operations
Hi
∑=
−=L
ki khknxny
0
][][][
x[n] y[n]2
Keep one out of two pixels
Filter Decimator
Input Image
LLComponent
HLComponent
LHComponent
HHComponent
(Low pass)
(Low pass)
(Low pass)
(High pass)
(High pass)
(High pass)
SPIHT/JPEG 2000 principles
-
117
Multi-Level Wavelet Decomposition
LL HL1
LH1 HH1
2D-DWT
2D-DWT
LL HL2
HH2LH2HL1
LH1 HH1
SPIHT/JPEG 2000 principles
-
118
Bitplanes and Self-Similarity Across Scales
SPIHT/JPEG 2000 principles
-
119
Spatial Orientation Trees
Some Definitions: •O (i,j): set of coordinates of all offspring of node (i,j).
• D (i,j): set of coordinates of all descendants of node (i,j).
• H : set of coordinates of all spatial orientation tree roots.
• L (i,j): D (i,j) - O (i,j).
SPIHT/JPEG 2000 principles
-
120
Coding Algorithm (SPIHT)
• Three list are defined:
1. LIS: List of Insignificant Sets
2. LIP: List of Insignificant Pixels3. LSP: List of Significant Pixels
• Type A: Entries are elements D (i,j)• Type B: Entries are elements L (i,j)
• Significance Test:
≥
=∈
otherwise
c
TS
nji
Tji
n
,0
2|}{|max,1
)(
,),(
• Key Ideas:
• Ordered bit plane transmission.• Multi-pass zero-tree coding.• Exploitation across scales of the 2-D DWT.
SPIHT/JPEG 2000 principles
-
121
4-43065115
6363-46-32
40-232-303
2-26447-19-5
23-248-14-7-9
93-75-1231415
-1643-131423-31
7-121371049-3463
480
0-7(1,3)
0-9(0,3)
002(3,5)
00-3(2,5)
480
147(3,4)
00-1(2,4)
114(1,2)
015(0,2)
1-31(0,1)
0-34(1,0)
023(1,1)
00-7(1,3)
00-9(0,3)
0014(1,2)
0015(0,2)
1-31(0,1)
00-13(3,1)
0014(2,1)
0010(3,0)
480
149(2,0)
1-34(1,0)
0023(1,1)
00-31(0,1)
-481
1-34(1,0)
163(0,0)
Reconstruction Value
BinarySymbols
Coefficient Value
CoefficientCoordinatesSPIHT – Example
-
122Some ExamplesOriginal Image
96x Compression 48x Compression
22x Compression 16x Compression
-
123
JPEG-2000
-
124Introduction
• Image Compression has to:� Reduce storage and bandwidth requirements� Allow different extraction modes
• JPEG-2000 provides:� Low bit-rate compression performance� Progressive transmission by quality, resolution, component,
or spatial locality� Lossy and Lossless compression� Random access to the bitstream� Region of Interest coding� Robustness to bit errors
-
125Codec Structure
-
126Source Image Model
• One or Several Components in the image• Components can be at different resolutions
(different sizes)
-
127Intercomponent Transform
• Reduces the correlation between components• Maps image data from RGB to YCrCb• Advantages:
� Improve coding efficiency�Allow visually relevant quantization
• Two Transforms:� Irreversible color transform (ICT)�Reversible color transform (RCT)
-
128Reversible Component Transformation
• Used for Lossless or Lossy coding
• Advantages:� Reasonable Color Space� Ability of having lossless
compression
++
+−
=
−−
++
=
GU
GV
VUY
B
R
G
GB
GR
BGR
U
V
Y
r
r
rrr
r
r
r
4
4
2
-
129Wavelet Transform (6)
An example:
-
130Wavelet Transform (7)
• 5/3 Transform: reversible� Integer to Integer transform
�Can be used both for lossless or lossy coding
• 9/7 Transform: nonreversible�Real to Real transform
�Can only be used for lossy coding
-
131Quantization
• A uniform scalar quantization with dead-zone about the origin
A zero output may be produced for larger values on the input, to avoid recording noise
-
132Progression
• Different ordering of the packets in the code stream• 4 Types of Progression:
� Resolution� Quality � Spatial Location� Component
• Progression Type can be changed during coding
-
133Progression (2)
• Progression by Resolution
-
134Progression (3)
• Progression by Quality
-
135Region of Interest
• Coding different regions of the image with different quality
• Used when certain parts of the image are of higher importance
• ROI coding:� General Scaling-Based method� MAXSHIFT method
-
136General Scaling -Based
• Idea:Scale (shift) coefficients s.t. the bits associated with ROI are in higher bit-planes
• Some bits of ROI might be encoded together with nonROI bits
-
137
General Scaling -Based (2)• Steps:
• Wavelet Transform• ROI Mask is derived• Quantization• nonROI coefficients are
downscaled• Entropy coding
• Scaling Value and ROI coordinates are included
-
138MAXSHIFT Method
• Scaling value S is chosen such that:The minimum ROI coefficient is larger than the maximum nonROIcoefficient
• Advantages:� Allows arbitrary shaped ROIs� No ROI mask is needed
-
139Scalability
• The ability to achieve coding of more than one qualities and/or resolution simultaneously
• Two important types:� SNR Scalability� Spatial or Resolution Scalability
• Advantages: � No need to know target bit rate/resolution � No need for multiple compressions� Resilience to transmission errors
-
140SNR Scalability
• The bit stream can be decompressed at different quality levels (SNR)
Decompressed image “bike” at (a) 0.125 b/p, (b) 0.25 b/p, (c) 0.5 b/p
-
141Spatial Scalability
• The bit stream can be decompressed at different resolution level
-
142Scalability (2)
• Combination of Spatial and SNR• Changing the progression type
-
143JPEG-2000 V.S. JPEG
(a) (b)
Compression at 0.25 b/p by means of (a) JPEG (b) JPEG-2000
-
144JPEG-2000 V.S. JPEG
Compression at 0.2 b/p by means of (a) JPEG (b) JPEG-2000(a) (b)
-
145Comparison of JPEG 2000 with JPEG
• Much smaller files• Much better quality
Figure: 0.08bpp J2K Image (8KB); 0.1563bpp JPEG Image (16KB);
-
146Illustration
• Region of Interest (ROI) Encoding
Figure: Raw Image; 0.07bpp J2K Image with ROI; 0.07bpp J2K Image without ROI
-
147References• M.D. Adams, “The JPEG-2000 Still Image Compression Standard”, ISO/IEC JTC1/SC29/WG1
(ITU-T SG8), 2001
• D. Taubman, E. Ordentlcih, I. Ueno, “Embedded Block Coding”, Proc. Int. Conf. on Image Processing (ICIP '2000), Vol. II, 33-36, 2000
• M.W. Marcellin, M.J. Gormish, A. Bilgin, M.P. Boliek, “An Overview of JPEG-2000”, Proc. Of IEEE Data Compression Conference, pp. 523-541, 2000
• A. Skodras, C. Christopulos, T. Ebrahimi, “The JPEG 2000 Still Image Compression Standard”, IEEE Signal Processing Magazine, pp. 36-60, September 2001
• D.S. Taubman, and M.W. Marcellin, "Jpeg2000: Image Compression Fundamentals, Standards, and Practice", Kluwer Academic Pulishers, 2001.
• G. Cena, P. Montuschi, L. Ciminiera, A. Sanna, “A Q-Coder Algorithm with Carry Free Addition”, Proc. 13th IEEE Symposium on Computer Arithmetic, pp. 282-290, July 1997
• S.Y. Choo, G. Chew, “JPEG 2000 and Wavelet Compression” , http://www-ise.stanford.edu/class/psych221/00/shuoyen/
-
148JPEG-2000 Parts
-
149JPEG Principles
717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6
-99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7
51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4
30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0
22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9
15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7
0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8
-0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1
45 0 0 -1 0 0 0 0
-8 -3 2 1 0 0 0 0
4 -5 0 0 0 0 0 0
2 -1 0 0 0 0 0 0
1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99]
÷÷÷÷ Q
Step 2: Quantization