lie gonzalez ch8 ppt
TRANSCRIPT
-
8/10/2019 Lie Gonzalez Ch8 ppt
1/106
CCU, Taiwan
Wen-Nung Lie
Chapter 8 : Image
Compression
-
8/10/2019 Lie Gonzalez Ch8 ppt
2/106
8-1CCU, Taiwan
Wen-Nung Lie
Applications of image
compression
Televideo conferencing, Remote sensing (satellite imagery),
Document and medical imaging,
Facsimile transmission (FAX),
Control of remotely piloted vehicles in military,
space, and hazardous waste management
-
8/10/2019 Lie Gonzalez Ch8 ppt
3/106
8-2CCU, Taiwan
Wen-Nung Lie
Fundamentals
What is data and hat is information ? Data are the means by which information is conveyed.
Various amounts of data may be used to represent the
same amount of information Data redundancy
Coding redundancy
Interpixel redundancy Psychovisual redundancy
-
8/10/2019 Lie Gonzalez Ch8 ppt
4/1068-3CCU, Taiwan
Wen-Nung Lie
Coding redundancy
The graylevel histogram of an image can provide a great
deal of insight into the construction of codes
The average number of bits to represent each pixel
is the average number of bits to represent each value of
Variable length coding (VLC) : Higher probability, shorter bit length
==
1
0)()(
L
k
krkavg rprlL
)( krl kr
(code2)bits7.2
(code1)bits0.3=avgL
-
8/10/2019 Lie Gonzalez Ch8 ppt
5/106
-
8/10/2019 Lie Gonzalez Ch8 ppt
6/1068-5
CCU, Taiwan
Wen-Nung Lie
Run-length coding to remove
spatial redundancy
62.063.2
11
63.2
==
=
D
R
R
C
-
8/10/2019 Lie Gonzalez Ch8 ppt
7/1068-6
CCU, Taiwan
Wen-Nung Lie
Psychovisual redundancy
Certain information simply has less relative importance
than others in normal visual processing psychovisual
redundancy
The elimination of psychovisually redundant data results
in a loss of quantitative information called quantization
lossy data compression E.g., quantization in graylevels
E.g., line interlacing in TV
(reduced video scanning rate)
(b) False contouring on quantization
(c) Improve graylevel quantizationvia dithering
-
8/10/2019 Lie Gonzalez Ch8 ppt
8/1068-7
CCU, Taiwan
Wen-Nung Lie
Fidelity criteria
Objective fidelity criterion
Root-mean-square error
Mean-square signal-to-noise ratio
Subjective fidelity criterion
Much worse, worse, slightly worse, the same, slightly better, better,much better
21
1
0
1
0
2)],(),([1
=
=
=
M
x
N
y
rms yxfyxfMN
e
[ ]
=
=
=
=
=
1
0
1
0
2
1
0
1
0
2
),(),(
),(
M
x
N
y
M
x
N
y
msyxfyxf
yxf
SNR
-
8/10/2019 Lie Gonzalez Ch8 ppt
9/106
8-8CCU, Taiwan
Wen-Nung Lie
Image compression model
Source encoder Remove input redundancies
Channel encoder
Increase the noise immunity of the source encodersoutput
-
8/10/2019 Lie Gonzalez Ch8 ppt
10/106
8-9CCU, Taiwan
Wen-Nung Lie
Source encoder model
Mapper
Transform the input data
into a format designed to
reduce interpixel
redundancies in the image(reversible)
Quantizer Reduce the accuracy of the mappers output (reduce the
psychovisual redundancies) irreversible and must be omitted onerror-free compression
Symbol encoder
Fixed or variable-length coder (assign the shortest code words tothe most frequently occurring output to reduce codingredundancies) -- reversible
-
8/10/2019 Lie Gonzalez Ch8 ppt
11/106
8-10CCU, Taiwan
Wen-Nung Lie
Channel encoder and decoder
Designed to reduce the impact of channel noise by
inserting a controlled form of redundancy into the source
encoded data
Hamming code as a channel code
7-bit Hamming (7,4) the minimum distance is 3 and could
correct one bit error
A single-bit error is indicated by a nonzero parity word c4c2c1
07
160124
250132
330231
bh
bhbbbh
bhbbbh
bhbbbh
===
==
==
h1
h2
h4
are evenparity bits
76544
76322
75311
hhhhc
hhhhchhhhc
=
==
-
8/10/2019 Lie Gonzalez Ch8 ppt
12/106
8-11CCU, Taiwan
Wen-Nung Lie
Information theory
Explore the minimum amount of data that issufficient to describe completely an image without
loss of information
Self-information of an eventEwith probabilityP(E)
)(log
)(
1log)( EP
EP
EI ==
P(E)=1 I(E)=0
P(E)=1/2 I(E)=1 bit
The base of logarithmdetermines the information units
-
8/10/2019 Lie Gonzalez Ch8 ppt
13/106
8-12CCU, Taiwan
Wen-Nung Lie
Information channel
Source alphabet
Symbol probability
Ensemble (A, z) describes the information source
completely Average information per source output
{ }J
aaaA ,...,,21
= =
=J
j
jaP
1
1)(
[ ]TJaPaPaP )(),...,(),( 21=z
==
J
j
jj aPaPH1
)(log)()(z
H(z) is the uncertainty or entropy of the source orthe average amount of information (in m-ary units)
obtained by observing a single source output
For equally probable source symbols, entropy is maximized
-
8/10/2019 Lie Gonzalez Ch8 ppt
14/106
8-13CCU, Taiwan
Wen-Nung Lie
Channel alphabet Channel description (B, v)
{ }K
bbbB ,...,,21
=
[ ]TKbPbPbP )(),...,(),( 21=v
=
=J
j
jjkk aPabPbP1
)()()(
=
)(......)()(
...........
...........
..........)(
)(......)()(
21
12
12111
JKKK
J
abPabPabP
abP
abPabPabP
Q
Q : Forward channel transitionmatrix, channel matrix
Qzv =
-
8/10/2019 Lie Gonzalez Ch8 ppt
15/106
8-14CCU, Taiwan
Wen-Nung Lie
Conditional entropy function
is called the equivocation of z wrt v.
The difference between and is the average
information received upon observing a single output
symbol -- mutual information
==
J
j
kjkjk baPbaPbH1
)(log)()(z
=
=K
k
kk bPbHH1
)()()( zvz
= ==J
jkjkj
K
k baPbaPH 1 1 )(log),()( vz
)( vzH
)(zH )( vzH
),( vzI
)()(),( vzzvz HHI =
-
8/10/2019 Lie Gonzalez Ch8 ppt
16/106
8-15CCU, Taiwan
Wen-Nung Lie
Mutual information
The minimum possible value ofI(z,v) is zero and occurs
when the input and output symbols are statistically
independent (i.e., )
The maximum value ofI(z,v) for all possible z is called the
capacity Cdescribed by channel matrix Q
= =
=
= =
=
=
J
j
K
k
J
i
kii
kj
kjj
J
j
K
k kj
kj
kj
qaP
qqaP
bPaP
baPbaPvzI
1 1
1
1 1
)(
log)(
)()(
),(log),(),(
Q={qij}
)()(),( kjkj bPaPbaP =
)],([max vzz
IC=
-
8/10/2019 Lie Gonzalez Ch8 ppt
17/106
8-16CCU, Taiwan
Wen-Nung Lie
Channel capacity
Capacity : the maximum rate at which information can be
transmitted reliably through the channel
Channel capacity does not depend on the input probability
of the source (i.e., on how the channel is used) but is a
function of the conditional probabilities defining thechannel alone (i.e., Q)
-
8/10/2019 Lie Gonzalez Ch8 ppt
18/106
8-17CCU, Taiwan
Wen-Nung Lie
Binary entropy function For binary source
Binary symmetrical channel (BSC) error probability :
channel matrix :
probability of received symbols :
mutual information :
T
bsbs pp ]1,[ =z
bsbsbsbs ppppH 22 loglog)( =z
5.0bit1)( == bsbs pwhenHMax
ep
=
ee
ee
pp
ppQ
++==
bsebse
bsebse
ppppppppQzv
)()(),( ebsebsebsbs pHppppHI +=vz
Plot_1
Plot_2
I(z,v)=0 whenpbs=0 or 1
I(z,v)=1-Hbs(pe)=Cwhenpbs=0.5 and anype
To estimate source entropy
To estimate channel capacity
-
8/10/2019 Lie Gonzalez Ch8 ppt
19/106
8-18CCU, Taiwan
Wen-Nung Lie
whenpe=0 or 1 C=1 bit
whenpe=0.5 C=0 bit (no information transmitted)
-
8/10/2019 Lie Gonzalez Ch8 ppt
20/106
8-19CCU, Taiwan
Wen-Nung Lie
Shannons first theorem for
noiseless coding Define the minimum average code length per source
symbol that can be achieved
For a zero-memory source (with finite ensemble (A, z) and
statistically independent source symbols)
Single symbol :
Block symbol : (n-tuple of symbols)
},...,,{ 21 JaaaA=},....,,{ 21 nJA =
)()...()()(21 jnjji
aPaPaPP =
)(),( zz HA)(),( zz HA
=
=n
J
i
ii PPH1
)(log)()( z
)()( zz nHH = Entropy of zero-memorysource with block symbols
-
8/10/2019 Lie Gonzalez Ch8 ppt
21/106
8-20CCU, Taiwan
Wen-Nung Lie
N-extended source Use n symbols as a block symbol
Average word length of the n-extended code can be
= =
==
n nJ
i
J
i i
iiiavgP
PlPL1 1 )(
1log)()()(
1)()( +
-
8/10/2019 Lie Gonzalez Ch8 ppt
22/106
8-21CCU, Taiwan
Wen-Nung Lie
Coding efficiency Coding efficiency :
Example :
Original source :
Second extension source :
avgL
zHn
=
)(
3/1)(,3/2)( 21 == aPaP
}9/1,9/2,9/2,9/4{=z
symbolbitsLavg /89.19/17 ==
symbolbitszH /83.1)( =
97.089.1/83.1 ==
symbolbitLavg /1= 918.01/918.0 ==
-
8/10/2019 Lie Gonzalez Ch8 ppt
23/106
8-22CCU, Taiwan
Wen-Nung Lie
Shannons second theorem
for noisy coding For anyR < C (channel capacity of the zero-memory
channel with Q), there exists codes of block length rand
rateR such that the block decoding error can be arbitrarily
small
Rate distortion function
given a distortionD, the minimum rate at which
information about the source can be conveyed to the user
)],([min)( vzDQQ IDR =
})({ Ddqkj = QQD
ncompressiobysymbolsoutputtosymbolssourcefromyprobabilitthe:Q
d(Q) : average distortion
D: distortion threshold
-
8/10/2019 Lie Gonzalez Ch8 ppt
24/106
8-23CCU, Taiwan
Wen-Nung Lie
Example of Rate-distortion
function Consider a zero-memory binary source with equally
probable source symbols {0,1}
R(D) is monotonically decreasing and convex in the interval (0,Dmax)
Shape of R-D function represents the coding efficiency of a coder
)(1)( DHDR bs=
-
8/10/2019 Lie Gonzalez Ch8 ppt
25/106
8-24CCU, Taiwan
Wen-Nung Lie
Estimate the information
content (entropy) of an image Construct a source model based on the relative frequency
of occurrence of the graylevels
First-order estimate (consider histogram of single pixels) : 1.81
bits/pixel
Second-order estimate (consider histogram of pairs of adjacentpixels) : 1.25 bits/pixel
Third-order, fourth-order : approaching the source entropy
21
21
21
21
21
21
21
21
21
21
21
21
95
95
95
95
169
169
169
169
243
243
243
243
243
243
243
243
243
243
243
243
Example image
-
8/10/2019 Lie Gonzalez Ch8 ppt
26/106
8-25CCU, Taiwan
Wen-Nung Lie
The difference between the higher-order estimates of entropy and
the first-order estimate indicates the presence of interpixelredundancy
If the pixels are statistically independent, the higher-order
estimates are equivalent to the first-order estimate and variable
length coding (VLC) provides optimal compression (i.e., no further
mapping is necessary)
First-order estimate of difference mapped source : 1.41 bits/pixel
1.41 > 1.25 there is an even better mapping
-
8/10/2019 Lie Gonzalez Ch8 ppt
27/106
8-26CCU, Taiwan
Wen-Nung Lie
Error-free compression --
variable-length coding (VLC) Huffman coding
the most popular method to yield the smallest possible number of
code symbols per source symbol
construct the Huffman tree according to source symbol
probabilities Code the Huffman tree
Compute the source entropy, average code length, and code
efficiency
-
8/10/2019 Lie Gonzalez Ch8 ppt
28/106
8-27CCU, Taiwan
Wen-Nung Lie
Huffman code is :
a block code (each symbol is mapped to a fixed sequence of bits)
instantaneous (decoded without referencing succeeding symbols)
uniquely decodable (any code word is not a prefix of another)
for example : 010100111100 a3 a1 a2 a2 a6
symbolbitsLavg /2.2= 973.0=H(z)=2.14 bits/symbol
-
8/10/2019 Lie Gonzalez Ch8 ppt
29/106
8-28CCU, Taiwan
Wen-Nung Lie
Variations of VLC
-
8/10/2019 Lie Gonzalez Ch8 ppt
30/106
8-29CCU, Taiwan
Wen-Nung Lie
Variations of VLC (cont.) Considered when a large number of source symbols are
considered Truncated Huffman coding
Only partial symbols (with most probabilities, here the top 12) are
encoded with Huffman code A prefix code (whose probability was the sum of other symbols,
here 10) followed with a fixed-length code is used to representall the other symbols
B-code Each code word is made up of continuationbits (C) and
informationbits (natural binary numbers)
Ccan be 1 or 0, but alternative
E.g., a11a2a7 001010101000010 or 101110001100110
-
8/10/2019 Lie Gonzalez Ch8 ppt
31/106
8-30CCU, Taiwan
Wen-Nung Lie
Variations of VLC (cont.) Binary shift code
Divide the symbols into blocks of equal size
Code the individual elements within all blocks identically
Add shift-up or shift-down symbols to identify each block (here,
111) Huffman shift code
Select one reference block
Sum up probabilities of all other blocks and use it to determine theshift symbol by Huffman method (here, 00)
-
8/10/2019 Lie Gonzalez Ch8 ppt
32/106
8-31CCU, Taiwan
Wen-Nung Lie
Arithmetic coding AC generates non-block codes (no
look-up table as in Huffman code) An entire sequence of source symbols
is assigned a single code word
The code word itself defines an
interval of real numbers between 0 and
1. As the number of symbols in the
message increases, the interval
becomes smaller (more information
bits to represent this real number)
Multiple symbols are integratedly
coded with a set of bits number of
bits per symbol is effectivelyfractional
-
8/10/2019 Lie Gonzalez Ch8 ppt
33/106
8-32CCU, Taiwan
Wen-Nung Lie
Arithmetic coding (cont.) Example : coding of a1a2a3a3a4
Any real number between [0.06752, 0.0688] can representthe source symbols (e.g., 0.000100011=0.068359375)
Theoretically, 0.068 is enough to represent the source
symbol 3 decimal digits 0.6 decimal digits persymbol Actually, 0.068359375 9 bits for 5 symbols 1.8 bits
per symbol
As the length of the source symbol sequence increases, theresulting arithmetic code approaches the Shannons bound
Two factors limit the coding performance
An end-of-message indicator is necessary to separate onemessage sequence from another
The finite precision arithmetic
LZW (L l Zi W l h)
-
8/10/2019 Lie Gonzalez Ch8 ppt
34/106
8-33CCU, Taiwan
Wen-Nung Lie
LZW (Lempel-Ziv-Welch)
coding Assign fixed-length code words to variable-length
sequences of source symbols but require no a prioriknowledge of the probabilities of occurrence of symbols
LZW compression has been integrated into GIF, TIFF, and
PDF formats For a 9-bit (512 words) dictionary, the latter half entries
are constructed in the encoding process. An entry of two ormultiple pixels can be possible (i.e., assigned with a 9 bits
code) If the size of the dictionary is too small, the detection of matching
gray-level sequences will be less likely.
If too large, the size of code words will adversely affectcompression performance
-
8/10/2019 Lie Gonzalez Ch8 ppt
35/106
8-34CCU, Taiwan
Wen-Nung Lie
LZW coding (cont.) The LZW decoder builds an
identical decompressiondictionary as it decodes
simultaneously the bit
stream
Notice to handle the
dictionary overflow
-
8/10/2019 Lie Gonzalez Ch8 ppt
36/106
8-35CCU, Taiwan
Wen-Nung Lie
Bit-plane coding Bit-plane decomposition
Binary : bit change between two adjacent codes may besignificant (e.g., 127 and 128)
Gray code : only 1 bit is changed between any two
adjacent codes
Gray-coded bit planes are less complex than the
corresponding binary bit planes
0
0
1
1
2
2
1
1 22...22 aaaa m
m
m
m ++++
2011
1
= = +
miagaag
mm
iii
-
8/10/2019 Lie Gonzalez Ch8 ppt
37/106
8-36CCU, Taiwan
Wen-Nung LieBinary-coded gray-coded
Lossless compression for
-
8/10/2019 Lie Gonzalez Ch8 ppt
38/106
8-37CCU, Taiwan
Wen-Nung Lie
Lossless compression for
binary images 1-D run-length coding
RLC+VLC according to run-lengths statistics
2-D run-length coding
used for FAX image compression Relative address coding (RAC)
based on the principle of tracking the binary transitions that
begin and end each black and white run combined with VLC
Other methods for lossless
-
8/10/2019 Lie Gonzalez Ch8 ppt
39/106
8-38CCU, Taiwan
Wen-Nung Lie
Other methods for lossless
compression of binary images White block skipping
code the solid lines as 0s and all other lines with a 1 followed by
original bit patterns
Direct contour tracing
represent each contour by a single boundary point and a set ofdirectionals
Predictive differential quantizing (PDO)
a scan-line-oriented contour tracing procedure Comparisons of various algorithms
only entropies after pixel mapping are computed, instead of real
encoding (see next slice)
Comparison of various
-
8/10/2019 Lie Gonzalez Ch8 ppt
40/106
8-39CCU, Taiwan
Wen-Nung Lie
Comparison of various
methods Run-length coding proved to
be the best coding method for
bit-plane coded images
2-D techniques (PDQ, DDC,
RAC) perform better when
compressing binary images orhigher bit-plane images
Gray-coded images proved to
gain additional 1 bit/pixel
compression efficiency relativeto binary-coded images
For binary image
-
8/10/2019 Lie Gonzalez Ch8 ppt
41/106
8-40CCU, Taiwan
Wen-Nung Lie
Lossless predictive coding Eliminating the interpixel redundancies of closely spaced pixels
by extracting and coding only the new information (thedifference between the actual and predicted value) in each pixel
System architecture
the same predictor in the encoder and decoder sides rounding the predicted value
RLC symbol encoder
nnn ffe
=
-
8/10/2019 Lie Gonzalez Ch8 ppt
42/106
8-41CCU, Taiwan
Wen-Nung Lie
Methods of prediction Use local, global, or adaptive predictor to generate
linear prediction : linear combination of mprevious pixels
m : order of prediction
: prediction coefficients
Use local neighborhoods (e.g., pixels-1,2,3,4) forprediction of pixel-X in 2-D images
special case : previous pixel predictor
nf
=
=
m
i
inin froundf1
i
2 3 4
1 X
Entropy reduction by
-
8/10/2019 Lie Gonzalez Ch8 ppt
43/106
8-42CCU, Taiwan
Wen-Nung Lie
Entropy reduction by
difference mapping
e
e
e
e eep
2
2
1)(
=
Due to removal of interpixel redundancies by prediction, first-
order entropy of difference mapping will be lower than theoriginal image (3.96 bits/pixel vs. 6.81 bits/pixel)
The probability density function of the prediction errors is
highly peaked at zero and characterized by a relatively smallvariance (modeled by the zero mean uncorrelated Laplacian pdf)
e
e eep =
2)(
-
8/10/2019 Lie Gonzalez Ch8 ppt
44/106
8-43CCU, Taiwan
Wen-Nung Lie
Lossy predictive coding Error-free encoding of images seldom results in more than
3:1 reduction in data
A lossy predictive coding model
the prediction in the encoder and decoder must be equivalent (same)
-- placing encoders predictor within a feedback loop
nnn fef+= &&
This closed loop
configuration preventserror buildup atdecoders output
-
8/10/2019 Lie Gonzalez Ch8 ppt
45/106
8-44CCU, Taiwan
Wen-Nung Lie
Delta modulation (DM) DM :
the resulting bit rate is 1 bit/pixel
Slope overload effect
is too small to represent inputs large changes, lead to blurredobject edges
Granular noise effect
is too large to represent inputs small change, lead to grainy ornoisy surfaces
1
= nn ff &
>+=
otherwise
ee
n
n
0for
&
-
8/10/2019 Lie Gonzalez Ch8 ppt
46/106
8-45CCU, Taiwan
Wen-Nung Lie
Delta modulation (cont)
-
8/10/2019 Lie Gonzalez Ch8 ppt
47/106
8-46CCU, Taiwan
Wen-Nung Lie
Optimal predictors Minimize encoders mean-square prediction error
with
select mprediction coefficients to minimize
The solution :
}]{[}{ 22 nnn ffEeE =
nnnnnn ffefef =++= & =
=m
i
inin ff1
DPCM
=
=
2
1
2}{m
i
ininn ffEeE
rR1=
=
}{...}{
.....
.....
...}{}{
}{..}{}{
1
2212
12111
mnmnnmn
nnnn
mnnnnnn
ffEffE
ffEffE
ffEffEffE
R
=
}{
.
.
}{
}{
2
1
mnn
nn
nn
ffE
ffE
ffE
r
=
m
.
.
2
1
-
8/10/2019 Lie Gonzalez Ch8 ppt
48/106
8-47CCU, Taiwan
Wen-Nung Lie
Optimal predictors (cont) Variance of prediction error
For a 2-D Markov source with separable autocorrelation
function
and 4-th order linear predictor
We generally have
to reduce the DPCM decoders susceptibility totransmission noise
= ==m
i
iinnT
e ffE1
222 }{ r
j
h
i
vjyixfyxfE 2)},(),({ =
)1,1(),1()1,1()1,(),( 4321 ++++= yxfyxfyxfyxfyxf
0,,, 4321 ==== vhvh
2 1
3 X
4
0.11
=
m
i
i
Confines the impact of an input error to asmall number of outputs
-
8/10/2019 Lie Gonzalez Ch8 ppt
49/106
8-48CCU, Taiwan
Wen-Nung Lie
Examples of predictors Four examples :
The visually perceptible error decreases as the order of the
predictor increases
=
+=
+=
=
otherwise),1(97.0)1,(97.0),(
)1,1(5.0),1(75.0)1,(75.0),(
),1(5.0)1,(5.0),(
)1,(97.0),(
yxfvhifyxfyxf
yxfyxfyxfyxf
yxfyxfyxf
yxfyxf
Adaptive with local measure ofdirectional property
-
8/10/2019 Lie Gonzalez Ch8 ppt
50/106
8-49CCU, Taiwan
Wen-Nung Lie
Optimal quantization The staircase quantization function is an odd function that
can be described by the decision ( ) and reconstruction ( )levels
Quantizer design is to select the best and for a
particular optimization criterion and input probabilitydensity functionp(s)
is it
is it
-
8/10/2019 Lie Gonzalez Ch8 ppt
51/106
8-50CCU, Taiwan
Wen-Nung Lie
L-level Lloyd-Max quantizer Optimal in the mean-square error sense
2,...,2,10)()(
1
Ls
s i idsspts
i
i
==
= =
=
=
++
2
22
1,...,2,1
00
1
L
Ltt
i
ii
i
s
ii
iiii ttss == ,
ti is the centroid ofeach decision interval
Decision levels are halfway
of the reconstruction levels
Non-uniformquantizer
How about an optimal
uniform quantizer ?
-
8/10/2019 Lie Gonzalez Ch8 ppt
52/106
8-51CCU, Taiwan
Wen-Nung Lie
Transform coding Subimage decomposition
Transformation
Decorrelate the pixels of each subimage
Energy packing
Quantization Eliminate coefficients that carry the least information
coding
-
8/10/2019 Lie Gonzalez Ch8 ppt
53/106
8-52CCU, Taiwan
Wen-Nung Lie
Transform selection Requirements
orthonormal or unitary forward and inverse transformation kernels
basis function or basis images
separable and symmetric for the kernels
Types DFT
DCT
WHT (Walsh-Hadamard transform) Compression is achieved during the quantization of the
transformed coefficients (not during the transformation
step)
-
8/10/2019 Lie Gonzalez Ch8 ppt
54/106
8-53CCU, Taiwan
Wen-Nung Lie
WHT The summation in the exponent is performed in modulo 2
arithmetic is thekth bit (from right to left) in the binary
representation ofz
==
=
+1
0
)()()()(
)1(1
),,,(),,,(
m
i
iiii vpybupxb
N
vuyxhvuyxg
mN 2=
)(zbk
=
=
=1
0
1
0
),,,(),(),(N
x
N
y
vuyxgyxfvuT
=
=
=1
0
1
0
),,,(),(),(N
u
N
v
vuyxhvuTyxf
-
8/10/2019 Lie Gonzalez Ch8 ppt
55/106
8-54CCU, Taiwan
Wen-Nung Lie
DCT
+
+==
N
vy
N
uxvuvuyxhvuyxg
2
)12(cos
2
)12(cos)()(),,,(),,,(
=
==
1,...,2,1
0)(
2
1
Nufor
uforu
N
N
Approximations using DFT,
-
8/10/2019 Lie Gonzalez Ch8 ppt
56/106
8-55CCU, Taiwan
Wen-Nung Lie
pp g ,
WHT, and DCT Truncating 50% of the
transformed coefficientsbased on maximum
magnitudes
The RMS error values are1.28 (DFT), 0.86 (WHT), and
0.68 (DCT).
DFT
WHT
DCT
-
8/10/2019 Lie Gonzalez Ch8 ppt
57/106
8-56CCU, Taiwan
Wen-Nung Lie
Basis images A linear combination of basis images :
=
=
=1
0
1
0
),(n
u
n
v
uvvuT HF
=
),,1,1(..),,1,1(),,0,1(
.....
.....
....),,0,1(
),,1,0(..),,1,0(),,0,0(
vunnhvunhvunh
vuh
vunhvuhvuh
uvH
uvH2n
-
8/10/2019 Lie Gonzalez Ch8 ppt
58/106
-
8/10/2019 Lie Gonzalez Ch8 ppt
59/106
8-58CCU, Taiwan
Wen-Nung Lie
Subimage size selection The most popular subimage sizes are 88 and 1616.
A comparison of nn subimage for n=2,4,8,16 Truncating 75% of resulting coefficients
Curves of WHT and DCT flatten as size of subimage becomes greater
than 88, while FT does not Block effect decreases when subimage size increases
-
8/10/2019 Lie Gonzalez Ch8 ppt
60/106
8-59CCU, Taiwan
Wen-Nung Lie
original 2x2
subimage
4x4
subimage
8x8
subimage
-
8/10/2019 Lie Gonzalez Ch8 ppt
61/106
8-60CCU, Taiwan
Wen-Nung Lie
Zonal vs. threshold coding The coefficients are retained based on
Maximum variance : zonal coding Maximum magnitude : threshold coding
Zonal coding
Each DCT coefficient is considered as a random variable whose
distribution could be computed over the ensemble of all transformedsubimages
The variances can also be based on an assumed image model (e.g., aMarkov autocorrelation function)
Coefficients of maximum variances are usually located around the origin
Threshold coding
Select coefficients which have the largest magnitudes
Causing far less error than the zonal coding result
Bit allocation for
-
8/10/2019 Lie Gonzalez Ch8 ppt
62/106
8-61CCU, Taiwan
Wen-Nung Lie
zonal coding The retained coefficients are
allocated bits according to : The dc coefficient is modeled by
a Rayleigh density function
The ac coefficients are often
modeled by a Laplacian or
Gaussian density
The number of bits is made
proportional to
Zonal
coding
Threshold
coding
2),(2log vuT
The information content of a Gaussian
random variable is proportional to )/(log2
2 D
Bit allocation in threshold
-
8/10/2019 Lie Gonzalez Ch8 ppt
63/106
8-62CCU, Taiwan
Wen-Nung Lie
coding The transform coefficients of largest magnitude make the
most significant contribution to reconstructed subimagequality
Inherently adaptive in the sense that the location of the
retained transform coefficients vary from one subimage toanother
Retained coefficients are recorded in a 1-D zigzag ordering
pattern The mask pattern is run-length coded
-
8/10/2019 Lie Gonzalez Ch8 ppt
64/106
8-63CCU, Taiwan
Wen-Nung Lie
Zonal mask Bit allocation
Thresholded and
retained coefficients
Zigzag ordering pattern
Bit allocation in threshold
-
8/10/2019 Lie Gonzalez Ch8 ppt
65/106
8-64CCU, Taiwan
Wen-Nung Lie
coding (cont) Thresholding the transform coefficients -- the threshold
can be varied as a function of the location of eachcoefficient within the subimage
Result in a variable code rate, but offer the advantage that
thresholding and quantization can be combined by replacing the
masking (i.e., ) with
The elementZ(u, v) can be scaled to achieve a variety of
compression levels
),(),( vuTvu
]),(
),([),(
vuZ
vuTroundvuT =
Z(u,v) is the transform normalization array
),(),(),( vuZvuTvuT =& Recovered or denormalized value
Normalization (quantization)
-
8/10/2019 Lie Gonzalez Ch8 ppt
66/106
8-65CCU, Taiwan
Wen-Nung Lie
matrix Used in JPEG standard
-
8/10/2019 Lie Gonzalez Ch8 ppt
67/106
8-66CCU, Taiwan
Wen-Nung Lie
Wavelet coding Requirements in encoding an image
An analyzing wavelet Number of decomposition levels
The digital wavelets transform (DWT) converts a largeportion of the original image to horizontal, vertical anddiagonal decomposition coefficients with zero mean andLaplacian-like distributions.
-
8/10/2019 Lie Gonzalez Ch8 ppt
68/106
8-67CCU, Taiwan
Wen-Nung Lie
Wavelet coding (cont) Many computed coefficients can be quantized and coded to
minimize intercoefficient and coding redundancy The quantization can be adapted to exploit any positional
correlation across different decomposition levels
Subdivision of the original image is unnecessary
Eliminating the blocking artifact that characterizes DCT-based
approximations at high compression ratios
-
8/10/2019 Lie Gonzalez Ch8 ppt
69/106
8-68CCU, Taiwan
Wen-Nung Lie
1-D Subband decomposition Analysis filter :
Low-pass : approximation High-pass : detail
Synthesis filter :
Low-pass : High-pass :
)(0 ng)(1 ng
)(1 nh
)(0 nh
2-D Subband image coding
-
8/10/2019 Lie Gonzalez Ch8 ppt
70/106
8-69CCU, Taiwan
Wen-Nung Lie
4 filter bank
An example of Daubechies
-
8/10/2019 Lie Gonzalez Ch8 ppt
71/106
8-70CCU, Taiwan
Wen-Nung Lie
orthonormal filters
Comparison between DCT-
-
8/10/2019 Lie Gonzalez Ch8 ppt
72/106
8-71CCU, Taiwan
Wen-Nung Lie
based and DWT-based coding A noticeable decrease of error in the wavelet coding results (based on
compression ratios of 34:1 and 67:1)
DWT-based DCT-based
Rms error :
2.29, 2.96
Rms error :
3.42, 6.33
-
8/10/2019 Lie Gonzalez Ch8 ppt
73/106
8-72CCU, Taiwan
Wen-Nung Lie
Rms error :
3.72, 4.73
Based on compression ratios of
108:1 and 167:1 Even the 167:1 DWT result
(rms=4.73) is better than 67:1
DCT result (rms=6.33)
At more than twice ofcompression ratio, the wavelet-
based reconstruction has only
75% of the error of the DCT-
based result
-
8/10/2019 Lie Gonzalez Ch8 ppt
74/106
8-73CCU, Taiwan
Wen-Nung Lie
Wavelet selection The wavelet selection affects all aspects of wavelet coding
system design and performance Computational complexity of the transform
Systems ability to compress and reconstruct images of acceptable
error
Include the decomposition filters and reconstruction filters
Useful analysis property : number of zero moments
Important synthesis property : smoothness of reconstruction
The most widely used expansion functions Daubechies wavelets
Bi-orthogonal waveletsAllows filters with binary coefficients (numbers of theform k/2a)
Comparison between different
-
8/10/2019 Lie Gonzalez Ch8 ppt
75/106
8-74CCU, Taiwan
Wen-Nung Lie
wavelets (1) Harr wavelets : the
simplest
(2) Daubechies wavelets :
the most popular imaging
wavelets
(3) Symlets : an extensionof Daubechies with
increased symmetry
(4) Cohen-Daubechies-
Feauveau wavelets :biorthogonal wavelets
1,2
3,4
Comparison between different
-
8/10/2019 Lie Gonzalez Ch8 ppt
76/106
8-75CCU, Taiwan
Wen-Nung Lie
wavelets (cont) Comparisons in number of operations required
As the computational complexity increases, the informationpacking ability does as well
For analysis and reconstruction filters, respectively
-
8/10/2019 Lie Gonzalez Ch8 ppt
77/106
8-76CCU, Taiwan
Wen-Nung Lie
Other issues The number of decomposition levels required
The initial decompositions are responsible for the majority of the datacompression (3 levels is enough)
Quantizer design
Introduce an enlarged quantization interval around zero (i.e., the deadzone)
Adapting the size of the quantization interval from scale to scale
Quantization of coefficients at more decomposition levels impacts largerareas of the reconstructed image
Binary image compression
-
8/10/2019 Lie Gonzalez Ch8 ppt
78/106
8-77CCU, Taiwan
Wen-Nung Lie
standards Most of the standards are issued by
International Standardization Organization (ISO)
Consultative Committee of the International Telephone and Telegraph(CCITT)
CCITT Group 3 and 4 are for binary image compression
Originally designed as fasimile (FAX) coding method
G3 : Nonadaptive, 1-D run-length coding
G4 : a simplified or streamlined version of G3, only 2-D coding
The coding approach is quite similar to the RAC method
Joint Bilevel Imaging Group (JBIG)
A joint committee of CCITT and ISO
Proposed JBIG1: adaptive arithmetic compression technique (the bestaverage and worst-case available)
Proposed JBIG2 : achieve compressions 2 to 4 times greater than JBIG1
Continuous-tone image
-
8/10/2019 Lie Gonzalez Ch8 ppt
79/106
8-78CCU, Taiwan
Wen-Nung Lie
compression -- JPEG JPEG, wavelet-based JPEG-2000, JPEG-LS
JPEG define three coding system Lossy baseline coding system
Extended coding system for greater compression, higherprecision, or progressive reconstruction applications
Lossless independent coding system
A product or system must include support for the baseline system
Baseline system
Input, output : 8 bits Quantized DCT values : 11 bits
Subdivided into blocks of 8x8 pixels for encoding
Pixels are level shifted by substracting 128 graylevels
-
8/10/2019 Lie Gonzalez Ch8 ppt
80/106
8-79CCU, Taiwan
Wen-Nung Lie
JPEG (cont) DCT-transformed
Quantized by using the quantization or normalization matrix The DC DCT coefficients is difference coded relative to the DC
coefficient of the previous block
The non-zero AC coefficients are coded by using a VLC that
defines the coefficients value and number of preceding zeros The user is free to construct custom tables and/or arrays, which may
be adapted to the characteristics of the image being compressed
Add a special EOB (end of block) code behind the last non-zero
AC coefficient
Huffman Coding for the DC
-
8/10/2019 Lie Gonzalez Ch8 ppt
81/106
8-80CCU, Taiwan
Wen-Nung Lie
and AC coefficients VLC = base code (category code) + value code
An example to encode DC = -9
Category 4 : -15,,-8,8,,15 (base code = 101)
Total length = 7 value code length = 4
For a category K, additional Kbits are needed as the value code andcomputed as either the KLSBs (positive value) or the KLSBs of 1scomplement of its absolute value (negative value) : K=4, value code =0110
Complete code : 101-0110
An example to encode AC = (0,-3)
length of 0 run : 0, AC coefficient value : -3
AC = -3 category = 2 0/2 (Table 8.19)base code = 01
Value code = 00
Complete code = 0100
-
8/10/2019 Lie Gonzalez Ch8 ppt
82/106
8-81CCU, Taiwan
Wen-Nung Lie
A JPEG coding example52 55 61 66 70 61 64 73
63 59 66 90 109 85 69 72
62 59 68 113 144 104 66 73
63 58 71 122 154 106 70 69
67 61 68 104 126 88 68 70
79 65 60 70 77 68 58 75
85 71 64 59 55 61 65 8387 79 69 68 65 76 78 94
-76 -73 -67 -62 -58 -67 -64 -55
-65 -69 -62 -38 -19 -43 -59 -56
-66 -69 -60 -15 16 -24 -62 -55
-65 -70 -57 -6 26 -22 -58 -59
-61 -67 -60 -24 -2 -40 -60 -58
-49 -63 -68 -58 -51 -65 -70 -53
-43 -57 -64 -69 -73 -67 -63 -45-41 -49 -59 -60 -63 -52 -50 -34
Level
shiftedoriginal
-415 -29 -62 25 55 -20 -1 3
7 -21 -62 9 11 -7 -6 6
-46 8 77 -25 -30 10 7 -5
-50 13 35 -15 -9 6 0 3
11 -8 -13 -2 -1 1 -4 1
-10 1 3 -3 -1 0 2 -1
-4 -1 2 -1 2 -3 1 -2
-1 1 -1 -2 -1 -1 0 -1
DCT
transformed
-
8/10/2019 Lie Gonzalez Ch8 ppt
83/106
8-82CCU, Taiwan
Wen-Nung Lie
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
-4 1 2 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
quantized
1-D Zigzag scanning : [-26 3 1 3 2 6 2 4 1 4 1 1 5 0 2 0 0 1 2 0 0 0 0 0 1 1 EOB]
Symbols to be coded : (DPCM =-9, assumed),(0,-3),(0,1),(0,-3),(0,-2),(0,-6),(0,2),(0,-4),(0,1),(0,-4),(0,1),(0,1),(0,5),(1,2),(2,-1),(0,2),(5,-1),(0,-1),EOB
Final codes :
1010110,0100,001,0100,0101,100001,0110,100011,001,100011,001,001,100101,1110011
0,110110,0110,11110100,000,1010
-
8/10/2019 Lie Gonzalez Ch8 ppt
84/106
-
8/10/2019 Lie Gonzalez Ch8 ppt
85/106
8-84CCU, Taiwan
Wen-Nung Lie
-
8/10/2019 Lie Gonzalez Ch8 ppt
86/106
8-85CCU, Taiwan
Wen-Nung Lie
JPEG 2000
-
8/10/2019 Lie Gonzalez Ch8 ppt
87/106
8-86CCU, Taiwan
Wen-Nung Lie
JPEG-2000 Increased flexibility to the access of compressed data
Portions of a JPEG 2000 compressed image can be extracted forre-transmission, storage, display, and editing
Procedures
DC level shift by substracting 128
Optionally decorrelated by using reversible or non-reversible linear
combination of components (e.g., RGB to YCrCb component
transformation) (i.e., RGB-based or YCrCb-based JPEG-2000
coding are allowed)
),(08131.0),(41869.0),(5.0),(
),(5.0),(33126.0),(16875.0),(
),(114.0),(587.0),(299.0),(
2102
2101
2100
yxIyxIyxIyxY
yxIyxIyxIyxY
yxIyxIyxIyxY
=
+=
++=
Y1 and Y2 are different images highly peaked around zero
JPEG 2000 ( t)
-
8/10/2019 Lie Gonzalez Ch8 ppt
88/106
8-87CCU, Taiwan
Wen-Nung Lie
JPEG-2000 (cont) Optionally divided into tiles (rectangular arrays of pixels which
can be extracted and reconstructed independently), providing a
mechanism for accessing a limited region of a coded image
2-D DWT by using a biorthogonal 5-3 coefficient filter (lossless)
or a 9-7 coefficient filter (lossy) produces a low-resolution
approximation and horizontal, vertical, and diagonal frequencycomponents (LL, HL,LH, HH bands)
-
8/10/2019 Lie Gonzalez Ch8 ppt
89/106
8-88CCU, Taiwan
Wen-Nung Lie
9-7 filter for lossy DWT
i H0 H1
0 6/8 11 2/8 -1/2
2 -1/8
5-3 filter for lossless DWT
JPEG 2000 ( t)
-
8/10/2019 Lie Gonzalez Ch8 ppt
90/106
8-89CCU, Taiwan
Wen-Nung Lie
JPEG-2000 (cont) A fast DWT algorithm or a lifting-basedapproach is often used
Iterative decomposition of the approximation part to obtain :
Important visual information is concentrated in a few coefficients
Difference from subbanddecomposition
The standard does not specify
the number of scales (i.e., thenumber of decomposition
levels) to be computed
-
8/10/2019 Lie Gonzalez Ch8 ppt
91/106
8-90CCU, Taiwan
Wen-Nung Lie
JPEG 2000 ( t)
-
8/10/2019 Lie Gonzalez Ch8 ppt
92/106
8-91CCU, Taiwan
Wen-Nung Lie
JPEG-2000 (cont) Coefficient quantization adapted to individual scales and subbands
The number of exponent and mantissa bits should be provided tothe decoder on a subband basis, called explicit quantization, or forthe LL subbands only, called implicit quantization (the remainingsubbands are quantized using extrapolated LL subband parameters)
]),(
[)],([),(b
bbb
vuafloorvuasignvuq
=
)2
1(2:sizesteponquantizati11
bR
b
bb +=
Rb is the nominal dynamic range (in bits) of subbandb, b andb are the number of bits allocated to the exponent and
mantissa of subbands coefficients
For error-free compression, b = 0,Rb = b, and b=1
JPEG 2000 ( t)
-
8/10/2019 Lie Gonzalez Ch8 ppt
93/106
8-92CCU, Taiwan
Wen-Nung Lie
JPEG-2000 (cont) Quantized coefficients are arithmetically coded on a bit-plane basis
The coefficients are arranged into rectangular blocks called code
blocks, which are individually coded a bit plane at a time Starting from the MSB bit plane with nonzero elements to the LSB bit
plane
Encoding by using the context-based arithmetic coding method
The coding outputs from each code block are grouped to formlayers
The resulting layers are finally partitioned intopackets
The encoder can encode Mb
bitplanes for a particularsubband, the decoder can decode only Nbbitplanes, due tothe embedded nature of the code stream
-
8/10/2019 Lie Gonzalez Ch8 ppt
94/106
8-93CCU, Taiwan
Wen-Nung Lie
empty
empty
empty
empty
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
code-block 1
bit-stream code-block 2
bit-stream
code-block 3
bit-stream code-block 4
bit-stream
code-block 5
bit-stream code-block 6
bit-stream
code-block 7
bit-stream
N decomposition level N-1 decomposition levelN+1 decomposition level
R-D / Quality
Resolution
Video compression -- motion-
compensated transform coding
-
8/10/2019 Lie Gonzalez Ch8 ppt
95/106
8-94CCU, Taiwan
Wen-Nung Lie
compensated transform coding Standards
Video teleconferencing standard (H.261, H.263, H.263+, H.320, H.264)
Multimedia standard (MPEG-1/2/4)
A combination of predictive coding (in temporal domain) and
transform coding (in spatial domain)
Estimate (predict) the image by simply propagating pixels in the
previous frame along their motion trajectory motion
compensation
The success depends on accuracy, speed, and robustness of thedisplacement estimator -- motion estimation
Motion estimation is based on each image block (1616 or 88
pixels)
Motion-compensated
transform coding encoder
-
8/10/2019 Lie Gonzalez Ch8 ppt
96/106
8-95CCU, Taiwan
Wen-Nung Lie
transform coding - encoder
Motion-compensated
transform coding decoder
-
8/10/2019 Lie Gonzalez Ch8 ppt
97/106
8-96CCU, Taiwan
Wen-Nung Lie
transform coding -- decoder
VLDVLDBUFBUF ITIT
MCMC
PredictorPredictor
ReconstructedReconstructed
FrameFrame
Motion VectorsMotion Vectors
++
Compressedbit stream
IT : inverse DCT transform
VLD : variable length decoding
2-D Motion estimation
techniques
-
8/10/2019 Lie Gonzalez Ch8 ppt
98/106
8-97CCU, Taiwan
Wen-Nung Lie
techniques Assumptions :
Rigid translational movements only
Uniform illumination in time and space
Methods : Optical flow approach
Pel-recursive approach
Block-matching approach
Different from motion estimation in target tracking
Causal operations (forward prediction, no future frames at decoder) Goal: reproduce pictures with minimum distortion, not necessarily
estimate accurate motion parameters
Block-Matching (BM) Motion
Estimation
-
8/10/2019 Lie Gonzalez Ch8 ppt
99/106
8-98CCU, Taiwan
Wen-Nung Lie
Estimation Find the best match between current image block and
candidates in the previous frame by minimizing the following cost
One motion vector (MV) for each block between successive frames
frame k
search window
block
frame k-1
),( yx mvmv
++blockyx
yxnn mvymvxxyxx),(
1 ),(),(
BM motion estimation
techniques
-
8/10/2019 Lie Gonzalez Ch8 ppt
100/106
8-99CCU, Taiwan
Wen-Nung Lie
techniques Full search method
Fast Search (Reduced Search)
Usually a multistage search procedure that calculates fewer search points
Local-(or Sub-)optima in general
Evaluation number of search points, noise immunity
Existing Algorithms : Three-step method
Four-step method
Log-search method
Hierarchical matching method
Prediction search method
Kalman prediction method
-
8/10/2019 Lie Gonzalez Ch8 ppt
101/106
Three steps search
-
8/10/2019 Lie Gonzalez Ch8 ppt
102/106
8-101CCU, Taiwan
Wen-Nung Lie
Three steps search The full search examines 1515=225 positions, the three step search examines
9+8+8=25 positions (assume MV range is 7 to +7)
Search range
Predictive search
-
8/10/2019 Lie Gonzalez Ch8 ppt
103/106
8-102CCU, Taiwan
Wen-Nung Lie
Predictive search Explore the similarity of MVs between adjacent macroblocks
Obtain initially predicted MV from neighboring macroblocks by
averaging
Prediction followed by a small-area detailed search
Significantly reduce the CPU time by 4~10X
Search area
MV1 MV2
MV3
MVpredict=(MV1+MV2+MV3)/3
Fractional pixel motion
estimation
-
8/10/2019 Lie Gonzalez Ch8 ppt
104/106
8-103CCU, Taiwan
Wen-Nung Lie
estimation Question : what happens if motion vector are real numbers instead
of integer numbers ?
Answer : Find integer motion vector first and then do interpolation fromsurrounding pixels for refined search. Most popular case is the half-pixel
accuracy (i.e., 1 out of 8 positions for refinement). 1/3 pixel accuracy will be
proposed in H.263L standard.
Half-pixel accuracy often leads to a matching with less residue (highreconstruction quality), hence decreases the bit rate required in following
encoding.
),( yx mvmv
: integer MV position
: candidates of half-pixel MV positions
Evaluation of BM motion
estimation
-
8/10/2019 Lie Gonzalez Ch8 ppt
105/106
8-104CCU, Taiwan
Wen-Nung Lie
estimation Advantages :
Straightforward, regular operation, and promising for VLSI chipdesign
Robustness (no differential operations as in optical flow approach)
Disadvantages :
Can not process several moving objects in a block (e.g., around
occlusion boundaries)
Improvements :
Post-processing of images to eliminate blocking effect
Interpolation of images or averaging of motion vectors to subpixel
accuracy
Image Prediction structure
-
8/10/2019 Lie Gonzalez Ch8 ppt
106/106
8-105CCU, Taiwan
Wen-Nung Lie
Image Prediction structure I/ P frame only (H.261, H.263)
I/ P/ B frame (MPEG-x)
. . . . .I P P P
forward prediction
P P
. . . . .I B B B P B B B P B B B P B
forward prediction
bidirectional
prediction
Intra-frame
Predictiveframe
Bi-directional
frame