svd and digital image processing - umu.se · redundancy spatial redundancy similarities between...
TRANSCRIPT
Image Compression
Image compression addresses the problem
of reducing the amount of data required to
represent a digital image. The underlying
basis of reduction process is the removal of
redundant data
Transforming a 2-D pixel array into a
statistically uncorrelated data set
Redundancy
Spatial redundancy
Similarities between adjacent pixels
250 252 249
250 2 -3
Temporal redundancy
Similarities between pixels in adjacent frames
250 252 249
250 2 -1
3
Elements of Information theory
A random event E that occurs with probability
P(E) is said to contain
units of energy. The quantity I(E) is often
called self-information. The amount of self-
information is inversely related to the
probability of E.
)2(log
))((log)
)(
1(log)(
10
102
EP
EPEI
Example
Assume that the grading levels are
A, B, C, D, E, F
These are equally distributed
How much information do you have if you know that you don’t have grade F?
How much information is needed to knowyour exact grade?
(5/6)*-log2 (5/6) = 0,22 bits
log2 (6) - ((1/6)*-log2 (1/6)) = 2,15 bits
Entropy
Entropy = average information per source
output
J
j
jj aPaPzH1
2 )(log)()(
The source
The set of source symbols {a1, a2, …,aj}
The finite ensemble (A,z) describes the
source completely
)](),...(),([
1)(
21
1
J
J
j
j
aPaPaPz
aP
The information channel
When self-information is transferred between
an information source and a user, the source
is said to be connected to the user by an
information channel
The noiseless coding theorem
It is possible to make Lavg/n arbitrary close to
H(z) by coding infinitely long extensions of
the source
)(lim/
zHn
L navg
n
Data vs Information
Data are the means by which information is
conveyed. Various amount of data may be
used to represent the same amount of
information
Data redundancy
Compression ratio CR
Relative data redundancy RD
R
D
R
CR
n
nC
11
2
1
Redundancy
Three basic data redundancies can be
identified
Coding redundancy
Interpixel redundancy
Psychovisual redundancy
Coding Redundancy
The average length of the code words
assigned to various gray-level values is found
by summing the product of the number of bits
used to represent each gray level and the
probability that the gray level occurs
12
1
0
)()(L
k
krkavg rprlL
Variable–length coding
Assigning fewer bits to the more probable gray levels than to the less probable ones achieves data compression
Code 1 Lavg = 3
Code 2 Lavg = 2,7
Interpixel Redundancy
The value of any given pixel can be reasonable predicted from the value of its neighbors – the information carried by individual pixels is relatively small
The differences between adjacent pixels can be used to represent an image
An image can be efficiently represented by the value and length of its constant gray level runs (gi,wi)
Fidelity Criteria
Objective fidelity criteria
Subjective fidelity criteria Excellent
Fine
Passable
Marginal
Inferior
unusable
2)],(),(ˆ1
yxfyxfMN
erms
Image Compression Model
Error-free Compression
Provide compression ratios of 2 to 10
Equally applicable to binary and gray-scale
images
Composed of two independent operations
Alternative representation of image which reduces
interpixel redundancy
Coding that reduces coding redundancy
Huffman
Optimal for fixed n
Smallest possible number of code symbols
per source symbol
Instantaneous code
Uniquely decodable
Binary Huffman codes
Assume that you have 5 symbols s1 … s5
The probabilities are
pi = 0,2 0,1 0,2 0,3 0,2
Sort them in order based on their probability
0,3 0,2 0,2 0,2 0,1
Join the two smallest probabilities and sort
again
0,3 0,3 0,2 0,2
200,2+0,1
Binary Huffman codes
Repeat until only one value remains
0,3 0,3 0,2 0,2
0,4 0,3 0,3
0,6 0,4
1,0
21
0,4
0,6
1,0
Binary Huffman codes
Go backwards and assign codes
0,3 0,2 0,2 0,2 0,1
0,3 0,3 0,2 0,2
0,4 0,3 0,3
0,6 0,4
22
10 11 00 01
0 10 11
1 0
10 00 01 110 111
Average word length
bits 2
1,031,032,022,023,02
1
M
i
ii NPN
Binary Huffman codes
24
bits 2,204,0506,05
1,041,033,024,011
M
i
ii NPN
Entropy and efficiency
Entropy
Coding efficiency
25
11,2)1
(log)(1
2
M
i i
iP
PXH
%5,9521,2
11,2100
)(100
N
XH
LZW-Coding Lempel-Ziv-Welch
Error-free compression technique that also
makes use of interpixel redundancy
Assigns fix length code to variable length of
source symbols
Requires no a priori knowledge of the
probability of occurrence of the symbols to be
encoded
LZW-Coding Lempel-Ziv-Welch
Integrated to mainstream imaging file formats
Graphic interchange format –GIF
Tagged image file format – TIFF
Portable document format - PDF
39 39 126 126 39 39 126 126 39 39 126 126 39 39 126 126
39 39 126 126 256 258 260 259 257 126
LZW
Coding dictionary (code book) is created
while data are being encoded
LZW decoder builds an identical
decompression dictionary as it decodes the
data stream
Flush the code book
When the codebook is full
When coding is inefficient
Bit Plane coding
Is based on the concept of decomposing a
multilevel image into a series of binary
images and compressing each binary image
via one of several well-known binary
compression methods
Alternative decomposition approach – start
with representation of Gray code –
successive gray level differ with one bit
Constant area coding CAC
Image is divided into areas of size p x q
Classify all white, all black, mixed
Example white=0, black=10, mixed = 11+pixel
values
If dominantly white
Example white=0, black or mixed = 1 + pixel
values
Run-length coding
Developed in the 1950’s – standard compression approach in facsimile coding
Example – white = 1, black = 0 – start with color of first pixel: 0 1 3 2 1 3
Example start with white pixel : 0 3 2 1 3
Additional compressing – variable length coding
Lossless Predictive Coding
Does not require decomposition of an image into a collection of bitplanes
Based on eliminating the interpixelinterference of closely spaced pixels by extracting and coding only the new information in each pixel
Contains encoder, decoder and predictor
The output of the predictor is rounded to the nearest integer
Lossless Predictive Coding
Prediction error
Is coded using an variable length code
Decoder reconstructs
The predictor uses m previous pixels
nnn ffe ˆ
nnn fef ˆ
]),([),(ˆ
1
m
i
in iyxfroundyxf
Lossy Predictive Compression
Transform Coding
A compression technique that is based on
modifying the transform of an image
For most natural images a significant number
of the coefficients have small magnitudes and
can be coarsely quantized or discarded with
little image distortion
Transform Coding
Sub image decomposition
Transformation
Quantization
Coding
Adaptive transform coding or nonadaptive transform coding
g(x,y,u,v) = Forward transformation kernel
h(x,y,u,v) = Inverse transformation kernel
1
0
1
0
),,,(),(),(N
x
N
y
vuyxgyxfvuT
1
0
1
0
),,,(),(),(N
u
N
v
vuyxhvuTyxf
Fourier Transform
Nvyuxj
Nvyuxj
eN
vuyxh
eN
vuyxg
/)(2
2
/)(2
2
1),,,(
1),,,(
Walsh-Hadamard Transform -
WHT
1
0
)]()()()([
)1(1
),,,(),,,(
m
i
iiii vpybupxb
Nvuyxhvuyxg
Discrete Cosine Transform -
DCT
1...2,1/2
0/1)(
)2
)12(cos()
2
)12(cos()()(),,,(
),,,(
NuN
uNu
N
vy
N
uxvuvuyxh
vuyxg
Discrete Cosine Transform -
DCT
JPEG - Sequential baseline
system
Limited to 8-bit words
DCT-values restricted to 11 bit
DCT computation, quantization, variable
length coding
Subimages 8 x 8 left to right, top to bottom
Image size selection
Wavelet Coding
The principal difference between wavelet
coding and transform coding is the omisson
of the subimage processing stage
JPEG2000
Video Compression Standards
Teleconference H.261 H.262 H.263 H.230
Multimedia video MPEG-1 MPEG-2 MPEG-4
47
MPEG
MPEG:s main components are:
Block (8×8 pixels)
Macro block (2×2 block)
Slice (One row of macro blocks)
Picture (An entire video frame)
Group of pictures (GOP)
Video Sequence (One or more GOP:s)
48
MPEG
8×8 Block
16×16 Macro block
Slice
49
MPEG – YUV compression
4:2:0 or 4:1:1 common in consumer products (DV)
4:2:2 common on professional products (DVCPro)
4:4:4 is rarely used – gives no visible improvement compared with 4:2:2
Y Y
Y YU V
4:2:0
Y Y
Y Y
U
U V
V
4:2:2
Y Y
Y Y
U
U V
V
V
VU
U
4:4:4
Y Y
Y Y
Y Y
Y Y
U V
4:1:1
U V
50
MPEG – Block compression
The same as JPEG
8
8
Spatial domain
DCT Quantization
99 50
-74 28
35 -11
21 24
87 -49
55 95
54 16
35 22
12 0
0 0
0 0
0 0
0 0
4 2
0 0
0 0
68 40
44 57
-17 8
25 12
-25 32
60 18
33 24
14 5
8 0
0 3
0 0
0 0
0 0
0 -1
0 0
0 0
Frequency domain
High frequencies
Low Frequencies
0 0 0 0 0 -1 0 0 0 0 0 0 3 0 5 14 24 0 0 0 0 0 0 2 8 12 …RLE
Huffman
51
MPEG – Temporal
compression
Adjacent frames share large similarities
Temporal compression can be achieved in
two ways:
Discarding images (reduce the frame rate)
Through motion estimation and motion vectors
52
MPEG – Motion Estimation
Calculate the position for the macro block in
the new image
Store the motion vector and difference in
appearance
Macro block
53
MPEG – Motion Estimation
54
MPEG – Group Of Pictures
(GOP)
MPEG uses three types of frames
Grouped in a Group Of Pictures (GOP)
I-pictures (Intracoded)
P-pictures (Predictive Coded)
B-pictures (Bidirectionally interpolated)
I B B P B B P
Forward Prediction
Bidirectional Prediction
I B B P B B P
55
MPEG – Data stream
I B B P B B P
Display order
1 5432 76
Order in data stream
I BBP BBP
1 54 32 7 6
How to represent a face image
56
DCT Approach (block-based)
=a +b +f +g +x
PCA Approach (full-frame)
58