image processing: image...
TRANSCRIPT
Image Processing:Image Compression
IntroductionCompression & Decompression StepsInformation theory: EntropyInformation CodingBinary Image CompressionLossless Gray Level Image CompressionLossy Compression of Gray Level Images
© 2004 Rolf Ingold, University of Fribourg
Definition of image compression
Image compression is data compression on digital imagesIts objective is to reduce the quantity of information (number of bytes) used to represent an image
needs less storageallows faster transmission
The principle consists of reducing redundanciesdue to data correlation (data redundancy)due to the use of non optimal codes (coding redundancy)
Compression : the original image is transformed and encoded into a compressed fileDecompression : the compressed file is decoded and the original image is reconstructed
© 2004 Rolf Ingold, University of Fribourg
Two groups of compression methods
Lossless compression methods : all information is preservedthe transformation is reversiblesuitable for binary and indexed color images
Lossy compression methods : some information is lostthe transformation is not reversiblecompression artifacts are introducedsuitable for natural gray-level and color images such as photos best image quality at a given compression rate is the main goal
© 2004 Rolf Ingold, University of Fribourg
Compression steps
Transformation perfoms data decorrelation to reduce data redundancyQuantization performs approximation by restricting the set of possible valuesEncoding assigns optimal codes to eliminate coding redundancy
original image
transfor-mation
quantization
encoding
information preserved
compressedimage
loss of information
© 2004 Rolf Ingold, University of Fribourg
Decompression steps
Decoding restores the values representing the data in the tranfomatedspaceReconstruction recomputes the data of the original image space
For some applications, special requirements may be requestedprogressive display to preview a low quality image whiledownloading (stepwise) a better quality version
original image
recons-tructiondecoding
compressedimage
© 2004 Rolf Ingold, University of Fribourg
Encoding
Goal : encode a sequence of symbols (or values), by minimizing thelength of the data
assign a binary code to each symbolwhich is reversible (can be decoded univoquely)
Types of encoding schemes usedentropy encoding
Huffman codingrange codingarithmetic coding (encumbered by patents !)
dictionary based encoding
© 2004 Rolf Ingold, University of Fribourg
Information entropy
Entropy is a basic concept of information theoryFormal definition (by Shanon) : the entropy of a discrete random eventx, with possible states 1..n , is defined as
Entropy is a lower bound of the average amount of bits used to represent one event
entropy is maximal and equal to log2 nif all states have the same probabilityentropy decreases with the differencesof the states' probalilitiesentropy is minimal and equal to 0 if only one state can occurexample: Bernoulli trial with two state
∑=
−=n
i
ipipxH1
2 )(log)()(
© 2004 Rolf Ingold, University of Fribourg
Entropy example
We consider an image with 8 grey levels Z in the range 0..7 with knowndistributions PZ ; the table below computes its entropy
2.652Entropy :
0.1135.6440.027
0.1525.0590.036
0.2444.0590.065
0.2923.6440.084
0.4232.6440.163
0.4732.2520.212
0.5002.0000.251
0.4552.3960.190
-PZ log2 (PZ)-log2(PZ)PZZ
© 2004 Rolf Ingold, University of Fribourg
Infomation coding principles
An encoder translates a sequence of symbols (describing the states ofan event sequence) into a string of bitsThe decoder interpretes this string of bits to reconstruct the sequenceof symbolsCompression is achieved by using codes having variable lengths
short codes are used for frequent symbolslong codes are used for rare symbols
Two principles should be observeddecoding must be univoque : this is the case for prefix-free codes(no code is a prefix of another one)each bit string must be usable : this is achieved if each bit string either has a code as prefix, or it is the prefix of a code
Horn adresses of binary trees meet both requirements
© 2004 Rolf Ingold, University of Fribourg
Information coding example
Evaluation of a variable length code used to represent the gray levelsof the previous example of a
281size : 300size :
1260000006311127
1860000019311036
3050000118310165
434000124310084
483001483011163
42210633010212
50201753001251
38211573000190
FreqZ. l2l2code 2FreqZ
. l1l1code 1FreqZZ
© 2004 Rolf Ingold, University of Fribourg
Huffman codes
Huffman codes are optimal prefix-free codes associated to symbolsHuffman codes can be built by the following algorithm using a forest ofbinary trees and a heap
heap = new Heap();for each Symbol s do
heap.add(new Node(s, freq(s), null, null);while (heap.size > 1) {
node1 = heap.extraxt(); node2 = heap.extraxt();f1 = node1.getFreq(); f2 = node2.getFreq()heap.add(new Node(null, f1+f2, n1, n2);
}result = heap.extract();
the code of each symbol corresponds to its Horn address of theresult tree
© 2004 Rolf Ingold, University of Fribourg
Huffman codes example
The Huffman codes associated with the previous example is built withthe binary tree below
z0 → 11 z1 → 01 z2 → 10 z3 → 001z4 → 0001 z5 → 00001 z6 → 000001 z7 → 000000
p7=0.02
p6=0.03
p5=0.06
p4=0.08
p3=0.16
p2=0.21
p1=0.25
p0=0.19
0.050.11
0.19
0.40
0.35
1.00
00
00
0
1
1
1
10
11
00.60
1
© 2004 Rolf Ingold, University of Fribourg
Arithmetic coding
Arithmetic coding is an optimal entropy encoding technique in whichthe entire message is encoded as a single number between 0 and 1 (with very precision)Example:
consider 4 symbols with the following probabilitiesA(50%), B(20%), C(20%), D(10%)each symbol correespond to an intevall:A(0.0-0.5), B(0.5-0.7), C(0.7-0.9), D(0.9-1)the message ACBADA is encoded as0.0+0.5*(0.7+0.2*(0.5+0.2*(0.0+0.5*(0.9+0.1*(0.0+0.5*(0)))))) = 0.409
Many companies own patents in the US and other countries on algorithms used for implementing an arithmetic encoder and decoder !Range coding is an equivalent technique believed to not be covered by any company's patents !
© 2004 Rolf Ingold, University of Fribourg
Dictionary based coders
Dictionary based coders have initialy been developed to compress textPerformance is similar to entropy based codersPrinciples
sequences of symbols are encoded according to entries in a dictionarydictionaries may be static or dynamicto avoid transmitting the dictionary, it is possible to automaticallybuild a dictionary of previously seen strings
The following decoders belong to this familyLZW : used by the Unix compress utility and GIF files, but protected by several patentsLZ77, LZ78 : ancestors of LZW, free of patentsDEFLATE : combining LZ77 and Huffman codes, used for ZIP andgzip file formats
© 2004 Rolf Ingold, University of Fribourg
Limitation of entropy encoding
Entropy encoding is suitable for removing coding redundancyImages generally don't have much coding redundancy
entropy encoding alone does not allow to compress image data significantly !
The role of preliminary transformations is to convert data redundancyinto coding redundancy, which can later be eliminated by entropycoding
© 2004 Rolf Ingold, University of Fribourg
Binary Image Compression
Types of transformations used for binary imagesrun length encodingscan-line differencesquadtrees
© 2004 Rolf Ingold, University of Fribourg
Run-Length Encoding (RLE)
RLE is a very simple form of lossless data compressionLengths of horizontal runs (sequences of same pixel values) are storedas a single numberExample :
Original image using 120 bits
RLE representation: 5, 3, 16, 4, 4, 9, 3, 4, 4, 4, 9, 3, 4, 5, 4, 8, 3, 4, 6, 14, 4 using 21x5=105 bits (assuming five bits are used for each number)
© 2004 Rolf Ingold, University of Fribourg
2D-Run-Length Encoding
Principle : apply RLE on scanline differencesExample :
size of the compressed encoding : 13 + 18 + 11 + 13 + 17 = 72 bits
code table
ABS 0REL 1FL 00000 0-1 10+1 110 00011 0012 0103 0114 1005 10106 10116+... 11...
ABS, 5, 3, ABS, FL, 0 1010 011 0 0000REL, -1, 0, ABS, 9, 3, ABS, FL, 1 10 0 0 11011 011 0 0000REL, 0, 0, REL, 0, 0, ABS, FL, 1 0 0 1 0 0 0 0000REL, +1, +1, REL, 0, 0, ABS, FL, 1 10 10 1 0 0 0 0000ABS, 6, 14, ABS, FL 0 1011 1111010 0 0000
© 2004 Rolf Ingold, University of Fribourg
Quadtree Data Structure
A quadtree is a data structure used to partition recursively an image by subdividing it into four quadrants
the process is stoped if a node correspond to a homogeneoussquare
L
A B
G
H I
J K
M N
S
C DE F
O PQ R
L
1 2
3
A B
G
L
H I
J K
M N
S
4
5
A B
C D E F
G H I J K
L
M N
O P Q R
S
1 2 3
4 5
0
0
© 2004 Rolf Ingold, University of Fribourg
Quadtree compression
Quatrees can be used for lossless data compressionthe quadtree is represented by the list of nodes in pre-order, whichis significant to rebuild the tree univoquely
Example: Original image using 64 bits
quadtree representation: XXWWXWWWBBXWBBWBXWWXBBWWBusing 25x2=50 bits (assuming two bits are used for each symbol)
© 2004 Rolf Ingold, University of Fribourg
Lossless Gray Level Image Compression
Performing lossless gray-level image compression can be doneby using binary compression on bit- plane decompositionby decorrelating adjacent pixels and using entropy coding
Performance is limited !
© 2004 Rolf Ingold, University of Fribourg
Bit-plane Decomposition
A binary image with 256 gray levels can be represented by 8 bit-planesa0, a1, ... a7.
the higher planes (a7, ... ) are more compressible than the lower
A better compression is achieved with the planes of the Gray code defined as 1+⊕= iii aag
© 2004 Rolf Ingold, University of Fribourg
Entropy Encoding for Gray-Level images
Usually, gray-level images don't have much coding redundancy
Example: the set of 256 gray levels of the 65'536 pixels of the Lena picture have an entropyof 7.394 (correponding to 60'572 bytes)
using Huffman codes, the picturecan be compressed into 60'825 bytes (plus 481 bytes for the codebook)
50 100 150 200
100
200
300
400
500
600
700
© 2004 Rolf Ingold, University of Fribourg
Gray-Level Image Transfomations
Transformations of gray-level images are usingpixel differences rather than absolute values
based on scan linesbased on quadtrees
global transformations Karhunen-Loève or Hotelling transform (also PCA)Fourier transform, Walsh-Hadamard transform, discretecosinus transform (DCT), etc.Haar and other wavelets
© 2004 Rolf Ingold, University of Fribourg
Data Correlation for Adjacent Pixels
In gray level images, values of adjacent pixels are highly correlatedExample (from the Lena picture)
first plot : distribution of (z1, z2) with z1 = f(2x, y), z2=f(2x+1, y)first plot : distribution of (z1, z2-z1)
the distribution of z2-z1 is much narrower than the distribution of z2 !50 100 150 200
50
100
150
200
50 100 150 200 250
-200
-100
100
200
© 2004 Rolf Ingold, University of Fribourg
Decorrelation by Adjacent Pixel Differences
Replacing pixel values by pixel differencesnarrows the distributionreduces the entropy
Example: the pixel differences of the Lena picturehave an entropy of 5.277 (needing globaly43'321 bytes)
-100 -50 0 50 100
1000
2000
3000
4000
5000
6000
© 2004 Rolf Ingold, University of Fribourg
Decorrelation by Quadtrees
Pixel differences can be generalized to 2 dimensions by consideringquadtrees
each node is characterized by the mean value of all its leaves(representing pixels)and stores the gray-level difference to its father
Example : for the Lena picture, the quatree of differences has an entropy of4.710
© 2004 Rolf Ingold, University of Fribourg
Lossy Compression of Gray-Level Images
Lossy compression consists oftransforming spatial data in order to reduce data correlationquantization of the values in transformed spaceencode the data by eliminating code redundancy
Type of transfomations usedKarhunen-Loève Transfom (KLT)Fourier Transform, Walsh-Hadamard Transform and DiscreteCosinus Transform (DCT)Wavelets, such as Haar Transform...
© 2004 Rolf Ingold, University of Fribourg
Karhunen-Loève Transform
The Karhunen-Loève transform (KLT) is the linear transform thatoptimaly decorrelates the dataThe KLT is calculated as follows
estimate the covariance matrix
find the eigenvectors (i.e an orthonormal matrix) A such as
the transformation y=At(x-m) is the Karhunen-Loève transformThe reverse transform is obtained by x=Ay+m
tn
k
tkk
t xxn
E mmmxmxC −=−−= ∑=1
1}))({( ∑=
==n
kkn
E1
1}{ xxm
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
λ
λλ
=
d
t
0
0
2
1
ACA
© 2004 Rolf Ingold, University of Fribourg
Karhunen-Loève Transform (cont.)
The Karhunen-Loève transform can be applied on blocks of adjacent pixelsThe Karhunen-Loève transform can be understood as a lineartransform of a set of data dependent basis function
these functions are ordered according to decreasing eigenvaluesExample :
the 64 8x8 basis functions of the Lena Pictures are listed belowwith their index
1 2 3 4 5 ... 15 16 17 ... 62 63 64
© 2004 Rolf Ingold, University of Fribourg
Quantization of the KLT data
The KLT is a reversible transform: the original image can be rebuiltfrom the KLT coefficientsFor lossy compression, KLT coefficient are quantized, using twocomplementary principles
the set of possible values is restricted to multiples of a quantization interval
only a subset of KLT coefficients are used (the others are put to 0)Entropy coding used for
KLT basis encodingKLT coefficients
)/round(ˆ qyqy =
© 2004 Rolf Ingold, University of Fribourg
Example of KLT compression
Original image compared to1/4 of KLT coefficients quantized with multiples of 16 (size = 3'834 bytes, compression rate = ~1:17)1/8 of KLT coefficients quantized with multiples of 16 (size = 2'152 bytes, compression rate = ~1:30)
© 2004 Rolf Ingold, University of Fribourg
Distortion
Two measures are used to characterize distortionmean square error (MSE) is defined as
peak signal-to-noise ratio (PSNR) is defined as
where m represents the peak value (255 for 8-bit gray level image)
( ) ]ˆ[ 2xxEMSE −=
( ) ]ˆ[log10 2
2
10 xxEmPSNR−
=