svd and digital image processing - umu.se · redundancy spatial redundancy similarities between...

Ulrik Söderström

[email protected]

8 Feb 2017

Image Processing

Image compression

Image Compression

Image compression addresses the problem

of reducing the amount of data required to

represent a digital image. The underlying

basis of reduction process is the removal of

redundant data

Transforming a 2-D pixel array into a

statistically uncorrelated data set

Redundancy

Spatial redundancy

Similarities between adjacent pixels

250 252 249

250 2 -3

Temporal redundancy

Similarities between pixels in adjacent frames

250 252 249

250 2 -1

3

Elements of Information theory

A random event E that occurs with probability

P(E) is said to contain

units of energy. The quantity I(E) is often

called self-information. The amount of self-

information is inversely related to the

probability of E.

)2(log

))((log)

)(

1(log)(

10

102

EP

EPEI

Example

Assume that the grading levels are

A, B, C, D, E, F

These are equally distributed

How much information do you have if you know that you don’t have grade F?

How much information is needed to knowyour exact grade?

(5/6)*-log2 (5/6) = 0,22 bits

log2 (6) - ((1/6)*-log2 (1/6)) = 2,15 bits

Entropy

Entropy = average information per source

output

J

j

jj aPaPzH1

2 )(log)()(

The source

The set of source symbols {a1, a2, …,aj}

The finite ensemble (A,z) describes the

source completely

)](),...(),([

1)(

21

1

J

J

j

j

aPaPaPz

aP

The information channel

When self-information is transferred between

an information source and a user, the source

is said to be connected to the user by an

information channel

The noiseless coding theorem

It is possible to make Lavg/n arbitrary close to

H(z) by coding infinitely long extensions of

the source

)(lim/

zHn

L navg

n

Data vs Information

Data are the means by which information is

conveyed. Various amount of data may be

used to represent the same amount of

information

Data redundancy

Compression ratio CR

Relative data redundancy RD

R

D

R

CR

n

nC

11

2

1

Redundancy

Three basic data redundancies can be

identified

Coding redundancy

Interpixel redundancy

Psychovisual redundancy

Coding Redundancy

The average length of the code words

assigned to various gray-level values is found

by summing the product of the number of bits

used to represent each gray level and the

probability that the gray level occurs

12

1

0

)()(L

k

krkavg rprlL

Variable–length coding

Assigning fewer bits to the more probable gray levels than to the less probable ones achieves data compression

Code 1 Lavg = 3

Code 2 Lavg = 2,7

Interpixel Redundancy

The value of any given pixel can be reasonable predicted from the value of its neighbors – the information carried by individual pixels is relatively small

The differences between adjacent pixels can be used to represent an image

An image can be efficiently represented by the value and length of its constant gray level runs (gi,wi)

Fidelity Criteria

Objective fidelity criteria

Subjective fidelity criteria Excellent

Fine

Passable

Marginal

Inferior

unusable

2)],(),(ˆ1

yxfyxfMN

erms

Image Compression Model

Error-free Compression

Provide compression ratios of 2 to 10

Equally applicable to binary and gray-scale

images

Composed of two independent operations

Alternative representation of image which reduces

interpixel redundancy

Coding that reduces coding redundancy

Huffman

Optimal for fixed n

Smallest possible number of code symbols

per source symbol

Instantaneous code

Uniquely decodable

Binary Huffman codes

Assume that you have 5 symbols s1 … s5

The probabilities are

pi = 0,2 0,1 0,2 0,3 0,2

Sort them in order based on their probability

0,3 0,2 0,2 0,2 0,1

Join the two smallest probabilities and sort

again

0,3 0,3 0,2 0,2

200,2+0,1


Repeat until only one value remains

0,3 0,3 0,2 0,2

0,4 0,3 0,3

0,6 0,4

1,0

21

0,4

0,6

1,0


Go backwards and assign codes

0,3 0,2 0,2 0,2 0,1

0,3 0,3 0,2 0,2

0,4 0,3 0,3

0,6 0,4

22

10 11 00 01

0 10 11

1 0

10 00 01 110 111

Average word length

bits 2

1,031,032,022,023,02

1

M

i

ii NPN


24

bits 2,204,0506,05

1,041,033,024,011

M

i

ii NPN

Entropy and efficiency

Entropy

Coding efficiency

25

11,2)1

(log)(1

2

M

i i

iP

PXH

%5,9521,2

11,2100

)(100

N

XH

LZW-Coding Lempel-Ziv-Welch

Error-free compression technique that also

makes use of interpixel redundancy

Assigns fix length code to variable length of

source symbols

Requires no a priori knowledge of the

probability of occurrence of the symbols to be

encoded

LZW-Coding Lempel-Ziv-Welch

Integrated to mainstream imaging file formats

Graphic interchange format –GIF

Tagged image file format – TIFF

Portable document format - PDF

39 39 126 126 39 39 126 126 39 39 126 126 39 39 126 126

39 39 126 126 256 258 260 259 257 126

LZW

Coding dictionary (code book) is created

while data are being encoded

LZW decoder builds an identical

decompression dictionary as it decodes the

data stream

Flush the code book

When the codebook is full

When coding is inefficient

Bit Plane coding

Is based on the concept of decomposing a

multilevel image into a series of binary

images and compressing each binary image

via one of several well-known binary

compression methods

Alternative decomposition approach – start

with representation of Gray code –

successive gray level differ with one bit

Constant area coding CAC

Image is divided into areas of size p x q

Classify all white, all black, mixed

Example white=0, black=10, mixed = 11+pixel

values

If dominantly white

Example white=0, black or mixed = 1 + pixel

values

Run-length coding

Developed in the 1950’s – standard compression approach in facsimile coding

Example – white = 1, black = 0 – start with color of first pixel: 0 1 3 2 1 3

Example start with white pixel : 0 3 2 1 3

Additional compressing – variable length coding

Lossless Predictive Coding

Does not require decomposition of an image into a collection of bitplanes

Based on eliminating the interpixelinterference of closely spaced pixels by extracting and coding only the new information in each pixel

Contains encoder, decoder and predictor

The output of the predictor is rounded to the nearest integer

Lossless Predictive Coding

Prediction error

Is coded using an variable length code

Decoder reconstructs

The predictor uses m previous pixels

nnn ffe ˆ

nnn fef ˆ

]),([),(ˆ

1

m

i

in iyxfroundyxf

Lossy Predictive Compression

Transform Coding

A compression technique that is based on

modifying the transform of an image

For most natural images a significant number

of the coefficients have small magnitudes and

can be coarsely quantized or discarded with

little image distortion

Transform Coding

Sub image decomposition

Transformation

Quantization

Coding

Adaptive transform coding or nonadaptive transform coding

g(x,y,u,v) = Forward transformation kernel

h(x,y,u,v) = Inverse transformation kernel

1

0

1

0

),,,(),(),(N

x

N

y

vuyxgyxfvuT

1

0

1

0

),,,(),(),(N

u

N

v

vuyxhvuTyxf

Fourier Transform

Nvyuxj

Nvyuxj

eN

vuyxh

eN

vuyxg

/)(2

2

/)(2

2

1),,,(

1),,,(

Walsh-Hadamard Transform -

WHT

1

0

)]()()()([

)1(1

),,,(),,,(

m

i

iiii vpybupxb

Nvuyxhvuyxg

Discrete Cosine Transform -

DCT

1...2,1/2

0/1)(

)2

)12(cos()

2

)12(cos()()(),,,(

),,,(

NuN

uNu

N

vy

N

uxvuvuyxh

vuyxg

Discrete Cosine Transform -

DCT

JPEG - Sequential baseline

system

Limited to 8-bit words

DCT-values restricted to 11 bit

DCT computation, quantization, variable

length coding

Subimages 8 x 8 left to right, top to bottom

Image size selection

Wavelet Coding

The principal difference between wavelet

coding and transform coding is the omisson

of the subimage processing stage

JPEG2000

Video Compression Standards

Teleconference H.261 H.262 H.263 H.230

Multimedia video MPEG-1 MPEG-2 MPEG-4

47

MPEG

MPEG:s main components are:

Block (8×8 pixels)

Macro block (2×2 block)

Slice (One row of macro blocks)

Picture (An entire video frame)

Group of pictures (GOP)

Video Sequence (One or more GOP:s)

48

MPEG

8×8 Block

16×16 Macro block

Slice

49

MPEG – YUV compression

4:2:0 or 4:1:1 common in consumer products (DV)

4:2:2 common on professional products (DVCPro)

4:4:4 is rarely used – gives no visible improvement compared with 4:2:2

Y Y

Y YU V

4:2:0

Y Y

Y Y

U

U V

V

4:2:2

Y Y

Y Y

U

U V

V

V

VU

U

4:4:4

Y Y

Y Y

Y Y

Y Y

U V

4:1:1

U V

50

MPEG – Block compression

The same as JPEG

8

8

Spatial domain

DCT Quantization

99 50

-74 28

35 -11

21 24

87 -49

55 95

54 16

35 22

12 0

0 0

0 0

0 0

0 0

4 2

0 0

0 0

68 40

44 57

-17 8

25 12

-25 32

60 18

33 24

14 5

8 0

0 3

0 0

0 0

0 0

0 -1

0 0

0 0

Frequency domain

High frequencies

Low Frequencies

0 0 0 0 0 -1 0 0 0 0 0 0 3 0 5 14 24 0 0 0 0 0 0 2 8 12 …RLE

Huffman

51

MPEG – Temporal

compression

Adjacent frames share large similarities

Temporal compression can be achieved in

two ways:

Discarding images (reduce the frame rate)

Through motion estimation and motion vectors

52

MPEG – Motion Estimation

Calculate the position for the macro block in

the new image

Store the motion vector and difference in

appearance

Macro block

53

MPEG – Motion Estimation

54

MPEG – Group Of Pictures

(GOP)

MPEG uses three types of frames

Grouped in a Group Of Pictures (GOP)

I-pictures (Intracoded)

P-pictures (Predictive Coded)

B-pictures (Bidirectionally interpolated)

I B B P B B P

Forward Prediction

Bidirectional Prediction

I B B P B B P

55

MPEG – Data stream

I B B P B B P

Display order

1 5432 76

Order in data stream

I BBP BBP

1 54 32 7 6

How to represent a face image

56

DCT Approach (block-based)

=a +b +f +g +x

PCA Approach (full-frame)

58

svd and digital image processing - umu.se · redundancy spatial redundancy similarities between...

Documents