cs 556 – computer vision image basics & review. what is an image? image: a representation,...

43
CS 556 – Computer Vision Image Basics & Review

Upload: hugo-miles

Post on 20-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

CS 556 – Computer Vision

Image Basics & Review

Page 2: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

What is an Image?• Image: a representation, resemblance, or likeness

• An image is a signal: a function carrying information

• Thus, an image is a function, f, from 2 to : f (x, y) gives the amount of some value at position (x, y)

In practice, an image is only defined over a finite rectangular domain and with a finite range:

f : [a, b] [c, d] [vmin, vmax]

Page 3: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Images as Functions

Page 4: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Images as Functions• An image is a signal: a function carrying information

• Functions have domains and ranges:

ν)(f

Domain:(t)

(x,y)(x,y,t)(x,y,z)

(x,y,z,t)

Range:sound (air pressure)

graylevel (light intensity)color (RGB, HSL)

LANDSAT (7 bands)

Page 5: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Image as Functions• A color image is “vector-valued” function created by

pasting three functions together:

• Spatial image: function of two or three spatial dimensions f(x, y): images (grayscale, color, multi-spectral)

f(x, y, z): medical scans or image volumes (CT, MRI)

• Spatio-temporal image: 2/3-D space, 1-D time f(x, y, t): videos, movies, animations

),(

),(

),(

),(

yxb

yxg

yxr

yxf

Page 6: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

• May be quantities we cannot sense:

What do the Range Values Mean?• May be visible light:

Radio waves (e.g., doppler radar)

Magnetic resonance

Range images

Ultrasound

X-rays (e.g., CT)

Intensity (gray-level)

What intensity does the value “213” represent?

Color (RGB)

Page 7: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Digital Images: Domains & Ranges

• The real world is analog: continuous domain and range

• Computers operate on digital (discrete) data

• Converting from continuous to discrete: Domains: selection of discrete points is called sampling Ranges: selection of discrete values is called quantization

Domain Sampling

Ran

ge

Qua

ntiz

atio

n

Page 8: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Digital Image Formation• To create a digital image:

Sample the 2-D space on a regular grid Quantize each sample (round to nearest integer)

• If the samples are apart, wecan write this as:

f [i, j] = Quantize{f (i , j )}

• The image can now berepresented as a matrix ofinteger values

62 79 23 119 120 105 4 0

10 10 9 62 12 78 34 0

10 58 197 46 46 0 0 48

176 135 5 188 191 68 0 49

2 1 1 39 26 37 0 77

0 89 144 147 187 102 62 208

255 252 0 166 123 62 0 208

166 63 127 17 1 0 99 30

i

j

Page 9: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Resolution• Ability to discern detail – both domain & range

• Not simply the number of samples/pixels

• Determined by the averaging or spreading of information when sampled or reconstructed

Page 10: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Apertures• Point measurements are impossible

• Have to make measurements using a (weighted) average over some aperture: Time window Spatial area Etc.

• Size determines resolution: Smaller better resolution Larger worse resolution

Page 11: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

AperturesLenses allow physically larger aperture with effectively smaller one

Sensor

Lens

EffectiveAperture

PhysicalAperture

Page 12: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Image TransformationsAn image processing operation typically defines a new image g in terms of an existing image f

• We can transform either the domain or the range of f

• Range transformation (a.k.a level operations):

g(x, y) = t( f (x, y))

What’s kinds of operations can this perform?

Page 13: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Image TransformationsSome operations preserve the range but change the domain of f (a.k.a geometric operations):

g(x, y) = f (tx(x, y), ty(x, y))

What kinds of operations can this perform?

• Many image transforms operate both on the domain and the range

Page 14: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Linear TransformsA general and very useful class of transforms are linear transforms

Properties of linear transforms: Multiplying input f(x) by a constant value multiplies the

output by the same constant:

t(a f(x)) = a t( f(x))

Adding two inputs causes corresponding outputs to add:

t( f(x) + h(x)) = t( f(x)) + t(h(x))

Linearity: the transform t is linear iff

t(a f(x) + b h(x)) = a t( f(x)) + b t(h(x))

Page 15: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Linear TransformsA linear transforms of a discrete signal/image f can be defined by a matrix M using matrix multiplication

1

0

][],[][n

j

jfjiig M

*

*

*

*

*

*

***

***

***

f [i]M [i, j] g[i]

Note that matrix and vector indices start

at 0 instead of 1

Does M(a f + b h) = a M f + b M h?

Page 16: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Linear Transforms: Examples

Let’s start with a discrete 1-D image (a “signal”): f [x]

][xff [x]

x

44002466

Page 17: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Linear Transforms: Examples

Identity transform:

fM g

6

6

4

2

0

0

4

4

10000000

01000000

00100000

00010000

00001000

00000100

00000010

00000001 4

4

0

0

2

4

6

6

M = I M f = I f = f

Page 18: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Linear Transforms: Examples

Scale:

fM g

6

6

4

2

0

0

4

4

0000000

0000000

0000000

0000000

0000000

0000000

0000000

0000000

a

a

a

a

a

a

a

a 4a

4a

0

0

2a

4a

6a

6a

M = a I M f = a I f = a f

Page 19: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Linear Transforms: Examples

Shift (translate):

fM g

6

6

4

2

0

0

4

4

00100000

00010000

00001000

00000100

00000010

00000001

00000000

00000000 0

0

4

4

0

0

2

4

f [x]

g[x]

Page 20: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Linear Transforms: Examples

Derivative (finite difference):

fM g

6

6

4

2

0

0

4

4

00000000

11000000

01100000

00110000

00011000

00001100

00000110

00000011 0

– 4

0

2

2

2

0

0

f [x]

g[x]

Page 21: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Linear Transforms: Examples

The transformation matrix doesn’t have to be square

fM g

6

6

4

2

0

0

4

4

11000000

00110000

00001100

00000011

2

1

4

0

3

6

f [x]

g[x]

Page 22: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Fourier TransformOne important linear transform is the Fourier transform

• Basic idea: any function can be written as the sum of (complex-valued) sinusoids of different frequencies

• Euler’s equation: ei2sx = cos(2sx) + i sin(2sx)

Note: i is the imaginary number

• To get the weights (amount of each frequency):

1

0

2][

1][

n

x

nsxi

exfn

sF

1

Page 23: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Fourier TransformIn matrix form:

where

The frequency increases with the row number

][

][

]2[

]1[

]0[

1

1

1

1

11111

1

2

2

2242

2

nf

xf

f

f

f

WWWW

WWWW

WWWW

WWWW

n

nnn

nxn

nn

nn

snn

sxn

sn

sn

nn

xnnn

nn

xnnn

ni

n eW

2

Page 24: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Linear Shift-Invariant TransformA special class of linear transforms are shift invariant

• Shift invariance: an operation is invariant to translation

• Implication: shifting the input produces the same output with an equal shift

if g(x) = t( f(x))

then t( f(x + x0) = g(x + x0)

Page 25: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

FiltersFilter: linear, shift-invariant transform

Often applied to operations that are not technically filters (e.g., median “filter”)

Transformation matrix M: Shifted copy of some pattern

applied to each row Pattern is (usually) centered

on (or near) the diagonal

Pattern is called a filter, kernel,or mask and is represented bya vector h

**000000

00000

00000

00000

00000

00000

00000

000000**

cba

cba

cba

cba

cba

cba

M

h[x] = [a b c]

Page 26: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

FiltersFilter operations can be written (for a kernel size of 2k + 1) as:

Assumes negative kernel indices. . . Actual implementation may need to use h[j + k] instead of h[j]

• Can think of it as a dot (or inner) product of h with a portion of f

• Since 2k + 1 is often much less than n, this computation is more efficient (it ignores summing terms that are multiplied with 0)

k

kj

jxfjhxg ][][][

Page 27: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Cross-Correlation & ConvolutionFiltering operations come in two (very similar) types:

Cross-correlation (already seen):

Convolution:

Convolution is cross-correlation where either the kernel or signal is flipped first

How do the results differ for cross-correlation and convolution if the kernel is symmetric? Anti-symmetric?

k

kj

jxfjhxfxhxg ][][][][][

k

kj

jxfjhxfxhxg ][][][*][][

Page 28: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

2-D Linear TransformsA 2-D discrete image (in matrix form) can form a 1-D vector by concatenating the rows into one long vector:

However, it is usually easier to think about it in terms of the computation for an individual value of g[u, v]:

]%,/[][ˆ nxnxfxf

*

*

*

*

*

*

***

***

***

M f g

1

0

1

0, ],[],[ ],[

m

y

n

xvu yxfyxwvug

Page 29: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

2-D Transforms: Fourier Transform

The 2-D discrete Fourier Transform is given by:

where the weight values wu,v have been replaced with

1

0

1

0

)//(2],[ 1

],[n

x

m

y

mvynuxieyxfnm

vuF

)//(2,

1],[ mvynuxi

vu enm

yxw

Page 30: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Fourier Transform: Examples

Page 31: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

2-D FilteringA 2-D image f[x, y] can be filtered by convolving (or

cross-correlating) it with a 2-D kernel h[x, y] to produce an output image g[u, v]:

As with the 1-D case, actual implementation may need to use h[i + k, j + k] instead of h[i, j] to adjust for negative indices

• Filtering is useful for many reasons such as noise reduction and edge detection

k

kj

k

ki

jyixfjihyxg ],[],[ ],[

Page 32: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Noise• Unavoidable/undesirable fluctuation from “correct”

value: The nemesis of signal/image processing and computer vision

• Usually random: modeled as a statistical distribution Mean () at the “correct” value Measured sample varies from according to distribution ()

• Signal-to-Noise Ratio (SNR) = :

Measures how “noise free” the acquired signal is

“Signal” can refer to absolute or relative value

σμ

Page 33: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

NoiseFiltering is useful for noise reduction. . .

• Common types of noise: Salt and pepper: random

occurrences of black and white pixels

Impulse: random occurrences of white pixels

Gaussian: intensity variations drawn from a normal distribution

What kind of filter (i.e., kernel) reduces noise? Why?

Page 34: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Noise Reduction: Mean Filter

What does a 33 mean (i.e., averaging) kernel look like?

What does it do to the salt and pepper noise?

What does it do to the edges of the white box?

0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0

f[x, y] g[x, y]

Page 35: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Noise Reduction: Mean Filter

0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0

0 10 20 30 30 30 20 100 20 40 60 60 60 40 200 30 60 90 90 90 60 300 30 50 80 80 90 60 300 30 50 80 80 90 60 300 20 30 50 50 60 40 20

10 20 30 30 30 30 20 1010 10 10 0 0 0 0 0

f[x, y] g[x, y]

1 1 11 1 11 1 1

9

1

h[x, y]

Page 36: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Noise Reduction: Mean Filter

0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0

0 10 20 30 30 30 20 100 20 40 60 60 60 40 200 30 60 90 90 90 60 300 30 50 80 80 90 60 300 30 50 80 80 90 60 300 20 30 50 50 60 40 20

10 20 30 30 30 30 20 1010 10 10 0 0 0 0 0

f[x, y] g[x, y]

1 1 11 1 11 1 1

9

1

h[x, y]

Page 37: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Mean Filter: Effect

33 55 77

Gaussiannoise

Salt andpeppernoise

Page 38: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Noise Reduction: Gaussian Filter

A Gaussian kernel gives less weight to pixels further from the center of the window

This kernel is an approximation of a Gaussian function:

1 2 12 4 21 2 1

16

1

h[x, y]

2

22

222

1),(

yx

eyxh

Page 39: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Mean vs. Gaussian Filtering

Page 40: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Non-Linear OperationsThey are often mistakenly called “filters” Strictly speaking, non-linear operators are not filters

They can be useful, though

Examples: Order statistics (e.g., median filter) Iterative algorithms (e.g., CLEAN) Anisotropic diffusion Non-uniform convolution-like operations

Page 41: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Median “Filter”Instead of a local neighborhood weighted average, compute the median of the neighborhood

• Advantages: Removes noise like low-pass filtering does Value is from actual image values Removes outliers – doesn’t average (blur) them into result

(“despeckling”) Edge preserving

• Disadvantages: Not linear Not shift invariant Slower to compute

Page 42: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Comparison: Salt & Pepper Noise

33

77

GaussianMean Median

Page 43: CS 556 – Computer Vision Image Basics & Review. What is an Image? Image: a representation, resemblance, or likeness An image is a signal: a function carrying

Comparison: Gaussian Noise

33

77

GaussianMean Median