2. filtering basics

• Role of Artificial Intelligence and Image Processing in Computer Vision

• Industrial Machine Vision Applications

• System Architecture

• State of Art

CHAPTER 1

2

Introduction

• What is computer vision?

The analysis of digital images by a computer

3

The Three Stages of Computer Vision

7

Low-Level

blurring

sharpening

8

Low-Level

Canny

ORT

Mid-Leveloriginal image edge image

edge image circular arcs and line segments

datastructure

9

Mid-level

K-meansclustering

original color image regions of homogeneous color

(followed byconnectedcomponent

analysis)

datastructure

10

edge image

consistentline clusters

low-level

mid-level

high-level

Low- to High-Level

Building Recognition

System Architecture

Typical Computer Vision System. The figure shows the basic components of a computer vision system. The left most component is the scene or object of study. The next required component shown in the figure is the sensing device used to collect data from the scene. The third component is the computation device. The device computes information such as visual cues and reasons on this information to generate interpretations of the scene such as objects present or actions being performed.

System Architecture

Typical computer vision system involves three components,

1. Scene under study2. Sensing device that can be used to analyze

the scene (used to collect data from the scene)

System Architecture3. Computational device that can perform the analysis of the scene

based on the data from the sensor. The computation devices generates two possible forms of data, information such as visual cues, and interpretations of information such as actions being performed or the presence of objects. The two forms of data can each be used to refine the other until the output of the vision system is computed with a predefined amount of certainty. The result can be the 3D location for every pixel within and image or the certainty that a person is performing jumping jacks. With proper selection of the information and the interpretation algorithm, computer vision systems can be applied in a large number of applications.

14

Digital Image Terminology:

0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 1 95 96 94 93 92 0 0 92 93 93 92 92 0 0 93 93 94 92 93 0 1 92 93 93 93 93 0 0 94 95 95 96 95

pixel (with value 94)

its 3x3 neighborhood

• binary image• gray-scale (or gray-tone) image• color image• multi-spectral image• range image• labeled image

region of medium intensity

resolution (7x7)

Spatial Filtering

Image filtering

Hybrid Images, Oliva et al., http://cvcl.mit.edu/hybridimage.htm

Slides from Steve Seitz and Rick Szeliski

http://cvcl.mit.edu/hybridimage.htm

Image filtering

Hybrid Images, Oliva et al., http://cvcl.mit.edu/hybridimage.htm

http://cvcl.mit.edu/hybridimage.htm

What is an image?

Images as functions

•We can think of an image as a function, f, from R2 to R:

– f( x, y ) gives the intensity at position ( x, y ) – Realistically, we expect the image only to be defined

over a rectangle, with a finite range:• f: [a,b]x[c,d] [0,1]

A color image is just three functions pasted together. We can write this as a “vector-valued” function:

( , )( , ) ( , )

( , )

r x yf x y g x y

b x y

Images as functions

Images as functions

What is a digital image?

•We usually work with digital (discrete) images:– Sample the 2D space on a regular grid– Quantize each sample (round to nearest integer)

•If our samples are D apart, we can write this as:f[i ,j] = Quantize{ f(i D, j D) }

Spatial Filtering (cont’d)

• The word “filtering” has been borrowed from the frequency domain.

• Filters are classified as:– Low-pass (i.e., preserve low frequencies)– High-pass (i.e., preserve high frequencies)– Band-pass (i.e., preserve frequencies within a

band)– Band-reject (i.e., reject frequencies within a band)

Spatial Filtering – Neighborhood (or Mask)• Typically, the neighborhood is rectangular and its size is much smaller than that of f(x,y)

- e.g., 3x3 or 5x5

Filtering noise

•How can we “smooth” away noise in an image?

0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 100 130 110 120 110 0 00 0 0 110 90 100 90 100 0 00 0 0 130 100 90 130 110 0 00 0 0 120 100 130 110 120 0 00 0 0 90 110 80 120 100 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0

Mean filtering

Mean filtering

0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0

0 10 20 30 30 30 20 100 20 40 60 60 60 40 200 30 60 90 90 90 60 300 30 50 80 80 90 60 300 30 50 80 80 90 60 300 20 30 50 50 60 40 20

10 20 30 30 30 30 20 1010 10 10 0 0 0 0 0

Cross-correlation filtering

•Let’s write this down as an equation. Assume the averaging window is (2k+1)x(2k+1):

We can generalize this idea by allowing different weights for different neighboring pixels:

This is called a cross-correlation operation and written:

H is called the “filter,” “kernel,” or “mask.”

The above allows negative filter indices. When you implement need to use: H[u+k,v+k] instead of H[u,v]

Mean kernel•What’s the kernel for a 3x3 mean filter?

0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0

Mean of neighborhood

Effect of filter size

Mean vs. Gaussian filtering

Gaussian filtering

• A Gaussian kernel gives less weight to pixels further from the center of the window

Linear vs Non-LinearSpatial Filtering Methods

• A filtering method is linear when the output is a weighted sum of the input pixels.

• Methods that do not satisfy the above property are called non-linear.– e.g.,

Linear Spatial Filtering Methods

• Main linear spatial filtering methods:– Correlation– Convolution

Correlation (cont’d)

Often used in applications where we need to measure the similarity between images or parts of images(e.g., template matching).

Convolution

• Similar to correlation except that the mask is first flipped both horizontally and vertically.

Note: if w(i, j) is symmetric, that is w(i, j)=w(-i,-j), then convolution is equivalent to correlation!

/2 /2

/2 /2

( , ) ( , ) ( , ) ( , ) ( , )K K

s K t K

g i j w i j f i j w s t f i s j t

Image gradient

•How can we differentiate a digital image F[x,y]?– Option 1: reconstruct a continuous image, f, then

take gradient– Option 2: take discrete derivative (finite

difference)

Image gradient

It points in the direction of most rapid change in intensity

It points in the direction of most rapid change in intensity

smoothed – original(scaled by 4, offset +128)

original smoothed (5x5 Gaussian)

Why doesthis work?

Laplacian of Gaussian(2nd derivative operator)

Gaussian delta function

More on filters…•Cross-correlation/convolution is useful for, e.g.,

– Blurring– Sharpening– Edge Detection– Interpolation

•Convolution has a number of nice properties– Commutative, associative – Convolution corresponds to product in the Fourier domain

•More sophisticated filtering techniques can often yield superior results for these and other tasks:

– Polynomial (e.g., bicubic) filters– Steerable filters– Median filters– Bilateral Filters

Filters

• We will mainly focus on two types of filters:– Smoothing (low-pass)– Sharpening (high-pass)

Smoothing Filters (low-pass)• Useful for reducing noise and eliminating small details.• The elements of the mask must be positive.• Sum of mask elements is 1 (after normalization).

Gaussian

Smoothing filters – Example

smoothed imageinput image

• Useful for reducing noise and eliminating small details.

Sharpening Filters (high-pass)

• Useful for highlighting fine details. • The elements of the mask contain both

positive and negative weights.• Sum of mask elements is 0.

1st derivativeof Gaussian

2nd derivativeof Gaussian

Sharpening Filters (cont’d)

• Useful for highlighting fine details. – e.g., emphasize edges

Sharpening Filters - Example• Note that the response of sharpening might be negative.• Values must be re-mapped to [0, 255]

sharpened imageinput image

Smoothing Filters

• Averaging• Gaussian• Median filtering (non-linear)

Smoothing Filters: Averaging

Smoothing Filters: Averaging (cont’d)

• Mask size determines the degree of smoothing (loss of detail).

3x3 5x5 7x7

15x15 25x25

original

Smoothing Filters: Averaging (cont’d)

15 x 15 averaging image thresholding

Example: extract largest, brightest objects

Smoothing filters: Gaussian

• The weights are Gaussian samples:

mask size isa function of σ:

σ = 1.4

Smoothing filters: Gaussian (cont’d)

• σ controls the amount of smoothing • As σ increases, more samples must be obtained to represent the Gaussian function accurately.

σ = 3

Smoothing filters: Gaussian (cont’d)

Averaging vs Gaussian Smoothing

Averaging

Gaussian

Smoothing Filters: Median Filtering(non-linear)

• Very effective for removing “salt and pepper” noise (i.e., random occurrences of black and white pixels).

averagingmedian filtering

Smoothing Filters: Median Filtering (cont’d)

• Replace each pixel by the median in a neighborhood around the pixel.

Sharpening Filters

• Unsharp masking• High Boost filter• Gradient (1st derivative)• Laplacian (2nd derivative)

Sharpening Filters: Unsharp Masking

• Obtain a sharp image by subtracting a lowpass filtered (i.e., smoothed) image from the original image:

- =(after contrastenhancement)

Sharpening Filters: High Boost

• Image sharpening emphasizes edges but low frequency components are lost.

• High boost filter: amplify input image, then subtract a lowpass image.

(A-1) + =

Sharpening Filters: High Boost (cont’d)

• If A=1, we get unsharp masking.• If A>1, part of the original image is added back to the high

pass filtered image.• One way to implement high boost filtering is using the

masks below:

Sharpening Filters: High Boost (cont’d)

A=1.4 A=1.9

Sharpening Filters: Derivatives

• Taking the derivative of an image results in sharpening the image.

• The derivative of an image (i.e., 2D signal) can be computed using the gradient.

Laplacian

The Laplacian (2nd derivative) is defined as:

(dot product)

Approximate2nd derivatives:

Laplacian (cont’d)

Laplacian Mask

Edges can be foundby detect the zero-crossings

Example: Laplacian vs Gradient

Laplacian Sobel

FOV (Field of View)

2. filtering basics

Engineering