2. filtering basics
TRANSCRIPT
• Role of Artificial Intelligence and Image Processing in Computer Vision
• Industrial Machine Vision Applications
• System Architecture
• State of Art
CHAPTER 1
2
Introduction
• What is computer vision?
The analysis of digital images by a computer
3
The Three Stages of Computer Vision
7
Low-Level
blurring
sharpening
8
Low-Level
Canny
ORT
Mid-Leveloriginal image edge image
edge image circular arcs and line segments
datastructure
9
Mid-level
K-meansclustering
original color image regions of homogeneous color
(followed byconnectedcomponent
analysis)
datastructure
10
edge image
consistentline clusters
low-level
mid-level
high-level
Low- to High-Level
Building Recognition
System Architecture
Typical Computer Vision System. The figure shows the basic components of a computer vision system. The left most component is the scene or object of study. The next required component shown in the figure is the sensing device used to collect data from the scene. The third component is the computation device. The device computes information such as visual cues and reasons on this information to generate interpretations of the scene such as objects present or actions being performed.
System Architecture
Typical computer vision system involves three components,
1. Scene under study2. Sensing device that can be used to analyze
the scene (used to collect data from the scene)
System Architecture3. Computational device that can perform the analysis of the scene
based on the data from the sensor. The computation devices generates two possible forms of data, information such as visual cues, and interpretations of information such as actions being performed or the presence of objects. The two forms of data can each be used to refine the other until the output of the vision system is computed with a predefined amount of certainty. The result can be the 3D location for every pixel within and image or the certainty that a person is performing jumping jacks. With proper selection of the information and the interpretation algorithm, computer vision systems can be applied in a large number of applications.
14
Digital Image Terminology:
0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 1 95 96 94 93 92 0 0 92 93 93 92 92 0 0 93 93 94 92 93 0 1 92 93 93 93 93 0 0 94 95 95 96 95
pixel (with value 94)
its 3x3 neighborhood
• binary image• gray-scale (or gray-tone) image• color image• multi-spectral image• range image• labeled image
region of medium intensity
resolution (7x7)
Spatial Filtering
Image filtering
Hybrid Images, Oliva et al., http://cvcl.mit.edu/hybridimage.htm
Slides from Steve Seitz and Rick Szeliski
Image filtering
Hybrid Images, Oliva et al., http://cvcl.mit.edu/hybridimage.htm
What is an image?
Images as functions
•We can think of an image as a function, f, from R2 to R:
– f( x, y ) gives the intensity at position ( x, y ) – Realistically, we expect the image only to be defined
over a rectangle, with a finite range:• f: [a,b]x[c,d] [0,1]
A color image is just three functions pasted together. We can write this as a “vector-valued” function:
( , )( , ) ( , )
( , )
r x yf x y g x y
b x y
Images as functions
Images as functions
What is a digital image?
•We usually work with digital (discrete) images:– Sample the 2D space on a regular grid– Quantize each sample (round to nearest integer)
•If our samples are D apart, we can write this as:f[i ,j] = Quantize{ f(i D, j D) }
Spatial Filtering (cont’d)
• The word “filtering” has been borrowed from the frequency domain.
• Filters are classified as:– Low-pass (i.e., preserve low frequencies)– High-pass (i.e., preserve high frequencies)– Band-pass (i.e., preserve frequencies within a
band)– Band-reject (i.e., reject frequencies within a band)
Spatial Filtering – Neighborhood (or Mask)• Typically, the neighborhood is rectangular and its size is much smaller than that of f(x,y)
- e.g., 3x3 or 5x5
Filtering noise
•How can we “smooth” away noise in an image?
0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 100 130 110 120 110 0 00 0 0 110 90 100 90 100 0 00 0 0 130 100 90 130 110 0 00 0 0 120 100 130 110 120 0 00 0 0 90 110 80 120 100 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0
Mean filtering
Mean filtering
0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0
0 10 20 30 30 30 20 100 20 40 60 60 60 40 200 30 60 90 90 90 60 300 30 50 80 80 90 60 300 30 50 80 80 90 60 300 20 30 50 50 60 40 20
10 20 30 30 30 30 20 1010 10 10 0 0 0 0 0
Cross-correlation filtering
•Let’s write this down as an equation. Assume the averaging window is (2k+1)x(2k+1):
We can generalize this idea by allowing different weights for different neighboring pixels:
This is called a cross-correlation operation and written:
H is called the “filter,” “kernel,” or “mask.”
The above allows negative filter indices. When you implement need to use: H[u+k,v+k] instead of H[u,v]
Mean kernel•What’s the kernel for a 3x3 mean filter?
0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0
Mean of neighborhood
Effect of filter size
Mean vs. Gaussian filtering
Gaussian filtering
• A Gaussian kernel gives less weight to pixels further from the center of the window
Linear vs Non-LinearSpatial Filtering Methods
• A filtering method is linear when the output is a weighted sum of the input pixels.
• Methods that do not satisfy the above property are called non-linear.– e.g.,
Linear Spatial Filtering Methods
• Main linear spatial filtering methods:– Correlation– Convolution
Correlation (cont’d)
Often used in applications where we need to measure the similarity between images or parts of images(e.g., template matching).
Convolution
• Similar to correlation except that the mask is first flipped both horizontally and vertically.
Note: if w(i, j) is symmetric, that is w(i, j)=w(-i,-j), then convolution is equivalent to correlation!
/2 /2
/2 /2
( , ) ( , ) ( , ) ( , ) ( , )K K
s K t K
g i j w i j f i j w s t f i s j t
Image gradient
•How can we differentiate a digital image F[x,y]?– Option 1: reconstruct a continuous image, f, then
take gradient– Option 2: take discrete derivative (finite
difference)
Image gradient
It points in the direction of most rapid change in intensity
It points in the direction of most rapid change in intensity
smoothed – original(scaled by 4, offset +128)
original smoothed (5x5 Gaussian)
Why doesthis work?
Laplacian of Gaussian(2nd derivative operator)
Gaussian delta function
More on filters…•Cross-correlation/convolution is useful for, e.g.,
– Blurring– Sharpening– Edge Detection– Interpolation
•Convolution has a number of nice properties– Commutative, associative – Convolution corresponds to product in the Fourier domain
•More sophisticated filtering techniques can often yield superior results for these and other tasks:
– Polynomial (e.g., bicubic) filters– Steerable filters– Median filters– Bilateral Filters
Filters
• We will mainly focus on two types of filters:– Smoothing (low-pass)– Sharpening (high-pass)
Smoothing Filters (low-pass)• Useful for reducing noise and eliminating small details.• The elements of the mask must be positive.• Sum of mask elements is 1 (after normalization).
Gaussian
Smoothing filters – Example
smoothed imageinput image
• Useful for reducing noise and eliminating small details.
Sharpening Filters (high-pass)
• Useful for highlighting fine details. • The elements of the mask contain both
positive and negative weights.• Sum of mask elements is 0.
1st derivativeof Gaussian
2nd derivativeof Gaussian
Sharpening Filters (cont’d)
• Useful for highlighting fine details. – e.g., emphasize edges
Sharpening Filters - Example• Note that the response of sharpening might be negative.• Values must be re-mapped to [0, 255]
sharpened imageinput image
Smoothing Filters
• Averaging• Gaussian• Median filtering (non-linear)
Smoothing Filters: Averaging
Smoothing Filters: Averaging (cont’d)
• Mask size determines the degree of smoothing (loss of detail).
3x3 5x5 7x7
15x15 25x25
original
Smoothing Filters: Averaging (cont’d)
15 x 15 averaging image thresholding
Example: extract largest, brightest objects
Smoothing filters: Gaussian
• The weights are Gaussian samples:
mask size isa function of σ:
σ = 1.4
Smoothing filters: Gaussian (cont’d)
• σ controls the amount of smoothing • As σ increases, more samples must be obtained to represent the Gaussian function accurately.
σ = 3
Smoothing filters: Gaussian (cont’d)
Averaging vs Gaussian Smoothing
Averaging
Gaussian
Smoothing Filters: Median Filtering(non-linear)
• Very effective for removing “salt and pepper” noise (i.e., random occurrences of black and white pixels).
averagingmedian filtering
Smoothing Filters: Median Filtering (cont’d)
• Replace each pixel by the median in a neighborhood around the pixel.
Sharpening Filters
• Unsharp masking• High Boost filter• Gradient (1st derivative)• Laplacian (2nd derivative)
Sharpening Filters: Unsharp Masking
• Obtain a sharp image by subtracting a lowpass filtered (i.e., smoothed) image from the original image:
- =(after contrastenhancement)
Sharpening Filters: High Boost
• Image sharpening emphasizes edges but low frequency components are lost.
• High boost filter: amplify input image, then subtract a lowpass image.
(A-1) + =
Sharpening Filters: High Boost (cont’d)
• If A=1, we get unsharp masking.• If A>1, part of the original image is added back to the high
pass filtered image.• One way to implement high boost filtering is using the
masks below:
Sharpening Filters: High Boost (cont’d)
A=1.4 A=1.9
Sharpening Filters: Derivatives
• Taking the derivative of an image results in sharpening the image.
• The derivative of an image (i.e., 2D signal) can be computed using the gradient.
Laplacian
The Laplacian (2nd derivative) is defined as:
(dot product)
Approximate2nd derivatives:
Laplacian (cont’d)
Laplacian Mask
Edges can be foundby detect the zero-crossings
Example: Laplacian vs Gradient
Laplacian Sobel
FOV (Field of View)
FOV (Field of View)
FOV (Field of View)