image formation • camera models - uvic.caaalbu/computer vision 2008/week 2... · {basic...
TRANSCRIPT
1
1
Image formation
Computer VisionCENG 421/ELEC 536Spring 2008
2
Course outline
• Image formation• Camera Models
• Pinhole Perspective Projection• Affine Projection
Reading: Cipolla and Gee, Lecture Notes on Projection, 1999.
3
Digital images A digital image=2D array (matrix) of numbers demoDepending on the nature of the image, these numbers may represent:
light intensities,colour (wavelength)distanceother physical quantities related to the image acquisition process.
Several types of images are typically used in computer vision:Intensity images
encode light intensities and colour (R,G,B) channelsacquired by digital cameras
Range imagesEncode shape and distance (3D vision)Acquired by special sensors like sonars, radars, laser scanners Oceanography, intelligent vehicles etc.
Medical images2D or 3D depending on the acquisition techniquesUltrasound (2D/3D), CT (3D), MRI (3D), digital X-Ray (2D) etc.
others
4
Digital images- discussion
The exact relationship between a digital image and the physical world is determined by the acquisition process, which depends on the sensor usedAny information contained in images (e.g. shape-related measurements, relative position of various objects or object identity) must be ultimately be extracted from the 2D numerical arrays in which it is encodedComputer Vision algorithms extract the information relevant for the task at hand from the 2D imagesThis chapter investigates the process of image formation in intensity images; the rest of the course is dedicated to computational techniques for the extraction of information from images.
5
Image formation (geometric)
The geometric process is projection. A 3D scene is projected onto a 2D image.Basic assumptions:- the 3D scene consists of opaque and reflectiveobjects in a transparent medium (air) with one or more light sources.Additional assumptions (for more complex scenes and/or tasks)- for instance, controlled scene lighting is helpful for the removal of cast shadows
o For a sharp image (in focus), all rays coming from a single scene point P must converge to a single point P’ in the image. P’ will be referred as the image of P.
6
Cameras
Basic abstraction: pinhole cameraAbstract camera model - box with a small hole in itPinhole cameras work in practiceThe pinhole perspective projection equations were discovered by Brunelleschi, in the 15th century.First pinhole camera: 16th century.still represent the most used theoretical camera model
2
7
Brunelleschi’s experiments
The Baptistry in Florencehttp://www.kap.pdx.edu/trow/winter01/perspective/
A peephole in the mirror
A two-mirror system for comparing the painting and the real scene
Painted panel of the baptistry
Final result: a mathematical theory of projection by which 3D space could be rendered on any 2D surface.
8
First pinhole cameraThe first published picture of a pinhole camera obscura : a drawing in Gemma Frisius' De Radio Astronomica et Geometrica (1545).Gemma Frisius (an astronomer) had used the pinhole in a darkened room to study the solar
eclipse of 1544.
The term camera obscura ("dark room") was coined by Johannes Kepler(1571–1630).
9
In a pinhole camera, images are formed by the projection of 3D objects.
Figure from US Navy Manual of Basic Optics and Optical Instruments, prepared by Bureau of Naval Personnel. Reprinted by Dover Publications, Inc., 1969.
10
Camera obscura : each point on the image plane sees light from only one direction, the one that passes through the pinhole. The pinhole is the center of projection through which all light passes.
! Perspective projection creates inverted images.
! It is sometimes convenient to consider a virtual image in a plane lying in front of a pinhole and symmetric to the image plane with respect to the pinhole point.
11
Pinhole optics
Using ray-tracing, we see that only a narrow light beam passes through a pinhole
A) In a wide pinhole, light from the source spreads across the image, making it blurry.
B) In a narrow pinhole, only a small amount of light is let in. The image is sharper.
Small apertures require longer exposure times.
The sharpness is limited by diffraction. 12
Pinhole too big -many directions areaveraged, blurring theimage
Pinhole too small-diffraction effects blurthe image.
Generally, images from pinhole cameras are dark, because a very small set of rays from a particular point hits the screen.
3
13
Pinhole optics
Pinhole optics focuses images:- without lenses- with an infinite depth of field- Depth of field = The distance between the nearest
and farthest objects that appear in acceptably sharp focus in the image.
Small pinhole:- Better focus- Less light energy available from any scene point - The sharpness is limited by difraction
14
Perspective effect: Far objects appear smaller than close ones
! The image plane is behind the pinhole (inverted images).
15
Pinhole optics: Horizon and vanishing points
The film plane is usually placed in front of the pinhole O(virtual image plane).Moving the film plane → image scaling.H is the horizon line. Considering all the possible sets of parallel lines in plane Π, their intersection (vanishing points) lie on the horizon line.
What is the image of a point located on line L?
16Parallel lines meet at vanishing points
17
Parallel lines meet (cont’d)
• each set of parallel lines in the real 3D world will have a different vanishing point in the image located on the horizon line.
• Also, planes parallel to the ground plane meet in the horizon line.
18Is this a perspective image of four identical buildings?
Spotting ‘fake’ images
4
19
The Pinhole Perspective Equation
We associate a coordinate system to the pinhole camera (also known as the camera frame).
The pinhole camera is defined by Oc and by the image plane.
The ZC axis is called the optical axis of the camera.
Point C’ is the image center.
3D scene point P=(Xc ,Yc ,Zc)T projects to image point P’=(x, y, f) where f is the focal distance
Equation of perspective projection is found by analyzing similar triangles
C’
P
P’
20
Properties of perspective projectionSize of the image of an object changes as it translates along the z axis (scaling effect)
Perspective projection is line-preserving
image Planar scene z=z0
0
'''
zf
PQ
QPm ==
- Ratios of lengths are not preserved, except for planar scenes parallel to the image plane
- The focal distance f is an essential parameter of the pinhole camera.
-f small: more world points project onto the finite image. This is called a wide angle image
- f large: telescopic image
21
Homogenous coordinates
Add an extra coordinate and use an equivalence relationfor 2D
equivalence relationk*(X,Y,Z) is the same as (X,Y,Z)
for 3Dequivalence relationk*(X,Y,Z,T) is the same as (X,Y,Z,T)
Basic notionPossible to represent points “at infinity”
Where parallel lines intersectWhere parallel planes intersect
Possible to describe the perspective projection as a matrix transform.
22
Rationale for using homogeneous coordinates
Every point in an image corresponds to one incoming light ray: any 3D point along the ray projects to the same image point, so only the direction of the ray is relevant, not the distance of the point along it. One way to represent incoming ray directions is by their corresponding pixel location : 2 image coordinates (x,y).Another way is by arbitrarily choosing some 3D point along each ray to represent the ray's direction.
- In this case we need 3 homogeneous coordinates instead of 2 ‘inhomogeneous’ ones to represent each ray. This seems inefficient, but it has the significant advantage of making the image projection process much easier to model.
23
Rationale for homogeneous coordinates (cont’d)
A. suppose that the camera is at the origin (0,0,0). The ray represented by homogeneous coordinates (X,Y,T) passes through the 3D point (X,Y,T). The 3D point
also lies on (represents) the same ray. Thus, rescaling homogeneous coordinates makes no difference:
( ) ),,(,, TYXTYX λλλλ =
( ) ),,(,, TYXTYX λ≈
24
Relationship between homogeneous and inhomogeneous coordinates
suppose that the image plane of the camera is T=1;The ray through pixel (x,y) can be represented homogeneously by the vector (x,y,1) ≈ (xT, yT, T) for any depth T>0.The homogeneous point vector (X,Y,T ) with T≠0 corresponds to the inhomogeneous coordinates on the plane T=1.
TY
TX ,
5
25
Perspective projection revisitedWe want to express the projection equation into HC’s (homogeneous coordinates)
HC’s for 3D point are [λX λY λZ λ]T
HC’s for its image are [sx sy s]T
26
Homogeneous coordinates and vanishing points
What happens when T=0?(X,Y,0) is a valid 3D point defining an optical ray parallel to the plane T=1.It has no finite intersection with it!Such rays (homogeneous vectors) can no longer be interpreted as finite points of the standard 2D plane.They may be considered as ‘ideal’ points, or limits.Points at infinity (vanishing points)Lines at infinity (intersection between two parallel planes) (horizon line)
27
Application
Given two parallel planes :nx Xc +ny Yc + nzZc=d1
nx Xc +ny Yc + nzZc=d2 , d1≠ d2,
in inhomogeneous coordinates, prove that the 4th homogeneous coordinate of every point lying on the horizon line is 0.
28
Affine projection models: Weak perspective projection
is the magnification.
The weak perspective projection (m=constant for all points in the scene) works when the scene depth is small relative to the average distance from the camera.
0
'where''
zfm
myymxx −=
−=−=
29
Affine projection models: Orthographic projection
==
yyxx
''
When the camera is at a (roughly constant) distance from the scene, take m=1.Unlike other geometric models of image formation, orthographic projection does not involve a reversal of image features.What is the main difference between orthographic and weak perspective, general perspective projection? 30
The projection matrix for orthographic projection
=
TZYX
WVU
010000100001
==
yyxx
''
HC
The focal distance does not influence the image formation process under the assumption of orthographic projection.
6
31
Image formation with orthographic projection
• Parallel lines in the scene appear as parallel lines in the image
• The length of parallel segments is preserved by the projection transform
32
Geometric camera models
Camera models describe the mapping from world to pixel coordinates.
- useful in the process of camera calibration- can be expressed in either homogeneous or
inhomogeneous coordinates - Must account for the following transformations:1) Rigid body motion between the camera and the
scene2) Perspective projection of the 3D real world onto
the image plane3) CCD imaging – the geometry of the sensor array
33
Rigid body transformation
34
Euclidean, right-handed coordinate systems and vectors of coordinates
=⇔++=⇔
===
zyx
zyxOPOPzOPyOPx
Pkjikji
...
35
Coordinate changes: pure translation
POOOOB AABP +=A
BAB OPP +=
- basis vectors are parallel to each other
- Origins OA ≠OB
Convention : PF is the coordinate vector of point P in frame F
36
Coordinate Changes: Rotation
The rotation matrix describing the frame (A) in the coordinate system (B)
−=
1000cossin0sincos
θθθθ
RBA
( ) ( )PRP ABA
B =
Rotation about the z axis
7
37
Coordinate Changes: Pure Rotation
=
BABABA
BABABA
BABABABA R
kkkjkijkjjjiikijii
.........
=TB
A
TB
A
TB
A
kji
38
Geometric camera models
Camera models describe the mapping from world to pixel coordinates.
- useful in the process of camera calibration- can be expressed in either homogeneous or
inhomogeneous coordinates - Must account for the following transformations:1) Rigid body motion between the camera and the
scene2) Perspective projection of the 3D real world onto
the image plane3) CCD imaging
39
Perspective projection in inhomogeneous coordinates
c
c
c
c
ZYfy
ZXfx
=
=
Non-linear
40
Geometric camera models
Camera models describe the mapping from world to pixel coordinates.
- useful in the process of camera calibration- can be expressed in either homogeneous or
inhomogeneous coordinates - Must account for the following transformations:1) Rigid body motion between the camera and the
scene2) Perspective projection of the 3D real world onto
the image plane3) CCD imaging – the geometry of the sensor array
41
CCD imaging
Same scene, same camera viewpoint (external parameters), two different images
CCD imaging refers to the geometry of the CCD array (size and shape of pixels) and its position with respect to the optical axis.
42
Camera model – CCD imaging (inhomogeneous coordinates)
=
=
yx
vu
w x
• The imaging process involves digitization (discrete images)
• Thus, we define a vector w of pixel coordinates, related to the vector x of image coordinates.
• The relationship between x and w depends on the pixel size (aspect ratio).
The pixel size is :
Pixel coordinates
vu kk11
×
Image coordinatesu=u0 + ku x
v=v0 + kv y
8
43
Rigid camera motion in homogeneous coordinates
Pr is the rigid body transformation matrix
It is composed of extrinsic parameters and has 6 degrees of freedom (DOF). 44
Perspective projection in homogeneous coordinates
Pp is the projection matrix
45
CCD imaging in homogeneous coordinates
Equivalently, w=Pc x
Pc is the CCD calibration matrix
=
ssysx
vkuk
ssvsu
v
u
1000
0
0
0
46
Overall mapping from world coordinates to pixel coordinates inhomogeneous coordinates
w=PpsX
where Pps=PcPpPr is the camera projection matrix for a perspective camera
Intrinsic camera parameters : PcPp; Extrinsic camera parameters : Pr
Pps has 10 degrees of freedom (rank (Pps)=10) because f, ku and kvare not independent (2 DOF instead of 3).
==
10000100000000
1000
0
0
0 TRf
fvkuk
PPPP v
u
rpcps
47
The camera projection matrix
Is not a general 3 x 4 matrix, but has a special structure composed of Pr, Pp, and Pc.
It can be conveniently written as a product of two matrices
• αu =f ku and αv =f kv are image scaling factors
• ratio αu / αv is known as the aspect ratio
48
The projective camera
The perspective camera is a special case of the projective camera (11 degrees of freedom).a general 3 x 4 matrix
9
49
Projective camera versus perspective camera
It is more convenient to work with a projective camera model instead of a perspective one, since we do not have to worry about any nonlinear constraints on the elements of P.
Perspective=special case of projective. Thus, any results derived for the perspective camera will also work for the projective camera.
50
Camera calibration
Estimation of the projection matrix from an image of a controlled scene.Good images for calibration are grids with patterns of known size.
51
A typical set-up for grid-based calibration
52
Camera calibration
If we use a projective camera model we need to estimate 11 parameters (we can set p34 to 1)
53
Particular cases: how many parameters do we need to estimate?
2D→2D
1D→1D
54
Recovery of world position
With a calibrated camera, we can attempt to recover the world position of image features1D case (line to line)2D case (plane to plane)
10
55
Recovery of world position (cont’d)
3D case (3D world to image plane)
We need at least two cameras to determine the position of the world point: Stereo vision
56
Particular projection cases simplify the calibration process
Orthographic (parallel) projectionDepth of the objects in the scene is small compared to the distance of the camera to the scenef→∞, Zav→∞
Weak perspective Scaling according to the average depth of the scenePreserves ratios of segments and anglesThus preserves parallelism
57 58
Error introduced by the weak perspective approximation
59
Affine camera models
Describe weak and orthographic projectionError in assigned reading p. 40: parallel projection should be weak perspectiveThe form of the projection matrix is simplified when using the weak perspective assumption
60
Affine cameras – planar view
The 6 degrees of freedom for an affine planar camera
11
61
Sensing
Main difference between a modern camera and the camera obscura of the 17th century:
-ability to record the pictures formed in the backplane (photographic film, CCD technology)
-ability to focus the image with lenses
-The pinhole perspective is still considered an a convenient mathematical model for camera sensing
62
Lenses
Useful for :1. Gathering light. Under ideal pinhole projection, a single
ray of light will reach each point in the image plane. 2. Sharpening the imageThe trade-off between 1 and 2 is possible only if using
lenses.
63
Lenses behave according to the laws of geometric optics
-Light travels in straight lines (light rays) in homogeneousmedia- Reflection law- Refraction law : the incident ray, the refracted ray and the normal at the refraction surface are coplanar. Angles obey Snell’s law. Snell’s law n1 sinα1 = n2 sin α2
64
Paraxial (or first-order) optics
Snell’s law:
n1 sinα1 = n2 sin α2
Small angles:
n1 α1 ~ n2α2R
nndn
dn 12
2
2
1
1 −=+
The paraxial refraction equation
65
Thin lenses: basic properties
Any ray entering the lens parallel to the axis on one side goes through the focus on the other sideAny ray entering the lens from the focus on one side emerges parallel to the axis on the other side
66
Thin Lenses: a ray entering the lens and refracted at its right boundary is immediately refracted again at the left boundary.
)1(2 and11
'1 e wher
''
''
−==−
=
=
nRf
fzzzyzy
zxzx
All rays passing through P are focused by the thin lens on point P’(x’, y’, z’) along PO.
12
67
Thin lenses – depth of fieldWhen a lens focuses on an object at a given distance, all objects at the same distance are sharply focused. Objects located at different distances are out of focusand theoretically not sharp.Is it possible to reduce the size of the circle of confusion?
68
Spherical Aberration
Blue region : paraxial zone (small angles), where P corresponds to P’ (called paraxial image)
If the image plane is Π’, then the image of P is a circle of confusion of diameter d’.
The focus plane (dashed) leads to a circle of confusion of minimal diameter.
69
Barrel or pincushion?
70
How to correct (minimize) aberrations?
By aligning several simple lenses with well-chosen shapes (apertures) and refraction indexes- compound lenses
Vignetting : light beams emanating from objects located off-axis are partially blocked by the various apertures of individual lenses.
Brightness drops gradually in the periferal zones of the image.
71Vignetting in photography
72
CCD Camera has discrete elements
Lens collects light raysCCD elements replace chemicals of filmNumber of elements less than with film (so far)
13
73
CCD sensors
incoming light is recorded on a small, rectangular piece of silicon, called a charge-coupled device (CCD). This silicon wafer = array of individual light-sensitive cells called photosites. Each photosite corresponds to one picture element, or pixel.The CCD photosites sense incoming light through the photoelectric effect; an electron is released when the photosite is hit with a photon of light.electrons emitted within the CCD are fenced within nonconductive boundaries → they remain within the area of the photon strike. As long as light is allowed to impinge on a photosite, electrons will accumulate in that pixel. When the shutter is closed, the CCD array is unloaded using charge coupling; electrons in each pixel are counted, the resulting data is displayed and stored as an image.
74
Sensors in CCD cameras (cont’d)
( ) ( )( )
( ) λλλλ
dpdqpRpETcrIcrSp
∫ ∫∈
=,
,),(
T is the electron collection time;
The integral is computed over the spatial domain of the cell, and over its range of wavelengths.
E is the irradiance (more about it in following courses)
R – spatial response of the site
Q is the quantum efficiency (how many electrons are generated per unit of incident light energy)
75
You know now
Digital images can be acquired by a variety of acquisition processesCameras
Geometric image formation obeys laws of perspective projection geometryPinhole cameraMapping the 3D world onto 2D image coordinates must consider
Camera motion (rigid)Perspective projectionCCD imagingCamera calibration is necessary if we want an exact mapping
Lenses: optical systems that help enhance the brightness and overall quality of the imageCamera-based image acquisition features systematic distorsions and aberrations.