introduction to robot vision ziv yaniv computer aided interventions and medical robotics, georgetown...
TRANSCRIPT
![Page 1: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/1.jpg)
Introduction to Robot Vision
Ziv YanivComputer Aided Interventions and Medical Robotics,
Georgetown University
![Page 2: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/2.jpg)
Vision
The special sense by which the qualities of an object (as color, luminosity, shape, and size) constituting its appearance are perceived through a process in which light rays entering the eye are transformed by the retina into electrical signals that are transmitted to the brain via the optic nerve.
[Miriam Webster dictionary]
![Page 3: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/3.jpg)
The Sensor
C-arm X-ray
endoscope
webcam
Single Lens Reflex (SLR) Camera
![Page 4: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/4.jpg)
The Sensor
Model: Pin-hole Camera, Perspective Projection
Image plane
optic
al
axis
focal point
x
yz
xy
z
principle point
![Page 5: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/5.jpg)
Machine Vision
Goal:Obtain useful information about the 3D world from 2D images.
Model:
images
Regions TexturesCornersLines…
3D GeometryObject identificationActivity detection…
actions
![Page 6: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/6.jpg)
Machine Vision
•Low level (image processing)• image filtering (smoothing, histogram modification…), • feature extraction (corner detection, edge detection,…)• stereo vision• shape from X (shading, motion,…)• …
• High level (machine learning/pattern recognition)• object detection • object recognition• clustering• …
Goal:Obtain useful information about the 3D world from 2D images.
![Page 7: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/7.jpg)
Machine Vision
• How hard can it be?
![Page 8: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/8.jpg)
Machine Vision
• How hard can it be?
![Page 9: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/9.jpg)
Robot Vision
1. Simultaneous Localization and Mapping (SLAM)
2. Visual Servoing.
![Page 10: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/10.jpg)
Robot Vision
1. Simultaneous Localization and Mapping (SLAM) – create a 3D map of the world and localize within this map.
NASA stereo vision image processing, as used by the MER Mars rovers
![Page 11: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/11.jpg)
Robot Vision
1. Simultaneous Localization and Mapping (SLAM) – create a 3D map of the world and localize within this map.
“Simultaneous Localization and Mapping with Active Stereo Vision”, J. Diebel, K. Reuterswärd, S. Thrun, J. Davis, R. Gupta, IROS 2004.
![Page 12: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/12.jpg)
Robot Vision
1. Visual Servoing – Using visual feedback to control a robot:
a) image-based systems: desired motion directly from image.
“An image-based visual servoing scheme for following paths with nonholonomic mobile robots” A. Cherubini, F. Chaumette, G. Oriolo,ICARCV 2008.
![Page 13: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/13.jpg)
Robot Vision
1. Visual Servoing – Using visual feedback to control a robot:
b) Position-based systems: desired motion from 3D reconstruction estimated from image.
![Page 14: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/14.jpg)
• Difficulty of similar tasks in different settings varies widely:– How many cameras?– Are the cameras calibrated?– What is the camera-robot configuration?– Is the system calibrated (hand-eye calibration)?
Common configurations:
System Configuration
x
y
zx
y
z
xy
zx
y
z x
y
z
x
y
z
x
y
z
![Page 15: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/15.jpg)
System Characteristics
• The greater the control over the system configuration and environment the easier it is to execute a task.
• System accuracy is directly dependent upon model accuracy – what accuracy does the task require?.
• All measurements and derived quantitative values have an associated error.
![Page 16: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/16.jpg)
Stereo Reconstruction
• Compute the 3D location of a point in the stereo rig’s coordinate system:• Rigid transformation between the two cameras is known.• Cameras are calibrated –given a point in the world coordinate system we know how to map it to the image.• Same point localized in the two images.
Camera 1
Camera 2T21
world
![Page 17: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/17.jpg)
Commercial Stereo Vision
Polaris Vicra infra-red system(Northern Digitial Inc.)
MicronTracker visible light system(Claron Technology Inc.)
![Page 18: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/18.jpg)
Commercial Stereo Vision
left image right image
Images acquired by the Polaris Vicra infra-red stereo system:
![Page 19: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/19.jpg)
Stereo Reconstruction
Camera 1
Camera 2
• Wide or short baseline – reconstruction accuracy vs. difficulty of point matching
Camera 1
Camera 2
Camera 1 Camera 2
![Page 20: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/20.jpg)
Camera Model
10100
000
000
Z
Y
X
f
f
w
v
u
Z
Yfy
• Points P, p, and O, given in the camera coordinate system, are collinear.
xy
z
P=[X,Y,Z]
O
p=[x,y,f]
f
There is a number for which O + P = p
P = p
= f/Z , thereforeZ
Xfx
![Page 21: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/21.jpg)
Camera Model
10100
00
0
'
'
'
0
0
Z
Y
X
yfk
xsfk
w
v
u
y
x
Transform the pixel coordinates from the camera coordinate system to the image coordinate system:• Image origin (principle point) is at [x0,y0] relative to the camera coordinate system.• Need to change from metric units to pixels, scaling factors kx, ky.
x
y
principle point
[x’,y’]
• Finally, the image coordinate system may be skewed resulting in:
10100
00
00
'
'
'
0
0
Z
Y
X
yfk
xfk
w
v
u
y
x
![Page 22: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/22.jpg)
Camera Model
C]|KR[I10
RCR]0|[IKM
13333343
100
0 0
0
yfk
xsfk
y
x
K
• As our original assumption was that points are given in the camera coordinate system, a complete projection matrix is of the form:
• How many degrees of freedom does M have?
T3
T2
T1
34333231
24232221
14131211
43
M
M
M
mmmm
mmmm
mmmm
M
C – camera origin in theworld coordinate system.
![Page 23: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/23.jpg)
Camera Calibration
MPp
• Given pairs of points, piT=[x,y,w], Pi
T=[X,Y,Z,W], in homogenous coordinates we have:
•As the points are in homogenous coordinates the vectors p and MP are not necessarily equal, they have the same direction but may differ by a non-zero scale factor.
0MPp
Our goal is to estimate M
xz
xy
z
principle point
calibration object/world coordinate
system
camera coordinate system
image coordinate system
y
![Page 24: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/24.jpg)
Camera Calibration
0Am
0
3
2
1
TTT
TTT
TTT
M
M
M
0PP
P0P
PP0
iiii
iiii
iiii
xy
xw
yw
• After a bit of algebra we have:
• The three equations are linearly dependent:
• Each point pair contributes two equations.
• Exact solution: M has 11 degrees of freedom, requiring a minimum of n=6 pairs.
• Least squares solution: For n>6 minimize ||Am|| s.t. ||m||=1.
321 AAA i
i
i
i
w
y
w
x
![Page 25: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/25.jpg)
Obtaining the Rays
• Camera location in the calibration object’s coordinate system, C, is given by the one dimensional right null space of the matrix M (MC=0).
• A 3D homogenous point P = M+p is on the ray defined by p and the camera center [it projects onto p, MM+p =Ip=p].
• These two points define our ray in the world coordinate system.
• As both cameras were calibrated with respect to the same coordinate system the rays will be in the same system too.
![Page 26: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/26.jpg)
Intersecting the Rays
11111 )( nar tt
22222 )( nar tt 1a 1n
2n2a
2
21
21T
2121
)())((
nn
nnnaa
t 2
21
21T
1122
)())((
nn
nnnaa
t
2
)]()([ 2211 tt rr
![Page 27: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/27.jpg)
World vs. Model
• Actual cameras most often don’t follow the ideal pin-hole model, usually exhibitsome form of distortion (barrel, pin-cushion, S).
• Sometimes the world changes to fit your model, improvements in camera/lens quality can improve model performance.
old image-Intensifier x-ray:pin-hole+distortion
replaced by flat panel x-ray: pin-hole
![Page 28: Introduction to Robot Vision Ziv Yaniv Computer Aided Interventions and Medical Robotics, Georgetown University](https://reader035.vdocuments.net/reader035/viewer/2022081420/56649e445503460f94b38ff4/html5/thumbnails/28.jpg)
Additional Material
• Code:– Camera calibration toolbox for matlab (Jean-Yves Bouguet )
http://www.vision.caltech.edu/bouguetj/calib_doc/
• Machine Vision:– “Multiple View Geometry in Computer Vision”, Hartley and Zisserman,
Cambridge University Press.– "Machine Vision", Jain, Kasturi, Schunck, McGraw-Hill.
• Robot Vision:– “Simultaneous Localization and Mapping: Part I”, H. Durant-Whyte, T. Bailey,
IEEE Robotics and Automation Magazine, Vol. 13(2), pp. 99-110, 2006.– “Simultaneous Localization and Mapping (SLAM) : Part II”,T. Bailey, H. Durant-
Whyte, IEEE Robotics and Automation Magazine, Vol. 13(3), pp. 108-117, 2006.– “Visual Servo Control Part I: Basic Approaches”, IEEE Robotics and Automation
Magazine, Vol. 13(4), 82-90, 2006.– Visual Servo Control Part II: Advanced Approaches”, IEEE Robotics and
Automation Magazine, Vol. 14(1), 109-118, 2007.