open source computer vision tutorial for...
TRANSCRIPT
Open Source Computer Vision Tutorial for ICDSC’08
Gary Bradski
Senior Scientist, Willow Garage;
Consulting Prof, Stanford University
1Gary Bradski (c) 2008
Smart Cameras
Rights to original images herein explicitly retained, no redistribution without written permission
OpenCV Book Due out end of Sept. 2008• This tutorial is just a faint introduction – new book will
teach computer vision and its use via OpenCV
• Learning OpenCV
– Computer Vision with the OpenCV Library, O’Reilly Press.
• Google “OpenCV Amazon”– http://www.amazon.com/Learning-OpenCV-Computer-Vision-Library/dp/0596516134
2Gary Bradski (c) 2008
Outline
• Installing OpenCV
• Computer vision motivation
• OpenCV overview
• Programming with OpenCV, getting started
• Some computer vision background – With program examples
• An object training set program
• Optic flow, segmentation, calibration, stereo, recognition
• Some tips on tricks of the trade
• Wither computer vision and OpenCV
3Gary Bradski (c) 2008
INSTALLING OPENCV
Gary Bradski (c) 2008 4
Installing OpenCV• First, install the files for this course (by key or by
CD).
• Open this file.
• Hopefully you’ve preinstalled, otherwise see the next two pages, and/or see the
• installing_opencv_cvs.txt in the files for this course
• Also in files, there’s FFMPEG_install_MacOSX.txt
OR:
Gary Bradski (c) 2008 5
Instructions can be found on the OpenCV wiki, http://opencvlibrary.sourceforge.net/for Windows, Linux or MacOS.
The release version of the library are at:http://sourceforge.net/projects/opencvlibrary/
I recommend you get the latest version of OpenCV from the sourceforge CVSrepository. If you get the version 1 release from sourceforge, it's OK, butyou won't be able to run the stereo demo.
Gary Bradski (c) 2008 6
Installing On WindowsFor getting the code from CVS:=============================- - - WINDOWS - - -For Windows users, you'll need a CVS program. I recommend TortoiseCVS(http://www.tortoisecvs.org/ ), which integrates nicely with Windows Explorer.
On Windows, if you want the latest OpenCV from the CVS repository then you'll needto access the CVSROOT directory:
:pserver:[email protected]:2401/cvsroot/opencvlibrary
For the tortoiseCVS, launch the program by finding the directory you want, right clickingand selecting CVS checkout. When the program launches,Put the above :pserver line in the CVSROOT box. The protocal is Password server (:pserver:)No protocal parametersServer is opencvlibrary.cvs.sourceforge.net Port is probably 2401Repository folder is /cvsroot/opencvlibraryUsername is anonymous.
After you download the code, you must build it using the MSVC sln or dsw file in the opencv _make directory.
The alternative for Windows is to go to http://sourceforge.net/project/showfiles.php?group_id=22870and download the opencv-win release 1.0 which is a self extracting exe file. You will be able to run all butthe stereo related demos.
Installing on Linux
Gary Bradski (c) 2008 7
On Linux, you can get the opencv CVS using the following two commands:
cvs -d:pserver:[email protected]:/cvsroot/opencvlibrary login
When asked for password, hit return. Then use:
cvs -z3 -d:pserver:[email protected]:/cvsroot/opencvlibrary co -P opencv
Linux needs other files to build OpenCV. You can use sudo synaptic or sudo apt-get to get the following packages:GTK+ 2.x or higher, including headers.pkgconfig, libpng, zlib, libjpeg, libtiff, and libjasper with development files.Python 2.3, 2.4, or 2.5 with headers installed (developer package).libavcodec and the other libav* libraries (including headers) from ffmpeg 0.4.9-pre1 or later.
FFMPEG instructions:Download ffmpeg from http://ffmpeg.mplayerhq.hu/download.html. You have to build a shared libraryof the ffmpeg program to use it with other open source programs such as OpenCV. To build and use a sharedffmpg library:
$> ./configure --enable-shared$> make$> sudo make install
You will end up with: /usr/local/lib/libavcodec.so.*, /usr/local/lib/libavformat.so.*,/usr/local/lib/libavutil.so.*, and include files under various /usr/local/include/libav*.The latest FFMPEG might have changed where it puts its include files, so you might have to:cd /usr/local/includesudo mkdir ffmpegsudo cp libavcodec/* ffmpegsudo cp libavformat/* ffmpegsudo co libavutil/* ffmpeg
FFMPEG is a complete pain due to their licensing and refusal to package
To build OpenCV once you've done all of the above:
$> autoreconf --force$> ./configure$> make$> sudo make install$> sudo ldconfig
[
Ubuntu modification (if you can't get ffmpeg to work):================================
* sudu synaptic
o and get gstreamer and install gstreamer files, particularly libgstreamer0.10-dev
* Then configure opencv with ./configure --with-gstreamero as follows:
$> autoreconf --force$> ./configure --with-gstreamer$> make$> sudo make install$> sudo ldconfig
]
The alternative for Linux (though you'll still need the other files) is to goto http://sourceforge.net/project/showfiles.php?group_id=22870and download the opencv-linux release 1.0 which is a gz file. Then:
$> tar xvzf OpenCV-1.0.0.tar.gz$> cd opencv-1.0.0$> ./configure --prefix=/opencv_library_install_path/opencv-1.0.0$> make$> sudo make install
- - - - - -
COMPUTER VISION MOTIVATION
Gary Bradski (c) 2008 8
My Motivation
• Bring competent vision (recognition and 3D segmentation) to Robotics
Gary Bradski (c) 2008 9
Willow Garage Robot Stanford STAIR Project under Prof. Andrew Ng
Perception Seems easy• In 1966 Marvin Minsky told an undergraduate
to solve perception over the summer…
– Perhaps he forgot to say which summer?
• Why is vision so hard?
10Gary Bradski (c) 2008
Perception is hard
11
Maybe look for larger features like edges?Gary Bradski (c) 2008
12
• Depth discontinuity
• Surface orientation discontinuity
• Reflectance discontinuity (i.e., change in surface material properties)
• Illumination discontinuity (e.g., shadow)
Slide credit: Christopher Rasmussen
But, What’s an Edge?
Gary Bradski (c) 2008
To Deal With the Confusion, Your Brain has Rules...
That can be wrong
13Gary Bradski (c) 2008
14
Color Changes with Lighting, the Brain Compensates
Sometimes Wrong:
Gary Bradski (c) 2008
The Brain Compensates for Lighting too…
Which square is darker?
15Gary Bradski (c) 2008
Lighting Determines Shape Perception
16Gary Bradski (c) 2008
Don’t believe me?
Perception of surfaces depends on lighting assumptions
17Gary Bradski (c) 2008
Object Perception
[slide credit: Kevin Murphy]Depends on Context18Gary Bradski (c) 2008
The Brain Assumes 3D Geometry
19
Perception is ambiguous … depending on your point of view!
Gary Bradski (c) 2008
20
The Brain Has a Point of View*and it’s a Tad Strange
* A Cartoon Epistemology: http://cns-alumni.bu.edu/~slehar/cartoonepist/cartoonepist.html
Same size things get smaller, we hardly notice…
Parallel lines meet at a point…
Gary Bradski (c) 2008
21
An Infinite World in a Finite Space
Perception must be mapped to a space variant grid
Logarithmic in nature
Steve LeharGary Bradski (c) 2008
QUESTION: What is beyond the sky?
22We live in our own bubbles
The sky outside your head.
Gary Bradski (c) 2008
(Robots Will Be the Same…)
23
Vision is hard. Now lets make some progress
Gary Bradski (c) 2008
OPENCV OVERVIEW
Gary Bradski (c) 2008 24
What is OpenCV?• Free AND Open Source Computer Vision Library• Then: Started in 1999 as a project on my C: drive.• Now: Over 500 functions implementing
– Computer vision, machine learning, image processing and numeric algorithms.
• Portable and very efficient (implemented in C/C++)– Full Python interface
• Runs on Windows, Linux and Mac OS• Has BSD license (that is, free for ANY use)
• Available at http://sourceforge.net/projects/opencvlibrary
Other Sites: Open Source Computer Vision Library “OpenCV” (algorithms)
http://www.intel.com/technology/computing/opencv/index.htm
Intel® Integrated Performance Primitives (optimized routines)http://www.intel.com/cd/software/products/asmo-na/eng/perflib/ipp/index.htm
25Gary Bradski (c) 2008
OpenCV History• Motivation was to lower the bar to entry to
computer vision, increase incentive for higher MIPS out in the market for Intel.
• Timeline:
• Willow Garage www.willowgarage.com now supports OpenCV (and a robot OS called ROS) as a means of accelerating peaceful uses of robotics and AI
Gary Bradski (c) 2008 26
OpenCV Tends Towards Real Time
Gary Bradski (c) 2008 27
License
• Based on BSD license
• Free for commercial or research use
– In whole or in part
– Does not force your code to be open
– You need not contribute back
• We hope you will contribute back, recent contribution, C++ wrapper class used for Google Street Maps*
Gary Bradski (c) 2008 28
* Thanks to Daniel Filip
OpenCV Structure
CVImage Processing
and
Vision Algorithms
HighGUIGUI,
Image and
Video I/O
MLLStatistical Classifiers
and
Clustering Tools
CXCOREbasic structures and algorithms,
XML support, drawing functions
IPPFast architecture-specific low-level functions
29Gary Bradski (c) 2008
OpenCV Big Picture:
General Image Processing Functions
Machine
Learning:• Detection,
• Recognition
Transforms
Segmentation
Features
Tracking
Matrix Math
Geometric
descriptors
Utilities and
Data Structures
Fitting
Camera
calibration,
Stereo, 3D
Image Pyramids
30Gary Bradski (c) 2008
Where is OpenCV Used?• Google Maps, Google street view, Google Earth, Books
• Academic and Industry Research
• Safety monitoring (Dam sites, mines, swimming pools)
• Security systems
• Image retrieval
• Video search
• Structure from motion in movies
• Machine vision factory production inspection systems
• Robotics
2M downloads
• On Cell phones
Pictorial Tour of OpenCV
32Gary Bradski (c) 2008
Canny Edge Detector
33Gary Bradski (c) 2008
Hough Transform
Gary Bradski (c) 2008 34
Scale Space
void cvPyrDown(IplImage* src, IplImage* dst, IplFilter filter = IPL_GAUSSIAN_5x5);
void cvPyrUp(IplImage* src, IplImage* dst, IplFilter filter = IPL_GAUSSIAN_5x5);
35Gary Bradski (c) 2008
Space Variant vision: Log-Polar Transform
36Gary Bradski (c) 2008
Image textures• Inpainting:
• Removes damage to images, in this case, it removes the text.
37Gary Bradski (c) 2008
Optical Flow// opencv/samples/c/lkdemo.c
int main(…){
…
CvCapture* capture = <…> ? cvCaptureFromCAM(camera_id) : cvCaptureFromFile(path);
if( !capture ) return -1;
for(;;) {
IplImage* frame=cvQueryFrame(capture);
if(!frame) break;
// … copy and process image
cvCalcOpticalFlowPyrLK( …)
cvShowImage( “LkDemo”, result );
c=cvWaitKey(30); // run at ~20-30fps speed
if(c >= 0) {
// process key
}}
cvReleaseCapture(&capture);}
lkdemo.c, 190 lines(needs camera to run)
y
x
t
yyx
yxx
I
IIb
III
IIIGyxX
bXG
dtdyyIdtdxxItI
tyxIdttdyydxxI
,
,
,),,(
,
);/(/)/(//
);,,(),,(
2
2
Morphological Operations Examples
• Morphology - applying Min-Max. Filters and its combinations
Opening IoB= (IB)BDilatation IBErosion IBImage I
Closing I•B= (IB)B TopHat(I)= I - (IB) BlackHat(I)= (IB) - IGrad(I)= (IB)-(IB)
39Gary Bradski (c) 2008
Distance Transform• Distance field from edges of objects
Flood Filling
40Gary Bradski (c) 2008
Histogram Equalization
Gary Bradski (c) 2008 41
Thresholds
Gary Bradski (c) 2008 42
Segmentation
• Pyramid, mean-shift, graph-cut
• Here: Watershed
Gary Bradski (c) 2008 43
Contours
Gary Bradski (c) 2008 44
Delaunay Triangulation, Voronoi Tessellation
Gary Bradski (c) 2008 45
Background Subtraction
46Gary Bradski (c) 2008
Motion Templates (My work with James Davies)
• Object silhouette
• Motion history images
• Motion history gradients
• Motion segmentation algorithm
silhouette MHI MHG
47Gary Bradski (c) 2008
Segmentation, Motion Tracking
PoseRecognition
MotionSegmentation
GestureRecognition
MotionSegmentation
48Gary Bradski (c) 2008
Tracking with CAMSHIFT
• Control game with head
3D tracking• Camera Calibration
• View Morphing
• POSIT
Projections
Single Camera Calibration
Now, camera calibration can be done by holding
checkerboard in front of the camera for a few seconds.
And after that you’ll get:
3D view of checkerboard Un-distorted image
52Gary Bradski (c) 2008
Bird’s-Eye View
• Useful for navigation planning and security systems
Gary Bradski (c) 2008 53
Stereo Calibration
Gary Bradski (c) 2008 54
3D Stereo Vision• Find Epipolar lines:
• Triangulate points:
• Align images:
• Depth:
55Gary Bradski (c) 2008
ML for Recognition
56Gary Bradski (c) 2008
CLASSIFICATION / REGRESSION
CART
Naïve Bayes
MLP (Back propagation)
Statistical Boosting
Random Forests
SVM
K-NN
Face Detector
(Histogram matching)
(Correlation)
Stochastic Discrimination
Logistic
CLUSTERING
K-Means
EM
(Mahalanobis distance)
Agglomerative
TUNING/VALIDATION
Cross validation
Bootstrapping
Variable importance
Sampling methods
Machine Learning Library (MLL)
AACBAABBCBCC
AAA
AACACB CBABBC
CCB
BCC
ABBCCB
BC A BBC
CBB
57Gary Bradski (c) 2008
K-Means, Mahalanobis
Gary Bradski (c) 2008 58
Patch Matching
Gary Bradski (c) 2008 59
Gesture Recognition
Up R L Stop OKGestures:
Meanshift Algorithm used to track, histogram intersection with gradient used to recognize.
Gesture via:Gradient histogram*
based gesture
recognition with
Tracking.
60Gary Bradski (c) 2008
*Bill Freeman
Boosting: Face Detection withViola-Jones Rejection Cascade
61Gary Bradski (c) 2008
PROGRAMMING WITH OPENCV, GETTING STARTED
We’ll now delve into some programming, ending with an object recognition dataset collection algorithm
Gary Bradski (c) 2008 62
SOME COMPUTER VISION BACKGROUND
We’ll get a light gist to some algorithms. Go over the algorithms in the samples directory
Gary Bradski (c) 2008 63
Seeing the light … Cameras• The Ideal Camera is a Pinhole Camera
– In theory it’s a perfect imager
64Z
Xfx Magnification factor (by similar triangles):
65
PROBLEM: Pinhole Gives Too little light:
SOLUTION: Use a lens
-- Brunelleschi, XVth Century
66
Geometrical aberrations
spherical aberration
astigmatism
distortion
coma
Vignetting
Marc Pollefeys
TRADOFFs: More light, more problems
67
Astigmatism
Different focal length for inclined rays
Marc Pollefeys
68
Distortion
magnification/focal length different for different angles of inclination
pincushion(tele-photo)
barrel(wide-angle)
Marc Pollefeys
Camera Calibration• We use a known calibration object (chessboard)
• Collect the image it actually makes
• Correct it to the projection it should make
Gary Bradski (c) 2008 70
Camera Calibration• The projection equation
• Can be written
• Given that these come from images of a plane (it forms something called a Homography ), we can stack the points together and
• Solve for the projection equation parameters.
• We then adjust for distortion:
• And solve all over again.Gary Bradski (c) 2008 71
Camera Calibration
• Lens distortion can then be corrected using a fast undistortion lookup map
Optical flow• Assumptions
– Brightness constancy
– Temporal persistence (small movements)
– Spatial coherence
• I(x(t),t) = I(x(t+dt),t+dt)
Gary Bradski (c) 2008 73
Stereo
• Depth from correspondence is easy in a perfect stereo rig (frontal parallel, row aligned)
Gary Bradski (c) 2008 74
Stereo
• But stereo rigs are never perfectly aligned, so we use stereo calibration (chessboards again) to mathematically create the perfect rig.
Gary Bradski (c) 2008 75
Stereo
• To get perfect alignment, we align something called epipoles along rows of the left and right images.
Gary Bradski (c) 2008 76
Stereo• We use calibration to find the rotation to align
the epipoles
• This means finding the rotation and translation between the left and right cameras
• The math is involved, but similar to single camera calibration
Gary Bradski (c) 2008 77
Stereo
• Once the left an right cameras are calibrated internally (intrinsics) and externally (extrinsics), we need to rectify the images
Gary Bradski (c) 2008 78
Stereo
• This converts correspondence search to a 1D search along image rows
• We do the serch using Sum of Abs. Diff block matching, with various constraints
Gary Bradski (c) 2008 79
Stereo• There is a left-right feature ordering constraint, a
texture strength constraint, a max disparity constraint (“Horopter”), a match strength constraint, all of which adds robustness.
Gary Bradski (c) 2008 80
Stereo• Note that disparity is inversly proporational to
distance, so we get most depth resolution close by
Gary Bradski (c) 2008 81
Stereo• Result is a disparity map (if we don’t know the scale
of the calibration object), or a depth map (if we do)
• This can be all be done at frame rate.
Gary Bradski (c) 2008 82
Optical flow• Going to 2D is by simple extension
Gary Bradski (c) 2008 83
Need for small motion assumption
Kalman Filter, Partical Filter for Tracking
Gary Bradski (c) 200884
Kalman Condensation or Particle Filter
Mean-Shift for Tracking
Gary Bradski (c) 2008 85
SOME TIPS ON TRICKS OF THE TRADE
A few hints and some speculation
Gary Bradski (c) 2008 86
Machine learning• Features beat algorithms
• Chose an operating point that trades off accuracy vs. cost
Gary Bradski (c) 2008 87
TP FN
FP TN
100%
100%
Debugging ML: Bias, Variance*
• If your model is too strong or weak you can have characteristic problems that have different strategies to fix.
Gary Bradski (c) 2008 88
* See Andrew Ng’s course notes, http://www.stanford.edu/class/cs229/materials/ML-advice.pdf
Object Recognition, Segmentation
• Want what (recognition) and where (segmentation)
• Quick features with 2D and 3D offsets
– Initial recognition and segmentation
– Rich feature sets
– Make use of lighting effects
• Model based verification 2D and/or 3D
• Better sensors help
– Depth
– Higher dynamic range and sensitivity
– Adaptivity to lighting
Gary Bradski (c) 2008 89
WITHER COMPUTER VISION AND OPENCV
Gary Bradski (c) 2008 90
Vision
• Faster computation, 2D and 3D sensors, higher range and sensitivity
– Will make vision recognition and segmentation “work” faster than people think
• Will enable autonous robotics
– Visual geometry is already fairly mature for image stitching
– Model capture for games, movies, VR
Gary Bradski (c) 2008 91
OpenCV
• Now has a full time team of five people developing it again (Willow Garage’s support)
• Will begin regular releases
• Plans
– Even better stereo and multicamera support
– Dealing with 2D+3D active and passive calibation
– Dealing with point clouds
– Much richer 2D and 3D feature set
– Better handling of motion, lighting, color, shadow
Gary Bradski (c) 2008 92
OpenCV Book Due out end of Sept. 2008• This tutorial is just a faint introduction – new book will
teach computer vision and its use via OpenCV
• Learning OpenCV
– Computer Vision with the OpenCV Library, O’Reilly Press.
• Google “OpenCV Amazon”– http://www.amazon.com/Learning-OpenCV-Computer-Vision-Library/dp/0596516134
93Gary Bradski (c) 2008
Questions?
OpenCV CODE:
http://sourceforge.net/projects/opencvlibrary/
OpenCV WIKI:
http://opencvlibrary.sourceforge.net/
BOOK O’Reilly Late September 2008:
Learning OpenCV: Computer Vision with OpenCV
Willow Garage: http://www.willowgarage.com