![Page 1: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/1.jpg)
3D Object Recognition Pipeline
Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli
Willow Garage
Stefan Holzer, Stefan Hinterstoisser
TUM
Morgan Quigley, Stephen GouldStanford
Marius MujaUBC
![Page 2: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/2.jpg)
2
3D and Object Recognition
•Provides more info than just visual texture
•Good for scale and segmentation
•Verification
Need a good device for 3D info
![Page 3: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/3.jpg)
3
3D CamerasTechnology Examples Pro/Con
Stereo Newcombe, Davison CVPR 2010
Not dense, smearing; real-time, good resolutionRegistration + regularization
Stereo + texture WG device Dense, real-time, good resolutionShort range
Laser line scan STAIR Borg scanner Dense, most accurateShort range, not real time
Structured light PrimeSense Dense, real-time, good resolutionShort range, ambient light/scene texture
Phase shift SR4, PMD Dense, real-time, medium rangeLow resolution, low accuracy, gross errors
Gated reflectance Canesta Dense, real-timeLow resolution, low accuracy
Tabletop manipulation:• Short range• High resolution• High range accuracy• Real-time
Technology Examples Pro/Con
Stereo Newcombe, Davison CVPR 2010
Not dense, smearing; real-time, good resolutionRegistration + regularization
Stereo + texture WG device Dense, real-time, good resolutionShort range
Laser line scan STAIR Borg scanner Dense, most accurateShort range, not real time
Structured light PrimeSense Dense, real-time, good resolutionShort range, ambient light/scene texture
Phase shift SR4, PMD, Canesta Dense, real-time, medium rangeLow resolution, low accuracy, gross errors
Gated reflectance 3DV Dense, real-timeLow resolution, low accuracy
![Page 4: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/4.jpg)
WG Projected Texture Stereo Device
• Paint the scene with texture from a projector• vs. single camera with structured light
• Advantages:• Simple projector• Standard algorithms• Full frame rates (640x480)• Dynamic scenes
![Page 5: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/5.jpg)
WG project texture device
Projector• Red LED• Eye safe• Synchronized to cameras
3D Fly-thru
![Page 6: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/6.jpg)
6
Object Recognition Pipeline
•Textured objects via keypoints [Victor Eruhimov, Suat Gedikli]
•Untextured objects via DOT [Stefan Holzer, Stefan Hinterstoisser]
•Simple 3D model matching [Marius Muja]
•STAIR 2D/3D features [Stephen Gould]
Pre-filter Detect Verify
![Page 7: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/7.jpg)
7
MOPED – Textured object recognition with pose
•Model: Stereo view of an object at a known pose
•Extract keypoints and features
•For a new scene, match keypoints to each model
•Run SfM geometric check to verify and recover pose
Torres, Romea, Srinivasa ICRA 2010
![Page 8: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/8.jpg)
8
- Need texture- Need high res camera
![Page 9: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/9.jpg)
Dominant Orientation Templates (DOT) Stefan Hinterstoisser, Stefan Holzer (TUM; CVPR 2010, ECCV
2010)● DOT is a template matching based approach
template current scene
- Template is slid over the image to compute the response for each image position- If response is above a threshold it is considered as detection of the template
![Page 10: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/10.jpg)
DOT – Basic Principle● DOT uses gradients instead of color or gray values
template current scene
- Gradients are less sensitive to illumination changes- Gradients have orientation and magnitude
![Page 11: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/11.jpg)
Offline Learning● Good learning is necessary to reduce false-positive rate● We try to use all available information to segment the object:
● Point cloud from narrow stereo is used to detect the table and segment the point cloud of the object
● Object point cloud is used to create an initial mask● Mask is refined using GrabCut (see OpenCV)
![Page 12: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/12.jpg)
False-Positive Rejection
● Two more precise templates for validation:● more precise and not discretized gradient template● disparity template to compare expected with real disparities
![Page 13: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/13.jpg)
False-Positive Rejection
● Compute error between reference point cloud and point cloud at detected position
Optimize initial 3D point cloud pose given from the detection
Directly gives object pose if model is associated with learned point clouds
![Page 14: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/14.jpg)
14
![Page 15: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/15.jpg)
15
![Page 16: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/16.jpg)
16
![Page 17: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/17.jpg)
17
![Page 18: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/18.jpg)
18
![Page 19: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/19.jpg)
STAIR Vision Library (SVL)Stanford STAIR project [Andrew Ng, Stephen
Gould]• Initially developed to
support the Stanford AI Robot (STAIR) project
• Builds on top of OpenCV computer vision library and Eigen matrix library
• Provides a range of software infrastructure for
• computer vision
• machine learning
• probabilistic graphical models
• Hosted on SourceForge
![Page 20: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/20.jpg)
Object Detection in SVL• Sliding-window object detector
• Features are extracted from a local window
• Learned boosted decision-tree classifier scores each window
• Image is scanned at multiple resolutions to detect objects at different scales
![Page 21: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/21.jpg)
Image Channels• Image decomposed into multiple channels
• Depth at each pixel, obtained from a laser scanner, can be thought of as an additional channel
intensity image edge map depth map
[Quigley et al., ICRA 2009]
![Page 22: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/22.jpg)
Object Detection Features
• Learn a “patch” dictionary over intensity, edge and depth channels
• Patches encode localized templates for matching
• Depth patches capture shape; intensity and edge patches capture appearance
• Patch responses (over entire dictionary) are combined to form the feature vector
[Quigley et al., ICRA 2009]
![Page 23: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/23.jpg)
Results• 150 images of cluttered indoor scenes
• 5-fold cross-validation
• Depth information provides significant improvement in area under precision-recall curve
[Quigley et al., ICRA 2009]
8% improvement 3% improvement 38% improvement
![Page 24: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ca45503460f94964217/html5/thumbnails/24.jpg)
24
Conclusions
•Realtime, accurate 3D devices are becoming available
•3D can help in object detection for untextured objects
- Combo of visual and 3D features best
•3D is useful for verification
•Check out the PR2 Grasping Demo!