bryan willimon master’s thesis defense interactive perception for cluttered environments

Bryan WillimonMaster’s Thesis Defense

Interactive Perception for Cluttered Environments

Visually-guided Manipulation: Traditional Approach

Manipulation-guided Sensing: “Interactive Perception*”

Sense Plan Act

Act Sense Plan

*D. Katz and O. Brock. Manipulating articulated objects with interactive perception. ICRA 2008

Previous Related Work on Interactive Perception

Segmentation through image differencing

P. Fitzpatrick. First Contact: an active vision approach to segmentation. IROS 2003

Learning about prismatic and revolute joints on planar rigid objects

D. Katz and O. Brock. Manipulating articulated objects with interactive perception. ICRA 2008

Goal of Interactive PerceptionPile of Stuff Separate Object

ClassifyLearn

Our Approach

ExtractionGraph-based SegmentationStereo MatchingDetermining Grasp Point

ClassificationColor Histogram LabelingSkeletonizationMonitoring Object InteractionLabeling Revolute Joints

using Motion

Extraction Process

Graph-based Segmentation*

Separates the image into regions based on features of the pixels (e.g., color)

Breaks apart the foreground and background Classify background as any pixel that shares

the same color label as a border pixel. Subtracts background to leave only foreground

*P. Felzenszwalb and D. Huttenlocher. Efficient graph-based image segmentation. IJCV 2004

Stereo Matching*

Uses two different cameras from two slightly different projections to provide a sense of depth

Depth information from foreground only is considered Foreground image from previous step is used as a

mask to erase any background information Object on top of pile minimizes disturbance

*P. Fua. Combining stereo and monocular information to compute dense depth maps that preserve depth discontinuities. IJCAI 1991

Determining Grasp Point

Calculate the maximum chamfer* distance within the white area

Use the outline of the white area as the starting point for the chamfering process

Using chamfer distance instead of centroid handles concave objects

*G. Borgefors. Distance transformations in digital images. CVGIP 1986

Classification

Color Histogram Labeling* Use color values (RGB) of the object to create a 3-D

histogram Each histogram is normalized by number of pixels in object

to create a probability distribution Each histogram is then compared to histograms of previous

objects for a match using histogram intersection* White area is found by using same technique as in graph-

based segmentation and used as a binary mask to locate object in image

*M. Swain and D. Ballard. Color indexing. IJCV 1991

Skeletonization*

Use binary mask from previous step to create a skeleton of the object

Skeleton is a single-pixel wide outline of the area Prairie-fire analogy

*G. Bertrand. A parallel thinning algorithm for medial surfaces. PRL 1995

Iteration 1

Skeletonization*




Iteration 3

Skeletonization*




Iteration 5

Skeletonization*




Iteration 7

Skeletonization*




Iteration 9

Skeletonization*




Iteration 10

Skeletonization*




Iteration 11

Skeletonization*




Iteration 13

Skeletonization*




Iteration 15

Skeletonization*




Iteration 17

Skeletonization*




Iteration 47

Monitoring Object Interaction

Use KLT* feature points to track movement of the object as the robot interacts with it

Only concerned with feature points on the object and disregard all other points

Calculate distance between each feature point every flength frames (flength=5)

*C. Tomasi and T. Kanade. Detection and tracking of point features. CMU 1991

Monitoring Object Interaction (cont.) Idea: Like features keep a constant intra-distance,

features from different groups have variable intra-distance Features were separated into groups by measuring the

intra-distance amount after flength frames

If the intra-distance between two features changes by less than a threshold, then they are within the same group

Otherwise, they are within

different groups Separate groups relate to

separate parts of an object

Labeling Revolute Joints using Motion

For each feature group, create an ellipse that encapsulates all features

Calculate major axis of ellipse using PCA* End points of major axis correspond to a revolute joint

and the endpoint of the extremity

*I. Jolliffe. Principal Component Analysis. Springer 1986

Labeling Revolute Joints using Motion (cont.)

Using the skeleton, locate intersection points and end points

Intersection points (Red) = Rigid or Non-rigid joints End points (Green) = Interaction points Interaction points are locations that the robot uses to

“push” or “poke” the object

Labeling Revolute Joints using Motion (cont.)

Map estimated revolute joint from major axis of ellipse to actual joint in skeleton

In the case of groups with size 1, the revolute joint is labeled to be the closest intersection point

After multiple interactions from the robot, a final skeleton is created with revolute joints labeled (red)

Experiments Items used for experiments

3 Logitech Quick-Cam Pro webcams (2 for stereo system and 1 for classifying)

PUMA 500 robotic arm (or EZ gripper) 2 areas were used and located near each other for easy use

of the robotic arm One was designated as extracted table and the other as

classification table

Results

Toys on the floor – PUMA 500

Recycling bin – EZ gripper

Socks and shoes in a hamper – EZ gripper

Results Toys on the(cont.) floor

Final Skeleton used for Classification


Classification Experiment

1 2 3 4

5 6 7 8



*Rows = Query image, Columns = Database image



Without use of skeleton



With use of skeleton

Results Recycling(cont.) bin


Without use of skeleton


With use of skeleton

Results Socks and(cont.) Shoes

Results Socks and(cont.) Shoes

Only 1 image matched #5, skeleton could not be used

Comparison of Related Work

Comparing objects of the same type to that of similar work* Pliers from our results compared to shears in their results*

*D. Katz and O. Brock. Manipulating articulated objects with interactive perception. ICRA 2008

Our approach Their approach

How is our work different?

1. Our approach handles rigid and non-rigid objects Most of the previous work only considers planar rigid

objects

2. We gather more information with interaction like a skeleton of the object, color, and movable joints.

Other works only look to segment the object or find revolute and prismatic joints

3. Our approach works with cluttered environments Other works only handle a single object instead of

working with multiple items piled together

Conclusion

This is a general approach that can be applied to various scenarios using manipulation-guided sensing

The results demonstrated that our approach provided a way to classify rigid and non-rigid objects and label them for sorting and/or pairing purposes

This approach builds on and exceeds previous work in the scope of “interactive perception”

This approach also provides a way to extract items out of a cluttered area one at a time with minimal disturbance

Applications for this projectService robots handling household choresMap-making robot learning about the environment while

creating a map of the area

Future Work

Create a 3-D environment instead of a 2-D environment Modify classification area to allow for interactions from

more than 2 directions Improve the gripper of the robot for more robust grasping Enhance classification algorithm and learning strategy

Use more characteristics to properly label a wider range of objects

Questions?

bryan willimon master’s thesis defense interactive perception for cluttered environments

Documents

classification slide

motion slide

extraction process slide

object skeleton

graph based segmentation

background information

binary mask

foreground image