implicit 3d orientation learning for 6d object detection from rgb...

30
Implicit 3D Orientation Learning for 6D Object Detection from RGB Images Martin Sundermeyer, Zoltan-Csaba Marton, Maximilian Durner, Manuel Brucker, Rudolph Triebel

Upload: others

Post on 27-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Implicit 3D Orientation Learning for 6D Object Detection from RGB

Images

Martin Sundermeyer, Zoltan-Csaba Marton, Maximilian Durner, Manuel Brucker, Rudolph Triebel

Page 2: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

MOTIVATION

Robot Manipulation Tasks:

Page 3: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

MOTIVATION

Augmented Reality:

Page 4: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Objective :

❖ To learn 6D pose (R, t) of object from RGB images.

❖ To overcome common challenges in this field:

➢ Can you guess some of the common issues??

Page 5: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Problem with the Data:

● Pose annotation is complex and expensive.

Alternative:

● Train on Synthetic views of a 3D model.

CONS ??

Page 6: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Pose Ambiguity:

One-to-many mappings in a Supervised pipeline.

Page 7: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

RELATED WORK:

1. Depth Based Methods:Vidal et.al - Current State-of-the-art method using Point Pair features. CON: Computationally expensive evaluation of many pose hypotheses.

2. Learning Based Methods from images:Tekin et. al, “Real-Time Seamless Single Shot 6D Object Pose Prediction”CON: Restricted to pose-annotated datasets.

Page 8: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

CHALLENGES: Simulation to Reality

AIM: To bridge the domain gap and improve generalizability to real images.

1) Photo-Realistic Renderings: Works for simple environments, for viewpoint estimation.CON: Requires a lot of effort, end result still imperfect.

[34] Hao Su et. al

Page 9: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

CHALLENGES: Simulation to Reality

AIM: To bridge the domain gap and improve generalizability to real images.

2) Domain Adaptation: Use of GANs to generate training images, can generalize to unseen images. CON: Training is fragile, depends on good initialization.

[3] Bousmalis et. al

Page 10: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

CHALLENGES: Simulation to Reality

AIM: To bridge the domain gap and improve generalizability to real images.

3) Domain Randomization: Train on rendered views in a variety of semi-realistic settings.CON: Treats it as a Classification Task. Still a bit slow.

[36] Tobin et. al

Page 11: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

CHALLENGES: Learning 3D Orientation

AIM: To overcome the difficulties of training with fixed SO(3) parameterizations.

2) Classification: Requires a discretization of SO(3). Even rather coarse intervals of ∼ 5o lead to over 100 possible classes.CON: Sparse training data, cannot handle new viewpoints.

3) Descriptor Learning: Learn Orientations in a low-dimensional space.CON: Current methods require some supervised data, slow.

1) Regression: Directly regress a fixed SO(3) parameterizations like quaternions.CON: Representational constraints and pose ambiguities can introduce convergence issues.

Page 12: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

METHODOLOGY:

Overview:

Page 13: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

MAIN CONTRIBUTION: Augmented AE

Background: Autoencoders

● Consists of an Encoder Φ and a Decoder Ψ.● Reconstruct input x using a latent representation z.

● Uses a simple L2 loss objective function:

Page 14: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

MAIN CONTRIBUTION: Augmented AE

Augmented AutoEncoders (AAE):

● A kind of Denoising Autoencoder.● Difference: They apply random augmentations faugm(.) to the input images.

● Uses a simple L2 loss objective function:

Page 15: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Toy Example:

GOAL: Encode only the in plane rotations r ∈ [0, 2π] in a two dimensional latent space z ∈R2 independent of scale or translation.

Page 16: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Augmented Autoencoder:

Page 17: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Network Architecture:

● Convolutional AE with 128 dimensional latent space.● Bootstrapped pixel-wise L2 loss which is only computed on the pixels with the largest errors.● Adam Optimizer. Learning rate = 2×10−4. ● Batch size = 64.

Page 18: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

CodeBook creation and Test Procedure:

Page 19: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

CodeBook creation and Test Procedure:

Offline Codebook Generation:

1) Render clean, synthetic object views at equidistant viewpoints from a full view-sphere.2) Rotate each view in-plane at fixed intervals to cover the whole SO(3).3) Generate latent codes z ∈ R128 for all resulting images and assigning their corresponding rotation Rcam2obj.

Test time Procedure:

1) Object detection in RGB scene, crop to required dimension. 2) Compute the cosine similarity. 3) Make Prediction of 3D orientation using k-NN.

Page 20: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Projective Distance Estimate for full 6D Pose

GOAL: To estimate the 3D translation vector from camera to object center.

● Ratio of diagonal lengths of Bounding Boxes is used.● During training, for each synthetic object view in the codebook, the diagonal length lsyn,i of

its 2D bounding box is saved.

Page 21: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

INFERENCE:

Using SSD with VGG16 base and 31 classes plus the AAE with a codebook size of 92232 × 128:

Page 22: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Metrics Used:

The Visible Surface Discrepancy (errvsd) - a pose error function determined by the distance between the estimated and ground truth visible object depth surfaces.

Average Distance of Model Points(ADD) - average distance to the corresponding model point.

Page 23: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Datasets Used: T-LESS

TEST

TRAINING

Page 24: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Ablation Studies:

Page 25: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Ablation Studies:

Page 26: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Results:

Page 27: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

● Over 18,000 real images of 15 objects, like duck, bowl, cup, hole puncher, etc.

● RGB images● Depth images● Ground truth Pose● 3D model in Point cloud/mesh

formats.

Datasets Used: ACCV LineMOD

Page 28: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Results:

Page 29: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Conclusion:

This Paper has given these Key contributions:

1. A real-time, RGB-based pipeline for 6D object detection.

2. A self-supervised training strategy for Autoencoder for robust 3D object orientation estimation.

3. Learn from synthetic data, no pose annotation dataset required.

4. Handles pose ambiguities.

5. Generalizes to various test sensors.

Page 30: Implicit 3D Orientation Learning for 6D Object Detection from RGB …cseweb.ucsd.edu/.../Winter2019/Lectures/10_ObjectPose6D.pdf · 2019-03-08 · Implicit 3D Orientation Learning

Conclusion:

Possible Extensions to the Paper:

1. Ability to generalize to new objects.

2. Pose estimation of non-rigid objects.

3. Tracking of the objects?