panoroom: from the sphere to the 3d layout · 3d layout evaluation-96.2% accuracy on sun360...

PanoRoom: From the Sphere to the 3D Layout Clara Fernández-Labrador 1,2 , José M. Fácil 1 , Alejandro Pérez-Yus 1 , Cédric Demonceaux 2 , José J. Guerrero 1 1 2 Overview Motivation Edge and Corners Maps from the proposed FCN Lines and Vanishing Points under MW assumption Input Panorama Accurate structural lines and corners Final Layout reprojected on the image 3D Model of the Room Encoder-Decoder FCN Main directions Layout Recovery Hypotheses Generation and Evaluation ResNet-50 128 64 512 256 128x256 Input Panorama 64x128 Corner Map Edge Map Skip-connections Preliminary predictions Pixel-wise sigmoid cross-entropy loss Perceptual loss These maps have an extremely unbalanced distribution of edges and corners, so we introduce the ponder factors λ 1 and λ 0 . Apart from encouraging the output images to match the target images , we also encourage them to have same feature representations. Encoder-Decoder FCN Details Loss Functions [1] Fernandez-Labrador, C., Perez-Yus, A., Lopez-Nicolas, G., Guerrero, J.J.: Layouts from panoramic images with geometry and deep learning. RA-L/IROS 2018. [2] C. Zou, A. Colburn, Q. Shan and D. Hoiem. "Layoutnet: Reconstructing the 3D room layout from a single rgb image". CVPR 2018. [3] Y. Zhang, S. Song, P. Tan, and J. Xiao. "PanoContext: A whole-room 3D context model for panoramic scene understanding." ECCV 2014. This work was supported by Projects DPI2014-61792-EXP, DPI2015-67275 (MINECO/FEDER), DPI2015-65962-R (UE) and by the Regional Council of Burgundy (CRB). Related Work Acknowledgements Layout Recovery and Results Input Image with Reprojected 2D Layout 3D Model Scan the QR Code to see our paper! Encoder Decoder RGB image Ground Truth (GT) [1] F-L C. et.al [2] LayoutNet PanoRoom (Ours) FCN Evaluation 3D Layout Evaluation -96.2% accuracy on SUN360 dataset. -Able to generalize: 91.6% acc. on Stanford2D-3D dataset. We outperform the State of the Art in all the metrics while being faster. [1] [1] [2] [3] -General Layouts: Not limited to box-type -Cleaner edges around the boundaries We randomly sample corners to generate layout hypotheses following the Manhattan World Assumption. We include the possibility of introducing new corners for those cases that they are not visible due to the scene convexity or occlusions. Finally we choose the best ﬁtting solution with the Edge and Corners Maps obtained trough our FCN by the following function: We build our model over ResNet-50 pre-trained on ImageNet dataset. - Residual networks allow to increase depth without increasing the number of parameters with respect to their plain counterparts. - Faster convergence due to the general low-level features learned from ImageNet. - We propose a unique branch for multi-tasking to reduce computation time and the number of parameters. - We introduce skip-connections between encoder and decoder. - We perform preliminary predictions in diﬀerent resolutions which are concatenated. Layout Hypotheses Generation and Evaluation

Upload: others

Post on 19-Jul-2020

5 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: PanoRoom: From the Sphere to the 3D Layout · 3D Layout Evaluation-96.2% accuracy on SUN360 dataset.-Able to generalize: 91.6% acc. on Stanford2D-3D dataset. We outperform the State

PanoRoom: From the Sphere to the 3D LayoutClara Fernández-Labrador1,2 , José M. Fácil1, Alejandro Pérez-Yus1, Cédric Demonceaux2, José J. Guerrero 1

1 2

Overview

Motivation

Edge and Corners Maps from the proposed FCN

Lines and Vanishing Points under MW assumption

Input PanoramaAccurate structural lines

and cornersFinal Layout reprojected

on the image 3D Model of the Room

Encoder-DecoderFCN

Main directionsLayout

Recovery

Hypotheses Generation and Evaluation

ResNet-50

128 64512 256

128x256Input Panorama

64x128Corner Map

Edge Map

Skip-connections

Preliminarypredictions

Pixel-wise sigmoid cross-entropy loss

Perceptual loss

These maps have an extremely unbalanced distribution of edges and corners, so we introduce the ponder factors λ 1 and λ0.

Apart from encouraging the output images to match the target images , we also encourage them to have same feature representations.

Encoder-Decoder FCN Details

Loss Functions[1] Fernandez-Labrador, C., Perez-Yus, A., Lopez-Nicolas, G., Guerrero, J.J.: Layouts from panoramic images with geometry and deep learning. RA-L/IROS 2018.

[2] C. Zou, A. Colburn, Q. Shan and D. Hoiem. "Layoutnet: Reconstructing the 3D room layout from a single rgb image". CVPR 2018.

[3] Y. Zhang, S. Song, P. Tan, and J. Xiao. "PanoContext: A whole-room 3D context model for panoramic scene understanding." ECCV 2014.

This work was supported by Projects DPI2014-61792-EXP, DPI2015-67275 (MINECO/FEDER), DPI2015-65962-R (UE) and by the Regional Council of Burgundy (CRB).

Related Work

Acknowledgements

Layout Recovery and Results

Input Image withReprojected 2D Layout

3D Model

Scan the QR Code to see our paper!

Encoder

Decoder

RGB image Ground Truth (GT) [1] F-L C. et.al [2] LayoutNet PanoRoom (Ours)

FCN Evaluation

3D Layout Evaluation

-96.2% accuracy on SUN360 dataset.-Able to generalize: 91.6% acc. on Stanford2D-3D dataset.

We outperform the State of the Art in all the metrics while being faster.

[1]

[2]

[3]

-General Layouts: Not limited to box-type-Cleaner edges around the boundaries

We randomly sample corners to generate layout hypotheses following the Manhattan World Assumption. We include the possibility of introducing new corners for those cases that they are not visible due to the scene convexity or occlusions.

Finally we choose the best fitting solution with the Edge and Corners Mapsobtained trough our FCN by the following function:

We build our model over ResNet-50 pre-trained on ImageNet dataset.- Residual networks allow to increase depth without increasing the number of parameters with respect to their plain counterparts.- Faster convergence due to the general low-level features learned from ImageNet.

- We propose a unique branch for multi-tasking to reduce computation time and the number of parameters. - We introduce skip-connections between encoder and decoder.- We perform preliminary predictions in different resolutions which are concatenated.

Layout Hypotheses Generation and Evaluation

FAUST: Dataset and Evaluation for 3D Mesh Registration · FAUST: Dataset and evaluation for 3D mesh registration Federica Bogo 1;2 Javier Romero Matthew Loper Michael J. Black1 1Max

HFSS 3D Layout for 3D EM simulation of planar … 3D Layout for 3D EM simulation of planar structure (PCB, RFIC, MMIC…) Alain Michel, ... • Native HFSS solves from Cadence

The Berkeley 3D Object Dataset

IntrA: 3D Intracranial Aneurysm Dataset for Deep Learningopenaccess.thecvf.com/content_CVPR_2020/papers/Yang...IntrA: 3D Intracranial Aneurysm Dataset for Deep Learning Xi Yang1 Ding

A-3 3D MYNDIR _ Layout

HouseExpo: A Large-scale 2D Indoor Layout Dataset for Learning … · 2019-03-26 · task for simultaneously producing a 3D voxel representation and semantic labels by using a single-view

Tank Design Calculations 3D Production Detailed Models Layout

PubLayNet: largest dataset ever for document layout analysisthe document layout of over 1 million PubMed Central™ PDF articles and generate a high-quality document layout dataset

HFSS 3D LAYOUT 用户手册

NYC3DCars: A Dataset of 3D Vehicles in Geographic Context

A Refined 3D Pose Dataset for Fine-Grained Object Categoriesopenaccess.thecvf.com/content_ICCVW_2019/papers/R6D/Wang... · 2019. 10. 23. · A Reﬁned 3D Pose Dataset for Fine-Grained