![Page 1: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/1.jpg)
Reconstructing PASCAL VOC
Sara Vicente*Anthropics Technology
Lourdes AgapitoUniversity College London
Jorge BatistaISR - University of
Coimbra
João Carreira*UC Berkeley / ISR
* First two authors contributed equally
![Page 2: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/2.jpg)
Data Matters
1960 1990 2010Person
Motorbike
EverythingToy images3D models
Image classificationCropped images
Hundreds of images,class labels
Object localizationSimple images
10K-1M images, class labels, segmentations and keypoints
Goal:Test data:
Training Data:
![Page 3: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/3.jpg)
PresentRenewed interest on joint object reconstruction and recognition
Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models, M. Aubry, D. Maturana, A. Efros, B. Russell and J. Sivic
Estimating Image Depth Using Shape Collections, H. Su, Q. Huang, N. Mitra, Y. Li and L. Guibas
Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild, Y. Xiang, R. Mottaghi and S. Savarese
Detailed 3D Representations for Object Recognition and Modeling, Z. Zia, M. Stark, B. Schiele and K. Schindler
Image-based Synthesis and Re-Synthesis of Viewpoints Guided by 3D Models. K. Rematas, T. Ritschel, M. Fritz, and T. Tuytelaars
Parsing IKEA objects: Fine Pose Estimation. J. Lim, H. Pirsiavash and A. Torralba
![Page 4: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/4.jpg)
Present
Renewed interest on joint object reconstruction and recognition
But awesome recognition datasets (PASCAL VOC, Imagenet) that took years to collect and everyone uses have only 2D annotations
Person
Motorbike
Class labelsSegmentations
Keypoints (not shown)
Available Unavailable
Aligned 3D shapesPASCAL VOC
![Page 5: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/5.jpg)
Proposed Solution
Bootstrap reconstructions for all objects in detection datasets from existing 2D annotations
Facilitate new attack at joint recognition and reconstruction
Available Reconstructed
![Page 6: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/6.jpg)
Class-based Reconstruction – Prior Work
A Morphable Model for the Synthesis of 3D Faces, Volker Blanz and Thomas Vetter, Siggraph 1999
What shape are dolphins? Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution: An Incremental Approach to
Non-Rigid Structure from Motion, Shengqi Zhu, Li Zhang, Brandon M. Smith, CVPR 2010
Morphable Models built from:
Multiple 3D scans
Single 3D mesh + 2D data
2D data(non-rigid SFM)
Less information
![Page 7: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/7.jpg)
But… how ? PASCAL VOC - Birds
![Page 8: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/8.jpg)
But… how ? PASCAL VOC - Chairs
![Page 9: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/9.jpg)
But… how ? PASCAL VOC - Aeroplanes
![Page 10: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/10.jpg)
But… how ? PASCAL VOC - Boats
![Page 11: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/11.jpg)
Key Idea
Assume for each object in a class there are a small number of similar ones seen from different viewpoints (shape surrogates)
Target Object Other objects in same category
![Page 12: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/12.jpg)
Key Idea
Assume for each object in a class there are a small number of similar ones seen from different viewpoints (shape surrogates)
Target Object Other objects in same category
Reconstruct an object using standard rigid multiview techniques with the images of surrogates as additional views
Hard to identify surrogates: perform viewpoint-biased sampling
![Page 13: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/13.jpg)
Proposed Approach
1. Viewpoint Estimation (Rigid Structure from Motion)
2. 3D Reconstruction (Visual Hull Sampling)
3. Reconstruction RankingFor each object:
Jointly over all objects in a class:
Bet
ter
Output
![Page 14: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/14.jpg)
Step 1 of 3: Class-based Viewpoint Estimation
![Page 15: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/15.jpg)
Step 1 of 3: Class-based Viewpoint Estimation
Factorization-based rigid SFM:
𝑥11 𝑥1
𝑘…
𝑦11 𝑦1
𝑘…
𝑥21 𝑥2
𝑘…
𝑦21 𝑦2
𝑘…
𝑥𝑁1 𝑥𝑁
𝑘…
𝑦𝑁1 𝑦𝑁
𝑘…
… … =
Measurement matrix
Estimating 3D shape from degenerate sequences with missing data, Manuel Marques, João Paulo Costeira, CVIU 2009
Known
𝑀1
…
Unknown
𝑀2
𝑀𝑁
Motion matrices
x
Shape
Unknown
𝑥1
𝑦1
𝑧1
𝑥2
𝑦2
𝑧2
𝑥𝑘
𝑦𝑘
𝑧𝑘
…
…
…
![Page 16: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/16.jpg)
Step 1 of 3: Class-based Viewpoint Estimation
Idea: exploit segmentation information: occluded keypoints should project inside silhouette
Side viewOriginal view
Estimated keypoints (occluded)
Estimated keypoints (visible)
Ground truth keypoints (only visible ones are available)
![Page 17: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/17.jpg)
Step 1 of 3: Class-based Viewpoint Estimation
Estimated elevation for airplanes:
![Page 18: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/18.jpg)
Step 2 of 3: 3D Reconstruction (Visual Hull)
Well-known multiview reconstruction algorithm
Efficient
Easy to implement
Multiple views of same aeroplane model
![Page 19: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/19.jpg)
Step 2 of 3: 3D Reconstruction (Visual Hull)
Making the multiview reconstruction assumptions hold
Sampling approach• Randomly select multiple pairs of silhouettes hoping that one
pair arises from shape surrogates• Bias sampling to most informative viewpoints
Cars Aeroplanes
![Page 20: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/20.jpg)
Step 2 of 3: 3D Reconstruction (Visual Hull)
Making the multiview reconstruction assumptions hold
Sampling approach• Randomly select multiple pairs of silhouettes hoping that one
pair arises from shape surrogates• Bias sampling to most informative viewpoints
Typically:• Left/Right• Top/Bottom• Front/Back
![Page 21: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/21.jpg)
Step 2 of 3: 3D Reconstruction (Visual Hull)
Principal Component Analysis on 3D points from SFM returns an intuitive set of 3 informative viewpoints
Cluster together objects up to 15º away from these viewpoints
Cars Aeroplanes
Informative viewpoints = PCA ( )
![Page 22: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/22.jpg)
Step 2 of 3: Visual Hull Reconstruction
Randomly sample silhouettes from 2 out of the 3 clusters multiple times and reconstruct from each combination with target image (in gray)
a b
c d e
![Page 23: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/23.jpg)
Step 2 of 3: Imprinted Visual Hull Reconstruction
Optimize each reconstruction to conform exactly to the reference silhouette
Non-imprintedImprinted Reference silhouette
![Page 24: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/24.jpg)
Step 3 of 3: Reconstruction Ranking
Select mesh whose projected boundaries best match average masks
Car average masks and SFM model Selected reconstruction
Bet
ter
Target Object
Reconstruction ranking
![Page 25: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/25.jpg)
Experiments
Reconstructed 9,087 annotated and unnocluded objects on PASCAL VOC 20 categories
Also reconstructed 1000 renderings of a synthetic extension of PASCAL VOC for obtaining quantitative results
![Page 26: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/26.jpg)
Synthetic Dataset: Reconstruction Error
Smaller is better
Shape InflationOur results SFM convex hull
![Page 27: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/27.jpg)
Smaller is better
Synthetic Dataset: Reconstruction Error
Playing with puffball: simple scale-invariant inflation for use in vision and graphics,N. Twarog, M. Tappen, and E. Adelson, In ACM Symp. on Applied Perception, 2012
Shape InflationOur results SFM convex hull
![Page 28: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/28.jpg)
Synthetic Dataset: Reconstruction Error
Smaller is better
Shape InflationThis method SFM Convex Hull
![Page 29: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/29.jpg)
Synthetic Dataset: Reconstruction Error
Inflation: shape inflation baseline
SFMCvxHull: convex hull of SFM points
aeroplane 3.58 9.64 5.79
bicycle 4.3 10.51 6.56
bird 9.98 8.76 12.01
boat 5.91 8.81 6.52
bottle 8.09 6.25 12.13
bus 6.45 11.02 7.34
car 3.04 11.07 3.22
cat 6.98 11.39 9.61
chair 5.36 8.13 7.37
cow 5.44 9.17 7.5
dining table 8.97 8.67 9.52
dog 7.08 11.61 9.91
horse 6.05 6.9 7.41
motorbike 4.12 9.24 5.32
person 7.35 9.14 19.46
potted plant 7.72 7.58 17.86
sheep 7.18 8.77 7.16
sofa 6.11 8.06 5.75
train 15.73 17.01 17.47
tv/monitor 9.73 9.67 10.08
mean 6.96 9.57 9.4
![Page 30: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/30.jpg)
Code available online:http://www2.isr.uc.pt/~joaoluis/carvi/index.html
Conclusions
Rigid SFM can be made robust to challenging intra-category variation
Class-based reconstruction by sampling visual hulls with different putative surrogates shapes
Bootstrapped coarse 3D viewpoint and shape information from existing 2D annotations on PASCAL VOC
Future work: • Learn more powerful recognition models from the new 3D data
• Relax need for annotations
![Page 31: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:](https://reader033.vdocuments.net/reader033/viewer/2022042409/5f2577355289122abd00d7a4/html5/thumbnails/31.jpg)
Thanks!