figure - ground image segmentation helps weakly-supervised...

1
Katerina Fragkiadaki Jianbo Shi Figure - Ground Image Segmentation Helps Weakly-Supervised Learning of Objects Input: An image collection containing a common object Output: Models for segmenting the common object and its background. Main Challenge: Lange variations of the common object. Features do not repeat consistently!! Previous Work Generative models: Topic models Hierarchical representations: suspicious coincidence Co-segmentation Discriminative models : MILboosting/MILSVM on segments or patches (use of a negative image collection) Recently: Discriminative clustering Our approach Co-occurrences not sufficient image saliency Figure-ground saliency of an image set from figure-ground saliency of single image model update figure-ground update sample figure- ground labels FG input image collection Saliency values sal are computed given the segmentation of each map. Flexible representation of image figure-ground! Each map captures different object! Co-occurrence will choose the right one Segment figure-ground labels constrain the co- occurrence model Multiple segmentations per image. Model parameters: Φ 1 :foreground bag of words model Φ 2 . . φ T :background bag of words models M : foreground shape model Observed variables: w: visual words sal: single image figure-ground saliency Latent variables: ρ Є [0,1] : figure-ground map score given image set FG Є {0,1}: segment figure-ground label given image set z:segment topic Figure-ground saliency of an image set from figure-ground saliency of single image Use both figure-ground saliency of single image and feature co-occurrence across the image set to discover the common object. Encode figure-ground given image set as multiple figure-ground maps and probability distribution ρ over them. ρ depends both on single image figure- ground saliency and co-occurrence model. Segment figure-ground labels FG are sampled from FGsoft, the map with the highest score р The irrelevant foreground objects have been suppressed by the co-occurrence model Figure-ground shape aware model image representation bag of segments Image figure-ground changes through the update of the scores ρ of figure-ground maps! FG=1 FG=0 background models φ 2 φ T words P words P figure-ground aware model w λ θ z N I N S N W β T φ T topic model ρ FG w β φ T N I N S N W z M FGsoft figure ground given single image: observed figure ground given image set: latent sal Maximizing a conditional likelihood! Discrimination without a negative image set word model φ 1 shape model M foreground model words P figure-ground given image set ρ↓ sal 2 2 =0.4 Ρ 2 2 =0.7 sal 1 1 =0.8 Ρ 1 1 =0.8 sal 2 1 =0.2 Ρ 2 1 =0.2 sal 1 2 =0.6 Ρ 1 2 =0.3 figure-ground organization multiple figure-ground maps figure ground given single image ρ↓ sal 2 2 =0.4 Ρ 2 2 =0.4 sal 1 1 =0.8 Ρ 1 1 =0.8 sal 2 1 =0.2 Ρ 2 1 =0.2 sal 1 2 =0.6 Ρ 1 2 =0.6 The co-occurrence model rescores the figure- ground maps Maps switched score order! Datasets used: Caltech 101:1) 81 images of Airplanes; MSRC: 2) 70 images of Cars, 3) 84 images of Cows; ETH: 4) 48 images of Bottles, 5) 29 images of Swans, 6) 85 images of Giraffes; Weizmann Horses: 7)80 images In each dataset 2/3 of images for training and 1/3 for testing. 2 representations tested: sFgmodel: shape +bag of words model (full model) bagFGmodel: bag of words model (no shape at test time, only during learning) Use both single image figure-ground saliency and feature co-occurrence across image set to discover the common object t 1 t 3 t 2 Problem algorithm Assumption: Often salient image regions capture the common object. sample segment topic z є t 2 …t T Gibbs sampling Learning Test time Results ours baseline

Upload: others

Post on 20-Jul-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Figure - Ground Image Segmentation Helps Weakly-Supervised ...katef/posters/poster_FG_semisupervised.pdf · Figure-ground saliency of an image set from figure-ground saliency of single

Katerina Fragkiadaki Jianbo Shi Figure - Ground Image Segmentation Helps Weakly-Supervised Learning of Objects

Input: An image collection containing a common object Output: Models for segmenting the common object and its background.

Main Challenge: Lange variations of the common object. Features do not repeat consistently!!

Previous Work Generative models: • Topic models • Hierarchical representations: suspicious coincidence Co-segmentation Discriminative models : MILboosting/MILSVM on segments or patches (use of a negative image collection) Recently: Discriminative clustering

Our approach

Co-occurrences not sufficient image saliency

Figure-ground saliency of an image set from figure-ground saliency of single image

model update figure-ground update sample figure- ground labels FG

input image collection

Saliency values sal are computed given the segmentation of each map. Flexible representation of image figure-ground! Each map captures different object! Co-occurrence will choose the right one

Segment figure-ground labels constrain the co-

occurrence model

Multiple segmentations per image.

Model parameters: • Φ1:foreground bag of words model • Φ2 . . φT:background bag of words models • M : foreground shape model Observed variables: • w: visual words • sal: single image figure-ground saliency Latent variables: • ρ Є [0,1] : figure-ground map score given image set • FG Є {0,1}: segment figure-ground label given image set • z:segment topic

Figure-ground saliency of an image set from figure-ground saliency of single image

•Use both figure-ground saliency of single image and feature co-occurrence across the image set to discover the common object. •Encode figure-ground given image set as multiple figure-ground maps and probability distribution ρ over them. ρ depends both on single image figure-ground saliency and co-occurrence model. •Segment figure-ground labels FG are sampled from FGsoft, the map with the highest score р

The irrelevant foreground objects have been suppressed by the co-occurrence model

Figure-ground shape aware model

image representation bag of segments

Image figure-ground changes through the update of the scores ρ

of figure-ground maps! FG=1

FG=0

background models φ2

φT

words

P

words

P

figure-ground aware model

w λ θ z NI

NS NW

β T

φ

T topic model

ρ FG w β T

φ

T NI

NS NW

z

M

FGsoft

figure ground given single image: observed

figure ground given image set: latent

sal

Maximizing a conditional likelihood! Discrimination without a negative image set

word model φ1

shape model M

foreground model

words

P figure-ground given image set

ρ↓

sal22 =0.4

Ρ22 =0.7

sal11=0.8

Ρ11 =0.8

sal21 =0.2

Ρ21 =0.2

sal12 =0.6

Ρ12 =0.3

figure-ground organization multiple figure-ground maps figure ground given single image

ρ↓

sal22 =0.4

Ρ22 =0.4

sal11=0.8

Ρ11 =0.8

sal21 =0.2

Ρ21 =0.2

sal12 =0.6

Ρ12 =0.6

The co-occurrence model

rescores the figure-ground maps

Maps switched score order!

Datasets used: Caltech 101:1) 81 images of Airplanes; MSRC: 2) 70 images of Cars, 3) 84 images of Cows; ETH: 4) 48 images of Bottles, 5) 29 images of Swans, 6) 85 images of Giraffes; Weizmann Horses: 7)80 images In each dataset 2/3 of images for training and 1/3 for testing. 2 representations tested: • sFgmodel: shape +bag of words model (full model) • bagFGmodel: bag of words model (no shape at test time, only during learning)

Use both single image figure-ground saliency and feature co-occurrence across image set to discover the common object

t1

t3

t2

Problem

algorithm Assumption: Often salient image regions capture the common object.

sample segment topic z є t2…tT

Gibbs sampling

Learning

Test time Results

ours baseline