image segmentation and visualization · 2021. 6. 7. · visualization by optimization •maximize...

Image Segmentation and Visualization

Xiaolong Wang

This Class

• Naïve FCN model for Image Segmentation

• Transpose Convolution

• Skip Connection, Hypercolumn

• Visualizing Deep Networks

Slides partially from: http://cs231n.stanford.edu/ https://slazebni.cs.illinois.edu/fall20/

Segmentation Problem and FCN

The problem

semantic segmentation instance segmentation

image classification object detection

The problem

Semantic Segmentation

Simply staring one pixel is impossible to do the classification

Let’s put in some context!

Semantic Segmentation

Time Consuming!

Can we process the whole image at one time?

AlexNet input: 227 x 277 x 3

AlexNet Conv5: 13 x 13 x 128

Output ?13 x 13 x 21

Output is too small!

Can we process the whole image at one time?

AlexNet input: 227 x 277 x 3

AlexNet Conv5: 13 x 13 x 128

Output ?227 x 227 x 21

FC + reshape

Huge number of Parameters for FC: 13 x 13 x 128 x 227 x 227 x 21

Fully Convolutional Network

Convolution at original image resolution has high computation cost.

Fully Convolutional Network

D! × H/4 ×W/4 D! × H/4 ×W/4

D" × H/8 ×W/8

Make the feature map small increases the receptive field

Make the feature map larger again increases the resolution

Transpose Convolution

The FCN paper

Long et al. 2015

1 x 1 convolutions

The FCN paper

The upsamling

• Upsampling Layer

• Deconvolution Layer

• Transpose Convolution Layer

Recall Convolution3 X 3 convolution with stride 1 and padding 1

Recall Convolution3 X 3 convolution with stride 2 and padding 1

Transpose Convolution3 X 3 transpose convolution, stride 2 and padding 1

Sum over the overlapping region

1D Transpose Convolution Example

2D ConvolutionRegular convolution (stride 1, pad 0)

𝑥(( 𝑥() 𝑥(* 𝑥(+

𝑥)( 𝑥)) 𝑥)* 𝑥)+

𝑥*( 𝑥*) 𝑥** 𝑥*+

𝑥+( 𝑥+) 𝑥+* 𝑥++

𝑤(( 𝑤() 𝑤(*

𝑤)( 𝑤)) 𝑤)*

𝑤*( 𝑤*) 𝑤**

𝑧(( 𝑧()

𝑧)( 𝑧))

𝑤!! 𝑤!" 𝑤!# 0 𝑤"! 𝑤"" 𝑤"# 0 𝑤#! 𝑤#" 𝑤## 0 0 0 0 00 𝑤!! 𝑤!" 𝑤!# 0 𝑤"! 𝑤"" 𝑤"# 0 𝑤#! 𝑤#" 𝑤## 0 0 0 00 0 0 0 𝑤!! 𝑤!" 𝑤!# 0 𝑤"! 𝑤"" 𝑤"# 0 𝑤#! 𝑤#" 𝑤## 00 0 0 0 0 𝑤!! 𝑤!" 𝑤!# 0 𝑤"! 𝑤"" 𝑤"# 0 𝑤#! 𝑤#" 𝑤##

𝑥##𝑥#!𝑥#"𝑥#$⋮𝑥$$

𝑧##𝑧#!𝑧!#𝑧!!

Matrix-vector form:

4x4 input, 2x2 output

𝑥!!𝑥!"𝑥!#𝑥!$𝑥"!𝑥""𝑥"#𝑥"$𝑥#!𝑥#"𝑥##𝑥#$𝑥$!𝑥$"𝑥$#𝑥$$

𝑧##𝑧#!𝑧!#𝑧!!

𝑤!! 0 0 0𝑤!" 𝑤!! 0 0𝑤!# 𝑤!" 0 00 𝑤!# 0 0𝑤"! 0 𝑤!! 0𝑤"" 𝑤"! 𝑤!" 𝑤!!𝑤"# 𝑤"" 𝑤!# 𝑤!"0 𝑤"# 0 𝑤!#𝑤#! 0 𝑤"! 0𝑤#" 𝑤#! 𝑤"" 𝑤"!𝑤## 𝑤#" 𝑤"# 𝑤""0 𝑤## 0 𝑤"#0 0 𝑤#! 00 0 𝑤#" 𝑤#!0 0 𝑤## 𝑤#"0 0 0 𝑤##

𝑥!! 𝑥!" 𝑥!# 𝑥!$

𝑥"! 𝑥"" 𝑥"# 𝑥"$

𝑥#! 𝑥#" 𝑥## 𝑥#$

𝑥$! 𝑥$" 𝑥$# 𝑥$$

𝑤!! 𝑤!" 𝑤!#

𝑤"! 𝑤"" 𝑤"#

𝑤#! 𝑤#" 𝑤##

𝑧!! 𝑧!"

𝑧"! 𝑧""

=2x2 input, 4x4 output

Not an inverse of the original convolution operation, simply reverses dimension change!

SourceTranspose Convolution

𝑧##𝑧#!𝑧!#𝑧!!

𝑥!! 𝑥!" 𝑥!# 𝑥!$

𝑥"! 𝑥"" 𝑥"# 𝑥"$

𝑥#! 𝑥#" 𝑥## 𝑥#$

𝑥$! 𝑥$" 𝑥$# 𝑥$$

𝑤!! 𝑤!" 𝑤!#

𝑤"! 𝑤"" 𝑤"#

𝑤#! 𝑤#" 𝑤##

𝑧!! 𝑧!"

𝑧"! 𝑧""

𝑥## = 𝑤##𝑧##

𝑧##𝑧#!𝑧!#𝑧!!

𝑥!! 𝑥!" 𝑥!# 𝑥!$

𝑥"! 𝑥"" 𝑥"# 𝑥"$

𝑥#! 𝑥#" 𝑥## 𝑥#$

𝑥$! 𝑥$" 𝑥$# 𝑥$$

𝑤!! 𝑤!" 𝑤!#

𝑤"! 𝑤"" 𝑤"#

𝑤#! 𝑤#" 𝑤##

𝑧!! 𝑧!"

𝑧"! 𝑧""

𝑥#! = 𝑤#!𝑧## + 𝑤##𝑧#!

Convolve input with flipped filter

𝑧##𝑧#!𝑧!#𝑧!!

𝑥!! 𝑥!" 𝑥!# 𝑥!$

𝑥"! 𝑥"" 𝑥"# 𝑥"$

𝑥#! 𝑥#" 𝑥## 𝑥#$

𝑥$! 𝑥$" 𝑥$# 𝑥$$

𝑤!! 𝑤!" 𝑤!#

𝑤"! 𝑤"" 𝑤"#

𝑤#! 𝑤#" 𝑤##

𝑧!! 𝑧!"

𝑧"! 𝑧""

𝑥#" = 𝑤#"𝑧## + 𝑤#!𝑧#!

𝑧##𝑧#!𝑧!#𝑧!!

𝑥!! 𝑥!" 𝑥!# 𝑥!$

𝑥"! 𝑥"" 𝑥"# 𝑥"$

𝑥#! 𝑥#" 𝑥## 𝑥#$

𝑥$! 𝑥$" 𝑥$# 𝑥$$

𝑤!! 𝑤!" 𝑤!#

𝑤"! 𝑤"" 𝑤"#

𝑤#! 𝑤#" 𝑤##

𝑧!! 𝑧!"

𝑧"! 𝑧""

𝑥#$ = 𝑤#"𝑧#!

𝑧##𝑧#!𝑧!#𝑧!!

𝑥!! 𝑥!" 𝑥!# 𝑥!$

𝑥"! 𝑥"" 𝑥"# 𝑥"$

𝑥#! 𝑥#" 𝑥## 𝑥#$

𝑥$! 𝑥$" 𝑥$# 𝑥$$

𝑤!! 𝑤!" 𝑤!#

𝑤"! 𝑤"" 𝑤"#

𝑤#! 𝑤#" 𝑤##

𝑧!! 𝑧!"

𝑧"! 𝑧""

𝑥!# = 𝑤!#𝑧## + 𝑤##𝑧!#

𝑧##𝑧#!𝑧!#𝑧!!

𝑥!! 𝑥!" 𝑥!# 𝑥!$

𝑥"! 𝑥"" 𝑥"# 𝑥"$

𝑥#! 𝑥#" 𝑥## 𝑥#$

𝑥$! 𝑥$" 𝑥$# 𝑥$$

𝑤!! 𝑤!" 𝑤!#

𝑤"! 𝑤"" 𝑤"#

𝑤#! 𝑤#" 𝑤##

𝑧!! 𝑧!"

𝑧"! 𝑧""

𝑥!! = 𝑤!!𝑧## + 𝑤!#𝑧#! + 𝑤#!𝑧!# + 𝑤##𝑧!!

𝑧##𝑧#!𝑧!#𝑧!!

𝑥!! 𝑥!" 𝑥!# 𝑥!$

𝑥"! 𝑥"" 𝑥"# 𝑥"$

𝑥#! 𝑥#" 𝑥## 𝑥#$

𝑥$! 𝑥$" 𝑥$# 𝑥$$

𝑤!! 𝑤!" 𝑤!#

𝑤"! 𝑤"" 𝑤"#

𝑤#! 𝑤#" 𝑤##

𝑧!! 𝑧!"

𝑧"! 𝑧""

𝑥!" = 𝑤!"𝑧## + 𝑤!!𝑧#! + 𝑤#"𝑧!# + 𝑤#!𝑧!!

output

D! × H/4 ×W/4 D! × H/4 ×W/4

D" × H/8 ×W/8

Advanced Techniques in Segmentation

Ronneberger et al. 2015

https://phillipi.github.io/pix2pix/

U-Net and FPN

Lin et al. 2017

Hypercolumn and Skip Connection

Hariharan et al. 2015

Hypercolumn and Skip Connection

Bansal et al. 2017

Dilated convolutions

• Idea: instead of reducing spatial resolution of feature maps, use a large sparse filter

Dilation factor 1 Dilation factor 2 Dilation factor 3

Dilated convolutions

• Increase the receptive field

output

Visualizing Deep Networks using “Saliency map”

CAM: Class Activation Maps

Zhou et al. 2016

CAM: Class Activation Maps

https://www.youtube.com/watch?v=fZvOy0VXWAI&ab_channel=BoleiZhou

Visualizing Deep Networks by maximizing activation

Visualization by optimization

• We can synthesize images that maximize activation of a given neuron.

• Find image 𝑥 maximizing target activation 𝑓 𝑥 subject to natural image regularization penalty 𝑅(𝑥):

𝑥∗ = arg max" 𝑓 𝑥 − 𝜆𝑅(𝑥)

Visualization by optimization• Maximize 𝑓 𝑥 − 𝜆𝑅(𝑥)

• 𝑓 𝑥 is score for a category before softmax• 𝑅(𝑥) is L2 regularization• Perform gradient ascent starting with zero image, add dataset mean

to result

Simonyan et al. 2014

Compute the Loss

BP the gradients, but do not train the network

Keep adding/aggregating the gradients under constraint 𝑅(𝑥)

Google DeepDreamAmplify one layer instead of just one neuron.

Choose an image and a layer in a CNN; repeat:1. Forward: compute activations at chosen layer2. Set gradient of chosen layer equal to its activation

Equivalent to maximizing ∑6 𝑓6)(𝑥)3. Backward: Compute gradient w.r.t. image4. Update image (with some tricks)

Next Class

Visualizing Deep Networks

image segmentation and visualization · 2021. 6. 7. · visualization by optimization •maximize...

Documents

softmax pro software - university of victoria · softmax®...

gumbel-softmax-based optimization: a simple general

learning the compositional visual coherence for ...content...

ieee transactions on circuits and systems—ii: express...

chap5.5 regularization

softmax pro software user guide -...

channel selection using gumbel softmax: supplement

meeseva user manual for regularization of encroachments...

softmax splatting for video frame interpolation

title: regularization

softmax pro 7 -...

sv regularization

gumbel-softmax optimization (gso)

cross-view discriminative feature learning for person re...

f^2-softmax: diversifying neural text generation via ...9167...

cap 6412 advanced computer vision - cs...

attribute adaptive margin softmax loss using privileged

an alternative softmax operator for reinforcement...

application note analysis of complex data with the softmax

softmax pro gxp - moleculardevices.com · 123456 - xxxx x-...