2d scene representation and description · level-set based segmentation james sethian (1999): level...
TRANSCRIPT
Institute of Electrical Measurement and Measurement Signal Processing
1
Axel Pinz WS 2017/18 Image and Video Understanding 4
2D Scene Representation and Description
You can get very far in 2D !
2D “image object”
“token”
“tokenset”
2D scene description
image
image
description
segmentation
2D grouping
Institute of Electrical Measurement and Measurement Signal Processing
2
Axel Pinz WS 2017/18 Image and Video Understanding 4
Segmentation: From Images to Tokens
• “Formal” definition of segmentation:Image I is segmented segmentation S = {regions Ri | rules 1-4}
1. ∪ 𝑅𝑖 = 𝐼
2. 𝑖 ≠ 𝑗: 𝑅𝑖 ∩ 𝑅𝑗 = ∅
3. some homogeneity criterion H holds: ∀𝑖: 𝐻 𝑅𝑖 = 𝑡𝑟𝑢𝑒
4. disjunct neighbors: 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑅𝑖 , 𝑅𝑗 : 𝐻 𝑅𝑖 ∪ 𝑅𝑗 = 𝑓𝑎𝑙𝑠𝑒
• Fundamental concepts of segmentation• Region-based segmentation (e.g. threshold, split+merge, R growing)
• Contour-based segmentation (e.g. edge detection closing of gaps)
• State-of-the-art segmentation algorithms• Graph based segmentation
• Level set
image
image
description
segmentation
2D grouping
Institute of Electrical Measurement and Measurement Signal Processing
3
Axel Pinz WS 2017/18 Image and Video Understanding 4
Segmenting 2D “Shapes”
Region-Based Segmentation (1)
e.g.:
Global threshold
Institute of Electrical Measurement and Measurement Signal Processing
4
Axel Pinz WS 2017/18 Image and Video Understanding 4
Region-Based Segmentation (2)
e.g.: Region growing Seed cells
Institute of Electrical Measurement and Measurement Signal Processing
5
Axel Pinz WS 2017/18 Image and Video Understanding 4
Region-Based Segmentation (3)
e.g.: Split and Merge
Image
Institute of Electrical Measurement and Measurement Signal Processing
6
Axel Pinz WS 2017/18 Image and Video Understanding 4
Edge-Based Segmentation (1)
General approach:
1) Smoothing
2) Edge detection
3) Filteringa) Eliminate short edges
b) Close small gaps
4) Obtain closed contours
Image Edge image
Institute of Electrical Measurement and Measurement Signal Processing
7
Axel Pinz WS 2017/18 Image and Video Understanding 4
Edge-Based Segmentation (2)
Real edge profile Real line profile
Edge: inflection point
Line: local extremum
Locate it:
Zero crossing of 1st / 2nd derivative !!
Noise smoothing needed convolution with a Gaussian!
Edge
Line
Institute of Electrical Measurement and Measurement Signal Processing
8
Axel Pinz WS 2017/18 Image and Video Understanding 4
Edge-Based Segmentation (3)
D. Marr + E. Hildreth: LoG / DoG zero crossings [1978]
“Mexican hat” / “Sombrero” Operator
http://laurent-duval.blogspot.co.at/2014/09/cours-radial-basis-functions.html
∆ 𝐺 ∗ 𝐼 = (∆𝐺) ∗ 𝐼
Institute of Electrical Measurement and Measurement Signal Processing
9
Axel Pinz WS 2017/18 Image and Video Understanding 4
Edge-Based Segmentation (4)
LoG / DoG zero crossings
always produce closed contours!
Original images
T … threshold
T … threshold
Canny edge
Detector
Institute of Electrical Measurement and Measurement Signal Processing
10
Axel Pinz WS 2017/18 Image and Video Understanding 4
Relaxing the formal definition: Superpixels
Compactness and/or size parameters control the segmentation
Homogeneity may be violated:
disjunct neighbors: 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑅𝑖 , 𝑅𝑗 : 𝐻 𝑅𝑖 ∪ 𝑅𝑗 = 𝑓𝑎𝑙𝑠𝑒… may be true!
Example: [Achanta et al., SLIC superpixels, PAMI 34(11):2274-2282, 2012]
Institute of Electrical Measurement and Measurement Signal Processing
11
Axel Pinz WS 2017/18 Image and Video Understanding 4
SLIC Examples [Achanta et al. PAMI ]
Spatial segmentations
into “superpixels”
Institute of Electrical Measurement and Measurement Signal Processing
12
Axel Pinz WS 2017/18 Image and Video Understanding 4
SLIC Examples [Achanta et al. PAMI ]
Spatio-temporal
segmentations
into “supervoxels”
Institute of Electrical Measurement and Measurement Signal Processing
13
Axel Pinz WS 2017/18 Image and Video Understanding 4
Temporal Superpixels TSPs [Chang et al CVPR‘13]
Institute of Electrical Measurement and Measurement Signal Processing
14
Axel Pinz WS 2017/18 Image and Video Understanding 4
High-Level Segmentation – The “Correct” one?
3 Regions24 Regions41 Regions
http://www.cs.berkeley.edu/projects/vision/grouping/segbench/
Institute of Electrical Measurement and Measurement Signal Processing
15
Axel Pinz WS 2017/18 Image and Video Understanding 4
State-of-the-Art Segmentation (1)
Graph Cut [Shi+Malik, PAMI 2000]
J. Shi, J. Malik, “Normalized Cuts and Image Segmentation”, PAMI 22(8):888-905, 2000
Fully connected graph (all pixels of the image !)
Weights wpq measure similarity between pixel p, q
Institute of Electrical Measurement and Measurement Signal Processing
16
Axel Pinz WS 2017/18 Image and Video Understanding 4
Graph cut (2)J. Shi, J. Malik, “Normalized Cuts and Image Segmentation”, PAMI 22(8):888-905, 2000
Partition the graph into subgraphs
Goal: High similarity within a subgraph
Low similarity between subgraphs
Institute of Electrical Measurement and Measurement Signal Processing
17
Axel Pinz WS 2017/18 Image and Video Understanding 4
N-Cut Results (1)
Institute of Electrical Measurement and Measurement Signal Processing
18
Axel Pinz WS 2017/18 Image and Video Understanding 4
N-Cut Results (2)
Institute of Electrical Measurement and Measurement Signal Processing
19
Axel Pinz WS 2017/18 Image and Video Understanding 4
State-of-the-Art Segmentation (2)
Graph Based [Felzenszwalb+Huttenlocher, IJCV 2004]P. Felzenszwalb, D. Huttenlocher, “Efficient, Graph-based Image Segmentation”, IJCV 59(2), 2004
Pairwise comparison of regions
Original image segmentation using segmentation using
“grid graph” “nearest neighbor graph”
Institute of Electrical Measurement and Measurement Signal Processing
20
Axel Pinz WS 2017/18 Image and Video Understanding 4
Graph Based (2) [Felzenszwalb+Huttenlocher, IJCV 2004]
P. Felzenszwalb, D. Huttenlocher, “Efficient, Graph-based Image Segmentation”, IJCV 59(2), 2004
Pairwise comparison of regions
Original image segmentation using segmentation using
“grid graph” “nearest neighbor graph”
Institute of Electrical Measurement and Measurement Signal Processing
21
Axel Pinz WS 2017/18 Image and Video Understanding 4
State-of-the-Art Segmentation (3)
Level-Set Based SegmentationJames Sethian (1999): Level Set & Fast Marching Methods, Cambridge.
Stan Osher & Ronald Fedkiw (2002): Level Set Methods and Dynamic Implicit Surfaces, Springer.
Stan Osher & Nikos Paragios (2003): Geometric Level Set in Imaging, Vision+Graphics, Springer.
Split the image into 2 regions, such that:
• Similarity between pixels of a region is maximized
• Contour length is minimized
e.g.: [Chan&Vese, 2001]
dH
dHufHufuuE
)(
)(1)()()(,, 22
Institute of Electrical Measurement and Measurement Signal Processing
22
Axel Pinz WS 2017/18 Image and Video Understanding 4
Level-Set Based Segmentation (2)
dH
dHufHufuuE
)(
)(1)()()(,, 22
u+,u- Average intensities fore- and background
Φ Level-Set function
Ω image domain
f image point (pixel)
H Heaviside step function
ν weight of regularisation term (contour)
contour
Minimize E w.r.t. , u+, u-, +, -
https://en.wikipedia.org/wiki/Heaviside_step_function
Institute of Electrical Measurement and Measurement Signal Processing
23
Axel Pinz WS 2017/18 Image and Video Understanding 4
Level-Set Based Segmentation (3)
Φ>0
Φ<0
Φ>0
Φ<0
Φ>0
Φ<0
Many adaptations, various formulations
e.g. [Paragios&Deriche], [Brox&Weickert], [Fussenegger]
Institute of Electrical Measurement and Measurement Signal Processing
24
Axel Pinz WS 2017/18 Image and Video Understanding 4
Level-Set Based Segmentation (4)
Multi-region Level-Set
Institute of Electrical Measurement and Measurement Signal Processing
25
Axel Pinz WS 2017/18 Image and Video Understanding 4
Tokenset: Relaxing the formal definition
1. ∪ 𝑅𝑖 = 𝐼
2. 𝑖 ≠ 𝑗: 𝑅𝑖 ∩ 𝑅𝑗 = ∅
3. some homogeneity criterion H holds: ∀𝑖: 𝐻 𝑅𝑖 = 𝑡𝑟𝑢𝑒
4. disjunct neighbors: 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑅𝑖 , 𝑅𝑗 : 𝐻 𝑅𝑖 ∪ 𝑅𝑗 = 𝑓𝑎𝑙𝑠𝑒
• Tokens may be post-processed (e.g. opening/closing)• #of holes, #of parts several regions!
• Tokens may be overlapping (at least: their bounding boxes)
• No need to cover the whole image!
image
image
description
segmentation
2D grouping
Institute of Electrical Measurement and Measurement Signal Processing
26
Axel Pinz WS 2017/18 Image and Video Understanding 4
Token – Image Object
• Points (x,y)
• Lines ((x1,y1),(x2,y2))
• Polylines/Chains ((x1,y1),(x2,y2), …,(xn,yn))
• Polygons ((x1,y1),(x2,y2), …,(xn,yn),(x1,y1))Squares, rectangles, circles, ellipses, … (parametrized closed contours)
• Constellations/Bitmaps
“Feret”-Box
Bounding box, aligned (x,y)
… foreground
… background
Institute of Electrical Measurement and Measurement Signal Processing
27
Axel Pinz WS 2017/18 Image and Video Understanding 4
Constellation Tokens
Institute of Electrical Measurement and Measurement Signal Processing
28
Axel Pinz WS 2017/18 Image and Video Understanding 4
Tokenset
A file – i.e., a list of tokens database, various indexingTo
ke
n T
yp
e
Lexicon
Data
Institute of Electrical Measurement and Measurement Signal Processing
29
Axel Pinz WS 2017/18 Image and Video Understanding 4
Tokenset 2D Image/Scene Description
“houses” [Matsuyama’90] “face” [Brunelli’92] “pedestrians” [Suzuki’90]
Institute of Electrical Measurement and Measurement Signal Processing
30
Axel Pinz WS 2017/18 Image and Video Understanding 4
Example: PhD Pinz (1988)
Finding trees in aerial images …
Original image Smoothed (conv. Gaussian) Local brightness maxima
Circles
circles trees
Spruce (Fichte) + Pine (Kiefer)
Institute of Electrical Measurement and Measurement Signal Processing
31
Axel Pinz WS 2017/18 Image and Video Understanding 4
Example: PhD Pinz (1988)
Finding trees in aerial images …
Original image Smoothed (conv. Gaussian) Local brightness maxima
Circles
circles trees
Spruce (Fichte) + Pine (Kiefer)
2D image 2D scene
description
image
image
description
image proc.
segmentation
2D grouping
2D scene
description
Institute of Electrical Measurement and Measurement Signal Processing
32
Axel Pinz WS 2017/18 Image and Video Understanding 4
2D Scene Representation and Description
You can get very far in 2D !
2D “image object”
“token”
“tokenset”
2D scene description
image
image
description
segmentation
2D grouping
Institute of Electrical Measurement and Measurement Signal Processing
33
Axel Pinz WS 2017/18 Image and Video Understanding 4
Some very fundamental questions:
• What characterizes an object ?
“objectness”
- compact, (self-)similar, distinct (color, texture), …
• Given an object, what characterizes its shape ?
• Maybe easier to answer: What is not shape?
- color
- texture
- size
?
Institute of Electrical Measurement and Measurement Signal Processing
34
Axel Pinz WS 2017/18 Image and Video Understanding 4
“Objectness”
[Alexe et al., PAMI 2012]
cf. “Region Proposal Network – RPN” as one component in ConvNets for Object
Detection (e.g., Faster R-CNN, CVPR’14, see https://arxiv.org/abs/1506.01497)
Institute of Electrical Measurement and Measurement Signal Processing
35
Axel Pinz WS 2017/18 Image and Video Understanding 4
Object ShapeHumans (again!) are very good
in characterizing shape!
Proportions !
Pablo Picasso, rites of spring,
from D.Marr, VISION, fig. 3-56 (a)
2D 3D
Institute of Electrical Measurement and Measurement Signal Processing
36
Axel Pinz WS 2017/18 Image and Video Understanding 4
Ideas open to further development
• Spatio-temporal shape (2D space + time)
• Motion patterns, trajectory space, …
• Spatio-temporal shape (3D space + time)
Institute of Electrical Measurement and Measurement Signal Processing
37
Axel Pinz WS 2017/18 Image and Video Understanding 4
Token Features for 2D Grouping
Shape• Minimum bounding rectangle (MBR)
• Best ellipse fit
• Aspect ratio (AR): |log(height/width)| = |log(width/height)|
• BR fill: % of foreground pixels in Feret box or in MBR
• Circumference
• Compactness = 𝑎𝑟𝑒𝑎
𝑐𝑖𝑟𝑐𝑢𝑚𝑓𝑒𝑟𝑒𝑛𝑐𝑒2
• Elongatedness = 1 −𝑚𝑖𝑛𝑜𝑟 𝑎𝑥𝑖𝑠
𝑚𝑎𝑗𝑜𝑟 𝑎𝑥𝑖𝑠of the best ellipse fit
Appearance• Color
• Texture
• Histograms of … local descriptors
Institute of Electrical Measurement and Measurement Signal Processing
38
Axel Pinz WS 2017/18 Image and Video Understanding 4
Homographies, Collineations, Perspective Transformations
vs.
Token shape, appearance, etc.
I borrow from „Bildgestützte Messverfahren“
(image-based measurement 2VO, 1LU) …
A hierarchy of transformations – hierarchy of geometries:
• Euclidean
• Similarity transformation
• Affine transformation
• Perspective transformation
3
2
1
333231
232221
131211
3
2
1
'
'
'
'
x
x
x
hhh
hhh
hhh
x
x
x
x
x
H
Institute of Electrical Measurement and Measurement Signal Processing
39
Axel Pinz WS 2017/18 Image and Video Understanding 4
A Hierarchy of Transformations / Geometries
100
2221
1211
y
x
taa
taa
100
2221
1211
y
x
tsrsr
tsrsr
333231
232221
131211
hhh
hhh
hhh
100
2221
1211
y
x
trr
trr
Projective
8dof
P
Affine
6dof
A
Similarity
4dof
S
Euclidean
3dof
E
In 2D, a square transforms to:
3
2
1
333231
232221
131211
3
2
1
'
'
'
'
x
x
x
hhh
hhh
hhh
x
x
x
x
x
H
Invariance of token features?
Institute of Electrical Measurement and Measurement Signal Processing
40
Axel Pinz WS 2017/18 Image and Video Understanding 4
Invariance of Token FeaturesRadiometric
transformation
E
Euclidean
S
Similarity
A
Affine
P
Projective
Feret box: AR
Feret: BR fill
MBR: AR
MBR fill
best ellipse
elongatedness
circumference
compactness
size (# pix.)
color
texture
# holes
# parts
Institute of Electrical Measurement and Measurement Signal Processing
41
Axel Pinz WS 2017/18 Image and Video Understanding 4
Examples 2D GroupingLarge tokens:
Original image 434 constellation tokens 9 tokens > 2000 pixels
Parallelism
Original image 26002 straight lines 94 lines with orientation -4
Institute of Electrical Measurement and Measurement Signal Processing
42
Axel Pinz WS 2017/18 Image and Video Understanding 4
Grouping – Example (KU/09)
edges circles (Hough)
coins
Institute of Electrical Measurement and Measurement Signal Processing
43
Axel Pinz WS 2017/18 Image and Video Understanding 4
Perceptual Grouping (Lowe’87)
input image projection of
3D wireframe model
successful matches
by perceptual grouping
Institute of Electrical Measurement and Measurement Signal Processing
44
Axel Pinz WS 2017/18 Image and Video Understanding 4
The Umass VISIONS SystemInterpretations of Massachusetts “road” scenes [Draper et al., IJCV 1989]
original image interpretation result interpretation key
Institute of Electrical Measurement and Measurement Signal Processing
45
Axel Pinz WS 2017/18 Image and Video Understanding 4
The Umass VISIONS SystemInterpretations of Massachusetts “road” scenes [Draper et al., IJCV 1989]
original image interpretation result interpretation key
Institute of Electrical Measurement and Measurement Signal Processing
46
Axel Pinz WS 2017/18 Image and Video Understanding 4
The Umass VISIONS SystemInterpretations of Massachusetts “house” scenes [Draper et al., IJCV 1989]
original image interpretation result interpretation key
Institute of Electrical Measurement and Measurement Signal Processing
47
Axel Pinz WS 2017/18 Image and Video Understanding 4
Bottom-Up vs. Top-Down Grouping
2D Models
• Chains of “edgels”, “ridgels”
• Hough transform:
complex patterns local maxima
Image space “Hough” space, bins
• Active contour models
• Shape priors
• Active shape models
Institute of Electrical Measurement and Measurement Signal Processing
48
Axel Pinz WS 2017/18 Image and Video Understanding 4
Bottom-Up vs. Top-Down Grouping
2D Models
Edges chains
Lines ridges
Lines valleys
Institute of Electrical Measurement and Measurement Signal Processing
49
Axel Pinz WS 2017/18 Image and Video Understanding 4
“Snakes” – Active Contour Models
Kass, Witkin, Terzopoulos, 1st ICCV, London, 1987
Image energies (greylevel, gradient, …), mechanical model (spring)
Active contours adaptation to subjective contours
Institute of Electrical Measurement and Measurement Signal Processing
50
Axel Pinz WS 2017/18 Image and Video Understanding 4
“Snakes” – Active Contour Models
Kass, Witkin, Terzopoulos, 1st ICCV, London, 1987
Tracking of moving contours
Extension to 3D:
“Balloons”
Institute of Electrical Measurement and Measurement Signal Processing
51
Axel Pinz WS 2017/18 Image and Video Understanding 4
“Shape Priors” in Level Set Segmentation[Fussenegger]
… global deformation (scale, rot, translation)
Institute of Electrical Measurement and Measurement Signal Processing
52
Axel Pinz WS 2017/18 Image and Video Understanding 4
“Active Shape Models” in Level Set Segmentation[Fussenegger]
• Shape changes dependent on viewpoint train ASM
• Videos level_set_hide+seek level_set_teapot
Institute of Electrical Measurement and Measurement Signal Processing
53
Axel Pinz WS 2017/18 Image and Video Understanding 4
“Active Shape Models” in Level Set Segmentation[Fussenegger]
3 ASMs
learnt:
- elephant
- octopus
- african man
a original img
b segment.
c, d:
varying the
order of the
3 ASMs
Institute of Electrical Measurement and Measurement Signal Processing
54
Axel Pinz WS 2017/18 Image and Video Understanding 4
Summary IVU_1 – IVU_4
• Vision• Neurophysiology
• Cognitive psychology
• Computational theory (Marr paradigm, representations, algorithms)
• Linear Filtering, Convolution
• Definition of terms, system model of image understanding• Visual recognition the “holy grail” of computer vision
• Segmentation and grouping 2D image/scene description
… recap %
Institute of Electrical Measurement and Measurement Signal Processing
55
Axel Pinz WS 2017/18 Image and Video Understanding 4
Definition: Visual Recognition [Perona’09]
“The holy grail of Computer Vision”
Five tasks of “visual recognition”:
– Verification (is a “car” in the image?)
– Detection and localization (what is there? where?)
– Classification (n “beach” images, m “city” images)
– Naming (name and locate all objects in an image)
– Description: objects, actions, relations, etc.
(example “kissing” “scene understanding”)
Increasing complexity from top bottom
Image and Video Understanding: mostly 2D (+time) recognition
Image-based Measurement: 3D (+time) reconstruction
Co
mp
lexity
Institute of Electrical Measurement and Measurement Signal Processing
56
Axel Pinz WS 2017/18 Image and Video Understanding 4
My Model of Image Understanding
Institute of Electrical Measurement and Measurement Signal Processing
57
Axel Pinz WS 2017/18 Image and Video Understanding 4
2D Scene Representation and Description
You can get very far in 2D !
2D “image object”
“token”
“tokenset”
2D scene description
image
image
description
segmentation
2D grouping
What next?
Institute of Electrical Measurement and Measurement Signal Processing
58
Axel Pinz WS 2017/18 Image and Video Understanding 4
Course Schedule 2016/17• Vision
• Neurophysiology
• Cognitive psychology
• Computational theory (Marr paradigm, representations, algorithms)
• Linear Filtering, Convolution
• Definition of terms, system model of image understanding
• Visual recognition the “holy grail” of computer vision
• Segmentation and grouping 2D image/scene description
• Object categorization
• Terms, goals, issues, …
• Signal processing: Fourier, Gabor
• Scale
• Object models
• CNNs for image and video understanding