the learning and use of graphical models for image interpretation

53
THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION Thesis for the degree of Master of Science By Leonid Karlinsky Under the supervision of Professor Shimon Ullman

Upload: zeno

Post on 21-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION. Thesis for the degree of Master of Science By Leonid Karlinsky Under the supervision of Professor Shimon Ullman. Introduction. Introduction. Part I: MaxMI Training. Best = Maximal MI. Classification. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE

INTERPRETATION

Thesis for the degree of Master of ScienceBy Leonid Karlinsky

Under the supervision of Professor Shimon Ullman

Page 2: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Introduction

GraphicalModels

A

B C

D E

P(B|A) P(C|A)

P(D|B) P(E|B)

Bayesian Network (BN)

iX

ii XXPEDCBAP )|(),,,,(

Page 3: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Introduction

GraphicalModels

Learning

Loop FreeLoopy

Using

Loop FreeLoopy

Tasks

Scenarios

Page 4: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Part I:Part I: MaxMI Training

GraphicalModels

Learning

Loop FreeLoopy

Using

Loop FreeLoopy

Tasks

Scenarios

Page 5: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Classification

ClassC

Features

Goal: Classify C, using a subset of “trained” featuressubset of “trained” features - F on new examples with minimum error

f

Training tasks:• Best F• Best• Efficient model

F2

F1F3

F6F5

);,( FCP

1F2F

4F 3F

5F 6F

7F

More…);,( FCP

Best = Maximal MI

Page 6: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training - The Past

• Model: simple “Flat” structure, NCC thresholds

• Training: Features and thresholds selected one by one

);;(),;,;(minmaxarg,

,jjjiji

ijFii FCMIFFCMIF

ii

1 2 3 4 5 6

Cond. independence in C increased MI upper bound

More…

Page 7: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – Our ApproachMaxMI Training – Our Approach

1

2 3

4 5 6 7

Learn modelmodel and allall togethertogether maximizing:i

);;( FCMI

Page 8: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – Learning

MaxMI: Decompose MI

1

2 3

4 5 6 7

i);;( FCMI

iiiii FCFMIFCMI ),,|;();;(

EfficientlyEfficiently learn parameters using GDLGDL

Maximize

for allall together

More…

Page 9: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – Assumptions

1.1. TAN model structureTAN model structure – Tree Augmented Naïve Bayes [Friedman, 97]

2.2. Feature Tree (FT)Feature Tree (FT) – can remove C preserving the feature tree.

i

iin CFFPFFCP ),|(),,,( 1

i

iin FFPFFP )|(),,( 1

iiiii FCFMIFCMI ),,|;();;(

Page 10: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – TAN and

1.1. TAN structure is unknownTAN structure is unknown

2. Learn and TAN s.t.: 1

2 3

4 5 6 7

i

);;( FCMI is maximized. Asymptotic correctness FT holds Efficiency

Page 11: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – MaxMI hybridMaxMI Training – MaxMI hybrid

,TAN

);;( argmax TAN,

FCMI

,TAN Legal

)|;(),( iiiiMM FCFMIFFwMI

C)|FP(F)F,FP(C

)|FP(F)F,P(F-

iin

-iin

,, TAN, Legal

1

1

Page 12: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – MaxMI hybridMaxMI Training – MaxMI hybrid

More…

C)|FP(F)F,FP(C

)|FP(F)F,P(F-

iin

-iin

,, TAN, Legal

1

1

)|FP(F)F,P(FFF -iinii 1FT

TAN, ),(w argmax

[Chow & Liu, 68]

);;( argmax ),(w argmax TAN, Legal

MM TAN, Legal

FCMIFF ii MaxMI:MaxMI:

maximal is MI Legal are TAN,

),(w),(w argmax

?

FTMM TAN,

iiii FFFF

C)|FP(F)F,FP(CFF -iinii ,, ),(w argmax 1TAN

TAN,

[Friedman, 97]

);(),(w

),(w),(w

TAN

FTMM

CFMIFF

FFFF

iii

iiii

Page 13: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – MaxMI hybrid

Convergent algorithm:

),(),( iiFTiiMM FFwFFw ),( iiTAN FFw

TAN

More…

);(),(w

),(w),(w

TAN

FTMM

CFMIFF

FFFF

iii

iiii

Page 14: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – empirical results

0102030405060708090

100110120130140150160170180

LMI MaxMI

Cow Parts Model - Feature Centered - Test DB (2256)

Miss FA Total Errors

More…

0102030405060708090

100110120130140

LMI MaxMI

Cow Parts Model - Parent Centered - Test DB (2256)

Miss FA Total Errors

Page 15: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – empirical results

0102030405060708090

100110120130140150160170180

LMI MaxMI MaxMI+TAN

Cow Parts Model - Classification Errors - Test DB (2256)

Miss FA Total Errors More…

Page 16: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – Generalizations

Train any parametersTrain any parameters i Any low-TREEWIDTH structureAny low-TREEWIDTH structure Even without assumptions:

iiiii FCFMIFCMI ),,|;();;(

C)|FP(F)F,FP(C

)|FP(F)F,P(F-

iin

-iin

,, TAN, Legal

1

1

iiiiiiiii FCFHFFHFCMI ),,,|()ˆ,,ˆ|();;(

Page 17: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Back to the Goals

GraphicalModels

Learning

Loop FreeLoopy

Using

Loop FreeLoopy

Tasks

Scenarios

Page 18: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Part II:Part II: Loopy MAP approximation

GraphicalModels

Learning

Loop FreeLoopy

Using

Loop FreeLoopy

Tasks

Scenarios

Page 19: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Loopy network example

x

x x x x

),( 2112xxf ),(

5115 xxf),( 3

113

xxf ),(4

114 xxf

),( 4224 xxf ),( 5335 xxf

),( 5225 xxf

2 3

1

4 5

Want to solve MAP:

nji

jiijxx

n xxfxxn 1},,{

1 ),(argmax},,{1

NP-hard in general! [Cooper 90, Shimony 94]

Page 20: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Our approach – opening loops

x

x x x x

),( 2112xxf ),(

5115 xxf),( 3

113

xxf ),(4

114 xxf

42

24

53

35

52

25

2 3

1

4 5

z 4 z3z 2

Now, we can maximizemaximize:

l

ll

lmkk kjkjk

kjijiij

zzzxn xzfxxfxx ),(),(argmax},,{

,},,{,1

1

The assignment is legallegal for the loopy problem if:ll kk xz

Page 21: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Our approach – opening loops

),,(argmax},,{,,

zxygzxyxzyx

lll

LegallyLegally maximize:

),,( zxyg

),,(argmax},,{,,

zxygzxyzyx

mmm Can maximize unrestricted:

Usually mm zx

Our solution – slow connectionsslow connections

Page 22: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Our approach – slow connectionsOur approach – slow connections

),,(argmax},{,

Zxygxyyx

ZZ

Fix z=Z

ZxZ Now legalizelegalize and return to step one.

Iterate until convergence. This is the Maximize-and-LegalizeMaximize-and-Legalize algorithm. y

x x x y2 3

1

4 5

z 4 z3z 2

4Z 2Z 3Z

Zx )( 2

Zx )( 3

Zx )( 4

Zy )( 1

Zy )( 5

MaximizeMaximize (loop-free, use GDL):

Page 23: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Our approach – slow connections

When will this work?When will this work?

The intuition:The intuition: z-minor

Strong Strong zz-minor-minor

Weak Weak zz-minor-minor

),,(),,(),,(),,(:, ZxygZZygZxygxxygZxy ZZZZZZZZ

),,(),,(),,(),,(:),(),( ZxygZxygZxygzxygxyxy ZZZZZZZZ

global maximumglobal maximum – single stepsingle step

local optimumlocal optimum – several stepsseveral steps

),,( Zxyg ZZ

),,( zxyg ZZ),,( Zxyg

),,( Zxyg ZZ

),,( ZZZ xxyg),,( ZZyg

Page 24: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Making the assumptions trueSelecting z-variablesSelecting z-variables The intuitionThe intuition:: recursive z-selection

Recursive strong strong zz-minor-minor: single step, global maximum! Recursive weak z-minor: iterations, local maximum.

Different / Same speed

Remove – Contract – Split algorithm More…

slow connection

1x

2x 3x3z

slow connection

1x

2x 3x

1z4x 4x

slow connection

1x

2x 3x

1z4x

Page 25: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Making the assumptions true

Approximating the functionApproximating the function The intuitionThe intuition:: recursively “chip away” small parts of the function

More…

slow connection

slow connection

1x

2x 3x),( 322 xxf

1x

2x 3x),( 322 xxf

1z

1z

slow connection

1x

2x 3x),( 322 xxf

1z

Page 26: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Existing approximation algorithms

ClusteringClustering: triangulation [Pearl, 88]

Loopy Belief Revision Loopy Belief Revision [McEliece, 98][McEliece, 98]

Bethe-Kikuchi Free-EnergyBethe-Kikuchi Free-Energy: CCCP [Yuille, 02]

Tree Re-Parametrization (TRP)Tree Re-Parametrization (TRP) [Wainwright, 03] [Wainwright, 03]

Page 27: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Experimental Results

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

Weak z-

minor,

diff erent

speed

Weak z-

minor, same

speed

Random z-

variables

selection

LBR (50

messages)

LBR (10

messages)

Ignore

Siblings

1000 samples, 31 nodes, 4 values

Avarage Approximation Avarage MatchMore…

Page 28: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Experimental Results

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

Weak z-

minor,

diff erent

speed

Weak z-

minor, same

speed

Random z-

variables

selection

LBR (50

messages)

LBR (10

messages)

Ignore

Siblings

1000 samples, 31 nodes, 2 values

Avarage Approximation Avarage MatchMore…

Page 29: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

More…More…

Maximum MI Maximum MI vs. vs.

Minimum PMinimum PEE

More…

Page 30: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION
Page 31: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION
Page 32: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Classification Specifics• How do we classify a new example?

);|(maxarg FCPCC

• What are “the best” features and parameters?

);;(maxarg,,

FCMIFF

• Why maximize MI?

MAP:

Maximize MI:

More reasons – if time permits

Tightly related to PE

Back…

)|()();( FCHCHFCMI

Page 33: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training - The Past - Reasons

• Why did it work? Conditional independence in C

• What was missing?

);;(),;,;(min);;(),;,;( jjjijiij

ii FCMIFFCMIFCMIFFCMI

Increased MI upper bound

Conditional independence in Conditional independence in C C was assumed!was assumed!

1 2 3 4 5 6

Maximizing the “whole” MI.Maximizing the “whole” MI. Learning model structure.Learning model structure.

Back…

Page 34: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – JT

JTJT structure = TANTAN structure

GDL - exponential in TREEWIDTHTREEWIDTH

),,|;(

iiii FCFMI

TREEWIDTH = 2

Back…

Page 35: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – EM

Why not EM?Why not EM?

EM assumes static training datastatic training data!

Not true in our scenario!Not true in our scenario!

[Redner, Walker, 84] EM algorithm:

Training CPTs with EM

Yy

yp );(logmaxarg

Back…

Page 36: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – MaxMI hybrid solution

[Chow, Liu 68] “Best” Feature Tree

),;;(),( iiiiiiFT FFMIFFw

[Friedman, et al. 97] “Best” TAN

),;|;(),( iiiiiiTAN CFFMIFFw

Back…

[We, 2004] Maximal MI

),;|;(),( iiiiiiMM FCFMIFFw

Page 37: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – MaxMI hybrid solution

);;();();();( iijiTANjiFTjiMM CFMIFFwFFwFFw

FTMM ScoreScore maxarg

MMScore Increase: ? ICR

TANScoreTAN maxarg

Non-decrease: TAN Asymptotic correctness

i

iiTANFTMM CFMIScoreScoreScore );;(

FTMM ScoreScore

Back…

Page 38: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – MaxMI hybrid

Back…

);(

),(w)1(),(w

),(w),(w

FTTAN

FTMM

CFMI

FFFF

FFFF

i

iiii

iiii

Page 39: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – empirical results

Before training:

After training:

Back…

Page 40: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – empirical results

0102030405060708090

100110120130140150

MaxMI Original

Training

MaxMI+TAN

constrained

MaxMI+TAN

greedy

MaxMI+TAN

hybrid

MaxMI

(threshold only)

MaxMI+TAN

O&U soft EM

Face Parts Model - Classification Errors - Test DB (2257)

Miss FA Total Errors Back…

Page 41: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – empirical results

Face Parts ModelTest DB Size

Training DB Size

Class entropy on training DB

MI model to class on training DB

Error rate on test DB

Error rate on training DB

MaxMI Training22577670.7926908340.75824246413525

Original Training22577670.7926908340.72242935213635

MaxMI Training with constrained TAN restructure22577670.7926908340.756855168

Miss=62, FA=36

Miss=15, FA=3

MaxMI Training with greedy TAN restructure22577670.7926908340.746516913

Miss=30, FA=44

Miss=16, FA=3

Alternative MaxMI Training with TAN restructure22577670.7926908340.74711484

Miss=33, FA=109N / A

Threshold only training (without restructure)22577670.7926908340.738676981

Miss=84, FA=46

Miss=30, FA=5

Observed & Un-observed model training constructed from the all-observed model and soft EM22577670.792690834N / A67N / A

Back…

Page 42: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – empirical results

Cow Parts ModelTest DB Size

Training DB Size

Class entropy on training DB

MI model to class on training DB

Error rate on test DB

Error rate on training DB

Original Training22569610.46535663N / AMiss=84, FA=64

Miss=36, FA=16

MaxMI Training22569610.46535663N / AMiss=53, FA=42

Miss=25, FA=17

MaxMI Training with constrained TAN restructure22569610.46535663N / A

Miss=32, FA=48

Miss=17, FA=12

MaxMI Training with greedy TAN restructure22569610.46535663N / A

Miss=59, FA=30

Miss=23, FA=16

Observed & Un-observed model training constructed from the all-observed model and trained using soft EM22569610.46535663N / A89N /A

Back…

Page 43: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Remove – Contract – SplitRemove – Contract – Split

Back…

Page 44: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Making the assumptions true

Approximating the functionApproximating the function Strong Strong zz-minor-minor

Challenge:Challenge: selecting proper Z constants Benefit:Benefit: single step convergence

Weak Weak zz-minor-minor

Drawback:Drawback: exponential in number of “chips” Benefit:Benefit: less restrictive

Back…

),(),()1(),(),(

),(),()1(),(

),(),(

2121211

11

),(

1

1

zygxygZygxyf

zygxygxyf

xygxyf

xyg

Page 45: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

The clique treeThe clique tree

Back…

)( ii vf )( jj vf

),(log jiij vvw

C1

C2 C4C3

iv jv

Ck

Page 46: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Experimental Results

Model SizeNode Count

Value Count

Sample Count

A2 (same "slow" speed)

Average Approximation

Average Mismatch

Average Match )%(

Depth=3, Branching=5

314100094.11%15-1650.31%

Depth=3, Branching=5

313100094.55%11-1263.70%

Depth=3, Branching=5

312100097.16%4-584.60%

Based on Natural feature trees, 4

cliques of size 7252~200098.34%1-293.62%

Model SizeNode Count

Value Count

Sample Count

A2 (different "slow" speed)

Average Approximation

Average Mismatch

Average Match )%(

Depth=3, Branching=5

314100098.26%10-1165.22%

Depth=3, Branching=5

313100098.08%7-874.51%

Depth=3, Branching=5

312100098.55%3-488.62%

Based on Natural feature trees, 4

cliques of size 7252~200097.85%3-486.14%

Page 47: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Experimental ResultsModel Size

Node Count

Value Count

Sample Count

Random Slow Connections

Average Approximation

Average Mismatch

Average Match )%(

Depth=3, Branching=5

314100082.70%20-2134.58%

Depth=3, Branching=5

313100081.52%16-1745.48%

Depth=3, Branching=5

312100079.37%11-1262.23%

Based on Natural feature trees, 4

cliques of size 7252~2000N/AN/AN/A

Model SizeNode Count

Value Count

Sample Count

Loopy Belief Revision (50 messages per node)

Average Approximation

Average Mismatch

Average Match )%(

Depth=3, Branching=5

3141000N/AN/AN/A

Depth=3, Branching=5

313100089.17%13-1455.31%

Depth=3, Branching=5

312100088.73%8-972.80%

Based on Natural feature trees, 4

cliques of size 7252~200093.34%3-487.73%

Page 48: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

Experimental ResultsModel Size

Node Count

Value Count

Sample Count

Loopy Belief Revision (10 messages per node)

Average Approximation

Average Mismatch

Average Match )%(

Depth=3, Branching=5

314100087.65%17-1841.95%

Depth=3, Branching=5

313100086.74%14-1554.02%

Depth=3, Branching=5

312100085.78%8-971.80%

Based on Natural feature trees, 4

cliques of size 7252~2000N/AN/AN/A

Model SizeNode Count

Value Count

Sample Count

Ignore Sibling Loopy Links

Average Approximation

Average Mismatch

Average Match )%(

Depth=3, Branching=5

314100074.04%21-2229.25%

Depth=3, Branching=5

313100071.89%19-2038.56%

Depth=3, Branching=5

312100069.38%13-1456.09%

Based on Natural feature trees, 4

cliques of size 7252~200073.45%9-1063.88%

Back…

Page 49: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION
Page 50: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI Training – extensions Observed and unobserved model.

MaxMI augmented to support O&U Training observed only + EM heuristic.

Complete training Constrained and greedy TAN restructure. MaxMI vs. MinPE in ideal scenario –

characterization and comparison. Future research directions

Page 51: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI vs. MinPE

MinPE:

F

FstructureF

FFCPCPF

);),|(maxarg(minargmodel,,

MaxMI:);;(maxargmodel

,,F

structureFFCMI

F

Fano & inverse Fano (binary C):

)();|( EF PHFCH

);|(2

1FE FCHP

Back…

Page 52: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI vs. MinPE – ideal scenario

MinPE:

MaxMI:

Setting: n-valued C, k-valued F.

Arrange: Select F:

)()1( ,2 iCPiCPni

)|(argmaxj ,1 jFiCPkji

Divide:

Select F:

)(argmax},,{

1},,1{

)()(1 AHAA

k

jj

jj

An

ACPAAPk

0)|(

)(

)()|( ,1

jFAiCP

AP

iCPjFAiCPkj

j

jj

Back…

Page 53: THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

MaxMI vs. MinPE – ideal scenario

In general MaxMI MinPE

In special cases MaxMI MinPE

With increase in number of guesses:

Implications:Implications:

MaxMIMinPE

Back…