hierarchical shape classification using bayesian aggregation

20
1 Zafer Barutcuoglu Princeton University Christopher DeCoro Hierarchical Shape Hierarchical Shape Classification Classification Using Bayesian Aggregation Using Bayesian Aggregation

Upload: carsyn

Post on 21-Jan-2016

55 views

Category:

Documents


0 download

DESCRIPTION

Hierarchical Shape Classification Using Bayesian Aggregation. Zafer BarutcuogluPrinceton University Christopher DeCoro. Shape Matching. Given two shapes, quantify the difference between them Useful for search and retrieval, image processing, etc. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Hierarchical Shape Classification Using Bayesian Aggregation

1Zafer Barutcuoglu Princeton UniversityChristopher DeCoro

Hierarchical Shape ClassificationHierarchical Shape ClassificationUsing Bayesian AggregationUsing Bayesian AggregationHierarchical Shape ClassificationHierarchical Shape ClassificationUsing Bayesian AggregationUsing Bayesian Aggregation

Page 2: Hierarchical Shape Classification Using Bayesian Aggregation

2

Shape MatchingShape MatchingShape MatchingShape Matching

• Given two shapes, quantify the difference between themGiven two shapes, quantify the difference between them– Useful for search and retrieval, image processing, etc.

• Common approach is that of Common approach is that of shape descriptorsshape descriptors– Map arbitrary definition of shape into a representative vector– Define a distance measure (i.e Euclidean) to quantify similarity– Examples include: GEDT, SHD, REXT, etc.

• A common application is classificationA common application is classification– Given an example, and a set of classes, which class is most

appropriate for that example?– Applicable to a large range of applications

Page 3: Hierarchical Shape Classification Using Bayesian Aggregation

3

Hierarchical ClassificationHierarchical ClassificationHierarchical ClassificationHierarchical Classification

• Given a hierarchical set of classes,Given a hierarchical set of classes,

• And a set of labeled examples for those classesAnd a set of labeled examples for those classes

• Predict the Predict the hierarchically-consistenthierarchically-consistent classification of a novel classification of a novel example, using the hierarchy to example, using the hierarchy to improve performanceimprove performance..

Example courtesy of “The Princeton Shape Benchmark”, P. Shilane et. al (2004)

Page 4: Hierarchical Shape Classification Using Bayesian Aggregation

4

MotivationMotivationMotivationMotivation

• Given these, how can we predict classes for novel shapes?Given these, how can we predict classes for novel shapes?

• Conventional algorithms don’t apply directly to hierarchiesConventional algorithms don’t apply directly to hierarchies– Binary classification– Multi-class (one-of-M) classification

• Using binary classification for each class can produce Using binary classification for each class can produce predictions which contradict with the hierarchypredictions which contradict with the hierarchy

• Using multi-class classification over the leaf nodes loses Using multi-class classification over the leaf nodes loses information by ignoring the hierarchyinformation by ignoring the hierarchy

Page 5: Hierarchical Shape Classification Using Bayesian Aggregation

5

Other heirarchical classification Other heirarchical classification methods, other domainsmethods, other domainsOther heirarchical classification Other heirarchical classification methods, other domainsmethods, other domains

• TO ZAFER: I need something here about background TO ZAFER: I need something here about background information, other methods, your method, etc.information, other methods, your method, etc.

• Also, Szymon suggested a slide about conditional probabilities Also, Szymon suggested a slide about conditional probabilities and bayes nets in general. Could you come up with something and bayes nets in general. Could you come up with something very simplified and direct that would fit with the rest of the very simplified and direct that would fit with the rest of the presentation? presentation?

Page 6: Hierarchical Shape Classification Using Bayesian Aggregation

6

Motivation (Example)Motivation (Example)Motivation (Example)Motivation (Example)

• Independent classifiers give an inconsistent predictionIndependent classifiers give an inconsistent prediction– Classified as bird, but not classified as flying creature

• Also cause incorrect resultsAlso cause incorrect results– Not classified as flying bird– Incorrectly classified as dragon

Page 7: Hierarchical Shape Classification Using Bayesian Aggregation

7

Motivation (Example)Motivation (Example)Motivation (Example)Motivation (Example)

• We can correct this using our Bayesian Aggregation methodWe can correct this using our Bayesian Aggregation method– Remove inconsistency at flying creature

• Also improves results of classificationAlso improves results of classification– Stronger prediction of flying bird– No longer classifies as dragon

Page 8: Hierarchical Shape Classification Using Bayesian Aggregation

8

Naïve Hierarchical ConsistencyNaïve Hierarchical ConsistencyNaïve Hierarchical ConsistencyNaïve Hierarchical Consistency

bipedNO

humanYES

animalYES

INDEPENDENT TOP-DOWN

bipedNO

humanYES

animalYES

BOTTOM-UP

bipedNO

humanYES

animalYES

Unfair distribution ofresponsibility and correction

Page 9: Hierarchical Shape Classification Using Bayesian Aggregation

9

Our Method – Bayesian AggregationOur Method – Bayesian AggregationOur Method – Bayesian AggregationOur Method – Bayesian Aggregation• Evaluate individual classifiers for each classEvaluate individual classifiers for each class

– Inconsistent predictions allowed– Any classification algorithm can be used (e.g. kNN)– Parallel evaluation

• Bayesian aggregation of predictionsBayesian aggregation of predictions– Inconsistencies resolved globally

Page 10: Hierarchical Shape Classification Using Bayesian Aggregation

10

Our Method - ImplementationOur Method - ImplementationOur Method - ImplementationOur Method - Implementation

• Shape descriptor: Spherical Harmonic DescriptorShape descriptor: Spherical Harmonic Descriptor**

– Converts shape into 512-element vector– Compared using Euclidean distance

• Binary classifier: k-Nearest NeighborsBinary classifier: k-Nearest Neighbors– Finds the k nearest labeled training examples– Novel example assigned to most common class

• Simple to implement, yet flexibleSimple to implement, yet flexible

* “Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors” M. Kazhdan, et. al (2003)

Page 11: Hierarchical Shape Classification Using Bayesian Aggregation

11

superman

biped flying creature

animal

A Bayesian FrameworkA Bayesian FrameworkA Bayesian FrameworkA Bayesian Framework

y4

y2 y3

y1g1

g3

g4

g2

Given predictions g1...gN from kNN,

find most likely true labels y1...yN

Page 12: Hierarchical Shape Classification Using Bayesian Aggregation

12

Classifier Output LikelihoodsClassifier Output LikelihoodsClassifier Output LikelihoodsClassifier Output Likelihoods

P(y1...yN | g1...gN) = = αα P(P(gg11...g...gNN | | yy11...y...yNN)) P( P(yy11...y...yNN))

• Conditional independence assumptionConditional independence assumption– Classifiers outputs depend only on their true labels– Given its true label, an output is conditionally

independent of all other labels and outputs

P(g1...gN | y1...yN) = i P(gi | yi)

Page 13: Hierarchical Shape Classification Using Bayesian Aggregation

13

Estimating P(Estimating P(ggii | | yyii))Estimating P(Estimating P(ggii | | yyii))

Negative examples

Positive examples

#(g=0,y=0)#(g=0,y=0) #(g=1,y=0)#(g=1,y=0)

#(g=0,y=1)#(g=0,y=1) #(g=1,y=1)#(g=1,y=1)

Predicted negative

Predicted positive

The Confusion Matrix obtained using cross-validation

e.g. P(g=0 | y=0) ≈ #(g=0,y=0) / [ #(g=0,y=0) + #(g=1,y=0) ]

Page 14: Hierarchical Shape Classification Using Bayesian Aggregation

14

Hierarchical Class PriorsHierarchical Class PriorsHierarchical Class PriorsHierarchical Class Priors

P(y1...yN | g1...gN) = = αα P(P(gg11...g...gNN | | yy11...y...yNN)) P(P(yy11...y...yNN))

• Hierarchical dependency modelHierarchical dependency model– Class prior depends only on children

P(y1...yN) = i P(yi | ychildren(i))

• Enforces hierarchical consistencyEnforces hierarchical consistency– The probability of an inconsistent assignment is 0– Bayesian inference will not allow inconsistency

Page 15: Hierarchical Shape Classification Using Bayesian Aggregation

15

Conditional ProbabilitiesConditional ProbabilitiesConditional ProbabilitiesConditional Probabilities

• P(P(yyii | | yychildren(i)children(i)))– Inferred from known

labeled examples

• P(P(ggii | | yyii))– Inferred by validation on

held-out data

y4

y2 y3

y1g1

g3

g4

g2

• We can now apply Bayesian inference algorithmsWe can now apply Bayesian inference algorithms– Particular algorithm independent of our method

– Results in globally consistent predictions

– Uses information present in hierarchy to improve predictions

Page 16: Hierarchical Shape Classification Using Bayesian Aggregation

16

Applying Bayesian AggregationApplying Bayesian AggregationApplying Bayesian AggregationApplying Bayesian Aggregation• Training phase produces Bayes NetworkTraining phase produces Bayes Network

– From hierarchy and training set, train classifiers– Use cross-validation to generate conditional probabilities– Use probabilities to create bayes net

• Test phase give probabilities for novel examplesTest phase give probabilities for novel examples– For a novel example, apply classifiers– Use classifier outputs and existing bayes net to infer

probability of membership in each class

Hierarchy Classifiers Bayes NetCross-validation

Classifiers Bayes Net Class Probabilities

Training Set

Test Example

Page 17: Hierarchical Shape Classification Using Bayesian Aggregation

17

Experimental ResultsExperimental ResultsExperimental ResultsExperimental Results

• 2-fold cross-validation on each class using 2-fold cross-validation on each class using kkNNNN

• Area Under the ROC Curve (AUC) for evaluationArea Under the ROC Curve (AUC) for evaluation– Real-valued predictor can be thresholded arbitrarily– Probability that pos. example is predicted over a neg. example

• 169 of 170 classes were improved by our method169 of 170 classes were improved by our method– Average AUC = +0.137 (+19% of old AUC)– Old AUC = .7004 (27 had AUC of 0.5, random guessing)

Page 18: Hierarchical Shape Classification Using Bayesian Aggregation

18

AUC Scatter PlotAUC Scatter PlotAUC Scatter PlotAUC Scatter Plot

0.5 0.6 0.7 0.8 0.9 10.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

AUC for kNN

AU

C f

or k

NN

+B

ayes

Scatterplot of AUC scores after vs. before Bayesian correction

Page 19: Hierarchical Shape Classification Using Bayesian Aggregation

19

AUC ChangesAUC ChangesAUC ChangesAUC Changes

• 169 of 170 classes were improved by our method169 of 170 classes were improved by our method– Average AUC = +0.137 (+19% of old AUC)– Old AUC = .7004 (27 had AUC of 0.5, random guessing)

Page 20: Hierarchical Shape Classification Using Bayesian Aggregation

20

QuestionsQuestionsQuestionsQuestions