robust statistics robust statistics why do we use the norms we do? henrik aanæs imm,dtu...

Post on 15-Jan-2016

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Robust StatisticsRobust StatisticsWhy do we use the norms we do?

Henrik Aanæs

IMM,DTU

haa@imm.dtu.dk

A good general reference is:Robust Statistics: Theory and Methods, by Maronna, Martin and Yohai. Wiley Series in Probability and Statistics

How Tall are You ?How Tall are You ?

Idea of Robust Statistics

To fit or describe the bulk of a data set well without being perturbed (influenced to much) by a small portion of outliers. This should be done without a pre-processingsegmentation of the data.

We thus now model our data set as consisting of inliers, thatfollow some distribution, at outliers which do not.

Inliers Outliers

Outliers can be interesting too!

Line ExampleLine Example

..

.

.

.

.

.

.

.

.

..

.

.

Robust Statistics in Computer VisionRobust Statistics in Computer VisionImage Smoothing

Image by Frederico D'Almeida

Robust Statistics in Computer VisionRobust Statistics in Computer VisionImage Smoothing

Robust Statistics in Computer VisionRobust Statistics in Computer Visionoptical flow

Play Sequence MIT BCS Perceptual Science Group. Demo by John Y. A. Wang.

Robust Statistics in Computer VisionRobust Statistics in Computer Visiontracking via view geometry

Image 1 Image 2

Gaussian/ Normal DistributionGaussian/ Normal DistributionThe Distribution We Usually UseThe Distribution We Usually Use

Nice Properties:• Central Limit Theorem.• Induces two norm.• Leads to linear computations.

But:• Is fiercely influenced by outliers.• Empirical distributions often have

‘fatter’ tails.

Gaussians Just are Models Gaussians Just are Models TooToo

Alternative title of this talk

Error or Error or ρρ-functions-functionsConverting from Model-Data Deviation to Objective Function.

ρρ-functions and ML-functions and MLA typical way of forming ρ- functions

ρρ-functions and ML II-functions and ML IIA typical way of forming ρ- functions

ρρ-functions and ML II-functions and ML IIA typical way of forming ρ- functions

Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From

• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared

General Idea: Down weigh outliers, i.e. ρ(x) should be ‘smaller’ for large |x|.

Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From

• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared

• Induced by Gaussian.• Very non-robust.• ‘Standard’ distribution.

Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From

• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared

• Quite Robust.• Convex.• Corresponds to Median.

The Median and the 1-NormThe Median and the 1-Norm

The Median and the 1-NormThe Median and the 1-NormExample with 2 observations

The Median and the 1-NormThe Median and the 1-NormExample with 2 observations

The Median and the 1-NormThe Median and the 1-NormExample with more observations

Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From

• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared

• Quite Robust.• Convex.• Corresponds to Median.

Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From

• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared

• Mixture of 1 and two norm.• Convex.• Has nice theoretical

properties.

Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From

• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared

• Discards Outlier’s.• For inliers works as

Gaussian.• Has discontinues

derivative.

Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From

• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared

• Discards Outlier’s.• Smooth.

Quantifying RobustnessQuantifying RobustnessA peak at tools for analysis

Bias vs. Variance

Quantifying RobustnessQuantifying RobustnessA peak at tools for analysis

®= 10

E± cieny of an estimator µ̂:

e(µ̂) =1=I (µ)var(µ̂)

;

where I (µ) is the F isher information of thesample. In general e(µ̂) · 1 so:

var(µ̂) ¸1

I (µ):

Related to variance, on the previous slide

Quantifying RobustnessQuantifying RobustnessYou want to be robust over a range of models

®= 10

Model F = (1¡ ²)N (0;1)+²N (0;10) the asymp-totic variance of the mean using the Huber-norm is given by:

k ² = 0 ² = 0:05 ² = 0:100 1.571 1.722 1.897

0.7 1.187 1.332 1.5011.0 1.107 1.263 1.4431.4 1.047 1.227 1.4391.7 1.010 1.233 1.4792.0 1.010 1.259 1.5501 1.000 5.950 10.900

from " Robust statistics: theory and methods" by Maronnaet. al.

Quantifying RobustnessQuantifying RobustnessA peak at tools for analysis

®= 10

Other measures (Similar):

• Breakage Point: How many outliers can an estimator handle and still give ‘reasonable’ results.

• Asymptotic bias: What bias does an outlier impose.

Back to ImagesBack to Imageshere we have multiple ‘models’here we have multiple ‘models’

To fit or describe the bulk of a data set well without being perturbed (influenced to much) by a small portion of outliers. This should be done without a pre-processingsegmentation of the data.

Optimization MethodsOptimization Methods

Typical Approach:

1. Find initial estimate.

2. Use Non-linear optimization and/or EM-algorithm.

NB: In this course we have and will seen other methods e.g. with guaranteed convergence

Hough TransformHough TransformOne off the oldest robust methods in ‘vision’

Often used for initial estimate.

Example from MatLab help

Curse of Dimesionality= PROBLEM

RanSaCRanSaCSampling in Hough space, better for higher dimensions

In a Hough setting:• 1. and 2. corresponds

to finding a ‘good’ bin in Hough space.

• 3. Corresponds to calculating the value.

RANdom SAmpling

Consensus, RANSAC

Iterate:

1. Draw minimal sample.

2. Fit model.

3. Evaluate model by Consensus.

Run RanDemo.m

RansacRansacHow many iterations

Inliers Outliers

Need to sample only Inliers to ‘succed’. Naïve scheme; try allcombinations i.e. all

E.g. For 100 points and a samplesize of 7, this is 8.0678e+013 trials.

)!(

!

SampleObs

Obs

NN

N

Preferred stopping scheme: •Stop when there is a e.g. 99% chance of getting all inliers.•Chance of getting an inlier

•Use consesus of best fit as estimate of N_in

See e.g. Hartley and Zisserman: “Multiple view geometry”

OutIn

In

NN

N

Iteratively Reweighted Least Squares IRLSIteratively Reweighted Least Squares IRLS

EM-type or chicken and egg optimization

top related