gaussian process regression within an active learning scheme

GAUSSIAN PROCESS REGRESSION WITHINAN ACTIVE LEARNING SCHEME

IGARSS 2011

University of TrentoDept. of Information Engineering and

Computer ScienceItaly

Edoardo Pasolli

[email protected]

Farid [email protected]

n.it

July 28, 2011

Introduction

Supervised regression approach

2

Pre-processing

Feature extraction

RegressionImage/Signal Prediction

Training sample

collection

Training sample quality/quantity

Human expert

Impact on prediction errors

Introduction

Active learning approach for classification problems

3

Trainingof classifier

Active learningmethod

Model of classifier

Learning (unlabeled) set

Labeling of selected samples

Selected samplesafter labeling

Insertion in training set

f1

f2

f1

f2

f1

f2

Selected samples from

learning (unlabeled) set

f2

f1

f2

f1

Human expert

Training (labeled) set

Class 1 Class

3

Class 2

Objective

Propose GP-based active learning strategies for biophysical parameter estimation problems

4

Gaussian Processes (GPs)

Predictive distribution

5

2**** ,~,,| NXf xy

yk 12

** , IXXK nt *

12***

2* ,, kkxx

IXXKk n

t

fy

IN n2,0~ XXKGPf ,,0~

IXXKGPy n2,,0~

noise variance

covariance matrix defined by covariance function

',xxk

Gaussian Processes (GPs)6

Example of predicted function

: predicted value

*: standard deviation of predicted value

*

: training sample

Proposed Strategies7

GPRegression

U’s: Selected

unlabeled samples

Labeling

SelectionInsertion intraining set

Human expert

L: Training set

U: Learning set

L’s: Labeled samples

Proposed Strategies

Minimize covariance measure in feature space (Cov)

8

2

2

2

2

'exp,

l'k f

xxxx : squared

exponential covariance functionsignal

variance length-scale

: training sample: covariance function with respect to training sample

',xxk'x

Proposed Strategies

Minimize covariance measure in feature space (Cov)

9

: covariance measure with respect to all training samples

covf

covf

: covariance function with respect to training sample

ik xx,

ix

: training sample

n

iikf

1cov ,xxx xx

xcov

* minarg f

: selection of samples with minimum values of

Proposed Strategies

Maximize variance of predicted value (Var)

10

kkxxx122 ,

IKk n

t

: predicted value

x: standard deviation of predicted value

x

: training sample

Proposed Strategies

Maximize variance of predicted value (Var)

11

xx 2var f xx

xvar

* maxarg f

varf

varf: variance

: training sample

: selection of samples with maximum values of

Experimental Results

Data set description (MERIS) Simulated acquisitions Objective: estimation of chlorophyll

concentration in subsurface case I + case II (open and coastal) waters

Sensor: MEdium Resolution Imaging Spectrometer (MERIS)

# channels: 8 (412-618 nm) Range of chlorophyll concentration: 0.02-54

mg/m3

12


Data set description (SeaBAM) Real aquisitions Objective: estimation of chlorophyll

concentration mostly in subsurface case I (open) waters

Sensor: Sea-viewing Wide Field-of-view (SeaWiFS)

# channels: 5 (412-555 nm) Range of chlorophyll concentration: 0.02-32.79

mg/m3

13


Mean Squared Error

14

MERIS SeaBAM


Standard Deviation of Mean Squared Error

15

MERIS SeaBAM


Detailed results

16

Accuracies on 4000 test samples

Method#

trainingsamples

MSE σMSE R2 σR2

Full 1000 0.086 - 0.991 -Initial 50 1.638 0.869 0.849 0.070RanCovVar

1500.5850.3780.184

0.4060.1050.054

0.9380.9610.980

0.0450.0100.005

RanCovVar

3000.2370.2120.095

0.0840.1770.005

0.9750.9770.990

0.0080.0180.000

MERIS


Detailed results

17


Method#

trainingsamples

MSE σMSE R2 σR2


1500.5850.3780.184

0.4060.1050.054

0.9380.9610.980

0.0450.0100.005

RanCovVar

3000.2370.2120.095

0.0840.1770.005

0.9750.9770.990

0.0080.0180.000

MERIS


Detailed results

18


Method#

trainingsamples

MSE σMSE R2 σR2


1602.9722.2101.818

1.0380.0740.029

0.6820.7450.784

0.0690.0070.003

RanCovVar

3102.0621.6011.573

0.6870.0100.003

0.7530.8000.803

0.0660.0010.000

SeaBAM


Detailed results

19


Method#

trainingsamples

MSE σMSE R2 σR2


1602.9722.2101.818

1.0380.0740.029

0.6820.7450.784

0.0690.0070.003

RanCovVar

3102.0621.6011.573

0.6870.0100.003

0.7530.8000.803

0.0660.0010.000

SeaBAM

Conclusions

In this work, GP-based active learning strategies for regression problems are proposed

Encouraging performances in terms of convergence speed stability

Future developments extension to other regression approaches

20

GAUSSIAN PROCESS REGRESSION WITHINAN ACTIVE LEARNING SCHEME

IGARSS 2011

University of TrentoDept. of Information Engineering and

Computer ScienceItaly

Edoardo Pasolli

[email protected]

Farid [email protected]

n.it

July 28, 2011

gaussian process regression within an active learning scheme

Documents

example of predicted

test samples merismethod

active learning strategies

test samples seabammethod

learning setls

training setu

experimental resultsdata

subsurface case