modern machine learning regression methods

34
Modern Machine Learning Regression Methods* Igor & Igor** *based on our lecture at Strasbourg Chemoinformatics School **Igor Baskin, MSU Click here to buy A B B Y Y P D F T r a n s f o r m e r 2 . 0 w w w . A B B Y Y . c o m Click here to buy A B B Y Y P D F T r a n s f o r m e r 2 . 0 w w w . A B B Y Y . c o m

Upload: ssa-kpi

Post on 24-Jan-2015

1.047 views

Category:

Education


2 download

DESCRIPTION

AACIMP 2009 Summer School lecture by Igor Tetko. "Environmental Chemoinfornatics" course.

TRANSCRIPT

Page 1: Modern Machine Learning Regression Methods

Modern Machine LearningRegression Methods*

Igor & Igor**

*based on our lecture at Strasbourg Chemoinformatics School**Igor Baskin, MSU

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 2: Modern Machine Learning Regression Methods

The Goal of LearningThe goal of statistical learning (in chemoinformatics) consists in finding functionf relating the value of some property y (which can be physicochemical property,biological activity, etc) to the values of descriptors x1,…,xM (which can describechemical compounds, reactions, etc)

),...,( 1 Mxxfy

Continuous y are handled by regression analysis, function approximation, etc

Discrete y are handed by discriminant analysis, classification, patternrecognition, etc

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 3: Modern Machine Learning Regression Methods

The Goal of LearningSo, the goal of statistical learning is to find such function class F and suchparameters c1,…,cP that would minimize the risk function and therefore provide amodel with the highest predictive ability:

min),...( 1 PccR

In classical statistics it is assumed that:

min),...(min),...( 1exp1 PP ccRccR

Is this correct? Almost correct for big data sets and may not be correct forsmall data sets

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 4: Modern Machine Learning Regression Methods

Incorrectness of Empirical Risk MinimizationData approximation with polynoms of different order

Minimization of the empirical risk function does not guaranties the bestpredictive performance of the model

p

i

ii xccpxPol

10),(

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 5: Modern Machine Learning Regression Methods

Occam’s Razor

Entia non sunt multiplicanda praeter necessitatem

Entities should not be multiplied beyondnecessity

All things being equal, the simplest solutiontends to be the best one

(c. 1285–1349)

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 6: Modern Machine Learning Regression Methods

Machine Learning RegressionMethods

• Multiple Linear Regression (MLR)• Kernel version of this method• K Nearest Neighbours (kNN)• Back-Propagation Neural Network (BPNN)• Associative Neural Network (ASNN

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 7: Modern Machine Learning Regression Methods

Multiple Linear Regression

YXXXC TT 1)(

Mc

cc

C 1

0

Ny

yy

Y2

1

NM

N

M

M

xx

xxxx

X

1

221

111

1

11

Regressioncoefficients

Experimental propertyvalues Descriptor values

M < N !!!

Topless: M < N/5 for good models

Y=CX

Topleiss: M < N/5 for good models/

y c0 c1x1 ... cM xM

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 8: Modern Machine Learning Regression Methods

Mystery of the “Rule of 5”

John G. Topliss

J.H.Topliss & R.L.Costello, JMC, 1972, Vol. 15, No. 10, P. 1066-1068.

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 9: Modern Machine Learning Regression Methods

Mystery of the “Rule of 5”

John G. Topliss

J.H.Topliss & R.L.Costello, JMC, 1972, Vol. 15, No. 10, P. 1066-1068.

For example, if R2 = 0.40 is regarded as the maximumacceptable level of chance correlation then the minimumnumber of observations required to test five variables isabout 30, for 10 variables 50 observations, for 20 variables65 observations, and for 30 variables 85 observations.

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 10: Modern Machine Learning Regression Methods

Mystery of the “Rule of 5”C.Hansch, K.W.Kim, R.H.Sarma, JACS,1973, Vol. 95, No.19, 6447-6449

Topliss and Costello2 have pointed out the danger of finding meaninglesschance correlations with three or four data points per variable.

The correlation coefficient is good and there are almost five data pointsper variable.

Topliss Hansch: M < N/5 for good models/

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 11: Modern Machine Learning Regression Methods

Kernel ridge regression

• Regression + regularization, in feature space

• Avoids large, canceling coefficients• Bias unnecessary if samples and labels are centered (in

feature space)• Solution in convex hull of samples, ,

where K is the kernel and I the identity matrix• Predict new samples x as

wmin yi w,x i

2w 2

i

K In n1

i x i, xi

minw

y i w , (x i) w 2

i

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 12: Modern Machine Learning Regression Methods

K Nearest Neighbors

M

k

jk

ik

Euclidij xxD

1

2)(

M

k

jk

ik

Manhatij xxD

1

tan

neighbourskjj

predi yy

k1

neighbourskjj

ij

neighbourskj ij

predi y

DD

y 11

1Non-weighted Weighted

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 13: Modern Machine Learning Regression Methods

Overfitting by Variable Selection inkNN

Golbraikh A., Tropsha A. Beware of q2! JMGM, 2002, 20, 269-276

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 14: Modern Machine Learning Regression Methods

Artificial Neuron

wi5wi4

wi3wi2wi1

o5

o3 o4

o1

O2O2o2

oi = f(neti)

oi

neti = wjioj-ti

)1(1)( xexf

xx

xf,0,1

)(

Transfer function

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 15: Modern Machine Learning Regression Methods

Multilayer Neural NetworkInput Layer

Hidden Layer

Output Layer

Neurons in the input layer correspond to descriptors, neurons in the output layer– to properties being predicted, neurons in the hidden layer – to nonlinear latentvariables

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 16: Modern Machine Learning Regression Methods

Generalized Delta-RuleThis is application of the steepest descent method to training backpropagationneural networks

')(ji

jji

ij

empij y

deedf

yw

Rw

– learning rate constant

1974

Paul WerbosDavid

RummelhardJames

McClelland

1986

Geoffrey Hinton

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 17: Modern Machine Learning Regression Methods

Multilayer Neural NetworkInput Layer

Hidden Layer

Output Layer

The number of weights corresponds to the number ofadjustable parameters of the method

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 18: Modern Machine Learning Regression Methods

Origin of “Rule of 2”The number of weights (adjustable parameters) for the case of one hidden layer

W = (I+1)H + (H+1)O

2.28.1T.A. Andrea and H. Kalayeh, J. Med. Chem., 1991, 34, 2824-2836.

Parameter :WN

End of “Rule of 2”I.V. Tetko, D.J. Livingstone, Luik, A.I. J. Chem. Inf. Comput. Sci., 1995, 35, 826-833.

Baskin, I.I. et al. Foundations Comput. Decision. Sci. 1997, v.22, No.2, p.107-116.

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 19: Modern Machine Learning Regression Methods

Overtraining and Early Stopping

training set

test set 1

test set 2

point for early stopping

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 20: Modern Machine Learning Regression Methods

Associative Neural Network (ASNN)Associative Neural Network (ASNN)

A prediction of casei:

x i ANNE M z i

z1i

zki

zMi

Pearson’s (Spearman) correlation coefficient rij=R(zi,zj)>0 in space ofresiduals

Mk

iki z

Mz

,1

1

)(

1

ikNjjjkii zyzz

x

Ensemble approach:Ensemble approach:

<<= ASNN bias correction<<= ASNN bias correction

The correction of neural network ensemble value is performed using errors (biases)calculated for the neighbor cases of analyzed case xxii detected in space of neuralnetwork models

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 21: Modern Machine Learning Regression Methods

Correction of a model bythe nearest neighbors

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 22: Modern Machine Learning Regression Methods

0

0.2

0.4

0.6

0.8

1

-3 -2 -1 0 1 2 30

0.2

0.4

0.6

0.8

1

-3 -2 -1 0 1 2 3

A B

-4

-2

0

2

4

-3 -2 -1 0 1 2 3

C

x

x x

x2

x1

y y

Detection of nearestneighbors in space ofmodels uses invariants in“structure-property” space.

Nearest neighbors for Gauss functionClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 23: Modern Machine Learning Regression Methods

ASNN local correction stronglyimproves prediction of extreme data

Tetko, Methods Mol Biol, 2008, 458, 826-833.

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 24: Modern Machine Learning Regression Methods

Prediction of similar properties

.

0

0.25

0.5

0.75

1

0 1 2 3

t raining "LogP"

"LogD"

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 25: Modern Machine Learning Regression Methods

10 new points are measured

0

0.25

0.5

0.75

1

0 1 2 3

t raining "LogP"

"LogD"

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 26: Modern Machine Learning Regression Methods

Library mode (no retraining)

0

0.25

0.5

0.75

10 1 2 3

t raining "LogP"

"LogD"

Library "LogD"

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 27: Modern Machine Learning Regression Methods

Library mode vs development ofa new model using 10 cases

0

0.25

0.5

0.75

10 1 2 3

t arget funct ion"LogD"

Library "LogD"

Ret rainingfrom "scrat h"

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 28: Modern Machine Learning Regression Methods

How to validate a model?

• Training/Validation set partition– How to select validation set?– How many molecules should be in it?

• N-fold cross-validation

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 29: Modern Machine Learning Regression Methods

What is n-cross-validation?

White & blue rectangles are training & validation sets, respectively.

n=5

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 30: Modern Machine Learning Regression Methods

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 31: Modern Machine Learning Regression Methods

Conclusions

• There are many machine learning methods• Different problems may require different methods• All methods could be prone of overfitting• But all of them have facilities to tackle this

problem• To correctly evaluate performance of your model

use validation set or better n-fold cross-validation(built-in option)

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 32: Modern Machine Learning Regression Methods

Exam. Question 1

What is it?

1. Support VectorRegression

2. BackpropagationNeural Network

3. Partial Least SquaresRegression

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 33: Modern Machine Learning Regression Methods

Exam. Question 2Which method is not prone to overfitting?

1. Multiple Linear Regression2. Partial Least Squares3. Support Vector Regression4. Backpropagation Neural Networks5. K Nearest Neighbours

Neither!

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com

Page 34: Modern Machine Learning Regression Methods

Thank you for your attention!

Click h

ere to

buy

ABB

YY PDF Transformer 2.0

www.ABBYY.comClic

k here

to buy

ABB

YY PDF Transformer 2.0

www.ABBYY.com