integration ii prediction. kernel-based data integration svms and the kernel “trick”...

Post on 12-Jan-2016

225 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Integration II

Prediction

Kernel-based data integration

• SVMs and the kernel “trick”• Multiple-kernel learning• Applications– Protein function prediction– Clinical prognosis

SVMs

These are expression measurements from two genes for two populations(cancer types)

The goal is to define a cancer type classifier...

[Noble, Nat. Biotechnology, 2006]

SVMs

These are expression measurements from two genes for two populations(cancer types)

The goal is to define a cancer type classifier...

One type of classifier is a “hyper-plane”that separates measurements fromtwo cancer types

[Noble, Nat. Biotechnology, 2006]

SVMs

These are expression measurements from two genes for two populations(cancer types)

The goal is to define a cancer type classifier...

One type of classifier is a “hyper-plane”that separates measurements fromtwo cancer types

E.g.: a one-dimensional hyper-plane

[Noble, Nat. Biotechnology, 2006]

SVMs

These are expression measurements from two genes for two populations(cancer types)

The goal is to define a cancer type classifier...

One type of classifier is a “hyper-plane”that separates measurements fromtwo cancer types

E.g.: a two-dimensional hyper-plane

[Noble, Nat. Biotechnology, 2006]

SVMs

Suppose that measurements are separable:there exists a hyperplane thatseparates two types

Then there are an infinite number ofseparating hyperplanes

Which to use?

[Noble, Nat. Biotechnology, 2006]

SVMs

Suppose that measurements are separable:there exists a hyperplane thatseparates two types

Then there are an infinite number ofseparating hyperplanes

Which to use?

The maximum-margin hyperplane

Equivalently: minimizer of

[Noble, Nat. Biotechnology, 2006]

SVMs

Which hyper-plane to use?

In reality: minimizer of trade-off between1. classification error, and2. margin size

loss penalty

SVMs

This is the primal problem

This is the dual problem

SVMs

What is K?

The kernel matrix:each entry is sample inner productone interpretation: sample similaritymeasurements completely described by K

SVMs

Implication:Non-linearity is obtained byappropriately defining kernelmatrix K

E.g. quadratic kernel:

SVMs

Another implication:No need for measurement vectorsall that is required is similarity between samples

E.g. string kernels

Protein Structure PredictionProtein structure

Protein sequence

Sequence similarity

Protein Structure Prediction

Kernel-based data fusion

Core idea: use different kernels for different genomic data sources a linear combination of kernel matrices is a kernel (under certain conditions)

Kernel-based data fusion

Kernel to use in prediction:

Kernel-based data fusion

In general, the task is to estimateSVM function along withcoefficients of the kernelmatrix combination

This is a type of well-studiedoptimization problem(semi-definite program)

Kernel-based data fusion

Kernel-based data fusion

Kernel-based data fusion

Same idea applied to cancer classification from expression and proteomic data

Kernel-based data fusion

• Prostate cancer dataset– 55 samples– Expression from microarray– Copy number variants

• Outcomes predicted:– Grade, stage, metastasis, recurrence

Kernel-based data fusion

top related