integration ii prediction. kernel-based data integration svms and the kernel “trick”...

Integration II

Prediction

Kernel-based data integration

• SVMs and the kernel “trick”• Multiple-kernel learning• Applications– Protein function prediction– Clinical prognosis

These are expression measurements from two genes for two populations(cancer types)

The goal is to define a cancer type classifier...

[Noble, Nat. Biotechnology, 2006]

One type of classifier is a “hyper-plane”that separates measurements fromtwo cancer types

E.g.: a one-dimensional hyper-plane

E.g.: a two-dimensional hyper-plane

Suppose that measurements are separable:there exists a hyperplane thatseparates two types

Then there are an infinite number ofseparating hyperplanes

Which to use?

Suppose that measurements are separable:there exists a hyperplane thatseparates two types

Then there are an infinite number ofseparating hyperplanes

Which to use?

The maximum-margin hyperplane

Equivalently: minimizer of

Which hyper-plane to use?

In reality: minimizer of trade-off between1. classification error, and2. margin size

loss penalty

This is the primal problem

This is the dual problem

What is K?

The kernel matrix:each entry is sample inner productone interpretation: sample similaritymeasurements completely described by K

Implication:Non-linearity is obtained byappropriately defining kernelmatrix K

E.g. quadratic kernel:

Another implication:No need for measurement vectorsall that is required is similarity between samples

E.g. string kernels

Protein Structure PredictionProtein structure

Protein sequence

Sequence similarity

Protein Structure Prediction

Kernel-based data fusion

Core idea: use different kernels for different genomic data sources a linear combination of kernel matrices is a kernel (under certain conditions)

Kernel to use in prediction:

In general, the task is to estimateSVM function along withcoefficients of the kernelmatrix combination

This is a type of well-studiedoptimization problem(semi-definite program)

Same idea applied to cancer classification from expression and proteomic data

• Prostate cancer dataset– 55 samples– Expression from microarray– Copy number variants

• Outcomes predicted:– Grade, stage, metastasis, recurrence

integration ii prediction. kernel-based data integration svms and the kernel “trick”...

cancer type classifier

type of classifier

expression measurements

quadratic kernel

kernel matrix

cancer classification

different genomic data

svmswhich hyperplane

Documents

protein functional site prediction using the shortest-path...

semi-parametric genomic-enabled prediction of genetic values...

integration by parts and quasi-invariance for heat kernel...

fractional integration and interval prediction

scalable kernel correlation filter with sparse …scalable...

initial integration of noise prediction tools for acoustic

optimization algorithm with kernel pca to support...

guide to kernel driver integration in linux for huawei...

free and open source software at cern: integration of...

interaction networks - prediction, data integration and text...

guide to kernel driver integration in android for huawei...

"biomolecular annotation prediction through information...

guide to kernel driver integration in android for huawei ......

integration of pavement cracking prediction model with...

cross-version defect prediction via hybrid active...

neural integration of stimulus history underlies prediction...

kernel methods for land cover classification and prediction

surface roughness prediction through internal kernel ... ·...

hpac (hazard prediction and assessment capability ...hpac...

computer architecture: a constructive approach branch...