advanced introduction to machine...

47
Advanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity, with application in Computational Genomics Eric Xing Lecture 3, September 15, 2014 Reading: © Eric Xing @ CMU, 2014 1

Upload: others

Post on 27-Apr-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Advanced Introduction to Machine Learning

10715, Fall 2014

Structured Sparsity, with application in Computational

Genomics

Eric XingLecture 3, September 15, 2014

Reading:© Eric Xing @ CMU, 2014 1

Page 2: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Structured Sparsity

Sparsity

Group sparsity

Total variation sparsity

y

x1

x2

x3

xn-1

xn

1

2

3

n-1

n

Ð(¯) =P

i j¯ij

Ð(¯) =P

c j¯Gc j2 =P

c

qPi2Gc

¯2i

Ð(¯) =P

c j¯GcjTV =P

c

Pi2Gc

j¯i ¡ ¯i¡1j

¯¤ = arg minL(X;Y;¯) + Ð(¯)

© Eric Xing @ CMU, 2014 2

Page 3: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Healthy

Sick

ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA

ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA

TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA

ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA

ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA

TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA

ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA

Single nucleotide polymorphism (SNP)

Causal (or "associated") SNP

Genetic Basis of Diseases

© Eric Xing @ CMU, 2014 3

Page 4: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

A . . . . T . . G . . . . . C . . . . . . T . . . . . A . . G .

A . . . . A . . C . . . . . C . . . . . . T . . . . . A . . G .

T . . . . T . . G . . . . . G . . . . . . T . . . . . T . . C .

A . . . . A . . C . . . . . C . . . . . . T . . . . . T . . C .

A . . . . T . . G . . . . . G . . . . . . A . . . . . A . . G .

T . . . . T . . C . . . . . G . . . . . . A . . . . . A . . C .

A . . . . A . . C . . . . . C . . . . . . A . . . . . T . . C .

Data

Genotype Phenotype

• Cancer: Dunning et al. 2009.• Diabetes: Dupuis et al. 2010.• Atopic dermatitis: Esparza-Gordillo et al. 2009.• Arthritis: Suzuki et al. 2008

Standard Approach

causal SNP

a univariate phenotype:e.g., disease/control

Genetic Association Mapping

© Eric Xing @ CMU, 2014 4

Page 5: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Healthy

Cancer

ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA

ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA

TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA

ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA

ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA

TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA

ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA

Causal SNPs

Genetic Basis of Complex Diseases

© Eric Xing @ CMU, 2014 5

Page 6: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Genetic Basis of Complex Diseases

Healthy

Cancer

ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA

ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA

TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA

ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA

ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA

TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA

ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA

Causal SNPs

Biological mechanism

© Eric Xing @ CMU, 2014 6

Page 7: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Intermediate Phenotype

Genetic Basis of Complex Diseases

Healthy

Cancer

ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA

ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA

TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA

ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA

ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA

TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA

ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA

Causal SNPs

Clinical records

Gene expressionAssociation to intermediate phenotypes

© Eric Xing @ CMU, 2014 7

Page 8: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Subnetworks for lung physiology Subnetwork for

quality of life

Genetic Association for Asthma Clinical Traits

TCGACGTTTTACTGTACAATT

Statistical challengeHow to find

associations to a multivariate entity in a graph?

© Eric Xing @ CMU, 2014 8

Page 9: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Gene Expression Trait Analysis

TCGACGTTTTACTGTACAATT

GenesSamples

Statistical challengeHow to find

associations to a multivariate entity in a tree?

© Eric Xing @ CMU, 2014 9

Page 10: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Association with Phenome

ACGTTTTACTGTACAATT

Multivariate complex syndrome (e.g., asthma)age at onset, history of eczema

genome‐wide expression profile

ACGTTTTACTGTACAATT

Traditional Approachcausal SNP

a univariate phenotype:gene expression  level

Structured Association

© Eric Xing @ CMU, 2014 10

Page 11: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Sparse Associations

Pleotropic effects

Epistatic effects

© Eric Xing @ CMU, 2014 11

Page 12: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Considerone phenotype at a time

Considermultiple correlated phenotypes (phenome) jointly

vs.

Standard Approach New Approach

Phenotypes Phenome

Structured Sparse Association : a New Paradigm

© Eric Xing @ CMU, 2014 12

Page 13: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Sparse Learning

Linear Model:

Lasso (Sparse Linear Regression)

Why sparse solution? penalizing

constraining

[R.Tibshirani 96]

© Eric Xing @ CMU, 2014 13

Page 14: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Multi-Task Extension Multi-Task Linear Model:

Coefficients for k-th task:Coefficient Matrix:

Input:Output:

Coefficients for a variable (2nd)

Coefficients for a task (2nd)© Eric Xing @ CMU, 2014 14

Page 15: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Background: Sparse multivariate regression for disease association studies

Structured association – a new paradigm Association to a graph-structured phenome

Graph-guided fused lasso (Kim & Xing, PLoS Genetics, 2009)

Association to a tree-structured phenome Tree-guided group lasso (Kim & Xing, ICML 2010)

Outline

© Eric Xing @ CMU, 2014 15

Page 16: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

T G

A A

C C

A T

G A

A G

T A

GenotypeTrait

Multivariate Regression for Single-Trait Association Analysis

Xy

2.1 x=

x= β

Association Strength

?

© Eric Xing @ CMU, 2014 16

Page 17: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

T G

A A

C C

A T

G A

A G

T A

GenotypeTrait

Multivariate Regression for Single-Trait Association Analysis

Many non-zero associations: Which SNPs are truly significant?

2.1 x=

Association Strength

© Eric Xing @ CMU, 2014 17

Page 18: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Genotype

x=2.1

Trait

Lasso for Reducing False Positives (Tibshirani, 1996)

Many zero associations (sparse results), but what if there are multiple related traits?

+

J

j 1 |βj |

T G

A A

C C

A T

G A

A G

T A

Lasso Penalty for sparsity

Association Strength

© Eric Xing @ CMU, 2014 18

Page 19: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Genotype

(3.4, 1.5, 2.1, 0.9, 1.8)

Multivariate Regression for Multiple-Trait Association Analysis

T G

A A

C C

A T

G A

A G

T A

How to combine information across multiple traits to increase the power?

Association Strength

x=

Association strength between SNP j and Trait i: βj,i

Allergy Lung physiology

LDSynthetic

lethal

+ ji,

|βj,i|

© Eric Xing @ CMU, 2014 19

Page 20: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Genotype

(3.4, 1.5, 2.1, 0.9, 1.8)

Trait

Multivariate Regression for Multiple-Trait Association Analysis

T G

A A

C C

A T

G A

A G

T A

Allergy Lung physiology

Association Strength

x=

+We introducegraph-guided fusion penalty

Association strength between SNP j and Trait i: βj,i

+ ji,

|βj,i|

© Eric Xing @ CMU, 2014 20

Page 21: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Multiple-trait Association:Graph-Constrained Fused Lasso

Step 1: Thresholded correlation graph of phenotypes

ACGTTTTACTGTACAATT

Step 2: Graph‐constrained fused lasso

Lasso Penalty

Graph-constrained fusion penalty

Fusion

© Eric Xing @ CMU, 2014 21

Page 22: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Fusion Penalty

Fusion Penalty: | βjk - βjm | For two correlated traits (connected in the network), the

association strengths may have similar values.

ACGTTTTACTGTACAATT

SNP j

Trait m

Trait k

Association strength between SNP j and Trait k:βjk

Association strength between SNP j and Trait m:βjm

© Eric Xing @ CMU, 2014 22

Page 23: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

ACGTTTTACTGTACAATT

Overall effect

Graph-Constrained Fused Lasso

Fusion effect propagates to the entire network Association between SNPs and subnetworks of traits

© Eric Xing @ CMU, 2014 23

Page 24: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

ACGTTTTACTGTACAATT

Multiple-trait Association: Graph-Weighted Fused Lasso

Subnetwork structure is embedded as a densely connected nodes with large edge weights

Edges with small weights are effectively ignored

Overall effect

© Eric Xing @ CMU, 2014 24

Page 25: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Estimating Parameters

Quadratic programming formulation Graph-constrained fused lasso

Graph-weighted fused lasso

Many publicly available software packages for solving convex optimization problems can be used

© Eric Xing @ CMU, 2014 25

Page 26: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

50 SNPs taken from HapMap chromosome 7, CEU population

10 traits

Trait Correlation Matrix

True Regression Coefficients

Single SNP-Single Trait Test

Significant at α = 0.01

Lasso Graph-guided Fused Lasso

Thresholded Trait Correlation Network

Simulation Results

PhenotypesSN

Ps

No association

High association

© Eric Xing @ CMU, 2014 26

Page 27: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Simulation Results

© Eric Xing @ CMU, 2014 27

Page 28: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Association to a Tree-structured Phenome

TCGACGTTTTACTGTACAATT

© Eric Xing @ CMU, 2014 28

Page 29: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Tree-Guided Group Lasso In a simple case of two genes

• Low height• Tight correlation• Joint selection

• Large height• Weak correlation• Separate selection

h

h

Gen

otyp

es

Gen

otyp

es

© Eric Xing @ CMU, 2014 29

Page 30: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

L1 penalty• Lasso penalty• Separateselection

L2 penalty • Group lasso• Joint selection

h

Elastic net

Select the child nodes jointly orseparately?

Tree-Guided Group Lasso In a simple case of two genes

Tree-guided group lasso

© Eric Xing @ CMU, 2014 30

Page 31: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

h2

h1

Select the child nodes jointly or separately?

Joint selection

Separate selection

Tree-Guided Group Lasso For a general tree

Tree-guided group lasso

© Eric Xing @ CMU, 2014 31

Page 32: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

h2

h1

Select the child nodes jointly or separately?

Tree-Guided Group Lasso For a general tree

Tree-guided group lasso

Joint selection

Separate selection

© Eric Xing @ CMU, 2014 32

Page 33: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Previously, in Jenatton, Audibert & Bach, 2009

Balanced Shrinkage

© Eric Xing @ CMU, 2014 33

Page 34: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Estimating Parameters

Second-order cone program

Many publicly available software packages for solving convex optimization problems can be used

© Eric Xing @ CMU, 2014 34

Page 35: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Illustration with Simulated Data

True association strengths  Lasso  Tree‐guided 

group lasso 

SNPs

Phenotypes

No association

High association

© Eric Xing @ CMU, 2014 35

Page 36: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Yeast eQTL Analysis

Tree‐guided group lasso 

Single‐Marker Single‐Trait Test

SNPs

Phenotypes

No association

High association

Hierarchical clustering tree

© Eric Xing @ CMU, 2014 36

Page 37: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Ultimately …

Pleotropic effects

Epistatic effects

© Eric Xing @ CMU, 2014 37

Page 38: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

38

Structured Input/Output-Lasso

k Usr

krs

j k

kj

k m Ssr

krs

j k

kj

K

k

N

i Usrrsi

krs

p

jij

kj

kilassoio

m

ZXY

).(4

23

),(

22

1 11

1 1 ),(,

1

minarg

This full model incorporates input/output structure of the dataset as well as epistatic effects guided by genetic interaction networks

network SNPin cluster m :Snetworksn interactio genetic:

thm

U

Lasso penalty: within group sparsity Input structure: group selection of correlated epistatic SNPs

Output structure: group selection of SNPs across multiple correlated traits

[Lee, Zhu and Xing, submitted 2010]

© Eric Xing @ CMU, 2014 38

Page 39: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Sensitivity and Specificity varying the number of SNPs

9/15/2014 39 For each number of SNPs, we show the average of the performance with 5 different simulated data

Marginal SNP: Methods taking advantage of output structures outperforms others.

Epistatic SNP: Methods taking advantage of input structures outperforms others.

IO-Lasso outperforms other methods for detecting both marginal & epsitatic eQTLs

© Eric Xing @ CMU, 2014 39

Page 40: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Computation Time

© Eric Xing @ CMU, 2014 40

Page 41: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Proximal Gradient DescentOriginal Problem:

Approximation Problem:

Gradient of theApproximation:

© Eric Xing @ CMU, 2014 41

Page 42: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Geometric Interpretation Smooth approximation

Uppermost LineNonsmooth

Uppermost LineSmooth

© Eric Xing @ CMU, 2014 42

Page 43: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Convergence RateTheorem: If we require and set , the number of iterations is upper bounded by:

Remarks: state of the art IPM method for for SOCP converges at a rate

© Eric Xing @ CMU, 2014 43

Page 44: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Multi-Task Time Complexity Pre-compute:

Per-iteration Complexity (computing gradient)

Tree:

Graph:

IPM for SOCP

Proximal-Gradient

IPM for SOCP

Proximal-Gradient

Proximal-Gradient: Independent of Sample Size

Linear in #.of Tasks© Eric Xing @ CMU, 2014 44

Page 45: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Experiments Multi-task Graph Structured Sparse Learning

(GFlasso)

© Eric Xing @ CMU, 2014 45

Page 46: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Experiments

Multi-task Tree-Structured Sparse Learning (TreeLasso)

© Eric Xing @ CMU, 2014 46

Page 47: Advanced Introduction to Machine Learningepxing/Class/10715/lectures/lecture3-Structured_Sparsity.pdfAdvanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity,

Conclusions

Novel statistical methods for joint association analysis to correlated phenotypes Graph-structured phenome : graph-guided fused lasso Tree-structured phenome : tree-guided group lasso

Advantages Greater power to detect weak association signals Fewer false positives Joint association to multiple correlated phenotypes

© Eric Xing @ CMU, 2014 47