advanced introduction to machine...
TRANSCRIPT
Advanced Introduction to Machine Learning
10715, Fall 2014
Structured Sparsity, with application in Computational
Genomics
Eric XingLecture 3, September 15, 2014
Reading:© Eric Xing @ CMU, 2014 1
Structured Sparsity
Sparsity
Group sparsity
Total variation sparsity
y
x1
x2
x3
xn-1
xn
1
2
3
n-1
n
Ð(¯) =P
i j¯ij
Ð(¯) =P
c j¯Gc j2 =P
c
qPi2Gc
¯2i
Ð(¯) =P
c j¯GcjTV =P
c
Pi2Gc
j¯i ¡ ¯i¡1j
¯¤ = arg minL(X;Y;¯) + Ð(¯)
© Eric Xing @ CMU, 2014 2
Healthy
Sick
ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA
ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA
TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA
ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA
ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA
TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA
ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA
Single nucleotide polymorphism (SNP)
Causal (or "associated") SNP
Genetic Basis of Diseases
© Eric Xing @ CMU, 2014 3
A . . . . T . . G . . . . . C . . . . . . T . . . . . A . . G .
A . . . . A . . C . . . . . C . . . . . . T . . . . . A . . G .
T . . . . T . . G . . . . . G . . . . . . T . . . . . T . . C .
A . . . . A . . C . . . . . C . . . . . . T . . . . . T . . C .
A . . . . T . . G . . . . . G . . . . . . A . . . . . A . . G .
T . . . . T . . C . . . . . G . . . . . . A . . . . . A . . C .
A . . . . A . . C . . . . . C . . . . . . A . . . . . T . . C .
Data
Genotype Phenotype
• Cancer: Dunning et al. 2009.• Diabetes: Dupuis et al. 2010.• Atopic dermatitis: Esparza-Gordillo et al. 2009.• Arthritis: Suzuki et al. 2008
Standard Approach
causal SNP
a univariate phenotype:e.g., disease/control
Genetic Association Mapping
© Eric Xing @ CMU, 2014 4
Healthy
Cancer
ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA
ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA
TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA
ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA
ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA
TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA
ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA
Causal SNPs
Genetic Basis of Complex Diseases
© Eric Xing @ CMU, 2014 5
Genetic Basis of Complex Diseases
Healthy
Cancer
ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA
ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA
TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA
ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA
ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA
TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA
ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA
Causal SNPs
Biological mechanism
© Eric Xing @ CMU, 2014 6
Intermediate Phenotype
Genetic Basis of Complex Diseases
Healthy
Cancer
ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA
ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA
TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA
ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA
ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA
TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA
ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA
Causal SNPs
Clinical records
Gene expressionAssociation to intermediate phenotypes
© Eric Xing @ CMU, 2014 7
Subnetworks for lung physiology Subnetwork for
quality of life
Genetic Association for Asthma Clinical Traits
TCGACGTTTTACTGTACAATT
Statistical challengeHow to find
associations to a multivariate entity in a graph?
© Eric Xing @ CMU, 2014 8
Gene Expression Trait Analysis
TCGACGTTTTACTGTACAATT
GenesSamples
Statistical challengeHow to find
associations to a multivariate entity in a tree?
© Eric Xing @ CMU, 2014 9
Association with Phenome
ACGTTTTACTGTACAATT
Multivariate complex syndrome (e.g., asthma)age at onset, history of eczema
genome‐wide expression profile
ACGTTTTACTGTACAATT
Traditional Approachcausal SNP
a univariate phenotype:gene expression level
Structured Association
© Eric Xing @ CMU, 2014 10
Sparse Associations
Pleotropic effects
Epistatic effects
© Eric Xing @ CMU, 2014 11
Considerone phenotype at a time
Considermultiple correlated phenotypes (phenome) jointly
vs.
Standard Approach New Approach
Phenotypes Phenome
Structured Sparse Association : a New Paradigm
© Eric Xing @ CMU, 2014 12
Sparse Learning
Linear Model:
Lasso (Sparse Linear Regression)
Why sparse solution? penalizing
constraining
[R.Tibshirani 96]
© Eric Xing @ CMU, 2014 13
Multi-Task Extension Multi-Task Linear Model:
Coefficients for k-th task:Coefficient Matrix:
Input:Output:
Coefficients for a variable (2nd)
Coefficients for a task (2nd)© Eric Xing @ CMU, 2014 14
Background: Sparse multivariate regression for disease association studies
Structured association – a new paradigm Association to a graph-structured phenome
Graph-guided fused lasso (Kim & Xing, PLoS Genetics, 2009)
Association to a tree-structured phenome Tree-guided group lasso (Kim & Xing, ICML 2010)
Outline
© Eric Xing @ CMU, 2014 15
T G
A A
C C
A T
G A
A G
T A
GenotypeTrait
Multivariate Regression for Single-Trait Association Analysis
Xy
2.1 x=
x= β
Association Strength
?
© Eric Xing @ CMU, 2014 16
T G
A A
C C
A T
G A
A G
T A
GenotypeTrait
Multivariate Regression for Single-Trait Association Analysis
Many non-zero associations: Which SNPs are truly significant?
2.1 x=
Association Strength
© Eric Xing @ CMU, 2014 17
Genotype
x=2.1
Trait
Lasso for Reducing False Positives (Tibshirani, 1996)
Many zero associations (sparse results), but what if there are multiple related traits?
+
J
j 1 |βj |
T G
A A
C C
A T
G A
A G
T A
Lasso Penalty for sparsity
Association Strength
© Eric Xing @ CMU, 2014 18
Genotype
(3.4, 1.5, 2.1, 0.9, 1.8)
Multivariate Regression for Multiple-Trait Association Analysis
T G
A A
C C
A T
G A
A G
T A
How to combine information across multiple traits to increase the power?
Association Strength
x=
Association strength between SNP j and Trait i: βj,i
Allergy Lung physiology
LDSynthetic
lethal
+ ji,
|βj,i|
© Eric Xing @ CMU, 2014 19
Genotype
(3.4, 1.5, 2.1, 0.9, 1.8)
Trait
Multivariate Regression for Multiple-Trait Association Analysis
T G
A A
C C
A T
G A
A G
T A
Allergy Lung physiology
Association Strength
x=
+We introducegraph-guided fusion penalty
Association strength between SNP j and Trait i: βj,i
+ ji,
|βj,i|
© Eric Xing @ CMU, 2014 20
Multiple-trait Association:Graph-Constrained Fused Lasso
Step 1: Thresholded correlation graph of phenotypes
ACGTTTTACTGTACAATT
Step 2: Graph‐constrained fused lasso
Lasso Penalty
Graph-constrained fusion penalty
Fusion
© Eric Xing @ CMU, 2014 21
Fusion Penalty
Fusion Penalty: | βjk - βjm | For two correlated traits (connected in the network), the
association strengths may have similar values.
ACGTTTTACTGTACAATT
SNP j
Trait m
Trait k
Association strength between SNP j and Trait k:βjk
Association strength between SNP j and Trait m:βjm
© Eric Xing @ CMU, 2014 22
ACGTTTTACTGTACAATT
Overall effect
Graph-Constrained Fused Lasso
Fusion effect propagates to the entire network Association between SNPs and subnetworks of traits
© Eric Xing @ CMU, 2014 23
ACGTTTTACTGTACAATT
Multiple-trait Association: Graph-Weighted Fused Lasso
Subnetwork structure is embedded as a densely connected nodes with large edge weights
Edges with small weights are effectively ignored
Overall effect
© Eric Xing @ CMU, 2014 24
Estimating Parameters
Quadratic programming formulation Graph-constrained fused lasso
Graph-weighted fused lasso
Many publicly available software packages for solving convex optimization problems can be used
© Eric Xing @ CMU, 2014 25
50 SNPs taken from HapMap chromosome 7, CEU population
10 traits
Trait Correlation Matrix
True Regression Coefficients
Single SNP-Single Trait Test
Significant at α = 0.01
Lasso Graph-guided Fused Lasso
Thresholded Trait Correlation Network
Simulation Results
PhenotypesSN
Ps
No association
High association
© Eric Xing @ CMU, 2014 26
Simulation Results
© Eric Xing @ CMU, 2014 27
Association to a Tree-structured Phenome
TCGACGTTTTACTGTACAATT
© Eric Xing @ CMU, 2014 28
Tree-Guided Group Lasso In a simple case of two genes
• Low height• Tight correlation• Joint selection
• Large height• Weak correlation• Separate selection
h
h
Gen
otyp
es
Gen
otyp
es
© Eric Xing @ CMU, 2014 29
L1 penalty• Lasso penalty• Separateselection
L2 penalty • Group lasso• Joint selection
h
Elastic net
Select the child nodes jointly orseparately?
Tree-Guided Group Lasso In a simple case of two genes
Tree-guided group lasso
© Eric Xing @ CMU, 2014 30
h2
h1
Select the child nodes jointly or separately?
Joint selection
Separate selection
Tree-Guided Group Lasso For a general tree
Tree-guided group lasso
© Eric Xing @ CMU, 2014 31
h2
h1
Select the child nodes jointly or separately?
Tree-Guided Group Lasso For a general tree
Tree-guided group lasso
Joint selection
Separate selection
© Eric Xing @ CMU, 2014 32
Previously, in Jenatton, Audibert & Bach, 2009
Balanced Shrinkage
© Eric Xing @ CMU, 2014 33
Estimating Parameters
Second-order cone program
Many publicly available software packages for solving convex optimization problems can be used
© Eric Xing @ CMU, 2014 34
Illustration with Simulated Data
True association strengths Lasso Tree‐guided
group lasso
SNPs
Phenotypes
No association
High association
© Eric Xing @ CMU, 2014 35
Yeast eQTL Analysis
Tree‐guided group lasso
Single‐Marker Single‐Trait Test
SNPs
Phenotypes
No association
High association
Hierarchical clustering tree
© Eric Xing @ CMU, 2014 36
Ultimately …
Pleotropic effects
Epistatic effects
© Eric Xing @ CMU, 2014 37
38
Structured Input/Output-Lasso
k Usr
krs
j k
kj
k m Ssr
krs
j k
kj
K
k
N
i Usrrsi
krs
p
jij
kj
kilassoio
m
ZXY
).(4
23
),(
22
1 11
1 1 ),(,
1
minarg
This full model incorporates input/output structure of the dataset as well as epistatic effects guided by genetic interaction networks
network SNPin cluster m :Snetworksn interactio genetic:
thm
U
Lasso penalty: within group sparsity Input structure: group selection of correlated epistatic SNPs
Output structure: group selection of SNPs across multiple correlated traits
[Lee, Zhu and Xing, submitted 2010]
© Eric Xing @ CMU, 2014 38
Sensitivity and Specificity varying the number of SNPs
9/15/2014 39 For each number of SNPs, we show the average of the performance with 5 different simulated data
Marginal SNP: Methods taking advantage of output structures outperforms others.
Epistatic SNP: Methods taking advantage of input structures outperforms others.
IO-Lasso outperforms other methods for detecting both marginal & epsitatic eQTLs
© Eric Xing @ CMU, 2014 39
Computation Time
© Eric Xing @ CMU, 2014 40
Proximal Gradient DescentOriginal Problem:
Approximation Problem:
Gradient of theApproximation:
© Eric Xing @ CMU, 2014 41
Geometric Interpretation Smooth approximation
Uppermost LineNonsmooth
Uppermost LineSmooth
© Eric Xing @ CMU, 2014 42
Convergence RateTheorem: If we require and set , the number of iterations is upper bounded by:
Remarks: state of the art IPM method for for SOCP converges at a rate
© Eric Xing @ CMU, 2014 43
Multi-Task Time Complexity Pre-compute:
Per-iteration Complexity (computing gradient)
Tree:
Graph:
IPM for SOCP
Proximal-Gradient
IPM for SOCP
Proximal-Gradient
Proximal-Gradient: Independent of Sample Size
Linear in #.of Tasks© Eric Xing @ CMU, 2014 44
Experiments Multi-task Graph Structured Sparse Learning
(GFlasso)
© Eric Xing @ CMU, 2014 45
Experiments
Multi-task Tree-Structured Sparse Learning (TreeLasso)
© Eric Xing @ CMU, 2014 46
Conclusions
Novel statistical methods for joint association analysis to correlated phenotypes Graph-structured phenome : graph-guided fused lasso Tree-structured phenome : tree-guided group lasso
Advantages Greater power to detect weak association signals Fewer false positives Joint association to multiple correlated phenotypes
© Eric Xing @ CMU, 2014 47