.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
�
CH2
CH3�1 �1 �1
�1 �1
�1
�1
�1
�
2 2
4
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
r0E
E
r0E = AtE ,
A tE
E tE
E
A A
A
r0E tE
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
Raman shift (cm-1)
Intensity
(a.u.)
1000 1500 2000 2500 3000
Intensity
(a.u.)
1000 1500 2000 2500 3000
A
B
C
...........
Original high-dimensionalRaman spectra
D
F G
Principal component-Lineardiscriminant analysisFingerprint region
Spectra from env. 1
Spectra from env. 2
LDA1
LDA2
E
-0.5 0 0.5LDA3
-0.6
-0.4
-0.2
0
0.2
0.4
LDA4
YEEMM 2% Glc.EMM-NEMM-CEMM 0.1% Glc.Sorbitol 1 MCdSO4 1 mMH2O2 2 mMHeat 39°CEtOH 10%
YE
Sorbitol 1 M
CdSO4 1 mM
H2O2 2 mM
-0.6 -0.4 -0.2 0 0.2 0.4 0.6LDA3
-0.4
-0.2
0
0.2
0.4
0.6
LDA5
YEEMM 2% Glc.EMM-NEMM-CEMM 0.1% Glc.Sorbitol 1 MCdSO4 1 mMH2O2 2 mMHeat 39°CEtOH 10%
YE
EMM 2% Glc.
Sorbitol 1 MH2O2 2 mM
CdSO4 1 mM
Fingerprint region
EMM-C
YE
EMM 2% Glc.
EMM 0.1% Glc.Sorbitol 1 M
CdSO4 1 mM
H2O2 2 mM
Heat 39°C
EtOH 10%
-2 -1.5 -1 -0.5 0 0.5 1LDA1
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
LDA2
YEEMM 2% Glc.EMM-NEMM-CEMM 0.1% Glc.Sorbitol 1 MCdSO4 1 mMH2O2 2 mMHeat 39°CEtOH 10%
-1 -0.5 0 0.5 1LDA2
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
LDA3
YE
EMM 2% Glc.
EMM 0.1% Glc.
Sorbitol 1 M
CdSO4 1 mM
H2O2 2 mM
Heat 39°CYEEMM 2% Glc.EMM-NEMM-CEMM 0.1% Glc.Sorbitol 1 MCdSO4 1 mMH2O2 2 mMHeat 39°CEtOH 10%
�1 �1
E rE!
i
�
2
�
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
A
A
kr0Ei� r0Ei
k r0Ei
r0EiEi
PRESSr =
NX
i=1
kr0Ei� r0Ei
k2,
N N = 10
PRESSr = 5.45
PRESSr
PRESSr
PRESSr p PRESSr 5.45
PRESSr p
10 5
10 6
p
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
5 6 7 8 9 10Number of environments used
10-4
10-3
10-2
10-1
100
p-value
⋮⋮
⋮ ⋮⋮
⋮
Training dataA
E F
Test dataEstimate Estimate
⋮⋮
⋮⋮ ⋮⋮
Randomly permuted datasets
B
C
-2 -1.5 -1 -0.5 0 0.5 1LDA1
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
LDA2
EMM-C
EMM 2% Glc.
Sorbitol 1 M
CdSO4 1 mM
H2O2 2 mM
Heat 39°C
YEEMM 2% Glc.EMM-NEMM-CEMM 0.1% Glc.Sorbitol 1 MCdSO4 1 mMH2O2 2 mMHeat 39°CEtOH 10%
0 20 40 60 80 100 120Number of transcripts with highest VIP scores included
10-4
10-3
10-2
10-1
p-value
10 20 30 40 500
100
200
300
400
500
600
700
800
Frequency
Original PRESSr = 5.45
PRESSr value
D
N = 10 Ei
A�i
r0Ei= A�i
tEi i = 1, . . . , N
PRESSrPRESSr PRESSr PRESSr
p PRESSr p PRESSr p
p p PRESSrp
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
p PRESSr
A
Ei
tEi r0Ei
A�i
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
0 2 4 6 8 10 12 14PRESSt value 1012
0
200
400
600
800
1000
1200
Frequency
Original PRESSt = 1.52 × 1012
C
F
A B
E
⋮ ⋮⋮
⋮
Training data
Test dataEstimate Estimate
⋮⋮
0 1000 2000 3000 4000 5000 6000Sorted RNA index
10-4
10-2
100
102
104
106
RNAabundance+1(FPK
M)
meas. RNA-seq dataest. RNA-seq data
YE
1 2 3 4 5 6 7 8Number of dimensions used
50
60
70
80
90
100
Explainedvariance(%
)
10-2 100 102 104 106
Measured YE transcriptome (FPKM)
10-2
100
102
104
106
MeasuredEM
M-N
transcriptome(FPK
M)
10-2 100 102 104 106
Estimated YE transcriptome (FPKM)
10-2
100
102
104
106
MeasuredYE
transcriptome(FPK
M)
D
A�i
tEi = A†�i
r0Eii = 1, . . . , N
PRESStPRESSt
PRESSt 1.52⇥ 1012
p
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
3.95⇥ 105
3.86⇥ 105
1.81⇥ 105
1.46⇥ 105
1.32⇥ 105
7.27⇥ 104
1.29⇥ 105
7.54⇥ 104
6.44⇥ 104
1.29⇥ 105
9.17⇥ 104
5.38⇥ 104
2.88⇥ 104
6.96⇥ 104
7.79⇥ 104
4.02⇥ 104
3.10⇥ 104
2.46⇥ 104
3.10⇥ 104
5.77⇥ 104
5.33⇥ 104
2.61⇥ 104
2.39⇥ 104
2.12⇥ 104
1.53⇥ 104
6.80⇥ 104
5.22⇥ 104
5.78⇥ 104
1.89⇥ 104
1.20⇥ 104
tEi tEi
tEi
PRESSt =
NX
i=1
ktEi � tEik2,
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
N = 10
PRESSt p = 0.0004
�
�
�
660
�
PRESSr
N = 5
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
PRESSr 5!(= 120)
PRESSr p
PRESSt PRESSt p
PRESSrmRNA p
PRESSrncRNA p
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
1 1.5 2 2.5 3 3.5 40
5
10
15
20
25
30
35
Frequency
Original PRESSr = 1.06
PRESSr value
1000 1500 2000 2500 3000
1000 1500 2000 2500 3000Intensity
(a.u.)
Raman shift (cm-1)
Intensity
(a.u.)
A
B
C
ED
F
Fingerprint region
Fingerprint region
0 2 4 6 8 10PRESSt value 1010
0
5
10
15
20
25
Frequency
Original PRESSt = 1.83 × 1010
0 1000 2000 3000 4000Sorted RNA index
10-2
100
102
104
106
RNAabundance+1(FPK
M) meas. RNA-seq data
est. RNA-seq data
-1 -0.5 0 0.5 1LDA1
-0.4
-0.2
0
0.2
0.4
0.6
LDA2
WT
ΔcyaA 0 mMΔcyaA 0.1 mMΔcyaA 0.5 mMΔcyaA 1 mM
� �
�
2 � �� �
PRESSrPRESSr = 1.06 5! = 120
p 3/120 = 0.0250 �
tEi = A†�i
r0Ei
PRESStPRESSt = 1.83⇥ 1010 5! = 120
p 2/120 = 0.0167
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
CVPRESSt =
pPRESSt
dim t ·mean t,
dim t mean t
t tmRNA tncRNA
t CVPRESStmRNA = 0.0909 CVPRESStncRNA = 0.383
PRESSrmRNA p
PRESSrncRNA
p
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
CB
0 0.5 1 1.5 2 2.5 3CV
0
0.02
0.04
0.06
0.08
0.1
RelativeFrequency
mRNAncRNA
D E
0 10 20 30 40 500
200
400
600
800
1000
1200
1400
Frequency
mRNAncRNA
Original PRESSrmRNA = 6.71Original PRESSrncRNA = 5.52
PRESSr value
A
0 1 2 3 4 5 60
5
10
15
20
Frequency
PRESSr value
Original PRESSrncRNA = 1.61Original PRESSrmRNA = 0.81
mRNAncRNA
0 1000 2000 3000 4000 5000 6000Number of mRNAs selected
0
0.2
0.4
0.6
0.8
1
p-value
0 500 1000 1500Number of ncRNAs selected
0
0.2
0.4
0.6
0.8
1
p-value
PRESSrPRESSr
p p PRESSr p
p PRESSr p
p
p
PRESSrPRESSr p
p
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
2.57⇥ 105
8.98⇥ 104
2.94⇥ 104
9.88⇥ 103
1.73⇥ 104
2.90⇥ 103
7.85⇥ 103
1.04⇥ 104
9.30⇥ 103
4.27⇥ 103
1.48⇥ 104
8.53⇥ 103
3.24⇥ 103
4.45⇥ 103
5.94⇥ 103
2.01⇥ 103
1.35⇥ 104
5.82⇥ 103
3.88⇥ 103
7.45⇥ 103
5.68⇥ 103
5.08⇥ 103
1.32⇥ 103
6.91⇥ 103
1.65⇥ 103
5.12⇥ 103
7.31⇥ 103
7.17⇥ 103
6.02⇥ 103
3.12⇥ 103
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
�
�
600 =
4
2 2�
�
�
�
�
�
�
�660
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
⇥
⇥
µ
�
10, 000 2048⇥ 2048
o
i
vari
i
o
i
=1
M
MX
m=1
s
m
i
,
vari
=1
M
MX
m=1
(smi
)2 � o
2i
,
M M = 10, 000 m
s
m
i
i m
unif(Di
, n) =
Pi2Cn⇥n
(Di�oi)
i
�
Pi2Cn⇥n
�1i
,
D
i
n
C
n⇥n
n⇥ n i
n = 3
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
�1 �1
µ
µ
�
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
>
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
Ei
A�i
R�i
= A�i
T�i
+E�i
R�i
= [r0E1,�i
, · · · , r0Ei�1,�i
, r0Ei+1,�i
, · · · , r0EN ,�i
]
T�i
= [tE1 , · · · , tEi�1 , tEi+1 , · · · , tEN ] E�i
r0E,�i
i r0E,�i
tE
N � 1 N � 1 < dim tE
A�i
N � 1 = 9 < dim tE = 6560 N � 1 = 4 < dim tE = 4349
tE N � 1
r0E,�i
tE
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
N � 3
N � 3 = 10� 3 = 7
A�i
r0Ei,�i
r0Ei,�i
= A�i
tEi r0Ei,�i
Ei
Ei
r0Ei,�i
Ei
r0Ei= C�i
r0Ei,�i
C�i
C�i
= [r0E1, · · · , r0Ei�1
, r0Ei+1, · · · , r0EN
]R†�i
R†�i
R�i
i = 1, . . . , N PRESSr
Ei
A�i
A†�i
tEi = A†�i
r0Ei
tEi r0Ei= A�i
tEi
tEi = A†�i
r0Ei+ (I� A†
�i
A�i
)v
v tEi
r0EiA†
�i
r0Ei
A†�i
r0Ei(I� A†
�i
A�i
)v
hA†�i
r0Ei, (I� A†
�i
A�i
)vi = 0 (I� A†�i
A�i
)v
A†�i
r0EitEi
tEi = A†�i
r0Eii = 1, . . . , N PRESSt
i = 1, ..., N
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;
PRESSr PRESSt
p PRESS
N
8! = 40, 320
p N � 8 p
N < 8
p
p N � 8
p (b+1)/(m+1) b
PRESS
PRESS m
.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/235580doi: bioRxiv preprint first posted online Dec. 18, 2017;