jetpoint meeting @jetbrains on bioinformatics
TRANSCRIPT
JetPoint Meeting
JetBrains BioLabs #jetmeet
6.03.2013
JetBrains
At JetBrains, we have a passion for making people more productive through smart software solutions that help them focus more on what they really want to accomplish, and less on mundane, repetitive "computer busy work".
Jetbrains , , . , :)
(. , , ) , , , , .
, . , , , .
1942 , . , , , .
, 1942, , , . 2000.
,
:
, . 2 .
()
- , .
, X-, .
?
. , , . .
->
->
->
->
, , , . , , , p-Value .
, .. . ....
!
?
?
???
, . ....
BS-seq
ChIP-seq
Illumina27/450K
ChIP-seq
+
ChIP-BS-Seq
? - ..ChIP-Seq - .
, , , .
Open Data
- - -
+
. , Encode, Atlas, . GEO .
Wet Labs problems
- - -
, (wet lab), ....
.
Academic software
, .
, , .
A Farewell to
Bioinformaticshttp://madhadron.com/a-farewell-to-bioinformatics
Fuck you, bioinformatics. Eat shit and die.
, , ..
, , A Farewell to Bioinformatcs, .
JetBrains BioLabs
, , .
, , 95% ( ) , , . .
?
RNA-directed DNA methylation in Arabidopsis
, . , , .
-
, :
, .
-
, 40 .
,
SVM + Ada Boost ML. n- . .
Tradeoff:
~ 80%
-
Proof of concept AdaBoost
~ 99%
ML !
, . AdaBoost , .
!
ML smRNA
, .
. . . . 3.
BS-Seq
, smRNA, PiRNA, lncRNA, etc
,
Illumina450K
Infinium Methylation 450K is a hybrid of two different assays, Infinium I and II.
Due to its design, Infinium Methylation 450K technology generates a dataset that should be viewed as two distinct datasets. Infinium II data are less accurate and reproducible than Infinium I data.
Peak-based correction makes it possible to treat Infinium I and Infinium II data as a single dataset.
Infinium Methylation 450K is one of the most attractive powerful and cost-effective tool currently available for generating quantitative DNA methylomes for health and disease, notably in the framework of large biomarker discovery studies.
Microarray Illumina450k.
Illumina450K
Beta = methylated / (methylated + unmethylated)
Beta .
Illumina450K
+ subset quantile normalization
Beta , Peak-Based correction Subset Quantile Normalization
Illumina450K
SNP-
Subset Quantile Normalization
Batch effects
(genes, gene regions, etc) Mann-Whitney U-test
: NDA
. Laboratory of Stem Cell Biology .
.
ChIP-seq
,
Poisson Mixture
?
Poisson Mixture + HMM
? .
+
,
AIC = 2*freedom_degrees log(likelihood)
1974 . . .
.
!
, 2 , . 4 . (-, -, -, -).
, 4 .
!
..
Chromasig
ChIP-Seq - Chromasig
, .
. , , , .
Chromasig
: , , ,
: , ,
, , .
Tools
Java
R
Big server computations (Linux)
Confluence, Bamboo, Crucible
Continuous integration, tests
JetBrains
JetBrains BioLabs
LabBook - . . Excel. .
Genome query .
Genestack Platform - universal collaborative ecosystem for bioinformatics research and development. http://genestack.com
, JetBrains .
JetBrains BioLabs
?
!
[email protected]: oleg_s