processing and analysis methods for dna methylation array data - torino...
TRANSCRIPT
![Page 1: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/1.jpg)
Processing and analysis methods for DNA methylation array data
Giovanni FioritoPhD Student in Complex Systems for Life SciencesDepartment of Medical SciencesUniversity of Turin, Italy
![Page 2: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/2.jpg)
Outline
Brief introduction to epigenetics, DNA methylation, and genome-wide association studies (GWAS).
Statistical analysis of DNA methylation array data using R- single marker test,- multiple marker test.
Applications and interesting results.
![Page 3: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/3.jpg)
GENETIC BACKGROUND
ENVIRONMENTAL EXPOSURE(pollution, occupational exposure, …)
LIFESTYLE(smoking habits, diet,
physical activity, obesity, …)
SUSCEPTIBILITY TO DISEASES
AGEING
EPIGENETICS(DNA methylation, gene expression,
histone modifications)
“ Epigenetics is the study of heritable changes in gene activity that are not caused bychanges in the DNA sequence ”.
Epigenetics, DNA methylation and environmental exposure
S. Dalì, Ritratto di mio padre, 1921
![Page 4: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/4.jpg)
Bisulfite conversion
Measure Genome-Wide DNA Methylation
Single experiment -> DNA methylation percentage of ~450k markers simultaneously
![Page 5: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/5.jpg)
Workflow
Wilhelm-Benartzi, Br. J. Cancer, 2013
Fluorescence intensities areconverted in numerical values.
𝜷 =𝑴𝒆𝒕𝒉 + 𝒃𝒈
𝑴𝒆𝒕𝒉 + 𝑼𝒏𝒎𝒆𝒕𝒉 + 𝒃𝒈
![Page 6: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/6.jpg)
Pre-processing and analysis data: R packages
Wilhelm-Benartzi, Br. J. Cancer, 2013
![Page 7: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/7.jpg)
β vs. M values
𝑀 = 𝑙𝑜𝑔2β
1 − β
− βvalues are not normally distributedand has severe heteroscedastic variance but are biologically easily interpretable.- M values are approximately normally distributed and has approximately homoscedastic variance.- It is recommended to use M values for statistical comparisons and βvalues in final reports.
Pan et. al., BMC Bioinformatics, 2013
S. Dalì, Battaglia più di un dente di leone, 1938
![Page 8: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/8.jpg)
GWAS (Genome-Wide Association Study)
Maurano, Science, 2012
Single marker test- About 450k statistical comparisons (T-test, multivariate linear/logistic regression, …).- R package CpGassoc.
Example: DNA methylation and Multiple Sclerosis susceptibility
![Page 9: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/9.jpg)
Problem: Multiple testing correction- More than 20k false positive considering α = 0.05.- Bonferroni correction αB = 0.05/N ~ 1 x 10-07. Too much conservative.- False Discovery Rate (FDR), Holm correction (R function p.adjust, R package fdrtool).- Permutation is the better way to reduce type I and type II error (R function permutest, R
package CpGassoc), but is computationally expensive.
GWAS (Genome-Wide Association Study)
Maurano, Science, 2012
![Page 10: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/10.jpg)
Differential DNA Methylation in Purified Human Blood Cells
Whole blood is a mixture of several cell types.
Each cell type has specific DNA methylation profile.
Differences in proportion of whole blood cell type can introduce a bias in the statistical analysis.
In DNA methylation analyses using whole blood, the cell specific pattern should be evaluated beforehand.
Reinius, Epigenetics, 2013
![Page 11: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/11.jpg)
Blood-based profiles of DNA methylation predict the underlined distribution of cell types
Houseman, Epigenetics, 2013
Based on selected ~500 DNA methylation markers is possible to predict blood cell type proportion.
R script wbcInference.R is available on line.
Cell type proportion should be used as covariates in multivariate linear/logistic regression analysis.
![Page 12: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/12.jpg)
Ageing and DNA methylation
Johansson, PLoS One, 2013
DNA methylation and blood cell proportion are strongly age dependent.
Adjustment for blood cell proportion avoid several false positive associations.
S. Dalì, Uomo anziano al crepuscolo, 1917-1918
![Page 13: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/13.jpg)
Multiple markers test
R package RPMM (Recursively Partitioned Mixture Model)
- Hierarchical clustering samples based on DNA methylation similarity
- Testing association between different clusters and outcome of interest
Example: Susceptibility to mental disorders
P. Hasmann, Rappresentazione di S. Dalì
![Page 14: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/14.jpg)
DNA Methylation influence survivor in bladder cancer patients
Tajuddin, Br. J. Cancer, 2014
Different DNA methylation profiles ->Different response to therapy.
S. Dalì, La persistenza della memoria, 1931
![Page 15: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/15.jpg)
DNA methylation modulate cardiovascular disease (CVD) risk conferred by intake of B vitamins
Causal mediation analysis canbe performed using the Rpackages mediation andRmediation.
DNA methylation of One Carbon Metabolism
(OCM) genes
Low B vitamins intakeMyocardial
Infarction risk
Fiorito, Nutr. Metabolism and CVD, 2013
DE
TE
ME
![Page 16: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/16.jpg)
Gene Set Enrichment Analysis (GSEA)
Gene List
Molecular Signatures Database
Pathways Enrichment
R package GSEA find biological pathway enrichments based on the hypergeometric distribution test.The online version is easy to use but it is not possible to change analysis parameters.
S. Dalì, L’enigma senza fine, 1938
![Page 17: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/17.jpg)
Gene Set Enrichment Analysis (GSEA)R package graphite- Convert pathway topology to gene network.- Online version is available (graphite web).
Romualdi, BMC Bioinformatics, 2014
![Page 18: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/18.jpg)
Conclusions
DNA methylation is strongly regulated by external exposure and is associated to several complex diseases.
Several R packages are developed for processing and analysis of genome-wide epigenetics data.
GWAS are powerful method for the identification of novel candidate genes altered in complex diseases.
![Page 19: Processing and analysis methods for DNA methylation array data - Torino Rtorinor.net/wp-content/uploads/2014/06/01_ProcessingAndAnalysis... · Outline Brief introduction to epigenetics,](https://reader030.vdocuments.net/reader030/viewer/2022020315/5aad2cf97f8b9a2e088de07d/html5/thumbnails/19.jpg)
Acknowledgements
Genomic variation in human population and complex diseases unit. Human Genetics Foundation (HuGeF).
Prof. Giuseppe MatulloAlessandra AllioneAlessia RussoBarbara PardiniCornelia di GaetanoElisabetta CasaloneFabio RosaFederica ModicaGiovanni FioritoSimonetta Guarrera