the omics revolution - uh bristol nhs ft · integrating ‘omics’ technologies lifecourse...
TRANSCRIPT
THE OMICS REVOLUTION
OMIC “All constituents considered
collectively”
OMIC OMICS
Integrating ‘Omics’ technologies
Lifecourse
Development
To
Ageing
Bioinformatics
& Statistics
Data
integration
Data mining
Analysis
methods
• SNPs
• CNVs
• mtDNA variants
• Rare variants
• Whole genome sequencing
• DNA methylation
• miRNA expression
• Chromatin analysis
• LCL gene
expression
• Tissue specific
expression
profiles
• NMR metabolites
• One carbon
intermediates
• Questionnaires
• Clinical assessment
• Health records
• Recall
• Imaging
• Educational
attainment
• Data capture
• Questionnaires
• Environmental
monitoring
• Biological
sample
analysis
• Orbitrap-MS
intact protein
analysis
TARGETED METABOLOMICS (AND
PROTEOMICS)
METABOLOMICS – MEASURING SMALL MOLECULES
Kettunen et al. Nature Genetics 2012; 44(3):269-76
Kettunen et al. Genome-wide association study identifies multiple loci influencing
human serum metabolite levels. Nature Genetics 2010; 44,: 269–276.
Varbo et al. Remnant cholesterol as a causal risk factor for ischemic heart disease. J AM Coll Cardiol
2013;61:427-36.
Varbo et al. Remnant cholesterol as a causal risk factor for ischemic heart disease. J AM Coll Cardiol
2013;61:427-36.
Varbo et al. Remnant cholesterol as a causal risk factor for ischemic heart disease. J AM Coll Cardiol
2013;61:427-36.
PLEIOTROPIC EFFECTS OF LIPID-RELATED
SNPS UTILISED IN MR STUDIES
Wurtz P, et al. Lipoprotein Subclass Profiling Reveals Pleiotropy in the Genetic Variants of Lipid Risk Factors for Coronary Heart Disease – A Note on Mendelian Randomization Studies, submitted.
PLEIOTROPIC EFFECTS OF LIPID-RELATED
SNPS UTILISED IN MR STUDIES
• Wurtz P, et al. Lipoprotein Subclass Profiling Reveals Pleiotropy in the Genetic Variants of Lipid Risk Factors for Coronary Heart Disease – A Note on Mendelian Randomization Studies, submitted.
GE
NE
TIC
EF
FE
CT
S VS.
C
RO
SS-S
EC
TIO
NA
L O
BS
ER
VA
TIO
NS
LO
NG
ITU
DIN
AL
CH
AN
GE
S
VS.
CR
OS
S-S
EC
TIO
NA
L O
BS
ER
VA
TIO
NS
UNTARGETED METABONOMICS
“ANONYMOUS MENDELIAN RANDOMIZATION”
Select participants by extremes of genetic risk
predictions scores
Samples from ages before disease develops
Untargeted metabolomic analysis
Anonymous metabolite peaks identified that
differ between groups categorized by allele scores
GWAS of these peaks
Relate (in e.g. GWAS consortia) peak-associated
SNPs with disease
Follow up characterising the peaks which
triangulate through MR analysis
MICROBIOME
Qin J et al. A human gut microbial gene catalogue established by metagenomic
sequencing. Nature 2010 464; 59-67
Qin J et al. A human gut microbial gene catalogue established by metagenomic
sequencing. Nature 2010 464; 59-67
THE DIET-MICROBE MORBID UNION
Rak K et al. Nature 2011; 472:40
EPIGENOME
Epigenetics: the confused epidemiologist's friend.
Davey Smith G Int. J. Epidemiol. 2012;41:303-308
Published by Oxford University Press on behalf of the International Epidemiological Association ©
The Author 2012; all rights reserved.
ACCESSIBLE RESOURCE FOR INTEGRATED EPIGENOMIC STUDIES (ARIES)
Genome-wide DNA methylation analysis in a longitudinal cohort study (ALSPAC) 1000 mother-child pairs at 5 time points across the life course (HM450)
Linked to extensive associated data on genetics, exposures and phenotypes
Methylome sequencing BS-seq* 10-mother-child pairs at 5 time points across the life course(s)
Tissue specific (and paired blood) genome-wide DNA methylation analysis Muscle, adipose, liver, cartilage, skin, brain, fetal tissues
HM450 analysis and BS-seq*
Data integration
Data visualisation and browsing
Reference resource for ALSPAC (and other?) case-control studies Additional nested case-control studies in progress and planned (conduct
problems, rheumatoid arthritis, asthma, breast cancer, assisted conception etc)
* Agilent SureSelect targeted BS-seq
SERIAL SAMPLES FROM THE SAME INDIVIDUALS
http://www.ariesepigenomics.org.uk/
TISSUE SPECIFIC ANALYSIS
Adult tissues Adrenal gland Adipose Bone Brain - Frontal cortex - Temporal cortex - Hippocampus - Substantia nigra - Dorsal raphe nucleus - Putamen - Hypothalamus - Amygdala - Cerebellum Cartilage Eye Gonad - Ovary - Testis Heart Intestine - Colon Kidney Liver Lung Muscle Palate Pancreas Skin Spleen Stomach Thyroid
Fetal tissues Adrenal gland Bone - Rib cage Brain - Fore - Mid - Cerebellum Eye Gonad - Ovary - Testis Heart Intestine - Large - Small Kidney Liver Lung Muscle - Leg Pancreas Skin Spleen Stomach Thymus Umbilical cord Chorionic villi Yolk sac
BIOINFORMATICS INFRASTRUCTURE
BIOINFORMATICS WORKFLOW
1) Windows
Scheduled Task
00:00 : Push data to
epi-linus
2) Chronjob
03:00 :
• Check for new
files
• Copies IDATs to
repository
3) Epi-linus
envokes PHP
script to
generate
manifest files
using LIMS
data
4) R: Run lumi script
•p-value plot
•QC probe plots
•Dump out control probe betas
5) R: Run minfi script
• PDF QC report
6) Epi-linus
pushes QC
plots to LIMS
interface
7) PERL:
Parses QC
probe data.
Connect to
LIMS database
LIMS VM
8) LIMS mysql
Receive QC data.
Website updated
9) Archive:
• Manifest files
• Logs
EPI-
LINUS
Lab/Scanner
PC
VISUALISATION AND COMPARISON OF EPIGENETIC AND RELATED DATA
• Genome browser visualisation
• Comparison with other data sources
• Comparison of tissues, mother-child pairs, time points, age groups, genotype groups, etc
• Alignment with gene expression and SNP data
INTEGRATION WITH DATA FROM OTHER SOURCES
• ENCODE: http://genome.ucsc.edu/ENCODE/ - especially methylation, open chromatin etc.
• Published GWAS results: http://www.genome.gov/gwastudies/ (also UCSC track)
• OMIM: http://www.ncbi.nlm.nih.gov/omim/ (also UCSC track)
• Other UCSC tracks: http://genome.ucsc.edu/cgi-bin/hgGateway
– CpG Islands (and possibly other "Regulation" tracks), %GC
• RoadMap Epigenomics tracks: http://www.ncbi.nlm.nih.gov/epigenomics/
• MethylomeDB: http://epigenomics.columbia.edu/methylomedb/index.html
• PubMeth: http://www.pubmeth.org/ (and de novo lit mining in non-cancer areas?)
• Imprinted Genes database: http://igc.otago.ac.nz/home.html
• MethDB: http://www.methdb.de/
• DiseaseMeth: http://bioinfo.hrbmu.edu.cn/diseasemeth
• MicroRNA target database: http://www.microrna.org/microrna/home.do or http://www.targetscan.org/index.html
• ChromDB: http://www.chromdb.org/
• GEO: http://www.ncbi.nlm.nih.gov/geo/
• KEGG: http://www.genome.jp/kegg/
• STRING: http://string-db.org/
ARIES GENOME BROWSER (PROTOTYPE)
A TWO-STEP APPROACH
Exposure Phenotype
SNP 1
CpG
(A) Step 1
Exposure Phenotype
SNP 2
CpG
(B) Step 2
Relton CL, Davey Smith G, Two step epigenetic Mendelian randomization: a strategy for establishing the causal role of epigenetic processes in pathways to disease. Int J Epidemiol 2012;41:161-176
EPIGENETIC MEDIATION OF SMOKING AND CARDIOVASCULAR DISEASE
Identified through EWAS of smoking on DNA methylation
Smoking associated with methylation of the coagulation factor II receptor-like 3 gene (F2RL3)
expression of the protease-activated receptor-4 (PAR4)
Induces platelet activation (aggregation)
Plausible mechanism of smoking induced CVD
↑ smoking ↑ platelet aggregation
↑ risk CVD ↑ expression PAR4
↓ methylation F2RL3
Breitling LP et al. Am J Hum Genet 2011 N=177 discovery N=316 replication
SMOKING, F2RL3 METHYLATION AND PROGNOSIS IN STABLE CORONARY HEART DISEASE
Breitling LP et al. Eur Heart J Apr 2012
Methylation level
Secondary CVD event
CVD mortality Non-CVD mortality
All-cause mortality
>0.74 (Q4) 1 (ref) 1 (ref) 1 (ref) 1 (ref)
0.74 (Q3) 0.92 (0.57-1.50) 1.06 (0.46-2.46) 1.55 (0.59-4.12) 1.26 (0.67-2.37)
0.67 (Q2) 1.14 (0.71-1.84) 1.89 (0.85-4.16) 2.33 (0.88-6.19) 2.07 (1.12-3.83)
0.54 (Q1) 1.40 (0.83-2.36) 3.49 (1.51-8.04) 5.36 (1.93-14.8) 4.19 (2.20-8.00)
Cox’s regression model of F2RL3 (CpG_4) methylation and prognosis in stable coronary heart disease
SMOKING AND CARDIOVASCULAR DISEASE
Smoking
CVD
F2RL3
SNP
U
F2RL3
CpG E Y
G
U
X
(B) Step 2
E Y
G
U
X
(A) Step 1
Smoking
CVD
CHRN3/5
SNP
U
F2RL3
CpG
INDUCED PLURIPOTENT STEM
CELLS
IPSC IN POPULATION HEALTH SCIENCES
RESOURCE FOR INDUCED PLURIPOTENT
STEM CELL STUDIES
Establish an accessible and sustainable biobank of iPSC from
ALSPAC participants
Establish collaborative links for the differentiation of iPSC
into desired cell lineages using standardised protocols
Fuel research into
The biology of pluripotency
Disease modelling
Drug screening
Cell-based therapies
Functional characterisation of GWAS hits (select
individuals by genetic load/allele score for in depth
molecular phenotyping)
RE-PROGRAMMING OF EBV TRANSFORMED
LYMPHOBLASTOID CELL LINES
HARVARD
Episomal
plasmid
method BRISTOL
Lentiviralm
ethod
LONDON
Heterokaryon
method
NEWCASTLE
Sendai virus
method
Reprogramme Assess pluripotency Assess removal of Differentiate into
EBV genome desired lineage
DATA INTEGRATION
MAKING THE MOST OF MULTI-DIMENSIONAL DATA - INTEGRATION AND DATA MINING
MAJOR NEW DEVELOPMENTS IN OMICS IN
BRISTOL
MRC Integrative Epidemiology Unit (IEU)
Epigenomics lab
Metabolomics facility
ALSPAC Strategic Award
Epigenomics programme
Expansion of biobanking provision
Building on ARIES
Bioinformatics growth
MRC High-throughput sequencing ‘omics’ funding proposal
iPSC
Project grants
MRC IEU
MRC Integrative Epidemiology Unit (IEU)
Scientific Structure
EPIGENOMICS LAB
To support a wide range of activities in
epigenomics research
To follow up on observational associations
through functional studies; to validate
differentially methylated regions seen in studies
using the Illumina HM450K array
To assess functional consequences of differential
methylation
Manipulation of DNA methylation levels
Gene expression
METABOLOMICS FACILITY
To establish capacity for high throughput analysis of
multiple metabolites from serum samples using NMR
spectroscopy
Led by Prof Mika Ala-Korpela
Housed in School of Chemistry
Other metabolomic initiatives are being piloted (e.g. mass
spectrometry of serum, and of histone proteins to measure
epigenomic profiles)
ALSPAC STRATEGIC AWARD
CORE PROGRAMME FUNDING RENEWAL
To enhance ALSPAC to incorporate detailed annotation of
epigenomic features and additional epigenomic profiling.
To exploit the unique scientific opportunities afforded by
ALSPAC to identify epigenetic signatures of exposure,
track their persistence over time, across generations and
evaluate their relationship with development and disease.
To assess the genetic contribution to DNA methylation
variation through leadership of a Genome-Wide Association
Study Consortium.
LC-MS/MS method development for high throughput
histone modification analysis. Analysis of 100 samples.
miRNA profiling of 100 samples.
EXPANSION OF BIOBANKING PROVISION
MRC NSHD 1946 cohort
Millennium Cohort Study
Southall And Brent REvisited (SABRE)
Born in Bradford Study
Cleft Collective
Head & Neck 5000
Sub-samples of other estbalished cohort studies (e.g.
MOBA, Copenhagen City Heart Study)
Other funding applications (e.g.Bio.ME)
...
BUILDING ON ARIES
Bioinformatics growth
ARIES- Explorer (BBSRC £1M)
Links with EBI for more rapid and widespread omics
data release, in line with other large scale international
omics projects
MRC High-Throughput Sequencing and ‘Omics’ profiling
funding bid
Large scale profiling of ALSPAC LCLs (£2.2M)
Project grants
Lots, commonly requesting additional HM450K
profiling of selected samples plus validation using
Pyrosequencing
Non-ALSPAC studies requesting HM450K analysis
Lymphoblastoid cell lines, n=500 Existing data:GWAS ; HM450k peripheral blood ; WGS(UK10K)
Epigenomics Histone preparation
20µg
Transcriptomics RNA extraction
10µg
Epigenomics DNA extraction
6µg
Methylation HM450k
1µg/sample
Methylation BS-seq
5µg/sample
Expression RNA-seq
10µg/sample
Histone PTM ChIP-seq 4xIP 4µg/sample
In house: ALSPAC
Laboratory
Outsource: Babraham Institute
Outsource: UoB
Chemistry
Histone PTM LC-MS/MS
7.5µg/sample
Outsource: MRC HTS Hub
GenePool, Edinburgh
Future phenotyping LCL storage
In house: ALSPAC
Laboratory
ENHANCEMENT OF ALSPAC THROUGH LARGE
SCALE OMIC PROFILING OF LYMPHOBLASTOID
CELL LINES
MRC High Throughput Sequencing and Omics
Profiling.
Sept 2013-March 2014
PUMP-PRIMING FUNDING FOR IPSC WORK
1. Initial pilot and feasibility studies on iPSC derivation from LCLs. In progress.
2. Optimization of the necessary work flow and quality control procedures within the ALSPAC labs.
3. Establishment of differentiation of iPSC into various lineages and comparison with fibroblast derived iPSC.
4. Development of links with the European Bioinformatics Institute (EBI) for omics data management.
5. Exemplar ‘case studies’ to illustrate the value and utility of an ALSPAC iPSC resource. Sample selected based on genetic load , LCLs reprogrammed to iPSC, relevant lineages derived and functional studies conducted;
Schizophrenia
Glucose trafficking
Psoriasis
PROGRAMMES, PROJECTS AND OTHER
ACTIVITIES
Metabolomic profiling of 35,000 ALSPAC serum samples at
various ages
UK10K; Whole Genome Sequence data on 2,000 ALSPAC
individuals
Is DNA methylation influenced by...
Alcohol, smoking, heavy metals, folate and related
metabolites, lipids, glucose, insulin, assisted
reproductive technologies, social deprivation, ...etc
Does DNA methylation play a role in...
Cardiovascular disease, respiratory function, diabetes,
eating disorders, depression, neurodevelopment, autism,
intellectual disability, ... etc
ACKNOWLEDGEMENTS
Caroline Relton
Peter Wurtz