big data opportunities and challenges in human disease genetics & genomics manolis kellis mit...

15
Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis Computer Science & Artificial Intelligence Laboratory road Institute of MIT and Harvard

Upload: jerome-goodwin

Post on 24-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Big Data Opportunities and Challengesin Human Disease Genetics & Genomics

Manolis Kellis

MIT Computer Science & Artificial Intelligence Laboratory

Broad Institute of MIT and Harvard

Page 2: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Big data Opportunities & Challenges in human disease genetics & genomics

• The goal: Mechanistic basis of human disease– Epigenomics: Enhancers, networks, regulators, motifs– Genetics: GWAS, QTLs, molecular epidemiology

• The challenges / opportunities: – Effects are very small, huge number of hypotheses– Much larger cohorts are needed, consent limitations– Technologies for privacy vs. excuse for data hoarding

• Overcoming the challenges: – Case study: Schizophrenia, Alzheimer’s– Collaboration & sharing: personal & technological

Page 3: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

CATGACTGCATGCCTG

GeneticVariant

Disease

Environment

Bringing knowledge gap from genetics to disease

Chromatinstates

Promoter

Enhancer

Insulator

Silencer

Circuitry

Control regions

Retina

Heart

Cortex

Lung

Blood

Skin

Nerve

TissueCell Type

Intermediateeffects

LipidsTensionEye drusenMetabolismDrug response

Protein

miRNA

TIMP3

ncRNA

Target genes

Factors

Requires: systematic understanding of genome function

Page 4: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

The most complete map of human gene regulation

• 2.3M regulatory elements across 127 tissue/cell types• High-resolution map of individual regulatory motifs• Circuitry: regulatorsregionsmotifstarget genes

Page 5: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Non-coding variants lie in tissue-specific regulatory regions

• Yield new insights on relevant tissues and pathways• Enable linking non-coding elements to relevant target genes• Provide a mechanistic basis for developing therapeutics

Page 6: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Control regions harbor 1000s weak-effect disease SNPs

• GWAS top hits only explain small fraction of trait heritability• Functional enrichments well past genome-wide significance

Page 7: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Poorly ranked SNP nearby

Highly rankedSNP nearby

Bayesian integration of weak effects disease modules

• MAZ no direct assoc, but clusters w/ many T1D hits• MAZ indeed known regulator of insulin expression

Disease geneGenetic associationDisease SNP

Page 8: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Brain methylation changes in Alzheimer’s patients

• Variation in methylation patterns largely genotype driven• Global signature of repression in 1000s regulatory regions:

hypermethylation, enhancer states, brain regulator targets

Genotype(1M SNPsx700 ind.)

Methylation(450k probes

x 700 ind)

Reference Chromatin

states

Dorsolateral PFC

MAP Memory and Aging Project+ ROS Religious Order Study

Page 9: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Big data Opportunities & Challenges in human disease genetics & genomics

• The goal: Mechanistic basis of human disease– Epigenomics: Enhancers, networks, regulators, motifs– Genetics: GWAS, QTLs, molecular epidemiology

• The challenges / opportunities: – Effects are very small, huge number of hypotheses– Much larger cohorts are needed, consent limitations– Technologies for privacy vs. excuse for data hoarding

• Overcoming the challenges: – Case study: Schizophrenia, Alzheimer’s– Collaboration & sharing: personal & technological

Page 10: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Big data Opportunities & Challenges in human disease genetics & genomics

• The goal: Mechanistic basis of human disease– Epigenomics: Enhancers, networks, regulators, motifs– Genetics: GWAS, QTLs, molecular epidemiology

• The challenges / opportunities: – Effects are very small, huge number of hypotheses– Much larger cohorts are needed, consent limitations– Technologies for privacy vs. excuse for data hoarding

• Overcoming the challenges: – Case study: Schizophrenia, Alzheimer’s– Collaboration & sharing: personal & technological

Page 11: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Scaling of QTL discovery power w/ sample

• Number of meQTLs continues to increase linearly• Weak-effect meQTLs: median R2<0.1 after 400 indiv.

Page 12: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

2006 2007 2008 2009 2010 2011 2012 2013 2014 20150

20

40

60

80

100

120

WCPG Hamburg 2012 (~65K)

Freeze Jan. 2013 (~70K)

Incl. SWE + CLOZUK(~60K)

Inflection point in complex trait GWAS

Freeze May 2013 (~80K)

Incl. replication (~100K)

Page 13: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Schizophrenia GWAS: Number of significant loci

35,000 cases 62 loci!

3,500 cases 0 loci

10,000 cases 5 loci

Page 14: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Similar inflection point found in every complex trait!

Significantly associated regions (p < 5e-08)

Adult height Crohn’s Schizophrenia(per 5000/5000) (per 1000/1000) (per 3000/3000)

1x 0 2 12x 2 4 23x 7 5 69x 68 51 6218x 180 - -

Same story in:• Type 1 diabetes• Type 2 diabetes• Serum cholesterol level• Every common chronic

disease

• Proof that Schizophrenia is a heritable, medical disorder • Genetic architecture similar to non-brain diseases and traits• Many genes recognition of key pathways and processes

• Voltage-gated calcium channels (CACNA1C, CACNA1D, CACNA1I, CACNB2)

• Proteins interacting with FMRP, fragile X gene• Neuron organization: Postsynaptic density, dendritic spine heads• Enhancers: brain (angular gyrus, inferior temporal lobe), immune

Larger samples lead to new biological insights

Page 15: Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad

Big data Opportunities & Challenges in human disease genetics & genomics

• The goal: Mechanistic basis of human disease– Epigenomics: Enhancers, networks, regulators, motifs– Genetics: GWAS, QTLs, molecular epidemiology

• The challenges / opportunities: – Effects are very small, huge number of hypotheses– Much larger cohorts are needed, consent limitations– Technologies for privacy vs. excuse for data hoarding

• Overcoming the challenges: – Collaboration, consortia, sharing of datasets– Case study: Schizophrenia, Alzheimer’s