the ultimate genotyping experiment: … the ultimate genotyping experiment: determination of human...
TRANSCRIPT
![Page 1: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/1.jpg)
1
The Ultimate Genotyping Experiment:
Determination of Human DNA Sequences
Dept. of MCD Biology
Institute for Behavioral Genetics
Center for Adolescent Drug Dependence
University of Colorado
![Page 2: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/2.jpg)
2
Overview
Why DNA sequencing will become the tool of
choice to study genotype/phenotype
relationships for heritable traits
What is the current technology that makes it work
How to make whole genome sequencing affordable
for large-scale genetic studies
![Page 3: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/3.jpg)
3
State of the art: Association studies via
Genome Wide Association
Strategy: survey 105 - 106 well-known single
nucleotide polymorphisms (SNPS) in large
populations - score for co-variation with trait
– “Skims” genetic variation and can allow
correlation with trait of interest
– Only 5-10% of the ~10 million common SNPs
– Based on “common allele, common disease”
model
![Page 4: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/4.jpg)
4
When it is good, it is very, very good
• Hundreds of successful GWAS studies for
several phenotypes (i.e. diabetes,
hypertension, asthma, height, obesity)
• Depends alot on “power” which is
proportional to the number of people
studied
• Also depends critically on phenotype
![Page 5: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/5.jpg)
5
Example: Blood lipids
• In an analysis by a group lead by Gonzalo
Abecasis U. of Michigan
– combined 41 samples, >100,000 genotypes
– Phenotype: Fasting lipids
(LDL,HDL,Triglycer.)
– No medicated people studied
– 2.5 x 106 SNP (typed+imputed); MAF >=1%
• Identified 95 loci that associate with levels
![Page 6: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/6.jpg)
6
How good is this? Very, very
good.
• OMIM had reported 18 genes affecting lipids
– 15 of them within 100kb of GWAS hit
– 8 within 10kb
• Computer simulations of alleles randomized
averaged <1 within 100kb and not 1/106
simulations had more than 8.
![Page 7: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/7.jpg)
7
Does it mean anything?
• One GWAS allele (40% frequency) was
found to be in a GALNT2 (glycosylation)
• Allele causes only +/- 1mg/dl HDL-C
Teslovich et al, in press
![Page 8: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/8.jpg)
8
Does it mean anything?
• One GWAS allele (40% MAF) was found
to be in a GALNT2 (glycosylation)
• Allele causes only +/- 1mg/dl HDL-C
• In mouse--
– Overexpression decreases HDL-C ~20%
– Knockdown increases HDL-C ~30%
• So clearly this gene, that had no known
role in lipid metabolism, CAN be important
![Page 9: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/9.jpg)
9
But when it is bad, it is horrid
Successful studies can account for only a
fraction of the genetic influence on
phenotypic variance for most behavioral
traits despite high heritability – why?
• Genes with high-influence may be lacking
• Phenotypes inappropriately defined
• Insufficient N
• Inability to study rare variants
meaningfully
![Page 10: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/10.jpg)
10
Sequencing can do genotyping
ALOT better
• Whole genome sequencing types ALL
polymorphisms - rare and common
• GWAS done with sequencing has no
“missing” data like chip-based methods
• Linkage (and LD) are not required for
association. More “straighforward”
analysis
• May eventually be cheaper per marker
![Page 11: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/11.jpg)
11
Sharpening tools
Efforts to begin large scale DNA sequencing
to increase power to detect genes are now
being piloted
1000 Genomes Project is pointing the way
– Moderate frequency alleles – low pass
sequencing (low cost/person)
– Rare variants – deep sequencing (high cost)
![Page 12: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/12.jpg)
12
Digression on Sequencing
• Rates of acquisition of DNA sequence
have gone through the roof
• Accuracy improves and costs have
plummeted
• Most of the progress due to determination
of reference human sequence combined
with technological advances in short read
technologies
![Page 13: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/13.jpg)
13
Sequencing is affordable!
13
![Page 14: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/14.jpg)
14
Sequencing is affordable!
14
![Page 15: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/15.jpg)
15
Sequencing is affordable!
15
Publications by Year using Illumina
Sequencing Methods
2007 2008 2009 2010
![Page 16: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/16.jpg)
16
Three prominent technolgies
• Illumina - based on solid phase PCR
approach (similar to SoLID system)
• 454 - similarly based on solid phase DNA
synthesis using highly processive process
• PacBio - based on true single-molecule
detection approach - MOST processive
16
![Page 17: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/17.jpg)
17
Illumina
17
![Page 18: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/18.jpg)
18
Illumina
18
![Page 19: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/19.jpg)
19
Illumina
19
![Page 20: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/20.jpg)
20
Roche - 454
![Page 21: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/21.jpg)
21
PacBio Single molecule of
DNA at a time
75,000 reads
simultaneously
1000-10000/read
![Page 22: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/22.jpg)
22
What is good for what?
22
454 SOLID ILLUMINA PACBIO
Method DNA Pol
synthesis
Ligase PCR with fluor.
dNTPs
DNA Pol
special NTPs
Medium Beads Beads Glass surface Optical well
Error
types
Indels at
homopol.
End errors End errors Random indels
Bases/re
ad
400-1000 50 100+100 >1000
Most
common
use
Metagenomics Resequencing
/de novo
Resequencing/
de novo
de novo
microbial
genomes
![Page 23: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/23.jpg)
23
What can these babies do?
• Illumina,
SoLID, 454
can deal
with most of
these
methods
23
Tag profiling
Small RNA Discovery
mRNA-Seq Methylation
Targeted
Resequencing
CNV
DNase I
Hypersensitivity
Metagenomics
ChIP-Seq
ChIA-PET
Bacterial Sequencing
Human Genome
Resequencing
Nucleosome Mapping Molecular
Cytogenetics
De novo
Sequencing
![Page 24: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/24.jpg)
24
What can these babies do?
• Illumina HiSeq 2000 - popular for genotyping
24
HiSeq 2000
Readlength 2X100
Yield / run 250 Gb
Runs / genome 1/2
Depth 54.9x
SNPs 4,232,886
550k GT coverage 99.8%
Genotype concordance 99.3%
![Page 25: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/25.jpg)
25
How are bases called given
finite errors? • One can sequence many times (i.e. 5x coverage
25
5’-ACTGGTCGATGCTAGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTGCTAGCTCGACG-3’
Reference Genome
GCTAGCTGATAGCTAGCTAGCTGATGAGCCCGA
AGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTG
ATGCTAGCTGATAGCTAGCTAGCTGATGAGCC
ATAGCTAGATAGCTGATGAGCCCGATCGCTGCTAGCTC
TAGCTGATAGCTAGATAGCTGATGAGCCCGAT
Sequence Reads
Predicted Genotype ?
![Page 26: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/26.jpg)
26
How are bases called given
finite errors?
26
5’-ACTGGTCGATGCTAGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTGCTAGCTCGACG-3’
Reference Genome
GCTAGCTGATAGCTAGCTAGCTGATGAGCCCGA
AGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTG
ATGCTAGCTGATAGCTAGCTAGCTGATGAGCC
ATAGCTAGATAGCTGATGAGCCCGATCGCTGCTAGCTC
TAGCTGATAGCTAGATAGCTGATGAGCCCGAT
Sequence Reads
P(reads|A/A , read mapped)= 0.00000098
P(reads|A/C , read mapped)= 0.03125
P(reads|C/C , read mapped)= 0.000097
![Page 27: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/27.jpg)
27
How are bases called
27
5’-ACTGGTCGATGCTAGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTGCTAGCTCGACG-3’ Reference Genome
GCTAGCTGATAGCTAGCTAGCTGATGAGCCCGA
AGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTG
ATGCTAGCTGATAGCTAGCTAGCTGATGAGCC
ATAGCTAGATAGCTGATGAGCCCGATCGCTGCTAGCTC
TAGCTGATAGCTAGATAGCTGATGAGCCCGAT
Sequence Reads
Individual Based Prior: Every site has 1/1000 probability of varying.
P(reads|A/A)= 0.00000098 Prior(A/A) = 0.00034 Posterior(A/A) = <.001
P(reads|A/C)= 0.03125 Prior(A/C) = 0.00066 Posterior(A/C) = 0.175
P(reads|C/C)= 0.000097 Prior(C/C) = 0.99900 Posterior(C/C) = 0.825
![Page 28: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/28.jpg)
28
How are bases called
28
• Individual Based Prior
• Assumes all sites have an equal probability of showing polymorphism
• Specifically, assumption is that about 1/1000 bases differ from reference
• If reads where error free and sampling Poisson …
• … 14x coverage would allow for 99.8% genotype accuracy
• … 30x coverage of the genome needed to allow for errors and clustering
![Page 29: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/29.jpg)
29
What if....
29
5’-ACTGGTCGATGCTAGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTGCTAGCTCGACG-3’ Reference Genome
GCTAGCTGATAGCTAGCTAGCTGATGAGCCCGA
AGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTG
ATGCTAGCTGATAGCTAGCTAGCTGATGAGCC
ATAGCTAGATAGCTGATGAGCCCGATCGCTGCTAGCTC
TAGCTGATAGCTAGATAGCTGATGAGCCCGAT
Sequence Reads
Population Based Prior: Use frequency information from examining others at the same site. In the example above, we estimated P(A) = 0.20
P(reads|A/A)= 0.00000098 Prior(A/A) = 0.04 Posterior(A/A) = <.001
P(reads|A/C)= 0.03125 Prior(A/C) = 0.32 Posterior(A/C) = 0.999
P(reads|C/C)= 0.000097 Prior(C/C) = 0.64 Posterior(C/C) = <.001
![Page 30: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/30.jpg)
30
What if....
30
5’-ACTGGTCGATGCTAGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTGCTAGCTCGACG-3’ Reference Genome
GCTAGCTGATAGCTAGCTAGCTGATGAGCCCGA
AGCTGATAGCTAGCTAGCTGATGAGCCCGATCGCTG
ATGCTAGCTGATAGCTAGCTAGCTGATGAGCC
ATAGCTAGATAGCTGATGAGCCCGATCGCTGCTAGCTC
TAGCTGATAGCTAGATAGCTGATGAGCCCGAT
Sequence Reads
Population Based Prior: Use frequency information from examining others at the same site. In the example above, we estimated P(A) = 0.20
P(reads|A/A)= 0.00000098 Prior(A/A) = 0.04 Posterior(A/A) = <.001
P(reads|A/C)= 0.03125 Prior(A/C) = 0.32 Posterior(A/C) = 0.999
P(reads|C/C)= 0.000097 Prior(C/C) = 0.64 Posterior(C/C) = <.001
![Page 31: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/31.jpg)
31
How are bases called
31
Population Based Prior •Uses frequency information obtained from examining other individuals
•Calling very rare polymorphisms still requires 20-30x coverage of the genome
•Calling common polymorphisms requires much less data
Haplotype Based Prior or Imputation Based Analysis •Compares individuals with similar flanking haplotypes
•Calling very rare polymorphisms still requires 20-30x coverage of the genome
•Can make accurate genotype calls with 2-4x coverage of the genome
•Accuracy improves as more individuals are sequenced
![Page 32: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/32.jpg)
32
What good is using statistics?
32
.5 – 1% 1 – 2% 2-5%
400 Deep Genomes (30x - current cost ~$4,000,000 - 2nd quarter ~$2,000,000)
Discovery Rate 100% 100% 100%
Het. Accuracy 100% 100% 100%
Effective N 400 400 400
3000 Shallow Genomes (4x - current cost ~$4,000,000 -> $2,000,000)
Discovery Rate 100% 100% 100%
Het. Accuracy 90.4% 97.3% 98.8%
Effective N 2406 2758 2873
This would cover essentially ALL 10x106 common SNPs
in 2800 individuals. Affy cost now - $1,000,000 for 1/10
the number of common SNPs.
![Page 33: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/33.jpg)
33
Genotyping by sequencing
• Even with current technology, sequencing
can be an alternative to chip genotyping
• Costs ~$5K per person for deep coverage
• Costs ~$800 per person for 4X coverage
• Using hapmap + knowledge about alleles,
can study all SNPs with MAF > 1-2%
33
![Page 34: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/34.jpg)
34
Does sequencing help?
• Too soon to be sure - but probably so
• Best work in 1000 genomes project
– 2 deeply sequenced trios
– 179 whole genomes sequenced at low coverage
– 8,820 exons deeply sequenced in 697 individuals
15M SNPs, 1M indels, 20,000 structural variants
34
![Page 35: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/35.jpg)
35
Some highlights
35
Highlights Reduced Diversity Extending ~120kb Around Genes
![Page 36: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/36.jpg)
36
Allele frequency spectrum
36
![Page 37: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/37.jpg)
37
Does sequencing improve association:
Expression QTL example TIMM22
37
![Page 38: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/38.jpg)
38
Does sequencing improve association:
Expression QTL example TIMM22
38
![Page 39: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/39.jpg)
39
Does sequencing improve association:
Expression QTL example TIMM22
39
![Page 40: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/40.jpg)
40
Imputation
• Given detailed allele distribution data, it is
possible to “guess” genotypes based on
hapmap/neighboring markers
• Improves with better understanding of
allele distribution
• Allows conversion of Affy/Illumina chip
data into more detailed SNP information
electronically
40
![Page 41: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/41.jpg)
41
Imputation
41
Reference Imputation Accuracy (r2)
Panel Release Date MAF 1-3% MAF 3-5% MAF >5%
1000G Pilot (final) June 2010 ~0.69 ~0.77 ~0.91
280 EUR (draft) November 2010 ~0.73 ~0.78 ~0.92
• As more samples are sequenced, ability to impute individual SNPs improves
• As more samples are sequenced, it becomes possible to impute additional markers
![Page 42: The Ultimate Genotyping Experiment: … The Ultimate Genotyping Experiment: Determination of Human DNA Sequences Dept. of MCD Biology Institute for Behavioral Genetics Center for Adolescent](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b2641f37f8b9ab5318b48c3/html5/thumbnails/42.jpg)
42
Status of 1000 Genomes
• 25,487,060 variant sites called on 629 samples – 7,922,125 sites in dbSNP 129 – 17,564,935 sites not in dbSNP 129 – 98.8% of HapMap III sites rediscovered – Transition/transversion ratio of 2.21 vs 2.04 in
Pilot
• As of November 2010: – 1103 sequenced samples – 22.6 Tb of raw sequence data
42