population genetics and molecular evolution · 2019-07-22 · bachelor’sdegree in bioinformatics...
TRANSCRIPT
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
1
Course 2017-18
Session 5.
Evolutionary Inference
Methods for the detection of selection at the DNA level
Test of neutrality
Antonio Barbadilla
Group Genomics, Bioinformatics & Evolution
Institut Biotecnologia I Biomedicina
Departament de Genètica i Microbiologia
UAB
Population Genetics and Molecular Evolution
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
2
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 1. Introduction to Population Genetics
3Casillas, S. and A. Barbadilla. 2017. Molecular Population Genetics. Genetics 205: 1003–1035
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 1. Introduction to Population Genetics
4Casillas, S. and A. Barbadilla. 2017. Molecular Population Genetics. Genetics 205: 1003–1035
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 1. Introduction to Population Genetics
5Casillas, S. and A. Barbadilla. 2017. Molecular Population Genetics. Genetics 205: 1003–1035
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
Tajima’s D test (1989)
If segregating variants are neutral -> Tajima = Watterson
Deviation statistic Tajima’s D
In absence of recombination D follows a distribution with mean 0 and variance 1
• D 0 no neutral pattern
• D < 0 Purifying (negative) selectionor recent selective sweep
• D > 0 Balancing selection
In absence of recombination|D|>1.8 reject null hypothesis
with α = 0.05
Ѳw = (S/m) /σ𝑖=1𝑛−11/𝑖
Tajima = 1𝑛2𝑚σ𝑖=1𝑛−1σ𝑗=𝑖+1
𝑛 𝑘𝑖𝑗
𝑇𝑎𝑗𝑖𝑚𝑎′𝑠 𝐷 =𝜋−𝜃
𝑉𝑎𝑟(𝜋−𝜃)
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
7
In absence of recombination|D|>1.8 reject null hypothesis
with α = 0.05
𝑇𝑎𝑗𝑖𝑚𝑎′𝑠 𝐷 =𝜋−𝜃
𝑉𝑎𝑟(𝜋−𝜃)
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
Sequence FOXP2
Lai et al. (2000) Am. J. Hum. Genet. 67, 357-367Inheritance pattern autosomical dominant
Chromosome 77q31
Gene FOXP2
Orangutan Gorilla Chimpanzee Humans
Protein evolution?
000 2
Lai et al. (2001) A forkhead-domain gene is mutated in a severe speech and language disorder. Nature. 2001 Oct 4;413(6855):519-23
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
Enard, W. et al. (2002) Molecular evolution of FOXP2, a gene involved in speech and language. Nature 418, 869–872
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
Tajima’s D test in FOXP2 gene
In absence of recombination|D|>1.8 reject null hypothesis
with α = 0.05
𝑇𝑎𝑗𝑖𝑚𝑎′𝑠 𝐷 =𝜋−𝜃
𝑉𝑎𝑟(𝜋−𝜃)= -2.20
• 20 individuals• Seq 14000 pb around the genes• 47 SNPs
𝜋 = 7.9𝑥10_4
𝜃 = 3.0𝑥10-3
𝜃 > 𝜋
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
11
Population differentiationFst test (1989)
Elevation in FST at multiple SNPs in a 3.2-Mb region around the LCT gene. Sample Europeans, Africans and Asians. Percentile is computed for all SNPs in the genome
Bersaglieri, Todd et al. 2004. Genetic Signatures of Strong Recent Positive Selection at the Lactase Gene. The American Journal of Human Genetics , Volume 74 , Issue 6 , 1111 - 1120
LCT gene -> enzyme lactase
lactose -> glucose + galactose
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
Population Diversity
(polymorphism)
Selective sweep (footprint) on neutral variationlinked to a beneficial mutation
Vitti JJ, Grossman SR, Sabeti PC 2013 Detecting natural selection in genomic data. Annual review of genetics 2013 47: 97-120
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 1. Introduction to Population Genetics
13
Test* Compares References
Based on Linkage Disequilibrium:
EHH Measurement of the decay of the association
between alleles at various distances from a locus
Sabeti et al. (2002)
LHR Test to search alleles of high frequency with long-
range linkage disequilibrium
Sabeti et al. (2002)
iHS Test to search for alleles under positive selection
between shared haplotypes
Voight et al. (2006)
EEH, Extended Haplotype Homozygosity; LHR, Long Haplotype Range; iHS, Integrated Haplotype Score; XP, Cross Population; CLR, Composite Likelihood Ratio .
Selective tests based on linkage disequilibrium (LD)
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
14
Test using LD and haplotype structure
Estimates of elevation of ρexcess at multiple SNPs in a 3.2-Mb region around the LCT gene. Sample Europeans, Africans and Asians. ρexcess is a measure of the increase on linkage disequlibrium. Percentile is computed for all SNPs in the genome
Bersaglieri, Todd et al. 2004. Genetic Signatures of Strong Recent Positive Selection at the Lactase Gene. The American Journal of Human Genetics , Volume 74 , Issue 6 , 1111 - 1120
LCT gene -> enzyme lactase
lactose -> glucose + galactose
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary InferenceRecent human adaptation
Going global by adapting local: A review of recent human adaptation. 2016. Shaohua Fan,1* Matthew E. B. Hansen,1* Yancy Lo,1,2* Sarah A.
Tishkoff1,3 Science. 2016 Oct 7; 354(6308): 54–59.
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
Exercise: Estimate Tajima’s D values. Suggest potential selective scenarios for theestimated values
Sequence data set 1 Sequence data set 2
A G C G T T C T G C T C GA G A G T T C T G C T C GA G C T T T A T G C T C GA G A G T T C T G C T C GA G A G T T A T G C T C GA G C G T T C T G C T C GA G C T T T A T G C T C GA G C G T T A T G C T C GA G A T T T A T G C T C GA G A G T T A T G C T C GA G A G T T C T G C T C GA G C T T T A T G C T C GA G A G T T C T G C T C GA G C T T T A T G C T C G
1.
2.
3.
4.
5.
6
7.
8.
9.
10.
11.
12.
13.
14.
A G C G T T C T G C T C GA G C G T T C T G C T C GA G C G T T C T G C T C GT G C G T T C T G C T C GA G C G T T C T G C T C GA G C G T T C T G C T A GA G G G T T C T G C T C GA G C G T T C T G C T C GT G C G T T C T G C T C GA G C G T T A T G C T C GA G C G T T C T G C T C GA G C G T T C T G C T C GA G C G G T C T G C C C GA G C G T T C T G C T C G
1.
2.
3.
4.
5.
6
7.
8.
9.
10.
11.
12.
13.
14.
Exercises
Prof. Antonio BarbadillaBachelor’s Degree in Bioinformatics
Population Genetics and Molecular Evolution - Session 5. Evolutionary Inference
18
Readings & VideosReadings
• Introduction to Genetics and Evolution - Coursera - Prof. Mohamed Noor
Videos / Online resources
• Casillas, S. and A. Barbadilla. 2017. Molecular Population Genetics. Genetics 205: 1003–1035. -> read pages 1012-1018
• Nielsen, R. and M. Slatkin 2013, Chapters 9
• Kliman, R. (2008) The EvolGenius Population Genetics Computer Simulation: How it Works. Nature Education 1(3):7
• Vitti JJ, Grossman SR, Sabeti PC 2013 Detecting natural selection in genomic data. Annual review of genetics 2013 47: 97-120
• Shaohua F, Matthew EB, Hansen YL, Tishkoff SA. 2016. Going global by adapting local: A review of recent human adaptation. 2016. Science. 2016 Oct 7; 354(6308): 54–59.