biol335: genetic selection
DESCRIPTION
Course material for: http://www.canterbury.ac.nz/courseinfo/GetCourseDetails.aspx?course=BIOL335TRANSCRIPT
Measuring genetic selection
Paul Gardner
September 19, 2014
Paul Gardner Measuring genetic selection
Main questions
I Any questions from the last lecture?
I How can we measure selection?
I How much of the human genome is under selection?
Paul Gardner Measuring genetic selection
One of the biggest surprises in the human genomesequence...
I The draft Human genome sequence was published in 2001I The haploid human genome contains approximately 20,000
protein-coding genesI Before that, the text-books predicted 100,000 protein-coding
genesI C. elegans has ≈ 20, 470 genes, Drosophila has ≈ 15, 682
genes, Baker’s yeast has ≈ 6, 607 genes and E. coli has≈ 5, 000 genes
I Just ≈ 2% of the human genome is protein coding
Paul Gardner Measuring genetic selection
A conclusion from comparing the human and chimpgenomes
I “One of Clark and colleagues’ findings is that human enzymes foramino-acid breakdown (catabolism) have been under positiveselection. This is concordant with the generally high proportion ofmeat (and thus protein) in the human diet, at least in comparisonwith the more herbivorous chimpanzee and gorilla. The increasedcapacity to break down amino acids is not surprising in anotherrespect. For example, failure to catabolize phenylalanine has severaladverse effects, including brain damage. Overall, the finding lendssupport to theories that an increased proportion of meat in the dietof early humans was important for an increase in brain size.Regardless of that, there could also be ethical implications. If earlyhumans ate meat ’naturally’, then for example being vegetariancould be considered a personal choice rather than a universal ethicaldecision.”
Penny (2004) Evolutionary biology: Our relative genetics. Nature
Paul Gardner Measuring genetic selection
How much is under “selection”...
I Genetic selection favours some genetic variation within apopulation over others.
I Negative selectionI Positive selection
I 3% to 8% of the human genome appears to be under negativeselection (8.2% according to Rands, et al. (2014))
Siepel et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. GenomeResearch
Paul Gardner Measuring genetic selection
How can we measure “selection”?
Paul Gardner Measuring genetic selection
Idea 1: compare rates of synonymous and non-synonymousmutations
I Compare the number of non-synonymous and synonymousmutations: dN
dS(also called the Ka
KSratio or ω)
# STOCKHOLM 1.0#33 unique RNA sequences, 1 peptide sequence#=GR PR1 G..A..D..V..T..H..P..P..A..G..D..#=GR PR3 GlyAlaAspValThrHisProProAlaGlyAspplatypus GGAGCAGACGTCACTCACCCCCCAGCCGGAGATopossum GGAGCAGATGTTACTCACCCTCCTGCTGGAGATsloth GGAGCAGACGTCACACACCCTCCCGCGGGGGATarmadillo GGAGCAGACGTCACGCACCCTCCGGCAGGGGATtenrec GGGGCCGACGTCACGCACCCCCCTGCGGGCGATelephant GGAGCGGATGTCACACACCCGCCTGCGGGGGATshrew GGCGCAGATGTCACGCATCCTCCAGCAGGGGAChedgehog GGAGCAGATGTCACACACCCCCCAGCAGGAGATmegabat GGAGCAGATGTCACACACCCTCCTGCAGGAGATmicrobat GGAGCAGATGTCACCCACCCCCCTGCAGGGGACdog GGAGCGGATGTCACACACCCCCCAGCCGGGGACcat GGAGCCGATGTCACGCACCCCCCAGCAGGGGAThorse GGAGCGGATGTCACACACCCTCCGGCAGGGGATpika GGAGCAGATGTCACTCACCCTCCAGCTGGGGATrabbit GGTGCAGATGTCACACACCCCCCAGCTGGAGATsquirrel GGAGCAGATGTCACTCACCCTCCAGCGGGAGATguinea_pig GGAGCAGATGTCACACACCCACCAGCGGGAGATmouse GGAGCAGATGTCACTCATCCGCCTGCTGGGGACrat GGAGCAGATGTCACTCATCCACCTGCTGGGGATkangaroo_rat GGAGCAGATGTTACACACCCTCCAGCAGGGGATtree_shrew GGCGCAGACGTCACGCACCCCCCGGCCGGGGAThuman GGAGCGGATGTCACACACCCCCCAGCAGGGGATtarsier GGTGCTGATGTCACACACCCCCCTGCAGGGGATmarmoset GGAGCAGATGTCACACACCCACCAGCAGGGGATzebrafinch GGAGCAGATGTCACTCACCCTCCCGCCGGGGATgreen_anole GGGGCAGACGTCACTCACCCGCCAGCCGGGGACxenopus GGAGCAGATGTTACACACCCACCTGCTGGTGATpufferfish GGTGCGGATGTTACTCATCCTCCTGCTGGTGATfugu GGGGCTGATGTTACTCACCCTCCAGCTGGTGATstickleback GGTGCAGACGTCACACATCCTCCAGCGGGTGATmedaka GGTGCCGATGTCACTCATCCTCCTGCCGGGGACzebrafish GGGGCAGATGTTACACACCCGCCGGCTGGTGATlamprey GGTGCCGATGTGACACACCCTCCAGCGGGAGAC//
https://en.wikipedia.org/wiki/Genetic_code
Paul Gardner Measuring genetic selection
Interpreting dNdS
I dN & dS near zero is a bugger!
I log10(dN+δdS+δ )
I log10(dNdS + δ) & dS > 0
Histogram of log10(dN+δdS+δ)
log10((dN+δ)/(dS+δ))
Fre
quen
cy
−6 −4 −2 0 2 4 6
050
100
150
200
Histogram of log10(dN
dS+δ)
log10(dN/dS+δ)
Fre
quen
cy
−3 −2 −1 0 1 2
050
100
150
200
Paul Gardner Measuring genetic selection
Problems with dNdS
Paul Gardner Measuring genetic selection
Another idea, look for fast vs slow evolving regions
Siepel et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. GenomeResearch
Paul Gardner Measuring genetic selection
Finding extreme levels of conservation
I ELAVL4 (HuD) gene, an RNA-binding gene associated withparaneoplastic encephalomyelitis sensory neuropathy andhomologous to Drosophila genes with established roles inneurogenesis and sex determination
Siepel et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. GenomeResearch
I See also: ultraconserved elements
Paul Gardner Measuring genetic selection
Finding “accelerated evolution”
Pollard et al. (2006) An RNA gene expressed during cortical development evolved rapidly in humans. NatureHubisz & Pollard (2014) Exploring the genesis and functions of Human Accelerated Regions sheds light on theirrole in human evolution. Current Opinion in Genetics & Development
Paul Gardner Measuring genetic selection
Idea 2: look at population variation
I Allele/SNP frequenciesI Alleles associated with harmful traits decrease in frequency
while those associated with beneficial traits become morecommon.
I Genome wide association studies (GWAS)I Selective sweepsI Tajima’s D (roughly, the difference between observed numbers
of SNPs and the expected)
Paul Gardner Measuring genetic selection
Idea 3:
I Transposon-free regions
Simons et al. (2006) Transposon-free regions in mammalian genomes. Genome Res.
Paul Gardner Measuring genetic selection
The End
Paul Gardner Measuring genetic selection