phylogenomics “the intersection of phylogenetics and genomics” the reconstruction of...
Post on 13-Jan-2016
222 Views
Preview:
TRANSCRIPT
Phylogenomics• “The intersection of phylogenetics and genomics”• The reconstruction of evolutionary relationships by comparing
sequences of whole genomes or portions of genomes• Several potential methods/strategies to discuss• We will focus on:
– Ultraconserved element phylogenetics– Transposable element phylogenetics– RADSeq– PhylomeDB
Phylogenomics• UltraConserved Elements
• UCEs• Bejerano et al. Science
304:1321-1325• “481 segments longer than 200
base pairs (bp) that are absolutely conserved (100% identity with no insertions or deletions) between orthologous regions of the human, rat, and mouse genomes”
• “Nearly all of these segments are also conserved in the chicken and dog genomes, with an average of 95 and 99% identity, respectively. Many are also significantly conserved in fish”
• “more than 5000 sequences of over 100 bp that are absolutely conserved among the three sequenced mammals”
Phylogenomics• UltraConserved Elements
• UCEs have been associated with gene regulation and development
• generally assumed that UCEs must be important by the very nature of their near-universal conservation across extremely divergent taxa.
• However, gene knockouts of UCE loci in mice resulted in viable, fertile offspring, suggesting that their role in the biology of the genome may be cryptic.
Phylogenomics• By definition, UCEs would be of minimal
use in phylogenetics because of the low variability
• Linkage predicts that neighboring sequence that isn’t as highly conserved would be under less constraint
• UCEs serve as the anchors to access the neighboring sequence
• UCE workflow• http://ultraconserved.org/• Target enrichment of ultraconserved
elements from arthropods provides a genomic perspective on relationships among Hymenoptera. Mol Ecol Res 2014
• The evolution of peafowl and other taxa with ocelli (eyespots): A phylogenomic approach. Proc R Soc Lond B Biol Sci 281: 20140823. 2014.
• Target Capture and Massively Parallel Sequencing of Ultraconserved Elements (UCEs) for Comparative Studies at Shallow Evolutionary Time Scales. Syst Biol 63:83-95. 2014.
• A Phylogenomic Perspective on the Radiation of Ray-Finned Fishes Based upon Targeted Sequencing of Ultraconserved Elements (UCEs). PLoS ONE 8: e65923. 2014.
Because of the way they accumulate in a genome, TEs, especially retrotransposons, make excellent marker for phylogenetic analysis
Genome
Time
Subfamily 1
Subfamily 2
Subfamily 3
SINE accumulation in genomes
Phylogenomics
SINEs as phylogenetic markers
But…Which SINE families do you target and how do you identify them?
• Transposable element phylogenetics1. Identical by descent
2. Known ancestral state
3. Simple evolutionary model
4. Neutral
5. “Low-tech”
6. Bi-allelic markers
Phylogenomics
Consistency index = 1.00Homoplasy index = 0.00
• ME-Scan
Phylogenomics
• ME-Scan validation
Phylogenomics
• RAD-Seq• Restriction Site Associated DNA Sequencing• Cresko and colleagues (PLoS ONE 2008;3:e3376, PLoS Genet
2010;6:e1000862, PNAS 2010;107:16196–200.)• Akin to RFLP and AFLP except that you sequence the fragments• Rapidly identify genome-wide suites of SNPs and other
polymorphisms
Phylogenomics
• (A) Genomic DNA is sheared with a restriction enzyme.
• (B) P1 adapter is ligated to cut fragments.
• (C) Samples from multiple individuals are pooled together randomly sheared. Only a subset of the resulting fragments contains restriction sites and P1 adapters.
• (D) P2 adapter is ligated to all fragments. The P2 adapter has a divergent end.
• (E) PCR amplification with P1 and P2 primers. The P2 adapter will be completed only in the fragments ligated with P1 adapter, and so only these fragments will be fully amplified.
• (F) Pooled samples with different MIDs are separated bioinformatically and SNPs called (C/G SNP underlined).
• (G) As fragments are sheared randomly, paired end sequences from each sequenced fragment will cover a 300 - 400 bp region downstream of the restriction site.
Phylogenomics
Phylogenomics• RAD-Seq and phylogenetics• There is potential but there are problems
– “the most substantial obstacle to using RAD sequences for phylogenetics is determining orthology”
– “Deep divergences are problematic for two reasons: first, restriction sites change over time, with losses favored over gains, leading to a reduction in the number of orthologs retained across divergent taxa; second, evolutionary divergence of orthologous RAD sequences compromises the ability to infer their orthology based on sequences imilarity. Consequently, taxa that are phylogenetically isolated on long branches are less likely to retain orthologous restriction sites, and the RAD sequences they do retain will be more divergent, diminishing their representation in clusters.”
– “While correct nodes are more likely in general to be strongly supported, incorrect nodes can also have high bootstrap values, although this is not unique to RAD phylogenetics.”
• Probably still really good for phylogeography within species and among closely related species
• PhylomeDB• Remember that gene tree/species tree problem?• “given the plurality of evolutionary histories among genes encoded
in a given genome, there is a need for the combined analysis of genome-wide collections of phylogenetic trees (phylomes).”
• Phylome – the complete collection of evolutionary histories of all genes in a genome
• Huerta-Cepas et al. 2007• Latest version of PhylomeDB is v4, Nucleic Acids Research 2013• phylomedb.org
Phylogenomics
Phylogenomics
Phylogenomics• Phylome for gene family TP53 (screenshot from Huerta-Cepas et al.
2013)
Gene duplication events
Speciation events
Phylogenomics• Alternative topology resolution using phylomes
# trees (%) supporting the given phylogeny
# trees (%) with PP >0.9 supporting the given phylogeny
# gene families (%) supporting the given phylogeny
top related