role of genome advancement in evolution studies
TRANSCRIPT
Role of Genome Advancement in Evolution Studies
Name: Sarla YadavClass: M.Sc. Microbial biotechnology 3rd sem
Roll no.: 1873
Contents
• What is evolution?• History of evolution• How evolutionary changes occur?• Three domain of life• What evolutionary trees depict?• Basis of evolutionary phylogenetic tree• Evolutionary analysis based on macromolecule analysis• Contradiction of using 16S rRNA or other individual gene families to make phylogenetic
tree• Horizontal gene transfer• Reconstruct the tree of life• Role of advancement of genomics technology in evolution studies• Changing of the guard: from genomes to pangenomes• Era of pangenomics• References
References
• Doolittle, W. F. (1999) Science 284, 2124–2128. • Aguinaldo, A. M. A., Turbeville, J. M., Linford, L. S., Rivera, M. C., Garey, J. R., Raff, R. A. &
Lake, J. A. (1997) Nature 387, 489–493.• Halanych,K.M.,Bacheller,J.D.,Aguinaldo,A.M.A.,Liva,S.M.,Hillis,D.M. & Lake, J. A. (1995)
Science 267, 1641–1643. • Copeland, H. F. (1938) Q. Rev. Biol. 13, 383-420. 8. Whittaker, R. H. (1959) Q. Rev. Biol. 34,
210-226. • Whittaker, R. H. & Margulis, L. (1978) Biosystems 10, 3-18.• Knoll, A. H. (1990) in Origins and Early Evolutionary History of the Metazoa, eds. Lipps, J.
H. & Signor, P. W. (Plenum, New York), in press. • Anne B. SimonsonJacqueline A. Servin, Ryan G. Skophammer, Craig W. Herbold, Maria C.
Rivera, and James A. Lake Decoding the genomic tree of life
What is evolution?
• Evolution mean simply “change”.• Evolutionary biology is the study of history of life forms on the
earth. • Biological (or organic) evolution• Change in the properties of groups of organisms over the course of
generations.
• Development of an individual organism is not considered as evolution.• Changes in population via the passing of genetic material from
one generation to the next considered evolutionary.
History of Evolution
• From Classical times until long after the Renaissance, species were considered to be special creations, fixed for all time
• Chevalier de Lamarck proposed the concept of spontaneous generation and argued that species differ from one another because they have different needs.• Famous example: giraffes originally had short necks, but
stretched their necks to reach foliage above them.
• Darwin’s evolutionary theory which is published in The Origin of Species in 1859 consisted of two major hypotheses:• All organisms have descended from common ancestral forms
of life• Modification among species is due to natural selection.
How evolutionary changes occur?
• The principles that explain evolutionary changes are as follows:• Genetic variation in phenotypic characters arises by random mutation
and recombination• Change in proportions of alleles and genotypes within a population may
result in replacement of genotypes over generations. This occur either by• Random fluctuation (genetic drift) or• Nonrandom (natural selection)
Due to different histories of genetic drift and natural selection populations of a species may diverge and reproductively isolated species.
Historical path of generation three domain of life: Bacteria, Archaea and Eucarya
1866• Haeckel recognized that single celled forms, protists, and challenged the
aboriginal plant/animal division of the living world and gave one more group in tree of life
• Copeland later spill out a fourth main branch, i.e., monera which accommodate bacteria
• Whittaker then created fifth for fungi.
1950• As the molecular and cytological understanding of cells deepened at a
very rapid pace lead to the discovery of archaebacterial. On the cytological level archaebacterial are indeed prokaryotes but on the molecular level they resemble more to the eukaryotes.
1970• By comparison of ribosomal RNA (rRNA) Carl Woese demonstrate that
there are two different groups of organisms with prokaryotic cell architecture and gave three domain of life.
What evolutionary trees depict?
• A phylogenetic tree, also known as phylogeny, is a diagram that depicts the lines of evolutionary descent of different species, organisms or genes from a common ancestor.• Phylogenies are useful for organizing knowledge of• Biological diversity• Structuring classification• Providing insight into events that occurred during evolution.
Basis of evolutionary phylogenetic tree
• Molecular structures and sequences are generally more revealing of evolutionary relationships than are classical phenotypes.• Definition of taxa:• Progressively shifted from organismal to the cellular to the molecular
level.
• Molecules used to relate microorganisms in phylogenetic trees:• Sequence similarity of small subunit ribosomal RNA (SSU
rRNA)• Sequence similarity of individual protein families:• Cytochromes• ATPase• Elongation factor• Aminoacyl tRNA synthase• RNA polymerase
Comparing genome sequences provides clues to evolution and development
• Genome sequencing and data collection has advanced rapidly in the last 25 years• Comparative studies of genomes
• Advance our understanding of the evolutionary history of life• Help explain how the evolution of development leads to
morphological diversity
• Genome comparisons of closely related species help us understand recent evolutionary events
• Genome comparisons of distantly related species help us understand ancient evolutionary events
• Relationships among species can be represented by a tree-shaped diagram
Most recentcommonancestorof all livingthings
Bacteria
Eukarya
Archaea
Billions of years ago4 3 2 1
Chimpanzee
Human
70
Mouse
60 50 40 30 20 10 0
Millions of years ago
Evolutionary analysis based on macromolecular sequences
• Zuckerkandl and Pauling initiated evolutionary analyses based upon macromolecular sequences with hemoglobin and by Fitch and Margoliash with cytochrome c• However rRNA has since replaced these molecules as a universal
indicator of universal relationship among organisms.• Several thousands RNA sequences has been determined in whole
or in part and used to create a “universal tree of life”.
• Highly conserved genes have changed very little over time• These help clarify relationships among species that diverged from each other long ago• Bacteria, archaea, and eukaryotes diverged from each other between 2 and 4 billion
years ago• Highly conserved genes can be studied in one model organism, and the results
applied to other organisms
Phylogenetic tree based on the 16S rRNA sequence
Contradiction of using 16S rRNA or other individual gene families to make phylogenetic tree
• They correspond to a tiny fraction of genomic material in most microorganisms and hence ignores the bulk of the genetic information in constructing the phylogenetic trees.• 16S rRNA tree does not reflect the evolution of all of the genes in
a genome and does not supplied the evidence that early eukaryotes were a chimera of eubacteria and archaebacterial genes• This revealed by the complete sequence of methanogen
Methanococcus janaschii.• Certain group of gene ( informational genes responsible for
translation and transcription) is more similar to eukaryotic genes whereas other groups of genes are more closely related to their bacterial homolog .• The operational gene(involved in biosynthesis of amino acids
and other numerous operational activity) of eukaryotes were most closely related to those found in eukaryotes.
• According to latest Bergey’s Manual the tree of prokaryotic life is fuzzy and unresolved • Unable to determine how the phyla are related to each other.
Horizontal gene transfer (HGT) Key in the evolution of prokaryotes• Genome sequences have demonstrated that horizontal transfer of genes (between
different types of organisms) are widespread and may occur between phylogentically diverse organisms.• It had the potential to significantly alter the gene tree.• HGT plays an important role in prokaryotic evolution• It is now generally recognized to be rampant among genomes (rampant at least on a
geological timescale)• Not all genes are equally likely to be horizontally transferred• Informational gene are rarely transferred, whereas operational gene are readily
transferred.• Biological and physical factors appear to have altered HGT.
• Molecular mechanism of HGT • Transformation• Conjugation• Transduction
• HGT preferentially occurs among organisms that have environmental and genomic factors in common.
Reconstruct the tree of life
• Since there are many contradictory statement regarding 16S rRNA gene tree in case of prokaryotes. So the scientists prefer whole genome tree to depict the evolutionary relation.• The whole genome tree was based on information from the entire
genome, using amino acid and dinucleotide composition.• These trees represent unbiased consideration of all the
information in the genome.• Condense everything to a simple composition vector.
Reconstruct the tree of life in the presence of HGT
• With the availability of complete genome, useful methods have been developed for whole genome analyses.• When analyzing using parsimony and simple distance based
method HGT significantly influence them• Recovering the tree of life in the presence of HGT have improved
with the development of a new mathematical algorithm, conditioned reconstruction (CR), for whole-genome-based phylogenetic reconstructions.• It analyses use the absences and presences of genes as
character states but through the use of reference genome• It also provide additional information that is not available in other
type of analyses.• For example: by restricting the analyses to only the genes
present in the reference genome R, one can also estimate the number of gene pairs that are missing in both genomes A and B.
Role of advancement of genomics technology in evolution studies
• This was best said by Carl Woese “Genome sequencing has come of age and genomics will become central to microbiology’s future”.• The increase in the speed and efficiency of the sequencing
technology over the last decade has been accompanied by more than a 90% reduction in the cost.• Take small fraction of the time to repeat all the work done to
date.• The next decade will bring about 10,000 more complete
genomes, thus providing us with hundreds of millions of new genes. It poses totally new challenge for the development data handling procedures.
Next generation sequencing
Technique Ion torrent Roche’s 454 Illumina ABI’s SOLiD
Data (Mb per run) 100 100 600 700
Time per run 1.5 Hrs 7 Hrs 9 Days 9 Days
Read length 200 bp 400 bp 150 bp 75 bp
Cost per Mb 5 $ 84.39 $ 0.03 $ 0.04 $
Analyses become uneasy
• For analyzing these huge data we will need computational methods for large-scale comparative analysis.
• Advance technology in sequences rapidly approaching the point of having more data than can be analyzed.
• Most widely used technology for identify homologous gene families is the BLAST, but with a linear rate of increase in data this approach will soon become unusable.
• High-performance computing and parallel implementation applications such as ScalaBLAST may help in near future.
• Hence rapid technological advances in sequencing make it easily affordable by the average university or research institute but the ability to analyze data will become increasingly expensive, souring out of reach of most institutions.
Changing of the guard: from genomes to pangenomes
• Promising approach for alleviating the data analysis involves a conceptual change:• Comparative analysis need not compare all genes with all other
genes, as not all genes have a sequence similarity to all others.• Methods for limiting BLAST analyses considerable reduce the
computational demand of comparative analysis.• The data reduction methodology would be based on the concept
of the ‘pangenome’ defined as all of the different genes present in a set of genomes.• Pangenome of a species consists of • Core genome (found in all isolated)• Flexible genome (present in some but not all)
Era of Pangenomics
• Pangenome may lead to a new understanding of our microbial planet, fulfilling microbiology’s dream of the systematic study.• 1960-1990 is the era of ribosomal RNA we were building the tree of life and
establishing the framework for the genomics revolution of 1990-2020, when we were growing the tree of life.• The next decade (2010-2020) will be marked as the era of pangenomics,
defined as finally understanding the tree of life.
• Several case studies have revealed that pangenomes of different species differ with respect to the relative proportion of core and flexible genes.
• Those with a high percentage of core genes are called closed pangenome, those with high percentage of flexible gene are termed ‘open’.
• The degree of ‘openness’ of the pangenome generated from those strains can reveal the evolutionary dynamics of that species and indicate that how many additional strains may need to be sequenced to adequately
Thank you