eukaryotic genomes: fungi wednesday, october 22, 2003 introduction to bioinformatics me:440.714 j....

76
Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner [email protected]

Upload: laurence-shields

Post on 28-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Eukaryotic Genomes:Fungi

Wednesday, October 22, 2003

Introduction to BioinformaticsME:440.714J. Pevsner

[email protected]

Page 2: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Many of the images in this powerpoint presentationare from Bioinformatics and Functional Genomicsby J Pevsner (ISBN 0-471-21004-8). Copyright © 2003 by Wiley.

These images and materials may not be usedwithout permission from the publisher.

Visit http://www.bioinfbook.org

Copyright notice

Page 3: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

We are in the last third of the course:

Today: Fungi. Exam #2 is due at the start of class.

Next Monday: Functional genomics (Jef Boeke)Next Wednesday: Pathways (Joel Bader)

Monday Nov. 3: Eukaryotic genomesWednesday Nov. 5: Human genome

Monday Nov. 10: Human diseaseWednesday Nov. 12: Final exam (in class)

Announcements

Page 4: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Outline of today’s lecture

Description and classification of fungi

The Saccharomyces cerevisiae genome

Duplication of the yeast genome

Functional genomics in yeast

Comparative genomics of fungi

Page 5: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Introduction to fungi: phylogeny

Fungi are eukaryotic organisms that can be filamentous (e.g. molds) or unicellular (e.g. the yeast Saccharomycescerevisiae).

Most fungi are aerobic (but S. cerevisiae can grow anaerobically). Fungi have major roles in the ecosystemin degrading organic waste. They have important rolesin fermentation, including the manufacture of steroidsand penicillin.

Several hundred fungal species are known to causedisease in humans.

Page 6: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Eukaryotes(Baldauf et al., 2000)

Page 7: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fungi and metazoa are sister groups

Fig. 15.1Page 504Baldauf et al., 2000

Page 8: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Classification of fungi

About 70,000 fungal species have been described (as of 1995), but 1.5 million species may exist.

Four phyla:Ascomycota yeasts, truffles, lichens

Basidiomycota rusts, smuts, mushroomsChytridiomycota AllomycesZygomycota feed on decaying vegetation

Box 15-1Page 505

Page 9: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Classification of fungi

About 70,000 fungal species have been described (as of 1995), but 1.5 million species may exist.

Four phyla:Ascomycota yeasts, truffles, lichens

Hemiascomycetae Génolevure projectEuascomycetae NeurosporaLoculoascomycetaeLaboulbeniomycetae parasites of insects

Basidiomycota rusts, smuts, mushroomsChytridiomycota AllomycesZygomycota feed on decaying vegetation

Box 15-1Page 505

Page 10: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Page 505

Page 11: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Page 505

Page 12: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Introduction to Saccharomyces cerevisiae

First species domesticated by humans

Called baker’s yeast (or brewer’s yeast)

Ferments glucose to ethanol and carbon dioxide

Model organism for studies of biochemistry,genetics, molecular and cell biology

…rapid growth rate…easy to modify genetically…features typical of eukaryotes…relatively simple (unicellular)…relatively small genome

Page 505

Page 13: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Sequencing the S. cerevisiae genome

The genome was sequenced by a highly cooperative consortium in the early 1990s, chromosome by chromosome(the whole genome shotgun approach was not used).

This involved 600 researchers in > 100 laboratories.

--Physical map created for all XVI chromosomes--Library of 10 kb inserts constructed in phage--The inserts were assembled into contigs

The sequence released in 1996, and published in 1997(Goffeau et al., 1996; Mewes et al., 1997)

Page 505

Page 14: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Features of the S. cerevisiae genome

Sequenced length: 12,068 kb = 12,068,000 base pairs Length of repeats: 1,321 kbTotal length: 13,389 kb (~ 13 Mb)

Open reading frames (ORFs): 6,275 Questionable ORFs (qORFs): 390 Hypothetical proteins: 5,885

Introns in ORFs: 220Introns in UTRs: 15Intact Ty elements: 52tRNA genes: 275snRNA genes: 40

Page 506

Page 15: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Features of the S. cerevisiae genome

A notable feature of the genome is its high gene density(about one gene every 2 kilobases). Most bacteria haveabout one gene per kb, but most eukaryotes have a much sparser gene density.

Also, only 4% of S. cerevisiae genes are interruptedby introns. By contrast, 40% of Schizosaccharomycespombe genes have introns.

What are the most common protein families and proteindomains? You can see the answer at EBI’s website:http://www.ebi.ac.uk/proteome/

Page 506

Page 16: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.2Page 508

Page 17: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Page 506

Page 18: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.3Page 509http://www.ebi.ac.uk/proteome/

The EBI website offers a variety of proteome analysis tools, such as this summary of protein length distribution in S. cerevisiae.

Page 19: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

ORFs in the S. cerevisiae genome

How are ORFs defined? In the initial genome analysis,an ORF was defined as >100 codons (thus specifyinga protein of ~11 kilodaltons).

390 ORFs were listed as “questionable”, because they were considered unlikely to be authentic genes. For example, they were short, or exhibited unlikely preferences for codon usage.

How many ORFs are there in the yeast genome?There are 40,000 ORFs > 20 amino acids; how many of these are authentic?

Page 506-507

Page 20: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

ORFs in the S. cerevisiae genome

Several criteria may be applied to decide if ORFs are authentic protein-coding genes: [1] evidence of conservation in other organisms [2] experimental evidence of gene expression (microarrays, SAGE, functional genomics)

The groups of Elizabeth Winzeler and Michael Snyder eachrecently described hundreds of previously unannotatedgenes that are transcribed and translated.

Page 507

Page 21: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

ORFs in the S. cerevisiae genome

The MIPS Comprehensive Yeast Genome Database lists criteria for assigning ORFs, based on FASTAsearch scores:

NumberCategory of proteinsKnown protein 3400Strong similarity to known protein 230Similarity or weak similarity to known protein 825Similarity to unknown protein 1007No similarity 516Questionable ORF 472

Total 6450

Page 507, 510

Page 22: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Exploring a typical S. cerevisiae chromosome

We will next familiarize ourselves with the S. cerevisiaegenome by exploring a typical chromosome, XII.

Page 508

Page 23: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Exploring a typical S. cerevisiae chromosome

We will next familiarize ourselves with the S. cerevisiaegenome by exploring a typical chromosome, XII.

This chromosome features• 38% GC content• very little repetitive DNA• few introns• six Ty elements (transposable elements)• a high ORF density: 534 ORFs > 100aa, and 72% of the chromosome has protein-coding genes

Page 508-511

Page 24: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Key S. cerevisiae databases

Web resources include:

NCBI (Entrez Genome Eukaryotic genome projects)

EBIhttp://www.ebi.ac.uk/proteome/

SGD: Saccharomyces Genome Databasehttp://genome-www.stanford.edu/Saccharomyces/

MIPS Comprehensive Yeast Genome Database(MIPS = Munich Information Center for Protein Sequences)http://mips.gsf.de/proj/yeast/CYGD/db/

Page 508

Page 25: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu
Page 26: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

NCBI: Entrez genomes for yeast resources

Fig. 15.4Page 510

Page 27: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

NCBI: Entrez genomes for yeast resources

~Fig. 15.5Page 511

Page 28: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

NCBI: Entrez genomes for yeast resources

~Fig. 15.5Page 511

Page 29: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.6Page 512

MIPS offers a ComprehensiveYeast Genome Database

http://mips.gsf.de/genre/proj/yeast/index.jsp

Page 30: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.7Page 513http://www.yeastgenome.org/

Saccharomyces Genome Database (SGD)

Page 31: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.7Page 513

Page 32: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

S. cerevisiae gene nomenclature

YKL159c

Y = yeastK = 11th chromosomeL = left (or right) arm159 = 159th ORFc = Crick (bottom) or w (Watson, top) strand

Box 15-2Page 514

Page 33: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

S. cerevisiae gene nomenclature

YKL159c

Y = yeastK = 11th chromosomeL = left (or right) arm159 = 159th ORFc = Crick (bottom) or w (Watson, top) strand

RCN1 = wildtype geneRcn1p = proteinrcn1 = mutant allele

Box 15-2Page 514

Page 34: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Analysis of the S. cerevisiae genome revealed that manyregions are duplicated, both intrachromosomally andinterchromosomally (within and between chromosomes).These duplicated regions include both genes andnongenic regions.

Such duplications reflect a fundamental aspect ofgenome evolution.

What are the mechanisms by which regions of the genomeduplicate?

Page 511

Page 35: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Mechanisms of gene duplication

tandem repeatslippageduring

recombination

Geneconversion

Lateralgene

transfer

Segmentalduplication

polyploidye.g.

genometetraploidy

Fig. 15.8Page 514

Page 36: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Fate of duplicated genes

Bothcopiespersist

One copy isdeleted

One copybecomes a

pseudogene

One copyfunctionally

diverges

Fig. 15.8Page 514

Page 37: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

In 1970, Susumu Ohno published the book Evolution by Gene Duplication.

He hypothesized that vertebrate genomes evolved by two rounds of whole genome duplication. This providedgenomes with the “raw materials” (new genes) with which to introduce various innovations.

Page 512

Page 38: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Ohno (1970):

“Had evolution been entirely dependent upon naturalselection, from a bacterium only numerous forms ofbacteria would have emerged. The creation of metazoans,vertebrates, and finally mammals from unicellularorganisms would have been quite impossible, for suchbig leaps in evolution required the creation of new geneloci with previously nonexistent function. Only the cistron that became redundant was able to escape fromthe relentless pressure of natural selection. By escaping,it accumulated formerly forbidden mutations to emergeas a new gene locus.”

Page 512

Page 39: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Wolfe and Shields (1997, Nature) provided support forOhno’s paradigm. They hypothesized that the yeast genome duplicated about 100 million years ago. There was a diploid yeast genome with about 5,000 genes. It doubled to a tetraploid number of 10,000 genes. Then there was massive gene loss and chromosomal rearrangement to yield thepresent day 6,000 genes.

Page 515

Page 40: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.9Page 515

Distance along chromosome X (kb)

Dis

tan

ce a

lon

g c

hro

mo

som

e X

I (k

b) Wolfe and Shields (1997)

performed blastp and found 55 blocks ofduplicated regions. Theyproposed that the entireS. cerevisiae genomeunderwent a duplication.

Matches with scores >200are shown. These arearranged in blocks of genes.

Page 41: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Evidence of genome duplication in yeast-- Systematic BLAST searches show 55 blocks of duplicated sequences.-- There are 376 pairs of homologous genes.

You can see the results of chromosomal comparisonson Ken Wolfe’s web site and at the SGD web site.

Page 515

Page 42: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.10Page 516

The SGD website includes a pairwise chromosomesimilarity viewer.

Page 43: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Kenneth Wolfe offers a website that permits analysisof yeast duplications:http://oscar.gen.tcd.ie/~khwolfe/yeast/

Page 516

Page 44: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Page 516

Page 45: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

As an example,note the SSO1 gene on XVI

Page 46: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

SSO1 (XVI) & SSO2 (XVIII)are part ofa block

Page 47: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Two models for the presence of duplication blocks

[1] Whole genome duplication (tetraploidy) followed by gene loss and rearrangements

[2] Successive, independent duplication events

Page 516

Page 48: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Model [1] is favored for several reasons:

-- For 50 of 55 duplicated regions, the orientation of the entire block is preserved with respect to the centromere. The orientation is not random.

-- For model [2] we would expect 7 triplicated regions. We observe only 0 or 1.

-- Gene order is maintained in 14 hemiascomycetes (the Génolevures project)

Page 516

Page 49: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

The Génolevures project:

-- Partial sequencing of 13 hemiascomycetes-- Gene order can be compared in 14 fungi-- 70% of the S. cerevisiae genome maps to sister regions with only minimal overlap-- Proposal that the 16 centromeres form 8 pairs

Page 517

Page 50: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

The Génolevures project:

-- Partial sequencing of 13 hemiascomycetes-- Gene order can be compared in 14 fungi-- 70% of the S. cerevisiae genome maps to sister regions with only minimal overlap-- Proposal that the 16 centromeres form 8 pairs

Phylogenetic analyses place the divergence of S. cerevisiaeand Kluyveromyces lactis prior to the whole genomeduplication (~100 million years ago). Perhaps the genomeduplication enabled S. cerevisiae to acquire new propertiessuch as the capacity for anaerobic growth.

Page 517

Page 51: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

What is the fate of duplicated genes?

A duplicated gene (overall in eukaryotes) has a half lifeof just several million years (Lynch and Conery, 2000).

50% to 92% of duplicated genes are lost (Wagner, 2001)

Consider four possible fates of a duplicated gene:

Page 517

Page 52: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

What is the fate of duplicated genes?

A duplicated gene (overall in eukaryotes) has a half lifeof just several million years (Lynch and Conery, 2000).

50% to 92% of duplicated genes are lost (Wagner, 2001)

Consider four possible fates of a duplicated gene:[1] Both copies persist (gene dosage effect)

Page 517

Page 53: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

What is the fate of duplicated genes?

A duplicated gene (overall in eukaryotes) has a half lifeof just several million years (Lynch and Conery, 2000).

50% to 92% of duplicated genes are lost (Wagner, 2001)

Consider four possible fates of a duplicated gene:[1] Both copies persist (gene dosage effect)[2] One copy is deleted (a common fate)

Page 517

Page 54: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

What is the fate of duplicated genes?

A duplicated gene (overall in eukaryotes) has a half lifeof just several million years (Lynch and Conery, 2000).

50% to 92% of duplicated genes are lost (Wagner, 2001)

Consider four possible fates of a duplicated gene:[1] Both copies persist (gene dosage effect)[2] One copy is deleted (a common fate)[3] One copy accumulates mutations and becomes a pseudogene (no functional protein product)

Page 517

Page 55: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

What is the fate of duplicated genes?

A duplicated gene (overall in eukaryotes) has a half lifeof just several million years (Lynch and Conery, 2000).

50% to 92% of duplicated genes are lost (Wagner, 2001)

Consider four possible fates of a duplicated gene:[1] Both copies persist (gene dosage effect)[2] One copy is deleted (a common fate)[3] One copy accumulates mutations and becomes a pseudogene (no functional protein product)[4] One copy (or both) diverges functionally. The organism can perform a novel function.

Page 517

Page 56: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Why are duplicated genes commonly lost? It might seemhighly advantageous to have a second copy of gene,thus permitting functional divergence.

Ohno suggested two reasons:

[1] After duplication, a deleterious mutation in one of the twogenes might now persist. Without duplication, the individual would have been selected against by such a mutation.

[2] The presence of a new paralogous sequence could lead tounequal crossing over of homologous chromosomes during meiosis.

Page 518

Page 57: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

To consider the fate of duplicated genes, consider theexample of genes involved in vesicle transport.

Vesicles carry cargo from one destination to another.Proteins on vesicles (e.g. vesicle-associated membraneprotein, VAMP; Snc1p in yeast) bind to proteins on targetmembranes (e.g. syntaxin in mammalian and othereukaryotic systems, or Sso1p in yeast).

In S. cerevisiae, genome duplication appears to be responsible for the presence of two syntaxins(SSO1 and SSO2) and two VAMPs (SNC1 and SNC2).

Page 518

Page 58: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Sso1p Sso2p

Snc1p Snc2p

Fig. 15.11Page 518

Page 59: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Search for informationon SSO1 (or anyyeast gene) at theSGD website

Page 60: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.12Page 519

The SGD record for SSO1 provides information on function

Page 61: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

The SGD website reveals that the SSO1 gene is nonessential(i.e. the null mutant is viable), but the double knockout ofSSO1 and SSO1 is lethal. Thus, these paralogs may offerfunctional redundancy to the organism.

Also, these proteins could participate in distinct (butcomplementary) intracellular trafficking steps.

Page 519

Page 62: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Duplication of the S. cerevisiae genome

Andreas Wagner (2000) considered two ways an organismcan compensate for mutations: via genes with overlappingfunctions (e.g. paralogs), or via genes with unrelatedfunctions that participate in regulatory networks.

He reported that overall, gene duplications did not providerobustness. Instead, interactions among unrelated genesprovide robustness against mutations.

Page 519

Page 63: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Functional genomics in yeast

Functional genomics refers to the assignmentof function to genes based on genome-widescreens and analyses.

Next week, Jef Boeke will describe functional genomics(Monday). Joel Bader will describe proteomicsin yeast (Wednesday).

Page 520

Page 64: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.13Page 520

We can consider functional genomics in yeastin terms of high throughput approaches at the levels of genes, transcripts, and proteins

Page 65: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Functional genomics in yeast (next week)

Protein levelTwo-hybrid screensAffinity purification and mass spectrometryPathways

RNA levelMicroarraysSAGEtransposon tagging

Gene levelGenetic footprintingTransposon insertion: random mutagenesisGene deletion: targeted deletion of all ORFs!!!

Page 66: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Today’s final topic: comparative analysis of fungal genomes

The fungi offer unprecedented opportunitiesfor comparative genomic analyses

-- relatively small genome sizes-- they are eukaryotes-- they exhibit significant differences in biology-- opportunities to apply functional genomics approaches in a comprehensive, genome-wide manner

Page 528

Page 67: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fungal and metazoan phylogeny

Baldauf et al., 2000Page 528

Page 68: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

A variety of fungal genome sequencing projects

size chromosomes Aspergillus fumigatus 30 Mb 8Aspergillus nigrans 29 Mb 8Apergillus parasiticusCandida albicans 16 Mb 8Cryptococcus neoformans 21 MbFusarium sporotrichiodesMagnaporthe grisea 40 Mb 7Neurospora crassa 43 Mb 7Phanerochaete chrysoporium 30 Mb 10Saccharomyces cerevisiae 13 Mb 16Schizosaccharomyces pombe 14 Mb 3Ustilago maydis 20 Mb

Page 69: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

An atypical fungus: Encephalitozoon cuniculi

Microsporidia are single-celled eukaryotes that lackmitochondria and peroxisomes. Consistent with theirroles as parasites, the E. cuniculi genome is severelyreduced in size (2000 proteins, only 2.9 Mb). They were thought to represent deep-branching protozoans, butrecent phylogenetic studies place them as an outgroupto fungi.

Page 529

Page 70: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Fig. 15.22Page 529

Encephalitozoon cuniculi as a fungal outgroup

Page 71: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Orange bread mold: Neurospora crassa

Beadle and Tatum chose N. crassa as a model organismto study gene-protein relationships. The genome sequencewas reported: 39 Mb, 7 chromosomes, 10,082 ORFs(Galagan et al., 2003).

N. crassa has only 10% repetitive DNA, and incredibly, only 8 pairs of duplicated genes that encode proteins >100 amino acids. This is because Neurospora uses“repeat-induced point mutation” (RIP), a mechanism bywhich the genome is scanned for duplicated (repeated)sequences. This appears to serve as a genomic defensesystem, inactivating potentially harmful transposons.

Page 530

Page 72: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Schizosaccharomyces pombe

The S. pombe genome is 13.8 Mb and encodes ~4900predicted proteins. Some bacterial genomes encode more proteins (e.g. Mesorhizobium loti with 6752, and Streptomyces coelicolor with 7825 genes).

Chromosome genes Coding1 5.6 Mb 2,255 59%2 4.4 Mb 1,790 58%3 2.5 Mb 884 55%

Total 12.5 Mb 4,929 58%

See: TIGR www.tigr.orgEBI www.sanger.ac.uk/Projects/S_pombe Page 530

Page 73: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Schizosaccharomyces pombe

Chromosome genes Coding1 5.6 Mb 2,255 59%2 4.4 Mb 1,790 58%3 2.5 Mb 884 55%

Total 12.5 Mb 4,929 58%

See: TIGR www.tigr.orgEBI www.sanger.ac.uk/Projects/S_pombe

Page 74: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Schizosaccharomyces pombe

S. pombe diverged from S. cerevisiae about330 to 420 million years ago.

Many genes are as divergent between thesetwo fungi as they are diverged from humans.

To see this, try TaxPlot at NCBI.

Page 530

Page 75: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu
Page 76: Eukaryotic Genomes: Fungi Wednesday, October 22, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu

Perspective and pitfalls

The budding yeast S. cerevisiae is one of the most significantorganisms in biology:• Its genome is the first of a eukaryote to be sequenced• Its biology is simple relative to metazoans• Through yeast genetics, powerful functional genomics approaches have been applied to study all yeast genes

It is important to note that even for yeast, our knowledge of basic biological questions is highly incomplete. We still understand little about how the genotype of anorganism leads to its characteristic phenotype.

Page 531