de novo genome sequencing of skeletonema marinoi and ... · photo: per johander skeletonema marinoi...

Post on 12-Jun-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

De novo genome sequencing ofSkeletonema marinoi and Surirella brebissonii

Mats Töpel!*, Magnus Alm Rosenblad!, Ulrika Lind!, Susanna Gross", Sandra Karlsten!,Jens Persson!, Mattias Backman!, Anna Godhe", Anders Blomberg!

1. Department of Chemistry and Molecular Biology, University of Gothenburg2. Department of Biology and Environmental Sciences, University of Gothenburg

*mats@topel.se

Introduction

De novo whole genome sequencing of the two diatom species Suri-rella brebissonii (CCMP2919) and Skeletonema marinoi (GUMACC St54) are currently conducted as part of the Linnaeus Centre for Marine Evolutionary Biology (CeMEB) initiative at the University of Gothenburg. This work is part of the Infrastructure for Marine Genetic model Organisms (IMAGO) project, aimed at developing new marine model systems and provide genomic and genetic tools to study vital phenomena and components of coastal marine ecosystems.

Protein translocation in diatoms

A chloroplast’s genome only encodes ~100 proteins, but the organ-elle requires many more proteins in order to perform its functions in the cell. Translocons at the Outer and Inner Chloroplast enve-lope membranes (TOC and TIC, respectively) are the two multi-protein complexes in plants, red- and green algae that enable chlo-roplasts to import these essential nuclear-encoded proteins.

Diatom plastids, on the other hand, are surrounded by four mem-branes where the outermost is continuous with the endoplasmic reticulum (ER) [8]. The second membrane (known as the periplas-tid membrane [PPM]) is the remnant of the secondary endosymbiont’s plasma membrane (proposed to be of red algal origin) [9]. The two innermost membranes are homologous to the outer and inner envelope membranes in plant plastids and are de-rived from the membranes surrounding the cyanobiont of primary plastids [10].

The identity of the TOC and TIC translocons in diatoms and most other chromalveolate organism groups (e.g brown algae, dinofla-gellates and apicomplexan parasites) is mainly unknown. How-ever, bioinformatics analyses of whole genome sequences from dia-toms has shown that these systems are also present in diatoms. Bullman et al. [11] reported the discovery of an Omp85 protein that is localized in the third outermost plastid membrane (homologous to the outer envelope membrane in plants) of the diatom Phaeodactylum tricornutum.

To date, this Omp85 protein is the only reported putative member of the TOC complex in diatoms. Our phylogenetic analyses (including bacterial, plant and diatom sequences) reveals that the diatom sequences are of red algal origin and more specifically belongs to the Toc75 gene family. Unexpectedly long branches in the diatom part of the tree indicates a rapid, albeit even, evolu-tionary rate. This phenomenon has been reported on previously, but the significance of the phenomenon has not yet been thor-oughly investigated.

Assembly statistics

DNA libraries (insert size 150 and 3000 bp), of which one were gen-erated from an axenic culture, and one RNA library (300 bp) from Surirella brebissonii has been sequenced. One axenic 300 bp library from Skeletonema marinoi has been generated. Both genomes have been assembled using the CLC de novo assembler software package. Sequence reads where preprocessed using cutadapt [1] and the fastx toolkit [2] (for details see http://matstopel.se/notebook).

  Skeletonema   SurirellaTotal  nt  sequenced  (Gb)   33   46Total  input  to  assembly  (Gb)   26   33Assembly  size  (Mb)   49   136Number  of  contigs  (K)   53   244Average  coverage  (x/cont)   443   207N50  (bp)   1673   694Average  contig  lenght  (bp)   929   557Longest  contig  (Kb)   506*   76*Putative  bacterial  symbiont.

Preliminary findings

Organelles ! The plastid of Surirella brebissonii contains a group II intron with an ORF, the first group II intron to be identified in dia-toms. Interestingly, the mitochondrial genome of S. brebissonii has lost the group II intron present in both Thalassiosira pseudonana and Phaeodactylum tricornutum mtDNA.Three putative components of the Translocon at the Outer Chloro-plast envelope membrane (TOC) have been identified in S. brebissonii and one in S. marinoi.

Cell wall ! Six silicon transporter genes (SIT’s) have been predicted to be present in the S. brebissonii genome, and two in S. marinoi. Bio-informatics analyses have also identified centric and pennate specific motives in these sequences. Frustulin and Silaffin/Cingulin proteins, that also are involved in diatom cell wall biogenesis, have been identified in both genomes, and preliminary analyses have found novel motifs in these sequences.

Phylogenetic analysis of the OMP85 superfamily that includes the Toc75 gene family of channel proteins. Preliminary analyses using data from Töpel et al. [4] as query sequences have identified at least one protein from the Omp85 superfamily, in the genomes of Surire-lla brebissonii and Skeletonema marinoi, respectively. Identified contigs where translated in all six reading frames, using the program getorf [5], and aligned to the query dataset using MAFFT [6]. Correct reading frames identified in this way were then used in BLAST searches of the publicly available diatom gene predictions, and subsequently anal-ysed together using MrBayes 3.2 [7].

Surirella brebissonii is an assymetric pennate bentic diatom which is approximately 45 um long and mostly found in brackish water. It was selected for sequencing because of its rather large size and assymetric form. It has since long been used for studies on chromosome separation.Photo: Per Johander

Skeletonema marinoi is a main primary producer during spring blooms in the North Atlantic and a valuable food source for zooplankton. Its generation time is 24 hours, which makes it ideal for studies of pheno-typic response. Benthic cells act as resting stages, with up to 50 000 per gram of sediment, and can survive for at least hundred years and thereby provide short-term evolutionary archives in sediments.Photo: Anna Godhe.

Evolutionary relationship between genera where whole genome data (WGS) is available. Albeit sparse (the number of diatoms have been estimated to ~200 000 species [3]), these seven species constitutes a broad phylogenetic sample from the diatom tree of life, covering many large morphological groups. Access to WGS data from either of the two groups Coscinodiscophycidae or Rhizosoleniophycidae would however signifi-cantly help improve our understanding of diatom evolution by including the crown node of the group in the analyses. Tree modified from [12].

Coscinodiscophycidae

Fragilariopsis

Phaeodactylum

Pseudo-­nitzschia

Rhizosoleniophycidae

Thalassiosira

Surirella

Skeletonema

Radial  Centrics

Bi(multi)polar  Centrics

Raphid  Pennates

The chloroplast protein translocation machinery in plants. The prepro-tein (black line) is first recognised by one of the TOC receptors (green), and subsequently transported through the Toc75 channel, and the TIC complex, to the chloroplast stroma. The identity of the TOC and TIC translocons in diatoms and most other chromalveolates is mainly unknown. Numbers indicate the names of the proteins. Graphics: Paula Töpel.

TOC

OEM

IEMIM

S

TIC

Cytoso

lStro

ma

Hs p 70

Hs p 70Hs p 60

6412

34

22

625532

SPP

2040

75

159

159

21110

Hsp93

References1.  https://code.google.com/p/cutadapt/.    2.  http://hannonlab.cshl.edu/fastx_toolkit/.    3. Bowler  C.,  Vardi  A.,  Allen  A.E.  (2010).  Oceanographic  and  Biogeochemical  Insights  from  Diatom  Genomes.  Annu.  Rev.  Mar.  Sci.  2,  333–65.    4.  Töpel,  M.,  Ling  Q.  and  Jarvis,  P.  (2012)  Neofunctionalization  within  the  Omp85  protein  superfamily  during  chloroplast  evolution.  Plant  Signaling  and  Behaviour.  7:2.    5.  http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html.    6.  Katoh,  Standley  (2013)  MAFFT  multiple  sequence  alignment  software  version  7:  improvements  in  performance  and  usability.  Mol.  Biol.  &  Evol.  30,  772-­780.    7.  Huelsenbeck  JP,  Ronquist  F.  (2001)  MRBAYES:  Bayesian  inference  of  phylogenetic  trees.  Bioinformatics  17(8),  754-­755.    8.  Gibbs,  S.  P.  (1981)  The  chloroplast  endoplasmic  reticulum:  structure,  function,  and  evolutionary  significance.  Int.  Rev.  Cytol.  72,  49–99.    9.  Cavalier-­Smith  T.  (2003)  Genomic  reduction  and  evolution  of  novel  genetic  membranes  and  protein-­targeting  machinery  in  eukaryote-­eukaryote  chimaeras  (meta-­algae).  Philos.  Trans.  R.  Soc.  Lond.  B.  Biol.  Sci.  358,  109–134.    10.  Palmer,  J.  D.  (2003)  The  symbiotic  birth  and  spread  of  plastids:  how  many  times  and  whodunit?  J.  Phycol.  39,  1–9.    11.  Bullmann  L.,  Haarmann  R.,  Mirus  O.,  Bredemeier  R.,  Hempel  F.,  Maier  U.  G.,  Schleiff  E.  (2010)  Filling  the  Gap,  Evolutionarily  Conserved  Omp85  in  Plastids  of  Chromalveolates.  J.  Biol.  Chem.  285,  6848-­6856.    12. Sorhannus  U.  (2004)  Diatom  phylogenetics  inferred  based  on  direct  optimization  of  nuclear-­encoded  SSU  rRNA  sequences.  Cladistics  20,  487–497.

0.4

Cyanobacteria

Plant  OEP80

Diatom  Toc75

Plant  Toc75

Microcoleus_vaginatusOscillatoria_spCyanothece_sp

Thermosynechococcus_elongatusGloeobacter_violaceus

Brachypodium_distachyonOryza_sativa

Physcomitrella_patens

Arabidopsis_thaliana  (atToc75-­V)

Brachypodium_distachyon

Selaginella_moellendorffii

Arabidopsis_lyrata

Populus_trichocarpa

Volvox_carteri

Arabidopsis_thalianaAquilegia_coerulea

Zea_mays

Selaginella_moellendorffiiAquilegia_coerulea

Physcomitrella_patens

Arabidopsis_lyrata

Chlamydomonas_reinhardtii

Arabidopsis_thaliana

Aquilegia_coerulea

Oryza_sativa

Zea_mays

Ricinus_communis

Surirella_toc75

Thalassiosira_pseudonana

Phaeodactylum_2

Pseudo-­nitzschiaFragilariopsis

Pseudo-­nitzschia_2

Skeletonema

Phaeodactylum

Thalassiosira_oceanica

Cyanidioschyzon_merolae

Arabidopsis_lyrata

Galdieria_sulphuraria

Arabidopsis_thaliana  (atToc75-­III)Arabidopsis_lyrata

Oryza_sativa

Volvox_carteri

Pisum_sativum

Arabidopsis_thaliana  (atToc75-­I)

Oryza_sativa

Aquilegia_coerulea

Physcomitrella_patensPhyscomitrella_patensSelaginella_moellendorffii

Brachypodium_distachyonZea_maysBrachypodium_distachyon

Chlamydomonas_reinhardtii

Physcomitrella_patens

Arabidopsis_thaliana  (atToc75-­IV)

Arabidopsis_lyrata

Primary  endosymbiosis

Secondary  endosymbiosis

Gene  duplication

top related