cordis.europa.eu · web view(scheme 1) undergoes degradation into fatty acid biosynthesis via...

69
EU MARIE CURIE: LYNGBYA KENYA - PROJECT PIIF-GA-2011-299550 CYP-450 BIOSYNTHESIS OF LYNGBYA MAJUSCULA NATURAL PRODUCTS FINAL REPORT SCIENTIFIC COORDINATOR Professor J GRANT BURGESS School of Marine Science and Technology Newcastle University Armstrong Building, Queen Victoria Road NE1 7RU United Kingdom Tel. +44 (0) 191 2226717 Fax. +44 (0) 191 2225491 Email. [email protected] THE FELLOW Dr THOMAS M DZEHA Department of Chemistry and Biochemistry Pwani University P.O. Box 195-80108 Kilifi Kenya Tel. +254 (0) 788986063 Email. [email protected] 1 | Page

Upload: buidat

Post on 18-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

EU MARIE CURIE: LYNGBYA KENYA - PROJECT PIIF-GA-2011-299550 CYP-

450 BIOSYNTHESIS OF LYNGBYA MAJUSCULA NATURAL PRODUCTS

FINAL REPORT

SCIENTIFIC COORDINATORProfessor J GRANT BURGESSSchool of Marine Science and TechnologyNewcastle UniversityArmstrong Building, Queen Victoria RoadNE1 7RUUnited KingdomTel. +44 (0) 191 2226717Fax. +44 (0) 191 2225491Email. [email protected]

THE FELLOWDr THOMAS M DZEHADepartment of Chemistry and BiochemistryPwani UniversityP.O. Box 195-80108KilifiKenyaTel. +254 (0) 788986063Email. [email protected]

1 | P a g e

EXECUTIVE SUMMARY

A sustainable supply of marine natural products for potential therapeutics is one of the

greatest challenges facing drug discovery efforts today, especially during clinical trials.

Nearly 300 compounds with therapeutic potential have been isolated from the tropical

marine cyanobacterium Lyngbya majuscula. However there are considerable concerns

regarding the real source of this large number of natural products attributed to L. majuscula.

This project focused on the cytochrome p450 biosynthesis of L. majuscula natural products

namely the modular cyclodepsipeptides homodolastatin 16 (HMDS 16) and antanapeptin A

(ANTAP A). L. majuscula was collected from Shimoni, Kenya in April 2012. The aim was to

identify if these compounds originate from the cyanobacteria or bacteria cohabiting with it or

both. Bacteria representing γ-proteobacteria, Firmicutes, and α–proteobacteria were isolated

from the cyanobacteria. Non-ribosomal peptide synthetase (NRPS) and polyketide synthase

(PKS) screens of bacteria with known complete genome sequences using bioinformatics

tools showed in this study that the modular assembly lines for these bacteria are inconsistent

with those of HMDS 16 and ANTAP A. However, Bacillus licheniformis and Marinobacterium

stanieri synthesise the β amino acid dolamethleuline and Klebsiella oxytoca is involved with

the biosynthesis of the unusual dolaphenvaline amino acid in HMDS 16 putatively. Profiling

for the cyclodepsipeptides in bacteria supernatants using liquid chromatograph mass

spectrometry (LCMS) confirmed the presence of HMDS 16, its analogue dolastatin 16, and

ANTAP A in L. majuscula but not in the bacteria. Subsequently such leads aided the

prospects for obtaining the complete genome sequence of L. majuscula from its

metagenome and in identifying gene clusters encoding for HMDS 16 and ANTAP A. The

cloning and heterologous expression of the gene clusters for these especially important

anticancer agents is the goal that we aimed to attain by the end of the program.

2 | P a g e

TABLE OF CONTENTS

Page

Summary 2

Table of contents 3

Key words and abbreviations 5

A SCIENTIFIC REPORT 6

A 1 Project context and objectives 6

A 2 Modular assembly of homodolastatin 16 9

A 3 Investigations of growth conditions for the Kenyan Lyngbya majuscula 10

A 3.1 Growth of L. majuscula in autoclaved seawater 11

A 3.2 Growth of Lyngbya in autoclaved seawater and antibiotics 11

A 3.3 Treatment of Lyngbya majuscula with antibiotics prior to culturing 12

A 3.4 Growth of Lyngbya in autoclaved seawater, BG11 and KNO3 (15 gL-1) 12

A 3.5 Growth of Lyngbya in autoclaved seawater, BG11, KNO3 and antibiotics 13

A 3.6 Summary of the results of bacteria isolates on Lyngbya majuscula 13

A 4 Isolation of Lyngbya majuscula filaments 14

A 4.1 The ecology of the Kenyan L. majuscula 14

A 4.2 Do dead L. majuscula filaments contain homodolastatin 16? 16

A 5 Bioinformatics strategies for identifying homodolastatin 16 gene clusters 17

A 6 Significant outcomes of the scientific project 18

A 6.1 Isolation of culturable bacteria co-habiting with L. majuscula 18

A 6.2 Phylogeny of bacterial isolates and related taxon 18

A 6.3 16S rDNA isolation and identification of L. majusculaand A. colombiense 19

A 6.4 Antibiotic resistant bacteria associated with L. majuscula filament 20

A 6.5 LCMS profiling of homodolastatin 16 in L. majuscula and epibiotic bacteria 21

3 | P a g e

A 6.6 Bioinformatics prediction for NRPS modular compounds 22

A 6.7 Putative biosynthesis of dolamethyleuline and dolaphenvaline fragments 23

A 6.8 Molecular identification of the Kenyan “Lyngbya majuscula” 25

A 6.9 Analysis of sequence data and phylogeny of the Kenya L. majuscula 27

A 7 Manuscripts 28

B IMPACT 29

B 1 Seminars and guest lectures 29

B 2 International conferences and symposia 30

B 2.1 Federation of European Biochemical Society (FEBS) 2013 conference at St. Petersberg, Russia from 6-12 July 2013. 30

B 2.2 International Advanced Single Cell Biotechnology atSheffield University, UK on 12 February 2014 32

B 3 Outstanding outcomes of the project 32

B 3.1 Grant Applications 32

B 3.2 Outreach program - Mentoring of Marine Biology students 33

C THE UNITED KINGDOM AND MY EU MARIE CURIE FELLOWSHIP 34

D ACKNOWLEDGEMENT 34

E REFERENCES 35

F ANNEX 37

Annex 1 Table 1 of Kenyan Lyngbya majuscula epibiotic bacteria (EB)isolates 37

Annex 2: Phylogeny of EB isolates 38

Annex 3: Adenylation domain for Moorea producens 39

Annex 4: MtaD-M1-Cys Biosynthesis 43

Annex 5: TycA-M1-D-/L-Phe Biosynthesis 44

Annex 6: List of PARSE HMM modular domains for M. producens 45

4 | P a g e

KEY WORDS: Lyngbya majuscula, Homodolastatin 16, Dolastatin 16, Antanapeptin A, Epibiotic bacteria,

16S rDNA, NRPS/PKS, gene cluster, modular assembly, LCMS, putative biosynthesis,

phylogeny, copper sulfate, molecular identification and differential DNA isolation

ABBREVIATIONS

A - adenylation

ANTAP A – Antanapeptin A

AT - acyl transferase

Cy – cyclisation

DH – Dehydratase

DML – Dolamethleuline

DPV – Dolaphenvaline

ER - enoyl reductase

HIV – Hydroxyisovaleriate

HMDS 16 – Homodolastatin 16

KR – Ketoreductase

KS - ketosynthase

LCMS – Liquid chromatograph mass spectrometry

M - Methylation

NMe-Ile – N-methylisoleucine

NRPS – Non-ribosomal peptide synthetase

Phe – Phenylalanine

PKS – Polyketide synthase

T- thiolation

TE - Thioesteration

5 | P a g e

A SCIENTIFIC REPORT

A 1 Project context and objectives

The marine biotope has been identified as a large and rich area for exploration of biologically

active pharmaceuticals of use in medicine and in biotechnology. The expansive diversity in

form and function of the marine environment coupled with the unique adaptations of the

marine organisms therein and varied biosynthesis pathways compared with the terrestrial

world suggest that it is as yet an untapped resource. Nearly 15 marine natural products are

in various phases of clinical development, mainly in oncology, with several products already

on the market and with more on the way.1

Research over the last four decades has shown the filamentous marine cyanobacterium,

Lyngbya majuscula of the order Oscillatoriales to be a prolific source of a diverse range of

modular natural products 2-4 Out of the nearly 800 compounds isolated from marine

cyanobacteria, L. majuscula dominates with nearly 300 substances coming from the

species.5 The plethora of marine derived natural products isolated from L. majuscula

worldwide in pan-tropically geographical locations include those which exhibit antimicrobial

activity, anti-proliferative compounds, anti-HIV agents and those which have shown potential

as anticancer agents.5,6 Useful L. majuscula natural products include the anticancer agent

Curacin A, the neurotoxic jamaicamides and the UV-sunscreen pigment scytonemin.7

Investigations into the biosynthesis of L. majuscula natural products have revealed gene

clusters encoding modular, mixed polyketide synthase (PKS)/nonribosomal peptide

synthetase assembly lines that incorporate other functional groups through highly unusual

mechanisms.3,4 Homodolastatin 16 (1) and antanapeptin A (2) isolated from the Kenyan L.

majuscula exemplify such assemblies.2 The former is a mixed NRPS/PKS modular

cyclodepsipeptide isolated from the Kenyan marine cyanobacterium L. majuscula along with

antanapeptin A (2) and has moderate activity towards the oesophageal cancer cell lines

WHCO1 and WHCO6 (IC50 values of 4.3 and 10.1 μg/mL respectively).2 Its analogue

6 | P a g e

dolastatin 16 (3) shows very strong activity against lung (NCI-H460: GI50 0.00096 μg mL-1),

colon (KM20L2 GI50 0.0012 μgmL-1), brain (SF-295, GI50 0.0052 μgmL-1) and melanoma (SK-

MEL5 GI50 0.0033 μgmL-1) cancer cell lines.8,9

1 R = N-Me-Ile 2

3 R = N-Me-Val

The biological activities exhibited by homodolastatin 16 (1) and its analogue dolastatin 16 (3)

towards cancer cell lines suggests that there is a need to obtain sustainable amounts of

these modular compounds for further investigations including structure activity relationships

and clinical testing. A sustainable supply of these cyclodepsipeptides could only be achieved

through aquaculture of the organism, chemical synthesis or by recombinant biosynthesis of a

source organism.10 A complex circadian rhythm associated with cyanobacteria rules out

aquaculture as a possible alternative.11 Low enantiomeric excess (e.e) yields and refractory

problems associated with chemical synthesis only suggest that recombinant biosynthesis of

these natural products is the ideal way to realizing sustainability. Gene shuffling, domain

deletions and mutations are recombinant biosynthesis engineering tools that are currently

used for the identification and cloning of gene clusters for polyketides, NRPS and hybrid

polyketide-NRP metabolites.12

7 | P a g e

Recently in 2011, the genome of Lyngbya majuscula 3L which is a Caribbean strain that

produces the tubulin polymerization inhibitor curacin A and the molluscicide barbamide was

sequenced using Sanger and 454 sequencing approaches to near completion.3 Whereas the

draft genome sequence revealed gene clusters for Curacin A and barbamide, only a mere

3% of the genes were dedicated to secondary metabolite production, biosynthesis, transport

and catabolism.3 Questions therefore arise as to whether or not the natural products are

strain specific, especially as most taxonomic classifications have been mostly morphological.

It is also clear through the draft genome sequence of Lyngbya majuscula 3L that the

cyanobacterium not only lacks the necessary nifH gene for photosynthesis but also encodes

for a complex gene regulatory system for microbial association and environmental

adaptation. Distinctively, there is a paucity of information regarding if the compounds

isolated from L. majuscula originate from the cyanobacterium or the bacteria co-habiting with

it or both, especially as efforts to render axenic L. majuscula culturable have been mostly

fruitless. Subsequently there are considerable concerns regarding the real source of the

large number of natural products attributed to L. majuscula.

Given the biological activities of homodolastatin 16 (1) and its analogue dolastatin 16 (3)

towards cancer cell lines, investigations to identify which of bacteria or the non-axenic

filamentous cyanobacterium produces the cyclodepsipeptides were necessary.

Subsequently, we report the 16S rDNA isolation and identification of bacteria cohabiting with

the Kenyan L. majuscula; bacteria found on the filament and the LCMS profiling for

homodolastatin 16 (1), antanapeptin A (2) and dolastatin 16 (3) in organic extracts of L.

majuscula and bacterial isolates supernatants. For clarity putative probing of gene clusters of

these modular compounds in the genomes of bacteria closely associated with the Kenyan L.

majuscula in databases was carried out. Further, we report a new method for identifying

non-axenic cyanobacteria. These findings have important implications on the understanding

of symbiotic pathways for L. majuscula and in the recombinant biosynthesis of

homodolastatin 16 (1) and its potent anticancer analogue dolastatin 16 (3).

8 | P a g e

SCIENTIFIC AND TECHNOLOGICAL RESULTS

A 2 Modular Assembly of homodolastatin 16

A thorough understanding of the modular assembly of homodolastatin 16 was essential in

gaining insight into the biosynthesis pathway mechanism of the cyclodepsipeptide. The

assembly could be inferred by examining the structure of the cyclodepsipepide. The

structure comprises 3 proline moieties, an n-methyl leucine, a hydroxyisovaleriate (HIV) and

two unusual beta-hydroxy amino acids, namely dolaphenvaline (dpv) and dolamethleuline

(dml). We proposed the following putative modular assembly for homodolastatin 16:

9 | P a g e

Homodolastatin 16 above is likely to use the following building blocks; the beta methyl homo

Phe (DPV) probably coming from methylation of the alpha keto acid, then transamination to

beta methyl Phe then chain followed by extension as is Ileu biosynthesis to yield the

homoskeleton. The Lactate and HIV hydroxy acids may be activated by keto acid-

recognizing A domains with an embedded downstream NADH-dependent dehydrogenase in

the module. The DML residue almost certainly uses a hybride NRPS-PKS module

condensing Val and Methyl malonyl CoA. This PKS module should have KR/DH/ER domain

to take the initial tethered beta keto extended Val-Me-mal scaffold to the fully saturated one

here. This is precedent in statine assembly. The gamma amino group that results is the key

indicator.

In our quest to further understand the modular assembly and to identify the gene clusters for

homodolastatin 16, we looked at compounds with similar unusual beta-amino acid fragments

and came up with the following literature: the isolation from cephalaspidean mollusk

Philinopsis speciosa and structure elucidation of kulokekahilide;13 and the isolation from

cyanobacterium Lyngbya majuscula and structure elucidation of pitiprolamide.14

A 3 Investigations of growth conditions for the Kenyan Lyngbya majuscula

Lyngbya majuscula was investigated for growth conditions to establish how to render it

cultivable. It should be noted that there has been controversy regarding whether or not the

cyanobacteria possesses the niFH gene for photosynthesis. Lundgren et al. in 2003 reported

the niFH gene for photosynthesis in L. majuscula on the basis of the acetylene reduction

assay. Studies by Gerwick and co-workers on the near complete genome of Moorea

producens (L. majuscula), assert that the cyanobacterium does not contain niFH genes in its

genome but is endowed with substantial amount of genes for microbial association. In our

study we hypothesized that if any genes for photosynthesis were observed in the acetylene

reduction studies, they may have been attributed to association especially by nitrogen fixing

bacteria associated with L. majuscula. The following experiments were therefore designed to

10 | P a g e

address the question as to why it is rather difficult to grow L. majuscula under ordinary

laboratory conditions.

A 3.1 Growth of L. majuscula in autoclaved seawater

Lyngbya majuscula was grown in autoclaved seawater from 5th March 2013 with unchanged

medium until 17th April 2013 (1 month 2 weeks) in an orbital shaker incubator (27 °C, mild

constant light). Observations revealed the cyanobacterium to be unhealthy and lacked the

green pigmentation for photosynthesis. Isolation of bacteria from the Lyngbya mat grown

under these conditions highlighted the following observations:

Two main bacteria species were isolated from the medium namely the bright orange

Shewanella algae sp. and the crusty creamy Klebsiella oxytoca. These were replicated from

an isolation of bacteria from the L. majuscula mat. Re-culturing of the cyanobacteria for a

further two weeks (17th to 30th April) under the same conditions but with fresh autoclaved

seawater media resulted into further degeneration of the cyanobacterium, totally lacking in

growth prospects and exhibiting no signs of the green pigment for photosynthesis. When the

culture was transferred to the refrigerator (4 °C, 2 weeks) regeneration was observed but

when transferred to an open well-lighted environment severe deterioration was observed.

This confirmed the necessity of a light and dark regime for the effective growth of the

cyanobacteria.

A 3.2 Growth of Lyngbya in autoclaved seawater and antibiotics

It was desired to make the cyanobacteria as axenic as possible and also to investigate if the

absence of bacteria had an effect on the growth of L. majuscula. A cocktail of antibiotics

targeting both Gram +Ve and Gram –Ve was made comprising of penicillin, Streptomycin

and chloramphenicol each at the concentration of 4 mg L-1. The antibiotic treated contents

were shaken vigorously to ensure uniform distribution and the cyanobacteria grown (5 th

March to 17 April, 2013) without any medium change. It was observed that there was slight

improvement on the growth of L. majuscula compared with the autoclaved seawater alone

11 | P a g e

with some areas showing the green pigmentation for photosynthesis. On this basis we

speculated that bacteria affecting the cyanobacteria negatively were absent. Isolation of

bacteria from the medium and cyanobacteria respectively led to the following observations:

Bright orange (Shewanella algae sp.) and bright yellow (Pseudomonas stutzeri) were

present along with the glassy Pseudomonas putida in the medium, results which were

replicated from an isolation of the Lyngbya mat. However, the abundance of the bacteria on

the plate was a lot less compared with the medium.

A 3.3 Treatment of Lyngbya majuscula with antibiotics prior to culturing

Based on the above observations, it was desired to treat the Lyngbya with the cocktail of

antibiotics out of the medium. The intent was to establish antibiotic resistant bacteria and

whether or not the surviving bacteria are associated with the non-ribosomal peptide

synthetase (NRPS) genes for antibiotics. Treatment of the Lyngbya led to the following

observations:

Consistently, the yellow Pseudomonas stutzeri and the creamy Klebsiella oxytoca and the

greenish Shewanella algae sp. dominated the isolation. Genome mining of the KEGG

genome database revealed that K. oxytoca possessed the non-ribosomal peptide synthetase

(nrps) genes for rifamycin and related antibiotics.

A 3.4 Growth of Lyngbya in autoclaved seawater, BG11 and KNO3 (15 gL-1)

The North Sea water at Newcastle upon Tyne is poor in nitrogen and it was therefore

considered necessary to supplement the autoclaved seawater with nitrogen nitrate from

KNO3 (15g L-1). In common with other cyanobacteria, Lyngbya is generally acknowledged to

grow under low phosphorus phosphate conditions and therefore it was not grown the

Lyngbya under phosphate conditions. Growth from (17th to 30th April) showed improvement

compared with the autoclaved seawater only but comparable to the growth of the antibiotic

treated Lyngbya. Isolation of bacteria from the medium and Lyngbya mat respectively

demonstrated that P. stutzeri, S. algae sp. and K. oxytoca were the key bacteria closely

12 | P a g e

associating with L. majuscula. No new recruitment of bacteria was observed by the

incorporation of KNO3 into the medium.

A 3.5 Growth of Lyngbya in autoclaved seawater, BG11, KNO3 and antibiotics

The effect of antibiotics was determined for the growth of Lyngbya in the above medium.

Growth of the cyanobacteria was comparable to that of the Lyngbya in autoclaved seawater,

BG11 and KNO3. Whereas Shewanella algae sp. was absent from the bacterial isolates in

the medium and Lyngbya mat respectively, P. stutzeri and K. oxytoca flourished.

It was noted that regardless of the regeneration due to KNO3, and/or incorporation of the

antibiotic cocktail the cyanobacteria could not survive under constant light.

A3.6 Summary of the results of bacteria isolates on Lyngbya majuscula

Bacteria and condition of Lyngbya

Autoclaved seawater Shewanella algae sp. (Bright orange)

Lyngbya condition poor

Autoclaved seawater and antibiotics Shewanella algae sp, Pseudomonas

stutzeri and Klebsiella oxytoca

Lyngbya fair growth

Autoclaved seawater, BG11 and KNO3 P. stutzeri, S. algae sp. and K. oxytoca.

Lyngbya growth good

Autoclaved seawater, BG11, KNO3, and

antibiotics

P. stutzeri and K. oxytoca

Lyngbya growth good

Lyngbya and antibiotics only (no

medium)

P. stutzeri and K. oxytoca

Lyngbya growth good

13 | P a g e

A 4 Isolation of Lyngbya majuscula filaments

Lyngbya majuscula was treated with cycloheximide overnight (5 mg L-1) to rid it of eukaryotic

cells, protozoa and fungi, rinsed several times with filtered sterile seawater (12 times, 45 μ)

and thereafter left in phosphate poor autoclaved seawater overnight. Following this, the

cyanobacterium was submerged in phosphate buffered saline to detach filaments. The

filaments were thoroughly rinsed, pooled together and weighed to afford 0.25 mg of biomass

for DNA extraction and subsequent genome sequencing. In another experiment, bacteria

were isolated from the filament and plated on marine agar to determine the assemblage of

culturable bacteria associated with the filaments outside the sheath of bacteria. These

experiments were repeated for a near dead Lyngbya mat. Bacteria isolated from the live and

dead bacteria were designated as LFB and DFB respectively. Additional filaments from the

live and near dead Lyngbya were thoroughly washed and stained with acridine and negrosin

to establish association of the filaments with heterotrophic bacteria.

The following observations were made:

Predominantly, a bacteria species with a red tinge at the centre that could not be re-isolated

into a single strain was isolated from the filaments of the live and near dead Lyngbya.

Additionally, a creamy bacteria species characteristic of K. oxytoca was observed. The pcr

amplification of the red bacteria did not generate any 16S rDNA sequence as expected.

A 4.1 The ecology of the Kenyan L. majuscula

In the quest to establish if L. majuscula filaments could entirely be rid of bacteria, filaments

prepared as described previously were stained with acridine to establish cyanobacteria-

bacteria association of the filament (Fig. 1) and with negrosin to study cell wall and DNA

degradation of the filament by bacteria (Fig. 2).

14 | P a g e

Fig.1. Bacteria on the surface of a live Fig. 2. DNA (at the centre) and cell wall

L. majuscula filament material (outside)

Clearly, despite treatment of the cyanobacteria with cycloheximide and several rinses of the

bacteria with phosphate buffered saline (PBS), microscopy (Fig. 1) and the isolation of the

LFB and DFB bacteria from the filament surface revealed that bacteria are always

associated with Lyngbya. This was regardless of the cyanobacteria getting actively involved

with phototropism or on the verge of dying. It is widely acknowledged that oxygen is

poisonous towards cyanobacteria including L. majuscula. Bacteria utilise oxygen for

respiration and cyanobacteria capture carbon dioxide to undergo auto-photosynthesis

creating an energy balance on the cyanobacteria-bacteria interface. Conclusively, it was not

surprising that certain species of bacteria were found inside the cyanobacteria sheaths and

on the filaments for this reason.

Fig. 3. A non-uniform near dead filament Fig. 4. Bacteria embeded onto surface

of a dying filament

15 | P a g e

The green and the brown are the live and dead parts respectively of the filament.

The congregation of live bacteria on the dead filament is shown by the red spots.

We further aimed to monitor the behavior of bacteria on the surface of an untreated filament

that was left overnight to die. Whereas it was observed that the bacteria did not enter the

core of the filament targeting the DNA material, nevertheless there was considerable loss of

cell wall material (Fig.3.) leading to the speculation that some of the bacteria on the surface

survive on organic carbon from the cyanobacteria. These findings were corroborated by a

broken cell wall as observed on a negrosin stained filament similarly left to die overnight.

The bacteria isolated from the sheath and on the filaments of L. majuscula are close

relatives of human pathogenic bacteria suggesting that they may not necessarily be

pathogenic to it. These pathogenic bacteria especially K. oxytoca have been shown through

genome mining to possess the niki B gene for nikimycin and rifamycin respectively. This

suggests that within the consortium some bacteria species in addition to other roles supply

chemical defense arsenals to the host substrate. However it is unclear how the live filament

retains its cylindrical shape despite the presence of a myriad species of bacteria some of

which are cellulose degraders as exhibited by the bacteria on a dying filament (Fig. 4).

A 4.2 Do dead L. majuscula filaments contain homodolastatin 16?

We considered that a comparison of the homodolastatin 16 content of dead and live L.

majuscula filaments would provide a clue on the role of surface bacteria towards the

biosynthesis of homodolastatin 16. The absence of the natural product in a dead filament

would suggest that bacteria have a significant role towards the biosynthesis of

homodolastatin 16. Cycloheximide treated filaments thoroughly cleaned with PBS were left

to die naturally in a sterile container (2 days, ambient temperature) and extracted with

dichloromethane:methanol 2:1 according to the method of Gerwick and co-workers with

16 | P a g e

modifications. The LCMS profiles of the C-18 eluants of the extracts of these dead filaments

compared favorably with those of live filaments.

A 5 Bioinformatics strategies for identifying homodolastatin 16 gene clusters

Our overall objective for the project was to identify gene clusters encoding for

homodolastatin 16 for expression in a Saccharomyces cerevisiae vector for a sustainable

supply of the anti-cancer homodolastatin 16 and antanapeptin A cyclodepsipeptides. We

were cognizant of the fact that a near complete genome of L. majuscula, otherwise renamed

Moorea producens 3L because of issues of identity had been accomplished in 2011 by

Gerwick and co-workers at the Scripps Institution of Oceanography, San Diego, La Jolla,

USA. We therefore sought to identify non-ribosomal peptide synthetase (NRPS) domains in

the genome of M. producens as a starting point for our understanding of the biosynthesis of

homodolastatin 16 pathway.

A blast of the near complete genome for M. producens against the NRPase HMM from Pfam

(see http://pfam.sanger.ac.uk/family/PF08415) found 3 hits, two above cut-offs. These were

identified as Pseudomonas pseudoalcaligenes CECT 5344, Klebsiella oxytoca KCTC 1686

Stenotrophomonas maltophilia K279a. Incidentally all three species of bacteria were isolated

from the Kenyan L. majuscula in the current project. A blast of the peptide sequence for M.

producens (see FASTA sequence annex 3) revealed the protein of 10623 amino acids to

comprise a total of 38 domains of which 4 were of adenylation. Two of these domains

encoded for the MtaD-M1-Cyst biosynthesis characteristic of Myxothiazol synthetase,

epothilone synthetase, Bacitracin synthetase and Yersiniabactin synthetase (Annex 4). Each

of these nrps natural products are synthesised from the DLYNLSLI modular assembly. The

third domain had a DAWTVAAV modular assembly predicting for the TycA-M1-D-/L-Phe

biosynthesis (Annex 4). This binding pocket putatively expresses the tyrocidine synthetase,

Gramicidin synthetase and Bacitracin synthetase (Annex 4). The fourth adenylation domain

was thought to comprise a hypothetical protein. An examination of the PARSE HMM hits for

17 | P a g e

M. producens (Annex 5) revealed the adenylation (A), thiolation (T), methylation (M),

thioesteration (TE), enoyl reductase (ER), ketosynthase (KS), acyl transferase (AT),

dehydratase (DH) and cyclisation (Cy) domains.

A 6 Significant outcomes of the scientific project

A 6.1 Isolation of culturable bacteria co-habiting with L. majuscula

Direct streaking of bacterial isolates from the Kenyan L. majuscula biomass onto marine

agar 2216 (10% w/v) consistently led to the isolation of colored colonies that were sub-

cultured to obtain pure strains. The concentration of marine agar was altered between 1%

and 10% to differentiate between bacteria growing under poor and rich nutrient medium

respectively. Pseudomonas stutzeri (yellow) was isolated from the sub-culturing of colonies

embedded into Enterobacter cloae (creamy), a mixture that was characterized by a green

pigment. There were diverse morphologies of the bacteria isolates after overnight incubation

including tiny colonies of Bacillus subtilis, glassy Pseudomonas putida and the large soft

Shewanella algae (purple) and Pseudomonas stutzeri (yellow) respectively. Hard colonies

were observed in Bacillus licheniformis (red). A list of the bacteria identified by 16S rDNA is

shown in Table 1. Nearly 70% of all isolates were γ-proteobacteria. Firmicutes were isolated

in reasonable quantities (17%) whereas α-proteobacteria and Actinobacteria were minimal.

A 6.2 Phylogeny of bacterial isolates and related taxon

The evolutionary history of bacteria isolates from L. majuscula investigated along other taxon

of pathogenic bacteria and cyanobacteria using the Maximum Parsimony method shows the

α-proteobacteria Ochrobactrum anthropi, Aminobacterium colombiense and the

actinobacteria Cellulosimicrobium cellulans as being close relatives of L. majuscula (Annex.

2). The mycobacterium A. colombiense was isolated from L. majuscula gDNA as a

consequence of the cross-reaction with the cyanobacteria primers. Surprisingly,

Pseudoalteromonas carrageenovora isolated from the L. majuscula filament appears to have

no close relationship to the cyanobacteria. This finding is corroborated with Klebsiella

18 | P a g e

oxytoca, Shewanella algae species and Pseudomonas stutzeri that associate closely with L.

majuscula. Pseudomonas pseudoalcaligenes related closely with the known pathogen

Pseudomonas tolaasii and Enterobacter cloacae had close relationship with the pathogen

Pseudomonas fluorescens. Klebsiella oxytoca related well with Enterobacter cancerogenous

and Yorkenella regensburgeii. The firmicutes all related fairly well to each other.

A 6.3 16S rDNA isolation and identification of L. majuscula and A. colombiense

The appearance and morphology of the L. majuscula was consistent with the earlier

identification of the homodolastatin 16 producing strain.2 16S rDNA was used to confirm the

identity of the cyanobacterium. In order to achieve quality genomic DNA for the identification,

L. majuscula in filtered (22 μm) autoclaved seawater poor in phosphate phosphorus was

treated with cycloheximide (4 mg mL-1, 12 hrs) to rid it of eukaryotic organisms and

thereafter left submerged in phosphate buffered saline (PBS, pH 7.4, overnight) to detach

the filaments and to remove extracellular polysaccharides.15 The filament remained

associated with bacteria even after several attempts to wash it with PBS and milliQ water

(Fig. 1). Species identification under the microscope was not possible.

Bacteria on the surface of L. majuscula were killed by exposure of the cyanobacterium to

copper sulfate pentahydrate (5 min, 10 min, 30 min and 60 min) prior to weighing aliquots

(0.5g) for DNA extraction. Most bacteria were dead within minutes whilst a few embedded

themselves into the L. majuscula filament tissue (Fig. 3). Conventional genomic DNA

extraction kits proved inadequate for the cyanobacteria. The addition of lysozyme (50

mg/mL), SDS, RNase and proteinase K in a power bead tube prior to homogenization

lyophilized the cyanobacterium and aided lysis of the L. majuscula cell. Precisely, the SDS

removed lipid polysaccharide.15 High molecular DNA (8 kb) of L. majuscula was

subsequently extracted with phenol : chloroisoamyl alcohol (25:24:1) to afford DNA with

268/280 and 280/230 ratios of between 1.8 and 2.0 and 1.7 and 2.0 respectively as

measured by spectrophotometry from the nano drop allowing for 16S rDNA identification and

19 | P a g e

complete genome sequence. The 16S rDNA of the copper sulfate exposed L. majuscula

isolates matched that of A. colombiense at 89% identity.

A 6.4 Antibiotic resistant bacteria associated with the L. majuscula and the filament

In other experiments, L. majuscula was treated to a cocktail of ampicillin, chloramphenicol

and streptomycin (4 mg mL-1 each) antibiotics in growth culture to establish drug resistant

bacteria in the consortia likely to offer protection to the cyanobacteria against bacterial

infection. It was established that P. stutzeri (yellow) S. algae (pink to orange) and K. oxytoca

(cream) resist the antibiotic cocktail treatment. The resistance towards antibiotics was

corroborated by the presence of the emrA multidrug efflux system protein emrA

[tr:G8WAU0_KLEOK], the K03543 multidrug resistance protein A and the beta lactamase

peptidoglycan glycosyltransferase gene clusters in the Klebsiella oxytoca KCTC 1686

genome.16,17 Similarly, the multidrug resistance (MDR) efflux pump F2N102_PSEU6

encoding for TbtABM operon was observed in the genome of the nitrogen fixing P. stutzeri.

Replating of detached L. majuscula filaments onto marine agar 2216 (10% w/v), identified

Pseudoalteromonas carrageenovora and Ochrobactrum anthropi as the bacteria found on

the filament. However, replating of L. majuscula specimen treated with copper sulfate

pentahydrate onto the agar did not result into any observed cultures of bacteria but instead

showed tiny specs of fragmented cells. It was also reasoned that cyanobacteria filaments left

to die would be culpable to cell wall destruction by bacteria living on the surface. To

investigate this, the filament was left to die on a microscope coverslip (48 hr) and thereafter

observed under the microscope with nigrosin stain. Nigrosin stains blue DNA and cell wall

material of bacteria and cyanobacteria. Whereas bacteria were present in a disfigured (non-

cylindrical) filament there were no indications of the cell wall material having been

dismembered by the bacteria during the 48 hours of decay.

20 | P a g e

A 6.5 LCMS profiling of homodolastatin 16 in L. majuscula and epibiotic bacteria

Organic extracts of freeze-dried supernatants of epibiotic bacteria (EB) were investigated for

the presence of homodolastatin 16 (1) through glass fibre (GFF 44μ) filtration and C18

purification. 2:1 dichloromethane/methanol eluants from the C-18 were evaporated down by

a rotary evaporator (23 °C) and dried with nitrogen under vacuum. Extracts were yellow in

color. Similar treatment was made for L. majuscula extracts. TD-Lyng chl was the first

Lyngbya extract fraction to elute from the C-18 column and exhibited an intense

pigmentation of chlorophyll. TD-Lyngbia eluted immediately after TD-Lyng chl. Both TD-

Lyngbia and TD-Lyng chl fractions were observed to have similar chromatograms on the

gradient elution.

Fig. 5. Low resolution LCMS chromatograms. The times shown here differ with the

high resolution values reported in the text (Not shown here)

21 | P a g e

The molecular ion for dolastatin 15 (m/z 837.9050) consistent with the molecular formula

C45H69O9H6 was used as a standard. Dolastatin 15 was eluted after 8.56 minutes.

Homodolastatin 16 (1), antanapeptin A (2) and dolastatin 16 (3) were eluted at 12.06, 12.53

and 11.39 minutes respectively. The peak at 10.42 minutes is a contaminant from the

column unrelated to the extracts. The molecular ion m/z 915.5178 M + Na++ was consistent

with the molecular formula C48H72O10N6 for homodolastatin 16 (theoretical mass m/z

915.5202) whereas m/z 759.4290 M + Na++ corresponded with the molecular formula

C41H60N4O8 for antanapeptin A (theoretical mass m/z 759.4314) that was previously isolated

from L. majuscula along with homodolastatin 16 (1).2 The minor metabolite for the molecular

ion m/z 901.5041 M + Na++ in the chromatogram is consistent with the molecular formula

C47H70O10 for the potent anticancer agent dolastatin 16 (3) (theoretical mass m/z 901.5046)

and differs from homodolastatin 16 (1) by a methylene group.

Examination of the chromatograms and TOF MS ESI+ spectra for the epibiotic bacteria

isolates did not show any matches for homodolastatin 16 (1) and its analogue dolastatin 16

(3). Neither was there observed any signal matching with that of antanapeptin A (2).

Representative spectra for the γ-proteobacteria (Enterobacter cancerogenus, Pseudomonas

carrageenovora, Pseudomonas pseudoalkaligenes, Yorkenella regensburgeii, Klebsiella

oxytoca), Firmicutes (Staphylococcus saprophyticus), and α–proteobacteria (Ochrobactrum

anthropii) are presented here (Fig. 5). These spectra also accounted for the bacteria closely

associated with L. majuscula and those found on the filament of the cyanobacteria.

A 6.6 Bioinformatics prediction for NRPS modular compounds

The absence of homodolastatin 16 (1), anatanapeptin A (2) and dolastatin 16 (3) in the

bacteria isolates prompted investigating whether this outcome was consistent with

bioinformatics driven prediction. To ascertain if the cyclodepsipeptides found in the Kenyan L

.majuscula had templates in M. producens, NRPS adenylation scaffolds of 11 prokaryotic

microorganisms encoding the AMP-C family developed from orphan-proline genes were

22 | P a g e

blasted onto the M. producens genome.16 Long chain fatty acid-CoA ligases (5853 bp,

1951aa, 462.e-130) for PKS were established in the M. producens genome. Similar blasts on

A. colombiense resulted in long chain fatty acid but comprising fewer base pairs and amino

acids (1512 bp, 504 aa, 157.e-38) compared with the cyanobacteria. NRPS gene clusters

were found in P. putida encoding for pyoverdin siderophore biosynthesis (10413 bp, 3471

aa, 410.e-114), K. oxytoca for the siderophore enterobactin synthetase (3882 bp, 1294 aa,

840.e0) and yersiniabactin (6099 bp, 2033 aa, 233.e-61).

A 6.7 Putative biosynthesis of dolamethyleuline and dolaphenvaline fragments

Dolamethyleuline is a fragment in homodolastatin 16 (1) and dolastatin 16 (3) whereas

dolaphenvaline (Dpv) has been observed in both 1 and 3; in kulokekahilide,13 and

pitiprolamide.14 Valine in step i (Scheme 1) undergoes degradation into fatty acid

biosynthesis via isobutyryl-CoA utilising the enzymes valine dehydrogenase (EC: 1.4.1.23)

and the branched chain amino-acid aminotransferase (EC: 2.6.1.42) .

Scheme 1

The dehydrogenation in step ii involves 2-oxoisovalerate dehydrogenase E1 component,

alpha subunit (EC: 1.2.4.4) and 2-oxoisovalerate dehydrogenase E2 component

(dihydrolipoyl transacylase) (EC 2.3.1.168). These are accompanied with dihydrolipoamide

dehydrogenase (EC:1.8.1.4) for co-factor recycling and 2-oxoisovalerate ferredoxin

oxidoreductase alpha subunit (EC:1.2.7.7). The dolamethyleuline β-amino acid is afforded

through a polyketide/fatty acid extension with methyl malonyl CoA and β-transaminase in

23 | P a g e

step iv. Bacillus licheniformis, Marinobacterium stanieri, Shewanella sp. and Pseudomonas

putida isolated from the Kenyan L. majuscula putatively can synthesize the 2-oxoisovalerate

dehydrogenase E1 component, alpha subunit. The accession numbers of the respective

proteins of these bacteria are (WP_016885941.1), WP_010322356.1, WP_011622791.1,

WP_010955110.1 respectively.

Dpv is synthesized via a benzoyl-CoA biosynthesis into the phenol intermediate of a final

transaminase component by an enzyme with aldolase functionalities similar to those of

Nikkomycin B (NikB, Scheme 2 above). The NikB gene has been observed in the

Streptomyces. In this study it has been found in the genomes of Klebsiella oxytoca

(WP_004134764.1) and Pseudomonas putida (NP_745483.1). The gene cluster was absent

in the nikB genome blast of the cyanobacteria M. producens.

Scheme 2

The putative presence of the genes for the Dml and Dpv fragments in bacteria and their

absence in M. producens suggested that there could be a symbiotic relationship in bacteria-

L. majuscula consortia. This led to the isolation and identification of culturable bacteria from

L. majuscula.

A 6.8 Molecular identification of the Kenyan “Lyngbya majuscula”

24 | P a g e

The failure to identify homodolastatin 16 and antanapeptin A genes in the Moorea

producens genome raised concerns on the identity of the Kenyan marine cyanobacterium.

The Kenyan L. majuscula had only been identified morphologically consistent with other

species collected worldwide in pantropic geographical locations. Morphological identification

is limited and unreliable because of the immense diversity of the Oscillatoriales. Molecular

identification is highly accurate and specific as it is based on the genomic content of the

organism. Whereas the technique works efficiently for axenic species, molecular

identification of non-axenic cyanobacteria is especially difficult due to the presence of

bacteria and other microorganisms that complicate genomic DNA isolation. Presently the

identification of non-axenic cyanobacteria mostly utilises the multiple displacement

amplification (MDA) method which has only a limited total genomic coverage.

Various approaches for obtaining axenic cultures of cyanobacteria are well documented

including treatment of cyanobacteria cultures with toxic chemicals and mechanical

separations.17 However these methods do not elaborate on how to isolate genomic DNA

from non-axenic strains. In this study the Kenyan L. majuscula was treated to toxic copper

sulfate (CuSO4.5H2O) at different time intervals (0, 5 min, 15 min, 30 min, 60 min) with

intermittent mechanical separation prior to DNA extraction. Controls in which the toxic

chemical was not applied were used. Freeze drying of the samples in liquid nitrogen followed

by periodical thawing and sonication (30% pulsar, 10 min maximal amplitude) removed

residual bacteria.18 For obtaining genomic DNA of the cyanobacterium, homogenized L.

majuscula pellets were exhaustively extracted for bacteria genomic DNA. The resulting

bacteria DNA was of mixed species and did not therefore generate a 16S rDNA sequence.

Surprisingly, gDNA isolation of the residue largely comprising of cyanobacteria provided

quality16S rDNA sequences with 260/280 and 260/230 ratios of between 1.90 and 2.29 in

the qubit assay respectively. Whereas both controls and copper sulfate treated samples

generated 16S rDNA sequences, sequences of the latter had fewer nucleotide bases to

afford sufficient coverage for complete and/or draft genome sequence. These observations

25 | P a g e

were corroborated with degraded DNA for the copper treated samples on an electrophoresis

gel (Fig.6 below) and deformed morphologies of the L. majuscula filament (Fig. below).

Fig. 6. Lane 1: Gene ruler DNA ladder mix; Lane 2: TD01 (control); Lane 3: TD Conv (Supernatant treated with CuSO4.5H2O); Lane 4: TD Res (Residue

treated with CuSO4.5H2O); Lane 5: Lambda DNA/Hind III marker.

Fig. 7. Lyngbya filament treated with CuSO4.5H2O for 15 minutes (left) and for

60 minutes (right)

A 6.9 Analysis of sequence data and phylogeny of the Kenyan L. majuscula

26 | P a g e

Differential DNA isolation of the homogenised Kenyan L. majuscula commensed with a

sample treated with toxic copper sulfate, generating 16S rDNA sequences with matches for

Aminobacterium colombiense at 89% identity and 94% coverage. The control in which the

cyanobacterium was not treated with copper sulfate did not generate any sequence. We

made the assumption that there may have been cross reaction with the cyanobacteria

primers CYA 106F (CGG ACG GGT GAG TAA CGC GTG A) and CYA 781R (GAC TAC

TGG GGT ATC TAA TCC CAT T) during amplification. Mismatches arising from the primer

CYA 106F are not unusual.19 Arguably the low % identity was questionable. Furthermore,

only small concentrations of the DNA were obtained on isolation. Treatment with copper

sulfate of a cyanobacterium residue rather than the usual supernatant of a TE buffered

solution resulted into a 16S rDNA sequence matching with that of Cylindrospermum stagnale

at 85% identity of 100% coverage and 88% at 95% coverage respectively for different

aliquots.

With this uncertainty the method was tested against the known L. majuscula CCAP 1446/4

strain from the Culture Collection at Oban, Scotland. Both the supernatant and the residue

confirmed the identity of the strain with 100% identity for 100% coverage of 6 replicates. Still

a major drawback was to get a non-degraded DNA with the quality suited for draft and/or

complete genome sequence. The isolation of soil bacteria DNA assumes exhaustive

extraction of bacteria DNA with the residual soil and humic substances as substrate for the

bacteria. In an analogy with this study cyanobacteria residue was the substrate comprising

the bulk of the cyanobacteria genomic DNA material. Exhaustive extraction of bacteria

genomic DNA from the Kenyan L. majuscula followed by genomic DNA isolation of the

residue with copper sulfate and controls respectively generated 16S rDNA sequences with

sufficient nucleotides for a blast. Whereas the copper sulfate treated DNA was degraded as

observed in an electrophoresis gel (Fig. 6 Lanes 3 and 4 respectively), the control (Fig. 6

Lane 2) was not and was of good quality to generate an assembly library for a draft genome.

27 | P a g e

An NCBI blast of the generated sequence of the Kenyan marine cyanobacteria, without

restricting organism identity matched the sequence at 99% identity with an uncultured

Aminanaerobia bacterium. 16S rDNA fragments of this organism had up to 100% identity

match with the aforesaid uncultured Aminanaerobia bacterium. These observations were

consistent with all the 16S rDNA sequences obtained from the CuSO4.5H20 extractions at 0,

5 min, 10 min and 30 min. CuSO4.5H20 was found to fragment the cyanobacteria genomic

DNA. With the blast restriction to cyanobacteria all the sequences were matched with an

uncultured cyanobacterium respectively. 16S rDNA sequences obtained from the axenic

Lyngbya majuscula strain CCAP 1446/4 by the aforesaid method did not show any matches

to Aminanaerobia bacteria but instead consistently matched with 100% to L. majuscula. A

phylogeny carried out for the Kenyan “Lyngbya majuscula” showed quite distant relations

with L. majuscula CCAP 1446/4 and its clones.

The goal of the project was to identify gene clusters encoding for the anticancer

homodolastatin 16, antanapeptin A and the potent anticancer dolastatin 16 originally isolated

from a Papua New Guinea sea hare; and to carry out the expression and recombinant

biosynthesis of the anticancer compounds in a heterologous Saccharomyces cerevisiae

system. Work on the draft genome of the Kenyan marine cyanobacterium is ongoing at the

University of Aberystwyth in collaboration with Dr Justin Pachebat. This expected draft

genome of the Kenyan cyanobacteria shall reveal the true identity and nature of the

organism.

A 7 Manuscripts

On manuscript preparation, we are shortly due to submit the manuscript ‘Bacteria living on

marine cyanobacteria utilise biofilm exopolysaccharides desiccation and avoidance to resist

UV irradiance’ to Photochemistry and photobiology C journal of Japan. Currently the

manuscript is on the proof reading stage. I have also drafted a manuscript “Differential DNA

28 | P a g e

isolation as a novel method for identifying non-axenic cyanobacteria” based on a novel

technique that is likely to replace the multiple displacement amplification currently used to

obtain the draft genome of non-axenic cyanobacteria. A key feature of the publication shall

be the observation that molecular identification of the Kenyan L. majuscula is not consistent

with the morphological identification previously done by Mirjam Girt of the Oregon State

University in 2003. The manuscript is for submission to the Proceedings of the National

Academy of Sciences (PNAS) journal of the USA for publication.

Separately we shall soon publish the draft genome of the Kenyan “Lyngbya majuscula” in

the Journal of Microbiology. Additionally, our confirmatory LC/MS results and draft genome

data shall strengthen our resolve to publish our findings on the source of the anticancer

homodolastatin 16, dolastin 16 and antanapeptin A in Nature biotechnology.

B IMPACT

B 1 Seminars and guest lectures

In regard to seminars and seminars, I provided the ‘Cyanobacteria-bacteria interactions’

lecture for the MST3011 Marine Microbiology Mini-Module at Newcastle University in 2012

and “The discovery of novel pharmaceutically relevant natural products from marine

cyanobacteria”; and “The future prospects for biodegradable resins from marine

cyanobacteria” for the 2013 Marine Biology group. Feedback from the Marine Biology

Research students was quite good at an average 9/10. I also presented a talk entitled

‘Biosynthesis of the anticancer cyclohexadepsipeptide homodolastatin 16’ to Professor Ian

Head’ research group at Newcastle University and externally I was invited to give a talk on

the isolation and biosynthesis of the modular anticancer cyclohexadepsipetide to the

Research Group of Professor Rebecca Goss at University of St Andrews, Scotland, UK in

April 2013. On outreach I was a guest speaker at the EU Marie Curie conference held at

Durham University in May 2013, a seminar that was organized for North East England.

29 | P a g e

B 2 International conferences and symposia

B 2.1 Federation of European Biochemical Society (FEBS) 2013 conference at St

Petersberg, Russia from 6-12 July 2013.

With regard to international conferences, I was invited to present my talk “Biosynthesis of the

modular anticancer cyclohexadepsipeptide homodolastatin 16” at the Federation of

European Biochemical Society (FEBS) 2013 conference at St Petersberg, Russia from 6-12

July 2013. This was an especially prestigious ‘Mechanisms in Biology’ conference in which

11 Nobel laureates comprising 7 in Chemistry and 4 in Medicine or Physiology attended.

They included:

1. Sidney Altman (USA) who won the Nobel Prize in Chemistry 1989 “for the discovery

of catalytic properties of RNA” together with Nobel laureate Thomas Cech.

2. Nobel laureate Aaron Ciechanover (Israel) who was awarded the Nobel Prize in

Chemistry in 2004 “for the discovery of ubiquitin-mediated protein degradation”

together with Nobel laureates Avram Hershko and Irwin Rose. Nobel laureate Jules

Hoffman (France) who won the Nobel Prize in Medicine or Physiology together with

Nobel laureates Bruce Beutler and Ralph Steinman in 2011 “for their discovery

concerning the activation of innate immunity”.

3. Nobel laureate Robert Huber (Germany) who together with Johann Deisenhofer and

Hartmut Michel were awarded the Nobel Prize in Chemistry in 1988 “for the

determination of the three-dimensional structure of a photosynthetic reaction centre”.

4. Nobel laureate Roger Kornberg (USA) who was awarded the Nobel Prize in

Chemistry in 2006 “for his studies of the molecular basis of eukaryotic transcription”.

5. Nobel laureate Jean-Marie Lehn (France) who together with Donald Cram and

Charles Pedersen won the Nobel Prize in Chemistry in 1987 “for their development

and use of molecules with structure-specific interactions of high selectivity”.

6. Nobel laureate Richard Roberts (UK) who together with Phillip Sharp were awarded

the Nobel Prize in Medicine or Physiology in 1993 “for their discovery of split genes”.

30 | P a g e

7. Nobel laureate Jack Szostak (USA) who together with Elizabeth Blackburn and Carol

Greider were awarded the Nobel Prize in Medicine or Physiology in 2009 “for the

discovery of how chromosomes are protected by telomeres and the enzyme

telomerase”.

8. Nobel laureate Susumu Tonegawa (Japan) who won the Nobel Prize in Medicine or

Physiology “for his discovery of the genetic principle for generation of antibody

diversity”.

9. Nobel laureate Kurt Wuethrich (Switzerland, USA) that was awarded the Nobel Prize

in Chemistry together with John Fenn and Koichi Tanaka in 2002 “for his

development of magnetic resonance spectroscopy for determining the three-

dimensional structure of biological macromolecules in solution”

10. Ada Yonath (Israel) who in 2009 was awarded the Nobel Prize in Chemistry together

with Venkatraman Ramakrishnan and Thomas Steitz “for studies of the structure and

function of the ribosme”

I attended nearly all the plenary sessions by the Nobel laureates and experienced their

humility in servitude to science for humanity’s sake. In all their sessions it became clear that

their approach is towards focusing on a problem “the goal” rather than allegiance to a

discipline of science. This tremendously influenced my research during the last half of my

Marie Curie Fellowship at Newcastle University. There was also much furor regarding taking

photographs with Nobel laureates at the conference especially from our British

conservatives. However, I argued that not a single African was honored with the award of a

Nobel Prize in Chemistry and Medicine or Physiology and was therefore allowed the

privilege. Subsequently I felt most humbled and yet honored to freely interact with Nobel

laureates Jack Szostak, Jules Hoffman and Susumo Tonegawa.

31 | P a g e

B 2.2 International Advanced Single Cell Biotechnology at Sheffield University, UK

on 12 February 2014

I also attended the “International Advanced Single Cell Biotechnology” at Sheffield

University, UK on 12 February 2014. The one day symposium was hosted by Dr Wei Huang

of the Kroto Research Institute. The key note speaker was Professor Michael Wagner of the

University of Vienna. There were presentations from all over the UK including Imperial

College, London; Sanger Institute; Manchester University and there were also presentations

from the USA. I was mostly interested in the symposium because of the difficulties I had

encountered in isolating quality DNA from the Kenyan L. majuscula for complete/draft

genome sequencing. The meeting was especially useful because I learnt some techniques

that were most helpful towards my research. There were also a number of questions

remaining unanswered in my project. I did find colleagues to partner with for my future

research aspirations. In this regard I single out a project which aims “to investigate the role

of cyanobacteria toxins on bacteria cell division and cell modulation and the relevance of

cyclodepsipetides in cancer therapy”. It was a happy and exciting moment to realize that Dr

Huabing Yin of Glasgow University had almost similar interests to mine albeit unconsciously

and could not help hugging her after the end of her seminar. The symposia aided my

networking and besides my on-going work on single cell technology at the University of

Aberystwyth Wales, UK, prospects for working with Dr Wei on Raman Tweezer

spectroscopy and with Dr Yin are high. It also afforded me the opportunity to realize that the

UK is a giant in science, something I always took for granted.

B 3 Outstanding outcomes of the project

B 3.1 Grant Applications

Regarding achievements and outstanding milestones, the isolation of bacteria pathogens

from a marine cyanobacterium opened the window to relooking into the biogenesis of human

pathogenic toxins in portable water from Kenya, Tanzania and South Africa and in this

regard we submitted a £1.24M proposal ‘Understanding the source of microbial

32 | P a g e

contamination in African coastal borehole waters’ to the Royal Society DFID Capacity

Building Initiative in April 2014. This intended project shall dissect the chemistry and biology

of bacteria and protozoa pathogens in coastal borehole water, aiming to finding solutions to

waterborne diseases in Africa and worldwide. The project shall link scientists in the UK from

Newcastle University, Aberystwyth University, St. Andrews University with those from Pwani

University, Kenya; University of Dar es salaam, Tanzania and the University of Cape Town,

South Africa. We are awaiting the outcome of this application due in October 2014.

Earlier on a scholarship was awarded to a summer student at the School of Chemistry by

Newcastle University to investigate the synthesis of some fragments in homodolastatin 16

with the aim of tracing its origin and whether or not the fragments originate from the

cyanobacteria or EB or both. This project was in collaboration with Dr Michael Hall of the

School of Chemistry who advised me on Chemistry related issues of the EU Marie Curie

project.

B 3.2 Outreach program - Mentoring of Marine Biology students

The EU Marie Curie Fellowship generated two projects for Marine Biology Honours students

at Newcastle University namely; “The role of secondary metabolites in Bacillus licheniformis

UV-resistance” and “Exploring Marine Bacteria Polysaccharides from a Desiccated

Environment and Evaluating their Hygroscopic Abilities in Application to the Cosmetic

Industry”. I supervised both projects and have co-authored a manuscript for publication in a

peer reviewed journal along with my own research findings on UV-resistance with the

students. Subsequently, these students are considering pursuing PhD studies in the UK.

Additionally, nine undergraduate students undertook their projects with our research group

during my fellowship as a result of the lectures I gave to them on cyanobacteria-bacteria

interactions and on marine biotechnology and drug discovery.

33 | P a g e

C THE UNITED KINGDOM AND MY EU FELLOWSHIP

I very much enjoyed the experience of being an EU Marie Curie IIF Research Fellow in the

United Kingdom. The independence of thought and the resolve to make a contribution

towards research in the EU was a strong motivation for my work. It was evident that my stay

had impacted on me positively; and made a lot of friends in addition to embracing the dry

humour of the British people. Unfortunately non EU Marie Curie citizens are taxed heavily on

money which does not originate in the UK when in reality they do not enjoy the same

privileges as locals. Nevertheless, I would still recommend the UK as a destination for early

career scientists to develop their expertise.

D ACKNOWLEDGEMENT

I wish to thank Professor Grant J Burgess for hosting me at the School of Marine Science

and Technology and for his enduring support; Dr Michael Hall of the School of Chemistry for

helpful discussions and mentorship during my project. Jill Cowans at the Dove Marine

Laboratory arranged my purchases for consumables. I wish to most sincerely thank Ms Lisa

Inganni and Anthony Gibson for handling my finances. Lastly, I acknowledge the EU for

according me the opportunity to work as a Marie Curie IIF in the UK through their funding.

34 | P a g e

REFERENCES

1. European Science Foundation, Position Paper 15 2010 Marine Biotechnology: A

New Vision and Strategy for Europe. www.esf.org/marineboard

2. Davies-Coleman, M. T., Dzeha, T. M., Gray, C. A., et al., 2003 J. Nat. Prod. 66,

5,712 – 715.

3. Jones, A.C., Monroe, E.A., Podell, S., et al. 2011

www.pnas.org/cgi/doi/10.1073/pnas.1101137108

4. Ramaswamy, A.V., Sorrels, V.M., Gerwick, W.H. 2007 J. Nat. Prod. XXXX, xxx, 000,

A-J. np 0704250 CCC.

5. Gerwick, W.H., Coates, R.C., Engene, N., et al. 2008 Microbe, 3, 6, 277- 284.

6. Engene, N., 2012 Int. Journ. Syst. and Evol. Microbiol, 62, 1171–1178.

7. Gu, L., Wang, B., Kulkarni, A., Geders, T.W., et al. 2009 Nature, 459, 731-735

8. Pettit, G. R., Smith, T. H., Xu, J., Herald, D. 2011 New crystal dolastatin 16 having

specified unit cell dimensions, useful as an anti-cancer agent. WO2012148943-A1.

9. Pettit, G. R., Smith, T. H., Xu, J.-P., et al., 2011 J. Nat. Prod., 74 (5), 1003-1008.

10. Sudek, S., Lopanik, N. B., Waggoner, L. E., et al., 2007 J. Nat. Prod., 70 (1), 67-74.

11. Kondo, T.; Ishiura, M. 2000 Bioessays 22 (1), 10-15.

12. Cragg, G. M.; Newman, D. J. 2013 Biochimica Et Biophysica Acta-General Subjects

1830, 6, 3670-3695.

13. Kimura, J., Takada, Y., Inayoshi, T., et al. 2002 J. Org. Chem. 67, 1760-1767

14. Montaser, R., Abboud, A. K., Paul, V.J., Luesch, H. 2011 J. Nat. Prod. 74, 109-112

15. Wu, X., Zarka, A., Boussiba, S. 2000 Plant Mol. Biol. Rep. 18, 385–392.

35 | P a g e

16. Aziz, R. K., Bartels, D., et al., 2008 The RAST server: Rapid annotations using

subsystems technology. Bmc Genomics 9.

17. Vaara, T., Vaara, M., Niemela, S 1979 Appl. Env. Microbiol., 38, 5, 1011- 1014

18. Nicolas Morin, T. V., Larissa Hendrickx , Leys Natalie , Annick Wilmotte, 2010 J.

Microbiol Methods, 80, 148-154

19. Nubel, U., Garcia-Pichel, F., Muyzer, G. 1997 App. Env. Microbiol, 63, 8, 3327-3332

36 | P a g e

Annex 1. Table 1 of Kenyan Lyngbya majuscula epibiotic bacteria (EB) isolates

Accession Strain Taxon

Shewanella algae KC660130 SHALG-01 99 γ-proteobacteria

Shewanella algae KC660131 SHALG-02 99 γ-proteobacteria

Marinobacterium stanieri KC660132 MARIS-01 99 γ-proteobacteria

Acinetobacter johnsonii KC660133 ACJ-01 99 γ-proteobacteria

Marinobacterium stanieri KC660134 MARIS-02 99 γ-proteobacteria

Staphylococcus saprophyticus KC660135 STAPRO 99 Firmicutes

Pseudomonas stutzeri KC660136 PST-01 99 γ-proteobacteria

Enterobacter cloacae KC660137 ENTCLO 99 γ-proteobacteria

Cellulosimicrobium cellulans KC660138 CCL-01 99 Actinobacteria

Cellulosimicrobium cellulans KC660139 CCL-02 99 Actinobacteria

Pseudomonas pseudoalcaligenes KC660140 PPS 99 γ-proteobacteria

Pseudomonas putida KC660141 PPT 99 γ-proteobacteria

Bacillus aereus ND ND 99 Firmicutes

Bacillus licheniformis KC660142 BLC-01 99 Firmicutes

Bacillus licheniformis KC660143 BLC-02 99 Firmicutes

Bacillus subtilis KC660144 BS-00 99 Firmicutes

Pseudomonas stutzeri KC660145 PST-02 99 γ-proteobacteria

Enterobacter cancerogenus ND ND 99.24 γ-proteobacteria

Klebsiella oxytoca ND ND 99.23 γ-proteobacteria

Yokenella regensburgei ND ND 99.02 γ-proteobacteria

Ochrobactrum anthropic ND ND 99.88 α-proteobacteria

Pseudomonas stutzeri ND ND 99.87 γ-proteobacteria

Pseudoalteromonas

carrageenovora

ND ND 99.22 γ-proteobacteria

37 | P a g e

ND – Strain sequences not deposited with Genbank but were inferred from Blast

Annex 2. Phylogeny of EB isolates

38 | P a g e

Annex 3: Adenylation domain for Moorea producens

>gi|332705439|ref|ZP_08425517.1|/1-2887 amino acid adenylation domain protein [Moorea producens 3L]-----------------MNLSEFLQELVISGWQFWA----EEGQVCFQAPDADSTDQVLAQL-KQHKRDILTILQEHPE---VLQVYPLGYGQ-------------------------------------------------QGIWFLWQLFPDNPNYNVSFATRIY--------SQVNVTTW---------------QQTFEALRKRHPLLCS---TFPKCGETPIRQHSEQLD--------FVQIDASTWDENELQTQVVAAHRHPFDLQTDPVMRVRWFTRSEQE-------HILLLTIHHIAWDGSSANI------IVKELS----ELYQAHCAGVAVDLPSLQHT----------YQDYVKWQ--------QQLVEGSKG-------ESLWTYWQQQLAGELPVLNLPTDRPHPPIQTNNGAVYRFQLPEHLVTQVKALSQAEGATLYMTLLAAF-------------QVLLHRYTG-------QEDILVGSPTSGRT--RPEFTSVVGYFVDSMVMRAKVSGSLSFREFLTQVRQ--------TVIDALAHQDYPFSLLVEKLQP---------------ERDLSRSPIFQVF-FGLHN-FLQSETQQLFLGETKTLVHWGGMEVETFLFDQYESLEDLVL------------EIIEINSQLSGFFKYNTDLFDEQTIAQMASHLQTLLAGIVT-----------HPEQRLESLP------LLTQAEQHQLLVEWNQ--------------TTTHYPTDKCIHQLFEEQVEQTPDAI--------AVVFKEEKLSYQELNIRANQLARYLQSLGVSPEV-LVGVC---------------VERSLEMIVGLLGILKAGGVYVPLDPKYPQ-------------ERLDYMFRD--SQMSVLLTQQQLLTLLPQYEAK------------------VVCLDRDWQKIVTEN-----------------------------PKNVTSEVTAENLAYVIYTSGSTGKPKGVMVAHIGLHNLLKVQIQAFKVSSNSRVLQFASLSFDASIWEIVMALGSGASLY------LESRENLL-------------------PGASLSKWLNEKKITHLTLPPSALAVM-------QKEELPSLQTIVVAGEACPAEVISQWSQGR---QFVNAYGPTESTV-----CATMAE------CSPEYSVLP---IGHPIANTQI----YLLDNNLQP--VPIGIPAEMYIGGIGLARGYLN----------------------RPDLTTQKFIPNP-----FSNKAEQRL----------------YKTGDLARYLPDGNIEFLGRI--------DHQVKIR----GFRIETAEIEAVLNQNPTVKQTVVVA-REDKPGDKHL-------------------CAYIVAQMETATNSNPE--LSETHLNSW-QEIFNQQIYSQ--LSEVTDPLFNTTGYLSNYDKQP--IPEAQMRDWAEDIVTQV--------LANKPNSVWEVGCGTGMLLFKIAPHTRAY---------------------------YGTDISEVSLKYIQTQIAQQPDKYAHVTLAQKAAEEMADIADNSFDVV-------LLSS-----IVQYFPSVEYLLQVI------SNSIRVVKPGGMIFLGDIRSL--PLMRAFHTSVQLHKAPPSLSVQQLKQGIY---RLMQQETELLVSPEL--FVALKDTYP--EITHVQI-RLQRG----SEHNELNKYRYSV-LLHIQAKPTSVIVAPVENGVGMSMED-IEVYLGQQQPESICFSSL-------TNGRVATDMAAVELLSQVESKLNVQQLRQQLRQKLVNGIEPEQLH---QLSASLGYELELC------------WSHKTEGCFDAVFVRSSLAPEAM--VLTPLTQQSVVGGNWHRYGNNPLASVTGKQLIPQ-WR------------KYLEERLPEY-----MVPSRYVILP-QLPLTPNGKV-----------NRKAL----------------------------------------PAPDNTSSRSTEFVAPETSTEKAL--AAIWAEVLSI------QQVGIHDNFFESGGHSLLATQVVSRIRQALGKELTLQRLLESPTIAELDSALVQLPRVEDSPKQKPDGLLPTIVPAPSQRYQPFPLTEIQQAYWLGRNSHFDLGNITTHGYLELDCENLALDRLSQAWQQVIDHHDMLRMVILPNGEQQVLEQVYPYQIEVLDLRGQPEQIVSTELETIRYRLSHEMFPAGEWPLFKIRVTRLADQRYRL

39 | P a g e

HWSFDALIADAWSMIIVWQQWLQLYQNPDSFLPKLDLTFRDYVLAELSLKDTPQYRRSQQYWWNRLETLPPAPELPLVKQTATLEQPEFNCYRAELSAPDWQQLQARAKQASLTPSGVLLAAFADFLAYWSKSPKFTINLTLFNRLPLHPQVNDLVGDFTSLTLLEVNQKNAAPFAQRAQRLQGQLWQDLDHRYVGGVEVQREL-RRQRGSYQPMGVVFTSTLALNTSAEKGLPSNEWHAWPFDQLGETVYMVSKTPQVWLDNSVAEQNGALLLIWNVVEDLFPEGFLNDMFTSYYHWLQQLATSDVAWAQTCPQLLPLSQLTQRLQVNETYAPVSEETLHNLFVKQVQQRPEAIALITPQRTLTYHELYTEAQALGQQVQQLGATPNTLVAVLMEKGWEQIVAVLGILMAGAAYLPIDAALPQERQWSLLEQGEVKLVVTQAALNASLGLPDHLHCLVVASQPQEIID-TPLEANVSSSDLAYVIFTSGSTGTPKGVMIDHRGAVNTIQDINQRFDVQPTDRMLAVSALNFDLSVYDIFGLLAAGGTLVMPTPEAAKDPVHWVELMTTHQVTLWNTVPALMQMLVEYLSEHPDQVTEDLRLALLSGDWIPLNLPTQIQSLWPQGQVVSLGGATEASIWSVYYPITTVEPEWKSIPYGKPLVNQSLHVLNHNLDPCPNWVPGQLYIGGIGLAQGYWRDEQKTNASFILHPQTGERLYKTGDLARYLPDGNSEFLGREDFQVKISGYRIELGEIEATLLGHATVKETVVAAVGE-LQSKQLVAYVVFHSESS-SDSATEDVHD--------------DMRIDELRHYLQQQLPEYMVPPSYMVLDALPLTANGKVDRKRLPTP--ELISDHYSPDTYIPPRNHQELQLVKLWEEILEVQPIGVGSHFFDLGGHSLLAVRLMNRIEQDFGRSLPLATLFQAPTIEQLAVILQQEQGVPTLSPLVPIQTQGNQPPIFCVHPAGGTVFCYLELSQLLGANQPFYGLQSLGQQEGQAPLTTVEEMANVYLAAIREVQPQGPYLLMGWSFGGMVALQMAHDLLSQGEQVAFLGLLDTYAPAHMPDEQVLSEDVEVLLELFGGPLSLDWEVLRDLPSEQQSALIWEQAHQANLVPPDLGAAQIERLLQLMKLNHKAMRSYSPPDYPDVITLLHAEAGSVAVSSTEVTTDPTLGWQAISPSKVEVHTIPGYHEYMVYQPTVVIVAETIKADIEKGLNTDVETSSK

>gi|332712440|ref|ZP_08432366.1|/1-3195 amino acid adenylation domain protein [Moorea producens 3L]-----------------MAELNLNRDLGTSNSEVVQLTELGNGVVQITMKDESSRNGFSPSI-VEGLRHCFSVVAQNQQ---YKVVILTGYGNYFSSGASKEYLIRKTRGEVEVLDLSGLILDCEIPIIAAMQGHSFGGGLLLGLYADFVVFSQESVYATNFMKYGF--------TPVGATSLILREKLGSELAQ---EMIYTGENYRGKELAERGIPFPVVSRQDVLNYAQQLGQKIAKSPRLSLVALKQHLSADIKAKFPEAIKKELEIHQVTFNQPEIASRIQQEFGETVIPNLIQSTVEQKIPNPQPVQLRIPSYGLLKNLTWMPQERRKPKSTEVEVQIKAVPVNFREVLNVLGIFQEYIKKRYRSGIISAENLTFGVEGVGTVVAVGSDVSQWK---VGDEVILAYP---------GNAFSSFVICSPDDLLAKPSDLSMVEAATIFMSFFTAYYGLHNLAKVQPGERVLIHAASGGAGQAAVQLAQFFGSEVFATT--SPHKISVLREQGIKHVMNSRTT------EFASEVRELTQGNGVDVIFNSLTHGEY-IPKNIDILAPGGRYI

40 | P a g e

EIGRLNIWSHEQVSQRRPDVKYFPFDMSDEFVRDKQFHAKLWDDLALLFESG-SLKPLPYKVFPS-EDVVEAFRHLQHSKHIGKIVVTMPELYNGVKNSSQQANQESMSHQEELLHQLQSGDISLENAEQLLLGLTDQQILATVPNNGQNKLINTDKTEQILSLLSSGEISLENAQNLLETVDLNSPTKKNLPTAVPNQGQSNQDEAILNQLQSGEVSLEDAEQLLLEIQQKESVTTKSIPDQRITDDIAIIGISCRYPGAKNWKEFWENLKHGVDSVTEPPPGRWEGRSWYHSDPEHPGTACSKYAAFLDDIDKFDPLFFQISPGEAELIEPQQRIFLEEAYHAIEDAGYAPDSLKGKHCGVFVGAASSDYIKFLSNSGFGHHRLVLSGTMLSVLPARIAYFLDLKGPVVAVEAACSSSLVAVHQACESIKRGESEIAIAGGISTMLTPDFQVLSSQFQMVSPEGRCKSFDAEASGIVWGEGCGAILLKRYEQAVQDQDHIYGIIKGTGTNYDGSTNGISAPSSKSQARLAENIYQQFGINPETISYL-------EAHGTATPLGDPIEVEAF-TEAFSKWTAQK---QFC-AIGSVKTNIGNAATAAGMSSLIKTILCLKNQKLVPSLHFNQPNPNIDFANSPFYVNTEFKAWEVPTGIPRRAAVNSFGLNGTNAHVVVEEAPIEDNRQTSPVSPQGGKATGNSEDYLENSVHLLTLSAKTETALGEVISSYQNYLKTNPNLRLGDVCYTASTGRTHFTHRLAVVAPNQQELVEKLRQHQEGKKLAGITSGELLNNTTVAKIAFLFT-GQ---GSQYINMGKQLYQQAPTFRQAINQCEEILSSVETFQETSLRNILYPTDKNSSGSSLLGQTAYTQPALFAIEYALFK---LWQSWGIEPDVVMGHSVGEYVAATVAGVFSLEDGLKLIAARGSLMQKLPGDGKMLWAMAPESKVLETLKAKDLSEKVAIAAINGPQSIVISGEGKAVEAIATNLESAGITTKPLKVSHAFHSPLMEPMLAEFEAAAKEITYEQPRIPLISNVTGKQVTEQITTAEYWVNHVRQPVQFAQSMKTLYQEGYELFLEIGPK--PVLLSMGRQCLPEKI--GVWLPSLRPGVEECQQMLSSLGKLYVEGAKVDWIAFEQNYARQKVALPTY-PFQRERYWVSSQNGYEQKSY-----WLKGKEQHPLLGEKINLAGIEDQHRFQSYIGAESPGYLNHHQVFGKVLFPSTGYLEIAASAGKSLFTSQEQVVVSDV-DILQSLVIPETEIKTVQTVVSFAENNSYKFEIFSPSEGENQQTPQWVLHAQGKIYTEPTRNSQAKIDLEKYQAECSQAIEIEEHYREYRSKGIDYGSSFQGIKQLWKGQGKALGEMAFPEELTAQLADYQLHPALLDAAFQIVSYAIPHTETDKIYLPVGVEKFKLYRQTISQVWAIAEIRQTNLTANIFLVDNQGTVLVELEGLRVKVTEPVLTQKSAFKEQLKSASVSERQELLTTQISSAIVNILGLRDGQQIERHQPLFDLGLDSLMAVELKNQLESNLGTSFSSTLLFDYPTVESLVEYLANNVIPIDSFSE-----LPTLIPHPEQRYQPFPLNDIQQAYWIGRNQIFDLGNIATHIYIEVDCENLNLESLHQAWRRLIDHHDMLRMVVLADGNQQILEQVPPYEIEILNLSEESPETIASELEQIRNQMSHEVLPTNQWPLFHLRATRLNEQCFRLHASIDMLIFDAWSTYVLFKQWSELYNNPQSSLPATEISFRDYVLAELELKDSPQYLSSQQYWFNRLDNLPPAPEIPQAKVTSAITDPQFNTHTAQLSQSDWQQLKNKASKANLTPSGVLLSAFASVLNYWSKSSKFTLNLTLFNRLPLHPQVNELIGDFTSVILLEVDNSQAVPFISRAQKLQRQLWEDLEHRYISGVEVQRELYRR--GRSQPMG

41 | P a g e

VVFTSTLGLKSLADEEVG----RGFGLEHFGEVVYSAAQTPQVLLDHIVTEEKGALAFSWHTVEGLFPEGLIEQMFEAYCDLLQQLATSDEPWMETYHQLLPTAQLALQAQVNQTTQSWSEDILHSLFVKQVQVQSEATAVISPQKSLTYGELYQRSHQLGHGLRKLGVKPNQLVAVVMEKGWEQVVAVLGILMSGGAYLPIDPGLPQERQWYLLEQAQVTQVLTQTHLKQSLGWPEGIKCWSVDTEELAEYDPNPLEPVQTSEDLAYVIYTSGSTGLPKGVMIDHRGAINTILDINQRFKVTPSDRVLALAALNFDLSVYDIFGVLGAGGAIVMPPPKAAKDPACWRELIIAHEVTLWNSVPALMQMLVEHLLGTSATAVGDLRVVMLSGDWLPVDLPSKIQSLWSNVQVMSLGGATEASIWSIGYPIEKVGSDWKSIPYGKPLLNQSFYVLNELMEPRPVWVPGQLYIGGVGLAKGYWKNEHKTQASFITHPVTQEPLYKTGDLGRYLPDGNIEFLGREDFQVKINGYRVELGEIEVALKQFPGIKEAIVTAIGESQQSKRLVAYAVFKEKSVISDSSLTDIHQTEDKNEVGQPDQEINCTSEQLRKYLWQKLPEYMVPDDYVILEALPLTANGKVDRKRLPKPQRQTIADT---NQNILPQTKTEQQIAAVWTEVLELEEVGIHDNFFAIGGNSLLVIRVHNKLQELLGIELKVVDLFANPTVHFLSQHLTQ-----------------------------------------IGSKELF----------------------------METSKTRG-------------------------------------------DERV-----------------------------------------------KKGTTRKER-----------------------------------------------------------------------------------RNIRKSLR-----GKK

>gi|375260917|ref|YP_005020087.1|/1-2032 yersiniabactin synthetase, HMWP2 component [Klebsiella oxytoca KCTC 1686]MISGAPSQDPLLSDNGEAADYQQLRELLIQELNVAPQQLQEESNLIQAGLDSIRLMRWLHWFRKKGYRLTLRELYAAPTLAAWRQLMRSRSGEKPDDASSPAE-------------------------------------------AAWPVMSEGTPFPLTPVQHAYLTGRMPGQTLGGVGCHLYQEFAGHYLTAPKLEQAITILLQRHPMLHI---AFRADGQQVWLPQPYWNG--------VTVHDLRQTDEASRQA-YLETLRQRLSHRLLRVEMGETFDFQLTLLPDNC--HRLHVNIDLLIMDASSFTL------FFDELN--------ALLAGESLPPGDPRYD----------FRAYLLHQQKIN----QPLLDKARA------------YWLAKASMLPPAPVLPLACEPATLREVRNTRRRMIVPTTRWNAFSQRAGENGVTPTMALATCF-------------AAVLGRWGG-------LTRLLLNITLFDRQPLHPAVDEMLADFTNILLLDTACDG-----DTVSNLARKNQL----TFTEDWEHRHWSGVELLRELK----------------RQQSHPHGAPVVFTSNLGR-SLYSSRPESPLGEPE----W-GISQTPQVWIDHLAFEHRGEVWLQWDSNDALFPPALVETLFNAYCQLINQLCDDESA---------------------------WKKPFADRMP----------QSQREIRQRVNA--------------TDAPVP-QGLLHEGIFRIALRQPQAL--------AVTDAHYQWNYRELTENARRCAGRLIACGVQPGD-NVAIT---------------MSKGAGQLVAVLAVLLSGAVYVPVSLDQPA-------------ARRGKIYAD----------ANVRLVLTCQHDASAWSDDIP---------------HLTWQQAIEAE-----------------------------PLADQAAHAPTQPAYIIYTSGSTGTPKGVVISHRAALNTCCDINSRYQVGPGDRVLALSALHFDLSVYDIFGVLSAGGSLV----IVMENQRR---------------------DPRAWCELIQRHQVTLWNSVPALFDMLLTWCEGFADAAPEKLRAVMLSGDWIGL

42 | P a g e

DLPARYHAFRPQGQFIAMGGATEASI--WSNACEINR------VPDHWRAIP---YGFPLANQR-----YRVVDELGR-DCPDWVPGELWIGGIGVAEGYFN----------------------DPVRSEQQFVTQS----------NARW----------------YRTGDLGCYWPDGTLEFLGRR--------DKQVKVG----GYRIELGEIESALSQLAGVKQSTVVAIGE---KEKTL-------------------AAWVVPQGSAFCVTHHR---DPALPQAW-RGLAGTLPCC----------------------VCPPEISAGQVADFLQHRLLKL-----------KPGQTPGADPLPLMNALAIQPRWRA--------------------------------VVERWLAFLVTQQRLQPAAEGYQVCAGE-APENDPPSFSGHDLT----------------LTQILRGARHELSLLNDARWSPESLAFDHPASALYIEELATICQQLSRRLQRPVRLLEV-----------GVRTARAAECLLTRL--SADEIEYVGLEHSQELLLSARQRLAPWSDARLALWSADTLTAHAHSADIIWLNNALHRLL---------------------PEEPGLL--------------AALQQLAVPGALLYVLEFRQLTPSA--LLSTLLLTDGQPEAL-------------------------------LHNSADWGAIFTAAAF----------NCQHGDEVEGLQRFLVQCPVSQVRRDPRQLQ---------------SALAERLPGW-----MVPKRIFLLD-ALPLTANGKI-----------DYQTL---------------------------------------KRCHTPEAENRTEADLPLGDIEKQV--AVIWQPLLSM------GAVSRETDFFQHGGDSLLATRLIGQLHQA-GYEARLSDLFNHPRLADFAATLRKTDLPVEQP----------FVHSPEERYRPFALTDVQQAYLVGRQPGFALGGVGSHFFVEFEIADLDIHRLEKVWNRLIARHDMLRAVV-RDGQQRVLEQTPPWVIPA-HILHSPEEAL-----QVRDRLAHQVLNPEVWPVFDLQVGFVDGMPARLWLCLDNLLLDGLSMQILLSELEHGYRYPQQLPPPLPVTFRDYLQQPALRTPNPDSLA---WWQTQLDDIPPAPALPLRCLPQDVETPRFARLYGAMDSARWRRLKQRAADAHLTPSAVLLSVWSTVLAAWSAQPDFTLNLTLFDRRPLHPQINQILGDFTSLMLLSWHPGES--WLQSARLLQQRLSESLNHRDVSAIRVMRQLARRQNVPAVPMPVVFTSALGFEQD----------NFLARRNLLKPVWGISQTPQVWLDHQVYESEGELRFNWDFVAALFPDGQVERQFAQYCALLNRMAEDDSSWQ------LPLADLVPPLKVTER--------------RARRLRPERA-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------QPRIAAD---------------------------------------------------------------------------------------------------------------------------------------------KSSVSLIC--------------------------------------------------------DTFREVVGE-----------------------------------------------------------------------------------------------------------------------------PVAPAENFFEAGATSLNLVQLHVLLQRHEFATLTLLDLFTHPSPVALANYLAG-----------------------------------------VALKEK--------------------------------------------------------------------------------TKRV-------------------------------------------------------------------------------------------------------------------------------------------RPVRRRQR------RI

43 | P a g e

Annex 4: MtaD-M1-Cys and TycA-M1-D-/L-Phe Biosynthesis

MODULAR ASSEMBLY

DOMAIN NRPS EXPRESSED E-BIT VALUE

DLYNLSLI MtaD-M1-Cy gi|6635397|gb|AAF19812.1|MtaD-M1-Cys|Myxothiazol synthetase

18e -0.044

gi|6724259|gb|AAF26925.1|EpoP-M1-Cys|Epothilone synthetase

18e -0.044

gi|2982194|gb|AAC06346.1|BacA-M2-Cys|Bacitracin synthetase

18e -0.044

gi|408802|gb|AAA27636.1|Irp2-M1-Cys|Yersiniabactin synthetase

17e -0.077

gi|48323|emb|CAA78044.1|AngR-M1-Cys|Anguibactin synthetase

17e -0.077

DAWTVAAV TycA-M1-D-/L-Phe

gi|2623771|gb|AAC45928.1|TycA-M1-D-/L-Phe|tyrocidine synthetase 1

18e -0.065

gi|39369|emb|CAA33603.1|GrsA-M1-D-/L-Phe|Gramicidin synthetase A

18e -0.065

gi|2623772|gb|AAC45929.1|TycB-M3-D-/L-Phe/Trp|tyrocidine synthet...

17e 0.10

gi|2982196|gb|AAC06348.1|BacC-M2-Phe|bacitracin synthetase 3

16e -0.16

gi|440169|emb|CAA82227.1|CssA-M9-Val|cyclosporine synthetase 15 0.38

15e -0.38

HYPOTHETICAL PROTEIN

None N/A N/A

10623 amino acids comprising 38 domains were found in the blast for M. producens. 4 of the domains were for adenylation with 2 encoding MtaD-M1-Cy, 1 TycA-M1-D-/L-Phe and 1 hypothetical protein. The E-BIT values are exp (-ve).

44 | P a g e

Annex 5: List of PARSE HMM modular domains for M. producens

LIST OF PARSE HMMs HITs for gi|332705439|ref|ZP_08425517.1|/1-2887 amino acid adenylation domain protein [Moorea producens 3L]

DOMAIN AMINO ACID REGION

SIZE NUMBER E-BIT SIZE

A 997 1253 228 1 5.6e -52M 1427 1938 457 1 6.1e -124T 2079 2146 68 4 1.5e -18Cy 2171 2610 450 1 7e -159A 2786 3000 228 1 1.7e -85T 3189 3253 68 1 3.9e -23TE 3275 3528 267 1 1.2e -43ER 320 663 313 1 7.6e -64KS 823 1261 438 1 4.9e -172AT 1383 1697 327 1 6.1e -121DH 1768 1943 192 1 1.4e -21T 2079 2146 68 1 2.3e -21Cy 2171 2610 450 1 1.8e -172A 2786 3000 228 1 8.7e -81T 3189 3253 68 1 8.9e -18T 22 87 68 1 3.6e -10Cy 150 682 450 1 4.4e -87A 997 1253 228 1 7e -51T 2081 2146 68 6 1.3e -10Cy 2171 2610 450 1 3.7e -228

45 | P a g e