discovering novel natural products from bioactive … · 2015-05-31 · the data shown in annual...
Post on 06-Aug-2020
2 Views
Preview:
TRANSCRIPT
DISCOVERING NOVEL NATURAL PRODUCTS FROM BIOACTIVE PLANTS AND BACTERIA
BY
YUNZI LUO
THESIS
Submitted in partial fulfillment of the requirements for the degree of Master of Science in Chemical Engineering
in the Graduate College of the University of Illinois at Urbana-Champaign, 2011
Urbana, Illinois
Adviser: Professor Huimin Zhao
ii
Abstract
Microorganisms and plants have evolved to produce a myriad array of natural
products that are of biomedical importance. One group of valuable natural products is
polyketides, which are known to be antiviral, antibacterial, and anticancer. In my first
project, I attempted to develop an efficient strategy for cloning and characterizing
Type III PKS genes that are likely responsible for the synthesis of novel natural
products from Eucalyptus species. Nested degenerate primers were designed and a
total of 94 clones and 27 clones were sequenced from the cDNA libraries created
from the leaves of E. camaldulensis and E. robusta, respectively. Five unique putative
Type III PKSs genes were identified. Two full-length genes were cloned and the
biochemical characterization of these genes indicated their functions.
In my second project, I attempted to develop a synthetic biology approach for
natural product discovery. Investigation into the unknown compounds synthesized by
actinomycetes may lead to the discovery of novel chemotherapeutic agents. As proof
of concept, one cryptic pathway from Streptomyces griseus, containing a PKS/NRPS
hybrid gene, was chosen as a model system. To decipher this cluster, I have been
attempting to develop a new strategy based on our recently developed “DNA
assembler” method. By using this approach, the gene cluster was modified by
adding different constitutive promoters in front of each gene. RT-qPCR was
employed to track expression on the transcription level. HPLC and LC-MS were
used to detect the products. A potential product was detected and further
characterization is in progress.
iii
To my family
iv
Acknowledgments
First of all, I would like to thank my graduate advisor, Professor Huimin Zhao for his
guidance in my research. He has been patient to me when I struggled in the bottleneck
of my projects. At the meantime, he truly educated me into an independent researcher.
I am also very thankful for all the members in the Zhao research group for their
valuable discussions and encouragement. In particular, I would like to thank Sheryl
Rubin Pitel and Zengyi Shao as they were my mentors when I first joined the lab.
They helped me to truly understand and master all the experimental techniques and
skills from fundamental to a higher level. For the discovery of plant type III PKS, I
also would like to thank my undergraduate student Braden Christine for helping me
with the cloning of the genes. On the other hand, and most important of all, I would
like to thank my family, especially my mom and dad for their unconditional love and
supports. Even though they do not understand what I am doing, they are always on
my side and encourage me to move on all the time. Finally, I would like to thank all
my friends for helping me out during all the hard times for no credits and making all
the impossible into possible. Also, I truly appreciate that they keep in company with
me when I need to stay till 3 AM in the lab.
v
Table of Contents
Chapter 1 Overview of Natural Products Discovery ............................................................ 1
1.1 Importance of natural products biosynthesis ......................................................... 1
1.1.1 Significant source for medicinal compounds .............................................. 1
1.2 Common sources of natural products .......................................................................... 2
1.2.1 Plants as a source of natural products ............................................................... 2
1.2.2 Fungi as a source of natural products ............................................................... 4
1.2.3 Bacteria as a source of natural products ........................................................... 5
1.2.4 Cryptic pathways as a source of natural products ............................................ 6
1.3 Technologies for natural products discovery and identification ................................. 7
1.3.1 Genome mining ................................................................................................ 7
1.3.2 Metabolic engineering and pathway engineering ............................................. 8
1.3.3 Heterologous expression .................................................................................. 9
1.3.4 Techniques used to activate cryptic pathways ................................................ 11
1.3.5 Synthetic biology approaches ......................................................................... 12
1.4 Polyketides as natural products ................................................................................ 14
1.4.1 Polyketides ..................................................................................................... 14
1.4.2 Polyketide synthases ....................................................................................... 15
1.5 Project overview ........................................................................................................ 17
1.6 Conclusions and outlook ........................................................................................... 20
1.7 References ................................................................................................................. 21
Chapter 2 Discovery of Novel Type III PKS from Bioactive Plants ................................. 26
2.1 Introduction .......................................................................................................... 26
2.2 Results and discussion .......................................................................................... 28
2.2.1 Design of the degenerate primers .............................................................. 28
vi
2.2.2 Validation of degenerate primers .............................................................. 30
2.2.3 Identification of target plants .................................................................... 32
2.2.4 Screening of PKS genes in target plants ................................................... 33
2.2.5 Cloning of full length PKS genes using RACE-PCR ............................... 35
2.2.6 Enzyme characterization and product identification ................................. 36
2.3 Conclusion ............................................................................................................ 37
2.4 Materials and methods ......................................................................................... 38
2.4.1 Strains and reagents .................................................................................. 38
2.4.2 Degenerate primer design ......................................................................... 38
2.4.3 Seed sterilization and germination ............................................................ 40
2.4.4 Collection of plant tissue........................................................................... 40
2.4.5 Total RNA isolation and concentration and cDNA synthesis ................... 41
2.4.6 Screen validation using Hypericum perforatum var. Topas ...................... 41
2.4.7 Screening Eucalyptus species ................................................................... 43
2.4.8 RACE PCR ............................................................................................... 44
2.4.9 Protein purification ................................................................................... 44
2.4.10 Polyacrylamide Gel Electrophoresis (SDS-PAGE) .................................. 46
2.4.11 Determination of activity .......................................................................... 46
2.4.12 Product HPLC analysis method ................................................................ 47
2.5 References ............................................................................................................ 47
Chapter 3 Deciphering a Cryptic Biosynthetic Pathway from Streptomyces griseus by a
Synthetic Biology Approach ................................................................................................. 59
3.1 Introduction .......................................................................................................... 59
3.2 Results and discussion .......................................................................................... 63
3.2.1 Selection of a target gene cluster .............................................................. 63
vii
3.2.2 Applying the DNA assembler method to pathway assembly .................... 64
3.2.3 Product detection from the original cluster in a heterologous host ........... 66
3.2.4 Quantitative real-time PCR (qPCR) .......................................................... 67
3.2.5 Inserting the ErmE constitutive promoter ................................................. 68
3.2.6 Promoter identification and characterization ............................................ 69
3.2.7 Creating gene expression cassettes ........................................................... 70
3.2.8 Identification of final products .................................................................. 71
3.3 Conclusions .......................................................................................................... 71
3.4 Materials and methods ......................................................................................... 74
3.4.1 Strains and reagents .................................................................................. 74
3.4.2 DNA manipulation .................................................................................... 75
3.4.3 Yeast transformation ................................................................................. 76
3.4.4 Verification of the assembled gene clusters .............................................. 77
3.4.5 Integration of the verified gene clusters onto the S. lividan 66 chromosome 78
3.4.6 Total RNA isolation and cDNA synthesis ................................................ 78
3.4.7 RT-PCR..................................................................................................... 79
3.4.8 HPLC analysis .......................................................................................... 79
3.5 References ............................................................................................................ 80
1
Chapter 1 Overview of Natural Products Discovery
1.1 Importance of natural products biosynthesis
1.1.1 Significant source for medicinal compounds
Over the past 10 years, there has been a resurgence of interest in the investigation
of natural products as a source of new drugs. Around 78% of antibacterial or
anticancer drugs are natural products or have been derived from natural products (1).
The data shown in Annual Reports of Medicinal Chemistry highlight the invaluable
role that natural products have played in the drug discovery process related to all
disease types (2). Although many effective drugs have been discovered, the need for
new therapeutic agents remains due to the increasing antibiotic resistance to existing
drugs and the recent threat of bioterrorism. Natural products are the most significant
source for modern medicine discovery (3). Up to date, only a small portion of the
natural products have been discovered and well-characterized. The Streptomyces
species alone is predicted to produce as many as 150,000 natural products, while only
3% have been characterized (4). With the development of novel biotechnologies and
expanding understanding of the biology world, the field of natural products discovery
has become an active field of investigation. For example, many bioactive natural
products are synthesized by non-ribosomal peptide synthetases (NRPS) and
polyketide synthases (PKS) (5,6). 20 out of 7000 polyketides known to exist have
been commercialized as medicines, which has a “hit rate” that is already two orders
higher than other classes of chemicals (7). Therefore, investigation into the unknown
2
compounds synthesized by these two classes of enzymes may lead to the discovery of
a variety of novel chemotherapeutic agents.
Erythromycin Aflatoxin B1 Aloesone
S. erythrea A. flavus R.palmatum
Taxol L-DOPA
Taxus brevifolia Mucuna sp.
Figure 1.1 Natural products discovered from bacteria, fungi and plants.
1.2 Common sources of natural products
1.2.1 Plants as a source of natural products
Medicinal plants have been used for thousands of years in the treatment of
numerous ailments (8). Modern analytical techniques identified the bioactive
compounds in plant extracts and approximately 25% of today’s medicines are derived
3
from plant natural products (9), including many highly successful drugs such as the
best breast cancer therapy Taxol (paclitaxel) (10) and anti-Parkinson’s treatment
L-DOPA. Plant-derived polyketides such as curcuminoids (curcumin) and flavonoids
(genistein, quercetin, resveratrol) are well-known complementary therapies, with a
growing body of evidence vis-à-vis their usefulness as anti-inflammatory and
anti-cancer agents (11,12). For example, the phloroglucinol components of bioactive
plants were generally discovered by analyzing organic solvent fractions of plant
materials. Searching databases of scientific literature uncovered a number of target
plants across a wide range of species, and in nearly all cases, chemical
characterization of the phloroglucinol natural products was not followed by
subsequent characterization of the producing enzymes.
Eucalypts produce a wide variety of acylphloroglucinol compounds which may
be formylated or acylphloroglucinol-terpene adducts (e.g. euglobals and macrocarpals
(13)). Eucalypt polyketide natural products are known to be antiviral, antibacterial,
anticancer, and cariostatic (inhibit the growth of bacteria that cause tooth decay), and
Eucalypts have been used by the Aboriginal people of Australia for the treatment of
colds, influenza, toothache, snakebites, fevers, and other ailments (13).
4
Figure 1.2 Polyketides and polyketide-terpene adducts from E. robusta.
1.2.2 Fungi as a source of natural products
Fungi have existed on earth for at least one thousand million years, and they have
exploited and evolved secondary metabolism for the production of bioactive
compounds (14). These compounds include toxins such as α-amanitin, the toxic
principal component of Amanita phalloides (death cap), vomitoxin (deoxynivalenol)
produced by the Fusaria family of plant pathogens and aflatoxin B1 produced by
Aspergillus species. Fungi also produce potent psychoactive compounds such as
muscimol, psilocybin and xenovulene A, pharmaceuticals such as the β-lactams and
the statin lovastatin. These compounds illustrate the utility and diversity of chemical
structures produced by fungi, and the diverse biosynthetic potential of these
organisms. Among all the compounds produced by fungi, a wide variety of
polyketides with diverse structures, such as orsellinic acid, lovastatin, and aflatoxin
can be found as well (14). Most of these compounds are bioactive and usually
5
pharmacologically useful. For example, lovastatin which is produced by Aspergillus
terreus, can lower the cholesterol by inhibiting the rate-limiting enzyme in cholesterol
synthesis. Fungi can inhabit almost all known environments on earth — from
temperate soils and forests to deserts and from oceans to tropical rain forests. Thus, it
is not surprising that they can produce numerous medicinal natural products (15).
1.2.3 Bacteria as a source of natural products
Bacteria represent a rich source of natural products as well. Type II PKS are
found in bacteria exclusively, which can synthesize a variety of polyketides, such as
oxytetracycline from Streptomyces rimosus, and daunorubicin from Streptomyces
peucetius (16). Bacteria produce many different tetracyclines, one of the most
important classes of antibiotics, and anthracyclines, a potent class of anticancer
compounds which includes daunorubicin. Actinomycetes are a group of bacteria
capable of producing numerous natural products. The Streptomyces species alone is
predicted to produce as many as 150,000 natural products. Most of them are
secondary metabolites. For example, 20 secondary metabolite biosynthesis pathways
have been revealed from actinomycete Streptomyces coelicolor A3(2) [18] and the
industrial actinomycete Streptomyces avermitilis [19]. Another example is
Streptomyces griseus, from which genomic sequence, 34 secondary metabolite
pathways have been found (17). These gene clusters reveal the high possibility of
discovery of new natural products.
6
1.2.4 Cryptic pathways as a source of natural products
The recent increase and availability of whole genome sequences have revised our
view of the metabolic capabilities of microorganisms. A large number of orphan
biosynthesis pathways have been identified by bio-informatics study of these data.
Cryptic biosynthetic pathways are gene clusters for which the encoded natural product
is unknown or the pathway is silent and the product cannot be detected under general
growth conditions. It is worthy to note that the number of cryptic pathways coding for
putative natural products far outnumbers the number of currently known metabolites
for a given organism. Thus, cryptic pathways serve as an enormous source for natural
products discovery.
Since the genomic sequencing projects are mostly focused on bacteria and fungi,
the cryptic pathways related to valuable secondary metabolites are often revealed
from the bacteria and fungi genome database. For example, Streptomyces coelicolor
was known to produce only 4 secondary metabolites (18), the genome analysis
revealed 18 additional cryptic biosynthetic pathways. It is intriguing to note that this
is not a particular case because analysis of other microbial genomes originating from
Myxobacteria, cyanobacteria and filamentous fungi showed the presence of a
comparable or even larger number of cryptic pathways (19,20). The discovery of
these numerous pathways represents a treasure trove, which is likely to grow
exponentially in the future, uncovering many novel and possibly bio-active
compounds. The few natural products that have been correlated with their orphan
7
pathway are merely the tip of the iceberg, whilst plenty of metabolites await discovery
(21).
1.3 Technologies for natural products discovery and identification
1.3.1 Genome mining
Genome mining has been used in various fields to describe the exploitation of
genomic information for the discovery of new processes, targets, and products.
Genomics has resulted in the deposition of a huge quantity of DNA sequence data
from a wide variety of organisms in publicly accessible databases. This information
can be exploited to generate new knowledge in several areas relevant to medicinal
chemistry including the identification and validation of new drug targets in human
pathogens, and the discovery of new chemical entities (19). The concept of exploiting
genomic sequence data for the discovery of new natural products has emerged from
the rapid expansion in knowledge of the genetic and biochemical basis for secondary
metabolite biosynthesis, particularly in microorganisms, in the 1980s and 1990s (5).
Large quantities of genomic sequence data have accumulated in public databases.
Those of plants and microorganisms, containing numerous genes encoding proteins
are likely to participate in the assembly of structurally complex bioactive natural
products but not associated with the production of known metabolites. Therefore, they
become invaluable as they shed light on the discovery of novel drugs.
Analysis of the complete genome sequence of the model actinomycete S.
coelicolor A3(2) (18) and the industrial actinomycete S. avermitilis (22) have
8
facilitated the prediction of structural features of natural products. For example, the
genome mining of these two bacteria revealed 20 secondary metabolite biosynthetic
gene clusters, many of which have not been characterized yet. Another example is the
genome sequence of S. griseus, which revealed 34 secondary metabolites pathways
and only six of them have been characterized (17). All these information offers the
researchers a big chance to discover novel high value compounds.
1.3.2 Metabolic engineering and pathway engineering
Metabolic engineering is an emerging field of academic research and industrial
practice with the goal of engineering pathways to either produce value-added
molecules or degrade harmful compounds in a simple, effective and advantageous
process (23). It is broadly defined as using recombinant DNA technology to improve
cellular activities. Specifically, metabolic engineering includes identification of
metabolic pathways, elucidation of regulatory mechanisms, metabolic flux analysis,
metabolic control analysis, identification of inter- or intra- cellular transport
mechanisms, and discovery and manipulation of biosynthetic pathways. This area is
particularly important to natural products biosynthesis because it offers ways to
optimize the native producer to increase production, enable the heterologous
expression of a pathway, and balance metabolic flux to maximize production of
desired compound, as well as produce novel chemicals and pharmaceuticals.
Applying metabolic engineering to establish metabolic pathways that are able to
channel carbon flow to a desired product at a high yield requires careful
considerations. Four key elements are normally included: (i) direct and optimize the
9
primary metabolic pathway flow to the target product, including removal of rate
limiting steps, transcriptional and allosteric regulation; (ii) genetically block
competing branch pathways; (iii) modify secondary metabolic pathways to enhance
energy metabolism and availability of required enzymatic cofactors; and (iv) remove
detrimental side-products. For example, hydrocortisone, an important starting material
for steroidal drug synthesis, can be synthesized from glucose in yeast by metabolic
engineering (24). Another example is to enhance the production of
6-deoxyerythronolide B (the polyketide precursor of erythromycin) in the
heterologous host E. coli to the level of the native producer through metabolic
engineering (27). As for polyketides, the availability of acyl-CoA is a major obstacle
in heterologous host as in E. coli or S. cerevisiae, the amounts of propionyl-CoA and
(2S)-methylmalonyl-CoA are low. Pathways that can use propionate and
methylmalonate to produce these CoAs were introduced to the heterologous host,
which resulted in an increase in the final products (25,26).
1.3.3 Heterologous expression
Natural products show great promise as medicinal agents. Most of them typically
act as secondary metabolites of microbial, and some are synthesized by an
evolutionarily related but architecturally diverse family of multifunctional enzymes. A
principal limitation for fundamental biochemical studies of these modular
megasynthases, as well as for their applications in biotechnology, is the challenge
associated with manipulating the natural microorganism that produces a polyketide of
interest. For example, they may have strict environmental requirements for growth or
10
cannot be cultured readily (23). Host organisms may also have a complicated or
multi-stage growth profile, which may result in little production of a target compound.
To ameliorate this limitation, several genetically tractable microbes, such as E. coli, B.
subtilis, S. coelicolor, S. lividans and S. cerevisiae have been developed as
heterologous hosts for natural product biosynthesis (27). The genes responsible for
natural product synthesis can be over-expressed heterologously.
Over the past decade, the study of bioactive natural product biosynthesis has
benefited significantly from the incorporation of heterologous tools. Compared to
native producers, heterologous hosts have several advantages such as ease to operate,
simple growth conditions, excellent fermentation protocols, and abundant genetic
manipulation tools (23).
Heterologous expression of polyketides is particularly important as many are
produced in actinomycetes or plants that often suffer from slow growth rates.
Reconstitution of polyketide biosynthesis in heterologous hosts demands that large
multienzyme assemblies be functionally expressed, their posttranslational
modification be adequately met, their substrates be available in vivo in reasonable
quantities, and the producer cell be protected against the toxicity of the biosynthetic
products. Already, heterologous expression of Type I and Type II PKSs has been
achieved in E. coli to produce the antibiotic erythromycin (28) and plant flavonoids
(29). In addition, unnatural substrates have been fed into a heterologous host for Type
III PKS, which resulted in combinatorial biosynthesis of new medicinal natural
products (30).
11
1.3.4 Techniques used to activate cryptic pathways
Analyses of the >500 microbial genome sequences currently in the
publicly-accessible databases have revealed numerous examples of gene clusters
encoding enzymes similar to those known to be involved in the biosynthesis of many
important natural products (31). Examples of such enzymes include nonribosomal
peptide synthetases (NRPSs), polyketide synthases (PKSs) and terpene synthases
(TSs), as well as enzymes belonging to less thoroughly investigated families. Many of
these gene clusters are hypothesized to assemble novel natural products. Numerous
novel metabolites have been discovered as the products of such so-called ‘‘cryptic’’
or ‘‘orphan’’ biosynthetic gene clusters and several recent reviews have described
these discoveries (19,21). The various approaches for discovery of the metabolic
products of cryptic biosynthetic gene clusters are summarized in Figure 1.3.
Figure 1.3 Techniques used to activate cryptic pathways
12
At the beginning of every discovery of the natural product of an orphan pathway
and the capability to predict the corresponding structure, significant progress needs to
be made in the understanding of biosynthetic pathways (21). With this accumulated
knowledge, combined with the advanced state of bio-informatics, it is nowadays
possible to reasonably predict the product of an orphan biosynthetic pathway.
Techniques usually used to active the cryptic pathways include: i) prediction of
physicochemical properties (32); ii) genomisotopic approach (33); iii) in vitro
reconstitution approach (34); iv) biosynthetic gene inactivation/comparative
metabolic profiling approach (35); v) heterologous gene expression/comparative
metabolic profiling approach (36); vi) activation of silent clusters by manipulation of
regulatory genes (20).
1.3.5 Synthetic biology approaches
The term synthetic biology did not obtain many popularity when it first appeared
in the literature in late 1970’s (37), which was used to describe bacteria that had been
genetically engineered using recombinant DNA technology. At the beginning of last
century, synthetic biology was again introduced, and quickly became popular, to refer
to the design and construction of new biological components, such as enzymes,
genetic circuits, and cells, or the redesign of existing biological systems (38-40).
The field of synthetic biology lies at the interface of many different biological
research areas, such as functional genomics, protein engineering, chemical biology,
metabolic engineering, systems biology, and bioinformatics. Not surprisingly,
synthetic biology means different things to different people, even to leading
13
practitioners in the field (41). The fundamental idea behind synthetic biology is that
any biological system can be regarded as a combination of individual functional
elements, a limited number of which can be recombined in novel configurations to
modify existing properties or to create new ones (42). The element that
distinguishes synthetic biology from traditional molecular biology and cellular
biology is the focus on the design and construction of core components that can be
modeled, understood, and tuned to meet specific performance criteria, and the
assembly of these smaller components into larger integrated systems that solve
specific problems (43). As a broad field, the successful applications of synthetic
biology make it an invaluable tool for natural products discovery, such as synthesis of
proteins that do not exist in nature, design and engineering of signaling and metabolic
pathways, and rebuilding of new organisms.
In a further level of complexity, the modular nature makes synthetic biology very
useful for engineering signaling and metabolic pathways and prompts the discovery of
natural products. Although de novo synthesis of a biochemical pathway is possible
using the DNA synthesis tools, most efforts focus on construction and optimization of
an existing biochemical pathway either in a native or a heterologous host. Pathway
engineering can be designed rationally by mixing and matching well-known, modular
parts and modulating gene expression through various control mechanisms. However,
the design at the pathway level is not only concerned with including the necessary
biological parts such as promoters, genes, and proteins, but also with optimizing the
expressed functionality of those parts. Failure to balance the flux in the synthetic
14
pathway will result in a bottleneck and the accumulation of intermediates (44). One
way to balance the flux is through transcription optimization of the various genes in
the pathway. By fusing a library of promoters to the various enzymes in the
isoprenoid production pathway, Pitera and coworkers have managed to engineer a
flux-balanced pathway with improved yield and reduced metabolic burden (44).
These well-characterized families of transcription regulators have emerged as
powerful tools in metabolic engineering as they allow rational coordination and
control of multigene expression, thereby decoupling pathway design from
construction (45).
The concept of synthetic biology is relatively broad, and the development and
characterization of individual components continue to increase. With our increasing
understanding in genomics and proteomics and with the development of more
powerful techniques for system design, synthesis and optimization, a rapid growth in
the capabilities of synthetic systems with a wide-range of applications will be seen in
the near future.
1.4 Polyketides as natural products
1.4.1 Polyketides
Polyketides are a diverse group of natural products with important applications in
medicine and industry. It can be found in a variety of organisms including plants,
fungi, and bacteria, in which they are produced by polyketides synthases (PKS) which
polymerize acyl-Coenzyme As (CoA) in a fashion similar to fatty acid synthesis (46).
15
Usually, they are not strictly required for the growth, development, or reproduction of
their host organisms, but they perform an important role in both industrial chemical
synthesis and drug discovery. Polyketides are structurally varied and in turn have a
broad range of different biological activities. The diversity of natural polyketides
comes both from the priming acyl-CoAs and elongating acyl-CoAs used, and the
chemical transformations occurring after polyketide synthesis. Polyketide natural
products are known to have antiviral (47), antibacterial (48), and anticancer (49)
activities. Therefore, they are widely used in clinical trials and industry (7). For
example, a number of plant species produce a wide variety of phloroglucinol
derivatives. Phloroglucinol natural products may be monomeric, dimeric, trimeric, or
modified by prenylation or O- or C-glycosylation. These varied phloroglucinols have
many known bioactivities including anti-feedant, antifungal, antibiotic, antitumor,
anti-malarial, antidepressant, anti-dementia, antiviral, antioxidant, and
anti-inflammatory or anti-allergy activities.
1.4.2 Polyketide synthases
Polyketide natural products have diverse structures, many of which are clinically
valuable drugs. They are derived from acetate and other short carboxylic acids by
sequential decarboxylative condensations in a fashion similar to fatty acid
biosynthesis, and this process is catalyzed by polyketide synthases (PKSs). There are
three types of PKSs known to date. Type I and type II PKSs are microbial enzymes,
whereas type III PKSs are mainly found in plants (50).
16
Type I PKSs, a group which is further sub-divided into iterative (Figure 1.2A)
and modular type I PKSs (Figure 1.2B), are responsible for the biosynthesis of
complex, largely reduced polyketides, such as macrolides (erythromycin and
avermectin), polyethers (monensin and tetronasin), polyenes (candicidin and nystatin),
and hybrid peptide-polyketide natural products, such as bleomycin, epothilone and
rapamycin. Type I PKSs are made up of a single, very large multi-domain protein that
is structurally organized into modules. Type II PKSs are multi-enzyme systems as
well, but each domain contained in Type II PKSs acts as a free-standing enzyme
(Figure 1.2C). Type III PKSs are small, homodimeric enzymes and represent a single
domain (Figure 1.2D). One to six functional modules can be assembled into one
individual PKS, each consisting of several distinct active sites, for each enzymatic
step. A module is in charge of one cycle of polyketide elongation and the associated
modifications. The order of modules in the PKS enzyme indicates the sequence of
biosynthetic events. Variations of domains within the modules afford the structural
diversity observed in the resultant polyketide products. The growing polyketide
intermediates are tethered to the PKSs via a thioester linkage to the
4'-phosphopantetheine prosthetic group of the acyl carrier protein (ACP) domain
during the entire elongation process. Thus, the three core domains: β-ketoacyl
synthase (KS), acyltransferase (AT), and ACP, catalyze the C-C elongation, and the
accessory domains, such as ketoreductase, dehydratase, enoyl reductase and
methyltransferase, control the modifications of the growing polyketide intermediates.
Once the growing polyketides chain reaches its full length, it is released from the PKS
17
enzyme by a thioesterase (TE) domain or a discrete amidase to yield the final products
(51).
Figure 1.4 Representative structures of PKS enzymes. (A) Iterative Type I PKS, (B)
Non-iterative, modular Type I PKS, (C) minimal Type II PKS, and (D) Type III PKS.
Domain abbreviations: KS, ketosynthase; AT, acyltransferase; ACP, acyl-carrier protein; KR,
ketoreductase; DH, dehydratase; TE, termination (52).
1.5 Project overview
Microorganisms and plants have evolved to produce a myriad array of natural
products that are of biomedical importance (1). This thesis focuses on discovery and
characterization of novel natural products. As polyketides represent a huge class of
structurally diverse natural products, I chose PKSs as my targets. Throughout my
thesis, I have applied the above-mentioned laboratory techniques to natural products
biosynthesis and discovery, including: i) use of genome-mining to identify a target
cryptic gene cluster, SGR810-815, from the sequenced genome of S. griseus; ii) use
of synthetic biology to activate the cryptic SGR810-815 gene cluster from S. griseus
18
in S. lividans, iii) discovery of novel Type III PKS enzymes by mining biodiversity of
medicinal plants;4) heterologous expression of Type III PKS genes in E. coli.
Chapter 2 describes the characterization of the Type II PKS enzyme EC2 from
Eucalyptus camaldulensis. The overall goal is to identify new Type III PKS enzymes
from bioactive plants. A PCR-based screening method for Type III PKS genes
discovery was created and validated. Two Eucalyptus species were chosen as initial
targets for Type III PKS discovery. The leaves of Eucalyptus robusta were used to
produce an anti-malarial medicine (13). Eucalyptus camaldulensis was chosen as
the second target species. Nested degenerate primers were designed based on 14
different plant Type III PKS and were validated by amplifying PKSs from Hypericum
perforatum. A total of 94 clones and 27 clones were sequenced from the cDNA
libraries created from the leaves of E. camaldulensis and E. robusta, respectively.
Five unique putative Type III PKSs genes were identified based on the screening
method. To further validate the cloned putative Type III PKS fragments, RACE-PCR
(rapid amplification of cDNA ends-polymerase chain reaction) was used to clone the
full-length gene. One full-length gene was successfully cloned. The biochemical
characterization of this gene led to the identification of one novel Type III PKS.
However this gene shares a high sequence identity (75%) with chaclone synthase
(CHS) from Medicago sativa. The generation of naringenin from 4-coumaroyl-CoA
and malonyl-CoA was confirmed. Also, different CoAs as start units were tested to
explore the novelty of this enzyme. However, it turned out to be a CHS and no new
reaction was detected.
19
Chapter 3 reports my progress on the development of synthetic biology
approaches to activate a cryptic pathway SGR810-815 from S. griseus. As many
bioactive natural products are synthesized by non-ribosomal peptide synthetases
(NRPS) and polyketide synthases (PKS) (5,6), investigation into the unknown
compounds synthesized by these two classes of enzymes may lead to the discovery of
novel chemotherapeutic agents. Our motivation is to develop an efficient approach to
decipher the cryptic natural product biosynthetic pathways. As proof of concept, one
cryptic pathway from S. griseus (17), SGR810-815, containing a PKS/NRPS hybrid
gene, was chosen as a model system. BLAST search indicates that the PKS/NRPS
gene shares high sequence identity (63%) with the one from the heat-stable antifungal
factor (HSAF) biosynthetic gene cluster (53) and 70-80% sequence homology with
the recently characterized frontalamide gene cluster (54). However, the nature of the
natural product encoded by the target pathway is still not clear. To decipher this
cluster, I have been attempting to develop a new strategy based on our developed
“DNA assembler” method (55). In this method, all of the resulting DNA products
were co-transformed into S. cerevisiae for assembly into a single DNA molecule in
one step based on the yeast homologous recombination mechanism. Verified
constructs were integrated into the genome of S. lividans for heterologous expression.
By using this approach, gene clusters can be easily assembled and modified. Thus,
three modified constructs were assembled and used as controls for the
characterization of the cluster. RT-PCR and SDS-PAGE were employed to track
expression both on the transcription and translation levels. HPLC and LC-MS were
20
used to detect the products. Preliminary data showed the formation of new
compounds. In parallel, two more plasmids were constructed, which contained one or
six artificial operons to regulate the production of target product. These two
constructs will be evaluated in the near future. Consequently, construction of artificial
operons should be able to improve gene expression, which in turn increases the
product yield.
1.6 Conclusions and outlook
As a valuable class of compounds for drug discovery and generation of industrial
chemicals, natural products have drawn increasing attentions with the development of
new biotechnology tools. Polyketides produced by microorganisms show an
extraordinary structural diversity and represent a huge collection of biologically active
compounds. Polyketides are synthesized by polyketide synthases. As a class of
multifunctional enzymes, they create remarkable structural complexity and diversity
by varying the building blocks of different elongation stages.
Type III PKSs are comprised of mono-functional enzymes. Because of its simple
structure and relative short sequence, it is easier to study. However, due to the lack of
genomic information from plants, it is hard to discover novel Type III PKS from
bioactive plants. Insert a few sentences to summarize your key findings.
The natural products from actinomycetes represent another type of attractive
compounds. With the rapid development of DNA sequencing techniques and publicly
assessable genomic sequence data, the deciphering of cryptic biosynthetic pathways
should lead to the discovery of novel high value compounds. By studying one cryptic
21
pathway from Streptomyces griseus, I have been able to produce the potential product
from the pathway by applying synthetic biology approach, which contains pathway
engineering and assembly that would be talked in detail in Chapter 3.
In general, the new synthetic biology approach I plan to develop will enable rapid
introduction of gene deletions and insertions into the pathway, which will help to
study the mechanism of a biosynthetic pathway. In addition, this approach can be
used to balance pathway gene expression, optimize metabolic flux, and generate new
compounds. Overall, this genomics-driven, synthetic biology enabled method offers
the ultimate versatility and flexibility in characterizing and engineering natural
product biosynthetic pathways, which will greatly facilitate the exploration of the
genomic information and the discovery of valuable natural products.
1.7 References
1. Newman, D.J. and Cragg, G.M. (2007) Natural products as sources of new drugs over the last 25 years. J Nat Prod, 70, 461-477.
2. Cragg, G.M., Newman, D.J. and Snader, K.M. (1997) Natural products in drug discovery and development. J Nat Prod, 60, 52-60.
3. Corre, C. and Challis, G.L. (2009) New natural product biosynthetic chemistry discovered by genome mining. Nat Prod Rep, 26, 977-986.
4. Watve, M.G., Tickoo, R., Jog, M.M. and Bhole, B.D. (2001) How many antibiotics are produced by the genus Streptomyces? Arch Microbiol, 176, 386-390.
5. Fischbach, M.A. and Walsh, C.T. (2006) Assembly-line enzymology for polyketide and nonribosomal Peptide antibiotics: logic, machinery, and mechanisms. Chem Rev, 106, 3468-3496.
6. McConnell, O.J., Longley, R.E. and Koehn, F.E. (1994) The discovery of marine natural products with therapeutic potential. Biotechnology, 26, 109-174.
7. Weissman, K.J. and Leadlay, P.F. (2005) Combinatorial biosynthesis of reduced polyketides. Nat Rev Microbiol, 3, 925-936.
8. Halberstein, R.A. (2005) Medicinal plants: historical and cross-cultural usage patterns. Ann Epidemiol, 15, 686-699.
22
9. Zhou, L.G. and Wu, J.Y. (2006) Development and application of medicinal plant tissue cultures for production of drugs and herbal medicinals in China. Nat Prod Rep, 23, 789-810.
10. Wall, M.E. and Wani, M.C. (1995) Camptothecin and taxol: discovery to clinic--thirteenth Bruce F. Cain Memorial Award Lecture. Cancer Res, 55, 753-760.
11. Bisht, K., Wagner, K.H. and Bulmer, A.C. (2009) Curcumin, resveratrol and flavonoids as anti-inflammatory, cyto- and DNA-protective dietary compounds. Toxicology, 278(1), 88-100.
12. Khan, N., Afaq, F. and Mukhtar, H. (2008) Cancer chemoprevention through dietary antioxidants: progress and promise. Antioxid Redox Signal, 10, 475-510.
13. Ghisalberti, E.L. (1996) Bioactive acylphloroglucinol derivatives from Eucalyptus species. Phytochemistry, 41, 7-22.
14. Cox, R.J. (2007) Polyketides, proteins and genes in fungi: programmed nano-machines begin to reveal their secrets. Org Biomol Chem, 5, 2010-2026.
15. Cutler, J.E., Deepe, G.S. and Klein, B.S. (2007) Advances in combating fungal diseases: vaccines on the threshold. Nature Reviews Microbiology, 5, 13-28.
16. Hertweck, C., Luzhetskyy, A., Rebets, Y. and Bechthold, A. (2007) Type II polyketide synthases: gaining a deeper insight into enzymatic teamwork. Nat Prod Rep, 24, 162-190.
17. Ohnishi, Y., Ishikawa, J., Hara, H., Suzuki, H., Ikenoya, M., Ikeda, H., Yamashita, A., Hattori, M. and Horinouchi, S. (2008) Genome sequence of the streptomycin-producing microorganism Streptomyces griseus IFO 13350. J Bacteriol, 190, 4050-4060.
18. Bentley, S.D., Chater, K.F., Cerdeno-Tarraga, A.M., Challis, G.L., Thomson, N.R., James, K.D., Harris, D.E., Quail, M.A., Kieser, H., Harper, D. et al. (2002) Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature, 417, 141-147.
19. Challis, G.L. (2008) Genome mining for novel natural product discovery. J Med Chem, 51, 2618-2628.
20. Bergmann, S., Schumann, J., Scherlach, K., Lange, C., Brakhage, A.A. and Hertweck, C. (2007) Genomics-driven discovery of PKS-NRPS hybrid metabolites from Aspergillus nidulans. Nat Chem Biol, 3, 213-217.
21. Gross, H. (2007) Strategies to unravel the function of orphan biosynthesis pathways: recent examples and future prospects. Appl Microbiol Biotechnol, 75, 267-277.
22. Ikeda, H., Ishikawa, J., Hanamoto, A., Shinose, M., Kikuchi, H., Shiba, T., Sakaki, Y., Hattori, M. and Omura, S. (2003) Complete genome sequence and comparative analysis of the industrial microorganism Streptomyces avermitilis. Nat Biotechnol, 21, 526-531.
23
23. Boghigian, B.A. and Pfeifer, B.A. (2008) Current status, strategies, and potential for the metabolic engineering of heterologous polyketides in Escherichia coli. Biotechnol Lett, 30, 1323-1330.
24. Dumas, B., Brocard-Masson, C., Assemat-Lebrun, K. and Achstetter, T. (2006) Hydrocortisone made in yeast: metabolic engineering turns a unicellular microorganism into a drug-synthesizing factory. Biotechnol J, 1, 299-307.
25. Mutka, S.C., Bondi, S.M., Carney, J.R., Da Silva, N.A. and Kealey, J.T. (2006) Metabolic pathway engineering for complex polyketide biosynthesis in Saccharomyces cerevisiae. FEMS Yeast Res, 6, 40-47.
26. Pfeifer, B.A., Admiraal, S.J., Gramajo, H., Cane, D.E. and Khosla, C. (2001) Biosynthesis of complex polyketides in a metabolically engineered strain of E. coli. Science, 291, 1790-1792.
27. Pfeifer, B.A. and Khosla, C. (2001) Biosynthesis of polyketides in heterologous hosts. Microbiol Mol Biol Rev, 65, 106-118.
28. Fujii, I. (2009) Heterologous expression systems for polyketide synthases. Nat Prod Rep, 26, 155-169.
29. Katsuyama, Y., Matsuzawa, M., Funa, N. and Horinouchi, S. (2008) Production of curcuminoids by Escherichia coli carrying an artificial biosynthesis pathway. Microbiology, 154, 2620-2628.
30. Horinouchi, S. (2009) Combinatorial biosynthesis of plant medicinal polyketides by microorganisms. Curr Opin Chem Biol, 13, 197-204.
31. Donadio, S., Monciardini, P. and Sosio, M. (2007) Polyketide synthases and nonribosomal peptide synthetases: the emerging view from bacterial genomics. Nat Prod Rep, 24, 1073-1109.
32. Udwary, D.W., Zeigler, L., Asolkar, R.N., Singan, V., Lapidus, A., Fenical, W., Jensen, P.R. and Moore, B.S. (2007) Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica. Proc Natl Acad Sci U S A, 104, 10376-10381.
33. Gross, H., Stockwell, V.O., Henkels, M.D., Nowak-Thompson, B., Loper, J.E. and Gerwick, W.H. (2007) The genomisotopic approach: a systematic method to isolate products of orphan biosynthetic gene clusters. Chem Biol, 14, 53-63.
34. Lin, X., Hopson, R. and Cane, D.E. (2006) Genome mining in Streptomyces coelicolor: molecular cloning and characterization of a new sesquiterpene synthase. J Am Chem Soc, 128, 6022-6023.
35. Song, L., Barona-Gomez, F., Corre, C., Xiang, L., Udwary, D.W., Austin, M.B., Noel, J.P., Moore, B.S. and Challis, G.L. (2006) Type III polyketide synthase beta-ketoacyl-ACP starter unit and ethylmalonyl-CoA extender unit selectivity discovered by Streptomyces coelicolor genome mining. J Am Chem Soc, 128, 14754-14755.
36. Hornung, A., Bertazzo, M., Dziarnowski, A., Schneider, K., Welzel, K., Wohlert, S.E., Holzenkampfer, M., Nicholson, G.J., Bechthold, A., Sussmuth, R.D. et al. (2007) A genomic screening approach to the structure-guided identification of drug candidates from natural sources. Chembiochem, 8, 757-766.
24
37. Hobom, B. (1980) Surgery of Genes - at the Doorstep of Synthetic Biology. Med Klin, 75, 14-21.
38. Alper, H., Cirino, P., Nevoigt, E. and Sriram, G. (2010) Applications of Synthetic Biology in Microbial Biotechnology. J Biomed Biotechnol, -.
39. Fagot-Largeault, A., Galperin, C., Gros, F. and Livage, J. (2011) From synthetic chemistry to synthetic biology. Cr Chim, 14, 343-347.
40. Keasling, J.D. (2008) Synthetic biology for synthetic chemistry. Acs Chem Biol, 3, 64-76.
41. (2009) What's in a name? Nat Biotechnol, 27, 1071-1073. 42. de Lorenzo, V. and Danchin, A. (2008) Synthetic biology: discovering new
worlds and new words - The new and not so new aspects of this emerging research field. Embo Rep, 9, 822-827.
43. Leonard, E., Nielsen, D., Solomon, K. and Prather, K.J. (2008) Engineering microbes with synthetic biology frameworks. Trends Biotechnol, 26, 674-681.
44. Pitera, D.J., Paddon, C.J., Newman, J.D. and Keasling, J.D. (2007) Balancing a heterologous mevalonate pathway for improved isoprenoid production in Escherichia coli. Metab Eng, 9, 193-207.
45. Bennett, M.R. and Hasty, J. (2009) Overpowering the component problem. Nat Biotechnol, 27, 450-451.
46. Burkart, M.D. and Meier, J.L. (2009) The chemical biology of modular biosynthetic enzymes. Chemical Society Reviews, 38, 2012-2045.
47. Takasaki, M., Konoshima, T., Shingu, T., Tokuda, H., Nishino, H., Iwashima, A. and Kozuka, M. (1990) Structures of euglobal-G1, -G2, and -G3 from Eucalyptus grandis, three new inhibitors of Epstein-Barr virus activation. Chem Pharm Bull (Tokyo), 38, 1444-1446.
48. Yamakoshi, Y., Murata, M., Shimizu, A. and Homma, S. (1992) Isolation and characterization of macrocarpals B--G antibacterial compounds from Eucalyptus macrocarpa. Biosci Biotechnol Biochem, 56, 1570-1576.
49. Yin, S., Xue, J.J., Fan, C.Q., Miao, Z.H., Ding, J. and Yue, J.M. (2007) Eucalyptals A-C with a new skeleton isolated from Eucalyptus globulus. Org Lett, 9, 5549-5552.
50. Hopwood, D.A. (1997) Genetic Contributions to Understanding Polyketide Synthases. Chem Rev, 97, 2465-2498.
51. Du, L. and Shen, B. (2001) Biosynthesis of hybrid peptide-polyketide natural products. Curr Opin Drug Discov Devel, 4, 215-228.
52. Shen, B. (2003) Polyketide biosynthesis beyond the type I, II and III polyketide synthase paradigms. Curr Opin Chem Biol, 7, 285-295.
53. Yu, F., Zaleta-Rivera, K., Zhu, X., Huffman, J., Millet, J.C., Harris, S.D., Yuen, G., Li, X.C. and Du, L. (2007) Structure and biosynthesis of heat-stable antifungal factor (HSAF), a broad-spectrum antimycotic with a novel mode of action. Antimicrob Agents Chemother, 51, 64-72.
54. Blodgett, J.A., Oh, D.C., Cao, S., Currie, C.R., Kolter, R. and Clardy, J. (2010) Common biosynthetic origins for polycyclic tetramate macrolactams from
25
phylogenetically diverse bacteria. Proc Natl Acad Sci U S A, 107, 11692-11697.
55. Shao, Z. and Zhao, H. (2009) DNA assembler, an in vivo genetic method for rapid construction of biochemical pathways. Nucleic Acids Res, 37, e16.
26
Chapter 2 Discovery of Novel Type III PKS from Bioactive
Plants
2.1 Introduction
Polyketides found in plants include chalcones and stilbenes. The enzymes
responsible for the biosynthesis of these polyketides are classified as polyketide
synthases (PKSs). In most cases, the production of polyketide metabolites in plant
species has been traced to the activity of Type III PKSs. The Type III PKSs are the
smallest PKS and essentially represent a single domain of the multi-domain Type I or
Type II PKS enzymes, mostly the KS domain. They usually form homodimers and the
two independent active sites of the homodimer act iteratively, catalyzing the
sequential condensation of two-carbon acetate units derived from a malonate thioester
into a growing polyketide chain (1). Although they are simple in structure, they can
produce a variety of bioactive aromatic compounds with single or multi-ring
structures. Type III PKSs were traditionally associated with plants but recently
discovered in a number of bacteria and filamentous fungi (2). Type III PKS were
thought to be diverged from bacterial β-ketoacyl-acyl-carrier-protein synthase III
(KAS III) enzymes (3). Alkylresorinol synthases of bacterial and fungi are thought to
be the ancestors of Type III PKS enzymes, while chalcone synthase evolved from
plants after the Type III PKS was transferred about 450 million years ago from
primitive cyanobacteria (3).
Medicinal plants have been used for thousands of years in the treatment of
numerous diseases and drug discovery (4). Approximately 25% of today’s medicines
27
are derived from plant natural products such as the Taxol used in breast cancer
therapy (5) (Figure 2.1). Overall, natural products from plants include terpenes,
alkaloids, lignans, and polyketides. Many well-known complementary therapies, such
as curcuminoids (curcumin) and flavonoids (genistein, quercetin, resveratrol) were
occupied a large portion of the natural products discovered from bioactive plants.
They were usually extracted from plant extracts using newly-developed analytical
techniques (6,7). For example, the phloroglucinol components of bioactive plants
were generally discovered by analyzing organic solvent fractions of plant materials
(8). Phloroglucinols itself and its derivatives are smooth muscle relaxants and serve as
antispasmodic in clinical use. It is sold in Europe under the names Spasfon and
Spasfon Lyoc. The one used in industry and medicine is synthesized chemically;
however, it has received little attention for bio-production. Searching databases of
scientific literature uncovered a number of target plants across a wide range of species,
we found phloroglucinols and its derivatives were isolated from Acacia arabica and
Euclyptus kino and others (8). In nearly all cases, chemical characterization of the
phloroglucinol natural products was not followed by subsequent characterization of
the producing enzymes. So far, only two acylpholoroglucinol-producing enzymes
from plants have been studied, isovalerophenone synthase from Humulus lupulus (9)
and PKS1 from Hypericum perforatum (10). Thus, the genetic material of known
phloroglucinol or its derivatives producing plants provide rich sources of
uncharacterized phloroglucinol synthase enzymes.
28
The well-studied plant Type III PKSs are the chalcone synthases (CHSs), which
are involved in the synthesis of precursors of certain pigments. A large number of
CHS family enzymes have been discovered and characterized (2). Based on these
CHSs, it may be possible to isolate new Type III PKS from bioactive plants, such as
enzymes that are capable of synthesizing phloroglucinol or acylphloroglucinol.
The goal of this project is to create a PCR-based screening method using
degenerate primers to identify novel Type III PKS, especially phloroglucinol synthase,
from bioactive plants. If not, an enzyme with previously uncharacterized substrate
specificity or new intermediates or products will be a valuable discovery as well. In
this method, degenerate primers were used as probes to mine the plant genetic
materials. Therefore, a validation of the degeneracy of the primers needs to be tested
first, to confirm the efficiency of our degenerate primers. Once this method is
validated, future discovery of new Type III PKS can be processed and
characterization of interesting enzymes can be performed to determine their substrate
and product profiles.
2.2 Results and discussion
2.2.1 Design of the degenerate primers
Detection of interested genes by PCR is a commonly used strategies in molecular
biology (11). Degenerate primers are used for amplifying a family of unknown genes
by incorporating deoxyinosines in the primer sequence to create the genetic diversity.
Inosine is a natural nucleoside commonly present in transfer RNAs and is able to base
29
pair with thymidine, cytosine, and, to a less extent, adenine. This strategy has already
been employed for discovery of Type III PKS, such as acridone synthase from
Huperzia serrate (12), and octaketide synthase from Aloe arborescens (13).
Fourteen different enzymes from plant Type III PKS family were chosen as the
template for design of the degenerate primers. The phylogenetic relationship of these
fourteen enzymes and their main polyketide products are shown in Figure 2.2. The
mRNA sequences coding these enzymes were aligned using ClustalW. From the
alignment, regions shared high homologies were found and six of them were chosen
to create the degenerate primers. The locations of the three sets of conserved regions,
F1, F2, F3, R1, R2 and R3, mapped on the M. sativa chalcone synthase 2 (CHS2)
gene and protein sequence are shown in Figure 2.3. Three primer sets annealing at
these regions (shown in yellow, green, and pink) amplify three gradually shorter
fragments to perform a nest PCR, with fragment lengths of approximately 800bp
(yellow, region F1 and R3), 770bp (green, region F2 and R2), and 400bp (pink, region
F3 and R1). Nest PCR would help to eliminate the false positive and also enrich the
genetic materials of interest. The core fragments that were ~ 400bp in length were
specially designed in the region that contained the important residues, such as
Thr-197 and Gly-256. Those residues play an important role in substrate recognition
and elongation of a Type III PKS. In this case, alignment of the resultant core
fragments would give insights into the potential classification of the Type III PKSs.
This would help avoiding cloning and characterizing less desirable enzymes.
30
2.2.2 Validation of degenerate primers
As the degenerate primers is the first key element to insure the success of
amplification of the desired Type III PKS, Type III PKSs from Hypericum perforatum
are used to validate the degenerate primers. Hypericum perforatum, known as St.
John’s Worts, is a medical plant and its extracts are commonly used for the treatment
of mild to moderate depression (14). Four polyketides, hyperforin, adhyperforin,
hyperfricin, and pseudohypericin are found in this plant and believed to be bioactive
(15). Thus, Hypericum perforatum is ideal to be used as the model system to test the
degenerate primers as it is a medical plant and has a number of PKSs whose
sequences are known. Four type III PKSs have been cloned from H. perforatum:
chalcone synthase, benzophenone synthase, PKS1, a hyperforin synthase and PKS2,
which synthesizes the octaketide precursor to hyperforin, emodin anthrone (16). The
primary sequences of these four enzymes are divergent, so the ~400bp fragment may
be enough to differentiate the four genes. The validation experiment design is shown
in Figure 2.4.
In this study, the genetic materials were collected from the improved strain
Hypericum perforatum var. Topas. Seeds were germinated on Murashige-Skoog
media and grow for four weeks. Total RNA was extracted from immature tissue,
purified and concentrated, and then was used to create the cDNA library. One positive
control gene, which synthesizes the RuBisCo (Ribulose-1,5-bisphosphate carboxylase
oxygenase), was used to ensure the successful preparation of mRNA and cDNA
library. RuBisco is the most abundant enzyme in green plants, so it was chosen as the
31
positive control. Specific primers based on the published mRNA sequence of the four
known PKSs (BPS, CHS, PKS1 and PKS2) were used to amplify the corresponding
genes from the cDNA library (Figure 2.4). The amplified fragments were cloned into
pET 26 vectors and confirmed by sequencing. The identities were greater than 90%
versus the templates (BPS. 797/855 nt, 93%; CHS, 1004/1067 nt, 94%; PKS1,
960/981 nt, 97%; PKS2, 1057/1078 nt, 98%). Then, degenerate primers were used to
amplify the Type III PKSs based on nest PCR protocol. Basically, PKS-F1/PKS-R3
were used to amplify the ~800 bp fragment and then , the product was used as
template for the second round of amplification using PKS-F2/PKS-R2, and the
product of second round served as the template for the last round of amplification by
PKS-F3/PKS-R1. Product from the third round of nest PCR can be visible on agarose
gel as shown on Figure 2.5.
The fragments were cloned into a vector and 26 colonies were sequenced. The
results showed that out of the 26, some were belonging to CHS and BPS from H.
perforatum, and an additional PKS was found. The sequence identity of various
clones to CHS, BPS, PKS1 and PKS2 was shown in Table 2.1.
A BLAST search based on the ~400 nt sequence of the unknown gene fragments
(clones #5 and #8) indicates it has significant homology to plant CHS and maybe root
specific. Our inability to identify PKS1 and PKS2 may due to the low expression
level of these two genes, also may due to the availability of these two genes in the
immature plant tissue.
32
Nonetheless, the validation data confirmed that the degenerate primers are able
to amplify the Type III gene from a plant cDNA library. During the degenerate primer
design process, no Type III PKS from H. perforatum was included, which proved that
the degenerate primers designed were able to amplify Type III PKS from general
plant tissue.
2.2.3 Identification of target plants
The target plants were selected based on two criteria: firstly, the plant needs to be
bioactive, evidenced by use in traditional medicine and/or findings of high-valued
compounds reported by scientific journals; secondly, the plant has been reported to
produce phloroglucinol or derivative compounds. Searching scientific database,
several target plants have been picked out over a wide range of species. Trees of
Eucalyptus and ferns of Dryopteris are two groups of plants in which
acylphloroglucinols are particularly abundant and no cloning of relevant genes.
Eucalypts produce a wide variety of acylphloroglucinol compounds which may be
formylated or acylphloroglucinol-terpene adducts (e.g. euglobals and macrocarpals)
(17). For example, the compound grandinol, produced by Eucalyptus grandis, is the
most-studied acylphloroglucinols. Gradinol serves as a growth regulator in the plants
and can inhibit germination (8). Eucalypt polyketide natural products are also known
to be antiviral, antibacterial, anticancer, and cariostatic (inhibit the growth of bacteria
that cause tooth decay), and Eucalypts have been used by the Aboriginal people of
Australia for the treatment of colds, influenza, toothache, snakebites, fevers, and other
33
ailments (17). The plant E. camaldulensis, from which the compound phloroglucinol
has been isolated in the past, is an important species to study.
2.2.4 Screening of PKS genes in target plants
Based on the two criteria and literature study, two plants, Eucalyptus
camadulensis var. Silverton Province and Eucalyptus robusta were chosen for study.
E. robusta produces the dimeric and monomeric acylphloroglucinols Robustaol A and
B, respectively, as well as the acylphloroglucinol-terpene adducts Robustadial A and
B. The leaves of E. robusta are used to produce an anti-malarial medicine, “Da Ye
An,” in China, and also to treat bacterial illnesses and dysentery. E. camaldulensis, or
red gum, was chosen as the second target species. It is an abundant tree also known to
produce acylphloroglucinol-terpene adducts.
Specimens of the target plants E. camaldulensis var. Silverton Province and E.
robusta were obtained for study. Briefly, the collected leaf tissue was flash-frozen in
liquid nitrogen and crushed, and total RNA was extracted from the plant tissue and
used to create cDNA libraries. This cDNA was used as the template for nested PCR.
At each stage, PCR products were analyzed by agarose gel electrophoresis, and in the
initial step, an amplification of a fragment of the abundant mRNA for RuBisCo
(Ribulose-1,5-bisphosphate carboxylase oxygenase) was also used as a control. The
first and second amplifications did not result in bands of the expected size, with the
exception of the RuBisCo control, but the final amplification gave a fragment of the
appropriate size (~400 nt). The fragments were then digested and ligated into a vector
34
for sequencing. The absence of product bands until the third nested PCR indicates the
need for this strategy due to low abundance of the mRNAs of interest in Eucalyptus.
A total of 94 clones and 27 clones were sequenced from the cDNA libraries
created from the leaves of E. camaldulensis and E. robusta, respectively. Depending
on the threshold (i.e. the number of different amino acids in the cloned DNA
fragments), the number of different genes could range from 5 to 97. If the threshold in
the validation experiment was used as a reference (i.e. genes with 80% sequence
identity are considered the same), the identified DNA fragments could be classified
into 11 groups, representing 11 unique putative Type III PKSs genes. To make it more
stringent, these 11 groups can be further consolidated into 5 groups (PKS1-5)
according to the characteristic residues. Representation of each group ranged from 1
to approximately 50 clones and two of the three conserved catalytic triad residues in
all known Type III PKSs (Cys and His) were present.
Based on the translated sequence, specifically on the Gly-256 (M. sativa
chalcone synthase numbering) equivalent and nearby residues, the activity of a new
Type III PKS could be inferred. Enzymes utilizing coumaroyl-CoA as the priming
acyl-CoA (chalcone synthase and stilbene synthase enzymes) typically possess a
glycine at the equivalent of CHS Gly-256, and a DGH (Asp-255, Gly-256, His-257)
pattern, indicating that PKS4 likely represents CHS or STS enzymes. Benzophenone
synthases possess a TAH pattern, which is shared by PKS2. Two enzymes possessed
unusual residues at those locations: AGH was seen in PKS1, and AAH for PKS5.
35
The translated sequence of PKS3 shows that it possesses the residues EGQ at the CHS
Asp-255, Gly-256, His-257 equivalent positions. ClustalW alignment of
representative translated sequences from each group is shown in Figure 2.6, and
highlights the diversity present among these five putative Type III PKS enzymes.
2.2.5 Cloning of full length PKS genes using RACE-PCR
To further investigate the cloned putative Type III PKS fragments, RACE-PCR
(rapid amplification of cDNA ends-polymerase chain reaction) was used to clone the
full-length genes encoding the three Eucalyptus Type III PKSs. However, for Pks5, as
the amount of this gene in the cDNA library was so low, the RACE-PCR failed to
obtain the 3’- and 5’-end of this gene. Thus we decided to focus on Pks1 and Pks3.
The 3’- and 5’-ends of these two genes were obtained from RACE-PCR, and specific
primers were designed to obtain the full length gene. Full length genes were
successfully amplified from the cDNA library (Figure 2.7). The resulting full-length
gene for Pks 1, consists of 1170 nucleotides, which corresponds to 389 amino acid
residues, and for Pks 3, consists of 1182 nucleotides, which corresponds to 408 amino
acid residues. The two full length genes were sequenced. BLAST search indicates
both Pks1 and Pks3 share 75% sequence identity with M. sativa chalcone synthase.
Bioinformatic analysis indicates that the catalytic triad residues (Cys, His, Asn) of
Type III PKSs are all present in this protein. By comparing all the conserved residues
in CHSs, including the catalytic triad: Cys164, His303, Asn336; the active sites:
Thr197, Ile254, Gly256, Ser338; the catalytic site: Met137, Gly211, Gly216, Pro375,
and the gate keepers: Phe215, Phe265, it was found that all these residues are
36
conserved in Pks3, while Pks1 has Ile265 instead of Phe265. Therefore, we selected
Pks3 for further characterization.
2.2.6 Enzyme characterization and product identification
In order to produce the large quantities of enzymes required for characterization
studies, a heterologous host was needed for over-expression. Escherichia coli was
chosen as the heterologous host. It is widely used for heterologous expression due to
its fast growth, ease of transformation and mature and numerous availability of DNA
cloning tools (18). The pks1 gene with C-terminal His6 was cloned into the
IPTG-inducible pET26b(+) vector, which possesses a PT7 promoter, and expressed as
a soluble protein in E. coli. A PKS1 protein band is clearly visible by SDS-PAGE
analysis of purified soluble lysate from E. coli PKS1 expression. However, the purity
was poor, so gel filtration was used to enhance the purity (Figure 2.8).
At the same time, M. sativa chalcone synthase was cloned and purified to serve
as the positive control. Enzymatic activity assay was carried out as described
elsewhere (19). 14 different CoAs were tested as the starter unit whereas malonyl
CoA was used as the extension unit. Several HPLC programs were tested as listed in
Table 2.3 and 2.4, and the highlighted one was chosen for CoA separation and
product discovery.
As for M. sativa chalcone synthase, 4-coumaroyl CoA was used as the starter
unit and malonyl CoA as the extension unit. 4-coumaroyl-NAC was chemically
synthesized and used as the starter unit. The chemical structure of
4-coumaroyl-NAC was confirmed by both LC-MS and NMR (data not shown). Both
37
CHS and Pks1 were incubated with these two substrates and the expected product
peak for naringenin was detected (Figure 2.9), which indicated that the enzymes were
active. However, after testing all possible starter CoAs, no novel product was
identified, which indicates that Pks1 is a member of the CHS family of Type III PKSs
and shares the same function as M. sativa chalcone synthase.
2.3 Conclusion
In this study we developed a PCR-based screening method able to identify novel
Type III PKSs in medicinal plants. The screening system was successfully validated
by identifying two known Type III PKSs and one unknown Type III PKS from
Hypericum perforatum. Initial results from screening Eucalyptus species have been
promising, and led to the identification of gene fragments from at least five putative
Type III PKS enzymes between E. robusta and E. camaldulensis, including two
enzymes which appear to be conserved in both species. Bioinformatic studies allowed
us to identify one putative benzophenone synthase and three putative chalcone or
stilbene synthases. RACE-PCR was used to obtain the full-length genes encoding two
Eucalyptus Type III PKSs, however, further biochemical characterization showed that
Pks1 Type III PKS belongs to the CHS family and has a same function as M. sativa
chalcone synthase.
38
2.4 Materials and methods
2.4.1 Strains and reagents
E. coli DH5α was obtained from the University of Illinois at Urbana-Champaign
Biochemistry Department's Media Preparation Facility (Urbana, IL). Hypericum
perforatum var. Topas seeds were purchased from Sand Mountain Herbs (Fyffe, AL).
Eucalyptus robusta and Eucalyptus camaldulensis var. Silverton Province plants were
obtained from Windmill Outback Nursery (Louisa, VA). Plasmid pET26b(+) and
pET28a(+) were obtained from Novagen (Madison, WI). All restriction
endonucleases, as well as T4 DNA ligase and antarctic phosphatase, were purchased
from New England Biolabs (Beverly, MA). The QIAprep Spin Plasmid Mini-prep Kit,
QIAquick PCR Purification Kit, QIAquick Gel Extraction Kit, and RNeasy MinElute
Cleanup Kit were purchased from Qiagen (Valencia, CA). The Total RNA Isolation
Mini Kit was obtained from Agilent Technologies (Santa Clara, CA). All primers
were synthesized by Integrated DNA Technologies (Coralville, IA). 5 PRIME
MasterMix was purchased from 5 PRIME Inc. (Gaithersburg, MD). Unless otherwise
noted, all other reagents were purchased from Sigma-Aldrich (St. Louis, MO).
2.4.2 Degenerate primer design
The mRNA sequences of fourteen enzymes in the plant Type III PKS family were
used as the templates to design degenerate primers: Hypericum androsoemum
benzophenone synthase (BPS), Hordeum vulgare homoeriodictyol/eriodictyol
synthase (HEDS), Medicago sativa chalcone synthase 2 (CHS2), Arachis hypogea
39
stilbene synthase (STS), Pinus sylvestris stilbene synthase (STS), Gerbera hybrida
2-pyrone synthase (2-PS), Aloe arborescens octaketide synthase (OKS), Aloe
arborescens pentaketide chromone synthase (PCS), Rheum palmatum aloesone
synthase (ALS), Rheum palmatum benzalacetone synthase (BAS), Humulus lupulus
isovalerophenone synthase (VPS), Hydrangea macrophylla stilbenecarboxylate
synthase 1 (STCS1), Hydrangea macrophylla coumaroyl triacetic acid lactone
synthase (CTAS), and Huperzia serrata acridone synthase (ACS).
The mRNA sequences of the fourteen above enzymes were aligned using the
ClustalW tool in the Biology Workbench suite (http://workbench.sdsc.edu) (20).
Highly similar regions were identified from the alignment, and the six regions chosen
to create degenerate primers are mapped onto the M. sativa CHS2 mRNA and protein
sequences in Figure 2.3. The conserved regions were denoted F1, F2, F3, R1, R2, and
R3, and used to design primers of the same name. To keep degeneracy as low as
possible, less conserved bases were replaced with deoxyinosine, with a maximum of
three deoxyinosine nucleotides per primer. The primer sequences, their levels of
degeneracy, and definitions of mixed bases are given in Table 2.5. Three sets of
primers are designed to perform nest PCR, which is essential as repeating of F1-R3
cannot amplify desired fragments and using F3-R1 initially cannot generate enough
amount of the desired fragment either. Restriction sites were added to primers
PKS-F3 (NheI) and PKS-R1 (XhoI) to facilitate cloning the smallest fragments into a
vector for sequencing; those primers are PKS-F3-F and PKS-R1-F. Primers both with
and without flanking bases were generated because it was unclear whether
40
non-hybridizing, flanking nucleotides would inhibit the amplification reaction. Based
on the M. sativa CHS2 mRNA (Figure 2.3), the fragment amplified by PKS-F1 and
PKS-R3 is 803 nt in length; the fragment amplified by PKS-F2 and PKS-R2 is 776 nt;
and the fragment amplified by PKS-F3(-F) and PKS-R1(-F) is 407 nt.
2.4.3 Seed sterilization and germination
Before germinating, H. perforatum var. Topas seeds were sterilized by a four-step
process: (1) seeds were washed in a solution of 70% ethanol and 0.1% Tween-20 with
vigorous mixing for 1 minute; (2) seeds were washed in a solution of 5% NaOCl
(bleach) with vigorous mixing for 10 minutes; (3) seeds were then rinsed with three
washes of distilled, deionized water which was sterilized through a 0.22 µm syringe
filter; (4) seeds were soaked in a solution of Agribrom antifungal dissolved in sterile
distilled, deionized water and incubated in the dark at 4 °C for 48 hours. Subsequently
the sterilized seeds were rinsed in sterile distilled, deionized water, plated on
Murashige-Skoog (no sucrose) plates (Murashige-Skoog basal salts, MES, 0.7% agar,
pH 5.7) sealed with gas-permeable tape (3M, St. Paul, MN) and incubated in a lighted
room to induce germination. After approximately four weeks, seedlings were removed
from agar plates and plant tissue was collected and cDNA prepared as described
below.
2.4.4 Collection of plant tissue
Plant tissue (leaf) was cut from live plants and immediately frozen on liquid
nitrogen to prevent degradation of RNA. Frozen tissue was crushed to fine powder
41
using a mortar and pestle cooled to freezing using liquid nitrogen. Crushed plant
tissue was stored at -80 °C until RNA extraction.
2.4.5 Total RNA isolation and concentration and cDNA synthesis
Total RNA was purified from crushed plant tissue using the Agilent Total RNA
Isolation Mini Kit according to the manufacturer’s instructions. Approximately 100
mg of plant tissue was used in each RNA purification. To purify and concentrate
RNA, the Qiagen RNeasy MinElute Cleanup Kit was used according to the
manufacturer’s instructions. Total RNA was visualized by agarose gel electrophoresis
and concentration and purity were determined using a NanoDrop (Thermo Fisher
Scientific, Waltham, MA). Purified, concentrated total RNA was used as the template
for cDNA synthesis using Transcriptor II reverse transcriptase enzyme (Invitrogen,
Carlsbad, CA) and anchored Oligo-dT20 primer (Invitrogen, Carlsbad, CA) according
to the manufacturer’s instructions.
2.4.6 Screen validation using Hypericum perforatum var. Topas
To verify their expression, full-length genes for H. perforatum CHS, BPS, PKS1,
and PKS2 were amplified from H. perforatum cDNA using the specific primers
shown in Table 2.6 and 5 PRIME MasterMix according to the manufacturer’s
instructions (PCR program: 95 °C, 3 min; [94 °C, 1 min; 50 °C 1 min; 68 °C 1 min 30
sec] x 50; 68 °C, 7 min). PCR products were analyzed by agarose gel electrophoresis
and then bands of the appropriate size were purified by gel purification and extraction.
DNA sequencing reactions were performed using the same primers initially used to
42
amplify the genes (Table 2.6) and the BigDye Terminator v3.1 Cycle Sequencing Kit
(Applied Biosystems, Foster City, CA) according to the manufacturer’s instructions.
DNA was sequenced by the Biotechnology Core DNA Sequencing Laboratory at the
University of Illinois at Urbana-Champaign (Urbana, IL). Comparing the sequencing
results versus published sequences confirmed that all four PKSs were expressed in the
H. perforatum seedlings.
Amplifications with the degenerate primers were performed using the following
PCR program: 95 °C, 3 min; [94 °C, 1 min; 50 °C 1 min; 68 °C 1 min 30 sec] x 50;
68 °C, 7 min. DNA fragments were amplified from cDNA using the primers shown in
Table 2.5 and 5 PRIME MasterMix according to the manufacturer’s instructions.
Amplification of the RuBisCo fragment was performed using specific primers for the
RuBisCo gene in conjunction with the first amplification using PKS-F1 and PKS-R3
primers, to verify the amplification was successful and cDNA was present. For
amplifications using cDNA as the template, 1 µl of cDNA was used; alternatively, 5
µl of PCR product was used as the template for subsequent nested PCR amplifications
using primers PKS-F2 and PKS-R2 or PKS-F3-F and PKS-R1-F.
PCR-amplified ~400 nt DNA fragments were cloned into the pET26b(+) vector
to facilitate DNA sequencing. The ~400 nt fragment was digested with NheI and XhoI
(10 units enzyme per µg DNA), and the vector was digested with XbaI and XhoI (10
units enzyme per µg DNA). Note that NheI and XbaI have compatible cohesive ends.
After ligation using T4 DNA ligase the plasmid was transformed into E. coli DH5α
and single colonies were picked. Presence of an insert of the correct size was verified
43
by colony PCR. Briefly, cells were collected from 200 µl overnight culture and
resuspended in 200 µl sterile distilled, deionized water. The samples were boiled for
ten minutes, cooled to room temperature, and then spun at max speed in a bench top
microcentrifuge for three minutes to pellet cell debris. A PCR amplification using Taq
DNA polymerase (Invitrogen, Carlsbad, CA) was set up according to the
manufacturer’s instructions with 5 µl of the soluble cell lysate as template and using
primers specific to the T7 promoter and the T7 terminator. The PCR product was
analyzed by agarose gel electrophoresis. A plasmid with resulting PCR product of
size ~500 nt was purified and submitted for DNA sequencing. Sequencing reactions
were performed using primers specific to the T7 terminator and the BigDye
Terminator v3.1 Cycle Sequencing Kit. DNA was sequenced by the Biotechnology
Core DNA Sequencing Laboratory at the University of Illinois at Urbana-Champaign
(Urbana, IL).
DNA sequencing of 26 fragments in pET26b(+) identified the ~400 bp
fragments as belonging to the genes for H. perforatum CHS, H. perforatum BPS, and
an additional, previously unknown gene, based on sequence homology (Table 2.1).
The other two enzymes, PKS1 and PKS2, were not identified, probably due to lower
relative expression (and thus abundance of mRNA) in the young plants.
2.4.7 Screening Eucalyptus species
Leaf tissue was taken from E. robusta and E. camaldulensis var. Silverton
Province trees (~1.5 m height) for RNA isolation. Collection of plant tissue, RNA
isolation, and cDNA synthesis were completed as described in the sections above.
44
Degenerate primer nested PCR, cloning of ~400 nt fragments, and DNA sequencing
were performed as described above for H. perforatum. A total of 27 individual clones
were submitted for sequencing from E. robusta and 94 from E. camaldulensis.
ClustalW and BLAST analysis using the Biology Workbench Suite (20) were used to
analyze the resulting DNA and translated protein sequences. Depending on the cut-off
(i.e. the number of different amino acids in the cloned DNA fragments), the number
of different Type III PKS genes ranges from 5 to 97 (Table 2.2). Phylogenetic
analysis of all the Type III PKS fragments is given in Figure 2.10.
2.4.8 RACE PCR
The core fragment of EC2 was subjected to RACE PCR using gene specific
primers (Figure 2.7) and the protocol of 3’/5’ RACE system for Rapid Amplification
of cDNA Ends Kits Version 2.0 (Invitrogen, Carlsbad, CA). The PCR program
consists of 94 °C (2 min), 35 cycles at 94 °C (1 min), 55 °C (30s) and 72 °C (2 min),
and a final 10 min extension at 72 °C. A plasmid with resulting PCR product of size
~1000 nt was purified and submitted for DNA sequencing. Sequencing reactions were
performed using primers specific to the T7 terminator and the BigDye® Terminator
v3.1 Cycle Sequencing Kit.
2.4.9 Protein purification
Plasmids pET26b(+)-EC2, pET26(+)-ER4 and pET26(+)-CHS were transformed
into E.coli BL21 (DE3) and single colonies were picked and inoculated 5 mL LB
containing 50 μg kanamycin mL-1 . Following overnight growth at 37 °C, and this
45
culture was used to inoculate 500 mL TB containing the appropriate antibiotic, and
grown at 37 °C until the optical density at 600 nm reached 0.8. Protein expression was
induced with the addition of 0.25mM isopropyl-β-D-thiogalactopyranoside (IPTG),
and growth was continued for an additional 18 hours at 20 °C. The flasks were chilled
for 30 minutes, and then the cells were harvested by centrifugation (6000 g, 4 °C, 10
minutes). The resulting pellet was resuspended in 2 mL lysis buffer (10 mM
imidazole, 50 mM sodium phosphate buffer, 300 mM NaCl, 10 mM imidazole, 15%
glycerol, pH 8.0, 1 mg/mL lysozyme) per gram cell weight and frozen at -80 °C.
Subsequently, cells were lysed using sonication and cellular debris was removed by
centrifugation (40,000 g, 4 °C, 10 minutes). Recombinant proteins were purified by
chromatography on Ni-NTA resin by amino-terminal hexahistidine affinity tag.
Captured proteins was washed with wash buffer (20 and 50 mM imidazole, 50 mM
sodium phosphate buffer, 300 mM NaCl, 20 or 50 mM imidazole, 15% glycerol, pH
8.0) and finally eluted in elution buffer (200 mM imidazole, 50 mM sodium
phosphate buffer, 300 mM NaCl, 200 mM imidazole, 15% glycerol, pH 8.0). The
protein content of individual elution fractions was estimated using Quick Start
Bradford Protein Assay (Bio-Rad, Hercules, CA) and the appropriate fractions were
pooled and buffer exchanged into 10 mM phosphate buffer pH 7.5 containing 15%
glycerol at 4°C. The protein was concentrated using an Amicon Ultra centrifugal filter
device (Millipore, Billerca, MA). Representative SDS-PAGE analysis of recombinant
protein was shown in Figure 2.8.
46
2.4.10 Polyacrylamide Gel Electrophoresis (SDS-PAGE)
Protein standards, Powerpac 300 power supply, Mini-PROTEAN 3 assembly, and
precast acrylamide gels were purchased from Bio-Rad (Hercules, CA). Varying
amounts of sample, dependent upon protein concentration, were mixed with 2 μl
SDS-PAGE loading buffer (30% glycerol, 1% SDS, 0.3% bromophenol blue, 0.5 M
Tris-HCl, pH 6.8) and water to a final volume of 20 μl, Samples were denatured at
98°C for 10 minutes, cooled to 4°C, and then loaded onto a gel in a Bio-Rad
Mini-PROTEAN 3 assembly. Voltage was supplied by a Powerpac 300 supply at a
voltage of 100 V for 100 minutes. Gels were stained with SimplyBlueTM Safestain
from Invitrogen (Carlsbad, CA) followed by destaining overnight in distilled water.
Destained SDS-PAGE gels were scanned subsequently discarded.
2.4.11 Determination of activity
Experiments were conducted according to the method of Mckhann, et al. (19)
with some modifications. The experiments were carried out in triplicates using a
range of different starting CoAs. For each reaction, substrate was incubated with EC2
enzyme (0.5 mg/ml) in PBS buffer (pH 7.5) at 37°C for 15 minutes. Usually, 0.5 μM
enzyme is used and 100 μM malonyl CoA in a total reaction volume of 400 μl. The
concentration of priming acyl-CoA was 10 μl. The reaction mixture was incubated at
37°C for 5 minutes before addition of malonyl-CoA. The reaction was quenched by
the addition of 20 μl 20% HCl and kept room temperature until HPLC analysis. The
reactions were analyzed using an Agilent 1100 Series HPLC and ZORBAX SB-C18
47
reverse-phase column (3.0 x 150 mm, 3.5 μm) (Aglient Technologies, Palo Alto, CA)
monitoring at 254 nm.
2.4.12 Product HPLC analysis method
Product analysis method was conducted by comparing a series of different HPLC
programs. An Agilent 1100 Series HPLC and ZORBAX SB-C18 reverse-phase
column (3.0 x 150 mm, 3.5 μm) (Aglient Technologies, Palo Alto, CA) was used to
monitor at 280 nm and at a flow rate of 0.5 mL/min at 25°C. Solvent A is water with
1% acetic acid and solvent B is methanol.
2.5 References
1. Watanabe, K., Praseuth, A.P. and Wang, C.C. (2007) A comprehensive and engaging overview of the type III family of polyketide synthases. Curr Opin Chem Biol, 11, 279-286.
2. Austin, M.B. and Noel, J.P. (2003) The chalcone synthase superfamily of type III polyketide synthases. Nat Prod Rep, 20, 79-110.
3. Jiang, C., Kim, S.Y. and Suh, D.Y. (2008) Divergent evolution of the thiolase superfamily and chalcone synthase family. Mol Phylogenet Evol, 49, 691-701.
4. Halberstein, R.A. (2005) Medicinal plants: historical and cross-cultural usage patterns. Ann Epidemiol, 15, 686-699.
5. Zhou, L.G. and Wu, J.Y. (2006) Development and application of medicinal plant tissue cultures for production of drugs and herbal medicinals in China. Nat Prod Rep, 23, 789-810.
6. Bisht, K., Wagner, K.H. and Bulmer, A.C. (2009) Curcumin, resveratrol and flavonoids as anti-inflammatory, cyto- and DNA-protective dietary compounds. Toxicology, 278(1), 88-100.
7. Khan, N., Afaq, F. and Mukhtar, H. (2008) Cancer chemoprevention through dietary antioxidants: progress and promise. Antioxid Redox Signal, 10, 475-510.
8. Pal Singh, I. and Bharate, S.B. (2006) Phloroglucinol compounds of natural origin. Nat Prod Rep, 23, 558-591.
9. Okada, Y. and Ito, K. (2001) Cloning and analysis of valerophenone synthase gene expressed specifically in lupulin gland of hop (Humulus lupulus L.). Biosci Biotechnol Biochem, 65, 150-155.
48
10. Bais, H.P., Vepachedu, R., Lawrence, C.B., Stermitz, F.R. and Vivanco, J.M. (2003) Molecular and biochemical characterization of an enzyme responsible for the formation of hypericin in St. John's wort (Hypericum perforatum L.). J Biol Chem, 278, 32413-32422.
11. Jabado, O.J., Palacios, G., Kapoor, V., Hui, J., Renwick, N., Zhai, J., Briese, T. and Lipkin, W.I. (2006) Greene SCPrimer: a rapid comprehensive tool for designing degenerate primers from multiple sequence alignments. Nucleic Acids Res, 34, 6605-6611.
12. Wanibuchi, K., Zhang, P., Abe, T., Morita, H., Kohno, T., Chen, G., Noguchi, H. and Abe, I. (2007) An acridone-producing novel multifunctional type III polyketide synthase from Huperzia serrata. FEBS J, 274, 1073-1082.
13. Abe, I., Oguro, S., Utsumi, Y., Sano, Y. and Noguchi, H. (2005) Engineered biosynthesis of plant polyketides: chain length control in an octaketide-producing plant type III polyketide synthase. J Am Chem Soc, 127, 12709-12716.
14. Klingauf, P., Beuerle, T., Mellenthin, A., El-Moghazy, S.A., Boubakir, Z. and Beerhues, L. (2005) Biosynthesis of the hyperforin skeleton in Hypericum calycinum cell cultures. Phytochemistry, 66, 139-145.
15. Butterweck, V. and Schmidt, M. (2007) St. John's wort: role of active compounds for its mechanism of action and efficacy. Wien Med Wochenschr, 157, 356-361.
16. Karppinen, K., Hokkanen, J., Mattila, S., Neubauer, P. and Hohtola, A. (2008) Octaketide-producing type III polyketide synthase from Hypericum perforatum is expressed in dark glands accumulating hypericins. FEBS J, 275, 4329-4342.
17. Ghisalberti, E.L. (1996) Bioactive acylphloroglucinol derivatives from Eucalyptus species. Phytochemistry, 41, 7-22.
18. Makrides, S.C. (1996) Strategies for achieving high-level expression of genes in Escherichia coli. Microbiol Rev, 60, 512-538.
19. McKhann, H.I. and Hirsch, A.M. (1994) Isolation of chalcone synthase and chalcone isomerase cDNAs from alfalfa (Medicago sativa L.): highest transcript levels occur in young roots and root tips. Plant Mol Biol, 24, 767-777.
20. Subramaniam, S. (1998) The Biology Workbench--a seamless database and analysis environment for the biologist. Proteins, 32, 1-2.
49
Figure 2.1 Examples of natural products derived from plants.
Artemisinin Quinine Taxol
Grandinol 13
Kosotoxin 14
Dryopteroside
Euglobal III
OHO
OH O
Aspidinol B
50
Figure 2.2 Phylogenetic analysis of the fourteen Type III PKS enzymes used to create
degenerate primers in this study. The primary products of the enzymes are shown. The
enzymes are, in order: HEDS, homoeriodictyol/eriodictyol synthase; ALS, aloesone synthase;
BPS, benzophenone synthase; OKS, octaketide synthase; PCS, pentaketide chromone
synthaes; STS, stilbene synthase; ACS, acridone synthase; CTAS, coumaroyl triacetic acid
synthase; STCS1, stilbenecarboxylate synthase 1; 2-PS, 2-pyrone synthase; VPS,
isovalerophenone synthase; CHS2, chalcone synthase 2; BAS, benzalacetone synthase.
51
Figure 2.3 Locations of the three sets of conserved regions used to create degenerate primers,
mapped onto the M. sativa CHS2 gene and protein, are shown in yellow, green, and pink,
corresponding to F1, F2, F3, R1, R2, and R3, respectively. Active site residues Cys-152,
His-305, and Asn-338 are shown in red and marked with asterisks. Residues which are known
to affect substrate preference and elongation number, Thr-197, Gly-256, and Ser-338, are
shown in blue and marked with a triangle.
M V S V S E I R K A Q R A E G P A T I L 1 atggtgagtgtgtctgaaattcgtaaagctcaaagggcagaaggccctgcaactatcttg 60 A I G T A N P A N C V E Q S T Y P D F Y 61 gccattggcactgcaaatccagcaaattgtgttgaacaaagcacttatcctgatttttac 120 F K I T N S E H K T E L K E K F Q R M C 121 tttaaaattacaaatagtgagcacaaaactgaacttaaagagaaatttcagcgcatgtgt 180 D K S M I K R R Y M Y L T E E I L K E N 181 gataaatctatgatcaagaggagatacatgtatctaacagaggagattttgaaagaaaat 240 P N V C E Y M A P S L D A R Q D M V V V 241 ccaaatgtttgtgaatacatggcaccttcattggatgcaaggcaagacatggtggtggta 300 E V P R L G K E A A V K A I K E W G Q P 301 gaggtacctagactagggaaggaggctgcagtgaaggctataaaagaatggggtcaacca 360 K S K I T H L I V C T T S G V D M P G A 361 aagtcaaagattactcacttaatcgtttgcaccacaagtggtgtcgacatgcctggagcc 420 D Y Q L T K L L G L R P Y V K R Y M M Y 421 gattatcaactcaccaaactcttaggtcttcgcccatatgtgaaaaggtacatgatgtac 480 * Q Q G C F A G G T V L R L A K D L A E N 481 caacaagggtgttttgcaggtggcacggtgcttcgtttggctaaagatttggctgagaac 540 ▼ N K G A R V L V V C S E V T A V T F R G 541 aacaaaggtgctcgtgtgttggttgtttgttctgaggtcaccgctgtcacatttcgtggg 600 P S D T H L D S L V G Q A L F G D G A A 601 cctagtgatactcacttggacagccttgttggacaagcactatttggagatggagctgct 660 A L I V G S D P V P E I E K P I F E M V 661 gcactcattgttggttctgatccagtaccagaaattgagaaacctatatttgagatggtt 720 ▼ W T A Q T I A P D S E G A I D G H L R E 721 tggactgcacaaacaattgctcctgatagtgaaggagccattgatggtcaccttcgtgaa 780 A G L T F H L L K D V P G I V S K N I T 781 gctggactaacatttcaccttcttaaagatgttcctgggattgtttcaaagaacatcact 840 K A L V E A F E P L G I S D Y N S I F W 841 aaagcattggttgaggctttcgagccattgggaatttctgattacaactcaatcttttgg 900 * I A H P G G P A I L D Q V E Q K L A L K 901 attgcacatcctggtggacctgcaattctagatcaagtagagcaaaagttagccttaaag 960 * ▼ P E K M N A T R E V L S E Y G N M S S A 961 cctgaaaagatgaatgcaactagagaagtgctaagtgaatatggaaatatgtcaagtgca 1020 C V L F I L D E M R K K S T Q N G L K T 1021 tgtgttttgtttatcttagatgaaatgagaaagaagtcaactcaaaacggattgaagaca 1080 T G E G L E W G V L F G F G P G L T I E 1081 acgggagaaggacttgaatggggtgtattgtttggttttggaccaggacttaccattgaa 1140 T V V L R S V A I * 1141 acagttgttttgcgtagcgtggctatatga 1170
52
Figure 2.4 Overview of the PCR-based method for discovering new Type III PKS genes.
A: L Rub CHS PKS1 PKS2 BPS B: L D
Figure 2.5 (A) Gene fragments (~1000 nt) for H. perforatum RuBisCo (Rub), CHS, PKS1,
PKS2 and Benzophenone synthase(BPS), amplified by specific primers and analyzed by
agarose gel electrophoresis. (B) The nest PCR product from the third round of PCR (~400 nt)
from H. perforatum.
53
Figure 2.6 Alignment of Type III PKS fragments from five distinct enzymes discovered in
this work. Catalytic histidine residue is marked in red. The CHS Thr-197 and CHS Gly-256
equivalent residues are marked in blue.
Figure 2.7 Amplification of the full length gene. The first two lanes used Taq for PCR, and
the first one had the stop codon while the second one contained the His-tag. The third and
fourth lane used phusion for PCR and same order as the first two.
L Pks3 Pks3 after gel filtration
Figure 2.8 Pks3 protein before and after gel filtration
PKS1 AKDVAENNKGARVLVVCAEITVCTFRGPTETYLDNLVGQALFGDGASSVIVGSDPVPGV-ERPLYEIVSPKS2 AKDVAENNANARVLVVCAENTAMTFHAPNESHLDVIVGQAMFSDGAAALIIGANPDTAAGERAVFNILSPKS3 AKDVAENNAGARVLVVCSEITVTTFRGPSAENLANLVGQAIFGDGAAALIIGSDPEIPA-ERPIFQLVWPKS4 -KDVAENNKGARVLVVCSEITAVTFRGPSDTHLDSLVGQALFGDGAAALIVGSDPIPEI-ERPLFEMVWPKS5 ----AENNKGARVLVVCSEITAVTFRGPSDTRLDNLVGQALFGDGAAALIVGSDPIPEI-EKPSFEMVW PKS1 TSQTLLP-NSEGAIAGHVREIGLTIHLLKDVPGLISKNVEKNMTEAFHPLGITDWNSLFFIVHPGGTGIPKS2 ASQTIVP-GSDGAITAHFYGMGMSYFLKEDVIPLFRDNIADVMKEAFSPLDVSDWNSLFYSIHPGGPAIPKS3 TAETVLP-NSEGAIEGQLRDVGLTFGLSDGVPHLIANHIEKRLKEAFDSLGISDWNSIFWVPHPGGPAIPKS4 TAQTILP-DSEGAIDGHLREVGLTFHLLKDVPGLISKNIEKALVEAFQPLGISDYNSIFWIAHPGGTAIPKS5 TAQTIFPHDGEDIIAAHFREIGLTIHLHRDIPRLVSKNIEKAMIEAFQPLGISDYNSIFWIAHPGGRGI
50kD
30kD
54
Figure 2.9 Detection of naringenin. The peak appeared at 29.786 min showed the production
of naringenin.
55
Figure 2.10 Phylogenetic analysis of all the cloned Type III PKS fragments.
56
Table 2.1 Sequence identity (%) of ~400 nt fragments to known H. perforatum PKSs.
# BPS CHS PKS1 PKS2 # BPS CHS PKS1 PKS2
1 94 50 41 38 14 95 48 37 38
2 93 50 43 34 15 54 87 45 43
3 95 52 42 39 16 54 88 46 44
4 55 88 47 43 17 55 87 45 43
5* 55 54 35 30 18 58 80 46 46
6 53 84 44 41 19 70 48 34 43
7 55 81 46 45 20 53 84 45 42
8* 58 49 36 43 21 54 88 46 44
9 54 88 45 44 22 93 52 42 40
10 90 49 35 40 23 94 48 36 38
11 94 51 40 42 24 93 48 32 33
12** - - - - 25 93 49 38 40
13 90 47 32 37 26** 69 45 37 41
*Previously uncharacterized Type III PKS
**Low quality sequencing data.
57
Table 2.2 The number of expected different Type III PKS genes.
Number of different amino
acids (out of ~136)
Number of expected
different genes
1 97
5 31
10 16
15 12
20 11
30 9
50 5
Table 2.3 HPLC programs tested for CoA separation.
Table 2.4 HPLC programs for product discovery.
58
Table 2.5 DNA sequence and degeneracy of primers used in this study.
Primer Sequence, 5’ 3’ Degeneracy
PKS-F1 GCIATIVMDSARTGGGGNC 288 PKS-F2 TCIVRRATIACICAYBTNRT 576 PKS-F3 GYIAARGAYITNGGNGARAACAA 256 PKS-F3-F* GCGTTCGCTAGCGYIAARGAYITNGGNGARAACA 256 PKS-R1 ATNSCISKICCICCRGGRTG 128 PKS-R1-F* GCGTTCCTCGAGATNSCISKICCICCRGGRTG 128 PKS-R2 CCSMWITCIWDNCCITCNCC 768 PKS-R3 TCIAYNGTNADICCIGGNCC 384 ±Mixed bases B: C, G, T; D: A, G, T; H: A, C, T; K: G, T; M: A, C; N: A, C, G, T; S: C, G; V: A, C, G; W: A, T ; Y: C, T.*Additional flanking bases are in italics; the NheI restriction site is underlined. **Additional flanking bases are in italics; the XhoI restriction site is underlined.
Table 2.6 Gene-specific primers for H. perforatum Type III PKSs.
Primer Sequence, 5’ 3’
CHS-F ATGGTGACCGTGGAAGAAGTCAGCHS-R TTAATATGCGACACTGTGAAGGACCACBPS-F ATGGCCCCGGCGATGBPS-R CTGGAGAATTGGGACACTCTGGAGPKS1-F ATGTCTAACTTGGAGACCAATGGCTCPKS1-R TCATAGGCATAGGCTTCGAAGAAGGPKS2-F ATGGGTTCCCTTGACAATGGTTCPKS2-R TTAGAGAGGCACACTTCGGAGC
Table 2.7 Gene-specific primers for RACE PCR.
Primer Sequence, 5’ 3’
EC2-F1 CGACACCGAGACCTACCTGGACAACCEC2-F2 CGAGTGGGCCAGGCTCTGTTCGEC2-F3 CGATTCGCTAGCGATGGCGCCTCCTCCACCEC2-R1 TTCGCTAGCTGTCCAGGTAGGTCTCGEC2-R2 ACAGAGCCTGGCCCACEC2-R3 GGTGGAGGAGGCGCCATC
59
Chapter 3 Deciphering a Cryptic Biosynthetic Pathway from
Streptomyces griseus by a Synthetic Biology Approach
3.1 Introduction
Over the past 10 years, there has been a resurgence of interest in the investigation
of natural products as a source of new drugs. Around 78% of antibacterial or
anticancer drugs are natural products or have been derived from natural products (1).
Although many effective drugs have been discovered, the need for new therapeutic
agents remains due to the increasing antibiotic resistance to existing drugs and the
recent threat of bioterrorism. In the meantime, computational analysis of sequenced
genomes and metagenomes has revealed numerous gene clusters that are potentially
involved in the biosynthesis of natural products. Over the last decade, the genome
sequences of over 1900 organisms have been released; with many more soon to be
completed (http://www.genomesonline.org/cgi-bin/GOLD/bin/gold.cgi). There is no
doubt that a great deal of valuable information hidden within these sequence data that
are related to production of high-valued compounds, biosynthesis of promising drugs,
and elucidation of mechanisms of various diseases.
However, only a small fraction of this rich genome sequencing data has been
exploited. Our understanding of the specific roles of genes and the functions of those
pathways in the whole cell is still limited. For example, actinomycetes are
well-known of producing bioactive natural products, many of which have exhibited
important medical properties (2). Recently revealed genome sequencing data clearly
showed that the number of pathways coding for putative valuable natural products
60
largely outnumbers the number of currently known secondary metabolites for a given
organism (3-6). A total of 18 additional gene clusters involved in secondary
metabolite production were discovered from the complete genome sequence of
Streptomyces coelicolor, while the strain was previously known to produce only four
compounds before the S. coelicolor genome project (7). In addition, even a larger
number of such pathways were present in the genome mining results of myxobacteria
(8,9) and filamentous fungi (10-12). These gene clusters, for which the
corresponding natural metabolite is unknown, are named as “cryptic pathways” or
“orphan pathways” (3,4). Among the cryptic pathways, non-ribosomal peptide
synthetases (NRPS) and polyketide synthases (PKS) occurred frequently, which were
found in the biosynthesis of numerous bioactive natural products (13,14).
Investigation into the unknown compounds synthesized by these two classes of
enzymes may lead to the discovery of novel chemotherapeutic agents. However, since
the widely used DNA cloning methods for studying biosynthetic pathways are
time-consuming and labor-intensive, our motivation is to develop an efficient
approach to decipher the cryptic natural product biosynthetic pathways.
Up to date, two major approaches have been applied to biochemistry pathway
characterization and analysis. The first group mainly focuses on activating native host
genes while the second one takes the advantages of heterologous host gene expression
(15). The first group of methods including screening of different culture conditions,
which somehow helps to activate the cryptic pathway by introducing activators,
environmental factors or remove repressors (4). However, this approach is usually
61
time-consuming and labor-intensive. Another main drawback is that even if a pathway
is turned on, sometimes, it is still unclear which factor triggers the activation. Also,
most of native producers grow slowly and are not commonly cultivable in the
laboratory. It has been reported that only 1% of bacteria and 5% of fungi have been
cultivated in the laboratory (16-18). On the other hand, the second group of
heterologous host-based method can avoid these drawbacks since a well-characterized
organism grows fast in the laboratory and has all the genetic tools available for
engineering. This offers the possibility for metabolic engineering, which could
enhance the production of target compounds and combinatorial biosynthesis which
may generate new derivatives of a target compound. Most important of all, when
using a well-characterized heterologous host which has a clear background profile, it
would be much easier for product identification. There are many examples of
heterologous expression of natural products, such as polyketides (19,20),
methylenomycin furans (21), and phosphonic acids (22,23). However, there are
limitations for this method as well. It is difficult to transfer large biosynthetic
pathways and it is also not easy to introduce single/multiple mutations or perform
gene deletions. Other problems exist as well, such as low efficiency of promoter
recognition, poor gene expression, or low yield (4). Therefore, the need for novel
methods which are suitable for efficient transfer, expression and modification of target
biosynthetic pathways are obvious.
Synthetic biology has emerged as a versatile tool for manipulating genes,
pathways, networks, and whole cells (24). The fundamental idea behind synthetic
62
biology is that any biological system can be regarded as a combination of individual
functional elements, a limited number of which can be recombined in novel
configurations to modify existing properties or to create new ones (25). As a broad
field, the successful applications of synthetic biology such as synthesis of proteins
that do not exist in nature, design and engineering of signaling and metabolic
pathways, and rebuilding of new organisms make it an invaluable tool for natural
product discovery.
Recently, our laboratory developed a highly efficient method, “DNA assembler”
to rapidly assemble a multiple-gene biochemical pathway in a single-step fashion by
use of the in vivo homologous recombination mechanism in Saccharomyces
cerevisiae (26). Because it is easy and efficient to perform homologous recombination
in S. cerevisaie, S. cerevisiae was chosen as a host for DNA assembly (27-32). In this
chapter, we attempted to further exploit the DNA assembler method to decipher
cryptic pathways via synthetic biology for natural product discovery. Using one
cryptic gene cluster in Streptomyces griseus as a test case (33), we demonstrated that
a shuttle system can be very useful for rapid cloning of a gene cluster and its
heterologous expression in a desired organism. This would solve the problem of
introducing large gene clusters into a heterologous host. On the other hand, as it is
easy to perform genetic manipulations by using this method, we can add different
strong constitutive promoters in front of each gene in the gene cluster to control the
gene expression. This would eliminate the problem of promoter recognitions and gene
expression compared to heterologous expression alone. Future work will be focused
63
on product characterization, study of biosynthetic mechanisms by performing targeted
gene deletions.
3.2 Results and discussion
3.2.1 Selection of a target gene cluster
As we have mentioned above, the recent increasing availability of whole genome
sequences have revised our view of the metabolic capabilities of microorganisms. A
few orphan biosynthesis pathways have been identified by bio-informatics study
(4,6,14). The discovery of these numerous pathways represents a treasure trove,
which is likely to grow exponentially in the future, which will lead to the uncovering
of many novel and possibly bio-active compounds. The few natural products that have
been correlated with their orphan pathway are merely the tip of the iceberg, whilst
plenty of metabolites await discovery (4).
The complete genome sequence of Streptomyces griseus IFO 13350 was
determined in 2008 (33). Streptomyces griseus is a soil bacterium producing an
antituberculosis agent, streptomycin, which was discovered more than 60 years ago
and was the first aminoglycoside antibiotic. There are 8,545,929 base pairs (bp) in the
linear chromosome and the average G+C content is 72.2%. It was predicted that there
are 7,138 open reading frames, six rRNA operons (16S-23S-5S), and 66 tRNA genes.
It contains extremely long terminal inverted repeats (TIRs) of 132,910 bp each. After
comparisons with the genomes of two related species, Streptomyces coelicolor A3(2)
and Streptomyces avermitilis, not only the characteristics of the S. griseus genome but
64
also the existence of 24 Streptomyces-specific proteins was clarified. 34 gene clusters
or genes for the biosynthesis of known or unknown secondary metabolites were found
in the S. griseus genome.
As proof of concept, one cryptic pathway from Streptomyces griseus,
SGR810-815, containing a PKS/NRPS hybrid gene, was chosen as a model system.
BLAST search indicates that the PKS/NRPS gene shares high homology (63%) with
the one from the heat-stable antifungal factor (HSAF) biosynthetic gene cluster (34,35)
and 70-80% sequence homology with the recently characterized frontalamide gene
cluster (36). Those clusters which share high homology seem to produce structurally
similar compounds, which is called macrolide (Figure 3.1). However, the nature of
the natural product encoded by the target pathway is still not clear. To decipher this
cluster, I have been attempting to develop a new strategy based on our developed
“DNA assembler” method.
3.2.2 Applying the DNA assembler method to pathway assembly
In principle, DNA assembler can be used to assemble custom-designed DNA
fragments in any desired order by designing PCR primers to generate appropriate
overlaps between the adjacent ones. Usually, a S. cerevisiae helper fragment carrying
a centromere sequence (CEN), an automatic replication sequence (ARS) and a Ura+
selection marker, an E. coli helper fragment carrying a pBR322 ori and a Kan+
selection marker, and a helper fragment carrying an ori and a selection marker
required by the desired organism, are individually prepared. These helper fragments
are mixed with the pathway fragments, and subsequently co-transformed into S.
65
cerevisaie to be assembled into a single molecule. The transformants are selected on
SC-Ura plates and the assembled circular DNAs are subsequently purified from S.
cerevisiae. The isolated yeast plasmids need to be transformed to E. coli for
enrichment and further verification. Finally, the verified plasmids can be transformed
to the desired heterologous host for expression of the target pathway.
First, we sought to reconstruct the original SGR810-815 biosynthetic pathway
from S. griseus in Streptomyces lividans. To construct a shuttle system carrying the
complete SGR810-815 gene cluster and particularly in order to minimize the possible
mutations introduced by PCR, the biosynthetic gene cluster was split into seven
segments with the size of ~4-5 kb each (Table 3.1). 7 pairs of primers were designed
to cover the whole cluster according to the reported genomic sequence of S griseus
and three pairs of primers were designed for amplification of the three helper
fragments and generating overlap regions with the beginning and ending of the target
cluster. The S. cerevisiae helper fragment carrying CEN6, ARS H4 and Ura3
sequences was amplified from a yeast vector. The E. coli helper fragment carrying
oriR6K and accIV sequences was amplified from an E. coli vector, and accIV encodes
for an AprR gene. The S. lividans helper fragment carrying oriT, int, ϕ31 attP and
tL3 sequences was amplified from the vector used for recombination with the possible
cluster-containing fosmids in the traditional top-down approach as reported in
literature (22). OriT is the origin of conjugal transfer; int encodes for an integrase;
tL3 encodes for a terminator and ϕC31 attP is the junction site for integration into the
ϕC31 site of the chromosome. The construct is shown in Figure 3.2A.
66
Each of the fragments was individually amplified, gel purified and mixed with
others at the end. After precipitation in ethanol, the mixture was re-dissolved in
ddH2O and co-transformed into S. cerevisiae through electroporation and spread onto
SC-Ura plates. Typically, 30-50 colonies appeared after 2-3 days of incubation at
30 °C, which were then randomly picked for plasmid isolation. 20 colonies were
inoculated in SC-Ura liquid culture and the purified yeast plasmids were individually
transformed to E. coli WM1788 and selected on LB-Apr+ plates. E. coli plasmids
were subsequently purified and checked by gel electrophoresis. As a result, 18 out of
the 20 plasmids have a relatively large size (the size of the correctly assembled
plasmid is approximately 36 kb) while the other 2 are too small to be the desired
plasmid. The 18 plasmids were subjected to restriction digestion by NcoI, ApaLI and
XmnI. 13 of them exhibited an expected digestion pattern, indicating an assembly
efficiency of 65% (Figure 3.3).
The first three verified clones were named as pYL-#1, pYL-#7, and pYL-#18,
respectively, and subsequently introduced into S. lividans via conjugal transfer and
integrated into the ϕC31 site of the genome. The exconjugant integrants were verified
by PCR and checked for potential production of a novel natural product.
3.2.3 Product detection from the original cluster in a heterologous host
Traditional extraction methods were applied for product detection according to
the work flow shown in Figure 3.4. However, no product was detected from HPLC.
Later, ammonium sulfate precipitation was performed based on the previously
reported HSAF extraction method (34), as well as the SPE process and YMS plates
67
extraction from the frontalamide extraction method (36), but still no product was
identified.
3.2.4 Quantitative real-time PCR (qPCR)
As we have confirmed that the whole cluster was correctly assembled into the
heterologous host, the first concern was the gene expression issue. Thus,
Revers-Transcription PCR was performed to track the gene expression. Total RNA
was isolated based on the protocol of Total RNA Isolation Mini Kit and cDNA was
synthesized afterwards. In parallel, to eliminate genomic DNA contamination, cDNA
synthesis reaction without reverse transcriptase was carried out to serve as the
negative control. Primers were designed to amplify a ~300 bp fragment for each gene
from the cDNA library. As shown in Figure 3.5, all genes are expressed in the growth
condition.
In order to quantify the gene expression level, quantitative real-time PCR (qPCR)
was employed to track the gene expression level. The hrdB gene (37), an essential
gene of Streptomyces coelicolor A3(2), which indicates the presence of an open
reading frame encoding a putative polypeptide of 442 amino acid (aa) residues with a
molecular weight (Mr) of 48,412, was chosen as the internal control for all following
qPCR experiments and its transcription signal remained comparable in all samples.
The principal sigma-like transcriptional factor of S. coelicolor (HrdB) protein showed
an extensive aa sequence homology with the known principal sigma factors of
Escherichia coli, Bacillus subtilis, Pseudomonas aeruginosa and Myxococcus xanthus
(37). The HrdB protein is the house-keeping gene in Streptomyces strains. The hrdB
68
gene in S. griseus shows high similarity to the hrdB products of Streptomyces
coelicolor A3(2) (89.9%) and Streptomyces aureofaciens (88.1%) (38). Due to its
stable and consistent expression level in Streptomyces strains, we chose this gene to
serve as the internal control. Gene specific primers for genes contained in the target
cluster were designed by Real-Time PCR tools in IDT-DNA
(https://www.idtdna.com/scitools/Applications/RealTimePCR/). Samples were run on
the ABI Prism 7900 HT Real-Time Quantitative PCR machine and the results were
analyzed by SDS software. A standard curve was made first for all pair of primers
using genomic DNA of pYL-#18 as the template. One representative example was
shown in Figure 3.6. Real-Time PCR was performed according to a protocol
described elsewhere (39).
In order to obtain stable and consistent data, cells were grown for 4 days and
samples for RNA isolation were collected at different time points, 48 h, 72 h, 96 h and
120 h. The gene expression level from different time points were compared (Figure
3.7). Overall, all genes were expressed after 72 h and the expression reaches its stable
level. Therefore, 72 h was chosen as the best time point for collecting cells for total
RNA isolation. However, we found that in the pYL-#18 construct, the expression
levels for all genes were low and not comparable to those of the native producer- S
griseus (SG) (Figure 3.8).
3.2.5 Inserting the ErmE constitutive promoter
After observing the poor gene expression levels from the qPCR results, several
additional constructs were assembled in a similar fashion as mentioned above to help
69
facilitating the product discovery. Five different constructs were made for different
purposes. The first one was called ErmE-1 (E1 or Ep), which contains an ErmE
promoter in front of the whole gene cluster (Figure 3.2B).
While in the ErmE promoter construct (E1), all genes were confirmed to be
expressed and the first gene immediately downstream of the ErmE promoter showed a
great increase in the gene expression level, which demonstrated that by adding
constitutive promoters, gene expression level can be enhanced (Figure 3.8). It is
obvious to see that by adding constitutive promoter, the expression levels of all genes
have been improved by different extent.
3.2.6 Promoter identification and characterization
Based on the above results, we decided to add different constitutive promoters in
front of each gene in the target cluster. In order to do so, a promoter library needs to
be established. With Zengyi’s help, our lab is able to identify 10 strong constitutive
promoters for Streptomyces strains. To search those constitutive promoters, several
groups of housekeeping genes were targeted, such as RNA polymerase subunit α and
β, DNA gyrase, elongation factors, RNA polymerase sigma factors, amino-acyl tRNA
synthetase and enzymes in the essential metabolic pathways.
From EGFP confocal microscope analysis, two promoters showed strong
activities, gapDHp and rspLp (Figure 3.9). Based on this observation, we started to
search similar promoters from other bacteria species. As shown on Figure 3.10, a total
of 10 strong constitutive promoters were identified, which were used in the following
experiments.
70
3.2.7 Creating gene expression cassettes
As we have already confirmed that by adding constitutive promoters in front of a
target gene would highly increase the gene expression level, we decided to add six
different promoters in front of each gene in the cluster and this is called the fabricate
cassettes (FC) (Figure 3.2). The first design was named AO-V1, which contains six
different promoters in front of each gene in the cluster; the second one was named
AO-V2, which contains three different promoters in the gap regions of the gene
cluster; the fourth one contains 6 different promoters as well but deleted all the
upstream and downstream genes; and the last one shares the same design as the fourth
one, except it contains a xylE gene linked after the SGR814 gene to help track the
protein expression level (Figure 3.2).
We tested the gene expression levels for all those constructs by qPCR and found
that for #7-AO-V1 in minimal medium (MM), the first gene under the GAPDH
promoter showed a ~150 fold increase at the mRNA level. Therefore, the expression
levels for the three AO constructs were compared in both the YMS and MM media
(Figure 3.11). It is clear that the constructs with 6 different promoters have a higher
mRNA levels compared to the one that only contains 3 promoters in both the YMS
and MM media.
The effect of medium was considered as well, since all the promoters were
tested in the minimal medium first and the ones picked for making the AO constructs
are the strong promoters in the minimal medium. Because the medium we would
prefer for production is the YMS medium, we tested the promoters’ strength in both
71
media (data not shown) and also the gene expression levels in both media (Figure
3.12). For the two constructs chosen, AO-V1-#7 and AO-V3-#2, it is not hard to
observe that they both expressed better in the MM than in the YMS medium, which
indicated that the promoter picked worked better in the MM medium.
3.2.8 Identification of final products
We followed the product extraction method described in literature (36) to isolate
the potential product from the target cluster. By comparing the HPLC profiles of the
negative control (S. lividans without the target gene cluster) and the sample, a
potential product was observed at around 24.1 min and the corresponding mass was
(M+H): 511 (Figure 3.13 and 3.14). Further structural characterization is underway.
3.3 Conclusions
The strategy we developed to decipher cryptic pathways by using synthetic
biology offers great applications in the natural product discovery area. The DNA
assembler method we employed can be used to rapidly assemble a target gene cluster
and heterologously express it in a desired organism. Such a shuttle system has great
flexibility as it is not limited by the choice of hosts as long as the corresponding
essential elements required by the host are assembled along with the cluster. In
addition, such system can be used to perform targeted gene deletions, insertions, point
mutations and replacements to facilitate the cluster mechanism studies. For example,
if we want to understand the function of an Orf, the Orf can be removed from the
cluster easily and the rest of the pathway will be assembled as usual.
72
In the pathway engineering experiments to help increase the production of the
cluster, preliminary data obtained from RT-PCR indicates that all the genes are
expressed under certain conditions. By adding different promoters in the gene cluster,
we can detect the increase of gene expression at different levels, which proved that
our method can be used to speed product detection.
In order to further demonstrate the versatility of this approach and address the
remaining questions, we plan to carry out the following experiments:
(i) Product detection. To determine the productivity of different constructs, all
constructs will be grown in a series of media and potential products will be extracted.
The most surprising feature of this cluster is the lack of additional PKS-type genes
associated with it. This absence is particularly intriguing because if the
PKS/NRPS-type enzymes (SGR814) were to function modularly as postulated by Yu
et al. (34), then the enzyme’s polyketide synthase domains KS-AT-DH-KR-ACP
would be responsible to take the malonyl group, carry out its subsequent
decarboxylation, and incorporate into a growing polyketide backbone. Likewise, the
C-A-T-TE NRPS domains of the PKS/NRPS enzymes should activate an ornithine
residue and ligate it to an ACP domain-tethered polyketide before hydrolysis of the
polyketide-peptide product from the enzyme by the TE domain. The net function of
the PKS/NRPS enzyme would then be to elaborate a relatively small portion of the
backbone of final compounds. This analysis led to a prediction that the complete
HSAF cluster and frontalamide cluster would contain the additional PKS enzymes
needed to synthesize the remainder of the polyketide chains (34,36). However, if the
73
PKS/NRPS-type enzymes function in a fashion similar to certain fungal hybrid
PKS-NRPS enzymes, such as those involved in tenellin (40,41) and cyclopiazonic
acid (42,43) biosynthesis, then it is not hard to understand the lack of additional PKS
enzymes in the target cluster to produce the remainder of the polyketide backbone
(43-45). Once we can purify the product from the heterologous expression system, we
can prove that this cluster is complete and no additional PKS type enzyme is needed.
(ii) Mechanism study. New constructs with deletion of genes which encodes
modification proteins will be constructed to illustrate the cluster biosynthetic
mechanism. If the predictions regarding the SGR814 catalysis are correct, then this
enzyme would have a unique ability to catalyze a second round of iterative polyketide
synthesis before condensation with the δ-amino group originating from ornithine. A
Dieckmann cyclization would render the backbone complete, with concomitant
product release. However it remains unclear how the 5,5,6 polycycle of the final
compound is formed. This biosynthetic feature will be solved once we obtain the
SGR814’s polyketide intermediates. Therefore, we could assemble constructs
eliminating the modification enzymes and illustrate its mechanism more clearly. (3) In
vitro reconstitution of the SGR814 enzyme. To better understand the function of this
enzyme, in vitro assay could be carried out to study its mechanism and kinetics.
74
3.4 Materials and methods
3.4.1 Strains and reagents
S. cerevisiae YSG50 (MATα, ade2-1, ade3Δ22, ura3-1, his3-11, 15, trp1-1,
leu2-3, 112 and can1-100) was used as the host for DNA assembly and integration.
S. lividans 66 and S. griseus were obtained from the Agricultural Research Service
Culture Collection (Peoria, IL) and E. coli BL21 (DE3) was obtained from Novagen
(Madison, WI). Plasmids pAE4, E. coli strains WM6026 and WM1788 were
developed by the Metcalf group (University of Illinois, Urbana-Champaign). The
pRS426 and pRS416 plasmid (New England Biolabs, Beverly, MA) were modified by
incorporating the hisG and partial δ sequence (δ2) that flank the multiple cloning site
and serve as the vectors for assembly of various pathways. The resulting pRS426m
and pRS416m were linearized by BamHI. The δ1-hisG-ura3-hisG fragment was cut
from pdδUB33 with BamHI and XhoI. Plasmid pET26b(+) and pET28a(+) were
obtained from Novagen (Madison, WI). Nalidixic acid and IPTG were obtained from
Sigma-Aldrich (St. Louis, MO). ISP2, agar, beef extract, yeast extract, malt extract,
and other reagents required for cell culture were obtained from Difco (Franklin Lakes,
NJ). All restriction endonucleases, as well as T4 DNA ligase and antarctic
phosphatase, were purchased from New England Biolabs (Beverly, MA). Phusion
DNA polymerase was from Finnzymes (Espoo, Finland). FailsafeTM 2x premix
buffer G was purchased from EPICENTRE Biotechnologies (Madison, WI). SYBR
Green PCR master mix was purchased from Applied Biosystems (San Francisco, CA).
The QIAprep Spin Plasmid Mini-prep Kit, QIAquick PCR Purification Kit, QIAquick
75
Gel Extraction Kit, and RNeasy MinElute Cleanup Kit were purchased from Qiagen
(Valencia, CA). The Total RNA Isolation Mini Kit was obtained from Agilent
Technologies (Santa Clara, CA). All primers were synthesized by Integrated DNA
Technologies (Coralville, IA). 5 PRIME MasterMix was purchased from 5 PRIME
Inc. (Gaithersburg, MD). Yeast YPAD medium-containing 1% yeast extract, 2%
peptone and 2% dextrose supplied with 0.01% adenine hemisulphate was used to
grow S. cerevisiae YSG50 strain. Synthetic complete drop-out medium lacking
uracil (SC-Ura) was used to select transformants containing the assembled
biochemical pathways of interest.
3.4.2 DNA manipulation
S. griseus was grown in liquid MYG medium (4 g/L yeast extract, 10 g/L malt
extract and 4 g/L glucose) at 30 °C with constant shaking (250 rpm) for 2 days. The
genomic DNA was isolated from S. griseus using the Wizard Genomic DNA isolation
kit from Promega (Madison, WI). Ten pairs of primers were designed with the
sequences shown in Table 3.2. Pathway fragments 1-SG, 2-SG, 3-SG, 4-SG, 5-SG,
6-SG, and 7-SG were amplified from the genomic DNA of S. griseus. Generally, the
PCRs were carried out in 100 µL reaction mixture consisting of 50 µL FailSafeTM
PCR 2xPreMix G from Epicentre Biotechnologies (Madison, WI), 2.5 pmol of each
primer, 1 µL yeast plasmid or genomic DNA and 2.0 unit Phusion DNA polymerase
for 35 cycles on a PTC-200 Thermal Cycler (MJ Research, Watertown, MA). Each
cycle consisted of 10 s at 98 °C, 30 s at 58 °C and 3 min at 72 °C, with a final
extension of 10 min. The S. cerevisaie helper fragment was amplified from the
76
commercial vector pRS416, while the E. coli helper fragment and the S. lividans
helper fragment were amplified from the Streptomyces-E. coli shuttle vector pAE4.
In order to obtain good assembly efficiency, the overlaps between the pathway
fragments were designed to be 400 bp. We encountered some difficulties in
amplifying the S. cerevisaie helper fragment, and in the end, a sufficient amount of
amplification product was only obtained by using a short pair of primers (Table 3.2).
As a result, the S. cerevisaie helper fragment only overlaps with 7-SG and the E. coli
helper fragment with 40 bp. The S. lividans helper fragment has overlaps of 80 bp
with the adjacent E. coli helper fragment and 1-SG. Following electrophoresis, the
PCR products were individually gel-purified from 1.0% agarose gels using Qiagen
Gel Purification kit. 100 ng of each individual was mixed and precipitated with
ethanol. The resulting DNA pellet was air-dried and resuspended in 4 µL Milli-Q
double deionized water and subsequently electroporated into S. cerevisiae using the
protocol reported elsewhere (26).
3.4.3 Yeast transformation
Yeast transformation was performed by electroporation. YPAD medium (50
mL) was inoculated with a 0.5 mL overnight S. cerevisiae YSG50 culture and shaken
at 30 °C and 250 rpm for 4-5 h until OD600 reached 0.8-1.0. Yeast cells were
harvested by centrifugation at 4 °C and 4000 rpm for 5 min. The supernatant was
discarded and the cell pellet was washed with 50 mL ice-cold Milli-Q
double-deionized water, followed by another wash with cold 1 M sorbitol and finally
resuspended in 250 µL cold sorbitol. An aliquot of 50 µL of yeast cells together
77
with 4 µL DNA mixture was electroporated in a 0.2 cm cuvette at 1.5 kV. The
transformed cells were immediately mixed with 1 mL room temperature YPAD
medium and shaken at 30 °C for 1 h. Following that, the cells were harvested by
centrifugation, washed with room temperature 1 M sorbitol several times to remove
the remaining medium and finally resuspended in 1 mL sorbitol. Aliquots of 30-50
mL were spread on SC-Ura plates for the selection of the transformants, and the plates
were incubated at 30 °C for 2-4 days until colonies appeared. Colonies were
randomly picked to SC-Ura liquid medium and grown for 1 day, after which yeast
plasmid was isolated using Zymoprep II Yeast plasmid Miniprep kit (Zymo Research,
CA), and genomic DNA was prepared using Wizard Genomic DNA isolation kit from
Promega (Madison, WI).
3.4.4 Verification of the assembled gene clusters
Colonies were randomly picked to SC-Ura liquid media and grown for 2 day,
after which yeast plasmid was isolated using Zymoprep II Yeast plasmid Miniprep kit
(Zymo Research, CA). Yeast plasmids were transformed to E. coli strain WM1788
and selected on Luria Broth (LB) agar plates supplemented with 50 µg/ml apramycin.
Colonies were inoculated into 5 ml of LB media supplemented with 50 µg/ml
apramycin, and plasmids were isolated from the liquid culture using the plasmid
miniprep kit from Qiagen. Plasmids isolated from E. coli were then subjected to
restriction digestion by different enzymes, after which the reaction mixtures were
loaded to 0.7% agarose gels to check for the correct restriction digestion pattern by
DNA electrophoresis.
78
3.4.5 Integration of the verified gene clusters onto the S. lividan 66 chromosome
The verified constructs were transformed to E. coli WM602623 and selected on
LB agar plates supplemented with 19 µg/mL 2,6-diaminopimelic acid and 50 µg/mL
apramycin. These transformants were then used as the donors for conjugal transfer
of the assembled plasmids to S. lividans 66 following the ‘‘high-throughput’’ protocol
described previously with the exception that the entire E. coli/S. lividans mixture was
spotted on R2 no-sucrose media plates in 2.5 µL aliquots (46). After incubation at
30 °C for 16-20 hr, plates were flooded with 2 mL of a mixture of 1 mg/mL each
nalidixic acid and Apr and incubated at 30 °C for additional 3-5 days, at which point S.
lividans exconjugants were picked and restreaked on ISP2 plates supplemented with
50 µg/mL nalidixic acid and 50 µg/mL Apr and allowed to grow for several days.
These exconjugants were inoculated into the YMS medium, grown at 30 °C for 3 days
before being spread onto YMS plates.
3.4.6 Total RNA isolation and cDNA synthesis
Total RNA was purified from liquid culture grown in YMS or MM for 3 days
using the Agilent Total RNA Isolation Mini Kit according to the manufacturer’s
instructions (City, state). Approximately 600 μL of cell lysis supernatant was used in
each RNA purification. Total RNA was visualized by agarose gel electrophoresis and
concentration and purity were determined using a NanoDrop (Thermo Fisher
Scientific, Waltham, MA). Purified, concentrated total RNA was used as the template
for cDNA synthesis using Transcriptor II reverse transcriptase enzyme (Invitrogen,
79
Carlsbad, CA) and random hexamer primer (Invitrogen, Carlsbad, CA) according to
the manufacturer’s instructions.
3.4.7 RT-PCR
Experimental procedure for real-time RT-PCR using SYBR Green I was carried
out as described elsewhere (39). Reverse Transcription was carried out with the
SuperScript First-Strand Synthesis System for RT-PCR. Gene specific primers for
genes contained in my cluster were designed by Real-Time PCR tools in IDT-DNA
(https://www.idtdna.com/scitools/Applications/RealTimePCR/). I normalized the
primer concentrations and mixed gene-specific forward and reverse primer pair. The
concentration of each primer in the mixture is 5 pmol/μL. The real-time PCR reaction
mixture was set up in each optical tube. 10 μL SYBR Green Mix (2X), 1 μL cDNA, 2
μL primer mix and 7 μl H2O were mixed gently in the optical well. I seted up the
experiment and the following PCR program on ABI Prism 7900 HT Real-Time
Quantitative PCR machine; 2 min at 50 °C, 10 min at 95 °C for one cycle and then 15
s at 95 °C, 30 s at 60 °C and 30 s at 72 °C for 40 cycles with a final cycle 10 min at
72 °C. After PCR was finished, removed the tubes from the machine and analyzed the
real-time PCR result with the SDS software.
3.4.8 HPLC analysis
Streptomyces lividans contained different constructs was grown for 3 days in 15
mL YMS liquid medium in a 125 mL shake-flask (equipped with 3 mm glass beads to
improve liquid mixing and aeration). Aliquots (200 μL) of this seed culture were
80
spread for confluence over Petri plates containing ~25-30 mL YMS solid agar
medium and allowed to incubate for 3-4 days at 30 °C. To isolate potential compound,
agar and adhering cells were scraped from the plates, pooled, and covered with ethyl
acetate to extract the products. The extract was dried under vacuum, and subsequently
fractionated through a C18 Sepak (1 gram) SPE (solid phase extraction) column by
eluting with a step gradient of 20%, 40%, 60%, 80%, 100% MeOH in water. The 80%
MeOH/water fraction was subjected to further chromatography to purify potential
products after testing individual fractions by LC/MS. Product analysis was conducted
by comparing a series of different HPLC programs. An Agilent 1100 Series HPLC
and ZORBAX SB-C18 reverse-phase column (3.0 x 150 mm, 3.5 μm) (Agilent
Technologies, Palo Alto, CA) was used to monitor at 280 nm and at a flow rate of 0.5
mL/min at 25 °C. A gradient of 10-100% acetonitrile in water with 0.1% formic acid
over 20 min was used and data were collected using a diode array to monitor
absorbance at 280 nm. Mass spectral data were acquired using ESI-MS on an Agilent
6130 Series mass spectrometer, where both positive and negative ions were
monitored.
3.5 References
1. Newman, D.J. and Cragg, G.M. (2007) Natural products as sources of new drugs over the last 25 years. J Nat Prod, 70, 461-477.
2. Newman, D.J., Cragg, G.M. and Snader, K.M. (2000) The influence of natural products upon drug discovery. Natural Product Reports, 17, 215-234.
3. Challis, G.L. (2008) Mining microbial genomes for new natural products and biosynthetic pathways. Microbiology, 154, 1555-1569.
4. Gross, H. (2007) Strategies to unravel the function of orphan biosynthesis pathways: recent examples and future prospects. Appl Microbiol Biotechnol, 75, 267-277.
81
5. Palmu, K., Ishida, K., Mantsala, P., Hertweck, C. and Metsa-Ketela, M. (2007) Artificial reconstruction of two cryptic angucycline antibiotic biosynthetic pathways. Chembiochem, 8, 1577-1584.
6. Zazopoulos, E., Huang, K., Staffa, A., Liu, W., Bachmann, B.O., Nonaka, K., Ahlert, J., Thorson, J.S., Shen, B. and Farnet, C.M. (2003) A genomics-guided approach for discovering and expressing cryptic metabolic pathways. Nat Biotechnol, 21, 187-190.
7. Bentley, S.D., Chater, K.F., Cerdeno-Tarraga, A.M., Challis, G.L., Thomson, N.R., James, K.D., Harris, D.E., Quail, M.A., Kieser, H., Harper, D. et al. (2002) Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature, 417, 141-147.
8. Bode, H.B. and Muller, R. (2006) Analysis of myxobacterial secondary metabolism goes molecular. J Ind Microbiol Biotechnol, 33, 577-588.
9. Goldman, B.S., Nierman, W.C., Kaiser, D., Slater, S.C., Durkin, A.S., Eisen, J.A., Ronning, C.M., Barbazuk, W.B., Blanchard, M., Field, C. et al. (2006) Evolution of sensory complexity recorded in a myxobacterial genome. Proc Natl Acad Sci U S A, 103, 15200-15205.
10. Galagan, J.E., Calvo, S.E., Borkovich, K.A., Selker, E.U., Read, N.D., Jaffe, D., FitzHugh, W., Ma, L.J., Smirnov, S., Purcell, S. et al. (2003) The genome sequence of the filamentous fungus Neurospora crassa. Nature, 422, 859-868.
11. Jones, M.G. (2007) The first filamentous fungal genome sequences: Aspergillus leads the way for essential everyday resources or dusty museum specimens? Microbiology, 153, 1-6.
12. Pel, H.J., de Winde, J.H., Archer, D.B., Dyer, P.S., Hofmann, G., Schaap, P.J., Turner, G., de Vries, R.P., Albang, R., Albermann, K. et al. (2007) Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. Nat Biotechnol, 25, 221-231.
13. Fischbach, M.A. and Walsh, C.T. (2006) Assembly-line enzymology for polyketide and nonribosomal Peptide antibiotics: logic, machinery, and mechanisms. Chem Rev, 106, 3468-3496.
14. McConnell, O.J., Longley, R.E. and Koehn, F.E. (1994) The discovery of marine natural products with therapeutic potential. Biotechnology, 26, 109-174.
15. Baltz, R.H. (2010) Streptomyces and Saccharopolyspora hosts for heterologous expression of secondary metabolite gene clusters. J Ind Microbiol Biotechnol, 37, 759-772.
16. Bull, A.T., Goodfellow, M. and Slater, J.H. (1992) Biodiversity as a Source of Innovation in Biotechnology. Annu Rev Microbiol, 46, 219-252.
17. Dernain, A.L. (2006) From natural products discovery to commercialization: a success story. J Ind Microbiol Biot, 33, 486-495.
18. Leadbetter, J.R. (2003) Cultivation of recalcitrant microbes: cells are alive, well and revealing their secrets in the 21st century laboratory. Curr Opin Microbiol, 6, 274-281.
82
19. Li, C., Roege, K.E. and Kelly, W.L. (2009) Analysis of the indanomycin biosynthetic gene cluster from Streptomyces antibioticus NRRL 8167. Chembiochem, 10, 1064-1072.
20. Wenzel, S.C., Bode, H.B., Kochems, I. and Muller, R. (2008) A type I/type III polyketide synthase hybrid biosynthetic pathway for the structurally unique ansa compound kendomycin. Chembiochem, 9, 2711-2721.
21. Corre, C., Song, L., O'Rourke, S., Chater, K.F. and Challis, G.L. (2008) 2-Alkyl-4-hydroxymethylfuran-3-carboxylic acids, antibiotic production inducers discovered by Streptomyces coelicolor genome mining. Proc Natl Acad Sci U S A, 105, 17510-17515.
22. Woodyer, R.D., Shao, Z., Thomas, P.M., Kelleher, N.L., Blodgett, J.A., Metcalf, W.W., van der Donk, W.A. and Zhao, H. (2006) Heterologous production of fosfomycin and identification of the minimal biosynthetic gene cluster. Chem Biol, 13, 1171-1182.
23. Eliot, A.C., Griffin, B.M., Thomas, P.M., Johannes, T.W., Kelleher, N.L., Zhao, H. and Metcalf, W.W. (2008) Cloning, expression, and biochemical characterization of Streptomyces rubellomurinus genes required for biosynthesis of antimalarial compound FR900098. Chem Biol, 15, 765-770.
24. French, C.E. (2009) Synthetic biology and biomass conversion: a match made in heaven? J R Soc Interface, 6, 547-558.
25. Keasling, J.D. (2008) Synthetic biology for synthetic chemistry. Acs Chem Biol, 3, 64-76.
26. Shao, Z. and Zhao, H. (2009) DNA assembler, an in vivo genetic method for rapid construction of biochemical pathways. Nucleic Acids Res, 37, e16.
27. Gunyuzlu, P.L., Hollis, G.F. and Toyn, J.H. (2001) Plasmid construction by linker-assisted homologous recombination in yeast. Biotechniques, 31, 1246, 1248, 1250.
28. Ma, H., Kunes, S., Schatz, P.J. and Botstein, D. (1987) Plasmid construction by homologous recombination in yeast. Gene, 58, 201-216.
29. Oldenburg, K.R., Vo, K.T., Michaelis, S. and Paddon, C. (1997) Recombination-mediated PCR-directed plasmid construction in vivo in yeast. Nucleic Acids Res, 25, 451-452.
30. Raymond, C.K., Pownder, T.A. and Sexson, S.L. (1999) General method for plasmid construction using homologous recombination. Biotechniques, 26, 134-138.
31. Raymond, C.K., Sims, E.H. and Olson, M.V. (2002) Linker-mediated recombinational subcloning of large DNA fragments using yeast. Genome Res, 12, 190-197.
32. Schaerer-Brodbeck, C. and Barberis, A. (2004) Coupling homologous recombination with growth selection in yeast: a tool for construction of random DNA sequence libraries. Biotechniques, 37, 202-206.
33. Ohnishi, Y., Ishikawa, J., Hara, H., Suzuki, H., Ikenoya, M., Ikeda, H., Yamashita, A., Hattori, M. and Horinouchi, S. (2008) Genome sequence of the
83
streptomycin-producing microorganism Streptomyces griseus IFO 13350. J Bacteriol, 190, 4050-4060.
34. Yu, F., Zaleta-Rivera, K., Zhu, X., Huffman, J., Millet, J.C., Harris, S.D., Yuen, G., Li, X.C. and Du, L. (2007) Structure and biosynthesis of heat-stable antifungal factor (HSAF), a broad-spectrum antimycotic with a novel mode of action. Antimicrob Agents Chemother, 51, 64-72.
35. Lou, L., Qian, G., Xie, Y., Hang, J., Chen, H., Zaleta-Rivera, K., Li, Y., Shen, Y., Dussault, P.H., Liu, F. et al. (2011) Biosynthesis of HSAF, a tetramic acid-containing macrolactam from Lysobacter enzymogenes. J Am Chem Soc, 133, 643-645.
36. Blodgett, J.A., Oh, D.C., Cao, S., Currie, C.R., Kolter, R. and Clardy, J. (2010) Common biosynthetic origins for polycyclic tetramate macrolactams from phylogenetically diverse bacteria. Proc Natl Acad Sci U S A, 107, 11692-11697.
37. Shiina, T., Tanaka, K. and Takahashi, H. (1991) Sequence of hrdB, an essential gene encoding sigma-like transcription factor of Streptomyces coelicolor A3(2): homology to principal sigma factors. Gene, 107, 145-148.
38. Shinkawa, H., Hatada, Y., Okada, M., Kinashi, H. and Nimi, O. (1995) Nucleotide sequence of a principal sigma factor gene (hrdB) of Streptomyces griseus. J Biochem, 118, 494-499.
39. Wang, X. and Seed, B. (2003) A PCR primer bank for quantitative gene expression analysis. Nucleic Acids Res, 31, e154.
40. Halo, L.M., Heneghan, M.N., Yakasai, A.A., Song, Z., Williams, K., Bailey, A.M., Cox, R.J., Lazarus, C.M. and Simpson, T.J. (2008) Late stage oxidations during the biosynthesis of the 2-pyridone tenellin in the entomopathogenic fungus Beauveria bassiana. J Am Chem Soc, 130, 17988-17996.
41. Eley, K.L., Halo, L.M., Song, Z., Powles, H., Cox, R.J., Bailey, A.M., Lazarus, C.M. and Simpson, T.J. (2007) Biosynthesis of the 2-pyridone tenellin in the insect pathogenic fungus Beauveria bassiana. Chembiochem, 8, 289-297.
42. Shinohara, Y., Tokuoka, M. and Koyama, Y. (2011) Functional Analysis of the Cyclopiazonic Acid Biosynthesis Gene Cluster in Aspergillus oryzae RIB 40. Biosci Biotechnol Biochem, 75, 2249-2252.
43. Liu, X. and Walsh, C.T. (2009) Cyclopiazonic acid biosynthesis in Aspergillus sp.: characterization of a reductase-like R* domain in cyclopiazonate synthetase that forms and releases cyclo-acetoacetyl-L-tryptophan. Biochemistry, 48, 8746-8757.
44. Halo, L.M., Marshall, J.W., Yakasai, A.A., Song, Z., Butts, C.P., Crump, M.P., Heneghan, M., Bailey, A.M., Simpson, T.J., Lazarus, C.M. et al. (2008) Authentic heterologous expression of the tenellin iterative polyketide synthase nonribosomal peptide synthetase requires coexpression with an enoyl reductase. Chembiochem, 9, 585-594.
45. Tokuoka, M., Seshime, Y., Fujii, I., Kitamoto, K., Takahashi, T. and Koyama, Y. (2008) Identification of a novel polyketide synthase-nonribosomal peptide
84
synthetase (PKS-NRPS) gene required for the biosynthesis of cyclopiazonic acid in Aspergillus oryzae. Fungal Genet Biol, 45, 1608-1615.
46. Martinez, A., Kolvek, S.J., Yip, C.L., Hopke, J., Brown, K.A., MacNeil, I.A. and Osburne, M.S. (2004) Genetically modified bacterial strains and novel bacterial artificial chromosome shuttle vectors for constructing environmental libraries and detecting heterologous natural products in multiple expression hosts. Appl Environ Microbiol, 70, 2452-2463.
85
O
HN
OH
NH
O
HO
O
H
H
O
HO
Frontalamide AChemical Formula: C29H36N2O7
Exact Mass: 524.25Molecular Weight: 524.61
O
HN
OH
NH
O
HO
O
H
H
O
H
Frontalamide BChemical Formula: C29H36N2O6
Exact Mass: 508.26Molecular Weight: 508.61
O
HN
NH
O
HO
O
H
H
O
HO
H H
FI-1Chemical Formula: C29H36N2O6
Exact Mass: 508.26Molecular Weight: 508.61
O
HN
NH
O
HO
O
H
H
O
HO
FI-2Chemical Formula: C29H38N2O6
Exact Mass: 510.27Molecular Weight: 510.62
O
HN
NH
O
HO
O
H
H
O
H H
H
FI-3Chemical Formula: C29H36N2O5
Exact Mass: 492.26Molecular Weight: 492.61
Figure 3.1 Macrolide compounds as natural products.
86
Figure 3.2 Six different constructs assembled by DNA assembler. A: the original construct
from S griseus and heterologous constructed in S lividans, several upstream and downstream
genes are included to maintain the potential regulation of the gene cluster. B: The ErmE
promoter was inserted in front of the whole gene cluster SGR910-815. C: Three different
promoters in the gap of the gene cluster. D: Gene cluster contains six different promoters in
front of each gene. E: Gene cluster contains six different stronger promoters in front of each
pYL-SGR805-822-136663 bp
SGR805
SGR806
SGR807
SGR808
SGR809
SGR810
SGR811
SGR812
SGR813
SGR814
SGR815
SGR816
SGR817
SGR818
SGR819
SGR820
SGR821
SGR822
PhiC31 attP
int
tL3CEN6
ARS H4
URA2
oriR6K
accIV
oriT nicking site
oriT
pYZ101 805-822 ErmEp SCP42319 bp
SGR805
SGR806
SGR807
SGR808
SGR809
SGR810
SGR811
SGR812
SGR813
SGR814
SGR815
SGR816
SGR817
SGR818
SGR819
SGR820
SGR821
SGR822
CEN6
ErmE
ARS
URA3
EF
YR
oriR6K
accIV
oriT
SCP2a F
SR(oriT)
SCP2b F
SCP2a R
SCP2
PYL-SGR810-815-A.O.v224269 bp
SGR810
SGR811
SGR812
SGR813
SGR814
SGR815
PhiC31 attP
int
tL3
CEN6
ARS H4
URA2
oriR6K
accIV
oriT
oriT nicking site
P6
KRp
rpsLp
pYL-SGR805-822 AO39671 bp
SGR805
SGR806
SGR807
SGR808
SGR809
SGR810
SGR811
SGR812
SGR813SGR814
SGR816
SGR817
SGR818
SGR819
SGR820
SGR821
SGR822
PhiC31 attP
int
tL3
CEN6
ARS H4
URA2
oriR6K
accIV
oriT
oriT nicking site
SGR815
P6
ErmE*p
AspSP
MetSp
rpoBp
ActIp
RBS
RBS
PYL-SGR810-815-A.O.v326343 bp
SGR810
SGR811
SGR812
SGR813
SGR814
SGR815
PhiC31 attP
int
tL3
CEN6
ARS H4
URA2
oriR6K
accIV
oriT
oriT nicking site
GAPDHp
ErmE*p
KRp
rpsLp
CFp
ActIp
RBS
RBSPYL-SGR810-815-A.O.v3 XylE
27312 bp
SGR810
SGR811
SGR812
SGR813
SGR814
SGR815
Linker Protein
PhiC31 attP
int
tL3
CEN6
ARS H4
URA2
oriR6K
accIV
oriT
oriT nicking site
XylE
GAPDHp
ErmE*p
KRp
rpsLp
CFp
ActIp
RBS
RBS
A
C D
E F
B
87
gene, and deleted all the upstream and downstream genes. F: Gene cluster contains six
different stronger promoters in front of each gene, and deleted all the upstream and
downstream genes. Also, a XylE gene was linked to the end of the SGR814 gene.
Figure 3.3 Physical characterization of the recombinant clones through restriction digestion.
Plasmids were digested by NcoI, XmnI and ApalI. The correct clones were marked as the
red stars.
Figure 3.4 Sample preparation work flow
88
#18 Construct L 810 811 812 813 815 L 814(-) 814(+)
Figure 3.5 Reverse-Transcription PCR results. The left figure: on the top of the figure, the
results showed that the genes in #18 constructs were all expressed except 814. The bottom
part is the negative controls. The right figure showed the gene 814 was expressed when
compared with the negative control.
Figure 3.6 One example of the standard curve made for Real-Time-PCR. pYL-#18 genomic
DNA was used as the PCR template, which was diluted to 100,1000, 10000 and 100000
times.
y = ‐3.7985x + 30.828R² = 0.9961
0
5
10
15
20
25
30
0 2 4 6
89
Figure 3.7 Different time points real-time PCR results. 1: #18 120h, 2: #18 72h, 3:#18 96h,
4:E1 120h, 5: E1 72h, 6: E1 96h, 7: SG: 120h, 8:SG 48h, 9:SG 72h, 10: SG 96h
Figure 3.8 Comparison of the gene expression level of different constructs. HrdB was used as
internal control, whose expression level was set as 1. OC: original construct; SG: S. griseus
(native producer); Ep: ErmE promoter added in front of the whole gene cluster; FC:
Expression cassette (different promoters in front of each gene in the cluster).
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10
90
Figure 3.9 EGFP confocal microscope results identified two strong promoters: gaDHp and
rpsLp from S. griseus.
Figure 3.10 EGFP confocal microscope results for promoters: gaDHp and rpsLp from other
species.
Figure 3.11 Comparison of the gene expression levels of AO v1(#7), AO v2(2-6), and AO
v3 (3-2 and 3-4in both YMS medium and minimal medium.
0
5
10
15
20
25
30
hrdB
810
811
812
813
814
815
2‐6 YMS
3‐4 YMS
7 YMS
0
50
100
150
200
7 MM
3‐2 MM
91
Figure 3.12 Comparison of the same construct’s gene expression levels in both YMS medium
and minimal medium.
Figure 3.13 HPLC analysis of the negative control and the sample.
0
2
4
6
8
10
12
hrdB
810
811
812
813
814
815
3‐2 YMS
3‐2 MM
020406080
100120140160180
#7 AO‐v1YMS
#7 AO‐v1MM
min0 5 10 15 20 25 30
mAU
0
20
40
60
80
100
MWD1 B, Sig=280,16 Ref=360,100 (Z:\YUNZI\11012011000003.D)
3.1
73
3.3
86
3.5
66
3.9
27
5.5
01
5.7
06
12.
854
13.
361
13.
656
13
.88
8 1
4.1
31
14
.653 15.
081
15
.47
5 1
5.6
91 1
5.88
8
16
.541
16.
828
17.
178
17.
359
17.
635
18
.24
4 1
8.4
59 1
8.72
2
19.
272
19
.837
20
.628
21
.07
3
21.
806
22.
119
22.
498
23.
728
24
.01
3
26.
523
29.
828
30.
030
30
.61
4
32
.58
5 33
.21
3
33
.83
9 3
4.1
28
min0 5 10 15 20 25 30
mAU
0
500
1000
1500
2000
2500
MWD1 B, Sig=280,16 Ref=360,100 (Z:\YUNZI\11012011000005.D)
3.1
66 3.3
84 3
.570
3.8
36 3
.973
4.2
30
4.9
22 5
.170 5
.457
5.6
80
6.5
34
7.3
61
7.8
87
8.4
97 8
.837
9.6
09
10.
635
11.
114
12.
926
13.
280
13.
586
13.
802
14.
102
14.
291
14.
531
14
.66
9 1
4.9
14 1
5.04
3
15
.483
15.
702
15.
877
16.
036
16.
550
16.
796
16.
906
17.
270
17
.621
17
.87
2 1
8.05
7
18.
454
18.
752
19.
078
19
.21
9 1
9.60
0 1
9.84
8 2
0.1
24 20
.33
7
20.
755
21.
087
21.
874
22.
320
22.
478
22.
731
22
.94
7 2
3.2
58 2
3.4
55
23.
631
24.
108
24
.684
25
.16
9 2
5.32
3 2
5.64
6 2
5.9
75
26.
500
27
.01
2
27.
606
28.
167
29.
002
29
.78
9
30
.27
9 3
0.58
4
32.
541
33
.17
4
33.
799
34.
087
92
Figure 3.14 MS1 and MS2 data of the identified new potential product peak.
169.0197.1
221.1
240.1 290.2
363.4 394.5 479.3
511.3511.3+MS, 24.2m in #2046
0.0
0.5
1.0
1.5
2.0
2.5
8x10Intens.
100 200 300 400 500 600 m/z
402.4
420.3438.3
458.3
475.4
493.3
+MS2(511.7), 24.2min #2043
0
2
4
6
8
7x10Intens.
100 200 300 400 500 600 m/z
93
Table 3.1 Fragments size and region located on the genomic DNA of S griseus.
Fragments Primers Size Region 1 1-for 5635 2501-8135 1-rev 2 2-for 4479 7716-12194 2-rev 3 3-for 5226 11775-17000 3-rev 4 4-for 5220 16581-21800 4-rev 5 5-for 4810 21381-26190 5-rev 6 6-for 3048 25771-28818 6-rev 7 7-for 4141 28381-32521 7-rev
Yeast fragment Yeast-for 1859 32522-34380 Yeast-rev
E. coli fragment E.coli-for 1430 34381-35810 E. coli-rev
Streptomyces fragment Strep-for 3353 35811-2500 Strep-rev
94
Table 3.2 Primers designed to amplify each gene from the genomic DNA of S griseus.
Amplified fragment Primer sequence (5’->3’)
1-SG gtagaaacagacgaagaagctagctttgcactggattgcg gtgggtcctttcgggtcgac
gaaccgtccacgactggtga
2-SG ttcaacacggtgtacgacta
cctgtcctgtcgccgctcac
3-SG cttccgtacgaggcgggcgt
Tgcagcaggaggtagtacgc
4-SG gaagcggccctcggcctcgt
gcgcgcgaggtgtaggcgac
5-SG cgccgcgtggaaggcggcga
actggctgcgacaggggtga
6-SG gggacccgcgccgcggtgga
gttaccgatcgggtattcat
7-SG ggactgctggtccgacgatc
ccagggcccagatgacgaagcgaaaagtgccacctgggtccttttcatcacgtgctataa
S. cerevisaie helper fragment cgaaaagtgccacctgggtc
gtgagtttagtatacatgca
E. coli helper fragment aaaactgtattataagtaaatgcatgtatactaaactcactgtcatcacgatactgtgat
tttcagcgtgacatcattctgtgggccgtacgctggtactgcaaatacggcatcagttac
S. lividans helper fragment acccattcaaaggccggcattttcagcgtgacatcattctgtgggccgtacgctggtact
tagctttgcactggattgcg gtgggtcctttcgggtcgacggctcgtcgcagtatcgcgg
top related