codon optimization to enhance expression yields insights ... · codon optimization to enhance...

16
Breakthrough Technologies Codon Optimization to Enhance Expression Yields Insights into Chloroplast Translation 1[OPEN] Kwang-Chul Kwon, Hui-Ting Chan, Ileana R. León, Rosalind Williams-Carrier, Alice Barkan, and Henry Daniell* Department of Biochemistry, School of Dental Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6030 (K.-C.K., H.-T.C., H.D.); Global Research, Novo Nordisk, Malov DK-2760, Denmark (I.R.L.); and Institute of Molecular Biology, University of Oregon, Eugene, Oregon 97403-1229 (R.W.-C., A.B.) ORCID IDs: 0000-0002-4037-1776 (K.-C.K.); 0000-0001-7319-2080 (I.R.L.); 0000-0003-4485-1176 (H.D.). Codon optimization based on psbA genes from 133 plant species eliminated 105 (human clotting factor VIII heavy chain [FVIII HC]) and 59 (polio VIRAL CAPSID PROTEIN1 [VP1]) rare codons; replacement with only the most highly preferred codons decreased transgene expression (77- to 111-fold) when compared with the codon usage hierarchy of the psbA genes. Targeted proteomic quantication by parallel reaction monitoring analysis showed 4.9- to 7.1-fold or 22.5- to 28.1-fold increase in FVIII or VP1 codon-optimized genes when normalized with stable isotope-labeled standard peptides (or housekeeping protein peptides), but quantitation using western blots showed 6.3- to 8-fold or 91- to 125-fold increase of transgene expression from the same batch of materials, due to limitations in quantitative protein transfer, denaturation, solubility, or stability. Parallel reaction monitoring, to our knowledge validated here for the rst time for in planta quantitation of biopharmaceuticals, is especially useful for insoluble or multimeric proteins required for oral drug delivery. Northern blots conrmed that the increase of codon- optimized protein synthesis is at the translational level rather than any impact on transcript abundance. Ribosome footprints did not increase proportionately with VP1 translation or even decreased after FVIII codon optimization but is useful in diagnosing additional rate-limiting steps. A major ribosome pause at CTC leucine codons in the native gene of FVIII HC was eliminated upon codon optimization. Ribosome stalls observed at clusters of serine codons in the codon-optimized VP1 gene provide an opportunity for further optimization. In addition to increasing our understanding of chloroplast translation, these new tools should help to advance this concept toward human clinical studies. Heterologous gene expression has facilitated our understanding of DNA replication, recombination, transcription, and translation and protein import in chloroplasts. The expression of precursor proteins via the chloroplast genome demonstrated that cleavage of transit peptides takes place in the stroma and not in the chloroplast envelope (Daniell et al., 1998). Most importantly, the role of nucleus-encoded cytosolic proteins that bind to regulatory sequences and their species specicity were demonstrated using transgenes expressed in chloroplasts (Ruhlman et al., 2010). When the lettuce (Lactuca sativa) psbA regulatory sequence was used to drive transgene expression in tobacco (Nicotiana tabacum) chloroplasts, there was greater than 90% re- duction in the accumulation of foreign proteins. This underscores the importance of the species specicity of chloroplast regulatory sequences. Likewise, details of the homologous recombination process and the deletion of mismatched nucleotides were evident using heterolo- gous anking sequences (Ruhlman et al., 2010). The translation of native polycistrons without the need for processing to monocistrons has been demonstrated (Barkan, 1988; Zoschke and Barkan, 2015), but the sim- ilarity of this process using heterologous polycistrons engineered via the chloroplast genome offered even more direct evidence for this process (De Cosa et al., 2001; Quesada-Vargas et al., 2005). The insertion of replication origins into chloroplast vectors offered fur- ther insight into minimal sequences required to study this process (Daniell et al., 1990). Therefore, in this study, we use transgenes, chloroplast genome sequences, and cutting-edge tools to understand the process of transla- tion in chloroplasts. Each plant cell contains up to 10,000 copies of the chloroplast genome. Therefore, transgenes inserted into chloroplast genomes are expressed at high levels, up to 1 This work was supported by the National Institutes of Health (grant nos. R01 HL107904, R01 HL109442, and R01 EY 024564), the Bill and Melinda Gates Foundation (grant no. OPP1031406 to H.D.), and the National Science Foundation (grant no. IOS1339130 to A.B.). * Address correspondence to [email protected]. The author responsible for distribution of materials integral to the ndings presented in this article in accordance with the policy de- scribed in the Instructions for Authors (www.plantphysiol.org) is: Henry Daniell ([email protected]). K.-C.K. organized codon tables, created and characterized trans- plastomic plants, and interpreted and wrote sections of the article; H.-T.C. created and characterized transplastomic plants and contributed data; I.R.L. performed MS and PRM analyses, interpreted data, and wrote this section of the article; R.W.-C. contributed ribosome prol- ing data analyses; A.B. interpreted ribosome proling data and wrote this section of the article; H.D. conceived and designed the project, analyzed and interpreted data, and wrote and revised several sec- tions and versions of the article. [OPEN] Articles can be viewed without a subscription. www.plantphysiol.org/cgi/doi/10.1104/pp.16.00981 62 Plant Physiology Ò , September 2016, Vol. 172, pp. 6277, www.plantphysiol.org Ó 2016 American Society of Plant Biologists. All rights reserved. www.plantphysiol.org on July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

Upload: ngodang

Post on 01-Jul-2018

242 views

Category:

Documents


0 download

TRANSCRIPT

Breakthrough Technologies

Codon Optimization to Enhance Expression YieldsInsights into Chloroplast Translation1[OPEN]

Kwang-Chul Kwon, Hui-Ting Chan, Ileana R. León, Rosalind Williams-Carrier, Alice Barkan, andHenry Daniell*

Department of Biochemistry, School of Dental Medicine, University of Pennsylvania, Philadelphia,Pennsylvania 19104-6030 (K.-C.K., H.-T.C., H.D.); Global Research, Novo Nordisk, Malov DK-2760, Denmark(I.R.L.); and Institute of Molecular Biology, University of Oregon, Eugene, Oregon 97403-1229 (R.W.-C., A.B.)

ORCID IDs: 0000-0002-4037-1776 (K.-C.K.); 0000-0001-7319-2080 (I.R.L.); 0000-0003-4485-1176 (H.D.).

Codon optimization based on psbA genes from 133 plant species eliminated 105 (human clotting factor VIII heavy chain [FVIIIHC]) and 59 (polio VIRAL CAPSID PROTEIN1 [VP1]) rare codons; replacement with only the most highly preferred codonsdecreased transgene expression (77- to 111-fold) when compared with the codon usage hierarchy of the psbA genes. Targetedproteomic quantification by parallel reaction monitoring analysis showed 4.9- to 7.1-fold or 22.5- to 28.1-fold increase in FVIII orVP1 codon-optimized genes when normalized with stable isotope-labeled standard peptides (or housekeeping protein peptides),but quantitation using western blots showed 6.3- to 8-fold or 91- to 125-fold increase of transgene expression from the samebatch of materials, due to limitations in quantitative protein transfer, denaturation, solubility, or stability. Parallel reactionmonitoring, to our knowledge validated here for the first time for in planta quantitation of biopharmaceuticals, is especiallyuseful for insoluble or multimeric proteins required for oral drug delivery. Northern blots confirmed that the increase of codon-optimized protein synthesis is at the translational level rather than any impact on transcript abundance. Ribosome footprints didnot increase proportionately with VP1 translation or even decreased after FVIII codon optimization but is useful in diagnosingadditional rate-limiting steps. A major ribosome pause at CTC leucine codons in the native gene of FVIII HC was eliminatedupon codon optimization. Ribosome stalls observed at clusters of serine codons in the codon-optimized VP1 gene provide anopportunity for further optimization. In addition to increasing our understanding of chloroplast translation, these new toolsshould help to advance this concept toward human clinical studies.

Heterologous gene expression has facilitated ourunderstanding of DNA replication, recombination,transcription, and translation and protein import inchloroplasts. The expression of precursor proteins viathe chloroplast genome demonstrated that cleavageof transit peptides takes place in the stroma and not inthe chloroplast envelope (Daniell et al., 1998). Mostimportantly, the role of nucleus-encoded cytosolic

proteins that bind to regulatory sequences and theirspecies specificity were demonstrated using transgenesexpressed in chloroplasts (Ruhlman et al., 2010). Whenthe lettuce (Lactuca sativa) psbA regulatory sequence wasused to drive transgene expression in tobacco (Nicotianatabacum) chloroplasts, there was greater than 90% re-duction in the accumulation of foreign proteins. Thisunderscores the importance of the species specificity ofchloroplast regulatory sequences. Likewise, details of thehomologous recombination process and the deletion ofmismatched nucleotides were evident using heterolo-gous flanking sequences (Ruhlman et al., 2010). Thetranslation of native polycistrons without the need forprocessing to monocistrons has been demonstrated(Barkan, 1988; Zoschke and Barkan, 2015), but the sim-ilarity of this process using heterologous polycistronsengineered via the chloroplast genome offered evenmore direct evidence for this process (De Cosa et al.,2001; Quesada-Vargas et al., 2005). The insertion ofreplication origins into chloroplast vectors offered fur-ther insight into minimal sequences required to studythis process (Daniell et al., 1990). Therefore, in this study,we use transgenes, chloroplast genome sequences, andcutting-edge tools to understand the process of transla-tion in chloroplasts.

Each plant cell contains up to 10,000 copies of thechloroplast genome. Therefore, transgenes inserted intochloroplast genomes are expressed at high levels, up to

1 This work was supported by the National Institutes of Health(grant nos. R01 HL107904, R01 HL109442, and R01 EY 024564), theBill and Melinda Gates Foundation (grant no. OPP1031406 to H.D.),and the National Science Foundation (grant no. IOS–1339130 to A.B.).

* Address correspondence to [email protected] author responsible for distribution of materials integral to the

findings presented in this article in accordance with the policy de-scribed in the Instructions for Authors (www.plantphysiol.org) is:Henry Daniell ([email protected]).

K.-C.K. organized codon tables, created and characterized trans-plastomic plants, and interpreted and wrote sections of the article;H.-T.C. created and characterized transplastomic plants and contributeddata; I.R.L. performed MS and PRM analyses, interpreted data, andwrote this section of the article; R.W.-C. contributed ribosome profil-ing data analyses; A.B. interpreted ribosome profiling data and wrotethis section of the article; H.D. conceived and designed the project,analyzed and interpreted data, and wrote and revised several sec-tions and versions of the article.

[OPEN] Articles can be viewed without a subscription.www.plantphysiol.org/cgi/doi/10.1104/pp.16.00981

62 Plant Physiology�, September 2016, Vol. 172, pp. 62–77, www.plantphysiol.org � 2016 American Society of Plant Biologists. All rights reserved. www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from

Copyright © 2016 American Society of Plant Biologists. All rights reserved.

70% of total leaf protein (De Cosa et al., 2001; Ruhlmanet al., 2010). A wide range of proteins, from very smallantimicrobial peptides (Lee et al., 2011) or hormones(Boyhan and Daniell, 2011; Kwon et al., 2013) to verylarge proteins encoded by bacterial, viral, fungal, ani-mal, and human genes, have been expressed success-fully in plant chloroplasts (DeGray et al., 2001; Daniellet al., 2009; Verma et al., 2010; Shenoy et al., 2014;Sherman et al., 2014; Shil et al., 2014). Most importantly,expressed proteins are highly stable when lyophilizedplant cells are stored at ambient temperature (Kwonet al., 2013; Lakshmi et al., 2013; Kohli et al., 2014; Jinand Daniell, 2015). Therefore, oral delivery of proin-sulin or exendin-4 reduced blood sugar levels simi-lar to injected proteins (Boyhan and Daniell, 2011;Kwon et al., 2013). Oral delivery of angiotensin andANGIOTENSIN-CONVERTING ENZYME2 expressedin chloroplasts reversed or prevented pulmonary hyper-tension by shifting the renin-angiotensin system to itsprotective axis, resulting in a decrease in fibrosis, im-provement in cardiopulmonary structure and function,and restoration of right heart function (Shenoy et al., 2014).Furthermore, ocular inflammation caused by decreasedactivity of the protective axis of the renin-angiotensinsystem was improved significantly (Shil et al., 2014).Likewise, oral delivery of myelin basic protein reducedAb plaques in advanced mouse and human Alzheimer’sbrains (Kohli et al., 2014). Delivery of coagulation factorsto hemophilicmice induced oral tolerance and suppressedinhibitor formation and anaphylaxis (Verma et al., 2010;Sherman et al., 2014; Wang et al., 2015a). The aforemen-tioned examples illustrate the significance of this novel,cost-effective protein drug-delivery concept.However, amajor limitation in the clinical translation

of human therapeutic proteins in chloroplasts is theirlow-level expression. Prokaryotic or shorter humangenes are highly expressed in chloroplasts (De Cosa et al.,2001; Arlen et al., 2007;Daniell et al., 2009; Ruhlman et al.,2010). However, expression of larger human proteins is amajor challenge. For example, cholera nontoxic B subunit(CNTB)-fused native human blood-clotting factor VIIIheavy chain (FVIII HC; 86.4 kD) or ANGIOTENSINCONVERTING ENZYME2 (92.5 kD) were expressed atvery low levels (Shenoy et al., 2014; Sherman et al.,2014). Likewise, the expression of viral vaccine anti-gens is quite unpredictable, with high, moderate, orextremely low expression levels (Birch-Machin et al.,2004; Lenzi et al., 2008; Waheed et al., 2011a, 2011b;Inka Borchers et al., 2012; Hassan et al., 2014). Fur-thermore, viral antigens are highly unstable, with ex-pression observed in youngest leaves but not inmature leaves (McCabe et al., 2008). It is well knownthat high doses of vaccine antigens stimulate high-levelimmunity and confer greater protection against patho-gens; therefore, higher level expression in chloroplasts isa key requirement for vaccine development (Chan andDaniell, 2015; Chan et al., 2016).Such challenges in transgene expression have been

addressed by the use of optimal regulatory sequences(promoters and 5ʹ and 3ʹ untranslated regions [UTRs]),

especially species-specific endogenous elements (Ruhlmanet al., 2010). In vitro assays of inserted genes with severalsynonymous codons show that translation efficiencydoes not always correlate with codon usage in plastidmRNAs (Nakamura and Sugiura, 2007), but they havebeen used in several codon optimization studies (Lutzet al., 2001; Ye et al., 2001; Franklin et al., 2002; Lenziet al., 2008; Jabeen at al., 2010; Madesis et al., 2010;Gisby et al., 2011;Wang et al., 2015b; Boehm et al., 2016;Nakamura et al., 2016). While some studies achievedsignificant increases in expression (75- to 80-fold) aftercodon optimization (Franklin et al., 2002; Gisby et al.,2011), other studies observed negligible enhancement(Ye et al., 2001; Lenzi et al., 2008; Daniell et al., 2009;Wang et al., 2015b; Nakamura et al., 2016). However,translation initiation and the elongation efficiency ofcodon-optimized sequences were enhanced when chlo-roplast gene N-terminal sequences were inserted down-stream of 5ʹ UTRs (Ye et al., 2001; Lenzi et al., 2008). In arecent study (Nakamura et al., 2016), the importance ofcompatibility between the psbA 5ʹ UTR and its 5ʹ codingsequence was shown using codon-optimized heterolo-gous genes. The aforementioned codon optimizationstudies used only smaller eukaryotic coding sequences(less than 30 kD), but there is a great need to expresslarger human genes (e.g. FVIII; greater than 200 kD) thatwould require not only the optimization of codons butalso compatibility with regulatory sequences for optimaltranslation initiation, elongation, and greater under-standing of tRNAs encoded by the chloroplast genome orimported from the cytosol. However, no systematicstudy has been done to utilize the extensive knowledgegathered by sequencing several hundred chloroplastgenomes (Daniell et al., 2016a) to understand codonusage and the frequency of highly expressed chlo-roplast genes.

Another major challenge is the lack of reliable methodsto quantify insoluble proteins; the only reliable method(ELISA) cannot be used due to the aggregation or for-mation of multimeric structures that are required for oraldrug delivery. Although the FDA accepts ELISA for thequantitation of purified protein drugs, it is not suitable forquantifying protein drugs from impure extracts due tocross-reacting proteins, autoantibodies (Kim and You,2013), or for the quantitation of insoluble, multimeric, ormembrane proteins. Similarly, immunoblots used forquantitation also have several limitations (i.e. aggregationof proteins at high protein concentrations trapped inwells, alteration of mobility by incomplete solubilizationor secondary structures, saturation of antibody-bindingsites, and inefficient transfer of large proteins to mem-branes and variable quantitation due to short or longexposure to films). However, peptide-centric quantitationstrategies (e.g. targeted mass spectrometry quantitationby parallel reaction monitoring [PRM]) can overcomemost of the limitations mentioned above. In the prepara-tion of protein samples for PRM, strong denaturing andreducing conditions are used (e.g. higher concentrationsof SDS and DTT) in combination with optimal enzymaticproteolysis conditions (e.g. sodium-deoxycholate; León

Plant Physiol. Vol. 172, 2016 63

New Tools to Study Transgene Expression in Chloroplasts

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

et al., 2013), especially suitable for insoluble, multimeric,ormembraneproteins (Savas et al., 2011).Moreover, PRMcan be used for relative and absolute protein quantita-tion of target proteins present in highly complex proteinbackgrounds based on its high specificity and sensitivity(Domon and Aebersold, 2010; Gallien et al., 2012; Picottiand Aebersold, 2012). In addition, PRM offers highspecificity and multiplexing characteristics, which al-low for specific monitoring of up to several hundredpeptides in a single analysis (Gallien et al., 2012). De-termination of protein drug dose in planta, especiallyof insoluble proteins without purification, is an unex-plored area of research, and to our knowledge, weinvestigate this concept for the first time to quantifyrecombinant protein drugs made in chloroplasts.

This study explores heterologous gene expressionutilizing chloroplast genome sequences, ribosomeprofiling, and targeted mass spectrometry (PRM) toenhance our understanding of the translation of foreigngenes in chloroplasts. We developed a codon optimizerprogram based on the analysis of psbA genes from133 plant species to compare the translational effi-ciencies of native and codon-optimized genes drivenby identical regulatory sequences. PRM using pep-tides selected from the N or C terminus were used tostudy the complete or incomplete synthesis of proteinsand to validate this approach to quantify the dosage ofprotein drugsmade in plant cells when comparedwithcurrent methods. The codon optimizer program wasevaluated in chloroplasts from two different species toidentify any species specificity. Ribosome profile wasevaluated for its suitability to diagnose limiting stepsin transgene expression. These observations providenew insight into limitations in the translation of het-erologous genes and approaches to address this infuture studies.

RESULTS

Codon Optimization of Human/Viral Transgenes

Differences in codon usage by chloroplasts frequentlydecrease translation. We observed that plants express-ing native sequences of FVIII HC or VIRAL CAPSIDPROTEIN1 (VP1) from polio virus showed very lowlevels of expression, less than 0.05% for FVIII and ap-proximately 0.1% for VP1 (see below). The psbA gene isamong themost highly expressed genes in chloroplasts,and the translation efficiency of the psbA gene is greaterthan 200 times higher than that of the rbcL gene (Eiblet al., 1999). The 5ʹUTR of psbA also showed the highesttranslation activity in vitro among 11 5ʹ UTRs investi-gated (Yukawa et al., 2007). Therefore, among 140transgenes expressed in chloroplasts, more than 75%use the psbA regulatory sequences (Jin and Daniell,2015; Daniell et al., 2016a, 2016b). Most importantly,compatibility between the 5ʹUTR of psbA and its codingregion is important for efficient translation initiation(Nakamura et al., 2016). For these reasons, a new codonoptimization program was developed using codon

usage of the psbA genes from 133 sequenced chloro-plast genomes (Fig. 1A). We first investigated the ex-pression of synthetic genes using only the most highlypreferred codon for each amino acid, which is referredto as the old algorithm in this study. When this resultedin even lower levels of expression than the native gene(see below), a new codon optimizer algorithm was de-veloped using the codon usage hierarchy observedamong sequenced psbA genes. Therefore, most of therare codons in heterologous genes were modified basedon codons with greater than 5% frequency of use in thepsbA genes. Synonymous codons for each amino acidwere ranked according to their frequency of use (Fig. 1B).

In this study, native sequences for FVIII HC (2,262 bp)and VP1 (906 bp) were codon optimized using the old ornew algorithm and synthesized. After codon optimiza-tion, the AT content of FVIII HC increased slightly, from56% to 62%, and 406 codons out of 754 amino acids wereoptimized. For the VP1 sequence from the Sabin 1 poliovirus strain, the 906-bp-long native sequence was codonoptimized, which slightly increased the AT content from52% to 59%, and 187 codons out of 302 amino acids wereoptimized. However, the CNTB coding sequence wasnot codon optimized because of its prokaryotic originand high AT content (65.4%). Most importantly, the ex-pression level of CNTB (native sequence) fused withproinsulin reached up to 72% of total leaf protein in to-bacco chloroplasts (Ruhlman et al., 2010) and 53% oftotal leaf protein in lettuce chloroplasts (Boyhan andDaniell, 2011), indicating that there is no limitation ontranslation of the CNTB coding sequence in chloroplasts.All sequences, including native and codon-optimizedsynthetic genes (new and old algorithms), are shownin Supplemental Figure S1; rare codons in native genesare shown in red andmodified codons are highlighted inyellow in Supplemental Figure S2.

When the psbA-based codon table is compared withtotal chloroplast codon usage tables, which are gener-ated based on all chloroplast genes of lettuce (57,528codons from 189 coding sequences) or tobacco (34,756codons from 137 coding sequences; Nakamura et al.,2000), there was no significant difference in AT contentof coding sequences: it varied between 59.59% and61.76%. However, there are striking differences be-tween psbA-based and total chloroplast gene-basedcodon tables when individual codons are compared.Native FVIII HC used CTC Leu codon 11 times, butcodon-optimized (new algorithm) HC eliminated allCTC codons. However, if the total chloroplast codontable is used, codon-optimized HC would still use fiveCTC codons. As seen in ribosome profiles, discussedbelow, tandem repeat of CTC-CTC in the native FVIIIHC sequence resulted in major stalling sites that werecompletely eliminated by psbA-based codon optimiza-tion (new algorithm). Likewise, another rare codon,TCA (Ser), is used 16 times in the FVIII HC and seventimes in VP1 coding sequences. However, the TCA rarecodon was eliminated completely in both genes aftercodon optimization using the new algorithm.However,if the total codon table is used for codon optimization,

64 Plant Physiol. Vol. 172, 2016

Kwon et al.

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

FVIII HC and VP1 would still contain 12 and five TCAcodons. Collectively, the new codon optimization al-gorithm eliminated 105 and 59 rare codons from FVIIIHC and VP1, respectively, resulting in enhanced ex-pression of both genes. However, if the total codontable is used, there will be 75 and 35 rare codons incodon-optimized FVIII HC and VP1 coding sequences,respectively. All 13 codons (GCG [Ala], GGG [Gly],CTG [Leu], CTC [Leu], CCG [Pro], CCC [Pro], AGG[Arg], CGG [Arg], TCA [Ser], TCG [Ser], ACG [Tyr],GTC [Val], and CTG [Val]) rarely used in the psbA genewere eliminated using our codon-optimized table (newalgorithm). More detailed information on the codondistribution between different codon tables is includedin Supplemental Figure S3.Synthetic gene cassettes were inserted into the chlo-

roplast transformation vector, pLSLF for lettuce orpLD-utr for tobacco (Fig. 2A). Native and synthetic

genes were fused to the native CTNB sequence, whichis used for efficient transmucosal delivery of fusedproteins via monosialotetrahexosylganglioside recep-tors present on intestinal epithelial cells. To eliminatepossible steric hindrance caused by the fusion of twoproteins and facilitate the release of tethered proteinsinto the circulation after internalization, nucleotide se-quences for a hinge (Gly-Pro-Gly-Pro) and a furincleavage site (Arg-Arg-Lys-Arg) were engineered be-tween CNTB and fused proteins. Fusion genes wereplaced under the control of identical psbA promoters, 59UTR and 39 UTR regulatory sequences, for specificevaluation of codon optimization (Fig. 2A). To selecttransformants, the aminoglycoside-399-adenylyl-transferasegene was driven by the rRNA promoter to confer resis-tance to spectinomycin in transformed cells. Expressioncassettes were flanked by sequences for isoleucyl-tRNAsynthetase and alanyl-tRNA synthetase, which are

Figure 1. Development of a codon optimization algorithm for the expression of heterologous genes in plant chloroplasts. A,Process to develop the codon optimization algorithm. Sequence data of psbA genes from 133 plant species collected from theNational Center for Biotechnology Information, and their codon preferences, were analyzed. Finally, the codon optimizer wasdeveloped using Java. B, Codon preference table. Codon preference is indicated by the percentage of use for each amino acid.Black and underlined codons indicate codons that were not used when optimizing sequences due to their low usage frequencyamong synonymous codons (less than 5%use or, for amino acidswith six synonymous codons [Leu, Ser, and Arg], the two codonsused least frequently).

Plant Physiol. Vol. 172, 2016 65

New Tools to Study Transgene Expression in Chloroplasts

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

identical to endogenous chloroplast genome sequences,leading to efficient double homologous recombinationand optimal processing of introns with flanking se-quences (Fig. 2A).

Transformation vectors containing the native orsynthetic sequences for FVIII HC and VP1 sequenceswere used to create transplastomic lettuce or tobaccoplants. To confirm homoplasmy, Southern-blot analysiswas performed on four independent lettuce and to-bacco lines expressing native or codon-optimized FVIIIHC and VP1. For lettuce plants expressing either nativeor codon-optimized CNTB-FVIII HC, chloroplast ge-nomic DNA was digested with HindIII and probedwith digoxigenin (DIG)-labeled probe spanning theflanking region. All selected lines showed the expecteddistinct hybridizing fragments and no untransformedfragment (Fig. 2, B and C). The homoplasmic tobaccolines expressing native or codon-optimized CNTB-VP1sequences were confirmed already in a previous study(Chan et al., 2016). Therefore, these data confirm thehomoplasmy of all transplastomic lines; therefore,

transgene expression levels should be attributed totranslation efficiency and not transgene copy number.

Translation Efficiency of Native and Codon-OptimizedGenes in Lettuce and Tobacco Chloroplasts

Expression levels between native and codon-optimizedgenes in chloroplasts were compared using immuno-blot and densitometry assays. Early studies in thisproject compared the translation efficiency of the oldalgorithm (using only the most preferred codons)with that of the new algorithm (using the psbA codonhierarchy) quantified by integrated density values ofwestern blots (Fig. 2D). The CNTB-VP1 expression levelin transplastomic plants using the old algorithm forcodon optimization was 2.7- to 3.1-fold lower than thatof the native VP1 viral gene sequence, and the increasein VP1 expression was 77- to 111-fold higher using thenew algorithm (Fig. 2D). Therefore, the new algorithmof the codon optimizer program was used in all sub-sequent studies. In order to correct for overexposure or

Figure 2. Construction of chloroplast vectors using native or codon-optimized genes, and evaluation of homoplasmy andtransgene expression. A, Lettuce or tobacco chloroplast vector maps. aadA, Aminoglycoside 39-adenylytransferase gene; CNTB,coding sequence of cholera nontoxic B subunit; FVIII HC, factor 8 heavy chain native (N) or codon optimized (CN) using the newalgorithm; PpsbA, promoter and 59UTR of the psbA gene; Prrn, rRNA operon promoter; SB-P, BamHI fragment; TpsbA, 39UTR ofthe psbA gene; trnA, alanyl-tRNA; trnI, isoleucyl-tRNA. B and C, Southern-blot analysis of homoplasmic lines. Total genomicDNA (3 mg) from untransformed (UT), native (N), or codon-optimized CNTB-FVIII HC (new algorithm; CN) was digested withHindIII and separated on a 0.8% agarose gel, blotted onto a Nytran membrane, and probed with a BamHI fragment. Lanes 1 to4 show four independent transplastomic lines. L.s., Lactuca sativa. D, Comparison of the expression level of CNTB-VP1 betweentransplastomic lines expressing the native (N) or codon-optimized genes using the old (CO) or new (CN) algorithm. Total extractedproteins were loaded as indicated protein concentrations and were probed with anti-CNTB antibody. CNTB, Standard protein ofcholera nontoxic B subunit; IDV, integrated density values; N.t., Nicotiana tabacum.

66 Plant Physiol. Vol. 172, 2016

Kwon et al.

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

underexposure of western blots to x-ray film, data onvariable exposures were collected. In order to accountfor extreme variation in the expression levels of nativeand codon-optimized genes, serial dilutions of extrac-ted proteins were loaded on each blot (Fig. 3, A and B;Supplemental Fig. S4). In a densitometry assay oflettuce expressing native and codon-optimized CNTB-FVIII HC, which also was used for PRM, the concen-tration of FVIII HC from the codon-optimized gene(108.8–137.5 mg g21 dry weight) was 6.3- to 8-foldhigher than that of the native FVIII HC gene (16.9–17.4mg g21 dry weight; Supplemental Fig. S4). For tobaccoplants expressing CNTB-VP1, the batch used for PRMmass spectrometry showed a 91- to 125-fold differencebetween codon-optimized (11.3–18.1 mg mg21) and

native sequence (0.12–0.15 mg mg21; Fig. 3C;Supplemental Fig. S4). Based on these data, codon-optimized sequences obtained from our newly de-veloped codon optimizer program improved thetranslation of transgenes to different levels, based onthe coding sequence.

To investigate the impact of codon optimization ontranscript stability, northern blotting was performedusing a probe for the psbA 5ʹ sequence (Fig. 4). Althoughloading controls show equal amounts of total RNA ineach lane based on ethidium bromide staining, higheror lower levels of the endogenous psbA transcript areobserved among samples, suggesting subtle changes inRNA loading. The mRNA levels of codon-optimized ornative sequences for CNTB-FVIII HC and CNTB-VP1

Figure 3. Quantitation of native or codon-optimized CNTB-FVIII HC or CNTB-VP1 gene expression using western blots.Extracted leaf proteins were resolved on gradient (4%–20%) SDS-PAGE and probed with anti-CNTB antibody (1:10,000). For aloading control, the same membranes were stripped and reprobed with anti-RbcL antibody (1:5,000). A, Lettuce leaf proteinextracts (5 or 10 mg) expressing CNTB-FVIII HC or untransformed. For loading controls, Ponceau S staining of membrane prior towestern blot or reprobed blot with the large subunit of Rubisco (RbcL) is provided. B, Serial dilution of the native (5–20 mg) orcodon-optimized (1–4 mg) CNTB-FVIII HC lettuce leaf extracts. C, Serial dilution of the native (2–8 mg) or codon-optimized (0.1–0.4 mg) CNTB-VP1 tobacco leaf extracts. CO or CN, Codon optimizedwith old algorithm (CO) or new algorithm (CN); L.s., Lactucasativa; N, native sequence; N.t., Nicotiana tabacum; UT, untransformed wild type.

Plant Physiol. Vol. 172, 2016 67

New Tools to Study Transgene Expression in Chloroplasts

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

were normalized to endogenous psbA transcripts usingdensitometry, and the normalized ratios in each samplewere compared. Northern blots indicated that the in-crease of codon-optimized CNTB-FVIII HC and CNTB-VP1 accumulation is at the translational level ratherthan RNA transcript accumulation. Several previousstudies on the expression of foreign genes have shown alack of variation or modest increases in transcriptabundance but significant variation in translation effi-ciency (Franklin et al., 2002; Gisby et al., 2011; Nakamuraet al., 2016). Franklin et al. (2002) reported a lack ofvariation in transcript abundance for GFP expression inChlamydomonas reinhardtii chloroplasts despite an 80-foldincrease in GFP protein accumulation of the codon-optimized sequence. Even though there was a 3-foldincrease in mRNA levels of codon-optimized TGF-b3when compared with the native sequence (Gisby et al.,2011), the greater part of the 75-fold increase in syntheticTGF-b sequence was attributed to enhanced translation.A recent study also showed that the compatibility of the5ʹ UTR and its coding sequence increased the efficienttranslation of codon-optimized sequences rather thanmRNA abundance (Nakamura et al., 2016).

Absolute Quantitation by PRM Analysis

Expression levels of codon-optimized and nativegene sequences also were quantified using PRM massspectrometry (Fig. 5). To select the optimal proteotypicpeptides for PRM analysis of the CNTB and FVIII HCsequences, we first performed a standard tandem massspectrometry analysis (data not shown) of a tryptic di-gest of lettuce plants expressing CNTB-FVIII HC tochoose specific peptides. The expression of codon-optimized FVIII HC was 5.4- or 5.8-fold higher thanthat of the native sequence when the fold changes were

normalized based on the housekeeping protein peptidesor stable isotope-labeled standard (SIS) peptides (Figs. 5and 6A). Peptides chosen from CNTB showed minorvariations in fold change based on the locations of pep-tides and normalized with SIS or housekeeping proteinpeptides from Rubisco (small or large subunits) or ATPsynthase subunit b: 4.9 (or 4.5; IAYLTEAK), 5.2 (or 4.8;IFSYTESLAGK), or 6.6 (or 6.1; LCVWNNK). Peptideschosen from FVIII HC also showed minor variations: 5.4(or 5; FDDDNSPSFIQIR), 5.7 (or 5.2; YYSSFVNMER), or7.1 (or 6.6; WTVTVEDGPTK; Fig. 6A). The locations ofthese selected peptides within CNTB-FVIII HC are shownin Supplemental Figure S5. For more details, see the rawdata included in Supplemental Data Set S1.

The expression of codon-optimized CNTB-VP1 was25.9- or 26.1-fold higher than that of the native sequencewhen their fold changes were normalized based on theSIS peptides or housekeeping protein peptides (Figs. 5and 6B). Peptides chosen from CNTB showed minimalvariations in fold changes based on their locations: 22.5(or 22.5; LCVWNNK) to 26.1 (or 26; IAYLTEAK) to 28.1(or 28; IFSYTESLAGK; Fig. 6B). The linearity of thequantification range also was investigated by spikingstable SIS peptides in a constant amount of plant digest(1:1:1:1 mix of all four types of plant materials) in adynamic range covering 220 amol to 170 fmol (valuesequivalent on column per injection). These results arereported in detail in Supplemental Figure S7. For all sixpeptides, we observed an r2 value over 0.98.

Absolute quantitation can be achieved by spiking aknown amount of the counterpart SIS peptide into sam-ples. For each counterpart, SIS peptide (34 fmol) was in-jected on columnmixedwith protein digest (equivalent toprotein extracted from 33.3 mg of lyophilized leaf pow-der). By calculating ratios of area under the curve of SISand endogenous peptides, we estimated the endogenous

Figure 4. Northern analysis of transplastomic lines. Transgene transcripts of CNTB-FVIII HC (A) or CNTB-VP1 (B) were probedwith 200 bp of lettuce psbA 5ʹ UTR (for FVIII HC) or tobacco psbA 5ʹ UTR (for VP1) regulatory sequences. Bottom and top ar-rowheads represent the endogenous psbA gene and CNTB-FVIII or CNTB-VP1 transgene, respectively. Ethidium bromide (EtBr)-stained gels are included for the evaluation of equal loading. CN, Codon-optimized sequence using the new algorithm; N, nativesequence; UT, untransformed wild type.

68 Plant Physiol. Vol. 172, 2016

Kwon et al.

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

peptide molarity, expressed as femtomoles on column(Fig. 6). The mean of all calculated ratios of femto-moles on column (six and three peptides for CNTB-FVIIIHC and CNTB-VP1, respectively) for codon-optimizedand native sequences is reported as the fold increase ofprotein expression in codon-optimized constructs. Thehigh reproducibility of the sample preparation and PRManalysis is shown in Figure 5. All peptide measurementswere the result of four technical replicates, two samplepreparation replicates (from leaf powder to extraction toprotein digestion), and two mass spectrometry technicalreplicates. Coefficients of variation among the four mea-surements per peptide ranged from 0.5% to 10% in all buttwo cases, where they were 17% and 22%.

Ribosome Profiling Studies

Ribosome profiling uses deep sequencing to map ri-bosome footprints, mRNA fragments that are protected

by ribosomes from exogenous nuclease attack. Themethod provides a genome-wide, high-resolution, andquantitative snapshot of mRNA segments occupied byribosomes in vivo (Ingolia et al., 2009). Total ribosomefootprint abundance within an open reading frame canprovide an estimate of translational output, and posi-tions at which ribosomes slow or stall are marked byregions of particularly high ribosome occupancy.

To examine how codon optimization influenced ri-bosome behavior, we profiled ribosomes from plantsexpressing the native and codon-optimized CNTB-FVIII HC and CNTB-VP1 transgenes. Figure 7 showsthe abundance of ribosome footprints as a function ofposition in each transgene; footprint coverage on theendogenous chloroplast psbA and rbcL genes is shownas a means to normalize the transgene data between theoptimized and native constructs. Ribosome footprintcoverage was much higher in the codon-optimized VP1sample than in the nativeVP1 sample (Fig. 7A). However,the magnitude of this increase varies depending upon

Figure 5. PRMmass spectrometry analysis of CNTB-FVIII and CNTB-VP1 proteins at N- to C-terminal protein sequences. The y axisshows molarity (fmol on column) of peptides from CNTB-FVIII HC or CNTB-VP1 in codon-optimized or native genes. CNTB:peptide 1, IFSYTESLAGK; peptide 2, IAYLTEAK; peptide 3, LCVWNNK. FVIII: peptide 4, FDDDNSPSFIQIR; peptide 5,WTVTVEDGPTK; peptide 6, YYSSFVNMER. Themedian of four technical replicates is presented for each sample. Circles representnative sequences, and squares represent codon-optimized (c.o.) sequences using the new algorithm. CV, Coefficient of variation.

Plant Physiol. Vol. 172, 2016 69

New Tools to Study Transgene Expression in Chloroplasts

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

how the data are normalized (Fig. 7C): the increase is 5-,16-, or 1.5-fold when normalized to total chloroplastribosome footprints, psbA ribosome footprints, or rbcLribosome footprints, respectively. These numbers areconsiderably lower than the 22.5- to 28.1-fold increasein VP1 protein abundance inferred from the quantita-tive mass spectrometry data. The topography of ribo-some profiles is generally highly reproducible amongbiological replicates (see rbcL and psbA in Fig. 7B),which are at the same developmental stage and grownunder the same conditions. In that context, it is note-worthy that the peaks and valleys in the endogenouspsbA and rbcL genes are quite different in the native andoptimized tobacco VP1 lines. It could be envisaged thatcompetition with the endogenous psbA 5ʹUTR could, inprinciple, reduce the translation of the endogenouspsbA open reading frame. However, no such competi-tion was observed for the lettuce construct. In addition,the degree of competition would depend on the abun-dance of the transgene mRNA. The abundance of thetransgene mRNA was similar in the native and codon-optimized constructs, so competition via the psbA 5ʹUTR is unlikely to contribute to differences in psbA ri-bosome occupancy in these lines. Many of the largepeaks (presumed ribosome pauses) observed in theseendogenous genes, specifically in the native VP1 line,map to paired Ala codons (asterisks in Fig. 7A). Thissuggests a limitation of Ala tRNA specifically in thenative VP1 line. Although the basis for this is unclear, itis conceivable that it has to do with minor differences inthe age of the plants used for the analyses (2.5 versus2 months). It is also conceivable that introduction of thetransgene had an unanticipated effect on the expressionof the nearby gene encoding Ala tRNA. In the samevein, ribosome pause sites in the CNTB regionwould beexpected, but the sites of the native and optimized VP1constructs were not similar. This global difference in

ribosome behavior at Ala codons may well contributeto differential transgene expression in the native andcodon-optimized lines.

The total number of ribosome footprints in the FVIIIgene decreased approximately 2-fold in the codon-optimized line, whereas protein accumulation increased4.5- to 6.6-fold. However, a major ribosome pause can beobserved near the 3ʹ end of the native transgene, fol-lowed by a region of very low ribosome occupancy (seebracketed region in Fig. 7B). This ribosome pause mapsto a pair of CTC Leu codons, a codon that is almost notused in native psbA genes (Fig. 1). These results stronglysuggest that the stalling of ribosomes at these Leu codonslimits the translation of the downstream sequences andoverall protein output while also causing a buildup ofribosomes on the upstream sequences. Thus, overall ri-bosome occupancy does not reflect translational outputin this case. Modification of those Leu codons in thecodon-optimized variant eliminated this ribosome stalland resulted in a muchmore even ribosome distributionover the transgene (Fig. 7B, right). Taken together, theribosome-profiling data revealed dramatic differences inribosome dynamics between codon-optimized and na-tive transgenes. Although total ribosome occupancy didnot reliably predict protein output from transgenesexpressed in chloroplasts, the detection of strong ribo-some pauses at specific sites can provide insight intorate-limiting steps that could be mitigated through se-quence modifications.

DISCUSSION

Past studies on transgene expression in chloroplastsreported abundant transcripts but variable levels oftranslation based on the origin of the coding sequence.Prokaryotic genes were translated more efficiently than

Figure 6. Fold change (increase) of CNTB-FVIII HC or CNTB-VP1 proteins based on targetedmass spectrometry analysis of CNTBand HC peptides. The reported data represent medians of the results from six and three peptides from CNTB-FIII HC (A) andCNTB-VP1 (B), respectively. The y axis represents the fold change increase (based onmeasured fmol on column) of peptides fromplant materials expressing genes codon optimized using the new algorithm (CO) with respect to plant materials expressing nativesequence (N). CNTB: peptide 1, IFSYTESLAGK; peptide 2, IAYLTEAK; peptide 3, LCVWNNK. FVIII HC: peptide 4,FDDDNSPSFIQIR; peptide 5,WTVTVEDGPTK; peptide 6, YYSSFVNMER. SIS-normalized values represent fold change as a ratioto each spiked SIS peptide. Housekeeping (HK) protein normalization values represent fold change as a normalized ratio toRubisco large or small subunit and ATP synthase CF1 b-subunit protein peptides. For peptide ratio results for CNTB-FVIII andCNTB-VP1, see Supplemental Data Set S1.

70 Plant Physiol. Vol. 172, 2016

Kwon et al.

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

Figure 7. Ribosome profiling data from transplastomic plants expressing native and codon-optimized VP1 or FVIII HC. Readcoverage for native transgenes (N), codon-optimized transgenes with new algorithm (CN), and the endogenous psbA and rbcLgenes are displayed with the Integrated Genome Viewer. A, Data from tobacco leaves expressing native and codon-optimizedVP1 transgenes. Asterisks mark each pair of consecutive Ala codons in the data from the native line. The + symbol marks threeconsecutive Ala codons.Many strong ribosome pause sites in the plants expressing native VP1map to paired Ala codons, whereas

Plant Physiol. Vol. 172, 2016 71

New Tools to Study Transgene Expression in Chloroplasts

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

eukaryotic genes. Transcript abundance is attributed tothe high copy number of transgenes and the strength ofthe psbA promoter. Among more than 150 transgenesexpressed in chloroplasts, more than 75% utilized psbAregulatory sequences (Jin and Daniell, 2015; Daniellet al., 2016a, 2016b). In addition, three ribosome-bindingregions in the 5ʹ UTR of psbA recruit ribosomes and ef-ficiently form the translational initiation complex (Zouet al., 2003). Therefore, it is expected that improvement oftranslation elongation in heterologous genes should in-crease transgene expression. There is a drawback of us-ing a codon table based on all chloroplast genes, whichassumes that all tRNA species are equally abundant.However, such translational selection is not possible(Surzycki et al., 2009). Therefore, in this study, we de-veloped a codon optimizer program based on the codonusage of psbA genes across 133 plant species to increasethe expression of heterologous genes in chloroplasts.

Codon Optimization Significantly Enhances Translationin Chloroplasts

The psbA promoter and 5ʹ UTR are the most widelyused regulatory sequences for transgene expressionin chloroplasts. Among more than 115 transgenesexpressed via the chloroplast genome, 84 use the psbAregulatory sequence (Jin and Daniell, 2015; Daniellet al., 2016a, 2016b). A recent study (Nakamura et al.,2016) shows the absence of any detectable translationwhen codons for the tat coding sequence of HIV-1 wereoptimized using all 79 tobacco chloroplast mRNAs andregulated by the psbA 5ʹ UTR (Nakamura et al., 2016),but the same sequence was expressed well using thephage T7 GENE10 5ʹ UTR. However, when the 5ʹ psbAcoding sequence was inserted between the psbA 5ʹ UTRand the tat sequence, translation was initiated. There-fore, compatibility between the psbA regulatory elementand codons is vital for initiation and elongation duringthe translation of heterologous genes (Nakamura et al.,2016). Therefore, when heterologous genes are regulatedby psbA, codon optimization based on psbA codon usageshould facilitate the movement of ribosomes more effi-ciently from the translational initiation complex thancodon-optimized sequences based on any other chloro-plast genes.

In this study, we developed and tested two new co-don optimizer programs based on the codon preferenceof psbA genes to improve the expression of heterolo-gous genes in chloroplasts in concert with the psbAregulatory elements. The first old algorithm of the co-don optimizer was programmed to use only the most

highly used codons, resulting in lower expression thanthe native gene. The increase in expression of VP1 inchloroplasts between the old and new algorithm is 77-to 111-fold. Therefore, removal of rare codons and re-placement with only highly preferred codons did nothelp in enhancing translation when tRNA pools werelimited. Thus, the new algorithm of the codon opti-mizer program was used in all subsequent studies. Thenew algorithm of the codon optimizer used the codondistribution hierarchy observed among psbA genes. Asa result, 105 rare codons out of 754 codons in the FVIIIHC gene and 59 rare codons out of 302 codons in theVP1 gene were replaced with psbA preferentially usedcodons. However, the replaced codons are not identi-fied as rare codons in codon tables using all chloroplastgenes. Therefore, the total chloroplast codon tablewould have retained 75 rare codons in FVIII HC and35 rare codons in the VP1 coding sequence.

Although we used a psbA-based codon optimizationprogram to improve translation in chloroplasts, manyother factors, including the size and origin of heterol-ogous genes and the compatibility of the 5ʹUTR and its5ʹ coding region, are important. The CNTB-fused nativesequence of human proinsulin (approximately 22 kD)was expressed up to 72% of total leaf protein (Ruhlmanet al., 2010), and the expression of ZZTEV-IGF-1(Staphylococcus aureus Z domains and TEV cleavagesite fused to native human insulin-like growth factor1gene; approximately 26 kD) was up to 32.4% of TSP(Daniell et al., 2009). However, human TGF-b3 (13 kD,56% GC) was expressed in up to 12% of leaf proteinonly after codon optimization (Gisby et al., 2011). Also,the expression of GFP (approximately 26 kD) increasedapproximately 80-fold after codon optimization(Franklin et al., 2002). Therefore, proteins with shortercoding sequences are not ideal to evaluate codon opti-mization concepts and other limitations in translation.Consequently, a better understanding of codon usageand other rate-limiting steps (compatibility with regu-latory sequences, efficiency of translation initiation,elongation, and availability of tRNAs) in translation isessential for the successful expression of human orother eukaryotic coding sequences.

Codon usage in psbA (our program) is different forpreferred Arg, Asn, Gly, His, Leu, and Phe codonsthan those reported for 79 tobacco chloroplast mRNAsbased on in vitro studies (Nakamura and Sugiura,2007). Preferred codons are decoded more rapidlythan nonpreferred codons, presumably due to higherconcentrations of corresponding tRNAs that recognizepreferred codons, which speeds up the elongation rate

Figure 7. (Continued.)this is not observed in the codon-optimized line. Triangles mark each pair of consecutive Ser codons in the codon-optimized line.A major ribosome stall maps to a region harboring five closely spaced Ser codons in the codon-optimized VP1 gene. nt, Nu-cleotides. B, Data from lettuce plants expressing the native and codon-optimized FVIII HC transgenes. A major ribosome stall inthe native FVIII HC gene maps to a pair of adjacent CTC Leu codons, a codon that is not used in the native psbA gene. Ribosomefootprint coverage is muchmore uniform on the codon-optimized transgene. C, Absolute and relative ribosome footprints counts.

72 Plant Physiol. Vol. 172, 2016

Kwon et al.

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

of protein synthesis (Yu et al., 2015). Higher plant chlo-roplast genomes code for a conserved set of 30 tRNAs.This set is believed to be sufficient to support the trans-lation machinery in chloroplasts (Lung et al., 2006). Inthe ribosome profiling data for codon-optimized VP1,two major ribosome stalling sites correlated with anunusually high concentration of Ser codons (Fig. 7A).Five Ser codons were clustered at codons 71, 73, 75, 76,and 79, and three other Ser codonswere found at codons178, 179, and 182. Two adjacent Ser residues in eachcluster, codons 75 and 76 (UCU-AGU) and codons178 and 179 (UCC-UCU; see triangles in Fig. 7A), show ahigh level of ribosome stalling. Thus, it may be possibleto further increase the expression of the codon-optimized VP1 transgene by replacing these codonswith codons for a different but similar amino acid.As seen in this study, the AT content of codon-

optimized VP1 was increased marginally, but theprotein level of the optimized CNTB-VP1 increasedsignificantly, up to 22.5- to 28.1-fold (by PRM) and 91-to 125-fold (by western blot), over the native sequencewhen expressed in chloroplasts. Therefore, severalother factors play key roles in regulating the efficiencyof translation. As observed in ribosome profilingstudies of CNTB-VP1, the availability and density ofspecific codons could severely impact translation.Similarly, FVIII HC ribosome footprint results showedthat ribosome pauses mapped to CTC Leu codons,which are almost not used in psbA genes. This codonalso is rarely used in the lettuce rbcL gene (2.44%) andis never used in tobacco rbcL. Native FVIII HC uses theCTC codon as much as 15.28%, but the CTC codonwaseliminated from the codon-optimized sequence basedon psbA codon usage. More detailed analysis of thecodon frequency of the native FVIII HC and the psbAgene reveals further insight into rare codons: GGG forGly is used 2.3% in psbA but 11.63% in native HC; CTGfor Leu is 3.7% in psbA but 26.39% in native HC; CCCfor Pro is 1.9% versus 11.9%; CGG for Arg is 0.5%versus 10.81%; and CTG for Val is 1.7% versus 25.49%.So, similar to the CTC codon, several other rare codonsin native human genes should have reduced transla-tional efficiency in chloroplasts. In the process of de-veloping the codon optimizer, the cutoff value used forthe determination of codons was set at 5% to eliminaterare codons. So, there is room to further modify the co-don optimizer program.

New Solution for the Quantitation of InsolubleMultimeric Proteins

A major challenge is the lack of reliable methods toquantify insoluble proteins, because the only reliablemethod (ELISA) cannot be used due to the aggregationor formation of multimeric structures. CNTB fusionproteins expressed in chloroplasts form pentamericstructures that are highly resistant to detergents, andthis hampers solubilization due to tight interactionsbetween CNTB monomers, mediated by 30 hydrogen

bonds, seven salt bridges, and hydrophobic interactions(Miyata et al., 2012). In our previous studies (Boyhanand Daniell, 2011; Kwon et al., 2013; Kohli et al., 2014;Shil et al., 2014), multimeric forms exist even aftertreatment with DTT, detergents (SDS), and boiling.Also, acid (pH 2) could not completely dissociate CNTBpentamers due to the reformation of multimeric struc-tures. Although such stability of pentamers is ideal forthe oral drug delivery of CNTB fusion proteins, quan-titation of the dose continues to be a major challenge.

Delivering accurate doses of protein drugs is a fun-damental requirement for their clinical use. Therefore,in this study, we carried out PRM analysis for the ab-solute quantitation of CNTB-FVIII HC and CNTB-VP1in plants carrying codon-optimized and native sequences.Limitations in quantitation using western blots, in-cluding protein aggregation and inefficient transfer oflarge proteins to membranes, inadequate solubiliza-tion, and differential exposure to films, were quite ev-ident, resulting in unreliable quantification of drugdosage in planta. Use of strong denaturing and reduc-ing conditions in combination with optimal enzymaticproteolysis conditions maximized the solubilizationof multimeric CNTB proteins. PRM analysis has beenbroadly adopted in quantitative proteomics studies(e.g. biomarker discovery in plasma), due to its highsensitivity, specificity, and precise quantitation of spe-cific protein targets within complex protein matrices(Gallien et al., 2012). These qualities clearly show theadvantage of using PRM in the quantification of specificprotein targets, independent of the proteinmatrix source(e.g. plant extracts from tobacco or lettuce) or complex-ity. Moreover, the development of a PRM assay for ahandful of proteins can be achieved in a relatively shorttime and at low cost (not considering the mass spec-trometry instrumentation). As a peptide-centric quanti-tation methodology, it also offers robustness andversatility of protein extraction methods, and keepingthe protein of interest in a native conformation is notrequired. However, it is intrinsically biased by theenzymatic cleavage site access of the enzymes used fordigestion. In order to overcome this bias, we usedstrong denaturing conditions (i.e. 2% SDS) and buffersthat favor the activity of the proteolytic enzymes (i.e.sodium deoxycholate-based buffers; León et al., 2013).For FVIII HC (Figs. 5 and 6), there were no significantvariations in the values for fold increases of codon-optimized over native sequences, which were deter-mined by the peptides chosen for quantification. Inaddition, the fold increases were very similar be-tween two different normalization approaches. Threepeptides selected from the CNTB region (N terminusof the fusion protein) showed that the range of thefold increase was from 4.5 to 6.6, while the range was5 to 7.1 for the peptides chosen from FVIII regions(C terminus of the fusion protein). Therefore, quan-tification results obtained from PRM analysis areconsistent, irrespective of the selected region of thefusion protein (N or C terminus) or the componentprotein (CNTB or FVIII HC).

Plant Physiol. Vol. 172, 2016 73

New Tools to Study Transgene Expression in Chloroplasts

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

By using absolute quantified SIS peptides at identicalconcentrations in all samples and by examining theentire length from N to C terminus, one could accu-rately quantify the absolute amount of the target pro-tein (Streng et al., 2016). Furthermore, the accuracy ofPRM assays in this study was further consolidated byusing two different normalization methods: SIS pep-tides and peptides for housekeeping proteins (large orsmall subunit proteins of Rubisco and ATP synthaseb-subunit). Incomplete/cleaved proteins can be detec-ted using targeted peptide located closer to the C or Nterminus or in the midregions. Quantification resultsobtained from PRM analysis of both CNTB fusionproteins in our study are consistent, irrespective of theselected region of the fusion protein (N or C terminus orelsewhere), and offer data for reliable quantitation. Also,the same three CNTB peptides for CNTB-VP1 showedconsistent fold increases, ranging from 22.5 to 28.1. PRManalysis is better than western blotting because it elimi-nates variation introduced by mobility and the transferof different-sized proteins and the saturation of antibodyprobes. Overall, the PRMworkflow included selection ofthe proteotypic peptides from CNTB and FVIII HC se-quences and synthesis of the counterpart SIS peptides(Supplemental Fig. S6). Six peptides were selected andscheduled for PRM analysis on the Q Exactive massspectrometer, based on observed retention time on thechromatograph with a window of65 min and mass-to-charge ratio (m/z) of the double and/or triple chargestate of these peptides. This double way of targeting theselection of precursor ions, in addition to the high reso-lution of the Q Exactive mass spectrometer, contributesto the high specificity of the assay. The PRM data anal-ysis, postacquisition, also offers a high specificity to theassay. The fivemost intense fragment ions, with no clearcontaminant contribution from the matrix, are thenselected for the quantification of the peptide. Theconfidence of the fragment ion assignment by thebioinformatics tool used (i.e. Skyline; MacLean et al.,2010) is finally achieved by the comparison of the refer-ence tandem mass spectrometry spectra and the reten-tion time profiles, generatedwith each of the counterpartSIS peptides. The high sensitivity, specificity, versatility,and robustness of PRM offer a new opportunity forcharacterizing translational systems in plants.

CONCLUSION

This study explored heterologous gene expressionutilizing chloroplast genome sequences, ribosomeprofiling, and targeted mass spectrometry to enhanceour understanding of the synthesis of valuable bio-pharmaceuticals in chloroplasts. Targeted proteomicquantification by mass spectrometry showed that co-don optimization increases translation efficiency 4.5- to28.1-fold based on the coding sequence, validating thisapproach, to our knowledge, for the first time for thequantitation of protein drug dosage in plant cells. Thelack of reliable methods to quantify insoluble proteins

due to the aggregation or formation of multimericstructures is a major challenge. Both biopharmaceuticalsused in this study are CNTB fusion proteins that formpentamers, which is a requirement for their binding tointestinal epithelial monosialotetrahexosylgangliosidereceptors. Such a multimeric structure excludes thecommonly used ELISA for the quantitation of dosage.However, delivering accurate doses of protein drugs is afundamental requirement for their clinical use, and thisimportant goal was accomplished in this study. Indeed,plant biomass generated in this study has resulted in thedevelopment of a polio booster vaccine that has beenvalidated by the Centers for Disease Control and Pre-vention, a timely invention to meet the World HealthOrganization requirement to withdraw the current oralpolio vaccine, which causes severe polio in outbreakareas, in April 2016 (Chan et al., 2016).

Such an increase of codon-optimized protein accu-mulation is at the translational level rather than anyimpact on transcript abundance. The codon optimizerprogram increases transgene expression in chloroplastsin both tobacco and lettuce with no species specificity.In contrast to previous in vitro studies, these in-depthin vivo studies of heterologous gene expression usinga wealth of newly sequenced chloroplast genomeshelped us to understand the codon optimization pro-cess. While the removal of rare codons is very impor-tant, replacing those with the most highly used psbAcodons indeed decreased translation efficiency. There-fore, the key factor in enhancing translation is the re-placement of rare codons following the hierarchy of ahighly expressed gene. Ribosome footprints obtainedusing profiling studies did not increase proportionatelywith VP1 translation or even decreased after FVIII codonoptimization, but it is a valuable tool for diagnosing rate-limiting steps in translation. A major ribosome pause atCTC Leu codons, a rarely used codon in chloroplasts,was eliminated from the native gene after codon opti-mization. Ribosome stalls observed at clusters of othercodons in codon-optimized genes provide opportunitiesfor further optimization. These observations providefurther insight into limitations in chloroplast translationand approaches to address these in future studies.

MATERIALS AND METHODS

Codon Optimization

To maximize the expression of heterologous genes in chloroplasts, a chloro-plast codon optimizer programwas developed based on the codon preference ofpsbA genes across 133 seed plant species. All sequences were downloaded fromthe National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid=2759&opt=plastid). The usage prefer-ence among synonymous codons for each amino acid was determined by ana-lyzing a total of 46,500 codons from 133 psbA genes. The optimization algorithm(Chloroplast Optimizer version 2.1) was made to facilitate changes from rarecodons to codons that are frequently used in chloroplasts using Java.

Creation of Transplastomic Lines

ThenativesequenceofFVIIIHCwasamplifiedusing thepAAV-TTR-hF8miniplasmid (Sherman et al., 2014) as the PCR template. The codon-optimized HC

74 Plant Physiol. Vol. 172, 2016

Kwon et al.

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

sequence obtained using Codon Optimizer version 2.1 was synthesized byGenScript. The native VP1 gene (906 bp) of Sabin 1 (provided by Dr. KonstantinChumakov, Food and Drug Administration) was used as the template for PCRamplification. The codon-optimized VP1 sequence also was synthesized byGenScript. Amplified and synthetic gene sequences were cloned into chloro-plast transformation vectors pLSLF and pLD-utr for lettuce (Lactuca sativa) andtobacco (Nicotiana tabacum ‘Petite Havana’), respectively. Sequence-confirmedplasmids were used for bombardment to create transplastomic plants as de-scribed previously (Verma et al., 2008). Transplastomic lines were confirmedusing Southern-blot analysis as described previously (Verma et al., 2008), ex-cept for probe labeling and detection, for which the DIG High Prime DNALabeling andDetection Starter Kit II (Roche; catalog no. 11585624910)was used.

Evaluation of Translation

To compare the level of protein expression between native and codon-optimized sequences, immunoblot and densitometric assays were performedusing anti-CNTB antibody. For total plant protein, powdered lyophilized plantcells were suspended in extraction buffer (100 mM NaCl, 10 mM EDTA, 200 mM

Tris-Cl, pH 8, 0.05% [v/v] Tween 20, 0.1% SDS, 14 mM b-mercaptoethanol,400 mM Suc, 2 mM phenylmethylsulfonyl fluoride, and proteinase inhibitorcocktail) in a ratio of 10 mg per 500 mL and incubated on ice for 1 h for rehy-dration. Suspended cells were sonicated (pulse on for 5 s and pulse off for 10 s;sonicator 3000; Misonix) after vortexing (approximately 30 s). After Bradfordassay, equal amounts of homogenized protein were loaded and separated onSDS-polyacrylamide gels with known amounts of CNTB protein standard. Todetect CNTB fusion proteins, anti-CNTB polyclonal antibody (GenWay Bio-tech) was diluted 1:10,000 in 13 phosphate-buffered saline + 0.1% Tween 20,and then membranes were probed with goat anti-rabbit IgG-horseradish per-oxidase secondary antibody (Southern Biotechnology; 4030-05) diluted 1:4,000in 13 phosphate-buffered saline + 0.1% Tween 20. For loading controls,protein-blotted membrane was stained with Ponceau S (Sigma; P-3504) prior toimmunoprobing with anti-CNTB antibody, and anti-RbcL antibody (Agrisera;AS03 037; 1:5,000) was used on the same blots after stripping anti-CNTB anti-body. Chemiluminescent signals were developed on x-ray films, which wereused for quantitative analysis with ImageJ software (IJ 1.46r; National Institutesof Health).

Evaluation of Transcripts

Total RNA was extracted from leaves of plants grown in agar medium in atissue culture room using the easy-BLUE Total RNA Extraction Kit (iNtRON;catalog no. 17061). For the RNA gel blot, equal amounts of total RNA wereseparated ona 0.8%agarose gel (containing 1.85% formaldehyde and 13MOPS)and blotted onto a nylonmembrane (Nytran SPC;Whatman). For northern blot,the PCR-amplified product from the psbA 5ʹ UTR of the chloroplast transfor-mation plasmid was used as the probe. Hybridization signals on membraneswere detected using a DIG labeling and detection kit as described above.

Lyophilization

Confirmed homoplasmic lines were transferred to a temperature- and light-controlled greenhouse. Mature leaves from fully grown transplastomic plantswere harvested and stored at 280°C before lyophilization. To freeze dry plantleaf materials, frozen, crumbled small leaf pieces were sublimated under 400-mTorr vacuum while increasing the chamber temperature from240°C to 25°Cfor 3 d (Genesis 35XL; VirTis SP Scientific). Dehydrated leaves were powderedusing a coffee grinder (Hamilton Beach) at maximum speed; tobacco wasground three times for 10 s each, and lettuce was ground three times for 5 s.Powdered leaves were stored in containers under air-tight and moisture-freeconditions at room temperature with silica gel.

Protein Extraction and Sample Preparation for MassSpectrometry Analysis

Total protein was extracted from 10 mg of lyophilized leaf powder by adding1 mL of extraction buffer (2% SDS, 100 mM DTT, and 20 mM TEAB). Lyophilizedleaf powder was incubated for 30 min at room temperature with sporadic vor-texing to allow rehydration of plant cells. Homogenates were then incubated for1 h at 70°C, followedbyovernight incubation at room temperature under constantrotation. Cellwall/membrane debris was pelleted by centrifugation at 14,000 rpm(approximately 20,800 rcf). The procedure was performed in duplicate.

All protein extracts (100 mL) were enzymatically digested with 10 mg oftrypsin/Lys-C (Promega) on a centrifugal device with a filter cutoff of 10 kD(Vivacon) in the presence of 0.5% sodiumdeoxycholate, as described previously(León et al., 2013). After digestion, sodium deoxycholate was removed by acidprecipitation with 1% (final concentration) trifluoroacetic acid. SIS peptides(greater than 97% purity, C-terminal Lys and Arg as Lys U-13C6;U-15N2 and ArgU-13C6;U-15N4; JPT Peptide Technologies) were spiked into the samples prior todesalting. Samples were desalted prior to mass spectrometry analysis withOligoR3 stage tips (Applied Biosystems). The initial protein extract (10 mL) wasdesalted on an OligoR3 stage tip column. Desalted material was then dried on aspeed vacuumdevice and suspended in 6mL of 0.1% formic acid inwater. Massspectrometry analysis was performed in duplicate by injecting 2 mL of desaltedmaterial into the column.

PRM Mass Spectrometry Analysis and Data Analysis

Liquid chromatography-coupled targeted mass spectrometry analysis wasperformed by injecting the column with 2 mL of peptide, corresponding to theamount of total protein extracted, and digested from 33.3 mg of lyophilized leafpowder, with 34 fmol of each SIS peptide spiked in. Peptides were separatedusing the Easy-nLC 1000 (Thermo Scientific) on a home-made 30-cm 3 75-mmi.d. C18 column (1.9 mm particle size; ReproSil; Dr. Maisch HPLC). Mobilephases consisted of an aqueous solution of 0.1% formic acid (A) and 90% ace-tonitrile and 0.1% formic acid (B), both HPLC grade (Fluka). Peptides wereloaded on the column at 250 nLmin21 with an aqueous solution of 4% solvent B.Peptides were eluted by applying a nonlinear gradient for 4%-7%-27%-36%-65%-80% B in 2-50-10-10-5 min, respectively.

Mass spectrometry analysis was performed using the PRM mode on a QExactive mass spectrometer (Thermo Scientific) equipped with a nanosprayFlex ion source (Gallien et al., 2012). Isolation of targets from the inclusion listinvolved a 2-m/z window, a resolution of 35,000 (at m/z 200), a target AGCvalue of 1 3 106, and a maximum filling time of 120 ms. Normalized collisionenergy was set at 29. Retention time schedules were determined by theanalysis of SIS peptides under equal nano-liquid chromatography. A list oftarget precursor ions and a retention time schedule are reported inSupplemental Data Set S1. PRM data analysis was performed using Skylinesoftware (MacLean et al., 2010).

Ribosome Profiling

Secondand third leaves from the topof theplantwereharvested for ribosomeprofiling. Lettuce plantswere approximately 2months old. Tobacco plantswere2.5or2monthsold, fornativeandcodon-optimizedVP1constructs, respectively.Leaves were harvested at noon and flash frozen in liquid nitrogen. Ribosomefootprints were prepared as described by Zoschke et al. (2013), except thatRNase I was substituted for micrococcal nuclease. Ribosome footprints wereconverted to a sequencing library with the NEXTflex Illumina Small RNA Se-quencing Kit version 2 (BIOO Scientific; 5132-03). rRNA contaminants weredepleted by subtractive hybridization after first-strand cDNA synthesis usingbiotinylated oligonucleotides corresponding to abundant rRNA contaminantsobserved in pilot experiments. Samples were sequenced at the University ofOregon Genomics Core Facility. Sequence reads were processed with cutadaptto remove adapter sequences and bowtie2 with default parameters to alignreads to the engineered chloroplast genome sequence.

Accession Numbers

Sequence data from this article can be found in the GenBank/EMBL datalibraries under accession numbers NM_000132.3 for FVIII HC and AY184219for VP1.

Supplemental Data

The following supplemental materials are available.

Supplemental Figure S1. Sequences of native and codon-optimized FVIIIHC and VP1 genes.

Supplemental Figure S2. Comparison of native and codon-optimized(new and old) sequences.

Supplemental Figure S3. Three different codon tables for the expression ofheterologous genes in chloroplasts.

Plant Physiol. Vol. 172, 2016 75

New Tools to Study Transgene Expression in Chloroplasts

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

Supplemental Figure S4. Plot of integrated density values for the quanti-fication of CNTB-FVIII HC and CNTB-VP1 based on standard curves.

Supplemental Figure S5. Peptide sequences used for targeted mass spec-trometry.

Supplemental Figure S6. Comparison of CNTB-FVIII HC and VP1 byPRM analysis.

Supplemental Figure S7. Evaluation of PRM assay linearity.

Supplemental Data Set S1. Codon usage table and mass spectrometrydata.

ACKNOWLEDGMENTS

We thank Mark Yarmarkovich for help with developing the codon optimi-zation algorithms, Nick Stiffler for help with the bioinformatic analysis ofribosome profiling data, and Non Chotewutmontri for helpful discussions.

Received June 20, 2016; accepted July 25, 2016; published July 27, 2016.

LITERATURE CITED

Arlen PA, Falconer R, Cherukumilli S, Cole A, Cole AM, Oishi KK,Daniell H (2007) Field production and functional evaluation ofchloroplast-derived interferon-alpha2b. Plant Biotechnol J 5: 511–525

Barkan A (1988) Proteins encoded by a complex chloroplast transcriptionunit are each translated from both monocistronic and polycistronicmRNAs. EMBO J 7: 2637–2644

Birch-Machin I, Newell CA, Hibberd JM, Gray JC (2004) Accumulation ofrotavirus VP6 protein in chloroplasts of transplastomic tobacco is lim-ited by protein stability. Plant Biotechnol J 2: 261–270

Boehm CR, Ueda M, Nishimura Y, Shikanai T, Haseloff J (2016) A cyanfluorescent reporter expressed from the chloroplast genome of March-antia polymorpha. Plant Cell Physiol 57: 291–299

Boyhan D, Daniell H (2011) Low-cost production of proinsulin in tobaccoand lettuce chloroplasts for injectable or oral delivery of functional in-sulin and C-peptide. Plant Biotechnol J 9: 585–598

Chan HT, Daniell H (2015) Plant-made oral vaccines against human in-fectious diseases: are we there yet? Plant Biotechnol J 13: 1056–1070

Chan HT, Xiao Y, Weldon WC, Oberste SM, Chumakov K, Daniell H(2016) Cold chain and virus free chloroplast-made booster vaccine toconfer immunity against different polio virus serotypes. Plant Bio-technol J (in press) doi/10.1111/pbi.12575

Daniell H, Chan HT, Pasoreck EK (2016b) Vaccination through chloroplastgenetics: affordable protein drugs for the prevention and treatment ofinherited or infectious diseases. Annu Rev Genet 50: (in press)

Daniell H, Datta R, Varma S, Gray S, Lee SB (1998) Containment of her-bicide resistance through genetic engineering of the chloroplast genome.Nat Biotechnol 16: 345–348

Daniell H, Lin CS, Yu M, Chang WJ (2016a) Chloroplast genomes: diversity,evolution, and applications in genetic engineering. Genome Biol 17: 134

Daniell H, Ruiz G, Denes B, Sandberg L, Langridge W (2009) Optimiza-tion of codon composition and regulatory elements for expression ofhuman insulin like growth factor-1 in transgenic chloroplasts andevaluation of structural identity and function. BMC Biotechnol 9: 33

Daniell H, Vivekananda J, Nielsen BL, Ye GN, Tewari KK, Sanford JC(1990) Transient foreign gene expression in chloroplasts of cultured to-bacco cells after biolistic delivery of chloroplast vectors. Proc Natl AcadSci USA 87: 88–92

De Cosa B, Moar W, Lee SB, Miller M, Daniell H (2001) Overexpression ofthe Bt cry2Aa2 operon in chloroplasts leads to formation of insecticidalcrystals. Nat Biotechnol 19: 71–74

DeGray G, Rajasekaran K, Smith F, Sanford J, Daniell H (2001) Expres-sion of an antimicrobial peptide via the chloroplast genome to controlphytopathogenic bacteria and fungi. Plant Physiol 127: 852–862

Domon B, Aebersold R (2010) Options and considerations when selecting aquantitative proteomics strategy. Nat Biotechnol 28: 710–721

Eibl C, Zou Z, Beck A, Kim M, Mullet J, Koop HU (1999) In vivo analysisof plastid psbA, rbcL and rpl32 UTR elements by chloroplast transfor-mation: tobacco plastid gene expression is controlled by modulation oftranscript levels and translation efficiency. Plant J 19: 333–345

Franklin S, Ngo B, Efuet E, Mayfield SP (2002) Development of a GFP reportergene for Chlamydomonas reinhardtii chloroplast. Plant J 30: 733–744

Gallien S, Duriez E, Crone C, Kellmann M, Moehring T, Domon B (2012)Targeted proteomic quantification on quadrupole-Orbitrap mass spec-trometer. Mol Cell Proteomics 11: 1709–1723

Gisby MF, Mellors P, Madesis P, Ellin M, Laverty H, O’Kane S, FergusonMW, Day A (2011) A synthetic gene increases TGFb3 accumulation by75-fold in tobacco chloroplasts enabling rapid purification and foldinginto a biologically active molecule. Plant Biotechnol J 9: 618–628

Hassan SW, Waheed MT, Müller M, Clarke JL, Shinwari ZK, Lössl AG(2014) Expression of HPV-16 L1 capsomeres with glutathione-S-transferaseas a fusion protein in tobacco plastids: an approach for a capsomere-basedHPV vaccine. Hum Vaccin Immunother 10: 2975–2982

Ingolia NT, Ghaemmaghami S, Newman JR, Weissman JS (2009)Genome-wide analysis in vivo of translation with nucleotide resolutionusing ribosome profiling. Science 324: 218–223

Inka Borchers AM, Gonzalez-Rabade N, Gray JC (2012) Increased accu-mulation and stability of rotavirus VP6 protein in tobacco chloroplastsfollowing changes to the 59 untranslated region and the 59 end of thecoding region. Plant Biotechnol J 10: 422–434

Jabeen R, Khan MS, Zafar Y, Anjum T (2010) Codon optimization of cry1Abgene for hyper expression in plant organelles. Mol Biol Rep 37: 1011–1017

Jin S, Daniell H (2015) The engineered chloroplast genome just got smar-ter. Trends Plant Sci 20: 622–640

Kim JW, You J (2013) Protein target quantification decision tree. Int JProteomics 2013: 701247

Kohli N, Westerveld DR, Ayache AC, Verma A, Shil P, Prasad T, Zhu P, ChanSL, Li Q, Daniell H (2014) Oral delivery of bioencapsulated proteins acrossblood-brain and blood-retinal barriers. Mol Ther 22: 535–546

Kwon KC, Nityanandam R, New JS, Daniell H (2013) Oral delivery ofbioencapsulated exendin-4 expressed in chloroplasts lowers blood glu-cose level in mice and stimulates insulin secretion in beta-TC6 cells.Plant Biotechnol J 11: 77–86

Lakshmi PS, Verma D, Yang X, Lloyd B, Daniell H (2013) Low cost tuberculosisvaccine antigens in capsules: expression in chloroplasts, bio-encapsulation,stability and functional evaluation in vitro. PLoS ONE 8: e54708

Lee SB, Li B, Jin S, Daniell H (2011) Expression and characterization ofantimicrobial peptides Retrocyclin-101 and Protegrin-1 in chloroplaststo control viral and bacterial infections. Plant Biotechnol J 9: 100–115

Lenzi P, Scotti N, Alagna F, Tornesello ML, Pompa A, Vitale A, DeStradis A, Monti L, Grillo S, Buonaguro FM, et al (2008) Translationalfusion of chloroplast-expressed human papillomavirus type 16 L1 cap-sid protein enhances antigen accumulation in transplastomic tobacco.Transgenic Res 17: 1091–1102

León IR, Schwämmle V, Jensen ON, Sprenger RR (2013) Quantitativeassessment of in-solution digestion efficiency identifies optimal proto-cols for unbiased protein analysis. Mol Cell Proteomics 12: 2992–3005

Lung B, Zemann A, Madej MJ, Schuelke M, Techritz S, Ruf S, Bock R,Hüttenhofer A (2006) Identification of small non-coding RNAs frommitochondria and chloroplasts. Nucleic Acids Res 34: 3842–3852

Lutz KA, Knapp JE, Maliga P (2001) Expression of bar in the plastid ge-nome confers herbicide resistance. Plant Physiol 125: 1585–1590

MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, FrewenB, Kern R, Tabb DL, Liebler DC, MacCoss MJ (2010) Skyline: an opensource document editor for creating and analyzing targeted proteomicsexperiments. Bioinformatics 26: 966–968

Madesis P, Osathanunkul M, Georgopoulou U, Gisby MF, Mudd EA,Nianiou I, Tsitoura P, Mavromara P, Tsaftaris A, Day A (2010) Ahepatitis C virus core polypeptide expressed in chloroplasts detects anti-core antibodies in infected human sera. J Biotechnol 145: 377–386

McCabe MS, Klaas M, Gonzalez-Rabade N, Poage M, Badillo-Corona JA,Zhou F, Karcher D, Bock R, Gray JC, Dix PJ (2008) Plastid transfor-mation of high-biomass tobacco variety Maryland Mammoth for pro-duction of human immunodeficiency virus type 1 (HIV-1) p24 antigen.Plant Biotechnol J 6: 914–929

Miyata T, Oshiro S, Harakuni T, Taira T, Matsuzaki G, Arakawa T (2012)Physicochemically stable cholera toxin B subunit pentamer created byperipheral molecular constraints imposed by de novo-introduced in-tersubunit disulfide crosslinks. Vaccine 30: 4225–4232

Nakamura M, Hibi Y, Okamoto T, Sugiura M (2016) Cooperation betweenthe chloroplast psbA 59-untranslated region and coding region is im-portant for translational initiation: the chloroplast translation machinerycannot read a human viral gene coding region. Plant J 85: 772–780

76 Plant Physiol. Vol. 172, 2016

Kwon et al.

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.

Nakamura M, Sugiura M (2007) Translation efficiencies of synonymouscodons are not always correlated with codon usage in tobacco chloro-plasts. Plant J 49: 128–134

Nakamura Y, Gojobori T, Ikemura T (2000) Codon usage tabulated frominternational DNA sequence databases: status for the year 2000. NucleicAcids Res 28: 292

Picotti P, Aebersold R (2012) Selected reaction monitoring-based proteo-mics: workflows, potential, pitfalls and future directions. Nat Methods9: 555–566

Quesada-Vargas T, Ruiz ON, Daniell H (2005) Characterization of hetero-logous multigene operons in transgenic chloroplasts: transcription, pro-cessing, and translation. Plant Physiol 138: 1746–1762

Ruhlman T, Verma D, Samson N, Daniell H (2010) The role of heterolo-gous chloroplast sequence elements in transgene integration and ex-pression. Plant Physiol 152: 2088–2104

Savas JN, Stein BD, Wu CC, Yates JR III (2011) Mass spectrometry ac-celerates membrane protein analysis. Trends Biochem Sci 36: 388–396

Shenoy V, Kwon KC, Rathinasabapathy A, Lin S, Jin G, Song C, Shil P,Nair A, Qi Y, Li Q, et al (2014) Oral delivery of Angiotensin-convertingenzyme 2 and Angiotensin-(1-7) bioencapsulated in plant cells attenu-ates pulmonary hypertension. Hypertension 64: 1248–1259

Sherman A, Su J, Lin S, Wang X, Herzog RW, Daniell H (2014) Suppression ofinhibitor formation against FVIII in a murine model of hemophilia A by oraldelivery of antigens bioencapsulated in plant cells. Blood 124: 1659–1668

Shil PK, Kwon KC, Zhu P, Verma A, Daniell H, Li Q (2014) Oral delivery ofACE2/Ang-(1-7) bioencapsulated in plant cells protects against experi-mental uveitis and autoimmune uveoretinitis. Mol Ther 22: 2069–2082

Streng AS, de Boer D, Bouwman FG, Mariman EC, Scholten A, vanDieijen-Visser MP, Wodzig WK (2016) Development of a targeted se-lected ion monitoring assay for the elucidation of protease inducedstructural changes in cardiac troponin T. J Proteomics 136: 123–132

Surzycki R, Greenham K, Kitayama K, Dibal F, Wagner R, Rochaix JD,Ajam T, Surzycki S (2009) Factors effecting expression of vaccines inmicroalgae. Biologicals 37: 133–138

Verma D, Moghimi B, LoDuca PA, Singh HD, Hoffman BE, Herzog RW,Daniell H (2010) Oral delivery of bioencapsulated coagulation factor IXprevents inhibitor formation and fatal anaphylaxis in hemophilia Bmice. Proc Natl Acad Sci USA 107: 7101–7106

Verma D, Samson NP, Koya V, Daniell H (2008) A protocol for expressionof foreign genes in chloroplasts. Nat Protoc 3: 739–758

Waheed MT, Thönes N, Müller M, Hassan SW, Gottschamel J, Lössl E,Kaul HP, Lössl AG (2011a) Plastid expression of a double-pentamericvaccine candidate containing human papillomavirus-16 L1 antigenfused with LTB as adjuvant: transplastomic plants show pleiotropicphenotypes. Plant Biotechnol J 9: 651–660

Waheed MT, Thönes N, Müller M, Hassan SW, Razavi NM, Lössl E, KaulHP, Lössl AG (2011b) Transplastomic expression of a modified humanpapillomavirus L1 protein leading to the assembly of capsomeres intobacco: a step towards cost-effective second-generation vaccines. TransgenicRes 20: 271–282

Wang X, Su J, Sherman A, Rogers GL, Liao G, Hoffman BE, Leong KW,Terhorst C, Daniell H, Herzog RW (2015a) Plant-based oral tolerance tohemophilia therapy employs a complex immune regulatory responseincluding LAP+CD4+ T cells. Blood 125: 2418–2427

Wang YP, Wei ZY, Zhong XF, Lin CJ, Cai YH, Ma J, Zhang YY, Liu YZ,Xing SC (2015b) Stable expression of basic fibroblast growth factor inchloroplasts of tobacco. Int J Mol Sci 17: E19

Ye GN, Hajdukiewicz PTJ, Broyles D, Rodriguez D, Xu CW, Nehra N,Staub JM (2001) Plastid-expressed 5-enolpyruvylshikimate-3-phosphatesynthase genes provide high level glyphosate tolerance in tobacco. PlantJ 25: 261–270

Yu CH, Dang Y, Zhou Z, Wu C, Zhao F, Sachs MS, Liu Y (2015) Codonusage influences the local rate of translation elongation to regulateco-translational protein folding. Mol Cell 59: 744–754

Yukawa M, Kuroda H, Sugiura M (2007) A new in vitro translation systemfor non-radioactive assay from tobacco chloroplasts: effect of pre-mRNAprocessing on translation in vitro. Plant J 49: 367–376

Zoschke R, Barkan A (2015) Genome-wide analysis of thylakoid-boundribosomes in maize reveals principles of cotranslational targeting tothe thylakoid membrane. Proc Natl Acad Sci USA 112: E1678–E1687

Zoschke R, Watkins KP, Barkan A (2013) A rapid ribosome profilingmethod elucidates chloroplast ribosome behavior in vivo. Plant Cell 25:2265–2275

Zou Z, Eibl C, Koop HU (2003) The stem-loop region of the tobacco psbA5’UTR is an important determinant of mRNA stability and translationefficiency. Mol Genet Genomics 269: 340–349

Plant Physiol. Vol. 172, 2016 77

New Tools to Study Transgene Expression in Chloroplasts

www.plantphysiol.orgon July 14, 2018 - Published by Downloaded from Copyright © 2016 American Society of Plant Biologists. All rights reserved.