bmc systems biology biomed central - springer...structured (bigg) 'databases'. we employ...
TRANSCRIPT
BioMed CentralBMC Systems Biology
ss
Open AcceResearch articleInvestigating the metabolic capabilities of Mycobacterium tuberculosis H37Rv using the in silico strain iNJ661 and proposing alternative drug targetsNeema Jamshidi and Bernhard Ø Palsson*Address: Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093-0412, USA.
Email: Neema Jamshidi - [email protected]; Bernhard Ø Palsson* - [email protected]
* Corresponding author
AbstractBackground: Mycobacterium tuberculosis continues to be a major pathogen in the third world,killing almost 2 million people a year by the most recent estimates. Even in industrialized countries,the emergence of multi-drug resistant (MDR) strains of tuberculosis hails the need to developadditional medications for treatment. Many of the drugs used for treatment of tuberculosis targetmetabolic enzymes. Genome-scale models can be used for analysis, discovery, and as hypothesisgenerating tools, which will hopefully assist the rational drug development process. These modelsneed to be able to assimilate data from large datasets and analyze them.
Results: We completed a bottom up reconstruction of the metabolic network of Mycobacteriumtuberculosis H37Rv. This functional in silico bacterium, iNJ661, contains 661 genes and 939 reactionsand can produce many of the complex compounds characteristic to tuberculosis, such as mycolicacids and mycocerosates. We grew this bacterium in silico on various media, analyzed the model inthe context of multiple high-throughput data sets, and finally we analyzed the network in an'unbiased' manner by calculating the Hard Coupled Reaction (HCR) sets, groups of reactions thatare forced to operate in unison due to mass conservation and connectivity constraints.
Conclusion: Although we observed growth rates comparable to experimental observations(doubling times ranging from about 12 to 24 hours) in different media, comparisons of geneessentiality with experimental data were less encouraging (generally about 55%). The reasons forthe often conflicting results were multi-fold, including gene expression variability under differentconditions and lack of complete biological knowledge. Some of the inconsistencies between in vitroand in silico or in vivo and in silico results highlight specific loci that are worth further experimentalinvestigations. Finally, by considering the HCR sets in the context of known drug targets fortuberculosis treatment we proposed new alternative, but equivalent drug targets.
BackgroundTuberculosis continues to be a devastating pathogenthroughout the world, particularly in developing nations.In 2001, the World Health Organization (WHO) esti-
mated 8.5 million new cases of tuberculosis (based on 3.8million new reported cases) and an estimated 1.8 milliondeaths from tuberculosis in 2000 [1]. Within the UnitedStates, the number of reported cases of tuberculosis has
Published: 8 June 2007
BMC Systems Biology 2007, 1:26 doi:10.1186/1752-0509-1-26
Received: 6 March 2007Accepted: 8 June 2007
This article is available from: http://www.biomedcentral.com/1752-0509/1/26
© 2007 Jamshidi and Palsson; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Page 1 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
been decreasing with the exception of a period when thetrend reversed in 1986 and peaked in 1992 [2,3]. Thisreversal has been attributed principally to HIV/AIDs,immigration from countries with high prevalence oftuberculosis, poverty, homelessness, and multi-drugresistant (MDR) tuberculosis [1,2]. MDR tuberculosis isgenerally defined as strains that are resistant to treatmentwith isoniazid and rifampin [4], two of the key first lineantituberculosis drugs [1]. MDR strains of tuberculosisemerged in the early 1990s and have now been found allover the world [4].
Many of the unique properties of tuberculosis are attribut-able to its metabolism, particularly the complex fatty acidscharacteristics of the organism. These mycolic acids, phe-nolic glycolipids, and mycoceric acids confer many of theproperties such as its acid-fastness and are believed to con-tribute to the resilience of the organism. Mycobacteriumtuberculosis can survive in a wide range of environments(many different tissues) and fairly extreme pHs [5]. Oneof the most confounding factors with these bacteria istheir ability to survive for long periods of time in a dor-mant stage. The slow doubling time of tuberculosis hasfurther limited the amount of experimental data that canbe generated. Many of the first and second line drugs usedto treat tuberculosis have metabolic targets, so developingsystems level models of metabolism are anticipated to beof great use in the future.
DNA sequencing of the ~4.4 Mbp genome of Mycobacte-rium tuberculosis H37Rv (M. tb) in 1998 [6] enabled theability the pursuit of genome-scale analyses of this micro-organism. The remarkable relevance to world health anddisease control and the need to understand the metabolicfunction of the organism all evoke the need of a genome-scale metabolic model. Long-term anticipated goals andapplications of such models are to understand the growthof mycobacteria under different conditions, identifyingstrategies to improve growth in vitro (for experimental anddiagnostic purposes), and identifying new drug targets fortreatment.
In order to gain understanding about the unique charac-teristics of this important pathogen, we manually recon-structed the metabolic network of M. tb in silico (iNJ661),from which we developed a model to compute performcomputational analyses and interpret experimental data.These bottom-up reconstructions have been described inthe past as, biochemically, genetically and genomicallystructured (BiGG) 'databases'. We employ constraint-based reconstruction and analysis (COBRA) of this BiGGreconstruction to learn about its normal metabolic func-tion and to infer new potential targets for drugs.
Results and discussionThe reconstruction process has been described previously[7,8], Figure 1 summarizes this process in brief. The net-work statistics for iNJ661, which has 661 genes and 939intra-system reactions, are summarized in Table 1. A bio-mass objective function was defined using available meas-urements of M. tb H37Rv and other mycobacteria strainsif information was lacking. The biomass objective func-tion was defined using the literature for chemical compo-sition studies of M. tb [9-13]. When such information wasnot found for the specific strain, Mycobacterium bovis wasused (for example to approximate the biomass composi-tion of nucleotides and peptidoglycans [14]). The simplefatty acids compositions were based on studies of themycobacterial cell wall [15]. The biomass functionincludes 90 metabolites in addition growth associatedmaintenance ATP costs (see Table 2). [See Additional file1] for more detailed information.
M. tb growth in silicoFlux Balance Analysis (FBA) was used to grow iNJ661 insilico maximizing the biomass function defined in Table 2(see Materials and Methods). iNJ661 was grown in silicoon three different types of media, Middlebrook 7H9 (sup-plemented with glucose and glycerol) [16], Youmans[17], and the chemically defined rich culture media(CAMR) [18]. Uptake fluxes which define the differentmedia conditions are shown in Table 3 and the computedresults under the different conditions are summarized inTable 4. The doubling times are within the rangedescribed in the literature [18,19], and as expected, thegrowth rates are higher with the richer media. The mini-mum doubling time of 14.7/hr described by [18] arewithin the capabilities of maximum growth of iNJ661.
As noted by Primm et al [20], M. tb growth on Middle-brook 7H9 media requires supplementation with glucoseand glycerol. Glycerol is a component of the biomassfunction (Table 2) according to the chemical compositionmeasurements performed by [14] on Mycobacteriumbovis. So in order to grow in silico, glycerol will eitherneed to be in the medium or the cell must have some wayof producing it from other precursors. iNJ661 was able togrow when glucose uptake was abrogated, however, italways required glycerol supplementation in the mediawhich could be due to the biomass requirement or due tothe need for glycerol for the production of a differentessential metabolite. The growth dependence on glyceroland glucose uptake can be seen in the phase plane dia-gram depicted in Figure 2a. In this figure, the biomassobjective function is optimized while the glucose andglycerol uptake fluxes are varied in different combina-tions. The resulting plot shows how the growth capabilityof iNJ661 varies as a function of glucose and glyceroluptake. The solid black lines depict isoclines, along which
Page 2 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
growth is constant [21]. Further investigations were car-ried out in order to understand the essentiality of glyceroland non-essentiality of glucose in iNJ661. Since flux dis-tributions calculated by FBA are non-unique, flux variabil-ity analysis (see Methods) can be used to calculate thespan of each flux while still achieving the maximum valuefor the objective function. Reactions with non-zero fluxes,in which the maximum flux equals the minimum flux(and is non-zero), are essential in achieving the particularobjective (no alternative pathways exist). Flux variability[22] at maximum growth, with Middlebrook media withglucose and glycerol supplementation (Table 2) was per-formed in order to identify the constraining flux(es) thatprevents growth on glucose but not glycerol. The onlynon-zero fluxes with no flexibility, directly involving themetabolism of glycerol, were the glycerol transporter(GLYCt) and glycerol kinase (Rv3696c, GlpK, GLYK). Theproduction of glycerol-3-phosphate via glycerol kinase isessential for membrane and fatty acid metabolism involv-ing fatty-acyl glycerol phosphates and interconversionsbetween CDP-diacyl-glycerol and phosphatidyl-glycerolphosphates. These results are consistent and expectedgiven the importance of membrane metabolism in therole of biomass production.
The non-zero fluxes with no flexibility directly involvingglucose were the glucose transporter (GLCabc, SugABC,Rv1236+Rv1237+Rv1238) and glucokinase (PPGKr,PpgK, Rv2702). Glucose-6-phosphate is subsequentlyessential for a number of other reactions involving variousaspects of membrane metabolism, including the synthesisof trehalose phosophate (TRE6PS, OtsA, Rv3490) andinositol-phosphate (MI1PS, Ino1, Rv0046c), in additionto phosphoglucomutase (PGMT, PgmA, Rv3068c). Thesethree reactions are required not only for optimal growth,but they are in fact essential for any growth at all in silico.Only one of these, however, has been suggested to beessential experimentally (Rv3490) [23]. Growing iNJ661in Middlebrook media (with glycerol but without glucose
supplementation) severely retards the growth rate, how-ever it is still viable in silico in contrast to M. tb growth invitro. The discrepancy between in vitro essentiality of glu-cose, but non-essentiality of reactions requiring glucosederivatives and in silico non-essentiality of glucose suggestthat there may be non-metabolic functions related to theneed for glucose, in addition to alternative metabolicpathways for the production and/or use of glucose-6-phosphate.
Figure 2b characterizes the growth dependence on theuptake rates of inorganic phosphate and sulfate. Limitinguptake of either of these molecules will inhibit growth insilico. Although the uptake fluxes are small compared tothe uptake of the carbon sources and oxygen, both arerequired for growth of M. tb. Consequently, the sulfateand phosphate transporters may serve as bacteriostaticand possibly bacteriocidal drug targets. A robustness anal-ysis (see Methods) was carried for biomass production byiNJ661 for phosphate and sulfate [see Additional file 7].This was compared to a robustness analysis performed forthe in silico E. coli strain iJR904 [24] on aerobic glucoseconditions [see Additional file 7]. The sensitivity of therespective biomass production rates to the uptake fluxesare given by the slopes of the plots for each organism. Thesensitivity to the phosphate uptake rate for iNJ661 andiJR904 are both approximately 1. However, the slope forsulfate uptake in iNJ661 is approximately four times thatof iJR904 (about 16 for iNJ661 compared to about 4 foriJR904). This difference reflects the relatively largeramount of sulfur as a percent of total biomass in M. tbcompared to E. coli, which is consistent with knowledge ofM. tb's composition [25]. Collectively, the phase planesand robustness plots suggest that M. tb may be much moresensitive to sulfate depletion than E. coli.
A remarkable property of M. tb is its ability to grow in cul-ture with carbon monoxide (CO) serving as the sole car-bon source [26,27]. Unlike other mycobacteria that cangrow with CO as the sole carbon source, M. tb lacks ribu-lose-bisphosphate carboxylase (RuBisCO), an enzymefound in plants enabling fixation of carbon dioxide.Although it was argued that a reductive citric acid cyclemay enable the fixation of carbon dioxide into by M. tb[27,28], experimentally it was observed that citrate didnot affect M. tb growth on CO [27]. We did not expectiNJ661 to grow on CO, since the only experimentallycharacterized CO associated metabolism in M. tb was COdehydrogenase (CODH). The uptake of glycerol versusthe uptake of CO while optimizing for growth can be seenin Figure 3. CO uptake affects the biomass optimizationonly during minimal uptake of glycerol, over a slightrange. The contribution being made in this region is thedonation of protons (through the oxidation of CO toCO2). Looking further into this issue, we tested the
Table 1: The confidence level for each reaction is based on a scale from 0 to 4, 4 being the highest level of confidence (experimental biochemical evidence supporting the inclusion of a reaction) and 1 being the lowest level of confidence (inclusion of a reaction solely on modeling functionality). Sequence based annotations have a confidence level of 2.
Network properties of iNJ661
Genes 661Proteins 543Reactions (intra-system) 939Reactions (exchange) 88Gene Associated Reactions 77%Metabolites 828Average Confidence level 2.31
Page 3 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
hypothesis that RuBisCO would enable growth on CO(even though M. tb is not known to have it). We addedRuBisCO in addition ribose 1-phosphokinase and ribose1,5-bisphosphate isomerase (to complete the pathway)and optimized for growth on a modified Middlebrook7H9 media (glycerol and glucose were removed and COwas added). iNJ661 was unable to grow under these con-ditions, despite conferring the ability to fix CO2 withribose sugars. This observation in conjunction with Figure2a suggests that one possibility for how M. tb is able togrow using CO as the carbon source is that it somehow isable to fix CO into a glycolytic intermediate that does not
appear to occur through a RuBisCO pathway or a reduc-tive citrate cycle.
A context for contentThis BiGG reconstruction can be used as a 'context forcontent' in analyzing large biological datasets. Multipleinvestigators have employed high-throughput data analy-sis methods to characterize M. tb in different environmen-tal conditions, Sassetti et al [23] used transposon sitehybridization (TraSH) to identify the genes needed foroptimal growth of M. tb in vitro (this will be referred to asthe OptGro dataset throughout the rest of the text), Gao et
The Gene Index for Mycobacterium tuberculosis H37Rv was downloaded from The Institute for Genomic Research (TIGR) [45]Figure 1The Gene Index for Mycobacterium tuberculosis H37Rv was downloaded from The Institute for Genomic Research (TIGR) [45]. Reconstruction content was defined based on the sequence annotation, legacy data, the Tuberculist database [31], ancil-lary sources such as the Kyoto Encyclopedia of Genes and Genomes (KEGG), and SEED [47]. Reactions were defined and according to the Tuberculist web and KEGG. Legacy data was also used in the process of building the model from the recon-struction. The manual curation aspects of the reconstruction are outlined above and discussed in further detail in [42]. Debug-ging was started once the first draft of the reconstruction was completed and functional testing (i.e. flux balance analysis calculations, etc.) were begun.
Reconstruction and Model Building of iNJ661 from M. tb H37Rv
iNJ661 Reconstruction
iNJ661 Model
Manual Curation Steps
Evidence for gene
Identify catalytic protein complex/subunits
Reaction definition: primary metabolite conversion
Reaction definition: secondary metabolites/cofactors
Metabolites: formula and charge determination
Reaction definition: catalytic mechanism
Reaction definition: compartmentation
Confirm reaction mass conservation
Confirm reaction charge conservation
Manual Curation Resources
Primary Literature
TextbooksInternet Resource Databases: Tuberculist, KEGG, SEED
Debugging
Identify gaps and carry out directed manual curationEliminate `free energy’ loops
Convert to ModelDefine system boundaries and uptake constraintsTest anabolic and catabolic capabilities
TuberculistKEGG
The SEED
Genome Annotation (TIGR)
Page 4 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
Table 2: The list of biomass components for the initial biomass function. [See Additional file 1] for the additional components added for the extended biomass function and [see Additional file 5] for the explicit names and molecular formulas.
Biomass composition for growth of iNJ661
mmol/gDW Abbreviation
0.40596 ala-L
0.02261 cys-L
0.12031 asp-L
0.09001 glu-L
0.04810 phe-L
0.33581 gly
0.04062 his-L
0.08773 ile-L
0.03909 lys-L
0.20471 leu-L amino acids
0.03489 met-L
0.04770 asn-L
0.13803 pro-L
0.05812 gln-L
0.12042 arg-L
0.14340 ser-L
0.13571 thr-L
0.20583 val-L
0.02012 trp-L
0.03176 tyr-L
0.13903 amp
0.25095 cmp
0.24365 gmp
0.13160 ump nucleic acids
0.00349 damp
0.00666 dcmp
0.00666 dgmp
0.00365 dtmp
0.00797 tre
0.00118 tat
0.00125 pat
0.00114 sl1
0.00649 tre6p
0.00647 tres
0.16315 glc-D
0.09507 man sugars
0.02172 rib-D carbohydrates
0.05547 gal peptidoglycans
0.23018 acgam1p
0.00279 uamr
0.00101 uaaAgla
0.00101 uaaGgla
0.00099 uaagmda
0.00098 ugagmda
0.02536 glyc
Page 5 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
0.01885 peptido_TB1
0.01885 peptido_TB2
0.00073 arabinanagalfragund
0.00168 Ac1PIM1
0.00149 Ac1PIM2
0.00134 Ac1PIM3
0.00121 Ac1PIM4
0.00127 Ac2PIM2
0.00208 PIM1
0.00179 PIM2
0.00157 PIM3
0.00140 PIM4
0.00127 PIM5
0.00115 PIM6
0.00406 pe160
0.00586 clpn160190 fatty acids
0.01219 ttdca glycolipids
0.23515 hdca phospholipids
0.01094 hdcea
0.03484 ocdca
0.00985 ocdcea
0.09580 mocdca
0.01568 arach
0.00784 mbhn
0.05835 hexc
0.00724 pa160
0.00680 pa160190
0.00641 pa190190
0.00649 pg160
0.00613 pg160190
0.00581 pg190
0.01100 mycolate
0.00334 kmycolate
0.00329 mkmycolate
0.00337 mmmycolate
0.00111 tmha1 mycolic acids
0.00110 tmha2 phenolic glycolipids
0.00110 tmha3 mycocerosates
0.00110 tmha4 mycobactin
0.00645 phdca
0.00118 mfrrppdima
0.00140 mcbts
0.00150 fcmcbtt
0.00027 pdima
0.00026 ppdima
60.00000 atp
60.00000 adp (product) growth associated maintenance
60.00000 pi (product)
Table 2: The list of biomass components for the initial biomass function. [See Additional file 1] for the additional components added for the extended biomass function and [see Additional file 5] for the explicit names and molecular formulas. (Continued)
Page 6 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
al [29] identified a set of genes that are consistentlyexpressed under different growth conditions in liquid cul-ture (this will be referred to as the ConstExp datasetthroughout the rest of the text), and Sassetti in 2003 inves-tigated the set of genes needed for M. tb survival in vivoduring infection in mice [30] (will be referred to as theInfect dataset). The degree of overlap between the threedatasets can be seen in Figure 4A (only 2 genes are sharedby all three sets) highlight the variability of gene expres-sion across the different experimental datasets. Althoughsignificant differences between in vivo and in vitro patternsof gene expression may be expected, the OptGro and Con-stExp datasets were both done in vitro. The heterogeneityamong the datasets (all three experimental datasets shareonly two loci) suggests that defining an objective functionbased on in vitro data, may not necessarily translate to use-ful results in vivo (it may constrain the solution space fromthe wrong directions).
We evaluated these three datasets for M. tb and comparedthem to gene deletion analyses for iNJ661, in which eachgene was individually deleted and growth was optimized
on Middlebrook media. Any growth rate that fell short ofthe maximum wild type growth was considered 'sub-opti-mal'. Table 5a summarizes the 114 false negative (iNJ661gene non-essentiality, but in vitro gene essentiality; 50%)results between iNJ661 and the OptGro dataset (for thefull results [see Additional file 4]). The inconsistencieswere considered individually and classified into four cate-gories, Not in Objective Function (NOF), Alternate Route(AR), Alternative Locus (AL), and Not Essential (NE);these classifications are not mutually exclusive and weredefined in order to navigate and interpret with greaterease. A gene was placed in the NOF category if it was iden-tified to be in a biosynthetic pathway (based on networkconnectivity) for a particular metabolite, such as a vitaminor co-factor. Genes would be categorized as AR or AL ifthere was an alternative biochemical route or an alterna-tive gene annotation that could carry out that reaction iniNJ661, respectively, but not in M. tb. Potential AR or ALclassifications reflect one of the potential errors due tosequence based annotations, so further biochemical char-acterization of the gene products would help resolve thesecases. Genes that did not fall in these categories were clas-
Table 4: Summary of iNJ661 biomass production rates (in mmol/hr/g dwt) and doubling times (1/hr) in silico on different media. * from [19], ^ from [18].
iNJ661 Growth
Biomass Accumulation iNJ661 Doubling Time M. tb Doubling Time
Middlebrook 7H9 0.052 13.33Youman's 0.029 23.90 24*
CAMR 0.058 11.95 14.7^
Table 3: *Glucose and glycerol supplementation were added as described in [20].
in silico media composition
Youmans Middlebrook 7H9 CAMR
oxygen ammonium L-alanineL-asparagine calcium L-arginine
citrate chloride L-asparagineglycerol citrate L-aspartic acidwater copper L-glutamic acid
octadecanoate (Tween) ferric iron L-glycinephosphate L-glutamate L-isoleucine
sulfate magnesium L-leucinemagnesium oxygen L-serinepotassium phosphate L-phenylalanine
bicarbonate sodium pyruvate(ferric iron: minimal amount) sulfate glycerol
water octadecanoate (Tween)D-glucose*glycerol*
Page 7 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
sified as NE. Many of these loci were associated with fluxesin which the flux variability was uniformly zero (maxi-mum and minimum value).
Since quantitative data was not available for vitamins andcofactors, they were not included in the original defini-tion of the objective function (Table 2). However, after thedeletion analysis, the majority of the false negative predic-tions in the NE category by iNJ661 were reactions associ-ated with the biosynthesis of vitamins and cofactors, suchas dihydrofolate and tetrahydrofolate. 16 such metabo-lites were added [see Additional file 1] to the biomassfunction with coefficients of 0.000001 (small enough tohave negligible effects on quantitative growth, but stillrequired for growth). The results for the genes required foroptimal growth using the initial biomass function and theexpanded compared to OptGrow are displayed in the firstVenn diagram of Figure 4B. Although only 16 metaboliteswere added, the deletion study resulted in 31 more geneloci that matched the OptGro data. This only slightlyincreased the deletion prediction from 54% to 56%. Thereason for the modest increase is that there were also 18additional false positive (iNJ661 gene essentiality, but invitro gene non-essentiality) results (from 46% to 44% forthe original and expanded objective functions, respec-tively).
Genes categorized as AL or AR suggest that there was anerror in the sequence based annotation, there are regula-tory control loops that we are not aware of, or there was afalse positive result from the experimental data (TraSH isoften used as a screening tool). Sassetti et al [23] dicussedone of these cases involving the Rv0505c and Rv3042cloci (serB and serB2 respectively). Both have beenassigned the same function based on annotation, howeverserB2 was found to be essential while serB was not. Theremay be many possibilities why serB cannot rescue muta-tions/deletions of serB2, however the detection of suchcases imply that having duplicate genes many not conferthe degree or robustness that is often assumed to accom-pany an organism. The cases with alternative loci high-light areas where further experimental investigation mayprovide insight into the growth capabilities of M. tb. TheAL cases are enumerated in Table 5b. These loci are worthfurther experimental investigation to confirm the annota-tion and to determine the different conditions which mayinduce or inhibit expression. The false negative predictionfor the sulfate transporter (Rv2397c, Rv2398c, Rv2399c,and Rv2400c) is an interesting case to consider, in partbecause it validates the observations in the phase planediscussed in Figure 2b. Additionally, it identifies an areain which a partially characterized annotation causes erro-neous results. The four aforementioned loci form a pro-tein complex in the model in order to catalyze the
a. Phase plane diagram of glycerol versus glucose uptake while optimizing for growth on Middlebrook mediaFigure 2a. Phase plane diagram of glycerol versus glucose uptake while optimizing for growth on Middlebrook media. iNJ661 can grow with glycerol serving as the sole carbon source, but not glucose. The open dots are the calculated phase points and the solid black lines indicate isoclines. b. Phase plane diagram for phosphate versus sulfate uptake on Middlebrook media. Although the transport fluxes are very small, they are both necessary for the organism to be able to grow.
Page 8 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
transport reaction. However, since there was an additionalannotation identifying Rv1739c as a sulfate transporter[31], an association was made between Rv1739c in addi-tion to the existing protein complex (Rv2397c + Rv2398c+ Rv2399c + Rv2400c). So, although the sulfate trans-porter is essential for in silico growth of iNJ661, the lack ofa biochemically characterized gene protein relationshipcan result in false predictions.
The 103 false positive cases likely reflect incompleteknowledge about the network (alternative pathways exist)or an alternative objective function [see Additional file 4].Approximately 20 cases involve amino acid pathways,since amino acids would presumably be required for anykind of growth in lab conditions, these cases likely suggestalternative pathways or enzymes are present in M. tb thatare not in iNJ661. Almost half of these cases (46) involvefatty acid, membrane, or peptidoglycan metabolism, sincethe fatty acid composition in M. tb can change signifi-cantly over time in vitro [9], many of these false predic-
tions are likely to be due to changes in the biomasscomposition (in silico simulations have fixed biomasscompositions). These cases highlight areas in whichexperimental studies documenting the changes in compo-sition and growth rates may yield interesting results. Fiveof the cases involve loci associated with the transport ofphosphate, ammonium, and a sugar transporter. Thesecases likely involve as of yet unidentified alternative trans-porters in M. tb. Another case involves glycerol kinase(Rv3696c, GlpK, GLYK) which was one of the fluxes withno flexibility (see M. tb growth in silico), since M. tb invitro requires glycerol supplementation for growth [20].However, since glycerol kinase does not appear to berequired even for sub-optimal growth as implied by theOptGro dataset, there must be other reactions involvingthe direct metabolism of glycerol that have not yet beenidentified for M. tb. The ability to grow using CO as thesole carbon source and maintain optimal growth evenwithout glycerol kinase activity imply unique and uniden-tified aspects to central metabolism in the M. tb network.
Biomass optimization on Middlebrook media without glucoseFigure 3Biomass optimization on Middlebrook media without glucose. Carbon monoxide uptake can affect the growth at very low glyc-erol uptake rates via Carbon Monoxide Dehydrogenase, which generate protons through the oxidation of carbon dioxide.
Page 9 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
Further experimental investigations may provide cluestowards controlling the growth of M. tb (increasing thegrowth rate in culture and inhibiting growth in vivo).
Out of the 194 genes identified in the Infect dataset, only17 were shared with the OptGro dataset and 37 were
shared with the ConstExp dataset (Figure 4A). From thesethree sets of experimental datasets, OptGro and ConstExphad the greatest number of genes in common, 162. How-ever this is still only about 27% and 29% of the genes ineach dataset, respectively. The large variability between allof these datasets highlights the point that further develop-
Summary of experimental datasets and their overlap with iNJ661 and the iNJ661 gene deletion studiesFigure 4Summary of experimental datasets and their overlap with iNJ661 and the iNJ661 gene deletion studies. OptGro refers to the dataset described in [23], ConstExp refers to the dataset described in [29], and Infect describes the dataset in [30]. ^ denotes the subset of genes from the experimental dataset contained in iNJ661. iNJ661 represents the 661 genes in the reconstruction. iNJ661 optimal growth represents the subset of genes required for optimal growth with the original objective function. iNJ661 optimal growth* represents the subset of genes required for optimal growth with the objective function expanded to include vitamins and cofactors. iNJ661 alternative objective represents the set of genes required for optimal growth using the objective function constructed based on the ConstExp dataset. Panel A: The Venn diagram shows the overlap between the different experimental gene expression datasets. The accompanying chart summarizes the total number of genes in each dataset and how many of those genes are found in the iNJ661 reconstruction. Panel B: Series of Venn diagrams summarizing the results for the gene deletion studies carried out of iNJ661 compared to the experimental gene expression datasets.
iNJ661optimalgrowth
OptGro^
31031
31
18
83
123
iNJ661optimalgrowth*
iNJ661optimalgrowth 01901
10
39
55
36
iNJ661optimalgrowth*
ConstExp^
iNJ661optimalgrowth 42142
3
46
20
12
iNJ661optimalgrowth*
Infect^
OptGro ConstExp
Infect
553515
437 363
142
2
160
A
B
OptGro ConstExp InfectGenes in Data Set 614 560 194Genes Shared with iNJ661 237 101 103
Table 5a: Classification of False Negative results by subdivision into four overlapping categories for the gene deletion study in iNJ661 compared to the OptGro dataset. Each row and column lists the number of those genes in the respectively classes. NOF: Not in Objective Function, AR: Alternate Route, AL: Alternative Locus, NE: Not essential.
Classification of False Negative results
AL AR NE NOF
AL 26 0 5 2AR 0 7 6 1NE 5 6 24 0
NOF 2 1 0 43
Page 10 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
ment of Flux Balance Analysis based approaches willrequire the definition alternative objective functions tobiomass optimization.
Functional assessment independent of objective functionsBiomass functions can help improve predictions underwell-defined conditions by constraining the steady statesolution space. However, different growth conditions canappreciably alter the composition and the 'objectives' ofbacteria, as reflect by Figure 4A and the discussion above.Significant changes in fatty acid composition have beenobserved even while growing M. tb in culture over thespan of weeks [12]. These changes would be reflected aschanges in the composition and the coefficients of the
biomass function, consequently altering the predictionsmade by the model. Not all COBRA methods require thedefinition of objective functions [32] and identifyinggroups of reactions that operate together can help simplifya network and provide insight into its functionality [33].Correlated reaction sets have been calculated using sam-pling [34] and flux coupling [35] and implications forclassifying diseases and identifying pathway specific drugtargets have been discussed [36]. Applying the same sys-tems view of metabolic networks to pathogens such as M.tb causes one to adopt the view that single enzyme drugtargets actually knock out complete pathways. As a result,terminating the activity of any other enzyme in that path-way should have the same effect. Similar to the correlated
Table 5b: Detailed listing of the AL results for the False Negative results for iNJ661. The first column lists a locus needed for optimal growth according to the OptGro dataset. The second column lists the reactions involved. The third column lists the possible alternative loci in iNJ661 (commas indicate isozymes, addition symbols indicate formation of protein complexes).
Alternative Loci in False Negative results
OptGro Reaction Other loci in iNJ661
Rv0337c ASPTA Rv2565Rv0462 PDHc, PDH, PDHa Rv0843, Rv2497c+Rv2496cRv0808 GLUPRT Rv1602Rv0824c DESAT18 Rv1094Rv1122 GND Rv1844cRv1600 HSTPT Rv3772, Rv2231cRv1602 GLUPRT Rv0808Rv1609 ANS Rv2859cRv2182c AGPAT160190, AGPAT160, AGPAT190 Rv2483cRv2215 PDH, AKGDb Rv2241, Rv0462, Rv0843Rv2231c HSTPT Rv1600, Rv3772Rv2397c SULabc, TSULabc Rv1739cRv2398c SULabc, TSULabc Rv1739cRv2399c SULabc, TSULabc Rv1739cRv2400c SULabc, TSULabc Rv1739cRv2682c DXPS Rv3379cRv2746c PGSA190, PGSA160, PGSA160190 Rv1822Rv2996c PGCD Rv0728cRv3003c ACLS Rv3509c+Rv3002c, Rv1820+Rv3002c,
Rv3509c+Rv3002c, Rv3002c+Rv3470cRv3042c PSP_L Rv0505cRv3257c PMANM, AIRCr Rv3275c+Rv3276cRv3275c AIRCr Rv3257c+Rv3276cRv3281 ACCOACr Rv3280+Rv0904cRv3441c PMANM, ACGAMPM Rv3308, Rv3257cRv3634c UDPG4E Rv0536, Rv0501Rv0112 GMAND Rv1511Rv2247 PPCOAC Rv2502c+Rv0973c, Rv3285+Rv0974c,
Rv0973c+Rv0974c,Rv0973c PPCOAC Rv2502c+Rv0973c, Rv3285+Rv0974c,
Rv0973c+Rv0974c, Rv2501c+Rv2247Rv3285 PPCOAC Rv2502c+Rv0973c, Rv0973c+Rv0974c,
Rv2501c+Rv2247Rv2967c PC Rv2976cRv3464 TDPGDH Rv3468c, Rv3784Rv2754c TMDS Rv2764cRv3713 ADCYRS Rv2848c
Page 11 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
reaction sets calculated by sampling or flux coupling, HCRsets are defined by mass balance constraints. The defini-tion of HCRs is stricter in the sense that it is based onmetabolites with one-to-one connectivity. However incontrast to the other approaches, it is independent of theexchange reactions with the environment and any require-ments for demand functions. Consequently, the sets canbe calculated directly from the stoichiometric matrix.There is a significant degree of overlap between HCRs andthe Enzyme Subsets (ES) described by Pfeiffer et al [37].Since ESs are not constrained by 1:1 connectivity, in prin-ciple they may include additional reactions. On the sametoken, when the same reaction is carried out by multipleco-factors and the reactions are reversible, which is not
uncommon in metabolic networks, ESs may add reactionswhich form loops (similar to Type III Extreme Pathways[38]). Additionally, intermediate steps in the calculationof HCRs allow the consideration of inclusion or removalof potentially reversible reactions.
147 HCR sets (1:1 only, additional reversible reactions,0:2/2:0, were not included, see Methods) were calculatedfor the network, the average size of each set was 2.93, withmedian and modes both equal to 2. The largest set, HCR40, consisted of 13 reactions involved in Cofactor Metab-olism and Porphyrin Metabolism. The summary of howthe HCRs are split among the 35 subsystems in the net-work are outlined in Table 6. Using the Gene-Protein-
Table 6: HCR sets mapped to the subsystems in the network along with the size of each subsystem.
HCR mapping to subsystems
Metabolic Subsystem HCR Sets in each Subsystem Number of HCRs
Alanine and Aspartate Metabolism 47,81,99,126 4Arginine and Proline Metabolism 126 1Biotin Metabolism 33 1Citric Acid Cycle 98 1Cofactor Metabolism 7,24,25,26,30,40,56,77,118,123,129 11Cysteine Metabolism 38 1Fatty Acid Metabolism 1,43,57,78,86,90,93,94,95,96,97,103,108,110,11
4,115,116,128,13219
Folate Metabolism 9,22,28,31,44 5Glutamate Metabolism 21,27,29,67,76 5Glycine Serine Threonine Metabolism 19,24,45,111,124 5Glycolysis 41,42 2Glyoxylate Metabolism 104 1Histidine Metabolism 10,70 2Lysine Metabolism 2,8 2Membrane Metabolism 23,29,34,35,36,43,46,61,64,65,66,74,79,97,103,
106,107,108,109,110,112,113,116,117,130,131,138
27
Methionine Metabolism 27,28,45,91,92 5Nucleotide Sugar Metabolism 55,59,63,69,75,83,140 7Other Amino Acid Metabolism 29,48,82,91,92,125,127 7Pantothenate and CoA Metabolism 12,67,68,81 4Pentose Phosphate Pathway 32,121 2Peptidoglycan Metabolism 43,133,134,136,137,100 6Phenylalanine Tyrosine Tryptophan Metabolism 10,11,20,85,105 5Polyprenyl Metabolism 16,60,62,77 4Porphyrin Metabolism 25,30,40,51,52,53 6Purine Metabolism 5,17,27,49,58,63,73,140 8Pyrimidine Metabolism 55,59,83,98 4Pyruvate Metabolism 82 1Redox Metabolism 39,50,71,72,76,80,100,102 8Riboflavin Metabolism 6,7,26,30,56 5Sugar Metabolism 15,42,74,75,87,89,99,119,120,135 10Thiamine Metabolism 15 1Transport 39,48,54,71,82,84,87,88,89,101,102,106,110,12
4,127,139,141,142,143,144,145,146,14723
Ubiquinone Metabolism 13,122,125 3Urea Cycle 27,37 2Valine Leucine Isoleucine Metabolism 3,4,14,18,84,88,139 7
Page 12 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
Reaction relationships which form the foundation of thereconstruction, we mapped the HCRs back to the gene loci(Figure 5). The complete HCR set mapped to 124 loci and61 HCRs in the OptGro dataset, to 46 loci and 34 HCRsin the ConstExp dataset, and 21 loci and 18 HCRs in theInfect dataset [see Additional file 2]. The OptGro andConstExp shared 8 HCR sets (HCR 2, 3, 11, 12, 43, 57, 69,and 122). The intersection of these datasets identifies asubset of reactions that are consistently expressed underdifferent conditions and also required for optimal growthin vitro. These HCR sets may reflect underlying core sets ofreactions vital to growth and survival in vitro.
Many of the tuberculosis treatment drugs have metabolictargets [1] and a use of this metabolic based reconstruc-tion can be to assist in identifying new and alternativechemotherapy targets for tuberculosis. Mdluli and Spigel-man recently discussed the drug targets in M. tb reasona-bly comprehensively [39]. We used the list (absent theDNA synthesis and Regulatory Protein targets) andmapped them to the 147 HCR sets as seen in Table 7 [seeAdditional file 3 for the full set]. The final column lists thenumber of different reactions in the HCR. Other reactionsthat are shared within a particular HCR which contains aknown drug target, have the potential to be alternative but
The stoichiometric matrix is created from the metabolic networkFigure 5The stoichiometric matrix is created from the metabolic network. The HCR (Hard Coupled Reaction) sets are calculated directly from the stoichiometric matrix and mapped back to the gene loci. Gene-Protein-Reaction relationships: top box is the gene, the next box is the peptide, the oval represents the functional protein, and the bottom boxes are the reactions catalyzed by the protein.
Calculating Hard Coupling Reaction Sets
and Mapping to Genes
Stoichiometric Matrix
Metabolic Network
Rv0211
pckA
PckA
PEPCK_re PPCKr
Gene-Protein-Reaction
Association
Transcript
Catalyzed Reaction
ProteinComplex
Hard Coupled
Reaction Sets
GeneLocus
Page 13 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
Page 14 of 20(page number not for citation purposes)
Table 7: Mapping the HCR sets to the drug targets described by Mdluli and Spigelman [39]. The first column includes the protein names (adopted from Mdluli and Spigelman), the second column lists the corresponding gene locus, the third is the HCR set number and the fourth lists how many reactions are in the HCR set. Only those drug targets mapped to an HCR are shown. [See Additional file 3] for the full set of HCR sets and the individual reactions they are comprised of; the HCR sets mapped to the above drug targets are highlighted in these data.
HCRs mapped to Mtb drug targets
Drug Target Gene Locus HCR HCR size
EmbA Rv3794 43 10EmbB Rv3795 43 10EmbC Rv3793 43 10AftA Rv3792 43 10decaprenyl phosphoryltransferase Rv3806c 62 3udp galatofuraosyltransferase Rv3808c 43 10dTDP-deoxy-hexulose reductase Rv3809c 43 10RmlA Rv0334 69 4RmlB Rv3464 69 4RmlC Rv3465 69 4RmbD Rv3266c 69 4InhA Rv1484 57 3MabA Rv1483 90 5KasA Rv2245 90 5KasB Rv2246 90 5MmaA4 Rv0642c 93 4Pks13 Rv3800c 86 4FadD32 Rv3801c 86 4AccD5 Rv3280 78 2LeuD Rv2987c 14 2TrpD Rv2192c 10 4DapB Rv2773c 2 3AroA Rv3227 20 3AroC Rv2540c 20 3AroE Rv2537c 11 4AroG Rv2178c 11 4AroQ Rv2537c 11 4ilvG (acetolactate synthase) Rv1820 3 3ilvX (acetolactate synthase) Rv3509c 3 3ilvN (acetolactate synthase) Rv3002c 3 3ilvB (acetolactate synthase) Rv3003c 3 3ilvB2 (acetolactate synthase) Rv3470c 3 3branchd chain aminotransferase Rv2210c 84 2dihyropteroate synthase Rv3608c 9 3dihyropteroate synthase Rv1207 9 3PanC Rv3602c 12 4riboflavin synthase Rv1416 56 3riboflavin synthase Rv1412 56 3sulfotransferase Rv1373 131 2MshA Rv0486 80 2MshB Rv1170 80 2MshC Rv2130c 50 2MshD Rv0819 50 2IspD Rv3582c 16 5IspE Rv1011 16 5IspF Rv3581c 16 5MenA Rv0534c 13 2MenB Rv0548c 122 3MenD Rv0555 125 2MenE Rv0542c 122 3
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
metabolically equivalent targets. The drug targets mappedto 25 of the HCR sets. These sets are depicted on a reducedmetabolic map of iNJ661 (Figure 6). An image of the HCRsets mapped to the entire metabolic network will be avail-able on our website [40]. These 25 HCR sets contain all ofthe 8 HCR sets identified by the overlap between the Opt-Gro and ConstExp data.
Last year Raman et al [41] constructed and analyzed amodel of the mycolic acid synthesis pathways and pro-posed additional targets in fatty acid metabolism.Accounting for the entire metabolic network furtherbuilds upon this by enabling a more rigorous evaluationof the global growth capabilities of the network and theidentification of potential drug targets. The genome-scalemodel presented here includes the extensive fatty acid
metabolism pathways of M. tb, in addition to rest of themetabolic network. With this full network, we thensought to extend the identification of potential drug tar-gets by taking advantage of the underlying principles ofreconstructing metabolic networks to calculate HCR sets.
Interpreting large, complex networks in a functionallymeaningful manner is enabled by adopting a hierarchicalview of the network, where one identifies groups of reac-tions that respond in unison to changes in media condi-tions. Groups of reactions strictly bound together basedon mass conservation and stoichiometry constraints canlead to the definition of hierarchical sets of reactions[33,34]. This principle was applied to causal SNP associ-ated diseases in the human mitochondria, and it wasfound that reactions in the same co-set had similar disease
A partial metabolic map of iNJ661 with the 25 drug target HCR setsFigure 6A partial metabolic map of iNJ661 with the 25 drug target HCR sets. Each reaction is numbered and color coded according to the HCR which it belongs to. The number of the HCR sets matches those in the Additional files. Only pathways or parts of pathways which included members of the 25 HCRs are depicted.
Folate MetabolismRiboflavin Metabolism
Cofactor Metabolism
Lysine Metabolism
Phenylalanine, Tyrosine, Tryptophan
Metabolism
Glutamate Metabolism
Redox Metabolism
Biotin Metabolism
Thiamine Metabolism
Pantothenate and CoA Metabolism
Fatty Acid Oxidation
Fatty Acid Synthesis
Ubiquinone Metabolism
Valine, Leucine, and Isoleucine
Metabolism
Alanine and Aspartate Metabolism
Mycolic Acid Synthesis
PDIM Synthesis
Trehalose Dimycolate Metabolism
Sulfolipid-1 Synthesis
From Peptidoglycan Metabolism
Peptidoglycan Metabolism
Arabinogalactan and Peptidoglycan Metabolism
Polyprenyl Metabolism
Fatty Acid Synthesis
Mycobactin Synthesis
Membrane Metabolism
h2o
f6p
co2 h2o
h
h
h
acgam1p
nad
coa
uacgam
nad
h2o
pmhexc
nadph
h2o2
gdp
h2o
mmcoa-S
h2o
uacgam
atp
h
h
h2o
nadph
atp
nadp
coa
prpp
accoa
for
pi
argsuc
pgp160190
h
h
h2o
ump
coa
h
ala-L
fadh2
h
adp
adp
amp
h2o
tarab-D
h
sl26da
amp
amp
ump
nadp
galfragund
ala-D
26dap-M
amet
adp
ser-L
tat
atp
pi
ctp
atp
nadh
nh4
h
nh4
h
amp
peptido_EC
arab-D
arachACP
h
nadp
mkmeroacidcyc1AM P
gdp
uacgam
2p4c2me
h
h
nh4
nadph
ugmd
gdp
for
h
nadph
nad
h
thmmp
gln-L
nadp
paps
h
h2o
nadp
pa190190
adp
nadph
indpyr
ctp
glucys
ppi
ppi
coa
h2o
h
nh4
uaAgla
nadp
pi
uacgam
uaaGgla
co2
meroacidcyc2AMP
octa
co2
glu-D
ppi
acser
decda_tb
amp
udcpp
stcoa
ahcys
o2
h
uaGgla
h
pat
4adcho
uama
mmmeroacidACP
h
co2
ACP
h2o
co2
nad
h
h2o
mshfaldox
atp
ppi
h2o
forcoa
2shchc
peptido_TB2
coa
4mhetz
nadph
co2
thmpp
mmcoa-S
coa
udp
h
tmha1
amp
glu-Lpi
ahcys
nad
nadh
arpp
adn
gthox
h
co2
atp
h
nadp
tdcoa
gdp
pi
atp
ppi
h
mmcoa-S
4abz
ppi
h2o
2ahhmd
h
amp
co2
4ahmmp
uAgla
h
thr-L
clpn160190
ttc
ppi
adp
harab-D
mkmycolate
apoACP
atp
h
fad
bmn
mshfald
Ac1PIM4
cys-L
ac
ahcys
udp
pi
pi
uacgam
mlthf
nadph
ACP
mycolate
uaaGgla
h
nadp
h2o
h
pi
h2o
o2
h2o
glu-L
ump
h2o
pant-R
3c2hmp
adp
prephthcoa
2ippm
odecoa
h
nadp
h2o
coa
atp
h2o
amp
h2o
gdp
Ac2PIM2
cdpc16c19g
h
h2o
pi
achms
coa
nh4
cdp
h
2ahbut
mmeroacidcyc2AM P
glu-L
hpmtria
h2o
h
pi
co2
h
decda_tb
2cpr5p
h
h
ad
ACP
h2o
coah2o
atp
h
co2
26dap-M
amp
cdpdhdecg
uacgam
peptido_TB1
h
h
uaAgla
h
mmcoa-S
nadp
ipdp
h
h
h2o
chor
coa
mcbts
co2
nadphcoa
mmmycolate
utp
h
mssg
dhlam
nadp
atp
nadh
no
25drapp
sl1
pi
25dhpp
h
nad
h2o
pi
h2o
h2o
h2o
ocdca
ctp
pg160190
ahcys
nadp
coa
uAgl
gdpfuc
ppi
tdm2
udcpp
h2o
ump
nadp
co2
nadp
malcoa
amp
fmettrna
udcpp
coa
occoa
coa
h
mi1p-D
h2o
h
coa
co2
Ac1PIM3
atp
h2o
uamag
nad
uamr
23dhdp
PIM3
mmcoa-S
adp
h
h
ahcys
h2o
h
h
3ig3p
fad
amp
ppi
mmcoa-S
ragund
arg-L
h2o
h
atp
decda_tb
cmp
h
atp
h
nad
h
co2
unaga
ssaltpp
pi
h
tamhexc
h2o
h
h
nadph
cyst-L
mharab-D
h
h
mstrACP
alaala
succ
grdp
coa
h
hexc
gam1p
akg
pep
akg
gtp
h
h
malACP
adp
h
h
adp
ala-L
h2o
co2
pi
mkmeroacidACP
Ac1PIM1
succ
succ
atp
mi1p-D
tdm3
nh4
h
h
ctp
uGgl
h
ipdp
h
h
co2
cys-L
mmmeroacidcyc1AMP
h
2ahhmp
nadph
decdman1p_tb
galfragund
nh4
h
h2o
cys-L
cmp
h
h
phe-L
ACP
atp
amet
pyr
coa
accoa
h
nh4
h2o
co2
2obut
chor
pi
ala-B
nh4
nadp
glu-L
hexccoa
o2
accoa
adp
uaaAgla
gdpmann
for
ahcys
atp
gdpmann
nadp
nadph
h2o
h2o
mmcoa-S
arabinanagalfragund
h2o
h
gdpmann
pi
h2
pmtcoa
nadph
coa
nadp
h
amp
thf
adp
co2
co2
ctp
3mob
co2
h2o
uacgam
nh4
udpgalfur
coa
lys-L
h
nadph
mfrrppdima
4abut
ipdp
4mop
gln-L
cmp
nadp
ragund
nadph
nadph
h2o
h
mmcoa-S
tdm2
gdpmann
coa
h
fad
2dhp
h
pa160
h2o
nadp
ppi
mmcoa-S
uamag
co2
h
co2
co2
pdima
coa
adp
2mecdp
gcald
h
pep
PIM2
co2
mmeroacidcyc1AC P
nadh
h
prephthACP
co2
atp
fadh2
h
adp
h2o
gthrd
h
ppi
nadph
atp
succoa
pyr
mmcoa-S
udcpdp
atp
h
pi
co2
hdca
coa
amet
h[e]
decdp
amet
succoa
nadph
mmcoa-S
co2
ahcys
atp
coa
dpcoa
atp
coa
nadph
h
malcoa
h
h
cdpc19c19g
gam1p
atp
atp
h2o
ppi
ppi
h2o
5apru
coa
h
o2
ppi
gluala
nadp
2shchc
grdp
co2
pi
nadph
adp
prepphth
nadph
h
h
dxyl5p
nadph
adp
h
asn-L
air
citr-L
malcoa
h2o
h2o
nadph amet
uacgam
cys-L
ipdp
pa190190
ser-L
co2
h
uagmda
h
amet
h
coa
atp
nadp
gdp
mmycolate
nadp
thf
malACP
h2o
nadph
tdm1
h
h
nadph
pi
nadp
h
nadh
nadh
malACP
h
suchms
akg
ACP
h
h2o
mql8
h
nadph
amp
pap
chexccoa
h
h2o
gdp
ser-L
h2o
ACP
ump
cyst-L
pyr
nadp
octdp
atp
nad
malACP
pe160
o2
atp
h
ugma
ACP
mmcoa-S
tamocta
h
h[e]
nadph
mkmeroacidcyc1ACP
amet
h
ACP
tre
26dap-M
kmeroacidcyc1ACP
pep
adp
nadph
h2o
nadph
coa
h
ppi
adp
nadph
coa
coa
nadph
atp
h
atp
gcald
atp
h2o
udcpp
h2o
h2o
co2
co2
alac-S
coa
ACP
h
h
uaagmda
co2
glu-L
h2o
ACP
h
succ
co2
h2o
h
udcpp
h2o
uacgam
hh
coa
akg
gdp
uGgl
nadph
ugmda
kmeroacidACP
26dap-LL
3c4mop
uama
udp
nadh
pi
h2o
h2o
h
glyc
5mthf
h2o
rmyc
gdpmann
q8h2
nad
adp
udcpp
nadph
h
coa
h
ser-D
thm
nadp
hpmdtria
fadh2
nadp
pnto-R
prpp
h
meroacidACP
h
co2
atp
ala-D
coa
atp
h2o
akg
o2
decd_tb
ACP
glucys
nadp
nadp
h2o
h
h2o
coah2o
h
adp
uaaGgtla
malcoa
atp
h
Ac1PIM3
h2o
nadp
g3p
cmp
23dhmp
adp
omtta
sucsal
h2o
h
h2o
uagmdaugggmda
nadp
glu-L
adp
glyc3p
atp
h
pi
dtdp
atp
h2o
4r5au
palmACP
amp
mkmycolate
5aprbu
thr-L
lys-L
adp
amp
h
nadph
udpgalfur
amet
ppi
nadph
malcoa
nadh
ipdp
o2
iasp
h
adp
nadp
h
pi
meroacidcyc2ACP
atp
hdcoa
udp
nad
co2
h
pan4p
amp
uGgla
dtdp
h
glu-L
nadp
triat
ala-L
dmarach
gln-L
pi
h2s
adp
h
pi
nadph
ppi
inost
meoh
pi
hdca
h
h2o
uAgl
pad
mmcoa-S
PIM6
ugmag
h
h
glu-L
pi
h2o
bhb
amp
frdp
mcbtt
co2
akg
adp
pyr
h2o
mmcoa-S
uaccg
coa
db4p
nadph
h
mmcoa-S
3mob
omtta
h2o nad
ACP
pi
tmlgnc
mycolate
g6p
nadp
coa
bmnmsh
cys-L
coa
coa
pyr
arabinan
chexccoa
co2
atp
h
pi
co2
pi
ppi
ocdcea
nadh
ser-L
3psme
ppi
kmycolate
udp
amp
2mahmp
h
amp
h
h
tdm4
tdm3
frdp
nad
ppi
hmtria
hom-L
ppi
co2
ppi
coa
h
h2o2
h2o
ppi
h2o
g3p
h2o
pi
coa
3mop
gdpmann
amp
h
fad
h2o
coa
h
amet
nadph
ACP
ahcys
kmycolate
omdtria
co2
pi
h
amp
h
gdp
atp
fmn
nadph
h2o
ppi
succ
hmocta
for
accoa
h
atp
h2o
6hmhpt
h
hom-L
ahcys
h2o
pmtcoa
h
atp
agalfragund
h2o
h2o
2dmmq8
coa
3dhsk
nadh
occoa
ACP
tres
glu-L
amet
co2
Ac1PIM2
tmhexc
co2
mmcoa-S
h
h
nadp
akg
h
malcoa
ahcys
coa
h2o
h
h
h2o
nadph
glu-D
lys-L
msh
h
iacgam
ptd1ino160
dhnpt
sucsal
nadh
tdm2
h
atp
co2
amet
dca
arabinan
mmmycolate
atp
ppi
h2o
h2o
4ahmmp
uamag
h
ribflv
ugagmda
coa
h2o
h
coa
phpyr
thfglu
atp
coa
nad
nadh
decd_tb
igam
2dda7p
h
dhpmp
gly
amp
nad
ptd1ino160190
tmha2
co2
10fthf
cys-L
gluala
arach
atp
tdm1gdpmann
cmp
pran
h
h2o
pi
inost
gly
nh4
dmbhn
pi
h2o
nadp
adp
g3pe
malcoa
ile-L
ssaltpp
4c2me
2dmmq6
ACP
nadp
ppi
pa160190
arachcoa
cmp
glyc3p
atp
pi
h2o
amp
glu-L
h
g6p
nadh
ppi
nad atp
34hpp
pi
fadh2
unaga
h2o
nad
ppi
h2o
h
h
h2o
kmeroacidcyc2ACP
h
accoa
h
h2o
nadph
4hbz
3c3hmp
h
h2o
nh4
tyr-L
h
h
decdpa_tb
gdpmann
h2o
ppi
co2
ppi
coa
phdcaACP
h
23dhmb
decdp_tb
ACP
nad
kmeroacidcyc2AMP
h
co2
mycolate
mi4p-D
amet
nadph
coa
tmbhn
nh4
nadh
h2o
h2o
dmlgnc
tmha6
ipdp
ctp
pi
coa
h
h2o
h2o
ahcys
nadp
fol
atp
uaGgla
anth
h2o
amet
atp
gtp
h
nadph
h
coa
atp
o2
4pasp
2dmmql8
mettrna
phthdl
atp
h
nadp
adp
cmp
h
o2coa
2agpe160
salc
atp
nh4
h2o
amp
udcpp
tdm1
atp
coa
pi
ppi
26dap-M
asp-L
h
nadph
PIM2
rppdima
chor
nadph
atp
hdca
pi
co2
h
udp
nad
pi
agalfragund
succ
h2o
mycolate
h
co2
h
h2o
pmocta
decda_tb
phthcl
adp
h
mmmeroacidcyc1AC P
h
h2o
h
adp
o2
h2o
o2
alpam
coa
co2
co2
udcpp
adp
nadph
h2o
pyr
ppi
atp
h
6hmhpt
cmcbtt
q8
ppi
pi
udcpdp
amet
nadp
nadp
h2o
mmcoa-S
ttdca
coa
h2o
dtdp
pi
udp
nadp
h
4hbzACP
gthrd
decdr_tb
2me4p
adp
h
adp
adp
udp
dmpp
ppi
fum
tarab-D
h2o
h2o
sl2a6o
nadh
gdp
nadh
h
h2o
ugmd
rmyc
h
akg
h2o
udcpp
nadp
nadph
h
h
atp
ump
h
ACP
glu-L
h2o
pyr
h2o
amet
glyc3p
gdpmann
cgly
coa
h2o
h
udp
nadph
nadp
prepphthACP
udpg
dtdp
6hmhptpp
ppi
mmcoa-S
h
nadh
h2o
ppi
pphthdl
ppi
mqn8
udcpdp
h
PIM4
ile-L[e]
lys-L
coa
h
adp
h
h
coa
gcald
uGgla
for
aspsa
h2mb4p
uggmda
uamr
nadh
h2o
amp
h
acac
h2o
coa
nadph
malcoa
h
co2
malcoa
nadh
coa
pac
h
h
mlthf
pphn
4ppan
ru5p-D
h
nadhpi
g3p
h2o
atp
h
co2
h2o
h
nad
pyr
nad
amp
val-L
co2
h2o
amp
ppi
h
nad
h
mlthf
PIM5
ppi
nadh
mharab-D
uamag
nadp
h
bhb
adp
glu-D
ahcys
pi
h
h2o
co2
utp
m1ocdca
PIM1
h2o
nadp
atp
co2
nadp
ile-L
akg
nad
h
amp
h
co2
akg
ppcoa
cdp
h
octdp
amp
glu-L
ichor
coa
h2o
atp
mmcoa-S
tmlgnc
atp
h
pg190
thmpp
nadph
ahcys
nadph
nadp
coa
for
nadh
ppi
ppi
adp
co2
h
malcoa
ppi
udp
h
glu-L
ocdca
pi
homtta
h
h
mmcoa-S
atp
malcoa
thf
h
o2s
h2o
ACP
h
ACP
adp
nadp
h
decdpr_tb
chexccoa
4mpetz
nadph
coa
dmpp
cys-L
coa
nadh
ppdima
methf
4ampm
atp
malcoa
uggmd
octdp
h
dtdprmn
pgp190
h2o
nadp
2mecdp
no3
mppp9
nadp
atp
h
glu-L
atp
nadp
h2o
leu-L
h
hdcea
ppi
coa
pi
nadp
mmcoa-R
coa ACP
h2o
h
glu-L
h
h
h
h2o
accoa
h
h2o
h
udcpp
mmcoa-S
glu-L
tmha5
h2o
mmcoa-S
h2o
h
glu-L
h2o
mmycolate
fad
h
peamn
decd_tb
pi
hexc
udp
pmtcoa
o2
amp
nadp
tamlgnc
h
coa
coa
nadp
malcoa
nadph
arab-D
coa
nadphh
dmlz
gdp
mbhn
ppi
h2o
mmeroacidACP
mmeroacidcyc2AC P
adp
h2o
h2o
lys-L
coa
h
h2o2
5fthf
fald
rrppdima
ahcys
uaagmda
5oxpro
h2o
uaaAgtla
h2o
nadph
nadph
tmha4
akg
co2
h2o
h
atp
pi
h
ppi
h
malcoa
coa
phthclh2
hdca
co2
nadp
phthiocerol
cmp
dtdprmn
dhf
co2
coa
h2mb4p
amp
h
h2o
adp
amp
h
coa
odecoa
lys-L
h
nadph
acgam
mqn6
udcpdp
pacald
nh4oh
alaala
co2
1msg3p
h2o
pi
indole
uAgla
amp
h2o
mi3p-D
marach
udpgal
h
o2
chexccoa
nadph
pi
co2
amp
co2
nadph
pi
pi
adp
h2o
nadp
hdca
adp
4ppcys
gam6p
akg
ump
nh4
for
pi
h
co2
mycolate
ugmr
gly
nadp
amp
malcoa
ipdp
nadph
skm5p
ichor
atp
h
frrppdima
h[e]
mndn
h2o
trp-L
mmcoa-S
h
ACP
coa
h
ppi
pap
gthox
gln-L
tmha3
h2o
uaagtmda
sucbz
nadph
nadph
h
sbzcoa
h2o
thdp
h2o
h2o
atp
h
ahdt
h
bhb
nadph
2obut
amp
h
ppi
nad
h2o
h2o
amet
pi
h2o
co2
harab-D
h
h
pi
nadph
nadp
ppi
nadp
h
h
meroacidcyc1ACP
o2
dtdprmn
h
h[e]
hpmtria
coa
nadh
atp
h
h
h
atp
atp
skm
aacoa
cmp
3dhq
h2o
nadp
h
ppi
pphthiocerol
adp
h2o
h
hdcoa
atp
dhna
mocdca
odecoa
phom
h
ac
ala-L
uaaAgla
ugmda
h
h2o
ACP
pi
nadp
h
h2o
tre6p
nadph
gdpmann
dtdprmn
co2
nadp
akg
pi
accoa
ddca
h
prephth
nh4
nh4
ump
co2
h
h
h
h
nh4
h
nadp
cigam
fadh2
nadp
glu-L
hexc
h
gdp
fad
succoa
ahcys
decdp
coa
asp-L
ump
ppi
pmtcoa
ahcys
ile-L[e]
e4p
dhpt
FASm2201
PMPK
TRESULT131
PPND
DHFS
AMID2
ALAD_L
DHAD2
FASm280
TMHAS4
FASm2801
PNTK12
GTMLT
GLNSP2
OXGDC2125
FACOAL80
TRIATS
UGLDDS1
SHSL4r
DHFR
HMPK2
GLUDx
DCPDP
MEPCT16
FAS14057
FOLR2
FASm300
MNDNS2
HPPK9
FAS161
KARA1i3
G16MTM4
DPR12
MYCFLUX5
FASm2602
IPPS
SSALy
FACOAE180
DASYN190190
PPNCL2
G16MTM7
PPNCL
MMCD
CDAPPA160190
PAPPT1
HSERTA
OCTDPS
TRE6PS
PHTHCLR1
GGLUCT2
SERAT
MYC1CYC4
GTHOr
LEUTA
NH4OHDs
AACPS3
ILEt2r84
MYCON4
CDPPT160190
FCOAH2
MTHFR2
ALDD19x
MYCON5
UAMRH
CYTBD
UAAGDS
RBFSb56
AFTA43
FASm2001
PREPPACPHr
MYCSacp5090
FTHFCL
UAGCVT
DESAT18
PHTHSOCOAT1r
DHPS2
PREPHACPH
PREPPACPH
DB4PS56
PRDX
HSDx
2AGPEAT160
FASm1601
FASm220
NADH5
PREPTHS
PPTGS_TB2
ALDD1
GLNSP3
DHPPDA2
UDPDPS
UGLDDS2
TDMS4
MTHFD
MYC2CYC386
SHKK20
PHTHDLS
PHETA1
FASm3001
DNMPPA
MECDPS16
PLIPA1E160
FAS80_L
DHQS11
G16MTM1
PRAIi10
FAH4
TRE6PP
PHYCBOXL
SDPTA
G3PAT190
UAGPT1
MME
ABTA
PHTHS2
BMNMSHS
GTPCI
DHAD13
CDPMEK16
ICHORSi
FASm2601
MYCSacp56
ACGAMT43
ARACHTA
PPNDH
DHQD11
UAGPT2
FAS240_L
HPPK2
DCPDPP62
FASPHDCA
MSHOXH
ANPRT10
MYCFLUX2
CDPPT160
UGGPT3
FMNAT
FAO1
GMT243
DXPRIi16
GLYCL
DXPS
UDCPDPS
SHK3Dr11
FAMPL4
CIGAMS50
FAS160
DALAOX
FASm260
MYCSacp58
SDPDS
MYC2CYC190
SUCBZL122
G16MTM10
PPDIMAS
MSHAMID
G16MTM2
ACLS3
DMPPS
PREPHACPHr
FACOAL161
TDMS1
PPTGS
PAPPT3
G16MTM8
HSK
MYC1CYC593
THMDP
GTHS
AMMQT8_2
PPTGS_TB1
TATSO131
DMATT
FAS181
FAMPL2
UGMDDS
FTHFD
RRPPDIMAS
DDPA11
PGSA190
OMCDC
PTPATi
UGLDDS2
AMMQT8
UDPGALM43
NADH9
SHCHCS2125
ARGSL
UAGPT3
UAMAS
PAPPT2
UAGPT3
ANS10
TMN
FMETTRS
PAPPT1
FAO3
FASm2202
PDIMAS
MANAT3
RBFSa56
NPHS122
MYCON2
IACGAMS80
ILETA84
G1PACT
TMHAS3
G16MTM6
ACCC78
RBFK
MYCFLUX4
DPPS
DAPE
AGPAT190
SHCHCS
UGMDDS
MI3PP
DCPDPP2
GF6PTAr
MYC2CYC2
TMPKr
SERD_L
TMHAS1
ALAALAD
LPLIPAL2E160
FHL
SERD_D
CHORS20
SHSL1r
CDAPPA160
MYC1M293
UAMAGS
ARABF43
TAS43
ARABF43
PPNCL3
TMHAS6
ACP1(FMN)
PATS
PHTHDLS2
FACOAL200
FACOAL180
HSDy
SUCBZS122
FACOAE140
UGMDDS2
MDFDH
FASm1801
GTPCII
HMPK4
DCPT62
MANAT2
ASNS1
FAS180
AMMQT613
FRRPPDIMAS
FAO2
FASm2401
PGPP190
AMPTASECG
MCBTS2
PSCVT20
GCC2
DHPPDA
MYC1CYC386
PGPP160190
DHNPA2
G16MTM9
FACOAL160
ARI
VALTA
MSHS50
PAPPT3
KARA2i
CHRPL
IGPS10
DHDPS2
MANAT1
TMPPP
FAMPL593
DAPDC
ACHBS
IPDDI
CYSS
IGAMD80
TYRTA
FAO10
MOHMT12
PHTO
RPPDIMAS
MI4PP
UAMAGS
SSALx
AACPS11
PREPTHS2
MTHFC
MYC1CYC190
GALFT43
UAGPT1
FOMETR
SALCS
ASPK
MYCFLUX3
DHNPA9
OXGDC2125
FTHFL
IPMD
GLNS
FAMPL190
PGPPT3
FASm2002
DESAT16
APRAUR
PEAMNO
PMDPHT
GLUCYS
MCBTS3
UGMAGS
MNDNS1
GLNSP1
FAS12057
MI1PS
ASPO5
TRPS2
TDMS2
MMM2r
IPPMIa14
MYCON3
ALATA_L
DPCOAK
DASYN160190
TRPS1
DCPE62
THDPS2
UDPDPS2
TATS
CDAPPA190
ARGSS
AMID4
UAAGDS
TRPTA
UAAGLS1
IPDPS
GTPCII2
AFTA43
FAMPL386
FAO11
FOLD39
FACOAL140
CHORM
MECDPDH16
FASm180
UAMAS
O16RHAT43
TDMS3
NADH10
HMPK1
DCPDPS
FASm340
UDCPDP
ASNN
HMPK3
MCBTS
PGAMT
AHMMPS
TMHAS2
PGLS
IPPMIb14
MCOATA
ACPS1
MYC1CYC2
PAPPT2
MYCTR
GLUCYS
UAGDP
THRD_Lr
GMT243
NO
GALFT43
DCPT2
DHDPRy2
FACOALPREPH
UAPGR
PANTS12
MYC1M1
UGMAS
CYTBD2
GLUDC
MCD
ILEt2r84
ASAD
TMHAS5
UGAGDS
CYSTGL
ADCL
CLPNS160190
DNTPPA
FACOAL181
FASm320
UAGPT2
AACPS10
HAS43
PREPTTA
HSST
FASm2402
DHNAOT213
PHTHCLS
MYCTR2
G16MTM5
MYCFLUX1
GRTT
UAAGLS2
FORMCOAL
PPCDC
DHNAOT
MYCON190
FASm2802
ACChex78
THRS
TRPS3
ACGAMT43
FASm240
GLUR
GLUSy
MTHFD2
PGSA160190
THFGLUS
UGLDDS1
O16RHAT43
FAS10057
FAS260
ADCS
AFE43
PHTHCLR2
GLUR
GMT1
Page 15 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
phenotypes [36]. We applied this same concept to iNJ661,but rather than looking for disease phenotypes, we soughtto identify alternative drug targets.
ConclusionOver time and with iterative improvements, metabolicnetwork reconstructions have achieved the stage ofhypothesis generation and model-driven biological dis-covery in systems biology [42]. We present an in silicostrain of M. tb, iNJ661, with the anticipation that it will bereceived by the community in a similar vein and to assistin the discovery process of this remarkable yet devastatingpathogen.
A large amount of material has been presented in thismanuscript, beginning with the presentation of a genome-scale reconstruction and model of M. tb, iNJ661, followedby experimental validation and use of the model for inte-gration and analysis of multiple experimental datasets.We summarize some of the main points that highlightareas in which further experimental research may be fruit-ful.
Growth studies• iNJ661 can grow at rates consistent with experimentaldata in varying media conditions. Further measurementsof the biomass composition of M. tb in well-defined situ-ations are likely to improve the in silico predictions underthose conditions.
• Analysis of the capabilities under maximal growth con-ditions can help identify critical nutrient targets for killingM. tb (such as the sulfate transporter) or for optimizinggrowth (in the lab).
• The pathway enabling growth of M. tb using CO as thecarbon source is still unclear. The computational studiessuggest that there must be an alternative pathway (toreductive citrate cycle and RuBisCO) which enables entryof CO2 into central metabolism.
• The seemingly paradoxical need for glucose supplemen-tation in Middlebrook media but non-essentiality ofdirectly related reactions involving the metabolism of glu-cose-6-phosphate for in vitro growth and the almost dia-metrically opposed results for in silico growth, suggest thatthere may be additional pathways involving glucosemetabolism and potentially non-metabolic functionsrelated to the glucose requirement.
Comparison with and analysis of large datasets• The definition of the biomass function will largely dic-tate the results of optimal growth/gene deletion experi-ments, consequently, positive results (in agreement withthe experiments) are not as interesting as the false negative
(or false positive results). The false negative results may bedue to an error in the annotation of the gene or an under-lying physiological mechanism that we are not aware of.In either case, investigating the particular genes experi-mentally may help clarify the issue.
• The model can be used as a Context for Biological Con-tent and provides a consistent, coherent framework toanalyze various datasets focusing on metabolism.
'Unbiased' analysisUnbiased analysis methods, methods that calculate prop-erties of the network independent of objective functions,can provide interesting and valuable insights into networkcapabilities [32]. The definition of HCR sets, in a similarvein to correlated reaction sets from sampling and flux-coupling, can confer a hierarchical structure to metabolicnetworks, and may assist in the identification of alterna-tive drug targets. Using these models in conjunction withexperimental data can assist in screening and identifyingnew and alternative drug targets.
Taken together, this study adds another high resolutionreconstruction of microbial metabolism. There now exista growing number of such reconstructions that have beenparticularly useful for basic studies of network propertiesand for bioprocessing applications [42-44]. As thenumber of reconstructions of human pathogens grows,we should be able to expand their uses towards improvedunderstanding pathogenic mechamisms, and design ofinterventions and treatment. Large-scale rigorous experi-mental validation studies should now be performed tofurther these goals.
MethodsReconstruction and model development of the metabolicnetwork has been described previously [7,8]. The recon-struction contents will be made available on our website[40] in Excel [see Additional file 5] and SBML formats.
ReconstructionFigure 1 provides a brief summary of the reconstructionand modeling building process. The sequence basedgenome annotation of Mycobacterium tuberculosisH37Rv was downloaded from TIGR [45] and served as theframework of the model. Charge and elementally bal-anced reactions were added individually based on thisannotation, legacy data when available, and updatesmade in the Tuberculist database [31]. KEGG [46] and theSEED [47] were used as ancillary tools on occasion. Fol-lowing the initial reconstruction, the gaps were evaluatedindividually by searching for direct evidence in the litera-ture for their metabolism. Due to the relative sparsity ofliterature on aspects of central and cofactor metabolism inM. tuberculosis, many of these gaps could not be filled. Leg-
Page 16 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
acy data in the form of primary articles, review articles,and textbooks were employed in addition to the databaseresources [see Additional file 6] during the reconstructionand model building phases. A comprehensive map ofiNJ661's metabolic network was visualized by creating amap of the network organized by lumped subsystems ofmetabolism [see Additional files 8, 9, 10, 11, 12, 13, 14,15]. After the debugging process, when the cell could grow(i.e. produce biomass) on different media, the remainingintracellular gaps were evaluated by searching the litera-ture for evidence of metabolic reactions involving thatparticular metabolite. If no evidence of transport or bio-chemical transformations of the metabolite in M. tb wasfound, no additional reactions or transporters wereadded.
Model formulationThe stoichiometric matrix, S, was constructed based onthe reactions described in the reconstruction. The biomassfunction was defined using available legacy data [seeAdditional file 1]. Flux Balance Analysis (FBA) simula-tions were employed during the debugging processtowards developing a functional model.
Flux Balance Analysis (FBA)The metabolic network is represented mathematically bythe stoichiometric matrix, m rows by n columns, wherethere are m metabolites and n reactions in the network.Mass conservation dictates
which simplifies to
S·v = 0
at the steady state. Further constraints can be placed onthe system based on thermodynamics (reaction direction-ality) and limits on substrate uptake rates
α ≤ v ≤ β
The oxygen uptake rate measured from the MycobacteriaBovis BCG strain of 25 μL/hr/mg dry weight (0.98 mmolO2/hr/g dry weight) [48] was used. The small moleculecarbon sources (glucose, glycerol, etc.) were limited to anuptake rate of 1 mmol/hr/g dry weight and all otherallowable substrates were unconstrained. The different insilico culture media are listed in Table 3. Deviations fromthe standard media include: the addition of minuteamounts of ferric iron in Youmans media. The oxygenuptake rate was the growth constraining flux in all of thedifferent media.
Flux Variability Analysis (FVA)Constraining the biomass objective function, c, to theoptimal value calculated in FBA,
max(cT·v)
every flux in the network is then minimized and maxi-mized. The different between the maximum and mini-mum for each flux defines the flux variability for thatreaction.
Robustness AnalysisRobustness plots are performed by varying a particularflux through a pre-defined range and recalculating theobjective function. The slope of the curve describes thesensitivity of the objective function on that particular flux(over the specified range of values).
Hard-Coupled Reaction (HCR) Sets
Denoting the binary form of the stoichiometric matrix, ,the input/reactant part of the stoichiometric matrix, S-,
and the output/product part of the stoichiometric matrix,S+, then it is transparent that,
The input and output connectivities of each metabolite
can be calculated for - and +, respectively. The
input:output ratio for each metabolite can be determinedand all metabolites with 1:1 connectivity will be hard-coupled by the network. Metabolites with 0:2 or 2:0 con-nectivity reflect metabolites that are hard-coupled in theevent that the corresponding reaction is reversible. Metab-olites with 0:1 or 1:0 connectivity are either source/sinktransporters or blocked reactions.
A general algorithm for calculating hard coupled reactionsets.
1. Identify 1:1, 0:2, and 2:0 metabolites
2. Define HCRs
a. Identify the sets of reactions corresponding to themetabolites, denote the entire set as HCR0
b. Join any HCR0 subset that share 1 or more reactions,denote the entire set as HCR1
c. i ← 1
3. while (HCRi-1 ≠ HCRi)
dx
dtS v= •
S
S S S= +− +ˆ ˆ
S S
Page 17 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
a. Join any HCRi subset that share 1 or more reactions
b. i ← i+1
Since 0:2 or 2:0 reactions may be irreversible reactions, anintermediate processing step may be needed to verify thatthere is biological evidence supporting the catalysis of thereaction in the reverse direction. If one does not wish toconsider these potential effects and assume that all of thereactions are reversible then step 1 can be simplified by
finding all of the 2 metabolites for . Biomass functionswere removed from the stoichiometric matrix before cal-culating the HCR sets. The HCRs calculated for iNJ661had 94 0:2/2:0 reactions. Of these, 24 overlapped with thecore set of 147 1:1 HCRs. Since the set of 0:2/2:0 reactionsdid not have additional gene associations (that weren'talready in the 1:1 set) and since they had many irreversi-ble reactions throughout, the 1:1 set and the 0:2/2:0 setwere not combined (step 1 in the above algorithm outlinewould involve only identifying the 1:1 metabolites).
The reconstruction process, the metabolic maps, and themajority of the calculations were performed using Sim-pheny™ v1.11 software (Genomatica, Inc.). HCR Sets werecalculated with Mathematica© v4.2 (Wolfram Research,Inc.).
Authors' contributionsNJ carried out the reconstruction, performed the analyses,and drafted the manuscript. BØP conceived the study andhelped revise the manuscript. Both authors approved thecontent of the final manuscript.
Additional material
Additional File 1Molecular composition of biomass functions. A detailed list of the biomass constituents and their respective coefficients in the biomass function in addition to the references from which this information was found.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S1.xls]
Additional File 2HCRs mapped to experimental data sets. The list of HCRs that mapped to the three experimental data, based on gene loci.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S2.xls]
S
Additional File 3HCRs mapped to reaction and metabolic subsystems. The complete list of HCRs, highlighting those known to be drug targets in M. tb.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S3.xls]
Additional File 4Comprehensive list of false negatives between OptGro data set and iNJ661 lethal gene deletions. A catalog of the false negative results pro-duced by iNJ661 when compared to the OptGro data set. Each of false negative results is categorized into at least one of four different categories. Specific loci are listed in cases with suggested alternative loci or alternative paths.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S4.xls]
Additional File 5Model content for iNJ661. A complete list of the reactions (abbreviations and full names), the associated mass and charge balanced reactions, con-fidence scores, reaction subsystems, and the list of the Gene-Protein-Reac-tion relationships, including logical AND/OR relationships, for all of the reactions in the model. A complete list of metabolites appearing in iNJ661, including abbreviation, full name, formula, and charge at pH 7.2 can be found in the second tab in the file.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S5.xls]
Additional File 6Explicit list of references used in the reconstruction process. The list of ref-erences used to define particular reactions and GPR relationships.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S6.pdf]
Additional File 7Robustness plots for phosphate and sulfate for iNJ661 and iJR904. Plot-ting the biomass objective function versus phosphate or sulfate uptake rates for in silico strains of M. tb and E. coli.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S7.pdf]
Additional File 8Amino acid metabolism. A map of the amino acid pathways in iNJ661.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S8.svg]
Additional File 9Central metabolism. A map of the central metabolic pathways in iNJ661.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S9.svg]
Additional File 10Fatty acid metabolism. A map of the fatty acid pathways in iNJ661.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S10.svg]
Page 18 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
AcknowledgementsThis work was supported in part by an NIH Training Grant. We also thank Mr. Scott Becker and Mr. Adam Feist for helpful comments and feedback. The authors do not declare any competing interests.
References1. Kasper D, Braunwald E, Fauci A, Hauser S, Longo D, Jameson J: Har-
rison's Principles of Internal Medicine. 16th edition. McGraw-Hill Professional; 2004.
2. Schneider E, Moore M, Castro KG: Epidemiology of tuberculosisin the United States. Clinics in chest medicine 2005, 26:183-195. v.
3. Small PM, Fujiwara PI: Management of tuberculosis in theUnited States. The New England journal of medicine 2001,345(3):189-200.
4. Sharma SK, Mohan A: Multidrug-resistant tuberculosis: a men-ace that threatens to destabilize tuberculosis control. Chest2006, 130(1):261-272.
5. Wayne LG, Sohaskey CD: Nonreplicating persistence of myco-bacterium tuberculosis. Annual review of microbiology 2001,55:139-163.
6. Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, Gor-don SV, Eiglmeier K, Gas S, Barry CE 3rd, et al.: Deciphering thebiology of Mycobacterium tuberculosis from the completegenome sequence. Nature 1998, 393(6685):537-544.
7. Becker SA, Palsson BO: Genome-scale reconstruction of themetabolic network in Staphylococcus aureus N315: an initialdraft to the two-dimensional annotation. BMC microbiology[electronic resource] 2005, 5(1):8.
8. Feist AM, Scholten JC, Palsson BO, Brockman FJ, Ideker T: Modelingmethanogenesis with a genome-scale metabolic reconstruc-tion of Methanosarcina barkeri. Molecular systems biology [elec-tronic resource] 2006, 2(2006):0004. 2006 0004
9. Khuller GK, Taneja R, Kaur S, Verma JN: Lipid composition andvirulence of Mycobacterium tuberculosis H37Rv. The Austral-ian journal of experimental biology and medical science 1982, 60(Pt5):541-547.
10. Nandedkar AK: Comparative study of the lipid composition ofparticular pathogenic and nonpathogenic species of Myco-bacterium. Journal of the National Medical Association 1983,75(1):69-74.
11. Watanabe M, Aoyagi Y, Ridell M, Minnikin DE: Separation andcharacterization of individual mycolic acids in representativemycobacteria. Microbiology (Reading, England) 2001, 147(Pt7):1825-1837.
12. Youmans AS, Youmans GP: Ribonucleic acid, deoxyribonucleicacid, and protein content of cells of different ages of Myco-bacterium tuberculosis and the ralationship to immuno-genicity. J Bacteriol 1968, 95(2):272-279.
13. Garcia-Vallve S, Guzman E, Montero MA, Romeu A: HGT-DB: adatabase of putative horizontally transferred genes inprokaryotic complete genomes. Nucleic acids research 2003,31(1):187-189.
14. Beste DJ, Peters J, Hooper T, Avignone-Rossa C, Bushell ME, McFad-den J: Compiling a molecular inventory for Mycobacteriumbovis BCG at two growth rates: evidence for growth rate-mediated regulation of ribosome biosynthesis and lipidmetabolism. J Bacteriol 2005, 187(5):1677-1684.
15. Acharya PV, Goldman DS: Chemical composition of the cell wallof the H37Ra strain of Mycobacterium tuberculosis. J Bacteriol1970, 102(3):733-739.
16. Middlebrook G, Cohn ML: Bacteriology of tuberculosis: labora-tory methods. American journal of public health 1958,48(7):844-853.
17. Youmans , Karlson : 1947.18. James BW, Williams A, Marsh PD: The physiology and patho-
genicity of Mycobacterium tuberculosis grown under con-trolled conditions in a defined medium. Journal of appliedmicrobiology 2000, 88(4):669-677.
19. Cox RA: Quantitative relationships for specific growth ratesand macromolecular compositions of Mycobacterium tuber-culosis, Streptomyces coelicolor A3(2) and Escherichia coli B/r: an integrative theoretical approach. Microbiology (Reading,England) 2004, 150(Pt 5):1413-1426.
20. Primm TP, Andersen SJ, Mizrahi V, Avarbock D, Rubin H, Barry CE3rd: The stringent response of Mycobacterium tuberculosis isrequired for long-term survival. J Bacteriol 2000,182(17):4889-4898.
21. Edwards J, Ramakrishna R, Palsson B: Characterizing the meta-bolic phenotype: a phenotype phase plane analysis. BiotechnolBioeng 2002, 77(1):27-36.
22. Reed J, Palsson B: Genome-scale in silico models of E. coli havemultiple equivalent phenotypic states: assessment of corre-lated reaction subsets that comprise network states. GenomeRes 2004, 14(9):1797-1805.
23. Sassetti CM, Boyd DH, Rubin EJ: Genes required for mycobacte-rial growth defined by high density mutagenesis. Mol Microbiol2003, 48(1):77-84.
24. Reed JL, Vo TD, Schilling CH, Palsson BO: An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR).Genome biology 2003, 4(9):R54.
25. Cole ST: Tuberculosis and the tubercle bacillus. Washington,DC: ASM Press; 2005.
26. King GM: Uptake of carbon monoxide and hydrogen at envi-ronmentally relevant concentrations by mycobacteria. ApplEnviron Microbiol 2003, 69(12):7266-7272.
27. Park SW, Hwang EH, Park H, Kim JA, Heo J, Lee KH, Song T, Kim E,Ro YT, Kim SW, et al.: Growth of mycobacteria on carbonmon-oxide and methanol. J Bacteriol 2003, 185(1):142-147.
28. Srinivasan V, Morowitz HJ: Ancient genes in contemporary per-sistent microbial pathogens. The Biological bulletin 2006,210(1):1-9.
29. Gao Q, Kripke KE, Saldanha AJ, Yan W, Holmes S, Small PM: Geneexpression diversity among Mycobacterium tuberculosis
Additional File 11Membrane metabolism. A map of the membrane metabolism pathways in iNJ661.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S11.svg]
Additional File 12Nucleotide metabolism. A map of the nucleotide synthesis and degrada-tion pathways in iNJ661.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S12.svg]
Additional File 13Peptidoglycan metabolism. A map of the peptidoglycan pathways in iNJ661.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S13.svg]
Additional File 14Transporters. A map of the transport reactions found in iNJ661.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S14.svg]
Additional File 15Vitamin and cofactor metabolism. A map of the vitamin and cofactor pathways in iNJ661.Click here for file[http://www.biomedcentral.com/content/supplementary/1752-0509-1-26-S15.svg]
Page 19 of 20(page number not for citation purposes)
BMC Systems Biology 2007, 1:26 http://www.biomedcentral.com/1752-0509/1/26
Publish with BioMed Central and every scientist can read your work free of charge
"BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime."
Sir Paul Nurse, Cancer Research UK
Your research papers will be:
available free of charge to the entire biomedical community
peer reviewed and published immediately upon acceptance
cited in PubMed and archived on PubMed Central
yours — you keep the copyright
Submit your manuscript here:http://www.biomedcentral.com/info/publishing_adv.asp
BioMedcentral
clinical isolates. Microbiology (Reading, England) 2005, 151(Pt1):5-14.
30. Sassetti CM, Rubin EJ: Genetic requirements for mycobacterialsurvival during infection. Proc Natl Acad Sci USA 2003,100(22):12989-12994.
31. Tuberculist Web Server, Pasteur Institute [http://genolist.pasteur.fr/TubercuList/]
32. Price ND, Reed JL, Palsson BO: Genome-scale models of micro-bial cells: evaluating the consequences of constraints. Nat RevMicrobiol 2004, 2(11):886-897.
33. Papin JA, Reed JL, Palsson BO: Hierarchical thinking in networkbiology: the unbiased modularization of biochemical net-works. Trends Biochem Sci 2004, 29(12):641-647.
34. Thiele I, Price ND, Vo TD, Palsson BO: Candidate metabolic net-work states in human mitochondria. Impact of diabetes,ischemia, and diet. J Biol Chem 2005, 280(12):11683-11695.
35. Burgard AP, Nikolaev EV, Schilling CH, Maranas CD: Flux couplinganalysis of genome-scale metabolic network reconstruc-tions. Genome Res 2004, 14(2):301-312.
36. Jamshidi N, Palsson BO: Systems biology of SNPs. Molecular sys-tems biology [electronic resource] 2006, 2:38.
37. Pfeiffer T, Sanchez-Valdenebro I, Nuno JC, Montero F, Schuster S:METATOOL: for studying metabolic networks. Bioinformatics1999, 15(3):251-257.
38. Palsson BO: Systems Biology: Determining the Capabilities ofReconstructed Networks. Cambridge Univ Pr; 2006.
39. Mdluli K, Spigelman M: Novel targets for tuberculosis drug dis-covery. Current opinion in pharmacology 2006, 6(5):459-467.
40. Systems Biology Research Group, UCSD [http://systemsbiology.ucsd.edu/]
41. Raman K, Rajagopalan P, Chandra N: Flux balance analysis ofmycolic Acid pathway: targets for anti-tubercular drugs.PLoS computational biology 2005, 1(5):e46.
42. Reed JL, Famili I, Thiele I, Palsson BO: Towards multidimensionalgenome annotation. Nat Rev Genet 2006, 7(2):130-141.
43. Hjersted JL, Henson MA, Mahadevan R: Genome-scale analysis ofSaccharomyces cerevisiae metabolism and ethanol produc-tion in fed-batch culture. Biotechnol Bioeng 2007.
44. Van Dien SJ, Lidstrom ME: Stoichiometric model for evaluatingthe metabolic capabilities of the facultative methylotrophMethylobacterium extorquens AM1, with application toreconstruction of C(3) and C(4) metabolism. Biotechnol Bioeng2002, 78(3):296-312.
45. The Institute for Genomic Research [http://www.tigr.org/]46. Kyoto Encyclopedia of Genes and Genomes [http://
www.genome.jp/kegg/]47. The SEED [http://theseed.uchicago.edu/FIG/index.cgi]48. van Hemert PA, Tiesjema RH: Possible use of the oxygen uptake
rate in the evaluation of BCG vaccines. Journal of biological stand-ardization 1977, 5(2):121-129.
Page 20 of 20(page number not for citation purposes)