metagenome analysis and draft genome reconstruction of … · 2016. 10. 3. · importantly, a draft...

Post on 29-Aug-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Research & Innovation Center

Science & Engineering To Power Our Future

Metagenome Analysis and Draft Genome Reconstruction of Produced Water Samples from Coalbed Methane EnvironmentsDaniel E. Ross,1,2* Djuna Gulliver1

1National Energy Technology Laboratory, U.S. Department of Energy, 2AECOM, (*Daniel.Ross@netl.doe.gov)

AbstractBiogasification is a process that utilizes the microbialcommunity native to coalbeds to naturally convertcurrently unusable coal into readily available methane.One methodology involves injection of nutrients intothe coal seams to stimulate biogenic coal degradationand methanogenesis. Identification of major functionalpathways of biogenic coal degradation and subsequentmethane production will lead to a better understandingof the coal-to-methane conversion, the microorganismsresponsible for this conversion, and the nutrientsrequired to bolster this conversion in situ. This studyexamines the metagenome of four produced watersamples from the Central Appalachian Basin(Pocahontas 3 coal seam) to determine thecomposition (who’s there) and the potential functionalpathways (what can they do) of the resident microbialcommunity. Nucleic acid was recovered from producedwater samples using DNA isolation techniques and thequality and quantity of DNA was assessed. IlluminaMiSeq next generation sequencing was employed, andthe resultant nucleic acid sequence data wasprocessed using a suite of bioinformatics software.Four metagenomes, named K34, K35, BB137, andL32A were obtained from produced water samplesfrom a depth of 1704 ft, 1912 ft, 1980 ft, and 2578 ft,respectively. Methanogens were present in allsamples, suggesting methanogenesis can occur.Furthermore, hydrocarbon degradation pathways werefound, suggesting a route for biodegradation of coal.Importantly, a draft genome most closely related toPseudomonas stutzeri CCUG was extracted from theK35 metagenome. Initial analysis of the draft genomerevealed a complete nitrogen fixation pathway, and anaphthalene degradation pathway.

Methods: Metagenome Analysis Pipeline (MAP)

Metagenome Analysis Genome Analysis

1. Sample Collection and Preservation

2. Nucleic Acid extraction

3. Nucleic Acid Analysis

3.1 16S rRNA gene analysis

3.2 Metagenome analysis

3.1.1 16S amplification and

sequencing

3.2.1 DNA sequencing

3.2.2 Binning and Assembly

3.1.3 16S clustering and

annotation 3.2.3 Annotation

Quality assessment | DNA Library preparation

metadata

Store aliquots for later use

Store aliquots for later use

PAL analysis

Enrichment cultures

Sequencing to be done at NETL-PIT in Gulliver Lab (84-223) and samples sent to Earth Microbiome Project (EMP)

Samples to be collected by drillers, mailed to NETL-MGN.

Bioinformatics on Linux in Gaia Lab at NETL-PIT

K351912 ft.

BB1371980 ft.

K341704 ft.

L32A2578 ft.

DISCLAIMER: This project was funded by the Department of Energy, National Energy Technology Laboratory, an agency of the United States Government, through asupport contract with AECOM. Neither the United States Government nor any agency thereof, nor any of their employees, nor AECOM, nor any of their employees,makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus,product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, orservice by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the UnitedStates Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United StatesGovernment or any agency thereof.

REFERENCES:1. Milici, R.C., and Polyak, D.E., 2014, Coalbed-methane production in the Appalachian basin, chap. G.2

of Ruppert, L.F., and Ryder, R.T., eds., Coal and petroleum resources in the Appalachian basin;Distribution, geologic framework, and geochemical character: U.S. Geological Survey ProfessionalPaper 1708, 25 p., http://dx.doi.org/10.3133/pp1708G.2. (Chapter G.2 supersedes USGS Open-FileReport 02–105.)

2. Jones EJP, Voytek MA, Corum MD, and Orem WH (2010). Stimulation of Methane Generation fromNonproductive Coal by Addition of Nutrients or a Microbial Consortium 76(21):7013-7022.

K35 K34

L32A

BB137

Pseudomonas

Methanogens

Methanogens

Methanogens

Marine isolates

Soil/sludgeisolates

Pseudomonas stutzeri K35

NitrateNitrite

Dinitrogen oxide

Dinitrogen

Nitrite

Ammonia

Fd

Fd

Sugars Glucose-6-P

2-keto-3-deoxy-6-P-gluconate

Glyceraldehyde-3-phosphate

Pyruvate

Glyceraldehyde-3-phosphate

Acetyl-CoA

TCA

Fructose-6-P

ED Pathway

Fructose-1,6-2P

D-Xylulose-5P

pro, arg

ser, gly, cys

DHAP

purines, pyrimidines his FAD, folate,

riboflavin

Goals:• Investigate the microbial community in potential

biogasification sites• Characterize relevant functional pathways found in

coal systems required for coal-to-methanebioconversion

• Construct draft genomes of abundantmicroorganisms in coal systems to complete adetailed characterization of prevalent functionalpotential

Metagenome AssemblyTaxonomic assessment of metagenome

Metagenome Results. Four metagenomes wereanalyzed and classified taxonomically. Generally,all samples contained Bacteria and Archaea,mostly comprised of Proteobacteria andEuryarchaeota. With the exception of K35, allmetagenomes were dominated by Methanogensfrom the order Euryarchaeota. Short DNAsequencing reads (250 bp) were assembled intolonger contigs. The contigs were binned accordingto genomic signature. Each bin was individuallyanalyzed for genome completeness by comparingcontigs to a reference marker gene set. Basedupon the presence or absence of these markergenes, the % genome completeness wasestimated.

DNA isolation and sequencing

Metagenome Assembly

Metagenome Binning

Metagenome Metagenome contigs

Metagenome reads Binned contigs

Genome bins

Mapping binned contigsto reference genomes

Jones et al., 2010

ACKNOWLEDGEMENT: This technical effort was performed in support of the National Energy Technology Laboratory’s ongoing research under the RES contractDE‐FE0004000.

Genome Results. The K35metagenome was estimated to be~50% Pseudomonas. After carefulcontig binning and genomemapping, the Pseudomonasgenome bin was 99.2% complete,estimated by thepresence/absence of 833 markergenes. The pan-genome wasdetermined and the core genome(genes common across all strainstested) was estimated. The draftgenome encodes for a completenitrification pathway as well as theupper and lower naphthalenedegradation pathways. The workpresented here represents aninitial metagenomic/genomicapproach to functionalcharacterization of coalbedmethane microbial communities.

The first step in the metagenome analysispipeline (MAP) involves careful andcalculated sample collection. Samples arecollected by drillers, or when possible onsite by NETL researchers. Importantly, topreserve sample integrity and preventnucleic acid degradation, samples areimmediately aliquoted and frozen. Aftertransport from the field, samples are thawedand processed for DNA extraction. Thequality and quantity of DNA is assessedbefore preparing samples for sequencing.

The second step involves processing DNAsamples to generate a sequence library tobe loaded into the sequencer (IlluminaMiSeq). Processing involves cleaning,barcoding, and pooling DNA samples.

The third MAP step is the most time-consuming and computationally intensive.Here, data that is retrieved from thesequencer is processed and analyzed.Processing involves removing barcodes andtrimming reads based upon the qualityscore (a measure of the confidence of eachbase call). Analysis involves metagenomeassembly, binning, and annotation.

Pan-genome Analysis

Contig mapping to reference genome sequence

Whole genome comparisons

Overview of functional potential

Future work. Results presentedhere will provide a framework fortailoring nutrient amendments formicrobial enhanced coalbedmethane and provide a baselinefor monitoring changes in themicrobial community duringamendments.

Clinical isolates

Sequence reads were assembled and contig length was plottedagainst contig number. A steeper slope represents a betterassembly.

Metagenome Binning

Contigs from the K35 metagenome assembly were binned. Eachdot represents a contig and each cluster represents a potentialgenome.

Contigs from the Pseudomonas genome bin from the K35 metagenome (bottom) were aligned to the complete genome sequenceof Pseudomonas stutzeri CCUG 29243 (top). Red lines demarcate contig boundary.

Pan-genome tree comparing the draft genome of P. stutzeri K35 to 31 complete and draft Pseudomonas genomes. The treerepresents the degree of similarity between the predicted proteins encoded by each genome.

Pan-genome of 32 Pseudomonas stutzeristrains. Each bar represents the number ofgene clusters found in 1 to 32 strainsexamined.

Milici and Polyak, 2014

top related