metatranscriptomics: challenges and progress
DESCRIPTION
AUG. Metatranscriptomics: Challenges and Progress. AUG. AUG. AUG. AUG. AUG. AUG. Shaomei He and Edward Kirton DOE Joint Genome Institute. Metatranscriptome The complete collection of transcribed sequences in a microbial community: Protein-coding RNA (mRNA) - PowerPoint PPT PresentationTRANSCRIPT
Advancing Science with DNA Sequence
Metatranscriptomics:Challenges and Progress
Shaomei He and Edward KirtonDOE Joint Genome Institute
AUG
AUG
AUG
AUG
AUGAUG
AUG
Advancing Science with DNA Sequence
Metatranscriptomics
Metatranscriptome
The complete collection of transcribed sequences in a microbial community:
Protein-coding RNA (mRNA) Non-coding RNA (rRNA, tRNA, regulatory RNA, etc)
Metatranscriptomics studies: Community functions Response to different
environments Regulation of gene expression
Advancing Science with DNA Sequence
Evolving of Metatranscriptomics
cDNA clone libraries + Sanger sequencing
Microarrays
RNA-seq enabled by next-generation sequencing technologies.
Sorek & Cossart, NRG (2010) 11, 9-16
RNA-seq is superior to microarrays in many ways in microbial community gene expression analysis.
Advancing Science with DNA Sequence
Challenges in Metatranscriptomics
Wet lab Low RNA yield from environmental samples Instability of RNA (half-lives on the order of
minutes) High rRNA content in total RNA (mRNA
accounts for 1-5% of total RNA)
http://cybernetnews.com/vista-recovery-disc/
http://www.nwfsc.noaa.gov/index.cfm
Bioinformatics General challenges with short reads and large data
size Small overlap between metagenome and
metatranscriptome, or complete lack of metagenome reference
Advancing Science with DNA Sequence
rRNA Removal Methods
Method rRNA feature usedInput RNA
Manipulate raw RNA
Before cDNA synthesis
Subtractive hybridization Conserved sequence
HighYes
RNase H digestion
Exonuclease digestion 5’ monophosphate
Gel extraction Size
Biased poly(A) tailing 2o structure Low
During cDNA synthesis
Not-so-random primers Sequence feature Low No
After cDNA synthesis
Library normalization w/ DSN High abundance Low No
Advancing Science with DNA Sequence
Validation of Two Ribosomal RNA Removal Methods for Microbial
Metatranscriptomics
Shaomei He, Omri Wurtzel, Kanwar Singh, Jeff L. Froula, Suzan Yilmaz, Susannah G. Tringe, Zhong Wang, Feng Chen, Erika A. Lindquist, Rotem Sorek and Philip Hugenholtz
Advancing Science with DNA Sequence
Subtractive Hybridization & Exonuclease Digestion
Hyb Exo
Capture Oligo
Magnetic Bead
rRNA
mRNA
Subtractive Hybridization
MICROBExpress Bacterial mRNA Enrichment(Ambion)
Exonuclease Digestion
mRNA-ONLY Prokaryotic mRNA Isolation(Epicentre)
5’ Monophosphate Dependent Exonuclease
rRNA
mRNA
5’ P
5’ PPP
Advancing Science with DNA Sequence
Objectives
Validate the performance of Hyb and Exo kits on
synthetic five-member microbial communities, using
Illumina sequencing to evaluate:
Efficiency of rRNA removal
Fidelity of mRNA relative transcript abundance
Hyb 2 x Hyb Exo Hyb + Exo Exo + Hyb
Treatments:
Advancing Science with DNA Sequence
OrganismGenome
size (Mbp)
%GC PhylumMatch Hyb target sites
Desulfovibrio vulgaris 3.7 63 Proteobacteria Yes
Streptomyces sp. 8-10 71 Actinobacteria Yes
Lactococcus lactis 2.53 35 Firmicutes Yes
Spirochaeta aurantia 4.3 65 Spirochaeta Yes
Lactobacillus brevis 2.3 46 Firmicutes Yes
Kangiella koreensis 2.9 43 Proteobacteria Yes
Catenulispora acidiphila 10.5 70 Actinobacteria Yes
Halorhabdus utahensis 3.1 63 Euryarchaeota No
Microbial Isolates in the Two Synthetic Communities
Community 1
Community 2
Advancing Science with DNA Sequence
Technical Reproducibility
Exo
All treatments exhibited good technical reproducibility.
Hyb
Hyb, rep1
Hyb
, rep
2
Exo, rep1E
xo, r
ep2
Advancing Science with DNA Sequence
rRNA Removal Efficiency
Advancing Science with DNA Sequence
Read Distribution
Community 1
Community 2
Advancing Science with DNA Sequence
Observed and Actual rRNA Removal
- 80 - 0
17 3After removal
97 3Before removal
rRNA mRNA
Observed rRNA reduction = 97% - 85% = 12%
Actual percent removal = 80/97 = 82.5%
Actual removal is much higher than what appears, due to the very high original rRNA content.
97%
rRNA
85%
rRNA
Advancing Science with DNA Sequence
Community rRNA Removal
Community 1: Hyb + Exo > Hyb > Exo
Community 2: Hyb + Exo > Exo + Hyb > Exo > 2 x Hyb ≈ Hyb
rRN
A R
emov
al (
%)
Advancing Science with DNA Sequence
Hyb 2 x Hyb Exo Hyb + Exo Exo + Hyb
rRN
A R
emo
val
(%)
RIN: RNA integrity number
More intact RNA Higher rRNA removal efficiency
rRNA Removal and RNA Integrity
60
70
80
90
100
110
120
5 6 7 8 9 10 11
0
20
40
60
80
100
120
5 6 7 8 9 10 110
20
40
60
80
100
120
5 6 7 8 9 10 1140
60
80
100
120
5 6 7 8 9 10 11
60
70
80
90
100
110
120
5 6 7 8 9 10 11
r = 0.946 r = 0.958 r = 0.874 r = 0.945
RNA Integrity Number (RIN)
Advancing Science with DNA Sequence
Enrichment of mRNA & Increase of Detection Sensitivity
Advancing Science with DNA Sequence
Fidelity of mRNA Relative Abundance
Advancing Science with DNA Sequence
Fidelity of mRNA Relative Abundance
Hyb > Exo > Hyb+Exo
Community 1
Hyb ≈ 2xHyb > Exo > Hyb+Exo ≈ Exo+Hyb
Community 2
Advancing Science with DNA Sequence
Conclusions
rRNA removal efficiency was community composition and RNA integrity dependent.
Exo degraded some mRNA, introducing larger variation than Hyb.
Combining Hyb and Exo provided higher rRNA removal than used alone, but the fidelity was significantly compromised.
Advancing Science with DNA Sequence
Customized subtractive hybridization
Stewart et al, ISME J (2010) 4, 896–907
Customized probes specific to communities of interest
Probes cover near-full-length rRNA, and should also capture partially degraded (fragmented) rRNA
It has been applied on marine metatranscriptome samples to substantially reduce rRNA.
Advancing Science with DNA Sequence
Duplex-specific nuclease (DSN)
• Efficient on E. coli (final rRNA% = 26 ± 11%)• Preserved mRNA relative abundance• Little reduction of the very abundant mRNA
Total RNA
RNA-seq library construction
Library normalization using DSN
Denature ds-DNA at high temp
Re-anneal to ds-DNA at lower temp.
DSN degrades DNA duplex which is presumably from abundant transcripts.
Yi et al, Nucleic Acids Res (2011) doi: 10.1093/nar/gkr617
Advancing Science with DNA Sequence
Still efficient and “faithful” for microbial communities?
0
0.5
1
1.5
2
2.5
3
1 101 201 301 401 501 601 701 801 901 1001
Rank of OTU
Rel
ativ
e ab
un
dan
ce o
f O
TU
(%
)
Environmental microbial communities are very diverse, with a long tail of minor community members.
Typical species rank abundance
Advancing Science with DNA Sequence
Termite Hindgut Metatranscriptomics
- A case study
(Preliminary results)
Advancing Science with DNA Sequence
Summary
Metatranscriptomics is being advanced by next-generation sequencing technologies.
Currently, high rRNA content is still a major bottleneck of metatranscriptomics projects.
Bioinformatically removing rRNA reads should increase computational speed in de novo assembly, and improve the assembly of low-abundance mRNAs. Need to investigate algorithm that is sensitive and computationally efficient to do this for large datasets.
Advancing Science with DNA Sequence
• Phil Hugenholtz• Susannah Tringe• Edward Kirton• Kanwar Singh• Erika Lindquist• Feng Chen• Falk Warnecke• Natalia Ivanova• Martin Allgaier• Steve Lowry• Jeff Froula• Zhong Wang• R&D group• Production group• Many others!
• Hans Peter Klenk
• Omri Wurtzel• Rotem Sorek
Acknowledgement
• Jose Escovar-Kousen
• Rudolph Scheffrahn