microbiome studies using 16s ribosomal dna pcr: some cautionary tales

Post on 27-May-2015

554 Views

Category:

Science

7 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation at a workshop conducted by the UC Davis Bioinformatics Core Facility: Using the Linux Command Line for Analysis of High Throughput Sequence Data, September 15-19, 2014

TRANSCRIPT

“Scientists often have a naïve faith that if only they could discover enough facts about a problem, these facts would somehow arrange themselves in a compelling and true solution.”Theodosius

Dobzhansky

Jenna Morgan Lang

postdocJonathan Eisen’s Lab

UC Davis

email: jennomics@gmail.com

Twitter: @jennomics

websites: jennomics.com seagrassmicrobiome.org phylogenomics.wordpress.com

Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.

16S ribosomal RNA PCR surveys

Metagenomics

Typical laboratory workflow

• Extract DNA with MoBio PowerSoil Kit

• Amplify 16S rDNA with barcoded primers

• Pool samples and sequence on the MiSeq– 15 million reads, 250bp PE– 50-200(?) samples– Sample drop out

Typical bioinformatic workflow

• Demultiplex and QC sequence data

• Process using QIIME

• Stare at graphs and wait for a revelation

zSequence

data

inputs pre-processing under the hood analysis

Meta-data

Sequencepre-

processing

Clustersequences

Build OTU table

Build phylogenetic

tree

Assign taxonomy

Alpha diversity

Betadiversity

Hypothesistesting

Data visualization

QII

ME

You can do lots of things with a .biom table produced by QIIME

• METAGENassist• interactive web tool that will do lots of stats and make

pretty pictures

• PICRUSt (google: picrust metagenomes)• infers functional potential based on your 16S data

• STAMP (google: stamp bioinformatics)

• flexible python tool (with a GUI) that will do statistical analysis of taxonomic and functional profiles on the fly

• R (phyloseq package)• If you are familiar with R, this will bridge the gap between

QIIME and Rstats

• Phinch• Interactive web-based visualization tool

METAGENassist

• Input is .biom table and “mapping file”

• can input matrix of taxonomy or functional assignments

• many options for statistical analysis

• easily generate nice plots

Some examples of METAGENassist output:

PICRUSt(Phylogenetic Investigation of Communities by Reconstruction of Unobserved States)

• .biom table input from QIIME• normalize by copy number• predict metagenome• .biom table output (with functional

categories)

Zaneveld, J.R., Lozupone, C., Gordon, J.I. & Knight, R. Ribosomal RNA diversity predicts genome diversity in gut bacteria and their relatives. Nucleic Acids Res. 38, 3869–3879 (2010)

Martiny, A.C., Treseder, K. & Pusch, G. Phylogenetic conservatism of functional traits in microorganisms. ISME J. 7, 830–838 (2013)

PICRUSt accuracy across various environmental microbiomes

PICRUSt can produce results that make sense!

Tributary contaminated by old sulfur mine

Sulfur Metabolism

STAMP

• Input is .biom table and “mapping file”• Can input matrix of taxonomy or

functional assignments• powerful statistical options• Can subsample data on the fly• Generates OK plots

Using STAMP to identify SEED subsystems which are differentially abundant between Candidatus Accumulibacter phosphatis sequences obtained from a pair of enhanced biological phosphorus removal (EBPR) sludge metagenomes(data originally described in Parks and Beiko, 2010).

phyloseq R package

• Create a phyloseq object– .biom table– “mapping file”– phylogenetic tree

• google: phyloseq demo• do stats and make plots that you can

prettify with ggplot2

phinch.org

• Add metadata to biom table

• Upload to phinch

Phinch allows you to manipulate and explore your data

Lots of data cannot compensate for a poorly designed

experiment

Bioinformatics cannot save

a poorly designed experiment

Design your experiment.

replication

controls

biases

233 sequences

Read number distribution for 60 samples on one MiSeq run

Read number distribution for 95 samples on one MiSeq run

318 sequences

Figure 3. Predicted and observed frequencies of sequence reads from each organism.

Morgan JL, Darling AE, Eisen JA (2010) Metagenomic Sequencing of an In Vitro-Simulated Microbial Community. PLoS ONE 5(4): e10209. doi:10.1371/journal.pone.0010209http://www.plosone.org/article/info:doi/10.1371/journal.pone.0010209

Standardize collection, storage, and laboratory procedures

Beware the chimera

The How:

The Why:

• too many cycles

• extension time too short

• close relatives in the mix

• less abundant taxa

Include kit / negative

controls

16S rRNA gene sequencing of a pure Salmonella bongori culture

16S rRNA gene sequencing of a pure Salmonella bongori culture

Child nasopharyngeal samples from Thailand, appears to show age-related clustering

Child nasopharyngeal samples from Thailand, extraction kit lot # explains the pattern better

Child nasopharyngeal samples from Thailand,

loss of clustering after excluding contaminant OTUs

Schloss reducingartifacts

Last Bit of Ugly Data

mock community consisting of 21 taxa3 different regions amplified4 different sequencing centers

Fecal sample

“Perfection is the enemy of progress”

WORDS OF WISDOM

Consult an expert.

WORDS OF WISDOM

Include replicates and controls.

Design your experiment!

WORDS OF WISDOM

Have a specific question.

Seek to answer THAT question.

(no pilots!)

WORDS OF WISDOM

Do microbes differ between your treatments?

Yes.

WORDS OF WISDOM

Know the answer to the question:

So now what?

(follow-up experiments)

WORDS OF WISDOM

Avoid metagenomics.

top related