scalable metabolic reconstruction for metagenomic data and the human microbiome

26
Scalable metabolic reconstruction for metagenomic data and the human microbiome Sahar Abubucker, Nicola Segata, Johannes Goll, Alyxandria M. Schubert, Jacques Izard, Brandi L. Cantarel, Beltran Rodriguez-Mueller, Jeremy Zucker, Mathangi Thiagarajan, Bernard Henrissat, Owen White, Scott T. Kelley, Barbara Methé, Patrick D. Schloss, Dirk Gevers, Makedonka Mitreva, Curtis Huttenhower rvard School of Public Health partment of Biostatistics

Upload: sloan

Post on 23-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Scalable metabolic reconstruction for metagenomic data and the human microbiome. Sahar Abubucker , Nicola Segata , - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Scalable metabolic reconstruction for metagenomic data and the human microbiome

Scalable metabolic reconstructionfor metagenomic data

and the human microbiomeSahar Abubucker, Nicola Segata,

Johannes Goll, Alyxandria M. Schubert, Jacques Izard,Brandi L. Cantarel, Beltran Rodriguez-Mueller, Jeremy Zucker,

Mathangi Thiagarajan, Bernard Henrissat, Owen White,Scott T. Kelley, Barbara Methé, Patrick D. Schloss,

Dirk Gevers, Makedonka Mitreva,Curtis Huttenhower

07-16-11Harvard School of Public HealthDepartment of Biostatistics

Page 2: Scalable metabolic reconstruction for metagenomic data and the human microbiome

2

What’s metagenomics?Total collection of microorganisms

within a community

Also microbial community or microbiota

Total genomic potential of a microbial community

Total biomolecular repertoire of a microbial community

Study of uncultured microorganisms from the environment, which can include

humans or other living hosts

Page 3: Scalable metabolic reconstruction for metagenomic data and the human microbiome

Valm et al, PNAS 2011

Page 4: Scalable metabolic reconstruction for metagenomic data and the human microbiome

What to do with your metagenome?

4

Diagnostic or prognostic

biomarker for host disease

Public health tool monitoring

population health and interactions

Comprehensive snapshot of

microbial ecology and evolution

Reservoir of gene and protein

functional informationWho’s there?

What are they doing?

Who’s there varies: your microbiota is plastic and personalized.

What they’re doing is adapting totheir environment:

you, your body, and your environment.

Page 5: Scalable metabolic reconstruction for metagenomic data and the human microbiome

The Human Microbiome Project for a normal population

300 People/15(18) Body

Sites

Multifaceted analyses

2 clin. centers, 4 seq. centers, data generation,technology development, computational tools,

ethics…

Multifaceted data

Human population

Microbial population

Novel organisms

BiotypesVirusesMetabolism

>6,000 samples>50M 16S seqs.4Tbp unique

metagenomicsequence

>1,900 referencegenomes

Full clinicalmetadata

Page 6: Scalable metabolic reconstruction for metagenomic data and the human microbiome

6

GeneexpressionSNPgenotypes

Metabolic/Functional Reconstruction: The Goal

Healthy/IBDBMIDiet

Taxon abundancesEnzyme family abundancesPathway abundances

http://huttenhower.sph.harvard.edu/lefse

LEfSe:LDA Effect Size

Metagenomic biomarker discovery

Nicola Segata

Page 7: Scalable metabolic reconstruction for metagenomic data and the human microbiome

7

HMP: Metabolic reconstruction

WGS reads

Pathways/modules

Genes(KOs)

Pathways(KEGGs)

Functional seq.KEGG + MetaCYC

CAZy, TCDB,VFDB, MEROPS…

BLAST → Genes

rra

a

raa

p

gap

ggc

)(

)(

1

)()1(

||1)(

Genes → PathwaysMinPath (Ye 2009)

SmoothingWitten-Bell

otherwiseTNNgcgcTNTVTN

gc)/()(

0)()/()/()(Gap filling

c(g) = max( c(g), median )

100 subjects1-3 visits/subject~7 body sites/visit

10-200M reads/sample100bp reads

BLAST

?Taxonomic limitation

Rem. paths in taxa < ave.

XipeDistinguish zero/low

(Rodriguez-Mueller in review)

HUMAnN:HMP UnifiedMetabolic AnalysisNetworkhttp://huttenhower.sph.harvard.edu/humann

Page 8: Scalable metabolic reconstruction for metagenomic data and the human microbiome

8

HUMAnN: Metabolic reconstruction

Pathway coverage Pathway abundance

← Samples →←

Pat

hway

s→

Vaginal Skin NaresGut Oral (SupP)Oral (BM) Oral (TD)

← P

athw

ays→

← Samples →

Vaginal Skin Nares GutOral (SupP) Oral (BM) Oral (TD)

Page 9: Scalable metabolic reconstruction for metagenomic data and the human microbiome

9

HUMAnN: Validating gene and pathwayabundances on synthetic data

Validated on individual gene families, module coverage, and abundance

• 4 synthetic communities:Low (20 org.) and high (100 org.) complexityEven and lognormal abundances

• Best-BLAST-hit overshoots false positives,

undershoot real pathways as a result• HUMAnN FNs: short genes (<100bp),

taxonomically rare pathways • HUMAnN FPs: large and multicopy

(not many in bacteria)

Individual gene families

ρ=0.91

Page 10: Scalable metabolic reconstruction for metagenomic data and the human microbiome

10

← R

elat

ive

abun

danc

e →

← Pathways →

A portrait of the healthy human microbiome:Who’s there vs. what they’re doing

Vaginal SkinNares Gut Oral (SupP)Oral (BM) Oral (TD)

← R

elat

ive

abun

danc

e →

← R

elat

ive

abun

danc

e →

← R

elat

ive

abun

danc

e →

← Phylotypes →

Page 11: Scalable metabolic reconstruction for metagenomic data and the human microbiome

11

Niche specialization in human microbiome function

Metabolic modules in theKEGG functional catalogenriched at one or more

body habitats

http://huttenhower.sph.harvard.edu/lefse Nicola Segata

Page 12: Scalable metabolic reconstruction for metagenomic data and the human microbiome

12

Proteoglycan degradationby the gut microbiota

AA coreGlycosaminoglycans(Polysaccharide chains)

Page 13: Scalable metabolic reconstruction for metagenomic data and the human microbiome

13

Proteoglycan degradation:From pathways to enzymes

10-310-8

Enzyme relative abundance

• Heparan sulfate degradation

missing due to the absence of

heparanase, a eukaryotic enzyme• Other pathways not

bottleneckedby individual

genes

• HUMAnN links microbiome-wide

pathway reconstructions →

site-specific pathways →individual gene families

Page 14: Scalable metabolic reconstruction for metagenomic data and the human microbiome

14

Patterns of variation in human microbiome function by niche

Page 15: Scalable metabolic reconstruction for metagenomic data and the human microbiome

15

Patterns of variation in human microbiome function by niche

• Three main axes of variation• Eukaryotic exterior• Low-diversity vaginal• Gut metabolism• Oral vs. tooth hard surface• Only broad patterns:

every human-associated habitat

is functionally distinct!

Page 16: Scalable metabolic reconstruction for metagenomic data and the human microbiome

16

How do microbes and function varywithin each body site across the population?

Page 17: Scalable metabolic reconstruction for metagenomic data and the human microbiome

17

How do body sites compare between individuals across the population?

Page 18: Scalable metabolic reconstruction for metagenomic data and the human microbiome

18

HMP: Prevalence of species (OTUs)across the population

Cumulative prevalence

Page 19: Scalable metabolic reconstruction for metagenomic data and the human microbiome

19

HMP: Prevalence of pathwaysacross the population

• 16 (of 251) modules strongly “core” at 90%+ coverage in 90%+ individuals at 7 body sites

• 24 modules at 33%+ coverage• 71 modules (28%) weakly “core” at 33%+ coverage in 66%+ individuals at

6+ body sites• Contrast zero phylotypes or OTUs meeting this threshold!• Only 24 modules (<10%) differentially covered by body site• Compare with 168 modules (>66%) differentially abundant by body site

Cumulative prevalence

Page 20: Scalable metabolic reconstruction for metagenomic data and the human microbiome

Linking function to community composition

20

← T

axa

and

corr

elat

ed m

etab

olic

pat

hway

s →

← 52 posterior fornix microbiomes →

F-type ATPase, THF

Sugar transport

Phosphate and peptide transport

AA and small molecule biosynthesis

Embden-Meyerhof glycolysis, phosphotransferases

Eukaryotic pathways

Plus ubiquitous pathways: transcription, translation, cell wall, portions of central carbon metabolism…

Lactobacillus crispatus

Lactobacillus jensenii

Lactobacillus gasseri

Lactobacillus iners

Gardnerella/Atopobium

Candida/Bifidobacterium

Page 21: Scalable metabolic reconstruction for metagenomic data and the human microbiome

Linking communities to host phenotype

21

Nor

mal

ized

rela

tive

abun

danc

e

Vaginal pH (posterior fornix)

Body Mass Index

Top correlates with BMI in stool

Vaginal pH, community metabolism, and community composition represent a strong, direct link between

phenotype and function in these data.

Vaginal pH (posterior fornix)

Page 22: Scalable metabolic reconstruction for metagenomic data and the human microbiome

Microbial biomolecular function and metabolism in the human microbiome: the story so far?

• HUMAnN– Accurate metagenomic metabolic reconstruction– Sequences → genes → pathways → phenotypes– Validated on 4x synthetic communities

• Who’s there varies even in health– What they’re doing doesn’t (as much)

• There are patterns in this variation– Communities in related environments adapt using related functions– Function correlates with membership and phenotype

• ~1/3 to 2/3 of human metagenome characterized– Job security! 22

Page 23: Scalable metabolic reconstruction for metagenomic data and the human microbiome

Ask both what you can do for your microbiomeand what your microbiome can do for you

Page 24: Scalable metabolic reconstruction for metagenomic data and the human microbiome

Thanks!

24

Jacques IzardAlyx SchubertPat Schloss

Nicola Segata Levi Waldron

FahSathira

Interested? We’re recruitingstudents and postdocs!

Human Microbiome Project

HMP Metabolic Reconstruction

George WeinstockOwen WhiteRob KnightMakedonka MitrevaErica SodergrenMihai PopVivien Bonazzi Jane PetersonLita Proctor

Johannes GollYuzhen Ye

Beltran Rodriguez-MuellerJeremy Zucker

Mathangi ThiagarajanBrandi CantarelQiandong Zeng

Maria RiveraBarbara Methe

Bill KlimkeDaniel Haft

Dirk Gevers

Bruce Birren Ramnik XavierDoyle Ward Eric AlmAshlee Earl Lisa Cosimi

http://huttenhower.sph.harvard.eduhttp://huttenhower.sph.harvard.edu/humann

http://huttenhower.sph.harvard.edu/lefse

BenGanzfried

Sahar Abubucker

LarisaMiropolsky

VagheeshNarasimhan

Page 25: Scalable metabolic reconstruction for metagenomic data and the human microbiome
Page 26: Scalable metabolic reconstruction for metagenomic data and the human microbiome

26

HMP: Prevalence of genera (phylotypes)across the population

Cumulative prevalence