designing a high quality metabolomics experiment

72
RTI International RTI International is a trade name of Research Triangle Institute. www.rti.org Designing a high quality metabolomics experiment Grier P Page Ph.D. Senior Statistical Geneticist RTI International Atlanta Office [email protected] 770-407-4907

Upload: aiko-cummings

Post on 01-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Designing a high quality metabolomics experiment. Grier P Page Ph.D. Senior Statistical Geneticist RTI International Atlanta Office [email protected] 770-407-4907. Metabolomics is Powerful and Central. Designing a good study. Errors Errors Everywhere. UMSA Analysis. Day 1. Day 2. - PowerPoint PPT Presentation

TRANSCRIPT

RTI International

RTI International is a trade name of Research Triangle Institute. www.rti.org

Designing a high quality metabolomics experiment

Grier P Page Ph.D.Senior Statistical Geneticist

RTI International

Atlanta Office

[email protected]

770-407-4907

RTI International

Metabolomics is Powerful and Central

RTI International

Designing a good study

RTI International

RTI International is a trade name of Research Triangle Institute. www.rti.org

Errors Errors Everywhere

RTI International

RTI International

UMSA Analysis

Insulin Resistant

Insulin Sensitive

Day 1Day 2

RTI International

Understand the strengths and weaknesses of each step of the experiments.

Take these strengths and weaknesses into account in your design.

Primary consideration of good experimental design

RTI International

RTI International is a trade name of Research Triangle Institute. www.rti.org

From Drug Discov Today. 2005 Sep 1;10(17):1175-82.

RTI International

State the Question and Articulate the Goals

RTI International

The Myth That Metabolomics does not need a Hypothesis

There always needs to be a biological question in the experiment. If there is not even a question don’t bother.

The question could be nebulous: What happens to the metabolome of this tissue when I apply Drug A.

The purpose of the question is to drive the experimental design.

Make sure the samples answer the question: Cause vs. effect.

RTI International

RTI International

Design Issues

Known sources of non-biological error (not exhaustive) that must be addressed– Technician / post-doc– Reagent lot– Temperature– Protocol– Date– Location– Cage/ Field positions

RTI International

Experimental Design

RTI International

Biological replication is essential.

Two types of replication– Biological replication – samples from different individuals

are analyzed– Technical replication – same sample measured

repeatedly Technical replicates allow only the effects of measurement

variability to be estimated and reduced, whereas biological replicates allow this to be done for both measurement variability and biological differences between cases. Almost all experiments that use statistical inference require biological replication.

RTI International

How many replicates?

Controlled experiments – cell lines, mice, rats 8-12 per group.

Human studies – discovery 20+ per group For predictive models – 100+ per group, need model

building and validation sets The more the better, always.

RTI International

Experimental ConductAll experiments are subject to non-

biological variability that can confound any study

RTI International

Control Everything!

Know what you are doing Practice! Practice!

RTI International

What if you can’t control or make all things uniform

Randomize Orthogonalize

RTI International

What are Orthogonalization and Randomization ?

Orthogonalization- spreading the biological sources of error evenly across the non-biological sources of error. – Maximally powerful for known sources of error.

Randomization – spear the biological sources of error at random across the non-biological sources of error.– Useful for controlling for unknown sources of error

RTI International

Examples of Orthogonalization and Randomization ?

Sample # Treatment Variety

1 1 1

2 1 2

3 1 1

4 1 2

5 2 1

6 2 2

7 2 1

8 2 2

Order Sample

1 1

2 2

3 5

4 6

5 8

6 7

7 4

8 3

Order Sample

1 7

2 6

3 4

4 1

5 2

6 8

7 5

8 3

The experiment Orthogonalize Randomize

RTI International

RTI International is a trade name of Research Triangle Institute. www.rti.org

Statistical analyses have assumptions too

RTI International

Statistical analyses

Supervised analyses – linear models etc– Assume IID (independently identically distibuted)– Normality– Sometimes can rely on central limit– ‘Weird’ variances– Using fold change alone as a statistic alone is not valid.

– ‘Shrinkage’ and or use of Bayes can be a good thing. False-discovery rate is a good alternative to

conventional multiple-testing approaches. Pathway testing is desirable.

RTI International

Classification

Supervised classification– Supervised-classification procedures require

independent cross-validation.– See MAQC-II recommendations Nat Biotechnol. 2010

August ; 28(8): 827–838. doi:10.1038/nbt.1665. Wholly separate model building and validation

stages. Can be 3 stage with multiple models tested Unsupervised classification

– Unsupervised classification should be validated using resampling-based procedures.

RTI International

Unsupervised classification - continued

Unsupervised analysis methods– Cluster analysis– Principle components– Separability analysis

All have assumptions and input parameters and changing them results in very different answers

RTI International

RTI International

RTI International

Sample size estimation for metabolomics studies

RTI International

There is strength in numbers —power and sample size .

Unsupervised analyses– Principal components, clustering, heat maps

and variants– These are actually data transformations or

data display rather than hypothesis testing, thus unclear if sample size estimation is appropriate or even possible.

– Stability of clustering may be appropriate to think about. Garge et al 2005 suggested 50+ samples for any stability.

RTI International

Sample size in supervised experiments

Supervised analyses– Linear models and variants– Methods are still evolving, but we suggest the

approach we developed for microarrays may be appropriate for metabolomics (being evaluated)

RTI International

RTI International

RTI International

RTI International is a trade name of Research Triangle Institute. www.rti.org

Metabolomics does not reveal everything and different technologies show different things

RTI International

Technology and detection evolves over time.

RTI International

Technologies are not perfect in agreement

RTI International

The human urine metabolome

RTI International

Sample, Image and Data Quality Checking

RTI International

RTI International

RTI International

RTI International

RTI International

RTI International

Metabolite quality

Still evolving field RTI is one of the Metabolomics Reference

Standards Synthesis Centers

RTI International

Know your data - What should it look like

These are OK

These are not OK

RTI International

One bad sample can contaminate an experiment

Histogram of p-values

Potentially Bad Data

Histogram of p-values with bad data removed

RTI International

Quality of Database, Bioinformatics and Interpretative tools

RTI International

Just because a database says something does not mean it is right. Read the evidence.

Databases are biased. Databases are incomplete Databases have lots of data Understand data before you use it Database are useful!

Understand what databases include, don’t include, and assumptions

RTI International

RTI International is a trade name of Research Triangle Institute. www.rti.org

Issues in the Annotation of Genes, proteins, metabolites

RTI International

Gene Symbol p-value fc 50/21 Gene Ontology Biological Process Gene Ontology Cellular ComponentPathwayAco2 0.746656 0.955755 --- --- Krebs-TCA_Cycle // GenMAPPPdk2 0.967577 1.005459 6086 // acetyl-CoA biosynthesis from pyruvate5739 // mitochondrion // Krebs-TCA_Cycle // GenMAPPPdk2 0.823635 1.02781 6086 // acetyl-CoA biosynthesis from pyruvate 5739 // mitochondrion // Krebs-TCA_Cycle // GenMAPPPdha2 0.368075 1.403263 6096 // glycolysis 5739 // mitochondrion Krebs-TCA_Cycle // GenMAPPIdh1 0.710704 0.994378 6099 // tricarboxylic acid cycle 5829 // cytosol ---Acly 0.367315 0.982691 6099 // tricarboxylic acid cycle 5622 // intracellular Fatty_Acid_Synthesis // GenMAPPAco2 1.22E-06 0.561041 --- --- Krebs-TCA_Cycle // GenMAPPFh1 6.76E-06 0.690515 6099 // tricarboxylic acid cycle // 5739 // mitochondrion Krebs-TCA_Cycle // GenMAPPAtp5g3 1.53E-06 0.754735 6099 // tricarboxylic acid cycle // 5739 // mitochondrion ---Suclg1 8.87E-07 0.694384 6099 // tricarboxylic acid cycle // 5739 // mitochondrion Krebs-TCA_Cycle // GenMAPPMdh1 5.92E-09 0.519311 6099 // tricarboxylic acid cycle // --- Krebs-TCA_Cycle // GenMAPPMor1 4.24E-07 0.617645 6099 // tricarboxylic acid cycle // 5739 // mitochondrion Krebs-TCA_Cycle // GenMAPPIdh1 2.36E-06 0.677013 6099 // tricarboxylic acid cycle // 5829 // cytosol // ---Idh3g 2.19E-06 0.709971 6099 // tricarboxylic acid cycle // 5739 // mitochondrion Krebs-TCA_Cycle // GenMAPPDlst 2.49E-07 0.688339 --- --- ---Sdhd 5.13E-07 0.583485 6121 // mitochondrial electron transport, succinate to ubiquinone 5749 // respiratory chain complex II (sensu Eukaryota) Krebs-TCA_Cycle // GenMAPPSdhc 1.82E-06 0.64108 --- --- ---RGD:735073 2.13E-07 0.570307 --- 9352 // dihydrolipoyl dehydrogenase complex---Cs 1.56E-07 0.560436 --- 5739 // mitochondrion Krebs-TCA_Cycle // GenMAPPRGD:621624 1E-06 0.486736 6099 // tricarboxylic acid cycle // 5829 // cytosol ---Idh3B 2.57E-07 0.694389 --- --- Krebs-TCA_Cycle // GenMAPPMdh1 1.08E-05 0.496911 6099 // tricarboxylic acid cycle // --- Krebs-TCA_Cycle // GenMAPPPc 1.91E-05 0.468765 6094 // gluconeogenesis // 5739 // mitochondrion Krebs-TCA_Cycle // GenMAPPRGD:708561 0.004002 0.76777 --- 5913 // cell-cell adherens junction Krebs-TCA_Cycle // GenMAPPRGD:708561 0.03978 0.686511 --- 5913 // cell-cell adherens junction Krebs-TCA_Cycle // GenMAPPDlat 4.76E-06 0.435534 6086 // acetyl-CoA biosynthesis from pyruvate // inferred from electronic annotation /// 6096 // glycolysis // inferred from electronic annotation /// 8152 // metabolism // inferred from electronic annotation5739 // mitochondrion // Krebs-TCA_Cycle // GenMAPPSdhd 1.3E-06 0.64335 6121 // mitochondrial electron transport, succinate to ubiquinone // inferred from sequence or structural similarity5749 // respiratory chain complex II (sensu Eukaryota) // inferred from sequence or structural similarityKrebs-TCA_Cycle // GenMAPPSdha 7.85E-06 0.730667 6099 // tricarboxylic acid cycle // 5739 // mitochondrion // Krebs-TCA_Cycle // GenMAPPIdh3a 0.000449 0.690147 6099 // tricarboxylic acid cycle // 5739 // mitochondrion // Krebs-TCA_Cycle // GenMAPPPdk4 0.044616 1.700116 6086 // acetyl-CoA biosynthesis from pyruvate5739 // mitochondrion // Krebs-TCA_Cycle // GenMAPPCs 1.36E-06 0.592128 --- 5739 // mitochondrion // Krebs-TCA_Cycle // GenMAPPAcly 0.000227 0.554459 6085 // acetyl-CoA biosynthesis 5622 // intracellular // Fatty_Acid_Synthesis // GenMAPP

Annotation is inconsistent across sources

RTI International

RTI International is a trade name of Research Triangle Institute. www.rti.org

Issues with pathway data

RTI International

TCA cycle from Ingenuity

TCA from GeneMAPP

TCA cycle from Ingenuity

RTI International

RTI International is a trade name of Research Triangle Institute. www.rti.org

Share Your Data

Use shared data!

RTI International

Metabolomics WorkBench http://www.metabolomicsworkbench.org/

RTI International

MetaboLights

RTI International

Practice compendium research – to allow others to replicate your work

Many high profile omic studies are not even technically reproducible

Overshare your data and show work

RTI International

Limited in the literature so far. Some work on tissue and species metabolomes.

Use metabolomics databases

RTI International

Design your experiment well Conduct your experiment well Control for non-biological sources of error Know what is good and bad quality data at each stage

including metabolite, image, data, and annotation If you are aware of these issues and control for them

highly powerful and reproducible metabolite experimentation is possible.

Else you get garbage Share your data and use shared data

Summary

RTI International

The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray based predictive models. Nat Biotechnol. 2010 August ; 28(8): 827–838.

Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006 Jan;7(1):55-65.

Baggerly K. "Disclose all data in publications." Nature. 2010 Sep 23;467(7314):401. PMID: 20864982

Repeatability of published microarray gene expression analyses. Nat Genet. 2009 Feb;41(2):149-55

A design and statistical perspective on microarray gene expression studies in nutrition: the need for playful creativity and scientific hard-mindedness. Nutrition. 2003 Nov-Dec;19(11-12):997-1000.

39 Steps. From Drug Discov Today. 2005 Sep 1;10(17):1175-82.

References

If time allows

RTI International

RTI International is a trade name of Research Triangle Institute. www.rti.org

RTI Regional Comprehensive Metabolomics Resource Core

(RTI RCMRC)

Susan Sumner, PhDDirector RTI RCMRC

Discovery SciencesProteomics and Metabolomics Programs

RTI International

RTI International

Contact Information for the RTI RCMRC

Susan C.J. Sumner, PhD

Director RTI RCMRC

Senior Scientist nanoSafety

RTI International

Discovery Sciences

3040 Cornwallis Drive

Research Triangle Park

North Carolina 27709

[email protected]

919-541-7479 (office)

919-622-4456 (cell)

Jason P. Burgess, PhD

Program Coordinator, RTI RCMRC

Associate Director, Discovery Sciences

RTI International

3040 Cornwallis Drive

Research Triangle Park

North Carolina 27709

[email protected]

919-541-6700 (office)

RTI International

MS and NMR Instruments at RTI and DHMRI

RTI DHMRI

Mass Spectrometers (38)LC-MS 13 6GC-MS 4 3GC x GC-TOF-MS 1 1ICP-MS 6 1MALDI ToF/ToF 2 1

NMR (6) 2 4

RTI International

Some RTI Metabolomics Applications and PilotsExperience with adolescent and adult human subject research, animal model and cell based research, e.g.,Apoptosis- cellsDrug induced liver injury- animal modelsin utero exposure to chemicals and fetal imprinting- animal modelsDietary exposure and imprinting- animal modelsNAFLD - pediatric obesity; microbiomeWeight Loss- pediatric obesityPreterm delivery- human subjectsResponse to vaccine- human subjectsNicotine withdrawal- human subjectsColon cancer- human subjects

RTI International

Pilot and Feasibility Studies

The aim of the pilot and feasibility program is to foster collaborations and promote the use of metabolomics.

Studies will be selected through an application process.– Application involves abstract, description of samples available (matrix type, volume, type

and duration of storage, sample processing, freeze thaws, etc), description of phenotypes, and plan for subsequent grant/contract submissions for metabolomics analysis beyond initial pilot study.

Applications may also include technology development.

Applications must agree to deposit data in DRCC, coauthor publications, and submit joint grant/contract proposals.

Deadlines being defined