cabig: ian fore

16
December 2006 MAGE and the Biospecimen Research Database Experiment Design and other issues Ian Fore, D.Phil U.S. National Cancer Institute - Center for Bioinformatics Incorporating material from NCI Office of Biorepositories and Biospecimen Research January 24, 2008

Upload: niranabey

Post on 05-Jul-2015

455 views

Category:

Health & Medicine


3 download

TRANSCRIPT

Page 1: caBIG: Ian Fore

December 2006

MAGE and the Biospecimen Research Database

Experiment Design and other issues

Ian Fore, D.Phil

U.S. National Cancer Institute - Center for Bioinformatics

Incorporating material from NCI Office of Biorepositories and Biospecimen Research

January 24, 2008

Page 2: caBIG: Ian Fore

Why?

• View from the outside• Discussion of experimental factor• Want not to lose a key feature

• Identify some issues• Expect many are already in hand

• State the use case• So we don’t lose the feature

• Describe how we are using experimental factor in the Biospecimen Research Database• Inspired by MAGE model!• Vision for how it should interact with MAGE

Page 3: caBIG: Ian Fore

Pre- and Post- Acquisition Variables Impact Clinical and Research Outcomes

• Effects on Clinical Outcomes

• Potential for incorrect diagnosis

• Morphological/immunostaining artifact

• Skewed clinical chemistry results

• Potential for incorrect treatment

• Therapy linked to a diagnostic test on a biospecimen (e.g., HER2 in breast cancer)

• Effects on Research Outcomes

• Irreproducible results

• Variations in gene expression data

• Variations in post-translational modification data

• Misinterpretation of artifacts as biomarkers

Page 4: caBIG: Ian Fore

Variables of Biospecimen Acquisition

Post-acquisition variables: Time at room temperature Temperature of room Type of fixative Time in fixative Rate of freezing Size of aliquots Type of collection container Biomolecule extraction method Storage temperature Storage duration Storage in vacuum

Pre-acquisition variables:

Antibiotics

Other drugs

Type of anesthesia

Duration of anesthesia

Arterial clamp time

Blood pressure variations

Intra-op blood loss

Intra-op blood administration

Intra-op fluid administration

Pre-existing medical conditions

Patient gender

Page 5: caBIG: Ian Fore

Use case 1

• Actor - An investigator who does not work in a microarray lab and who does not understand the details of how microarray experiments are performed

• Description:• The investigator wishes to understand what happens to gene expression

under particular experimental conditions. They are happy to make the assumption that the experiments were done correctly and that appropriate QA and/or peer review of that took place. They want to work with the “high level” output from microarray experiments - experimental factors vs gene expression.

Page 6: caBIG: Ian Fore

Experiment design - example

• Drug • Control• WY27127• WY26382

• Dose• 10, 30, 100, 300 mg/kg

• Time• -10, -5, 0, 5, 10, 20, 30, 60, 120 mins

• Rat• 4 rats per drug treatment

Page 7: caBIG: Ian Fore
Page 8: caBIG: Ian Fore

Dose in protocol text - 200 mg/kg/day

• <Protocol text="Rat treated q.d. by oral gavage with 200 mg/kg/day clofibrate

• (formulated in 0.5 % methylcellulose (w/v) plus 0.1 % polysorbate 80 (v/v) in

• distilled water). Animal sacrificed after five days, at which time a portion of

• liver was removed and flash frozen in liquid N2, then stored at -80 C. All

• experimental procedures were approved by the Pharmacia Institutional Animal

• Care and Use Committee and were performed in compliance with laws regarding

• humane treatment of laboratory animals."

• identifier="P-MEXP-357"

• name="SAMPLETREATPRTCL357">

Page 9: caBIG: Ian Fore

Dose in protocol text - 400 mg/kg/day

• <Protocol text="Rat treated q.d. by oral gavage with 400 mg/kg/day clofibrate

• (formulated in 0.5 % methylcellulose (w/v) plus 0.1 % polysorbate 80 (v/v) in

• distilled water). Animal sacrificed after five days, at which time a portion of

• liver was removed and flash frozen in liquid N2, then stored at -80 C. All

• experimental procedures were approved by the Pharmacia Institutional Animal

• Care and Use Committee and were performed in compliance with laws regarding

• humane treatment of laboratory animals."

• identifier="P-MEXP-357"

• name="SAMPLETREATPRTCL357">

Page 10: caBIG: Ian Fore

Use case 2

• Actor - A scientist who wants to analyze microarray data who only has moderate understanding of statistical analysis techniques.

• (different person than the previous use case)

• Description - The scientist uses a statistical analysis tool which uses a wizard like approach to help them understand how the data might be analyzed. The microarray experiment data contains sufficient description for the statistical tool to extract the design of the experiment in a way that it can suggest appropriate ways to analyze the data.

• Implementation notes:• Need the factors• And their nature in the design - i.e. whether “intended” or not

Page 11: caBIG: Ian Fore

Experiment design - example

• Drug - “intended”• Control• WY27127• WY26382

• Dose - “intended”• 10, 30, 100, 300 mg/kg

• Time - “intended”• -10, -5, 0, 5, 10, 20, 30, 60, 120 mins

• Rat• 4 rats per drug treatment

Page 12: caBIG: Ian Fore

Experiment design - example

• Drug - “fixed effect”• Control• WY27127• WY26382

• Dose - “fixed effect”• 10, 30, 100, 300 mg/kg

• Time - “fixed effect”• -10, -5, 0, 5, 10, 20, 30, 60, 120 mins

• Rat- “random effect”• 4 rats per drug treatment

Page 13: caBIG: Ian Fore

Use case 3

• Actor - A database or system that wishes to index and publish microarray data• This is not the same thing as a microarray data repository - it is a more topic

based database or datamart.

• Description: The database wants to be able to automatically extract the experiment design from the experimental data file

• Example:• Biospecimen Research Database• http://brd.nci.nih.gov/BRN

Page 14: caBIG: Ian Fore

Use case 4

• Actor - A scientist investigating cancer or another disease

• Description: • The scientist understands that certain pre-analytical factors influence gene expression

and that specimens must be collected in a way to remove those as variables from the experiment. They know that sometimes they do not follow the protocol exactly and wish to annotate their experiments with those variations from protocol.

• They visit a site such as the BRD which provides them with structured tissue processing protocols.

• They are able to download electronic of these protocols in MAGE or MAGE like format.• The structured information therein can be used by programs they use to do their

experiments• and complete the “protocol execution” data specific to their experiment when they run

them.

Page 15: caBIG: Ian Fore

Use case 5

• Actor - A scientist investigating cancer or another disease

• Description: The scientist knows that they were not able to control all factors in the experiment - such as those performed in by the surgeon or anesthetist during surgery. However, the BRD provides information about which genes are affected by these uncontrolled factors. They download gene lists from the BRD and remove these from consideration as factors relevant to cancer.

Page 16: caBIG: Ian Fore

MAGE-ML overhead