microarray data analysis day 2

20
Microarray Data Analysis Day 2

Upload: donoma

Post on 28-Jan-2016

63 views

Category:

Documents


0 download

DESCRIPTION

Microarray Data Analysis Day 2. Microarray Data Process/Outline. Experimental Design Image Analysis – scan to intensity measures (raw data) Normalization – “clean” data More “low level” analysis-fold change, ANOVA, (Z-score) --data filtering Data mining-how to interpret > 6000 measures - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Microarray Data Analysis  Day 2

Microarray Data Analysis

Day 2

Page 2: Microarray Data Analysis  Day 2

1. Experimental Design

2. Image Analysis – scan to intensity measures (raw data)

3. Normalization – “clean” data

4. More “low level” analysis-fold change, ANOVA, (Z-score) --data filtering

5. Data mining-how to interpret > 6000 measures– Databases– Software– Techniques-clustering, pattern recognition

etc.– Comparing to prior studies, across platforms?

6. Validation

Microarray Data Process/Outline

*

Page 3: Microarray Data Analysis  Day 2

Today we will be using Spotfire software to filter and search your data.

10928 records in Spotfire -5999 S. pombe specific -166 Affy controls

5763 S. cerevisiae specific

Page 4: Microarray Data Analysis  Day 2

The Affy detection oligonucleotide sequences are frozen at the time of synthesis, how does this impact downstream data analysis?

660343771407819

Page 5: Microarray Data Analysis  Day 2

Biology and Data Mining

Page 6: Microarray Data Analysis  Day 2

Subcellular Localization, Provides a simple goal for genome-scale functional prediction

Determine how many of the ~6000 yeast proteins go into each compartment

Page 7: Microarray Data Analysis  Day 2

Subcellular Localization, a standardized aspect of function

Nucleus

Membrane

Extra-cellular[secreted]

ER

Cytoplasm

Mitochondria

Golgi

Page 8: Microarray Data Analysis  Day 2

"Traditionally" subcellular localization is "predicted" by sequence patterns

NLS

TM-helix

Sig. Seq.

HDEL

Nucleus

Membrane

Extra-cellular[secreted]

ER

Cytoplasm

Mitochondria

GolgiImport Sig.

Page 9: Microarray Data Analysis  Day 2

Subcellular localization is associated with the level of gene expression

Nucleus

Membrane

Extra-cellular[secreted]

ER

Cytoplasm

Mitochondria

Golgi

[Expression Level in Copies/Cell]

Page 10: Microarray Data Analysis  Day 2

Combine Expression Information & Sequence Patterns to Predict Localization

NLS

TM-helix

Sig. Seq.

HDEL

Nucleus

Membrane

Extra-cellular[secreted]

ER

Cytoplasm

Mitochondria

GolgiImport Sig.

[Expression Level in Copies/Cell]

Page 11: Microarray Data Analysis  Day 2

Major Objective: Discover a comprehensive theory of life’s organization at the molecular level

– The major actors of molecular biology: the nucleic acids, DeoxyriboNucleic Acid (DNA) and RiboNucleic Acids (RNA)

– The central dogma of molecular biology???

Proteins are very complicated molecules with 20 different amino acids.

Epigenetics

RNA editing

Post-translational modification

Translational regulation

Page 12: Microarray Data Analysis  Day 2

Data Mining

Microarray Experiment

Image Analysis

Biology Application Domain

Experiment Design and Hypothesis

Data Analysis

Artificial Intelligence (AI)

Knowledge discovery in databases (KDD)

Data Warehouse

Validation

Statistics

Page 13: Microarray Data Analysis  Day 2

Higher LevelMicroarray data analysis

• Clustering and pattern detection• Data mining and visualization• Linkage between gene expression data and

gene sequence/function/metabolic pathways databases

• Discovery of common sequences in co-regulated genes

• Meta-studies using data from multiple experiments

Page 14: Microarray Data Analysis  Day 2

Scatter plot of all genes in a simple comparison of two control (A) and two treatments (B: high vs. low glucose) showing changes in expression greater than 2.2 and 3 fold.

Page 15: Microarray Data Analysis  Day 2

Types of Clustering

• Herarchical– Link similar genes, build up to a tree of all

• Self Organizing Maps (SOM)– Split all genes into similar sub-groups– Finds its own groups (machine learning)

Page 16: Microarray Data Analysis  Day 2

Cluster by color/expression

difference

Page 17: Microarray Data Analysis  Day 2

Self Organizing Maps

Page 18: Microarray Data Analysis  Day 2

Public Databases

• Gene Expression data is an essential aspect of annotating the genome

• Publication and data exchange for microarray experiments

• Data mining/Meta-studies• Common data format - XML• MIAME (Minimal Information About a

Microarray Experiment)

Page 19: Microarray Data Analysis  Day 2

• Molecular Function = elemental activity/task

– the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity

• Biological Process = biological goal or objective– broad biological goals, such as mitosis or purine metabolism, that are accomplished

by ordered assemblies of molecular functions

• Cellular Component = location or complex– subcellular structures, locations, and macromolecular complexes; examples include

nucleus, telomere, and RNA polymerase II holoenzyme

The 3 Gene Ontologies

Page 20: Microarray Data Analysis  Day 2

One Last Note

• Microarrays are “cutting edge” technology

• You now have experience doing a technique that most Ph.D.s have never done

• Looks great on a resume…