microarray and proteomics data analysis - cbs › dtucourse › cookbooks › carsten › 27611 ›...
TRANSCRIPT
![Page 1: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/1.jpg)
Microarray and Proteomics data analysis—27611 Introduction to Bioinformatics 2006
“It’s not just the genes we have—It’s how we use them”
Carsten Friis
Media
GlnA
glnA
TnrA
GlnR
C2
tnrA
glnR C3 C5 C6
C1 C4 C7
K
![Page 2: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/2.jpg)
Microarrays – Test Questions
1. Microarrays measure the expression levels of genes. But how?
![Page 3: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/3.jpg)
The Central Dogma
![Page 4: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/4.jpg)
Microarrays measure mRNA concentrations
gene specific DNA probeslabeled target
gene
mRNA
![Page 5: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/5.jpg)
Microarrays – Test Questions
2. Fine, probes bind mRNA, But what’s this process called?
![Page 6: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/6.jpg)
Hybridization
A
A
AT
TG
GC
C
T
AT
GA
TGC
C
T
AT
GA
TGC
C
A
A
AT
TG
GC
C
![Page 7: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/7.jpg)
Microarrays – Test Questions
3. Ok, hybridization. But how many genes can then hybridize to one array slide?
![Page 8: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/8.jpg)
Microarrays are a high-throughput method
Measure the level of transcript from a
a complete genome in one go
CELL
RNA
![Page 9: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/9.jpg)
Microarrays – Test Questions
4. Gee, thousands then? Neat, but what is this sample and control stuff?
![Page 10: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/10.jpg)
Experiment setup – Sample preparation
1. Design experiment
2. Perform experiment
3. Precipitate RNA
4. Label RNA/cDNA
Eukaryote/prokaryote?
Amplification?Direct or indirect?Label?
wild typemutant
![Page 11: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/11.jpg)
Microarrays – Test Questions
5. Ok, so we search for changes in expression; fine, but which technologies are most popular for this?
![Page 12: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/12.jpg)
Microarrays - The Technologies
Stanford Microarrays
Affymetrix
![Page 13: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/13.jpg)
Microarrays – Test Questions
6. Stan & Affy it is; Now, what characterizes the Stanford technology?
![Page 14: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/14.jpg)
The Stanford cDNA Microarrays
![Page 15: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/15.jpg)
Science. 1995 Oct 20;270(5235):467-70.
Quantitative monitoring of gene expression patterns with a complementary DNA microarray.
Schena M, Shalon D, Davis RW, Brown PO.
Department of Biochemistry, Beckman Center, Stanford University Medical Center, CA 94305, USA.
A high-capacity system was developed to monitor the expression of many genes in parallel. Microarrays prepared by high-speed robotic printing of complementary DNAs on glass were used for quantitative expression measurements of the corresponding genes. Because of the small format and high density of the arrays, hybridization volumes of 2 microliters could be used that enabled detection of rare transcripts in probe mixtures derived from 2 micrograms of total cellular messenger RNA. Differential expression measurements of 45 Arabidopsis genes were made by means of simultaneous, two-color fluorescence hybridization.
PMID: 7569999 [PubMed - indexed for MEDLINE]
![Page 16: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/16.jpg)
Making Microarrays
1. Produce probes• oligos• cDNA library• PCR products
2. Print (spotting) by the use of a robot
![Page 17: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/17.jpg)
Spotting – Mechanical deposition of probes
![Page 18: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/18.jpg)
16-pin microarray spotter
![Page 19: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/19.jpg)
mRNAmRNA
cDNAcDNA
Cy3-cDNACy5-cDNA
SAMPLE CONTROL
Stanford microarrays
DESIGN
and ORDER
PROBES
![Page 20: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/20.jpg)
Microarrays – Test Questions
7. So, I guess that was Stan. What then characterizes the Affy technology?
![Page 21: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/21.jpg)
AffymetrixTM GeneChipsTM
![Page 22: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/22.jpg)
Affymetrix GeneChip® oligonucleotide array
Pre-fabricated arrays– On-chip synthesis of 25’mers– 11 to 20 oligonucleotide
probes for each gene– >50.000 probe sets pr. chip
Automation of routine procedures
– better reproducibility– lighter workload– faster scans
![Page 23: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/23.jpg)
Examples of Catalog Arrays
HumanMouseRat ArabidopsisC. elegansCanineDrosophila
E. coliP. aeruginosaPlasmodium/AnophelesVitis vinifera (Grape) Xenopus laevisYeastZebrafish
(+ many more...)
NimbleExpress™ Array Program
![Page 24: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/24.jpg)
TTT
T
T
TT
T
T
T
A
A
AA
A
A
A
AAAMask #1Mask #2
Photolithographyin situ synthesis
Spacers bound to surface with photolabile protection groups
![Page 25: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/25.jpg)
Photolithography - Micromirrors
NimbleExpress™ Array Program
![Page 26: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/26.jpg)
The Technologies – Cost pr. 2004
Facility setup:Stanford Microarrays < 100,000 USDAffymetrix < 250,000 USD
Cost pr. array Stanford Microarrays 30-50 USDAffymetrix 300-400 USD
NimbleExpress™ Array Program - a bit more expensive
![Page 27: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/27.jpg)
The Technologies - Data Quality
Reproducibility of data:(Pearson’s correlation coefficient)
– Stanford microarrays: 0.80 - 0.95
– Affymetrix: ≈ 0.95
![Page 28: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/28.jpg)
Microarrays – Test Questions
8. And that’s Affy folks; Well, except, what was that about several probes pr. gene? How does that work?
![Page 29: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/29.jpg)
How probe sets bind
5’ 3’Probes bind to different positions on the same gene
Regions not suitable for probeseg. BLAST hits >75% & longer than 15bp25 bp
![Page 30: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/30.jpg)
Microarrays – Test Questions
9. Ok, then that must be the end for Affy, right? Or, what was that again about PM & MM probes?
![Page 31: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/31.jpg)
Affymetrix uses PM & MM probes
- Perfect Match (PM)- MisMatch (MM)
PM: CGATCAATTGCACTATGTCATTTCT MM: CGATCAATTGCAGTATGTCATTTCT
![Page 32: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/32.jpg)
Microarrays – Test Questions
Great, and the MM’s don’t work, so Affy have wasted half of the chip. Cool going, dudes.
10.And so we come to the final question, what to do about all that noise (or, why are microarrays such a bother to analyze)?
![Page 33: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/33.jpg)
Sources of variation
Array-specific variation:
Similar effect on many measurements
Corrections can be estimated from data
Gene-specific variation:
Systematic Stochastic
Too random to be explicitly accounted for
“Noise”
Statistical testing and/or
Error modellingNormalization
![Page 34: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/34.jpg)
Facts on your project
We have three data sets for you to choose between– Bladder Cancer, HIV, Leukemia
Your report should as a minimum demonstrate that you have understood the basic principles of the microarray technology and data analysis– That is, after all, the core of the course
You should preferably also demonstrate some understanding of the biological problems behind the data set you choose– Because data are more than just numbers
To get the very highest grades you must demonstrate ability to formulate your data analysis in biological terms– i.e. don’t just talk statistics – what does the numbers mean to the cell?
![Page 35: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/35.jpg)
Study of Bladder Cancer
Identify differences between different stages/types of bladder cancer based on DNA chips run on a biopsy.
From the biopsy RNA is extracted and run on a GeneChip. The biopsy is also given to histopathologist, who use a microscope to evaluate and stage the suspicious growth into: – Superficial Ta– Intermediate T1– Invasive T2-T4
The purpose here is to identify differences in gene expression between these stages.– To learn more about the disease and its progression– To classify tumors based on a biopsy
(This data has been gathered by Skejby Sygehus and it cannot be used without their permission)
![Page 36: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/36.jpg)
Study of HIV
The purpose of this study is to measure the effect of HIV-1 on the transcription of genes in the infected host cell.
The human cell line MT4 was infected in vitro with HIV-1 and compared to control cultures grown without HIV-1 infection. – Thus, we have two classes, sick and healthy
After 7 days of growth of both cultures, cells were harvested and RNA was extracted and run on Affymetrix chips.– The purpose being to identify genes relevant to the HIV disease
Replicates were performed to assure reproducibility and allow measurement of experimental variation.
![Page 37: Microarray and Proteomics data analysis - CBS › dtucourse › cookbooks › carsten › 27611 › Day1...Microarray and Proteomics data analysis —27611 Introduction to Bioinformatics](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0ddecf7e708231d43c7ddf/html5/thumbnails/37.jpg)
Study of Childhood Leukemia
Diagnostic bone marrow samples from leukemia patientsPlatform: Affymetrix Focus Array
– 8793 human genes Immunophenotype
– 18 patients with precursor B immunophenotype– 17 patients with T immunophenotype
Outcome 5 years from diagnosis– 11 patients with relapse– 18 patients in complete remission
Paper out in Leukemia:“Prediction of immunophenotype, treatment response, and
relapse in childhood acute lymphoblastic leukemia using DNA microarrays”