introduction to microarray technology and analysis

102
Introduction to microarray technology and analysis Carol Bult Associate Professor The Jackson Laboratory [email protected]

Upload: olesia

Post on 13-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Introduction to microarray technology and analysis. Carol Bult Associate Professor The Jackson Laboratory [email protected]. Measuring Gene Expression. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction to microarray technology and analysis

Introduction to microarray technology and analysis

Carol BultAssociate Professor

The Jackson [email protected]

Page 2: Introduction to microarray technology and analysis

Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently harder.

Measuring Gene Expression

Page 3: Introduction to microarray technology and analysis

Central Assumption of Gene Expression Microarrays

The level of a given mRNA is positively correlated with the expression of the associated protein. Higher mRNA levels mean higher protein

expression, lower mRNA means lower protein expression

Other factors: Protein degradation, mRNA degradation,

polyadenylation, codon preference, translation rates, alternative splicing, translation lag…

Page 4: Introduction to microarray technology and analysis

Principal Uses of Microarrays

Genome-scale gene expression analysis Differential gene expression between two (or

more) sample types Responses to environmental factors Disease processes (e.g. cancer) Effects of drugs Identification of genes associated with clinical

outcomes (e.g. survival)

Page 5: Introduction to microarray technology and analysis

Microarray example: Biomarker identification - lung cancer

SamplesSamples

Gen

eG

en

ess

Garber, Troyanskaya et al. Diversity of gene expression in adenocarcinoma of the lung. PNAS 2001, 98(24):13784-9.

Page 6: Introduction to microarray technology and analysis

60

Cu

m.

Su

rviv

al

Time (months)

0

.2

.4

.6

.8

1

0 10 20 30 40 50

Cum. Survival (Group 3)

Cum. Survival (Group 2)

Cum. Survival (Group 1)

p = 0.002for Gr. 1 vs. Gr. 3

Data partitioning clinically important: Patient survival for lung cancer subgroups

Garber, Troyanskaya et al. Diversity of gene expression in adenocarcinoma of the lung. PNAS 2001, 98(24):13784-9.

Page 7: Introduction to microarray technology and analysis

Biological questionDifferentially expressed genesSample class prediction etc.

Testing

Biological verification and interpretation

Microarray experiment

Estimation

Experimental design

Image analysis

Normalization

Clustering Discrimination

Page 8: Introduction to microarray technology and analysis

Technology basics Microarrays are composed of short, specific DNA

sequences attached to a glass or silicon slide at high density

A microarray works by exploiting the ability of an mRNA molecule to bind specifically to, or hybridize, the DNA template from which it originated

RNA or DNA from the sample of interest is fluorescently-labeled so that relative or absolute abundances can be quantitatively measured

Page 9: Introduction to microarray technology and analysis

Two color vs single color

Bakel and Holstege. 2007. http://www.cell-press.com/misc/page?page=ETBR

Page 10: Introduction to microarray technology and analysis

Other applications of microarray technology

(besides measuring gene expression)

DNA copy number analysis SNP analysis chIP-chip (interaction data) Competitive growth assays …

Page 11: Introduction to microarray technology and analysis

Major technologies cDNA probes (> 200 nt), usually

produced by PCR, attached to either nylon or glass supports

Oligonucleotides (25-80 nt) attached to glass support

Oligonucleotides (25-30 nt) synthesized in situ on silica wafers (Affymetrix)

Probes attached to tagged beads

Page 12: Introduction to microarray technology and analysis

cDNA Microarray Design

Probe selectionNon-redundant set of probes

Includes genes of interest to project

Corresponds to physically available clones

Chip layoutGrouping of probes by function

Correspondence between wells in microtiter plates and spots on the chip

Page 13: Introduction to microarray technology and analysis

Building the chip

Ngai Lab arrayer , UC Berkeley

Print-tip head

Page 14: Introduction to microarray technology and analysis

http://transcriptome.ens.fr/sgdb/presentation/principle.php

Page 15: Introduction to microarray technology and analysis

Example dual channel cDNA array results

Page 16: Introduction to microarray technology and analysis

Affymetrix GeneChips

Probes are oligos synthesized in situ using a photolithographic approach

There are at least 5 oligos per cDNA, plus an equal number of negative controls

The apparatus requires a fluidics station for hybridization and a special scanner

Only a single fluorochrome is used per hybridization

Page 17: Introduction to microarray technology and analysis

http://genome.ucsc.edu/cgi-bin/hgTracks

Page 18: Introduction to microarray technology and analysis

There may be 5,000-100,000 probe sets per chipA probe set = 11-20 PM, MM pairs

Affy

Page 19: Introduction to microarray technology and analysis

http://www.weizmann.ac.il/home/ligivol/pictures/system.jpg

Page 20: Introduction to microarray technology and analysis

Interpreting Affymetrix OutputPerfect Match/Mismatch Strategy

Each probe designed to be perfectly complementary to a target sequence, a partner probe is generated that is identical except for a single base mismatch in its center.

These probe pairs, called the Perfect Match probe (PM) and the Mismatch probe (MM), allow the quantitation and subtraction of signals caused by non-specific cross-hybridization.

The difference in hybridization signals between the partners serve as indicators of specific target abundance

Page 21: Introduction to microarray technology and analysis

Biological questionDifferentially expressed genesSample class prediction etc.

Testing

Biological verification and interpretation

Microarray experiment

Estimation

Experimental design

Image analysis

Normalization

Clustering Discrimination

Page 22: Introduction to microarray technology and analysis

Experimental Design

Bakel and Holstege. 2007. http://www.cell-press.com/misc/page?page=ETBR

Page 23: Introduction to microarray technology and analysis

Microarray Analysis: Controlling for the

Known Knowns and Unknown Unknowns

- Donald Rumsfeld, former Secretary of Defense

Page 24: Introduction to microarray technology and analysis

http://www.bioconductor.org/workshops/2003/NGFN03/experimental-design.pdf

Page 25: Introduction to microarray technology and analysis
Page 26: Introduction to microarray technology and analysis
Page 27: Introduction to microarray technology and analysis
Page 28: Introduction to microarray technology and analysis
Page 29: Introduction to microarray technology and analysis

Selected references

http://discover.nci.nih.gov/microarrayAnalysis/Experimental.Design.jsp

Best advice?Consult a statistician before you start!

Page 30: Introduction to microarray technology and analysis

Statistical Power

The probability that a test will reject a null hypothesis if it is falseType I and Type II errors

Type 1 – fail to accept the null hypothesis We say there is a difference in gene expression between

gene A and gene B when there really isn’t

Type 2- fail to reject the null hypothesis We say there is no difference in gene expression between

gene A and gene B when there actually is!

Page 31: Introduction to microarray technology and analysis

Power in Perspective Sample size

Number of units Effect size

Signal to noise Alpha level

Significance level Power

Likelihood of detecting a treatment effect if it is there

What are the 4 main components that determine what conclusions are drawn from a study?

Page 32: Introduction to microarray technology and analysis

Check out this pithy description of Statistical Power and Hypothesis Testinghttp://www.socialresearchmethods.net/kb/

power.php

Page 33: Introduction to microarray technology and analysis

MicroArray Image Analysis

Based on slides from Robin Liechti ([email protected])

Page 34: Introduction to microarray technology and analysis

Microarray analysis

Array construction, hybridisation, scanning

Quantitation of fluorescence signals

Data visualisation

Meta-analysis (clustering)

More visualisation

Page 35: Introduction to microarray technology and analysis

Technical

probe(on chip)

sample(labelled)

pseudo-colourimage

[image from Jeremy Buhler]

Page 36: Introduction to microarray technology and analysis

Experimental design

Track what’s on the chip which spot corresponds to which gene

Duplicate experimental spots reproducibility

Controls DNAs spotted on glass

positive probe (induced or repressed)

negative probe (bacterial genes on human chip)

oligos on glass or synthesised on chip (Affymetrix)

point mutants (hybridisation plus/minus)

Page 37: Introduction to microarray technology and analysis

Images from scanner

Resolution standard 10m [currently, max 5m] 100m spot on chip = 10 pixels in diameter

Image format TIFF (tagged image file format) 16 bit (65’536 levels of grey) 1cm x 1cm image at 16 bit = 2Mb (uncompressed) other formats exist e.g.. SCN (used at Stanford University)

Separate image for each fluorescent sample channel 1, channel 2, etc.

Page 38: Introduction to microarray technology and analysis

Images in analysis software

The two 16-bit images (cy3, cy5) are compressed into 8-bit images

Goal : display fluorescence intensities for both wavelengths using a 24-bit RGB overlay image

RGB image : Blue values (B) are set to 0 Red values (R) are used for cy5 intensities Green values (G) are used for cy3 intensities

Qualitative representation of results

Page 39: Introduction to microarray technology and analysis

Images : examples

cy3

cy5 Spot color Signal strength Gene expression

yellow Control = perturbed unchanged

red Control < perturbed induced

green Control > perturbed repressed

Pseudo-color overlay

Page 40: Introduction to microarray technology and analysis

Processing of images

Addressing or gridding Assigning coordinates to each of the spots

Segmentation Classification of pixels either as foreground or as

background Intensity extraction (for each spot)

Foreground fluorescence intensity pairs (R, G) Background intensities Quality measures

Page 41: Introduction to microarray technology and analysis
Page 42: Introduction to microarray technology and analysis

File or archive your e-mail on your own computer

Page 43: Introduction to microarray technology and analysis

ScanAlyze

Parameters to address the spots positions

Separation between rows and columns of grids

Individual translation of grids Separation between rows and

columns of spots within each grid Small individual translation of

spots Overall position of the array in the

image

Addressing (I) The basic structure of the images

is known (determined by the arrayer)

Page 44: Introduction to microarray technology and analysis

Addressing (II)

The measurement process depends on the addressing procedure

Addressing efficiency can be enhanced by allowing user intervention (slow!)

Most software systems now provide for both manual and automatic gridding procedures

Page 45: Introduction to microarray technology and analysis

Segmentation (I)

Classification of pixels as foreground or background -> fluorescence intensities are calculated for each spot as measure of transcript abundance

Production of a spot mask : set of foreground pixels for each spot

Page 46: Introduction to microarray technology and analysis

Segmentation (II) Segmentation methods :

Fixed circle segmentationAdaptive circle segmentationAdaptive shape segmentationHistogram segmentation

Fixed circle ScanAlyze, GenePix, QuantArray

Adaptive circle GenePix, Dapple

Adaptive shape Spot, region growing and watershed

Histogram method

ImaGene, QuantArraym DeArray and adaptive thresholding

Page 47: Introduction to microarray technology and analysis

Fixed circle segmentation Fits a circle with a constant diameter to

all spots in the image Easy to implement The spots need to be of the same

shape and size

Bad example !

Page 48: Introduction to microarray technology and analysis

Adaptive circle segmentation The circle diameter is

estimated separately for each spot

Dapple finds spots by detecting edges of spots (second derivative)

Problematic if spot exhibits oval shapes

Page 49: Introduction to microarray technology and analysis

Adaptive shape segmentation Specification of starting points or seeds

Regions grow outwards from the seed points preferentially according to the difference between a pixel’s value and the running mean of values in an adjoining region.

Page 50: Introduction to microarray technology and analysis

Histogram segmentation Uses a target mask chosen to be

larger than any other spot Foreground and background

intensity are determined from the histogram of pixel values for pixels within the masked area

Example : QuantArray Background : mean between 5th

and 20th percentile Foreground : mean between 80th

and 95th percentile Unstable when a large target mask

is set to compensate for variation in spot size Bkgd Foreground

Page 51: Introduction to microarray technology and analysis

Information extraction

Page 52: Introduction to microarray technology and analysis

Spot intensity

The total amount of hybridization for a spot is proportional to the total fluorescence at the spot

Spot intensity = sum of pixel intensities within the spot mask

Since later calculations are based on ratios between cy5 and cy3, we compute the average* pixel value over the spot mask

*alternative : use ratios of medians instead of means

Page 53: Introduction to microarray technology and analysis

Background intensity

Motivation : spot’s measured intensity includes a contribution of non-specific hybridization and other chemicals on the glass

Fluorescence from regions not occupied by DNA should by different from regions occupied by DNA -> could be interesting to use local negative controls (spotted DNA that should not hybridize)

Different background methods :Local background, morphological opening, constant background, no adjustment

Page 54: Introduction to microarray technology and analysis

Local background Focusing on small regions surrounding the spot mask. Median of pixel values in this region

Most software package implement such an approach

ScanAlyze ImaGene Spot, GenePix

By not considering the pixels immediately surrounding the spots, the background estimate is less sensitive to the performance of the segmentation procedure

Page 55: Introduction to microarray technology and analysis

Morphological opening (spot) Applied to the original images R and G

Use a square structuring element with side length at least twice as large as the spot separation distance

Remove all the spots and generate an image that is an estimate of the background for the entire slide

For individual spots, the background is estimated by sampling this background image at the nominal center of the spot

Lower background estimate and less variable

Page 56: Introduction to microarray technology and analysis

Constant background

Global method which subtracts a constant background for all spots

Some findings suggests that the binding of fluorescent dyes to ‘negative control spots’ is lower than the binding to the glass slide

-> More meaningful to estimate background based on a set of negative control spots If no negative control spots : approximation of the

average background = third percentile of all the spot foreground values

Page 57: Introduction to microarray technology and analysis

No adjustment

Do not consider the background

Page 58: Introduction to microarray technology and analysis

Quality measures (-> Flag)

How good are foreground and background measurements ? Variability measures in pixel values within each spot mask Spot size Circularity measure Relative signal to background intensity b-value : fraction of background intensities less than the median

foreground intensity p-score : extend to which the position of a spot deviates from a

rigid rectangular grid

Based on these measurements, one can flag a spot

Page 59: Introduction to microarray technology and analysis

Summary The choice of background

correction method has a larger impact on the log-intensity ratios than the segmentation method used

The morphological opening method provides a better estimate of background than other methods Low within- and between-slide

variability of the log2 R/G Background adjustment has a

larger impact on low intensity spots

Spot, GenePix

ScanAlyze

M = log2 R/G

A = log2 √(R•G)

Page 60: Introduction to microarray technology and analysis

Selected references

Yang, Y. H., Buckley, M. J., Dudoit, S. and Speed, T. P. (2001), ‘Comparisons of methods for image analysis on cDNA microarray data’. Technical report #584, Department of Statistics, University of California, Berkeley.http://www.stat.berkeley.edu/users/terry/zarray/Html/papersindex.html

Yang, Y. H., Buckley, M. J. and Speed, T. P. (2001), ‘Analysis of cDNA microarray images’. Briefings in bioinformatics, 2 (4), 341-349.Excellent review in concise format!

Page 61: Introduction to microarray technology and analysis

http://pbil.univ-lyon1.fr/library/limma/doc/usersguide.html

Download the limma package and work through the Swirl zebrafish example.

Page 62: Introduction to microarray technology and analysis

Biological questionDifferentially expressed genesSample class prediction etc.

Testing

Biological verification and interpretation

Microarray experiment

Estimation

Experimental design

Image analysis

Normalization

Clustering Discrimination

Page 63: Introduction to microarray technology and analysis
Page 64: Introduction to microarray technology and analysis

Normalization - two problemsI. How do we detect biases?

Which genes should we use for estimating biases among chips/channels?

II. How do we remove the biases?

Page 65: Introduction to microarray technology and analysis

Why normalize?

Microarray data have significant systematic variation both within arrays and between arrays that is not true biological variation Accurate comparison of genes’ relative expression within and across conditions requires normalization of effects Sources of variation:

Spatial location on the array Dye biases which vary with spot intensity Plate origin Printing/spotting quality Experimenter

Page 66: Introduction to microarray technology and analysis

Why is normalization important?

Experiment:Comparison of gene expression response in mouse heart and kidney in response to drug

Source: http://www.partek.com

Most biological effects are swamped by systematic effects!

Page 67: Introduction to microarray technology and analysis

Other Sources of Systematic Bias

Individual Factors Print (20% - 30%) Experimenter (20%

- 30%) Organism (3% -

10%) Date (5%) Software (2%) Number of tips (3%)

Interactions Print - Experimenter

(40%) Print - Date (40%) Experimenter - Date

(40%)

(slide from Catherine Ball)

(based on ~4,600 experiments in Stanford Microarray Database analyzed by ANOVA)

Page 68: Introduction to microarray technology and analysis

KO #8

Probes: ~6,000 cDNAs, including 200 related to lipid metabolism. Arranged in a 4x4 array of 19x21 sub-arrays.

Clearly visible plate effects

Page 69: Introduction to microarray technology and analysis

Spatial Biases

Solution: spatial background estimation/subtraction

Page 70: Introduction to microarray technology and analysis

Spatial plots: background from two slides

Page 71: Introduction to microarray technology and analysis

Highlighting extreme log ratios

Top (black) and bottom (green) 5% of log ratios

Page 72: Introduction to microarray technology and analysis

Pin group (sub-array) effects

Boxplots of log ratios by pin groupLowess lines through points from pin groups

Page 73: Introduction to microarray technology and analysis

Boxplots and highlighting pin group effects

Clear example of spatial bias

Print-tip groups

Lo

g-r

ati o

s

Page 74: Introduction to microarray technology and analysis

Time of printing effects

Green channel intensities (log2G). Printing over 4.5 days.The previous slide depicts a slide from this print run.

spot number

Page 75: Introduction to microarray technology and analysis

Normalization in a nutshell Goal is to measure the ratios of gene expression levels, (ratio)i =

Ri/GiWhere Ri/Gi are, respectively, the measured intensities for the ith

spot In a self hybridzation, we would expect all ratios to be equal to

one:Ri/Gi = 1 for all i. But they probably won’t be…

Why? noise (systematics bias) signal (true differences)

Normalization brings appropriate ratios closer to 1

Page 76: Introduction to microarray technology and analysis
Page 77: Introduction to microarray technology and analysis

Ratio Histogram

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2

Ratio

Fre

qu

ency

The Starting Point: The Ratio (2-color arrays)

Page 78: Introduction to microarray technology and analysis

Log(ratio) Histogram

0

500

1000

1500

2000

2500

3000

-2 -1.8

-1.6

-1.4

-1.2 -1 -0

.8-0

.6-0

.4-0

.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Log(ratio)

Fre

qu

ency

Log ratios treat up- and down-regulated genes equally

log2(1) = 0 log2(2) = 1 log2(1/2) = -1(two-color arrays)

Page 79: Introduction to microarray technology and analysis

A note about Affymetrix (1-color) pre-processing

Log transform

Typical Affymetrix probe intensity distribution

After log-transform

Page 80: Introduction to microarray technology and analysis

Normalization methods

Page 81: Introduction to microarray technology and analysis

Which Genes to use for bias detection? 1. All genes on the chip

Assumption: Most of the genes are equally expressed in the compared samples, the proportion of the differential genes is low (<20%).

Limits: Not appropriate when comparing highly

heterogeneous samples (different tissues) Not appropriate for analysis of ‘dedicated chips’

(apoptosis chips, inflammation chips etc)

Page 82: Introduction to microarray technology and analysis

Which Genes to use for bias detection? 2. Housekeeping genes

• Assumption: based on prior knowledge a set of genes can be regarded as equally expressed in the compared samples

• Affy novel chips: ‘normalization set’ of 100 genes

• NHGRI’s cDNA microarrays: 70 "house-keeping" genes set

• Limits: The validity of the assumption is questionable Housekeeping genes are usually expressed at high

levels, not informative for the low intensities range

Page 83: Introduction to microarray technology and analysis

Which Genes to use for bias detection? 3. Spiked-in controls from other organism,

over a range of concentrations • Limits:

low number of controls- less robust Can’t detect biases due to differences in RNA extraction

protocols

4. “Invariant set”• Trying to identify genes that are expressed at

similar levels in the compared samples without relying on any prior knowledge:

Rank the genes in each chip according to their expression level

Find genes with small change in ranks

Page 84: Introduction to microarray technology and analysis

1. Global normalization (Scaling) A single normalization factor (k) is computed for

balancing chips\channels: Xi

norm = k*Xi or

log2 R/G log2 R/G – c (2-color) Multiplying intensities by this factor equalizes the

mean (median) intensity among compared chips Assumption: Total RNA (mass) used is same for both

samples. So, averaged across thousands of genes, total

hybridization should be the same for both samples.

Page 85: Introduction to microarray technology and analysis

Global Normalization (1-color, e.g. Affymetrix)

Before After

Xinorm = k*Xi

Page 86: Introduction to microarray technology and analysis

Global Normalization (2-color)

Un-normalized

Normalized

Frequ

enc

y

0

100

200

300

400

500

600

700

-8 -6 -4 -1 1 4 6 0

100

200

300

400

500

600

700

-7.7 -5.2 -2.8 -0.3 2.2 4.6 7.1

Log-ratios

log2 R/G log2 R/G – c where c = log2 (∑Ri/ ∑Gi)

Page 87: Introduction to microarray technology and analysis

2. Intensity-dependent normalization (Yang, Speed)

(Lowess – local linear fit)

Compensate for intensity-dependent biases

Page 88: Introduction to microarray technology and analysis

Detect Intensity-dependent Biases: M vs A plots (also called R-I plot)

X axis: A – average intensityA = 0.5*log(Cy3*Cy5)

Y axis: M – log ratioM = log(Cy3/Cy5)

Page 89: Introduction to microarray technology and analysis

Intensity-dependent bias

A

M = log(Cy3/Cy5)

Low intensities

M<0: Cy3<Cy5

High intensities

M>0: Cy3>Cy5

* Global normalization cannot remove intensity-dependent biases

Page 90: Introduction to microarray technology and analysis

A

We expect the M vs A plot to look like:

M =

lo

g(C

y3

/Cy5

)

Page 91: Introduction to microarray technology and analysis

LOWESS (Locally Weighted Scatterplot Smoothing)

• Local linear regression model

• Tri-cube weight function

• Least Squares

Estimated values of log2(Cy5/Cy3) as function of log10(Cy3*Cy5)

Page 92: Introduction to microarray technology and analysis

A note about Affymetrix (1-color) pre-processing

Two “standard” methods MAS 5.0 (now GCOS/GDAS) by Affymetrix (compares PM

and MM probes) RMA by Speed group (UC Berkeley) (ignores MM probes)

within-chip cross-chip sequence specific

background correction

within-probe setaggregation of intensity values

Page 93: Introduction to microarray technology and analysis

Normalization – Thoughts

There are many different ways to normalize dataGlobal median, LOWESS, LOESS, etcBy print tip, spatial, etc

Choose one wisely BUT: don’t expect it to fix bad data!

Won’t make up for lack of replicatesWon’t make up for horrible slides

Page 94: Introduction to microarray technology and analysis

For next time.. Read Quackenbush paper on normalization Look up the paper on Robust Multichip

Averaging (RMA) out of Terry Speed’s lab What is meant by least squares? Visit the Gene Expression Omnibus (GEO)

resource at NCBI and explore what is there If you aren’t familiar with the statistical

computing environment, R, look it up on the web Look up MeV (multi-experiment viewer) on the

web.

Page 95: Introduction to microarray technology and analysis

File or archive your e-mail on your own computer

Page 96: Introduction to microarray technology and analysis

Biological questionDifferentially expressed genesSample class prediction etc.

Testing

Biological verification and interpretation

Microarray experiment

Estimation

Experimental design

Image analysis

Normalization

Clustering Discrimination

Page 97: Introduction to microarray technology and analysis

Analysis

Page 98: Introduction to microarray technology and analysis

Microarray Data FlowMicroarray experiment Image

Analysis

Database

Data Selection & Missing value estimation

Data Matrix

UnsupervisedAnalysis – clustering

Networks & Data Integration

Supervised Analysis

Normalization & Centering

Decomposition techniques

Page 99: Introduction to microarray technology and analysis

Biological questionDifferentially expressed genesSample class prediction etc.

Testing

Biological verification and interpretation

Microarray experiment

Estimation

Experimental design

Image analysis

Normalization

Clustering Discrimination

Page 100: Introduction to microarray technology and analysis

Interpretation

Page 101: Introduction to microarray technology and analysis

Microarray data on the Web

Several initiatives to create “unified” databases EBI: ArrayExpress

NCBI: Gene Expression Omnibus

Page 102: Introduction to microarray technology and analysis

Normalization - tools

Bioconductor (both Affymetrix and cDNA): Packages in R language

dChip (Affymetrix): Quantile, Invariant set

MAANOVA Microarray ANOVA analysis

Normalization is typically provided in microarray vendor’s software/core facilities but you should always understand the data you’re working with

How has your data been processed? Are there any lingering effects?