lecture 9 introduction to transcriptional networks microarray experiments ma plots normalization...

58
Lecture 9 Introduction to transcriptional networks Microarray experiments MA plots Normalization of microarray data Tests for differential expression of genes Multiple testing and FDR

Upload: elmer-shepherd

Post on 17-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Lecture 9

Introduction to transcriptional networks

Microarray experiments

MA plots

Normalization of microarray data

Tests for differential expression of genes

Multiple testing and FDR

Page 2: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Unlike protein-protein interaction networks the transcriptional networks are directed networks

By the term transcriptional networks we generally mean gene regulatory networks

transcriptional networks

Page 3: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

transcriptional networks: Basic mechanism of gene regulation

Page 4: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

transcriptional networks

Page 5: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Most genes are regulated at transcription level and it is assumed that 5-10% of protein coding genes encode regulatory proteins.

Some regulatory proteins play targeted role i.e. they take part in regulation of a few genes.

Some regulatory proteins play more general role in initiating transcription (for example the eukaryotic transcription factors of type II or the RNA polymerase itself that is essential for the transcription of all genes).

It is considered that dedicated regulatory proteins are those that affect up to 5% genes of a genome.

However the boundary between the generalist and dedicated regulatory proteins is blurred.

transcriptional networks

Page 6: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Experiments and methods used to determine regulatory relations

1. Complementary DNA microarrays

2. Oligonucleotide chips

3. Reverse transcription polymerase chain reaction

4. Serial analysis of gene expression

5. Chromatin Immunoprecipitation

6. Bioinformatics—e.g. by way of identifying binding sites

transcriptional networks

Page 7: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Transcriptional Networks: Case study 1

An extended transcriptional regulatory network of Escherichia coli and analysis of its hierarchical structure and network motifs

Hong-Wu Ma, Bharani Kumar, Uta Ditges2, Florian Gunzer2, Jan Buer1,2 and An-Ping Zeng*

Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

This work combined data sets from 3 different sources:

1. RegulonDB (version 4.0, http://www.cifn.unam.mx/Computational_Genomics/regulondb/)

2. Ecocyc (version 8.0, www.ecocyc.org)

3. Shen-Orr,S.S., Milo,R., Mangan,S. and Alon,U. (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet., 31, 64–68.

Page 8: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Comparison of the TRN of E.coli from three different data sources (A) Based on number of genes (B) Based on number regulatory interactions

Page 9: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

A combined network that includes all the 2624 interactions from the three data sets has been produced.

In addition, this work extended this network by adding 23 additional genes and around 100 regulatory relationships through literature survey.

The final TRN altogether includes 1278 genes and 2724 interactions.

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 10: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

This work discovered a hierarchical structure in the TRN.

The hierachical structure was identified according to the following way:

(1) genes which do not code for transcription factors (TFs) or code for a TF which only regulates its own expression (auto-regulatory loop) were assigned to layer 1 (the lowest layer);

(2) then we removed all the genes in layer 1 and from the remaining network identified TFs which do not regulate other genes and assigned the corresponding genes in layer 2;

(3) we repeated step 2 to remove nodes which have been assigned to a layer and identified a new layer until all the genes were assigned to different layers. As a result, a nine layer hierarchical structure was uncovered.

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

From BMC Bioinformatics 2004, 5:199 of the related authors

Page 11: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 12: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

The hierarchical structure implies absence of cycles in the network i.e. feedback loops (though auto regulatory and inter-regulatory loops exist)

As the network is not complete, we cannot say that feedback loop could not be found in future however it seems they would not be too many.

A possible biological explanation for the existence of this hierarchical structure is that the interactions in this particular TRN are between proteins and genes without involving metabolites.

Only after a regulating gene has been transcribed, translated and eventually further modified by cofactors or other proteins, it canregulate the target gene.

A feedback from the regulated gene at transcriptional level may delay the process for the target gene to access a desired expression level in a new environment.

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 13: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Feedback control may be mainly through other interactions (e.g. metabolite and protein interaction) at post-transcriptional level rather than through transcriptional interactions between proteins and genes. For example, a gene at the bottom layer may code for a metabolic enzyme, the product of which can bind to a regulator which in turn regulates its expression. In this case, the feedback is through metabolite–protein interaction to change the activity of the transcription factor and then to affect the expression of the regulated gene.

Therefore, to fully understand the gene expression regulation, an integrated network that includes different interactions is needed.

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 14: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

To calculate network motifs in the E.coli TRN, this work removed all the loops in the network (including the autoregulatory loops and the two-gene regulatory loops). Then they used the program Mfinder developed by Kashtan et al. to generate the motif profiles.

The first four types are the so-called coherent FFLs in which the direct effect of the up regulator is consistent with its indirect effect through the mid regulator. In contrast, the last four types of FFLs are incoherent because the direct effect of the up regulator is contradictive with its indirect effect

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 15: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

(A) Gene gadA is regulated by six FFLs (B)Gene lpd is regulated by five FFLs (C) Gene slp is regulated by 17 regulators

Page 16: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 17: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

DNA Microarray

Page 18: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

•Though most cells in an organism contain the same genes, not all of the genes are used in each cell.

•Some genes are turned on, or "expressed" when needed in particular types of cells.

•Microarray technology allows us to look at many genes at once and determine which are expressed in a particular cell type.

DNA Microarray

Typical microarray chip

Page 19: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

•DNA molecules representing many genes are placed in discrete spots on a microscope slide which are called probes.

•Messenger RNA--the working copies of genes within cells is purified from cells of a particular type.

•The RNA molecules are then "labeled" by attaching a fluorescent dye that allows us to see them under a microscope, and added to the DNA dots on the microarray.

•Due to a phenomenon termed base-pairing, RNA will stick to the probe corresponding to the gene it came from

DNA Microarray

Typical microarray chip

Page 20: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Source: PhD thesis by Benjamin Milo Bolstad, 2004, University of California, Barkeley

Usually a gene is interrogated by 11 to 20 probes and usually each probe is a 25-mer sequenceThe probes are typically spaced widely along the sequenceSometimes probes are choosen closer to the 3’ end of the sequenceA probe that is exactly complementary to the sequence is called perfect match (PM)A mismatch probe (MM) is not complementary only at the central positionIn theory MM probes can be used to quantify and remove non specific hybridization

DNA Microarray

Page 21: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Sample preparation and hybridization

Source: PhD thesis by Benjamin Milo Bolstad, 2004, University of California, Barkeley

Page 22: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Source: PhD thesis by Benjamin Milo Bolstad, 2004, University of California, Barkeley

Sample preparation and hybridization

During the hybridization process cRNA binds to the array

Earlier probes had all the probes of a probset located continuously on the arrayThis may fall prey to spatial defectsNewer chips have all the probes spread out across the arrayA PM and MM probe pair are always adjacent on the array

Page 23: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Growth curve of bacteria

•Samples can be taken at different stages of the growth curve

•One of them is considered as control and others are considered as targets

•Samples can be taken before and after application of drugs

•Sample can be taken under different experimental conditions e.g. starvation of some metabolite or so

•What types of samples should be used depends on the target of the experiment at hand.

Page 24: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

•After washing away all of the unstuck RNA, the microarray can be observed under a microscope and it can be determined which RNA remains stuck to the DNA spots

•Microarray technology can be used to learn which genes are expressed differently in a target sample compared to a control sample (e.g diseased versus healthy tissues)

However background correction and normalization are necessary before making useful decisions or conclusions

DNA Microarray

Typical microarray chip

Page 25: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

MA plots

MA plots are typically used to compare two color channels, two arrays or two groups of arraysThe vertical axis is the difference between the logarithm of the signals(the log ratio) and the horizontal axis is the average of the logarithms of the signalsThe M stands for minus and A stands for addMA is also mnemonic for microarray

Mi= log(Xij) - log(Xik) = Log(Xij/Xik) (Log ratio)

Ai=[log(Xij) + log(Xik)]/2 (Average log intensity)

Page 26: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

A typical MA plot

From the first plot we can see differences between two arrays but the non linear trend is not apparentThis is because there are many points at low intensities compared to at high intensitiesMA plot allows us to assess the behavior across all intensities

Page 27: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Normalization of microarray data

Normalization is the process of removing unwanted non-biological variation that might exist between chips in microarray experiments

By normalization we want to remove the non-biological variation and thus make the biological variations more apparent.

Page 28: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Array 1 Array 2 ・・・

Array j ・・・

Array m

Gene 1 X11 X12 X1j X1m

Gene 2 X21 X22 X2j X2m

・・・Gene i Xi1 Xi2 Xij Xim

・・・Gene n Xn1 Xn2 Xnj Xnm

Mean X1 X2 Xj Xm

SD σ1 σ2 σj σm

Typical microarray data

Page 29: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Array 1 Array 2 ・・・

Array j ・・・

Array m

Gene 1 X11 X12 X1j X1m

Gene 2 X21 X22 X2j X2m

・・・Gene i Xi1 Xi2 Xij Xim

・・・Gene n Xn1 Xn2 Xnj Xnm

Mean X1 X2 Xj Xm

SD σ1 σ2 σj σm

Normalization within individual arrays

Scaling: Sij = Xij - Xj

Centering: Cij = ( Xij - Xj ) / σj

Page 30: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Original Data

Scaling

Mean = 0

Centering

Mean = 0Standard deviation = 1

Effect of Scaling and centering normalization

Page 31: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Normalization between a pair of arrays: Loess(Lowess) Normalization

Lowess normalization is separately applied to each experiment with two dyes

This method can be used to normalize Cy5 and Cy3 channel intensities (usually one of them is control and the other is the target) using MA plots

Page 32: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Genei-1 Ci-1 Ti-1

Genei Ci Ti

Genei+1 Ci+1 Ti+1

Mi=Log(Ti/Ci) (Log ratio)

Ai=[log(Ti) + log(Ci)]/2 (Average log intensity)

Mi=

Log(

Ti/C

i)

Ai=[log(Ti) + log(Ci)]/2

Each point corresponds to a single gene

2 channel data

Normalization between a pair of arrays: Loess(Lowess) Normalization

Page 33: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Mi=Log(Ti/Ci) (Log ratio)

Ai=[log(Ti) + log(Ci)]/2 (Average log intensity)

Mi=

Log(

Ti/C

i)

Ai=[log(Ti) + log(Ci)]/2

Each point corresponds to a single gene

The MA plot shows some bias

Typical regression line

Normalization between a pair of arrays: Loess(Lowess) Normalization

Page 34: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Mi=Log(Ti/Ci) (Log ratio)

Ai=[log(Ti) + log(Ci)]/2 (Average log intensity)

Mi=

Log(

Ti/C

i)

Ai=[log(Ti) + log(Ci)]/2

Each point corresponds to a single gene

The MA plot shows some bias

Usually several regression lines/polynomials are considered for different sections

The final result is a smooth curve providing a model for the data. This model is then used to remove the bias of the data points

Normalization between a pair of arrays: Loess(Lowess) Normalization

Page 35: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Bias reduction by lowess normalization

Normalization between a pair of arrays: Loess(Lowess) Normalization

Page 36: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Unnormalized fold changes

fold changes after Loess normalization

Normalization between a pair of arrays: Loess(Lowess) Normalization

Page 37: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Normalization across arrays

Here we are discussing the following two normalization procedure applicable to a number of arrays

1. Quantile normalization2. Baseline scaling normalization

Page 38: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Quantile normalization The goal of quantile normalization is to give the same empirical distribution to the intensities of each arrayIf two data sets have the same distribution then their quantile- quantile plot will have straight diagonal line with slope 1 and intercept 0.Or projecting the data points of the quantile- quantile plot to 45-degree line gives the transformation to have the same distribution.

quantile- quantile plot motivates the quantile normalization algorithm

Normalization across arrays

Page 39: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Quantile normalization Algorithm

Source: PhD thesis by Benjamin Milo Bolstad, 2004, University of California, Barkeley

Normalization across arrays

Page 40: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

No. Exp.1

No. Exp.2

1 1.6 1 1.2

2 0.6 2 2.8

3 1.8 3 1.8

4 0.8 4 3.8

5 0.4 5 0.8

No. Exp.1

No. Exp.2

Mean

5 0.4 5 0.8 0.6 = (0.4+0.8)/2

2 0.6 1 1.2 0.9

4 0.8 3 1.8 1.3

1 1.6 2 2.8 2.2

3 1.8 4 3.8 2.8

No. Exp.1

No. Exp.2

5 0.6 5 0.6

2 0.9 1 0.9

4 1.3 3 1.3

1 2.2 2 2.2

3 2.8 4 2.8

No. Exp.1

No. Exp.2

1 2.2 1 0.9

2 0.9 2 2.2

3 2.8 3 1.3

4 1.3 4 2.8

5 0.6 5 0.6

Original data

4. Get X normalized by rearranging each column of X' sort to have the same ordering as original X

1. Sort each column of X (values)2. Take the means across rows of X sort

3. Assign this mean to each elementin the row to get X' sort

Sort

Sort

Quantile Normalization:Normalization across arrays

Page 41: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Raw data

After quantile normalization

Normalization across arrays

Page 42: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Baseline scaling method

In this method a baseline array is chosen and all the arrays are scaled to have the same mean intensity as this chosen array

This is equivalent to selecting a baseline array and then fitting a linear regression line without intercept between the chosen array and every other array

Normalization across arrays

Page 43: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Baseline scaling methodNormalization across arrays

Page 44: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

After Baseline scaling normalization

Raw data

Normalization across arrays

Page 45: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Tests for differential expression of genes

Let x1…..xn and y1…yn be the independent measurements of the same probe/gene across two conditions.

Whether the gene is differentially expressed between two conditions can be determined using statistical tests.

Page 46: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Important issues of a test procedure are(a)Whether the distributional assumptions are valid(b)Whether the replicates are independent of each

other(c)Whether the number of replicates are sufficient(d)Whether outliers are removed from the sample

Replicates from different experiments should not be mixed since they have different characteristics and cannot be treated as independent replicates

Tests for differential expression of genes

Page 47: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Most commonly used statistical tests are as follows:

(a) Student’s t-test(b) Welch’s test(c) Wilcoxon’s rank sum test(d) Permutation tests

The first two test assumes that the samples are taken from Gaussian distributed data and the p-values are calculated by a probability distribution functionThe later two are nonparametric and the p values are calculated using combinatorial arguments.

Tests for differential expression of genes

Page 48: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Student’s t-test

Assumptions: Both samples are taken from Gaussian distribution that have equal variances

Degree of freedom: m+n-2

Welch’s test is a variant of t-test where t is calculated as follows

Welch’s test does not assume equal population variances

Page 49: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Student’s t-test

The value of t is supposed to follow a t-distribution.After calculating the value of t we can determine the p-value from the t distribution of the corresponding degree of freedom

Page 50: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Wilcoxon’s rank sum test

Let x1…..xn and y1…ym be the independent measurements of the same probe/gene across two conditions.Consider the combined set x1…..xn ,y1…ym The test statistic of Wilcoxon test is

Where is the rank of xi in the combined series

Possible Minimum value of T is

Possible Maximum value of T is

Minimum and maximum values of T occur if all X data are greater or smaller than the Y data respectively i.e. if they are sampled from quite different distributions

Page 51: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Expected value and variance of T under null hypothesis are as follow:

Now unusually low or high values of T compared to the expected value indicate that the null hypothesis should be rejected i.e. the samples are not from the same population

For larger samples i.e. m+n >25 we have the following approximation

Page 52: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

X Data

x1 7

x2 8

x3 5

x4 9

x5 7

Y Data

y1 5

y2 6

y3 8

y4 4

X&Y Data Rank

x4 9 1

x2 8 2

y3 8 3

x5 7 4

x1 7 5

y2 6 6

y1 5 7

x3 5 8

y4 4 9

Wilcoxon’s rank sum test (Example)

n=5. m=4

T=R(x1)+R(x2)+R(x3)+R(x4)+R(x5)=5+2+8+1+4= 20EH0(T)=n(m+n+1)/2= 5(4+5+1)/2=25VarH0(T)=mn(m+n+1)/12= 5*4(4+5+1)/12=50/3=16.66

P-value = .1112 (From chart)

Page 53: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Example

Page 54: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Multiple testing and FDR

The single gene analysis using statistical tests has a drawback.This arises from the fact that while analyzing microarray data we conduct thousands of tests in parallel.

Let we select 10000 genes with a significant level α=0.05 i.e a false positive rate of 5%

This means we expect that 500 individual tests are false which is not at all logical

Therefore corrections for multiple testing are applied while analyzing microarray data

Page 55: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Let αg be the global significance level and αs is the significance level at single gene level

In case of a single gene the probability of making a correct decision is

Therefore the probability of making correct decision for all n genes (i.e. at global level)

Now the probability of drawing the wrong conclusion in either of n tests is

For example if we have 100 different genes and αs=0.05the probability that we make at least 1 error is 0.994 ---this is very high and this is called family-wise error rate (FWER)

Multiple testing and FDR

Page 56: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

Using binomial expansion we can write

Thus

Therefore the Bonferroni correction of the single gene level is the global level divided by the number of tests

Therefore for FWER of 0.01 for n= 10000 genes the P-value at single gene level should be 10-6

Usually very few genes can meet this requirement

Therefore we need to adjust the threshold p-value for the single gene case.

Multiple testing and FDR

Page 57: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

A method for adjusting p-value is given in the following paper

Westfall P. H. and Young S. S. Resampling based multiple testing : examples and methods for p-value adjustment(1993), Wiley, New York

Multiple testing and FDR

Page 58: Lecture 9  Introduction to transcriptional networks  Microarray experiments  MA plots  Normalization of microarray data  Tests for differential expression

An alternative to controlling FWER is the computation of false discovery rate(FDR)

The following papers discuss about FDRStorey J. D. and Tibshirani R. Statistical significance for genome wise studies(2003), PNAS 100, 9440-9445

Benjamini Y and Hochberg Y Controlling the false discovery rate : a practical and powerful approach to multiple testing(1995) J Royal Statist Soc B 57, 289-300

Still the practical use of multiple testing is not entirely clear.

However it is clear that we need to adjust the p-value at single gene level while testing many genes together.

Multiple testing and FDR