a smooth version of the step-up procedure for multiple tests of hypotheses

9
Journal of Statistical Planning and Inference 137 (2007) 3352 – 3360 www.elsevier.com/locate/jspi A smooth version of the step-up procedure for multiple tests of hypotheses Arthur Cohen , 1 , John Kolassa 2 , Harold B. Sackrowitz 1 Department of Statistics, Rutgers University, 110 Frelinghuysen Road, Piscataway, NJ 08854, USA Available online 31 March 2007 Abstract Cohen and Sackrowitz [2005. Characterization of Bayes procedures for multiple endpoint problems and inadmissibility of the step-up procedure. Ann. Statist. 33, 145–158; 2007. More on the inadmissibility of step-up. J. Multivariate Anal. 97, 481–492] have demonstrated that the popular step-up (SU) multiple testing procedure is inadmissible under a wide variety of conditions. All conditions, however, did assume a permutation invariant (symmetric) model. In this paper we find a necessary condition for admissibility of multiple testing procedures in the asymmetric case. Once again SU does not satisfy the condition and is inadmissible. Since SU has a somewhat less favorable practical property and a less favorable theoretical property, we offer a smooth version of SU which retains the favorable practical properties and avoids some of the less favorable ones. In terms of performance the smooth version and nonsmooth version seem to be comparable at least in low dimensions. © 2007 Elsevier B.V.All rights reserved. MSC: 62F03; 62C15 Keywords: Admissibility; Permutation invariant procedures; Nonsymmetric procedures; Exponential family; Classification risk function 1. Introduction Multiple testing procedures are enjoying a resurgence of interest as a result of new applications to microarrays, stock market mutual funds, educational testing, clinical trials, and psychological experiments. One of the most popular approaches is the step-up (SU) procedure. The version put forward by Benjamini and Hochberg (1995) is designed to control the false discovery rate (FDR). This version has received considerable attention. See, for example, Efron (2003), Genovese and Wasserman (2002), Sarkar (2002) and Dudoit et al. (2003). The latter reference surveys other methods and lists 18 step-wise procedures, six of which are SU. Step-wise procedures have an advantage over Bonferroni-type single step procedures in that they have better power in some sense. They also have more flexibility than a single-step procedure in the sense that all the data is used when testing each individual hypothesis. Nevertheless, Cohen and Sackrowitz (CS) (2005, 2007) have shown that SU procedures are inadmissible under a wide variety of conditions. The variety of conditions include different distributions, many dependent situations, two- sided and one-sided alternatives and different risk functions. All the cases treated by CS thus far involve a permutation Corresponding author. Tel.: +1 7324455305; fax: +1 7324453428. E-mail address: [email protected] (A. Cohen). 1 Research supported by NSF Grant DMS-0457248 and NSA Grant H98230-06-1-007. 2 Research supported by NSF Grant DMS-0505499. 0378-3758/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2007.03.016

Upload: arthur-cohen

Post on 26-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Journal of Statistical Planning and Inference 137 (2007) 3352–3360www.elsevier.com/locate/jspi

A smooth version of the step-up procedure for multiple tests ofhypotheses

Arthur Cohen∗,1, John Kolassa2, Harold B. Sackrowitz1

Department of Statistics, Rutgers University, 110 Frelinghuysen Road, Piscataway, NJ 08854, USA

Available online 31 March 2007

Abstract

Cohen and Sackrowitz [2005. Characterization of Bayes procedures for multiple endpoint problems and inadmissibility of thestep-up procedure. Ann. Statist. 33, 145–158; 2007. More on the inadmissibility of step-up. J. Multivariate Anal. 97, 481–492]have demonstrated that the popular step-up (SU) multiple testing procedure is inadmissible under a wide variety of conditions.All conditions, however, did assume a permutation invariant (symmetric) model. In this paper we find a necessary condition foradmissibility of multiple testing procedures in the asymmetric case. Once again SU does not satisfy the condition and is inadmissible.Since SU has a somewhat less favorable practical property and a less favorable theoretical property, we offer a smooth version ofSU which retains the favorable practical properties and avoids some of the less favorable ones. In terms of performance the smoothversion and nonsmooth version seem to be comparable at least in low dimensions.© 2007 Elsevier B.V. All rights reserved.

MSC: 62F03; 62C15

Keywords: Admissibility; Permutation invariant procedures; Nonsymmetric procedures; Exponential family; Classification risk function

1. Introduction

Multiple testing procedures are enjoying a resurgence of interest as a result of new applications to microarrays,stock market mutual funds, educational testing, clinical trials, and psychological experiments. One of the most popularapproaches is the step-up (SU) procedure. The version put forward by Benjamini and Hochberg (1995) is designed tocontrol the false discovery rate (FDR). This version has received considerable attention. See, for example, Efron (2003),Genovese and Wasserman (2002), Sarkar (2002) and Dudoit et al. (2003). The latter reference surveys other methodsand lists 18 step-wise procedures, six of which are SU. Step-wise procedures have an advantage over Bonferroni-typesingle step procedures in that they have better power in some sense. They also have more flexibility than a single-stepprocedure in the sense that all the data is used when testing each individual hypothesis.

Nevertheless, Cohen and Sackrowitz (CS) (2005, 2007) have shown that SU procedures are inadmissible under awide variety of conditions. The variety of conditions include different distributions, many dependent situations, two-sided and one-sided alternatives and different risk functions. All the cases treated by CS thus far involve a permutation

∗ Corresponding author. Tel.: +1 7324455305; fax: +1 7324453428.E-mail address: [email protected] (A. Cohen).

1 Research supported by NSF Grant DMS-0457248 and NSA Grant H98230-06-1-007.2 Research supported by NSF Grant DMS-0505499.

0378-3758/$ - see front matter © 2007 Elsevier B.V. All rights reserved.doi:10.1016/j.jspi.2007.03.016

A. Cohen et al. / Journal of Statistical Planning and Inference 137 (2007) 3352–3360 3353

invariant model. In other words, the distributional model is symmetric in variables and parameters, the hypotheses andloss functions are symmetric and only permutation invariant procedures are studied.

We also note that SU procedures have an undesirable practical property as illustrated in the following examples: letz=(z1, . . . , zk)

′ be a vector of independent identically distributed random variables with mean vector μ=(�1, . . . , �k)′.

Test Hi : �i = 0 vs. Ki : �i > 0. Let C1 < C2 < · · · < Ck be constants and consider a sample point z = (C1 + �, C1 +�, . . . , C1 + �) where � > 0, � < (C2 − C1)/2. Then SU could reject all k hypotheses for this sample point and thenaccept all k hypotheses for the sample point (C1, C2, . . . , Ck)

′. That is, a small drop in the first coordinate and anincrease in all other coordinates leads SU to go from all reject to all accept. Furthermore, if k is large some of the C’scan get very large.

Perhaps a more dramatic illustration concerns the FDR controlling SU procedure for the symmetric case. Here theprocedure is expressed in terms of P-values. Suppose k = 5 and the P-values are (.4999, .4999, .4999, .4999, .4999).An � = .05 FDR controlling SU procedure rejects all five hypotheses while accepting all five hypotheses when theP-values are (.05, .04, .03, .02, .01)′.

In this paper we address two issues. For the nonpermutation invariant problem we find a necessary condition foradmissibility of procedures for a vector risk function, where the vector has the following two components: Expectednumber of type I errors, Expected number of type II errors. Such a risk function was used in CS (2005). We remark herethat a procedure inadmissible for such a risk function would also be inadmissible for any positive linear combinationof the two components of the vector. Such would allow for different weights for the two types of errors. In particularif the weights in the linear combination are equal we would have inadmissibility for the classification risk function.The classification risk function corresponds to a loss function that counts the number of errors. The classification riskfunction was first used by Lehmann (1957) and subsequently by Genovese and Wasserman (2002), Ishwaran and Rao(2003), Müller et al. (2004). The SU procedure is again shown to be inadmissible for many exponential family modelsin this asymmetric setting. See Theorem 3.2.

The other issue discussed in this paper concerns an alternative to the SU procedure that should improve somewhaton some of its less desirable properties. SU was shown to be inadmissible because of a lack of smoothness in crucialparts of the sample space when the sample space is partitioned into sets for various possible actions. In this paperwe offer a smooth (in some sense) multiple testing procedure that retains the flavor of SU and some of its desirableproperties, while avoiding some of its less desirable properties. The smoothing of SU entails expressing the criticalvalue of any individual test as a convex combination of the critical constants dictated by the procedure. Each coefficientin the convex combination is a sum of products of indicator functions of intervals of the form (−∞, Ci) for Ci’s thegiven critical constants dictated by the procedure. By smoothing these indicator functions a smooth multiple testingprocedure that is close to SU ensues.

In the next section we state a model and give some preliminaries. In Section 3 we give the necessary condition foradmissibility and prove that SU is inadmissible. In Section 4 we describe a smooth multiple testing procedure that isakin to SU. Some simulations comparing the performance of the smooth version with the SU procedure are presented.Section 5 contains some discussion.

2. Model and preliminaries

Let z be a random k × 1 vector with distribution f (z; �). Consider the multiple testing problem Hi : �i = �i0 vs.Ki : �i > �i0, i = 1, . . . , k. Without loss of generality for our applications we assume �i0 = 0. This then is a 2k finiteaction problem where an action a = (a1, . . . , ak)

′ is such that ai = 1 means reject Hi and ai = 0 means accept Hi . Letvi = 0 if Hi is true and let vi = 1 if Ki is true. In general a test function for the ith hypothesis testing problem is denotedby �i (z) where �i (z) is the probability of rejecting Hi when z is observed. A multiple testing procedure is determineduniquely by �= (�1, . . . ,�k)

′. Note � can be randomized or nonrandomized. Nonrandomized means each �i (z) takeson the value 0 or 1 for each z. The loss function is the vector

(a′(1 − v), (1 − a)′v). (2.1)

The vector risk function is(k∑

i=1

(1 − vi)E��i (z),k∑

i=1

viE�(1 − �i (z))

). (2.2)

3354 A. Cohen et al. / Journal of Statistical Planning and Inference 137 (2007) 3352–3360

Clearly any procedure which is inadmissible for the risk function (2.2) would be inadmissible for the risk function

k∑i=1

(1 − vi)E��i (z) + b

k∑i=1

viE�(1 − �i (z)) (2.3)

for any fixed b > 0. Furthermore, if a procedure �(1)(z) can be beaten by a procedure �(2)(z) for every b, when the riskfunction is (2.3) and �(2)(z) does not depend on b, then �(1)(z) is inadmissible for the risk function (2.2).

The SU for the permutation invariant model given in Hochberg and Tamhane (1987) is as follows: let z(1) �z(2) � · · ·�z(k) be the order statistics derived from z and let Ci be a nondecreasing set of positive critical constants with C1 < Ck .

(i) If z(1) �C1, accept H(1) where H(1) is the hypothesis corresponding to z(1). Otherwise reject all H(i).(ii) If H(1) is accepted, accept H(2) if z(2) �C2. Otherwise reject H(2), . . . , H(k).

(iii) In general, at stage j, if z(j) �Cj , accept Hj . Otherwise reject H(j), . . . , H(k).

For the asymmetric case the SU procedure is as follows: let Cij , i = 1, . . . , k; j = 1, . . . , k be critical constants suchthat for each i, Ci1 �Ci2 � · · · �Ci(j−1) < Cij �Ci(j+1) � · · · �Cik , for some i = 2, . . . , k for every i.

(i) If zi > Ci1 for all i, reject all Hi .(ii) If zi �Ci1, for exactly one i, and zj > Cj2 for all j = 1, . . . , k, j �= i, accept Hi and reject all other hypotheses.

(iii) In general, if at least one zi �Ci1, at least two zi’s are less than or equal to Ci2, . . . , at least (m − 1) zi’s are lessthan or equal to Ci(m−1) and the remaining (k − m + 1) zi’s exceed Cim, then reject all Hi corresponding to theseremaining (k − m + 1) zi’s and accept all other Hi’s.

3. A necessary condition for admissibility

In this section we will assume that z has an exponential family distribution with �, a vector of natural parameters. Asexamples, the components of z can be independent normals with different known variances or sample means based ondifferent sample sizes; sums of independent Poisson variables based on different sample sizes, independent binomialsbased on different sample sizes. We wish to test Hi : �i = 0 vs. Ki : �i > 0, i = 1, . . . , k.

In order to give a necessary condition for admissibility for a multiple testing procedure for the risk function in (2.2) wefirst talk about finite cylinders of radius r with center line segment L. By this we mean a neighborhood of line segments,equal in length, and parallel to L. Formally suppose we have the line segment L = {z : z = z∗ + �d, 0����∗}. LetB = {z : (z − z∗)′(z − z∗)�r2, d′z = d′z∗}= the set of points within r units of z∗ in the hyperplane orthogonal to L andthrough z∗. Then the cylinder of radius r with center L is made up of line segments and can be written as

W(r, d, z∗, �∗) = {z : z = z∗∗ + �d, 0����∗ for all z∗∗ ∈ B}. (3.1)

B can be thought of as an end of the cylinder. If r = 0 then the cylinder consists only of the line segment L.We will study the behavior of procedures within cylinders in the sample space. In particular we focus on cylinders

centered about line segments of the form z = z∗ + �d where d has, at most, two nonzero elements. Of special interestis the behavior of procedures between hyperplanes orthogonal to the cylinder (i.e., orthogonal to d).

Scenario A: We will say that a procedure �(z) contains Scenario A if we can find a cylinder, W(r, d, z∗, �∗), andthree (equally spaced and equal width) regions

Si = {z : d′(z∗ + �id)�d′z�d′(z∗ + �i+1d)}, i = 1, 3, 5,

with the following properties:

(1) dj = 0 if j �= j1 and j �= j2 for some fixed j1 and j2. Also dj1 �0, dj2 �0 with at least one strict inequality.(2) 0 = �1 ��2 < �3 ��4 < �5 ��6.(3) �2 − �1 = �4 − �3 = �6 − �5 and �3 − �2 = �5 − �4.(4) h∗ = infz∈W∩(S1∪S3∪S5)h(z) > 0, where h(z) is given in (3.2) below.(5) (�j1

(z), �j2(z)) = (0, 0) when z ∈ Si ∩ W , i = 1, 5 but (�j1

(z), �j2(z)) = (1, 1) when z ∈ S3 ∩ W .

A. Cohen et al. / Journal of Statistical Planning and Inference 137 (2007) 3352–3360 3355

In words, Scenario A describes a situation in which a procedure, when focused on two hypotheses takes actions(accept, accept) then (reject, reject) and then again (accept, accept) as one moves along the cylinder. By the followingtheorem such behavior violates admissibility.

Theorem 3.1. Let Z have density

dF�(z) = �(�)h(z)ez′� d(z), (3.2)

where is either counting measure on the integers or Lebesgue measure. If �A(z) contains a Scenario A (as definedabove) with a cylinder having positive measure, then �A is inadmissible when the risk function is (2.2).

Proof. We will modify �A(z) to arrive at a new procedure �∗(z). The new procedure will be better than �A(z) for therisk function (2.3), for b > 0 and �∗(z) will not depend on b. This will imply �A(z) is inadmissible for risk function(2.2).

To begin with, if j �= j1 or j2 then �∗j (z) = �A

j (z). Furthermore, if z /∈ W ∩ (S1 ∪ S3 ∪ S5) we take �∗j1

(z) = �Aj1

(z)

and �∗j2

(z) = �Aj2

(z). It remains to define �∗j1

(z) and �∗j2

(z) when z ∈ W ∩ (S1 ∪ S2 ∪ S3). Let (z) = 12h∗/h(z).

Next define

(�∗j1

(z), �∗j2

(z)) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

(0, 1) with probability (z) if z ∈ W ∩ S1,

(0, 0) with probability 1 − (z) if z ∈ W ∩ S1,

(0, 1) with probability (z) if z ∈ W ∩ S3,

(1, 0) with probability (z) if z ∈ W ∩ S3,

(1, 1) with probability 1 − 2(z) if z ∈ W ∩ S3,

(1, 0) with probability (z) if z ∈ W ∩ S5,

(0, 0) with probability 1 − (z) if z ∈ W ∩ S5.

(3.3)

To compare �A and �∗, we define the difference in risks DR(�)= the risk of �A minus the risk of �∗. We will nowshow DR(�)�0 for all � with strict inequality for some �. Suppose (vj1 , vj2) = (0, 0). Then

DR(�) =∫

W∩S1

(−(z)) dF�(z) +∫

W∩S3

(2(z)) dF�(z) +∫

W∩S5

(−(z)) dF�(z)

= − 1

2

∫W∩S1

h∗�(�)ez′� d(z) +∫

W∩S3

h∗�(�)ez′� d(z) − 1

2

∫W∩S5

h∗�(�)ez′� d(z), (3.4)

which, by translating S5 and S3 onto S1, can be written as

DR(�) = − 1

2h∗�(�)

∫W∩S1

[ez′� + e(z+(�5−�1)d)′� − 2e(z+(�3−�1)d)′�] d(z)

= − 1

2h∗�(�)

∫W∩S1

ez′�[1 − 2e(�3−�1)d′� + e2(�3−�1)d′�] d(z)

= − 1

2h∗�(�)

∫W∩S1

ez′�[1 − e(�3−�1)d′�]2 d(z). (3.5)

When �j1 = �j2 = 0 we have d′� = 0 as dj = 0 for j �= j1 or j2. Thus, for (vj1 , vj2) = (0, 0) we have DR(�) = 0. Nowsuppose (vj1 , vj2) = (1, 1). In this case DR(�) is equal to (−b) times equation (3.5). Thus, DR(�) > 0 for all such �.

3356 A. Cohen et al. / Journal of Statistical Planning and Inference 137 (2007) 3352–3360

Next suppose (vj1 , vj2) = (0, 1). Then the difference in risks is

DR(�) =∫

W∩S1

b(z) dF�(z) +∫

W∩S3

(1 − b)(z)dF�(z) +∫

W∩S5

(−(z)) dF�(z).

Continuing as in (3.4) and (3.5) gives

DR(�) = 1

2h∗�(�)

∫W∩S1

ez′�[b − be(�3−�1)d′� + e(�3−�1)d′� − e2(�3−�1)d′�] d(z)

= 1

2h∗�(�)

∫W∩S1

ez′�[b + e(�3−�1)d′�][1 − e(�3−�1)d′�] d(z),

which is �0 as d′��0 when dj2 �0 and �j1 = 0. Lastly suppose (vj1 , vj2) = (1, 0). Then arguing as above we get

DR(�) = 1

2h∗�(�)

∫W∩S1

ez′�[1 + be(�3−�1)d′�][e(�3−�1)d′� − 1] d(z).

which is �0 as d′��0 when dj1 �0 and �j2 = 0. This completes the proof of the theorem. �

At this point our goal is to demonstrate that for a large subclass of the exponential family distributions given in (3.2),that the SU procedure, denoted by �SU(z), contains a Scenario A. Such would imply that SU is inadmissible in thesecases. Toward this end we start by assuming without loss of generality that the critical constants Cij determining �SU(z)are such that Ci(k−1) < Cik for every i.Also without loss of generality assume =Ckk −Ck(k−1) < C(k−1)k −C(k−1)(k−1)

and for the case is counting measure on the integers assume Cij are integers such that Ckk > Ck(k−1) + 1 andC(k−1)k > C(k−1)(k−1) + 1.

We proceed with the case where is Lebesgue measure. In this case let

Z1 = {z : zi �Ci1, i = 1, 2, . . . , k − 2}.

Note for any z ∈ Z1, �SU(z) = (0, . . . , 0, �SU(k−1)(z), �

SUk (z))′ where �SU

k−1(z) and �SUk (z) depend only on zk−1 and zk .

Now take d = (0, 0, . . . , 0, 1, −1)′, z∗ = (C11 − r, C21 − r, . . . , C(k−2)1 − r, C(k−1)(k−1) − /4, Ck(k−1) + 3/4) sothat z∗ ∈ Z1, �1 = 0, �2 = /8, �3 = 7/16, �4 = 9/16, �5 = 14/16, �6 = �∗ = , r = /8. The above r, d, z∗,�∗, �1, �2, �3, �4, �5, �6, determine W, S1, S3, S5. Furthermore, for z ∈ Si ∩ W , i = 1, 5, (�SU

k−1(z), �SUk (z)) = (0, 0)

while for z ∈ S3 ∩ W , (�SUk−1(z), �

SUk (z)) = (1, 1).

For the case is counting measure on the integers, note that �SU(z) = (0, . . . , 0, 0, 0)′ if z = (C11, . . . , C(k−1)1,

C(k−1)(k−1), Ck(k−1) + 2)′, �SU(z) = (0, . . . , 0, 1, 1)′ if z = (C11, . . . , C(k−2)1, C(k−1)(k−1) + 1, Ck(k−1) + 1)′ and�SU(z) = (0, . . . , 0, 0, 0)′ if z = (C11, . . . , C(k−2)1, C(k−1)(k−1) + 2, Ck(k−1))

′. Then take d = (0, . . . , 0, 1, −1), z∗ =(C11, C21, . . . , C(k−2)1, C(k−1)(k−1), Ck(k−1) + 2), r = 0, �1 = 0, �2 = 0, �3 = √

2, �4 = √2, �5 = 2

√2, �6 = 2

√2.

This determines W, S1, S3, S5 satisfying (1), (2), (3) and (5) of Scenario A.Thus, we can now state

Theorem 3.2. Let Z have density (3.2). Assume h(z) in (3.2) is such that condition (4) of Scenario A is satisfied forthe above choices of W, S1, S3 and S5. Then �SU(z) is inadmissible.

Proof. This is an immediate consequence of Theorem 3.1. �

4. A smooth version of SU

In this section we consider the permutation invariant version of SU and assume the model of Section 2 also assumingthe random variables are exchangeable. For this model, to describe any permutation invariant procedure we needonly express the critical region for one hypothesis, say H1 in the form z1 > Qk(z(1)), where z(1) = (z2, . . . , zk)

′.

A. Cohen et al. / Journal of Statistical Planning and Inference 137 (2007) 3352–3360 3357

We note that for SU, Qk(z(1)) is a degenerate convex combination of the critical constants. That is,

Qk(z(1)) =k∑

j=1

CjAjk ,

and Ajk is a linear combination of products of indicator functions. All Ajk = 0 except for one j. The smooth versionof SU will be obtained by replacing the indicator functions by continuous functions.

To find Ajk we use the reasoning of balls in boxes used to determine multinomial probabilities.Define

Gi(z) ={1 if z�Ci,

0 if z > Ci,(4.1)

i = 1, 2, . . . , k. Note Qk(z(1)) = C1 if all zi , i = 2, . . . , k satisfy zi > C1. Therefore, take

A1k =k∏

i=2

(1 − G1(zi)). (4.2)

Next note Qk(z(1)) = C2, if exactly one zi �C1 and all other zi > C2. Hence

A2k =k∑

i=2

G1(zi)

k∏j=2j �=i

(1 − G2(zj )). (4.3)

Qk(z(1))=Cm if at least one zi �C1, at least two zi’s are less than or equal to C2, . . ., at least (m− 2) zi’s are less thanor equal to Cm−2, exactly (m − 1) zi’s are less than or equal to Cm−1, and the remaining (k − 1) − (m − 1) = k − m

zi’s exceed Cm. Hence

Amk =∑

1 �1,1+2 �2,...,1+2+···+m−1=m−1

∑B

∏1 terms

G1(zj )∏

2 terms

(G2(zj ) − G1(zj ))

· · ·∏

m terms

(Gm−1(zj ) − Gm−2(zj ))∏

k−m terms

(1 − Gm(zj )), (4.4)

where the above products are over disjoint sets of j’s. Here B means the following: for each fixed 1, 2, . . . , m−1, eachunique permutation of z2, . . . , zk is made to comprise the inner sum. Hence the sum over B, for fixed 1, 2, . . . , m−1, k−m has (k − 1)!/(1! · · · m−1!(k − m)!) terms. The terms are generated by permuting the arguments of the G functionsor difference of G functions.

For example, let k = 4. Then from (4.2)

A1k = (1 − G1(z2))(1 − G1(z3))(1 − G1(z4)).

From (4.3)

A2k = G1(z2)(1 − G2(z3))(1 − G2(z4)) + G1(z3)(1 − G2(z2))(1 − G2(z4))

+ G1(z4)(1 − G2(z2))(1 − G2(z3)).

From (4.4)

A3k = G1(z2)(G2(z3) − G1(z3))(1 − G3(z4)) + G1(z2)(G2(z4) − G1(z4))(1 − G3(z3))

+ G1(z3)(G2(z2) − G1(z2))(1 − G3(z4)) + G1(z3)(G2(z4) − G1(z4))(1 − G3(z2))

+ G1(z4)(G2(z2) − G1(z2))(1 − G3(z3)) + G1(z4)(G2(z3) − G1(z3))(1 − G3(z2))

+ G1(z2)G1(z3)(1 − G3(z4)) + G1(z2)G1(z4)(1 − G3(z3)) + G1(z3)G1(z4)(1 − G3(z2)).

3358 A. Cohen et al. / Journal of Statistical Planning and Inference 137 (2007) 3352–3360

To determine A4k note first that there are four sets of (1, 2, 3) in the sum of (4.4). These are (1, 1, 1) yielding sixterms, (1, 2, 0) yielding three terms, (2, 1, 0) yielding three terms and (3, 0, 0) yielding one term. Thus

A4k = G1(z2)(G2(z3) − G1(z3))(G3(z4) − G2(z4))

+ G1(z3)(G2(z2) − G1(z2))(G3(z4) − G2(z4))

+ G1(z2)(G2(z4) − G1(z4))(G3(z3) − G2(z3))

+ G1(z3)(G2(z4) − G1(z4))(G3(z2) − G2(z2))

+ G1(z4)(G2(z2) − G1(z2))(G3(z3) − G2(z3))

+ G1(z4)(G2(z3) − G1(z3))(G3(z2) − G2(z2))

+ G1(z2)(G2(z3) − G1(z3))(G2(z4) − G1(z4))

+ G1(z3)(G2(z2) − G1(z2))(G2(z4) − G1(z4))

+ G1(z4)(G2(z2) − G1(z2))(G2(z3) − G1(z3))

+ G1(z2)G1(z3)(G2(z4) − G1(z4))

+ G1(z2)G1(z4)(G2(z3) − G1(z3))

+ G1(z3)G1(z4)(G2(z2) − G1(z2))

+ G1(z2)G1(z3)G1(z4).

By replacing the step functions Gi by a properly chosen set of continuous functions G∗i , one would expect to obtain a

smooth version of the SU procedure. One choice of G∗i functions that yield a procedure with performance characteristics

very similar to those of the SU procedure is

G∗i (z) =

⎧⎨⎩

1 if z < Ci + .2i − 1,

(z − Ci − .2i)2 if Ci + .2i − 1�z�Ci + .2i,

0 if Ci + .2i�z.

Table 1 gives the probabilities of Type I and Type II errors, the FDR and the false nondiscovery rate (FNR) for boththis new procedure and the SU procedure. Monte Carlo calculations for each line of Tables 1–3 were performed by

Table 1Performance characteristics of the step-up (SU) procedure and a smoothed version procedure for k = 4

True means Type I error Type II error FDR FNR

Smooth SU Smooth SU Smooth SU Smooth SU

0.0 0.0 0.0 0.0 0.013 0.013 0.000 0.000 0.050 0.049 0.000 0.0001.5 0.0 0.0 0.0 0.019 0.018 0.757 0.760 0.040 0.038 0.192 0.1921.5 1.5 0.0 0.0 0.025 0.023 0.724 0.731 0.028 0.026 0.396 0.3981.5 1.5 1.5 0.0 0.033 0.028 0.690 0.706 0.016 0.013 0.635 0.6381.5 1.5 1.5 1.5 0.000 0.000 0.654 0.679 0.000 0.000 0.964 0.9643.0 0.0 0.0 0.0 0.025 0.024 0.211 0.214 0.040 0.038 0.053 0.0543.0 1.5 0.0 0.0 0.031 0.029 0.436 0.442 0.027 0.025 0.271 0.2733.0 1.5 1.5 0.0 0.037 0.032 0.483 0.495 0.014 0.011 0.531 0.5353.0 1.5 1.5 1.5 0.000 0.000 0.492 0.512 0.000 0.000 0.917 0.9173.0 3.0 0.0 0.0 0.039 0.036 0.153 0.157 0.028 0.026 0.099 0.1003.0 3.0 1.5 0.0 0.040 0.036 0.294 0.304 0.013 0.011 0.385 0.3893.0 3.0 1.5 1.5 0.000 0.000 0.348 0.363 0.000 0.000 0.843 0.8433.0 3.0 3.0 0.0 0.043 0.042 0.116 0.122 0.012 0.011 0.162 0.1663.0 3.0 3.0 1.5 0.000 0.000 0.216 0.224 0.000 0.000 0.664 0.6653.0 3.0 3.0 3.0 0.000 0.000 0.092 0.097 0.000 0.000 0.312 0.311

A. Cohen et al. / Journal of Statistical Planning and Inference 137 (2007) 3352–3360 3359

Table 2Performance characteristics of the step-up (SU) procedure for k = 8

True means Type I Type II FDR FNR

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00653 NA 0.05197 0.000001.5 1.5 0.0 0.0 0.0 0.0 0.0 0.0 0.00955 0.81575 0.05239 0.215403.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.01738 0.23635 0.06095 0.074233.0 3.0 1.5 1.5 1.5 0.0 0.0 0.0 0.02370 0.50100 0.02725 0.460993.0 3.0 3.0 3.0 3.0 0.0 0.0 0.0 0.03550 0.13768 0.02410 0.192183.0 3.0 3.0 3.0 3.0 1.5 1.5 1.5 NA 0.30194 0.00000 0.977423.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 NA 0.09569 0.00000 0.61829

Table 3Performance characteristics of the smoothed procedure for k = 8

True means Type I Type II FDR FNR

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00604 NA 0.04821 0.000001.5 1.5 0.0 0.0 0.0 0.0 0.0 0.0 0.00708 0.83200 0.04033 0.218583.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.01383 0.26320 0.05042 0.081703.0 3.0 1.5 1.5 1.5 0.0 0.0 0.0 0.01993 0.53576 0.02455 0.476733.0 3.0 3.0 3.0 3.0 0.0 0.0 0.0 0.03510 0.14664 0.02408 0.202093.0 3.0 3.0 3.0 3.0 1.5 1.5 1.5 NA 0.29671 0.00000 0.977113.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 NA 0.09195 0.00000 0.60884

drawing 10,000 independent random vectors, each of which is a vector of independent normal random variables withvariance one and means given by the first columns of the tables. Type I and Type II error rates were computed as theproportion of times true null hypotheses are rejected, and the proportion of times false null hypotheses are not rejected,respectively. When all null hypotheses are false, the Type I error rate is defined to be 0, and when all null hypothesesare true, the Type II error rate is defined to be 0. The FDRs and False Negative Rates are computed as the proportionof times rejected hypotheses are true, and the proportion of times unrejected hypotheses are false, respectively.

A comparison of the error rates in the tables indicates that the performances of the two procedures are similar. Interms of computer time, the new procedures could be carried out for k = 8 in 15 s, for k = 9 in 1 min, for k = 10 in27 min, for k = 11 in 13 h, and for k = 12 in 17 days. Thus, for k�12 regard the smooth procedure as computationallyimpractical.

5. Discussion

In this section we review the properties of the SU procedure and contrast these with the properties of the smoothversion. On the positive side, SU procedures reject more (and therefore have greater power in some sense) than theirsingle step counterparts. That is, a single step procedure that would control the familywise error rate (FWE) wouldessentially use the Bonferroni inequality and would be highly conservative in the sense that the number of rejectionswould be relatively small. The SU procedure is not only capable of controlling FWE but also can control FDR whilerejecting more often than the Bonferroni based procedure. SU procedures are more flexible than procedures which baseeach test on a single variable drawn from the population whose parameter is being tested. That is, a SU procedure usesthe data sampled from all populations in making an individual test decision.

On the negative side, SU procedures have been shown to be inadmissible under a wide variety of conditions includingdifferent distributions, some one-sided hypotheses, some two-sided hypotheses and for a variety of loss functions. Asnoted earlier, SU procedures have a disturbing practical shortcoming. That is, they reject all hypotheses for somesample points, while accepting all hypotheses for other sample points in which rejection of some hypotheses seemmore compelling.

Both shortcomings of SU seem to be a consequence of a pointed or an unsmoothed partition of the sample space usedto determine different multiple actions. This suggested that smoothing the pointed edges of the partition would lead

3360 A. Cohen et al. / Journal of Statistical Planning and Inference 137 (2007) 3352–3360

to procedures that would retain the positive properties of SU while avoiding the less desirable properties. Whereas wecannot demonstrate that the smooth version of SU in this paper is admissible, one cannot readily assert inadmissibility.It also appears that the smooth version does not have the practical shortcoming of SU.

In comparing the positive properties of the procedures, both are flexible. In terms of error rates the best that we cando is to rely on the simulations of Section 4, where for the most part for 2�k�8 the performances are comparable.SU has a decided advantage for k large, since at this point, the smooth version becomes computationally unfeasible.

For additional discussion concerning SU and competing procedures we refer to the discussion section of CS (2006).With additional research we are hopeful of deriving a Bayes (admissible and smooth) competitor to SU which will becomputationally feasible for large k and be applicable for a variety of distributions.

Acknowledgment

The authors are grateful to the referee who studied this paper with care, thoroughness, and insight. His report led tothe current result in Theorem 3.1 which is stronger than the previous Theorem 3.1.

References

Benjamini, Y., Hochberg, Y., 1995. Controlling the false discovery rate. A practical and powerful approach to multiple testing. J. Roy. Statist. Soc.Ser. B 57, 289–300.

Cohen,A., Sackrowitz, H.B., 2005. Characterization of Bayes procedures for multiple endpoint problems and inadmissibility of the step-up procedure.Ann. Statist. 33, 145–158.

Cohen, A., Sackrowitz, H.B., 2007. More on the inadmissibility of step-up. J. Multivariate Anal. 97, 481–492.Dudoit, S., Shaffer, J.P., Boldrick, J.C., 2003. Multiple hypothesis testing in microarray experiments. Statist. Sci. 18, 71–103.Efron, B., 2003. Robbins empirical Bayes and microarrays. Ann. Statist. 31, 366–378.Genovese, C., Wasserman, L., 2002. Operating characteristics and extension of the FDR procedure. J. Roy. Statist. Soc. Ser. B 64, 499–518.Hochberg, Y., Tamhane, A.C., 1987. Multiple Comparison Procedures. Wiley, New York.Ishwaran, H., Rao, J.S., 2003. Detecting differentially expressed genes in microarrays using Bayesian model selection. J. Amer. Statist. Assoc. 98,

438–455.Lehmann, E.L., 1957. A theory of some multiple decision problems I. Ann. Math. Statist. 28, 1–25.Müller, P., Parmigiani, G., Robert, C., Rousseau, J., 2004. Optimal sample size for multiple testing: the case of gene expression microarrays. J. Amer.

Statist. Assoc. 99, 990–1001.Sarkar, S., 2002. Some results on false discovery rate in stepwise multiple testing procedures. Ann. Statist. 30, 239–257.