rank minimization with a two-step analysis should not replace randomization in clinical trials

2
References [1] Kahan BC. Rank-minimization with a two-step analysis should not re- place randomization in clinical trials. J Clin Epidemiol 2012;65:808e9. [2] Taves DR. Rank-Minimization with a two-step analysis should replace randomization in clinical trials. J Clin Epidemiol 2012;65:3e6. [3] McEntegart D. Letter to the editor in response to Berger. Contemp Clin Trials 2010;31:507. [4] Taves DR. Optimum biased-coin designs for sequential treatment allo- cation with covariate information by A.C. Atkinson, Statistics in Med- icine 1999;18:1741e1752. Stat Med 2001;20:813e8. doi: 10.1016/j.jclinepi.2012.02.003 Rank minimization with a two-step analysis should not replace randomization in clinical trials To the Editor: Taves’ [1] article promoting the wider use of minimiza- tion contains a number of concerning points, one or two of which need some comment. Ignoring my concerns about se- lection bias with rank minimization, this letter is primarily about the section titled A new statistical analysis convention. The idea of balancing on all important and unimportant characteristics and adjusting for only some of these is illog- ical for three reasons: 1. Only well-defined and measurable patient characteris- tics can be incorporated in any balancing algorithm. This means unknown or known-but-unmeasurable characteristics, which have an impact on outcome can- not be balanced by rank minimization or any other scheme. The advantage of randomization is that un- known characteristics will be balanced in probability, which is not guaranteed by minimization. 2. As Taves writes, the true error rate is less than the nom- inal error rate if balancing variables are related to out- come [2]. One of the primary aims of balancing should be to increase precision and power. For minimization this increase is small, [3,4] and achieved if the balanc- ing variables are adjusted for. Balancing and not ad- justing is likely to reduce power and precision and so seems to defeat one of rank minimization’s important aims. 3. By balancing on unimportant factors, the balance of important factors might be compromised. Consider the following example: a trial might minimize on gen- der (deemed to be important) or gender and whether or not a patient has freckles (freckles are deemed to be unimportant). A female with freckles is recruited to be assigned to the preferred treatment with proba- bility O50%. Table 1 shows the 20 previous assign- ments and at the bottom sums the total number of assignments for females, followed by the total number of assignments for females with freckles, for each treatment. Minimizing on both factors here causes worse bal- ance gender across treatments, even though freckles are regarded as unimportant! Doing this is thus subop- timal for achieving balance in important characteristics. It is sensible to adjust for all factors that are balanced. Because sample size will limit the number of factors an analyst is able to adjust for, it is important to involve this number of factors as the very maximum in any bal- ancing scheme. Regardless of views on the appropriate- ness of ‘‘ransacking’’ the data, it can be done whether or not unimportant variables were involved in the balanc- ing scheme. Related to the advice to balance on all factors and adjust for only some, Taves claims that post hoc variable selection is not a problem because it makes P-values limiting rather than precise. Procedures which yield tests with the correct error rate a% and confidence intervals (CIs) with nominal 1 a% coverage are ideal if available (randomization valid [5]). Those which have less than their stated significance and CIs with greater than nominal coverage may also be of scientific value (confidence valid [5]), particularly if no randomization valid procedure is available. However, it is very difficult to see any scientific value in a greater than sign for a P-value and a CI with lower than the stated coverage. As with all other treatment allocation schemes, rank minimization is not immune to problems if the mantra ‘‘analyze according to your design’’ is forgotten. Tim Morris MRC Clinical Trials Unit Aviation House, 125 Kingsway London WC2B 6NH, United Kingdom E-mail address: [email protected] References [1] Taves DR. Rank-Minimization with a two-step analysis should re- place randomization in clinical trials. J Clin Epidemiol 2011;65: 3e6. Table 1 Minimization on freckles and gender vs. minimization on gender only Patient characteristic Previous assignments A B Freckles 4 6 No freckles 6 4 Female 5 4 Male 5 6 Totals Female only 5 4 a Female and freckles 9 a 10 a denotes preferred assignment given previous assignments. 810 Letters to the Editor / Journal of Clinical Epidemiology 65 (2012) 808e812

Upload: tim-morris

Post on 19-Oct-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rank minimization with a two-step analysis should not replace randomization in clinical trials

Table 1Minimization on freckles and gender vs. minimization on genderonly

Patient characteristic

Previous assignments

A B

Freckles 4 6No freckles 6 4

Female 5 4Male 5 6

TotalsFemale only 5 4a

810 Letters to the Editor / Journal of Clinical Epidemiology 65 (2012) 808e812

References

[1] Kahan BC. Rank-minimization with a two-step analysis should not re-

place randomization in clinical trials. J Clin Epidemiol 2012;65:808e9.

[2] Taves DR. Rank-Minimization with a two-step analysis should replace

randomization in clinical trials. J Clin Epidemiol 2012;65:3e6.

[3] McEntegart D. Letter to the editor in response to Berger. Contemp Clin

Trials 2010;31:507.

[4] Taves DR. Optimum biased-coin designs for sequential treatment allo-

cation with covariate information by A.C. Atkinson, Statistics in Med-

icine 1999;18:1741e1752. Stat Med 2001;20:813e8.

doi: 10.1016/j.jclinepi.2012.02.003

Female and freckles 9a 10

a denotes preferred assignment given previous assignments.

Rank minimization with a two-step analysis should notreplace randomization in clinical trials

To the Editor:

Taves’ [1] article promoting the wider use of minimiza-tion contains a number of concerning points, one or two ofwhich need some comment. Ignoring my concerns about se-lection bias with rank minimization, this letter is primarilyabout the section titled A new statistical analysis convention.

The idea of balancing on all important and unimportantcharacteristics and adjusting for only some of these is illog-ical for three reasons:

1. Only well-defined and measurable patient characteris-tics can be incorporated in any balancing algorithm.This means unknown or known-but-unmeasurablecharacteristics, which have an impact on outcome can-not be balanced by rank minimization or any otherscheme. The advantage of randomization is that un-known characteristics will be balanced in probability,which is not guaranteed by minimization.

2. As Taves writes, the true error rate is less than the nom-inal error rate if balancing variables are related to out-come [2]. One of the primary aims of balancing shouldbe to increase precision and power. For minimizationthis increase is small, [3,4] and achieved if the balanc-ing variables are adjusted for. Balancing and not ad-justing is likely to reduce power and precision and soseems to defeat one of rank minimization’s importantaims.

3. By balancing on unimportant factors, the balance ofimportant factors might be compromised. Considerthe following example: a trial might minimize on gen-der (deemed to be important) or gender and whetheror not a patient has freckles (freckles are deemed tobe unimportant). A female with freckles is recruitedto be assigned to the preferred treatment with proba-bility O50%. Table 1 shows the 20 previous assign-ments and at the bottom sums the total number ofassignments for females, followed by the total numberof assignments for females with freckles, for eachtreatment.

Minimizing on both factors here causes worse bal-ance gender across treatments, even though frecklesare regarded as unimportant! Doing this is thus subop-timal for achieving balance in important characteristics.It is sensible to adjust for all factors that are balanced.Because sample size will limit the number of factorsan analyst is able to adjust for, it is important to involvethis number of factors as the very maximum in any bal-ancing scheme. Regardless of views on the appropriate-ness of ‘‘ransacking’’ the data, it can be donewhether ornot unimportant variables were involved in the balanc-ing scheme.

Related to the advice to balance on all factors and adjustfor only some, Taves claims that post hoc variable selectionis not a problem because it makes P-values limiting ratherthan precise. Procedures which yield tests with the correcterror rate a% and confidence intervals (CIs) with nominal1� a% coverage are ideal if available (randomization valid[5]). Those which have less than their stated significanceand CIs with greater than nominal coverage may also beof scientific value (confidence valid [5]), particularly ifno randomization valid procedure is available. However,it is very difficult to see any scientific value in a greaterthan sign for a P-value and a CI with lower than the statedcoverage.

As with all other treatment allocation schemes, rankminimization is not immune to problems if the mantra‘‘analyze according to your design’’ is forgotten.

Tim MorrisMRC Clinical Trials Unit

Aviation House, 125 Kingsway

London WC2B 6NH, United KingdomE-mail address: [email protected]

References

[1] Taves DR. Rank-Minimization with a two-step analysis should re-

place randomization in clinical trials. J Clin Epidemiol 2011;65:

3e6.

Page 2: Rank minimization with a two-step analysis should not replace randomization in clinical trials

811Letters to the Editor / Journal of Clinical Epidemiology 65 (2012) 808e812

[2] Kahan BC, Morris TP. Improper analysis of trials randomised using

stratified blocks or minimisation. Stat Med 2012;31:328e40.

[3] Atkinson AC. Optimum biased-coin designs for sequential treatment

allocation with covariate information. Stat Med 1999;18:1741e52.

[4] Senn S, Anisimov VV, Fedorov VV. Comparisons of minimization and

Atkinson’s algorithm. Stat Med 2010;29:721e30.

[5] Rubin DR. Multiple imputation after 18þ years. J Am Stat Assoc 1996;

91:473e89.

doi: 10.1016/j.jclinepi.2012.02.005

Rebutal of anon’s critique: in fact randomization is thelimiting probability distribution for imbalance withminimization

In reply:

The author takes issue [1] with the proposed new sta-tistical analysis convention that should be used with rankminimization [2] and all other algorithms that fit the def-inition of minimization [3]. He thinks balancing on allcharacteristics regardless of their importance is illogicalfor three reasons. First, he says that randomization, notminimization, guarantees that unknown characteristicswill be balanced ‘‘in probability.’’ ‘‘In probability’’ meansthat if a study could be repeated an infinite number oftimes using randomization the effect of the unknowncharacteristic on the outcome will cancel out. This alsois true with minimization if the unknown characteristicis not correlated with any minimized characteristic. Inthis case it will distribute independently, that is, randomly[4,5]. However, if there is any correlation between the un-known characteristic and those used in minimization, thedistribution of the outcome will tend to be more balancedthan with randomization, that is, there will be fewer in-stances where the outcome will be statistically signifi-cantly different by chance alone. A demonstration ofthis synergism between correlated characteristics is shownin the original description of minimization [4]. Using15 characteristics it was shown that those that were cor-related acted as though they were weighted more thanthose that were not. Also this interaction of correlatedcharacteristics is the basis for doing analyses of covari-ance. Both considerations lead to an axiom: balancingany characteristic that is correlated with an unknowncharacteristic that is correlated with the outcome will de-crease the probability of obtaining differences in the out-come by chance alone. In summary, minimization ‘‘inprobability’’ cannot make the balance worse than random-ization, it is the limiting probability distribution. There-fore, his first point is in error; minimization guaranteesbetter balance of unknown characteristics than does ran-domization ‘‘in probability.’’

Second, he says, ‘‘Balancing and not adjusting is likelyto reduce power and precision and so seems to defeat one ofrank-minimizations important aims.’’ I agree. That is why

all the characteristics used in minimization are studiedbut not all simultaneously as the author wants. A primaryanalysis is done with only those characteristics that arethought to be important when setting up the trial and areweighted heavily enough to make sure that they will bewell balanced. After the primary analysis is completed,all other characteristics used in minimization are tested sin-gly and in combination for any possible influence on theoutcome, that is, they are ransacked. Therefore his secondpoint is based on a false premise.

The author’s third point is that if the wrong characteris-tics are weighted heavily, minimization can allow importantimbalanced trials. He resorts to an artificial example to sup-port that. Real examples of randomization leading to imbal-anced trials are not hard to find. The failed clinical trial thatled to the development of minimization is an example [4].Sicker subjects were placed randomly in the treatmentgroup. Then the primary analysis rejected the null hypoth-esis, implying that the treatment had adversely affectedthose subjects. Thus the study could not be repeated evenonce, to say nothing of the multiple number of times thatwould have been necessary using randomization, to besure that an imbalanced variable rather than the treatmenthad been responsible for the first results. With rank minimi-zation sicker patients would have been placed more evenlybetween the treatment groups and exceptions would beidentified in the secondary analysis.

The author also questions the value of a greater-thansign for the P-value after the secondary analysis. Thissign generally sets a limit on how much less the P-valuemight have been if the experiment and primary analysishad been done differently. If the secondary P-value, thatis, the smallest found in the ransacking, is significantlysmaller than the P-value calculated in the primary analysisit would signal a need to repeat the study with the im-provements suggested by the secondary analysis. If thesecondary P-value is larger than the primary value(O0.05 and !0.01, respectively), as it was in the failedclinical trial described above, it is incongruent, that is, itdoes not define limits for the expected true P-value. It sig-nals that an important variable was unbalanced. Thus thegreater-than sign can provide important information.

The mantra he invokes, ‘‘analyzing according to yourdesign’’ is supported by the two-step analysis. In practicethis rule has been generally ignored as noted in the author’sreferences [6]. Living by that rule may turn out to be themost important contribution of rank minimization.

Donald R. Taves MD, MPH, PhDUniversity of Washington

Oral Health Sciences

1959 NE Pacific Street, Box 357475

Seattle, WA 98195

United StatesE-mail address: [email protected]