an early “atkins' diet”: ra fisher analyses a medical “experiment”

12
An Early “Atkins’ Diet”: RA Fisher Analyses a Medical “Experiment” Stephen Senn * Department of Statistics, University of Glasgow, Glasgow, G12 8QQ, UK Received 26 March 2005, accepted 31 May 2005 Summary A study on vitamin absorption which RA Fisher analysed for WRG Atkins and co-authored with him is critically examined. The historical background as well as correspondence between Atkins and Fisher is presented. Key words: History of statistics; Randomisation; Vitamin C; Smoking; Lung cancer. “Although replication is essential in this way, it is not sufficient without the added precaution of ran- domization ...” RA Fisher (Fisher, 1958, p. 153). 1 Introduction Fisher pioneered and championed the practice of randomisation in agricultural experimentation (Fisher, 1926) and saw it as closely allied to analysis of variance, a technique for analysing designed experi- ments that he developed. When it came to clinical trials, however, it was Fisher’s immediate predeces- sor as President of the Royal Statistical Society, Bradford Hill who was to prove the most influential figure. Considerably influenced by Fisher’s work, Hill made randomisation a key feature of the Medi- cal Research Council (MRC) trial of streptomycin in tuberculosis that was to prove so influential (Medical Research Council Streptomycin in Tuberculosis Trials Committee, 1948). Yet, curiously, very few years were to pass before these two giants of 20 th century statistics were to disagree and the cause of that disagreement was to be presented by Fisher, at least, as being to do with the difficulty of making causal inferences in the absence of randomisation. Fisher was hardly ever involved in medical matters, although he too was involved with tuberculosis, albeit in the context of laboratory work, at around about the same time as the MRC trial (Fisher, 1949). On the whole, however, he was more involved with agriculture and with genetics in a generally scien- tific as opposed to immediately medical context. A notable exception was his controversy with Hill over smoking and lung-cancer. Hill, together with his collaborator Richard Doll, had carried out impor- tant case-control (Doll and Hill, 1950) and cohort (Doll and Hill, 1954) studies into possible causes of lung-cancer. In a number of papers Fisher argued that there were many plausible explanations for this association. The most important of these articles, “Cigarettes, cancer, and statistics,” is the one that provides our opening quotation. In it he argues that, as regards coming to sound conclusions, if rando- mised trials could be run to investigate the effect of smoking, the impossibility of which, “... is not the fault of the medical investigators ... there would be no difficulty ...” (Fisher, 1958, p. 155). No doubt, this is true, and it is tempting to see this dispute as one between a scientific realist, Hill and a purist, Fisher. However, as explained in this paper, Fisher was being misleading if he meant to * e-mail: [email protected], Phone: +44 (0)141 330 5141, Fax: +44 (0)141 330 4814 Biometrical Journal 48 (2006) 2, 193 204 DOI: 10.1002/bimj.200510206 # 2006 WILEY-VCH Verlag GmbH &Co. KGaA, Weinheim

Upload: stephen-senn

Post on 06-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

An Early “Atkins’ Diet”:RA Fisher Analyses a Medical “Experiment”

Stephen Senn*

Department of Statistics, University of Glasgow, Glasgow, G12 8QQ, UK

Received 26 March 2005, accepted 31 May 2005

Summary

A study on vitamin absorption which RA Fisher analysed for WRG Atkins and co-authored with him iscritically examined. The historical background as well as correspondence between Atkins and Fisher ispresented.

Key words: History of statistics; Randomisation; Vitamin C; Smoking; Lung cancer.

“Although replication is essential in this way, it is not sufficient without the added precaution of ran-domization . . .” RA Fisher (Fisher, 1958, p. 153).

1 Introduction

Fisher pioneered and championed the practice of randomisation in agricultural experimentation (Fisher,1926) and saw it as closely allied to analysis of variance, a technique for analysing designed experi-ments that he developed. When it came to clinical trials, however, it was Fisher’s immediate predeces-sor as President of the Royal Statistical Society, Bradford Hill who was to prove the most influentialfigure. Considerably influenced by Fisher’s work, Hill made randomisation a key feature of the Medi-cal Research Council (MRC) trial of streptomycin in tuberculosis that was to prove so influential(Medical Research Council Streptomycin in Tuberculosis Trials Committee, 1948). Yet, curiously,very few years were to pass before these two giants of 20th century statistics were to disagree and thecause of that disagreement was to be presented by Fisher, at least, as being to do with the difficulty ofmaking causal inferences in the absence of randomisation.

Fisher was hardly ever involved in medical matters, although he too was involved with tuberculosis,albeit in the context of laboratory work, at around about the same time as the MRC trial (Fisher, 1949).On the whole, however, he was more involved with agriculture and with genetics in a generally scien-tific as opposed to immediately medical context. A notable exception was his controversy with Hillover smoking and lung-cancer. Hill, together with his collaborator Richard Doll, had carried out impor-tant case-control (Doll and Hill, 1950) and cohort (Doll and Hill, 1954) studies into possible causes oflung-cancer. In a number of papers Fisher argued that there were many plausible explanations for thisassociation. The most important of these articles, “Cigarettes, cancer, and statistics,” is the one thatprovides our opening quotation. In it he argues that, as regards coming to sound conclusions, if rando-mised trials could be run to investigate the effect of smoking, the impossibility of which, “. . . is not thefault of the medical investigators . . . there would be no difficulty . . .” (Fisher, 1958, p. 155).

No doubt, this is true, and it is tempting to see this dispute as one between a scientific realist, Hilland a purist, Fisher. However, as explained in this paper, Fisher was being misleading if he meant to

* e-mail: [email protected], Phone: +44 (0)141 330 5141, Fax: +44 (0)141 330 4814

Biometrical Journal 48 (2006) 2, 193–204 DOI: 10.1002/bimj.200510206

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

imply that he considered that in the absence of such randomisation no conclusion of interest waspossible. Although, as has already been stated, Fisher was rarely involved in medical matters, therewas one issue in which he had briefly been involved: the absorption of vitamin C (Armitage, 2003;Senn, 2003). In 1943 Fisher collaborated with Atkins in a paper (published in 1944) that described astudy that not only involved no randomisation but also involved no planning as regards the claimedeffect (Atkins and Fisher, 1944). Yet he analysed this as if it were a randomised study and in thepaper he co-authored, having found a highly significant effect, was prepared to claim as a proven fact,what had begun as a chance observation, namely, “it appears therefore that vitamin C is much betterutilized after other food” (Atkins and Fisher, 1944, p. 252). In this note, I describe the Atkins andFisher study, as well as its background, beginning with biographical notes of the two authors.

First, however, I offer a brief apology. That critically examining a great scientist’s work implies nodisrespect is exemplified by Fisher himself. Fisher’s great paper, ‘Has Mendel’s work been rediscov-ered,’ (Fisher, 1936) considers, amongst other matters, the possibility that Mendel’s, data were fabri-cated in an attempt to make a theory more palatable to possibly sceptical scientists. Fisher writes,“although no explanation can be expected to be satisfactory, it remains a possibility among others thatMendel was deceived by some assistant who knew too well what was expected”. (Fisher, 1936,p. 132). This is still, however, a serious charge, as Mendel was sole-author of the publication of theseresults (Mendel, 1866) and must be regarded as responsible for them but that Fisher regarded Men-del’s work as of the first importance is not in doubt. Similarly, although I shall claim in this note, thatin the collaboration with Atkins, Fisher was not practising what he preached, this does not change onebit my belief that Fisher was one of the greatest scientists of the 20th century, nor do I wish toencourage the reader in any other view. In fact, I believe that Fisher’s reputation, which is not yetwhat it should be, will continue to grow.

2 R. A. Fisher 1890–1962

Sir Ronald Aylmer Fisher presents a paradox of fame. He is virtually unknown to the general publicbut is regarded by many who are qualified to judge as both the greatest evolutionary biologist of the20th century and the greatest statistician ever. Nevertheless, despite his lack of public fame, for thosewho are curious to know more of his life, he has been well served by his daughter’s brilliant biogra-phy (Box, 1978), by the obituary notice for the Royal Society by Yates and Mather (1963) and byvarious articles describing his scientific achievements, (Edwards, 2003; Fienberg and Hinkley, 1980;Grafen, 2003; Healy, 2003). Therefore, only the briefest of details of matters relevant to his collabora-tion with Atkins will be rehearsed here.

Fisher was appointed a Fellow of the Royal Society in 1929, whilst still working as a statistician atRothamsted Experimental Station in Harpenden, which he had joined in October 1919 (Box, 1978).Fisher was 39. Atkins had been elected FRS four years earlier. Presumably, the fact that they wereboth FRS played a role in their collaboration and may have been the source of their coming in con-tact. Certainly they had met before the War at least as early as 1936 when Fisher and Frank Yatesvisited Plymouth, where Atkins worked (Council of the Marine Biological Association, 1937). At thisstage of Fisher’s scientific life he had published about 75 or so of the 294 scientific papers eventuallynumbered amongst his collected works. By 1929 he had developed his notions of likelihood, his ideasof randomisation in experiments, the technique of analysis of variance, and much of his theory ofestimation, including the concepts of consistency, efficiency, sufficiency and ancillarity (Fisher,1925b). His book, Statistical Methods for Research Workers (Fisher, 1925a) was already in its secondedition. Shortly to appear were his controversial theory of fiducial inference (Fisher, 1930b) and hisgreat book, The Genetical Theory of Natural Selection (Fisher, 1930a).

In 1934 Fisher was appointed Galton Professor of Eugenics at University College London (UCL), asuccessor of sorts to Karl Pearson. In fact, Pearson’s legacy was divided in two and a separate statis-tics department was created and was headed by Pearson’s son, Egon. The appointment at UCL was

194 S. Senn: Atkinson, Fisher and Vitamin C

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

the realisation of an ambition of Fisher’s but his time there cannot be said to have been happy. In1935 there was a disagreement with Egon Pearson’s collaborator Jerzy Neyman over the analysis ofLatin Squares (Fisher, 1935a) and whether because of the collaboration with Neyman, or becauseEgon had objected to Fisher’s suggestion that Fisher might give a lecture-course in statistics in theEugenics Department or simply because Egon was his father’s son, Fisher never had friendly relationswith Pearson fils. In 1939, Fisher fell out with the Provost of UCL, Allen Mawer, as a result of hisresistance to the dissolution of his department as part of the evacuation of UCL to Wales. The corre-spondence reveals Fisher at his most exasperating and he shows few concessions in his dealings withthe Provost to the fact that the country was facing a grave emergency. Subsequent events proved thatthe Provost’s concerns for staff at UCL were not unfounded. The Gower Street site suffered consider-able bomb-damage in both September 1940 and April 1941. In fact, UCL suffered more destruction ofits fabric than any other British University or college (Harte and North, 1991). In his defence, Fisherwas not the only scientist to try to resist evacuation (JBS Haldane, for example, behaved similarly(Harte and North, 1991)), and he was clearly frustrated by the Provost’s inability to see the impor-tance of his work or unit and in this respect Fisher has also been proved right. The otherwise excel-lent official history of UCL by (Harte and North, 1991) does not mention Fisher at all, so that Fisheris still, apparently, undervalued by that institution today.

Fisher eventually succeeded in finding offices for himself and a few of his staff at his old addressof Rothamsted, which, since he knew the institution well and still lived in Harpenden was, althoughnot an ideal solution, at least a solution of sorts to the need to evacuate UCL.

In 1943, Fisher was appointed Arthur Balfour Professor of Genetics at Cambridge and he left Mil-ton Lodge, where he lived in Harpenden, in October of that year (Box, 1978). The work with Atkinsthus covers the period of his just having started at Cambridge, having moved from his official appoint-ment at UCL, albeit based at Rothamsted.

3 William Ringrose Gelston Atkins 1884–1959

(This section is based on Horace Poole’s (Poole, 1960) obituary for the Royal Society.) WRG, “Billie”Atkins was born in Cork on 4 September 1884 and died on 4 April 1959. He studied ExperimentalScience (that is to say physics and chemistry) and Natural Science (botany, zoology and geology) atTrinity College Dublin and subsequently had a varied scientific career that befitted this broad educa-tion, doing work in organic chemistry, physiology (both human and plant), marine biology and ecol-ogy, as well as material science and optics. He was a good athlete and became middleweight cham-pion of Ireland in boxing. In the First World War he was in the Royal Flying Corps as an equipmentofficer in charge of a laboratory at Aboukir in Egypt, eventually reaching the rank of major. He wastorpedoed in one of his voyages. After the war he returned to Ireland but after a short spell in India hewas appointed Head of the Department of General Physiology in the Marine Biological Association’sLaboratory at Plymouth in 1921. Apart from a brief period of war work during the Second World Warhe held this appointment until his retirement in 1955.

Atkins was elected FRS in 1925, just four years before Fisher. Fisher, of course, was a man ofwidely varied scientific interests but if Atkins does not match him for depth, power and originality ofwork, he certainly exceeds him for breadth. Thus, amongst his papers are ones on the freezing pointof milk, osmotic pressure in eggs, dopes and varnishes for aeroplane wings, the chemistry of honey,the packing and care of motor parts, hydrogen ion concentration in soils, the chemistry of the sea,water snails and liver flukes, the preservation of fishing nets, the penetration of light in sea-water,radiation from mercury vapour lamps, photo-electric cells, daylight in relation to climate and health,annual patterns of phytoplankton growth in the sea, size versus colour in air-sea rescue and many,many more too numerous to mention.

A mystery remains as to why Fisher was never given any scientific war work of any importance inthe Second World War (poor eyesight had prevented active service in the First). Age alone cannot be

Biometrical Journal 48 (2006) 2 195

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

the explanation, since Atkins who was six years his senior, in addition to work in the Home Guard,was given the rank of captain in the Royal Army Medical Corp (RAMC) and undertook routinephysiological testing at various medical centres. This work resulted in several papers on vitamin Creserves, one of which, a brief note of less than two full pages (Atkins and Fisher, 1944) was thecollaboration with Fisher. Nor can one be entirely convinced by Fisher’s daughter’s, suggestion, “Un-fortunately he appeared as a senior biologist; in consequence, perhaps, his special abilities in applica-tion of statistical and scientific reasoning to problem solving were not widely appreciated” (Box,1978, p. 375). Atkins, an employee of the Marine Biological Association, had more reason to beconsidered a biologist than Fisher.

Atkins married Ingaborg Jackson in 1922. They set up house in Cornwall, in Antony and later inDownderry, which is where he was living at the time of the paper with Fisher. They had one son,George Mignon Gelston, who went to Pembroke College Cambridge and was later in the Royal Artil-lery. In 1958 Atkins was asked to act as President of the Botanical Section of the British Associationmeeting in York. Unfortunately he was never able to give his presidential address. In December 1958he caught pneumonia, complications led to an operation and on 4 April 1959 he died.

4 The Atkins and Fisher Paper

There are two errors in the paper as recorded in Fisher’s collected works (Bennett, 1971–1974) thatare not present in the original. The first is that the date is given as 1943, where 1944 is correct, andthe second is that it lists the first author as “W.R.B. Atkins”, not “W.R.G. Atkins”. (The letter B isdiagonally adjacent to G on the QWERTY keyboard.)

The paper claims that, “Observations made upon two sections of R.A.M.C. men dosed with vitamin Cgave differences far beyond those that could be attributed to chance” (Atkins and Fisher, 1944,p. 251). However, whether or not chance could explain these observations, they were certainly chancefindings. Atkins had been given the task of carrying out a study to determine the vitamin C reservesof troops. In each of the stations he examined, 100 men were tested by a method described as that ofHarris and Abbasy. No reference is given but in a letter to Fisher (letter 1 below) Atkins refers to apaper of his in Nature on 2 January 1943. This letter to the editor is entitled ‘Vitamin C SaturationTest of Harris and Abbasy,’ (Atkins, 1943) and refers to a paper by Harris and Abbasy published inThe Lancet on 18 December 1937 (Harris and Abbasy, 1937). In a letter to the editor in Natureprinted immediately after that of Akins’s, Harris describes his method in more detail (Harris, 1943).The main features of the method as applied by Atkins are: 1. Daily dosing with gm of vitamin C.2. Sampling urine four or five hours after dosing 3. Making up the volume of urine to a standardamount of 1/2 a litre or one litre 4. Use of 2 :6 dichloro-phenol-indophenol as a reagent to assay theurine. 5. An observed excretion of 35 mgm or more of vitamin C in a two hour period taken as beingproof of saturation having been reached.

Returning to the work described in the Atkins and Fisher paper, it seems that vitamin C tabletswere given daily and the doses until saturation, as measured in the urine, were noted. At one of thestations about half of the men were in the RAMC and these were in two sections, A and B, the other

196 S. Senn: Atkinson, Fisher and Vitamin C

Table 1 The Atkins and Fisher data. Doses to saturation in two companies ofmen.

Doses to saturate Men A Men B Doses Men A Men B

1 0 0 5 9 12 0 1 5a 1 03 4 14 5b 2 04 6 5

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

half being in the infantry. It so happened that the three groups, section A, infantry, and section B weredosed in that order starting with A at 07:30 in the morning and finishing with B at 08:00. As thepaper puts it, “there were most unexpected differences between the two R.A.M.C. sections” (p. 251).

The data from the study are given in Table 1. (These are as presented in the original Atkins andFisher paper and not as in Fisher’s collected papers in which three columns only are used.)

The men were given four doses but if they failed to reach saturation by the fourth dose wererecorded as “five” if near saturation, “five(a)” if it appeared they would be saturated by six doses and“five(b)” if no prediction could be made. A more modern representation of the data, a Kaplan–Meierplot, is given in Figure 1. In the plot, values for men who did not reach saturation by day four aretreated as censored at day four. (The fact that the data are also interval censored, being measured inincrements of one day, is ignored.)

The paper makes a curious statement about the implications of these results. “As, however, it wasfelt that it would be unwise to draw a conclusion from a single experiment, although fifty men werein it, the tabulated results were submitted to statistical analysis to ascertain the probability of thedistribution being due to chance.” This statement is strange on several counts. First, this was not anexperiment, second the data relate to 43 men not 50, and third it is hard to see in what sense statisti-cal analysis could rectify the defect of its being a single observation. These matters are partly ex-plained in the Atkins–Fisher correspondence, discussed below. Finally, to make a point that some willconsider pedantic and others crucial, the probability that is eventually calculated is not that the distri-bution was due to chance, which would be a Bayesian quantity, but that of observing a split asextreme or more extreme if chance were the only explanation. Some of these points will be returnedto later.

The data were then dichotomised as follows, “. . . one draws a line between the 19 first saturatedand the 24 others” (Atkins and Fisher, 1944). This is, effectively, a median split and results in thefollowing table, which Atkins and Fisher describe but do not present.

Biometrical Journal 48 (2006) 2 197

days on treatment

prop

ortio

nun

satu

rate

d

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

AB

Figure 1 Atkins and Fisher data. Kaplan–Meier plot of days to saturation.

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

198 S. Senn: Atkinson, Fisher and Vitamin C

Table 2 Dichotomised version of the saturation data as analysed by RA Fisher.

Group

Men A Men B Total

Doses to saturate Three or fewer 4 15 19Four or more 18 6 24Total 22 21 43

Table 3 Summary of the Atkins Fisher correspondence.

Number Author Date Origin Pages Notes

1 Atkins 2 November 1943 MeteorologicalOffice,Stonehouse,Gloucestershire

4 Requests Fisher’s help

2 Fisher 4 November 1943 Cambridge? 1 Agrees to look at data

3 Atkins 3 December 1943 Derry House,Downderry,Cornwall

4 Provides the dataincluding five furtherbodies of men

4 Fisher 6 December 1943 Cambridge? 2 Provides results ofanalysis.

5 Atkins 26 December 1943 MeteorologicalOffice,Stonehouse,Gloucestershire

2 Encloses draft ofpaper. Proposes jointauthorship

6 Fisher 3 January 1944 Cambridge? 1 Accepts co-authorship

7 Capon 25 October 1944 The War OfficeA.M.D.5.London, S.W.1

1 Lieut-Colonel, Asst.Editor of the Journalof the R.A.M.C.Explains paper willappear November1944.

8 Atkins 26 October 1944 Not known 1 Presumably enclosingletter of Capon.Explains that paperwill appear under bothnames.

9 Fisher 4 November 1944 Cambridge? 1 Mentions need forreprints.

10 Atkins 27 December 1945 Plymouth 1 Postcard saying 50typed copies are beingsent to Fisher as noreprints available.

11 Fisher 29 December 1945 Cambridge? 1 Acknowledges cardand 50 copies.

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

As Atkins and Fisher put it, “the chance of getting so large a discrepancy, if the numbers in A andB were really proportional, is about 1 : 1,867, so there is no doubt at all of the statistical significanceof the result” (p. 252).

As explanation of the difference between the two groups of men, Atkins and Fisher offer the fol-lowing: “. . . further enquiries elicited the fact that section A men of the R.A.M.C had paraded at07:30 hrs., before breakfast, having been busy attending to patients. Section B men, finding they werenot required immediately, had their food and received the dose of vitamin afterwards . . . It appearstherefore that vitamin C is much better utilized after other food which is in keeping with the customof having dessert after meals.” (p. 252)

5 The Atkins and Fisher Correspondence

This is a series of letters held at the Barr Smith Library in Adelaide, in which Atkins approachesFisher for help with analysis, receives it, writes up the results, proposes joint authorship and receivesFisher’s agreement to include his name on the paper if he so wishes. Also included is a note ofacceptance of the paper from Lieutenant-Colonel PJL Capon, RAMC who was assistant editor of theJournal of the Royal Army Medical Corps 1943–1945. All of Atkins’s letters are handwritten, butthese are, of course, the originals. Fisher’s are all typewritten. These are copies. They do not includeaddress details. Perhaps the originals were typed on headed note-paper. Atkins refers to Fisher beingin Cambridge and presumably from the dates of the correspondence, Fisher had already taken up theBalfour Chair there. The correspondence starts on 2 November 1943 and finishes on 29 December1945. Table 3 gives brief details of all the letters.

Some relevant and revealing extracts from this correspondence are reproduced below.

Letter 1Dear Fisher,. . . I was to have gone out i/c Mobile Hygiene Lab, but then I was considered too old & sent round testing urines– my job Nov. ’41–May ’42, when I resigned . . . . I found one queer thing. At a military hosp. I tested 50R.A.M.C. & 50 infantry. Former were in sections say A&B, merely for A.R.P duties – they messed together.When I worked out results I found that A saturated about two days before B – I mean the peak of the curve wasthat, there is always a good scatter. I could see no reason for it. Actually on account of various casualties therewere 21 in A and 22 in B or vice versa. I just had to confess I was unable to account for it. But four months laterI returned . . . I found that the Sgt dispenser had dosed (3/4 g. ascorbic acid) say B. Then infantry arrived & beingvisitors he dosed them. A slipped away for breakfast & then were dosed. I had impressed on him that the testshould be repeated exactly each day, so he kept up this order – so I was told –actually he had left before secondvisit . . . It pointed therefore to the fact that the ascorbic acid (vitamin c) was better utilized – or less destroyed –after food . . . It has occurred to me that you could apply your statistics to say whether my conclusion had anyvalidity from the one trial.

Letter 2Dear Atkins,. . . They do not sound too extensive for me to get the test done in spare time . . . let me have a copy of theprotocols, or at least the earliest stage you want me to deal with, and I will see what can be done, withpleasure.

Letter 3Dear Fisher,. . . As a comparison I give the results for two places. The first had J & K, from infantry units and L, R.A.M.C –all in different messes, but in the same country district. The second had M,N, infantry from two units (hence twomesses) and P,Q, R.A.M.C. from the same unit and the same mess, merely P&Q for A.R.P duties etc. It sohappened that when P had been dosed – before breakfast, they had been busy giving patients breakfast – M&N(infantry) came in and were dosed. Meanwhile Q, R.A.M.C., slipped out & had breakfast & were dosed AFTERFOOD, like all the rest of the men everywhere, save the R.A.M.C. section P. Same routine was observed on

Biometrical Journal 48 (2006) 2 199

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

subsequent days . . . The question is, what is the probability that dosing after food is better than dosing beforefood . . . You see Q showed up considerably better than P – at first sight. But is it a valid conclusion? Were thereenough men in the test? That you can say.

The further data referred to by Atkins are reproduced in Table 4 below.

Letter 4 (Quoted in entirety)Dear Atkins,On the short question you sent to me, there is only one answer. Of the 43 cases in the two groups to be com-pared, if one draws a line between the 19 first saturated and the 24 others (that is, between those saturated in thefirst three days and those taking longer) one has in the first group 4P and 15Q, in the second group 18P and 6Q.The chance of getting so large a discrepancy, if the numbers in P and Q were really proportional, is about 1 in1867, so there is no doubt at all of the statistical significance of the difference you have observed. It can be takenas certainly not due to chance; whether it was due to breakfast or some other circumstance is of course beyondme to judge.

6 Comment

Fisher’s reply to Atkins comment in letter 3 by Atkins. ‘But is it a valid conclusion? Were thereenough men in the test? That you can say’, can have come as no surprise to Atkins. In fact, Atkinshad already published his conclusions in the article in Nature previously referenced, although Fisherquite possibly had not read this paper, despite Atkins having referred to it in the correspondence. InNature, Atkins wrote:

Two sections of men belonging to the same unit and feeding from the same cook-house responded very differ-ently to the doses, for one showed its peak of saturation about two days earlier than the other. The numbers, 22and 21 respectively, appeared to rule out a chance aggregation. This remained a puzzle until on returning to thestation four months later it was ascertained that a visiting unit was dosed after the first section and that thesecond section had breakfast before being dosed. Apparently the vitamin suffers less destruction when taken afterfood. A direct experiment should be made to test this accidental finding. (Atkins, 1943)

The statement in the paper on the significance of the results, “the chance of getting so large adiscrepancy, if the numbers in A and B were really proportional, is about 1 : 1,867, so there is nodoubt at all of the statistical significance of the result,” is lifted straight from Fisher from letter 4. All

200 S. Senn: Atkinson, Fisher and Vitamin C

Table 4 Data on time to vitamin C saturation for 7 groups of men reproduced from letter 3 Atkins toFisher 3 December 1943. The double line between columns L and M is taken from Atkins’s letter andindicates a difference between two locations, J, K, L being in one location and M, N, P, Q in another.P and Q are the groups A and B from Atkins and Fisher (Atkins & Fisher, 1944) and M & N togetherform the infantry unit otherwise referred to.

Days J K L M N P Q

1 0 0 0 0 0 0 02 7 2 5 0 1 0 13 11 10 22 6 4 4 144 4 8 12 12 4 6 55 3 2 4 8 2 9 1>5a 0 1 5 2 1 1 0>5b 0 0 1 1 7 2 0Casualties 0 2 2 1 1 3 4Total 25 25 51 30 20 25 25

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Atkins has done is amended P&Q in Fisher’s letter to A&B, thus reverting from the designation of theunits of men he himself had used in letter 3 to that which he had used in letter 1, and changed“difference” to “result”. In fact in letter 4, the words “at all” are inserted above the line by Fisher,presumably as an afterthought, perhaps because he felt that more emphasis was needed.

The figure of 1 in 1867 is curious. The method used is not referenced but appears to be Fisher’sown exact test. StatXact1 gives a value of P ¼ 0.00053680 (which I confirm by obtaining0.000536796, programming the problem in Mathcad1 11.0), the reciprocal of which is 1863. Clearlythis small discrepancy is unimportant.

What is curious, however, is that this is a one-sided value. Fisher was not generally in the habit ofcalculating one-sided P-values. Indeed, the tables he included in Statistical Methods for ResearchWorkers (Fisher, 1925a), for Student’s t-distribution, which are entered, as was his habit, at the levelsof significance, give two-sided values. Furthermore, the chi-square test as an approximation to Fisher’sexact test, being a measure of squared discrepancy, yields two-sided values. Of course, how to calcu-late two-sided P-values for the exact test is a matter of controversy (Cox and Hinkley, 1974) and oneabout which Yates, for example, changed his mind (Yates, 1984). Nevertheless, in view of the lack ofan expectation in advance of the data that vitamin C taken after breakfast would have the greaterabsorption, it seems hard to justify a one-sided test. Perhaps Fisher simply considered that in view ofthe small P-value it was not worth the trouble.

According to (Fienberg, 1980), Fisher first introduced the exact test in the famous tea-tasting experi-ment in The Design of Experiments (Fisher, 1935b) in 1935 and incorporated it in editions of Statisti-cal Methods for Research Workers from then on. The evidence from the latter work of Fisher’s prac-tice as regards his exact test appears inconclusive (Fisher, 1925a). He illustrates the test using datafrom Lange on ‘Frequency of criminality among the twin brothers or sisters of criminals’ (Lange,1931). Thirteen monozygotic and seventeen dizygotic twins split 10/3 and 2/15 as regards “con-victed”/“not convicted”. The question posed by Fisher is, “Do Lange’s data show that criminality issignificantly more frequent among the monozygotic twins of criminals than among the dizygotic twinsof criminals?” (p. 94) The chi-square value calculated by Fisher is 13.032, of which he writes, ‘a verysignificant value, equivalent to a normal deviate 3.61 times its standard error. The probability ofexceeding such a deviation in the right direction is about 1 in 6,500’. (p. 95, my italics). (Fisher hascalculated the Normal deviate as the square root of the chi-square and then referred this to tables ofthe Normal distribution.) Not surprisingly, the alternative calculation that follows using the exact testis also one-sided. Fisher obtains a P-value of 1 in 2150 (Confirmed correct by StatXact1).

What is also surprising, perhaps, is the fact that Fisher dichotomised the Atkins data. Of course,this enabled him to use a method he had already developed, as rank tests and, of course, survivalanalysis, lay in the future. However, Fisher was never shy of developing techniques where none ex-isted and in any case, Pitman had already developed permutation tests (Pitman, 1937), although theapplication of these would have presented severe difficulties.

Of course, the weakest point of the analysis is the assumption of independence within companies ofmen on which it relies. Fisher’s claim in letter 4, ‘so there is no doubt at all of the statistical significanceof the difference,’ could be defended on the grounds that it simply states that the data cannot be regardedas coming from one single distribution in which all observations are independent. Nevertheless, if that isall that is being claimed, then as much could be said of any agricultural trial: whether or not one hasrandomised, a highly significant difference between plots treated differently implies that the errors can-not form a single set of independently, identically, Normally distributed random variables. Yet this is adefence of conclusions drawn from non-randomised designs that Fisher would have rejected. It seems,therefore, that Fisher, quite apart from any other potential biasing explanations, the possibility of whichhe regards as Atkins responsibility to discount, accepts it as obvious that the observations made on themen within a company are independent. Note that this assumption of independence is required not onlyas regards the actual vitamin C reserves but also as regards the means by which Atkins assayed them.

The wider data set that Atkins provided Fisher in letter 3 would also provide the means of examin-ing whether more generally, not just for the two companies of men compared there are differences

Biometrical Journal 48 (2006) 2 201

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

between companies. These data tend to detract from the conclusions of Atkins and Fisher since, onthe basis of the test that Fisher provides, there are significant differences between pairs of groups ofmen, even if one excludes the group Atkins regards as suspicious, namely group P, which is the onlygroup he claims were dosed before breakfast. For example, the 2� 2 table given as table 5 can beconstructed for data from groups M and Q if dichotomised as previously. The one-sided P-valueaccording to StatXact1 is 0.000421, or 1 in 2376, and thus more impressive than for the comparisonof P and Q. Of course, the groups being compared here are not at all chosen at random and it mightbe argued that this is unfair. However, the point is that Atkins himself has chosen the groups after thefact. In any case, a single chi-square tests comparing all groups is significant whether or not groupsP & Q are included. (c2 = 31.5 on 6 DF or 19.1 on 4 DF.)

The letters also explain the discrepancy in the numbers of men in the two units. Letter 3 includes arow labelled ‘casualties’ (see table 4) and this has values of three and four for units P & Q (or A &B).He explains, “The casualties include men who for some reason (sent on leave, sent away etc etc) didnot complete the series to saturation.” One could regard these as censored data with an unknowncensoring point. Nowadays there would be much discussion of the “missing completely at random”assumption but it seems reasonably made.

Unlike James Lind’s famous investigation of treatments for scurvy (Lind, 1753), which in a sensewas also a study of the effects of vitamin C, albeit before its discovery, the Atkins and Fisher study isnot an experiment. It is not even a study that was designed to examine what it reports. Nowadays wewould perhaps demand much more tentative conclusions than Atkins and Fisher offer. Of course, wecan see that Fisher’s role in this is small and a kind interpretation is that he is merely obliging afellow FRS. He warns, ‘whether it was due to breakfast or some other circumstance is of coursebeyond me to judge’. Nevertheless, he did agree to put his name to the paper and subsequently, whenit came to an association between tobacco and lung cancer, an issue he was no better qualified tojudge than the association between the timing of breakfast and vitamin C absorption, he did nothesitate to find ingenious explanations for the association, nor to criticise those who looked for theobvious one.

7 Conclusion

One can argue that the claims of Atkins and Fisher are not unreasonable as perhaps all they weremeant to do was advance a theory. Such work would attract further studies, and in fact Atkins men-tions in letter 1, in which he comments on the advantage of having dessert after meals rather than,say, eating grapefruit before, that he proposed in his original report, presumably the letter in Nature(Atkins and Fisher, 1944), that this should be tested by direct experiment. The exigencies of war madefindings such as this potentially extremely important. Both in the paper and in correspondence Atkinsrefers to work of Harris regarding the role of vitamin C in wound healing.

The war, of course, is not absent from the correspondence. Quite apart from references to minorirritations such as paper rationing and its effects on reprints, there are also references to human trage-dies. In letter 1, Atkins writes, “I remember that you had two sons who stayed on the Salpa . . .”. The

202 S. Senn: Atkinson, Fisher and Vitamin C

Table 5 Comparison of groups M and Q from Letter 3 and based on Table 4using the method of Atkins and Fisher (Atkins & Fisher, 1944).

Group

Men M Men Q Total

Doses to saturation Three or fewer 6 15 21Four or more 23 6 29Total 29 21 50

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Salpa was the Plymouth laboratory’s collecting ship (Tansey, 1997), originally a Lowestoft drifter builtin 1918 and purchased in 1921 and sold in 1946 (Emma Woodason, personal communication). Georgeand Harry Fisher spent a week there in 1936 (Harry Fisher, personal communication) when Fisher andFrank Yates visited the Marine Laboratory (Council of the Marine Biological Association, 1937).Atkins continues, “. . . It is a bad time for sons so I hope both are well. Mine is still at school. Mybrother (col. R.A.M.C.) lost his only boy Capt. R.A. in Tunisia – he got M.C. posthumously.” It is athread of correspondence that Fisher does not pick up. On December 23 1943 Fisher received a tele-gram to inform him that George, his eldest son, a pilot in the Royal Air Force had been killed onactive service when his plane crashed into a mountainside. Although Fisher’s calculations for Atkinswere already completed, the period in which he could reflect on how they would appear in the jointpublication was thus one of grief, anguish and turmoil. Letter 5, in which Atkins (who clearly had noknowledge of this) proposed joint authorship, was posted only three days later.

How would we classify the Atkins–Fisher study today? Is it a cohort study? The men were fol-lowed over time but it is really only after looking at the results that putative causes were examined.The saturation pattern was collected and it was only months later that the exposure information wasstumbled on. Thus, in a sense, the study is retrospective. Is it, therefore, a case-control study? Butfrom one point of view it is no more than a case-series. Whatever the classification, I cannot acceptthat this study matches or even approaches in quality the ones that were initiated by Doll and Hill tostudy smoking and lung-cancer, nor can I accept that the greater confidence with which the conclu-sions of Atkins and Fisher were asserted is anywhere near as justified as the more tentatively assertedviews of Doll and Hill (Doll and Hill, 1954). Fisher cannot be absolved of a degree of subsequenthypocrisy on this score.

And what of the finding itself? Is the conclusion perhaps right even if the argument is weak and isvitamin C better absorbed after breakfast than before? I am grateful to my former colleague DavidBender, author of Nutritional Biochemistry of the Vitamin (Bender, 2003) for his opinion. Apparently,it is not generally considered today that absorption of vitamin C differs in this way.

Does any of this matter? Perhaps not. Many readers will no doubt justifiably feel that picking over theminute details of the lives of great scientists and their work is a task fit for pedants. Yet RA Fisher wasan extraordinary genius and of all the extraordinary things he did, his advocacy of randomisation wasone of the most controversial. To some it provides an elegant solution to otherwise insuperable inferen-tial difficulties and to others it is an irrelevancy. Fragments from Fisher’s own life and work illustrate hisviews on randomisation and bring with them some of the fascination of this complex genius.

Acknowledgements I am extremely grateful to the Barr Smith Library of the University of Adelaide for havingprovided me with copies of the correspondence between Fisher and Mawer and Fisher and Atkins, to the Universityof Adelaide for permission to publish extracts from Fisher’s letters and in particular to Elise Bennetto, Janine Tanand Susan Woodburn for their help. The website at Adelaide, http://www.library.adelaide.edu.au/digitised/fisher/index.html is an invaluable resource for Fisher research and the James Lind Library http://www.jameslindlibrary.org/for the history of fair tests of the efficacy of medicines. I am also extremely grateful to Harry Fisher for provid-ing me with information about his childhood, to Emma Woodason of the National Marine Biological Library forinformation about the Salpa and Fisher’s and Yates’s visits to Plymouth, to Tilli Tansey for help regarding thehistory of the Marine Biological Association, to David Bender and Shula Spain for information on vitamin C andto Peter Armitage, Henry Bennett, David Cox, Rex Galbraith and Stephen Stigler for helpful comments on earlierdrafts.

References

Armitage, P. (2003). Fisher, Bradford Hill, and randomization. International Journal of Epidemiology 32, 925–928.

Atkins, W. R. G. (1943). Vitamin C saturation test of Harris and Abbasy. Nature 151, 21–21.Atkins, W. R. G. and Fisher, R. A. (1944). The therapeutic use of vitamin C. Journal of the Royal Army Medical

Corps 83, 251–252.

Biometrical Journal 48 (2006) 2 203

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Bender, D. A. (2003). Nutritional Biochemistry of the Vitamin. (Second ed.). Cambridge University Press, Cam-bridge.

Bennett, J. H. (1971–1974). In University of Adelaide, Adelaide.Box, J. F. (1978). R.A. Fisher, The Life of a Scientist. Wiley, New York.Council of the Marine Biological Association (1937). Report of the Council. Journal of the Marine Biological

Association 22, 371–378.Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics. Chapman and Hall, London.Doll, R. and Hill, A. B. (1950). Smoking and Carcinoma of the Lung. British Medical Journal 2, 739–748.Doll, R. and Hill, A. B. (1954). The mortality of doctors in relation to their smoking habits: a preliminary report.

British Medical Journal 1, 1451–1455.Edwards, A. W. F. (2003). R. A. Fisher – twice Professor of Genetics: London and cambridge, or ‘A fairly well-

known geneticist’. Journal of the Royal Statistical Society Series D – The Statistician 52, 311–318.Fienberg, S. E. (1980), in R. A. Fisher: An Appreciation, eds. S. E. Fienberg, & D. V. Hinkley (New York: Sprin-

ger), 75Fienberg, S. E. and Hinkley, D. V. (1980). in Lecture Notes in Statistics, eds. J. O. Berger, S. E. Fienberg,

J. Gani, K. Krickerberg, I. Olkin, & B. SingerFisher, R. A. (1925a). Statistical Methods for Research Workers. Statistical Methods, Experimental Design and

scientific Inference. Ed. J. H. Bennet. Oxford University Press, Oxford.Fisher, R. A. (1925b). Theory of statistical estimation. Proceedings of the Cambridge Philosophical Society 22,

700–725.Fisher, R. A. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture of Great

Britain 33, 503–513.Fisher, R. A. (1930a). The Genetical Theory of Natural Selection. Oxford University Press, Oxford.Fisher, R. A. (1930b). Inverse probability. Proceedings of the Cambridge Philosophical Society 26, 528–535.Fisher, R. A. (1935a). Contribution to a discussion of J. Neyman’s paper on statistical problems in agricultural

experimentation. Journal of the Royal Statistical Society, Supplement 2, 154-157.Fisher, R. A. (1935b). Experimental Design and Scientific Inference. Statistical Methods, Experimental Design

and scientific Inference. Ed. J. H. Bennet. Oxford University Press, Oxford.Fisher, R. A. (1936). Has Mendel’s work been rediscovered? Annals of Science 1, 115–137.Fisher, R. A. (1949). A biological assay of tuberculins. Biometrics 5, 300–316.Fisher, R. A. (1958). Cigarettes, cancer, and statistics. Centennial Review 2, 151–166.Grafen, A. (2003). Fisher the evolutionary biologist. Journal of the Royal Statistical Society Series D – The

Statistician 52, 319–329.Harris, D. (1943). Vitamin C saturation test: standardization measurements at graded levels of intake. Nature 151,

21–22.Harris, D. and Abbasy, M. A. (1937). A simplified procedure for the vitamin-C urine test. The Lancet 2, 1429–1429Harte, N. and North, J. (1991). The World of UCL 1828–1890. Revised ed. UCL, London.Healy, M. J. R. (2003). R. A. Fisher the statistician. Journal of the Royal Statistical Society Series D – The

Statistician 52, 303–310.Lange, J. (1931). Crime and destiny. Allen and Unwin, London.Lind, J. (1753). A treatise of the scurvy. In three parts. Containing an inquiry into the nature, causes and cure, of

that disease. Together with a critical and chronological view of what has been published on the subject.Kincaid and Donaldson, Edinburgh.

Medical Research Council Streptomycin in Tuberculosis Trials Committee (1948). Streptomycin treatment forpulmonary tuberculosis. British Medical Journal ii, 769–782.

Mendel, G. (1866). Versuche �ber Pflazenhybriden. Verhandlungen Naturforschender Vereines in Br�nn 10,Pitman, E. J. G. (1937). Significance tests which may be applied to samples from any population. Journal of the

Royal Statistical Society, Supplement 4, 119–130.Poole, H. H. (1960), in Biographical Memoirs of Fellows of the Royal Society 1959, 1, The Royal Society, Lon-

don.Senn, S. J. (2003). Dicing with Death. Cambridge University Press, Cambridge.Tansey, T. (1997). It wasn’t all work – Plymouth in the 1930s. Physiological Society Magazine, 27, 10–12.Yates, F. (1984). Tests of significance for 2� 2 contingency tables. Journal of the Royal Statistical Society A 147,

426–463.Yates, F. and Mather, K. (1963). Ronald Aylmer Fisher. Biographical Memoirs of Fellows of the Royal Society of

London 9, 91–129.

204 S. Senn: Atkinson, Fisher and Vitamin C

# 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com