the need for careful evaluation of epidemiological evidence in product liablility cases

39
Law, Probability and Risk (2003) 2, 151–189 The need for careful evaluation of epidemiological evidence in product liablility cases: a reexamination of Wells v. Ortho and Key Pharmaceuticals J OSEPH L. GASTWIRTH Department of Statistics, George Washington University, Washington DC 2005, USA [Received on 8 July 2002; revised on 15 April 2003; accepted on 16 July 2003] Epidemiological studies indicating whether or not a product or chemical is related to a specific harm often are submitted as evidence in product liability and toxic tort litigation. Due to the inherent uncertainty in the estimated risks as well as the possibility that other unmeasured exposures may affect the likelihood of harm, either a significant or non-significant finding ultimately may not be confirmed. This gradual accumulation of scientific knowledge creates a problem for courts that need to resolve a case based on the existing evidence. The problem often arises when courts assess whether a manufacturer failed to warn of a risk as a firm can only rely on the science available at the time the product was sold. Moreover, commentators may sharply criticize judicial decisions using results of scientific studies published after the trial. This article reviews and contrasts the epidemiologic evidence submitted in two cases. It emphasizes the need for a careful examination of the available studies and assessment of the potential impact of unmeasured variables on the interpretation of the results. Our reanalysis is more favourable to the judiciary’s decision in Wells than previous discussions of the case. The decision deemed the epidemiological evidence inconclusive and relied on other medical testimony. Keywords: causation; failure to warn; omitted variables; statistical evidence; tort law. 1. Introduction The legal system is a process for resolving disputes in a reasonably prompt, fair and publicly acceptable manner. In contrast science is an ongoing search for the truth, so that scientific knowledge can be regarded as contingent in the sense that any theory is subject to falsification if predictions logically derived from it are not borne out. 1 Feynman observed that ‘all scientific knowledge is uncertain’ and it is useful for scientists to have a lingering doubt about currently accepted theories. 2 Unlike the scientific paradigm, the legal system must decide a case when it is arises and often requires cases to be filed within a pre-set time period after the relevant events occurred. It does not have the luxury of keeping an open mind or gathering more data. In Section 2 we briefly review the inherent conflict between the scientific search for the truth, which may leave the final resolution of a problem open for an indefinite amount of time and the legal process. Suffice to say that a decision should be assessed on the basis of the evidence submitted at the trial. Scientific studies appearing 1 KARL POPPER, CONJECTURES AND REFUTATIONS (5th ed 1989). 2 See RICHARD FEYNMAN, THE MEANING OF IT ALL (1998). This book, published posthumously, is based on lectures given by the author in 1963. c Oxford University Press 2003, all rights reserved

Upload: others

Post on 11-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The need for careful evaluation of epidemiological evidence in product liablility cases

Law, Probability and Risk (2003)2, 151–189

The need for careful evaluation of epidemiological evidencein product liablility cases: a reexamination of Wells v. Ortho

and Key Pharmaceuticals

JOSEPHL. GASTWIRTH

Department of Statistics, George Washington University, Washington DC 2005, USA

[Received on 8 July 2002; revised on 15 April 2003; accepted on 16 July 2003]

Epidemiological studies indicating whether or not a product or chemical is related to aspecific harm often are submitted as evidence in product liability and toxic tort litigation.Due to the inherent uncertainty in the estimated risks as well as the possibility thatother unmeasured exposures may affect the likelihood of harm, either a significant ornon-significant finding ultimately may not be confirmed. This gradual accumulation ofscientific knowledge creates a problem for courts that need to resolve a case based on theexisting evidence. The problem often arises when courts assess whether a manufacturerfailed to warn of a risk as a firm can only rely on the science available at the time theproduct was sold. Moreover, commentators may sharply criticize judicial decisions usingresults of scientific studies published after the trial. This article reviews and contraststhe epidemiologic evidence submitted in two cases. It emphasizes the need for a carefulexamination of the available studies and assessment of the potential impact of unmeasuredvariables on the interpretation of the results. Our reanalysis is more favourable to thejudiciary’s decision inWells than previous discussions of the case. The decision deemedthe epidemiological evidence inconclusive and relied on other medical testimony.

Keywords: causation; failure to warn; omitted variables; statistical evidence; tort law.

1. Introduction

The legal system is a process for resolving disputes in a reasonably prompt, fair andpublicly acceptable manner. In contrast science is an ongoing search for the truth, so thatscientific knowledge can be regarded as contingent in the sense that any theory is subject tofalsification if predictions logically derived from it are not borne out.1 Feynman observedthat ‘all scientific knowledge is uncertain’ and it is useful for scientists to have a lingeringdoubt about currently accepted theories.2 Unlike the scientific paradigm, the legal systemmust decide a case when it is arises and often requires cases to be filed within a pre-set timeperiod after the relevant events occurred. It does not have the luxury of keeping an openmind or gathering more data. In Section 2 we briefly review the inherent conflict betweenthe scientific search for the truth, which may leave the final resolution of a problem openfor an indefinite amount of time and the legal process. Suffice to say that a decision shouldbe assessed on the basis of the evidence submitted at the trial. Scientific studies appearing

1 KARL POPPER, CONJECTURES AND REFUTATIONS (5th ed 1989).2 See RICHARD FEYNMAN, THE MEANING OF IT ALL (1998). This book, published posthumously, is

based on lectures given by the author in 1963.

c© Oxford University Press 2003, all rights reserved

Page 2: The need for careful evaluation of epidemiological evidence in product liablility cases

152 J. L. GASTWIRTH

after a court has taken evidence in a particular case may be contrary to a particular legaldecision. This does not necessarily imply that the original decision was flawed. In thispaper we discuss two cases,Key Pharmaceuticals, Inc.3 andWells v. Ortho PharmaceuticalCorp.,4 where scientific knowledge changed subsequently. The first case, concerning theneed to properly monitor the blood level of a drug used to treat asthma, has not receivedmuch attention in the legal literature.5 The second case, however, in which the plaintiffswere awarded damages for limb defects from her mother’s use of spermicide after her lastmenstrual period, has been subject to substantial criticism.6 In fairness to the judiciary theepidemiologic evidence and its role in both the ‘failure to warn’ and causation aspects ofWells should be reviewed and compared with the evidence inKey.

Most scientists accept a common approach in deciding whether a theory explains aphenomenon; however, the law uses different criteria for the strength of the evidenceneeded to support a decision in different types of cases. In civil cases, where prevailingplaintiffs are awarded monetary damages, courts use thepreponderance of the evidencestandard of proof rather than thebeyond a reasonable doubt standard used in criminalcases.7 Thus, when assessing a court’s treatment of scientific evidence it is important tobe aware of both the specific issue and the type of case. The different purposes of scienceand law and the consequent differences in methods and procedures are briefly treated inSection 2.

Section 3 describes theKey Pharmaceuticals case. First the decision and the evidencethe court relied on are presented. Then a review of the available literature and analysis ofa study submitted into evidence is given. They indicate that the strength of the scientificevidence availableprior to the administration of the asthma drug to the child, much ofwhich was not submitted to the court, was quite strong, especially with regard to the failureto warn aspect of the case. Section 4 is devoted to theWells case. After summarizing thediscussion of the evidence given in the opinions, the main criticisms of the decision aresummarized. Then they are evaluated in light of an examination of two of the studies relied

3 922 P. 2d 158 (Wash. 1996).4 615 F. Supp. 262 (N.D. Ga. 1985)affirmed 788 F. 2d 741 (11th Cir. 1986).5 The fact that a pharmacist was not allowed to testify against a physician about the potential harmful

characteristics of a drug used for asthma inKey was noted by David E. Bernstein, Improving the Qualificationsof Experts in Medical Malpractice Cases, 1 LAW PROB. & RISK, 9, 11 (2002).

6 The following footnote from DAVID H. KAYE and HANS ZEISEL, PROVE IT WITH FIGURES 271(1997) is typical. The most notorious case isWells v. Ortho Pharmaceutical Corp., 788 F. 2d 741 (11th Cir.1986)reducing the trial court’s judgment against the manufacturer of a contraceptive jelly $5.1 million to $4.7 million),cert. denied, 479 U.S.950 (1986). The findings of the district court, 615 F. Supp. 262 (N.D. Ga. 1985, andthe affirmation on appeal became a lightning rod for criticism of the legal system’s ability to handle expertevidence.See e.g. Federal Judges v. Science, N.Y. Times, December 27, 1986, at A22 (unsigned editorial);Samuel R. Gross,Expert Evidence, (1991) WISC. L. REV. 1113, 1121–24; James L. Mills & Duane Alexander,Teratogens and ‘Litogens’, 15 NEW ENG. J. MED. 1234 (1986). Several years later, in another suit against thesame manufacturer, the district court distinguishedWells on the basis of newer studies and granted summaryjudgment for the manufacturer).Smith v. Ortho Pharmaceutical Corp., 770 F. Supp. 1561 (N.D. Ga. 1991).Seealso David E. Bernstein, Junk Science in the Courtroom, WALL ST. J. Mar. 24, 1993 at A 15 (citingWells as the‘most prominent’ of a ‘series of embarrassing decisions in cases involving scientific evidence’).

7 See Lee Loevinger,Standards of Proof in Science and Law, 32 JURIMETRICS J. 323 (1992) (noting that thestandard of proof informs the fact-finder about the degree of confidence society believes appropriate to the typeof adjudication, thereby allocating the risks of error between the litigants). For further discussion and referencesto the literature on translating standards of proof into probabilitiessee Richard A. Posner,An Economic Approachto the Law of Evidence, 51 STANFORD L. REV. 1477 (1999).

Page 3: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 153

on in the case and a small study the critics refer to. The two cases illustrate how legaldecisions can change over time in response to new scientific findings. While some authorshave strongly criticized theWells decision, this review suggests that both the trial andappellate judges involved did the best they could with the information available to them.

The issue of whether a manufacturer failed to warn of a risk rather than whether theproduct caused the harm arose in both cases. It is not logically obvious that the same degreeof scientific evidence needed to support a finding of causation should be required to supporta finding that a producer had a duty to warn of a potential risk at the time the product wassold. While many jurisdictions require plaintiffs in warning cases to demonstrate legalcausation, others use the informed consent rationale.8 Furthermore, an injured person maylose their opportunity for legal redress due to the expiration of the statute of limitations.9

Recent cases indicate that neither a formal medical diagnosis nor certainty as to causationnor the availability of the proof needed to sustain the legal action is required in order forthe limitations period to begin.10 Since plaintiffs have the burden of proving causation,other commentators have noted that the current system may be to discourage producersfrom testing and monitoring their products.11

TheWells andKey Pharmaceutical cases vividly demonstrate the need for the scientificand legal communities to understand the constraints each works under and the inherentuncertainty in epidemiological and statistical evidence. Some suggestions for enhancingthe utility of scientific evidence in product liability cases are presented in Section 5. Theirpurpose is to encourage producers to obtain and transmit knowledge about their products

8 Judge Wisdom’s opinion inBorel v. Fibreboard Paper Prods. Corp., 493 F. 2d 1076, 1089 (1973) describedthe reasoning underlying this approach as follows: ‘The rationale for this rule is that the user is entitled to makehis own choice as to whether the product’s utility or benefits justify exposing himself to the risk of harm. Thus, atrue choice situation arises and a duty to warn attaches, whenever a reasonable man would want to be informedof the risk in order to decide whether to expose himself to it.’See also MICHAEL D. GREEN, BENDECTINAND BIRTH DEFECTS 12 (1996) and Dix W. Noel,Products Defective Because of Inadequate Directions orWarnings, 23 SW. L. J. 256 (1969) for a discussion of the obligation to warn.

9 For example, in an early case,Mathis v. Eli Lilly Co.719 F. 2d 134 (6th Cir. 1983) involving cancer causedin post-menopausal young women exposed to DESin utero, the court upheld Tennessee’s commencement of its10 year limitations period at the time the drug was purchased. Due to the long latency period of the cancer, thiseffectively prevented any plaintiff from prevailing.See Michael D. Green,The Paradox of Statutes of Limitationsin Toxic Substances Litigation. 76 CAL. L. REV. 965 (1988).

10 See Sixth Circuit Finds Claim Untimely by Woman Who Suspected Link to Disease, 13 TOXIC TORTSREP. 145 (1998) (summarizing the decision in Lynch v. Johnson & Son, 6th Cir. No. 97-5413. The decisioncited an unpublished case,Wansley v. Refined Metals Corp., No. 02101-9503-CV-00065 (Tenn. Ct. App. 1996).The court stated ‘. . . The fact that plaintiff may not have had the proof necessary to sustain his cause of actionuntil within a year prior to filing suit is immaterial in determining when his cause of action accrued.’See also,Alaska Supreme Court Says Worker Failed to Bring Timely Exposure Claim 16 TOXIC LAW REP. 553 (2001),summarizingSopko v. Dowell Schlumberger Inc., Ala., No. S-9534 (the statute of limitations began to run whenthe worker was diagnosed with exposure to toxic fumes, even though the full extent of injury did not becomeapparent until years later).

11 See James A. Henderson, Jr.,Coping with the Time Dimension in Products Liability, 69 CAL. L. REV. 919,940–41 (1981) (observing that strict liability may discourage manufacturers from safety testing) and Wendy E.Wagner,Choosing Ignorance in the Manufacture of Toxic Products, 82 CORNELL L. REV. 773, 810 (1997)(noting that the existing liability system makes ignorance of potential problems a rational choice for chemicalproducers).See also Margaret A. Berger,Eliminating General Causation: Notes Towards a New Theory of Justiceand Toxic Torts, 97 COLUM. L. REV. 2117, 2135 (1997) (citing a variety of products where the producer did nottest the product adequately, failed to impart information when potential problems emerged and did not undertakefurther response to adverse information).

Page 4: The need for careful evaluation of epidemiological evidence in product liablility cases

154 J. L. GASTWIRTH

while enabling scientists to obtain a reasonable amount of information before the legalprocess must resolve a dispute.

2. Some inherent conflicts between the goals of law and science

It is well known that science and law are distinct fields with different roles. The goal ofscience is to understand the mechanisms underlying our observations about the world.Scientists attempt to develop theories explaining observed phenomena and examine themby deriving predictions that are tested against several sets of new data.12 The law is adispute resolution process and must reach a decision on the basis of the informationavailable to it. Thus, its procedures for establishing facts are different from sciencealthough both fields utilize the same basic principles of inductive inference13 and bothare concerned with finding the truth.14 In particular, the legal system cannot wait for theresults of a study to be replicated;15 sometimes it makes replication virtually impossible.For example, the filing of a charge of discrimination often changes the employment processunder scrutiny16. Similarly, a manufacturer may withdraw a product once evidence it isharmful appears;17 however, there is a legitimate concern that producers remove otherwiseuseful products to avoid litigation in the absence of sound scientific evidence.18

In product liability or toxic tort cases epidemiological studies and experimentsconcerning the toxicity of chemicals or drugs are submitted as evidence to support orrefute a claim that a plaintiff’s exposure to or use of a product caused harm. Typically,such evidence is introduced at trialafter the illness or injury occurred. Virtually all statesand nations have statutes of limitations requiring plaintiffs to file a claim within a pre-settime period, e.g. 3, 5 or 10 years after the harmful event happenedor after the relationshipbetween the harm and exposure was discovered or should have been discovered by them.

12 See Joseph L. Gastwirth,Statistical Reasoning in the Legal Setting, 46 AMER. STATIST. 55–69 (1992),and Karl Popper,supra note 1. For a general discussion of the different methodologies of the two fieldsseeSTEPHEN E. FIENBERG (ed.) THE EVOLVING ROLE OF STATISTICAL ASSESSMENTS AS EVIDENCEIN THE COURTS (1989), especially Chapter 4.

13 David H. Kaye,Proof in Law and Science, 32 JURIMETRICS J. 313–322 (1992).14 In Tehan v. U.S. ex rel. Shott, 382 U.S. 416,465 (1966) the Court noted that ‘the basic purpose of a trial

is the determination of the truth’ (finding that denying a poor defendant the assistance of a lawyer increases thedanger of convicting the innocent). Some of the rules of evidence do allow a party to withhold evidence becauseallowing it might endanger other societal values. The marital privilege is an example.

15 Kaye,supra note 13 at 317.See Ronald J. Allen,Expertise and the Daubert Decision, 84 J. CRIM. LAWAND CRIMINOLOGY, 1157 (1994) for a discussion of the role of scientific experts in trials and noting, at 1166,that science is not constrained by statutes of limitations.

16 A substantial increase in the proportion of minority employees hired subsequent to a charge in equalemployment cases is evidence that they were available prior to the charge.See Boris Freidlin and Joseph L.Gastwirth,Changepoint Tests Designed for the Analysis of Hiring Data Arising in Employment DiscriminationCases, 18 J. BUS. & ECON. STAT. 315 (2000) for a discussion of statistical methods used in detecting a changeand for references to several legal cases.

17 The data showing the sharp decrease in toxic shock syndrome cases after the manufacturer of thetampon responsible for them took it off the market and the related studies and legal cases is presented inJOSEPH L. GASTWIRTH, STATISTICAL REASONING IN LAW AND PUBLIC POLICY 840–49 (1988).Such a manufacturer is acting responsibly and post-event modifications or removal of a product are generallyinadmissible as evidence of causation.

18 This apparently happened when Merrill-Dow withdrew Bendectin from the market because of suits that itcaused birth defects.See MICHAEL D. GREEN,supra note 8 at 180–188.

Page 5: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 155

These fixed deadlines imposed for other reasons conflict with the open-ended nature ofscientific investigation. Regardless of the state of scientific knowledge at the end of thelimitations period, if plaintiffs have not filed suit, they are legally barred from receivingcompensation no matter how strong a causal relationship between the harm and exposureto the product is established subsequently.19 This may also cause plaintiffs to file a suitbefore the scientific issues could be resolved.

Manufacturers also have a duty to warn users of any potential harm their product maycreate. As noted by Phillips20 warnings are designed to ensure safe use and they mustadequately describe the nature and extent of the danger involved. Many locales requireplaintiffs to prove causation in addition to failure to warn. Some states, however, use theinformed consent rationale to evaluate the adequacy of warnings. Here the plaintiff assertsthey should have been warned of a danger or possible risk in order to decide whether touse the product.21 Regardless of the criteria adopted in warning cases a manufacturer’sobligation depends on the knowledge available at the time the product is sold to the user.

The majority of jurisdictions accept the view stated in theRestatement (Third) of Torts:

A defendant will not be liable under an implied warranty of merchantabilityfor failure to warn about risks that were not reasonably foreseeable at thetime of sale or could not have been discovered by way of reasonable testingprior to marketing the product. A manufacturer will be held to the standard ofknowledge of an expert in the appropriate field, and will remain subject to acontinuing duty to warn of risks discovered following the sale of the productat issue.

The above legal criteria stating a producer’s duty to warn does not define clearly theresponsibilities producers have to ascertain risks arising from the use of their products. Nordoes it indicate the magnitude of risk or danger, either in terms of probability or severityof harm that requires warning potential users. Several commentators have noted that thisvagueness and the requirement that plaintiffs need to establish causation, i.e. the productin question harmed them, discourages manufacturers from conducting safety studies.22

Some courts have interpreted the expert status of producers as a duty for them to testtheir products appropriately and to follow developments in the relevant scientific literature.23 Indeed, some courts have invoked a post sale duty to warn.24 This duty to warn mayencourage manufacturers to question studies and case reports indicating a potential risk of

19 See Wansley v. Refined Metals Corp., supra note 10.But see In Re Asarco/Vashon-Maury Island Litig.,W.D. Wash., No. C00-695Z (2001), where the court denied Asarco’s motion for summary judgment in a caseconcerning soil contamination, finding that the statutes of limitations do not begin to run on nuisance and trespassclaims until the substance is removed.

20 JERRY J. PHILLIPS, PRODUCTS LIABILITY, 205 (1988).21 Id. at 209.22 See Wendy E. Wagner,supra note 11 at 796 (observing that, given plaintiff’s need to prove causation,

a manufacturer might view testing efforts that could produce adverse results as encouraging litigation); HeidiL. Feldman,Science and Uncertainty in Mass Exposure Litigation, 74 TEX. L. REV. 1, 41 (1995) (notingthat plaintiff’s burden of proof in toxic tort cases may create incentives for defendants not to clarify scientificuncertainties relating to causation).

23 See Borel v. Fibreboard Paper Prods. Corp., 493 F. 2d 1076, 1089–90 (5th Cir. 1973) andNicklaus v. HughesTool Co. 417 F. 2d 983, 986–87 (8th Cir. 1969) (noting that a manufacturer has an affirmative duty to make testsduring and after the product is made that are appropriate to dangers involved in is use).

24 See M. STUART MADDEN, TOXIC TORTS DESKBOOK, 82, (1992), citingRastelli v. Goodyear Tire &

Page 6: The need for careful evaluation of epidemiological evidence in product liablility cases

156 J. L. GASTWIRTH

their products to keep their product on the market. In jurisdictions with strict statutes oflimitations, delaying public awareness may also prevent individuals that were harmed fromfiling a claim.

While both disciplines utilize the concept ofcausality, its meaning differs. Beforescience makes a causal attribution, the biochemical mechanism, i.e. how the agent inquestion affects the body ultimately leading to the disease should be known. Alternatively,a substantial number of studies should demonstrate a consistently larger risk of thedisease in exposed populations relative to an otherwise comparable group of unexposedindividuals. A substantial amount of time may pass before the cumulative amount ofevidence enables the scientific community to come to a consensus about the existenceof a causal relationship between an exposure and an illness. The law relies on a morepractical ‘more likely than not tests for causality’ to assign legal responsibility.25 Whilethis may appear surprising to scientists, even in his oft-quoted paper enumerating thecriteria to be used in establishing causality in medicine, Sir Bradford Hill26 noted thatdifferent standards are applicable to public policy. He clearly realized that waiting foranother confirmatory study imposes a potential risk to the public that remains exposedwhile the new study is carried out. This risk is illustrated by Reye’s syndrome cases. In thefall of 1982, after four studies, the Food and Drug Administration proposed warning thepublic about the increased risk of Reye’s syndrome in children following the use of aspirinto treat colds, flu or chicken pox. The industry persuaded the government to sponsor afurther study. In early 1985, the new data confirmed the increased risk and the industrybegan to notify the public. This delay was a substantial cost to the public.27

Before describing the cases and evidence that will be examined, a brief summary ofthe use of epidemiological evidence in tort cases is useful. Studies comparing individualsexposed to a product withotherwise similar unexposed individuals yield estimates ofthe relative risk,R, the ratio of the probability an exposed individual suffers the harm

Rubber Co., 565 N.Y.S. 2d 889 (App. Div. 1991). Wagner,supra note 11 at 809 cites other statutes that allowregulatory agencies to impose a continuing duty on manufacturers to conduct tests and to inform the public ofdangers that come to their attention after the product is sold.

25 For adiscussion of the variation in how courts rely on statistical evidence,see Susan R. Poulter,Science andToxic Torts: Is There a Rational Solution to the Problem of Causation? 7. HIGH TECH. L. J. 189, 211 (1992).SeeDavid Rosenberg,The Causal Connection in Mass Exposure Cases: A ‘Public Law’ Version of the Tort System.97 HARV. L. REV. 851–929 (1984) for comprehensive treatment of causality in tort law.

26 Austin B. Hill, The Environment and Disease: Association or Causation? 58 PROC. ROYAL SOC. MED.295–300 (1965). The main criteria considered in determining an association is causal are: temporal relationship,strength of the association, dose response, replication, consistency with other knowledge (biological plausibility)and specificity of the association (most agents cause only one or a related set of health problems). Hill emphasizedthat the criteria should be used as an aid for inference and that no single criteria should be considered asinequa non). See Michael D. Green, D. Michal Freedman and Leon Gordis,Reference Guide on Epidemiologyin REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 375–79 (Federal Judicial Center, 2000) for furtherdiscussion of these criteria. Ethical issues arising in the use of epidemiological studies in public health decisionsare given in Douglas L. Weed,Epistemology and Ethics in Epidemiology, in ETHICS AND EPIDEMIOLOGY76 (S. S Coughlin and T. L. Beauchamp eds., 1996).

27 See Joseph L. Gastwirth,Suggestions for Reconciling the Values of Statistical Science with the Goals andNeeds of the Legal and Regulatory Processes. PROC. OF THE EPIDEM. SEC. 1998 MEETING OF THE AM.STAT. ASS’N. and Ermias D. Belayet al., Reye’s Syndrome in the United States from 1981 Through 1997. 340NEW ENG. J. MED. 1377–1382 (1999). There were about 200 cases in each of the years 1981 and 1982 but onlyabout half as many in 1985 and 1986. The reader should be told that the author was a consultant to the statisticalunit in the Office of Regulatory Analysis of the Office of Management and Budget during this period.

Page 7: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 157

in question to that of the unexposed individual.28 Courts often consider the fraction ofcases that are attributable to exposure, which mathematically equals(R − 1)/R as theprobability that the exposure caused the harm at issue. Notice that anR > 2 implies that atleast one-half of the cases are due to the exposure and is sometimes equated to meeting thepreponderance of the evidence standard. This topic has been discussed extensively in theliterature, to which we simply refer.29 Scientific studies estimateR for a general populationso that the results may not be directly applicable to a case if the plaintiff is healthier orsicker than average. Usually there are several studies that yield varying estimates of therelative risk,R. It is important to present all of these results to the court so that an overallestimate ofR can be made. This is not a routine task as different populations may have beenstudied and information on other risk factors may have been collected in some studies butnot in others.

3. Young v. Key Pharmaceuticals, Inc.30

3.1 The background

This case concerned whether the treating physicians and hospital were negligent inprescribing and monitoring the blood level of theophylline, a drug used to treat asthma.A particularly important issue was whether the maker of the drug had properly warned themedical profession of the effect viral infections had on slowing the body’s elimination ofthe drug. It was known at the time that the dose had to be in the narrow range of 10 to20 mcg/ml as higher levels, especially those exceeding 40 mcg/ml, increased the risk ofseizure.31 In September 1978 the plaintiff (a young boy, at the time) was put on a timerelease version of the drug, and received 200 mg twice a day. In January 1979 the dosewasincreased byfifty percent to 300 mg and on January 26th his blood level was measuredas 11.8 mcg/ml, which was in the therapeutic range. On February 1, 1979 the child hadseizures. The measured blood level of the drug was 68 mg/ml, well above the safe level.

The physicians and hospital were dropped from the case when the court decided thata pharmacist did not have sufficient qualifications to be admitted as an expert.32 The

28 This is a simplified definition as the time dimension and the potential affect of other risk factors is ignored.The cases examined here concern responses that occur relatively soon after exposure.See Sander Greenland &James M. Robins,Epidemiology, Justice and the Probability of Causation, 40 JURIMETRICS J. 321 (2000) forfurther discussion and references.

29 See Greenet al. supra note 26 at 333–400 for a comprehensive discussion of the role of epidemiologicalevidence and the ‘rule’ thatR should exceed 2.See also Stephen E. Fienberget al., Understanding and EvaluatingStatistical Evidence in Litigation, 36 JURIMETRICS J. 1, 9 (1995) (noting that the requirement of a relative riskof 2.0 rests on a misinterpretation of statistical methodology). Russellyn S. Carruth and Bernard D. Goldstein,Relative Risk Greater Than Two in Proof of Causation in Toxic Tort Litigation, 41 JURIMETRICS J. 195 (2001)review of the various ways courts have used estimates of relative risk in conjunction with other evidence to assesscausation. JUDEA PEARL, CAUSALITY at 283–304 (2000) describes the stringent conditions needed for thevalidity of the ‘R should exceed 2 rule’ in the context of responses that occur shortly after exposure. Otheruseful discussions are given by Sana Loue,Epidemiological Causation in the Legal Context: Substance andProcedures, in STATISTICAL SCIENCE IN THE COURTROOM 263, 274–277 (J. L. GASTWIRTH ed., 2000),Louis A. Cox Jr.,Probability of Causation and Attributable Risk. 4 RISK ANALYSIS 221 (1984) (emphasizingthe potential effect of other risk factors on the interpretation of attributable risk).

30 922 P. 2d 158 (Wash. 1996)31 Miles Weinberger,Theophylline for Treatment of Asthma. 92 J. PEDIAT. 1–7 (1978).32 770 P. 2d 182, 188 (1989).

Page 8: The need for careful evaluation of epidemiological evidence in product liablility cases

158 J. L. GASTWIRTH

TABLE 1 Half-life of theophylline insix asthmatic children during and after afebrile infection

Subject During After Difference1 279.6 125.5 154.12 352.8 82.1 270.73 204.8 167.1 37.74 558.5 381.4 177.15 686.1 303.2 382.96 436.8 439.8 −3.0

AVG 419.8 249.9 169.9

Note: The first four pair had influenza(A), the sixth had influenza (B) whilethe fifth had a different viral infection.

pharmacist’s affidavit containing their intended testimony on behalf of the plaintiff statedthat the recommended dosage for children was 200 mg and that more frequent monitoringof blood levels was required. A dissenting judge believed that pharmacists were experts inthese topics and the affidavit was relevant33 but the majority felt a physician was required.

Regarding the claim of failure to warn, the plaintiff submitted an article34 that reporteda significant increase in the half-life of theophylline during acute respiratory illness. It wasasmall study of six patients who had upper respiratory viral infections. Other patients whohad gastrointestinal symptoms did not show an effect. The data, reporting the blood levelof theophylline during and after a fever-inducing infection, is reproduced in Table 1. Intheir statistical analysis the investigators used a one-tailed two-samplet-test, obtaining ap-value of 0.05. This result would be considered of border-line significance. Of the sixpatients only the first experienced an acute toxic reaction. The child had a blood level of43.3 mcg/ml, which exceeded the safe range. The blood levels for the remaining patientswere not reported.

While the study indicated that further research into the effect of acute viral infectionson the metabolism of the drug was needed, one of the authors testified that the studybyitself did not justify a warning at that time.35 The opinion notes that the defendant wasaware of reports, studies and articles prior to February 1, 1979 suggesting that prolongedfever and certain viral illnesses could result in accumulation of toxic levels of theophyllinein the body, even when administered in normal doses.36 By the time of the trial the opinionobserves that there was no dispute about the validity of these studies and the questionwas whether the firm should have warned physicians of this ‘possible risk’ based on theinformation available before February 1, 1979.37

33 Id. at 191–92.34 K. C. Chang, T. D. Bell, B. A. Lauer & H. Chai,Altered Theophylline Pharmacokinetics During Acute

Respiratory Viral Illness. LANCET 1132–34 (May 27, 1978)35 922 P. 2d 59 Wash. (1996) at 65.36 Id. at 62. Citations to those articles, however, are not given in the opinion.37 Id. at 62.See also G. D. Sweeny and MacLeod,Anti-Allergy and Anti-Asthma Drugs: Disposition in Infancy

and Early Childhood, 17 Clin. Pharmacokin. (Supp. 1) 156,163 (1989) (noting that theophylline is accepted as

Page 9: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 159

The plaintiff also wanted to submit an advertisement for an alternative drug, Sustaire,published in a medical journal in February 1979 that noted the need for careful monitoring.The trial judge did not admit the ‘ad’ as it appeared very close to the date of the injury.The Supreme Court of the state held that it was properly excluded.38 Two of the reasonsgiven are important. First, the ‘ad’ appearedafter the injury and secondly even if anothercompany felt a warning was needed, in order to prove that the defendant was negligent, theplaintiffs need to show that warning was the industry standard.39

A crucial aspect of the case was the jury instructions that set forth the manufacturer’sduty to warn in negligence cases. The instruction, approved by the Supreme Court ofWashington was:40

A pharmaceutical manufacturer is to under a duty to use ordinary care to test, analyseand inspect the drug products it sells, and is charged with knowing what such tests wouldhave revealed.

The pharmaceutical manufacturer has a duty to use ordinary care to keep abreast ofscientific knowledge, discoveries, advances and research in the field, and is presumed toknow what is imparted thereby.

When a pharmaceutical manufacturer becomes aware or should have become aware ofdangerous aspects of one of its drug products, it has a continuing duty to warn of suchdangerous aspects. In such a case, the pharmaceutical manufacturer is under a duty to actwith regard to issuing warnings or instructions concerning any such danger in the mannerthat a reasonably prudent pharmaceutical manufacturer would act in the same or similarcircumstances. This duty is satisfied if the manufacturer exercises reasonable care to informthe ordinary physician who prescribes the drug product.

The question of whether defendant, Key Pharmaceuticals, Inc., exercised reasonablecare is to be determined by what defendant Key knew or reasonably should have knownprior to the time of plaintiff’s injury on February 1, 1979.

The plaintiff objected to this instruction and asked for the following two instructionsthat appear consistent with the informed consent rationale:41

This duty to warn attaches, not when scientific certainty of harm is established, butwhenever a reasonable physician using the product would want to be informed of the riskof harm in order to decide how to safely use the drug in treating his or her patients.

The manufacturer of a pharmaceutical product has a duty to a physician who prescribesit to provide all necessary instructions and warnings to fully apprise the physician of theproper procedures for its safe use and the dangers involved.

Clearly, more or stronger evidence is required to satisfy the standard adopted in thecase than in the plaintiff’s alternative. The court’s criteria will be used in the followingcritique of the decision. Before proceeding, it should be noted that the final opinion onlycited one study and neither it nor the other published opinion42 cites the other reports andstudies the plaintiff submitted to the court.

a difficult drug to use with a narrow therapeutic window (plasma concentrations of 10 to 20 mcg/ml) betweeneffective dose and toxicity).

38 Id. at 66.39 Id. at 66.40 Id. at 67.41 Id. at 67.42 770 P. 2d 182 (Wash. 1989).

Page 10: The need for careful evaluation of epidemiological evidence in product liablility cases

160 J. L. GASTWIRTH

3.2 Critique of the evidence and the decision

In a reasonable but not exhaustive search of the literature the author found about twentyarticles related to theophylline or its sister drug, aminophylline, published in 1978 orearlier, which will be summarized briefly. If even a substantial fraction of them had beensubmitted into evidence, one could seriously question the court’s finding that the defendantdid not have a duty to warn the medical community by the end of 1978.

At least four case reports published in the 1950s indicated that over-dosage could causetoxicity in children.43 By the early 1970s dosage regimens for achieving a safe plasmatheophylline concentration of about 10 mcg/ml in most patients were established.44 In1973, a case of aminophylline intoxication in a four-year old occurred due to a mistakeby hospital staff who administered that drug instead of ampicillin.45 Another case-reportnoting that blood levels must be measured appeared in 1975.46 In 1974, the value ofmeasuring serum theophylline was demonstrated when a 72-year-old man had convulsionsas a result of high blood levels of the drug.47 Three theophylline induced seizures inadults were reported at about that time.48 The usefulness of plasma concentrations toguide theophylline therapy as well as the development of measuring that concentrationfrom saliva was known in 1975.49 The large variation in the time required by childrento eliminate the drug and the fact that fever might slow this process were known by

43 See Vincent J. Rounds,Aminophylline Poisoning, 14 PEDIATRICS 528–531 (1954) (reporting six new casesfrom apparently reasonable dosage administrations), A. C. Nolke,Severe Toxic Effects from Aminophylline andTheophylline Suppositories in Children, 161 JAMA 693–697 (1956) (reporting four deaths in a series of 13cases), Ben. H. White and C. Wm. Daeschner,Aminophylline Poisoning in Children, 49 J. PEDIATR. 262–271 (1956) (reporting four new cases of toxic reaction in children indicating that the possibility of intoxicationshould be considered when the drug is used and that care should be taken to prevent accumulation) and Soifer,H., Aminophylline Toxicity, 50 J. PEDIATR. 657–669 (1957) (summarizes a total of 37 reported aminophyllinetoxicity cases resulting in 11 deaths).See also H. L. Bacal, K. Linegar, R. L. Denton,et al., AminophyllinePoisoning in Children, 80 CAN. MED. ASSOC. J. 6–9 (1959) (reporting 10 cases of aminophylline poisoning inchildren from the Montreal Children’s Hospital and observing, at 8, that in all cases reported the dosage was toohigh).

44 See Paul A. Mitenko and Richard I. Ogilvie,Rational Intravenous Doses of Theophylline. 289 N. ENG.J. MED. 600–03 (1973) (developed a dose schedule designed to result in concentrations of 10 plus or minus 5mcg/ml and noted that at higher doses care must be taken to avoid systemic toxicity).

45 See Bruce Barter & Robert J. Roberts,Unusual Case of Aminophylline Inoxication. 52 PEDIATRICS, 608–09 (1973) (also noting that blood levels are difficult to interpret due to erratic and unpredictable absorption of oralor rectal administration of the drug).

46 See Badiollah Badiei and R. Michael Sly,Theophylline Toxicity in an Infant. 35 ANN. ALLLERGY, 309–11(1975) (citing two other recent reports confirming the association of the drug’s toxicity with excessive doses).

47 Myron H. Jacobs and Robert M. Senior,Theophylline Toxicity Due to Impaired Theophylline Degradation,110 AM. REV. RESPIR. DIS. 342–345 (1974) (reporting the problem at the standard dosage).

48 M. S. Schwartz and D. F. Scott,Aminophylline Induced Seizures, 15 EPILEPSIA 510–514 (1974) (reportingfour cases of seizures in adults and noting that convulsions had been well recognized as a possible toxic effect inchildren in 1953), Clifford W. Zwillich, Frank D. Sutton and Thomas A. Neff,Theophylline-Induced Seizures inAdults, Correlation with Serum Concentrations, 82 ANN. INTERN. MED. 784–787 (1975) and Philip R. Yarnelland Nai-Shin Chu,Focal Seizures and Aminophylline, 25 NEUROLOGY 819–822 (1975) (stating that carefulindividual monitoring of aminophylline dose and administration is essential).

49 Gerhard Levy and Renu Koysooko,Pharmacokinetic Analysis of the Effect of Theophylline on PulmonaryFunction in Asthmatic Children, 86 J. PEDIATRICS, 789–93 (1975) (noting the suitability of monitoring theblood level of theophylline levels in children using saliva).

Page 11: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 161

1976.50 Further reports on toxicity in adults and children appeared in 1977.51 The medicalprofession was reminded of the need for careful monitoring of serum concentrationsin four articles appearing the following year.52 One of those studies53 increased dosesin increments of 25%, rather than the 50% raise in dose administered in the case. Atthe beginning of 1978, a review article noted that some sustained-release preparations,including TheoDur tablets made by Key Pharmaceuticals, warranted consideration for use,and also stated that individual dosages should be guided by careful measurement of serumtheophylline concentration.54

Collectively these articles and reports appear to provide sufficient evidence that by theend of 1978 a warning of the risk of seizure, the need to increase the dose gradually andto regularly monitor the serum level of the drug should have been given. As manufacturershave a duty to keep up with the relevant scientific literature, it is unfortunate that theopinion did not list the studies and case reports submitted at trial. Then all the informationthe court believed the defendant was or should have been aware of rather than theinformation that truly was available at the time could be reviewed.55

It should also be noted that the statistical method used to analyse the data in Table 1from the small study was not the appropriate one. As the measurements are on the samechild, they are matched or paired. Using the pairedt-test56 yields a statistically significant

50 See Elliot Ginchansky and Miles Weinberg,Dose Dependent Kinetics of Theophylline Disposition inAsthmatic Children, 91 J. PEDIATRICS 820–824 (1977) (noting the need for a reliable sustained-release formof the drug as noting the effect of fever on decreasing clearance) and P. M. Loughnanet al. PharmacokineticAnalysis of the Disposition of Intravenous Theophylline in Young Children 88 J. PEDIATRICS 874–879 (1976)(while these authors suggest the potential effectiveness of higher dose levels they comment that the drug exhibitsrelatively small differences between therapeutic and toxic plasma concentrations and that clinical monitoring forsigns of toxicity is imperative).

51 See Leslie Hendeles, Lyle Bighley, Robert H. Richardson,et al., Frequent Toxicity from IV AminophyllineInfusion, 11 DRUG INTELL. CLIN. PHARM. 12–18 (1977) (observing that reduced clearance of the drugwas related to increased toxicity) and Yvonne Vaucher, Elmer S. Lightner, and Philip D. Walson,TheophyllinePoisoning, 90 J. PEDIATRICS 827–830 (1977) (reporting three cases of theophylline poisoning in children, twoof whom had fever).

52 See Richard Wyatt, Miles Weinberger and Leslie Hendeles,Oral Theophylline Dosage for the Managementof Chronic Asthma, 92 J. PEDIATRICS 125–130 (1978) (noting that patients will be placed at risk for seizuresif doses are increased without measurement of serum theophylline concentrations), John Turk, Jay M. McDonaldand Jack H. Ladenson,Theophylline Toxicity, 24 CLIN. CHEM. 1603–1608 (1978) (recommending that theserum level of the drug be monitored regularly and dosages adjusted accordingly), Frederick M. Vincent,CaseReport: Fatal Theophylline-induced Seizures, 63 POSTGRAD. MED. 76–77 (1978) (noting that aminophyllineand related drugs should be listed among those that can cause seizures at levels and stating that serum druglevels should be monitored) and Gregory J. Kadlecet al., Acute Theophylline Intoxication Biphasic First OrderElimination Kinetics in a Child, 41 ANN. ALLERGY 337–339 (1978) (stating ‘it is of the utmost importance tohave regular theophylline levels determined in order to accurately predict the course of intoxication’).

53 See Wyattet al., supra note 52 at 125. The article also notes that high fever can slow metabolism of the drug.54 Miles Weinberger,Theophylline for Treatment of Asthma, 92 J. PEDIATR. 1–7 (1978).55 Of course, it is the duty of the parties to present this evidence to the court.56 In a matched study one takes one sample of individuals and then finds controls that are matched to

them on the basis of other factors, e.g. prior health history or lung capacity, that influence the outcome underinvestigation. If a difference between the groups is observed, it cannot be attributed to these matched factors butis due to the difference in exposure to the agent under study. As the article submitted in the case reported thereaction of the same child at two different times, virtually all other possible factors are controlled for. Statisticalmethods appropriate for matched studies are readily available.See GASTWIRTH, supra note 17, 587–674, andMICHAEL O. FINKELSTEIN and BRUCE LEVIN, STATISTICS FOR LAWYERS, 223 and 226–228 (2000)for an introduction to them and their application in the legal setting. The analysis used assumes the differences

Page 12: The need for careful evaluation of epidemiological evidence in product liablility cases

162 J. L. GASTWIRTH

result at the usual 0.05 level (two-tailedp-value of 0.034) for the difference in blood levelsbetween the patients when they were well and when they had a respiratory illness. Thisis noticeably stronger evidence as the two-tailedp-value of the test used by the authorsof the article57 was only 0.10. While this one small study, even properly analysed, wouldnot justify a warning, when added to the reports cited previously, the pre-1979 informationappears to satisfy the instruction accepted by the court.

Questions can also be raised about the second reason the court gave for excludingthe competitor’s advertisement.58 It appears to conflict with a classic opinion written byJudge L. Hand who observed that an entire industry could be negligent.59 As firms in fairlyconcentrated industries have been charged with anti-competitive acts60 it is quite possiblethat an entire industry would agree to limit the cost of a product.

Postscript. In a recent case in Oregon,Ricci v. Key Pharmaceuticals, Inc.61 the firmwas found liable for the injuries resulting from a child taking the time-released version,TheoDur of theophylline. The first appellate opinion noted that before the 1970s thedrug was a difficult one to use because substantial monitoring and adjustment of dosagewere required.62 After the time-released version was marketed, in 1987 and in 1989 theFDA informed Key that it was making false claims of superiority of its drug comparedwith competitors. The court found that between 1987, when the company learned of aninteraction between theophylline and cipro, and the plaintiff’s injury in October 1990,‘Key took no steps to inform physicians or patients of this toxicity problem’.63 Thesecond appellate opinion, upholding the jury’s award of punitive damages, described thesefacts as ‘clear and convincing’ evidence that Key withheld or misrepresented informationconcerning toxicity problems with its product.

follow a normal distribution as the original authors used the two-samplet-test that makes a similar assumptionabout the measurements at both times.

57 Seesupra note 34 and accompanying text.58 Seesupra note 39 and accompanying text.59 The T. J. Hooper, 60 F. 2d 737,740 (1932)affirming 53 F. 2d 107 (1931).See WILLIAM M. LANDES

& RICHARD A. POSNER, THE ECONOMIC STRUCTURE OF TORT LAW 133–136 (1987) for a carefuldiscussion of the case and related ones. In particular, the lower court’s decision indicated that it was the industrycustom for tugboats to have radios. They observe, at 132, that when consumers have adequate information aboutthe safety characteristics of products, an optimal allocation of resources to safety versus other activities shouldbe achieved by negotiations so industry custom should be optimal. For the cases we discuss, the cost of obtaininginformation about drugs and their side effects is much harder for customers to obtain than for producers.

60 In Todd v. Exxon Corp. 275 F. 3d 191(2d Cir. 2001) plaintiffs alleged that 14 oil and petrochemical firmsaccounting for 80 to 90 percent of the industry exchanged salary data concerning managerial, professional andtechnical positions in order to limit the employees pay. The court said that a trial to determine whether thisexchange of information violated the anti-trust provisions of the Sherman Act should be held.

61 974 P. 2d 758 (1999). The appellate opinion is A86556 (Nov. 14, 2001).62 Id. at 761.63 Id. at 761.

Page 13: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 163

4. Two cases concerning birth defects and spermicide use

The two casesWells v. Ortho Pharmaceutical Corp.64 and the subsequent caseSmith v.Ortho65 illustrate how early studies can indicate a legal need for a warning but subsequentstudies may indicate it is not needed. As noted previously, the first case has been severelycriticized in the literature.66 First, the evidence presented and legal decisions in bothWells andSmith will be summarized. Then the major criticisms of theWells decision willbe described. Finally, the scientific evidence and the criticisms will be re-examined. Indiscussing the various studies, the conventional 0.05 level will be adopted to determinestatistical significance, however, many studies are small and do not have sufficient power(probability of detecting a relative risk of 1.5 or more). Thus, the fact that a study is notsignificant means that there is not strong evidence of an effect. The ultimate inferenceshould be based on a summary estimate of the relative risk utilizing the risks found in allrelevant studies.

4.1 Wells v. Ortho Pharmaceutical Corp.

The mother of ‘baby Wells’ began using a spermicide in July 1980 and the baby was bornon July 1, 1981 with several birth defects including several involving her limbs. She laterdeveloped an optic nerve defect.67 The court observed that plaintiff could not recover ifthere was only a ‘bare possibility’ that the spermicide caused the birth defects or othertheories of causation were equally plausible. The opinion states that the plaintiff’s burdenwas not to produce an unassailable scientific study on the issue but to show from all theevidence that to a ‘reasonable degree of medical certainty’ the spermicide caused some thebaby’s defects.

4.1.1 Plaintiffs evidence. First, the baby’s parents testified that they used thespermicide and that the product was used for one or two weeks after the mother’s lastmenstrual period. Then various drugs used by either parent were explored.

Plaintiff’s first expert, Dr Buehler, was a specialist in biochemical teratology and thedirector of the Center of Human Genetics at the University of Nebraska. After examiningthe baby, he concluded that the limb defect was caused by a vascular disruption thatinterrupted the blood supply to the developing limb.68 He testified that the defect in theright hand was also likely to be due to a vascular disruption but it was inconsistent with

64 615 F. Supp. 262 (N.D. Ga. 1985)affirmed 788 F. 2d 741 (11th Cir. 1986).65 770 F. Supp. 1561 (N.D.Ga. 1991).66 Seesupra note 6 and accompanying text.See also SHEILA JASANOFF, SCIENCE AT THE BAR 114

(1995) (stating that cases ‘likeWells add compelling specificity to charges that a know-nothing legal system letsplaintiffs walk way with multimillion-dollar awards although their causal arguments are grossly inadequate byscientific standards’) and Bert Black,A Unified Theory of Scientific Evidence, 56 FORDHAM L. REV. 595, 672(1988) (stating that by ignoring science, valid evidence favourable to the defendant was rejected). Other criticalcomments appear in: PETER W. HUBER, GALILEO’S REVENGE 174 (1991), NEIL VIDMAR, MEDICALMALPRACTICE AND THE AMERICAN JURY 173 (1995), James L. Mills, Spermicides and Birth Defects inPHANTOM RISK ed. by Kenneth R. Foster, David E. Bernstein and Peter W. Huber (1993) p. 87–99 and MarcS. Klein,After Daubert: Going Forward With Lessons From the Past, 15 CARDOZA L. REV. 2219 (1994).

67 Wells 615 F. Supp. 262 (the facts about the plaintiff are given on 266–269).68 Id. at 270–273 (for Dr Buehler’s testimony).

Page 14: The need for careful evaluation of epidemiological evidence in product liablility cases

164 J. L. GASTWIRTH

amniotic banding. He then ruled out other possible causes such as diabetes or other drugsused by the parents.

Then he discussed the studies. In 1976 a study supported by the NIH indicated anincreased risk of serious birth anomalies for whites using a spermicide after their lastmenstrual period (LMP) but this pattern was not seen in the data for blacks.69 In 1977acase-control study in Canada indicated that use of spermicide or tranquillizers as well asa threatened abortion, e.g. vaginal bleeding and pain, during pregnancy increased the riskof limb defects.70 The data showed a threefold increased risk of limb defects in offspringof spermicide users.71 Dr Buehler noted that while this article did not draw a strongconclusion its data suggested an association between spermicides and limb deficiencies.

When he saw the 1981 article by Jick,72 he started to recommend that pregnant womenbe screened for fetal limb defects. He also cited a 1982 animal study73 that showed thechemical in the spermicide caused an increased number of fetuses to be reabsorbed in ratsand mice indicating that the spermicide could reach the fetus and harm it. The expert alsoreferred to a study of miscarriages74 but a major fact he relied on was the mother’s use ofthe drug after conception through the time the limb buds are formed.75

Plaintiff’s second expert, Dr Sutherland, described how the type of spermicide usedworked and stated that it can cause vascular disruption even though he could not identifythe precise mechanism by which this results in a birth defect.76 He cited animal studiesindicating that the mother’s body can absorb the chemical and noted that some clinicalreports had suggested an association. Addressing the failure to warn issue, Dr Sutherlandstated that after the 1976 Oechsli report and 1977 Smithet al. study, the defendant shouldhave warned users about the risk of birth defects.

Two other experts testified that after the two studies indicating a risk of birth defectsappeared in the late 1970s, the defendant should have had a warning in place by July1980.77 One of them emphasized the importance of making sure that the patient is properlyinformed. A third doctor, a Director of drug information at a hospital, also testified onthe failure to warn issue. He testified that the defendant did not inform him about thetwo studies indicating a risk. During his cross-examination, he was asked whether thefact that a FDA panel report in 1983, that concluded that the product did not require an

69 Frank W. Oechsli,Studies of the Consequences of Contraceptive Failure (Apr. 8, 1976) (unpublished study).It is cited in Wells 615. F. Supp. at 271 n. 9. The report concludes, on page 10, that the study’s findings raisea serious suspicion that there may be deleterious effects from jelly-foam-suppository use and recommends thatfurther research be undertaken.

70 E. S. O. Smith, Charlotte S. Dafoe, James R. Miller and Philip Banister,An Epidemiological Study ofCongenital Reduction Deformities of the Limbs, 31, BRITISH JOURNAL OF PREVENTIVE AND SOCIALMEDICINE, 39–41 (1977).

71 Id. at 41. The data are also presented in GASTWIRTH,supra note 17 at 864 and is discussed in Section 4.4.72 Hershel Jicket al., Vaginal Spermicides and Congenital Disorders, 245 JAMA 1329–32 (1981).73 Buttar, Assessment of the Embryotioxic and Teratogenic Potential of Nonoxynol-9 in Rats Upon Vaginal

Administration, 2 THE TOXICOLOGIST 39 (1982).74 S. Harlap, P. H. Shiono and S. Ramcharan,Spontaneous Fetal Losses in Women Using Different

Contraceptives Around the Time of Contraception, 9, INTERNATIONAL JOURNAL OF EPIDEMIOLOGY,49–56 (1980) (finding, at 54, a non-significant increased risk(R = 1·16) of fetal loss for spermicide users pasttheir LMP). The study did not examine birth defects.

75 615 F. Supp. at 293.76 Id. at 273–74.77 Id. at 276–78 (for the testimony of these experts).

Page 15: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 165

additional warning, would affect his opinion. It did not because other interests, includingthe pharmaceutical industry, can influence them.

4.1.2 Defendant’s evidence. The firm’s Vice President for Technical affairs served ona FDA advisory panel in the mid-1970s, and told the court how it worked. He stated thatthe Committee sent its final report, concluding that the spermicide in question was safe,effective and properly labelled, to the FDA in 1978.78 He suggested that the Oechsli andSmith studies were ‘substantially flawed’ and a warning was not justified. A second witnesswas a pharmacist who worked with the panel when he was employed at the FDA. Hetestified that there was little information available about the safety of the ingredients invaginal products but based on the lack of ‘significant’ adverse reports, the panel reachedthe conclusion that the products were safe.

The defendant presented three experts who testified on the epidemiological evidence.The first was Dr Stolley, then co-director of the Clinical Epidemiology unit at theUniversity of Pennsylvania. He stated that while the Oechsli study raised a small suspicionabout the possibility of an association, its failure to specify the active ingredient in thespermicide used by the patients and the fact that the relative risk was not large madereliance on it questionable.79 Professor Stolley referred to an unpublished study of Harlapthat failed to replicate Oechsli’s findings. He also asserted that the association betweenlimb defects and spermicide use found in the Smith study was ‘not striking’ and could haveoccurred by chance. He also noted that when the study was conducted some spermicidescould have contained a different chemical and that his conclusion from Smith was ‘thatspermicides are not related to limb reduction defects’.

Professor Stolley believed that the first published study to show an association betweenspermicide use and limb defects was the 1981 Jick study. He questioned the way the studydetermined a woman’s exposure as it used a rather long time frame (having a prescriptionfor the drug filled within 600 days of the birth of the baby). Dr Stolley noted that theJick study had found increased risks in several different types of birth defects which was‘biologically implausible’. He observed that if two of the three ‘users’ whose childrenhad limb defects had not truly been exposed the study would have shown no association.Stating that there was no need to warn of any risk of birth defects in 1980, he referred tostudies that did not find an association. In particular, he relied on the 1982 Shapiro80 studyto support his statement that the spermicides clearly did not cause birth defects.

In his comments on the Jick study, Professor Stolley referred to the testimony of DrWatkins, a co-author of the article who testified for the defendant. Dr Watkins explainedthat a woman was considered exposed if she had received a prescription for spermicidewithin 11 months of conception and that this was too broad.81 Thus, some mothers,classified as spermicide users, might not have been exposed. He testified that the studypopulation was too small and that too many different kinds of birth defects were combined.He indicated that he had expressed some of his concerns and subsequently was asked not

78 Id. at 279–82. He opined that some patients in one of the studies might have been exposed to a differentspermicide, which did have an unsafe ingredient.

79 Id. at 284–86 (for the testimony of Dr Stolley).80 Samuel Shapiroet al. Birth Defects and Vaginal Spermicides, 247, JAMA, 2381–2384 (1982).81 615 F. Supp. at 281–82 (for the testimony of Dr Watkins).

Page 16: The need for careful evaluation of epidemiological evidence in product liablility cases

166 J. L. GASTWIRTH

to participate further in the study. After the study was published, at the request of oneof the defendant’s employees, he reviewed some of the records. He concluded that somemothers classified as spermicide users were not. In the opinion no information is givenabout whether the babies of these women were normal or had birth defects. The records hereviewed, however, werenot made available to the court. Prior to this review Dr Watkinsindicated that he knew of only one mother classified as a spermicide user who actuallyplanned her pregnancy.

The defendant also had several of its employees take the stand. They questioned theanimal studies indicating that the spermicide’s ingredients could be absorbed by the motherand expose the fetus. They cited their own studies showing the non-ionic surfactants arenot absorbed.82

The primary witness for the defence was Dr Robert Brent,83 a well-known Professorof Medicine who has published extensively in teratology. He stated that while most of thecauses of birth defects were unknown, about 20–25% of them are genetic in origin whileabout 10% are due to environmental factors. He described the development of limbs andhands and said that defects in them are not caused by exposure prior to the 24th day whenlimb development begins. He also said that damage to a developing hand would not occurbefore the 35th day. He also said that if on the 26th or 27th day a mother stopped using anagent capable of producing limb defects it would be ‘unlikely’ that the agent would haveproduced the type of defect ‘baby Wells’ had.

He took issue with Dr Buehler’s conclusion that the cause of the defects was a vascularproblem. Dr Brent thought they were caused by amniotic band syndrome. He observedthat the causes of this syndrome were not known. He stated that teratologists examine theconsistency of epidemiologic studies, animal studies and basic science concepts to assesswhether a chemical can cause birth defects. He said that the only study he knew of thatshowed a statistically significant association between spermicide use and birth defects wasthat of Jick. He noted that it was not surprising for an occasional study to suggest anassociation and that one should consider the ‘mass of data’, i.e. all studies.

He reviewed medical records and photographs of the child and testified that none ofher malformations were caused by the spermicide. He questioned the suggestion made byplaintiff’s experts that a sperm damaged by spermicide might fertilize an egg. In his viewthe animal studies showed that that very little, if any, of the product could be absorbed andreach the fetus. Moreover, he suggested that even if some of the chemical was absorbed,the concentration would be so low that no teratogenic or embryotoxic effects would occur.

In his direct testimony Dr Brent mentioned that he was first contacted by the defencecounsel in June 1983. After he learned that the FDA panel he was serving on as a votingmember would discuss the issue of spermicides in December 1983, the defence counseladvised him that he would not participate in the case until after the hearing. In March, 1984he was formally retained by the defendant. Dr Brent testified that at the 1983 meeting, thepanel heard presentations by Drs Jick, Harlap and Shapiro and discussed the Oechsli study.They concluded that the data did not demonstrate an increased risk and that a warning labelwas not needed.

82 615 F. Supp. at 283 (for the testimony of Mr. Kirsch and at 2887–8 for that of Dr Malyk).83 615 F. Supp. at 288–291 (for the testimony of Dr Brent).

Page 17: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 167

4.1.3 The trial court opinion. The court found that the plaintiffs had not shown thatsome of the baby’s defects were caused by the spermicide but did find that the defects toher left arm and shoulder and right hand were caused by spermicide.84 It also found thatthe defendant had a duty to warn of a risk of birth defects from use of the drug.

Judge Shoob found that the two doctors who had examined the baby themselves weremore credible than the defendant’s experts. In particular, he noted that Dr Buehler relied onthe fact that the mother had continued to use the spermicide after conception and throughthe time the limb buds develop.85 He contrasted the testimony of Dr Brent, who had ruledout the possibility the defects could have occurred in the ‘all or none’ period of the first 17days after conception and who said that exposure before the 24th day would not producelimb defects, with that of Dr Buehler. In particular, the mother had used the product formore than 17 days after conception and likely for more than 24 days.86 Thus, the timingwasconsistent with Dr Buehler’s testimony. The opinion noted that Dr Buehler had given adetailed explanation of how he ruled out other possible causes, indicating that his opinionwas aresult of careful medical reasoning.87

The two main experts also disagreed on the cause of the limb defects. Dr Brenthad suggested the amniotic band syndrome, where a band of material wraps around andeventually amputates a body part. Dr Buehler disagreed because the baby’s fingers did notshow the amputation that would be expected. Apparently Dr Brent did not address thiscriticism of his proposed cause.88 The court noted that while the defendant is not obligatedto prove the actual cause, as their main witness had proposed a theory the judge couldconsider criticisms of it in assessing the weight to be given to his testimony.

The opinion noted that the experts disagreed as to the animal studies on absorption ofthe product. The judge found that the plaintiff’s experts were more competent, credible andfree of bias than those presented by the defendant.

In discussing the epidemiologic studies, Judge Shoob deemed them inconclusive onthe issue of whether the spermicide caused the baby’s limb defects. He noted that thestudies themselves stated that they did not rule out the possibility of a relationship betweenspermicide use and a particular birth defect. In describing the cross-examination of DrStolley who relied on the 1982 Shapiro study, the opinion refers to Dr Shapiro’s statementto the December 1983 committee.89 There he said that the studies done so far lack thepower to evaluate whether there was an appreciable risk of specific defects and that hisstudy did not rule out an increase in limb defects.

The trial judge’s negative assessment of the credibility of defendant’s experts was duein part to their overstating the implications and meaning of studies that did not show anincreased risk90 or their lack of candour. Dr Brent did not inform the government panelthat met in December 1983 that he had been contacted by Ortho. More importantly, he firstsaid he was retained in March, 1983 but changed it to February 10, 1983. The defendant,

84 Id. at 292.85 Id. at 293.86 Id. at 293.87 Id. at 273. The judge also noted that the opinion the expert gave was the same as given in his deposition,

taken one hour after he examined the baby.88 Id. at 293.89 Id. at 285.90 Id. at 286 (indicating that Dr Stolley placed greater confidence in the Shapiro study than the author did).

Page 18: The need for careful evaluation of epidemiological evidence in product liablility cases

168 J. L. GASTWIRTH

however, had filed interrogatory responses on February 9th indicating that he would be awitness.91 Clearly, he had been retained before February 10th. Finally, the court thoughtthat Dr Brent’s involvement in many of cases concerning defects and his publishing anarticle alleging that over 50% of plaintiff’s experts distort the truth in contrast to only 10%of defence experts, indicated further bias. The judge also questioned the criticisms raisedby Dr Watkins, the co-author of the 1981 Jick study. Although the article was publishedfour years before the trial, this was the first time he repudiated its findings and the judgefelt his testimony was not credible.92 In 1986,after the trial he did publish a letter to thejournal saying that the study was flawed and should not have been published.93

The court also found that at the time the mother obtained the product in 1980, thedefendant was negligent for failing to warn of an increased risk of birth defects. Theopinion cited the 1975 report,94 the 1977 Smith article and the 1976 Oechsli study. Thedefence suggested that the Oechsli study was not readily available. Two of plaintiffs’experts indicate that they would have access to it and the 1980 study by Harlap95 citedit.

4.1.4 The appellate decision. The appeals court reviewed the case under the ‘clearlyerroneous’ standard, i.e. if the district court’s account of the evidence is plausible in lightof the entire record, the decision should be upheld even if it believes it would have reachedadifferent decision.

Ortho complained that due to the inconclusive nature of the scientific evidence,the plaintiffs had not carried their burden of proof.96 Specifically, Ortho asserted thatepidemiological studies be the essential data to determine causation.97 The firm alsochallenged the credibility determinations of the trial judge.

The decision noted that the district court was right to focus on the birth defects ofthe baby and that the plaintiff did not need to produce studies showing a statisticallysignificant association between spermicide use and malformations in a large population.98

It recalled the distinction between legal sufficiency and scientific certainty and observed

91 Id. at 290.92 Id. at 282.93 Richard N. Watkins,Vaginal Spermicides and Congenital Disorders: The validity of a Study, 256 JAMA

3095 (December 12, 1986). The letter said that during discovery in a lawsuit he examined the exposure status ofeight women who had children with defects. He indicated that four of the eight actually planned pregnancies.Seealso the accompanying letter of Lewis N. Holmes (stating that they did not have enough information to publishthe article and that no subsequent study has shown an unequivocal link between spermicides and birth defects)and the reply by Drs Hershel Jick and Kenneth Rothman (claiming that as of 1986 the relationship betweenspermicide use and birth defects is still open but the balance of the literature supports an effect on chromosomalalterations). In addition to questioning the method Dr Watson used in his reclassification those authors also notedthat he only examined the exposure status of mothers of children with defects but did not examine any records ofmothers of normal children.

94 Vaginal Contraceptives: A Time for Reappraisal? H3 POPULATION REPORTS (January 1975). Noting, atH-42, that a spermicide might damage the genetic material in the sperm without killing it and lead to abnormalitiesafter fertilization or could be absorbed by the mother and damage the fetus. Neither of these hypotheses had beeninvestigated for nonoxynol-9, the chemical used in the spermicide.

95 Supra note 74.96 Wells v. Ortho Pharmaceutical Corp. 788 F. 2d 741 (11th Cir. 1996).97 Id. at 744.98 Id. at 745.

Page 19: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 169

that if the factfinder is convinced that plaintiffs have proven to a reasonable degree ofmedical certainty that the limb defects of the baby were due to the spermicide, it doesnot matter that the medical community might require further research before resolving thequestion.

Concerning the failure to warn issue, the court interpreted Georgia law as beingconsistent with the ‘informed consent’ rationale. It cites cases indicating that if amanufacturer has actual or constructive knowledge of potential dangers of a product, themanufacturer must warn purchasers at the time of sale.99 The opinion noted that that threearticles prior to 1980 indicated a risk100and states that the court considered the FDA reportsbut is not bound to follow them. Prior cases stated that while compliance with regulatorystandards may be admissible on the issue of care, they do not require a jury to find thedefendant’s conduct reasonable.101

Before turning to the criticisms of the decisions inWells, it will be useful to review asubsequent case, concerning a different birth defect, where the court decided that the defectwas not caused by spermicide use.

4.2 Smith v. Ortho Pharmaceutical Corp.102

Although this case refers to the use of the spermicide during the same time period, it did notgo to trial at the same time asWells. In brief, the mother used a spermicide from July, 1979until shortly before she knew she was pregnant. The baby was born in January 1981 withserious birth defects resulting from trisonomy-18, a chromosomal abnormality. Note thatthis birth defect differs from the defects of the limbs (left arm and shoulder and right hand)suffered by the child inWells.103 Therefore, Judge Ward reviewed the studies focusing onwhether spermicide use is related to an increased risk of trisonomy-18 not limb defects. Inaddition, the mother was 38 years old at the time the baby was conceived while theWellsopinion noted that the mother was 32 years of age. As the risk of a birth defect increaseswith maternal age104 this six-year differential in age also distinguishes the cases.

Judge Ward’s opinion noted that the plaintiff’s experts relied on four early studiesnoting that none of them, including Oechsli’s (1986) study, showed an increased riskof trisomy-18, although one (Rothman, 1982) indicated an increased risk of trisomy-21 (Downs syndrome).105 His discussion of the Jick (1981) study is illuminating. Afterreviewing some of its problems, especially the long (used spermicide within 600 days ofdelivery) window used to assess exposure,106 the opinion observes that theonly case oftrisomy-18 in that study occurred in the control group.107 Hence, assuming the study were

99 Id. at 746.100 Id. at 746. These were the 1975 report,supra note 94, Smithet al., supra note 70 and Oechsli,supra note 69.101 Id. at 746.102 770 F. Supp. 1561 (N.D.Ga. 1991).103 This was mentioned by a defence expert in the Wells case, 615 F. Supp. at 287, when he commented onthe abstract summarizing the study of Warburtonet al. that appeared in 1987,see D. Warburtonet al., Lack ofAssociation Between Spermicide Use and Trisomy. 317 NEW ENGL. J. MED. 478 (1987).104 See Ernest B. Hook,Rates of Chromosomal Abnormalities at Different Maternal Ages, 58 OBSTET. &GYN. 282–85 (1981). While many studies did adjust for maternal age, this fact does distinguish the cases.105 Kenneth J. Rothman,Spermicide Use and Down’s Syndrome, 72 JAMA 399–401 (1982).106 770 F. Supp. at 1571.107 Id. at 1571 and n. 54.

Page 20: The need for careful evaluation of epidemiological evidence in product liablility cases

170 J. L. GASTWIRTH

sound, it does not support the plaintiff’s case inSmith. Moreover, the studies after 1982did not show any association between spermicide and trisomy-18. For example, Harlapet al. 108 reported no cases of trisomy-18 among 1569 spermicide users, and the studiesof Warburtonet al. 109 and Louik,et al. 110 also showed a lack of association betweenspermicide use and trisomy.

As the Smith opinion observed, there is no conflict between the two decisions. Inthe years between their resolution, the scientific community arrived at a consensus thatspermicide does not cause the birth defect, trisomy-18, the baby had.

COMMENT 1 The two cases concerned different types of birth defects. Also, the numberand quality of experts differed. In particular, one of the two plaintiff’s experts inSmith wasnot found credible inWells. Six experts testified on behalf of the plaintiff inWells, four oncausation and two on the failure to warn issue.

COMMENT 2 A footnote111 in theSmith opinion reports several studies that did not findan association between spermicide and birth defects. Only two, Smithet al. (1977) andHarlap (1980) were considered inWells. Recall that the second study focused on fetal lossnot limb defects, while the first showed an elevated risk(R = 2·98) of limb reductiondefects in spermicide users.

COMMENT 3 TheSmith opinion credits an affidavit by Dr Mills citing his reanalysis of theJick (1981) data that reclassified women who had planned pregnancies as non-users.112 Heconcluded that malformation rates in spermicide users were not statistically significantlyhigher than in non-exposed mothers. It is unclear as to whether the reanalysis focused onlyon all defects or included specific ones. Unfortunately, that reanalysis does not appear tohave been published and almost surely was not submitted to the court inWells.

4.3 The major criticisms of the Wells decision

The first criticism, by Mills and Alexander, of the case appeared in a medical journaland claimed that the overwhelming body of evidence indicates that spermicides are notteratogenic.113 They cited the two studies Mills co-authored, as well as the studies ofCordero and Leyde that showed no increased risk of limb defects as well as the Shapiroetal. study of 462 users of a spermicide with the active chemical as the one at issue.114 Inaddition to stating that ‘numerous other studies’ confirmed these findings, they observedthat in spite of the defence experts mentioning the FDA decision that there was insufficientevidence to warrant a warning label, the trial judge was not convinced.

108 S. Harlapet al., Chromosomal Abnormalities in the Kaiser-Permanente Birth Defects Study, with SpecialReference to Contraceptive Use Around the Time of Conception. 31 TERATOLOGY 381 (1985).109 Supra note 103.110 Carol Louik,et al. Maternal Exposure to Spermicide in Relation to Certain Birth Defects, 317 NEW ENG.J. MED. 474–478 (1987).111 770 F. Supp. at 1576 n. 52.112 Id. at 1577 n. 55.113 Mills and Alexandersupra note 6.114 See James L. Millset al. Are Spermicides Teratogenic? 248 JAMA 2148–51 (1982), James L. Millset al. AreThere Adverse Effects of Periconceptional Spermicide Use? 43 FERTILITY AND STERILITY 442–446 (1985)and Jose Cardero and Peter M. Layde,Vaginal Spermicides, Chromosomal Abnormalities and Spermicide Use,15 FAMILY PLANN. PERSP. 16–18 (1983). Shapiro’s study is atsupra note 80.

Page 21: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 171

The commentary quoted an editorial by Dr Brent calling spermicides a ‘litogen’, i.e.something that causes lawsuits but not malformations. The authors criticized the appellatedecision where it stated that the plaintiffs did not need to produce studies showing astatistically significant association between the use of the product and malformations ina large population. Furthermore, they disagreed with the court’s view that a doctor cantestify that in his opinion a cause and effect relationship exists without clear epidemiologicevidence.

These critics claimed that courts will allow uncontrolled ‘case series’ to be used todemonstrate causation and thatpost hoc ergo propter hoc reasoning is acceptable as‘patient examination’. They describe the opinion as stating that legal decisions may bebased on evidence unacceptable by today’s scientific standards, that such opinions havedriven safe products off the market.115 Finally, they urged the medical community torespond to the situation by providing expert witness testimony and teaching the publicto distinguish between ‘litogens’ and teratogens.

In an interesting article on the problems with the common law’s use of expert testimony,Professor Gross uses theWells case as an example of an ‘absolutely wrong’ decision, eventhough in many respects the opinion was well crafted from a judicial perspective.116 Afternoting how the expert testimony was focused on the contested issues in the case, ProfessorGross emphasizes that Judge Shoob evaluated the testimony not just on its rationalityand internal consistency but also on his assessment of the expert’s motives, biases andinterests.117 Then he points to the portions of Judge Shoob’s opinion that indicated that theplaintiff’s experts appeared less biased and more credible.

Professor Gross cites the previous criticism’s assertion that the overwhelming body ofevidence indicates that spermicides are not teratogenic as well as the decision of the FDApanel not to require a warning. Unlike other critics, Professor Gross does mention that thetrial judge discounted the FDA decision in part because an expert for the defence had beenaconsultant to it.118

The article observes that none of the experts who persuaded Judge Shoob was anepidemiologist, the specialty dealing with the occurrence and causes of disease. ProfessorGross describes their testimony as being based on ‘their own inexpert reading of theepidemiologic literature’ and their physical examination of the patient.119 In two footnotes,Professor Gross gives other arguments.120 First, the trial judge’s observation that theepidemiological data could not rule out a ‘small increase’ in risk, cannot justify the opinionfor the following reasons: (1) the fact that a risk can’t be ruled out does not meet thepreponderance of the evidence criterion; (2) even if it did cause a ‘small increase’ in riskmost of the birth defects in children of users would have occurred anyway, and (3) even ifthis small risk did in fact cause Katie Well’s injury, the defendant shouldn’t be faulted fornot knowing of this ‘unknown, unobserved and improbable danger’. The second footnote

115 They specifically refer to the drug Bendectin once used for morning sickness.See Green,supra note 8 for acomprehensive treatment of the cases and science involving that drug.116 Gross,supra note 6 at 1122.117 Id. at 1121.118 Id. at 1124. The article does not describe the time sequence of the expert’s involvement with the case andthe FDA panel.See supra note 91 and accompanying text.119 Id. at 1124.120 Id. at 1123, note 36 and at 1124, note 39.

Page 22: The need for careful evaluation of epidemiological evidence in product liablility cases

172 J. L. GASTWIRTH

asserts ‘there simply is no credible scientific evidence that Orho-Gynol is a teratogen, andthere is certainly no credible evidence of a danger or at the time it was used’.

TheWells case is also criticized in the influential book by Huber that coined the term‘junk science’.121 He said that the plaintiff prevailed on the basis of the 1981 Jick studythat ‘very tentatively suggested that spermicides might cause birth defects’. He then citesthe 1986 letters of Dr Watkins and another co-author of the study that said the study shouldnot have been published.122 The only other article mentioned is the one by Mills andAlexander.

The case is discussed briefly by several other commentators. Two well-known scholarsquestioned the appropriateness of Judge Shoob’s considering the demeanour of the expertsand his effort ‘to ascertain the motives, biases, and interests that might have influenced eachexpert’s opinion’.123 In an article emphasizing the importance of ensuring that scientificevidence is valid and reliable, Black says that the decisions ignored valid scientificevidence favourable to the defendant. While he mentions that preliminary studies didsuggest an association, he only cites the Mills and Alexander commentary, the post-trialletters of Watkins and a co-author as well as the FDA panel’s decision that a warning wasnot needed as of December 1983.

Mills wrote another review of the relationship between spermicides and birth defectswhere he updates the 1986 critique of the decision.124 In addition to citing the two 1986letters of Drs Watkins and Holmes casting doubt on the 1981 Jick study, he cites theWarburton and Louik studies.125 The first did not find a relationship between spermicideuse and trisomy-18 and he summarizes the second as concluding that the risks for the birthdefects studied were not increased by exposure to spermicide. The editors also criticizetheWells decisions citing theSmith v. Ortho decision, the letters of the co-authors and theMills and Alexander 1986 commentary.126

In 1993, the Supreme Court issued its opinion inDaubert,127 which gave trial judgesthe responsibility for assessing expert scientific testimony for its reliability. The opinionprovided several criteria lower courts may use in this determination. The major ones arewhether the methodology used has been subject to peer review and publication, whetherits application to the facts in the case is appropriate and considers the known or potentialerror rates of the technique and whether the approach adopted is accepted in the field. Theopinion indicated that these were guidelines rather than a checklist. Other courts haveexpressed concern about opinions developed expressly for litigation and may examinewhether an expert’s testimony is as intellectually sound as their normal professional

121 See supra note 66 at 174.122 See supra note 93.123 See supra note 66. Videmar, at 173, quotes part of the opinion cited by Gross,supra note 6 at 1121, whileJassanoff, at 54, states that the opinion rested largely on the trial judge’s assessments of the believability of theexpert testimony. Black’s observations are at 673 of his article.124 See supra note 66 at 87–99.125 See Warburton,supra note 103 and Louik,supra note 110.126 See supra note 66 at 28–29 and 137–138.127 Daubert v. Merrell-Dow Pharmaceuticals Inc., 509 U.S. 579 (1993).

Page 23: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 173

research.128 The next critic ofWells utilizes theDaubert criteria for the evaluation of experttestimony.

Klein, a lawyer for the defendant’s appeal, describes the case as an example of ‘junkscience’.129 He asserts that the judge disregarded the clear consensus of the scientificcommunity because the studies could not rule out all possibility that spermicides causebirth defects.130 He then notes that the FDA rejected theWells finding in a ‘Talk Paper’and cited the Mills and Alexander commentary and a report about the letters of Watkinsand another co-author about their disagreement with the 1981 Jick study. He advocatesstrict criteria for admitting an expert’s testimony, requiring a ‘tight fit’ between the expert’sbackground and the issue in controversy.131 In particular, the expert should be prepared toaddress all of the scientific literature relevant to the case. Then he asserts that the plaintiff’s‘primary expert’ relied heavily on epidemiological data and proceeds to point to parts ofthe transcript where the expert did not have command of all the studies and failed to usestatistical significance in his interpretations.132

Klein asserts that the opinions ignored accepted scientific methodology and citesthe defence experts who said the later studies failed to replicate the earlier ones.133 Henotes that the Jick study itself stated that the results should be considered tentative untilconfirmed by other data and that it was never confirmed.134 Thus, he argues plaintiff’sexpert should not have placed much confidence in its findings. He observes that courtsusually require evidence of ‘general causation’, typically studies showing that a substanceis capable of causing harm in a large population before finding specific causation.135

In sum, the criticisms of the Wells decision can be classified into two categories.First, the judge over-emphasized the demeanor, presentation and conflicts of interest ofthe experts rather than focusing on the scientific evidence. Secondly, the critics wouldhave required more scientific evidence, e.g. other studies comparing exposed mothers tounexposed ones, before finding that the drug caused the baby’s limb defects.

4.4 Re-examining the evidence and the criticisms

It is interesting that most of the studies cited by the critics appeared after the mother tookthe drug, were not mentioned in the opinion and presumably were not submitted to thecourt. The 1977 Canadian peer-reviewed study that is cited in the opinion, which yieldedan estimated relative risk of about 3.0 for limb defects in spermicide users was not citednor carefully discussed by most critics. The first part of this section will review this studyand its implications for the interpretation of the subsequent ones. Then the appropriatenessof the various criticisms in light of the evidence available to the judge will be examined.

128 See Judge Kosinski’s opinion in the remand ofDaubert v. Merrell-Dow Pharm. 43 F. 3d 1311 (9th Cir. 1995)and Judge Posner’s opinion inBraun v. Ciba-Geigy Corp. 78 F. 3d 316 (7th Cir. 1996).129 See supra note 66 at 2221.130 Id. at 2223, citing 615 F. Supp. at 292.131 Id. at 2225.132 Id. at 2225–26.133 Id. at 2231 and note 68 (citing Dr Stolley’s testimony that the later studies were more powerful and yet couldnot replicate the results of Oechsli and Jick).134 Id. at 2226 note 37.135 Id. at 2228, citingRichardson v. Richardson-Merrell, 649 F. Supp. 799, 803 (D.D.C. 1986) aff’d, 857 F. 2d(D.C. Cir. 1988).

Page 24: The need for careful evaluation of epidemiological evidence in product liablility cases

174 J. L. GASTWIRTH

It is important to distinguish the issue of whether the opinion was reasonable given theinformation available at trial from a scientific review of a larger body of evidence.

4.4.1 The Canadian study and its implications. The Canadian study compared 93children with limb defects with 93 normal controls and 93 controls who had other defects.The rationale for using two control groups is that mothers of children with problems oftenrecall more exposures.136 This ‘recall bias’ can create an apparent association even whenthe proportions of spermicide users in both groups were the same. Eleven of the 93 casesused spermicides while 4 of the 93 controls in each of the groups were users.137 Noticethat comparing the cases to either control group gives the same odds-ratio of 2.98.

The fact that the odds-ratios were identical in both comparisons suggests that recallbias did not occur in the 1977 study. Indeed, one of the critics co-authored a study that didnot find an association of spermicide use with adverse effects raised the issue of recall biasin questioning a previous finding of an association of Down’s syndrome with spermicideuse.138

To perform a test of significance one should combine both control groups,139 whichyields a statistically significant difference (p-value 0.024). The confidence interval for theestimated odds-ratio is (1.16, 7.70).140While the estimated odds-ratio of nearly 3.0 appearsto indicate a fairly strong association, the lower end of the confidence interval is only1.16. Thus,if spermicide users had a higher prevalence of one or more other risk factors,they could have caused the observed association between the drug and limb defects. Asensitivity analysis can illuminate this as it tells us the difference in the prevalence ofanother risk factor in the two groups must satisfy in order to explain the observed riskbetween spermicide use and limb defects. The increased risk of limb defects in children oftranquillizer users was confirmed in a 1981 study.141 It found that taking tranquillizers in

136 See Martha M. Werleret al., Reporting Accuracy Among Mothers of Malformed and Non Malformed Infants,128 AM. J. EPIDEM. 415 (1989).137 See supra note 70 at 41 and GASTWIRTH,supra note 17, at 864.138 See James L. Millset al., Are There Adverse Effects of Periconceptional Spermicide Use? 43 FERTILITYAND STERILITY 442,445 (1985) discussing Kenneth J. Rothman,supra note 105. In that study spermicide usewas statistically significantly higher among cases when the exposure rate in mothers of cases was compared tothat of normal controls but not when compared to that in mothers of malformed infants.139 In statistical terms, the odds-ratios of both comparisons are homogeneous. Thus, the control groups aresufficiently similar to be considered as one. If the odds-ratios were different, then the control groups would differin some way that is related to birth defects and would not appear to be from a common population.140 This calculation used the large sample approximation to the distribution of the difference of two proportionsas this was the standard statistical procedure readily available in the early 1980s. Using the exact samplingdistribution of Fisher’s exact test in STATXACT-4 one obtains ap-value of 0.04, still significant. The originalarticle, however, also reported the non-significant result that is obtained when one compares the exposure rate ofthe cases to each control group separately. This occurs because of the smaller sample size.141 See Michael B. Bracken and Theodore R. Holford,Exposure to Prescribed Drugs in Pregnancy andAssociation with Congenital Malformations, 58 OBSTET. & GYN. 336–344 (1981). A non-significant butpotentially increased risk (2.0) of limb reduction defects in users of tranquillizers was reported by Lisa Hill,M. Murphy, M. McDowall and A. H. Paul,Maternal Drug Histories and Congenital malformations: LimbReduction Defects and Oral Clefts, 42 J. EPIDEM. and COMM. HEALTH 1–7 (1988). Since statisticalsignificance depends on the sample size as well as the magnitude of the risk, the estimated relative risk of 2.0for users in the first trimester of pregnancy is consistent with the estimated risk of 2.3 found by Bracken andHolford. These estimates are lower than the estimate 2.82 from the Canadian study but all three are consistentwith ‘sampling variation’ about an overall figure in the neighbourhood of 2.0 to 3.0.

Page 25: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 175

TABLE 2 Required prevalence of tranquillizer use among spermicide usersto reduce the Canadian study to non-significance

Prevalence in controls Required prevalence inusers if relative risk oftranquilizers is 2.3

Required prevalence inusers if relative risk oftranquilizers is 2.8

0.10 0.239 0.2050.20 0.355 0.3210.30 0.471 0.4370.40 0.587 0.5530.50 0.703 0.669

the first trimester increased the risk of a birth defect by 2.3, with a 95% confidence interval(1.8, 7.2) while smokers who used tranquillizers had a relative risk of 3.7. These resultsare within the expected sampling error of the 1977 study. Tranquilizers will be assumed toincrease the odds of a limb defect by 2.8, the estimate in the 1977 study, or 2.3, found inthe 1981 one in the sensitivity analysis.142 Table 2 presents the prevalence of tranquillizeruse amongst spermicide users that would be needed to reduce the observed association inthe Canadian study to a non-statistically significant one. It is a function of the prevalenceof tranquillizer use by the control mothers and the increased risk of limb reduction defectstranquillizers cause. Notice that if 10% of the controls used tranquillizers and the relativerisk of tranquillizer use is 2.8, then 20.5% of spermicide users would have had to usetranquillizers to explain the association. If the odds-ratio associated with tranquillizer useequals 2.3, then nearly 24% of the spermicide users would need to have taken tranquillizersduring their pregnancy. Similarly, if 30% of the controls used tranquillizers then 43.7 or47.1% of spermicide would need to have used them. While it is plausible that differencesof this magnitude in the use of tranquillizers between users and non-users could exist,there was no evidence of such differential use of tranquillizers among spermicide usersand non-users in the studies cited.

Given the findings of the 1977 study, it is surprising thatnone of the subsequentstudies examined the joint effect of a mother’s joint use of a spermicide and tranquillizers.Therefore, a sensitivity analysis of the 1982 study by Millset al. 143 that did not find anincreased risk of birth defects in spermicide users should be conducted. The data mostrelevant to theWells case compared the rate of limb defects in children of mothers whoused spermicide past their LMP to non-users. They found a non-significant relative risk of

142 The method, an extension of an earlier technique of Cornfield, is described by Joseph L. Gastwirth,Methods for Assessing the Sensitivity of Statistical Comparisons Used in Title VII Cases to Omitted Variables, 33JURIMETRICS J. 19, 20–25 (1992).See also PAUL R. ROSENBAUM, OBSERVATIONAL STUDIES, 87–91(1995) for related procedures and their application to real data. The basic result is that in order to explain a relativerisk RO = p1/p0 wherep1 and p0 are the probabilities of an event (birth defect) in the exposed and unexposedgroups, an unobserved variable,U , must cause an increased risk of the eventRU of at leastRO . Moreover, theprevalencesf1 and fO of U in the exposed and unexposed groups must satisfyf1 = RO f0+(RO −1)/(RU −1).Thus, if one has some knowledge of the effectRU of the variable that was omitted and its prevalence in the controlpopulation one can assess whether its prevalencef1 in the exposed group is likely to satisfy this condition.143 Supra note 114. In fairness to these investigators their data were collected between 1974 and 1977,Id. at2148, so they may have been unaware of the results of the Canadian study when their research was planned.

Page 26: The need for careful evaluation of epidemiological evidence in product liablility cases

176 J. L. GASTWIRTH

1.24, with a confidence interval of (0.89 to 1.72).144 Their analysis used logistic regressionto adjust for the possible effects of maternal age, education, race, smoking, alcohol useand previous malformed infants.145 The authors also report that their study had high power(99%) to detect a doubling of the risk. The lower end, 0.89, of the confidence interval,however, is not far from 1.0. A standard calculation shows that if the estimated relativerisk had been 1.39, rather than 1.24, the lower end of a 95% confidence interval would justexceed 1.0, implying a significantly increased risk, at the usual 0.05 level. Thus, it is worthascertaining whether an increased prevalence of an observed risk factor in the controlsrelative to the spermicide users could have led to an underestimate of the relative risk by0.15.

The study reported that spermicide users were older, more educated, smoked less, drankless than women using other contraceptive methods.146 Smoking is another potential riskfactor, with an estimated relative risk of about 1.3, which is still under investigation.147

Another study148 of the same era reported that 31.4% of spermicide users past their LMPsmoked in contrast to only 21.4% of users who stopped before their LMP. The prevalenceof smokers among non-users, however, was 39.6%. It is reasonable to assume that about30% of the users past their LMP in the 1982 study smoked while 40% of the non-usersdid. Since users have a lower prevalence of smoking and if tranquillizer use is the samein both groups one would expect a lower fraction of spermicide users to smoke and usetranquillizers than non-users. This implies that an age-adjusted estimate of the relativerisk of spermicide users to non-users would underestimate the risk of limb defects asthe controls have a higher prevalence of another risk factor. The methodology used toassess whether an omitted factor could reduce a significant finding to a non-significantone is adaptable to the situation where the control group has a higher prevalence of theomitted risk factor.149 Table 3 presents the expected effect of an omitted variableU , on theestimated relative risk in the 1982 study if its prevalence amongst users is less than that inthe controls. The relative risk ofU ranges from 1 to 4.

First, notice that even if smoking150 itself had a relative risk of 1.5, even if 40% of thecontrol mothers smoked in contrast to 30% of the controls, the relative risk of spermicideexposure to limb defects would only increase to 1.294. A risk factor with a relative riskof 3.0, however, would raise the observed relative risk of 1.24 to 1.395, which would

144 Id. note 114 at 2150.145 Id. at 2148.146 Id. at 2149. All comparisons were highly statistically significant withp-values less than 0.0001.147 See Andrew E. Czeizel, Imre Kodaj and Widuking Lenz,Smoking During Pregnancy and Congenital LimbDeficiency, 308 BR. MED. J. 1473 (1994) (finding a statistically significant odds ratio of 1.48) and KarenKallen, Maternal Smoking During Pregnancy and Limb Reduction Malformations in Sweden, 87 AM. J. PUB.HEALTH, 29–32 (1997) (finding an odds ratio of 1.26 (95% confidence interval 1.06, 1.50) for smoking andlimb defects). Both studies noted that further work is needed before a causal inference could be made. For a viewthat questioning whether smoking is related to limb defectsSee Robert L. Brent,Book Review of CONGENITALLIMB DEFICIENCIES IN HUNGARY by ANDREW E. CEIZELet al. 15 Genetic Epidem. 647–650 (1998). DrBrent was an expert for the defence in Wells,see 615 F. Supp. at 289.148 Anthony P. Polednak, Dwight T. Janerich and Donna M. Glebatis,Birth Weight and Birth Defects in Relationto Maternal Spermicide Use. 26 TERATOLOGY 27, 30 (1982).149 This methodology is given in Binbing Yu and Joseph L. Gastwirth,The Use of the ‘Reverse CornfieldInequality’ to Assess the Sensitivity of a Non-significant Association, 22 STAT. IN MED. 3383–3401 (2003).150 This is done to illustrate the use of the table as the study did adjust for smoking.

Page 27: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 177

TABLE 3 The true relative risk of spermicide use as a function of thestrength of an omitted variable, U, and its prevalence in the control andexposed groups

f0 f1 1.0 1.5 2.0 2.5 3.0 3.5 4.00.40 0.30 1.24 1.294 1.335 1.368 1.395 1.417 1.4360.40 0.20 1.24 1.353 1.447 1.526 1.594 1.653 1.7050.20 0.15 1.24 1.269 1.294 1.316 1.335 1.353 1.3680.10 0.075 1.24 1.255 1.269 1.282 1.394 1.305 1.316

Notes: f0 and f1 are the prevalence ofU in the controls and user groupsandR, the relative risk ofU ranges from 1.0 (no effect) to 4.0.

have been statistically significant. Recall that smokers who used tranquillizers had anestimated relative risk of birth defects of 3.7. Thus, considering both risk factors, smokingand tranquillizers, if the smokers in the Millset al. study had the same high incidenceof tranquillizer use, it is plausible that the omission of tranquillizer use could lead to anunderestimate an amount sufficient to reduce a significant result to non-significance.

As other studies reported that spermicide users tended to drink less as well as smokeless than non-users, they probably were more health conscious than non-users. If this iscorrect, users probably had less exposure to other risk factors than the control group.Thus, relative risks derived from studies that did not control for these risk factors wouldunderestimate the actual risk of spermicide use and birth defects.151 It is reasonable,therefore, to assume that a smaller fraction of spermicide users who smoked usedtranquillizers too. In Table 4 we assume that between one-fourth to one-half of the smokersused tranquillizers in the control group but the probability that a spermicide user tooktranquillizers was less than that of the control group. In most cases the imbalance in theprevalence of smoking and tranquillizer use could have created a sufficient underestimateof the risk of spermicide to mask a statistically significant result using the estimate of 3.7for their joint risk.

The above analyses cannot demonstrate that there was a sufficient imbalance intranquillizer use that would raise the relative risk to a statistical significance. It does showthat this 1982 study relied on by the defence is sensitive to a potential omitted factor,smoking and using tranquillizers, just as the 1977 study relied on by plaintiffs is sensitiveto an omitted factor that is more frequent in the users. The only difference is that the controlgroup in the study was known to have a significantly higher fraction of smokers,152 so it

151 This type of underestimate is analogous to the one arising in occupational epidemiology when the diseaserate of employees exposed to the agent under study is compared to that of the age-adjusted general population. Asemployed individuals are healthier than non-employed, their disease rate is usually less than that of the generalpopulation.See JUDITH S. MAUSNER & SHIRA KRAMER, EPIDEMIOLOGY: AN INTRODUCTORY TEXT319–20 (2d ed. 1985) (noting that mortality data for the general population include individuals too sick to work).See also Russellyn S. Carruth and Bernard D. Goldstein,Relative Risk greater than Two in Proof of Causationin Toxic Tort Litigation, 41 JURIMETRICS 195, 207–8 (2001) (noting that workers tend to have relative risksfrom mortality between 0.7 and 0.9 when compared to the general population). In the Wells case, the fact thatthe mother apparently did not have some risk factors, would make her comparatively healthier than the averagepregnant woman of the same age.152 Supra note 114 Mills (1982) at 2149 and Mills (1985) at 443.

Page 28: The need for careful evaluation of epidemiological evidence in product liablility cases

178 J. L. GASTWIRTH

TABLE 4 The true relative risk of spermicide use as a function of thestrength of an omitted variable and its prevalence in the two groups when asmaller percentage of exposed individuals have the omitted risk factor

f0 f1 1.0 1.5 2.0 2.5 3.0 3.5 4.00.20 0.125 1.24 1.284 1.323 1.357 1.389 1.417 1.4430.20 0.10 1.24 1.299 1.353 1.402 1.447 1.488 1.5260.15 0.10 1.24 1.270 1.296 1.321 1.343 1.364 1.3830.10 0.05 1.24 1.270 1.299 1.327 1.353 1.378 1.402

Notes: f0 and f1 are the prevalence of the omitted variable in the controland user groups. The entries in the columns report the true risk ofspermicide use for various values of the relative risk,R, of the omittedvariable.

is likely that it contained a higher fraction of smokers and tranquillizer users than did theexposed group. Of course, an imbalance between users and controls with regard to anotherrisk factor would also affect the estimated relative risk.

The sensitivity analyses presented in this section would have been more informativehad more data on the joint prevalence of the various risk factors been available. Then onecould have utilized their joint distribution to construct a probabilistic model to estimate theprobability that an imbalance sufficient to change the ultimate inference existed.

There is another important statistical consequence in situations when another risk factoris more prevalent in the controls. Both the level of significance and the power to detect apre-specified relative risk are changed. For example, assume the prevalence of an omittedrisk factor in the controls is 0.20 but only 0.15 in the exposed, and the true relative risk ofan agent is 1.0, i.e. it is not related to birth defects. If this omitted risk factor has a relativerisk of 3.0, then a test with nominal level 0.05 actually has a Type I error of about 0.02.Thus, one is requiring agreater degree of statistical significance than the usual one adoptedby most scientific journals. The corresponding power to detect relative risks of 2.0 (1.5)declines from 0.99 (0.71) to 0.946 (0.43). The power calculations assumed a 40% (30%)prevalence of the omitted variable in the users (non-users).153

A number of studies investigating the relationship between spermicide use and birthdefects have been published since the trial. Table 5 summarizes the findings of studies inthe trial record (*) and the newer ones with respect to limb defects, reporting the data forusers during the first trimester and past the LMP when available.154 Whether or not thestudy incorporated data on tranquillizer use is also given. Most reviews have not found anincreased risk for all defects, although the data for specific risks are less definitive.155

153 Full details of the effect of an omitted risk factor on both the significance level and power of a test appear inthe article citedsupra note 149.154 The studies in Table 5 that have not previously been sited are: George Hugginset al., Vaginal Spermicidesand Outcome of Pregnancy’ Findings in a Large Cohort Study, 25 CONTRACEPTION 219–230 (1982), MichaelBracken and Kathy Vita,Frequency of Non-hormonal Contraception Around Conception and Association withCongenital Malformations in Offspring, 117 AM. J. EPIDEMIOLOGY 281–291(1983) and Shai Linnet al., Lackof Association Between Contraceptive Usage and Congenital Malformstions in Offspring, 147 AM. J. OBSTET.GYNECOL. 923–928 (1983).155 Janice E. Manujuck,Relationship of Vaginal Spermicides and Birth Defects, 76 J. FLORIDA MED. ASSOC.316–321 (1989) (concluding that current studies indicate use is not related to an increase in all defects but there

Page 29: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 179

TABLE 5 A chronological summary of the epidemiological studies of the association ofspermicide use and limb reduction defects

Author Study Type Year RR 95% C.I. Controlled fortranquillizers in

the analysisSmithet al. * MCC 1977 2.99 (1.16, 7.69) NoJick et al. * CO 1981 15.3 (1.16, 148.2) NoPolednack *et al. MCC 1982 2.00 (0.5, 7.8) NoHugginset al. CC 1982 4.47 (0.405, 49.37) NoShapiroet al. *+ CO 1982 0 NA NoMills et al. *+ CO 1982 1.24 (0.89, 1.72) NoCordero & Leyde+ CC 1983 1.00 (0.3, 3.3) NoBracken & Vita CC 1983 2.18 (0.29, 16.57) NoLinn et al. CC 1983 0.85 (0.66, 109) NoLouick et al. CC 1987 1.1 (0.5, 2.1) No

1.7 (0.8, 3.6)

Notes: The type of study, cohort or case control, is indicated by CO or CC. Matching isindicated by M. The column 95% C.I. reports the 95% confidence interval for the relativerisk. The estimate from Millset al. (1982) refers to women using spermicides after the lastmenstrual period as did the mother of ‘baby Wells’. The data for the Linnet al. (1983) studyrefers to several major malformations in the study of Jicket al. (1981). No limb defectsoccurred in either the spermicide users or the controls. For all defects, Bracken and Vita(1983) did not find a significant incremental risk,O R = 1·26, with 95% CI (0.85, 1.85).The two entries from the Louick (1987) study refer to the odds-ratios for all limb reductiondeficits and the subset of them due to unknown origin. The data for women exposed duringthe first trimester are given.

Looking at the odds-ratios in the fourth column of Table 5, it appears that the 1981study of Jick is unusually large, so summary measures should down-weight it. On theother hand, seven of the ten studies did estimate an odds-ratio greater than 1.0. Again,noneof them specifically present an analysis incorporating data on tranquillizer use.156 Giventhe sensitivity analyses, Judge Shoob’s characterization of the five studies in evidence in

is concern it could be associated with specific malformations and future studies should control for confoundingvariables). This review did not specifically consider limb defects and the 1977 Smith study was not included.Michael B. Bracken,Spermicidal Contraceptives and Poor Reproductive Outcomes: The Epidemiologic EvidenceAgainst an Association. 151 AM. J. OBSTET. GYNECOL. 552 (1985) concluded that further regulation ofspermicides was not needed. He does cite the 1977 Smithet al. study but regarded it as not showing an increasedrisk. The meta-analysis conducted by Einarsonet al., Maternal Spermicide Use and Adverse ReproductiveOutcome: A Meta-Analysis, 162 AM. J. OBSTET GYNECOL. 655 (1990) only examined all teratogenicoutcomes and did not consider individual defects. For a variety of reasons these authors excluded four of thestudies in Table 5 so studies of limb defects may be under-weighted in their analysis. Also, they excluded studiesthat used mothers of children born with other defects as controls, e.g. the 1977 Smithet al. studysupra note 70.156 The study by Bracken and Vita,supra note 145 at 284 mentions that data on prescription drugs were obtainedbut does not describe an analysis that incorporates those drugs,Id. at 287. Similarly, Shapiroet al., supra note 80at 2381 state that data on maternal illnesses and complications of pregnancy were collected but does not list thefactors actually used in the risk estimates for the different defects,Id. at 2382.

Page 30: The need for careful evaluation of epidemiological evidence in product liablility cases

180 J. L. GASTWIRTH

Wells as equivocal may well be applicable to the entire set of studies concerned with limbreduction defects.

4.4.2 A review of the criticisms. Given the near uniformity and strength of thepublished criticism of theWells decisions157 it is worthwhile looking at them in light of ourexamination of the studies focusing on those reported in the opinion. Many of the legal andpublic policy commentators appear to have relied on the Note by Mills and Alexander158

in a prestigious medical journal. They did not mention either the Oechsli or Smithet al.studies cited in the opinion159 but refer to studies that favoured the defendant, includingtwo that apparently were not submitted to the court, the 1983 study by Cordero andLayde160 and the 1985 review by Bracken.161 They also cite the 1986 letters questioningthe Jick study.162 Another study that did indicate a possible risk of limb defects, Hugginset al.,163 which was published in 1982, was also not cited by Mills and Alexander.

It is inappropriate to speculate on why studies in the record and in the literature thatfound spermicides were related to limb defects were not discussed by Mills and Alexanderor most other critics. Since all the studies omitted tranquillizers identified as a risk factorin the 1977 study and confirmed in 1981164, it is questionable that they are scientificallymore reliable than the early ones. Suffice to say that the sensitivity analysis indicates thatthis omission could have affected the risk estimates in some of them by an amount thatmight alter their implications for the legal decision.

The omission of a substantial risk factor is important. The Supreme Court’s decisionin the discrimination caseBazemore v. Friday165 noted that plaintiffs need not incorporateall measurable variables but should include the ‘major’ ones. Indeed, sinceDaubert, courtsdown-weight studies that omit a major variable and sometimes do not admit them.166 Asan agent having a statistically increased risk that doubles the probability of an exposedindividual suffering harm is a ‘major’ variable, its omission from the studies diminishestheir evidentiary value. When the prevalence of such an omitted factor is higher in the non-exposed group, the statistical power of the study is diminished. In these circumstances,

157 Supra notes 6, 66 and 93.158 Supra note 6.159 615 F. Supp. 271 n. 9 and 272 n. 11.160 Supra note 114.161 Supra note 155.162 Supra note 92.163 Supra note 154 and Table 5.164 Supra note 141 and accompanying text.165 478 U.S. 385, 400 (1986).166 See Smith v. Xerox 196 F. 3d 358 (2d Cir. 1999) (not admitting a study of terminations in a discriminationcase for not considering performance evaluations). The decision noted that the plaintiff would not need to includethem in their regression if they had seriously questioned the propriety of those evaluations, e.g. by showing theywere inconsistent with previous ones. InMurphy v. General Electric 90 FEP Cases 1418, 1423 (S.D.NY 2002)the court admitted a study of terminations that concluded that age was a factor in one year as the expert controlledfor the education of the employee and their division. It did not admit a study of the next year that did not controlfor these variables. Similarly, the court rejected an economic expert’s market analysis used to estimate damagesin Blue Dane Simmental, 178 F. 3d 1035, 1041 (8th Cir. 1999) because the expert neglected several variables thatrelate to the value of the breed.

Page 31: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 181

studies claiming that they had high power (e.g. 90 or 95%) of detecting a risk of 1.5 or 2.0are overstating their true power.

Gross and Klein claim that theWells decisions had virtually no scientific supportfor their reasoning. First, consider the ‘failure to warn’ issue focusing on the availableinformation prior to 1980. The appellate opinion cited one report and two studiesthat suggested that spermicides might cause birth defects.167 None of the subsequentpopulation-based studies that did not find an increased risk were available nor were anyanimal studies published.168 As animal studies are still used today to demonstrate thesafety of drugs in pre-clinical studies, they are considered scientific evidence.169 Sincepharmaceutical companies only need two studies to show a new drug is effective beforemarketing it, criteria the plaintiff’s expert and the court adopted to decide the ‘failure towarn’ issue appears reasonable.170 This might not be the case in jurisdictions that requireevidence of causation but it is consistent with the ‘informed consent’ rationale, used inGeorgia.

As many of the critics cite the 1986 letter of Dr Watkins repudiating the Jick study,his review should be examined. Recall that he examined the records of onlyeight mothers,all of them having a baby born with a defect. Most exposure validation studies utilizea larger sample, are not limited to individuals with just one of the possible outcomes171

and the classifiers are not told the purpose of the study or the outcome of the individualwhose record is being classified.172 Keeping the classifiers ‘blinded’ as to the casestatus is considered essential to determining the accuracy of the original determinations.Furthermore, the study was conducted especially for a lawsuit and not even submitted tothe court, much less peer-reviewed. These are similar to the shortcomings that led to thenon-admittance of plaintiff’s epidemiologic evidence in the remand ofDaubert.173 It issurprising that the critics accept it without any reservations.

In determining whether a particular plaintiff was harmed by a chemical or drug courtsfirst consider whether it can cause the same harm in a population. This is referred to as

167 Supra note 100.168 Both the opinion, 615 F. Supp. at 273 n. 15 and Hugginset al., supra note 154 at 229 cite Chapvilet al.Studies on Nonoxynolf-9 II. Intravaginal Absorption, Distribution, Metabolism and Excretion in Rats and Rabbits22 CONTRACEPTION 325 (1980) as demonstrating that the drug is likely to be absorbed into the mother’sbloodstream and might produce malformations. The time of the publication is too near the time the mother usedthe drug to expect the defendant to have been aware of it and its implications.169 For an example,See http://www.fda.gov/cder/foi/nda/2000/2-941_Abreva.htm.170 Recently, after two studies indicated that flu vaccinations appear to reduce the risk of stroke or other heartproblems an influential medical letter cited them in urging its readers to get a flu shot.See Get the flu vaccine—your heart will thank you. 13 Harvard Heart Letter 1 (October 2002).171 This is especially important in studies using members of health plans,see Alan A. Mitchell, Linda B. Cottlerand Samuel Shapiro,Effect of Questionnaire Design on Recall of Drug Exposure in Pregnancy, 123 AM. J.EPIDEMIOLOGY 670–676 (1986) (noting at 675 that about 20% of patients receive drugs such as Valium fromother sources than the plan or their primary care doctor and that studies relying on prescription records mayinclude false negative as well as false positive exposures.172 See Dallas R. English , Bruce K. Armstrong, and Anne Kricker,Reproducibility of Reported Measurementsof Sun Exposure in a Case Control Study, 7 CANCER EPIDEMILOGY BIOMARKERS & PREV. 857–858(1998) (the second evaluation was conducted on 62 cases from a total of 201and 162 out of 700 controls and thereviewers in the second phase did not know the results of the first one).173 See supra note 128 at 1313–19 (noting that establishing that an expert’s testimony grew out of pre-litigationresearch or has been peer reviewed are ways a party can show it is reliable). Judges are also skeptical aboutresearch just performed for litigation.

Page 32: The need for careful evaluation of epidemiological evidence in product liablility cases

182 J. L. GASTWIRTH

‘general causation’.174 The critics appear to translate this into requiring that epidemiologicstudies demonstrate an increased risk in the general population. Courts, however, haveshied away from making any particular type of evidence an absolute requirement. InRiderv. Sandoz,175 where the court did not accept an expert who intended to testify that a drugthat caused one type of stroke (ischemic) could cause the other type (hemorragic), theopinion noted that absent epidemiologic evidence plaintiffs may prove medical causationthrough other evidence. The judge inWells assessed the epidemiological studies asequivocal and relied on other evidence.176 The critics use studies that are not in thecourt record and omit the 1977 study to say that the judge should have decided that theepidemiological evidence conclusively demonstrates that spermicide does not cause birthdefects. Readers can review the studies on limb defects reported in Table 5 as well asthe sensitivity analysis to decide whether the judge’s summary of the submitted studies isreasonable.

The opinion noted that plaintiff’s expert eliminated other causes, a method referred toas differential diagnosis. This method is often accepted by courts,177 especially when thereis other supporting evidence. InHollander v. Sandoz where the case reports and differentialdiagnosis were not accepted as evidence, the opinion noted that when differential diagnosisis done properly and there is other evidence, it is acceptable.178 In Wells the plaintiff’sexpert also examined the baby and checked that the mother used the spermicide after theLMP during the time period when the limbs are formed. He incorporated human and animalstudies in concluding that to a ‘reasonable medical certainty’ the limb defects butnot otherdefects were due to the spermicide.179

The importance of the time sequence in assessing causality is stressed by Aroet al.as exposures beyond the sensitive period can typically be excluded.180 The integration ofstudies with the facts of the case as done by Dr Buehler is consistent with current medicaladvice.181

Requiring two or more epidemiologic studies to show a statistically significantincreased risk of harm in a population raises additional problems. First, which populationis appropriate? Today, it is recognized that birth defects and cancer are often multi-factorial

174 See Greenet al. supra note 26 at 374.175 295 F. 3d 1194 (11th Cir. 2002). The case dealt with the propriety of the trial judge’s decision not to admitan expert’s testimony. Appellate courts review them under an ‘abuse of discretion’ standard.176 In Rider,Id at 1198, the court noted that the parties agreed that that four studies, three of which suggestedno relationship or a negative one and one suggesting a positive relationship, non of which were statisticallysignificant, were inconclusive. InWells two out of five peer-reviewed studies indicated a statistically significantrelationship while none of the ‘negative’ ones indicated a risk significantly less than 1.0. Thus, the Wells studiesseem slightly more favourable to the plaintiff than those in Rider so it is reasonable for them to be deemedinconclusive.177 See Globetti v. Sandoz 111 F. Supp. 2d 1291 (N.D. Ala. 2001) (accepting differential diagnosis and notingthat epidemiologic studies are not required). The decision is cited in the Rider case,supra note 175.178 289 F. 3d 1193 (10th Cir. 2002).179 615 F. Supp. at 293–94.See Jeff L. Lewin, The Genesis and Evolution of Legal Uncertainty About‘Reasonable Medical Certainty 57 MD. L. REV. 380 (1998) for a comprehensive examination of the interpretationof this criteria in tort cases.180 Timo Aro, Jaason Haapakoski & Olli Heinonen,A Multivariate Analysis of the Risk Indicators of ReductionLimb Defects, 13 INT’L. J. EPIDEMIOL. 459, 462 (1984).181 Harvey B. Simon in 5 HARVARD MEN’S HEALTH WATCH 8 (July 2001) (emphasizing the need tointerpret medical studies carefully toSee how they may apply to the reader).

Page 33: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 183

in nature, i.e. humans vary widely in their response to toxic chemicals, in part due to theirgenetic make-up.182 If several genes contribute to the susceptibility of a fetus to exposureto a particular chemical, it will be difficult to find a sufficiently large sample of individualswith the genotype creating the risk. Furthermore, if only a modest fraction, say 10%, ofthe general population have this genotype, a study of the overall population will have littlechance of finding an agent that doubles or triples the risk of birth defects or disease in therelevant subpopulation.183

The commentators who criticize the judge for considering the potential biases of theexperts did not describe the unusual circumstances in this case. Thefirst time Dr Watkins,a co-author of the 1981 Jick study, questioned the accuracy of the exposure classificationswas at the trial.184 As the study was published four years before he testified Judge Shoobis justified in doubting his testimony.185 Apparently, only after the trial or after being hiredby the defendant did the expert write a letter to the Journal criticizing the study.186 Theseletters are cited by critics187 of the decision without mentioning they were publishedafterthe trial andafter the co-author was contacted by the company.

Similarly, shouldn’t a judge question the impartiality of an expert who didnot informthe Food and Drug Administration that he was involved in the case and initially did notrecall when he was retained by the defendant?188 Judge Weinstein189 also states that somescepticism of experts is appropriate because there is a tendency for scientists to exaggeratesupport for hypotheses. TheWells opinion noted that several of the defendant’s expertsoverstated the implications of some studies.190 Indeed, the Cordero and Leyde study,191

cited by Mills and Alexander192 concludes

182 See THOMAS D. GELEHRTER, FRANCIS COLLINS & DAVID GINSBURGH, PRINCIPLES OFMEDICAL GENETICS, 53–59 (2d. ed. 1998) (noting that environmental triggers of disease are most likelyto have a major impact on genetically predisposed individuals so in searching for them one should focus on thoseat highest genetic risk). The text also cites research dating to 1970.See also Harr Vainio,Molecular Approachesin Toxicology: Change in Perspective, 37 MOLEC. APPROACHES IN TOXICOLOGY, 14–18 (1995) (notingthat combining molecular biology with epidemiological associations should help pinpoint agents causing harmand noting that carcinogenesis and teratogenesis may have common mechanisms) and Peter Soderkvist and OlavAxelson,On the Use of Molecular Biology Data in Occupational and Environmental Epidemiology, 37 MOLEC.APPROACHES IN TOXICOLOGY, 84–90 (1995) (noting that genetically determined susceptibility plays a rolein understanding disease etiology). Both articles cite earlier work starting from the late 1970s and early 1980s.183 For a tripled risk in the sub-group the expected relative risk in the total population would only become0·1× 3+ 0·9 = 1·2. Most studies are designed to have about 80% power to detect risks of the order of 1.5 to 2.0and would have less than 50% power to detect a relative risk of 1.2.184 615 F. Supp. at 281–82.185 Id. at 282.186 Watkins, supra note 93.187 See Huber, note 66 at 174, Black,supra note 66 at 673, Mills and Alexandersupra n. 6 and, indirectly, Klein,supra note 66, at 2223 fn. 28.188 Id. at 290. A similar problem with experts in the UK is described by Daniel Bachtold,Conflict of InterestAllegations Derail Inquiry into Anti-depressant’s ‘Dark Side’, 300 SCIENCE, 33 (2003) (describing that two outof four panel members owned stock in the company making the drug.189 JACK B. WEINSTEIN, INDIVIDUAL JUSTICE IN MASS TORT LITIGATION 116 (1995) (citing a studyindicating that 27% of scientists surveyed had encountered falsified or fabricated research).190 615 F. Supp. at 286 (while the authors of the study said that an ‘appreciable increase’ in risk could not beruled out but the expert claimed that only a small increase could not be ruled out).191 Supra n. 114 at 18.192 Supra note 6 at 1235.

Page 34: The need for careful evaluation of epidemiological evidence in product liablility cases

184 J. L. GASTWIRTH

In view of the public health importance of this issue, however, further studiesare needed: they should include detailed information on exposure to vaginalspermicides and should have sufficient statistical power to address preciselyspecific birth defects.

Similarly, the large study by Louiket al.,193 cited by Mills194 concludes that that useof spermicide does not increase the risk of any of the defects studied, ‘except possibly’ thesubgroup oflimb defects with unknown origin. While this study did control for smoking itdid not obtain data on tranquillizer use.

Several critics also mentioned that the FDA panels decided that a warning label wasnot needed; however, the participation of a defence expert in the panel is hardly mentioned.While compliance by a firm with federal regulations is evidence supporting the safety of aproduct, Schwartz explains that this does not automatically satisfy the legal standards thatthe product is safe.195 Plaintiff’s experts inWells observed that producers try to influenceregulatory bodies,196 another reason is that regulatory institutions are often slow to acceptnew theories. Gibbons197 reports that a Government Accounting Office review found thatregulatory bodies have not applied scientific knowledge, including findings that the fetus ismore sensitive to toxins than children or adults. A survey of animal studies indicating thisappeared in 1979.198

There is a valid point concerning causation raised by Professor Stolley199 in histestimony and by Professor Bracken,200 namely that most harmful agents cause one healthproblem or a biologically related set. While this specificity criteria is not an absoluterequirement (smoking causes lung cancer and heart problems) it is an important factorused in causality assessment. Because the Jick study found that spermicide was associatedwith an increased in four different defects, this requirement is violated suggesting thatsomething unusual happened, e.g. the controls might have been much healthier thanaverage. If one accepts the importance of this requirement, then the critics who use thedifferent decision in theSmith v. Ortho case also are violating it since that child had atrisomy-18 defect not a limb reduction.

The epidemiological evidence in theWells case surely was not very convincing and it isunderstandable that scientists and legal scholars might desire stronger scientific evidencebefore holding a defendant liable. What seems unfortunate is the lack of discussion of thepart of the record most supportive of the legal decisions from the published criticisms.201

While it is reasonable to ask how much evidence should be required before a warning is

193 Supra note 114.194 Supra note 66 at 98.195 See Theresa M. Schwartz,The Role of Federal Safety Regulations in Products Liability Actions. 41VANDERBILT L. REV. 1121–1169 (1986).196 See MARION NESTLE, SAFE FOOD 194–219 (2003) for a discussion of the politics of governmentoversight and conflicts of interest.197 See Ann Gibbons,Reproductive Toxicity: Regs Slow to Change, 254 SCIENCE 25 (Oct. 1991).198 See Jerry M. Rice,Perinatal Period and Pregnancy: Intervals of High Risk for Chemical Carcinogens. 29Environ. Health Persp. 23–29 (1979).199 Supra note 79 and accompanying text.200 Supra note 155 at 554 (noting the defects did not have a common etiology).201 None of the commentaries,supra notes 5, 66, 93 discuss the 1977 Smithet al. study,supra note 70, andthe other risk factors it identified that were not considered in the later studies. After the Bazemore opinon,supra

Page 35: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 185

required, is it fair to rely on studies that were published after the time of exposure to assessthe producer’s duty at the time?

It should be stressed that our examination of the studiesdoes not conclude that exposureto spermicidescauses limb reduction defects.202 Indeed, the studies published in the 1980sare sensitive to the potential effect of unmeasuredknown potential risk factors that werenot controlled for in the analysis. Many studies observed that it is difficult to rule outsmall but meaningful risks given their sample size or that more studies including potentialconfounding variables be carried out before a scientific conclusion can be reached.203

The adversarial system places the burden of producing evidence on the parties, not thejudiciary. In Wells both opinions cited virtually all the studies in the record and explainedwhy they relied on the ones they did. They decided the ‘failure to warn’ issue, using thestudies existingprior to the mother using the drug and the testimony of two experts thatthese studies indicated a need for a warning.204 Although later studies diminished thestrength of the evidence for an association, theKey case demonstrates that the judiciaryis faced with a difficult problem when deciding how much evidence should be requiredbefore the public is warned. Delaying a warning exposes the public to small but life-threatening risks, but prematurely warning about a safe product also harms the public.Unlike the critics, the author believes that courts that present their findings and reasoningas carefully as theWells andSmith opinions did should be praised, while opinions such asKey that do not provide that information should be questioned.

5. Conclusion and discussion

The reanalysis of the studies inKey and Wells illustrates some of the problems indetermining when scientific studies establish a causal relationship or when sufficientevidence has accumulated to justify a need to warn users of potential harmful effectsassociated with a product. It also demonstrates the importance of courts receiving athorough review of the existing studies and reports as it is arguable that there existedstronger evidence substantiating the need for a warning inKey than inWells. Three recentSupreme Court decisions,Daubert, Joiner and Kumho Tire,205 give the trial judge the taskof assessing prospective expert testimony to determine whether it is based on sufficiently

note 165, however, the fact that legal commentators did not mention the potential effect of the omission of thosepossible risk factors on subsequent studies is surprising.202 This caveat needs to emphasized as, recently, Shawet al., Maternal Periconceptional Use of Multivitaminsand Reduced Risk for Conotruncal Heart Defects and Limb Deficiencies Among Offspring, 59 AM. J. MED.GENETICS 536–545 (1995) showed that women who took vitamins had a 30–35% lower risk of offspringwith limb defects and that consumption of folic acid was also associated with a reduced risk. If these resultsare confirmed, further studies will need to incorporate nutritional factors as an imbalance in the nutrient levelsbetween users and non-users would affect the estimated relative risk of other agents. The discovery of a possiblenew causal factor 15 years after the mother of ‘baby’ Wells used the drug illustrates the difference betweenscience and law,see supra notes 12 to 15 and accompanying text.203 See Shapiroet al. supra note 80, Polednaket al. supra note 148. and Cordero and Laydesupra note 114.Seealso Fienberget al., supra note 29 at 31 (specifically suggesting that experts be asked about how they accountedfor confounders or adjusted for other possible explanations as well whether they considered exposure to multiplerisks).204 788 F. 2d at 746 notes 9 and 10.205 Daubert v. Merrell-Dow Pharmaceuticals Inc., 509 U.S. 579 (1993),General Electric Co. v. Joiner, 522U.S. 136 (1996) andKumho Tire Co. v. Carmichael, 526 U.S. 137 (1999).

Page 36: The need for careful evaluation of epidemiological evidence in product liablility cases

186 J. L. GASTWIRTH

reliable methodology, which should include a careful examination of the relevant literature,to be admissible. These opinions have had a significant impact on the courts.206 Since nostudy is perfect, sensitivity analysis can be used as in Section 4 to assess whether the flawsare sufficiently severe to affect the ultimate inference.

Our review of the studies relied on by the critics of theWells decision indicates thatthe same level of scrutiny courts are giving to experts who rely on peer-reviewed studies intheir testimony should be applied to experts who criticize those studies. The flaws shouldbe shown to be sufficiently severe to cast substantial doubt on the main conclusions ofthe study. In particular, experts relying on later studies that do not obtain information onpotential risk factorsidentified previously should be asked to justify why those factorscould not alter their conclusions. Thus, the proponent of the later study should have theburden of showing that the omission will not affect the primary inference.207 The originaland reverse versions of Cornfield’s inequality and related methods of sensitivity analysiscan assist experts and judges in assessing whether a relevant omitted variable is likely tohave a substantial impact on the analysis.

The decision inKey Pharmaceuticals raises several important issues. First, it isimportant for the parties to obtain all the available and relevant studies and bring them to theattention of the court. Secondly, the determination that a pharmacist was not sufficientlyknowledgeable to be admitted as an expert may have had a major role in the decision,although we do not know how many of the reports and studies we found would have beencited. While it might be preferable to have an expert be a medical doctor or possess adoctorate in pharmacology, courts may need to avoid being overly restrictive in definingthe credentials of potential experts to assure that both parties have a reasonable pool todraw from. Presumably, the basis of an expert’s testimony, including their knowledge ofthe scientific literature will be subjected to careful examination.

As scientific evidence will continue to play a major role in a wide variety of cases, itis important for members of both professions to understand the different purposes of andprocedures used in the other field. This is especially true when there are major public policyissues involved so that the public is fully informed of all the relevant issues. In particular,statutes of limitations in the legal system restrict the scientific information that will beavailable to courts. Several suggestions for improving the usefulness of scientific findingsin product liability issues follow:

Both scientists and lawyers need to appreciate the different purposes of theirprofessions. Scientists need to understand that time limits imposed by the legal systemalter the amount of scientific information that will be available when a case must be filedand a decision rendered. When commenting on a decision it is incumbent on scientists

206 See Margaret A. Berger, The Supreme Court’s Trilogy on the Admissibility of Expert Testimony, inREFERENCE MANUAL ON SCIENTIFIC EVIDENCE 2th ed. (Federal Judicial Center, 2000) and MarcRosenblum, On the Evolution of Analytical Proof, Statistics, and the Use of Experts in EEO Litigation, inSTATISTICAL SCIENCE IN THE COURTROOM (Joseph L. Gastwirth ed., 2000) for further discussion ofthe influence the trilogy is having on lower courts.See also theSymposium: At the Daubert Gate: Managing andMeasuring Expertise in an Age of Science, Specialization and Speculation, 57 WASH. AND LEE L. REV. (2000).207 In general the proponent of a study should have the burden of showing that known omitted variables that arerelated to the harm under investigation would not affect the ultimate conclusion.

Page 37: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 187

to read the decision carefully, learn the applicable standard of proof208 and examineallthe studies the court had available to it. In fairness to the judiciary, studies not submittedinto evidence or cited in testimony or studies made available after the close of the trial (orrelevant discovery period) should not be used to criticize the decision. Scientists shouldalso clearly identify any prior contacts or with the parties in the case or related cases thatmight be construed as a potential conflict of interest.209 Legal scholars should also informreaders of their role as a consultant in related litigation and administrative processes.

The legal world should uniformly adopt a more realistic discovery rule and clarifyits relationship to statutes of limitations and repose. In particular, the limitations periodshould only begin to run when the underlying science supports a finding of causation andthe plaintiff should be aware of it (e.g. through public knowledge via the media or bynormal discussion with their doctor or pharmacist in case of drugs). This would also allowscientists the time needed to collect and analyse several studies to enable a review suitablefor public policy determinations.210 In England211 plaintiffs in personal injury cases havethree years from the time theyhad or ought to have had knowledge that the harm wasserious, attributable in part to the actions of the defendant and the identity of the defendant.

It might be preferable to separate the failure to warn issue from a determinationof causation as the amount of evidence that should suffice for a warning of potentialtoxicity would appear to beless than that required for causation. This is consistent withthe ‘informed consent rationale’ used in some states but not in those that only require awarning label when causation could have been demonstrated. As the burden of proof ison plaintiffs this is a disincentive for manufacturers, who are in the best position to carryout studies, from doing so.212 By preserving the requirement that plaintiffs need to showthat they would have heeded a warning, consumers would also be encouraged to be moreresponsible. Of course, producers should only be held liable for failing to warn if they hador should have had adequate knowledge.213

Science should progress over time so courts should understand that the ‘balance’ of

208 Jasanoff,supra n. 66 at 68 also recommends that experts familiarize themselves with the legal process andstandards legal fact-finders use in assessing truthfulness.209 The British Journal of Medicine now requires authors to describe such connections.210 See Michael B. Bracken,supra n. 155 at 555 (observing that it is quite common for the first studies to suggesta positive association and that it takes time to design large-scale studies to specifically test for an hypothesizedassociation). This statement may only refer to those studies that are published as firms may not report studies thatindicate risks.See the column Agency Watch, EPA’s Voluntary Data, NAT. L. J. A 10 (Nov. 4, 1996) (reportingthat the chemical industry reported only about one-fourth of the studies they were supposed to).See also BlazaToman and Joseph L. Gastwirth,Statistical Issues in the U.S. v. Marine Shale Case, 8 ENVIROMETRICS 53(1997) (discussing the low power of study that did not find a significant increased risk submitted by the defendantthat was published).211 See PAMELA R. FERGUSON, DRUG INJURIES AND THE PURSUIT OF COMPENSATION 56–59(1996).212 See supra notes 11 and 22.See also Mary L. Lyndon, Information, Economics and Chemical Toxicity:Designing Laws to Produce and Use Data, 87 MICH. L. REV., 1795. 1814 (1989) (ignorance of toxicity maybe an advantage as new or untested ones will do better than those that have indicated some level of toxicity)and Wendy E. Wagner,The Science Charade in Toxic Risk Regulation, 95 COLUM. L. REV. 1613, 1687 (1995)(observing that rational manufacturers with a fiduciary responsibility to shareholders are unlikely to undertakeresearch on a potential carcinogen it produces if such a finding will likely lead to further regulation).213 See Colgain v. Oy-Partek Ab. Del. No. 359 (2002) holding that that the dangers from asbestos were notgenerally available to the Finnish company or scientific community in the 1930s when the plaintiff was exposed.

Page 38: The need for careful evaluation of epidemiological evidence in product liablility cases

188 J. L. GASTWIRTH

the relevant scientific information also changes over time. When interpreting a series ofstudies, an association uncovered in an early study should be one of the main focusesof a subsequent one so that the issue of multiple comparisons is minimized.214 Data onvariables shown, in prior studies, to be related to the response under investigation shouldbe collected in future studies.215 Judges should question experts who rely on studies thatdo not include these related variables about the effect they could have on the expert’sconclusions.

Courts can assist scientists and legal commentators to develop relevant information andmethods of assessing bodies of evidence by formally citing the studies submitted to themby the parties or relied on by their experts and explaining how the evidence was evaluated.This would stimulate the development of methods for combining evidence from varioussources and encourage scientists to reanalyse data from these studies as well as plan futurestudies.

Producers should have a greater incentive to monitor their products for side effects.216

Once several case reports are published or a small but significant result as in Changet al. 217

is published, a responsible producer should arrange for two or three independent medicalteams to carry out confirmatory studies simultaneously. This would minimize the timeperiod the public is exposed to a potential risk, if there truly is one, and would also lead tothe reassurance of the medical community and public if the original study turned out to be afalse positive one. Such responsible policies of a manufacturer should be considered strongevidence in their favour if punitive damages are requested in addition to compensatoryones.

Finally, the role of animal studies in assessing causation deserves more attention. Whilecourts and scientists prefer evidence garnered from human studies, ethical considerationsmay preclude some experiments. In theWells case the court accepted the expert’s relianceon animal studies showing that the chemical could pass through the placenta and reachthe fetus. Some might argue that this does not prove the chemical also passes through ahuman placenta but would it be proper to expose a pregnant woman to the spermicide? Asthere would be no health benefit to either the mother or the baby and one would only carryout the studyafter a possible health risk was suspected, would such a study be ethical?218

214 The mathematics underlying typical calculations of statistical significance assumes that one comparison ortest is being conducted. When many tests are carried out on the same data set, then the probability of a ‘falsepositive’ finding of significance is increased.See ZEISEL & KAYE, supra note 6 at 92–94 and GASTWIRTH,supra note 17 at 776 for examples of this problem. InKadas v. MCI Systemhouse, Inc. 85 FEP CASES 1720, 1723(2001) Judge Posner observes that if an expert has run 20 regressions and only reports the one favourable to theparty’s position it has no evidentiary value. He also notes that the usual 0.05 level used to determine significanceshould not be used to determine whether a study or analysis is admissible.215 This policy would reduce the chance that data on tranquillizers would not be collected in other studies of therelationship of birth defects and spermicide use. Courts should understand that large-scale epidemiologic studiesrequire several years to plan. Thus, they should not expect that studies appearing within a year or two after anearlier one will have had the opportunity to incorporate the results of the previous study during the planning stage.216 I believe this suggestion is consistent with the economic analyses of tort law as the producer is in the bestposition, i.e. has the least cost in obtaining that information.See LANDES & POSNER,supra note 59 for adiscussion of the role of information costs.217 Supra note 34.218 See Mills et al., supra note 138 at 446. (stating that no one would recommend that a woman continue touse spermicides after conception although their results indicate that women should not be concerned about suchaccidental exposure).

Page 39: The need for careful evaluation of epidemiological evidence in product liablility cases

EPIDEMIOLOGICAL EVIDENCE IN PRODUCT LIABLILITY CASES 189

Almost surely today’s Institutional Review Boards would question such a study and would,at a minimum, require complete disclosure in the ‘informed consent’ form given potentialparticipants.

Acknowledgements

It is a pleasure to thank Joseph Cecil, Michael D. Green, David H. Kaye and Wendy Wagnerfor many conversations, spanning several years, on the role of scientific evidence in thetort and environmental law. I also wish to thank Drs Barry Graubard, Weiwen Miao andBinbing Yu for their careful review of the manuscript, Professor H. W. LaRue for helpfulcomments and Dr George Reed for a most helpful discussion of the power calculationsdone in the 1982 study.