statistical treatment of experimental:...

27
S– 1 STATISTICAL TREATMENT OF EXPERIMENTAL: DATA S-1. INTRODUCTION ...................................................................................................................................................... 1 S-1.1. TYPES OF EXPERIMENTAL ERRORS............................................................................................................................ 1 S-1.2. RULES FOR TREATMENT OF DATA ............................................................................................................................. 3 S-2. PROPERTIES OF THE SAMPLING UNIVERSE............................................................................................... 4 S-2.1. CENTRAL VALUES ...................................................................................................................................................... 4 S-2.2. DISPERSION................................................................................................................................................................. 4 S-3. REPEATED MEASUREMENTS OF SINGLE QUANTITY ............................................................................. 5 S-3.1. ESTIMATE OF CENTRAL VALUES ............................................................................................................................... 5 S-3.2. ESTIMATE OF DISPERSION VALUES ........................................................................................................................... 5 S-4. MEASUREMENTS OF LINEAR RELATIONSHIPS ......................................................................................... 7 S-4.1. LEAST-SQUARES FIT OF Y = MX + B .......................................................................................................................... 7 S-4.2. PROPER APPLICATION OF LEAST-SQUARES FITS ...................................................................................................... 8 S-5. QUALITY OF RESULTS.......................................................................................................................................... 2 S-5.1. REJECTION OF DATA .................................................................................................................................................. 2 S-5.2. CONFIDENCE INTERVALS ........................................................................................................................................... 5 S-5.3. SIGNIFICANT FIGURES AND ROUNDING ERRORS ...................................................................................................... 6 S-6. RESULTS DERIVED FROM MEASURED QUANTITIES ............................................................................... 7 S-6.1. ERROR PROPAGATION ................................................................................................................................................ 7 S-6.2. ESTIMATES OF PRECISION .......................................................................................................................................... 7 S-6.3. ESTIMATES OF ACCURACY......................................................................................................................................... 8 S-7. TABLES FOR STATISTICAL TREATMENT OF DATA ............................................................................... 10 S-7.1. VALUES OF T FOR 95% CONFIDENCE INTERVALS ................................................................................................... 10 S-7.2. VALUES OF Q FOR DATA REJECTION ...................................................................................................................... 11 S-7.3. VALUES OF T C FOR DATA REJECTION -- CHAUVENET'S CRITERION ....................................................................... 12 S-7.4. PRECISION AND ACCURACY OF VOLUMETRIC GLASSWARE................................................................................... 13 S-7.5. MEASURED PRECISION OF LABORATORY BALANCES ............................................................................................. 13 S-7.6. TABLE OF ATOMIC WEIGHTS WITH UNCERTAINTIES .............................................................................................. 14 S-7.7. TABLE OF CONSTANTS AND CONVERSION FACTORS WITH UNCERTAINTIES......................................................... 16 S-7.8. SUMMARY OF COMPUTATIONAL FORMULAS .......................................................................................................... 17 S-1.Introduction It is common experience that repeated laboratory measurements of any quantity yield numerical results that vary from one time to another. Similarly, values of physical properties that are derived computationally from directly measured quantities usually vary from one determination to another. Finally, values of properties obtained by one observer commonly vary from those of other observers or from accepted (literature) values where the latter exist. All of these different types of variations of physical values are known collectively as "experimental errors." Experimental errors can never be totally eliminated, but their effect can be minimized by proper application of statistics. While it is desirable for any scientist or engineer to understand the mathematics of probability and statistics, it is possible to use statistical analysis correctly even without detailed understanding. Thus, statistics can be applied just like any other tool. S-1.1.Types of Experimental Errors There are basically three types of experimental errors: 1) blunders; 2) systematic, or determinate errors; and 3) random, or indeterminate errors. The first two types of errors can in principle be nearly

Upload: others

Post on 26-Nov-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 1

STATISTICAL TREATMENT OF EXPERIMENTAL: DATA

S-1. INTRODUCTION ...................................................................................................................................................... 1

S-1.1. TYPES OF EXPERIMENTAL ERRORS............................................................................................................................ 1 S-1.2. RULES FOR TREATMENT OF DATA............................................................................................................................. 3

S-2. PROPERTIES OF THE SAMPLING UNIVERSE............................................................................................... 4

S-2.1. CENTRAL VALUES ...................................................................................................................................................... 4 S-2.2. DISPERSION................................................................................................................................................................. 4

S-3. REPEATED MEASUREMENTS OF SINGLE QUANTITY ............................................................................. 5

S-3.1. ESTIMATE OF CENTRAL VALUES ............................................................................................................................... 5 S-3.2. ESTIMATE OF DISPERSION VALUES ........................................................................................................................... 5

S-4. MEASUREMENTS OF LINEAR RELATIONSHIPS......................................................................................... 7

S-4.1. LEAST-SQUARES FIT OF Y = MX + B .......................................................................................................................... 7 S-4.2. PROPER APPLICATION OF LEAST-SQUARES FITS ...................................................................................................... 8

S-5. QUALITY OF RESULTS.......................................................................................................................................... 2

S-5.1. REJECTION OF DATA .................................................................................................................................................. 2 S-5.2. CONFIDENCE INTERVALS ........................................................................................................................................... 5 S-5.3. SIGNIFICANT FIGURES AND ROUNDING ERRORS ...................................................................................................... 6

S-6. RESULTS DERIVED FROM MEASURED QUANTITIES............................................................................... 7

S-6.1. ERROR PROPAGATION ................................................................................................................................................ 7 S-6.2. ESTIMATES OF PRECISION .......................................................................................................................................... 7 S-6.3. ESTIMATES OF ACCURACY......................................................................................................................................... 8

S-7. TABLES FOR STATISTICAL TREATMENT OF DATA ............................................................................... 10

S-7.1. VALUES OF T FOR 95% CONFIDENCE INTERVALS................................................................................................... 10 S-7.2. VALUES OF Q FOR DATA REJECTION ...................................................................................................................... 11 S-7.3. VALUES OF TC FOR DATA REJECTION -- CHAUVENET'S CRITERION ....................................................................... 12 S-7.4. PRECISION AND ACCURACY OF VOLUMETRIC GLASSWARE................................................................................... 13 S-7.5. MEASURED PRECISION OF LABORATORY BALANCES............................................................................................. 13 S-7.6. TABLE OF ATOMIC WEIGHTS WITH UNCERTAINTIES.............................................................................................. 14 S-7.7. TABLE OF CONSTANTS AND CONVERSION FACTORS WITH UNCERTAINTIES......................................................... 16 S-7.8. SUMMARY OF COMPUTATIONAL FORMULAS .......................................................................................................... 17

S-1.Introduction It is common experience that repeated laboratory measurements of any quantity yield numerical results that vary from one time to another. Similarly, values of physical properties that are derived computationally from directly measured quantities usually vary from one determination to another. Finally, values of properties obtained by one observer commonly vary from those of other observers or from accepted (literature) values where the latter exist. All of these different types of variations of physical values are known collectively as "experimental errors." Experimental errors can never be totally eliminated, but their effect can be minimized by proper application of statistics. While it is desirable for any scientist or engineer to understand the mathematics of probability and statistics, it is possible to use statistical analysis correctly even without detailed understanding. Thus, statistics can be applied just like any other tool.

S-1.1.Types of Experimental Errors There are basically three types of experimental errors: 1) blunders; 2) systematic, or determinate errors; and 3) random, or indeterminate errors. The first two types of errors can in principle be nearly

Page 2: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 2

eliminated. The third type can never be eliminated, but its influence can be quantitatively evaluated to yield the greatest possible information about the value that is sought. Systematic errors cause inaccurate results, even when the precision of the measurement is excellent. Random errors reduce the precision of the results, but the accuracy may still be perfect within the confidence limits of precision. Undetected blunders may contribute to both inaccuracy and imprecision.

S-1.1.1.Blunders:

Under the heading of blunders we can include such physical errors as spilling or splashing a portion of the sample being measured, contaminating the sample, using the wrong sample (because of labelling error, carelessness in reading labels, or losing the label), misreading an instrument or other apparatus, etc. Also included in this type of error would be arithmetic or algebraic errors involved in calculations of derived values (including entering the wrong value in a calculator or computer by mistakenly pressing a wrong key). Whenever any of these types of blunders are noticed, they must be corrected if possible, or the particular sample must be eliminated from consideration if the error cannot be rectified. Sometimes a blunder goes unnoticed but its effect becomes evident during statistical analysis of the data. At that time, the particular value (s) so affected can be eliminated from further consideration.

S-1.1.2.Systematic Errors:

Systematic, or determinate, errors are most commonly thought of as involving inaccurate calibrations of equipment. When such errors exist, the most precise measurements imaginable will result in incorrect results which are not detectable or correctable by statistical methods. For example, if a 100 ml volumetric flask actually has a volume of 100.1 ml, then the use of this flask will always give erroneous results unless either the flask is accurately calibrated to learn its true volume, or some other error compensates for the erroneous volume. All volumetric glassware, balances, electrical components, etc. are calibrated by their manufacturers, but only to within a certain "tolerance" range of a truly accurate value. For very exact work, it is important that the experimenter calibrate his or her own apparatus. Assumptions regarding the purity of chemical reagents introduce another type of systematic error which can be minimized only by assaying the purity of the compound or by rigorously purifying it, or both. Sometimes the manufacturer supplies an assay of the material; in such cases the actual value of the concentration should be used. The use of obsolete values of physical constants, conversion factors, atomic masses, etc. introduces additional systematic errors into derived values, just as does the use of improperly calibrated standards. Tables of constants and conversion factors are given in Section S-7.7. Numerical values of many of these factors are periodically reviewed and revised using more and more refined measurement techniques as they become available. Sometimes definitions (e.g., atomic mass scale, conversion factor from liters to cubic centimeters, etc.) are changed by international agreement, requiring revision of values of some other physical constants or conversion factors. Only by the use of the most current values can the contribution of this type of systematic error be minimized. Finally, rounding off of numerical values during computation introduces an additional systematic error which in its effect is equivalent to reducing the precision of calibration of an instrument. Though the influence of round-off error can in some situations be estimated statistically, it actually is a systematic error which can be minimized, or effectively eliminated, but usually its effect cannot be minimized or even discovered after the fact. This problem is discussed more full in Sec. S-1.2.3

S-1.1.3.Random Errors:

The third category of experimental error, i.e., random or indeterminate error, includes a great number of types of phenomena or conditions normally external to the experimental system, each of which can have (usually) small influences on the results of the experiment. Generally it is considered impossible to determine even what all the influences are and certainly impossible to determine their individual or collective effects. Just a few of the types of possible influences here might include minor fluctuations in atmospheric pressure or temperature, color and intensity of incident light, voltage of electric supply, electromagnetic fields, gravitational field (as influenced by the phase of the moon, for instance), sunspots, physical and emotional conditions of the observer, etc. Fortunately, if there are enough different contributors to random experimental error (as there almost certainly are), statistical theory tells us a great deal about the overall effect of such errors, and this effect is easily quantified.

Page 3: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 3

S-1.2.Rules for Treatment of Data

Unless you are specifically instructed to do otherwise, you are to follow the data treatment rules given below for every experiment you perform in the physical chemistry laboratory. The application of some of these rules is self-evident and requires no knowledge of statistics. For others, the required knowledge of statistical computations and analysis can be obtained from the following pages. Even if you have had some prior involvement with statistical treatment of data, you should at least peruse the following pages. It is your responsibility to use the formulas and terms as they are described here. Failure to follow any of these rules faithfully will be penalized (in terms of your grade) just as if you had used an incorrect thermodynamic function or had made a serious error in arithmetic.

S-1.2.1.Data Entry.

Always enter data directly into your laboratory notebook (never on a separate piece of paper first), preferably with ball point pen so a good carbon copy will result. If you enter data first on a separate piece of paper, to be transferred into your notebook later, your valuable data may be confiscated and destroyed by your instructor. If a value includes a decimal fraction, be sure to make the decimal point very distinct on both the original and the carbon copy. If the value is less than unity, either place a zero before the decimal point or use scientific notation with a non-zero integer preceding the decimal point. (Follow these rules for intermediate and final values in your lab report as well as for original data values.) If you make a "blunder" type of error and you realize it at the time, cross out the erroneous value with a single line (don't obliterate). It is a good idea to append a notation indicating why the value was crossed out. If, while working in the laboratory you suspect you might have made some kind of blunder in obtaining or recording a particular data value but you aren't certain, proceed as follows. Indicate the questionable value, but don't cross it out. Indicate, by a written notation, your reason for questioning the value. When you perform the appropriate calculations with the data, if the result from the questionable value does not appear "out of line" with other values, retain it; otherwise eliminate it from further consideration.

S-1.2.2.Statistical Rejection of Data.

When you have completed all, or an appropriate portion, of the calculations with your data, determine by the following means whether any individual values are outlying points. If no more than ten individual data values are involved, reject any that fail the Q-test. If more than ten values are involved, eliminate those data which are indicated to be outlying points by Chauvenet's criterion. Detailed instructions for the use of both of these rejection decision methods are provided in Sec. 5.1.

S-1.2.3.Rounding Off.

During computations, always retain at least two more significant figures than those which you would have retained according to the rules you learned (reviewed in Sec. 5.3) in your introductory chemistry course. When you have completed you computations you are to round off the final results to two significant figures in the confidence interval limits.

S-1.2.4.Use of Averages.

Values which represent replicate measurements of the same property of the same sample may be averaged, and the average value is then used in further computations. The standard error of the average is used in computing confidence limits in the final results.

S-1.2.5.Linear Functions.

When either raw data or intermediate computed values are to be fitted to a straight line, the values of the slope and intercept are to be obtained by a least-squares fit. In the process of fitting you will also obtain the standard error the y-values (actually the estimated standard deviation of residuals) as well as the slope and intercept. Apply the appropriate criterion to the residuals to decide whether to reject any data points as out-lyers. If you do reject any, recompute the least-squares fit with the remaining data. Use the standard errors of the slope and/or intercept in computing confidence limits of any final results derived therefrom.

Page 4: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 4

S-1.2.6.Confidence Limits of Results.

Final results of measurements are to include confidence limits for, and physical units of, the values obtained. This procedure is illustrated in more detail in Sec. 6.3. If your final answer contains more than two "significant figures" in the confidence limits, it will be graded as a computational error. Along with the final results, report a "literature value" (a table of results is an appropriate way to report these) if one can be found. Give a complete reference to the source of your literature value, even if it was obtained from a textbook or handbook. If the literature value falls outside the confidence limit range of the value you obtained, a systematic error is implied. In your discussion, consider possible systematic errors that could result in the observed discrepancy. This is discussed further in Sec. 6.3.

S-1.2.7.Expected Errors.

For one data point (of each kind, if more than one kind) you are to estimate the precision involved in each measurement or observation. Estimates of errors involved in measuring volumes and masses are tabulated in Sec. 7.4 and 7.5. Errors in other types of measurements must be estimated. Counting numbers (e.g., the charge on an ion, the exponents in an equilibrium constant expression, etc.) are considered to contain absolutely no error. When all expected errors have been assessed, the expected errors in derived quantities are to be computed. The expected errors (precision) is to be compared with the standard error determined from experimental results. If the standard error is larger than the expected error by a factor of two or more, include an analysis of why your experimental results show poorer precision than anticipated. An example is given in Sec. 6.2.

S-1.2.8.Validity of Least-Squares Fit.

If your computation involves a linear fit of data, prepare a graph of the residuals in the least-squares fit. The appearance of the plot of residuals is to be discussed briefly in terms of the applicability of linear least-squares fit of the experimental data. This type of analysis is described in Sec. 4.2.

S-2.Properties of the Sampling Universe If an infinite number of measurements of quantity were made, they would be "distributed normally," i.e., the most common value would be the true value (assuming no systematic error). Values increasingly far from the "central" (true, or most common) value would be less frequently observed. If a graph were plotted with measured value as the abscissa and the frequency of observation of that value as the ordinate, the result would be the bell-shaped Gaussian, or normal, distribution curve. This distribution of infinitely many (all possible) measurable values is known as the "sampling universe" or frequently simply as the "universe". It is from this universe that our samples of a few measurements is drawn. We assume that our sample is "representative" of the universe and use it to estimate properties of the sampling universe itself.

S-2.1.Central Values We shall be especially concerned with two particular features of the normal distribution. The value having the maximum frequency (corresponding to the peak of the curve) is the "mode" or "most probably value," and for the normal distribution it is identical to the "mean" or "average" value of the distribution. The mean of a sampling universe will be denoted by µ, and the average of a measured sample will be denoted by a bar over the symbol for the quantity being measured. Thus, µx is the mean of the sampling universe for the quantity x, and x is the average of an actual group of measurements of x.

Ideally, x = µx, but this is not normally encountered, especially if the sample of measurements is small. However, statistical treatment of the data allows us to estimate the probability, or likelihood, that the value of µx deviates from the value of by a given amount.

S-2.2.Dispersion The second quantity of the distribution that is significant to us is a measure of the dispersion, or breadth of the distribution curve. As an example, we might have two sets of measurements of some property each with an average of 100. If one set had all values between 99 and 101, and the other had all values between 90 and 110, we would say the first was a narrower distribution. We would probably have

Page 5: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 5

more confidence in the value of x as a representation of µ in the first case, even though x is the same in both cases. Conceivably we could use the total range of values of a distribution as a measure of its breadth, but a normal distribution (of an infinite sampling universe) extends infinitely in both directions from the central value, even though finite samples have finite ranges. It turns out that a quantity σ, the "standard deviation" of the sampling universe, is easily estimated and has much utility. We can compute a standard deviation of a finite sample (denoted s) and from this we can estimate the value of σ. Much useful information can be obtained from a knowledge of the estimated values of µ and σ.

S-3.Repeated Measurements of Single Quantity

S-3.1.Estimate of Central Values Suppose we measure the height of a column of mercury five times and obtain values of 5.31 cm, 5.28 cm, 5.34 cm, 5.30 cm, and 5.27 cm, and we wish to obtain an estimate of the "true" height of the mercury column together with a measure of the amount of confidence we can place in the value. It seems natural to use x as a measure of µ. We can calculate its value the same way we have since grade school, i.e., add up all the values and divide by the number of values. However, we will soon find such word-based definitions too cumbersome to be useful and we desire a more compact, efficient notation. The "sigma" notation is universally used in statistics, and it is mandatory that you master its use. In this notation each value is assigned an index number which is carried as a subscript. The index numbers used in any given set of measurements are the consecutive counting numbers beginning either with 0 or with 1 (we shall always begin with 1 in this work). These might be applied to our set of data for the height of the mercury column in the following way:

Index number i

Measured value xi

1 x1 = 5.31 cm 2 x2 = 5.28 cm 3 x3 = 5.34 cm 4 x4 = 5.30 cm 5 x5 = 5.27 cm

Translated into sigma notation, our formula for computing the mean of the sample becomes:

x =1

nxi =

1

nx1 + x2 +K + xn( )

i!1

n

"

The summation symbol, i= 1

n

! , means "add up the quantities that follow, with i running sequentially from

1 to n." The value of n is simply the total number of sample points. Frequently, the limits of a summation are simply understood to include all the sample points and the index is not actually indicated.

x =1

nxi

i!1

n

" =1

nxi" =

1

nx"

S-3.2.Estimate of Dispersion Values

We say that x is the best estimate we can get of µ without making additional measurements, but we can say something about how good the estimate is if we know the value of the standard deviation, σ. The actual use of the standard deviation for this purpose is treated in Sec. S-5.2.

Page 6: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 6

The definition of the standard deviation of the sampling universe is stated mathematically as:

! =lim

n "#

1

nxi $ µ( )2

i =1

n

%&

' ( (

)

* + +

1 2

It seems natural to define the standard deviation of a finite sample in an analogous way as:

s =1

nxi ! x ( )2

i =1

n

"#

$ % %

&

' ( (

1 2

Although we can compute s directly, we would really like to have a value for σ. Since it cannot be computed directly, we need a best estimate,

ˆ ! of the value, σ, that is desired. (In statistics, a carat is placed above a symbol to indicate that the value to which it refers is an estimate rather than a true value. For example,

ˆ ! is an estimate of the standard deviation, and ˆ µ = x is an estimate of the mean). The

relationship we seek is:

ˆ ! =1

n " 1xi " x ( )2

i =1

n

#$

% & &

'

( ) )

1 2

= sn

n " 1

*

+ ,

-

. /

1 2

Note that on many handheld calculators, the symbol s often appears as a part of the statistical functions, however, the value that is calculated is ˆ ! . Now let us apply these definitions and formulas to our measurements of the height of a column of mercury.

x =1

nx

i! =1

55.31 + 5.28 + 5.34 + 5.30 + 5.27( ) cm

=26.5 cm

5= 5.30cm

s =1

nxi ! x ( )2"#

$ % &

' (

1 2

s =

1

5(5.31 ! 5.30)

2+ (5.28 ! 5.30)

2+ (5.34 ! 5.30)

2+ (5.30 ! 5.30)

2+ (5.27 ! 5.30)

2[ ]" # $

% & '

1 2

s =1

50.01

2+ 0.02

2+ 0.04

2+ 0.00

2+ 0.03

2[ ]! " #

$ % &

1 2

= 0.0245 cm

ˆ ! = sn

n " 1= s

5

4= 0.0245 cm

5

4= 0.0274 cm

We feel "intuitively" (i.e., on a basis of experience) that by taking the average of the measurements, the result probably will be closer to the mean than would any single measurement chosen at random. Indeed, mathematical statistics tells us that the standard deviation of the means, known to statisticians as the standard error of the mean, of all samples of size n is:

!x =!

n

Though we lack absolute knowledge of the value of σ, we have our best estimate, ˆ ! . Therefore, to a good approximation, we have:

Page 7: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 7

ˆ ! x "

ˆ !

n

=0.0274 cm

5= 0.0123 cm

=s

n # 1=

0.0245cm

4= 0.0122cm

Note that even though these two sequences of computation are algebraically equivalent, the results differ by 1 in the third significant figure. The reason for this is that intermediate results were rounded off, introducing round-off error. This type of error in statistical computations is discussed further in Sec. S-5.3.

Computation of such statistical parameters as x , ˆ ! , and ˆ ! x can be tedious and are subject to errors

simply because of the length of the procedure. Careful use of the statistical functions available on most modern scientific calculators significantly reduces this problem.

S-4.Measurements of Linear Relationships Suppose a functional relationship with x and y exists such that y = f(x, a, b, c, ...) where values of y are measured for fixed, precisely known values of x, and the best estimate of constants a, b, c, ... are to be determined for the particular relationship. Then if the measured values of y are normally distributed about the values of the function calculated from the mathematical relation, it can be shown that the best estimates of the constants are those for which the sum of the squares of the differences between measured and calculated values of y is a minimum. Note that this criterion is valid only if the random errors are assumed to exist solely in the values of y, and not in those of x. This is known as the "least-squares criterion of fit." It is possible to obtain standard deviations of the various estimated values of parameters also. Results of application of the method to the equation of a straight line are as follows.

S-4.1.Least-Squares Fit of y = mx + b This is probably the best known of all least-squares applications. The computational formulas for obtaining the best estimates of the values of the slope and intercept are as follows:

ˆ m =

n!xy " ! x! y

D

ˆ b =! x2

! y " !x! xy

D= y " ˆ m x ,

where D = n!x

2" ! x( )

2= the denominator

The standard error of estimate of the y values relative to the least-squares fitted line is given by:

ˆ ! ̂ y =" y # ˆ m x # ˆ b ( )

2

n # 2

$

%

& & &

'

(

) ) )

1 2

and the standard errors of estimate of the slope and intercept are given by:

ˆ ! ˆ m =ˆ ! ̂ y

n

D

"

# $

%

& '

1 2

ˆ ! ̂ b = ˆ ! ̂ y

" x2

D

#

$ %

&

' (

1 2

As with the mean and standard deviation, many scientific calculators and most computer data analysis programs can perform linear (and often non-linear) least-squares fitting.

Page 8: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 8

S-4.2.Proper Application of Least-Squares Fits

The first assumption in the derivation and use of a least-squares fit is simply that the equation of the line is exactly the same mathematical form as the relation between the variables in the actual physical system or process, and that the purpose of using the fitting technique is to obtain the best estimate of the numerical values of the parameters that are involved. The second assumption is that the errors in the measured values of the dependent variable (y) are normally distributed, with zero mean, and that these errors are randomly distributed. That is, if a sufficiently large number of data points were available, and if they were divided into groups (clusters) of adjacent points, the means and standard deviations of errors theoretically should be the same for all groups. The third assumption is that errors in the measured values of the independent variable (x) are non-existent, or in practical term, they are negligible relative to the errors in the measured values of the dependent variable. If any of the above assumptions is invalid for a particular set of data, then a least-squares fit does not provide the best estimate of the values of the desired parameters. It may give better estimates than any other conveniently available method but it certainly cannot be considered to have the same reliability as if the above assumptions were valid. There is a sensitive test, to be desired below, which should be applied whenever fitting data to a line under conditions such that the validity of the assumptions is not known for certain. When some or all of the measurements involve replicate determinations of y at each of several values of x, the following considerations should be observed carefully. Unless the standard error of estimate of the y values is exactly the same for each value of x, do not apply the above formulas to the xi's and the corresponding average values of yi. To do so would result in obtaining estimates of the desired parameters that may be considerably poorer than the desired best estimates. (It would violate the second assumption stated above.) The proper procedure is to apply the appropriate formulas to all the measured values individually. Though this may seem to be an excessive amount of computation, it actually involves about the same number of data entry operations when using a programmed computer or calculator as it does if the data are first averaged and then subjected to a least-squares fit. In addition to consideration of whether the equations are properly applied to the data, it is appropriate to consider the equation of whether the least-squares technique should be used at all. Once the fitted values have been obtained for the parameters of the model equation, it is a simple matter to compute the fitted value of y corresponding to each experimental value of x. From this, the deviation of each experimental value of y from its corresponding fitted value, known as the "residual" of y, is obtained by subtraction of the latter from the former. If a computer data analysis program is used for the least-squares computations, it is generally a simple matter to calculate the residual values in addition to the fitted values of y. Then a graph is prepared using fitted value of y as the abscissa and the corresponding residual value (with a algebraic sign) as the ordinate. From a simple examination of such a scatter diagram of residuals, qualitative information about the appropriateness of fit is obtained. If the field of residuals seems to represent a straight line at an angle to the horizontal axis, as in Fig. 1a, it almost certainly indicates an error either in the least-squares computations or in entering the data into the computations. This is a more sensitive test than a simple plot of the computed line through the field of actual data points, as in Fig. 1b. Also, if the mean value of the y-residuals (with algebraic sign) is significantly different from zero, it certainly indicates some error in fitting the data. This case is illustrated in Fig. 2.

Page 9: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 9

data

resid

uals

variable

0

Figure 1a

data

resid

uals

variable

0

Figure 2a

data

variable Figure 1b

data

variable Figure 2b

If the field of points is wider at one side of the diagram and narrower at the other, or if it is wider or narrower in the center than at the sides, the second assumption behind the least-squares method is not valid. This occurs frequently when a nonlinear functional relation between x and y is "linearized" by algebraic manipulation. A common example of this is the case of an exponential function. For example, the vapor pressure of a liquid is an exponential function of the reciprocal of temperature. However, we do not have a convenient least-squares equation for an exponential relation, so we linearize it by taking the logarithm of both sides of the equation. The result is a function of the type

log p = m

1

T+ b

If we let log pi = yi and 1/Ti = xi, we have a general linear equation and it is a simple matter to apply a least-squares analysis to evaluate m and b. However, it is common experience that errors in measured values of pi form a single normal distribution, not the errors in log pi. Stated another way, the least-squares fitting method in this case assumes a single distribution of the relative errors in p, where the usual physical situation results in a single distribution of the absolute errors in p. For this reason, the scatter diagram of y-residuals obtained from such an experimental analysis almost invariably is wider for small values of log p than for large values. Thus, the line so obtained is unlikely to be the best possible fit of the data, but it may not be greatly in error, so the linearizing technique is commonly used. This type of fitting problem is illustrated by Fig. 3, which involves a least-squares fit of calibration data for a thermistor, whose resistance is an exponential function of the reciprocal of Kelvin temperature. In Fig. 3b, ln R is the ordinate and 1/T is the abscissa. Errors in resistance are approximately normally distributed.

Page 10: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S–1

data

resid

uals

variable

0

Figure 3a

data

variable Figure 3b

If the field of points in the graph of residuals seems to represent a curve rather than a horizontal straight line of zero y-deviation, it indicates that the mathematical model does not adequately represent the experimental data. This is almost always the case when the logarithm of vapor pressure is fitted to a linear function of reciprocal of absolute temperature for careful experimental work. The reason is found by referring to the basis for the log P vs 1/T relation, i.e. the Clausius-Clapeyron equation. The derivation of the latter equation involves several assumptions that are not adequately correct for a temperature range of more than a few degrees. Thus if high quality experimental data are plotted in the appropriate form for the Clausius-Clapeyron equation, they lie on a curve rather than a straight line. The curvature may not be great, however, and probably would not be noticed, but the scatter diagram of y-residuals again provides a very sensitive test for the appropriateness of the model. As a specific example, if a least-squares analysis is applied to vapor pressure data for water from a handbook, as in Fig. 4, this curvature is very noticeable. Furthermore, the normal boiling point will not be calculated (from the fitting equation) to be 100.0oC, nor will the calibrated heat of vaporization agree with the accepted value. In addition, the actual calculated values of the boiling point and of the heat of vaporization will depend on how many points, and which ones, are used in the computation.

data

resid

uals

variable

0

Figure 4a

data

variable Figure 4b

Finally,it should be reiterated that the test described here is very sensitive, and should not negate the reasonableness of using approximate models for physical systems. While the graph of residuals serves to identify many situations in which the residuals do not belong to a single distribution (at least approximately), it does not in any way tell us whether the distribution is reasonably normal. For example, the residuals involved in Fig. 5 are normally distributed, but this fact cannot be discerned by simple visual inspection. There are rather sophisticated tests available to answer this question, but they will not be considered here because the question is not usually of serious concern when dealing with analysis of experimental data.

Page 11: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 2

data

resid

uals

variable

0

Figure 5a

data

variable Figure 5b

Whether errors associated with the measurement of x values are non-existent or negligible, as required for a least-squares fit to be the best possible, is usually judged by consideration of the physical system. Thus, the effectiveness of control of the independent variable and the precision of the method used for measuring its value (time, temperature, composition, etc.) will determine the relative precision involved. If this is not already known for a given system or apparatus, it is subject to independent experimental evaluation. Then the magnitude of

ˆ ! x can be compared with the magnitude of

ˆ ! y

obtained in the least-squares fit of the data. If ˆ ! x /x is not less than about one percent of

ˆ ! y /y , the

question of whether the least-squares estimate of parameters is truly the best estimate becomes pertinent. Though it is possible to obtain an even better fit using the experimentally determined value of

ˆ ! x , such

refinements are somewhat complicated and will not be treated here.

S-5.Quality of Results By combining estimates of dispersion of the errors in experimental data with estimates of mean values or of least-squares fitted values, it is possible to deduce additional information about the quality of results. In fact, when dealing with experimental data there probably is no other reasonable justification for performing the calculations to obtain estimates of the standard deviation. In most of the cases presented here, the recommended procedures can be justified by mathematical deduction that is rigorous and is indisputably correct provided the basic assumption of normal distribution of errors is correct. this basic assumption is justified by the Central Limit theorem which is mathematically provable, but the absolute proof of the assumption is unavailable because there is not way of deciding how nearly the limiting conditions inherent in the theorem are met by the actual experimental situation. In a few of the cases discussed, no treatment is available which could be proven rigorously. Such cases will be identified, and a justification will be presented for recommending them. The user is generally free to make a choice of how to deal with such situations without fear of being proved wrong (or right).

S-5.1.Rejection of Data One question that frequently arises in the analysis of sets of measurements regards criteria for rejection of data that seem "out of line" with the remaining data. This must always involve a subjective judgement--there is no absolute, rigorously provable, basis for rejection. In fact, an experimental purist might say that a data point must never be rejected unless it is known to have been the result of something faulty. However, if the third measurement in our mercury column example had been 15.34 cm rather than 5.34, few if any persons would object to its rejection. The question then becomes, for practical purposes at which point do we reject? Many people use one or another of many rejection criteria that have been proposed, and we shall comment on a few of them.

Page 12: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 3

S-5.1.1.Probability Distributions for Small Sample Size.

All statistical, "objective" criteria for rejection are based on a single concept. A decision is make, a priori, to reject any sample points that have less than the rest of the data points. Once the probability level for rejection has been chosen (a subjective matter), then statistical theory can be applied objectively to determine which data points meet or fail the established criterion for retention, and all other data points are rejected. After such a technique has been applied to the original set of data, the statistical properties of the remaining values are computed, and data rejection is never even considered for application to the remaining values. From the central limit theorem it can be deduced that the measurements of a single physical quantity can be described in terms of a statistical distribution which becomes asymptotically (in the limit of infinitely many measurements) indistinguishable from a normal (Gaussian) distribution with x = µ and s = σ. From this, it is easily demonstrated that the statistic Z = (xi -µ)/σ can be described in terms of a standard normal distribution, i.e., a normal distribution with mean = 0 and standard deviation = 1. As the total area under the standard normal distribution curve is unity, the area under the curve between -Z and +Z is equal to the probability of obtaining an individual value xi. The probability of a data point lying between Z = -1 and Z = +1 is 0.68268, and the probability of it being between Z = -2 and Z = +2 is 0.95450. However, you should note carefully one feature of this discussion that is frequently overlooked in the establishment of rejection criteria and in the reporting of confidence intervals for measured data. The statistic Z is stated in terms of σ and of µ, for which we do not know the actual values. It is true that we have best estimates, ˆ ! for σ, and x for µ. However, these are but estimates and we know only that they are the best we can obtain in the absence of actual knowledge of σ and µ. We don't even know how good the estimates are. This fact shakes our confidence somewhat in the use of Z, as we have only estimates for its value. A statistician by the name of W.S. Gosset, who published under the pen name of "Student", came to our rescue with the t-distribution. It can be shown that the statistic

t = xi ! x ( ) ˆ "

is not distributed according to a standard normal distribution, but rather according to Student's t-distribution, for which extensive tables exist. Unlike the normal distribution, the t-distribution is a function of a sample size.

S-5.1.2."2σ" Rejection Criterion.

In this approach, it is proposed that any measurement that has less than a 5% chance of belonging to the true sampling universe (i.e., it is a mistake, with high probability) is rejected. If x is a reasonable estimate of µ, then any individual measurement lying outside the range x ± 2

ˆ ! approximately meets these criteria and should be rejected. However, this basis has two serious problems, one for small samples and one for large samples.

For small samples, the range of ±2 ˆ ! encompasses considerably less than the specified 95%

probability range. This shortcoming can be overcome by stating the criterion in terms of the range x ± t0.95 ˆ ! , where t0.95 is a function of sample size. The problem with this criterion for large samples is as follows. Suppose we have a sample of 100 measurements, and that it is a true sample of the universe (i.e., no values represent systematic errors, and it has a distribution of the same form as the distribution of the universe). Under these conditions we might expect 5%, or 5, of the measurements to lie outside x ± t0.95 ˆ ! and hence they would be rejected mistakenly. It may be just as serious an error to reject a good point as to accept a bad one, so some other basis for decision is needed.

S-5.1.3.Chauvenet's Rejection Criterion.

Chauvenet's criterion of rejection, which seems more reasonable for large samples than does the 2σ or 95% criterion, is that a measurement should be rejected if the probability of obtaining it in a single measurement taken from the sampling universe is less than 1/(2n). Thus, for our sample of size 5, we should reject any value whose probability of occurring is 1/10 or less. This turns our to be any point outside the range of x ± 2.13

ˆ ! . For a sample of 100, the probability desired is 0.005 or less, which corresponds to the range x ± 2.82

ˆ ! .

Page 13: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 4

Values to be used in application of Chauvenet's criterion are given in Sec. S-7.3. Note that the values in the first column are those to be used with an average value or with a fitted function that involves only one estimated parameter. Values in the second column are to be used with linear least-squares fit (two parameters). Values in the third column are to be used with (least-squares) fitted functions that involve three fitted parameters, such as a, b, and c in y = a + bx + cx2. Tables usually published for this purpose are incorrect in terms of the basic statement of Chauvenet's criterion. They are based on the standard normal distribution rather than on the Student's t-distribution. The table in Sec. S-7.3 is correct in this respect. In application of Chauvenet's criterion, values of tc from Table 7.3 are used in conjunction with the estimate of the universe standard deviation (for single parameter cases) or with the standard error of estimate of y-values (for two- and three-parameter cases). Any values whose residuals lie outside the range of x ± tc ˆ ! or ± tc

ˆ ! ̂ y are rejected.

S-5.1.4.Q-Test Rejection Criterion.

Even Chauvenet's criterion as it is usually applied has a difficulty that is overcome in part by the Q-test method. This method is widely promulgated in textbooks of analytical chemistry, but it, too has a minor flaw (which is corrected here). In the application of Chauvenet's criterion, values of the estimates of the mean and the standard deviation are computed for the entire sample. Then the critical t-value calculated from these results statistically is based on the assumption that all the data points belong to the same normal distribution. Those whose probabilities of belonging to that distribution are too low are rejected as not belonging to the distribution for which the computations assumed they did belong. The internal inconsistency should be evident here, even though it has been ignored in textbook approaches to the subject. For large enough samples, the errors introduced by including points in computations which perhaps don't belong there, are small enough not to be important. Also, the logic flaw results in a conservative criterion: points are less likely to be rejected falsely than even the criterion implies. For small sample sizes, though, this defect becomes serious. For example, for samples of fewer than seven values, application of Chauvenet's criterion as described above will never result in the rejection of any outlying data points, even if infinitely far removed from the mean value. This problem could be overcome in the following way. First, tentatively eliminate the outlying data point(s), compute the statistics of the remaining sample, and then apply Chauvenet's criterion to determine whether the outliers should indeed be rejected. If a sample is large, there may be more than one possible outlier, and the computations should be applied to all combinations of possible outliers. The computational work could quickly become prohibitive in such a technique, so it is not used for large samples. For small samples, the Q-test has been widely used. Its basis is as follows. First, assume that no more than one data point is likely to be an outlier for any given sample. Then reject an outlier if its probability (using the t-distribution) of being a member of the same distribution as the remaining values is less than 10%. To simplify computations, the usual Q-test tables use the range of values of the data other than the outliers) as a means of estimating the universe standard deviation, rather than using the sample standard deviation for this purpose. This approach has been well justified by extensive studies of the relationship between the range and the universe standard deviation. The Q statistic is computed in the following way. First, arrange all the data points (or residuals in the case of a 2- or 3-parameter fitted function) in order of increasing values. Assign serially ordered indices (from 1 to n, for n values) to these ordered values. We then compute

Q1 =x2 ! x1

xn ! x1

and Qn =xn ! xn!1

xn ! x1

If either of these values exceeds the critical value of Q for n points (as given in Table S-7.2) then point number 1, point number n, or possibly both, may be rejected as being an outlier. The 90% retention level of the Q-test suffers from the same problems as does the 2σ criterion. Thus, it is expected that for every sample of ten values, one value is likely to be rejected by the Q-test even though that value may be a legitimate member of the sample universe. The test is, according to Chauvenet's criterion, overly conservative for samples of fewer than five points (rejection especially unlikely). However, our intuitive confidence in statistics, even in the t-distribution, is not good for very small samples, so this probably is not a serious criticism.

Page 14: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 5

S-5.1.5.Summary of Rejection Criteria.

For your work, use the Q-test in conjunction with Sec. S-7.2 to decide whether to reject values when there are not more than ten values in all. In all other cases, use Chauvenet's method in conjunction with Sec. S-7.3. If a blank appears in the table of Sec. S-7.2 for your situation, never reject any data. Regardless of which criterion is used for rejection of data, the remaining values should be used to recalculate the desired parameters, but a rejection criterion should never be applied a second time to any given set of measurements.

S-5.2.Confidence Intervals The publication of confidence limits with experimental values has become a common procedure and is familiar to most chemists. In the following paragraphs the statistical basis for evaluations of confidence intervals will be presented. In the process it will be seen that many (probably most) published confidence limits are either incorrect or misleading. It requires almost no additional effort to evaluate the limits correctly, so the procedure will be described. When experimental data are reported, frequently the mean value (or fitted value) is given together with the value of the "standard deviation". Unfortunately, the meaning of "standard deviation" is not always made clear, so there is no way to evaluate it. Sometimes, though probably seldom, it refers to s, the standard deviation of the sample. More commonly it refers either to

ˆ ! , the best estimate of the universe standard deviation or to

ˆ ! x , the standard error of the mean. Sometimes the results are reported

as x ± σ or as x ± 2σ and are identified as 68% or 95% confidence limits respectively. Now let us examine the basis for such an identification and establish the mathematically correct way of computing confidence limits. It can be shown rigorously that the average values of groups of measurements of a single quantity can be described in terms of a statistical distribution which becomes asymptotically indistinguishable from a normal distribution with x = µ and

ˆ ! x = σ. From this it is demonstrable that the statistic

Zm = x ! µ( )/"x can be described in terms of a standard normal distribution. We then define Zω as the value of Z such that the area under the normal curve between -Zω and +Zω has the value ω. From this it can be proven that

µ = x ± ˆ ! x Z" with probability ω, or with percent confidence of 100ω. If Zω = 1, the

confidence is 68%, if Zω = 2, the confidence is 95%, and if Zω = 3, the confidence is about 99%.

As with the rejection of data, we note that x and ˆ ! x are only estimates of the true mean and

standard deviation of means (thus ˆ ! x is known as the standard error of estimate). This problem as

before, is handled by using t values rather than Z values in the computation, e.g.,

µ = x ± ˆ ! x t"

This is the only correct form for description of confidence intervals for mean values, as it accounts for the uncertainty in the value of

ˆ ! x . The value of tω depends on the number of data points obtained--the more

points, the more confidence we have in the estimate ˆ ! . Values of tω are given in the table in Sec. S-7.1 for ω = 0.95, i.e., for determining 95% confidence limits. The value of ω = 0.95 seems to be evolving as a "standard" basis reporting data. Thus, to determine the value of tω to specify 95% confidence limits for a sample of 15 measurements, we enter the table under the column for 1 parameter and read opposite the value 15 in the data point column, the value for tω = 2.145. To illustrate the application of confidence limits we turn to our earlier example of the height of a column of mercury. In that case, we found x = 5.30 cm and

ˆ ! x = 0.0123 cm.

As there were five data points in the example, from the table we find that t0.95 = 2.776, so that the confidence range on each side of the mean value is 2.776 x 0.0123 cm = 0.0341 cm. Thus, we stat the 95% confidence limit for the measurement as 5.300 ± 0.034 cm. (The use of significant figures in reporting confidence intervals will be considered later.) Note that if we had used the Z-distribution rather than the t-distribution the interval would have been given as 5.300 ± 0.024 cm. The difference in the two values is marked, and the latter value is in error. As tables of the t-distribution appear in most statistics books as well as here, it seems pointless not to use them in reporting experimental results. Note also that both

ˆ ! x and tω decrease as the number of data

points increases, giving narrower confidence intervals, as is intuitively expected. This is why results are more reliable for larger samples (other things being equal). However, since the work required (in

Page 15: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 6

obtaining more sample points) increases approximately as the square of the amount improvement in ˆ ! x ,

a condition of diminishing returns is involved. Confidence intervals can also be constructed for the parameters arising from a least-squares fit. The procedure for developing them is rather complicated, so only the results will be given here, and those for the general linear relationship. In obtaining tω for use in this case, remember that two parameters are obtained in the fit. Then it can be shown that

m = ˆ m ± t! ˆ " ̂ m = ˆ m ± t!

ˆ " ̂ y

n ˆ " x

b = ˆ b ± t! ˆ " ̂ b = ˆ b ± t!

ˆ " ̂ y

x2

n ˆ " x

Note that in both these expressions the confidence interval becomes smaller not only with increasing number of data points, as would be expected, but also with increasing range of x values (as indicated by ˆ ! x ), as also seems reasonable.

S-5.3.Significant Figures and Rounding Errors It is critically important that in all statistical computations, no values are to be rounded off any more than dictated by the limitations of computing equipment until the computations are completed. In line with this, computations done with FORTRAN, Pascal, C, or other programming language should be performed with double precision arithmetic (16 significant figures rather than the usual eight). Statistical computations frequently involve small differences between very large numbers, and round off errors can affect the results seriously, whether numbers are rounded off by limitations of computing devices or by scientists who apply rules of significant figures which should not be applied in statistical computations. Conceptually, it may be said that a statistical analysis is performed to determine the properties of a hypothetical set of numbers of which the numerical values of data constitute a subset, presumably a representative subset. At this point, the fact that the numbers correspond to physical measurements of limited precision is of no consequence whatsoever. Thus, for purposes of the statistical analysis all data are considered to be known to an infinite number of significant figures, i.e., an endless string of zeros is assumed to follow the last non-zero digit recorded. (Thus, if we compute the mean of the numbers 5 and 6, it is 5.5, not 5, or 6.) Once the computations have been completed, the results may be rounded off in any way desired. An illustration of the effect of round off errors in statistical computations was seen in Sec. S-3.2. That example showed that it is sometimes necessary to avoid round off errors in statistical computations and since the actual result of rounding off in any particular case can be determined only by doing the computation both ways, it should always be done with a minimum of rounding off. At this point it is appropriate to discuss the question of significant figures in reporting final results. It is sometimes stated that the confidence interval values should be reported only to one significant figure and that the mean or fitted values should be reported in a way that is consistent with the interval values. In the case of the mercury column example, 95% confidence limits as rounded off to 1, 2, and 3 significant figures are

x = 5.30 ± 0.03 cm

x = 5.300 ± 0.034 cm

x = 5.3000 ± 0.0341 cm

As the 3-decimal places form is the most nearly correct of these three, it is seen that rounding off to 1 significant figure understates the breadth of confidence interval by 12%, giving the impression that the results are somewhat better than is actually the case. In fact rounding off to one significant figure can introduce errors ranging up to 50%, so it would seem appropriate to use at last two significant figures. Rounding off to two significant figures introduces errors ranging only up to 5%, and it is likely that for most purposes this is adequate. In any case, the last significant figure retained in the estimated value

Page 16: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 7

itself must be in the same decimal position as the last significant figure retained in the confidence limits. This rule applies whether the last digit is zero or non-zero. For rounding off of values, observe the following rules. If the leftmost digit of those to be eliminated by rounding off is less than 5, the last retained digit is left unchanged. If the leftmost of the digits to be eliminated is 5 or greater, the last retained digit is increased by one. To illustrate, we shall round off each of four different values to three significant figures.

1.5550000 becomes 1.56 1.5650000 becomes 1.57 1.5650001 becomes 1.57 1.5549999 becomes 1.55

S-6.Results Derived from Measured Quantities

S-6.1.Error Propagation It is common in scientific work to compute a value of a function, e.g. f(x, y), from independent values, e.g. x and y, each of which has a certain degree of uncertainty attached. The uncertainties, which we shall denote εx, εy, etc., may be standard errors of estimate, or they may be stated or estimated uncertainties or tolerance limits. It should be obvious that there will also be some degree of uncertainty in the computed value of f, and it is desirable to estimate this uncertainty that is a result of "propagating" errors through the computation of f, with the maximum possible degree of precision. If εf is estimated too small, then a higher degree of precision is implied than is justified. If εf is estimated too large, then the precision of f is understated, and its value may not be accorded the confidence it deserves. Without offering any proof, or even a plausibility argument, we state that the desired computation of the uncertainty in f(x1, x2, …, xn) is according to the following equation:

! f ="f

"xi

#

$ %

&

' (

2

!xi

2

i

)*

+

, ,

-

.

/ /

1 2

In elementary science courses students commonly are instructed to use the relationships that the "error" in a sum or difference is the sum of the "errors" and that the relative "error" in a produce or quotient is the sum of the relative "errors". The usual rules for use of significant figures in computations are derived from this. That is, when added or subtracting numbers, discard figures to the right of the last significant figure retained in the least precise value involved in the calculation. This assumes the error to be ±1 in the last retained figure, or ±5 in the first omitted figure, or something similar. Similarly, in multiplying or dividing, retain the number of significant figures equal to that of the least precise figure used in the computations. Though these rules are but crude approximations to the propagated error rules stated at the beginning of this paragraph, they are satisfactory for computations in beginning science classes. However, from the more correct relationship for computing propagating errors, we find that the error in a sum or difference is the square root of the sum of squares of the errors of the individual quantities, and the relative error of a product or quotient is the square root of the sum of squares of the relative errors of the individual quantities. If all the errors (or relative errors as the case may be) are the same in a given case, the simpler estimates are in error by about 40%--they always overestimate the magnitude of the true "error" and hence they should never be applied to results of high quality experimental work. The purpose of statistical analysis of experimental data is to obtain the maximum information from the data. Use of error estimates that are too large negates much of the purpose of the calculations. To perform meaningless computations is little better than to perform no computations.

S-6.2.Estimates of Precision Whenever you determine a property of a system by a method that involves several different measurements, either under the same or different conditions, you will obtain a numerical evaluation of the precision of your work in the form of the 95% confidence interval of the result. In addition, however, you are to estimate the precision to be expected for the method. If the expected precision is very much less than the experimentally observed precision, try to find the reason for the discrepancy.

Page 17: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 8

In the calculation of expected precision, apply the appropriate propagated error formulas to uncertainties in reading the various types of quantities involved. In the case of volumes or masses, you may use the uncertainties indicated in Sec. S-7.4 and 0. For other types of measurements, you will have to estimate the uncertainties. For example, in timing the efflux period in a viscometer, your uncertainty would include your own reflex time in operating the switch as well as the uncertainty in reading the timer. In using electric meters of any type, the uncertainty would include an estimate of the fraction of a scale division in which you have confidence of your ability always to read the same value, or the range of observed needle fluctuations, or the larger of the two. Similarly with thermometer readings in which the mercury height might be seen to fluctuate if viewed with a magnifier. As an example of the calculation of expected precision, let us assume the titration of potassium hydrogen phthalate (KHP) with a dilute solution of NaOH to standardize the latter. Assume the following values are obtained:

Volume of solution = approx. 40.00 mL Mass of KHP = approx. 0.816 g M.W. of KHP = 204.2 g/mole

Though the value of the molecular weight has an uncertainty because of uncertainties of knowledge of atomic weights, this will contribute only to accuracy of the results, and not to their precision. From Sec. S-7.4, we find the estimated precision of reading a 50 mL buret to be 0.025 mL, but there are two readings involved in a volume measurement, so the value of the volume is stated to be

V = (40.000 ± 0.035) mL From Sec. 0, the measured precision of a weighing on the analytical balance is 0.00057 g, but there are two readings involved in a mass measurement, so the mass of KHP is stated to be

m = (0.81600 ± 0.00081) g The molar concentration is then found to be

C =

1000m

MV=

1000 ! (0.81600 ± 0.00081)

204.2 ! (40.000 ± 0.035)

= (0.09990 ± 0.00031) mole/L

If, as a result of several titrations, you obtain a 95% confidence interval for the concentration of the base that is much larger than ± 0.00013 molar, you should examine your titration technique. Perhaps you are unable to determine the endpoint reproducibly enough, but at least the computed precision tells you what you should be able to accomplish with good lab technique.

S-6.3.Estimates of Accuracy Estimation of accuracy of a result is done in a manner similar to that of precision, except that now we include uncertainties in molecular weights, constants, and conversion factors, we well as tolerance limits (accuracy) rather than reproducibility (precision) of quantitative apparatus. Using the titration example again, suppose the actual measured quantities were

V = 39.97 mL

m = 0.8154 g

From Sec. S-7.4, the stated tolerance of a 50 mL buret is 0.05 mL, so the value would be stated as V = (39.970 ± 0.050) mL.

We do not have a table of tolerances for the masses obtained from the balance; such a table would be cumbersome. However, assume the accuracy of weighing to be about the same as the precision in this case. Thus,

m = (0.8154 0±0.00081) g

Page 18: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 9

Referring to Sec. S-7.6 we obtain the molecular weight of KHP (KC8H5O4) to be M = 1 x (39.098 ± 0.003) + 8 x (12.011 ± 0.001) + 5 x (1.0079 ± 0.0001) + 4 x (15.9994 ± 0.0003) = (39.098 ±

0.008) + (5.0395 ± 0.0005) + (63.9976 ± 0.0012) = (204.2231 ± 0.0086) g/mole

Further, the KHP bottle carries an assay value of 99.99%, so M =(204.2435 ± 0.0086) g/mole

Finally,

C =

1000m

MV=

1000 ! (0.81540 ± 0.00081)

(204.2435 ± 0.0086) ! (39.970 ± 0.050)

= (0.09988 ± 0.00016) mole/L

In the process of calculating the final concentration, 95% confidence limits would be obtained, as titrations are always done in duplicate or triplicate, at least. The reported value would indicate an uncertainty that is the greater of the estimated uncertainty (as above) or the 95% confidence limits. Thus, if your precision is better than some of the uncertainties in calibration, then the systematic error is likely larger than the random error, and the final value can be no better than the certainty with which we know calibration values, molecular weights, constants, conversion factors, etc. Alternatively, if your precision is poorer than the computed uncertainties, then there is no reason to assume that the true value lies outside your 95% confidence limit range with probability greater than 0.05. Finally, if your result is a property (of a system) for which a "literature value" can be located for comparison, you must do so. Not to do so is very unscientific--it borders on dishonesty. If the literature value does not lie within your final range of uncertainty, you must examine possible causes of, and means of correction of, the discrepancy.

Page 19: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 10

S-7.Tables for Statistical Treatment of Data

S-7.1.Values of t for 95% Confidence Intervals Data Points Number of

Parameters

— 1 2 3 2 12.706 — — 3 4.303 12.706 — 4 3.182 4.303 12.706 5 2.776 3.182 4.303 6 2.571 2.776 3.182 7 2.447 2.571 2.776 8 2.365 2.447 2.571 9 2.306 2.365 2.447 10 2.262 2.306 2.365 11 2.228 2.262 2.306 12 2.201 2.228 2.262 13 2.179 2.201 2.228 14 2.160 2.179 2.201 15 2.145 2.160 2.179 16 2.131 2.145 2.160 17 2.120 2.131 2.145 18 2.110 2.120 2.131 19 2.101 2.110 2.120 20 2.093 2.101 2.110 21 2.086 2.093 2.101 22 2.080 2.086 2.093 23 2.074 2.080 2.086 24 2.069 2.074 2.080 25 2.064 2.069 2.074 26 2.060 2.064 2.069 27 2.056 2.060 2.064 28 2.052 2.056 2.060 29 2.048 2.052 2.056 30 2.045 2.048 2.052 31 2.042 2.045 2.048 32 2.040 2.042 2.045 33 2.037 2.040 2.042 34 2.035 2.037 2.040 35 2.032 2.035 2.037 36 2.030 2.032 2.035 37 2.028 2.030 2.032 38 2.026 2.028 2.030 39 2.024 2.026 2.028 40 2.023 2.024 2.026 41 2.021 2.023 2.024 42 2.020 2.021 2.023 43 2.018 2.020 2.021 44 2.017 2.018 2.020 45 2.015 2.017 2.018 46 2.014 2.015 2.017 47 2.013 2.014 2.015 48 2.012 2.013 2.014 49 2.011 2.012 2.013 50 2.010 2.011 2.012

Page 20: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 11

S-7.2.Values of Q for Data Rejection 1. Arrange values to be tested in order of increasing value. 2. Assign ordinal indices to the values, i.e., x1, x2, ..., xn. 3. Compute

Q1 = (x2 - x1)/(xn - x1) Qn = (xn - xn-1)/(xn - x1)

4. If either Q1 or Qn exceeds the value of Q in the above table, reject x1 or xn, respectively.

Data Points Number of

Parameters

1 2 3 3 0.94 — ---- 4 0.76 0.94 ---- 5 0.64 0.76 0.94 6 0.56 0.64 0.76 7 0.51 0.56 0.64 8 0.47 0.51 0.56 9 0.44 0.47 0.51 10 0.41 0.44 0.47

Page 21: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 12

S-7.3.Values of tc for Data Rejection -- Chauvenet's Criterion

Data Points

Number of Parameters

1 2 3 11 2.284 2.320 2.367 12 2.305 2.335 2.374 13 2.324 2.350 2.382 14 2.343 2.365 2.392 15 2.360 2.380 2.403 16 2.376 2.394 2.414 17 2.392 2.407 2.425 18 2.406 2.420 2.436 19 2.420 2.433 2.447 20 2.433 2.445 2.458 21 2.446 2.457 2.469 22 2.458 2.468 2.479 23 2.470 2.479 2.489 24 2.481 2.490 2.499 25 2.492 2.500 2.508 26 2.503 2.510 2.518 27 2.513 2.519 2.527 28 2.522 2.529 2.535 29 2.532 2.538 2.544 30 2.546 2.546 2.552 31 2.550 2.555 2.561 32 2.558 2.563 2.569 33 2.567 2.571 2.576 34 2.575 2.579 2.584 35 2.583 2.587 2.591 36 2.590 2.594 2.598 37 2.598 2.601 2.606 38 2.605 2.609 2.612 39 2.612 2.615 2.619 40 2.619 2.622 2.626 41 2.626 2.629 2.632 42 2.632 2.635 2.638 43 2.639 2.642 2.645 44 2.645 2.648 2.651 45 2.651 2.654 2.657 46 2.657 2.660 2.662 47 2.663 2.665 2.668 48 2.669 2.671 2.674 49 2.674 2.677 2.679 50 2.680 2.682 2.685

Page 22: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 13

S-7.4.Precision and Accuracy of Volumetric Glassware Item (Total Capacity)

Accuracy Class A

(NBS Tolerance)a

Other Precision, (Estimated)

Burets 100 ml 0.10 ml 0.20 ml 0.05 ml 50 ml 0.05 ml 0.10 ml 0.025 ml 25 ml 0.03 ml 0.06 ml 0.015 ml 10 ml 0.02 ml 0.04 ml 0.01 ml Measuring Pipets 10 ml 0.03 ml 0.06 ml 0.06 ml 5 ml 0.02 ml 0.04 ml 0.04 ml 2 ml 0.01 ml 0.02 ml 0.02 ml Transfer Pipets 100 ml 0.08 ml 0.16 ml 0.16 ml 50 ml 0.05 ml 0.1 ml 0.1 ml 25 ml 0.025 ml 0.05 ml 0.05 ml 10 ml 0.02 ml 0.04 ml 0.04 ml 5 ml 0.01 ml 0.02 ml 0.02 ml 2 ml 0.006 ml 0.012 ml 0.012 ml Volumetric Flasks, TC 2000 ml 0.5 ml 1.0 ml 1.0 ml 1000 ml 0.3 ml 0.6 ml 0.6 ml 500 ml 0.2 ml 0.4 ml 0.4 ml 250 ml 0.1 ml 0.2 ml 0.2 ml 100 ml 0.08 ml 0.16 ml 0.16 ml 50 ml 0.05 ml 0.1 ml 0.1 ml 25 ml 0.03 ml 0.06 ml 0.06 ml 10 ml 0.02 ml 0.04 ml 0.04 ml 5 ml 0.02 ml 0.04 ml 0.04 ml Volumetric Flasks, TD 2000 ml 1.0 ml 2.0 ml 2.0 ml 1000 ml 0.6 ml 1.2 ml 1.2 ml 500 ml 0.4 ml 0.8 ml 0.8 ml 250 ml 0.2 ml 0.4 ml 0.4 ml 100 ml 0.16 ml 0.32 ml 0.32 ml 50 ml 0.1 ml 0.2 ml 0.2 ml 25 ml 0.06 ml 0.12 ml 0.12 ml 10 ml 0.04 ml 0.08 ml 0.08 ml 5 ml 0.04 ml 0.08 ml 0.08 ml aSOURCE: NBS Circular 602, 1957.

S-7.5.Measured Precision of Laboratory Balances Triple-beam Platform Balance ± 0.28 g Chainomatic Balance ± 0.0021 g Single-pan Analytic Balance ± 0.00057 g

Page 23: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 14

S-7.6.Table of Atomic Weights with Uncertainties Name Symbol Atomic

# Atomic Weight Value Uncertainty

Aluminum Al 13 26.98154 0.00001 Antimony Sb 51 121.75 0.03 Argon Ar 18 39.948 0.001 Arsenic As 33 74.9216 0.0001 Barium Ba 56 137.34 0.03 Beryllium Be 4 9.01218 0.00001 Bismuth Bi 83 208.9808 0.0001 Boron B 5 10.81 0.01 Bromine Br 35 79.904 0.001 Cadmium Cd 48 112.40 0.01 Calcium Ca 20 40.08 0.01 Carbon C 6 12.011 0.001 Cerium Ce 58 140.12 0.01 Cesium Cs 55 132.9054 0.0001 Chlorine Cl 17 35.453 0.001 Chromium Cr 24 51.996 0.001 Cobalt Co 27 58.9332 0.0001 Copper Cu 29 63.546 0.001 Dysprosium Dy 66 162.50 0.01 Erbium Er 68 167.26 0.01 Europium Eu 63 151.96 0.01 Fluorine F 9 18.99840 0.00001 Gadolinium Gd 64 157.25 0.03 Gallium Ga 31 69.72 0.01 Germanium Ge 32 72.59 0.03 Gold Au 79 196.9665 0.0001 Hafnium Hf 72 178.49 0.03 Helium He 2 4.00260 0.00001 Holmium Ho 67 164.9304 0.0001 Hydrogen H 1 1.0079 0.0001 Indium In 49 114.82 0.01 Iodine I 53 126.9045 0.001 Iridium Ir 77 192.22 0.03 Iron Fe 26 55.847 0.003 Krypton Kr 36 83.80 0.01 Lanthanum La 57 138.9055 0.0003 Lead Pb 82 207.2 0.1 Lithium Li 3 6.941 0.001 Lutetium Lu 71 174.97 0.01 Magnesium Mg 12 24.305 0.001 Manganese Mn 25 54.9380 0.0001 Mercury Hg 80 200.59 0.03 Molybdenum Mo 42 95.94 0.03 Neodymium Nd 60 144.24 0.01 Neon Ne 10 20.170 0.003 Neptunium Np 93 237.0482 0.0001 Nickel Ni 28 58.71 0.03 Niobium Nb 41 92.9064 0.0001 Nitrogen N 7 14.0067 0.0001 Osmium Os 76 190.2 0.1 Oxygen O 8 15.9994 0.0003

Page 24: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 15

Palladium Pd 46 106.4 0.1 Phosphorus P 15 30.97376 0.00001 Platinum Pt 78 195.09 0.03 Potassium K 19 39.098 0.003 Praseodymium Pr 59 140.9077 0.0003 Protactinium Pa 91 231.0359 0.0001 Radium Ra 88 226.0254 0.0001 Rhenium Re 75 186.2 0.1 Rhodium Rh 45 102.9055 0.0001 Rubidium Rb 37 85.4678 0.0003 Ruthenium Ru 44 101.07 0.03 Samarium Sm 62 150.4 0.1 Scandium Sc 21 44.9559 0.0001 Selenium Se 34 78.96 0.03 Silicon Si 14 28.086 0.003 Silver Ag 47 107.868 0.001 Sodium Na 11 22.9898 0.0001 Strontium Sr 38 87.62 0.01 Sulfur S 16 32.06 0.01 Tantalum Ta 73 180.9479 0.0003 Technetium Tc 43 98.9062 0.0001 Tellurium Te 52 127.60 0.03 Terbium Tb 65 158.9254 0.0001 Thallium Tl 81 204.37 0.03 Thorium Th 90 232.0381 0.0001 Thulium Tm 69 168.9342 0.0001 Tin Sn 50 118.69 0.03 Titanium Ti 22 47.90 0.03 Tungsten W 74 183.85 0.03 Uranium U 92 238.029 0.001 Vanadium V 23 50.9414 0.0003 Xenon Xe 54 131.30 0.01 Ytterbium Tb 70 173.04 0.03 Yttrium Y 39 88.9059 0.0001 Zinc Zn 30 65.38 0.01 Zirconium Zr 40 91.22 0.01

SOURCE: R.C. Weast, Ed., "Handbook of Chemistry and Physics," 56th Ed., CRC Press, Cleveland, Ohio, 1975, inside back cover.

NOTE: This table includes all the known elements for which a reasonable atomic weight value is assignable. All other elements are man-made, or their natural abundance is such that a reasonable assessment is unavailable.

Page 25: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 16

S-7.7.Table of Constants and Conversion Factors with Uncertainties Fundamental and Derived Constants: Avogadro Number NA (6.022 045 ± 0.000 031) x 1023 mol-1 Gas Constant R (8.314 41 ± 0.000 26) J K-1 mol-1 R (1.987 192 ± 0.000 062) cal K-1 mol-1 R (8.205 68 ± 0.000 26) x 10-2 L atm K-1 mol-1 Boltzmann Constant kB (1.380 662 ± 0.000 044) x 10-23 J K-1 Faraday Constant F (9.648 456 ± 0.000 027) x 104 C mol-1 Electronic Charge e (1.602 189 ± 0.000 005) x 10-19 C Planck Constant h (6.626 176 ± 0.000 036) x 1034 J s h/2π (1.054 5887 ± 0.000 0057) x 10-34 J sec

Speed of Light in Vacuum c (2.997 924 58 ± 0.000 000 12) x 108 m s-1 Conversion Factors: T K = t oC + (273.1500 ± 0.0002) deg 1 atm = (7.60 ± 0.00) x 102 torr = (7.60 ± 0.00) x 102 mm Hg 1 cal = (4.184 ± 0) J 1 J = (1.0 ± 0) x 107 erg 1 erg = (1.0 ± 0) dyne cm 1 L = (1.0 ± 0) x 103 cm3 aSOURCE: J.A. Dean, Ed., "Lange’s Handbook of Chemistry", 13th Ed., McGraw-Hill, New York,

NY, 1985, pp. 2-3ff.

Page 26: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 17

S-7.8.Summary of Computational Formulas S-7.8.1.Mean.

x =1

nxi =

1

nx1 + x2 +K + xn( )

i!1

n

"

S-7.8.2.Sample Standard Deviation

s =1

nxi ! x ( )2

i =1

n

"#

$ % %

&

' ( (

1 2

S-7.8.3.Estimated Standard Deviation of the Universe

! =lim

n "#

1

nxi $ µ( )2

i =1

n

%&

' ( (

)

* + +

1 2

S-7.8.4.Standard Error of Estimate of the Mean

ˆ ! x "ˆ !

n=

s

n # 1

S-7.8.5.Least Squares Fitting

If y = mx + b,

ˆ m =

n!xy " ! x! y

D

ˆ b =! x2

! y " !x! xy

D= y " ˆ m x ,

where D = n!x

2" ! x( )

2

Standard error of estimate of the y values

ˆ ! ̂ y =" y # ˆ m x # ˆ b ( )

2

n # 2

$

%

& & &

'

(

) ) )

1 2

Standard errors of estimate of the slope:

ˆ ! ˆ m =ˆ ! ̂ y

n

D

"

# $

%

& '

1 2

and intercept:

ˆ ! ̂ b = ˆ ! ̂ y

" x2

D

#

$ %

&

' (

1 2

S-7.8.6.Confidence Limits

µ = x ± ˆ ! x t"

Page 27: STATISTICAL TREATMENT OF EXPERIMENTAL: Dhomepages.wmich.edu/~dschreib/Courses/Chem436/Statistical.pdfknowledge of statistical computations and analysis can be obtained from the following

S– 18

S-7.8.7.Propagation of Errors

For f(x1, x2, …, xn) :

! f ="f

"xi

#

$ %

&

' (

2

!xi

2

i

)*

+

, ,

-

.

/ /

1 2