kolmogorov smirnov test for normality

Queen Mary, University of London

Idlir Shkurti Student ID: 120192308

MTH731U Computation Statistics

The Kolmogorov-Smirnov test for normality

Idlir Shkurti 120192308

1.0 Introduction

Suppose we have n observations 1, , coming from independent and randomly distributed

random variables 1, , with a common cumulative distribution function (cdf) F. If we wish

to test the hypothesis 0: = 0 against 1: 0, where 0 is the cdf of a known continuous

distribution; the Kolmogorov-Smirnov statistic is an appropriate and valid way of doing so. The

Kolmogorov-Smirnov statistic, is defined the following way

= {1,,}|() 0()| (1)

where () is the ecdf obtained from the observations 1, , . In other words, the KS statistic

is the maximum absolute distance between the graph of the ecdf and the cdf of the known

distribution, which we are testing our data to come from.

Another way of defining the test statistic, when we are testing for normality of the data is the

following. Suppose we have a sample which consists of n observations ordered such that 1

. The ecdf of the sample is the step function such that at each the step is between 1

and

. If 0 is the cdf of a normal distribution with mean and standard deviation , then

the KS statistic is given by:

= 1 {

(

) , (

)

1

} (2)

Where is the cumulative distributive function of the standard normal distribution.

A more conventional way of testing a similar hypothesis is the chi-square test, however the

Kolmogorov-Smirnov statistic is much more advantageous, since it can be used with small

samples and it is overall more powerful.

In his paper in 1967, Lilliefors provided a means of using an improved version of the

Kolmogorov-Smirnov statistic to test whether a set of observations are from a completely

specified continuous distribution 0() when certain parameters of the distribution must be

estimated. If the conventional Kolmogorov-Smirnov test is used in this case the results will be

conservative, hence compromising the power of the test. In this paper Lilliefors presents a

method of testing whether a set of observations come from a normal population with unknown

mean and variance. In order to do so we must first know exactly what the continuous

distribution 0() is. Since the mean and the variance are unknown we use estimators and 2

to estimate the mean and the variance respectively; where is the sample mean and 2 is the

sample variance of the given observations. Hence we assume that 0() is the cdf of a normal

distribution with mean and variance 2. When these values are calculated the Kolmogorov-

Smirnov statistic is calculated exactly the same way as above with 0() = ,2 .



2.0 Adjusted Critical Values

Once the test statistic is obtained, we cannot use the critical values from the Kolmogorov-

Smirnov tables in order to draw a conclusion about the test because, as mentioned before, the

results will be conservative. In his paper, Lilliefors calculates new critical values for this specific

test using a Monte Carlo calculation. This was done by obtaining 1000 random normally

distributed samples, 1, , for different values of n and thus estimating the distribution of .

The first R code in appendix 1 is used to obtain a similar table to table 1 on Lilliefors paper. The

output table is given below.

Table 1: Monte Carlo Critical Values of Dn

Critical values estimated as a result of a Monte Carlo calculation using 1000 samples for different values of sample size, N. Any value of Kolmogorov-Smirnov test statistic greater than the corresponding critical value at a certain level of significance leads to a rejection of the null hypothesis of normality for that specific significance level.

Sample Size N

Level of Significance for = {1,,}|() 0()|

0.20 0.15 0.10 0.05 0.01 4 0.300 0.318 0.342 0.378 0.410 5 0.290 0.303 0.323 0.350 0.403 6 0.272 0.282 0.297 0.321 0.373 7 0.251 0.263 0.277 0.307 0.351 8 0.234 0.246 0.259 0.280 0.332 9 0.227 0.237 0.254 0.274 0.313 10 0.215 0.225 0.243 0.265 0.307 11 0.209 0.217 0.228 0.248 0.290 12 0.204 0.212 0.224 0.245 0.279 13 0.196 0.204 0.216 0.231 0.263 14 0.185 0.196 0.205 0.222 0.256 15 0.178 0.185 0.197 0.215 0.257 16 0.176 0.183 0.195 0.210 0.251 17 0.170 0.179 0.190 0.206 0.234 18 0.166 0.173 0.184 0.200 0.236 19 0.161 0.170 0.178 0.195 0.220 20 0.158 0.165 0.174 0.189 0.220 25 0.143 0.150 0.160 0.171 0.195 30 0.131 0.136 0.143 0.155 0.185

Over 30 0.723

N

0.760

N

0.814

N

0.880

N

1.022

N

The output in table 1 gives the estimated critical values for a 0.2, 0.15, 0.1, 0.05 and 0.01

significance level for sample sizes from n=4 to n=30. For any samples of size greater than 30,

say N, we use the critical values of N=40 multiplied by the square root of 40 and then the

product is divided by the square root of the sample size .

Comparing this table to the standard Kolmogorov-Smirnov tables, we can see that the critical

values for the 0.01 significance level here are slightly smaller than the critical values for a 0.20

significance level. Hence if we use the standard Kolmogorov-Smirnov tables as critical values we



would obtain very conservative results. This means that the actual significance level would be

much lower than the one claimed in the test. A big question which needs to be addressed at this

stage is: Is the modified KS test still more powerful than the Chi-Squared test?

3.0 Power of the test

One of the implements of this specialised Kolmogorov-Smirnov test for normality is that it can

still be used for small sample sizes, unlike the Chi-Squared test. Kiefer and Wolfowitz (1955)

showed that it is asymptotically more powerful than the Chi-Squared test. However we want to

know whether this test is ideal for relatively small sample sizes. In the Lilliefors paper a small

investigation was made to compare the powers of these two tests. There were 500 samples of

size 20 drawn from distributions such as normal, chi-square 3 d.f, students t 3 d.f, exponential

and uniform. The probabilities of rejecting the null hypothesis for normality using the

Kolmogorov-Smirnov statistic and the chi-square statistic were found and compared. However

in this example Lilliefors has also used Monte-Carlo critical values for the Chi-Squared test

rather than the standard chi-squared points to avoid a high probability of type I error.

From Table 2 in the Lilliefors paper we can see that the probabilities of type I error are

satisfying and relatively similar for both tests. However the Kolmogorov-Smirnov test is much

more powerful than the Chi-Squared test. This is because the probabilities of correctly rejecting

the hypothesis of normality are significantly greater when using the D statistic compared to the

Chi-Squared statistic for any underlying, non-normal distribution. The probabilities of rejecting

the null hypotheses when the observations do not come from a normal distribution are greater

at both a 5% and 10% significance level. This is another advantage of using the specialised test

for normality.

However this simply tells us that this specialised Kolmogorov-Smirnov test for normality is

superior to the Chi-Squared test. The power of this test is still far from ideal, particularly when

the observations come from a uniform distribution and we can see this by looking at the same

table. The probability of correctly rejecting the null hypothesis of normality when the 20

observations come from a uniform distribution is 12% when using a 5% significance level and

22% when using a 10% significance level. Hence this test is not ideal for certain distributions.

Table 3 from the Lilliefors table gives us a similar calculation to that from Table 2. This table

gives the probabilities of rejecting the hypotheses of normality from the same underlying

distributions as Table 2, however now using 500 samples of size 10 rather than 20. The test

used in this table is the Kolmogorov-Smirnov test with the adjusted critical values from Table 1

at =0.05 and =0.10.

We can see from this table that the probabilities of type I error are still satisfactory, however the

power of the test decreases even more now that the sample size has dropped. At a 5%

significance level, when the observations come from a uniform distribution, the test only

correctly rejects the null hypothesis of normality 7% of the time and only 13% of the time when

=0.10.

Hence another important factor affecting the power of the test is the size of the sample. The

greater the size of the sample, the more powerful the test. The code in appendix 2 will generate



the proportion of samples which have correctly rejected the null hypothesis of normality at a

10% level of significance out of 500 samples, each of size N=100, coming from a uniform

distribution. The output obtained is 0.774, which means that the null hypothesis of normality

will be correctly rejected for 77.4% of the 500 samples. This output is much higher than both

outputs from tables 2 and 3 from Lilliefors paper. The following table is obtained using R. This

is equivalent to Table 3 from Lilliefors paper, the only difference being that the samples now

are of size 100 rather than 10.

Table 2: Probability of rejecting hypothesis of normality when sample size is 100

Kolmogorov Smirnov test Using Critical Values from Table 1 Underlying distribution = 0.05 = 0.10 Normal 0.050 0.098 Chi-Square, 3 d.f. 0.990 0.998 Students t, 3 d.f. 0.730 0.842 Exponential 1.000 1.000 Uniform 0.578 0.774

Comparing this table to table 3 or even table 2 from Lillieforss paper we can clearly observe

dramatic increases in probabilities of correctly rejecting the null hypothesis of normality for the

bottom four distributions, particularly for exponential distribution. The power of this test has

increased; hence the sample size plays a very important factor in estimating the power of the

test. The probabilities of type I error are still as predicted.

4.0 Outliers

One problem with the use of sample mean and sample variance as estimates of the mean and

variance of the null distribution is that they are sensitive to outliers in the data sample. Since the

test is directly affected by the choice of mean and variance then this could lead to possible

errors. Type I error is a bigger problem when outliers are present, particularly when the sample

size is small. This is because the smaller the sample size, the greater the effect of the outlier on

the sample estimates. The third R code (appendix 3) uses a similar code to the one used to

obtain the values for Table 2 (appendix 2). 500 samples of size 10 are drawn. 9 observations

from each sample come from a standard normal distribution, whilst the remaining observation

is an outlier. If we include the outlier as a correct observation whilst estimating the sample

mean and the sample variance, then the null hypothesis of normality is rejected for a large

proportion of the 500 samples (30.2% precisely in one case). This means that the probability of

type I error is much higher than the level of significance. However if we first install the outliers

package into R and load it, then use the command rm.outlier in order to locate the outlier and

replace it by the mean of the remaining observations; the proportion of samples rejecting the

hypothesis of normality go back to normal.

In appendix 4, a code was used in order to observe the effects of a single outlier on the value of

the test statistic Dn for samples of size 10 to 101. Just as expected, the effect of the outlier

decreases as the sample size increases. The code in the appendix draws 500 samples for each

sample size from 10 to 101, with one outlier in each sample and finds the average value of the



KS test statistic for each sample size. The graph below shows the plotted values of the average

KS statistics against the corresponding sample sizes.

Graph 1: The average KS statistic for different sample sizes

4.2 Modified Lilliefors Test

One problem with Lilliefors test is that the mean and the variance are obtained from fixed

sample estimates. These sample estimates, particularly for relatively small sample sizes, are

sensitive to sample outliers. In their 2008 paper, Drezner, Turel and Zerom introduced a

modified version of Lilliefors test which they believed to be superior to the conventional

procedure. In their paper they also use equation (2) in order to find the test statistic, however

they do not believe that fixed sample estimators such as the sample mean and the sample

variance are appropriate estimates of the mean and variance of the random sample. In contrast

to the traditional KS test introduced by Lilliefors in which the data is compared against a normal

distribution with fixed parameters, this model tries to find a normal distribution which is more

appropriate for the data sample than the fixed parameter distribution. The traditional KS

statistic is obtained by using the sample mean, and the variance, 2 as the estimates of the

mean and standard deviation in (2). However the modified test introduced in this paper

uses an algorithm in order to obtain values and which minimize the value of the test statistic

. The critical values needed in order to make a decision about the outcome of the experiment

were also calculated differently to Lilliefors in order to complement the purpose of this test.

4.2.1 Algorithm

When using Lilliefors test, equation (2) depends on the choice of and 2, hence we denote the

test statistic as (, 2). When using the conventional Lilliefors test this test statistic is simply

(, 2). The test statistic for the modified test is (, ) where (, ) is the vector solution

for the problem which minimizes the KS statistic, i.e. minimizes the following problem:

,2{(, 2)} (3)

We can write equation (2) as the following inequalities:



(, 2)

(

) (4)

(, 2) (

)

1

(5)

The solution to problem (3) is the smallest possible value of (, 2) which satisfies (4) for

< and (5) for < (1 ) + 1. The values of and 2 for which this is satisfied are the

values of the vector (, ).

The benefit of using this modified version of the Lilliefors test is the fact that it accounts for

possible outliers in the data. When using the standard KS test, possible outliers in the data could

significantly affect the values of the sample estimates. This will lead to errors in the test.

However this method takes possible outliers into account. When compared to the standard KS

test, the modified version was much more powerful for most distributions, particularly when

data came from a uniform distribution.

5.0 Conclusion

The Kolmogorov-Smirnov test provides a good method of testing whether a sample comes from

a completely knows distribution with cdf 0. However when the cdf 0 is not known we can test

for normality using Lilleforss test, which is simply a modified version of the KS test where the

sample mean and the sample variance are used as the mean and the variance of the unknown

distribution. Lilliefors introduced new critical values in his paper which he obtained from a

Monte Carlo calculation from 1000 samples for different sample sizes. Once the Monte Carlo

critical values were obtained, it was noticeable that for each sample size, the standard critical

values from the Kolmogorov-Smirnov tables were much higher than the ones obtained in this

paper. Hence using the standard critical values could result in a very conservative test and

hence a loss of power.

This version of the KS test was still more powerful than the Chi-Square test as shown in table 2

of reference 1, where the probabilities of correctly rejecting the null hypothesis were

significantly higher for the KS test than the Chi-Square test. This test can also be used for small

sample sizes just like the standard KS test, which is another advantage of using this test over the

Chi-Square test.

Outliers will strongly affect the sample estimates, particularly for small sample sizes which

could lead to possible errors. This was shown in graph 2 where the average values of the KS

statistic from 500 samples of different sample sizes were calculated and plotted. The values of

the KS statistic clearly decrease as the sample size increases, which means that the outlier is

much more likely to lead to a rejection of the null hypothesis when the sample size is small. This

is also shown in appendix 3, where the inclusion of an outlier in the data sample increases the

proportion of the type I errors.

Drezner, Turel and Zerom, in their 2008 paper introduced a modified version of the Lilliefors

test which aims to choose more appropriate estimates for the mean and variance such that the

test statistic is minimized. This method was proved to be more powerful than the Lilliefors test

for many underlying distributions, however not for the t distribution. When the data came from

a t-distribution, this method was less powerful than Lilliefors test.



6.0 Appendices

6.1 Appendix 1

> N N1 Dn=c()

> CT.2=c()

> CT.15=c()

> CT.1=c()

> CT.05=c()

> CT.01=c()

# Empty vectors defined.

> for(i in 4:N){for(j in 1:N1){S



+ cc



+ Ecdf



+ m



+ DnStar

kolmogorov smirnov test for normality

Documents