normality test is to see whether the residual values

NORMALITY TEST

Definition of Normality Test

Normality test is to see whether the residual values are normally distributed or not. A

good regression model is to have a residual value is normally distributed. So the normality test

was not performed on every variable but the residual value. Often there are multiple errors is that

the normality test performed on each variable. It is not prohibited but require normality in the

regression model rather than the residual value of each variable of the study.

Definition of normal, simply to be analogous to a class. In class the number of students

who are stupid and clever once few and mostly located in the category of moderate or average. If

a class member stupid all, it is not normal, or a special school. And conversely if class members

of class who are clever more than the stupid, so the class is not normal or an excellent class.

Observations of normal data will provide extreme value extreme low and high bit and most of

the pile in the middle. Similarly, the value of mode, mean and median are relatively close.

Normality test can be done with the test histograms, normal test P Plot, Chi Square test,

skewness and Kurtosis or Kolmogorov Smirnov. No method is best or most appropriate. How is

that the test graph method often results in differences in perception among some observers, so

the test for normality by using statistical free from doubt, though there is no guarantee that a test

with better statistical test of the test graph method.

If residuals are not normal but closer to the critical value (eg Kolmogorov Smirnov test of

significance of 0.049) can be tested by other methods which may provide justification to normal.

However, if far from the normal value, then it can be done several steps: data transformation,

data observation cutting or adding data outliers. The transformation can be made into shapes

natural logarithm, square root, inverse, or other forms depending on the shape of the normal

curve, is tilted to the left, right, collects in the middle or spread to the right and left side.

H0 rejected area

H0 reception

area

Procedure of Normality Test:

1. Formulate Hypotheses Formula

Ho : Normal distribution of data

Ha : data not normally distributed

2. Determine the real level (a)

To get the value of chi square table

dk = k – 3

dk = Degrees of Freedom

k = Interval class

3. Determine the value of statistical tests

X2count=∑

i=1

k (Oi−E i )2

Ei

Where:

Oi = Observation frequency on ith classification

Ei = Expected frequency on ith classification

4. Determining Criteria Testing of Hypotheses

Ho rejected, if X2count≥X

2table

Ho Accepted, if X2count<X

2table

5. Provide a conclusion

http://3.bp.blogspot.com/_ritcELBjkt4/R4Rv30ltr4I/AAAAAAAAADU/_aXtysWo9u8/s1600-h/rumus1.JPG

Example:

The following table is a table of sample data

60 70 80 80 90

65 75 80 85 90

65 75 80 85 90

65 75 80 85 90

65 75 80 85 90

70 75 80 85 95

70 75 80 85 95

70 75 80 85 95

70 75 80 85 95

70 80 80 90 100

From these data, first tested the normality, because the number of samples> 30 then used Chi-

square test. So the hypothesis can be arranged as follows.

HO is a sample drawn from a normal distribution population.

HA is samples taken from the distribution is not normal.

Normality test are shown in table below.

N

o

Interva

l

Order Class z

Area O

i Ei Lowe

r

Uppe

r

Lowe

r

Uppe

r

1 60-66 59.5 66.5 -2.21 -1.46 0.058

5 5 2.925

1.47200

9

2 67-73 66.5 73.5 -1.46 -0.70 0.169

9 6 8.495

0.73278

7

3 74-80 73.5 80.5 -0.70 0.05 0.277

9 20

13.89

5

2.68233

4

4 81-87 80.5 87.5 0.05 0.81 0.271 8 13.55 2.27650

1 5 5

5 88-94 87.5 94.5 0.81 1.57 0.150

8 6 7.54

0.31453

6

6 95-101 94.5 101.5 1.57 2.32 0.048 5 2.4 2.81666

7

10.29

From the normality test with df = k-3 which is obtained

X2 = 10.29

Based on derived table

X2tabel = 11.341 with significance level 1%

Because X2 < X2tabel then the sample is taken from a normally distributed population.

Testing with SPSS

Case Processing Summary

Cases

Valid Missing Total

N Percent N Percent N Percent

Value 50 100.0% 0 .0% 50 100.0%

Test Criteria: accept H0 if sig> 0.05. listed above in the table of significant results, which states

greater than 0.05, so H we receive. Conclusion population normal distribution.

HOMOGENEITY TEST

Test Equality of Two Variances

1. Definition

Test equality of two variances are used to test whether both data that is by comparing both

homogeneous variance. If the variance as large, then the homogeneity test is not necessary

anymore because the data has been considered homogeneous. But for the variance is not as large,

there should be a test of homogeneity through two variance equality test.

Requirements for testing homogeneity of this can be done is if both of data has been shown to

have normal distribution. To perform the test of homogeneity there are several ways.

2. How To Test Homogeneity

Tests of homogeneity there are 3 ways:

1. Greatest variance compared to the smallest variance.

The steps are as follows:

a. Write Ha and H0 in the form of the sentence.

b. Write Ha and H0 in the form of statistics.

c. Find F count by using the formula:

F= BigestVarianceSmalestVariance

d. Set the level of significance (α)

e. Calculate Ftable by the formula:

(the largest variance df - 1, the smallest variance dkf- 1)

f. Specify the criteria for testing H0, namely:

If Fcount Ftable, then H0 is accepted (homogeneous).

g. Compare Fcount with Ftable

h. Make the conclusion.

2. Smallest variance compared to the Greatest variance

The steps are as follows:

a. Write Ha and H0 in the form of the sentence.

b. Write Ha and H0 in the form of statistics.

c. Find Foriginal count by using the formula:

d. Set the level of significance (α)

e. Calculate Foriginal table by the formula:

(the largest variance df - 1, the smallest variance df - 1)

f. Find Ftable right with the formula:

(smallest variance df - 1, the largest variance df - 1)

By using the F table obtained value of Ftable right. This value

hereinafter as the maximum value.

7) Find Ftable left with the formula :

(smallest variance df - 1, the largest variance df - 1)

Or,

F tabel=F1/2α

F= BigestVarianceSmalestVariance

F tabel=F1/2α

F tableright=F1 /2α

F tableleft=F(1−α )

8) Determine the test criteria by using the formula:

If the Ftable left Fcount left + Ftable right, then H0 is accepted (homogeneous).

9) Compare the value of the Ftable left, Fcount right, and Ftable right.

10) Make a conclusion

3. Bartlett Test

• Suppose the population has a homogeneous variance, namely:

• The hypothesis to be tested

H0 : population has the same variance

H1 : populations have unequal variances.

To simplify the calculation, the units needed to be better prepared Bartlett test in a list.

Samples to Df 1/df Si2 Log Si

2 Df log Si2

1 n1 – 1 1/(n1 – 1) S12 Log S1

2 (n1 – 1) Log

S12

2 n2- 1 1/(n2– 1) S22 Log S2

2 (n1 – 1) Log

S22

... ... ... ... ... ...

... ... ... ... ... ...

K nk- 1 1/(nk– 1) Sk2 Log Sk

2 (n1 – 1) Log

Sk2

Total Σ(ni- 1) Σ(1/(ni– 1)) -- -- Σ((n1 – 1)

Log Sk2)

F tableright=1

Foriginaltable

σ 12=σ2

2=.. ..=σ k2

From the list above can be calculated:

Variance composite of all samples

• Unit price B by the formula:

• For Bartlett test used chi square statistic

With the significance level α, we reject H0 if:

• To correct, use the correction factor K

• With statistical correction factor that is used now is

• With χ2 on the right side. In this case the hypothesis is rejected if:

s2=(∑ (n1−1 )si2

∑ (n1−1 ) )

B=( log s2) Σ (ni−1 )

χ2=( ln10 ) {B−Σ (ni−1 ) log si2}

with ln 10 = 2. 3026, called the original logarithm of number 10

χ2≥ χ ( 1−α ) (k−1 )2

where χ (1−α ) (k−1 )2 obtained from chi square table with

probability (1-α ) and df=(k-1 )

K=1+ 13 (k−1 ) {Σi=1

k

( 1n1−1 )− 1

Σ (n i−1 ) }

χK2 =( 1

K ) χ2

χK2 ≥ χ (1−α ) (k−1 )

2

Example:

In the example normality test, then can we test the homogeneity of a population is the population

for the selection and male student population for the selection of female students. For the first

selection were 15 female students and the second selection were 14 male students.

The value of selected new students

The population for the selection of female

students

Value

90 2 4

95 7 49

80 -8 64

80 -8 64

85 -3 9

85 -3 9

85 -3 9

95 7 49

95 7 49

x−x ( x−x )2

70 -18 324

95 7 49

90 2 4

85 -3 9

90 2 4

95 7 49

The value of selected new students

The population for the selection of the male students

Value (x)

90 3 9

90 3 9

70 -17 289

85 -2 4

90 3 9

80 -7 49

90 3 9

85 -2 4

85 -2 4

95 8 64

s2=Σ( x−x )n−1

s2=74514

s2=53 ,2

x−x ( x−x )2

90 3 9

95 8 64

85 -2 4

85 -2 4

The prices are needed to Bartlett test

sample df 1/df si2 log si

2 (df) log si2

Female 14 0.07142 53.2 1.725 24.15

Male 13 0.07142 40.8 1.610 20.93Total 27 0.14284 -- -- 45.08

Variance combination of the two samples

So log s2 = log 47 = 1.6720

s2=Σ( x−x )n−1

s2=53113

s2=40 ,8

H0 : σ12=σ2

2

s2=( Σ (ni−1 )si2

Σ (ni−1 ) )s2=(14 (53 .2 )+13 (40 . 8 )

27 )=47 . 22

Value of B:

Value χ2 count:

If α = 0,05 and from the chi square distribution table with df = 1 we get:

Apparently 1.6173 < 3.81so the hypothesis accepted when the significance level 0.05, so that it can be said both data derived from a homogeneous population variance.

B=( log s2) Σ (ni−1 )B=(1 . 67412 ) (27 )=45 .201

χ2=( ln10 ) {B−Σ (ni−1) log s i2 }

χ2=( ln10 ) {46 .81874−( 45 .201 ) }=1.6173

χ0 ,95 ( 2 )2 =3 .841

H0 : σ12=σ2

2

Steps testing with SPSS 16.

The Result testing by SPSS

Case Processing Summary

Gender

Cases

Valid Missing Total

N Percent N Percent N Percent

Value Female 15 100.0% 0 .0% 15 100.0%

Male 14 100.0% 0 .0% 14 100.0%

Descriptives

Gender Statistic Std. Error

Value Female Mean 87.6667 1.88140

95% Confidence Interval for Mean

Lower Bound 83.6315

Upper Bound 91.7019

5% Trimmed Mean 88.2407

Median 90.0000

Variance 53.095

Std. Deviation 7.28665

Minimum 70.00

Maximum 95.00

Range 25.00

Interquartile Range 10.00

Skewness -.955 .580

Kurtosis .872 1.121

Male Mean 86.7857 1.70706

95% Confidence Interval for Mean

Lower Bound 83.0978

Upper Bound 90.4736

5% Trimmed Mean 87.2619

Median 87.5000

Variance 40.797

Std. Deviation 6.38723

Minimum 70.00

Maximum 95.00

Range 25.00

Interquartile Range 5.00

Skewness -1.307 .597

Kurtosis 2.865 1.154

Tests of Normality

Gender

Kolmogorov-Smirnova Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

Value Female .176 15 .200* .872 15 .037

Male .247 14 .020 .859 14 .029

a. Lilliefors Significance Correction

*. This is a lower bound of the true significance.

Test of Homogeneity of Variance

Levene Statistic df1 df2 Sig.

Value Based on Mean .587 1 27 .450

Based on Median .354 1 27 .557

Based on Median and with adjusted df

.354 1 26.414 .557

Based on trimmed mean .532 1 27 .472

Test Criteria: accept H0 if sig> 0.05. in the results table with spss program, it appears that sig exceeds 0.05. H0 so that we have received, in other words that the population value can be conclude selection for female and male students is having the same variance

normality test is to see whether the residual values

Documents