normality test is to see whether the residual values
TRANSCRIPT
NORMALITY TEST
Definition of Normality Test
Normality test is to see whether the residual values are normally distributed or not. A
good regression model is to have a residual value is normally distributed. So the normality test
was not performed on every variable but the residual value. Often there are multiple errors is that
the normality test performed on each variable. It is not prohibited but require normality in the
regression model rather than the residual value of each variable of the study.
Definition of normal, simply to be analogous to a class. In class the number of students
who are stupid and clever once few and mostly located in the category of moderate or average. If
a class member stupid all, it is not normal, or a special school. And conversely if class members
of class who are clever more than the stupid, so the class is not normal or an excellent class.
Observations of normal data will provide extreme value extreme low and high bit and most of
the pile in the middle. Similarly, the value of mode, mean and median are relatively close.
Normality test can be done with the test histograms, normal test P Plot, Chi Square test,
skewness and Kurtosis or Kolmogorov Smirnov. No method is best or most appropriate. How is
that the test graph method often results in differences in perception among some observers, so
the test for normality by using statistical free from doubt, though there is no guarantee that a test
with better statistical test of the test graph method.
If residuals are not normal but closer to the critical value (eg Kolmogorov Smirnov test of
significance of 0.049) can be tested by other methods which may provide justification to normal.
However, if far from the normal value, then it can be done several steps: data transformation,
data observation cutting or adding data outliers. The transformation can be made into shapes
natural logarithm, square root, inverse, or other forms depending on the shape of the normal
curve, is tilted to the left, right, collects in the middle or spread to the right and left side.
H0 rejected area
H0 reception
area
Procedure of Normality Test:
1. Formulate Hypotheses Formula
Ho : Normal distribution of data
Ha : data not normally distributed
2. Determine the real level (a)
To get the value of chi square table
dk = k – 3
dk = Degrees of Freedom
k = Interval class
3. Determine the value of statistical tests
X2count=∑
i=1
k (Oi−E i )2
Ei
Where:
Oi = Observation frequency on ith classification
Ei = Expected frequency on ith classification
4. Determining Criteria Testing of Hypotheses
Ho rejected, if X2count≥X
2table
Ho Accepted, if X2count<X
2table
5. Provide a conclusion
Example:
The following table is a table of sample data
60 70 80 80 90
65 75 80 85 90
65 75 80 85 90
65 75 80 85 90
65 75 80 85 90
70 75 80 85 95
70 75 80 85 95
70 75 80 85 95
70 75 80 85 95
70 80 80 90 100
From these data, first tested the normality, because the number of samples> 30 then used Chi-
square test. So the hypothesis can be arranged as follows.
HO is a sample drawn from a normal distribution population.
HA is samples taken from the distribution is not normal.
Normality test are shown in table below.
N
o
Interva
l
Order Class z
Area O
i Ei Lowe
r
Uppe
r
Lowe
r
Uppe
r
1 60-66 59.5 66.5 -2.21 -1.46 0.058
5 5 2.925
1.47200
9
2 67-73 66.5 73.5 -1.46 -0.70 0.169
9 6 8.495
0.73278
7
3 74-80 73.5 80.5 -0.70 0.05 0.277
9 20
13.89
5
2.68233
4
4 81-87 80.5 87.5 0.05 0.81 0.271 8 13.55 2.27650
1 5 5
5 88-94 87.5 94.5 0.81 1.57 0.150
8 6 7.54
0.31453
6
6 95-101 94.5 101.5 1.57 2.32 0.048 5 2.4 2.81666
7
10.29
From the normality test with df = k-3 which is obtained
X2 = 10.29
Based on derived table
X2tabel = 11.341 with significance level 1%
Because X2 < X2tabel then the sample is taken from a normally distributed population.
Testing with SPSS
Case Processing Summary
Cases
Valid Missing Total
N Percent N Percent N Percent
Value 50 100.0% 0 .0% 50 100.0%
Test Criteria: accept H0 if sig> 0.05. listed above in the table of significant results, which states
greater than 0.05, so H we receive. Conclusion population normal distribution.
HOMOGENEITY TEST
Test Equality of Two Variances
1. Definition
Test equality of two variances are used to test whether both data that is by comparing both
homogeneous variance. If the variance as large, then the homogeneity test is not necessary
anymore because the data has been considered homogeneous. But for the variance is not as large,
there should be a test of homogeneity through two variance equality test.
Requirements for testing homogeneity of this can be done is if both of data has been shown to
have normal distribution. To perform the test of homogeneity there are several ways.
2. How To Test Homogeneity
Tests of homogeneity there are 3 ways:
1. Greatest variance compared to the smallest variance.
The steps are as follows:
a. Write Ha and H0 in the form of the sentence.
b. Write Ha and H0 in the form of statistics.
c. Find F count by using the formula:
F= BigestVarianceSmalestVariance
d. Set the level of significance (α)
e. Calculate Ftable by the formula:
(the largest variance df - 1, the smallest variance dkf- 1)
f. Specify the criteria for testing H0, namely:
If Fcount Ftable, then H0 is accepted (homogeneous).
g. Compare Fcount with Ftable
h. Make the conclusion.
2. Smallest variance compared to the Greatest variance
The steps are as follows:
a. Write Ha and H0 in the form of the sentence.
b. Write Ha and H0 in the form of statistics.
c. Find Foriginal count by using the formula:
d. Set the level of significance (α)
e. Calculate Foriginal table by the formula:
(the largest variance df - 1, the smallest variance df - 1)
f. Find Ftable right with the formula:
(smallest variance df - 1, the largest variance df - 1)
By using the F table obtained value of Ftable right. This value
hereinafter as the maximum value.
7) Find Ftable left with the formula :
(smallest variance df - 1, the largest variance df - 1)
Or,
F tabel=F1/2α
F= BigestVarianceSmalestVariance
F tabel=F1/2α
F tableright=F1 /2α
F tableleft=F(1−α )
8) Determine the test criteria by using the formula:
If the Ftable left Fcount left + Ftable right, then H0 is accepted (homogeneous).
9) Compare the value of the Ftable left, Fcount right, and Ftable right.
10) Make a conclusion
3. Bartlett Test
• Suppose the population has a homogeneous variance, namely:
• The hypothesis to be tested
H0 : population has the same variance
H1 : populations have unequal variances.
To simplify the calculation, the units needed to be better prepared Bartlett test in a list.
Samples to Df 1/df Si2 Log Si
2 Df log Si2
1 n1 – 1 1/(n1 – 1) S12 Log S1
2 (n1 – 1) Log
S12
2 n2- 1 1/(n2– 1) S22 Log S2
2 (n1 – 1) Log
S22
... ... ... ... ... ...
... ... ... ... ... ...
K nk- 1 1/(nk– 1) Sk2 Log Sk
2 (n1 – 1) Log
Sk2
Total Σ(ni- 1) Σ(1/(ni– 1)) -- -- Σ((n1 – 1)
Log Sk2)
F tableright=1
Foriginaltable
σ 12=σ2
2=.. ..=σ k2
From the list above can be calculated:
Variance composite of all samples
• Unit price B by the formula:
• For Bartlett test used chi square statistic
With the significance level α, we reject H0 if:
• To correct, use the correction factor K
• With statistical correction factor that is used now is
• With χ2 on the right side. In this case the hypothesis is rejected if:
s2=(∑ (n1−1 )si2
∑ (n1−1 ) )
B=( log s2) Σ (ni−1 )
χ2=( ln10 ) {B−Σ (ni−1 ) log si2}
with ln 10 = 2. 3026, called the original logarithm of number 10
χ2≥ χ ( 1−α ) (k−1 )2
where χ (1−α ) (k−1 )2 obtained from chi square table with
probability (1-α ) and df=(k-1 )
K=1+ 13 (k−1 ) {Σi=1
k
( 1n1−1 )− 1
Σ (n i−1 ) }
χK2 =( 1
K ) χ2
χK2 ≥ χ (1−α ) (k−1 )
2
Example:
In the example normality test, then can we test the homogeneity of a population is the population
for the selection and male student population for the selection of female students. For the first
selection were 15 female students and the second selection were 14 male students.
The value of selected new students
The population for the selection of female
students
Value
90 2 4
95 7 49
80 -8 64
80 -8 64
85 -3 9
85 -3 9
85 -3 9
95 7 49
95 7 49
x−x ( x−x )2
70 -18 324
95 7 49
90 2 4
85 -3 9
90 2 4
95 7 49
The value of selected new students
The population for the selection of the male students
Value (x)
90 3 9
90 3 9
70 -17 289
85 -2 4
90 3 9
80 -7 49
90 3 9
85 -2 4
85 -2 4
95 8 64
s2=Σ( x−x )n−1
s2=74514
s2=53 ,2
x−x ( x−x )2
90 3 9
95 8 64
85 -2 4
85 -2 4
The prices are needed to Bartlett test
sample df 1/df si2 log si
2 (df) log si2
Female 14 0.07142 53.2 1.725 24.15
Male 13 0.07142 40.8 1.610 20.93Total 27 0.14284 -- -- 45.08
Variance combination of the two samples
So log s2 = log 47 = 1.6720
s2=Σ( x−x )n−1
s2=53113
s2=40 ,8
H0 : σ12=σ2
2
s2=( Σ (ni−1 )si2
Σ (ni−1 ) )s2=(14 (53 .2 )+13 (40 . 8 )
27 )=47 . 22
Value of B:
Value χ2 count:
If α = 0,05 and from the chi square distribution table with df = 1 we get:
Apparently 1.6173 < 3.81so the hypothesis accepted when the significance level 0.05, so that it can be said both data derived from a homogeneous population variance.
B=( log s2) Σ (ni−1 )B=(1 . 67412 ) (27 )=45 .201
χ2=( ln10 ) {B−Σ (ni−1) log s i2 }
χ2=( ln10 ) {46 .81874−( 45 .201 ) }=1.6173
χ0 ,95 ( 2 )2 =3 .841
H0 : σ12=σ2
2
Steps testing with SPSS 16.
The Result testing by SPSS
Case Processing Summary
Gender
Cases
Valid Missing Total
N Percent N Percent N Percent
Value Female 15 100.0% 0 .0% 15 100.0%
Male 14 100.0% 0 .0% 14 100.0%
Descriptives
Gender Statistic Std. Error
Value Female Mean 87.6667 1.88140
95% Confidence Interval for Mean
Lower Bound 83.6315
Upper Bound 91.7019
5% Trimmed Mean 88.2407
Median 90.0000
Variance 53.095
Std. Deviation 7.28665
Minimum 70.00
Maximum 95.00
Range 25.00
Interquartile Range 10.00
Skewness -.955 .580
Kurtosis .872 1.121
Male Mean 86.7857 1.70706
95% Confidence Interval for Mean
Lower Bound 83.0978
Upper Bound 90.4736
5% Trimmed Mean 87.2619
Median 87.5000
Variance 40.797
Std. Deviation 6.38723
Minimum 70.00
Maximum 95.00
Range 25.00
Interquartile Range 5.00
Skewness -1.307 .597
Kurtosis 2.865 1.154
Tests of Normality
Gender
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Value Female .176 15 .200* .872 15 .037
Male .247 14 .020 .859 14 .029
a. Lilliefors Significance Correction
*. This is a lower bound of the true significance.
Test of Homogeneity of Variance
Levene Statistic df1 df2 Sig.
Value Based on Mean .587 1 27 .450
Based on Median .354 1 27 .557
Based on Median and with adjusted df
.354 1 26.414 .557
Based on trimmed mean .532 1 27 .472
Test Criteria: accept H0 if sig> 0.05. in the results table with spss program, it appears that sig exceeds 0.05. H0 so that we have received, in other words that the population value can be conclude selection for female and male students is having the same variance