summary table of influence procedures for a single sample (i)
DESCRIPTION
Summary Table of Influence Procedures for a Single Sample (I). &4-8 (&8-6). Summary Table of Influence Procedures for a Single Sample (II). Testing for Goodness of Fit. &4-9 (&8-7). - PowerPoint PPT PresentationTRANSCRIPT
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 11
Summary Table of Influence Procedures Summary Table of Influence Procedures for a Single Sample (I)for a Single Sample (I) &4-8 (&8-6)
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 22
Summary Table of Influence Procedures Summary Table of Influence Procedures for a Single Sample (II)for a Single Sample (II)
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 33
Testing for Goodness of FitTesting for Goodness of Fit
In general, we do not know the underlying distribution of In general, we do not know the underlying distribution of the population, and we wish to test the hypothesis that a the population, and we wish to test the hypothesis that a particular distribution will be satisfactory as a population particular distribution will be satisfactory as a population model.model.
Probability PlottingProbability Plotting can only be used for examining can only be used for examining whether a population is normal distributed.whether a population is normal distributed.
Histogram Plotting and others can only be used to guess Histogram Plotting and others can only be used to guess the possible underlying distribution type. the possible underlying distribution type.
&4-9 (&8-7)
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 44
Goodness-of-Fit Test (I)Goodness-of-Fit Test (I)
A random sample of size n from a population whose probaA random sample of size n from a population whose probability distribution is unknown. bility distribution is unknown.
These n observations are arranged in a frequency histograThese n observations are arranged in a frequency histogram, having k bins or class intervals.m, having k bins or class intervals.
Let OLet Oii be the observed frequency in the ith class interval, a be the observed frequency in the ith class interval, a
nd End Eii be the expected frequency in the ith class interval fro be the expected frequency in the ith class interval fro
m the hypothesized probability distribution, the test statistim the hypothesized probability distribution, the test statistics is cs is
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 55
Goodness-of-Fit Test (II)Goodness-of-Fit Test (II)
If the population follows the hypothesized distribution, XIf the population follows the hypothesized distribution, X0022
has approximately a chi-square distribution with k-p-1 d.f., has approximately a chi-square distribution with k-p-1 d.f., where p represents the number of parameters of the where p represents the number of parameters of the hypothesized distribution estimated by sample statistics.hypothesized distribution estimated by sample statistics.
That is,That is,
Reject the hypothesis if Reject the hypothesis if
21
1
220 ~
pk
k
i i
ii
E
EO
21,
20 pk
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 66
Goodness-of-Fit Test (III)Goodness-of-Fit Test (III)
Class intervals are not required to be equal width.Class intervals are not required to be equal width.
The minimum value of expected frequency can not be to The minimum value of expected frequency can not be to small. 3, 4, and 5 are ideal minimum values.small. 3, 4, and 5 are ideal minimum values.
When the minimum value of expected frequency is too When the minimum value of expected frequency is too small, we can combine this class interval with its small, we can combine this class interval with its neighborhood class intervals. In this case, k would be neighborhood class intervals. In this case, k would be reduced by one.reduced by one.
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 77
Example 8-18Example 8-18 The number of defects in printed circuit boards is The number of defects in printed circuit boards is
hypothesized to follow a Poisson distribution. A random sample of size 60 hypothesized to follow a Poisson distribution. A random sample of size 60 printed boards has been collected, and the number of defects observed as the table printed boards has been collected, and the number of defects observed as the table below:below:
The only parameter in Poisson distribution is The only parameter in Poisson distribution is , can be estimated by the , can be estimated by the sample mean = {0(32) + 1(15) + 2(19) + 3(4)}/60 = 0.75. Therefore, the sample mean = {0(32) + 1(15) + 2(19) + 3(4)}/60 = 0.75. Therefore, the expected frequency is:expected frequency is:
32.2860472.0
472.0!0
)75.0()0(
1
075.0
1
E
eXPp
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 88
Example 8-18 (Cont.)Example 8-18 (Cont.)
Since the expected frequency in the last cell is less than 3, we combine the last Since the expected frequency in the last cell is less than 3, we combine the last two cells:two cells:
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 99
Example 8-18 (Cont.)Example 8-18 (Cont.)
1.1. The variable of interest is the form of distribution of defects in printed circuit The variable of interest is the form of distribution of defects in printed circuit boards.boards.
2.2. HH00: The form of distribution of defects is Poisson: The form of distribution of defects is Poisson
HH11: The form of distribution of defects is not Poisson: The form of distribution of defects is not Poisson
3.3. k = 3, p = 1, k-p-1 = 1 d.f.k = 3, p = 1, k-p-1 = 1 d.f.
4. 4. At At = 0.05, we reject H = 0.05, we reject H00 if X if X2200 > X > X22
0.05, 1 0.05, 1 = 3.84= 3.84
5.5. The test statistics is:The test statistics is:
6.6. Since XSince X220 0 = 2.94 < X= 2.94 < X22
0.05, 1 0.05, 1 = 3.84, we are unable to reject the null hypothesis th= 3.84, we are unable to reject the null hypothesis th
at the distribution of defects in printed circuit boards is Poisson.at the distribution of defects in printed circuit boards is Poisson.
94.244.10
)44.1013(
24.21
)24.2115(
32.28
)32.2832()( 222
1
220
k
i i
ii
E
EO
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 1010
Contingency Table TestsContingency Table Tests Example 8-20Example 8-20
A company has to choose among three pension plans. Management wishes to A company has to choose among three pension plans. Management wishes to know whether the preference for plans is independent of job classification and know whether the preference for plans is independent of job classification and wants to use wants to use = 0.05. The opinions of a random sample of 500 employees = 0.05. The opinions of a random sample of 500 employees are shown in Table 8-4.are shown in Table 8-4.
(&8-8)
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 1111
Contingency Table TestContingency Table Test- The Problem Formulation (I)- The Problem Formulation (I)
There are two classifications, one has r levels and the other has c There are two classifications, one has r levels and the other has c levels. (3 pension plans and 2 type of workers)levels. (3 pension plans and 2 type of workers)
Want to know whether two methods of classification are statistically Want to know whether two methods of classification are statistically independent. (whether the preference of pension plans is independent independent. (whether the preference of pension plans is independent of job classification)of job classification)
The table:The table:
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 1212
Contingency Table TestContingency Table Test- The Problem Formulation (II)- The Problem Formulation (II)
Let pLet pijij be the probability that a random selected element falls in the ij be the probability that a random selected element falls in the ij thth
cell, given that the two classifications are independent. Then pcell, given that the two classifications are independent. Then p ijij = u = uiivvjj, ,
where the estimator for uwhere the estimator for uii and v and vjj are are
Therefore, the expected frequency of each cell isTherefore, the expected frequency of each cell is
Then, for large n, the statisticThen, for large n, the statistic
has an approximate chi-square distribution with (r-1)(c-1) d.f.has an approximate chi-square distribution with (r-1)(c-1) d.f.
r
iijj
c
jiji O
nvO
n 11
1
1
r
iij
c
jijjiij OO
nvnE
11
1
r
i
c
j ij
ijij
E
EO
1 1
220
)(
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 1313
Example 8-20Example 8-20
Horng-Chyi HorngHorng-Chyi Horng Statistics IIStatistics II 1414