powerpoint presentation · 2017-09-22 · 4-3 hypothesis testing 4-3.1 statistical hypotheses test...
TRANSCRIPT
-
4-1 Statistical Inference
• The field of statistical inference consists of those
methods used to make decisions or draw
conclusions about a population.
•These methods utilize the information contained in
a random sample from the population in drawing
conclusions.
-
4-1 Statistical Inference
-
4-2 Point Estimation
-
4-2 Point Estimation
-
4-2 Point Estimation
-
4-2 Point Estimation
-
4-3 Hypothesis Testing
We like to think of statistical hypothesis testing as the
data analysis stage of a comparative experiment, in which the engineer is interested, for example, in
comparing the mean of a population to a specified
value (e.g. mean pull strength).
4-3.1 Statistical Hypotheses
-
4-3 Hypothesis Testing
4-3.1 Statistical Hypotheses
Two-sided Alternative Hypothesis
One-sided Alternative Hypotheses
-
4-3 Hypothesis Testing
4-3.1 Statistical Hypotheses
Test of a Hypothesis
• A procedure leading to a decision about a particular
hypothesis
• Hypothesis-testing procedures rely on using the information
in a random sample from the population of interest.
• If this information is consistent with the hypothesis, then we
will conclude that the hypothesis is true; if this information is
inconsistent with the hypothesis, we will conclude that the
hypothesis is false.
-
4-3 Hypothesis Testing
4-3.2 Testing Statistical Hypotheses
-
4-3 Hypothesis Testing
4-3.2 Testing Statistical Hypotheses
-
4-3 Hypothesis Testing
4-3.2 Testing Statistical Hypotheses
Sometimes the type I error probability is called the
significance level, or the -error, or the size of the test.
-
4-3 Hypothesis Testing
4-3.2 Testing Statistical Hypotheses
Reduce α
1. push critical regions further
2. Increasing sample size
increasing Z value
-
4-3 Hypothesis Testing
4-3.2 Testing Statistical Hypotheses
-
4-3 Hypothesis Testing
4-3.2 Testing Statistical Hypotheses
error when μ =52 and n=16.
Increasing the sample size results
in a decrease in β.
-
4-3 Hypothesis Testing
4-3.2 Testing Statistical Hypotheses
error when μ =50.5 and n=10.
β increases as the true value of μ
approaches the hypothesized value.
-
4-3 Hypothesis Testing
4-3.2 Testing Statistical Hypotheses
-
4-3 Hypothesis Testing
4-3.2 Testing Statistical Hypotheses
• The power is computed as 1 - b, and power can be interpreted as
the probability of correctly rejecting a false null hypothesis. We often compare statistical tests by comparing their power
properties.
• For example, consider the propellant burning rate problem when
we are testing H 0 : m = 50 centimeters per second against H 1 : m not equal 50 centimeters per second . Suppose that the true value of the
mean is m = 52. When n = 10, we found that b = 0.2643, so the power of this test is 1 - b = 1 - 0.2643 = 0.7357 when m = 52.
-
4-3 Hypothesis Testing
4-3.3 P-Values in Hypothesis Testing
P-value carries info about the weight of evidence against H0.
The smaller the p-value, the greater the evidence against H0.
-
4-3 Hypothesis Testing
4-3.4 One-Sided and Two-Sided Hypotheses
Two-Sided Test:
One-Sided Tests:
-
4-3 Hypothesis Testing
4-3.5 General Procedure for Hypothesis Testing
-
OPTIONS NODATE NONUMBER;
DATA EX415;
mu_h0=12; mu_h1=11.25; sigma=0.5; n=4; alpha=0.05;
xbar=11.7; sem=sigma/sqrt(n); /* sem = mean standard error */
Z=(xbar-mu_h0)/sem; z_beta= probit(alpha);
xbar_beta=mu_h0+z_beta*sem;
z_beta=(xbar_beta-mu_h1)/sem;
PVAL=PROBNORM(Z);
beta=1-probnorm(z_beta);
power=1-beta;
PROC PRINT;
VAR PVAL beta power;
TITLE '4-15 (page 168)';
RUN; QUIT;
4-15 (page 168)
OBS PVAL beta power
1 0.11507 0.087685 0.91231
4-3 Hypothesis Testing
-
4-4 Inference on the Mean of a Population,
Variance Known
Assumptions
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.1 Hypothesis Testing on the Mean
We wish to test:
The test statistic is:
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.1 Hypothesis Testing on the Mean
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.1 Hypothesis Testing on the Mean
Reject H0 if the observed value of the test statistic z0 is
either:
or
Fail to reject H0 if
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.1 Hypothesis Testing on the Mean
-
4-4 Inference on the Mean of a Population,
Variance KnownEx. 4-3
X – burning rate
μ = 50, s = 2 a = 0.05, n = 25, ҧ𝑥 = 51.3
i) Hypothesis
H0: μ = 50
H1: μ ≠ 50
ii) Test Statistic
z0 =ഥ𝑋 − 𝜇0
ൗ𝜎
𝑛
=51.3 −50
ൗ2 25= 1.3/0.4 = 3.25
iii) p-value = 2[1 – Φ(3.25)] = 0.0012
Critical Region (CR) 또는 Region of Rejection (ROR)z0 > z0.025= 1.96 or z0< - z0.025= -1.96
iv) Conclusion
p-value=0.0012 값이유의수준 a = 0.05보다작기때문에(또는, 통계량 z0=3.25 값이기각역 (z0 = z0.025> 1.96) 에포함됨으로) H0를기각함. 즉,평균 burning rate이 50m/sec 이아니다라고결론내림.
-
4-4 Inference on the Mean of a Population,
Variance Known
OPTIONS NODATE NONUMBER;
DATA EX43;
MU=50; SD=2; N=25; ALPHA=0.05; XBAR=51.3;
SEM=SD/SQRT(N); /*MEAN STANDARD ERROR*/
Z=(XBAR-MU)/SEM;
PVAL=2*(1-PROBNORM(Z));
PROC PRINT;
VAR Z PVAL ALPHA;
TITLE 'TEST OF MU=50 VS NOT=50';
RUN; QUIT;
TEST OF MU=50 VS NOT=50
OBS Z PVAL ALPHA
1 3.25 .001154050 0.05
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.2 Type II Error and Choice of Sample Size
Finding The Probability of Type II Error b
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.2 Type II Error and Choice of Sample Size
Finding The Probability of Type II Error b
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.2 Type II Error and Choice of Sample Size
Sample Size Formulas
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.2 Type II Error and Choice of Sample Size
Sample Size Formulas
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.2 Type II Error and Choice of Sample Size
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.2 Type II Error and Choice of Sample Size
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.3 Large Sample Test
In general, if n 30, the sample variance s2
will be close to σ2 for most samples, and so s
can be substituted for σ in the test procedures
with little harmful effect.
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.4 Some Practical Comments on Hypothesis
Testing
The Seven-Step Procedure
Only three steps are really required:
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.4 Some Practical Comments on Hypothesis
Testing
Statistical versus Practical Significance
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.4 Some Practical Comments on Hypothesis
Testing
Statistical versus Practical Significance
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.5 Confidence Interval on the Mean
Two-sided confidence interval:
One-sided confidence intervals:
Confidence coefficient:
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.5 Confidence Interval on the Mean
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.6 Confidence Interval on the Mean
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.5 Confidence Interval on the Mean
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.5 Confidence Interval on the Mean
Relationship between Tests of Hypotheses and
Confidence Intervals
If [l,u] is a 100(1 - ) percent confidence interval for the
parameter, then the test of significance level of the
hypothesis
will lead to rejection of H0 if and only if the hypothesized
value is not in the 100(1 - ) percent confidence interval
[l, u].
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.5 Confidence Interval on the Mean
Confidence Level and Precision of Estimation
The length of the two-sided 95% confidence interval is
whereas the length of the two-sided 99% confidence
interval is
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.5 Confidence Interval on the Mean
Choice of Sample Size
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.5 Confidence Interval on the Mean
Choice of Sample Size
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.5 Confidence Interval on the Mean
Choice of Sample Size
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.5 Confidence Interval on the Mean
One-Sided Confidence Bounds
-
4-4 Inference on the Mean of a Population,
Variance Known
4-4.6 General Method for Deriving a Confidence
Interval
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.1 Hypothesis Testing on the Mean
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.1 Hypothesis Testing on the Mean
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.1 Hypothesis Testing on the Mean
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.1 Hypothesis Testing on the Mean
Calculating the P-value
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.1 Hypothesis Testing on the Mean
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.1 Hypothesis Testing on the Mean
Fixed Significance Level Approach
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.1 Hypothesis Testing on the Mean
-
4-5 Inference on the Mean of a Population,
Variance Unknown
Ex. 4-7ത𝑋 – mean coefficient of restitution n = 15
i) Hypothesis
H0: μ = 0.82
H1: μ > 0.82
ii) Normality Check!
iii) Test Statistic
t0 =ഥ𝑋 − 𝜇0
ൗ𝑠
𝑛
=0.83725 −0.82
ൗ0.02456 15= 2.72
iv) Fixed significance level approach
Critical Region (CR) 또는 Region of Rejection (ROR): t0 > t0.05= 1.761P-value approach
t0= 2.72는 t0.01;14= 2.624 값과 t0.005;14= 2.977 사이에있으므로0.005
-
4-5 Inference on the Mean of a Population,
Variance Unknown
OPTIONS NODATE NONUMBER;
DATA ex47;
xbar=0.83725; sd=0.02456;
mu=0.82; n=15;alpha=0.05; df=n-1;
t0=(xbar-mu)/(sd/sqrt(n));
pval = 1- PROBT(t0, df); /* PROBT function returns the probability from a t distribution
less than or equal to x */
cv=tinv(1-alpha, df); /* critical value */
/* TINV function returns a quantile from the t distribution */
PROC PRINT;
var t0 pval cv;
title 'Test of mu=0.82 vs mu>0.82';
title2 'Ex. 4-7 in pp193';
RUN; QUIT;
-
4-5 Inference on the Mean of a Population,
Variance Unknown
OPTIONS NOOVP NODATE NONUMBER;
DATA ex47;
input coeffs @@;
cards;
0.8411 0.8191 0.8182 0.8125 0.8750
0.8580 0.8532 0.8483 0.8276 0.7983
0.8042 0.8730 0.8282 0.8359 0.8660
proc univariate data=ex47 mu0=0.82 plot normal;
var coeffs;
title 'using proc uinivariate for t-test for one population mean';
title2 “Example 4-7 in page 191”;
RUN; QUIT;
-
4-5 Inference on the Mean of a Population,
Variance Unknown
-
4-5 Inference on the Mean of a Population,
Variance Unknown
-
4-5 Inference on the Mean of a Population,
Variance Unknown
-
4-5 Inference on the Mean of a Population,
Variance Unknown
OPTIONS NOOVP NODATE NONUMBER LS=80;
DATA ex47;
input coeffs @@;
cards;
0.8411 0.8191 0.8182 0.8125 0.8750
0.8580 0.8532 0.8483 0.8276 0.7983
0.8042 0.8730 0.8282 0.8359 0.8660
ods graphics on;
proc ttest data=ex47 h0=0.82 sides=u;
/* h0는귀무가설, sides는양측일경우 2, lower 단측검정일경우 L, upper 단측검정일경우 u*/var coeffs;
title 'using t-test for one population mean';
title “Example 4-7 in page 191”;
RUN; ods graphics off;
QUIT;
-
4-5 Inference on the Mean of a Population,
Variance Unknown
-
4-5 Inference on the Mean of a Population,
Variance Unknown
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.2 Type II Error and Choice of Sample Size
Fortunately, this unpleasant task has already been done,
and the results are summarized in a series of graphs in
Appendix A Charts Va, Vb, Vc, and Vd that plot for the
t-test against a parameter d for various sample sizes n.
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.2 Type II Error and Choice of Sample Size
These graphics are called operating characteristic
(or OC) curves. Curves are provided for two-sided
alternatives on Charts Va and Vb. The abscissa scale
factor d on these charts is defined as
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.2 Type II Error and Choice of Sample Size
Chart V (c) in pp. 496
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.3 Confidence Interval on the Mean
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.3 Confidence Interval on the Mean
-
4-5 Inference on the Mean of a Population,
Variance Unknown
4-5.4 Confidence Interval on the Mean
-
4-5 Inference on the Mean of a Population,
Variance Unknown
OPTIONS NOOVP NODATE NONUMBER LS=80;
DATA ex60;
input diam @@;
cards;
8.23 8.31 8.42 8.29 8.19 8.24
8.19 8.29 8.30 8.14 8.32 8.40
ods graphics on;
proc univariate plot normal mu0=8.2;
var diam;
title 'univariate for t-test with H0: mu=8.2';
proc ttest data=ex60 h0=8.2;
var diam;
title 't-test for one population mean';
title “Example 4-60 in page 198”;
RUN; ods graphics off;
QUIT;
-
4-5 Inference on the Mean of a Population,
Variance Unknown
-
4-5 Inference on the Mean of a Population,
Variance Unknown
-
4-6 Inference on the Variance of a
Normal Population
4-6.1 Hypothesis Testing on the Variance of a
Normal Population
-
4-6 Inference on the Variance of a
Normal Population
4-6.1 Hypothesis Testing on the Variance of a
Normal Population
-
4-6 Inference on the Variance of a
Normal Population
4-6.1 Hypothesis Testing on the Variance of a
Normal Population
-
4-6 Inference on the Variance of a
Normal Population
4-6.1 Hypothesis Testing on the Variance of a
Normal Population
-
4-6 Inference on the Variance of a
Normal Population
4-6.1 Hypothesis Testing on the Variance of a
Normal Population
-
4-6 Inference on the Variance of a
Normal Population
4-6.1 Hypothesis Testing on the Variance of a
Normal Population
-
4-6 Inference on the Variance of a
Normal Population
4-6.2 Confidence Interval on the Variance of a
Normal Population
-
4-6 Inference on the Variance of a
Normal Population
OPTIONS NODATE NONUMBER;
DATA ex410;
p_var=0.01; s_var=0.0153;
n=20;alpha=0.05; df=n-1;
chisq=(n-1)*s_var/p_var;
p_value=1-probchi(chisq, df);
cv1=cinv(1-alpha, df); /* critical value */
cv2=cinv(alpha, df);
lcl=sqrt((n-1)*s_var/cv1); /* ucl= upper confidence limit */
PROC PRINT;
var chisq p_value cv1 cv2 lcl;
title 'Test of H0: var=0.01 vs H0: var>0.01';
title2 'Ex. 4-10 in pp202';
RUN; QUIT;
-
4-7 Inference on Population Proportion
4-7.1 Hypothesis Testing on a Binomial Proportion
We will consider testing:
-
4-7 Inference on Population Proportion
4-7.1 Hypothesis Testing on a Binomial Proportion
-
4-7 Inference on Population Proportion
4-7.1 Hypothesis Testing on a Binomial Proportion
-
4-7 Inference on Population Proportion
4-7.1 Hypothesis Testing on a Binomial Proportion
-
4-7 Inference on Population Proportion
4-7.2 Type II Error and Choice of Sample Size
-
4-7 Inference on Population Proportion
4-7.2 Type II Error and Choice of Sample Size
-
4-7 Inference on Population Proportion
4-7.3 Confidence Interval on a Binomial Proportion
-
4-7 Inference on Population Proportion
4-7.3 Confidence Interval on a Binomial Proportion
-
4-7 Inference on Population Proportion
4-7.3 Confidence Interval on a Binomial Proportion
Choice of Sample Size
-
4-8 Other Interval Estimates for a
Single Sample
4-8.1 Prediction Interval
-
4-8 Other Interval Estimates for a
Single Sample
4-8.2 Tolerance Intervals for a Normal Distribution
-
4-10 Testing for Goodness of Fit
• So far, we have assumed the population or probability
distribution for a particular problem is known.
• There are many instances where the underlying
distribution is not known, and we wish to test a particular
distribution.
• Use a goodness-of-fit test procedure based on the chi-
square distribution.