chapters 8 - 9
DESCRIPTION
Chapters 8 - 9. Estimation Mat og metlar. Estimator and Estimate Metill og mat. - PowerPoint PPT PresentationTRANSCRIPT
Chapters 8 - 9Chapters 8 - 9
EstimationEstimationMat og metlarMat og metlar
©
Estimator and EstimateEstimator and EstimateMetill og matMetill og mat
An estimatorestimator of a population parameter is a random variable that depends on the sample information and whose value provides approximations to this unknown parameter. A specific value of that random variable is called an estimateestimate.
Metill fyrir þýðisstika er hending sem er háð úrtaksupplýsingum og gildi metilsins sem kallast mat gefur nálgun á hinn óþekkta þýðisstika.
Point Estimator and Point Estimate Point Estimator and Point Estimate Punktmetill og punktmatPunktmetill og punktmat
Let represent a population parameter (such as the population mean or the population proportion ). A point estimator, , of a population parameter, , is a function of the sample information that yields a single number called a point estimate. For example, the sample mean, , is a point estimator of the population mean , and the value that assumes for a given set of data is called the point estimate.
θ
XX
Þýðisstiki (population parameter)
θ
X
UnbiasednessUnbiasedness Óhneigður (óbjagaður)Óhneigður (óbjagaður)
The point estimator is said to be an unbiased estimator of the parameter if the expected value, or mean, of the sampling distribution of is ; that is,
θ
)ˆ(E
θ
Punktmetill er sagður óhneigður metill fyrir stikann ef vongildi líkindadreifingar úrtaks fyrir er ; þ.e.,
θ
)ˆ(E
θ
Probability Density Functions for unbiased Probability Density Functions for unbiased and Biased Estimatorsand Biased Estimators
Þéttifall fyrir hneigðan og óhneigðan metilÞéttifall fyrir hneigðan og óhneigðan metil(Figure 8.1)(Figure 8.1)
1 2
Bias Bias Bjögun (skekkja)Bjögun (skekkja)
Let be an estimator of . The bias in is defined as the difference between its mean and ; that is
It follows that the bias of an unbiased estimator is 0.
θ
)ˆ()ˆ( EBias
θ
Látum vera metil fyrir . Bjögun í er skilgreind sem mismunur milli vongildis metilsins og ; þ.e.
Samkvæmt þessu er bjögun (bias) fyrir óhneigðan metil 0.
θ
)ˆ()ˆ( EBias
θ
Most Efficient Estimator and Relative Most Efficient Estimator and Relative EfficiencyEfficiency
Skilvirkasti metillinn og hlutfallsleg Skilvirkasti metillinn og hlutfallsleg skilvirkniskilvirkni
Suppose there are several unbiased estimators of . Then the unbiased estimator with the smallest variance is said to be the most efficient most efficient estimatorestimator or to be the minimum variance minimum variance unbiased estimatorunbiased estimator of . Let and be two unbiased estimators of , based on the same number of sample observations. Then,
a) is said to be more efficient than if b) The relative efficiency of with respect to is
the ratio of their variances; that is, hlutfallsleg skilvirkni
1θ 2θ
1θ 2θ )ˆ()ˆ( 21 VarVar
1θ 2θ
)θVar(
)θVar(
1
2
ˆ
ˆ Efficiency Relative
Point Estimators of Selected Point Estimators of Selected Population ParametersPopulation Parameters
(Table 8.1)(Table 8.1)
Population Parameter
Point Estimato
r
Properties
Mean, X Unbiased, Most Efficient (assuming normality)
Mean, XmUnbiased (assuming normality), but not most efficient
Proportion,
p Unbiased, Most Efficient
Variance, 2
s2 Unbiased, Most Efficient (assuming normality)
Confidence Interval Confidence Interval EstimatorEstimator
Metill fyrir öryggismörkMetill fyrir öryggismörk
A confidence interval estimatorconfidence interval estimator for a population parameter is a rule for determining (based on sample information) a range, or interval that is likely to include the parameter. The corresponding estimate is called a confidence interval estimateconfidence interval estimate.Metill fyrir öryggismörk á þýðisstika er til að ákvarða (byggt á úrtaksgögnum) spönn, eða bil sem líklegt er til að ná utan um hinn sanna stika. Samsvarandi mat köllum við mat fyrir öryggismörk eða bara öryggismörk.
Confidence Interval and Confidence Confidence Interval and Confidence LevelLevel
Let be an unknown parameter. Suppose that on the basis of sample information, random variables A and B are found such that P(A < < B) = 1 - , where is any number between 0 and 1. If specific sample values of A and B are a and b, then the interval from a to b is called a 100(1 - )% confidence intervalconfidence interval of . The quantity of (1 - ) is called the confidence levelconfidence level of the interval.
If the population were repeatedly sampled a very large number of times, the true value of the parameter would be contained in 100(1 - )% of intervals calculated this way. The confidence interval calculated in this manner is written as a < < b with 100(1 - )% confidence.Látum vera óþekktan stika. Hugsum okkur á að á grunni úrtaksupplýsinga séu hendingar A og B reiknaðar þannig að P(A < < B) = 1 - , þar sem er einhver tala milli 0 og 1. Ef ákveðin gildi A og B eru a and b, þá er bilið frá a til b kallað 100(1 - )% öryggismörk fyrir . Stærðin (1 - ) er kallað öryggsstig bilsins.
Ef endurtekin úrtök væru tekin úr þýðinu mjög oft þá myndi 100(1 - )% allra þeirra bila sem reiknuð væri út innihalda hinn sanna stika . Öryggismörkin sem reiknuð eru á þennan hátt eru skrifuð sem a < < b með 100(1 - )% vissu.
P(-1.96 < Z < 1.96) = 0.95, where P(-1.96 < Z < 1.96) = 0.95, where Z is a Standard Normal VariableZ is a Standard Normal Variable
(Figure 8.3)(Figure 8.3)
0.95 = P(-1.96 < Z < 1.96)
-1.96 1.96
0.025 0.025
Notation Notation TáknmálsnotkunTáknmálsnotkun Let Z/2 be the number for which
where the random variable Z follows a standard normal distribution.
2)( 2/
ZZP
Látum Z/2 vera tölu sem
Þar sem hendingin Z fylgir staðlaðri normaldreifingu
2)( 2/
ZZP
Selected Values ZSelected Values Z/2/2 from the from the Standard Normal Distribution TableStandard Normal Distribution Table
(Table 8.2)(Table 8.2)
0.01 0.02 0.05 0.10
Z/2 2.58 2.33 1.96 1.645
Confidence Level
99% 98% 95% 90%
Confidence Intervals for the Mean of a Population Confidence Intervals for the Mean of a Population that is Normally Distributed: Population Variance that is Normally Distributed: Population Variance
KnownKnown Öryggismörk fyrir meðaltal þýðis sem er Öryggismörk fyrir meðaltal þýðis sem er
normaldreift og með þekkta dreifninormaldreift og með þekkta dreifni
Consider a random sample of n observations from a normal distribution with mean and variance 2. If the sample mean is X, then a 100(1 - )% confidence interval for the population mean confidence interval for the population mean with known variancewith known variance is given by
or equivalently,
where the margin of error (also called the sampling error, the bound, or the interval half width) is given by
n
ZX
n
ZX
2/2/
BX
nZB
2/
Basic Terminology for Confidence Interval Basic Terminology for Confidence Interval for a Population Mean with Known for a Population Mean with Known
Population VariancePopulation VarianceOrðnotkun fyrir öryggismörk þýðismeðaltals Orðnotkun fyrir öryggismörk þýðismeðaltals
með þekktri dreifnimeð þekktri dreifni(Table 8.3)(Table 8.3)
Terms Symbol
To Obtain:
Standard Error of the Mean
Z Value (also called the Reliability Factor)
Use Standard Normal Distribution Table
Margin of Error skekkjumörk
Lower Confidence Limit Neðri mörk
Upper Confidence Limit Efri mörk
Width (width is twice the bound)Breidd
X
2/Z
B
LCL
UCL
w
n/
nZB
2/
nZXLCL
2/
nZXUCL
2/
nZBw
2/22
Student’s Student’s tt Distribution DistributionGiven a random sample of n observations, with mean X and standard deviation s, from a normally distributed population with mean , the variable t follows the Student’s Student’s t t distributiondistribution with (n - 1) degrees of freedom and is given by
ns
Xt
/
Hugsum okkur slembið úrtak n athugana með úrtaksmeðaltal X og úrtaksstaðalfrávik s, úrtakið er fengið úr þýði sem er normaldreift með vongildi , breytan t er sögð fylgja Student’s Student’s t t dreifingudreifingu með (n - 1) frígráður og er gefin af
Notation Notation TáknmálsnotkunTáknmálsnotkun
A random variable having the Student’s t distribution with v degrees of freedom will be denoted tv. The tv,/2 is defined as the number for which
2/)( 2/, vv ttP
Slembin breyta sem hefur Student’s t dreifingu með v frelsisgráður verður táknuð með tv. Stærðin tv,/2 er skilgreind sem stærðin sem
Confidence Intervals for the Mean of a Normal Confidence Intervals for the Mean of a Normal Population: Population Variance Unknown Population: Population Variance Unknown
Öryggismörk fyrir vongildi í normaldreifðu þýði Öryggismörk fyrir vongildi í normaldreifðu þýði með óþekktri dreifnimeð óþekktri dreifni
Suppose there is a random sample of n observations from a normal distribution with mean and unknown variance. If the sample mean and standard deviation are, respectively, X and s, then a 100(1 - )% confidence interval for the confidence interval for the population mean, variance unknownpopulation mean, variance unknown, is given by
or equivalently,where the margin of errormargin of error, the sampling error, or bound, B, is given by
and tn-1,/2 is the number for which
The random variable tn-1 has a Student’s t distribution with v=(n-1) degrees of freedom.
n
stX
n
stX nn 2/,12/,1
BX
n
stB n 2/,1
2/)( 2/,11 nn ttP
Confidence Intervals for Population Proportion Confidence Intervals for Population Proportion (Large Samples) (Large Samples) Öryggismörk fyrir þýðishlutfallÖryggismörk fyrir þýðishlutfall
(Stór úrtök) (Stór úrtök)
Let p denote the observed proportion of “successes” in a random sample of n observations from a population with a proportion of successes. Then, if n is large enough that (n)()(1- )>9, then a 100(1 - )% confidence interval for confidence interval for the population proportionthe population proportion is given by
or equivalently,where the margin of errormargin of error, the sampling error, or bound, B, is given by
and Z/2, is the number for which a standard normal variable Z satisfies
n
ppZp
n
ppZp
)1()1(2/2/
Bp
n
ppZB
)1(2/
2/)( 2/ ZZP
Notation Notation TáknmálsnotkunTáknmálsnotkun
A random variable having the chi-square distribution with v = n-1 degrees of freedom will be denoted by 2
v or simply 2
n-1. Define as 2n-1, the number for
which )( 2,1
21 nnP
Hending með chi-square dreifingu þar sem v = n-1 frelsisgráður er táknuð með 2
v eða 2
n-1. Skilgreinum 2n-1, sem töluna sem
um gildir að
The Chi-Square DistributionThe Chi-Square Distribution(Figure 8.17)(Figure 8.17)
1 -
2n-1,0
The Chi-Square Distribution for The Chi-Square Distribution for n – 1 and (1-n – 1 and (1-)% Confidence )% Confidence
LevelLevel(Figure 8.18)(Figure 8.18)
/21 -
2n-1,/2
/2
2n-1,1- /2
Confidence Intervals for the Variance of a Normal Confidence Intervals for the Variance of a Normal Population Population Öryggismörk fyrir dreifni í Öryggismörk fyrir dreifni í
normaldreifðu þýðinormaldreifðu þýðiSuppose there is a random sample of n observations from a normally distributed population with variance 2. If the observed variance is s2 , then a 100(1 - )% confidence confidence interval for the population variance interval for the population variance is given by Hugsum okkur slembið úrtak n gagna úr normaldreifðu þýði með dreifni 2. Ef úrtaksdreifni er s2 , þá eru 100(1 - )% öryggismörk fyrir þýðisdreifni gefin semöryggismörk fyrir þýðisdreifni gefin sem
where 2n-1,/2 is the number for which
and 2n-1,1 - /2 is the number for which
And the random variable 2n-1 follows a chi-square
distribution with (n – 1) degrees of freedom. Og hendingin 2
n-1 fylgir chi-square dreifingu með (n – 1) frelsisgráður
22/1,1
22
22/,1
2 )1()1(
nn
snsn
2)( 2
2/,12
1
nnP
2)( 2
2/1,12
1
nnP
Confidence Intervals for Two Means: Matched Confidence Intervals for Two Means: Matched Pairs Pairs Öryggismörk fyrir tvö vongildi : Pör Öryggismörk fyrir tvö vongildi : Pör
(Matched Pairs)(Matched Pairs)
Suppose that there is a random sample of n matched pairs of observations from a normal distributions with means X and Y . That is, x1, x2, . . ., xn denotes the values of the observations from the population with mean X ; and y1, y2, . . ., yn the matched sampled values from the population with mean Y . Let d and sd denote the observed sample mean and standard deviation for the n differences di = xi – yi . If the population distribution of the differences is assumed to be normal, then a 100(1 - )% confidence confidence interval for the difference between meansinterval for the difference between means (d = X - Y) is given by
or equivalently,
n
std
n
std d
ndd
n 2/,12/,1
Bd
Confidence Intervals for Two Means: Confidence Intervals for Two Means: Matched PairsMatched Pairs
(continued)(continued)
Where the margin of errormargin of error, the sampling error or the bound, B, is given by
And tn-1,/2 is the number for which
The random variable tn – 1, has a Student’s t distribution with (n – 1) degrees of freedom.
2)( 2/,11
nn ttP
n
stB dn 2/,1
Confidence Intervals for Difference Between Means: Confidence Intervals for Difference Between Means: Independent Samples (Normal Distributions and Known Independent Samples (Normal Distributions and Known
Population Variances) Population Variances) Öryggismörk fyrir mismun Öryggismörk fyrir mismun
vongilda: Óháð úrtökvongilda: Óháð úrtök
Suppose that there are two independent random samples of nx and ny observations from normally distributed populations with means X and Y and variances 2
x and 2y .
If the observed sample means are X and Y, then a 100(1 - )% confidence interval for (X - Y) is given by
or equivalently,
where the margin of errormargin of error is given by
y
Y
x
XYX
y
Y
x
X
nnZYX
nnZYX
22
2/
22
2/ )()(
BYX )(
y
Y
x
X
nnZB
22
2/
Confidence Intervals for Two Means: Unknown Confidence Intervals for Two Means: Unknown Population Variances that are Assumed to be Population Variances that are Assumed to be
EqualEqual Öryggismörk fyrir mismun vongilda: Óþekkt dreifni en Öryggismörk fyrir mismun vongilda: Óþekkt dreifni en
dreifnin er eins skv. Forsendu.dreifnin er eins skv. Forsendu.
Suppose that there are two independent random samples with nx and ny observations from normallynormally distributed populations with means X and Y and a common, but unknown population variance. If the observed sample means are X and Y, and the observed sample variances are s2
X and s2
Y, then a 100(1 - )% confidence interval for (X - Y) is given by
or equivalently,
where the margin of errormargin of error is given by
y
p
x
pnnYX
y
p
x
pnn n
s
n
stYX
n
s
n
stYX
yxyx
22
2/,2
22
2/,2 )()(
BYX )(
y
p
x
pnn n
s
n
stB
yx
22
2/,2
Confidence Intervals for Two Means: Confidence Intervals for Two Means: Unknown Population Variances that are Unknown Population Variances that are
Assumed to be EqualAssumed to be Equal(continued)(continued)
The pooled sample variancepooled sample variance, s2p, is given by
is the number for which
The random variable, T, is approximately a Student’s t distribution with nX + nY –2 degrees of freedom and T is given by,
2/,2 yx nnt2
)1()1( 222
yx
YyXxp nn
snsns
2)( 2/,22
yxyx nnnn ttP
YXp
YX
nns
YXT
11
)()(
Confidence Intervals for Two Means: Confidence Intervals for Two Means: Unknown Population Variances, Unknown Population Variances,
Assumed Not EqualAssumed Not Equal
Suppose that there are two independent random samplesindependent random samples of nx and ny observations from normallynormally distributed populations with means X and Y and it is assumed that the population variances are not equal. If the observed sample means and variances are X, Y, and s2
X , s2Y, then a 100(1 - )%
confidence interval for (X - Y) is given by
where the margin of errormargin of error is given by
y
Y
x
XvYX
y
Y
x
Xv n
s
n
stYX
n
s
n
stYX
22
)2/,(
22
)2/,( )()(
y
Y
x
Xv n
s
n
stB
22
)2/,(
Confidence Intervals for Two Means: Confidence Intervals for Two Means: Unknown Population Variances, Assumed Unknown Population Variances, Assumed
Not EqualNot Equal(continued)(continued)
The degrees of freedom, v, is given by
If the sample sizes are equal, then the degrees of freedom reduces to
)1/()()1/()(
)]()[(
22
22
222
YY
YX
X
X
Y
Y
X
X
nns
nns
ns
ns
v
)1(2
1
2
2
2
2
n
ss
ss
v
X
Y
Y
X
Confidence Intervals for the Difference Confidence Intervals for the Difference Between Two Population Proportions (Large Between Two Population Proportions (Large
Samples) Samples) Öryggismörk fyrir mismun Öryggismörk fyrir mismun þýðishlutfalla (stór úrtök)þýðishlutfalla (stór úrtök)
Let pX, denote the observed proportion of successes in a random sample of nX observations from a population with proportion X successes, and let pY denote the proportion of successes observed in an independent random sample from a population with proportion Y successes. Then, if the sample sizes are large (generally at least forty observations in each sample), a 100(1 - )% confidence interval for confidence interval for the difference between population proportionsthe difference between population proportions (X - Y) is given by
Where the margin of error is
Bpp YX )(
Y
YY
X
XX
n
pp
n
ppZB
)1()1(2/
Sample Size for the Mean of a Normally Sample Size for the Mean of a Normally Distributed Population with Known Distributed Population with Known Population Variance Population Variance Gagnasafn fyrir Gagnasafn fyrir
vongildi normaldreifðs þýðis með þekktri vongildi normaldreifðs þýðis með þekktri þýðisdreifniþýðisdreifni
Suppose that a random sample from a normally distributed population with known variance 2 is selected. Then a 100(1 - )% confidence interval for the population mean extends a distance B (sometimes called the bound, sampling error, or the margin of error) on each side of the sample mean, if the sample sizesample size, n, is
2
222/
B
Zn
Sample Size for Population Sample Size for Population ProportionProportion
Stærð gagnasafns fyrir þýðishlutfall Stærð gagnasafns fyrir þýðishlutfall
Suppose that a random sample is selected from a population. Then a 100(1 - )% confidence interval for the population proportion, extending a distance of at most B on each side of the sample proportion, can be guaranteed if the sample sizesample size, n, is
2
22/ )(25.0
B
Zn
Key WordsKey Words Bias Bound Confidence interval:
For mean, known variance For mean, unknown
variance For proportion For two means, matched For two means, variances
equal For two means, variances
not equal For variance
Confidence Level Estimate Estimator Interval Half Width Lower Confidence Limit
(LCL) Margin of Error Minimum Variance
Unbiased Estimator Most Efficient Estimator Point Estimate Point Estimator
Key WordsKey Words(continued)(continued)
Relative Efficiency Reliability Factor Sample Size for Mean,
Known Variance Sample Size for
Proportion Sampling Error Student’s t Unbiased Estimator Upper Confidence Limit
(UCL) Width