chapter 5 sampling and statistics math 6203 fall 2009 instructor: ayona chatterjee
TRANSCRIPT
Chapter 5Sampling and Statistics
Math 6203Fall 2009
Instructor: Ayona Chatterjee
5.1 Sampling and Statistics
• Typical statistical problem: – We have a random variable X with pdf f(x) or pmf
p(x) unknown. • Either f(x) and p(x) are completely unknown.• Or the form of f(x) or p(x) is known down to the
parameter θ, where θ may be a vector.
• Here we will consider the second option.• Example: X has an exponential distribution with θ
unknown.
• Since θ is unknown, we want to estimate it. • Estimation is based on a sample.• We will formalize the sampling plan:– Sampling with replacement.• Each draw is independent and X’s have the same
distribution.
– Sampling without replacement.• Each draw is not independent but X’s still have the
same distribution.
Random Sample
• The random variables X1, X2, …., Xn constitute a random sample on the random variable X if they are independent and each has the same distribution as X. We will abbreviate this by saying that X1, X2, …., Xn are iid; i.e. independent and identically distributed.– The joint pdf can be given as
n
iinXX xfxxf
n1
1 )(),....,(,...1
Statistic
• Suppose the n random variables X1, X2, …., Xn constitute a sample from the distribution of a random variable X. Then any function T=T(X1, X2, …., Xn ) of the sample is called a statistic.
• A statistic, T=T(X1, X2, …., Xn ), may convey information about the unknown parameter θ. We call the statistics a point estimator of θ.
5.2 Order Statistics
Notation
• Let X1 , X2 , ….Xn denote a random sample from a distribution of the continuous type having a pdf f(x) that has a support S = (a, b), where -∞≤ a< x< b ≤ ∞. Let Y1 be the smallest of these Xi, Y2 the next Xi in order of magnitude,…., and Yn the largest of the Xi. That is Y1 < Y2 < …<Yn represent X1 , X2 , ….Xn, when the latter is arranged in ascending order of magnitude. We call Yi the ith order statistic of the random sample X1 , X2 , ….Xn.
Theorem 5.2.1• Let Y1 < Y2 < …<Yn denote the n order statistics
based on the random sample X1 , X2 , ….Xn from a continuous distribution with pdf f(x) and support (a,b). Then the joint pdf of Y1 , Y2 , ….Yn is given by,
elsewhere
byayfyFyFknk
nyg
elsewhere
byyyayfyfyfnyyyg
kkkn
kk
kkk
nnn
0
)()](1[)]([)!()!1(
!)(
asgiven becan statisticsorder any of pdf marginal The
0
...)()...()(!),....,(
1
212121
Note
• The joint pdf of any two order statistics, say• Yi < Yj can be written as
elsewhere 0
)()()](1[)]()([)]([)!()!1()!1(
!
),(
11
byya
yfyfyFyFyFyFjniji
n
yygji
jijn
jij
iji
i
jiij
Note
• Yn - Y1 is called the range of the random sample.
• (Y1 + Yn )/2 is called the mid-range
• If n is odd then Y(n+1)/2 is called the median of the random sample
5.4 MORE ON CONFIDENCE INTERVALS
The Statistical Problem
• We have a random variable X with density f(x,θ), where θ is unknown and belongs to the family of parameters Ω.
• We estimate θ with some statistics T, where T is a function of the random sample X1 , X2 , ….Xn.
• It is unlikely that value of T gives the true value of θ. – If T has a continuous distribution then P(T= θ)=0.
• What is needed is an estimate of the error of estimation.– By how much did we miss θ?
Central Limit Theorem
• Let θ0 denote the true, unknown value of the parameter θ. Suppose T is an estimator of θ such that
• Assume that σT2 is known.
),0()( 20 T
D
NTn
nT
nTP.
Z
TnZ
TT
T
96.196.1950
llyalgebraica showcan We
0.95 1.96) Z P(-1.96 Hence
N(0,1)ally asymptotic is Then
/)(Let
0
0
Note
• When σ is unknown we use s(sample standard deviation) to estimate it.
• We have a similar interval as obtained before with the σ replaced with st.
• Note t is the value of the statistic T.
Confidence Interval for Mean μ
• Let X1 , X2 , ….Xn be a random sample from the distribution with unknown mean μ and unknown standard deviation σ.
)n1.96s/x,n1.96s/-x(
is for interval confidence 95% eapproximatAn
N(0,1) is /)(n then TheoremLimit Central By the
ly.respective varianceandmean sample denote S and 2
SX
XLet
Note
• We can find confidence intervals for any confidence level.
• Let Zα/2 as the upper α/2 quantile of a standard normal variable.
• Then the approximate (1- α)100% confidence interval for θ0 is
)ns/zt,ns/z-(t /2/2
Confidence Interval for Proportions
• Let X be a Bernoulli random variable with probability of success p.
• Let X1 , X2 , ….Xn be a random sample from the distribution of X.
• Then the approximate (1- α)100% confidence interval for p is
error standard thecalled is /)ˆ1(ˆ
p)/2)-p(1N(p,~p̂
size sample
successes ofnumber ˆ
/)ˆ1(ˆˆ,/)ˆ1(ˆˆ 2/2/
npp
Note
n
xp
nppzpnppzp
5.5 Introduction to Hypothesis Testing
Introduction
• Our interest centers on a random variable X which has density function f(x,θ), where θ belongs to Ω.
• Due to theory or preliminary experiment, suppose we believe that
1100
1010
10
:H versus:H
as hypotheses theselabel We
. and of subsets are and where
or
• The hypothesis H0 is referred to as the null hypothesis while H1 is referred to as the alternative hypothesis.
• The null hypothesis represents ‘no change’. • The alternative hypothesis is referred to the as
research worker’s hypothesis.
Error in Hypothesis Testing
• The decision rule to take H0 or H1 is based on a sample X1 , X2 , ….Xn from the distribution of X and hence the decision could be wrong.
True State of Nature
Decision Ho is true H1 is true
Reject Ho Type I Error Correct Decision
Accept Ho Correct Decision Type II Error
• The goal is to select a critical region from all possible critical regions which minimizes the probabilities of these errors.
• In general this is not possible, the probabilities of these errors have a see-saw effect.– Example if the critical region is Φ, then we would
never reject the null so the probability of type I error would be zero but then probability of type II error would be 1.
• Type I error is considered the worse of the two.
Critical Region
• We fix the probability of type I error and we try and select a critical region that minimizes type II error.
• We saw critical region C is of size α if
• Over all critical regions of size α, we want to consider critical regions which have lower probabilities of Type II error.
]),...,,[(max 210
CXXXP n
• We want to maximize
• The probability on the right hand side is called the power of the test at θ.
• It is the probability that the test detects the alternative θ when θ belongs to w1 is the true parameter.
• So maximizing power is the same as minimizing Type II error.
]),...,P[(XError] II Type[1
for
21
1
CXXP n
Power of a test
• We define the power function of a critical region to be
• Hence given two critical regions C1 and C2 which are both of size α, C1 is better than C2 if
121 ];),,[()( CXXXP nC
. allfor )()( 121 CC
Note
• Hypothesis of the form H0 : p = p0 is called simple hypothesis.
• Hypothesis of the form H1 : p < p0 is called a composite hypothesis.
• Also remember α is called the significance level of the test associated with that critical region.
Test Statistics for Mean
1,0
10
0
10
/
-X
is H offavor in HReject /
-X
is H offavor in HReject
ntns
T
zn
Z
5.7 Chi-Square Tests
Introduction
• Originally proposed by Karl Pearson in 1900• Used to check for goodness of fit and
independence.
Goodness of fit test
• Consider the simple hypothesis– H0 : p1 =p10 , p2 =p20 , …, pk-1 =pk-1,0
• If the hypothesis H0 is true, the random variable
• Has an approximate chi-square distribution with k-1 degrees of freedom.
k
i i
iik np
npXQ
0
20
1
)(
Test for Independence
• Let the result of a random experiment be classified by two attributes.
• Let Ai denote the outcomes of the first kind and Bj denote the outcomes for the second kind.
• Let pij = P(Ai Bj )• The random experiment is said to be repeated n
independent times and Xij will denote the frequencies of an event in Ai Bj
b1,2,...,jfor where,.
ˆ
a1,2,...,ifor where,ˆ
,...,2,1;,...,2,1),()()(:
1..
1.
..
..
0
a
iijjj
b
jiji
ii
jiij
jiji
XXn
jXp
XXn
Xp
ppp
bjaiBPAPBAPH
• The random variable
• Has an approximate chi-square distribution with (a-1)(b-1) degrees of freedom provided n is large.
b
j
a
i ji
jiijba nXnXn
nXnXnXQ
1 1 ..
2..
)1)(1( )/)(/(
)]/)(/([