1 point and interval estimates examples with z and t distributions single sample; two samples...

Point and Interval Estimates• Examples with z and t distributions• Single sample; two samples• Result: Sums (and differences) of normally distributed RV are

normally distributed.• Determining the variance of the difference between means for

two independent samples• Pooled estimates of the variance (when two independent

estimates are available)• Degrees of freedom for the variance of the difference between

the means of two independent samples (equal/not equal variances)

• Estimating the variance for use with proportions, and CI with proportions:

• Bayesian Credibility Intervals – Prior Distribution– Joint Distribution of prior and data– Posterior Distribution

Introduction to Biostatistics (PUBHLTH 540)

Examples of Point and Interval Estimates+ Credibility Intervals

Examples from Seasons Study• Assumptions: Subjects are SRS from population. • Assume different groups are independent SRS from

different stratum (ie. gender)

Details: • Use t-distribution for interval estimates when sample

sizes are small (unless estimate is of a proportion) – requires an assumption that the underlying random

variable is normally distributed• When response is binary (yes/no), we estimate the

population mean by the sample mean (equal to the sample proportion ), and the sample variance byp̂

2ˆ ˆ ˆ1p p

Examples: Point and Interval Estimate of WtExamples from Seasons Study (see ejs09b540p34.sas).What is a 95% Confidence Interval for Weight?

(see: http://dostat.stat.sc.edu/prototype/calculators/index.php3 )?dist=T to get t-percentiles)Figure 1. Histogram of weight in kg for n=291

Source: ejs09b540p34.sas 10/20/2009 by ejs

48 60 72 84 96 108 120 132 144 156

W t (kg) (formerly cc5a)

Weight

n 291 Lower 95 Upper 95

Mean 77.62 75.6 79.7

Std 17.79

df 290

statist 1.968

The mean weight is estimated as 77.6 kg, with a 95% CI of (75.6, 79.7)

1 ,0.975df n

290,0.9751 ,0.975 1.968

17.7977.6 1.968

df nt t

Use applets to get t value

Examples: Point and Interval Estimate of Wt

Answer: Same as before--The mean weight is estimated as 77.6 kg, with a 95% CI of (75.6, 79.7)

• Suppose we assume the Seasons study subjects were a SRS from people in the US. What is a point and interval estimate of weight for the US population?

Examples: Point and Interval Estimate of Wt- separately for men and women

Examples from Seasons Studyejs09b540p34.sas

(see: http://dostat.stat.sc.edu/prototype/calculators/index.php3?dist=T to get t-percentiles)

For men, the mean weight is estimated as 85.9 kg (95% CI (83.3,88.5) while for women, mean wt is 69.7 kg (95% CI (67.2, 72.3)

Table 3. Description of weight by gender

Male(0) Analysis Variable : wt Wt (kg) (formerly cc5a) N Mean Std Dev Variance Std Errorƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ142 85.90 15.82 250.32 1.33ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Female(1) Analysis Variable : wt Wt (kg) (formerly cc5a) N Mean Std Dev Variance Std Errorƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ149 69.73 15.92 253.42 1.30ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒSource: ejs09b540p34.sas 10/20/2009 by ejs

1 ,0.975df n YY t S 141,0.975 1.977t

Use Applet-- men

Use Applet-- women148,0.975 1.976t

Examples: Point and Interval Estimate of Wt- adjusting for gender in US population

• Suppose we assume the Seasons study male subjects were a SRS from males in the US, and similarly, and female subjects were an independent SRS from females in the US. In 2000, there were 138.05 million males, and 143.37 million females in the U.S.. Using the Seasons study estimates, what is a point and interval estimate of weight for the US population?

138.050.49

138.05 143.37Mc

143.370.51

138.05 143.37Fc

ˆM FZ c Y c X

Males Females

Z c Y c X

Example: Linear Combinations of Random variables

0var 0

var0 var

Estimate:

250.320 0142

253.4200

0.49 0.51M Fc c

ˆvar var

var var

cYZ c c

c Y c X

ˆ 0.49 85.9 0.51 69.73Z

2 2250.32 253.42ˆ 0.49 0.51

142 149z

What are the DF for the t-dist?

If variances are equal, use df=n1+n2-2, and replace individual variance estimates by a pooled variance.

If variances are not equal, see p270-271 in text for df approximation.

Note: Common estimate of a variance-

Pooled EstimateIf we assume the population variance in weight is equal for males and females, we can estimate a pooled (common) variance (see p267 in text):

2 21 1 2 22

n S n SS

More generally:

for Wt:

2 142 1 250.32 149 1 253.42251.9

142 1 149 1pS

2 2251.9 251.9ˆ 0.49 0.51

142 149z

ˆ 0.49 85.9 0.51 69.73Z

2 215.82 15.92ˆ 0.49 0.51

142 149z

Assuming not equal: from p270-271 in text, 22 2 22 2

2 2 22 2 222

15.82 15.92142 149

1 1 141 148

Wt Mean constants Estimate Var(Mean)

Male 0.49 85.9 1.76

Female 0.51 69.7 1.70

Wt Average 77.6 0.865

df 289

Approx df 288.5 95% CI (75.8, 79.5)

t-0.975 1.968

Wt is estimated as 77.6 kg with a 95% CI of (75.8,79.5)

Examples: Proportion of Subjects who are obese (BMI>30) (see p327 text)

•Estimate the proportion of subjects obese, and a 95% CI

•Create 0/1 variable 1=obese 0=normal wt

•Use Z-dist for CI (since np>5)

• Variance estimate: 2ˆ ˆ ˆ1p p Obese

No 221

Yes 70 Lower 95

Total 291 0.1914

P_hat 0.240549828 0.2897

var (phat) 0.000627786 Upper95

See: ejs09b540p34.sas

ˆ ˆ1ˆ

p pp z

Examples: Proportion of Subjects who are obese (BMI>30) (see p327 text)

•Single random variable (0/1) is called a Bernoulli random variable.

•Variance is estimated using maximum likelihood estimator (biased): 2ˆ ˆ ˆ1p p

•Usual estimate of the variance (used in other settings) is:

•Normal Approximation is used commonly when nP>5 and n(1-P)>5 (NOT t-dist)

2 ˆ ˆ11

nS p p

Example: Sample finds 4 of 10 subjects obese

4ˆ 0.4

Note: nP is not large enough here for the normal approximation to be “good”.

ˆ ˆ1ˆ

p pp z

95% CI

0.4 0.60.4 1.96 (0.10,0.70)

Examples: Credibility IntervalsBayesian Approach

Recall that we could estimate the mean using Maximum Likelihood

Example: We select a srs with replacement of n=10 and observe x=4. What is p?

Solution 1: Use the sample mean:

ˆ 0.4x

Solution 2: Use value of the parameter p that maximizes the likelihood, given the data.

644 | , 10 210 1P X p n p p

64210 1L p p p Likelihood:

The likelihood is a function of p. We can think of a set of possible values, i.e. 0, 0.1, 0.2, …, 0.8, 0.9, 1 of p. The maximum likelihood estimate is the value of p where the likelihood is largest.

Binomial DistributionLikelihood

We select a srs with replacement of n=10 and observe x=4. What is p?

Parameterp

L(p) Parameterp

0.05 0.001 0.55 0.1596

0.10 0.0112 0.60 0.1115

0.15 0.0401 0.65 0.0689

0.20 0.0881 0.70 0.0368

0.25 0.1460 0.75 0.0162

0.30 0.2001 0.80 0.0055

0.35 0.2377 0.85 0.0012

0.40 0.2508 0.90 0.0001

0.45 0.2384 0.95 0.0000

0.50 0.2051 1.00 0.0000

Binomial DistributionMaximum LikelihoodLikelihood: 64210 1L p p p

p L(p) p L(p)

0.05 0.001 0.40 0.2508

0.10 0.0112 0.45 0.2384

0.15 0.0401 0.50 0.2051

0.20 0.0881 0.55 0.1596

0.25 0.1460 0.60 0.1115

0.30 0.2001 0.65 0.0689

0.35 0.2377 etc

0.2 0.3 0.4 0.5

MaximumLikelihood

ˆ 0.4x

0.6 0.7 0.9

Examples: Credibility IntervalsBayesian Approach-Prior

Suppose we assume each parameter is equally likely. This is called a uniform prior distribution

Parameterp

Prior Prob.

Parameterp

Prior Prob.

0.05 0.05 0.55 0.05

0.10 0.05 0.60 0.05

0.15 0.05 0.65 0.05

0.20 0.05 0.70 0.05

0.25 0.05 0.75 0.05

0.30 0.05 0.80 0.05

0.35 0.05 0.85 0.05

0.40 0.05 0.90 0.05

0.45 0.05 0.95 0.05

0.50 0.05 1.00 0.05

Prior distribution

Examples: Credibility IntervalsBayesian Approach-Data|p

We select a srs with replacement of n=10 and observe x=4. The likelihoodis the Pr(Data|p)

Parameterp

L(p|x) Parameterp

L(p|x)

0.05 0.001 0.55 0.1596

0.10 0.0112 0.60 0.1115

0.15 0.0401 0.65 0.0689

0.20 0.0881 0.70 0.0368

0.25 0.1460 0.75 0.0162

0.30 0.2001 0.80 0.0055

0.35 0.2377 0.85 0.0012

0.40 0.2508 0.90 0.0001

0.45 0.2384 0.95 0.0000

0.50 0.2051 1.00 0.0000

|P x p |P x p

64 1p p

Examples: Credibility IntervalsBayesian Approach-Posterior

Combining the Likelihood and the prior, we have the joint probabilities

|pP p x P x p

We sum these probabilities over all possible possible values of p, and divide by this sum to form posterior probabilities:

P x pP p x

Credibility Intervals are like Confidence Intervals for parameters in the Posterior Distribution (Uniform Prior)

n 10x 4 Successes Normalized

Prior Prob P(Success) Likelihood Joint Joint Cumulative pi(p) p L(x|p) pi(p)*L(x|p) Posterior Posterior

0.05 0.05 0.00000 0.00000 0.00053 0.000530.05 0.1 0.00005 0.00000 0.00614 0.006670.05 0.15 0.00019 0.00001 0.02205 0.02872 0.150000.05 0.2 0.00042 0.00002 0.04844 0.077170.05 0.25 0.00070 0.00003 0.08030 0.157460.05 0.3 0.00095 0.00005 0.11007 0.267530.05 0.35 0.00113 0.00006 0.13072 0.39825 0.960.05 0.4 0.00119 0.00006 0.13795 0.53620 Credible Interval0.05 0.45 0.00114 0.00006 0.13110 0.667300.05 0.5 0.00098 0.00005 0.11279 0.780100.05 0.55 0.00076 0.00004 0.08776 0.867860.05 0.6 0.00053 0.00003 0.06131 0.929170.05 0.65 0.00033 0.00002 0.03790 0.967070.05 0.7 0.00018 0.00001 0.02022 0.98729 0.700000.05 0.75 0.00008 0.00000 0.00892 0.996210.05 0.8 0.00003 0.00000 0.00303 0.999240.05 0.85 0.00001 0.00000 0.00069 0.999920.05 0.9 0.00000 0.00000 0.00008 1.000000.05 0.95 0.00000 0.00000 0.00000 1.000000.05 1 0.00000 0.00000 0.00000 1.00000

Totals 1 0.00043 1.00000

Credibility Intervals are like Confidence Intervals for parameters in the Posterior Distribution (Symmetric Prior)

Prior Prob P(Success) Likelihood Joint Joint Cumulative pi(p) p L(x|p) pi(p)*L(x|p) Posterior Posterior

0.050000 0.050000 0.000005 0.000000 0.000499 0.0004990.100000 0.100000 0.000053 0.000005 0.011541 0.0120400.200000 0.150000 0.000191 0.000038 0.082926 0.094965 0.1500000.300000 0.200000 0.000419 0.000126 0.273251 0.3682170.200000 0.250000 0.000695 0.000139 0.301952 0.670169 0.910.100000 0.300000 0.000953 0.000095 0.206945 0.877114 Credible Interval0.050000 0.350000 0.001132 0.000057 0.122886 1.000000 0.3500000.000000 0.400000 0.001194 0.000000 0.000000 1.0000000.000000 0.450000 0.001135 0.000000 0.000000 1.0000000.000000 0.500000 0.000977 0.000000 0.000000 1.0000000.000000 0.550000 0.000760 0.000000 0.000000 1.0000000.000000 0.600000 0.000531 0.000000 0.000000 1.0000000.000000 0.650000 0.000328 0.000000 0.000000 1.0000000.000000 0.700000 0.000175 0.000000 0.000000 1.0000000.000000 0.750000 0.000077 0.000000 0.000000 1.0000000.000000 0.800000 0.000026 0.000000 0.000000 1.0000000.000000 0.850000 0.000006 0.000000 0.000000 1.0000000.000000 0.900000 0.000001 0.000000 0.000000 1.0000000.000000 0.950000 0.000000 0.000000 0.000000 1.0000000.000000 1.000000 0.000000 0.000000 0.000000 1.000000

Totals 1.000000 0.000460 1.000000

Credibility Intervals are like Confidence Intervals for parameters in the Posterior Distribution (Tiered Prior)

Prior Prob P(Success) Likelihood Joint Joint Cumulative pi(p) p L(x|p) pi(p)*L(x|p) Posterior Posterior0.01000 0.05000 0.00000 0.00000 0.00010 0.000100.10000 0.10000 0.00005 0.00001 0.01105 0.011150.20000 0.15000 0.00019 0.00004 0.07941 0.09056 0.150000.20000 0.20000 0.00042 0.00008 0.17444 0.265000.20000 0.25000 0.00070 0.00014 0.28914 0.55414 0.890.10000 0.30000 0.00095 0.00010 0.19817 0.75231 Credible Interval0.03000 0.35000 0.00113 0.00003 0.07060 0.822910.02000 0.40000 0.00119 0.00002 0.04967 0.872580.02000 0.45000 0.00114 0.00002 0.04721 0.919790.02000 0.50000 0.00098 0.00002 0.04062 0.960410.01000 0.55000 0.00076 0.00001 0.01580 0.97621 0.550000.01000 0.60000 0.00053 0.00001 0.01104 0.987250.01000 0.65000 0.00033 0.00000 0.00682 0.994070.01000 0.70000 0.00018 0.00000 0.00364 0.997710.01000 0.75000 0.00008 0.00000 0.00161 0.999320.01000 0.80000 0.00003 0.00000 0.00055 0.999860.01000 0.85000 0.00001 0.00000 0.00012 0.999990.01000 0.90000 0.00000 0.00000 0.00001 1.000000.01000 0.95000 0.00000 0.00000 0.00000 1.000000.01000 1.00000 0.00000 0.00000 0.00000 1.00000

Totals 1.00000 0.00048 1.00000

Examples: Credibility IntervalsBayesian Approach-ConclusionsCredibility Intervals (for the same data) depend on the Prior Distribution

Prior Credibility Interval ConfidenceUniform (0.15, 0.70) 0.96Symmetric (0.15, 0.35) 0.91Tiered (0.15, 0.55) 0.89

Frequentist 95% Confidence Intervals based on Normal Approximation

(0.10, 0.70) 1

ˆ ˆ1ˆ

p pp z

Credibility Interval- Intuitive Interpretation- prob parameter is in interval is confidence

Frequentist Confidence Interval- awkward interpretation- includes parameter for 95% of samples, if repeated

1 point and interval estimates examples with z and t distributions single sample; two samples...

examples of point

mean wt

interval estimates examples

mean weight

interval estimate of

sample variance

women examples

seasons study estimates

Documents

everyday mathematics partial-sums addition partial-sums...

interval tausch- mitgliedschaftsvertrag · interval...

lesson 3: writing products as sums and sums as products ›...

ecg to continue…. interval changes assessment pr (pq)...

conduction disorders -...

tecnología sums

part ii: weak rami cation and galois-gauss sums 17 ·...

estimating sums & differences

qt interval monitoring - the onlinelearningcenter · qt...

dc circuits sums

maths- solved sums

part ii: weak rami cation and galois-gauss sums...

equal sums

mcs sums solutions

interval linear programming with generalized interval...

exponential sums, gauss sums and cyclic codes - university...

lesson 3: writing products as sums and sums as products

all sums costing

statistics 359a regression analysis. necessary background...

physics waves sums