sampling distributions

40
Sampling Distributions Martina Litschmannová m artina.litschmannova @vsb.cz K210

Upload: hilde

Post on 23-Feb-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Sampling Distributions. Martina Litschmannová m artina.litschmannova @vsb.cz K210. Populations vs. Sample. A population includes each element from the set of observations that can be made. A sample consists only of observations drawn from the population. Exploratory Data Analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 2: Sampling Distributions

Populations vs. Sample

A population includes each element from the set of observations that can be made.

A sample consists only of observations drawn from the population.

populationsample

Inferential Statistics

Exploratory Data Analysissampling

Page 3: Sampling Distributions

Characteristic of a population vs.

characteristic of a sample

A a measurable characteristic of a population, such as a mean or standard deviation, is called a parameter, but a measurable characteristic of a sample is called a statistic.

PopulationExpectation

(mean), resp.

Medianx0,5

Variance (dispersion)

, resp. Std. deviation

σProbability

π

SampleSample mean

(average)Sample median Sample variance

S2

Sample std. deviation

SRelative frequency

p

Page 4: Sampling Distributions

Sampling Distributions

A sampling distribution is created by, as the name suggests, sampling.

The method we will employ on the rules of probability and the laws of expected value and variance to derive the sampling distribution.

For example, consider the roll of one and two dices…

Page 5: Sampling Distributions

A fair die is thrown infinitely many times, with the random variable X = # of spots on any throw.

The probability distribution of X is:

The mean, variance and standard deviation are calculated as:

,

1 2 3 4 5 6

1/6 1/6 1/6 1/6 1/6 1/6

The roll of one die

Page 6: Sampling Distributions

A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their means.

The roll of Two DicesThe Sampling Distribution of Mean

Sample Mean Sample Mean Sample Mean{1, 1} 1,0 {3, 1} 2,0 {3, 1} 3,0{1, 2} 1,5 {3, 2} 2,5 {3, 2} 3,5{1, 3} 2,0 {3, 3} 3,0 {3, 3} 4,0{1, 4} 2,5 {3, 4} 3,5 {3, 4} 4,5{1, 5} 3,0 {3, 5} 4,0 {3, 5} 5,0{1, 6} 3,5 {3, 6} 4,5 {3, 6} 5,5{2, 1} 1,5 {4, 1} 2,5 {4, 1} 3,5{2, 2} 2,0 {4, 2} 3,0 {4, 2} 4,0{2, 3} 2,5 {4, 3} 3,5 {4, 3} 4,5{2, 4} 3,0 {4, 4} 4,0 {4, 4} 5,0{2, 5} 3,5 {4, 5} 4,5 {4, 5} 5,5{2, 6} 4,0 {4, 6} 5,0 {4, 6} 6,0

Page 7: Sampling Distributions

A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their means.

While there are 36 possible samples of size 2, there are only 11 values for , and some (e.g. ) occur more frequently than others (e.g. ).

The roll of Two DicesThe Sampling Distribution of Mean

1,0 1/361,5 2/362,0 3/362,5 4/363,0 5/363,5 6/364,0 5/364,5 4/365,0 3/365,5 2/366,0 1/36 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

0

1/36

2/36

3/36

4/36

5/36

6/36

Mean

Prob

abili

ty

Page 8: Sampling Distributions

A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their means.

,

The roll of Two DicesThe Sampling Distribution of Mean

1,0 1/361,5 2/362,0 3/362,5 4/363,0 5/363,5 6/364,0 5/364,5 4/365,0 3/365,5 2/366,0 1/36 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

0

1/36

2/36

3/36

4/36

5/36

6/36

Mean

Prob

abili

ty

Page 9: Sampling Distributions

A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their means.

= # of spots on i-th dice, ,

The roll of Two DicesThe Sampling Distribution of Mean

1,0 1/361,5 2/362,0 3/362,5 4/363,0 5/363,5 6/364,0 5/364,5 4/365,0 3/365,5 2/366,0 1/36 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

0

1/36

2/36

3/36

4/36

5/36

6/36

Mean

Prob

abili

ty

Page 10: Sampling Distributions

Compare

= # of spots on i-th dice, ,

Note that: ,

1 2 3 4 5 60

1/6

x

P(x)

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.00

1/36

2/36

3/36

4/36

5/36

6/36

MeanPr

obab

ility

Distribution of X Sampling Distribution of

Page 11: Sampling Distributions

Generalize - Central Limit Theorem

The sampling distribution of the mean of a random sample drawn from any population is approximately normal for a sufficiently large sample size.

, ,

The larger the sample size, the more closely the sampling distribution of X will resemble a normal distribution.

n=1n=5n=10n=30

x

f(x)

0 200

0,2

0,4

0,6

0,8

1

1,2

Page 12: Sampling Distributions

Central Limit Theorem

… random variable, ,

Note that: , , … standard error

Same Distribution of all Sampling Distribution of 𝜇

𝜎√𝑛

Page 13: Sampling Distributions

Generalize - Central Limit Theorem

The sampling distribution of drawn from any population is approximately normal for a sufficiently large sample size.

In many practical situations, a sample size of 30 may be sufficiently large to allow us to use the normal distribution as an approximation for the sampling distribution of .

Note: If X is normal, is normal. We don’t need Central Limit Theorem in this case.

Page 14: Sampling Distributions

1. The foreman of a bottling plant has observed that the amount of soda in each “32-ounce” bottle is actually a normally distributed random variable, with a mean of 32,2 ounces and a standard deviation of 0,3 ounce.

A) If a customer buys one bottle, what is the probability that the bottle will contain more than 32 ounces?

Page 15: Sampling Distributions

1. The foreman of a bottling plant has observed that the amount of soda in each “32-ounce” bottle is actually a normally distributed random variable, with a mean of 32,2 ounces and a standard deviation of 0,3 ounce.

B) If a customer buys a carton of four bottles, what is the probability that the mean amount of the four bottles will be greater than 32 ounces?

Page 16: Sampling Distributions

Graphically Speaking

What is the probability that one bottle will contain more than 32 ounces?

What is the probability that the mean of four bottles

will exceed 32 oz?

𝑋→𝑁 (32,2 ;0,09 ) 𝑋→𝑁 (32,2 ; 0,094 )

Page 17: Sampling Distributions

2. The probability distribution of 6-month incomes of account executives has mean $20,000 and standard deviation $5,000.

A) A single executive’s income is $20,000. Can it be said that this executive’s income exceeds 50% of all account executive incomes?

Answer: No information given about shape of distribution of X; we do not know the median of 6-month incomes.

Page 18: Sampling Distributions

2. The probability distribution of 6-month incomes of account executives has mean $20,000 and standard deviation $5,000.

B) n=64 account executives are randomly selected. What is the probability that the sample mean exceeds $20,500?

Page 19: Sampling Distributions

3. A sample of size n=16 is drawn from a normally distributed population with and . Find (Do we need the Central Limit Theorem to solve ?)

Page 20: Sampling Distributions

4. Battery life . Guarantee: avg. battery life in a case of 24 exceeds 16 hrs. Find the probability that a randomly selected case meets the guarantee.

Page 21: Sampling Distributions

5. Cans of salmon are supposed to have a net weight of 6 oz. The canner says that the net weight is a random variable with mean =6,05 oz. and stand. dev. =0,18 oz. Suppose you take a random sample of 36 cans and calculate the sample mean weight to be 5.97 oz. Find the probability that the mean weight of the sample is less than or equal to 5.97 oz.

Since , either you observed a “rare” event (recall: 5,97 oz is 2,67 stand. dev. below the mean) and the mean fill is in fact 6,05 oz. (the value claimed by the canner), the true mean fill is less than 6,05 oz., (the canner is lying ).

Page 22: Sampling Distributions

Sampling Distribution of a Proportion

The estimator of a population proportion of successes is the sample proportion . That is, we count the number of successes in a sample and compute:

X is the number of successes, n is the sample size.

Page 23: Sampling Distributions

Normal Approximation to Binomial

Binomial distribution with n=20 and with a normal approximation superimposed ( and ).

Page 24: Sampling Distributions

Normal Approximation to Binomial

Normal approximation to the binomial works best when the number of experiments n (sample size) is large, and the probability of success is close to 0,5.

For the approximation to provide good results one condition should be met:

.

Page 25: Sampling Distributions

Sampling Distribution of a Sample Proportion

Using the laws of expected value and variance, we can determine the mean, variance, and standard deviation of .

, ,

Sample proportions can be standardized to a standard normal distribution using this formulation: .

standard error of the proportion

Page 26: Sampling Distributions

6. Find the probability that of the next 120 births, no more than 40% will be boys. Assume equal probabilities for the births of boys and girls.

Page 27: Sampling Distributions

7. 12% of students at NCSU are left-handed. What is the probability that in a sample of 50 students, the sample proportion that are left-handed is less than 11%?

Page 28: Sampling Distributions

Sampling Distribution: Difference of two means

Assumption:Independent random samples be drawn from each of two normal populations.

If this condition is met, then the sampling distribution of the difference between the two sample means will be normally distributed if the populations are both normal.

Note: If the two populations are not both normally distributed, but the sample sizes are “large” (>30), the distribution of is approximately normal – Central Limit Theorem.

Page 29: Sampling Distributions

Sampling Distribution: Difference of two means

,

standard error of the difference between two means

Page 30: Sampling Distributions

Sampling Distribution: Difference of two proportions

Assumption:

Central Limit Theorem: ,

Page 31: Sampling Distributions

Sampling Distribution: Difference of two means

,

standard error of the difference between two proportions

Page 32: Sampling Distributions

Special Continous Distribution

Page 33: Sampling Distributions

, pak

Distribution

Degrees of Freedom

Using of Distribution

Page 34: Sampling Distributions

8. The Acme Battery Company has developed a new cell phone battery. On average, the battery lasts 60 minutes on a single charge. The standard deviation is 4 minutes.Suppose the manufacturing department runs a quality control test. They randomly select 7 batteries. What is probability, that the standard deviation of the selected batteries is greather than 6 minutes?

Page 35: Sampling Distributions

Student's t Distribution

, and are independent variables If ,

then has Student‘s t Distribution with degrees of freedom, .

Using of Student‘s tDistribution

The t distribution should be used with small samples from populations that are not approximately normal.

Page 36: Sampling Distributions

9. Acme Corporation manufactures light bulbs. The CEO claims that an average Acme light bulb lasts 300 days. A researcher randomly selects 15 bulbs for testing. The sampled bulbs last an average of 290 days, with a standard deviation of 50 days. If the CEO's claim were true, what is the probability that 15 randomly selected bulbs would have an average life of no more than 290 days?

Page 37: Sampling Distributions

F Distribution

The f Statistic The f statistic, also known as an f value, is a random

variable that has an F distribution.

Here are the steps required to compute an f statistic: Select a random sample of size n1 from a normal population,

having a standard deviation equal to σ1. Select an independent random sample of size n2 from a

normal population, having a standard deviation equal to σ2. The f statistic is the ratio of s1

2/σ12 and s2

2/σ22.

Page 38: Sampling Distributions

F Distribution

Here are the steps required to compute an f statistic: Select a random sample of size n1 from a normal population,

having a standard deviation equal to σ1. Select an independent random sample of size n2 from a

normal population, having a standard deviation equal to σ2. The f statistic is the ratio of s1

2/σ12 and s2

2/σ22.

Degrees of freedom

Page 39: Sampling Distributions

10. Suppose you randomly select 7 women from a population of women, and 12 men from a population of men. The table below shows the standard deviation in each sample and in each population.

Find probability, that sample standard deviation of men is greather than twice sample standard deviation of women.

Population Population standard deviation

Sample standard deviation

Women 30 35Men 50 45

Page 40: Sampling Distributions

Study materials :

http://homel.vsb.cz/~bri10/Teaching/Bris%20Prob%20&%20Stat.pdf (p. 93 - p.104)

http://stattrek.com/tutorials/statistics-tutorial.aspx?Tutorial=Stat (Distributions – Continous (Students, F Distribution) + Estimation