chapter 5. norchapter 5. normal probability distributions.pdfmal probability distributions

Chapter 5: Normal Probability Distributions 109

Chapter 5. Normal Probability Distributions

5-2 The Standard Normal Distribution

Using a Continuous Uniform Distribution. In Exercises 1-4, refer to the continuous uniform distribution depicted in figure 5-2, assume that a class length between 50.0 min and 52.0 min is randomly selected, and find the probability that the given time is selected.

1. 0.15 3.00.5 )503.50( 0.5 minutes) 50.3 than less (class ===P

2. 0.5 10.5 51)-(520.5 minutes) 51.0an greater th (class ===P

3. 15.00.30.5 50.5)-(50.8 0.5 minutes) 50.8 and minutes 50.5between (class ===P

4. 65.01.3)0.5 (50.5)-(51.80.5 min) 51.8 andmin 50.5between (class ===P

Using the Standard Normal Distribution. In Exercises 5-8, assume that voltages in a circuit vary between 6 volts and 12 volts, and voltages are spread evenly over the range of possibilities, so that there is a uniform distribution. Find the probability of the given range of voltage levels.

5. For a discrete probability distribution, P(x) =1. Since the values on the x axis range from 6 to 12, this is a range of 6.0. To get the closed area within the rectangle to be equal to 1, the height of the rectangle has to be 1/6 = 0.167 and these are placed adjacent to each other to cover all values in the full range of 6 to 12

P(voltage greater than 10 volts) = 333.03/16/2261)1012(

61

====

6. P (voltage less than 11 volts) = 833.06/5561)611(

61

===

7. P (voltage between 7 and 10 volts) = 500.02/16/3361)710(

61

====

8. P(voltage between 6.5 and 8.0 volts) = 250.04/16/5.15.161)5.68(

61

====

110 Chapter 5: Normal Probability Distributions

Using the Standard Normal Distribution. In Exercises 9-28, assume that the readings on scientific thermometers are normally distributed with a mean of 0C and a standard deviation of 1.00C. A thermometer is randomly selected and tested. In each case, draw a sketch, and find the probability of each reading in degrees Celsius.

9. Less than 0.25. The probability distribution of readings is a standard normal distribution because the readings are normally distributed with a mean of 0 and standard deviation of 1. We need to find the area below z= 0.25. From Table A-2, this is 0.4013.

So, P(x < 0.25) = 0.4013.

10. Probability of a thermometer reading less than 2.75C, z= 2.75 Area below z of 2.75= 0.0030, P(x < 2.75) = 0.0030

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 z=-0.25

Area found in Table A-2= 0.4013

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4


z=-2.75


11. Probability of a thermometer reading less than 0.25C, z= +0.25 Area below z of +0.25= 0.5987, P(x < +0.25) = 0.5987

12. Probability of a thermometer reading less than 2.75C, z= +2.75 Area below z of +2.75= 0.9970, P(x < +2.75) = 0.9970

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4


z=0.25

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4


z=2.75


13. Probability of a thermometer reading greater than 2.33C, z= +2.33 Area below z of +2.33= 0.9901, P(x > +2.33) = 1 0.9901 = 0.0099

14. Probability of a thermometer reading greater than 1.96C, z= +1.96 Area below z of +1.96= 0.9750, P(x > +1.96) = 1 0.9750 = 0.0250

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4


z=2.33

Area= 1- 0.9901= 0.0099

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4


z=1.96

Area = 1- 0.9750= 0.0250


15. Probability of a thermometer reading greater than 2.33C, z= 2.33 Area below z of 2.33= 0.0099, P(x > 2.33) = 1 0.0099= 0.9901

16. Probability of a thermometer reading greater than 1.96C, z= 1.96 Area below z of 1.96= 0.0250, P(x > 1.96) = 1 0.0250= 0.9750

-4 -3.5

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5

2 2.5 3 3.5 4


z=-2.33

Area= 1- 0.0099= 0.9901

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4


z= -1.96

Area= 1- 0.0250= 0.9750


17. Probability of a thermometer reading between 0.5C and 1.5C, between z= +0.50 and z= +1.50, Area below z of +1.50= 0.9332 and area below z of +0.50= 0.6915 P(+0.50 < x< +1.50) = 0.9332 0.6915 = 0.2417

18. Probability of a thermometer reading between 1.5C and 2.5C, between z= +1.50 and z= +2.50, Area below z of +2.50= 0.9938 and area below z of +1.50= 0.9332 P(+1.50 < x < +2.50) = 0.9938 0.9332 = 0.0606

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

z= 2.50

Total area up to z=2.50= 0.9938

z =1.50

Area= 0.9938-0.9332= 0.0606

Area found in Table A-2=0.9332

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4


z= 0.50


z =1.50

Area= 0.9332-0.6915= 0.2417


19. Probability of a thermometer reading between 2.00C and 1.0C, z= 2.00 and z= 1.00 Area below z of 1.00 is 0.1587 and area below z of 2.00 is 0.0228 P(2.00 < x < 1.00)= 0.1587 0.0228= 0.1359

20. Probability of a thermometer reading between 2.00C and 2.34C, z= +2.00 and z= +2.34 Area below z of +2.34 is 0.9904 and area below z of +2.00 is 0.9772 P(+2.00 < x < +2.34)= 0.9904 0.9772= 0.0132

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4


z= -1.0

Total area up to z=-1.0 = 0.1587

z =-2.0

Area= 0.1587- 0.0228= 0.1359

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

Area= 0.9904-0.9772= 0.0132

z= 2.34

Total area up to z=2.34=.9904

z =2.0

Area from table A-2= 0.9772


21. Probability of a thermometer reading between 2.67C and 1.28C, z= 2.67 and z= +1.28 Area below z of +1.28 is 0.8997 and area below z of 2.67 is 0.0038 P(2.67 < x < +1.28)= 0.8997 0.0038= 0.8959

22. Probability of a thermometer reading between 1.18C and 2.15C, z= 1.18 and z= +2.15 Area below z of +2.15 is 0.9842 and area below z of 1.18 is 0.1190 P(1.18 < x < +2.15)= 0.9842 0.1190 = 0.8652

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

Area= 0.8997-0.0038= 0.8959

z= 1.28

Total area up to z=2.34=0.8997

z=-2.67


-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

Area= 0.9842-0.1190= 0.8652

z= 2.15


z=-1.18





-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

Area= 0.9999-0.3015= 0.6984

z= 3.75


z=-0.52


-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

Area= 0.8577-0.0001=0.8576

z= 1.07


z=--3.88



25. Probability of a thermometer reading greater than 3.57C, z= +3.57 Area below z of +3.57=0.9999, P(x > +3.57) = 1 0.9999 = 0.0001

26. Probability of a thermometer reading less than -3.61C, z= 3.61 Area below z of 3.61= 0.0002, P(x < 3.61) = 0.0001

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

z= 3.57


-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

Area= 1-0.0001= 0.9999

z= -3.61

Area= 1-0.9999= 0.0001

Area from Table A-2= 0.0001


27. Probability of a thermometer reading greater than 0C, z= 0.00 Area below z of 0.00= 0.5000, P(x > 0.00) =1 0.5000= 0.5000

28. Probability of a thermometer reading less than 0C, z= 0.00 Area below z of 0.00= 0.5000, P(x < 0.00) = 0.5000

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

Area= 1-0.5000=0.5000

z= 0

Area from Table A-2=0.5000

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

z= 0



Basis for Empirical Rule. In Exercises 29-32, find the indicated area under the curve of the standard normal distribution, then convert it to a percentage and fill in the blank. The results form the basis for the empirical rule introduced in Section 2-5.

29. About 68.26% of the area is between z = 1 and z = +1 (or within one standard deviation of the mean). Since the area below z= 1.00 is 0.1587 the area between the mean and z= 1.00 is 0.5000 0.1587 = 0.3413, then the total area between z= 1.00 and z= +1.00 is

2 0.3413= 0.6826, converted to a percentage is 0.6826 100% 68.26%

30. About 95.44% of the area is between z= 2 and z= +2 (or within two standard deviation of the mean). Since the area below z= 2.00 is 0.0228 the area between the mean and z= 2.00 is 0.5000 0.0228 = 0.4772, then the total area between z= 2.00 and z= +2.00 is

2 0.4772= 0.9544, converted to a percentage is 0.9544 100%= 95.44%

31. About 99.74% of the area is between z= 3 and z = +3 (or within three standard deviation of the mean). Since the area below z= -3.00 is 0.0013 the area between the mean and z= 3.00 is 0.5000 0.0013 = 0.4987, then the total area between z= 3.00 and z= +3.00 is

2 0.4987= 0.9974, converted to a percentage is 0.9974 100%= 99.74%

32. About 99.98%of the area is between z= 3.5 and z = +3.5 (or within 3.5 standard deviation of the mean). Since the area below z= 3.50 is 0.0001 the area between the mean and

z= -3.50 is 0.5000 0.0001 = 0.4999, then the total area between z= 3.50 and z= +3.50 is 2 0.4999= 0.9998, converted to a percentage is 0.9998 100%= 99.98%

Finding Probability. In Exercises 33-36, assume that the readings on the thermometers are normally distributed with a mean of 0C and a standard deviation of 1.00C. Find the indicated probability, where z is the reading in degrees.

33. P (1.96 < z 2.575) = 1 (Area below z= 2.575) = 1 0.0050 = 0.9950

36. P (1.96< z < 2.33) = (Area below z= +2.33) (Area below z= +1.96) = 0.9901 0.9750= 0.0151

Finding Temperature Values. In Exercises 37-40, assume that the readings on the thermometers are normally distributed with a mean of 0C and a standard deviation of 1.00C. A thermometer is randomly selected and tested. In each case, draw a sketch, and find the temperature reading corresponding to the given information.

37. 0.90 in the body of the table corresponds to a z score of +1.28. So, the 90th percentile is the temperature reading of + (1.28 ) = 0 + (1.28 1.00) = 1.28C.


38. 0.20 in the body of the table corresponds to a z score of 0.84. So, the 20th percentile is the temperature reading of + (0.84 ) = 0 + (0.84 1.00) = 0.84C.

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 z= -0.84

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 z= 1.28

Area from Table A-2=0.9000



39. 0.05 in the body of the table corresponds to a z score of 1.645. So, the 5th percentile is the temperature reading of + (1.645 ) = 0 + (1.645 1.00) = 1.645C.

40. 0.03 in the body of the table corresponds to a z score of 1.88. This is the lower cutoff point. 1 0.03= 0.97 in the body of the table corresponds to a z score of +1.88. This is the higher cutoff point. Thus, thermometers with reading lower than 1.88 C or higher than +1.88 C would be rejected and thermometers between 1.88 would not be rejected. In practice, values of 1.88 or +1.88 would probably be rejected in this case since it indicates the lowest and highest 3% would be rejected.

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 z= -1.88


z= -1.88

Area between 1.88= 0.9700


-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 z= -1.645



41. a. The percentage of data that are between one standard deviation from the mean corresponds to the area between 1.00z and +1.00z scores. This area is 68.26%.

b. The percentage of data that are between 1.96 standard deviations from the mean corresponds to the area between 1.96z and +1.96z scores. This area is 95.00%.

c. The percentage of data that are between three standard deviations from the mean corresponds to the area between 3.00z and +3.00z scores. This area is 99.74%.

d. The percentage of data that are between one standard deviation below the mean and two standard deviations above the mean one corresponds to the area between 1.00z and +2.00z scores. This is 0.9772 0.1587 = 0.8185. This area is 81.85%

e. The percentage of data that are more than two standard deviations away from the mean corresponds to 1 Area between 2.00z and +2.00z scores = 1 0.9544 = 0.0456 or 4.56%.

5-3 Applications of Normal Distributions

IQ Scores. In Exercises 1-8, assume that adults have IQ scores that are normally distributed with a mean of 100 and a standard deviation of 15 (as on the Wechsler test). (Hint: Draw a graph in each case.)

1. The IQ of 115 is converted to a z score as follows:

00.11515

15100115

+==

=

=

xz

Referring to Table A-2, z = +1.00 corresponds to an area of 0.8413, so P(IQ < 115) = 0.8413

2. The IQ of 131.5 is converted to a z score as follows:

10.215

5.3115

1005.131+==

=

=

xz

Referring to Table A-2, z = +2.10 corresponds to an area of 0.9821, so P(IQ > 131.5) = 1 0.9821= 0.0179

100z

0

x(IQ) 115

1

Area = 0.8413


3. The IQs of 90 and 110 are converted to a z scores as follows:

67.01510

15100110

,67.01510

1510090

+==

=

==

=

=

=

xz

xz

Referring to Table A-2, z = 0.67 corresponds to an area of 0.2514 and z = +0.67 corresponds to an area of 0.7486, so P(90 < IQ < 110) = 0.7486 0.2514 = 0.4972

100z

0

x(IQ) 131.5

2.1

Area below z= 2.10= 0.9821

Area= 1-0.9821= 0.0179

100z

0

x(IQ) 110

0.67

Area= 0.2514 Area= 0.7486-0.2514= 0.4972

90

-0.67

Total area up to z= 0.67= 0.7486


4. The IQs of 110 and 120 are converted to a z scores as follows:

33.11520

15100120

,67.01510

15100110

+==

=

=+==

=

=

xz

xz

Referring to Table A-2, z = +0.67 corresponds to an area of 0.7486 and z = +1.33 corresponds to an area of 0.9082, so P(110 < IQ


7. The IQ score separating the top 15% from the others is the same score that separates the bottom (100 15) % from the others100 15 = 85. We find 0.85 in the body of the table and find the corresponding z score. The z score for a cumulative area of 0.85 = 1.04

( ) 6.1156.15100)1504.1(100 =+=+=+= zx The IQ score separating the top 15% from the others = 115.6

8. The IQ score separating the top 55% from the others is the same score that separates the bottom (100 55) % from the others.100 55 = 45. We find 0.45 in the body of the table and find the corresponding z score. The z score for a cumulative area of 0.45 = 0.13

( ) ( ) 05.9895.1100)1513.0(100 =+=+=+= zx The IQ score separating the top 55% from the others = 98.05

100z

0

x(IQ)

0.84

Area= 0.80

112.6

100z

0

x(IQ)

1.04

Area = 0.85

115.6

Area = 0.15


9. Body Temperature a. 6.100 0.62, ,20.98 === x

87.362.04.2

62.02.986.100

z +==

=

=

x

From Table A-2, P(Temperature < 100.6) = P (z< +3.87) = 0.9999. P(Temperature> +3.87) = P(z > +3.87) = 1 0.9999 = 0.0001 This corresponds to 0.01%. Yes, this percentage suggests that the cutoff of 100.6C is appropriate. b. Since we want 5% of the people to exceed the required temperature, we use (100 5)%to find the area to

the left of the cutoff line first. This corresponds to an area 0.95. From Table A-2, this corresponds to a z score of +1.645. ( ) 22.9902.12.98)62.0645.1(2.98 =+=+=+= zx Thus, 5% of the people will exceed 99.2C

10. Lengths of Pregnancies, 15 ,268 == a. x = 308. We are to find P(Pregnancy> 308days). We find P(Pregnancy < 308) and subtract it from 1.

67.21540

15268308

+==

=

=

xz

From the Table, P(z < 2.67) = P(pregnancy < 308) = 0.9962 P(pregnancy > 308) =1 0.9962 = 0.0038 This result shows that is highly unlikely for a pregnancy to last 308 days or more. Therefore it is more likely

that her husband is not responsible for her pregnancy, but there is no proof one way or the other. b. If premature babies are in the lower 4%, we find the cutoff time for the area 0.04. ( ) 75.24125.26268)1575.1(268 ==+=+= zx So, the length that separates premature babies from normal ones is 242 days.

11. Designing Helmets, 1 ,6 == To find the cutoff points for the smallest 2.5% and the largest 2.5%, we find the z scores for the areas 0.025 and

(1 0.025) or 0.975. From the table, these are 1.96 and +1.96 respectively.

( )( ) 896.796.16)196.1(6

404.496.16)196.1(6=+=++=+=

==+=+=

zx

zx

The minimum and maximum head breadths are 4 inches and 8 inches respectively.

100z

0

x(IQ)

-0.13

Area= 0.45

98.05

Area= 0.55


12. CD Player Warranty, 4.1 ,1.7 ==

a. 64.04.19.0

4.11.78

,0.8 +=====

xzx .

The area for this z score is 0.7389. So the probability that a CD player will have a replacement time less than 8 years is 0.7389

b. We need to find the cutoff point for the upper 2%. So, we find the z score for an area of (10.02) or 0.98. This corresponds to z= + 2.05. 97.987.21.7)4.105.2(1.7)( =+=+=+= zx

Therefore, the time length of the warranty should be 10 years.

Heights of Women. In Exercises 13-16, assume that heights of women are normally distributed with a mean given by = 63.6 in. and a standard deviation given by = 2.5 in. (based on data from the National Health Survey). In each case, draw a graph.

13. Beanstalk Club Height Requirement

56.25.24.6

5.26.6370

,5.2 ,6.63 +======

xz

This corresponds to a probability of 0.9948. So, 99.48% of the women have height < 70 in. Therefore (100 99.48) or 0.52% of the women meet the requirement of being at least 70in. in height.

14. Height Requirement for Women Soldiers We need to find the z scores and areas for 58 in. and 80 in.

56.65.24.16

5.26.6380

24.25.26.5

5.26.6358

+==

=

=

=

=

=

=

xz

xz

The areas for these z scores are 0.0125 and 0.9999 respectively. The probability of being between these heights is 0.9999 0.0125 = 0.9874. So, 98.74% of women meet this requirement. Not many women are being denied entry into the army due to height.

63.6z

2.56

70

Height(in) )

Area = 0.9948 Area = 0.0052


15. Height Requirement for Rockettes We need to find the z scores and areas for 66.5 in. and 71.5 in.

16.35.29.7

5.26.635.7116.1

5.29.2

5.26.635.66

+==

=

=+==

=

=

xz

xz

The areas for these z scores are 0.8770 and 0.9992 respectively. The probability of being between these heights is 0.9992 0.8770 = 0.1222. The probability of meeting this new height is 0.1222. Only 12.22% of women meet this requirement. Yes, it seems that the height of the Rockettes is well above the mean.

63.6 z

Height(in) )

Area= 0.0125

Total Area up to z= 6.56= 0.9999

-2.24 6.56

80

0

58

Area=0.9874

63.6z

Height(in) )

Area = 0.8770

Total Area up to z= 3.16= 0.9992

3.16 1.16

66.5

0

71.5

Area = 0.1222


16. Height Requirement for Rockettes To find the cutoffs for the shortest 20% and the tallest 20%, we need to find to find the z scores corresponding

to the areas 0.20 and (1 0.20) or 0.80. From the Table, these z values are 0.84 and +0.84. We then use the formula:

( )( ) 7.651.26.63)5.284.0(6.63

5.611.26.63)5.284.0(6.63=+=+=+=

==+=+=

zx

zx

So, the new minimum and maximum allowable heights are 61.5 in. and 65.7 in. respectively.

17. Birth Weights, 495 ,3420 == To find the cutoff weights for the lightest 2% we need to find to find the z score corresponding to the area 0.02.

From the Table, the z score is -2.05. We then use the formula: ( ) 25.240575.10143420)49505.2(3420 ==+=+= zx . Therefore, the weight of 2405g separates the lightest 2% of American babies from the others.

63.6z

Height(in) )

-0.84 0.84

61.5

0

65.7

3420 z

Weight(g)

-2.05

2405

0

Area = 0.02


18. Birth Weights, 500 ,3570 == To find the cutoff weights for the lightest 2% we need to find to find the z scores corresponding to the areas

0.02. From the Table, this z is 2.05. We then use the formula: ( ) 254510253570)50005.2(3570 ==+=+= zx . Therefore, the weight of 2545g

separates the lightest 2% of Norwegian babies from the others. This result is not very different from the result in Exercise 17. Its a difference of 140g.

19. Units of Measurement, 29 ,143 == a. z scores are measured in units of number of standard deviations from the mean, but they do not possess the

units of the original variable b. The mean will be 0, the standard deviation will be 1, and the distribution will be normal since the original

distribution is normal. z scores have the same shape of distribution as does the original variable distribution; converting to z scores does not result in a normal distribution of z scores if the original distribution was not normally distributed

c. After converting to kg., the distribution will be normal since the original distribution is normal, 1 lb= 0.4536 kg

deviation standard kg15.31kg 29) (0.4536 lb29mean kg64.86kg 143) (0.4536 lb143

===

===

20. Using Continuity Correction a. 105 ,15 100 === x

33.0155

15100105

+==

=

=

xz

So, P(IQ < 105) = 0.6293. Therefore, P(IQ >105) = 1 0.6293 = 0.3707 b. We will replace 105 with an interval of 104.5 and 105.5. Because we want the probability of a score greater

than 105, we want the area bounded by the interval including the area to the right. We convert 104.5 to a z score

30.015

5.415

1005.104+==

=

=

xz

So P(IQ104.5) = 1 0.6179=0.3821. P(IQ >105, adjusted for continuity) = 0.3821 c. The results from (a) and (b) are nearly the same. There is very little difference.

3570z

Weight(g)

-2.05

2545

0

Area= 0.02


5-4 Sampling Distributions and Estimators

1. Survey of Voters No, we cannot assume that the survey was done incorrectly because the value of a statistic varies from sample

to sample due to sampling variability. In this example, the values for the sample proportion are different because of sampling variability. A variation of 49% and 51% would seem to happen by chance relatively often.

2. Sampling Distribution of Cholesterol Levels The sampling distribution is a distribution of all possible means of the cholesterol levels of any 40 randomly

selected women.

3. Sampling Distribution of Body Temperatures No, the histogram will not show the shape of a sampling distribution of sampling means. It will show the

distribution of individual values within one sample. A sampling distribution will show a distribution of all possible means of similar samples with the same sample size.

4. Sampling Distribution of Survey Results a. The 52% is a statistic because it gives the value for one sample. b. The sampling distribution suggested by the data is the distribution of the proportions of all possible samples

of 1038 randomly selected people. c. I would feel more confident if the sample size were 2000 because larger sample sizes tend to have greater

representation of the population and they tend to have lower error.

5. Phone Center Selecting samples with replacement, there will be 32= 9 equally likely samples.

Sample Number a. Sample Sample Mean, x b. Probability

1 10,10 10.0 1/9 2 10, 6 8.0 1/9 3 10, 5 7.5 1/9 4 6, 10 8.0 1/9 5 6, 6 6.0 1/9 6 6, 5 5.5 1/9 7 5, 10 7.5 1/9 8 5, 6 5.5 1/9 9 5, 5 5.0 1/9

Sum of Sample Means =x 63.0

Mean of statistic values ==

9

x 7.0

Population parameter ==

++=

321

35610 7.0

Sampling Distribution Sample

Mean, x Probability

10.0 1/9 8.0 2/9 7.5 2/9 6.0 1/9 5.5 2/9 5.0 1/9


b. The probability of each sample is 1/9. The distribution of sample means is bi-modal and somewhat flat.

c. Mean of sample statistics= 0.7963

9===

x d. Yes, the mean of the sampling distribution is equal to the mean of the population of the three values. Yes,

these means are always equal, but only if every possible sample is included.

6. Telemarketing Selecting samples with replacement, there will be 42= 16 equally likely samples.

Sampling Distribution

Sample Mean, x Probability

11 1/16 10 2/16 9 1/16 7 2/16 6 4/16 5 2/16 3 1/16 2 2/16 1 1/16

b. The sampling distribution is of the 16 sample means, each of which has a probability of occurring. It has one mode and it is symmetrical.

Sample Number a. Sample

Sample Mean, x

c. Probability

1 1, 1 1 1/16 2 1, 11 6 1/16 3 1, 9 5 1/16 4 1, 3 2 1/16 5 11, 11 11 1/16 6 11, 1 6 1/16 7 11, 9 10 1/16 8 11, 3 7 1/16 9 9, 9 9 1/16

10 9, 1 5 1/16 11 9, 11 10 1/16 12 9, 3 6 1/16 13 3, 3 3 1/16 14 3, 1 2 1/16 15 3, 11 7 1/16 16 3, 9 6 1/16

Sum of Sample Means

=x 96.0 Mean of statistic values

==16

x 6.0


+++=

424

439111

6.0


c. Mean of sample statistics= 0.61696

16===

x d. Yes, the mean of the sampling distribution is equal to the mean of the population of the four values.

Yes, these means are always equal, but only if every possible sample is included.

7. Heights of L.A. Lakers Selecting samples with replacement, there will be 52= 25 equally likely samples.

Sample Number a. Sample Sample Mean, x Probability

1 85, 85 85.0 1/25 2 85, 79 82.0 1/25 3 85, 82 83.5 1/25 4 85, 73 79.0 1/25 5 85, 78 81.5 1/25 6 79, 79 79.0 1/25 7 79, 85 82.0 1/25 8 79, 82 80.5 1/25 9 79, 73 76.0 1/25

10 79, 78 78.5 1/25 11 82, 82 82.0 1/25 12 82, 85 83.5 1/25 13 82, 79 80.5 1/25 14 82, 73 77.5 1/25 15 82, 78 80.0 1/25 16 73, 73 73.0 1/25 17 73, 85 79.0 1/25 18 73, 79 76.0 1/25 19 73, 82 77.5 1/25 20 73, 78 75.5 1/25 21 78, 78 78.0 1/25 22 78, 85 81.5 1/25 23 78, 79 78.5 1/25 24 78, 82 80.0 1/25 25 78, 73 75.5 1/25

Sum of Sample Means =x 1985

Mean of statistic values ==

25

x 79.4


++++=

5397

57873827985 79.4



Sample Mean, x Probability

85.0 1/25 83.5 2/25 82.0 3/25 81.5 2/25 80.5 2/25 80.0 2/25 79.0 3/25 78.5 2/25 78.0 1/25 77.5 2/25 76.0 2/25 75.5 2/25 73.0 1/25

b. The probability of each sample occurring is 1/25. The sampling distribution of means consists of the 25 sample means with their corresponding probabilities. It has more than one mode and it is not symmetrical.

c. The means of the sampling distribution is 4.7925

1985==

=

n

x d. Yes, the mean of the sampling distribution is equal to the mean of the population of the five heights listed

above. Yes, these means are always equal as long as every possible sample is included.


8. Genetics, p(F)= 3/4= 0.75, q= 0.25 Selecting samples with replacement, there will be 42= 16 equally likely samples. M=Mike(male)=0, A=Anna(female)=1, B=Barbara(female)=1, C=Chris(female)=1


Proportion of Females

(Sample Mean) Probability

1 M,M= 0, 0 0.0 1/16 2 M,A= 0, 1 0.5 1/16 3 M,B= 0, 1 0.5 1/16 4 M,C= 0, 1 0.5 1/16 5 A,A= 1, 1 1.0 1/16 6 A,M= 1, 0 0.5 1/16 7 A,B= 1, 1 1.0 1/16 8 A,C= 1, 1 1.0 1/16 9 B,B= 1, 1 1.0 1/16

10 B,M= 1, 0 0.5 1/16 11 B,A= 1, 1 1.0 1/16 12 B,C= 1, 1 1.0 1/16 13 C,C= 1, 1 1.0 1/16 14 C,M= 1, 0 0.5 1/16 15 C,A= 1, 1 1.0 1/16 16 C,B= 1, 1 1.0 1/16


Mean of statistic values ===

1612

16x

0.75


+++=

43

41110 0.75


Sample Mean, x

Probability

0.0 1/16 0.5 6/16 1.0 9/16

b. The probability of each proportion is 1/16. The sampling distribution of proportions consists of the 16 sample proportions with their corresponding probabilities of 1/16. The distribution has one mode and is clearly not symmetrical.

c. The mean of the sampling distribution is 75.01612

==

=

n

x d. The mean of the sampling distribution is equal to the population proportion of females. Yes, the mean of

the sampling distribution of proportions always equals the population proportion as long as every possible sample is included.


9. Quality Control Selecting samples with replacement, there will be 52= 25 equally likely samples. D1= 1, D2= 1, A1= 0, A2=0, A3=0


Sample Mean x

Probability

1 D1, D1= 1, 1 1.0 1/25 2 D1, D2= 1, 1 1.0 1/25 3 D1, A1= 1, 0 0.5 1/25 4 D1, A2= 1, 0 0.5 1/25 5 D1, A3= 1, 0 0.5 1/25 6 D2, D2= 1, 1 1.0 1/25 7 D2, D1= 1, 1 1.0 1/25 8 D2, A1= 1, 0 0.5 1/25 9 D2, A2= 1, 0 0.5 1/25

10 D2, A3= 1, 0 0.5 1/25 11 A1, A1= 0, 0 0.0 1/25 12 A1, A2= 0, 0 0.0 1/25 13 A1, A3= 0, 0 0.0 1/25 14 A1, D1= 0, 1 0.5 1/25 15 A1, D2= 0, 1 0.5 1/25 16 A2, A2= 0, 0 0.0 1/25 17 A2, A3= 0, 0 0.0 1/25 18 A2, D1= 0, 1 0.5 1/25 19 A2, D2= 0, 1 0.5 1/25 20 A2, A1= 0, 0 0.0 1/25 21 A3, A3= 0, 0 0.0 1/25 22 A3, D1= 0, 1 0.5 1/25 23 A3, D2= 0, 1 0.5 1/25 24 A3, A1= 0, 0 0.0 1/25 25 A3, A2= 0, 0 0.0 1/25


Mean of statistic values ===

2510

25x

0.40

Population parameter

==

++++=

52

500011

0.40

Sampling Distribution Sample

Mean, x Probability

0.0 9/25 0.5 12/25 1.0 4/25

b. The sampling distribution consists of the 25 proportions and their corresponding probabilities of 1/25 each. The sampling distribution has one mode, but it is not symmetrical.


c. The mean of the sampling distribution is 40.02510

==

=

n

x d. Yes, the mean of the sampling distribution is equal to the population proportion of defects. Yes, the mean

of the sampling distribution of proportions always equals the population proportion as long as every possible sample is included.

10. Women Senators a. From a random sample, these results were obtained: D, R, D, D, D. b. The proportion of democrats is 4/5= 0.80. c. The proportion from part b is a statistic because it is the proportion in a particular sample. d. No, the sample proportion (4/5 = 0.8) does not equal the population proportion (10/13 = 0.77) No random

sample of size 5 can equal the population proportion because the proportions in the samples must be multiples of 0.2. The possibilities are: 0.0, 0.2, 0.4, 0.6, 0.8, 1.0. The population proportion (0.77) is not equal to any of these.

e. If all possible samples of size 5 are listed, then the mean of all the sample proportions will be equal to population proportion.

11. Mean Absolute Deviation From Table 5-2, x= 1, 2, 5, = 2.67 Population Mean Absolute Deviation, see this formula in Section 2-5.

56.1367.4

333.267.067.1

3)67.25()67.22(67.21

==

++=

++=

n

xx


Sample Mean x

Absolute Deviation

2)( 21 xxd =

1 1, 1 1.0 0.0 2 1, 2 1.5 0.5 3 1, 5 3.0 2.0 4 2, 1 1.5 0.5 5 2, 2 2.0 0.0 6 2, 5 3.5 1.5 7 5, 1 3.0 2.0 8 5, 2 3.5 1.5 9 5, 5 5.0 0.0

89.098

====

n

ddMAD

Since MAD = 0.89 1.56 (the population absolute mean deviation) the mean absolute deviation is not a good estimate of the population mean absolute deviation.


12. Median as an Estimator

Sample Number Sample Mean ( )x Median Probability

1 1,1,1 1.00 1 1/27 2 1,1,2 1.33 1 1/27 3 1,1,5 2.33 1 1/27 4 1,2,1 1.33 1 1/27 5 1,5,1 2.33 1 1/27 6 1,2,5 2.67 2 1/27 7 1,5,2 2.67 2 1/27 8 1,2,2 1.67 2 1/27 9 1,5,5 3.67 5 1/27

10 2,2,2 2.00 2 1/27 11 2,2,1 1.67 2 1/27 12 2,2,5 3.00 2 1/27 13 2,1,2 1.67 2 1/27 14 2,5,2 3.00 2 1/27 15 2,1,5 2.67 2 1/27 16 2,5,1 2.67 2 1/27 17 2,1,1 1.33 1 1/27 18 2,5,5 4.00 5 1/27 19 5,5,5 5.00 5 1/27 20 5,5,1 3.67 5 1/27 21 5,5,2 4.00 5 1/27 22 5,1,5 3.67 5 1/27 23 5,2,5 4.00 5 1/27 24 5,1,2 2.67 2 1/27 25 5,2,1 2.67 2 1/27 26 5,1,1 2.33 1 1/27 27 5,2,2 3.00 2 1/27

7.22772

===

n

xxx 5.227

68===

n

MdnxMdn

In this case, the mean of the sample means and the mean of the sample medians both are not equal to the population mean. Only the mean of the sample means is equal to the population mean. The mean of the medians is negatively biased. We conclude that the mean of the sample mean a better estimate of the population mean than the mean of the medians.


5-5 The Central Limit Theorem

Using the Central Limit Theorem. In Exercises 1-6, assume that mens weights are normally distributed with a mean given by . = 172 lb and a standard deviation given by = 29 lb (based on data from the National Health Survey).

1. a. P(x < 167)

.17.029

529

172167

=

=

=

xz From Table A-2, P(z < 0.17) = 0.4325.

There is a 0.4325 probability that an individual man will weigh less than 167 lb. b. P( x < 167)

03.1833.45

629

5

3629

172167=

=

=

=

=

n

xz

. From Table A-2, P(z < 1.03) = 0.1515.

There is a 0.1515 probability that a group of 36 men will have a mean weight less than 167 lb.

2. a. P(x > 180)

.28.0298

29172180

+==

=

=

xz From Table A-2, P(z < +0.28) = 0.6103.

Therefore, P(z > +0.28) = 1 0.6103 = 0.3897. There is a 0.3897 probability that an individual man will weigh more than 180 lb.

b. P( x > 180)

76.29.2

8

1029

8

10029

172180+===

=

=

n

xz

. From Table A-2, P(z < +2.76) = 0.9971.

Therefore, P(z > 0.28) = 1 0.9971 = 0.0029. There is a 0.0029 probability that a group of 100 men will have a mean weight more than 180 lb.

3. a. P(170 < x < 175) 10.0

293

29172175

z ,07.029

229

172170+==

=

==

=

=

=

xxz .

From Table A-2, P(z < 0.07) = 0.4721 and P (z < +0.10) = 0.5398. The difference is 0.5398 0.4721 = 0.0677. There is a 0.0677 probability that an individual man will weigh between 170 lb and 175 lb

b. P(170 < x < 175)

55.0625.3

2

829

2

6429

172170=

=

=

=

=

n

xz

83.0625.33

829

3

6429

172175+===

=

=

n

xz

From Table A-2, P(z < 0.55) = 0.2912, and P(z < +0.83) = 0.7967. The difference is 0.7967 0.2912 = 0.5055. There is a 0.5055 probability that a group of 64 men will have a mean weight between 170 lb and 175 lb

4. a. P(100 < x < 165) 24.0

297

29172165

z ,48.22972

29172100

=

=

=

==

=

=

=

xxz .

From Table A-2, P(z < 2.48) = 0.0066 and P(z < 0.24) = 0.4052.


The difference is 0.4052 0.0066 = 0.3986. There is a 0.3986 probability that an individual man will weigh between 100 lb and 165 lb

b. P(100 < x < 165)

34.22222.372

929

72

8129

172100=

=

=

=

=

n

xz

17.2222.3

7

929

7

8129

172165=

=

=

=

=

n

xz

From Table A-2, P(z < 22.34) ~ 0.0001, and, P (z < 2.17) = 0.0150. The difference is 0.0150 0.0001 = 0.0149. There is a 0.0149 probability that a group of 81 men will have a mean weight between 100 lb and 165 lb

5. a. P( x > 160)

07.280.512

529

12

2529

172160=

=

=

=

=

n

xz

From Table A-2, P(z < 2.07) = 0.0192 Therefore P(z > 2.07) = 1 0.0192 = 0.9808. There is a 0.9808 probability that a group of 25 men will

weigh more than 160 lb. b. The central limit theorem can be used in part (a) because the original distribution is a normal distribution

and we assume the sampling distribution would be normal even though the sample size is less than 30.

6. a. P(160 < x < 180)

55.05.14

8

229

8

429

172180 z

83.050.14

12

229

12

429

172160

+===

=

=

=

=

=

=

=

n

x

n

xz

.

From Table A-2, P (z < 0.83) = 0.2033 and P(z < +0.55) = 0.7088. The difference is 0.7088 0.2033 = 0.5055. There is a 0.5055 probability that a group of 4 men will have a

mean weight between 160 lb and 180 lb. b. The central limit theorem can be used in part (a) because the original distribution is a normal distribution

we assume the sampling distribution would be normal even though the sample size is less than 30.

7. Redesign of Ejection Seats, = 143, = 29 a. P(140 < x < 211) 34.2

2968

29143211

z ,10.029

329

143140+==

=

==

=

=

=

xxz

From Table A-2, P(z < 0.10) = 0.4602 and P(z < +2.34) = 0.9904. The difference is 0.9904 0.4602 = 0.5302. There is a 0.5302 probability that an individual woman will

weigh between 140 lb and 211 lb. b. P(140 < x < 211)


07.14833.468

62968

3629

143211 z

62.0833.43

629

3

3629

143140

+===

=

=

=

=

=

=

=

n

x

n

xz

.

From Table A-2, P (z < 0.62) = 0.2676 and P(z < +14.07) ~ 0.9999. The difference is 0.9999 0.2676= 0.7323. There is a 0.7323 probability that a group of 36 women will have a mean weight between 140 lb and 211 lb

c. The results from part (a) are more important because the seats will be occupied by individual women, and not by groups of women.

8. Designing Motorcycle Helmets, = 6, = 1 a. P(x < 6.2)

.2.012.0

162.6

==

=

=

xz From Table A2, P(z < 0.2) = 0.5793.

There is a 0.5793 probability that an individual man will have a head breadth less than 6.2 in.

b. 0.21.02.0

101

2.0

1001

62.6+===

=

=

n

xz

.

From Table A-2, P(z < +2.0) = 0.9772. There is a 0.9772 probability that a group of 100 men will have a mean head breadth less than 6.2 in.

c. The results from (b) above are for a group of men. Since the helmets are to be used by one man alone at a time, the results of (a) are more appropriate for the production manager to use.

9. Designing a Roller Coaster, = 14.4, = 1 a. P(x > 16.0)

.26.2707.06.1

414.11

6.1

21

4.1416+===

=

=

n

xz

From Table A-2, P (z < +2.26) = 0.9881. Therefore, P (z > 2.26) = 1 0.9881 = 0.0119. The probability that the mean of the 2 men is greater than 16 in. is 0.0119.

b. No, most riders will be able to fit since the probability of both riders having a mean hip breadth of greater than 16in. is very low.(0.0119). Yes, this design appears to be acceptable.

10. Uniform Random-Number Generator, = 0.5, = 0.289 .42.2

0289.007.0

10289.0

07.0

100289.0

50.057.0+===

=

=

n

xz

From Table A-2, P(z < +2.42) = 0.9922. Therefore, P(z > 2.42) = 1 0.9922 = 0.0078. The probability of getting 100 numbers with a mean greater than 0.57 is 0.0078. It would be unusual to generate

100 such numbers and get a mean of greater than 0.57. This is because the probability of this occurring is very low (0.0078).

11. Blood Pressure, = 114.8, = 13.1 a. .92.1

1.132.25

1.138.114140

+==

=

=

xz

From Table A-2, P(z < +1.92) = 0.9726.Therefore, P(z > +1.92) = 1 0.9726 = 0.0274. There is a 0.0274 probability that an individual woman will have a systolic blood pressure greater than 140.


b. .85.355.6

2.25

21.132.25

41.13

8.114140+===

=

=

n

xz

From Table A-2, P(z < +3.85)= 0.9999. Therefore, P(z > 3.85) = 1 0.9999= 0.0001. There is a 0.0001 probability that a group of 4 women will have a mean systolic blood pressure greater than 140.

c. The central limit theorem can be used in part (b) because the original distribution is a normal distribution, even though the sample size is less than 30.

d. No. Although the mean result for the 4 women is less than 140, the individual values could be above or below 140 due to sampling variability.

12. Reduced Nicotine in Cigarettes, = 0.941, = 0.313 a. .19.1

40313.0

941.0882.0=

=

=

n

xz

From Table A-2, P(z < 1.19) = 0.1170.

There is a 0.1170 probability of randomly selecting 40 cigarettes with a mean of 0.882 g or less. b. Based on the results, the amount of nicotine seems to be lower. This is because it is very unlikely to select a

group of 40 cigarettes with a mean nicotine level of less than 0.882 if the mean and standard deviation have not changed. Therefore, it is likely that these values have changed as the company claims.

13. Elevator Design, = 172, = 29, n= 16, P = 0.975 We first find the z score for the area P= 0.975 from the body of table A-2.This corresponds to z = +1.96. We then use the formula:

( )21.18621.14172

25.796.11720.4

2996.11721629

*96.1172

=+

=+=

+=

+=

+=

nzx

.

To get the total value for 16 men, 186.21 16 = 2979.4. This is the maximum total allowable weight if we want a 0.975 probability of this weight not being exceeded with 16 men.

14. Seating Design, = 14.4, = 1, n = 18, P= 0.975 a. We first find the z score for the area P= 0.975 from the body of table A-2. This corresponds to z = +1.96. We then use the formula:

86.1446.04.14

236.096.14.14243.4196.14.14

18196.14.14

=+

=+=

+=

+=

+=

nzx

.

To get the total value for 18men, = 14.86 18 = 267.48 in. This is the minimum length of the bench if we want a 0.975 probability that it will fit the combined hips of 18 men.

b. Using the result in (a) would be wrong because we actually want to build a bench for 18 male college football player are most probably bigger in size than normal men.

15. Correcting for a Finite Population, = 143, = 29, N=120, n = 8 a. If we do not want to exceed this limit, we need to find the probability of the 8 of them having a total weight

less than 1300 lb. A total capacity of 1300 lb for the 8 women means 1300/8 = 162.5 lb per woman on average.

96.194.95.19

970.025.105.19

941.025.105.19

119112

828.229

5.19

11208120

829

1435.162

1

==

=

=

=

=

=

NnN

n

xz

From Table A-2, P(z < +1.96) = 0.975.


The probability of their total weight not exceeding 1300lb = 0.9750. b. We first find the z score for the area P= 0.9900 from the body of Table A-2. This corresponds to z = +2.33.

We then use the formula:

18.16618.23143970.0*25.1034.2143

941.0828.22934.2143

11208120

82933.2143

1=+=+

=+=

+=

+=N

nNn

zx

To get the total value for 8 women, = 166.18 8 = 1329 lb. This is the maximum allowable weight of passengers in the elevator if we want a 0.99 probability that the elevator will not be overloaded.

16. Population Parameters, 2, 3, 6, 8, 11, 18

a. 0.8648

618118632

==

+++++=

=

Nx

x 2 3 6 8 11 18 x= 48 x -6 -5 -2 0 3 10 (x )= 0

(x )2 36 25 4 0 9 100 (x )2= 174

( ) 385.56

1742==

=

Nx

b.

Sample Number Samples (without

replacement) Sample

mean, x xx ( xx )2

1 2, 3 2.5 -5.5 30.25 2 2, 6 4.0 -4.0 16.00 3 2, 8 5.0 -3.0 9.00 4 2, 11 6.5 -1.5 2.25 5 2, 18 10.0 2.0 4.00 6 3, 6 4.5 -3.5 12.25 7 3, 8 5.5 -2.5 6.25 8 3, 11 7.0 -1.0 1.00 9 3, 18 10.5 2.5 6.25

10 6, 8 7.0 -1.0 1.00 11 6,11 8.5 0.5 0.25 12 6, 18 12.0 4.0 16.00 13 8, 11 9.5 1.5 2.25 14 8, 18 13.0 5.0 25.00 15 11, 18 14.5 6.5 42.25

= 120.00 0.0 174.00

c. Mean of sample means, 0.815

120===

x

xn

x

d. Mean and standard deviation, See part (c). for the mean of 8.0

Standard deviation of sample means, ( ) 406.360.11

1500.1742

===

=

x

xn

x

e. By comparing the result in part (a) with the result in part (c), we see that 8== x


x

NnN

n

==

===

=

406.38944.0808.3

8.0808.354

414.1385.5

1626

2385.5

1

Value is the same as the result found in part (d).

5-6 Normal as Approximation to Binomial

Using Normal Approximation. In Exercises 1-8, the given values are discrete. Use the continuity correction and describe the region of the normal distribution that corresponds to the indicated probability. For example, the probability of more than 20 girls corresponds to the area of the normal curve described with this answer: the area to the right of 20.5.

1. The probability of more than 15 males with blue eyes corresponds to the area of the normal curve to the right of 15.5, P(x > 15)= Pc(x > 15.5)

2. The probability of at least 24 students understanding continuity correction corresponds to the area of the normal curve to the right of 23.5, P(x > 23)= Pc(x > 23.5)

3. The probability of fewer than 100 bald eagles sighted in a week corresponds to the area of the normal curve to the left of 99.5, P(x < 100)= Pc(x < 99.5)

4. The probability that the number of working vending machines in the United States is exactly 27 corresponds to the area of the normal curve between 26.5 and 27.5, P(x =27)=

Pc(26.5 < x < 27.5)

5. The probability of no more than 4 students absent in a biostatistics class corresponds to the area of the normal curve to the left of 4.5, P(x 4)= Pc(x < 4.5)

6. The probability that the number of Canada geese residing in one pond is between 15 and 20 inclusive corresponds to the area of the normal curve between 14.5 and 20.5, P(15 x 20)= Pc(14.5 < x < 20.5)

7. The probability that the number of rabbit offspring is between 8 and 10 inclusive corresponds to the area of the normal curve between 7.5 and 10.5, P(8 x 10)= Pc(7.5 < x < 10.5)

8. The probability of exactly 3 American elm trees with Dutch elm disease corresponds to the area of the normal curve between 2.5 and 3.5, P(x= 3)= Pc(2.5 < x < 3.5)

Using Normal Approximation. In Exercises 9-12, do the following. (a) Find the indicated binomial probability by using Table A-1. (b) If np 5 and nq 5, also estimate the indicated probability by using the normal distribution as an approximation to the binomial distribution; if np < 5 or nq < 5, then state that the normal approximation is not suitable.

9. a. n = 14, p = 0.5, From Table A-1, P(9) = 0.122. b. Normal approximation np= nq= 14 0.5= 7 (both 5, normal approximation is justified)

34.1871.15.2

871.175.9

80.0871.15.1

871.175.8

8711535.05.01475014

==

=

=

==

=

=

====

===

xz

xz

..npq.np


z = 0.80 corresponds to a probability of 0.7881 z = 1.34 corresponds to a probability of 0.9099 P(9) from Normal Approximation= 0.9099 0.7881 = 0.1218 (very good approximation to 0.122)

10. a. n = 12, p = 0.8, From Table A-1, P (7) = 0.053 b.

.., nq..np 422012698012 ==== nq = 2.4 which is 55.5) P (girls > 55) = P (z > +1.1) = 1 0.8643 = 0.1357 No, since P(girls > 55) is greater than 0.05, it is not unusual to get more than 55 girls out of 100 births.

14. Probability of at Least 65 Girls, np= 50, nq= 50 (both 5, normal distribution justified)

925

5.145

505645255.05.0100

505010050100

.

.

x

z

npq.np

., pn

+==

=

=

====

===

==

P(x < 65), finding Pc(x < 64.5) P (girls 65) = P (z > +2.9) = 1 0.9981 = 0.0019 Yes, since P(girls 65) is less than 0.05, it is unusual that there would be 65 or more girls out of 100 births.


15. Probability of at Least Passing, np= 50, nq= 50 (both 5, normal distribution justified)

9155.9

550559

52550501005050100

answer) falseor (true50100

.

.

x

z

..npq.np ., pn

+==

=

=

====

===

==

P(x 60), finding Pc(x > 59.5) P (score 60) = P (z > +1.9) = 1 0.9713 = 0.0287 No, since P(score 60) is less than 0.05, it is unusual to get a score of at least 60 by guessing

16. Multiple-Choice Test, np= 5, nq= 20 (both 5, normal distribution justified)

.

.

x

z..

x

z

npq .np

., pn

75225.5

25510251

25.2

2552

248.02.02552025

correct) is options 5 ofout (one2025

+==

=

==

=

=

=

====

===

==

P(3 < x < 10), finding Pc(2.5 < x < 10.5) P (z< 1.25) = 0.1056, P (z < +2.75) = 0.9970 P (-1.25 < z < +2.75) = 0.9970 0.1056 = 0.8914

17. Mendels Hybridization Experiment, np= 145, nq= 435 (both 5, normal distribution justified)

.

.

.

x

z

.npq.np

., pn

62043.105.6

43101455151

431075.10875.025.0580145250580

250580

+==

=

=

====

===

==

P(x 152), finding Pc(x > 151.5) P (z at least 0.62) = P(z > +0.62) = 1 0.7324 = 0.2676 No, there is no evidence that the Mendelian rate of 25% is wrong because it is not unusual to get 152 yellow

pods out of 580 seedlings, p= 0.2676

18. Cholesterol-Reducing Drug, np= 16.4, nq= 846.6 (both 5, normal distribution justified)

.

.

.

.

..

x-

z

....npq..np

., pn

520014

1032014

39716518014091698100190863

3971601908630190863

+==

==

====

===

==

P(x 19), finding Pc(x > 18.5) P(z at least 0.52) = P (z > +0.52) = 1 0.6985 = 0.3015 It is not unusual to have 19 people with flu symptoms (P= 0.3015). Therefore, the flu symptoms are probably

not due to taking the drug.


19. Probability of at Least 50 Color-Blind Men, np= 54, nq= 546 (both 5, normal distribution justified)

64.001.7

5.401.7

545.49-z

01.714.4991.009.0600npq540.09600np

0.09,p ,600n

=

=

==

====

===

==

x

P(x 50), finding Pc(x > 49.5) P(z at least 0.64) = P(z > 0.64) = 1 0.2611 = 0.7389

It is quite likely to have 50 color blind men among this group of 600 men (P= 0.7389). However, the researchers cannot be very confident since there is still quite some chance of not getting up to 50 men.

20. Cell Phones and Brain Cancer, np= 143, nq= 419,952 (both 5, normal distribution justified)

.

.

..

x

z

.npq ..np

., pn

61095.1133.7

9511831425135

951178.142999660.0000340.0420095831420003400420095

0003400420095

=

=

=

=

====

===

==

P(x 135), finding Pc(x < 135.5) P (z < 0.61) = 0.2709

It is not unusual to have 135 or fewer cases of brain cancer in the population (P= 0.2709). Therefore, the media reports that cell phones cause brain cancer are not supported by the evidence.

21. Identifying Gender Discrimination, np= 31, nq= 31 (both 5, normal distribution justified)

4129373

599373

315219373515505062

3150625062

.

.

.

.

.

x

z

....npq .np

., pn

=

=

=

=

====

===

==

P(x 21), finding Pc(x < 21.5) P (z < 2.41) = 0.0080

It is unusual to have 21 female employees out of 62 new employees being hired assuming no gender discrimination. (P= 0.0080) These results support the charge of gender discrimination taking place.

22. Blood Group, np= 180, nq= 220 (both 5, normal distribution justified)

.

.

.

.

.

x

z

...npq.np

., pn

350959

53959

180517695999550450400

180450400450400

=

=

=

=

====

===

==

P(x 177), finding P(x > 176.5) P(z < 0.35)= 0.3632, P(z > 0.35) = 1 0.3632= 0.6368

It is not unusual to have at least 177 Group O donors in this group of 400 people. The pool may be sufficient, however this pool may not be sufficient because the probability is not high (P = 0.6368).


23. Acceptance Sampling, np= 5, nq= 45 (both 5, normal distribution justified)

.

.

.

.

.

x

z

....npq .np

., pn

651122

53122

55112254901050

510501050

=

=

=

=

====

===

==

P(x 2), finding P(x > 1.5) P(z < 1.65)= 0.0495, P(z > 1.65) = 1 0.0495= 0.9505 Yes, this plan would detect defects at the 10% level about 95% of the time.

24. Car Crashes, np= 170, nq= 330 (both 5, normal distribution justified)

79.259.105.29

59.101705.199

z

59.102.11266.034.0500npq 17034.0500np

34.0p ,500n

+==

=

=

====

===

==

x

P(x 200), finding P(x > 199.5) P(z < +2.79)= 0.9974, P(z > +2.79) = 1 0.9974 = 0.0026 The probability of having 40 %( 200) of 500 men having accidents is very low (p< 0.05) when the true

probability is 0.34. Therefore, the claim that the accident rate in New York City is higher than 34% is supported by the evidence in this result.

25. Cloning Survey, np= 506, nq= 506 (both 5, normal distribution justified)

.

..x

z

...npq.np

., pn

80249115

5.3949115

5065.900911525350501012

506501012501012

+==

=

=

====

===

==

P(x 900), finding P(x > 900.5) P (z < +24.80) 0.9999, P(z > +24.80) = 1 0.9999 = 0.0001 The probability of having 89% (900) of 1012 people in a sample assuming a general probability of 0.5 is very

low. Yes, this evidence supports the claim that a majority of people are opposed to cloning

5-7 Assessing Normality (Note: In Section 5-7 all graphics were generated using SPSS)

Interpreting Normal Quantile Plots. In Exercises 1-4, examine the normal quantile plot and determine whether it depicts data that have a normal distribution.

1. The data are not normally distributed since the data plot dots depart from being a straight line that follows the normal quantile plot that is expected if the data are normally distributed.

2. The data are not normally distributed since the data plot dots depart from being a straight line that follows the normal quantile plot that is expected if the data are normally distributed.

3. The data are normally distributed since the data plot dots are very close to a straight line that follows the normal quantile plot that is expected if the data are normally distributed.


4. The data are normally distributed since the data plot dots are very close to a straight line that follows the normal quantile plot that is expected if the data are normally distributed.

Determining Normality. In Exercises 5-8, refer to the indicated data set and determine whether the requirement of a normal distribution is satisfied. Assume that this requirement is loose in the sense that the population distribution need not be exactly normal, but it must be a distribution that is basically symmetric with only one mode.

5. BMI, Data Set 1 in Appendix B

The histogram above shows a distribution with one mode, relatively symmetrical, and bell-shaped. It can be said to approximate a normal distribution.

35.0030.0025.0020.00 15.00

BMIMales

12

10

8

6

4

2

0

Fre

quen

cy


6. Head Circumferences, Data Set 4 in Appendix B

The histogram above shows a distribution with one mode, relatively symmetrical, and bell-shaped, except for two values in the lower part of the range. While this distribution is not perfectly symmetrical it could be considered to be approximately normal.

42.5040.0037.50 35.00

HeadCircMales

20

15

10

5

0

Freq

uenc

y


7. Water Conductivity

The histogram above shows a distribution with one mode. However, the distribution is not symmetrical and bell-shaped so it would not be considered to be approximately normal.

60.0040.0020.00

WaterConductivity

12.5

10.0

7.5

5.0

2.5

0.0

Freq

uenc

y


8. Heights of Poplar Trees

The histogram above shows a distribution with one mode. However, the distribution is not symmetrical or bell-shaped so it would not be considered to be approximately normal.

14.0012.0010.008.006.004.00 2.00 0.00

PoplarTreeHgt

12

10

8

6

4

2

0

Freq

uenc

y


Generating Normal Quantile Plots. In Exercises 9-12, use the data from the indicated exercise in this section. Use a TI-83/84Plus calculator or software (such as SPSS, SAS, STATDISK, Minitab. or Excel) capable of generating normal quantile plots (or normal probability plots). Generate the graph, then determine whether the data appear to come from a normally distributed population. NOTE: The following Normal Quantile Plots, except those in Exercises 15 and 16 were generated by SPSS. When using the SPSS option for standardized or z scores, both axes are put into z score units, not just the Y-axis.

9. From Exercise 5

420-2-4

Standardized Observed Value

4

2

0

-2

-4

Expe

cted

No

rmal

Val

ueNormal Q-Q Plot of BMIMales

The BMI data from Exercise 5 seems to come from a normal distribution. Most of the points are very close to the straight line.


10. From Exercise 6

20-2-4


4

2

0

-2

-4

Expe

cted

Nor

mal

Va

lue

Normal Q-Q Plot of HeadCircMales

The head circumference data from Exercise 6 seems to come from a normal distribution. Most of the points, except for two of them, are very close to the line.


11. From Exercise 7

420-2-4


4

2

0

-2

-4

Expe

cted

N

orm

al V

alue

Normal Q-Q Plot of WaterConductivity

The data on the conductivity variable are not normally distributed. The points depart quite a bit from the straight line.


12. From Exercise 8

420-2-4


4

2

0

-2

-4

Expe

cte

d N

orm

al V

alu

e

Normal Q-Q Plot of PoplarTreeHgt

This tree height data distribution is not normal. The points are not close to the line. Also, there are some obvious outliers seen in the plot.

13. Comparing Data Sets

420-2-4


4

2

0

-2

-4

Expe

cted

No

rmal

Va

lue

Normal Q-Q Plot of HgtWomen

43210-1-2-3


4

2

0

-2

-4

Expe

cted

No

rmal

Va

lue

Normal Q-Q Plot of CholestWomen

The distribution for height appears to be normal, but the distribution for cholesterol does not appear to be normal. This could be because cholesterol levels depend on diet and many other human behaviors in different ways that do not yield normally distributed results while height is a more natural variable less influenced by human behaviors.


14. Comparing Data Sets

543210-1-2


4

2

0

-2

-4

Expe

cte

d No

rmal

Val

ueNormal Q-Q Plot of SystBPWomen

420-2-4


4

2

0

-2

-4

Expe

cte

d No

rmal

Val

ue

Normal Q-Q Plot of ElbowBrdthWomen

Systolic blood pressure does not appear to have a distribution that approximates a normal distribution, but the distribution of elbow breadth could approximate a normal distribution. This could be because systolic blood pressure levels depend on diet and other human behaviors that do not yield normally distributed results while elbow breadth is a more natural variable less influenced by human behaviors.

Constructing Normal Quantile Plots. In Exercises 15 and 16, use the given data values and identify the corresponding z scores that are used for a normal quantile plot, then construct the normal quantile plot and determine whether the data appear to be from a population with a normal distribution.

15. Heights of L.A. Lakers Sorting the data by order gives us 73, 78, 79, 82, 85 n = 5, 1/2n, 3/2n, 5/2n, 7/2n, 9/2n = 0.1, 0.3, 0.5, 0.7, 0.9 Corresponding z scores, using Table A-2 for these areas are: 1.28, 0.52, 0.00, +0.52, and +1.28 We now pair the sorted heights with their corresponding z scores: (73, 1.28) (78, 0.52) (79, 0) (82, +0.52) (85, +1.28) We plot these (x,y) coordinates to get the normal quantile plot.


NOrmal Q-Q Plot for Laker's Height

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Observed Score (Height)

Expe

cte

d N

orm

al Va

lue

70 74 76 78 80 82 84 86

This distribution looks like it approximates a normal distribution.

16. Monitoring Lead in Air Sorting the data by order gives us 0.42, 0.48, 0.73, 1.10, 1.10, 5.40 n= 6, 1/2n, 3/2n, 5/2n, 7/2n, 9/2n, 11/2n = 0.083, 0.167, 0.417, 0.583, 0.750, 0.917 Corresponding z scores by using Table A-2 for these areas are: 1.38, 0.67, 0.21, +0.21, +0.67 and +1.39 We now pair the sorted heights with their corresponding z scores: (0.42, -1.38) (0.48,-0.67) (0.73, 0.21) (1.10, 0.21) (1.10, 0.67) (5.40, 1.39) We plot these (x,y) coordinates to get the normal quantile plot.


Normal Q-Q PLot for Lead in Air

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Observed Value (Lead in Air)

Expe

cte

d St

anda

rdiz

ed

Valu

e

0 1 2 3 4 5 6

The distribution of the data clearly is not normal.

17. Using Standard Scores

No, the transformation to z scores involves subtracting a constant and dividing by a constant, so the plot of the (x,z,) points will always be a straight line, regardless of the nature of the distribution.


18. Lognormal Distribution

3210-1-2


2

1

0

-1

-2

Expe

cted

Nor

mal V

alu

e

Normal Q-Q Plot of PhoneTime

210-1-2


2

1

0

-1

-2

Expe

cted

No

rmal

Va

lue

Normal Q-Q Plot of LogPhoneTime

The above distribution on the left is clearly not normal. However, the distribution on the right, after the log (x + 1) transformation is much closer to being a normal distribution. This illustrates that at times a transformation can provide a distribution much closer to a normal distribution than the original distribution has.

Review Exercises

1. High Cholesterol Levels, = 178.1, = 40.7 a. P(x > 260)

01.27.409.81

7.401.178260

+==

=

=

xz

P(x > 260)= P(z > + 2.01), Using Table A-2, P(z < +2.01)= 0.9778 P(z > +2.01)= 1 P(z +2.01)= 1 0.9778= 0.0222 b. P(170 < x < 200)

54.07.409.21

7.401.17820020.0

7.401.8

7.401.178170

+==

=

==

=

=

=

xz

xz

P(z < +0.54)= 0.7054, P(z < -0.20)= 0.4207 P(170 < x < 200)= P(-0.20 < z < +0.54)= 0.7054 0.4207= 0.2847 c. P(170 < x < 200), with n= 9

61.157.139.21

37.409.21

97.40

1.178200 z

60.057.131.8

37.401.8

97.40

1.178170

+===

=

=

=

=

=

=

=

n

x

n

xz

.

From Table A-2, P (z < 0.60)= 0.2743 and P(z < +1.61)= 0.9463. The difference is 0.9463 0.2743= 0.6720. There is a 0.6720 probability that a group of 9 men will have a mean cholesterol level between 170 mg/dL and 200 mg/dL

d. The top 3% is equivalent to bottom 97%. From Table A-2, the area 0.97 corresponds to a z score of +1.88 254.640.7)1.88(178.1 )(z =++=+= x Therefore, the cutoff for men should be a cholesterol level of 254.6


2. Babies at Risk, a. = 3420, = 495

46.24951220

49534202200

z =

=

=

=

x

P( z < 2.46) = 0.0069. Therefore, 0.69% of babies are in at risk category. If the Chicago hospital has 900 births, we expect 0.69 % of the 900 to be at risk 6.21 babies would be at

risk. b. Lowest 2%. From Table A-2, the area 0.02 corresponds to a z score of 2.05 4052495)(-2.053420 )(z =+=+= x The cutoff weight for the lowest 2% is 2405 g. c. P( x > 3700)

26.275.123

280

4495280

16495

34203700-z +===

==

n

x

From Table A-2, P(z < 2.26) = 0.9881.Therefore, P(z >2.26) = 1 0.9881 = 0.0119. The probability that 16 newborn babies will have mean weight greater than 3700 is 0.0119.

d. P(3300 < x < 3700) with n= 49

96.371.70

280

7495280

49495

34203700-z

70.171.70

120

7495

120

49495

34203300-z

+===

==

=

=

=

==

n

x

n

x

From Table A-2, P(z < 1.70) = 0.0446, and, P(z < +3.96) = 0.9999. P(3300 < x < 3700)= P(z < +3.96) P(z < 1.70)= 0.9999 0.0446 = 0.9553. There is a 0.9553 probability that a group of 49 babies will have a mean birth weight between 3300 g and

3700 g.

3. Blue Genes, since np= 25 and nq= 75, both > 5, use of normal approximation to a binomial distribution, with continuity correction, is justified

P(x 19), find Pc(x < 19.5)

27.133.4

5.54.33

255.1933.475.1875.025.0100

250.25100 0.25 p 100,

=

=

=

=

====

===

==

xz

npq

npn

From Table A-2, the area below a z score of 1.27 is 0.1020. Since P= 0.1020 > 0.05, it would not be considered to be unusual to have 19 or fewer offspring with blue eyes out of 100 births.

4. Marine Corps Height Requirements for Men, = 69, = 2.8 a. P(64 < x < 78)

21.38.2

98.26978

z 79.18.25

8.26964

z +==

=

==

=

=

=

xx

From Table A-2, the area below a z score of 1.79 is 0.0367 and for a z score of +3.21 is 0.9993. P(64 < x < 78)= P(z < +3.21) P(z < 1.79)= 0.9993 0.0367= 0.9626 Therefore 96% of men meet this requirement so not many men (only about 3.7%) are denied entry into the

Marines because of their height.


b. The shortest 2% corresponds to an area of 0.02 which corresponds to a z score of 2.05. The tallest 2% corresponds to an area of 0.98 which corresponds to a z score of +2.05

74.7474.569)8.205.2(69)(26.6374.569)8.205.2(69)(

=+=+=+=

==+=+=

zx

zx

The new minimum and maximum heights would be 63.3 in. and 74.7 in. c. P( x > 68) with n= 64

86.235.01

88.21

648.2

6968z =

=

=

=

=

n

x

The area below a z score of 2.86 is 0.0021. P(z > 2.86) = 1 0.0021= 0.9979 The probability of randomly drawing a sample of 64 with a mean height greater than 68 in. is 0.9979.

5. Sampling Distributions a. With a sample size of 100, which is considered a large sample size, we would expect the distribution of

sample means to be normally distributed regardless of the shape of distribution from which the samples are drawn. The basis for making this claim is the Central Limit Theorem.

b. The standard deviation of the sample means is referred to as the standard error of the mean. If = 512 and samples are of size, n= 100, it is found as:

2.5110512

100512

====

nx

c. With a sample size of 1200, which is considered a very large sample size, we would expect the distribution of sample proportions from x/n to be normally distributed even though the original distribution is a binomial distribution. The basis for making this claim is the Central Limit Theorem.

6. Gender Discrimination, n= 20, p= 0.30, q= 0.70 np= 6, nq= 14 (since both 5, a normal distribution approximation is justified)

71.1049.2

5.3049.2

65.2z

)5.2(),2(049.22.470.030.020npq

630.020np

=

=

=

=


7. Testing for Normality, From Data Set 6 in Appendix B, Bear Neck Size From the graphs below, the distribution is approximately normal. The histogram, with a normal distribution superimposed on it, has one mode and is roughly bell-shaped and the normal quantile plot has most of the points on the straight line.

8. Testing for Normality, From Data Set 12 in Appendix B, Pre-Exercise with No Stress From the graphs below, the distribution is approximately normal. The histogram, with a normal distribution superimposed on it, has one mode is roughly bell-shaped and the normal quantile plot has most of the points on the straight line.

Cumulative Review Exercises

1. Eye Measurement Statistics Ordered scores: 55 59 62 63 66 66 66 67 in mm a. Sample Mean

35.00 30.00 25.00 20.00 15.00 10.00

BearNeckSize

10

8

6

4

2

0

Freq

ue

ncy

4 20-2-4


4

2

0

-2

-4

Expe

cted

Nor

mal

Val

ue

Normal Q-Q Plot of BearNeckSize

2 1 0-1-2Standardized Observed Value

2

1

0

-1

-2

Expe

cted

N

orm

al V

alu

e

Normal Q-Q Plot of PreExrcsSystBP

140.00 130.00120.00 110.00 100.00 90.00

PreExrcsSystBP

5

4

3

2

1

0

Freq

ue

ncy


0.638

5048

5566666362596667==

+++++++==

n

xx

b. Since there are a even number of scores, the median is the middle point between the two middle, Median, x~ = (63+66)/2 = 64.5

c. The mode is the number that occurs the most frequent = 66 (occurs 3 times) d. Standard deviation 31876504 2 == xx

( )

21.4714.17

714.1756992

78254016255008

)18(8)504()31876(8

)1(2

2222

===

==

=

=

=

ss

nn

xxns

e. 95.021.44

21.46359

=

=

=

=

s

xxz

f. 6 of the 8 numbers are greater than 59, 6/8= 0.75 or 75% g. Assuming a normal distribution, the area below a z score of 0.95, P(z < 0.95)= 0.1711 P (z > 0.95) = 1 0.1711= 0.8289. This corresponds to 82.89% h. This data set is ratio level of measurement since there are equal intervals of measurement and there is a

natural staring point at zero. i. The exact un-rounded distances are continuous data that can be any value on the continuum.

2. Left-Handedness, p= 0.10 a. This is a binomial distribution with p= 0.1. Probability of 3 out of a sample of 3 being left handed P(L1) = 0.1, P(L2)= 0.1, P(L3)= 0.1 P(all three are L)= P(L1) P(L2) P(L3)= 0.13= 0.001 b. P(at least 1 person left-handed)= 1 P(no lefthanders)= 1 P( N1) P(N2) P(N3) = 1 (0.9 0.9 0.9)=

1 0.729= 0.271 c. The sample size of 3 is too small, np= 0.3 < 5, np 5 is not satisfied. d. In a group of 50 people, the mean number of left handed people would be 0.51.050 === np e. Standard deviation, 121.25.49.01.050 ==== npq f. P(x > 8)

41.1121.23

121.258

==

=

=

xz

Area below a z score of 1.42 is 0.9207.Therefore, P(x > 8) = 1 0.9207= 0.0793. Since P= 0.0793 > 0.05, it would not be considered an unusual result to get 8 lefthanders out of 50 subjects.

chapter 5. norchapter 5. normal probability distributions.pdfmal probability distributions

Documents