oer.mciu.edu.ngoer.mciu.edu.ng/.../2015/04/mr-eberu-friday-sta-201-lecture-4-6-not… · web...

STA 201: Statistics for Biological Sciences Lecture Note

By EBERU K.U. Friday

Week Four to Six: Measures of Central Tendency and Measures of DispersionIntroduction

In this lesson we talk about two types of constants that we compute from data:

1. Measures of central tendency and

2. Measures of dispersion.

A measure of central tendency represents an "average value." Mean, median, mode (if you already know these) are measures of central tendency. A measure of dispersion is a measure of how widely the data is scattered around.

2.1 Measure of Central Tendency: Mean

The most common measure of central tendencies is the mean or arithmetic mean.

Definition. The mean or the arithmetic mean of a set of data is given by

mean = sum of all the data values

________________________________________size of the data .

If we denote a data value (i.e., the variable) by x and if n is the size of the data, then the above formula is written as

mean = ∑ xn

OR mean = x = ∑ x/n where ∑ denotes summation.

If the data is a sample, then the mean is called the sample mean. Again, if x denotes the variable, the data is sometimes denoted by x1,x2, ... ,xn and then

mean =

x

= n

∑ xi

i=1

n .

1

OR

mean = x = n

∑ xi/n

i=1

If you have not seen the notation ∑ before, it simply means summation. For example,

n

∑

i = 1 xi = x1+x2+ ... +xn

Weighted Mean

Sometimes, different values in data carry different weight. Let us consider the following data and the corresponding frequency distribution that we computed earlier:

Example 2.1.1 To estimate the mean time taken to complete a three-mile drive by a race car, the race car did several time trials. The following are sample times taken (in seconds) to complete the laps:

50 48 49 46 54 53 52 51 47 56 52 51

51 53 50 49 48 54 53 51 52 54 54 53

55 48 51 50 52 49 51 53 55 54 50

Lesson 2 : Measures of Central Tendency and Measures of Dispersion

Introduction

In this lesson we talk about two types of constants that we compute from data:

1. measures of central tendency and2. measures of dispersion.

A measure of central tendency represents an "average value." Mean, median, mode (if you already know these) are measures of central tendency. A measure of dispersion is a measure of how widely the data is scattered around.

2.1 Measure of Central Tendency: Mean

2

https://www.math.ku.edu/~mandal/math365/les2.html#top


The most common measure of central tendencies is the mean or arithmetic mean.

Definition. The mean or the arithmetic mean of a set of data is given by

mean =sum of all the data values

size of the data.

If we denote a data value (i.e., the variable) by x and if n is the size of the data, then the above formula is written as

mean = x =∑ x

n.

OR

mean = x = ∑ x/n where ∑ denotes summation.

If the data is a sample, then the mean is called the sample mean. Again, if x denotes the variable, the data is sometimes denoted by x1,x2, ... ,xn and then

mean = x

=

n ∑ xi

i=1 n

.

OR

mean = x =

n ∑ xi/ni=1

If you have not seen the notation ∑ before, it simply means summation. For example,

n ∑

i = 1 xi = x1+x2+ ... +xn

Weighted Mean

Sometimes, different values in data carry different weight. Let us consider the following data and the corresponding frequency distribution that we computed earlier:

Example 2.1.1 To estimate the mean time taken to complete a three-mile drive by a race car, the race car did several time trials. The following are sample times taken (in seconds) to complete the laps:

50 48 49 46 54 53 52 51 47 56 52 5151 53 50 49 48 54 53 51 52 54 54 5355 48 51 50 52 49 51 53 55 54 50

Following is the frequency distribution of this data:

3

Time (in seconds) 46 47 48 49 50 51 52 53 54 55 56Frequency 1 1 3 3 4 6 4 5 5 2 1

Now we want to compute the mean time. So, we add all the data values and divide by the data size 35. We already have computed the frequency distribution which tells us that, in the data, 46 was present 1 time, 47 was present 1 time, 48 was present 3, times and so on. So, using the frequency distribution, we compute the mean as follows :

mean=x= (46x1+47x1+48x3+49x3+50x4+51x6+52x4+53x5+54x5+55x2+56x1)(1+1+3+3+4+6+4+5+5+2+1) =1799/35=51.4

The mean of the original data is the weighted mean of the data values 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 56 with the corresponding frequency as the weight. So, a new formula for the mean would be

x

= mean =

n ∑ xi fi

i=1 n

∑fi

i=1

OR

mean

= x = n n ∑ fi xi / ∑fi

i=1 i=1

where fi is the frequency of xi. The weighted mean is defined in more general context as follows:

Definition. If x1, x2, ... , xn in a data set have different weights and the values xi has weight wi, then the weighted mean is defined as

weighted mean =

n

∑i=1

wixi

n

∑i=1

wi

.

OR

weighted mean = x = ∑wixi / ∑wi

Properties of the Mean

1. Combining two means. Suppose we have two sets of data. The mean of the first set is x, and the size of the first set is m; the mean of the second set is y, and size of the second set is n. The mean of the combined data is

Combined mean = (m x +ny)/(m+n)

This is the weighted mean of x, y with weight m,n respectively. 4

2. Effect of translation. Let x be the mean of x1, x2, ... , xn. Then the mean of y1 = x1+d, y2 = x2+d, ... ; yn = xn+d is given by

y = x+d

3. Effect of multiplication by a constant. Let x be the mean of x1, ... , xn. Then the mean of

z1 = cx1, z2 = cx2, ... , zn = cxn

is given by

z = cx

Properties of Mean

Remark (effect of translation): Your teacher tells you that the mean score for the midterm in your class is 73. After you complained and requested a change, he agreed that all can add 7 points to their score. The new mean score is (old mean + 7) = 73 + 7 = 80. This is what we meant by "effect of translation."

Example (effect of multiplication by c): Suppose you have some data x1, x2, ..., xn on salaries in an industry in the United States and the mean is $37000. On a certain day, 1 U.S. dollar = 1.4729 Canadian dollars (say c = 1.4729). So, in Canadian dollars the mean is 37000*c = 37000 x 1.4729. Similarly, the change of units (inches to feet or cm) are "multiplication by a constant c."

Example 2.1.2. A student took PHSX 115 (College Physics), PSYC 120 (Personality), FREN 110 (Elementary French), BUS 241 (Managerial Accounting), and MATH 365 (Elementary Statistics). The number of credit hours and the student's grade is given in the following table:

Course PHSX 115 PSYC 120 FREN 110 BUS 241 MATH 365Grade (Points) B (3 points) A (4 points) B (3 points) C (2 points) B (3 points)Credit Hours 4 3 5 3 3What is the student's GPA?

Solution. The GPA is the weighted average of the points (corresponding to the grades), weight being the course-credit hours. So, the GPA = (3x4+4x3+3x5+2x3+3x3)/(4+3+5+3+3) = 54/18 = 3.

2.2 Measure of Central Tendency: Median, and Mode

The Median

The median represents the middle value of the data. Half the data will be less than or equal to the median, and half the data will be greater than or equal to the median. You are above the median American income if half the American population is making less than you make.

Definition. Suppose the data is arranged in an increasing order (i.e., in an array). If the size of data is ODD then the median is the middle value. If it is EVEN, then the median is the mean of the middle two values.

The Percentiles

5


Definition. For a number p between 0 to 100, the pth percentile xp of the data is a number such that at least p percent of the data members are below xp and at least (100 - p) percent of the data members are above xp.

1. The 25th percentile is called the first quartile Q1.2. The median is the 50th percentile, also called the second quartile Q2.3. The 75th percentile is called the third quartile Q3.

The Mode

There is one other measure of central tendencies that should be mentioned.

Definition. The MODE of the data is the value or values that have the highest frequency. For example, the mode of the set {1, 3, 5, 5, 7} is {5} because it has the highest frequency. The mode of {1, 1, 3, 5, 5, 7} is {1, 5} because 1 and 5 both have the highest frequency. Such a set is said to be bimodal.

Use of Calculators (TI-83):Entering your data

1. Press the button stat.2. Select "Edit" in the Edit menu and enter.3. You will find six lists named L1, L2, L3, L4, L5, L6.4. Let's say you want to enter your data in L1.5. If L1 has some data, clear it by pressing the stat button and selecting

ClrList in the Edit menu.6. Once L1 is cleared, select Edit in the Edit menu and enter.7. Now type in your data and enter one by one.

Sorting data and computing the median

1. Enter your data in a list, say L1.2. Select SortA in the Edit menu and enter.3. The calculator will ask for the list. Type in the list (L1), close the

parentheses, and enter.4. The calculator will say Done.5. Press stat, select edit in the Edit menu, and enter.6. You will see that your data in L1 has been sorted in an increasing order.7. If the data size is odd, the median is the middle value.

If the data size is even, the median is the average of the middle two values.

Computing the mean if only raw data is given

1. Enter your data in a list, say L1.2. Select "1-Var Stats" in the CALC menu and enter.3. The calculator will ask for the list. Type in the list L1 and enter.4. The calculator will give a list of numbers; x-bar is the mean x.

Computing the mean if the frequency table is given

1. Enter the frequency table in the calculator, say, x-values in L1 and frequencies in L2.

6

2. Select "1-Var Stats" in the CALC menu and enter.3. The calculator will ask for the lists. Type in the list L1, L2 and enter.4. The calculator will give a list of numbers; x-bar is the mean x.

Problems on 2.2: Mean and Median

Exercise 2.2.1. The following is the price (in dollars) of a stock (say, CISCO SYSTEMS) checked by a trader several times on a particular day.

138 142 127 137 148 130 142 133Find the median price and mean price observed by the trader. Solution

Exercise 2.2.2. The following figures refer to the GPA of six students.

3.0 3.3 3.1 3.0 3.1 3.1Find the median and mean GPA.

Exercise 2.2.3. The following data give the lifetime (in days) of light bulbs.

138 952 980 967 992 197 215 157Find the mean and median lifetime of these bulbs. Solution

Exercise 2.2.4. An athlete ran an event 32 times. The following frequency table gives the time taken (in seconds) by the athlete to complete the events.

Time (in seconds) Frequency26 327 628 529 630 931 3

Total 32Compute the mean and median time taken by the athlete. Solution

Exercise 2.2.5. Following is data on the weight (in ounces), at birth, of 96 babies born in Lawrence Memorial Hospital in May 2000.

94 105 124 110 119 137 96 110 120 115 1197

104 135 123 129 72 121 117 96 107 80 8096 123 124 124 134 78 138 106 130 97 134

111 133 128 96 126 124 125 127 62 127 96116 118 126 94 127 121 117 124 93 135 112120 125 120 147 138 72 119 89 81 113 100109 127 138 122 110 113 100 115 110 135 12097 127 120 110 107 111 126 132 120 108 148

133 103 92 124 150 86 121 98 Compute the mean and median weight, at birth, of the babies. Solution

Exercise 2.2.6. Following is data on the hourly wages (paid only in whole dollars) of 99 employees in an industry.

7 11 7 11 10 9 10 10 12 137 8 11 11 14 9 7 9 11 79 13 12 14 7 8 7 14 15 99 7 11 9 12 9 12 11 14 9

12 13 7 9 10 14 11 12 13 715 15 16 16 15 16 11 7 18 1915 16 15 15 16 16 17 16 16 1315 15 16 15 16 15 15 17 16 1216 15 15 16 15 15 19 8 16 1716 16 15 16 16 16 13 12 8

Compute the mean and median hourly wage. Solution

Exercise 2.2.7. Following is the frequency table on the number of typos in a sample of 30 books published by a publisher.

No. of Typos 156 158 159 160 162Frequency 6 4 5 6 9

Find the mean and median number of typos in a book. Solution

Exercise 2.2.8. Following is data on the length (in inches), at birth, of 96 babies born in Lawrence Memorial Hospital in May 2000.

18 18.5 19 18.5 19 21 18 19 20 20.519 19 21.5 19.5 20 17 20 20 19 20.518 18.5 20 19.5 20.75 20 21 18 20.5 2021 19 20.5 19 20 19.5 17.75 20 19.5 2020.5 17 21 18.5 20 20 20 18.5 19.5 1918 20.5 18 20 19 19 19.5 20 20.75 2117.75 19 18 19 20 18.5 20 19 21 1919.5 20 20 19 19.5 20 19.5 18.5 20.5 19.520.25 20 19.5 19.5 20 20 20 21 20 19

8

18.5 20.5 21.5 18 19.5 18 Compute the mean and median length, at birth, of these babies. Solution

2.3 Measures of Dispersion

RangeClearly, the measures of central tendency—mean, median, mode—cannot tell us the "whole story" about the data.

Example 2.3.1. Suppose two sections of the statistics class have the following percentage score distribution at the end of the semester:

Section A 81 84 83 80 82Section B 72 93 92 82 71Both these sections have the same mean—82. But in Section A, everybody will get a B grade. In section B, we will have two C's, one B and two A's.

The measure of dispersion is a measure of how widely the data is scattered around. In section A, the data has a very small dispersion or variability, whereas section B has a large dispersion.

A very simple measure of dispersion is the range of the data as we have defined before:

range = largest value - smallest value.

Mean Deviation, Sample Variance, and Standard Deviation

We will discuss three more measures of dispersion.

Suppose we have a data set x1, x2, ... , xn of size n. We will denote the mean of the data by x. Three definitions follow:

Definition. The mean deviation of the data is defined as follows.

mean deviation = ( |x1- x | + ... + |xn- x |) / n

So, the mean deviation is the mean of the absolute deviations | xi -x | from the mean.

Definition. The sample variance s2 of the data is defined as follows:

s2 = ( (x1- x)2 + ... + (xn- x)2 ) / (n -1)

Remark.

1. Note that we denote the sample variance as the square of a number s.2. Also note that we divide by n-1, not by n. For some reason, dividing by n-1 works better.3. We would like our measure of dispersion to have the same units as our data, but our formula

involves squares (xi-x)2, which means the unit of dispersion, s2, is the unit of the data squared. If the data is in feet, the variance is in square feet. To solve this problem we define another

9


measure of dispersion, standard deviation denoted s.

Definition. The sample standard deviation s is defined as the square root of the sample variance s2. So, to compute the sample standard deviation, we have to compute the sample variance first.

If we simplify the definition of sample variance we get the following formula:

s2 =( (x12 + x2

2 + ... + xn2) - nx2)/(n - 1)

Let us quickly do some computation with the above example 2.3.1.

The mean deviation for section A = (1+2+1+2+0)/5= 6/5 and the mean deviation for section B = (10+11+10+0+11)/5= 42/5. Since the variability of section B was much higher, the mean deviation was very high.

Let us compute the the sample variances :

For section A the sample variance is

( (81-82)2+(84-82)2+(83-82)2+(80-82)2+(82-82)2 )/(5-1) = (1+4+1+4+0) /4= 10/4 = 2.5 .

For section B the sample variance is

( (72-82)2+(93-82)2+(92-82)2+(82-82)2+(71-82)2 )/(5-1) =(100+121+100+0+121) /4= 442/4.

Application of Standard deviation

The mean and the standard deviation tell us a lot about how the data is distributed.

Chebyshev's Rule. This rule applies for all kinds of data. Suppose x is the mean and s is the standard deviation of the data. Then we have the following:

1. At least 0 percent of the observations will fall within 1 standard deviation of the mean, i.e, within (x-s, x+s). This is clearly obvious.

2. At least 75 percent of the observations will fall within 2 standard deviations of the mean, i.e., within (x-2s, x+2s).

3. At least 89 percent of the observations will fall within 3 standard deviations of the mean, i.e., within (x-3s, x+3s).

4. More generally, at least 100(1 - 1/k2) percent of the data will be within k- standard deviations from the mean, i.e. within (x-ks, x+ks).

Chebyshev's Rule makes no assumption about the data or the variable. If we make some assumptions about the data, then we can improve the above rule as follows.

The Empirical Rule: Suppose the histogram of the data is symmetric around the vertical line x = x as follows:

10

In other words, the histogram should fit into a bell-shaped curve.

Bell-shaped Curve

Click to see the Flash animation. Then we have the following:

1. Approximately 68.3 percent of the observations will fall in the interval (x-s, x+s).2. Approximately 95.4 percent of the observations will fall in the interval (x-2s, x+2s).3. Approximately 99.7 percent of the observations will fall within the interval (x-3s, x+3s).

Question: What does it mean when the variance or mean deviation of some data is zero? The answer is that all the data members are EQUAL!

Practice Problem. Consider the exercises 2.2.1 through 2.2.8. For each problem, compute the mean and standard deviation of the data and find what percentage of the data are within one, two, or three standard deviations from the mean.

Use of the Frequency Table

When a frequency table is given, we can use new formulas to compute the mean and variance of the data.

Formulas. Suppose the data consisting of n observations are given in a frequency table (ungrouped). Let xi denote the values and fi be the frequency of xi. Then

11

1. the mean =

x

=∑ fixi

∑ fi

=∑ fixi

n,

2.3. the variance =

s2 =∑ fi(xi - x)2

n- 1,

4.5. A simplified formula for variance is

s2 =1

n- 1[∑ (fixi

2) - n x2 ].

6.7. If the data is given in a frequency table of the grouped data, we use the same formula,

with xi as the class mark, which is the average of the class limits.

Example 2.3.2. The following table extends the frequency table of the time taken to complete a lap by a race car (example 2.1.1) to compute mean and variance using the above formulas.

Time x

Frequency f fx fx2

46 1 46 211647 1 47 220948 3 144 691249 3 147 720350 4 200 1000051 6 306 1560652 4 208 1081653 5 265 1404554 5 270 1458055 2 110 605056 1 56 3136

Total 35 1799 92673

So, the mean x = 1799/35 =51.4 and variance s2 = (92673 - 35x 51.42)/(35-1) = 6.0118.

Example 2.3.3. Following is the class frequency distribution of the data on birth weight of some babies (exercise 1.2, Lesson 1):

Classes Frequency f

Class Mark x fx fx2

60.5-80.5 9 70.5 634.5 44732.25

12

80.5-100.5 20 90.5 1810 163805100.5-120.5 25 110.5 2762.5 305256.25120.5-140.5 37 130.5 4828.5 630119.25140.5-160.5 8 150.5 1204 181202

Total 99 11239.5 1325114.75We can use the above formula to compute (approximate) variance and the standard deviation of the birth weight.

So, the mean x = 11239.5/99 = 113.53 and variance

s2 = (1325114.75 - 99 x 113.532)/(99-1) = 500.997.

Remarks.

1. Note that we can only get an approximate mean and variance if we use the class mark and with the above formula. If you also use the original data you may notice a difference.

2. Because of the availability of computers, the importance of such approximations has declined.

Comment: We have had detailed discussions of various formulas for defining the mean, variance, and other constants. It is important to understand these concepts and formulas.

It is equally important to appreciate the value and necessity of using calculators or other available software (like Excel). It is almost impossible (and unnecessary) to compute these constants manually and correctly, unless one is specially gifted with numerical computations.

Use of Calculators (TI-83):Computing the variance and standard deviation

1. Follow the same steps used for computing the mean (using either raw data or the frequency table).

2. The calculator will give a list of numbers; SX is the standard deviation.3. The variance is the square of the standard deviation.

Problems on 2.3: Variance, Standard Deviation, and Use of the Frequency Table

Exercise 2.3.1. The following is the price (in dollars) of a stock (say, CISCO SYSTEMS) checked by a trader several times on a particular day.

138 142 127 137 148 130 142 133Find the variance and standard deviation of the price. Solution

Exercise 2.3.2. The following figures refer to the GPA of six students.

3.0 3.3 3.1 3.0 3.1 3.1Find the variance and standard deviation of GPA.

13

Exercise 2.3.3. The following data give the lifetime (in days) of certain light bulbs.

138 952 980 967 992 197 215 157Find the variance and standard deviation of the lifetime of these bulbs. Solution

Exercise 2.3.4. An athlete ran an event 32 times. The following frequency table gives the time taken (in seconds) by the athlete to complete the events.

Time (in seconds) Frequency15.6 315.7 615.8 515.9 616.0 916.1 3Total 32

Compute the variance and standard deviation of time taken by the athlete. Solution

Exercise 2.3.5. Following is data on the weight (in ounces), at birth, of 96 babies born in Lawrence Memorial Hospital in May 2000.

94 105 124 110 119 137 96 110 120 115 119104 135 123 129 72 121 117 96 107 80 8096 123 124 124 134 78 138 106 130 97 134

111 133 128 96 126 124 125 127 62 127 96116 118 126 94 127 121 117 124 93 135 112120 125 120 147 138 72 119 89 81 113 100109 127 138 122 110 113 100 115 110 135 12097 127 120 110 107 111 126 132 120 108 148

133 103 92 124 150 86 121 98 Compute the variance and standard deviation of the weight, at birth, of these babies. Solution

Exercise 2.3.6. Following is data on the hourly wages (paid only in whole dollars) of 99 employees in an industry.

7 11 7 11 10 9 10 10 12 137 8 11 11 14 9 7 9 11 79 13 12 14 7 8 7 14 15 99 7 11 9 12 9 12 11 14 9

12 13 7 9 10 14 11 12 13 715 15 16 16 15 16 11 7 18 1915 16 15 15 16 16 17 16 16 13

14

15 15 16 15 16 15 15 17 16 1216 15 15 16 15 15 19 8 16 1716 16 15 16 16 16 13 12 8

Compute the variance and standard deviation of the hourly wages. Solution

Exercise 2.3.7. Following is the frequency table on the number of typos in a sample of 30 books published by a publisher.

No. of Typos 156 158 159 160 162Frequency 6 4 5 6 9

Find the mean number, variance, and standard deviation of typos in a book. Solution

Exercise 2.3.8. Following is data on the length (in inches), at birth, of 96 babies born in Lawrence Memorial Hospital in May 2000.

18 18.5 19 18.5 19 21 18 19 20 20.519 19 21.5 19.5 20 17 20 20 19 20.518 18.5 20 19.5 20.75 20 21 18 20.5 2021 19 20.5 19 20 19.5 17.75 20 19.5 2020.5 17 21 18.5 20 20 20 18.5 19.5 1918 20.5 18 20 19 19 19.5 20 20.75 2117.75 19 18 19 20 18.5 20 19 21 1919.5 20 20 19 19.5 20 19.5 18.5 20.5 19.520.25 20 19.5 19.5 20 20 20 21 20 1918.5 20.5 21.5 18 19.5 18 Compute the variance and standard deviation of the length, at birth, of these babies. Solution

Exercise 2.3.9. The following is the frequency table of weight (in pounds) of some salmon in a river. Find the variance and standard deviation.

Weight x 31 32 33 34 35 36 37Frequency f 3 2 4 5 6 5 9

Find the variance and the standard deviation. Solution

Exercise 2.3.10. The following data represents the time (in minutes) taken by students to drive to campus.

23 17 19 24 42 33 20 22 15 926 37 29 19 35 18 30 21 11 2313 27 32 32 23 35 25 33 24 23

15

Find the mean, variance, and the standard deviation of the data. Solution

probability : Introduction

Historical Remarks

Games of chance, such as those involving dice, have been played for over 5,000 years. For almost that long, people have been trying to determine the odds or probability of winning at these games. Since the sixteenth century, mathematicians have been working steadily at this problem of calculating probabilities. Around 1620, Galileo wrote a paper on dice probabilities. However, the year 1654 is often considered as the beginning of probability theory. At that time that Blaise Pascal and Pierre Fermat began a correspondence on the subject.

Pierre Simon, the Marquis de Laplace, was an important contributor to probability theory. In 1812 he proved the central limit theorem which provides explains why so many data sets follow a distribution that is bell-shaped, i.e., normally distributed. In Laplace’s book Analytical Theory of Probability, he writes:

We see that the theory of probability is at the bottom only common sense reduced to calculation; it makes us appreciate with exactitude what reasonable minds feel by a sort of instinct, often without being able to account for it … It is remarkable that this science, which originated in the consideration of games of chance, should become the most important object of human knowledge … The most important questions in life are, for the most part, really only problems of probability.

A Classic Example

Probability is essentially an extension of the idea of a proportion, or ratio of a part to the whole. Let's look at a classic problem of drawing a colored ball out of an urn. Here the kind of urn we have in mind is a pottery vase, large enough to hold a number of colored balls and deep enough so that we cannot see what ball we select to draw out.

1. Suppose there are 6 green balls and 4 red balls in an urn. You mix them well and then reach in without looking and pull one out. What fraction of the time, " on average," would you expect to get a green ball? A red ball?

We need to discuss what the phrase "on average" means. In this case, after noting the color of the chosen ball, we put it back, mix well and select again. If we let Ng denote the number of green balls that we have obtained after performing this experiment N times, then we expect the ratio Ng / N to approach 0.6 as N becomes large. Similarly, we expect the likelihood of getting a red ball to approach 0.4. Here we say that the probability of selecting a green ball is 0.6 and that of selecting a red ball is 0.4.

16

Clicking below will bring up an applet that will allow you to perform the above experiment of selecting a ball from the urn containing 6 green balls and 4 red. Close the applet window when you are finished.

Elementary Probability

Part 2: Terminology

In order to discuss the notion of probability in more detail, we need to introduce some terminology. First, note that the word random is often used to describe an activity where the outcome is uncertain. In the urn example of Part 1, we would say that we are selecting a ball "at random" because we do not know the outcome of this activity.

We will discuss the probability of various outcomes in the context of a well-defined activity or procedure. We use the term experiment for such a well-defined procedure. An example is the procedure of selecting a single ball out of the urn. When an experiment is performed, the result is called an outcome. In our urn experiment, the possible outcomes are A green ball is selected and A red ball is selected. We also need a term for the set of all possible outcomes. We will call this set the sample space. So in our urn example, the sample space is a set consisting of the two outcomes.

Example: Rolling a die. Consider the experiment of rolling a single die. There are six possible outcomes: one of the numbers 1, 2, 3, 4, 5, and 6. So, in this case, the sample space has 6 elements. Assuming that we have a fair die (that is, each side is equally likely to turn up), we assign the probability of each outcome to be the ratio 1/6.

Example: Using a game spinner. Consider the experiment of spinning the pointer on the game spinner pictured below. There are three possible outcomes, that is, when the pointer stops it must point to one of the three colors. (We rule out the possibility of landing on the border between two colors.) Since the red region covers half the area of the spinner, we say that the probability of it pointing to the red area is 1/2. Similarly the probabilites of it pointing to the blue and green areas are 1/3 and 1/6 respectively.

17

1. Describe the experiment and the sample space for both the die example and the game spinner example.

Events

Not only do we want to assign probabilities to individual outcomes, but also we want to assign probabilities to sets of outcomes, i.e., to subsets of the sample space. A subset of the sample space will be called an event. So, how should we assign probabilities to events?

2. Die Example (continued). Consider again the experiment of rolling a fair die. Here, an event may be identified with a subset of the set of 6 integers {1, 2, 3, 4, 5, 6}. For example, if A is the event the die will show an even number, then A = {2, 4, 6}, what probability would you assign to this event?

3. Game Spinner Example (continued). In the experiment with the fair spinner, let A be the event that the outcome is either red or blue. What is the probability that you would assign to this event?

Definition: The probability of an event A, written P(A), is the sum of the probabilities assigned to the individual outcomes in A.

4. Check to see that this definition agrees with your assignments in Examples 1 and 2.5. Find the probability of each outcome in a sample space of size n, assuming that

each outcome in the sample space is equally likely to occur.6. Suppose we roll a fair die and A is the event that the outcome shows a number less

than 3. Find P(B).7. Now suppose we roll two fair dice. We can generate the elements of this sample

space as ordered pairs. For example, the ordered pair (2, 5), indicates that the first die showed a 2 and the second a 5. Explain why the size of this sample space is 36.

8. In the experiment of rolling a pair of fair dice, let A be the event that the sum of the two faces showing is 5. Find P(A). Clearly indicate how you arrived at your answer. What other sum has the same probability of appearing as the sum 5?

The following applet will allow you to experiment with rolling a pair of dice. Close the applet window when you are done.

18

9. Did your applet experiments agree with your answer to Question 8

Elementary Probability

Part 3: Rules for Assigning and Calculating Probabilities

Here is the fundamental rule for assigning probabilities to outcomes in a sample space. It reflects the intuitive idea that every outcome has a non-negative probability (A value of 0 is allowed.) and exactly one of the outcomes must occur.

Rule: The probability of each outcome must be a non-negative number, and the sum of the probabilities of the possible outcomes of an experiment must be 1.

Note that it follows from this fundamental principle that the probability of an event is always a number between zero and 1.

1. Let An be the event that the sum of the faces showing after the roll of two fair dice is n. Calculate P(An) for n = 2 ... 12. (For example P(A3) is the probability that the faces showing sum to 3.) Why should that the sum of these 11 probabilities be 1? Check that this is the case.

2. Consider the experiment of flipping a coin three times. If we denote a head by H and a tail by T, we can list the 8 possible ordered outcomes as (H,H,H), (H,H,T)… each of which occurs with probability of 1/8. Finish listing the remaining members of the sample space. Calculate the probability of the following events:a. All three flips are heads.b. Exactly two flips are heads.c. The first flip is tails.d. At least one flip is heads.e. The heads and tails alternate.

Adding Probabilities

Let A and B be two events from a given sample space. What is the probability of either A or B happening. (When we say "A or B" we mean "A or B or both.") Is this probability the same as P(A) + P(B)?

3. In Ms. Nelson’s homeroom at Weaver High School ,18 of her 30 students are taking a history class and 14 are taking a biology class.a. If Ms. Nelson chooses a homeroom student at random, what is the probability

that the student is taking a history class?b. If Ms. Nelson chooses a homeroom student at random, what is the probability

that the student is taking a biology class?

19

c. Explain why the sum of your answers from parts a) and b) is obviously not the probability that Ms. Nelson chooses a student taking a history class or a biology class.

After you answered all the parts of Question 3, read the following:

Discussion of Question 3

Generalizing the discussion above, we have the following rule:

Addition Rule

P(A or B) = P(A) + P(B) – P(A and B)

Thus, the probability of the event A or B is equal to the probability of A plus the probability of B minus the probability of A and B.

Note that in terms of the operations union and intersection the event A or B corresponds to A union B and the event A and B corresponds to A intersection B. Thus we can rewrite the addition rule as

P(A union B) = P(A) + P(B) – P(A intersection B)

4. A card is selected at random from a deck of 52 cards. Use the addition rule to show that the probability the card chosen is a queen or a diamond is 4/13.

5. If a fair die is rolled twice, what is the probability that either the sum of the face is 8 or at least one roll is 5?

Part 4: Independence

Suppose A and B are events in a sample space. The knowledge that an outcome is in event A may change your estimate of the likelihood that the outcome is in event B. For example, suppose that the experiment is rolling 2 fair dice, the event A is: The sum of the dice is greater than 10, and that the event B is: At least one of the dice is 5. Then knowing that the outcome is in the event A, increases the likelihood that the outcome is in the event B. Once we know that the outcome is in A, we have a new experiment with the sample space: {(5,6), (6,5), (6,6)}. Each of these outcomes is equally likely. So, the probability of one die being a 5 given that the sum is greater than 10 is 2/3.

On the other hand, if the experiment is flipping a fair coin twice, knowing that the first flip is a head (event A), does not change the likelihood that the second flip is a tail (event B). Here the new sample space is {(H,H),(H,T)}. Again, both outcomes are equally likely, so the probability that the second flip is a tail is 1/2 -- the same probability we would assign without the knowledge about A.

20

Informally, two events A and B are said to be independent if knowing that an outcome is in event A does not change the likelihood that the outcome is in event B and vice versa.

1. Suppose that you simultaneous flip a coin and roll a standard die. If you know that the die has come up 6, what is the probability that the coin shows heads? What does your answer tell you about the independence of the events of rolling a 6 and flipping a head?

2. Suppose that 50% of the people in a town are 5 ft. 7 in. or taller and that 50% of the people in this town are males. If a person from the town is chosen at random, are the events the person is taller than 5 ft. 7 in. and the person is a male likely to be independent? Explain.

The probability that a person chosen at random will be both taller than 5 ft. 7 in. and a male may at first appear to be 1/4 . After all, half the population is taller than 5 ft. 7 in. and half of that group is male, so 1/2 of 1/2 is 1/4. But since males on average are taller than females, it is wrong to assume that 1/2 of the 5 ft. 7 in and taller group is male. Knowing whether the person chosen is a male influences the likelihood (probability) that he is 5 ft. 7 in or taller.

Suppose that, for an experiment, the event A has probability 1/4 and event B has probability 1/3. If events A and B are independent, then in a large number of instances of this experiment, only 1/3 will be in the event A. Of this 1/3, only approximately 1/4 will also be in event B. So, the probability of the event A and B is just the product of the two probabilities, 1/12.

This observation is taken as the definition of the independence of two events.

Definition: Suppose we are considering the outcomes of a particular experiment. If A and B are events (subsets of the sample space of outcomes), we say A and B are independent if

P(A and B) = P(A) P(B).

3. In Question 2 of Part 3 you determined the 8 members of the sample space for flipping a fair coin 3 times. We list the elements of this sample space here:

(H,H,H) (H,H,T) (H,T,H) (H,T,T)(T,H,H) (T,H,T) (T,T,H) (T,T,T)

4. Let A be the event that a heads appears on the first flip, and let B be the event that a heads comes up on the third flip.a. Without calculating any probabilities, explain why the events A and B should

be independent.b. Find P(A).c. Find P(B).d. Find P(A and B).

21

e. Show that P(A and B) = P(A) P(B), and thus the events are indeed independent.

5. Again, consider the sample space obtained by flipping a fair coin 3 times. Let A again denote the event that the first flip is a head. Let C denote the event that at least 2 of the 3 flips are heads.a. Without calculating any probabilities, do you think that the events A and C are

independent? Explain.b. Find P(A).c. Find P(C).d. Find P(A and C).e. Show that P(A and C) is not equal to P(A) P(C), and thus the events are not

independent. Does this contradict your explanation in (a)?6. A fair die is painted so that three sides are red, two sides are blue and one side is

green. Thus, rolling the die has three possible outcomes R, B, and G.a. If the painted die is rolled once, what is the probability that it will come up blue?b. If the painted die is rolled twice, we can denote the nine possible outcomes

by RR, RB, etc. Find the probability of each element in this sample space.c. Consider the following events in the sample space obtained by rolling the

painted die twice. A: At least one roll will be red. B: The two rolls will have different colors. C: Both rolls are red or both are blue. D: Either both rolls are red or one roll is blue E: At least one roll is red or at least one roll is blue.

Among the events A, B, C, D, and E, determine which pairs are independent.

7. Suppose a fair die is rolled twice.a. Let A be the event that the first roll is greater than or equal to 2. Let A be the

event that the second roll is greater than or equal to 4. Show that A and B are independent.

b. Let A be the event that the first roll is greater than or equal to 2. Let C be the event that the sum of the rolls is greater than or equal to 4. Find P(A and C) and show that A and C are independent.

Independence and the Gambler's Fallacy.

A lack appreciation of the concept of independence lies at the heart of the mistaken belief that, after a run of bad luck, a gambler's luck is due to change. Click below for a short discussion of this misconception:

The Gambler's Fallacy

22

oer.mciu.edu.ngoer.mciu.edu.ng/.../2015/04/mr-eberu-friday-sta-201-lecture-4-6-not… · web...

Documents