5.1 random variable and probability distribution · 5.1 random variable and probability ... of a...
Post on 03-Jul-2018
291 Views
Preview:
TRANSCRIPT
5.1 Random Variable and Probability Distribution
Definition 1 A random variable is a variable whose value is determined by the outcome of a
random experiment.
Definition 2 A random variable that assumes countable values is called a Discrete Random
Variable.
Some examples of discrete random variables are
1. The number of cars sold at a dealership during a given month
2. The number of people coming to a theater on a certain day
3. The number of complaints received at the office of an airline on a given day
4. The number of customers visiting a bank during any given hour
Definition 3 A random variable that can assume any value contained in one or more intervals is
called a Continuous Random Variable.
The following are a few examples of continuous random variables.
1. The height of a person
2. The time taken to complete an examination
3. The amount of milk in a gallon (note that we do not expect a gallon to contain exactly one
gallon of milk but either slightly more or slightly less than a gallon)
2
Example 1 Suppose the following gives the frequency and relative frequency distributions of the
vehicles owned by all 2000 families living in a small town.
Number of Vehicles Owned Frequency Relative Frequency
0 30 30/2000 = 0.015
1 470 470/2000 = 0.235
2 850 850/2000 = 0.425
3 490 490/2000 = 0.245
4 160 160/2000 = 0.080
N = 2000 Sum = 1.000Table 1
Suppose one family is randomly selected from this population. The act of randomly selecting a
family is called a random or chance experiment. Let X denote the number of vehicles owned
by the selected family. Then X can assume any of the five possible values (0, 1, 2, 3, and 4) listed
in the first column of Table 1. The value assumed by X depends on which family is selected. Thus,
this value depends on the outcome of a random experiment.
Definition 4 The probability function or probability distribution P (X) of a discrete
random variable can be represented by a formula, a table or a graph, which provides the proba-
bilities P (x) or P (X = x) or f (x) corresponding to each and every value of x.
Example 2 Recall the frequency and relative frequency distributions of the number of vehicles
owned by families given in Table 1. Let X be the number of vehicles owned by a randomly se-
lected family. Write the probability distribution of X.
Number of Relative
Vehicles Owned Frequency Frequency
0 30 0.015
1 470 0.235
2 850 0.425
3 490 0.245
4 160 0.080
N = 2000 Sum = 1.0Table 2
Number of Vehicles Owned Probability
x P (x) = P (X = x)
0 0.015
1 0.235
2 0.425
3 0.245
4 0.080∑
P (x) = 1.0Table 3
3
We have learned that the relative frequencies obtained from an experiment or a sample can
be used as approximate probabilities. However, when the relative frequencies are known for the
population, they give the actual (theoretical) probabilities of outcomes.
Using the relative frequencies of Table 2, we can write the probability distribution of the discrete
random variable X in Table 3.
From Table 3, the probability for any value of X can be read. For example, the probability that
a randomly selected family from this town owns two vehicles is 0.425. This probability is written
as
P (X = 2) = 0.425
The probability that the selected family owns more than two vehicles is given by the sum of the
probabilities of three and four vehicles, respectively. This probability is 0.245+0.080 = 0.325. This
probability is written as
P (X > 2) = P (X = 3) + P (X = 4) = 0.245 + 0.080 = 0.325
4
• Three Characteristics of a Probability Distribution
The probability distribution of a discrete random variable possesses the following there charac-
teristics.
1. 0 ≤ P (x) ≤ 1 for each value of x
2.∑
P (x) = 1
3. P (x) = P (X = x)
Example 3 The following table lists the probability distribution of the number of breakdowns per
week for a machine based on past data.
Breakdowns per week 0 1 2 3
Probability 0.15 0.20 0.35 0.30
Find the probability that the number of breakdowns for this machine during a given week is (a)
exactly two (b) zero to two (c) more than one (d) at most one.
Solution: Let X denote the number of breakdowns for this machine during a given week. We
can calculate the required probabilities as follows.
(a) The probability of exactly two breakdowns is
P (exactly 2 breakdowns) = P (X = 2) = 0.35
(b) The probability of zero to two breakdowns is given by the sum of the probabilities of 0, 1,
and 2 breakdowns.
P (0 to 2 breakdowns) = P (X = 0) + P (X = 1) + P (X = 2) = 0.15 + 0.20 + 0.35 = 0.70
(c) The probability of more than one breakdown is obtained by adding the probabilities of 2
and 3 breakdowns.
P (more than 1 breakdown) = P (X > 1) = P (X = 2) + P (X = 3) = 0.35 + 0.30 = 0.65
(d) The probability of at most one breakdown is given by the sum of the probabilities of 0 and
1 breakdown.
P (at most 1 breakdown) = P (X ≤ 1) = P (X = 0) + P (X = 1) = 0.15 + 0.20 = 0.35
5
5.2 Mean and Variance of a Random Variable
Definition 5 Let X be a discrete random variable with probability distribution P (X). The mean
or expected value is
µ = E (X) =∑
xP (x)
Example 4 Recall Example 3. Find the mean number of breakdowns per week for this machine.
Solution: To find the mean number of breakdowns per week for this machine, we multiply each
value of x by its probability and add these products. This sum gives the mean of the probability
distribution of X.
x P (x) xP (x)
0 0.15 0(0.15) = 0.00
1 0.20 1(0.20) = 0.20
2 0.35 2(0.35) = 0.70
3 0.30 3(0.30) = 0.90∑
xP (x) = 1.80
The mean is µ = E (X) =∑
xP (x) = 1.80. Thus, on average, this machine is expected to break
down 1.80 times per week over a period of time. In other words, if this machine is used for many
weeks, then for certain weeks we will observe no breakdowns; for some other weeks we will observe
1 breakdown per week; and for still other weeks we will observe 2 or 3 breakdowns per week. The
mean number of breakdowns is expected to be 1.80 per week for the entire period.
Example 5 A fair die is tossed. If 2, 3 or 5 occur the player wins that number of dollars, but if 1,
4, or 6 occurs, the player loses that number of dollars. The possible payoffs for the player and their
respective probabilities follow:
x 2 3 5 −1 −4 −6
P (x) 1
6
1
6
1
6
1
6
1
6
1
6
Solution: The negative numbers −1, −4, −6 refer to the fact that the player loses when 1, 4, or
6 occurs. Then the expected value of the game is as follows:
E (X) =∑
xP (x) = 2 ×
(
1
6
)
+ 3 ×
(
1
6
)
+ 5 ×
(
1
6
)
− 1 ×
(
1
6
)
− 4 ×
(
1
6
)
− 6 ×
(
1
6
)
= −1
6
Thus, the game is unfavorable to the player since the expected value is negative.
6
Definition 6 Let X be a discrete random variable with probability distribution P (X) and mean µ.
The variance of X is
σ2 = V ar (X) =∑
(x − µ)2 P (x)
• Alternative formula
σ2 = V ar (X) =∑
x2P (x) − µ2
Example 6 The following table gives the probability distribution of a random variable X
x 0 1 2 3
P (x) 0.2 0.1 0.3 0.4
Find the variance of the random variable X.
Solution: By definition,
x P (x) xP (x) (x − µ) (x − µ)2 (x − µ)2 P (x)
0 0.2 0(0.2) = 0 0 − 1.9 = −1.9 3.61 0.722
1 0.1 1(0.1) = 0.1 1 − 1.9 = −0.9 0.81 0.081
2 0.3 2(0.3) = 0.6 2 − 1.9 = 0.1 0.01 0.003
3 0.4 3(0.4) = 1.2 3 − 1.9 = 1.1 1.21 0.484
µ =∑
xP (x) = 1.90 σ2 =∑
(x − µ)2 P (x) = 1.29
OR by alternative formula
x P (x) xP (x) x x2P (x)
0 0.2 0(0.2) = 0 0 0
1 0.1 1(0.1) = 0.1 1 0.1
2 0.3 2(0.3) = 0.6 4 1.2
3 0.4 3(0.4) = 1.2 9 3.6
µ =∑
xP (x) = 1.90∑
x2P (x) = 4.9
The variance is
σ2 =∑
x2P (x) − µ2 = 4.9 − (1.9)2 = 1.29.
5.3 The Binomial Probability Distribution
The binomial probability distribution is one of the most widely used discrete probability distri-
butions. To apply the binomial probability distribution, the random variable x must be a discrete
7
random variable. In other words, the variable must be a discrete random variable and each repe-
tition of the experiment must result in one of two possible outcomes. The binomial distribution is
applied to experiments that satisfy the four conditions of a binomial experiment. Each repetition
of a binomial experiment is called a trial or a Bernoulli trial (after Jacob Bernoulli).
Definition 7 An experiment that satisfies the following four conditions is called a binomial ex-
periment.
1. There are n identical trials. In other words, the given experiment is repeated n times. All
these repetitions are performed under identical conditions.
2. Each trial has two and only two outcomes. These outcomes are usually called a success and
a failure.
3. The probability of success is denoted by p and that of failure by 1 − p. The probabilities p
and 1 − p remain constant for each trial.
4. The trials are independent. In other words, the outcome of one trial does not affect the
outcome of another trial.
Example 7 Five percent of all VCRs manufactured by a large electronics company are defective.
Three VCRs are randomly selected from the production line of this company. The selected VCRs are
inspected to determine if each of them is defective or good. Is this experiment a binomial experiment?
Solution:
1. This example consists of three identical trials. A trial represents the selection of a VCR.
2. Each trial has two outcomes: A VCR is defective or a VCR is good. Let a defective VCR be
called a success and a good VCR be called a failure.
3. Five percent of all VCRs are defective. So, the probability p that a VCR is defective is 0.05.
As a result, the probability 1− p that a VCR is good is 0.95. These two probabilities add up
to 1.
4. Each trial (VCR) is independent. In other words, if one VCR is defective it does not affect the
outcome of another VCR being defective or good. This is so because the size of the population
is very large as compared to the sample size.
8
Since all four conditions of a binomial experiment are satisfied, this is an example of a binomial
experiment.
Definition 8 The Binomial Probability Distribution – The random variable X that repre-
sents the number of successes in n trials for a binomial experiment is called a binomial random
variable. The probability distribution of X in such experiments is called the binomial probability
distribution or simply binomial distribution.
Theorem 1 (Binomial Formula) For a binomial experiment, the probability of exactly x suc-
cesses in n trials is given by the binomial formula
P (x) = P (X = x) =
(
n
x
)
px (1 − p)n−x
where x = 0, 1, 2, . . . , n and
n = total number of trials
p = probability of success
x = number of successes in n trials
n − x = number of failures in n trials(
n
x
)
=n!
x! (n − x)!and n! = n × (n − 1) × · · · × 2 × 1 and 0! = 1
• Notation – Usually, we use X ∼ B (n, p) to represent that X follows a binomial distribution
with parameters n and p.
9
Use scientific calculator to find
(
n
x
)
= n SHIFT
nCr
x =
Method 1:
(
6
4
)
= 6 SHIFT
nCr
4 = = 15
Method 2:
(
6
4
)
= 6 SHIFT
x!
÷ 4 SHIFT
x!
÷ 2 SHIFT
x!
= = 15
Method 3:
(
6
4
)
= 6 x! ÷ 4 x!
÷ 2 x! = = 15
Use scientific calculator to find pn = p SHIFT
xy
n =
Method 1: (0.3)5 = 0 · 3 SHIFT
xy
5 = = 0.00243
Method 2: (0.3)5 = 0 · 3 xy 5 = = 0.00243
Example 8 The answer to a true/false question is either correct or incorrect. Assume that (i) an
examination consists of four true/false questions and (ii) a student has no knowledge of the subject
matter. The chance (probability) that the student will guess the correct answer to the first question
is 0.50. Likewise, the probability of guessing each of the remaining questions correctly is 0.50. What
is the probability of:
(a) Getting exactly none of four correct?
10
(b) Getting exactly one of four correct?
(c) Getting exactly two of four correct?
Solution: Note that the binomial conditions are met:
(1) There is a fixed number of trials (i.e. n = 4)
(2) There are only two possible outcomes (i.e. correct/incorrect)
(3) There is a constant probability of success (i.e. p = 0.5)
(4) The trials are independent
(a) The probability of guessing none of the four correctly is 0.0625, found by applying Binomial
formula:
P (X = 0) =
(
4
0
)
(0.5)0 (1 − 0.5)4−0
=4!
0! (4 − 0)!× 1 × 0.54
= 0.0625
Q1 Q2 Q3 Q4 Probability
Case 1 × × × × 0.5 × 0.5 × 0.5 × 0.5
(
4
0
)
= 1
(
4
0
)
= 4 SHIFT
nCr
0 = = 1
(0.5)0 = 0.5 SHIFT
xy
0 = = 1
(0.5)4 = 0.5 SHIFT
xy
4 = = 0.0625
(b) The probability of getting exactly one of four correct is 0.2500, found by:
P (X = 1) =
(
4
1
)
(0.5)1 (1 − 0.5)4−1
=4!
1! (4 − 1)!× 0.5 × 0.53
= 0.25
Q1 Q2 Q3 Q4 Probability
Case 1 X × × × 0.5 × 0.5 × 0.5 × 0.5
Case 2 × X × × 0.5 × 0.5 × 0.5 × 0.5
Case 3 × × X × 0.5 × 0.5 × 0.5 × 0.5
Case 4 × × × X 0.5 × 0.5 × 0.5 × 0.5
(
4
1
)
= 4
11
(
4
1
)
= 4 SHIFT
nCr
1 = = 4
(0.5)1 = 0.5 SHIFT
xy
1 = = 0.5
(0.5)3 = 0.5 SHIFT
xy
3 = = 0.125
(c) The probability of getting exactly two of four correct is 0.3750, found by:
P (X = 2) =
(
4
2
)
(0.5)2 (1 − 0.5)4−2
=4!
2! (4 − 2)!× 0.52
× 0.52
= 0.375
Q1 Q2 Q3 Q4 Probability
Case 1 X X × × 0.5 × 0.5 × 0.5 × 0.5
Case 2 X × X × 0.5 × 0.5 × 0.5 × 0.5
Case 3 X × × X 0.5 × 0.5 × 0.5 × 0.5
Case 4 × X X × 0.5 × 0.5 × 0.5 × 0.5
Case 5 × X × X 0.5 × 0.5 × 0.5 × 0.5
Case 6 × × X X 0.5 × 0.5 × 0.5 × 0.5
(
4
2
)
= 6
(
4
2
)
= 4 SHIFT
nCr
2 = = 6
(0.5)2 = 0.5 SHIFT
xy
2 = = 0.25
(0.5)2 = 0.5 SHIFT
xy
2 = = 0.25
Example 9 At the Express House Delivery Service, providing high-quality to its customers is the
top priority of the management. The company guarantees a refund all charges if a package it is
delivering does not arrive at its destination by the specified time. It is known from past data that
despite all efforts, 2% of the packages mailed through company do not arrive at their destinations
within the specified time. A corporation mailed 10 packages through Express House Delivery Service
on Monday.
(a) Find the probability that exactly two of these 10 packages will not arrive at its destination
within the specified time.
12
(b) Find the probability that at most one of these 10 packages will not arrive at its destination
within the specified time.
(c) Find the probability that at least two of these 10 packages will not arrive at its destination
within the specified time.
Solution: Let us call it a success if a package does not arrive at its destination within the
specified time and a failure if it does arrive within the specified time. Then,
n = total number of packages mailed = 10,
p = P (success) = 0.02,
1 − p = P (failure) = 1 − 0.02 = 0.98
(a) For this part,
x = number of successes = 2, n − x = number of failures = 10 − 2 = 8
Substituting all values in the binomial formula, we obtain
P (X = 2) =
(
10
2
)
(0.02)2(1 − 0.02)10−2
=10!
2!8!(0.02)2(0.98)8
= (45)(0.0004)(0.8508) = 0.0153
Thus, there is a 0.0153 probability that exactly two of the 10 packages mailed will not arrive at its
destination within the specified time.
(
10
2
)
= 10 SHIFT
nCr
2 = = 45
(0.02)2 = 0.02 SHIFT
xy
2 = = 0.0004
(0.98)8 = 0.98 SHIFT
xy
8 = = 0.8508
13
(b) The probability that at most one of the 10 packages will not arrive at its destination within
the specified time is given by the sum of the probabilities of X = 0 and X = 1. Thus,
P (X ≤ 1) = P (X = 0) + P (X = 1)
=
(
10
0
)
(0.02)0(0.98)10 +
(
10
1
)
(0.02)1(0.98)9
= (1)(1)(0.8171) + (10)(0.02)(0.8337)
= 0.8171 + 0.1667 = 0.9838
Thus, the probability that at most one of the 10 packages will not arrive at its destination within
the specified time is 0.9838.
(c) The probability that at least two of the 10 packages will not arrive at its destination within
the specified time is given by the sum of the probabilities of X = 2, X = 3, X = 4, . . . , X = 9 and
X = 10. Thus,
P (X ≥ 2) = 1 − P (X = 0) − P (X = 1)
= 1 −
(
10
0
)
(0.02)0(0.98)10−
(
10
1
)
(0.02)1(0.98)9
= 1 − 0.8171 − 0.1667
= 0.0162
Thus, the probability that at least two of the 10 packages will not arrive at its destination within
the specified time is 0.0162.
Theorem 2 The mean and variance of the binomial distribution are
µ = E (X) = np and σ2 = V ar (X) = np (1 − p)
5.4 The Poisson Probability Distribution
The Poisson probability distribution is applied to experiments with random and independent
occurrences. The occurrences are random in the sense that they do not follow any pattern and,
hence, they are unpredictable. Independence of occurrences means that one occurrence (or nonoc-
currence) of an event does not influence the successive occurrences or nonoccurrence of that event.
The occurrences are always considered with respect to an interval. The interval may be a time
interval, a space interval, or a volume interval. The actual number of occurrences within an interval
14
is random and independent. If the average number of occurrences for a given interval is known, then
by using the Poisson probability distribution we can compute the probability of a certain number
of occurrences X in that interval.
• Conditions to apply Poisson probability distribution
The following three conditions must be satisfied to apply the Poisson probability distribution.
1. The experiment consists of counting the number, X, of times a particular event occurs during
a given unit of time, or a given area or volume (or weight, or distance, or any other unit of
measurement)
2. The probability that an event occurs in a given unit of time, area, or volume is the same for
all the units.
3. The number of events that occur in one unit of time, area, or volume is independent of the
number that occur in other units.
The following are a few examples of discrete random variables for which the occurrences are
random and independent. Hence, these are examples to which the Poisson probability distribution
can be applied.
1. Consider the number of patients arriving at the emergency ward of a hospital during a one-
hour interval. In this example, an occurrence is the arrival of a patient at the emergency
ward, the interval is one hour (an interval of time), and the occurrences are random.
The total number of patients who may arrive at this emergency ward during a one-hour
interval may be 0, 1, 2, 3, 4, .... The independence of occurrences in this example means that
patients arrive individually and the arrival of any two (or more) patients is not related.
2. Consider the number of defective items in the next 10,000 items manufactured on a machine.
In this case, the interval is a volume interval (10,000 items). The occurrences (number of
defective items) are random because there may be 0, 1, 2, 3, ..., 10000 defective items in 10,000
items. We can assume the occurrence of defective items to be independent of one another.
3. Consider the number of defects in a 5-foot-long iron rod. The interval, in this example, is
a space interval (5 feet). The occurrences (defects) are random because there may be any
15
number of defects in a 5-foot iron rod. We can assume that these defects are independent of
one another.
In the Poisson probability distribution terminology, the average number or occurrences in an
interval is denoted by µ. The actual number of occurrences in that interval is denoted by X.
Then, using the Poisson probability distribution, we find the probability of X occurrences during
an interval given the mean occurrences are µ during that interval.
Theorem 3 (Poisson Probability Distribution Formula) According to the Poisson probabil-
ity distribution, the probability of x occurrences in an interval is
P (x) = P (X = x) =µxe−µ
x!
where µ is the mean number of occurrences in that interval and x = 0, 1, 2, 3, . . ..
Example 10 A washing machine in a laundry breaks down an average of three times per month.
Using the Poisson probability distribution formula, find the probability that during the next month
this machine will have (a) exactly two breakdowns (b) at most one breakdown.
Solution: Let µ be the mean number of breakdowns per month and X be the actual number of
breakdowns observed during the next month for this machine. Then, µ = 3.
(a) The probability that exactly two breakdowns will be observed during the next month is
P (X = 2) =µxe−µ
x!=
(3)2 e−3
2!=
(9)(0.049787)
2= 0.224
(3)2 = 3 SHIFT
xy
2 = = 9
e−3 = 3 +/− SHIFT
ex
= 0.049787
2! = 2 SHIFT
x!
= 2
16
(b) The probability that at most one breakdown will be observed during the next month is given
by the sum of the probabilities of zero and one breakdown. Thus,
P (at most 1 breakdown) = P (X ≤ 1)
= P (X = 0) + P (X = 1)
=(3)0 e−3
0!+
(3)1 e−3
1!
=(1)(0.049787)
1+
(3)(0.049787)
1
= 0.0498 + 0.1494
= 0.1992
Example 11 A study of the lines at the checkout registers of Wellcome Supermarket revealed that,
during a certain period at the rush hour, the number of customers waiting averaged four. What is
the probability that during that period:
(a) No customers were waiting?
(b) Three customers were waiting?
(c) Three or fewer were waiting?
(d) Two or more were waiting?
Solution: Let X be the number customers who were waiting. Note that µ = 4.
(a)
P (X = 0) =40e−4
0!= 0.0183
(b)
P (X = 3) =43e−4
3!= 0.1954
(c)
P (X ≤ 3) = P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3)
=40e−4
0!+
41e−4
1!+
42e−4
2!+
43e−4
3!
= 0.0183 + 0.0733 + 0.1465 + 0.1954
= 0.4335
17
(d)
P (X ≥ 2) = 1 − P (X < 2)
= 1 − P (X = 0) − P (X = 1)
= 1 −40e−4
0!−
41e−4
1!
= 1 − 0.0183 − 0.0733
= 0.9084
Theorem 4 The mean and variance of the Poisson distribution are
µ = E (X) = µ and σ2 = V ar (X) = µ
5.5 Poisson Approximation to the Binomial Distribution
The computation of probabilities for a binomial distribution could be tedious if the number of trials
n is large. In the sub-section, we note that the Poisson distribution can be used to approimate the
binomial probabilities when the number of trials n is large and at the same time the probability p
is small (generally such that µ = np ≤ 7).
Proposition 1 Let X be the number of success resulting from n independent trials, each with
probability of success p. The distribution of the number of success X is binomial, with mean np. If
the number of trials n is large and np is of only moderate size (perferably np ≤ 7), this distribution
can be approximated by the Poisson distribution with µ = np. The probability of the approximating
distribution is then:
P (X = x) =e−np(np)x
x!for x = 0, 1, 2, 3, . . . (1)
Example 12 An analyst predicted that 3.5% of small corporations would file for bankruptcy in the
coming year. For a random sample of 100 small corporations, estimate the probability that at least
three will file for bankrupty in the next year, assuming that the analyst’s prediction is correct.
The distribution of the number X of of filinfs for bankrupty is binomial with n = 100 and
p = 0.035, so that the mean of the distribution is µ = np = 3.5. Using the Poisson distribution to
18
approximate the probability of at least three bankruptcies, we find
P (X ≥ 3) = 1 − P (X ≤ 2)
= 1 − (e−3.5(3.5)0
0!+
e−3.5(3.5)1
1!+
e−3.5(3.5)2
2!)
= 1 − (0.03 + 0.106 + 0.185)
= 0.679.
5.6 Hypergeometric Probability Distribution
One of the applications of the binomial probability distribution is its use as the probability distribu-
tion for the number X of favorable (or unfavorable) responses in a public opinion or market survey.
Ideally, it is applicable when sampling is with replacement—that is, each item drawn from the
population is observed and returned to the population before the next item is drawn. Practically
speaking, the sampling in surveys is rarely conducted with replacement. Nevertheless, the binomial
probability distribution is still appropriate if the number N of elements in the population is large
and the sample size n is small relative to N .
When sampling is without replacement, and the number of elements N in the population is
small (or when the sample size n is large relative to N), the number of “successes” in a random
sample of n items has a hypergeometric probability distribution. The defining characteristics and
probability for a hypergeometric random variable are stated in the boxes.
Characteristics That Define a Hypergeometric Random Variable
1. The experiment consists of randomly drawing n elements without replacement from a set of
N elements, r of which are “successes” and N − r of which are “failures.”
2. The hypergeometric random variable X is the number of r’s in the draw of n elements.
Theorem 5 (Hypergeometric Probability Distribution Formula) For a Hypergeometric Prob-
ability Distribution, the probability of exactly x successes in n trials is given by the Hypergeometric
Probability distribution formula
P (x) = P (X = x) =
(
r
x
)(
N − r
n − x
)
(
N
n
)
19
where
N = Number of elements in the population
r = Number of “successes” in the population
n = Sample size
x = Number of “successes” in the sample
Assumptions:
1. The sample of n elements is randomly selected from the N elements of the population.
2. The value of X is restricted so that all factorials are nonnegative.
Example 13 Of six lecturers, four have been with the college five or more years. If three lecturers
are chosen randomly from the group of six, find the probability that exactly two will have five or
more years seniority.
Solution: For this problem, the number of elements in the population is N = 6, the number of
successes is r = 4, and the sample size is n = 3. The required probability is
P (X = 2) =
(
4
2
)(
6−4
3−2
)
(
6
3
)
=
(
4
2
)(
2
1
)
(
6
3
)
=(6) (2)
20
=12
20
=3
5
Example 14 Suppose you plan to select four of 10 new stock issues (stock issued by new companies)
and that, unknown to you, three of the 10 will result in substantial profits and seven will result in
losses.
(a) What is the probability that all three of the profitable issues will appear in your selection?
(b) What is the probability that at least two of the three profitable issues will appear in your
selection?
20
Solution: For this problem, the number of elements in the population is N = 10, the number of
successes is r = 3, and the sample size is n = 4.
(a) We use the formula for the hypergeometric probability distribution to compute the proba-
bility of observing exactly X = 3 successes in the sample of n = 4.
P (X = 3) =
(
3
3
)(
10−3
4−3
)
(
10
4
)
=
(
3
3
)(
7
1
)
(
10
4
)
=1 × 7
210
=1
30
(b) The probability of observing at least two successes in the sample of n = 4 is
P (X ≥ 2) = P (X = 2) + P (X = 3)
=
(
3
2
)(
10−3
4−2
)
(
10
4
) +
(
3
3
)(
10−3
4−3
)
(
10
4
)
=
(
3
2
)(
7
2
)
(
10
4
) +
(
3
3
)(
7
1
)
(
10
4
)
=3 × 21
210+
1 × 7
210
=9
30+
1
30
=10
30
=1
3
Theorem 6 The mean and variance of the Hypergeometric Distribution are
µ = E (X) =nr
Nand σ2 = V ar (X) = n
( r
N
)
(
N − r
N
) (
N − n
N − 1
)
The expression (N − n) / (N − 1) is known as the finite population correction factor.
5.7 Negative Binomial Distribution
Suppose that independent trials, each having probability p, 0 < p < 1, of being a success are
performed until a total of r successes is accumulated. If we let X be the number of trials required,
21
then
P (X = k) = Ck−1r−1 pr(1 − p)k−r k = r, r + 1, . . . (2)
The above equation follows because, in order for the r-th success to occur at the n-th trial, there
must be r-1 successes in the first n-1 trials, and the n-th trial must be a success. Any random
variable X whose probability mass function is given by (2) is said to be a negative binomial random
variable with parameters (r, p), written as X ∼ NegBin(r, p).
Theorem 7 The mean and variance of the Negative Binomial distribution with parameters r, p are
equal tor
pand
r(1 − p)
p2respectively.
Example 15 If independent trials, each resulting in a success with probability p, are performed,
what is the probability of r successes occurring before m failures?
The solution will be arrived at by noting that r successes will occur before m failures if and
only if the r-th success occurs no later than the r + m − 1 trial. This follows because if the r-th
success occurs before or at the r + m − 1 trial, then it must have occurred before the m-th failure,
and conversely. Hence, from equation (2), the desired probability is
r+m−1∑
k=r
Ck−1r−1 pr(1 − p)k−r.
Example 16 A company takes out an insurance policy to cover accidents that occur at its manu-
facturing plant. The probability that one or more accidents will occur during any given month is3
5.
The number of accidents that occur in any given month is independent of the number of accidents
that occur in all other months. Calculate the probability that there will be at least four months in
which no accidents occur before the fourth month in which at least one accident occurs.
If a month with one or more accidents is regarded as success and X be the number of failures be-
fore the fourth success, then X follows a negative binomial distribution of the requested probability
is
P (X ≥ 4) = 1 − P (X ≤ 3)
= 1 −
3∑
k=0
C3+kk
(
3
5
)4 (
2
5
)k
= 1 −
(
3
5
)4 (
1 +8
5+
8
5+
32
25
)
= 0.2898.
22
top related