7-nopause

34
The Hypergeometric Distribution The Poisson Distribution Lecture 7: Special Probability Distributions - 2 Assist. Prof. Dr. Emel YAVUZ DUMAN Introduction to Probability and Statistics ˙ Istanbul K¨ ult¨ ur University

Upload: haresh

Post on 05-Jan-2016

5 views

Category:

Documents


2 download

DESCRIPTION

Hypergeometric distribution

TRANSCRIPT

Page 1: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Lecture 7: Special Probability Distributions - 2

Assist. Prof. Dr. Emel YAVUZ DUMAN

Introduction to Probability and StatisticsIstanbul Kultur University

Page 2: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Outline

1 The Hypergeometric Distribution

2 The Poisson Distribution

Page 3: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Outline

1 The Hypergeometric Distribution

2 The Poisson Distribution

Page 4: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Many times we used sampling with and without replacement toillustrate the multiplication rules for independent and dependentevents. To obtain a formula analogous to the binomial distributionthat applies to sampling without replacement, in which case thetrials are not independent, let us consider a set of N elements forwhich M are looked upon as successes and the other N −M asfailures. In connection with the binomial distribution, we areinterested in the probability of getting x successes in n trial, butnow we are choosing, without replacement, n of the N elementscontained in the set.

Page 5: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

There are(Mx

)way of choosing x of the M successes, and(N−M

n−x

)ways of choosing n − x of the N −M failure, and

hence(Mx

)(N−Mn−x

)ways of choosing x successes and n − x

failures.

Since there are(Nn

)ways of choosing n of the N elements in

the set, and we shall assume that they are all equally likely(which is what we mean when we say that the selection israndom), then that the probability of x successes in n trials is(Mx

)(N−Mn−x

)/(Nn

).

Page 6: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Definition 1

A random variable X has a hypergeometric distribution and it isreferred to as a hypergeometric random variable if and only if itsprobability distribution is given by

h(x ; n,N,M) =

(Mx

)(N−Mn−x

)(Nn

)for x = 0, 1, 2, · · · , n, x ≤ M and n − x ≤ N −M.

Thus, for sampling without replacement, the number of successesin n trials is a random variable having a hypergeometricdistribution with parameters n, N, and M.

Page 7: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 2

As part of an air-pollution survey, an inspector decides to examinethe exhaust of six of a company’s 24 trucks. If four of thecompany’s trucks emit excessive amounts of pollutants, what is theprobability that none of them will be included in the inspector’ssample?

Solution. Substituting x = 0, n = 6, N = 24, and M = 4 into theformula for the hypergeometric distribution, we get

h(x ; n,N,M) = h(0; 6, 24, 4) =

(Mx

)(N−Mn−x

)(Nn

) =

(40

)(24−46−0

)(246

) = 0.2880.

Page 8: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 3

Draw 6 cards from a deck without replacement. What is theprobability of getting two hearts?

Solution. Substituting x = 2, n = 6, N = 52, and M = 13 intothe formula for the hypergeometric distribution, we get

h(x ; n,N,M) = h(2; 6, 52, 13) =

(Mx

)(N−Mn−x

)(Nn

) =

(132

)(52−136−2

)(526

) = 0.31513.

Page 9: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 4

49 balls are numbered 1 - 49. You select six numbers between 1and 49. The ones you write on your lotto card. What is theprobability that they contain (a) match 4? (b) match 6?

Solution. (a) Substituting x = 4, n = 6, N = 49, and M = 6 intothe formula for the hypergeometric distribution, we get

h(x ; n,N,M) = h(4; 6, 49, 6) =

(Mx

)(N−Mn−x

)(Nn

) =

(64

)(49−66−4

)(496

) = 2.3062×10−5.

(b) Substituting x = 6, n = 6, N = 49, and M = 6 into theformula for the hypergeometric distribution, we get

h(x ; n,N,M) = h(6; 6, 49, 6) =

(Mx

)(N−Mn−x

)(Nn

) =

(66

)(49−66−6

)(496

) = 7.1511×10−8.

Page 10: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Theorem 5

The mean and the variance of the hypergeometric distribution are

μ =nM

Nand σ2 =

nM(N −M)(N − n)

N2(N − 1).

Page 11: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 6

Suppose that a researcher goes to a small college of 200 faculty, 12of which have blood type O-negative. She obtains a simple randomsample of 20 of the faculty. Determine the mean and standarddeviation of the number of randomly selected faculty that will haveblood type O-negative.

Solution. Substituting n = 20, N = 200, and M = 12 into theformula for the hypergeometric distribution’s mean and variancewe obtain

μ =nM

N=

20 · 12200

= 1.2

and

σ =

√nM(N −M)(N − n)

N2(N − 1)=

√20 · 12(200 − 12)(200 − 20)

2002 · (200 − 1)= 1.0101

We expect that, in a random sample of 20 faculty members, 1.2will have blood type O-negative.

Page 12: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 7

A case of wine has 12 bottles, 3 of which contains spoiled wine. Asample of 4 bottles is randomly selected from the case.

(a) Find the probability distribution for X , the number of spoiledwine in the sample

(b) What are the mean and variance of X?

Solution. For this example n = 4, N = 12, and M = 3. Then

h(x ; 4, 12, 3) =

(3x

)( 94−x

)(124

) .

Page 13: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

h(x ; 4, 12, 3) =(3x)(

94−x)

(124 ).

(a) The possible values for X are 0, 1, 2 and 3, with probabilities

h(0; 4, 12, 3) =

(30

)(94

)(124

) = 0.25, h(1; 4, 12, 3) =

(31

)(93

)(124

) = 0.51,

h(2; 4, 12, 3) =

(32

)(92

)(124

) = 0.22, h(3; 4, 12, 3) =

(33

)(91

)(124

) = 0.02.

(b) The mean is given by

μ =nM

N=

4 · 312

= 1

and the variance is

σ2 =nM(N −M)(N − n)

N2(N − 1)=

4 · 3(12 − 3)(12 − 4)

122(12 − 1)= 0.5455.

Page 14: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Binomial Approximation to Hypergeometric Distribution

When N is large and n is relatively small compared to N (the usualrule of thumb is that n should not exceed 5 percent of N), there isnot much difference between sampling with replacement andsampling without replacement, and the formula for the binomialdistribution with the parameters n and θ = M

N may be used toapproximate hypergeometric probabilities.

Page 15: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 8

Among the 120 applicants for a job, only 80 are actually qualified.If five of the applicants are randomly selected for an in-depthinterview, find the probability that only two of the five will bequalified for the job by using

(a) the formula for the hypergeometric distribution;

(b) the formula for the binomial distribution with θ = 80/120 asan approximation.

Page 16: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Solution. (a) Substituting x = 2, n = 5, N = 120, and M = 80into the formula for the hypergeometric distribution, we get

h(x ; n,N,M) = h(2; 5, 120, 80) =

(802

)(403

)(1205

) = 0.164.

rounded to three decimals;(b) substituting x = 2, n = 5, N = 120, and θ = 80

120 = 23 into the

formula for the binomial distribution, we get

b

(2; 5,

2

3

)=

(5

2

)(2

3

)2 (1− 2

3

)3

= 0.165

rounded to three decimals. As can be seen from these results, theapproximation is very close.

Page 17: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 9

Boxes contain 2000 items of which 10% are defective. Find theprobability that no more than 2 defectives will be obtained in asample of size 10.

Solution. For this question x is equal to 0, 1 or 2, n = 10,N = 2000 and M = 2000 · 0.10 = 200. Since

n = 10 ≤ 100 = 2000 · 0.05 = N · 0.05

this means n is not exceed 5 percent of N we may use the methodof binomial approximation to the hypergeometric distribution also.

Page 18: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

(a) The hypergeometric distribution:

P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2)

=

(2000

)(180010

)(2000

10

) +

(2001

)(18009

)(2000

10

) +

(2002

)(18008

)(2000

10

)= 0.3476 + 0.3881 + 0.1939 = 0.9296.

(b) Binomial approximation to the hypergeometric distributionwith θ = M/N = 200/2000 = 0.1:

P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2)

=

(10

0

)0.100.910 +

(10

1

)0.110.99 +

(10

2

)0.120.98

= 0.3487 + 0.3874 + 0.1937 = 0.9298.

Page 19: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Outline

1 The Hypergeometric Distribution

2 The Poisson Distribution

Page 20: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

When n, the number of trial, is large the calculation of binomialprobabilities with the formula of binomial distribution will usuallyinvolve a prohibitive amount of work. In this section we shallpresent a probability distribution that can be used to approximatebinomial probabilities of this kind. Specifically, we shall investigatethe limiting form of the binomial distribution when n → ∞, θ → 0,while nθ remains constant.

Page 21: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Definition 10

A random variable X has a Poisson distribution and it is referredto as Poisson random variable if and only if its probabilitydistribution is given by

p(x ;λ) =λxe−λ

x!for x = 0, 1, 2 · · ·

where λ, the mean number of successes.

In general, Poisson distribution will provide a good approximationto binomial probabilities when n ≥ 20 and θ ≤ 0.05. Whenn ≥ 100 and nθ < 10, the approximation will generally be excellent.

Page 22: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 11

If 2 percent of books bound at a certain bindery have defectivebindings, use the Poisson approximation to the binomialdistribution to determine the probability that five of 400 booksbound by this bindery will have defective bindings.

Solution. Substituting x = 5, λ = nθ = 400 · 0.02 = 8 into theformula for Poisson distribution, we get

p(5; 8) =85e−8

5!= 0.09160.

Page 23: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 12

Records show that the probability is 0.00005 that a car will have aflat tire while crossing a certain bridge. Use the Poissondistribution to approximate the binomial probabilities that, among10,000 cars crossing the bridge

(a) exactly two will have a flat tire;

(b) at most two will have a flat tire.

Solution. (a) Substituting x = 2,λ = nθ = 10, 000 · 0.00005 = 0.5 into the formula for Poissondistribution, we get

p(2; 0.5) =0.52e−0.5

2!= 0.07582.

(b)

p(2; 0.5) + p(1; 0.5)+p(0; 0.5) =0.52e−0.5

2!+

0.51e−0.5

1!+

0.50e−0.5

0!= 0.07582 + 0.30327 + 0.60653 = 0.98562.

Page 24: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Having derived the Poisson distribution as a limiting form of thebinomial distribution, we can obtain formulas for its mean and itsvariance by applying the same limiting conditions (n → ∞, θ → 0and nθ = λ remains constant) to mean and the variance of thebinomial distribution. For the mean we get μ = nθ = λ and for thevariance we get σ2 = nθ(1− θ) = λ(1 − θ) which approaches λwhen θ → 0.

Theorem 13

The mean and the variance of the Poisson distribution are given by

μ = λ and σ2 = λ.

Theorem 14

The moment generating function of the Poisson distribution isgiven by

MX (t) = eλ(et−1).

Page 25: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Although the Poisson distribution has been derived as a limitingform of the binomial distribution, it has many applications thathave no direct connection with binomial distribution.In many practical situations we are interested in measuring howmany times a certain event occurs in a specific time interval or in aspecific length or area. For instance:

1 the number of phone calls received at an exchange or callcenter in an hour;

2 the number of customers arriving at a toll booth per day;

3 the number of flaws on a length of cable;

4 the number of cars passing using a stretch of road during aday.

The Poisson distribution plays a key role in modeling suchproblems.

Page 26: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Suppose we are given an interval (this could be time, length, areaor volume) and we are interested in the number of successes inthat interval. Assume that the interval can be divided into verysmall subintervals such that:

1 the probability of more than one success in any subinterval iszero;

2 the probability of one success in a subinterval is constant forall subintervals and is proportional to its length;

3 subintervals are independent of each other.

Page 27: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

We assume the following.

1 The random variable X denotes the number of successes inthe whole interval.

2 λ is the mean number of successes in the interval.

X has a Poisson Distribution with parameter λ and

P(X = x) = p(x ;λ) =λxe−λ

x!, x = 0, 1, 2, · · · .

Page 28: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 15

The average number of trucks on any one day at a truck depot ina certain city is known to be 12. What is the probability that on agiven day fewer than nine trucks will arrive at this depot?

Solution. Let X be the number of trucks arriving on a given day.Then, using Poisson distribution with λ = 12, we get

P(X < 9) =8∑

x=0

p(x ; 12) =8∑

x=0

12xe−12

x!

= e−12

(120

0!+

121

1!+

122

2!+

123

3!+

124

4!

+125

5!+

126

6!+

127

7!+

128

8!

)= 0.1550.

Page 29: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 16

The number of flaws in a fiber optic cable follows a Poissondistribution. The average number of flaws in 50m of cable is 1.2.

(a) What is the probability of exactly three flaws in 150m ofcable?

(b) What is the probability of at least two flaws in 100m of cable?

(c) What is the probability of exactly one flaw in the first 50m ofcable and exactly one flaw in the second 50m of cable?

Solution. (a) Mean number of flaws in 150m of cable is1.2 · 3 = 3.6. So the probability of exactly three flaws in 150m ofcable is

p(3; 3.6) =3.63e−3.6

3!= 0.21247

Page 30: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

(b) Mean number of flaws in 100m of cable is 1.2 · 2 = 2.4. Let Xbe the number of flaws in 100m of cable.

P(X ≥ 2) = 1− P(X < 2) = 1− (P(X = 0) + P(X = 1))

= 1− p(0; 2.4) − p(1; 2.4)

= 1− 2.40e−2.4

0!− 2.41e−2.4

1!= 0.69156

(c) Now let X denote the number of flaws in a 50m section ofcable. Then we know that

P(X = 1) = p(1; 1.2) =1.21e−1.2

1!= 0.36143.

As X follows a Poisson distribution, the occurrence of flaws in thefirst and second 50m of cable are independent. Thus theprobability of exactly one flaw in the first 50m and exactly one flawin the second 50m is

(0.36143)(0.36143) = 0.13063.

Page 31: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 17

Births in a hospital occur randomly at an average rate of 1.8 birthsper hour. What is the probability of observing 4 births in a givenhour at the hospital?

Solution. If we let X be the number of births in an hour, then Xhas a Poisson distribution:

P(X = 4) = p(4; 1.8) =1.84e−1.8

4!= 0.072302.

Page 32: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 18

Consider a telephone operator who, on the average, handles fivecalls every 3 minutes. (a) What is the probability that there willbe no calls in the next minute? (b) At least one call?

Solution. If we let X be the number of calls in a minute, then Xhas a Poisson distribution with λ = 5

3 . So(a) P(no calls in the next minute) = P(X = 0) = p(0; 5/3) =(5/3)0e−5/3

0! = 0.1889(b) P(At least one call) = P(X ≥ 1) = 1− P(X = 0) =1− p(0; 5/3) = 1− 0.1889 = 0.8111.

Page 33: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Example 19

A certain kind of sheet metal has on the average, five defects per10-square-feet. If we assume a Poisson distribution, what is theprobability that a 15-square-feet sheet of the metal will have atleast six defects?

Solution. Let X denote the number of defects in a 15-square-footsheet of the metal. Then, since the unit area is 10-square-feet, wehave

λ = 5 · 1.5 = 7.5

and

P(X ≥ 6) = 1− P(X ≤ 5) = 1− (P(X = 0) + P(X = 1) + P(X = 2)

+P(X = 3) + P(X = 4) + P(X = 5))

= 1− e−7.5

(7.50

0!+

7.51

1!+

7.52

2!+

7.53

3!+

7.54

4!+

7.55

5!

)= 1− (0.2414)

= 0.7586.

Page 34: 7-nopause

The Hypergeometric Distribution The Poisson Distribution

Thank You!!!