approximation and nested problem

28
Approximation and Nested Problem

Upload: kaili

Post on 15-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Approximation and Nested Problem. Hypergeometric Example. Four players are playing a poker game out of a deck of 52 cards. Each player has 13 cards. Let X be the number of Kings one player may have, and answer the following questions. 1. Is X a discrete or continuous random variable? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Approximation and Nested Problem

Approximation and Nested Problem

Page 2: Approximation and Nested Problem

Four players are playing a poker game out of a deck of 52 cards. Each player has 13 cards. Let X be the number of Kings one player may have, and answer the following questions.

1. Is X a discrete or continuous random variable?

2. Find an appropriate probability distribution that can be used to describe X. Also, find the corresponding parameter(s).

Hypergeometric Example

Page 3: Approximation and Nested Problem

3. Find the sample space and X and the probability corresponding to each point in the sample space.

Hypergeometric Example

Page 4: Approximation and Nested Problem

Binomial vs. Poisson

Given that a random variable follows a Binomial distribution with parameters n and p, X~BIN(n,p)

Sometimes, we can approximate the distribution of X with a Poisson whose λ=np.

This is usually done when n is large and p is small.

Page 5: Approximation and Nested Problem

Example I

A computer chip contains 1000 transistors. Each transistor has probability 0.0025 of being defective. What is the probability that the chip contains at most 4 defective transistors?

This is basically a BIN(1000, 0.0025)P(X<=4)=P(X=0)+P(X=1)+P(X=2)+P(X=3)+

P(X=4), where P(X=k)=1000Ck(0.0025)^k*(0.9975)^(1000-k)

Page 6: Approximation and Nested Problem

Example I

Or, we can consider the number of defective transistors follows a Poisson distribution with λ=1000*0.0025= 2.5.

P(X<=4)=P(X=0)+P(X=1)+P(X=2)+P(X=3)+P(X=4), where P(X=k)=e^(-2.5)*2.5^k / k!

Page 7: Approximation and Nested Problem

Another word on Poisson

Poisson experiment has the property that: The probability of an occurrence is the same for any

two intervals of equal length/area The occurrence or non-occurrence in any interval/area

is independent of the occurrence or non-occurrence in any other interval/area

Page 8: Approximation and Nested Problem

Another word on Poisson

In example I, there are 1000 transistors on our chip and there are an average of 2.5 defective transistors.

Given this, if we have some other chips with 10000 transistors, what is the average of the number of defective transistors? How about a chip with 500 transistors?

Page 9: Approximation and Nested Problem

How good the approximation is?

Look at two examples: 1. Example Ia: chips with 1000 transistors, each with

0.0025 chance of being defective. 2. Example Ib: chips with 1000 transistors, each with

25% chance of being defective.

Page 10: Approximation and Nested Problem

How good the approximation is?

Let’s find the probability that there are at most 4 transistors that are defective on the chip using both binomial and Poisson.

Example IaBIN(1000, 0.0025)

Example IbBIN(1000, 0.25)

Binomial

0.891429 6E-117Poisson Approximation 0.89118 4.4E-101

Page 11: Approximation and Nested Problem

Another approximation

Let’s take a step back and consider two sampling schemes and think about the probability (p) of each unit being selected from the population. Sampling with replacement. Sampling without replacement. Will this p be different if the population has a size 10,

1000, 1000000000 or infinite?

Page 12: Approximation and Nested Problem

That reminds us of the binomial and hyper-geometric distribution…

Suppose we draw a sample of size n (fixed, say 100) from a population.

If you sample with replacement, p1=the probability that each unit is selected.

If you sample without replacement, p2=the probability that each unit is selected.

As the size of the population increases, the difference between p1 and p2 actually decreases!

In an extreme case, if the population is infinitely large, there is no difference between sampling with or without replacement.

Page 13: Approximation and Nested Problem

Binomial approximation to Hyper-geometric

If the size of the population is large, we can use a binomial distribution to approximate the hyper-geometric distribution with nbin=nhg and p=m/N.

Page 14: Approximation and Nested Problem

Example II

In a population of size 10,000, suppose that 20% of the individuals favor Policy A. A sample of size 100 is taken without replacement, and let X denote the number of sampled individuals in favor of Policy A.  What is the exact distribution of X? How could we approximate the exact distribution of

X?

Page 15: Approximation and Nested Problem

How good the approximation is?

Let’s find the probability that there are less than 30 people in favor of the policy.

If we use hyper-geometric, the probability is: 0.989107184.

If we use binomial, the probability is: 0.988751021.

Page 16: Approximation and Nested Problem

Nested problems

Suppose we have a problem and we decide to use binomial distribution to analyze it. Then we will need to figure out n and p.

The same thing will happen to problems where we decide to use other probability distributions.

*** We always need to find the parameters of the distribution to be able to work on it.

Page 17: Approximation and Nested Problem

Nested Problems

A problem is called a nested problem if it is one problem inside another.

For example, the problem first requires us to work on a binomial (n, p), where in order to find the parameter p, we have to work out another problem.

Page 18: Approximation and Nested Problem

Example III

There are 12 independent sections of stat225 and each section has 35 students. From previous experience, on average, 10 students in each section will get an A. If the number of students getting an A follows a Poisson distribution and the probability of getting an A is equal for all sections, answer the following questions.

Page 19: Approximation and Nested Problem

Example III

A. Probability that there are 100 A’s in this course. This question is an application of the properties of

Poisson r.v. Let Xi be the number of students getting A in this

course for each section i, then the total number of A’s in this course will be the sum of all Xi’s.

Since Xi~POI(10), X~POI(12*10) Finally, the probability of interest is: P(X=100)=exp(-120) 120^100/100!=0.0068

Page 20: Approximation and Nested Problem

Example III

B. What is the probability that half of the sections have less than 10 A’s.

This is a nested problem. Each section may or may not have less than 10 A’s,

and that could be considered as a Bernoulli random variable with parameter p (probability that a section has less than 10 A’s). This part can be solved using a Poisson random variable

Then the number of sections with less than 10 A’s in this course can be considered as a binomial (n, p) where n=12 and p is what we found in the previous step.

This is a Poisson nested within a binomial.

Page 21: Approximation and Nested Problem

Example III

Step 1. Find pLet Xi be the number of A’s in each section,

since it follows a Poi (10), then the probability that there are less than 10 A’s in one section is:

P(Xi<10)=P(Xi=0)+P(Xi=1)+P(Xi=2)+P(Xi=3)+P(Xi=4)+…+P(Xi=9), where P(Xi=k)=exp(-10)*10^k/k!

Finally, we have P(Xi<10)=0.46 And this is the p for the binomial random variable

Page 22: Approximation and Nested Problem

Example III

Step 2: now we know that each section has a 46% probability to have less than 10 A’s. We have 12 independent sections, so the number of sections that have less than 10 A’s, denoted by Y, follows a binomial distribution with parameter (12, 0.46).

We are interested in the probability that half of the sections would have less than 10 A’s, so it is calculated as: P(Y=6)=12C6(0.46^6)(1-0.46)^6=21.7%

Page 23: Approximation and Nested Problem

Example IV

Two players, A and B, are playing a card game. They start with a deck of 26 cards with all clubs and diamonds removed. A will deal 13 cards at random to B and A wins if he gets more Ace than B. If both A and B get one Ace, the one with spade Ace wins. They repeat the game 10 times. Find the probability that A wins 6 times.

Assuming each game is independent, the number of games A wins follows a binomial distribution with parameters (10 and p)

Page 24: Approximation and Nested Problem

Example IV

Step 1, find p.A wins when he has two Aces or spade Ace, let’s

find the probability of each event separately. A. A has two Aces: let X1 be the number of Ace at A’s

hand, then apparently, X1~HG(26, 13, 2), therefore P(X1=2)=2C2*24C11/26C13=0.24

B. A has only the spade Ace: we can calculate the probability directly, which is 24C12/26C13 =0.26(note that, we don’t want to use 25C12 on the top since in that case, it will also include the possibilities with 2 Aces)

Finally, P(A wins)=0.24+0.26

Page 25: Approximation and Nested Problem

Example IV

Step 2, find the probability that A wins 6 times. Let Y be the number of times A wins and Y~BIN(10,

0.5) P(Y=6)=10C6*0.5^10=0.205

What is we want to calculate the probability that A wins 5 times?

Then P(Y=5)=10C5*0.5^10=0.246

Page 26: Approximation and Nested Problem

Example IV

If they repeat the game 100 times, how many times do you expect B to win?

Page 27: Approximation and Nested Problem

Example V

Someone wants to open a store at downtown Lafayette. He has decided to have his store open Monday through Saturday but has not decided the hours yet. He was torn between opening at 8 or 9. He is willing to open the store at 8 if there are more than 10 customers visiting between 8 and 9 for at least four days of a week. A quick research told him that on average, there are about 5 customers visiting a store in the neighborhood between 8 and 9. What is the probability that the storekeeper starts his business at 8?

Page 28: Approximation and Nested Problem

Example V

This is also an example of a nested problem. The storekeeper’s decision is based on the

number of days from Mon. to Sat. that there are more than 10 customers visiting his store. This number follows a binomial distribution.

For a binomial r.v., we need to figure out the two parameters, n and p to be able to work on it.

In this case, n=6, Monday through Saturday.p depends on the number of customers visiting,

that is a Poisson r.v. with a mean of 5.