iiml 3 4 probability and statistics a review [compatibility mode]

56
3/11/2012 1 Copyright Tapan P Bagchi 1 Sessions 3 and 4 Probability and Statistics in Six Sigma: A review Copyright Tapan P Bagchi 2 Why Study PROBABILITY? Occurrence of defects in production is stochasticsuch events cannot be exactly predicted. In decisions about such events we rely on the theory of probability. When our decisions require data analysis, the typical methods are obtained from statistics.

Upload: nikhil-nangia

Post on 24-Oct-2014

53 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

1

Copyright Tapan P Bagchi 1

Sessions 3 and 4

Probability and Statistics in Six Sigma: A review

Copyright Tapan P Bagchi 2

Why Study PROBABILITY?

Occurrence of defects in production is stochastic—such events cannot be exactly predicted. In decisions about such events we rely on the theory of probability. When our decisions require data analysis, the typical methods are obtained from statistics.

Page 2: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

2

Copyright Tapan P Bagchi 3

Applications of Probability Theory and Statistics in Business

Forecasting, Inventory Management Quality Assurance Project Risk Management Investment Portfolio Design Business simulation, Market research Game Theory and Strategy Formulation Six Sigma

Copyright Tapan P Bagchi 4

The Scope of Sessions 3 and 4

• Experiments Outcomes, Events and Sample Spaces

• What is probability?

• Basic Rules of Probability

• Probabilities of Compound Events

• An introduction to Distributions

• Test of hypothesis and inference

Page 3: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

3

Copyright Tapan P Bagchi 5

Learning Objectives

Understand the concepts of sample space and probability distribution and

construct sample spaces and distributions in simple cases

conditional probability and independent events; understand how to compute the probability of a compound event

Use simulations to construct empirical probability distributions and to make informal inferences about the theoretical probability distribution

Copyright Tapan P Bagchi 6

What is Probability?

Probabilitythe study of chance associated with the occurrence of random or stochastic events

Types of Probability Classical (Theoretical) Relative Frequency (Experimental)

Page 4: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

4

Copyright Tapan P Bagchi 7

Classical Probability

Rolling dice and tossing a coin are activities associated with a classical approach to probability. In these cases, you can list (or enumerate) all the possible outcomes of an experiment and determine the actual probabilities of each outcome.

Copyright Tapan P Bagchi 8

Sample Space, Events and RVsThe possible outcomes of a stochastic or random process are called events.

An event is a deterministic process has only one possible outcome.

The probability of a particular event is the fraction of outcomes in which the event occurs. The probability of event A is denoted by P(A).

Random variables map events to numbers. AS

4.3 X

Page 5: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

5

Copyright Tapan P Bagchi 9

Events may be (a) mutuallyexclusive, or (b) independent

Probability of an event remains between 0 (the event never occurs) and 1 (the event always occurs).

Two events are mutually exclusive if occurrence of one precludes the occurrence of the other.

Events whose occurrence do not depend on the occurrence of any other events are called independent events.

Copyright Tapan P Bagchi 10

Experiments, Outcomes, Events and Sample Spaces

Experiment: An experiment is any activity from which resultsare obtained. A random experiment is one in which the

outcomes, or results, cannot be predicted with certainty.

Examples:1. Flip a coin2. Flip a coin 3 times3. Roll a die4. Draw a random sample of size 50 from a population

Trial: A physical action , the result of which cannot be predetermined

Page 6: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

6

Copyright Tapan P Bagchi 11

Basic Outcomes and Sample SpacesBasic Outcome (o): A possible outcome of the experiment

Sample Space: The set of all possible outcomes of an experiment

Example: A company has offices in six cities, San Diego, Los Angeles, San Francisco, Denver, Paris, and London. A new employee will be randomly assigned to work in on of these offices.

What are the Outcomes?

What is the Sample Space?

Copyright Tapan P Bagchi 12

Assigning Probabilities to Events

Probability of an event P(E): “Chance” that an event will occur

• Must lie between 0 and 1• “0” implies that the event will not occur• “1” implies that the event will occur

Types of Probability:

Objective Relative Frequency Approach Equally-likely Approach

Subjective – based on beliefs, judgment and past experience

Page 7: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

7

Copyright Tapan P Bagchi 13

OddsIf the odds that an event occurs is a:b, then

( ) aP Aa b

Example: If the odds of the horse “Chetak” winning the Hong Kong Derby are 9:2, what is the subjective probability that he will win?

Copyright Tapan P Bagchi 14

Probabilities of Events

Let A be the event A = {o1, o2, …, ok}, where o1, o2, …, ok are k different outcomes. Then

1 2( ) ( ) ( ) ( )kP A P o P o P o

Problem: The number on a license plate is any digit between 0 and 9. What is the probability that the first digit is a 3? What is the probability that the first digit is less than 4?

Page 8: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

8

Copyright Tapan P Bagchi 15

• Start with the Law of Complements: “If A is an event, then the complement of A, denoted by ,represents the event composed of all basic outcomes in S (the sample space) that do not belong to A.”

A = set of outcomes thatmake event A

• By Additive Law of Probability: P(A) + P( ) = 1

Probabilities of Compound Events

A

A

A

S = set of all outcomes

A

Copyright Tapan P Bagchi 16

“If A is an event, then the complement of A, denoted by ,represents the event composed of all basic outcomes in S that do not belong to A.”

Law of ComplementsA

A

SLaw of Complements:

Example: If the probability of getting a “working” computer is 0.7,What is the probability of getting a defective computer?

( ) 1 ( )P A P A

Page 9: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

9

Copyright Tapan P Bagchi 17

• Unions of Two Events“If A and B are events, then the union of A and B, denoted by AB, represents the event composed of all basic outcomes in A or B.”

• Intersections of Two Events“If A and B are events, then the intersection of A and B, denoted by A∩B, represents the event composed of all basic outcomes in A and B.”

Unions and Intersections of Two Events

S

BA

Copyright Tapan P Bagchi 18

Additive Law of ProbabilityLet A and B be two events in a sample space S. The probability of the union of A and B is

( ) ( ) ( ) ( ).P A B P A P B P A B

S

BA

Page 10: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

10

Copyright Tapan P Bagchi 19

Using Additive Law of Probability

S

CM

Example: At Cornell, all first-year students must take chemistry and math. Suppose 15% fail chemistry, 12% fail math, and 5% fail both. Suppose a first-year student is selected at random. What is the probability that student selected failed at least one of the courses?

Copyright Tapan P Bagchi 20

Mutually Exclusive EventsMutually Exclusive Events: Events that have no basic outcomes in common, or equivalently, their intersection is the empty set .

S

BA

Let A and B be two events in a sample space S. The probability of the union of two mutually exclusive events A and B is

( ) ( ) ( ).P A B P A P B

Page 11: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

11

Copyright Tapan P Bagchi 21

Multiplication Rule and Independent EventsTwo independent events can occur together!

Multiplication Rule for Independent Events: Let A and B be two independent events, then

( ) ( ) ( ).P A B P A P B

Examples:• Flip a coin twice. What is the probability of observing two heads?

• Flip a coin twice. What is the probability of getting a head and then a tail? A tail and then a head? One head?

• Three computers are ordered. If the probability of getting a “working” computer is .7, what is the probability that all three are “working” ?

Copyright Tapan P Bagchi 22

Definitions

Sample Space - the list of all possible outcomes from a probabilistic experiment. 3-Children Example:

S = {BBB, BBG, BGB, BGG, GBB, GBG, GGB, GGG}

Each individual item in the list is called a Simple Event or Single Event.

Page 12: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

12

Copyright Tapan P Bagchi 23

Probability of Single Events with Equally Likely Outcomes

If each outcome in the sample space is equally likely, then the probability of any one outcome is 1 divided by the total number of outcomes.

outcomes ofnumber total1event) simple(

outcomes,likely equally For

P

Copyright Tapan P Bagchi 24

Three Children Example Continued

A couple wants 3 children. Assume the chance of a boy or girl is equally likely at each birth.

What is the probability that they will have exactly 3 girls?

What is the probability ofhaving exactly 3 boys?

Page 13: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

13

Copyright Tapan P Bagchi 25

Probability of Combinations of Single Events

But in general, an Event can be a combination of Single Events.

The probability of such an event is the sum of the individual probabilities.

Copyright Tapan P Bagchi 26

Three Children Example Continued

P(exactly 2 girls) = __P(exactly 2 boys) = __P(at least 2 boys) = __P(at most 2 boys) = __P(at least 1 girl) = __P(at most 1 girl) = __

Sample space =

Page 14: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

14

Copyright Tapan P Bagchi 27

Types of Probability

Classical (Theoretical)

Relative Frequency (Experimental, Empirical)

Copyright Tapan P Bagchi 28

Relative Frequency Probability

Uses actual experience to determine the likelihood of an outcome.

What isthe chanceof scoringa B or better?

Grade Frequency

A 20

B 30

C 40

Below C 10

Page 15: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

15

Copyright Tapan P Bagchi 29

Empirical Probability

Given a frequency distribution, the probability of an event, E, being in a given group is

nxP

ondistributi in the sfrequencie totalgroup theoffrequency E)(

Copyright Tapan P Bagchi 30

A Problem:Two-way Tables and Probability

Find:P(M)

P(A)

P(A and M)

Given: Got AGot < A

Total

Male 30 45

Female 60 65

Total

To solve this problem we need some theory.

Page 16: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

16

Copyright Tapan P Bagchi 31

Probability Fundamentals What is wrong with the statements? The probability of rain today is -10%. The probability of rain today is 120%. The probability of rain or no rain today is 90%.

1) (1)(0)(

spacesamplePeventPeventP REMEMBER!

These are the axioms of Kolmogorov.

Copyright Tapan P Bagchi 32

Probability Rules

Let A and B be events

Complement Rule:P(A) + P( not A) = 1

Page 17: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

17

Copyright Tapan P Bagchi 33

Set Theory Notation

Union: A or B (inclusive “or”)

BA

BA

Intersection: A and B

Copyright Tapan P Bagchi 34

Probability Rules

Union P(AUB) = P(A or B)

)()()()( BAPBPAPBAP

Page 18: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

18

Copyright Tapan P Bagchi 35

Two-way Tables and Probability

Find:P(M)P(A)P(A and M)P(A if M)

Got A Got < A

Total

Male 30 45 75

Female 60 65 125

Total 90 110 200

Copyright Tapan P Bagchi 36

Conditional Probability

P(A|B) = the conditional probability of event A happening given that event B has happened

“probability of A given B”

)()()|(

BPBAPBAP

Page 19: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

19

Copyright Tapan P Bagchi 37

Independent Mutually events exclusive

events

Head TailKGP Student had bicycle

stolen

It rained in Mumbai on

July 15

Car accidents

Obama gets elected

Fire in lab

Copyright Tapan P Bagchi 38

Independence

Events A and B are “Independent” if and only if

)()|( APBAP

Using the data in the two-way table, can you say that getting an “A” grade is independent of the student’s being male?

Page 20: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

20

Copyright Tapan P Bagchi 39

Two-way Tables and Probability

Question:Are grades and

gender independent?

Got A Got < A

Total

Male 30 45 75

Female 60 65 125

Total 90 110 200

Copyright Tapan P Bagchi 40

Terminology

The sum of probabilities of all mutually exclusive events in a process is 1. For example, if there are n possible mutually exclusive outcomes, then

P( i) 1i1

n

Page 21: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

21

Copyright Tapan P Bagchi 41

Simple probabilities

If A and B are mutually exclusive events, then the probability of either A or B to occur is the union

P(A B) P(A) P(B)

Copyright Tapan P Bagchi 42

Simple probabilitiesIf A and B are independent events, then the probability that both events A and B occur is the intersection

P(A B) P(A) P(B)Example: The probability that a US president is bearded is ~14%, the probability that a US president died in office is ~19%, thus the probability that a president both had a beard and died in office is ~3%. If the two events are independent, 1.3 bearded out of 43 presidents are expected to fulfill the two conditions. In reality, 2 died. (A close enough result.)

Page 22: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

22

Copyright Tapan P Bagchi 43

Conditional probabilities

What is the probability of event A to occur given than event B did occur. The conditional probability of A given B is

P(A | B) P(A B)

P(A)Example: The probability that a US president dies in office if he is bearded 0.03/0.14 = 22%. Thus, out of 6 bearded presidents, 22% (or 1.3) are expected to die. In reality, 2 died. (Again, a close enough result.)

Copyright Tapan P Bagchi 44

Joint Probability

For events A and B, joint probability Pr(AB) stands for the probability that both events happen.

Example: A={HH}, B={HT, TH}, what is the joint probability Pr(AB)?

Think—can any outcome lead to the occurrence of A and also B? Can A and B occur together?

Page 23: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

23

Copyright Tapan P Bagchi 45

Independence

Two events A and B are independent in casePr(AB) = Pr(A) Pr(B)

Independence does not mean that the events A and B cannot occur together

A set of events {Ai} is independent in case

Pr( ) Pr( )i iii A A

Copyright Tapan P Bagchi 46

The birth of a son or a daughter are mutually exclusive events.

Events—birth of a daughterdaughter and the birth of a child with AB+ blood typeAB+ blood type——are are not mutually exclusive (they are independentevents).

Page 24: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

24

Copyright Tapan P Bagchi 47

Independence

Two events A and B are independent in casePr(AB) = Pr(A)Pr(B)

A set of events {Ai} is independent in case

Example: Drug test

Pr( ) Pr( )i iii A A

Women Men

Success 200 1800

Failure 1800 200

A = {A patient is a Women}

B = {Drug fails}

Will event A be independent of event B ?

Copyright Tapan P Bagchi 48

Independence

Consider the experiment of tossing a coin twice Example I:

A = {HT, HH}, B = {HT} Will event A be independent from event B?

Example II: C = {HT}, D = {TH} Will event C be independent from event D?

Disjoint Independence

If A is independent from B, B is independent from C, will A be independent from C?

Page 25: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

25

Copyright Tapan P Bagchi 49

If A and B are events with Pr(A) > 0, the conditional probability of B given A is

Conditioning

Pr( )Pr( | )Pr( )

ABB AA

Copyright Tapan P Bagchi 50

If A and B are events with Pr(A) > 0, the conditional probability of B given A is

Example: Drug test

Conditioning

Pr( )Pr( | )Pr( )

ABB AA

Women Men

Success 200 1800

Failure 1800 200

A = {Patient is a Women}

B = {Drug fails}

Pr(B|A) = ?

Pr(A|B) = ?

Page 26: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

26

Copyright Tapan P Bagchi 51

If A and B are events with Pr(A) > 0, the conditional probability of B given A is

Example: Drug test

Given A is independent from B, what is the relationship between Pr(A|B) and Pr(A)?

Conditioning

Pr( )Pr( | )Pr( )

ABB AA

Drug’s performance

Women Men

Success 200 1800

Failure 1800 200

A = {Patient is a Women}

B = {Drug fails}

Pr(B|A) = ?

Pr(A|B) = ?

Copyright Tapan P Bagchi 52

Bayes’ Rule for computing Conditional

Probabilities

Bayes’ Rule allows you to go from P(A/B) to P(B/A)

Page 27: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

27

Copyright Tapan P Bagchi 53

Cancer Test example A medical test is used to check cancer.

This test has a known reliability:P(Test +ive /person has cancer) = 0.92P(Test +ive /person healthy) = 0.04

We know that cancer is rare and in the general population P(cancer) = 0.001 = (0.1%)

If a person is randomly selected and his test is +ive, what is the chance that he has cancer?

Copyright Tapan P Bagchi 54

Data for the medical testP(cancer) = P(c) = 0.001, P(healthy) = 0.999P(test +ive/ c) = 0.92, P(test +ive / healthy) = 0.04

Question: Will you rely on this test to start a treatment for cancer?

Page 28: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

28

Copyright Tapan P Bagchi 55

Given two events A and B and suppose that Pr(A) > 0. Then

Bayes’ Rule for finding P(B/A)

)Pr()Pr()|Pr(

)Pr()Pr()|Pr(

ABBA

AABAB

Bayes Definitions:

Pr(B) = a priori

Pr(B|A) = a posteriori

Copyright Tapan P Bagchi 56

Bayes’ Rule

Pr(W|R) R R

W 0.7 0.4

W 0.3 0.6

Events:

R: It rains

W: The grass is wet

R W

Information

Pr(W|R)

Inference

Pr(R|W)

Page 29: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

29

Copyright Tapan P Bagchi 57

Bayes’ Rule

Pr( | ) Pr( )Pr( | )Pr( )

E H HH EE

Hypothesis H Evidence EInformation: Pr(E|H)

Inference: Pr(H|E)

PriorLikelihoodPosterior

Note: Pr(H)

= Pr(H is true)

Copyright Tapan P Bagchi 58

Solution to the medical testP(cancer) = P(c) = 0.001, P(healthy) = 0.999P(test +ive/ c) = 0.92, P(test +ive / healthy) = 0.04

P(+ive / c ) P(c)P( c / test +ive) = ---------------------------------------------------------

P(+ive/ c) P(c) + P(+ive/healthy) P(healthy)

Verify that the answer is 0.0225

Managerial question:Will you rely on this test to get a treatment?

Page 30: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

30

Copyright Tapan P Bagchi 59

Bayes’ Rule: More Complicated contd. Suppose that B1, B2, … Bk form a partition of S:

Suppose that Pr(Bi) > 0 and Pr(A) > 0. Then

; i j iiB B B S

1

1

Pr( | ) Pr( )Pr( | )Pr( )

Pr( | ) Pr( )

Pr( )

Pr( | ) Pr( )

Pr( ) Pr( | )

i ii

i ik

jj

i ik

j jj

A B BB AA

A B B

AB

A B B

B A B

Copyright Tapan P Bagchi 60

A More Complicated Example

R It rains

W The grass is wet due to rain or sprinkler

U People bring umbrella

Pr(UW|R)=Pr(U|R)Pr(W|R)Pr(UW| R)=Pr(U| R)Pr(W| R)

R

W U

Pr(W|R) R R

W 0.7 0.4

W 0.3 0.6

Pr(U|R) R R

U 0.9 0.2

U 0.1 0.8

Pr(U|W) = ?

Pr(R) = 0.8

Page 31: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

31

Copyright Tapan P Bagchi 61

A More Complicated Example

R It rains

W The grass is wet

U People bring umbrella

Pr(UW|R)=Pr(U|R)Pr(W|R)Pr(UW| R)=Pr(U| R)Pr(W| R)

R

W U

Pr(W|R) R R

W 0.7 0.4

W 0.3 0.6

Pr(U|R) R R

U 0.9 0.2

U 0.1 0.8

Q. What is the probability that people will bring umbrella when they see that grass is wet = Pr(U|W) = ?

Pr(R) = 0.8

Copyright Tapan P Bagchi 62

A More Complicated Example

R It rains

W The grass is wet

U People bring umbrella

Pr(UW|R)=Pr(U|R)Pr(W|R)Pr(UW| R)=Pr(U| R)Pr(W| R)

R

W U

Pr(W|R) R R

W 0.7 0.4

W 0.3 0.6

Pr(U|R) R R

U 0.9 0.2

U 0.1 0.8

Pr(U|W) = ?

Pr(R) = 0.8

Page 32: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

32

Copyright Tapan P Bagchi 63

Random variables and probability distributions

Copyright Tapan P Bagchi 64

Probability DistributionThe probability distributionrefers to the frequency with which all different possible outcomes occur. There are numerous types of probability distributions.

Small fish got caught first!

In CA if you fish in Fall, you are more likely to catch big fish

Page 33: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

33

Copyright Tapan P Bagchi 65

Random Variable and Distribution

A random variable X is a numerical outcome of a random experiment

The distribution of a random variable is the collection of possible outcomes along with their probabilities: Discrete case: Continuous case:

Pr( ) ( )X x p x

Pr( ) ( )b

aa X b p x dx

Copyright Tapan P Bagchi 66

Random Variable: Example Let S be the set of all sequences of three rolls of a

die. Let X be the sum of the number of dots on the three rolls.

What are the possible values for X? Pr(X = 5) = ?, Pr(X = 10) = ?

Page 34: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

34

Copyright Tapan P Bagchi 67

Expectation

A random variable X~Pr(X=x). Then, its expectation is

In an empirical sample, x1, x2,…, xN,

Continuous case:

Expectation of sum of random variables

[ ] Pr( )xE X x X x

11[ ] N

iiE X xN

[ ] ( )E X xp x dx

1 2 1 2[ ] [ ] [ ]E X X E X E X

Copyright Tapan P Bagchi 68

Expectation: Example Let S be the set of all sequence of three rolls of a

die. Let X be the sum of the number of dots on the three rolls.

What is E(X)?

Let S be the set of all sequence of three rolls of a die. Let X be the product of the number of dots on the three rolls.

What is E(X)?

Page 35: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

35

Copyright Tapan P Bagchi 69

Variance

The variance of a random variable X is the expectation of (X-E[x])2 :

Range = Max(Xi) – Min(Xi)

2

2 2

2 2

2 2

( ) (( [ ]) )

( [ ] 2 [ ])

( [ ] )

[ ] [ ]

Var X E X E X

E X E X XE X

E X E X

E X E X

Copyright Tapan P Bagchi 70

Finding probabilities by Counting Events

Many times probabilities can be determined by enumerating (listing all possible) events or counting the ones of our interest.

Examples: Tossing a perfect coin Getting exactly 4 heads in 10 tosses

Page 36: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

36

Copyright Tapan P Bagchi 71

PermutationsThe number of possible permutations (sequences) is the number of different orders in which particular events occur. The number of possible permutations are

where r is the number of events in the series, n is the number of possible events, and n! denotes the factorial of n = the product of all the positive integers from 1 to n.

N p n!

(n r )!

Copyright Tapan P Bagchi 72

CombinationsWhen the order or sequence in which the events occurred is of no interest, we are dealing with combinations. The number of possible combinations is

where r is the number of events in the series, n is the number of possible events, and n! denotes the factorial of n = the product of all the positive integers from 1 to n.

Nc n

r

n!r!(n r)!

Page 37: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

37

Copyright Tapan P Bagchi 73

The uniform distribution

A random variable is said to be uniformly distributed if the probability of all possible outcomes are equal to one another. Thus, the probability P(i), where i is one of n possible outcomes, is

P(i) 1n

12…i n X X is discrete here

1/n

Copyright Tapan P Bagchi 74

Bernoulli Distribution

The outcome of an experiment can either be success (i.e., 1) and failure (i.e., 0).

Pr(X=1) = p, Pr(X=0) = 1-p, or

E[X] = p, Var(X) = p(1-p)

1( ) (1 )x xp x p p

Page 38: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

38

Copyright Tapan P Bagchi 75

The binomial distribution

The mean and variance of a binomially distributed variable are given by

np

npqVariance Example:

P(x) = Prob of finding x heads in n tosses of a coin

Application: Acceptance Sampling in Quality Control—d defectives in a sample of size n.

Copyright Tapan P Bagchi 76

The binomial distribution

A process that has only two possible outcomes is called a binomial process. In statistics, the two outcomes are frequently denoted as success and failure. The probabilities of a success or a failure are denoted by p and q, respectively. Note that p + q = 1. The binomial distribution gives the probability of exactly k successes in n trials

P(k) n

k

pk 1 p n k

Page 39: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

39

Copyright Tapan P Bagchi 77

Binomial Distribution n draws of a Bernoulli distribution Xi~Bernoulli(p), X=i=1

n Xi, X~Bin(p, n) Random variable X stands for the number of times

that experiments are successful.

E[X] = np, Var(X) = np(1-p)

(1 ) 1,2,...,Pr( ) ( )

0 otherwise

x n xnp p x n

X x p x x

Copyright Tapan P Bagchi 78

Plots of Binomial Distribution

Page 40: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

40

Copyright Tapan P Bagchi 79

The Poisson distribution

Siméon Denis Poisson1781-1840

Siméon Denis Poisson1781-1840

Copyright Tapan P Bagchi 80

The Poisson distributionWhen the probability of “success” is very small, e.g., the probability of a gene mutation, then pk and (1 – p)n – k

become too small to calculate exactly by the binomial distribution. In such cases, the Poisson distributionbecomes useful. Let l be the expected number of successes in a process consisting of n trials, i.e., l = np. The probability of observing k successes is

Mean and variance are given by m = l and V = l, respectively.

P ( k ) k e

k!

Page 41: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

41

Copyright Tapan P Bagchi 81

Examples of Poisson Distribution

Number of buses arriving at a bus stop/hour Number of road accidents/week on NH6 Number of copying machine

breakdowns/month Number of customers arriving/hour at BC Roy

Hospital Applications: Service facilities design

(queuing theory, Communications networks), sampling plan design (quality control)

Copyright Tapan P Bagchi 82

Poisson Distribution Coming from Binomial distribution Fix the expectation =np Let the number of trials nA Binomial distribution will become a Poisson distribution

E[X] = , Var(X) =

otherwise0

0!)()Pr( xe

xxpxXx

Page 42: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

42

Copyright Tapan P Bagchi 83

Plots of Poisson Distribution

Copyright Tapan P Bagchi 84

Normal (Gaussian) Distribution X~N(,)

E[X]= , Var(X)= 2

If X1~N(1,1) and X2~N(2,2), X= X1+ X2 ?

2

22

2

22

1 ( )( ) exp22

1 ( )Pr( ) ( ) exp22

b b

a a

xp x

xa X b p x dx dx

Page 43: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

43

Copyright Tapan P Bagchi 85

Other important Distributions

Normal distribution N(m, s2) Control charts in QC

Weibull distribution Reliability

Chi Square distribution Market research

F distribution Six SigmaBeta distribution and the Triangular distribution

Project Mgmt

Poisson distribution Traffic studies

Copyright Tapan P Bagchi 86

Statistical Inference

Consider a coin. To determine if this is a fair coin, you flip it 10 times, with 3 heads and 7 tails. Do you think this is fair coin?

In other words, is = P[head] = 0.5?

Page 44: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

44

Copyright Tapan P Bagchi 87

Statistical Inference inferring parameters from observed data

Problem:

Likelihood function

This is really a joint probability of {Yi} calculated using Approach: Maximum likelihood estimation (MLE), or

maximize log-likelihood

Copyright Tapan P Bagchi 88

Example: Flip Coins to find

Page 45: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

45

Copyright Tapan P Bagchi 89

Example: Flip Coins (cont’d)Likelihood = joint probability of the observed data /

Maximize ln()

Copyright Tapan P Bagchi 90

Recall Probability: Basic IdeasTerminology: Trial: each time you repeat an experiment Outcome: result of an experiment Random experiment: one with random

outcomes (cannot be predicted exactly) Relative frequency: how many times a

specific outcome occurs within the entire experiment.

Page 46: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

46

Copyright Tapan P Bagchi 91

Statistics: Basic Ideas Statistics is the area of science that deals with

collection, organization, analysis, and interpretation of data.

It also deals with methods and techniques that can be used to draw conclusions about the characteristics of a large number of data points--commonly called a population--

By using a smaller subset of the entire data.

Copyright Tapan P Bagchi 92

For Example… You work in a cell phone factory and are asked to

remove cell phones at random off of the assembly line and turn it on and off.

Each time you remove a cell phone and turn it on and off, you are conducting a random experiment.

Each time you pick up a phone is a trial and the result is called an outcome.

If you check 200 phones, and you find 5 bad phones, then

relative frequency of failure = 5/200 = 0.025

Page 47: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

47

Copyright Tapan P Bagchi 93

Statistics in Quality Control

As engineers perform experiments, they collect data that can be used to explain relationships better and to reveal information about the quality of products and services they provide.

Copyright Tapan P Bagchi 94

Frequency Distribution: Scores for an Six Sigma class are as follows: 58, 95, 80, 75, 68, 97,

60, 85, 75, 88, 90, 78, 62, 83, 73, 70, 70, 85, 65, 75, 53, 62, 56, 72, 79

To better assess the success of the class, we make a frequency chart:

Page 48: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

48

Copyright Tapan P Bagchi 95

Now the information can be better analyzed. For example, 3 students did poorly, and 3 did

exceptionally well. We know that 9 students were in the average range of 70-79. We can also show this data in a freq. histogram (PDF).

Divide each no. by 26

Copyright Tapan P Bagchi 96

Cumulative Frequency The data can be further organized by calculating the

cumulative frequency (CDF). The cumulative frequency shows the cumulative number of

students with scores up to and including those in the given range. Usually we normalize the data - divide 26.

Page 49: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

49

Copyright Tapan P Bagchi 97

Measures of Central Tendency & Variation Systematic errors, also called fixed errors, are errors

associated with using an inaccurate instrument. These errors can be detected and avoided by properly

calibrating instruments Random errors are generated by a number of

unpredictable variations in a given measurement situation. Mechanical vibrations of instruments or variations in line

voltage friction or humidity could lead to random fluctuations in observations.

Copyright Tapan P Bagchi 98

When analyzing data, the mean alone cannot signal possible mistakes. There are a number of ways to define the dispersion or spread of data.

You can compute how much each number deviates from the mean, add up all the deviations, and then take their average as shown in the table below.

Page 50: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

50

Copyright Tapan P Bagchi 99

As exemplified in Table 19.4, the sum of deviations from the mean for any given sample is always zero. This can be verified by considering the following:

Where xi represents data points, x is the average, n is the number of data points, and d, represents the deviation from the average.

x 1n

x ii1

n

di (x i x )

Copyright Tapan P Bagchi 100

Therefore the average of the deviations from the mean of the data set cannot be used to measure the spread of a given data set.

Instead we calculate the average of the absolute values of deviations. (This is shown in the third column of table 19.4 in your textbook)

For group A the mean deviation is 290, and Group B is 820.

We can conclude that Group B is more scattered than A.

dii1

n

x ii1

n

x i1

n

dii1

n

nx nx 0

Page 51: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

51

Copyright Tapan P Bagchi 101

Variance Another way of measuring the data is by

calculating the variance. Instead of taking the absolute values of each

deviation, you can just square the deviation and find the means.

(n-1) makes estimate unbiased

v i1

n

(xi x )2

n 1

Copyright Tapan P Bagchi 102

Taking the square root of the variance which results in the standard deviation.

The standard deviation can also provide information about the relative spread of a data set.

Range (Xmax – Xmin) can also show spread

s i1

n

(x i x )2

n 1

Page 52: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

52

Copyright Tapan P Bagchi 103

The mean for a grouped distribution is calculated from:

wherex = midpoints of a given rangef = frequency of occurrence of data in the

rangen = f = total number of data points

x (xf )n

Copyright Tapan P Bagchi 104

The standard deviation for a grouped distribution is calculated from:

s (x x )2 f

n 1

Page 53: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

53

Copyright Tapan P Bagchi 105

Normal Distribution We could use the probability distribution from the figures

below to predict what might happen in the future. (i.e. next year’s students’ performance)

Copyright Tapan P Bagchi 106

Normal Distribution

Any probability distribution with a bell-shaped curve is called a normal distribution.

The detailed shape of a normal distribution curve is determined by its mean and standard deviation values.

Page 54: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

54

Copyright Tapan P Bagchi 107

Copyright Tapan P Bagchi 108

Page 55: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

55

Copyright Tapan P Bagchi 109

THE NORMAL CURVE

Using Table 19.11, approx. 68% of the data will fall in the interval of -s to s, one std deviation

~ 95% of the data falls between -2s to 2s, and approx all of the data points lie between -3s to 3s

For a standard normal distribution, 68% of the data fall in the interval of z = -1 to z = 1.

zi = (xi - x) / s

Copyright Tapan P Bagchi 110

AREAS UNDER THE NORMAL CURVE

z = -2 and z = 2 (two standard deviations below and above the mean) each represent 0.4772 of the total area under the curve.

99.7% or almost all of the data points lie between -3s and 3s.

Page 56: IIML 3 4 Probability and Statistics a Review [Compatibility Mode]

3/11/2012

56

Copyright Tapan P Bagchi 111

Analysis of Two Histograms

Graph A is class distribution of numbers 1-10Graph B is class distribution of semester credits

Data for A = 5.64 +/- 2.6 (much greater spread than B)Data for B = 15.7 +/- 1.96 (smaller spread)Skew of A = -0.16 and Skew B = 0.146CV of A = 0.461 and CV of B = 0.125 (CV = SD/Mean)

Frequency A

01234567

2 3 4 5 6 7 8 9 10

Frequency B

0123456789

12 13 14 15 16 17 18 19 20

Copyright Tapan P Bagchi 112

References

1. Amir D Aczel and Jayavel Sounderpandian (2009). Complete Business Statistics, 6th

ed, Tata McGraw-Hill2. Bharat Jhunjhunwala (2008). Business

Statistics—A Self Study Textbook, 1st ed, S Chand