basic probability with an emphasis on contingency tables

79
Basic Probability With an Emphasis on Contingency Tables

Upload: lewis-reeves

Post on 24-Dec-2015

253 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Basic Probability With an Emphasis on Contingency Tables

Basic Probability

With an Emphasis on Contingency Tables

Page 2: Basic Probability With an Emphasis on Contingency Tables

Students in PSYC 2101

• Skip to Slide # 7.

Page 3: Basic Probability With an Emphasis on Contingency Tables

Random Variable

• A random variable is real valued function defined on a sample space.– The sample space is the set of all distinct

outcomes possible for an experiment.– Function: two sets’ (well defined collections

of objects) members are paired so that each member of the one set (domain) is paired with one and only one member of the other set (range)

Page 4: Basic Probability With an Emphasis on Contingency Tables

• The domain is the sample space, the range is a set of real numbers.

• A random variable is the set of pairs created by pairing each possible experimental outcome with one and only one real number.

Page 5: Basic Probability With an Emphasis on Contingency Tables

Examples

the outcome of rolling a die: = 1, = 2, = 3, etc. (Each outcome has only one number, and, vice versa)

= 1, = 2, = 1, etc. (each outcome has (odd-even) only one number, but not vica versa)

The weight of each student in my statistics class.

Page 6: Basic Probability With an Emphasis on Contingency Tables

Probability Distribution

• Each value of the random variable is paired with one and only one probability.

• More on this later.

Page 7: Basic Probability With an Emphasis on Contingency Tables

Probability Experiments

• A probability experiment is a well-defined act or process that leads to a single well defined outcome.– Flip a coin, heads or tails.– Roll a die, how many spots up.– Stand on a digital scale, what number is

displayed.

Page 8: Basic Probability With an Emphasis on Contingency Tables

Probability

• The probability of an event, P(A) is the fraction of times that event will occur in an indefinitely long series of trials of the experiment.

• Cannot be known, can be estimated.

Page 9: Basic Probability With an Emphasis on Contingency Tables

Estimating Probability

• Empirically – perform experiment many times, compute relative frequencies.

• Rationally – make assumptions and then apply logic.

• Subjectively – strength of individual’s belief regarding whether an event will or will not happen – often expressed in terms of odds.

Page 10: Basic Probability With an Emphasis on Contingency Tables

Odds of Occurrence of Event A

• If the experiment were performed (a & b) times, we would expect A to occur a times and B to occur b times.

• There are 20 students in a class, 14 of whom are women. If randomly select one, what are the odds it will be a woman?

• 14 to 6 = 7 to 3.

Page 11: Basic Probability With an Emphasis on Contingency Tables

Convert Odds to Probability

• Probability = a/(a & b).• 14 women, 6 men.• Odds = 7 to 3.• Probability = 7 out of 10.

Page 12: Basic Probability With an Emphasis on Contingency Tables

Convert Probability to Odds

• Odds = P(A)/P(not A)• Probability = .70• Odds = .70/(1 - .70) = 7 to 3

Page 13: Basic Probability With an Emphasis on Contingency Tables

Independence

• Two events are independent iff (if and only if) the occurrence or non-occurrence of the one has no effect on the occurrence or non-occurrence of the other.– I roll a die twice. The outcome on the first roll

has no influence on the outcome on the second roll.

Page 14: Basic Probability With an Emphasis on Contingency Tables

Mutual Exclusion

• Two events are mutually exclusive iff the occurrence of the one precludes occurrence of the other (both cannot occur simultaneously on any one trial).– You could earn final grade of A in this class.– You could earn a B.– You can’t earn both.

Page 15: Basic Probability With an Emphasis on Contingency Tables

Mutual Exhaustion

• Two (or more) events are mutually exhaustive iff they include all possible outcomes.– You could earn a final grade of A, B, C, D, or

F.– These are mutually exhaustive since there are

no other possibilities.

Page 16: Basic Probability With an Emphasis on Contingency Tables

Marginal Probability

• The marginal probability of event A, P(A), is the probability of A ignoring whether or not any other event has also occurred.– P(randomly selected student is female) =

.70

Page 17: Basic Probability With an Emphasis on Contingency Tables

Conditional Probability of A

• the probability that A will occur given that B has occurred

• P(A|B), the probability of A given B.– Given that the selected student is wearing a

skirt, the probability that the student is female is .9999

– Unless you are in Scotland• If P(A|B) = P(A), the A and B are

independent of each other.

Page 18: Basic Probability With an Emphasis on Contingency Tables

Joint Probability

• The probability that both A and B will occur.

• P(A B) = P(A) P(B|A) = P(B) P(A|B)• If A and B are independent, this simplifies

to P(A B) = P(A) P(B)• This is known as the Multiplication Rule

Page 19: Basic Probability With an Emphasis on Contingency Tables

The Addition Rule

• If A and B are mutually exclusive, the probability that one or the other will occur is the sum of their separate probabilities.

.5 .3 .2 P(B) P(A) B) P(A

Grade A B C D F

Probability .2 .3 .3 .15 .05

Page 20: Basic Probability With an Emphasis on Contingency Tables

• If A and B are not mutually exclusive, things get a little more complicated.

• P(A B) = P(A) + P(B) - P(A B)

Page 21: Basic Probability With an Emphasis on Contingency Tables

Two-Way Contingency Table

• A matrix where rows represent values of one categorical variable and columns represent values of a second categorical variable.

• Can be use to illustrate the relationship between two categorical variables.

Page 22: Basic Probability With an Emphasis on Contingency Tables

Survey Questions

• We have asked each of 150 female college students two questions:

1. Do you smoke (yes/no)?

2. Do you have sleep disturbances (yes/no)?

• Suppose that we obtain the following data (these are totally contrived, not real):

Page 23: Basic Probability With an Emphasis on Contingency Tables

Marginal Probabilities

Sleep?

Smoke? No Yes

No 20 30 50

Yes 40 60 100

60 90 150

60.5

3

15

9

150

90 P(Sleep) 66.

3

2

15

10

150

100 P(Smoke)

Page 24: Basic Probability With an Emphasis on Contingency Tables

Conditional Probabilities Show Absolute Independence

Sleep?

Smoke? No Yes

No 20 30 50

Yes 40 60 100

60 90 150

60.5

3

50

30 Nosmoke)|P(Sleep 60.

5

3

100

60 Smoke) | P(Sleep

Page 25: Basic Probability With an Emphasis on Contingency Tables

Multiplication Rule Given Independence

• Sixty of 150 have sleep disturbance and smoke, so P (Sleep Smoke) = 60/150 = .40

• P(A B) = P(A) x P(B)

40.15

6

3

2

5

3

P(Smoke) x P(Sleep) Smoke) P(Sleep

Page 26: Basic Probability With an Emphasis on Contingency Tables

“Sleep” = Sexually Active

• Preacher claims those who smoke will go to Hell.

• And those who fornicate will go to Hell.• What is the probability that a randomly

selected coed from this sample will go to Hell?

Page 27: Basic Probability With an Emphasis on Contingency Tables

Addition Rule

66.15

10

150

100 P(Smoke) 60.

15

9

150

90 P(Sleep)

27.115

19

15

10

15

9 P(Smoke) P(Sleep)

A probability cannot exceed one.Something is wrong here!

Page 28: Basic Probability With an Emphasis on Contingency Tables

Welcome to Hell

• The events (sleeping and smoking) are not mutually exclusive.

• We have counted the overlap between sleeping and smoking (the 60 women who do both) twice.

• 30 + 40 + 60 = 130 of the women sleep and/or smoke.

• The probability we seek = 130/150 = 13/15 = .87

Page 29: Basic Probability With an Emphasis on Contingency Tables

Addition Rule For Events That Are NOT Mutually Exclusive

.87.15

13

15

6-

15

10

15

9

Smoke) P(Sleep - P(Smoke) P(Sleep)

Smoke) P(Sleep

Page 30: Basic Probability With an Emphasis on Contingency Tables

Sleep = Sexually Active, Smoke = Use Cannabis

Sleep?

Smoke? No Yes

No 30 20 50

Yes 40 60 100

70 80 150

Page 31: Basic Probability With an Emphasis on Contingency Tables

Marginal Probabilities

Sleep?

Smoke? No Yes

No 30 20 50

Yes 40 60 100

70 80 150

35.15

8

150

80 P(Sleep) 66.

3

2

150

100 P(Smoke)

Page 32: Basic Probability With an Emphasis on Contingency Tables

Conditional Probabilities Indicate Nonindependence

Sleep?

Smoke? No Yes

No 30 20 50

Yes 40 60 100

70 80 150

40.50

20 Nosmoke)|P(Sleep 60.

100

60 Smoke) | P(Sleep

Page 33: Basic Probability With an Emphasis on Contingency Tables

Joint Probability

• What is the probability that a randomly selected coed is both sexually active and a cannabis user?

• There are 60 such coeds, so the probability is 60/150 = .40.

• Now let us see if the multiplication rule works with these data.

Page 34: Basic Probability With an Emphasis on Contingency Tables

Multiplication Rule

• Oops, this is wrong. The joint probability is .40. We need to use the more general form of the multiplication rule.

53.45

16

3

2

15

8

P(Smoke) x P(Sleep)

Smoke) P(Sleep

Page 35: Basic Probability With an Emphasis on Contingency Tables

Multiplication Rule NOT Assuming Independence

• Now that looks much better.

.40.15

6

5

3

3

2

Smoke)|P(Sleep P(Smoke)

Sleep) P(Smoke

Page 36: Basic Probability With an Emphasis on Contingency Tables

Actual Data From Jury Research

• Castellow, Wuensch, and Moore (1990, Journal of Social Behavior and Personality, 5, 547-562

• Male employer sued for sexual harassment by female employee.

• Experimentally manipulated physical attractiveness of both litigants

Page 37: Basic Probability With an Emphasis on Contingency Tables

Effect of Plaintiff Attractiveness

• P(Guilty | Attractive) = 56/73 = 77%.• P(Guilty | Not Attractive) = 39/72 = 54%.• Defendant found guilty more often if

plaintiff was attractive.

Guilty?

Plaintiff Attractive?

No

Yes

No 33 39 72

Yes 17 56 73

50 95 145

Page 38: Basic Probability With an Emphasis on Contingency Tables

Odds and Odds Ratios• Odds(Guilty | Attractive) = 56/17• Odds(Guilty | Not Attractive) = 39/33• Odds Ratio = 56/17 39/33 = 2.79.• Odds of guilty verdict 2.79 times higher

when plaintiff is attractive.

Guilty?

Plaintiff Attractive?

No

Yes

No 33 39 72

Yes 17 56 73

50 95 145

Page 39: Basic Probability With an Emphasis on Contingency Tables

Effect of Defendant Attractiveness

• P(Guilty | Not Attractive) = 53/70 = 76%.• P(Guilty | Attractive) = 42/75 = 56%.• The defendant was more likely to be found

guilty when he was unattractive.

Guilty?

Attractive? No Yes

No 17 53 70

Yes 33 42 75

50 95 145

Page 40: Basic Probability With an Emphasis on Contingency Tables

Odds and Odds Ratio

• Odds(Guilty | Not Attractive) = 53/17.• Odds(Guilty | Attractive) = 42/33.• Odds Ratio = 53/17 42/33 = 2.50.• Odds of guilty verdict 2.5 times higher

when defendant is unattractive.

Guilty?

Attractive? No Yes

No 17 53 70

Yes 33 42 75

50 95 145

Page 41: Basic Probability With an Emphasis on Contingency Tables

Combined Effects of Plaintiff and Defendant Attractiveness

• Plaintiff attractive, Defendant not = 83% guilty.

• Defendant attractive, Plaintiff not = 41% guilty.

• Odds ratio = 83/17 41/59 = 7.03.• When attorney tells you to wear Sunday

best to trial, listen.

Page 42: Basic Probability With an Emphasis on Contingency Tables

Odds Ratios and Probability Ratios

• Odds of Success– 90/10 = 9 for Antibiotic Group– 40/60 = 2/3 for Homeopathy Group– Odds Ratio = 9/(2/3) = 13.5

Page 43: Basic Probability With an Emphasis on Contingency Tables

Odds Ratios and Probability Ratios

• Odds of Failure– 10/90 = 1/9 for Antibiotic Group– 60/40 = 1.5 for Homeopathy Group– Odds Ratio = 1.5/(1/9) = 13.5

Notice that the odds ratio comes out the same with both perspectives.

Page 44: Basic Probability With an Emphasis on Contingency Tables

Odds Ratios and Probability Ratios• Probability of Success

– 90/100 = .9 for Antibiotic Group– 40/100 = .4 for Homeopathy Group– Probability Ratio = .9/(.4) = 2.25

Page 45: Basic Probability With an Emphasis on Contingency Tables

Odds Ratios and Probability Ratios

• Probability of Failure– 10/100 = .1 for Antibiotic Group– 60/100 = .6 for Homeopathy Group– Odds Ratio = .6/(.1) = 6

Notice that the probability ratio differs across perspectives.

Page 46: Basic Probability With an Emphasis on Contingency Tables

Another Example

• According to Medscape, 0.5% of the general population has narcissistic personality disorder (NPD)

• The rate is 20% among members of the US Military.

Page 47: Basic Probability With an Emphasis on Contingency Tables

Odds Ratios

• Odds of NPD– Military: .2/.8 = .25– General: .005/.995 = .005– Ratio: .25/.005 = 49.75

• Odds of NOT NPD– Military: .8/.2 = 4– General: .995/.005 = 199– Ratio: 199/4 = 49.75

Page 48: Basic Probability With an Emphasis on Contingency Tables

Probability Ratios

• Probability of NPD– Military: 20%– General: 0.5%– Ratio: 20/0.5 = 40.

• Probability of NOT NPD– Military: 80%– General: 99.5%– Ratio: .995/.8 = 1.24

Page 49: Basic Probability With an Emphasis on Contingency Tables

Probability Distributions

• For a discrete variable, pair each value with the probability of obtaining that value.

• For example, I flip a fair coin five times. What is the probability for each of the six possible outcomes?

• May be a table, a chart, or a formula.

Page 50: Basic Probability With an Emphasis on Contingency Tables

Probability Table

Number of Heads

Percent

0 3.11 15.62 31.23 31.24 15.65 3.1

Page 51: Basic Probability With an Emphasis on Contingency Tables

Probability Chart

Page 52: Basic Probability With an Emphasis on Contingency Tables

Probability Formula

• y is number of heads, n is number of tosses, p is probability of heads, q is probability of tails

ynyqpy

nyYP

!y)-(n !

!

Page 53: Basic Probability With an Emphasis on Contingency Tables

Continuous Variable

• There is an infinite number of values, so a table relating each value to a probability would be infinitely large.

• The probability of any exact value is vanishingly small.

• We can find the probability that a randomly selected case has a value between a and b.

Page 54: Basic Probability With an Emphasis on Contingency Tables

Evolution of a Continuous Variable

• I’ll start with a histogram for a discrete variable.

• In each step I’ll double the number of values (and number of bars).

• All the way up to an infinite number of values with each bar infinitely narrow.

Page 55: Basic Probability With an Emphasis on Contingency Tables
Page 56: Basic Probability With an Emphasis on Contingency Tables
Page 57: Basic Probability With an Emphasis on Contingency Tables
Page 58: Basic Probability With an Emphasis on Contingency Tables
Page 59: Basic Probability With an Emphasis on Contingency Tables

• Now one final step, to an uncountably large number of bars, each infinitely narrow, yielding a continuous, uniform distribution ranging from A to B.

Page 60: Basic Probability With an Emphasis on Contingency Tables
Page 61: Basic Probability With an Emphasis on Contingency Tables

• Now I do the same but I start with a binomial distribution with p = .5 and three bars.

• Note that the bars are not all of equal height.

• Each time I split one, I lower the height of the tail-wards one more than the center-wards one.

Page 62: Basic Probability With an Emphasis on Contingency Tables
Page 63: Basic Probability With an Emphasis on Contingency Tables
Page 64: Basic Probability With an Emphasis on Contingency Tables
Page 65: Basic Probability With an Emphasis on Contingency Tables
Page 66: Basic Probability With an Emphasis on Contingency Tables
Page 67: Basic Probability With an Emphasis on Contingency Tables
Page 68: Basic Probability With an Emphasis on Contingency Tables

• Now one final leap to a continuous (normal) distribution with an uncountably large number of infinitely narrow bars.

Page 69: Basic Probability With an Emphasis on Contingency Tables
Page 70: Basic Probability With an Emphasis on Contingency Tables

Random Sampling

• Sampling N data points from a population is random if every possible different sample of size N was equally likely to be selected.

• Random samples most often will be representative of the population.

• Our stats assume random sampling.

Page 71: Basic Probability With an Emphasis on Contingency Tables

Y Random, X Not

  ProbabilitySample X Y

AB 1/2 1/6AC 0 1/6AD 0 1/6BC 0 1/6BD 0 1/6CD 1/2 1/6

Page 72: Basic Probability With an Emphasis on Contingency Tables

Counting Rules

• PSYC 2101 students can skip the material in the rest of this slide show.

Page 73: Basic Probability With an Emphasis on Contingency Tables

Arranging Y Things

• There are Y! ways to arrange Y different things.

• I am getting a four scoop ice cream cone.• Chocolate, Vanilla, Coconut, and Mint.• How many different ways can I arrange

these four flavors?• 4! = 4(3)(2)(1) = 24.

Page 74: Basic Probability With an Emphasis on Contingency Tables

Permutations

• If I have 10 different flavors, how many different ways can I select and arrange 4 different flavors from these 10?

)!(

!

YN

N

5040!6

!678910

)!410(

!10

Page 75: Basic Probability With an Emphasis on Contingency Tables

Combinations

• Same problem, but order of the flavors does not count.

• The are Y! ways to arrange Y things, so just divide the number of permutations by Y!

210234!6

!678910

!4!6

!10

!)!(

!

YYN

N

Page 76: Basic Probability With an Emphasis on Contingency Tables

Number of Different Strings

• CL = number of different strings• C is the number of different characters

available• L is the length of the string.• Ten different characters (0 – 9) and two

character strings• 102 = 100 different strings

Page 77: Basic Probability With an Emphasis on Contingency Tables

• Use letters instead (A through Z)• 262 = 676 different strings• Use letters and numbers• 362 = 1,296 different strings• Use strings of length 1 or 2.• 36 + 1,296 = 1,332 different strings

Page 78: Basic Probability With an Emphasis on Contingency Tables

• Use strings of length up to 3.• 363 = 46,656 three character strings• + 1,332 one and two character strings• 47,988 different strings.• Use lengths up to 4• 1,679,616 + 47,988 = 1,727,604• Use lengths up to 5• 60,466,176 + 1,727,604 = 62,193,780

Page 79: Basic Probability With an Emphasis on Contingency Tables

• Use strings of length up to 6• 2,176,782,336 + 62,193,780 =

2,238,976,116 different strings• That is over 2 BILLION different strings.