-115-123 32 37 -115-123 32 37 (a) (b) cs pi after: 2000-2008, m 5.0 before: 1932-1999, m 3.0...

15
-115 -123 32 37 -115 -123 32 37 (a) (b) CS PI After: 2000-2008, M5.0 Before: 1932-1999, M3.0 Before: 1984-1987, M3.0 Is this difference statistically significant Hits 17/2

Post on 21-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

-115-12332

37

-115-12332

37

(a) (b)

CS PI

After: 2000-2008, M5.0

Before: 1932-1999, M3.0Before: 1984-1987, M3.0

Is this difference statistically significant?

19/21 Hits 17/21 Hits

Page 2: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

Rates +

Changes in Rates

After: 2000-2008, M5.0

Before: 1932-1999, M3.0

Before: 1984-1987, M3.0

% area

% h

its

Cellular Seismology

Rundle et al. (PI)Is this difference

statistically significant?

17/21

19/21

Page 3: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

The Binomial Random VariableThe Binomial Random Variable• The coin-tossing experimentcoin-tossing experiment is a

simple example of a binomial binomial random variable. random variable. Toss a fair coin n = 3 times and record x = number of heads.

x p(x)

0 1/8

1 3/8

2 3/8

3 1/8

Page 4: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

The Binomial ExperimentThe Binomial Experiment1. The experiment consists of nn identical trials. identical trials.2. Each trial results in one of two outcomesone of two outcomes, success (S)

or failure (F).3. The probability of success on a single trial is p and

remains constantremains constant from trial to trial. The probability of failure is q = 1 – p.

4. The trials are independentindependent.5. We are interested in xx, the number of successes in , the number of successes in n n

trials.trials.

Page 5: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

“Fair Coin” Null Hypothesis P(observing 14 or more heads)

n = 21 Coins9% probability

Page 6: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

“Fair Coin” Null Hypothesis P(observing 15 or more heads)

n = 21 Coins4% probability

Page 7: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

PI Null Hypothesis(expect “on average” 17 hits)

P(observing 19 or more hits)

n = 21 “after” earthquakes21% probability

Page 8: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

P(observing 21 hits)PI Null Hypothesis(expect “on average” 17 hits)

n = 21 “after” earthquakes1% probability

Page 9: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

Test of Statistical Significance

Question: Cellular Seismology (CS) - 19/21 hits, Pattern Informatics (PI) - 17/20 hits, but:

Does CS in general really perform better than PI, or are those extra two hits, just a coincidence?

Need to decide between two possibilities:

The mean success rate of CS exceeds the mean success rate of PI. or The mean success rate of CS does not exceed the mean success rate of PI.

This is an example of a statistical hypothesis test.

Page 10: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

Test of Statistical Significance

Analogous to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities.

• The person is guilty.• The person is innocent.

Begin by assuming that the person is innocent.

The prosecutor presents evidence, trying to convince the jury to reject the original assumption of innocence, and conclude that the person is guilty.

Page 11: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

Test of Statistical Significance

1. The null hypothesis, H0:

Assumed to be true until we can prove otherwise.

2. The alternative hypothesis, Ha:

Will be accepted as true if we can disprove H0.

Court Trial CS vs. PI

H0: Innocent CS success rate does not exceed PI success rate

Ha: Guilty CS success rate exceeds PI success rate.

Page 12: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

Reject the null hypothesis, H0, if (assuming that H0 is true) the probability of observing the number of hits that you actually did observe (19/21) is low enough that it is unlikely to have been observed due to only a random occurrence.

If 19/21 hits is unusually high, we say that the difference between the expected 17/21 hits and what you actually observed is statistically significant.

How low does the probability have to be?

If the probability is less than 0.05 (5%), we say that the difference is statistically significant at the 95% level.

If the probability is less than 0.10 (10%), we say that the difference is statistically significant at the 90% level.

Page 13: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

PI Null Hypothesis(expect “on average” 17 hits)

P(observing 21 hits)

n = 21 “after” earthquakes 1% probability

Statistically Significantor

“No way!”

Page 14: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically

PI Null Hypothesis(expect “on average” 17 hits)

P(observing 19 or more hits)

21% probabilityn = 21 “after” earthquakes

Not Statistically Significantor

“Way!”

Page 15: -115-123 32 37 -115-123 32 37 (a) (b) CS PI After: 2000-2008, M  5.0 Before: 1932-1999, M  3.0 Before: 1984-1987, M  3.0 Is this difference statistically