agenda 1.informationer 2.opsamling fra sidst a)spørgeskemaer b)standardafvigelser...

31
Agenda 1. Informationer 2. Opsamling fra sidst a) Spørgeskemaer b) Standardafvigelser 3. Sandsynlighedsregni ng a) Definitioner b) Regneregler 4. Sandsynlighedsforde ling 5. SPSS 6. Dagens øvelser

Upload: angelo-leakey

Post on 31-Mar-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Agenda

1. Informationer2. Opsamling fra sidst

a) Spørgeskemaerb) Standardafvigelser

3. Sandsynlighedsregninga) Definitionerb) Regneregler

4. Sandsynlighedsfordeling5. SPSS6. Dagens øvelser

Page 2: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Standard Deviation (standardafvigelsen)

• Gives a measure of variation by summarizing the deviations of each observation from the mean and calculating an adjusted average of these deviations.

1

)( 2

n

xxs

Site

Obs.1 2

3 Sum n Gns. Std.afv.

A 5 5 5 15 3 5 0,0

B 4 5 6 15 3 5 1,0

C 3 5 7 15 3 5 2,00

1

2

3

4

3 4 5 6 7

Page 3: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

B. Learning Objectives

1. Sample Space (Udfaldsrum) for a Trail (forsøg)

2. Event (Hændelse). A, B, C, ...

3. Probabilities for a sample space

4. Probability of an event

5. Basic rules for finding probabilities about a pair of events

6. Probability of the union of two events

7. Probability of the intersection of two events

Page 4: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 1:Sample Space (udfaldsrum) for a Trail (forsøg)

• The sample space (udfaldsrummet) is the set of all possible outcomes.

• Ex. Udfaldsrummet for en test bestående af 3 spørgsmål, som kan besvares korrekt, C (correct), eller forkert, I, (incorrect) fremgår af figuren.

Page 5: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 2:Event (hændelse)

• An event (hændelse) is a subset of the sample space

• An event corresponds to a particular outcome or a group of possible outcomes.

• Example:– Event A = student answers all 3 questions

correctly = (CCC)– Event B = student passes (at least 2 correct)

= (CCI, CIC, ICC, CCC)

Page 6: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 3:Probabilities for a sample space

Each outcome, f.eks. CCC, in a sample space has a probability

• The probability of each individual outcome is between 0 and 1. 0 ≤ P ≤ 1

• The total (the sum) of all the individual probabilities equals 1.

Page 7: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 4:Probability of an Event

• The Probability of an event A is denoted by P(A).

• The Probability is obtained by adding the probabilities of the individual outcomes in the event.

• When all the possible outcomes are equally likely:

space sample in the outcomes ofnumber

Aevent in outcomes ofnumber )( AP

Page 8: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 4:Eksempel: Forespørgsler på en hjemmeside?

1. Oplist 2 hændelser i ovenstående udfaldsrum.2. Hvad er ssh. for at en tilfældigt valgt person ...

a) har kontaktet en hjemmeside med sin mobiltelefon?b) har besøgt en.ORG hjemmeside ?

3. Hvilken domænetype er der størst ssh. for at en mobiltlf. bruger besøger?

4. Hvilken domænetype har størst ssh. for at blive besøgt af en mobiltlf. bruger?

Domæne Mobil PC Total

.DK 90 14.010  

.EDU 71 30.629  

.COM 69 24.631  

.ORG 80 10.620  

Total      

Page 9: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 5:Basic rules for finding probabilities about a pair of events

• Some events are expressed as the outcomes (udfald) that

1. Are not in some other event (complement of the event)

2. Are in one event and in another event (intersection of two events)

3. Are in one event or in another event (union of two events)

Page 10: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 5:Complement of an event

• The complement of an event A consists of all outcomes in the sample space that are not in A.

• The probabilities of A and of A’ add to 1• P(A’) = 1 – P(A)

Page 11: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 5:Intersection of two events (fællesmængde)

• The intersection of A and B consists of outcomes that are in both A and B.

Page 12: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 5:Union of two events (foreningsmængde)

• The union of A and B consists of outcomes that are in A or B or in both A and B.

Page 13: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 6:Probability of the Union of Two Events

Addition Rule:For the union of two events, P(A or B) = P(A) + P(B) – P(A and B)

If the events are disjoint, P(A and B) = 0, so P(A or B) = P(A) + P(B) + 0

Page 14: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 6:Example

• Event A = Mobil• Event B = .ORG domæne• Hvordan beregner vi P(A and B) til 0,001?

Domæne Mobil PC Total

.DK 90 14.010 14.100

.EDU 71 30.629 30.700

.COM 69 24.631 24.700

.ORG 80 10.620 10.700

Total 310 79.890 80.200

Page 15: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 7:Example

• Multiplication Rule:

For the intersection of two independent events, A and B, P(A and B) = P(A) x P(B)

A=correct. Probability of guessing correctly, P(A)=0,2.

What is the probability that a student answers:

a) 3 questions correctly?

b) at least 2 questions correctly?

Page 16: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 7:Events Often Are Not Independent

Don’t assume that events are independent unless you have given this assumption careful thought and it seems plausible.

  Øjne ved terningkast

Møntkast 1-2 3-4 5-6

Plat

Krone

Page 17: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 7:Events Often Are Not Independent

• Define the events A and B as follows:– A: {first question is answered correctly}– B: {second question is answered correctly}

• P(A) = P{(CC), (CI)} = 0.58 + 0.05 = 0.63• P(B) = P{(CC), (IC)} = 0.58 + 0.11 = 0.69• P(A and B) = P{(CC)} = 0.58

• If A and B were independent, P(A and B) = P(A) x P(B) = 0.63 x 0.69 = 0.43• Thus, in this case, A and B are not independent!

Spm. 1 Correct Incorrect TotalCorrect 0,58 0,05 0,63Incorrect 0,11 0,26 0,37Total 0,69 0,31 1,00

Spm. 2

Page 18: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

C. Learning Objectives

1. Conditional probability

2. Multiplication rule for finding P(A and B)

3. Independent events defined using conditional probability

Page 19: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 1:Conditional Probability

• For events A and B, the conditional probability of event A, given that event B has occurred, is:

• P(A|B) is read as “the probability of event A, given event B.” The vertical slash represents the word “given”.

• Of the times that B occurs, P(A|B) is the proportion of times that A also occurs

)(

) ()|P(

BP

BandAPBA

Page 20: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 6:Eksempel: 1) Omregning fra antal til ssh.

Domæne Mobil PC Total

.DK 90 14.010 14.100

.EDU 71 30.629 30.700

.COM 69 24.631 24.700

.ORG 80 10.620 10.700

Total 310 79.890 80.200

Domæne Mobil PC Total

.DK 0,0011 0,1747 0,1758

.EDU 0,0009 0,3819 0,3828

.COM 0,0009 0,3071 0,3080

.ORG 0,0010 0,1324 0,1334

Total 0,0039 0,9961 1,0000

Page 21: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 1:Example 1

• What was the probability of a cell phone visit, given that the site is .ORG domain?– Event A: Cell phone is

used– Event B: Site is a .ORG

domain

007.01334.0

0010.0

P(B)

B) andP(A B)|P(A

Domæne Mobil PC Total

.DK 0,0011 0,1747 0,1758

.EDU 0,0009 0,3819 0,3828

.COM 0,0009 0,3071 0,3080

.ORG 0,0010 0,1324 0,1334

Total 0,0039 0,9961 1,0000

Page 22: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 1:Exercise

• What is the probability of a cell phone visit given that the site is a .DK domain?

• A = Cell phone is used• B = Site is a .DK

• P(A and B) = • P(B) = • P(A|B) =

)(

) ()|P(

BP

BandAPBA

Domæne Mobil PC Total

.DK 0,0011 0,1747 0,1758

.EDU 0,0009 0,3819 0,3828

.COM 0,0009 0,3071 0,3080

.ORG 0,0010 0,1324 0,1334

Total 0,0039 0,9961 1,0000

Page 23: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 3:Checking for Independence

• Two events A and B are independent if the probability that one occurs is not affected by whether or not the other event occurs.

• To determine whether events A and B are independent:– Is P(A and B) = P(A) x P(B)?– Is P(A|B) = P(A)?– Is P(B|A) = P(B)?

• If any of these is true, the others are also true and the events A and B are independent.

Page 24: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objectives

1. Probability distributions for discrete random variables

2. Mean of a probability distribution

3. Summarizing the spread of a probability distribution

Page 25: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 1:Probability Distribution

• A random variable is a numerical measurement of the outcome of a random phenomenon.

• The probability distribution of a random variable specifies its possible values and their probabilities.

Page 26: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 1:Random Variable

• Use letters near the end of the alphabet, such as x, to symbolize – Variables– A particular value of the random variable

• Use a capital letter, such as X, to refer to the random variable itself.

Example: Flip a coin three times – X=number of heads in the 3 flips; defines the random

variable– x=2; represents a possible value of the random variable

Page 27: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 2:Probability Distribution of a Discrete

Random Variable

• A discrete random variable X has separate values (such as 0,1,2,…) as its possible outcomes

• Its probability distribution assigns a probability P(x) to each possible value x:– For each x, the probability P(x) falls between 0

and 1

– The sum of the probabilities for all the possible x values equals 1

Page 28: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 2:Example

• What is the estimated probability of at least three home runs?

Page 29: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 3:The Mean of a Discrete Probability

Distribution• The mean of a probability distribution for a discrete

random variable is

where the sum is taken over all possible values of x.

)(xpx

• The mean of a probability distribution is denoted by the parameter, µ.

• The mean is a weighted average; values of x that are more likely receive greater weight P(x)

Page 30: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 3:Example

• Find the mean of this probability distribution.

The mean:

= 0(0.23) + 1(0.38) + 2(0.22) + 3(0.13) + 4(0.03) + 5(0.01) = 1.38

)(xpx

Page 31: Agenda 1.Informationer 2.Opsamling fra sidst a)Spørgeskemaer b)Standardafvigelser 3.Sandsynlighedsregning a)Definitioner b)Regneregler 4.Sandsynlighedsfordeling

Learning Objective 4:The Standard Deviation of a Probability

Distribution

The standard deviation of a probability distribution, denoted by the parameter, σ, measures its spread.

– Larger values of σ correspond to greater

spread.

– Roughly, σ describes how far the random variable falls, on the average, from the mean of its distribution