lecture 1. uncertain events and probability 2020

Post on 15-Feb-2022

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Lecture1. Uncertain events and probability

2020

1 Uncertain outcomes

Consider an experiment with an uncertain outcome.

Example: I toss a coin twice, each time notingwhether it lands heads (H) or tails (T).

The four possible outcomes are: HH, HT, TH, TT.

2 Events

An event represents a set of outcomes. For example,in the coin-tossing experiment,

Event Outcomes

First throw gives a head: HH, HTSame result on both throws: HH, TTAt least one head: HT, TH, HHNo heads: TT

Mutually exclusive and exhaustive events:

Event no heads one head two headsOutcomes TT HT, TH HH

3 Combining events

A _ B is the event ‘either A or B occurs or both’

A ^ B is the event ‘both A and B occur’.

For example, in the coin-tossing experiment, take

A = (HH, HT) ‘˛rst toss is H’,B = (HT, TH) ‘one H, one T’,

Then

A _ B = (HH, HT, TH) ‘at most one T’A ^ B = (HT) ‘H then T’

4 A Venn diagram

HH

TH

HT

TT

5 Probability from randomisation

A box contains R red balls and B black balls. Oneball is selected from the box at random. What is theprobability that the ball is red?

The probability of an event is the sum of probabilitiesof outcomes associated with the event.

Probability that any one ball is selected is 1=(R+B).

Probability that selected ball is red is the sum of Rprobabilities, each equal to 1=(R + B), so

Pr(selected ball is red) =R

(R + B)

6 Probability from symmetry

If the coin is fair, symmetry between head and tailsuggests that all four outcomes of the coin-tossingexperiment are equally probable.

The probabilities of getting 0, 1, or 2 heads are then1/4, 1/2, and 1/4.

Note that it would be wrong to argue that there are 3possible values for the number of heads and they areall equally probable.

7 Probability of A _ B

For mutually exclusive events A and B,

Pr(A _ B) = Pr(A) + Pr(B)

In general,

Pr(A _ B) = Pr(A) + Pr(B)` Pr(AB)

For three events, Pr(A _ B _ C) is

Pr(A) + Pr(B) + Pr(C)

`Pr(AB)` Pr(AC)` Pr(BC) + Pr(ABC)

(writing AB for A ^ B, etc.)

8 Sampling without replacement

A population consists of n objects. A sample of size sis selected without replacement. How many di¸erentsamples can be drawn?

If we distinguish between samples which comprise thesame objects, but are selected in a di¸erent order, theanswer is

n(n` 1) ´ ´ ´ (n` s+ 1)

For the case s = n, the answer is n!

(n factorial, the number of ways of arranging nobjects).

9 Sampling without replacement

Usually we regard two samples as identical if theydi¸er only in the order in which the objects wereselected.

What is the total number of samples in this case?

Samples previously regarded as di¸erent now form anumber of groups each of size s!, with samples in thesame group regarded as identical.

The number of groups is

„ns

«=n(n` 1) ´ ´ ´ (n` s+ 1)

s!

This is ‘n choose s’, the number of ways of choosings objects from n.

10 Example

A box contains 4 balls numbered 1 to 4. Select twoballs without replacement.

´ ´ ´ (1,2) (1,3) (1,4)(2,1) ´ ´ ´ (2,3) (2,4)(3,1) (3,2) ´ ´ ´ (3,4)(4,1) (4,2) (4,3) ´ ´ ´

If we take account of ordering, there are 12 simpleoutcomes. If we disregard ordering, there are only six:

(1,2), (1,3), (1,4), (2,3), (2,4), (3,4),

where, e.g., (1,2) now represents either (1,2) or (2,1).

11 A property of ’choose’ numbers

A box contains 5 balls numbered 1 to 5. Select twoballs (without replacement).

How many samples of size 2?Answer: (5ˆ 4)=2 = 10.

How many samples of size 3?Answer: (5ˆ 4ˆ 3)=(3ˆ 2) = 10.

Why are these numbers the same?

Each selection of two balls to be included is also aselection of three balls to be excluded from thesample.

12 Binomial expansion

(p+ q)n = (p+ q)(p+ q) ´ ´ ´ (p+ q) (n terms)

The result is a sum of terms of the form psqn`s. Themultiplier for this term is the number of ways ofchoosing s terms from n, so that

(p+ q)n =nXs=0

„ns

«psqn`s

13 Pascal’s triangle

11 1

1 2 11 3 3 1

1 4 6 4 11 5 10 10 5 1

. . . and so on . . .

END OF LECTURE

Lecture2. Random Variables

2020

14 Random variables

A random variable is a numerical summary of eachoutcome.

Example: a ‘fair’ coin is tossed twice. We assignequal probabilities of 1/4 to each outcome.

Consider the mutually exclusive and exhaustive events‘no heads’, ’one head’, and ‘two heads’.

No. of heads 0 1 2Probability 1/4 1/2 1/4

The number of heads is a random variable. Theprobability distn is given in the table above.

15 Scottish soldiers

Chest sizes (inches) for 5738 Scottish soldiers:

33 34 35 36 ´ ´ ´ 46 47 48 Total3 18 81 185 ´ ´ ´ 21 4 1 5738

One individual is chosen randomly from thepopulation.

Let X be the chest size of the selected individual.

X is a random variable: its value is known only afterselection.

16 Scottish soldiers

Before selection, all that can be said about X withcertainty is that its value will be one of the values 33,34, etc. However, probabilities can be assigned:

Pr(X = 33) = 3=5738; Pr(X = 34) = 18=5738; : : :

The values 33, 34, . . . , 48 and correspondingprobabilities de˛ne the probability distn of X.Obviously, these probabilities add up to 1.

17 A probability distribution

0 4 8 12 160.00

0.05

0.10

0.15

0.20

The number of heads obtained when a fair coin istossed 16 times.

18 Cumulative probability function

0 4 8 12 16

0.0

0.2

0.4

0.6

0.8

1.0

Cumulative probabilities for the number of headsobtained when a fair coin is tossed 16 times.

19 Expectation

X takes values x1, x2, . . . with probabilities p1, p2,. . . The ‘expectation’ of X is

E(X) = p1x1 + p2x2 + ´ ´ ´

You pay me 1 dollar (the stake). A fair coin is tossed.If it falls heads, I return the stake and pay you anextra dollar. If tails, I keep the stake and pay nothing.Is this a fair game?

Your return is a random variable, determined by thetoss of the coin. Your ’expected’ return is2ˆ 0:5 + 0ˆ 0:5 = 1 dollar, exactly equal to thestake. The game is fair.

20 Mean and variance of a random variable

The mean value is another name for E(X). It is ameasure of location, somewhere near the ‘middle’ ofthe probability distn. On this course it will usually bedenoted by m.

The variance (ff2) is E(X `m)2, and the standarddeviation (ff) is the square root of the variance. Theseare measures of dispersion, or spread of the distn.

When X is derived by sampling from a population, mand ff2 are the mean and variance of the values in thepopulation.

An alternative formula for variance is E(X2)`m2.

21 Calculating population mean and variance

Variance depends on the spread of the x values, andalso on the probabilities. Variance is 0.5 in example 1,and 0.2 in example 2.

Example 1 Example 2x p xp x2p p xp x2p

0 0.25 0.00 0.00 0.10 0.00 0.001 0.50 0.50 0.50 0.80 0.80 0.802 0.25 0.50 1.00 0.10 0.20 0.40

Total 1.00 1.00 1.50 1.00 1.00 1.20

22 Two measures of shape

Skewness measures departure from symmetry. Apositive value indicates an extended right tail.

Kurtosis measures the thickness of the tails of adistribution. Large values indicates a heavy-taileddistribution.

A distribution can be both skew and kurtotic.

23 A skew distribution

0 4 8 12 160.00

0.05

0.10

0.15

0.20

Distn of the number of heads obtained when a biasedcoin (p = 1=4) is tossed 16 times.

END OF LECTURE

Lecture3. Conditional probability and

independence2020

24 Conditional probability

P (A jB) denotes the conditional probability of theevent A, given B. Note that

Pr(AB) = Pr(A jB) Pr(B)

The conditioning event B can have a strong in‚uenceon the probability. For example, the probability ofdying within a year is not the same for a 20-year oldand a 70-year old.

Often the unconditional probability applies to anentire population, and the conditional probabilityapplies to a sub-population.

25 Independent events

Events A and B are independent if P (AjB) = P (A),

or equivalently, if P (AB) = P (A)P (B).

Example: the probabilities that a coin lands ‘heads’ or‘tails’ are p and q (p+ q = 1). The coin is tossedtwice.

Independence between the outcomes of the ˛rst andsecond toss leads to the following probabilities:

Outcome TT HT TH HHProbability q2 pq pq p2

What is probability distn of the number of heads?

26 Independent events

The assumption of independence is often based onknowledge of the mechanism generating the randomevents. For example, the result of a coin toss doesnot depend on what happened previously: the coinhas no ‘memory’ of previous events.

Non-independence may be described as a statistical‘association’ between the events.

Positive association: Pr(AB)` Pr(A) Pr(B) > 0Negative association: Pr(AB)` Pr(A) Pr(B) < 0

27 Covariance

Random variables X and Y are independent if, for allpossible values a and b,

Pr(X = a; Y = b) = Pr(X = a) Pr(Y = b):

If X and Y are not independent, the degree ofdependence is measured by the covariance

cov(X; Y ) = E[(X `mX)(Y `mY )]

If X and Y are independent, cov(X; Y ) = 0.

Covariance a¸ects variance of the sum X + Y :

var(X + Y ) = var(X) + var(Y ) + 2 cov(X; Y )

28 Covariance

●●

●●

● ●

● ●

● ●

●●

●●

● ●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

● ●

●●

●●

●●●

● ●

● ● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

29 Covariance

−5 0 5 10

0

100

200

300

400

30 Correlation coe‹cient

The correlation between X and Y is cov(X; Y ) dividedby the product of the standard deviations.

A covariance can take any value, but the correlationcoe‹cient is always less than 1 in magnitude. It isunchanged by change in scale of X or Y .

31 Genetic covariance

A positive covariance is often the result of a sharedrandom term.

For example, for measurements X, Y on two siblings,

X = m+ U + e 1; Y = m+ U + e 2;

where U, e 1, and e 2 are independent r.v.s.

In this case cov(X; Y ) = var(U).

(U represents genetic inheritance from shared parents.)

32 Law of total probability

A1 : : : Ak are mutually exclusive and exhaustiveevents. For any event B,

Pr(B) = Pr(BjA1) Pr(A1) + ´ ´ ´+ Pr(BjAk) Pr(Ak)

A population is one-third male and two-thirds female.Half the males and three in eight of the females areleft-handed. Proportion of the population which isleft-handed is

(1=2)(1=3) + (3=8)(2=3) = 1=6 + 1=4 = 5=12

33 Bayes formula

Under the same conditions as on the previous slide

Pr(AijB) = Pr(BjAi) Pr(Ai)=Pr(B)

where Pr(B) is calculated as on the previous slide:

Pr(B) = Pr(BjA1) Pr(A1) + ´ ´ ´+ Pr(BjAk) Pr(Ak)

34 Bayes formula

The table below shows the proportions of thepopulation for all four combinations of gender and’handedness’, with marginal totals.

Left Right Total

Male 1/6 1/6 1/3Female 1/4 5/12 2/3

Total 5/12 7/12 1

Among left-handed individuals, the proportion whichare male is

1=6

1=6 + 1=4= 2=5

END OF LECTURE

top related