bsc/hnd ietm week 9/10 - some probability distributions

59
BSc/HND IETM Week 9/10 - Some Probability Distributions

Upload: delta

Post on 18-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

BSc/HND IETM Week 9/10 - Some Probability Distributions. When we looked at the histogram a few weeks ago, we were looking at frequency distributions. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

BSc/HND IETM Week 9/10 - Some Probability Distributions

Page 2: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

When we looked at the histogram a few weeks ago, we were looking

at frequency distributions.

Page 3: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

It is possible to convert such frequency distributions into

probability distributions, such that the probability of

encountering some particular value (or range of values) of x is plotted on the vertical axis, rather than the number of occurrences of

that value of x.

Page 4: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

There are a few standard forms of such distributions, which make analysis rather easy - so long as the data really do fit the chosen

form.

Page 5: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

We shall look at two of these standard forms, the normal and

the negative exponential distributions.

Page 6: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Probability distributions from frequency distributions

Page 7: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Suppose that our previously-mentioned (and, sadly,

hypothetical) optional unit for your course, ‘Flower Arranging

for Engineers’, becomes extremely popular.

Page 8: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

In fact, it becomes so popular that it is studied by 208 students, from all the various BSc courses

in the School.

Page 9: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

In an effort to analyse the performance of the students, so as to determine if any improvements to the unit are required, we might decide to plot a histogram of the

final marks obtained.

Page 10: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

As we know, this is a frequency distribution, and might be

obtained from the following summary of the students’ scores,

as shown:

Page 11: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Mark Scored (%) 0-9.9 10-19.9 20-29.9 30-39.9 40-49.9Frequency (No. of students) 1 4 8 17 47

Mark Scored (%) 50-59.9 60-69.9 70-79.9 80-89.9 90-100Frequency (No. of students) 53 39 25 11 3

Page 12: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

Mark (per cent)

Frequency (No. of students)

1

4

8

17

47

53

39

25

11

3

Page 13: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Frequency polygonsThe first step in the conversion is to change from the histogram to

what is called a frequency polygon. This is simply a line

graph, joining the centres of each of the chosen data intervals.

Page 14: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

At the ends, our frequency polygon reaches the zero axis as

shown, since no student can obtain less than zero or more

than 100 per cent. In situations when this doesn’t apply, it is conventional to terminate the polygon on the zero axis, half way through the next interval.

Page 15: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

Mark (per cent)

Frequency (No. of students)

1

4

8

17

47

53

39

25

11

3

Page 16: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

It is very easy to obtain probability distributions from

diagrams such as those above. All that is necessary is to divide each frequency by the total number of (in this case) students, to obtain the probability of any individual

student, selected at random, obtaining a mark in a particular

range.

Page 17: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

For example, to convert the histogram on page 1, or the

frequency polygon on page 2, into probability distributions,

simply divide every number on the vertical axis (and therefore also the numbers written on the

plots) by 208.

Page 18: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Thus, the vertical axes would now be calibrated in

probabilities from zero to 53/208 = 0.255.

Page 19: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

The probability of any given student obtaining a mark in the

range 40 to 49.9 per cent will be 47/208 = 0.226. The probability of a student scoring 90 per cent or more will be 3/208 = 0.0144, etc.

Page 20: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

The normal distribution

It is not very surprising that the marks distribution (frequency or

probability) looks like the diagrams above.

Page 21: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

In a fair examination, taken by a large number of students, we would expect that only a few students would obtain either

abysmally low marks or astronomically high marks.

Page 22: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

We would expect the majority of marks to be ‘somewhere in the middle’, with a ‘tail’ at both the

low and the high ends of the range.

Page 23: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

We would expect the majority of marks to be ‘somewhere in the middle’, with a ‘tail’ at both the

low and the high ends of the range.

This is what we see above.

Page 24: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Several real-life situations fit this general form of distribution,

where it is most likely that results will be clustered around the centre of some range, with outlying values tailing off

towards the ends of the range.

Page 25: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Wisniewski, in his ‘Foundation’ text, uses an example based on the distributions of the weights of breakfast cereal packed by

machines into boxes.

Page 26: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

There should always ideally be the stated amount in a box but, inevitably, some boxes will be

lighter, and some heavier. There will be the odd ‘rogue’ boxes a

long way from the mean.

Page 27: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

To make it easier to cope with such situations, they are often assumed to fit a standardised

probability distribution, called the normal distribution.

Page 28: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

By doing this, it is possible to use standard printed tables to make

predictions such as (for example), how many students would be

expected to score less than 40 per cent

Page 29: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

To allow standard tables to be used, we need to assume a certain

fixed shape of probability distribution, and we also need to

define it in terms of mean and standard deviation.

Page 30: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

We cannot define it in terms of actual data values (e.g.

examination marks, or weight of cereal in a box), otherwise we would need a different set of tables for every new problem.

Page 31: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

The normal distribution curve is actually defined by a rather

unpleasant formula (but we don’t need to use it, as we are going to

use tables which have been derived from it by someone else).

Page 32: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

If the variable in which we are interested is x (e.g. a mark in per cent, or the weight of cereal in a box in kg), the mean value of x is and the standard deviation of

the data set is x,

Page 33: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

then the normal distribution curve is defined by the probability that x will take a particular value (P(x))

obeying the following relationship (I believe there is an error in

Wisniewski’s version):

Page 34: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

2

2

1

22

1)(

x

xx

x

exP

Page 35: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

The resulting plot of P(x) as x varies is a ‘bell-shaped’ curve, as

shown in the next slide.

Page 36: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

0

-4 -3 -2 -1 0 1 2 3 4

0.1

0.2

0.3

0.4P(x)

z = no. of standard deviations of x from its mean valuez = 0 for mean of x

x

Page 37: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Notes1. The “x axis” is in STANDARD DEVIATIONS2. The total area under the graph is 1 unit.3. The area under the graph between two values of x gives the probability that the quantity will be between those values.

Page 38: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

ExampleSay that a large set of

examination results has a mean of 55 per cent, and a standard

deviation of 15 per cent.

Page 39: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

How many students would we expect to fail the examination (if we define a failure as obtaining less than 40 per cent), and how many students would we expect to get a first-class result (defined

as obtaining 70 per cent or more)?

Page 40: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

0

-4 -3 -2 -1 0 1 2 3 4

0.4P(x)

z for x = 55 per cent

x

z for x = 70 per centz for x = 40 per cent

area = probabilitythat student fails

area = probability thatstudent gets a ’first’

Page 41: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

SD from Area SD from Areamean mean2.00 0.0227 0.95 0.17101.95 0.0256 0.90 0.18401.90 0.0287 0.85 0.19761.85 0.0321 0.80 0.21181.80 0.0359 0.75 0.22661.75 0.0400 0.70 0.24191.70 0.0445 0.65 0.25781.65 0.0495 0.60 0.27421.60 0.0548 0.55 0.29111.55 0.0606 0.50 0.30851.50 0.0668 0.45 0.32631.45 0.0735 0.40 0.34461.40 0.0807 0.35 0.36321.35 0.0885 0.30 0.38211.30 0.0968 0.25 0.40131.25 0.1056 0.20 0.42071.20 0.1150 0.15 0.44041.15 0.1250 0.10 0.46021.10 0.1356 0.05 0.48011.05 0.1468 0.00 0.50001.00 0.1586

Page 42: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

X = 1.0 (1 SD from mean)

First : Probability 0.1587

Fail: Also 0.1587 !

Page 43: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

The negative exponential distribution

To cover a wider range of real-world situations, more

‘standardised’ probability distributions are required.

Page 44: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

The other one we shall briefly look at is the negative-

exponential distribution. This is also sometimes called a ‘failure-rate’ curve, because it

tends to describe how components fail with time.

Page 45: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

If a certain number of components is manufactured and put into

service, it is reasonable to assume that they will all eventually fail.

Page 46: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

If a certain number of components is manufactured and put into

service, it is reasonable to assume that they will all eventually fail. The probability of any one of the

components failing during a given time period might well depend on how many components are left in

service.

Page 47: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Choose to measure time t in the best units for the problem

(seconds, months, years, etc.). Technically, the unit chosen

should be short compared with the expected lifetime of a component,

so that any given component is expected to last for many time

units.

Page 48: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Let be the failure rate, that is, the proportion of components

expected to fail in one time unit. This means that must have

‘dimensions’ of (1/time). In the example above, we said that 1

per cent of components might fail in three years so, in that case, the

failure rate   0.01/3 (proportion per year).

Page 49: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

This can also be viewed as a probability - there is a probability

of 0.01/3 that any given component will fail in a given

period of one year.

Page 50: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Therefore, to find the proportion of components expected to fail over a time t (measured in our

chosen units), we need the quantity t. This is now

dimensionless - it is actually the probability that any given

component will fail over the stated time period.

Page 51: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

We can now state the rate of change of the number of

components as follows (it is negative, because the number

decreases as time passes):

Page 52: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

N

t

tN

period time

failures ofnumber

period time

components ofnumber in the change

Page 53: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

This is called a differential equation and would normally be

written

Page 54: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

Ndt

dN

Page 55: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

in which the quantity dN / dt is to be interpreted as the rate of

change of N as the time progresses.

Page 56: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

It turns out that:

tNen

Page 57: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

We can plot this negative exponential function as the following curve relating the

remaining number of components n to time:

Page 58: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

0 1 2 3 4 5 60

0.2

0.4

0.6

0.8

1

Time units

Multiple of initial numberof components (N)

n = Ne -t

Page 59: BSc/HND  IETM  Week  9/10  -  Some Probability Distributions

The End