Download - Probability Harry R. Erwin, PhD School of Computing and Technology University of Sunderland
Resources• Rowntree, D. (1981) Statistics Without Tears. Harmondsworth: Penguin.
• Hinton, P.R. (1995) Statistics Explained. London: Routledge.
• Hatch, E.M. and Farhady, H. (1982) Research Design And Statistics For Applied Linguistics. Rowley Mass.: Newbury House.
• Crawley, MJ (2005) Statistics: An Introduction Using R. Wiley.
• Gonick, L., and Woollcott Smith (1993) A Cartoon Guide to Statistics. HarperResource (for fun).
Module Outline
• Introduction• Using R• Data analysis (the gathering, display, and summary of data)
• Probability (the laws of chance)• Statistical inference (the drawing of conclusions from specific data knowing probability)
• Experimental design and modeling (putting it all together)
Lecture Outline
• Introduction• Basic Definitions• Basic Operations• Conditional Probability• Independence• Bayes Theorem• Discrete Random Variables• Continuous Random Variables• Examples
Introduction
• Historically, probability has had one application: gambling
• Claudius (Roman emperor, 10 BCE-54 CE) wrote the first book on gambling: How to win at dice
• Blaise Pascal and Pierre de Fermat invented the modern theory of probability
Basic Definitions
• Random experiment: the process of observing a chance event
• Elementary outcomes: the possible results
• Sample space: the collection of all elementary outcomes, written {outcome1, outcome2,…}
• The probability of outcome1 is written P(outcome1)
In the play Rosenkrantz and Guilderstein are
DeadOutcome (note same sample space)
Probability
Heads 1.0
Tails 0.0
Rules of Elementary Probability
• The probability of any outcome in the sample space is between 0.0 and 1.0 (non-negative)
• The total probability of all outcomes in the sample space is 1.0
• Suppose the probability of outcome1 is p. The total probability of any other outcome is 1.0-p.
Basic Operations
• An event is a set of elementary outcomes. The probability of the event is the sum of the probabilities of the outcomes in the set.
• An event is written {list of outcomes}• Suppose you roll a die twice, and the sum is 3. The set of events corresponding to this is {(1,2),(2,1)}
• The probability of this event is 1/36 + 1/36• What is the probability of getting a sum of 4?
Possible Events
• ‘E and F’, meaning both event E and event F occur.
• ‘E or F’, meaning either event E or event F occur.
• ‘not E’, meaning event E does not occur.• P(E or F) = P(E) + P(F) - P(E and F)• If the events are mutually exclusive, P(E or F) = P(E) + P(F)
• Finally, P(not E) = 1.0 - P(E)
Conditional Probability
• Suppose you have two events E in sample space A and F in sample space B.
• P(E|F) is the probability of E given that F happens.
• P(E|F) = P(E and F)/P(F) • P(E and F) = P(E|F)P(F) = P(F|E)P(E)• Note P(E|E) = P(E and E)/P(E) = P(E)/P(E) = 1
• Also P(E|F) = 0 if they are mutually exclusive
Independence
• Two events are independent if the occurrence of one has no influence on the other.
• If two events, E and F, are independent, – P(E and F) = P(E)P(F)
Bayes Theorem
• Suppose you know P(A|B) and you want to calculate P(B|A)– P(B|A)P(A) = P(A|B)P(B)– P(B|A) = P(A|B)P(B)/P(A)
• A rare disease has a prevalence of 1/1000• There is a test that is 99% accurate when you have the disease.
• The test also reports 2% positives when you don’t.
• You just had a positive test result. What are your chances?
Solution
• Look at 1000000 people, of whom 1000 have the disease– 999000 don’t; hence 19980 false positives– 1000 do; hence 999 true positives– Your chances of having the disease given you had a positive test result are 999/(19980+999) = 1/21. Why?
• P(I and X) = P(I|X)P(X) = P(X|I)P(I)• P(I|X) = P(X|I)P(I)/P(X) = 0.999*0.001/P(X)• And P(X) = P(X|I)P(I)+P(X|not I)P(not I) = 0.999*0.001+0.02*0.999 = 0.021*0.999
• So P(I|X) = 1/21
Discrete Random Variables
• A random variable is the numerical outcome of a random experiment.
• Each possible outcome has a probability.
• Histograms can be used to graph these.
Continuous Random Variables
• Random variables can be continuous– Your height– Your weight– Your age
You Can Also Discuss the Cumulative Probability
Distribution• This the probability of a result between the smallest possible value and a given value.
• Mathematically, it is area, calculated by summing.