chapter 5: probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · chapter 5:...
TRANSCRIPT
![Page 1: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/1.jpg)
Statistics for Journalism
Chapter 5: Probability models
1. Random variables:
a) Idea.
b) Discrete and continuous variables.
c) The probability function (density) and the distribution function.
d) Mean and variance of a random variable.
2. Probability models:
a) Bernoulli.
b) Geometric.
c) Binomial.
d) Normal.
e) Normal approximation to the binomial distribution.
Recommended reading:
Capítulos 22 y 23 del libro de Peña y Romo (1997)
![Page 2: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/2.jpg)
Statistics for Journalism
6.1: Random variables
• A function which places a numerical value on each possibleresult of an experiment is called a random variable.
• We use capital letters, e.g. X, Y, Z, to represent randomvariables and lower case letters, x, y, z, to represent particularvalues of these variables.
Discrete random variables can only take a discrete set of possiblevalues.
Continuous random variables can take an infinite number ofvalues within some continuous range.
![Page 3: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/3.jpg)
Statistics for Journalism
The probability function for a discrete r.v.: is the function which
associates the probability P(X=x) to each possible value x.
The possible values of a discrete r.v. X and their respective
probabilities are often displayed in a probability distribution table:
X x1 x2 ... xn
P(X=xi) p1 p2 ... pn
1 2 3 1np p p pEvery probability function satisfies
The distribution function for a discrete r.v.: Let X be a r.v. The distribution
function of X is the function which gives, for each x, the cumulative
probability up to x, that is, ( ) ( )F x P X x
![Page 4: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/4.jpg)
Statistics for Journalism
Mean, variance and standard deviation of a discrete r.v.
The mean or expectation of a discrete r.v., X, which takes values x1, ,x2,
....with probabilities p1, p2,... Is given by the following expression:
The variance is defined by the formula
which can be calculated using
( )i i i ii ix P X x x p
2 2 2
i iix p
2 2( )i iix p
Example: The probability distribution of the r.v. X is given in the following table:
xi 1 2 3 4 5
pi 0.1 0.3 ? 0.2 0.3
What is P(X=3)?
Calculate the mean and variance.
The standard deviation is the root of the variance.
![Page 5: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/5.jpg)
Statistics for Journalism
6.2: Probability models
Discrete models Continuous models
Bernoulli trials and the geometric and
binomial distributions
The normal distribution and related
distributions
![Page 6: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/6.jpg)
Statistics for Journalism
Bernoulli trials and the geometric and binomial distributions
A Bernoulli model is an experiment with the following characteristics:
• In each trial, there are only two possible results, success and failure.
• The result obtained in each trial is independent of the previous
results.
• The probability of success is constant, P(success) = p, and does not
change from one trial to the next.
![Page 7: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/7.jpg)
Statistics for Journalism
The geometric distribution
Suppose we have a Bernoulli model. What is the distribution of the
number of failures, F, before the first success?
• P(F=0) = P(0 failures before the 1st success) = p
• P(F=1) = P(failure, success) = (1-p)p
• P(F=2) = P(failure, failure, success) = (1-p)2 p
• P(F=f) = P( f failures before the 1st success) = (1-p)f p for f = 0, 1, 2,
…
The distribution of F is called the geometric distribution with parameter p.
E[F] = (1-p)/p V[X] = (1-p)/p2
![Page 8: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/8.jpg)
Statistics for Journalism
The binomial distribution
Suppose we have a Bernoulli model. What is the distribution of the
number of successes, X, in n trials?
for r = 0,1,2, …, n.
The distribution of X is called the binomial distribution with parameters n
and p.
E[X] = np V[X] = np(1-p)
P( ) ( ) (1 )r n rnObtener r éxitos P X r p p
r
![Page 9: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/9.jpg)
Statistics for Journalism
EXAMPLE
Calculate the probability that in a family with 4 children, 3 of them are
boys.
EXAMPLE
The probability that a student has to repeat the year is 0,3.
• We pick a student at random. What is the probability that the first
repeater is the 3rd student we pick?
• We choose 20 students at random. What is the chance that there
are exactly 4 repeaters?
EXAMPLE
On average, 4% of the votes in an election are null. Calculate the
expected number of null votes in a town with an electorate of 1000.
![Page 10: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/10.jpg)
Statistics for Journalism
DENSITY FUNCTION OF A CONTINUOUS R.V.:
The density function of a continuous r.v. satisfies the following conditions:
It only takes non-negative values, f(x) ≥ 0.
The area under the curve is equal to 1.
Distribution function. As with discrete variables, the distribution function gives the
cumulative probability up to x, that is:
The c.d.f. satisfies the conditions:
It takes the value 0 for any x below the minimum possible value of the
variable.
It takes the value 1 for all points over the maximum possible value of the
variable.
( ) ( )F x P X x
![Page 11: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/11.jpg)
Statistics for Journalism
The normal or gaussian distribution
Many variables have a bell shaped density.
Examples:
• Weights of a population of the same age and sex.
• Heights of the same population.
• The grades in a course (urban myth).
To say that a continuous variable X, has a normal distribution with mean
and standard deviation , we write:
X ~ N( , 2)
![Page 12: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/12.jpg)
Statistics for Journalism
The standard normal distribution
The normal distribution with mean 0 and standard deviation 1 is called the
standard normal distribution.
There are tables which allow us to calculate the probabilities for this
distribution, N(0,1).
If we have a normal r.v., X with mean and standard deviation we can
convert this to a standard, N(0,1) r.v. using the transformation:
XZ
![Page 13: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/13.jpg)
Statistics for Journalism
Let Z ~ N(0,1). Calculate the following probabilities
• P(Z < -1)
• P(Z > 1)
• P(-1,5 < Z < 2)
Calculate the 90%, 95%, 97,5% and 99% percentiles of the standard normal
distribution.
(These values are useful in the next chapter)
Let X ~ N(2,4). Calculate the following probabilities
• P(X < 0)
• P(-1 < X < 1)
Examples
![Page 14: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/14.jpg)
Statistics for Journalism
![Page 15: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/15.jpg)
Statistics for Journalism
![Page 16: Chapter 5: Probability modelshalweb.uc3m.es/esp/personal/personas/mwiper/... · Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c)](https://reader030.vdocuments.net/reader030/viewer/2022040811/5e5411ae3f78a36e67631a63/html5/thumbnails/16.jpg)
Statistics for Journalism
Approximation of the binomial distribution using a normal
When n is large enough, the binomial distribution, X~B(n, p), looks like a normal
distribution,
The approximation is usually considered to be good if np > 5 and n(1-p) > 5.
, (1 )N np np p
EXAMPLE
We throw a fair coin in the air 400 times. What is the probability of getting
between 180 and 210 heads?