continuous probability distribution.pdf

Upload: dipika-panda

Post on 06-Jul-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/17/2019 Continuous Probability Distribution.pdf

    1/47

    Probability Distribution 

  • 8/17/2019 Continuous Probability Distribution.pdf

    2/47

  • 8/17/2019 Continuous Probability Distribution.pdf

    3/47

    Continuous Probability Distributions

    The probability of the random variable assuming a value

    within some given interval from x1 to x2 is defined to be the

    area under the graph of the probability density function

     between x1 and x2.

     f (x)

    x

    Uniform

    x

    1 x

    2

    x

     f

    (

    x

    ) Normal

    x

    1

    x

    2

    x

    1 x

    2

    Exponential

    x

     f (x)

    x

    1 x

    2

  • 8/17/2019 Continuous Probability Distribution.pdf

    4/47

    Normal Probability Distributions

    The normal probability distribution is the most important

    distribution for describing a continuous random variable.

    It is widely used in statistical inference.

  • 8/17/2019 Continuous Probability Distribution.pdf

    5/47

    Normal Probability Distributions

     Normal Probability Density Function

    2 2( ) /21( )2

    x f x e    

     

      = mean

      = standard deviation

      = 3.14159

    e = 2.71828

    where:

  • 8/17/2019 Continuous Probability Distribution.pdf

    6/47

    Normal Probability Distributions

    It’s a probability function, so no matter what the values of

    and , must integrate to 1!

    12

    12)(

    2

    1

    dxe

     x

     

     

      

    E(X)= =

    Var(X)=2 =

    Standard Deviation(X)=

    dxe x

     x

    2)(

    2

    1

    2

     

      

    2)(

    2

    1

    2 )2

    1(

    2

     

        

      

     

    dxe x

     x

  • 8/17/2019 Continuous Probability Distribution.pdf

    7/47

    The distribution is symmetric; its skewness

    measure is zero.

    Normal Probability Distributions

    Characteristics

    x

  • 8/17/2019 Continuous Probability Distribution.pdf

    8/47

    The entire family of normal probability distributions

    is defined by its mean   and its standard deviation    .

    Normal Probability Distributions

    Characteristics

    Standard Deviation  

    Mean  x

  • 8/17/2019 Continuous Probability Distribution.pdf

    9/47

    The highest point on the normal curve is at the mean,

    which is also the median and mode.

    Normal Probability Distributions

    Characteristics

    x

  • 8/17/2019 Continuous Probability Distribution.pdf

    10/47

    Normal Probability Distributions

    Characteristics

    -10 0 20

    The mean can be any numerical value: negative, zero,

    or positive.

    x

  • 8/17/2019 Continuous Probability Distribution.pdf

    11/47

    Normal Probability Distributions

    Characteristics

    = 15

    = 25

    The standard deviation determines the width of the

    curve: larger values result in wider, flatter curves.

    x

  • 8/17/2019 Continuous Probability Distribution.pdf

    12/47

    Probabilities for the normal random variable are given

     by areas under the curve. The total area under the curve

    is 1 (.5 to the left of the mean and 0.5 to the right).

    Normal Probability Distributions

    Characteristics

    .5 .5

    x

  • 8/17/2019 Continuous Probability Distribution.pdf

    13/47

    Normal Probability Distributions

    Since the area under the curve represents probability, the probability of a normal random variable at one specific value is

    zero . With a single value, one can’t find the area since the

    area must be bound by two values. Thus,

    P(x = 10) = 0 P(x = 3) = 0 P(x = 7.5) = 0

    However, one can find the following probabilities:

    P( 1 < x < 3) P(2.2 < x < 3.7) P(x > 3)

  • 8/17/2019 Continuous Probability Distribution.pdf

    14/47

    Normal Probability Distributions

    Characteristics

    of values of a normal random variable

    are within of its mean.

    68.26%

    +/- 1 standard deviation

    of values of a normal random variableare within of its mean.95.44%

    +/- 2 standard deviations

    of values of a normal random variable

    are within of its mean.

    99.72%

    +/- 3 standard deviations

  • 8/17/2019 Continuous Probability Distribution.pdf

    15/47

    Normal Probability Distributions

    Characteristics

     x 

      – 3 – 1

      – 2

      + 1

      + 2

      + 3 

    68.26%

    95.44%

    99.72%

  • 8/17/2019 Continuous Probability Distribution.pdf

    16/47

    Normal Probability Distributions

    There may be thousands of normal distribution curves,

    each with a different mean and a different standarddeviation. Since the shapes are different, the areas under the curves between any two points are also different.

    To make life easier, all normal distributions can be

    converted to a standard normal distribution. A standardnormal distribution has a mean of 0 and a standarddeviation of 1.

     No matter what   and   are, the area between  - and  +is about 68.26%; the area between  -2 and  +2 is about95.44%; and the area between   -3 and   +3 is about99.72%. Almost all values fall within 3 standarddeviations.

  • 8/17/2019 Continuous Probability Distribution.pdf

    17/47

    How good is rule for real data?

    Check some example data:

    The mean of the weight of the women = 127.8

    The standard deviation (SD) = 15.5

  • 8/17/2019 Continuous Probability Distribution.pdf

    18/47

     

    80   90   100   110   120   130   140   150   160  

    0  

    5  

    10  

    15  

    20  

    25  

    P  e  r  c  e  n  t  

    POUNDS  

    127.8 143.3112.3

    68% of 120 = .68x120 = ~ 82 runners

    In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean.

  • 8/17/2019 Continuous Probability Distribution.pdf

    19/47

     

    80   90   100   110   120   130   140   150   160  

    0  

    5  

    10  

    15  

    20  

    25  

    P  e  r  c  e  n  t  

    POUNDS  

    127.896.8

    95% of 120 = .95 x 120 = ~ 114 runners

    In fact, 115 runners fall within 2-SD’s of the mean.

    158.8

  • 8/17/2019 Continuous Probability Distribution.pdf

    20/47

     

    80   90   100   110   120   130   140   150   160  

    0  

    5  

    10  

    15  

    20  

    25  

    P  e  r  c  e  n  t  

    POUNDS  

    127.881.3

    99.7% of 120 = .997 x 120 = 119.6 runners

    In fact, all 120 runners fall within 3-SD’s of the mean.

    174.3

  • 8/17/2019 Continuous Probability Distribution.pdf

    21/47

    Example

    Suppose SAT scores roughly follows a normal distribution in

    the U.S. population of college-bound students (with range

    restricted to 200-800), and the average math SAT is 500 with a

    standard deviation of 50, then:

    68% of students will have scores between 450 and 550

    95% will be between 400 and 600

    99.7% will be between 350 and 650

  • 8/17/2019 Continuous Probability Distribution.pdf

    22/47

    Standard Normal Probability Distributions

    The formula for the standardized normal probabilitydensity function is

    22)(

    2

    1)

    1

    0(

    2

    1

    2

    1

    2)1(

    1)(

     Z  Z 

    ee Z  p

      

  • 8/17/2019 Continuous Probability Distribution.pdf

    23/47

    The Standard Normal Distribution Z)

    All normal distributions can be converted into the standard

    normal curve by subtracting the mean and dividing by the

    standard deviation:

     

       X 

     Z 

    Somebody calculated all the integrals for the standard

    normal and put them in a table! So we never have tointegrate!

    Even better, computers now do all the integration.

  • 8/17/2019 Continuous Probability Distribution.pdf

    24/47

    0

    z

    The letter z is used to designate the standardnormal random variable.

    Standard Normal Probability Distributions

  • 8/17/2019 Continuous Probability Distribution.pdf

    25/47

    Applications of Standard Normal Distribution

    Problem:

    What’s the probability of getting a math SAT score of 575 or less,

    =500 and  =50?

    5.150

    500575

     Z 

    i.e., A score of 575 is 1.5 standard deviations above the mean

       

    5.1

    2

    1575

    200

    )

    50

    500(

    2

    1 22

    2

    1

    2)50(

    1)575(   dz edxe X  P 

     Z  x

      

    Yikes!

    But to look up Z= 1.5 in standard normal chart (or enterinto SAS) no problem! = .9332

  • 8/17/2019 Continuous Probability Distribution.pdf

    26/47

    Applications of Standard Normal Distribution

    Problem:

    Test scores of a special examination administered to all potential

    employees of a firm are normally distributed with a mean of 500

     points and a standard deviation of 100 points. What is the probability

    that a score selected at random will be higher than 700?

    P(x > 700) = ?

    If we convert this normal variable, x, to a standard normal variable, z,

    z = (x - µ) / σ = (700 – 500) / 100 = 2

    -------------500----------700 x-scale

    P(x > 700) = P(z > 2) ----------------0-----------2 z-scale

  • 8/17/2019 Continuous Probability Distribution.pdf

    27/47

    Problem

    If birth weights in a population are normally distributed with a

    mean of 109 oz and a standard deviation of 13 oz,

    a. What is the chance of obtaining a birth weight of 141 oz

    or heavier when sampling birth records at random?

     b. What is the chance of obtaining a birth weight of 120 orlighter ?

  • 8/17/2019 Continuous Probability Distribution.pdf

    28/47

    Solution

    a. What is the chance of obtaining a birth weight of 141 oz

    or heavier when sampling birth records at random?

    46.213

    109141

     Z 

    From the chart or SAS Z of 2.46 corresponds to a right tail (greater

    than) area of: P(Z≥2.46) = 1-(.9931)= .0069 or .69 %

  • 8/17/2019 Continuous Probability Distribution.pdf

    29/47

    Solution

     b. What is the chance of obtaining a birth weight of 120

    or lighter ?

    From the chart or SAS Z of .85 corresponds to a left tail area of:

    P(Z≤.85) = .8023= 80.23%

    85.13

    109120

     Z 

  • 8/17/2019 Continuous Probability Distribution.pdf

    30/47

    Looking up probabilities in the

    standard normal table

     Area is 93.45%

    Z=1.51

    Z=1.51

     What is the area to

    the left of Z=1.51 in

    a standard normal

    curve?

  • 8/17/2019 Continuous Probability Distribution.pdf

    31/47

  • 8/17/2019 Continuous Probability Distribution.pdf

    32/47

    Are my data normally distributed?

    1. Look at the histogram! Does it appear bell shaped?

    2. Compute descriptive summary measures — are mean,median, and mode similar?

    3. Do 2/3 of observations lie within 1 std dev of the mean? Do

    95% of observations lie within 2 std dev of the mean?4. Look at a normal probability plot — is it approximatelylinear?

    5. Run tests of normality (such as Kolmogorov-Smirnov). But,

     be cautious, highly influenced by sample size!

  • 8/17/2019 Continuous Probability Distribution.pdf

    33/47

    Normal approximation to the binomial

    When you have a binomial distribution where n is large and p

    is middle-of-the road (not too small, not too big, closer to .5),then the binomial starts to look like a normal distribution in

    fact, this doesn’t even take a particularly large n

    Recall: What is the probability of being a smoker among a

    group of cases with lung cancer is .6, what’s the

     probability that in a group of 8 cases you have less than 2

    smokers?

  • 8/17/2019 Continuous Probability Distribution.pdf

    34/47

    Normal approximation to the binomial

    When you have a binomial distribution where n is large and p

    isn’t too small (rule of thumb: mean>5), then the binomial starts

    to look like a normal distribution

    Recall: smoking example…

    1 4 52 3 6 7 80

    .27 Starting to have a normal shape

    even with fairly small n. You can

    imagine that if n got larger, the

     bars would get thinner and thinner

    and this would look more and

    more like a continuous function,with a bell curve shape. Here

    np=4.8.

  • 8/17/2019 Continuous Probability Distribution.pdf

    35/47

    Normal approximation to binomial

    1 4 52 3 6 7 80

    .27

    What is the probability of fewer than 2 smokers?

     Normal approximation probability:=4.8

    =1.39

    239.1

    8.2

    39.1

    )8.4(2

     Z 

    Exact binomial probability (from before) = .00065 + .008 = 0.00865

    P(Z

  • 8/17/2019 Continuous Probability Distribution.pdf

    36/47

    A little off, but in the right ballpark… we could also use the value

    to the left of 1.5 (as we really wanted to know less than but not

    including 2; called the “continuity correction”)…

    37.239.1

    3.3

    39.1

    )8.4(5.1

     Z 

    P(Z≤-2.37) =.0089A fairly good approximation of

    the exact probability, .00865.

  • 8/17/2019 Continuous Probability Distribution.pdf

    37/47

    Practice problem

    1. You are performing a cohort study. If the probability of

    developing disease in the exposed group is .25 for the study

    duration, then if you sample (randomly) 500 exposed

     people, What’s the probability that at most 120 people

    develop the disease?

  • 8/17/2019 Continuous Probability Distribution.pdf

    38/47

    Solution:

    P(Z

  • 8/17/2019 Continuous Probability Distribution.pdf

    39/47

    More Sample Problems

    a. P(z ≤ 1.5) =

     b. P(z ≤ 1.0) =

    c. P(1 ≤ z ≤ 1.5) =

    d. P(0 < z < 2.5) =

  • 8/17/2019 Continuous Probability Distribution.pdf

    40/47

    More Sample Problems

    a. P(z ≤ 1.5) = 0.9332

     b. P(z ≤ 1.0) = 0.8413

    c. P(1 ≤ z ≤ 1.5) = 0.9332 – 0.8413 = 0.09

    d. P(0 < z < 2.5) = 0.9938 – 0.5000 = 0.4938

  • 8/17/2019 Continuous Probability Distribution.pdf

    41/47

    More Sample Problems

    a. P(z ≤ - 1.0) =

     b. P(z ≥ - 1) =

    c. P(z ≥ - 1.5) =

    d. P(- 3 < z ≤ 0) =

  • 8/17/2019 Continuous Probability Distribution.pdf

    42/47

    More Sample Problems

    a. P(z ≤ - 1.0) = 0.1587

     b. P(z ≥ - 1) = 1 – P(z ≤ - 1) = 1 – 0.1587 = 0.8413

    c. P(z ≥ - 1.5) = 1 – P(z ≤ - 1.5) = 1 – 0.0668 = 0.9332

    d. P(- 3 < z ≤ 0) = 0.5 – 0.0014 = 0. 4986

  • 8/17/2019 Continuous Probability Distribution.pdf

    43/47

    More Sample Problems

    Given: µ = 77 σ = 20

    a. P(x < 50) = ?

    Convert to z: z = (x - µ) / σ = (50 – 77) / 20 = - 1.35

    P(x < 50) = P(z < - 1.35) = 0.0885

     b. P(x > 100) = ?

    z = (100 – 77) / 20 = 1.15P(x > 100) = P(z > 1.15) = 1 – P(z ≤ 1.15) = 1 – 0.8749 =

    0.1251 or 12.51 %

  • 8/17/2019 Continuous Probability Distribution.pdf

    44/47

    Continuation of Sample Problem

    c. x = ? to be considered a heavy user 

    Upper 20% of the area is in the right tail of the normal curve.

    80% of the area is to the left. Go to Table 1 and locate 0.8 (or

    80%) as the table entry. The closest entry is 0.7995. That

     point represents a z-value of 0.84. Use this value of z in the

    following equation:

    z = (x - µ) / σ0.84 = (x – 77)/ 20 x = 93.8 hours

  • 8/17/2019 Continuous Probability Distribution.pdf

    45/47

    More Sample Problems

    A statistics instructor grades on a curve. He does not want to

    give more than 15 percent A in his class. If test scores of  

    students in statistics are normally distributed with a mean of 

    75 and a standard deviation of 10, what should be the cut-off 

     point for an A?

    z = (x - µ) / σ

    1.04 = (x – 75) / 10

    x = 85.4 or 85

  • 8/17/2019 Continuous Probability Distribution.pdf

    46/47

    Sample Problems

    The service life of a certain brand of automobile battery is

    normally distributed with a mean of 1000 days and a standarddeviation of 100 days. The manufacturer of the battery wants

    to offer a guarantee, but does not know the length of the

    warranty. It does not want to replace more than 10 percent of 

    the batteries sold. What should be the length of the warranty?

    z = ( x - µ ) / σ

    - 1.28 = (x – 1000) / 100x = 872 days

  • 8/17/2019 Continuous Probability Distribution.pdf

    47/47

    Reference:

    Anderson Sweeney Williams