section 1-a refresher on probability and statistics

Upload: islam-ali

Post on 10-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    1/28

    A Refresher on Probability and

    Statistics

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    2/28

    What Well Do ...

    Review of probability and statistics necessary to do andunderstand simulation

    Outline

    Random variables

    Hypothesis testing

    Popular probability distributions

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    3/28

    Random Variables

    One way of quantifying, simplifying events andprobabilities

    A random variable (RV) is a number whose value is

    determined by the outcome of an experiment Think: RV is a number whose value we dont know for sure

    but well usually know something about what it can be or is

    likely to be

    Usually denoted as capital letters: X, Y, W1, W2, etc.

    Probabilistic behavior described by distribution function

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    4/28

    Discrete vs. Continuous RVs

    Two basic flavors of RVs, used to represent or modeldifferent things

    Discretecan take on only certain separated values

    Number of possible values could be finite or infinite

    Continuouscan take on any real value in some range

    Number of possible values is always infinite

    Range could be bounded on both sides, just one side, or neither

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    5/28

    Discrete Distributions

    LetXbe a discrete RV with possible values (range)x1,x2, (finite or infinite list)

    Probability mass function (PMF)

    p(xi) = P(X=xi) for i= 1, 2, ...

    The statement X=xi is an event that may or may not happen,so it has a probability of happening, as measured by the PMF

    Can express PMF as numerical list, table, graph, or formula

    SinceXmust be equal to some xi, and since thexis are all

    distinct,

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    6/28

    Cumulative distribution function (CDF)probability that theRV will be () a fixed valuex:

    Properties of discrete CDFs

    0 F(x) 1 for allx

    Asx, F(x) 0

    Asx +, F(x) 1

    F(x) is non-decreasing inx

    F(x) is a step function continuous from the right with jumps at

    thexis of height equal to the PMF at thatxi

    Discrete Distributions (contd.)

    These four propertiesare also true ofcontinuous CDFs

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    7/28

    Discrete Distributions (contd.)

    Computing probabilities about a discrete RVusually usethe PMF

    Add up p(xi) for thosexis satisfying the condition for the event

    With discrete RVs, must be careful about weak vs. strong

    inequalitiesendpoints matter!

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    8/28

    Data set has a center the average (mean) RVs have a center expected value

    Also called the mean or expectation of the RVX

    Other common notation: m, mX Weighted average of the possible valuesxi, with weights being

    their probability (relative likelihood) of occurring What expectation is not: The value ofXyou expect to get

    E(X) might not even be among the possible valuesx1,x2,

    What expectation is:Repeat the experiment many times, observe manyX1,X2, ,XnE(X) is what converges to (in a certain sense) as n

    Discrete Expected Values

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    9/28

    Data set has measures of dispersion Sample variance

    Sample standard deviation

    RVs have corresponding measures

    Other common notation:

    Weighted average of squared deviations of the possible valuesxi from the mean

    Standard deviation ofXis

    Interpretation analogous to that for E(X)

    Discrete Variances and

    Standard Deviations

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    10/28

    Continuous Distributions

    Now letXbe a continuous RV Possibly limited to a range bounded on left or right or both

    No matter how small the range, the number of possible values

    forXis always (uncountably) infinite

    Not sensible to ask about P(X=x) even if x is in the possiblerange

    Technically, P(X=x) is always 0

    Instead, describe behavior of X in terms of its falling between

    two values

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    11/28

    Continuous Distributions (contd.)

    Probability density function (PDF) is a function f(x) with thefollowing three properties:

    1. f(x) 0 for all real valuesx

    2. The total area under f(x) is 1:

    3. For any fixed a and b with a b, the probability thatXwill fallbetween a and b is the area under f(x) between a and b:

    Fun facts about PDFs ObservedXs are denser in regions where f(x) is high

    The height of a density, f(x), is not the probability of anythingit can even be > 1 With continuous RVs, you can be sloppy with weak vs. strong

    inequalities and endpoints

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    12/28

    Continuous Distributions (contd.)

    Cumulative distribution function (CDF) - probability that theRV will be a fixed valuex:

    Properties of continuous CDFs

    0 F(x) 1 for allx

    Asx, F(x) 0

    Asx +, F(x) 1

    F(x) is non-decreasing inx

    F(x) is a continuous function with slope equal to the PDF:

    f(x) = F'(x)

    These four propertiesare also true ofdiscrete CDFs

    F(x) may ormay not havea closed-formformula

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    13/28

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    14/28

    Hypothesis Tests

    Test some assertion about the population or itsparameters

    Can never determine truth or falsity for sureonly getevidence that points one way or another

    Null hypothesis (H0)what is to be tested

    Alternate hypothesis (H1 or HA)denial ofH0H0:m= 6 vs. H1: m 6H0:s< 10 vs. H1:s 10H0:m1 = m2 vs. H1:m1 m2

    Develop a decision rule to decide on H0

    or H1

    based onsample data

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    15/28

    Hypothesis Testing in Simulation

    Input side Specify input distributions to drive the simulation

    Collect real-world data on corresponding processes

    Fit a probability distribution to the observed real-world data

    Test H0: the data are well represented by the fitted distribution Output side

    Have two or more competing designs modeled

    Test H0: all designs perform the same on output, or test H0:

    one design is better than another Selection of a best model scenario

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    16/28

    Popular Discrete Probability Distribution

    Functions

    Uniform Binomial

    Geometric

    Negative Binomial Poisson

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    17/28

    Discrete Uniform Distribution

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    18/28

    Binomial Distribution

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    19/28

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    20/28

    Negative Binomial Distribution

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    21/28

    Poisson Distribution

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    22/28

    Popular Continuous Probability Distribution

    Functions

    Uniform Normal

    Triangular

    Exponential Erlang

    Weibull

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    23/28

    Continuous Uniform Distribution

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    24/28

    Normal Distribution

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    25/28

    Triangular Distribution

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    26/28

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    27/28

    Erlang Distribution

  • 8/8/2019 Section 1-A Refresher on Probability and Statistics

    28/28

    Weibull Distribution