ec203i2ee lec3 kt

Upload: devurenemy

Post on 14-Apr-2018

243 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Ec203I2EE Lec3 Kt

    1/21

    Lecture 3 Basic

    probability theory

    Lectures 1 and 2 (and Labs 1 and 2)

    basic work with economic data

    Very valuable!!

    However, economists often want to use

    more sophisticated statistical techniques to

    examine relationships between economic

    variables in more detail

    Remainder of class work towards basic

    single variable regression analysis

    First step basic probability theory

    Gujarati Chapters 2 and 3

    read all

    here, key points

    EC203 Introduction to Empirical Economics. KT. 1

  • 7/30/2019 Ec203I2EE Lec3 Kt

    2/21

    Random variables

    A random (or statistical) experiment is aprocess leading to at least two possible

    outcomes

    There will be uncertainty as to which

    outcome will occur

    Example: rolling a fair dice

    observe the number shownuppermost

    possible outcomes: 1, 2, 3, 4, 5 or 6

    For a random experiment, we know in

    advance all the possible outcomes

    What we do NOT know in advance is

    which outcome will occur in any particularexperiment

    Sample space (population)

    The set of all possible outcomes of the

    experiment: here, {1,2,3,4,5,6}

    EC203 Introduction to Empirical Economics. KT. 2

  • 7/30/2019 Ec203I2EE Lec3 Kt

    3/21

    A random variable is a variable whose

    (numerical) value is determined by the

    outcome of a random experiment.

    Example: toss two fair coins

    Let H denote a head, T a tail

    There are four possible outcomes:

    {HH, HT, TH, TT}

    Now consider a variable X, defined as the

    number of heads that are observed in the

    throw of two fair coins, or number of

    heads

    The situation is as follows

    Possible outcomes Number of heads

    TT 0TH 1

    HT 1

    HH 2

    EC203 Introduction to Empirical Economics. KT. 3

  • 7/30/2019 Ec203I2EE Lec3 Kt

    4/21

    The variable, X, number of heads , is a

    random orstochastic variable and has 3

    possible outcomes:

    X

    0

    1

    2

    Random variables (r.v) may be discrete or

    continuous:

    A discrete r.v. takes on only a finitenumber of particular values

    A continuous r.v. can take on anyvalue in some interval of values

    Both the roll of the dice and toss of the

    coin we have looked at are discrete r.v.s

    An example of a continuous r.v. is the

    rainfall falling in Glasgow per year.

    Focussing initially on discrete r.v.s makes

    concepts easier to grasp.

    EC203 Introduction to Empirical Economics. KT. 4

  • 7/30/2019 Ec203I2EE Lec3 Kt

    5/21

    Probability

    Logical reasoning and/or empiricalevidence may give us some feeling of how

    likely different outcomes are

    E.g. throw of the dice: outcomes{1,2,3,4,5,6} are all equally likely

    Basic coin toss H or T bothoutcomes equally likely

    For the two coin toss example,

    we can expect the value 1 to occurwith twoic the likelihood of value 0

    the values 0 and 2 are equally likely

    Lets use the notation Pr of a particular

    outcome, we can now deduce some

    probabilities for this example

    Possible outcome Pr of outcome

    O heads 1/4

    1 head 2/4

    2 heads 1/4

    EC203 Introduction to Empirical Economics. KT. 5

  • 7/30/2019 Ec203I2EE Lec3 Kt

    6/21

    Or, we can write Pr(2 heads) = 2/4 (or 1/2)

    etc i.e.

    Note that

    the probabilities sum to one, as weare distributing a total of 1

    the value of 1 corresponds to a

    certainty: we know that one of theoutcomes will occur

    each of the outcomes are mutuallyexclusive (i.e. they cannot occur at

    the same time)

    Note that this classical definition of

    probability is what we call an a priori

    definition

    the probabilities are derived frompurely deductive reasoning

    However, what if the outcomes of an

    experiment are not finite and cannot be

    stated with certainty?

    EC203 Introduction to Empirical Economics. KT. 6

  • 7/30/2019 Ec203I2EE Lec3 Kt

    7/21

    E.g. what is the probability that GDP will

    rise by a certain amount?

    Relative frequency or empiricaldefinition of probability

    Distinguish between absolute and relative

    frequency

    the absolutefrequency is thenumber of occurrences of a given

    event

    e.g. 10 students in this class get an

    exam mark of70%

    if there are 50 students in the classthe relativefrequency of the event

    of achievement of first class marks

    is 1/5

    The frequency distribution of marks

    achieved by all 50 students in the classwould show the different marking bands

    and how students are distributed across it

    in both relative and absolute terms.

    EC203 Introduction to Empirical Economics. KT. 7

  • 7/30/2019 Ec203I2EE Lec3 Kt

    8/21

    Can we treat relative frequencies as

    probabilities?

    Yes,provided the number of observations

    that the relative frequencies are based on

    is reasonably large

    The empirical, or relative frequency,definition of probability

    See Gujarati on properties of probabilities,

    but ignore Bayes Theorem

    Probability of random variables

    First discrete r.v.s

    takes only a finite number of values

    If X is an r.v. with distinct values x1, x2..xnThe functionfis defined by

    f(X=xi) = P(X=xi) i=1,2,N

    = 0 if xxi

    EC203 Introduction to Empirical Economics. KT. 8

  • 7/30/2019 Ec203I2EE Lec3 Kt

    9/21

    is called the probability mass function

    (PMF) or probability function (PF)

    Note that

    0 f(xi) 1

    i.e. the probability of X taking the value of

    xi lies between 0 and 1, and

    f(xi)=1

    From slide 5

    Number of heads PFX f(X)

    O heads 1/4

    1 head 1/2

    2 heads 1/4

    Sum 1

    Geometrically?Insert the PMF of the number of heads in a two coin toss (see Gujarati Fig 2.2)

    Expected value of a random variable, X

    EC203 Introduction to Empirical Economics. KT. 9

  • 7/30/2019 Ec203I2EE Lec3 Kt

    10/21

    1( ) ( * ( )

    i n

    i iiE X x f x

    =

    =

    = )

    EC203 Introduction to Empirical Economics. KT. 10

  • 7/30/2019 Ec203I2EE Lec3 Kt

    11/21

    Probability distribution of a continuous

    r.v

    Instead of a probability mass function we

    have a probability density function

    (PDF)

    Because a continuous r.v. can take an

    infinite number of values the probability of

    it taking any one is always measured over

    an interval

    Formally, means use of integral rather than

    summation operator (used for discreter.v.s)

    2

    11 2

    ( ( )

    x

    xP x X x f x dx< < =

    for all x1

  • 7/30/2019 Ec203I2EE Lec3 Kt

    12/21

    (Insert diagram and/or see Gujarati Fig 2.3)

    Note thatf(xi) =0

    Properties of a PDF

    1. Total area under the curvef(x) is 1

    2. P(x1

  • 7/30/2019 Ec203I2EE Lec3 Kt

    13/21

    Gujarati

    Miss material on cumulative

    distribution functions for now Also miss section on multivariate

    probability density functions (later

    class)

    Statistical independence will also be keyin later courses

    For now, move onto Chapter 3

    Characteristics of probability

    distributions

    * also referred to as moments of PDFs

    Next slide

    EC203 Introduction to Empirical Economics. KT. 13

  • 7/30/2019 Ec203I2EE Lec3 Kt

    14/21

    Moments of PDFs

    The first moment of a PDF is the expectedvalue of the random variable it represents

    the weighted average of all possiblevalues of all possible values

    where the probabilities of these

    values serve as weights also the average ormean valuethe population mean value

    E.g. throwing a dice

    outcomes are {1,2,3,4,5,6}

    each with a probability of 1/6

    So, EV(X)=1/6+2/6+3/6+4/6+5/6+6/6 =

    21/6 = 3.5

    Odd, since this is a discrete r.v., with 3.5not an option? Think of if someone gave

    you 1 for each number on the dice (i.e. 6

    for the 6, 1 for the 1), after a number of

    rolls of the die, you would anticipate

    receiving 3.50 per roll

    EC203 Introduction to Empirical Economics. KT. 14

  • 7/30/2019 Ec203I2EE Lec3 Kt

    15/21

    GeometricallyInsert or take Gujarati Fig 3.1

    Gujarati read section on properties of the

    expected value

    Key here is that the expected value is a

    measure of central tendency of the PDF

    Ignore for now section on EV ofmultivariate PDFs

    Our next focus is the second moment of

    the PDF the variance, a measure of

    dispersion

    EC203 Introduction to Empirical Economics. KT. 15

  • 7/30/2019 Ec203I2EE Lec3 Kt

    16/21

    Variance of a PDF

    In Lecture 2, we looked at the standarddeviation

    2

    1( )

    1

    Nii

    Y Ys

    N=

    =

    i.e. we square the total of summing

    across the deviation of each

    observation of Y from the sample mean

    and divide by the number of

    observations minus 1

    In empirical economics we normally

    replace s with x as notation for thestandard deviation of variable X

    The variance is defined as

    2var( ) xX =

    that is, the square of the standard deviation

    EC203 Introduction to Empirical Economics. KT. 16

  • 7/30/2019 Ec203I2EE Lec3 Kt

    17/21

    We wont go into details on computing the

    variance here see Gujarati, but not very

    intuitive, so well move on..

    Several r.v.s may have the same expected

    value but different variances.

    Geometrically..(Insert or Gujaratic Fig 3.2)

    See Gujarati on the properties of variance.

    The key one for this particular class is the

    first the variance of a constant is zero (by

    definition, a constant has no variability

    The final 3 concepts today link to Lec 4..

    EC203 Introduction to Empirical Economics. KT. 17

  • 7/30/2019 Ec203I2EE Lec3 Kt

    18/21

    In Gujarati, skip until we get to the

    coefficient of variation

    this is a measure of relative variationbetween the mean and standarddeviation of an r.v.

    because the mean and standarddeviation will be in common units of

    measurement, the coefficient ofvariation this is independent of units

    therefore it is useful for comparisonacross different r.v.s

    Then there is the covariance

    this is a special kind of EV thatmeasures how two variables vary or

    move together

    it can be positive, negative or zero

    Again, the computation of the covarianceis not too intuitive and we wont be

    focussing on it here other than its role in

    calculating the correlation coefficient, but

    you will in future applied classes

    EC203 Introduction to Empirical Economics. KT. 18

  • 7/30/2019 Ec203I2EE Lec3 Kt

    19/21

    The last but one thing we are interested in

    in Gujarati Ch3 is the (population)

    correlation coefficient this is found by taking the covariance

    of 2 r.v.s and dividing by the product

    of their standard deviations

    Thus, it is a measure oflinearassociation

    between two variables..this is the focus

    of our next lecture

    Finally, if you skip forward to the section

    titled From the population to the sample,

    and read over about thesample mean,variance, covariance and correlation

    coefficient, this will be our focus in

    Lecture 6

    EC203 Introduction to Empirical Economics. KT. 19

  • 7/30/2019 Ec203I2EE Lec3 Kt

    20/21

    EC203 Introduction to Empirical Economics. KT. 20

  • 7/30/2019 Ec203I2EE Lec3 Kt

    21/21