empirical finance

Upload: jackta

Post on 02-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Empirical Finance

    1/562

    Empirical Finance

    Executive MSc in Investment and Risk Management Programme

    Prof. Robert L [email protected]

    +65 6631 8579

    EDHEC Business School

    2427 Mar 2011

    2224 Aug 2011

    Singapore Campus

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 1 / 563

  • 8/10/2019 Empirical Finance

    2/562

    Introduction

    Empirical Finance

    Introduction

    Prof. Robert L [email protected]

    +65 6631 8579

    EDHEC Business School

    2427 Mar 2011

    2224 Aug 2011

    Singapore Campus

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 2 / 563

  • 8/10/2019 Empirical Finance

    3/562

    Introduction Introduction

    This course is about Empirical Finance.

    What do the available data tell us about financial markets, and do theysupport or contradict the various theories we have developed to explain thebehaviour of financial markets?

    We will focus mainly on pricing, that is, how prices of financial assets aredetermined. It is possible to focus on other aspects of financial markets,e.g., trading volume.

    The course will discuss both econometric techniques, and the actual

    empirical findings.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 3 / 563

  • 8/10/2019 Empirical Finance

    4/562

    Basic Principles

    Empirical Finance

    Basic Principles

    Prof. Robert L [email protected]

    +65 6631 8579

    EDHEC Business School

    2427 Mar 2011

    2224 Aug 2011

    Singapore Campus

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 4 / 563

    B i P i i l P b bili d Di ib i

  • 8/10/2019 Empirical Finance

    5/562

    Basic Principles Probability and Distributions

    Why is there even a subject matter called Empirical Finance?

    1 Astronomers can predict the positions of the planets, and phenomenasuch as eclipses, with extreme accuracy, centuries in advance.

    2 Meteorologists can predict the weather a few days in advance.3 Can stock market analysts predict stock prices ten minutes in

    advance?

    Humans have essentially no effect on the motion of the planets, and only(possibly) very long-term effect on the weather. Prices of financial assets

    are set on a minute-to-minute basis by people.

    How do they decide what the prices of financial assets should be?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 5 / 563

    B i P i i l P b bilit d Di t ib ti

  • 8/10/2019 Empirical Finance

    6/562

    Basic Principles Probability and Distributions

    The extent to which financial markets incorporate available information

    into asset prices (the degree of market efficiency) is very hotly debated, inboth academic and industry circles.

    There is no question, though, that events nobody knows about yet cantbe incorporated into asset prices.

    The evolution of the macroeconomy, technological progress, societalevolution, are all very hard to predict, even by people who spend theirwhole lives studying such things. They are best modelled as randomprocesses.

    If the fundamental economic processes that affect asset prices are random,then the asset prices themselves are also random.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 6 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    7/562

    Basic Principles Probability and Distributions

    The fact that security prices are random has profound implications forinvestorsmuch of financial theory involves the investors problem oftrading off risk and average return.

    However, it also has profound implications for those who study financialmarkets. Financial theories are generally about relations between averagereturns and various measures of risk. If we observe that the averagereturns of securities differ from what is predicted by a theory, what

    conclusion do we draw?

    1 The theory is wrong.2 The theory is right, but its predictions are not met exactly because of

    the random variation in asset prices.

    Which is it?

    Probability and statistics are absolutely fundamental to the study of

    financial markets.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 7 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    8/562

    Basic Principles Probability and Distributions

    Examplesuppose there are three assets, X, Y, and Z. We havedeveloped an economic theory that tells us what (on average) the returns

    of the assets ought to be. We then get a sample of monthly returns(annualised) of the three assets, over the last 20 year period. The resultsare as follows.

    AssetX Y Z

    Average return (predicted) 8% 10% 12%

    Average return (observed) 6% 16% 14%

    Standard deviation of return (observed) 25% 40% 60%

    How do the predictions of the theory hold up? Do you have enoughinformation to tell?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 8 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    9/562

    Basic Principles Probability and Distributions

    A probability distribution specifies the likelihood of each possible outcomeof a random process. They can be discreteorcontinuous.

    When a random variable has a discrete probability distribution, there areeither finitely many outcomes, or countably many.

    Consider a six-sided die, each side labelled with a number from one to six.If each side is equally likely to come up when the die is rolled, then the

    probabilitiesp1, . . . , p6 are all equal to 1/6.

    Probabilities (in a discrete probability distribution) must satisfy twoproperties:

    1 The probabilities must be zero or positive.2 The probabilities must add up to one.

    Do the probabilities specified above satisfy both of these constraints?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 9 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    10/562

    Basic Principles Probability and Distributions

    Probability Distribution of Six-sided Die Throw

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 10 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    11/562

    p y

    A discrete probability distribution can have infinitely many outcomes, eachwith positive probability.

    Suppose we throw a coin with a heads and a tails side. The coin isfair, meaning each side has a probability of 1/2. Suppose we throw thiscoin repeatedly, and call Xthe number of throws until the first head.What is the probability distribution ofX?

    There is a 1/2 probability that the first throw will be heads, sop1 = 1/2. The probability that the second throw will be the first head is1/4, so p2 = 1/4. More generally, pi= (1/2)

    i. There is no limit to thevalue of i; it is possible (although not likely) that it will take a million, a

    billion, a trillion trillion trillion, etc. throws.

    Do these probabilities satisfy the two rules?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 11 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    12/562

    p y

    Each of the probabilities is clearly greater than zero, so we have noproblem with negative probabilities. Do they add up to one?

    i=1

    pi=i=1

    1

    2

    i= 1

    (For justification of the last step, see any reference on geometric infiniteseries.)

    The probabilities are non-negative, and up to onethey are validprobabilities. More generally, any distribution with

    pi= (1 p)i1 p

    for some p [0, 1] is called a geometricdistribution.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 12 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    13/562

    y

    Probability Distribution of First Head in Coin Throw Example

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 13 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    14/562

    Continuous probability distributions have uncountablyinfinitely many

    possible outcomes.

    Examplewhat is the amount of rainfall in the centre of Singapore on 22June 2011, measured in millimetres?

    This quantity could take anynon-negative valueit could be zero (norainfall at all), or any positive number. (Since water consists of molecules,the amount of rainfall is actually a discrete quantityhowever, it is verywell approximated by a continuous distribution.)

    Continuous probability distributions are specified by a probability densityfunction.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 14 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    15/562

    Examplethe random variable Xhas a uniform probability distribution onthe interval [0, 1]. Then Xhas the probability density function fX(x) = 1.

    The density function does not specify the probability of each outcome;each particular outcome is infinitely improbable (i.e., has probability of 0).But ranges of outcomes have positive probability; what is the probabilitythat X falls in the interval [0.2, 0.3]?

    P(0.2 X 0.3) = 0.30.2

    fX(x) dx= 0.3

    0.2(1) dx= x|0.30.2 = 0.1

    Probability density functions must satisfy two rules:

    1 They must be non-negative.2 They must integrate to one.

    Does this uniform probability distribution satisfy these constraints?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 15 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    16/562

    The uniform probability density on [0, 1] is obviously positive on thisrange. It also integrates to one:

    10

    fX(x) dx= 1

    0(1) dx= 1

    Note that this integral is only taken over the range of possible values

    [0, 1]. We can instead take the probability density to be defined as 0outside this range:

    fX(x) = {1 0 x 10 x1We can then just integrate over the entire real line (, +), and thevalue of the integral is still one.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 16 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    17/562

    More generally, a uniform distribution can be defined on any range [a, b],with b>a:

    fX(x) = { 1

    (ba) a

    x

    b

    0 xb

    Note that the probability density satisfies the two requirements; it isnon-negative, and it integrates to one.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 17 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    18/562

    Uniform Distribution on [0, 1]

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 18 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    19/562

    Another examplethe exponentialdistribution, with probability densityfunction defined on the interval [0, +

    ):

    fX(x) =ex, >0

    Note that this is not a single distribution, but a family of many

    distributions, indexed by the parameter .

    The exponential distribution has many applications; for example, it is usedto model the time until a radioactive particle decays. It is sometimes usedto model time to default in credit risk applications.

    Does the exponential distribution satisfy the two requirements for a validprobability distribution?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 19 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    20/562

    Exponential Distribution with = 0.5

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 20 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    21/562

    Another examplethenormal, orGaussian distribution. This distributionis defined for all real numbers (positive, zero, and negative), and has thedensity function:

    fX(x) = 1

    22e

    (x)2

    22 , >0

    Despite its somewhat odd appearance, the normal distribution arises in avery natural way in many, many applications, and is one of the mostfundamental continuous distributions there is. It is often used to modelreturns of financial assets.

    Note that the Gaussian distribution is actually a family of distributions,indexed by and . More on these parameters later.

    Does the Gaussian distribution satisfy the two requirements for a validprobability distribution?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 21 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    22/562

    Gaussian Distribution with = 0.1 and = 0.25

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 22 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    23/562

    We will often use summary statistics, which capture some (but not all) of

    the information in the probability distribution of a random variable.One of the most important is the mean, or expected value. This is just theaverage outcome, weighted by probabilities.

    E [X] =Ni=1

    xipi

    where xi is the value of a particular outcome, and pi is its probability. The

    sum must be taken across all possible outcomes (the number of outcomesbeing denoted by N here).

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 23 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    24/562

    For a random variable with a continuous distribution, the mean is anintegral over all possible outcomes (weighted by probability).

    E [X] = +

    xfX(x) dx

    The expected values of the die and coin throw examples are 3.5 and 2,

    respectively. The uniform distribution on [a, b] has an expected value of(a+b) /2. The exponential distribution has a mean of 1/. The normal(Gaussian) distribution has a mean of.

    When there are infinitely many possible outcomes, the expected value may

    not even existwhat is the expected value of a random variable that hasvalue 2 with probability 1/2, 4 with probability 1/4, etc.? The expectedvalue also does not even have to be one of the possible outcomesin thedie throw example, the mean is 3.5, but no throw ever has this value.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 24 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    25/562

    For a random variable X, any function g(X) ofX is also a randomvariable, and we can contemplate its expected value. For example, ifX isthe value of a die throw (1 through 6, with equal probability), what is the

    expected value of the squaredoutcome?

    From the definition of an expected value:

    E X2= 6i=1

    x2ipi =16(1)2 +. . .+1

    6(6)2 =91

    6

    Similarly, E

    X3

    = 441/6 and E

    X4

    = 2275/6. (Try it.)

    When there are infinitely many possible outcomes, the expected value ofXor a particular function ofXmay not exist. However, for the coin throwingexample, E [Xn] is well-defined for any integer n 0. Can you find E[X]and E

    X2

    ?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 25 / 563

    Basic Principles Probability and Distributions

    W t j t b t th t d l ( t ) b t l

  • 8/10/2019 Empirical Finance

    26/562

    We care not just about the expected value (or average outcome), but alsohow large deviations from the average tend to be. The varianceof arandom variable is one such measure. For discrete and continuous randomvariables, respectively, the variance is:

    Var [X] =N

    i=1pi(xi E [X])2

    Var [X] = +

    fX(x) (x E [X])2dx

    In both cases, we can express the variance as an expected value:

    Var [X] = E

    (X E [X])2

    = E

    X2 (E [X])2

    The last step follows from the definitions of expected value and variance,

    although the algebra is tedious.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 26 / 563

    Basic Principles Probability and Distributions

    What is the ariance of X in the die thro e ample? One method go

  • 8/10/2019 Empirical Finance

    27/562

    What is the variance ofX in the die throw example? One methodgostraight to the definition of variance:

    Var [X] =Ni=1

    pi(xi E [X])2

    =1

    6(1 3.5)2 +. . .+1

    6(6 3.5)2 =35

    12

    Another methodfind the variance in terms of quantities we have alreadycalculated:

    Var [X] = E X2 (E [X])2 =916 7

    22 =35

    12

    Both methods give the same answer, which is not a coincidence.

    What is the variance in the coin throwing example?Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 27 / 563

    Basic Principles Probability and Distributions

    Wh h i fi i l i (lik d l )

  • 8/10/2019 Empirical Finance

    28/562

    When there are infinitely many outcomes, variance (like expected value)may not exist. For example, a Students T distribution with 2 degrees offreedom has an expected value of 0, but its variance does not exist.

    For most distributions we deal with, both mean and variance arewell-defined. For the exponential distribution, the variance is:

    Var [X] = 1

    2

    (Can you prove it?)

    For the normal (Gaussian) distribution, the variance is:

    Var [X] =2

    (Proof of this result is more difficult.)

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 28 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    29/562

    Variance is, by construction, zero or positive. (It is only zero if the randomvariable is always equal to its mean.) It is never negative.

    The mean, or expected value of a random variable can beexpressed in thesame units as the random variable itself; however, variance is not soconvenient. For example, suppose the annual return of a security has anormal distribution, with = 0.1 and = 0.4. Then the mean (oraverage) return is 0.1, or 10%, but its variance is 0.16; the units are

    percent squared per year squared. We therefore will often use standarddeviation instead of variance:

    SD [X]

    Var [X]Standard deviation, like variance, is always zero or positive, but is in thesame units as the original random variable. In the example above, thestandard deviation of the securitys return is 40% per year.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 29 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    30/562

    In financial and economic applications, mean and variance are used all thetime. Less often, so-called higher ordermoments are used, e.g., the third

    and fourth (centred) moments:

    E (X E [X])3

    = E X3

    3 E X2

    E [X] + 2 (E [X])3

    E

    (X E [X])4= E X4 4 E X3E [X]+ 6 E

    X2

    (E [X])2 3 ( E [X])4

    Like variance, these quantities are not in the most convenient units, so theyare often converted to dimensionless quantities, skewness and kurtosis.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 30 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    31/562

    Skewness and kurtosis are defined as:

    Skew E (X E [X])3(Var [X])

    32

    Kurt E (X E [X])4(Var [X])2

    3

    The kurtosis (sometimes called excess kurtosis) has 3 subtracted out to

    make a normal distribution have a kurtosis of 0; any distribution withpositive kurtosis is therefore more kurtotic than a normal distribution.

    Skewness is related to the symmetry of a distribution, and kurtosis isrelated to the probability of extreme values.

    Skewness can take any value, positive or negative. Any symmetricdistribution (e.g., the normal distribution, the uniform distribution, or thedie throwing example) has skewness of zero.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 31 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    32/562

    A distribution that has most of the probability near the mean, but also has

    a small amount of probability of extremely high values, then thedistribution will have positive skewness. If the extreme values are lowinstead of high, then the skewness will be negative.

    Income distributions in most countries have positive skewnessmost

    people earn an amount around the median, but a very small number ofpeople typically earn very high incomes.

    The skewness of the exponential distribution is 2; the skewness of thedistribution in the coin throwing example is 3/

    2. (Can you derive these

    results?)

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 32 / 563

    Basic Principles Probability and Distributions

  • 8/10/2019 Empirical Finance

    33/562

    Kurtosis has to do with the probability of extreme observations. If a

    random variable is almost always close to the mean, but with some smallprobability, it can take on a very large value (above or below the mean),then the distribution has high kurtosis.

    The lowest possible value of kurtosis is2; there is no maximum value ofkurtosis. It is possible for the skewness and the kurtosis of a distributionnot to exist.

    The exponential distribution has a kurtosis of 6; the uniform distributionhas a kurtosis of1.2. The Gaussian distribution has a kurtosis of zero.The coin throwing example has a kurtosis of 6.5, and the die throwingexample has a kurtosis of222/175. (Can you derive these results?)

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 33 / 563

    Basic Principles Probability and Distributions

    Exponential vs. Gaussian Distribution

  • 8/10/2019 Empirical Finance

    34/562

    Exponential vs. Gaussian Distribution

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 34 / 563

    Basic Principles Probability and Distributions

    Exponential vs. Gaussian DistributionRight Tail

  • 8/10/2019 Empirical Finance

    35/562

    Exponential vs. Gaussian Distribution Right Tail

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 35 / 563

    Basic Principles Probability and Distributions

    Gaussian vs. Students T Distribution

  • 8/10/2019 Empirical Finance

    36/562

    Gaussian vs. Student s T Distribution

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 36 / 563

    Basic Principles Estimation and Inference

    Problem we do not know the distribution of random events

  • 8/10/2019 Empirical Finance

    37/562

    Problemwe do not know the distribution of random events.

    1 For the coin throwing example, it seems like the probability of

    heads is 0.5. Are you sure? Maybe it is a trick coin.2 For a security return, we know the future return is random (i.e., we

    cannot predict it in advance with perfect accuracy). But what is itsprobability distribution?

    If we have historical data (e.g., we have observed the coin being thrownrepeatedly, or we have historical returns for a security), we can use thisdata to learn something about the probabilities of different outcomes. (Isthere an implicit assumption here?)

    Estimation of the entire probability distribution of a random variable is avery difficult problem. (Easy for some special cases, like the coin throwingexample.) We will focus on estimating quantities such as the mean andvariance of a random variable.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 37 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    38/562

    How do we estimate the mean (expected value) of a random variable, suchas the outcome of a coin throw, or the future return of a security?

    An extremely general methodtake the sample averageof the availableobservations. Suppose we have observed N realisations of the randomvariableX, denoted by X1, . . . , XN. Then we can estimate the averagewith:

    X = 1

    N

    Ni=1

    Xi

    Is this a good way to estimate the expected value of a random variable?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 38 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    39/562

    Exampleprobability of heads with a coin throw.

    Call the value of a coin throw X= 1 if it comes up heads, and X = 0

    otherwise. Call pthe probability of heads. Then:

    E [X] =

    2

    i=1xipi =p 1 + (1 p) 0 =p

    So estimating the expected value ofX is the same thing as estimating theprobability of heads. Estimate the sample mean by throwing the coin Ntimes, counting each heads as 1, and each tails as 0. Count up the

    number of heads, and divide by N. This is X, the sample mean.

    Will the sample average be equal to the true average (i.e., the expectedvalue)?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 39 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    40/562

    Exampleexpected return of a security.

    Collect historical returns for the last Nmonths. Add them all up, and

    divide by N:

    R= 1

    N

    N

    i=1Ri

    This method is very commonly used to estimate expected returns ofbroadly diversified portfolios; it is used less often to try to estimate theexpected returns of individual securities. (Any idea why?)

    Will the sample average return be equal to the true expected return?

    What are the statistical properties of the sample mean?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 40 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    41/562

    First, we will need a few basic results. Let X and Y be random variables,and let a, b, and cbe constants. Then:

    E [X+Y] = E [X] + E [Y]

    E [aX] =a E [X]E [a+bX+cY] =a+bE [X] +cE [Y]

    These results are true for both discrete and continuous random variables,

    and follow directly from the definition of expected value. (The derivationis a little tedious though.)

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 41 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    42/562

    The first two results are just special cases of the third, which can begeneralized; let X1, . . . , XNbe random variables, and let a0, . . . , aN beconstants. Then:

    Ea0+ Ni=1

    aiXi= a0+ Ni=1

    aiE [Xi]

    This last result will be extremely useful in analysing the statisticalproperties of the sample mean.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 42 / 563

    Basic Principles Estimation and Inference

    Note that the sample mean is itself a random variable; sometimes it will beh h h h d ll b l W fi d

  • 8/10/2019 Empirical Finance

    43/562

    higher than the true mean, and sometimes it will be lower. We can find itsexpected value, just like we can with any other random variable:

    E

    X

    = E

    1

    N

    Ni=1

    Xi

    = E

    Ni=1

    1

    NXi

    =Ni=1

    1

    NE [Xi]

    =

    Ni=1

    1N

    E [X] = E [X]

    So the expected value of the sample average is equal to the true

    averageif you estimate the true mean with the sample mean, then onaverage, you will get it right!

    We would also like to examine how precise the estimate tends to behowmuch can the sample average deviate from the true average? However, we

    need some additional tools first.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 43 / 563

    Basic Principles Estimation and Inference

    L X d Y b d i bl Th j i di ib i ll h

  • 8/10/2019 Empirical Finance

    44/562

    Let X and Ybe random variables. The joint distribution tells us theprobabilities of different possible outcomes ofX and ofY individually, butit also tells us how X and Yare related. Suppose there are Mpossible

    values ofX, and Npossible values ofY. Then the joint probability pi,j isthe probability that Xwill take the value xi, and Ywill simultaneouslytake the value yj.

    The joint probabilities ofX and Ymust satisfy the same two restrictions

    that all probabilities must satisfythey must be non-negative, and theymust add up to one.

    We can also consider the probabilities of either X orY, considered alone.

    For example, let p(X)

    1

    , . . . , p(X)

    M

    be the probabilities of theMpossible

    values ofX, and let p(Y)1 , . . . , p(Y)N be the probabilities of theNpossible

    values ofY. Then these two sets of probabilities are called the marginalprobabilities ofX and Y.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 44 / 563

    Basic Principles Estimation and Inference

    There is a relation between the marginal probabilities and the jointb biliti S ifi ll

  • 8/10/2019 Empirical Finance

    45/562

    probabilities. Specifically:

    p(X)i =Nj=1

    pi,j p(Y)j =

    Mi=1

    pi,j

    SupposeX and Ycan each take on the values1, 0, or +1, and do sowith the following probabilities:

    X1 0 +1

    1 0.20 0.10 0.00Y 0 0.20 0.05 0.20+1 0.10 0.00 0.15

    What are the marginal probabilities ofX and Y?Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 45 / 563

    Basic Principles Estimation and Inference

    We can also specify the joint probability density function fX,Y(x, y) fort d i bl ith ti di t ib ti

  • 8/10/2019 Empirical Finance

    46/562

    two random variables with a continuous distribution.

    The probability that X [a, b] and Y [c, d] is:

    P (a X b, c Y d) = ba

    dc

    fX,Y(x, y) dydx

    In either the discrete or the continuous case, expected values are definedanalogously to the case of a single random variable:

    E [g(X, Y)] =

    Mi=1

    Nj=1

    pi,jg(xi, yj)

    E [g(X, Y)] =

    +

    +

    fX,Y(x, y) g(x, y) dydx

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 46 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    47/562

    We say the discrete random variables X and Y are independent if:

    pi,j=p(X)i p

    (Y)j

    IfX and Yare continuous, then they are independent if:

    fX,Y(x, y) =fX(x) fY(y)

    Intuitively, X and Yare independent if knowledge ofX tells you nothing

    about the probability of different outcomes ofY, and vice-versa.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 47 / 563

    Basic Principles Estimation and Inference

    We define the covariancebetween X and Y as:

  • 8/10/2019 Empirical Finance

    48/562

    Cov[X, Y] E [(X E [X]) (Y E [Y])] = E [XY] E [X] E [Y]

    Covariance is a measure of how the two random variables are related; e.g.,if it is positive, then when X is above its mean value, Yalso tends to beabove its mean value.

    If two random variables are independent, then their covariance is zero.(Proof?) However, it is possible for random variables to have a covarianceof zero, but not be independent.

    Other useful properties of covariance are:

    Cov[X, Y] = Cov [Y, X] Cov [X, X] = Var [X]

    These follow immediately from the definition.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 48 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    49/562

    The units of covariance are not particularly useful, so one may prefercorrelation:

    Corr [X, Y] Cov[X, Y]SD [X] S D [Y]

    Correlation is not well-defined if either X orYhas a standard deviation of

    zero. But otherwise, correlation is dimensionless, and is bounded betweenits maximum value of +1 and its minimum value of1.Correlation and covariance have the same signthat is, they are bothpositive, both negative, or both zero.

    If two random variables have a correlation of zero, we say they areuncorrelated. This does not necessarily mean that they are independent!

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 49 / 563

    Basic Principles Estimation and Inference

    ExampleX and Yhave a bivariate normal distribution:

  • 8/10/2019 Empirical Finance

    50/562

    fX,Y(x, y) = 1

    22X2Y(1 2) e

    (x X)2

    2Y

    2 (x X) (y Y) XY+ (y Y)2 2X

    2[2X2Y(12)]

    This distribution has the following properties:

    E [X] =X E [Y] =Y

    Var [X] =2X Corr [X, Y] = Var [Y] =2Y

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 50 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    51/562

    Note that, if= 0, then X and Yare independent. (Can you show it?)For this particular distribution, X and Yare independent if and only ifthey are uncorrelated.

    This result does not generalise to other distributions! It is not true evenfor normal distributions; X and Ycan each have a marginal normaldistribution and a correlation of zero, but not be independent. (Can youconstruct an example?)

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 51 / 563

    Basic Principles Estimation and Inference

    Two Standard Gaussian DistributionsZero Correlation

  • 8/10/2019 Empirical Finance

    52/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 52 / 563

    Basic Principles Estimation and Inference

    Two Standard Gaussian DistributionsCorrelation of+0.5

  • 8/10/2019 Empirical Finance

    53/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 53 / 563

    Basic Principles Estimation and Inference

    Two Standard Gaussian DistributionsCorrelation of0.5

  • 8/10/2019 Empirical Finance

    54/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 54 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    55/562

    The following properties of variance follow from the definition. (Can you

    derive them?) Let X and Ybe random variables, and let a, b, and c beconstants. Then:

    Var [X+Y] = Var [X] + Var [Y] + 2 Cov [X, Y]Var [aX] =a2 Var [X]

    Var [a+bX+cY] =b2 Var [X] +c2 Var [Y] + 2bcCov [X, Y]

    The first two are special cases of the third.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 55 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    56/562

    More generally, ifX1, . . . , XNare random variables and a0, . . . , aN are

    constants:

    Var a0+N

    i=1 aiXi=N

    i=1 a2i Var [Xi] + 2N

    i=1N

    j=i+1 aiajCov [Xi, Xj]The presence of the covariance terms has very profound implications forportfolio choice. What is the above result if the X1, . . . , XNare all

    uncorrelated with each other?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 56 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    57/562

    At this point, it may be useful to specify some properties of covariances.

    Let X, Y, U, and Vbe random variables, and let a, b, c, d, f, and g beconstants. then:

    Cov[a+bX+cY, d+fU+gV] =bfCov[X, U] +bgCov [X, V]+cfCov[Y, U] +cgCov [Y, V]

    For both variances and covariances, adding a constant to the arguments

    has no effect.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 57 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    58/562

    The previous result may also provide some insight in why constants that

    appear multiplicatively inside a variance must be squared when they aretaken outside:

    Var [bX] = Cov [bX, bX] =b2 Cov[X, X] =b2 Var [X]

    We will state and use a number of statistical results in this section and thenext without proof; if you want to fill in the proofs, the above property ofcovariance will often be useful. This result generalizes to arbitrary linearcombinations of random variables in the obvious way.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 58 / 563

    Basic Principles Estimation and Inference

    We can now further analyse the statistical properties of the sample mean.Specifically, we would like to find its variance. At this point, we assume

  • 8/10/2019 Empirical Finance

    59/562

    the X1, . . . , XNare independent of each other. (Is this a reasonableassumption?)

    Var

    X

    = Var

    1

    N

    Ni=1

    Xi

    =

    1

    N2

    Ni=1

    Var [Xi] = 1

    NVar [X]

    The standard deviation of the sample mean is:

    SD

    X

    =

    Var

    X

    =

    1N

    SD [X]

    From the above results, we can reach the not very surprising conclusionthat, the more observations we have, the better an estimate of the truemean Xis. On average, it is right; furthermore, the more observations wehave, the less likely X is to deviate widely from the true mean.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 59 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    60/562

    Examplecoin throwing.

    Recall our method of estimating the probability a coin comes upheadsthrow the coin Ntimes, count the number of heads, and dividebyN. The resulting number (which is the sample mean) is an estimate ofthe probability of heads.

    On average, the sample mean is an accurate estimate of the true mean.But if you throw a coin 1, 000 times, will it always come up heads 500times, even if it is a fair coin? Suppose it comes up heads 550 timesisthis evidence that it is a trick coin?

    Recall that heads receives a value of 1, and tails receives a value of 0.The average value is p, where p is the probability of heads.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 60 / 563

    Basic Principles Estimation and Inference

    What is the variance of a single coin throw?

  • 8/10/2019 Empirical Finance

    61/562

    What is the variance of a single coin throw?

    E

    X2

    =p(1)2 + (1 p)(0)2 =pVar [X] = E

    X2 (E [X])2 =p p2 =p(1 p)

    What is the variance of the sample average?

    Var

    X

    =

    1

    NVar [X] =

    p(1 p)N

    We dont know the value ofp, so we dont know the variance of thesample mean.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 61 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    62/562

    However, note that p(1 p) takes a maximum value of 1/4 at p= 1/2.So we know for sure that:

    Var X 1

    4N SD X

    1

    2N

    For N= 1, 000, we have E

    X

    = 0.5 and SD

    X 0.01581

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 62 / 563

    Basic Principles Estimation and Inference

    S f h b h d I h

  • 8/10/2019 Empirical Finance

    63/562

    Suppose after 1, 000 throws, we observe heads 550 times. Is the coinfair? The sample mean X is 0.55. If the coin is fair, then p= 0.5, and

    E X= 0.5 and SD X 0.01581. There are two possibilities:1 The coin is not fair, and comes up heads more often than tails.2 The coin is fair, but came up heads more often than tails just

    due to chance.

    Which is it?

    When data are generated by a random process, we can never know

    anything with absolute certainty. However, we may be able to come to aconclusion with high probability.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 63 / 563

    Basic Principles Estimation and Inference

    We now construct a test statistic, of the form:

  • 8/10/2019 Empirical Finance

    64/562

    Z=X 0

    where X is the sample mean (i.e., the mean estimated from the data), 0is the hypothesized mean (in this case, 0.5, since we are testing whetherthe coin is fair), and is the standard deviation of the quantity beingtested. Since 550 coins out of 1, 000 came up heads, X = 0.55, vs. thehypothesized value of0 = 0.5. We have calculated = 0.01581. So thetest statistic is:

    Z =

    X

    0

    =

    0.55

    0.50

    0.01581 = 3.16

    Intuitively, the observed outcome (550 heads) is 3.16 standard deviationsabove the mean outcome, if the coin were fair. Could this have happenedby chance?Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 64 / 563

    Basic Principles Estimation and Inference

    Certainly 550 heads couldhave happened by chance; 600 heads, 900

  • 8/10/2019 Empirical Finance

    65/562

    y pp yheads, or 999 heads, or even 1, 000 heads could have happened by chance.But how likely is it? We can get some idea of how probable in outcome is,due to chance, even if the hypothesis being tested is true, using a resultknown as Chebyshevs inequality.

    This result states that the a random variable takes values at least kstandard deviations away from the mean with a probability that is at 1/k2.For k 1, it tells us the probability is at most 1, but we knew thatalready, since nothing can happen with probability greater than one. Butfor two standard deviations, Chebyshevs inequality tells us that suchoutcomes can happen with probability ofat most1/4; depending on the

    actual distribution, the true probability might be smaller. Outcomes threestandard deviations away from the mean happen with probability of atmost 1/9, etc.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 65 / 563

    Basic Principles Estimation and Inference

    In this case, the probability of getting a realised value ofX that is

  • 8/10/2019 Empirical Finance

    66/562

    , p y g gk= 3.16 standard deviations away from the mean is at most1/k2 = 0.10.So 550 heads could have occurred by chance, even if the coin is fair; butthe probability that the outcome would be 50 or more coin throws awayfrom the expected value of 500, is at most 0.10.

    Are you willing to conclude that the coin is not fair, based on this test? Ifnot, how extreme would the outcome have to be in order to convince youthat the coin is not fair?

    In fact, the actual probability of 550 heads, assuming the coin is fair, isquite a bit smaller than 0.10. The exact distribution of the outcome isknown in this case; it is called the binomialdistribution. However, thebinomial distribution is a bit unwieldy for large values ofN, so we willresort to an approximation.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 66 / 563

    Basic Principles Estimation and Inference

    Central Limit Theoremwhen the number of observations is large, thed b f h l X l l dl f

  • 8/10/2019 Empirical Finance

    67/562

    distribution of the sample mean X is approximately normal, regardless ofthe distribution ofX. (Requires existence of finite mean and variance.)

    If a random variable has a normal distribution, then any linear function ofthat random variable also has a normal distribution. (Can you prove it?)The sample mean, X, has a normal distribution (approximately) by thecentral limit theorem. Recall the test statistic:

    Z=X 0

    The test statistic Z is a linear function ofX(note the other quantities inthe expression above are not random), and therefore also hasapproximately a normal distribution.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 67 / 563

    Basic Principles Estimation and Inference

    What are the mean and standard deviation of the test statistic Z?(Assume the hypothesis, that E [X] = 0.5, is true.)

  • 8/10/2019 Empirical Finance

    68/562

    E [Z] = E X 0

    =

    E X 0

    =0 0

    = 0

    Var [Z] = Var

    X 0

    =

    1

    2Var

    X 0

    = 1

    2Var

    X

    =

    1

    22 = 1

    SD [Z] =Var [Z] = 1 = 1The test statistic tthus has approximately a normal distribution, withmean of 0 and variance of 1. (This is not a coincidencethe test statisticwas designed to have these properties.)

    We can now use the test statistic to determine how likely an outcome of550 heads is, if the coin is fair.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 68 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    69/562

    Basic properties of a normal distribution:

    1 The realised value is within one standard deviation of the mean withprobability 0.682.

    2 The realised value is within two standard deviations of the mean withprobability 0.954.

    3 The realised value is within three standard deviations of the meanwith probability 0.997.

    These statistics are determined by integrating over the appropriate range

    of the density function for the normal distribution.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 69 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    70/562

    For example, to find the second result, we can calculate:

    Prob( 2 X + 2) = +22

    122

    e(x)2

    22 dx

    The integral above cannot be found in closed-form; however, it can beevaluated numerically. (A closed-form expression that is known to beaccurate to at least 15 decimal places does exist.)

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 70 / 563

    Basic Principles Estimation and Inference

    Many books have tables of the value of integrals of the normal densityfunction for different ranges, and many software packages can also

  • 8/10/2019 Empirical Finance

    71/562

    g , y p gcalculate it. By any of these methods, we can determine than an

    observations at least 3.16 standard deviations from the mean occur withprobability of only 0.00159.

    In other words, if you were to throw a fair coin 1000 times, the combinedprobability that you would get either

    1 550 heads or more2 450 heads or fewer

    is only 0.00159, and the probability that the number of heads will fallbetween 450 and 550 is 0.99841. (These probabilities are based on anapproximation, that the sample mean has a normal distribution. Theapproximation is fairly accurate in this case.)

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 71 / 563

    Basic Principles Estimation and Inference

    Coin Throw Example1,000,000 Trials, 1,000 Throws Each Trial

  • 8/10/2019 Empirical Finance

    72/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 72 / 563

    Basic Principles Estimation and Inference

    Coin Throw ExampleStandardised Distribution

  • 8/10/2019 Empirical Finance

    73/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 73 / 563

    Basic Principles Estimation and Inference

    Since the distribution ofX is approximately normal for a large number of

  • 8/10/2019 Empirical Finance

    74/562

    coin throws, the probability that the number of heads would differ from

    the mean value by at least 50 is approximately 0.00159.The true value (based on the exact distribution ofX, which in thisexample is binomial) is 0.00173; the assumption of normality leads tosome inaccuracy, but not too much.

    So, if the coin were fair, the expected number of heads would be 500, anda realised value as far away as 550 would occur with probability of lessthan 0.002; the probability that the number of heads would be closer to500 is more than 0.998.

    Does 550 heads seem very likely to occur just by chance? Are you willingto declare that the coin is not fair?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 74 / 563

    Basic Principles Estimation and Inference

    (

  • 8/10/2019 Empirical Finance

    75/562

    Whether we use the approximate probability of 0.00159 (based on thenormal approximation) or the exact probability of 0.00173 (based on thebinomial distribution), this number has a nameit is often called thep-value. A p-value is simply the probability that, under the hypothesisbeing tested, data as extreme as what has been observed would occur justby chance. The p-value in this example is rather extremea result this

    extreme (50 or more heads away from the expected value of 500) shouldoccur just by chance, if the coin were fair, fewer than two times out of athousand. If the coin were fair, we have just observed quite a remarkablecoincidence. It is possiblethe coin is fair; but it doesnt seem very likely.

    We will now try to formalise this idea.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 75 / 563

    Basic Principles Estimation and Inference

    We have an hypothesisthe coin is fair, and the probability of heads is0.5.

  • 8/10/2019 Empirical Finance

    76/562

    We also have evidence550 heads out of 1, 000coin throws.

    There are two types of errors we can make here:

    1 Type I Errorwe rejectthe hypothesis (that is, conclude that thecoin is not fair) when it in fact is fair.

    2 Type II Errorwe fail to reject the hypothesis (concluding the coin isfair) when it is in fact not fair.

    It is impossible to avoid both types of errors completely. All we can do is

    trade the probability of one off against the other.

    The nearly universal convention in finance and economics (which iscompletely arbitrary) is to set the probability of a Type I Error at 0 .05.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 76 / 563

    Basic Principles Estimation and Inference

  • 8/10/2019 Empirical Finance

    77/562

    Hypothesis: the coin is fair (the probability of heads is 0.5).

    Evidence: 550 heads from 1, 000 coin throws.

    If the hypothesis is true, the probability of getting a deviation from themean this large is only 0.00159 (using the normal approximationtheexact p-value is 0.00173).

    Since this probability is less than 0.05, we rejectthe hypothesis, andconclude the coin is not fair.

    Could we have just made a Type I error?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 77 / 563

    Basic Principles Estimation and Inference

    Yes, we could have just made a Type I error. The only way to avoid Type I

  • 8/10/2019 Empirical Finance

    78/562

    errors (incorrect rejection of an hypothesis that is true) is never to reject

    any hypothesis. If one takes that approach, one is likely to commit quite alot of Type II errors (failure to reject an hypothesis which is false).

    When the hypothesis is true, if we use a cut-off of 0.05 (as we did in thisexample), we are likely to reject the hypothesis (incorrectly) one time in

    every twenty. If this risk of Type I error is unacceptably large, we can lowerour cut-off; for example, we could reject the hypothesis only if the p-valueis less than 0.02. Then we will only commit a Type I error one time inevery fifty, which is an improvement. However, this comes at a pricetheprobability of a Type II error goes up. We will fail to reject an hypothesis

    that is false more often, if we decrease our cut-off value. There is no wayaround this trade-off.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 78 / 563

    Basic Principles Estimation and Inference

    One could take the approach of trying to assess how costly Type I and

  • 8/10/2019 Empirical Finance

    79/562

    One could take the approach of trying to assess how costly Type I andType II errors are, and changing the cut-off value accordingly. For

    example, consider a medical test that is designed to detect the early stagesof a curable disease. If our hypothesis is the patient is healthy, then aType I error is a false positiveconcluding that the patient is sick, when infact the patient is healthy. A Type II error is a false negativefailure todetect the disease, when the patient in fact has it.

    If the test is very sensitive, there will be very few false negatives (very fewType II errors), but there will also be a lot of false positives (lots of Type Ierrors). If the test is adjusted so that it is not so sensitive, then there willbe fewer false positives, but more false negatives. So how sensitive shouldwe make the test?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 79 / 563

    Basic Principles Estimation and Inference

    If we conclude that the cost of a Type II error is very high (a sick patient

  • 8/10/2019 Empirical Finance

    80/562

    yp y g ( pfails to get treatment, wrongly believing s/he is healthy),whereas the

    Type I error is less costly (a healthy patient has some rather anxiousmoments, and undergoes some additional testing/treatment before it isrealised that there was a false positive), then we should make the test verysensitive. If the costs are different (for example, maybe the disease is notso serious, and the treatment is expensive, painful, and largely ineffective),

    then we should make the test less sensitive.

    This type of analysis is used frequently in some disciplines, such asengineering. It has largely gone out of fashion in financial analysis, wherearbitrary benchmarks (such as 0.05 probability of a Type I error) are

    commonplace.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 80 / 563

    Basic Principles Testing Pricing Models

    Returning to the three securities mentioned earlier:

  • 8/10/2019 Empirical Finance

    81/562

    AssetX Y Z

    Average return (predicted) 8% 10% 12%

    Average return (observed) 6% 16% 14%

    Standard deviation of return (observed) 25% 40% 60%

    Recall that the observed quantities were estimated from 20 years ofmonthly returns data. Can we safely conclude that the securities do notconform to the predictions of the theory?

    This problem is much more difficult than the coin throwing example.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 81 / 563 Basic Principles Testing Pricing Models

    Assume the predictions of the model are correctthen the deviations ofthe observed average returns from the predicted average returns are justdue to the random variation of the data. We already know:

  • 8/10/2019 Empirical Finance

    82/562

    E X= 8% E Y= 10% E Z= 12%But we need to know the standard deviations as well:

    SD

    X

    =? SD

    Y

    =? SD

    Z

    =?

    There were 20 years of monthly data, so N= 240, and

    240 15.49.Therefore:

    SD

    X

    =SD [X]

    15.49 SD

    Y

    =SD [Y]

    15.49 SD

    Z

    =SD [Z]

    15.49

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 82 / 563 Basic Principles Testing Pricing Models

    The problem is that we do not know the standard deviations ofX, Y, andZ; we can only estimate them from the data. Estimates were included inthe table, but how these were determined was not specified.

  • 8/10/2019 Empirical Finance

    83/562

    The usual way of estimating the variance of a random variable (which canthen be used to estimate the variance of the sample average) is as follows:

    s2XX = 1

    N 1

    N

    i=1 (Xi X2

    Note that, in order to calculate s2XX, we must first calculateX. The

    presence of the N 1 (instead ofN) in the denominator may seempuzzling; this is a correction to account for the fact that the mean is notknown exactly, but must be estimated with X.

    The sample variance s2XXis itself a random variablewhat are itsstatistical properties?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 83 / 563 Basic Principles Testing Pricing Models

    We have all the tools we need to find its mean and variance, although thealgebra can be tedious.

  • 8/10/2019 Empirical Finance

    84/562

    E s2XX= E 1N 1Ni=1

    (Xi X2=

    1

    N

    1

    N

    i=1 (E

    X2i

    2 E

    XiX+ E

    X2

    =

    1

    N 1Ni=1

    Var [X] + E [X]2

    2N

    Var [X] 2 E [X]2

    +

    1

    NVar [X] + E [X]2

    =Var[X]

    Can you fill in the missing steps?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 84 / 563 Basic Principles Testing Pricing Models

    The following results can also be derived, with considerable difficulty:

  • 8/10/2019 Empirical Finance

    85/562

    Var

    s2XX

    = (SD [X])4 2

    N 1+Kurt[X]

    N

    Cov

    X, s2XX

    =

    Skew [X] (SD[X])3N

    IfXhappens to have a normal distribution, then its skewness and kurtosisare each equal to zero, the sample mean and variance are uncorrelatedwith each other, and the variance ofs2XXhas a very simple form.

    We will not prove these results, but ifXhas a normal distribution, then Xalso has a normal distribution, and s2XXhas a chi-square distribution.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 85 / 563 Basic Principles Testing Pricing Models

    Returning to the example, consider security X. We have a theory thatpredicts its expected return is 8%, but when we estimate the mean withX , it is 6%. The estimated standard deviation (we will use the notation

  • 8/10/2019 Empirical Finance

    86/562

    X, it is 6%. The estimated standard deviation (we will use the notationsX) is 25%.

    We would like to construct a test statistic:

    Z =X 0SD X =

    N X 0SD [X]

    If the hypothesis is correct, then the expected value ofX is 6% and itsstandard deviation is SD [X] /

    240 (recall that there are 240 monthly

    observations). The test statistic then has a mean of zero, and a standarddeviation of one. IfXhas a normal distribution, then Zalso has a normaldistribution; even ifX isnt normal, then by the central limit theorem, Z isapproximately normal for large N.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 86 / 563 Basic Principles Testing Pricing Models

    Z-statistic for Stock Return Example1,000,000 Trials

  • 8/10/2019 Empirical Finance

    87/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 87 / 563 Basic Principles Testing Pricing Models

    The test statistic Z is therefore ideal, except for one little problemit isinfeasible. We dont know SD [X ], and can only estimate it. Note that this

  • 8/10/2019 Empirical Finance

    88/562

    infeasible. We don t know SD [X], and can only estimate it. Note that thissituation is different from the coin throwing examplethere, under thehypothesis (that the coin is fair, and the probability of heads is 1/2), weknew the standard deviation of a coin throw. Here, we dontthehypothesis tells us what the value of the mean ought to be, but is silentwith respect to the variance and standard deviation.

    Instead, we must use the estimated standard deviation, rather than theactual, to form our test statistic:

    t= N X 0sX

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 88 / 563 Basic Principles Testing Pricing Models

    Because the standard deviation used in our test statistic is estimated, thedistribution of the test statistic is not normal, even if X is. Under the

  • 8/10/2019 Empirical Finance

    89/562

    distribution of the test statistic is not normal, even ifX is. Under theassumption of normality for X, the test statistic thas a Students tdistribution with N 1 degrees of freedom.The t-distribution approaches a standard normal distribution (i.e., anormal distribution with a mean of zero and a standard deviation of one)as the degrees of freedom become large. When there are many dataobserved, the uncertainty in the estimate of the mean remains much largerthan the uncertainty in the estimate of the standard deviation, and the tstatistic approaches the distribution it would have if the standard deviationwere known with certainty: a standard normal. When the number of data

    observations is small, though, the deviation from normality can be verysignificant.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 89 / 563 Basic Principles Testing Pricing Models

    T Distribution with Various Degrees of Freedom

  • 8/10/2019 Empirical Finance

    90/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 90 / 563 Basic Principles Testing Pricing Models

    T-statistic for Stock Return Example1,000,000 Trials

  • 8/10/2019 Empirical Finance

    91/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 91 / 563 Basic Principles Testing Pricing Models

    T-statistic with Non-Gaussian Returns1,000,000 Trials, T= 240

  • 8/10/2019 Empirical Finance

    92/562

    Kimmel (EDHEC Business School) Empirical Finance Singapore Mar/Aug 2011 92 / 563 Basic Principles Testing Pricing Models

    T-statistic with Non-Gaussian Returns1,000,000 Trials, T= 480

  • 8/10/2019 Empirical Finance

    93/562

    Kimmel (EDHEC Business School) Empirical Finance Singapore Mar/Aug 2011 93 / 563 Basic Principles Testing Pricing Models

    T-statistic with Non-Gaussian Returns1,000,000 Trials, T= 960

  • 8/10/2019 Empirical Finance

    94/562

    Kimmel (EDHEC Business School) Empirical Finance Singapore Mar/Aug 2011 94 / 563 Basic Principles Testing Pricing Models

    The test statistic for security X is then:

    t

    NX 0

    2406% 8%

    1 24

  • 8/10/2019 Empirical Finance

    95/562

    t=

    N0

    sX =

    240 25% 1.24Since the number of degrees of freedom is quite large, we can simply treatthe t-statistic as if it were normally distributed. A test statistic of1.24corresponds to a p-value of approximately 0.215; that is, if the hypothesis

    were true, there is still a probability of 0.215 that the sample averagereturn of the security would differ from the hypothesized value by at least2%.

    If we use the 0.05 cut-off for p-values, as is common practice in finance,

    we cannot reject the hypothesis that E [X] = 8%. The risk that we aremaking a Type I error is too high.

    Do the other securities provide evidence against the model?

    Kimmel (EDHEC Business School) Empirical Finance Singapore Mar/Aug 2011 95 / 563

  • 8/10/2019 Empirical Finance

    96/562

    Basic Principles Multivariate Tests

    Is there anything wrong with what we are doing here?

  • 8/10/2019 Empirical Finance

    97/562

    It doesnt make any sense to test the securities one at a time. Suppose themodel we are testing is actually trueit correctly describes the expectedreturns of all securities. If we go out and test its predictions one securityat a time, then for each test we conduct, there is a 0.05 probability(assuming 95% confidence) of a Type I error. If, for example, we test a

    model for Japanese stock returns, and decide to conduct a statistical testfor each of the 225 stocks in the Nikkei 225 index, that is 225 chances tohave a Type I error. How likely is it that at least some of the stocks willappear to violate the predictions of the model, just by chance, even though

    the model is true?

    Ki l (EDHEC B si ss S h l) E i i l Fi Si M /A 2011 97 / 563 Basic Principles Multivariate Tests

    What we really ought to do is perform a single statistical test of all thesecurities simultaneously. For example, we could consider a test statisticalong the lines of the following:

  • 8/10/2019 Empirical Finance

    98/562

    g g

    F =t2X+t2Y +t

    2Z =

    (RX 0,X

    22

    RX

    +

    (RY 0,Y

    22

    RY

    +

    (RZ 0,Z

    22

    RZ

    Intuitively, this statistic has some advantagesit is big when thet-statistics for the individual assets are big, it places more weight onviolations of the theorys predictions for assets which have small standard

    deviations, etc. It also seems like it has a distribution that can becalculatedit is the sum of three squared t distributions. But are these tdistributions independent?

    Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 98 / 563 Basic Principles Multivariate Tests

    The test statistic just proposed doesnt work if we cant be sure that thereturns of the three assets are independent (or at least uncorrelated). Wecan fix this defect, but first, we will need to be able to estimatecovariances from historical data The usual way of estimating the

  • 8/10/2019 Empirical Finance

    99/562

    covariances from historical data. The usual wayof estimating the

    covariance between X and Y is:

    s2XY = 1

    T

    1

    T

    t=1 (XtX

    (Yt Y

    This estimator is unbiased, i.e., E

    s2XY

    = Cov [X, Y]. Derivation of its

    variance (and covariance with other statistics) is very difficult.

    The T 1 divisor, instead ofT, is often a point of confusion. T 1 isused to make our estimate unbiased. Some just use T, but if you estimatecovariance (or variance) this way, then your estimate is biased; it tends tobe a little too small, on average. For large T, it doesnt matter very much.

    Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 99 / 563 Basic Principles Multivariate Tests

    Some software products are quite inconsistent about which divisor theyuse, T 1 orT. For example, a spreadsheet product produced by asoftware company based in Redmond, Washington, USA, usesT 1 in theVAR function but T in the COVAR function Therefore even though

  • 8/10/2019 Empirical Finance

    100/562

    VAR function, but T in the COVAR function. Therefore, even though

    Cov[X, X] = Var [X] by definition, this software package returns differentvalues for VAR(A1:A10) and COVAR(A1:A10,A1:A10). When youhave a piece of software do these sorts of calculations for you, make sure itis doing what you think it is doing.

    When we need to estimate a correlation from historical data, we will do soas follows:

    = s2XY

    sXsY

    The little hat over the indicates that the quantity is the estimated,rather than true correlation.

    Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 100 / 563 Basic Principles Multivariate Tests

    We now return to the problem of constructing a joint test statistic. Forconvenience, we will call the assets X1, . . . , XN. It is convenient to arrangethe means of the assets in a column vector, and the variances andcovariances in a matrix:

  • 8/10/2019 Empirical Finance

    101/562

    =

    E [X1]...

    E [XN]

    =

    Var [X1] Cov[X1, XN]...

    . . . ...

    Cov[XN, X1] Var [XN]

    The sample equivalents are:

    = X1

    ..

    .XN =

    s211 s21N..

    .

    . . . ..

    .s2N1 s2NNwhere, through a slight abuse of previous notation, sij is the samplecovariance ofXi and Xj.

    Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 101 / 563 Basic Principles Multivariate Tests

    We will need three linear algebra operations to construct a reasonable teststatistic: matrix multiplication, matrix transposition, and matrix inversion.

  • 8/10/2019 Empirical Finance

    102/562

    In case these operations are not familiar, we will start with multiplicationof a row vector by a column vector. To perform this operation, we justmultiply each element in one of the vectors by its corresponding element inthe other vector, and add the products all up:

    x1 xN

    y1...yN

    = Ni=1

    xiyi

    The number of elements in the two vectors must be the same; otherwisethe product is undefined.

    Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 102 / 563 Basic Principles Multivariate Tests

    More generally, we can find the product of any two matrices, provided thenumber of columns in the first matrix is equal to the number of rows inthe second matrix. The product of a KMmatrix and an M N matrixis a K Nmatrix. The element in row iand column jof the product is

  • 8/10/2019 Empirical Finance

    103/562

    row iof the first matrix multiplied by column jof the second matrix:

    x11 x1M

    ..

    .

    . ..

    ...

    xK1 xKM y11 y1N

    ..

    .

    . ..

    ...

    yM1 yMN=

    Mi=1x1iyi1

    Mi=1x1iyiN

    ... . . .

    ...Mi=1xKiyi1 Mi=1xKiyiN

    The inner dimensions of the two matrices must match, or the product isundefined.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 103 / 563

    Basic Principles Multivariate Tests

    Many of the rules of ordinary multiplication do not apply to matrixmultiplication; for example, matrix multiplication is not commutative.

  • 8/10/2019 Empirical Finance

    104/562

    A numeric example of matrix multiplication:

    3 5 -24 1 0

    6 1-8 4

    2 1= -26 2116 8

    Given the large number of operations involved, it is not a bad idea to havea computer available before multiplying even relatively modestly sizedmatrices together. For example, to multiply a 5

    8 matrix by an 8

    3

    matrix requires 120 multiplications and 105 additions.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 104 / 563

    Basic Principles Multivariate Tests

    Transpose is a very simple operation, usually denoted by either a T or aprime superscript, i.e., CT orC. The matrix is flipped around, so that therows become columns and the columns become rows:

  • 8/10/2019 Empirical Finance

    105/562

    x11 x1N... . . . ...xM1 xMN

    T

    =

    x11 xM1... . . . ...x1N xMN

    A numeric example:

    1 3 -2-8 0 4T

    = 1 -83 0

    -2 4It doesnt get much easier than matrix transposition.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 105 / 563

    Basic Principles Multivariate Tests

    Matrix operations can be used to avoid cumbersome algebraic expressionsinvolving large numbers of assets. For example, consider Nassets, withreturns R1, . . . , RN, and a portfolio with share a1 invested in the firstasset, a2 invested in the second asset, and so on, up to aN invested inasset N (The weights a should add up to one ) What is the variance of

  • 8/10/2019 Empirical Finance

    106/562

    asset N. (The weights ai should add up to one.) What is the variance ofthe return of this portfolio?

    Var [a1R1+. . .+aNRN] =N

    i=1N

    j=1 aiajCov [Ri, Rj]Arranging the a1, . . . , aN in a column vector a, the returns R1, . . . , RN in acolumn vector R, and the variances and covariances of returns in a matrix, we can express the above as:

    Var

    aTR

    = aTa

    (Try it!) This expression is valid for any number of assets.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 106 / 563

    Basic Principles Multivariate Tests

    Numeric examplesuppose the returns of three assets have the covariancematrix:

  • 8/10/2019 Empirical Finance

    107/562

    =0.040 0.012 0.0200.012 0.090 0.036

    0.020 0.036 0.160

    What is the variance of the return of a portfolio that is 0.2 invested in thefirst asset, 0.6 in the second asset, and 0.1 invested in the third asset?

    0.2

    0.60.1T

    0.040 0.012 0.020

    0.012 0.090 0.0360.020 0.036 0.1600.2

    0.60.1= 0.0436

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 107 / 563

    Basic Principles Multivariate Tests

    Matrix inversion, usually denoted by a 1 superscript, as in C1, is arather difficult operation. The inverse of a matrix satisfies the condition:

    C C1 = C1 C = I

  • 8/10/2019 Empirical Finance

    108/562

    C

    C C

    C I

    where I is the identity matrix, which has 1 for each element on thediagonal, and 0 everywhere else:

    I =

    1 0 0...

    . . . ...

    ...0 1 0...

    ... . . .

    ...

    0 0 1

    If a matrix is not square (i.e., same number of rows and columns), it doesnot have an inverse.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 108 / 563

    Basic Principles Multivariate Tests

    Square matrices may or may not have inverses, although covariancesmatrices usually do. Specifically, every matrix that is the covariancematrix of some set of random variables R is automatically positivesemidefinite:

  • 8/10/2019 Empirical Finance

    109/562

    Var

    aTR

    =aTa 0 a

    Such a matrix is also positive definiteif it satisfies the stronger condition:

    Var

    aTR

    = aTa>0 a = 0

    A covariance matrix has an inverse if and only if it is positive definite.That is, if the only portfolio of assets that is risk-free (i.e., has variance ofzero) is the portfolio with weight zero on every asset, then the covariancematrix of the asset returns is positive definite.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 109 / 563

    Basic Principles Multivariate Tests

    Numeric examplesmatrix inversion is actually rather easy for diagonalmatrices, i.e., those in which the off-diagonal elements are all zero:

    5 0 01

    0.2 0.0 0.0

  • 8/10/2019 Empirical Finance

    110/562

    0 2 00 0 1

    = 0.0 0.5 0.00.0 0.0 1.0

    Note that the inverse is also diagonal, and the elements are just thereciprocals of the elements in the original matrix.

    Things are a bit more complicated in general:

    3 6 14 7 -26 13 0

    1

    = 1.6250 0.8125 -1.1875-0.7500 -0.3750 0.62500.6250 -0.1875 -0.1875

    (Try verifying the inverses.)Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 110 / 563

    Basic Principles Multivariate Tests

    Recall the example of the three securities, which were used to test a modelof expected returns. We have no information on the covariances betweenthe three asset returns; suppose these are all estimated at exactly zero(not very likely, but assume so for purposes of the discussion). We can

  • 8/10/2019 Empirical Finance

    111/562

    arrange the sample mean returns in a vector, and the hypothesized meanreturns in another vector:

    = 6%

    16%14% 0 =

    8%10%12%

    The estimated variances and covariances can be arranged in a matrix:

    =0.0625 0 00 0.16 0

    0 0 0.36

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 111 / 563

    Basic Principles Multivariate Tests

    The proposed joint test statistic can be expressed as:

    T 1

  • 8/10/2019 Empirical Finance

    112/562

    F = ( 0)T

    1

    ( 0)At an intuitive level, this test statistic has some good properties. Whenany of the assets have an estimated expected return that is far from thehypothesized value, this tends to make the test statistic large.Furthermore, it gives more weight to assets whose mean is estimated moreaccurately. If an asset has a small (estimated) variance of return, when is inverted, the corresponding element is large, giving more weight to thedeviation of this assets average return from the hypothesized value. Assets

    with large variance of return require larger differences between the observedand hypothesized returns to have the same effect on the test statistic.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 112 / 563

    Basic Principles Multivariate Tests

    This test statistic works just as well when the asset returns arecorrelated;the only modification we will make is to add a scaling factor:

  • 8/10/2019 Empirical Finance

    113/562

    F =T(T N)

    N(T 1) ( 0)T 1 ( 0)

    where T is (as before) the number of observations, and N is the number

    of assets. Under an assumption of normality (the asset returns have themultivariate normal distribution), this test statistic has an F distribution.

    An F distribution has two degrees of freedom parameters; the first is N,and the second is T N. This is sometimes written FN,TN. Tables ofthe Fdistribution are widely available in statistics books and otherreferences; many software packages can calculate them.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 113 / 563

    Basic Principles Multivariate Tests

    F-statistic for Stock Return Example1,000,000 Trials

  • 8/10/2019 Empirical Finance

    114/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 114 / 563

    Basic Principles Multivariate Tests

    When T is very large, the assumption of multivariate normality is not

  • 8/10/2019 Empirical Finance

    115/562

    When T is very large, the assumption of multivariate normality is notparticularly important. Recall that, for our application, the first degrees offreedom parameter is N, and the second is T N. The Fd1,d2 distributionapproaches a chi-squaredistribution with d1 degrees of freedom as d2approaches +; sinced2 approaches +asd2 becomes very large, this isthe limiting distribution of the test statistic for very large T. However, thetest statistic approaches this distribution, for very large T, even if the dataare not multivariate normally distributed.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 115 / 563

    Basic Principles Multivariate Tests

    Chi-square Distribution with Various Degrees of Freedom

  • 8/10/2019 Empirical Finance

    116/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 116 / 563

    Basic Principles Multivariate Tests

    F Distribution and Limiting Chi-square Distribution

  • 8/10/2019 Empirical Finance

    117/562

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 117 / 563

    Basic Principles Multivariate Tests

    A test procedure is therefore:

    1 Estimate the sample means, sample variances, and sample covariances

  • 8/10/2019 Empirical Finance

    118/562

    of the asset returns from historical data.2 Arrange the sample means into a vector, and the sample variances

    and covariances into a matrix.3 Also arrange the hypothesized values of the mean returns into a

    vector.4 Calculate the test statistic F.5 Determine the p-value of this statistic, using tables from a book,

    software, or some other source.6

    If the p-value is small enough (e.g., smaller than 0.05 for a 95%confidence test), then rejectthe hypothesis that the model is correct.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 118 / 563

    Basic Principles Multivariate Tests

    Numeric examplesuppose the (estimated) covariance matrix for thethree assets is:

  • 8/10/2019 Empirical Finance

    119/562

    =

    0.0625 -0.0200 0.0300

    -0.0200 0.1600 0.02400.0300 0.0240 0.3600

    (Are these numbers consistent with the standard deviations reportedearlier?)

    Can we reject, with 95% confidence, the predictions of the model?

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 119 / 563

    Basic Principles Multivariate Tests

    The test statistic is:

    F =240 (240 3)

  • 8/10/2019 Empirical Finance

    120/562

    3(240 1) 6% 8%16% 10%

    14% 12%

    T 0.0625 -0.0200 0.0300-0.0200 0.1600 0.02400.0300 0.0240 0.3600

    1 6% 8%16% 10%14% 12%

    2.066

    This distribution has 3 and 237 degrees of freedom. Many tables for the Fdistribution do not actually show p-values for different values of the Fstatistic, but rather a single cut-off p-value for tests of different confidence

    levels. From a table for 95% confidence tests, we find that the cut-offvalue for an F distribution with 3 and 120 degrees of freedom is 2.6802,and for 3 and infinitely many degrees of freedom, it is 2.6049. For 3 and237 degrees of freedom, it must be somewhere in between.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 120 / 563

    Basic Principles Multivariate Tests

    If the F-statistic is above the cut-off value of approximately 3, then thep-value is below 0.05, and we can reject the hypothesis (correctness of themodel) with 95% confidence. If the F-statistic is below thecut-off value ofapproximately 3, then the p-value is above 0.05, and we cannot reject the

  • 8/10/2019 Empirical Finance

    121/562

    hypothesis. (Recall that this does not mean the hypothesis is true; itmeans we have not found sufficient evidence to conclude that thehypothesis is false.)

    The F-statistic is 2.066, which is well below the cut-off value, so wecannot reject the hypothesis with 95% confidence. (We cannot reject itwith 90% confidence eitherthe p-value is 0.1053.)

    So despite the fact that a t-test rejects the hypothesis for one of the assetsindividually, a joint test based on an F-statistic fails to reject thehypothesis. We have not seen enough evidence to convince us, with 95%confidence, that the model is false.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 121 / 563

    Basic Principles Multivariate Tests

    It is worthwhile in a discussion of hypothesis testing to warn against thedangers of data mining.

  • 8/10/2019 Empirical Finance

    122/562

    In some disciplines, data mining is considered a good thing; one can eventake a course to learn how to do it. In finance and economics, if someonetells you that you are data mining, that person is not paying you acompliment.

    What is data mining? Recall that, even if an hypothesis is true, there is acertain probability of committing a Type I error (rejecting the hypothesiseven when it is true). For example, suppose you believe that the level ofthe high tide has an effect on stock market returns. The reality is thatyour theory is wrong, and the tides have no effect on the stock market;

    however, you dont know this.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 122 / 563

    Basic Principles Multivariate Tests

    So, you gather some data on the tides and the stock market, and performa statistic test of your hypothesis. Following common practice, you reject

  • 8/10/2019 Empirical Finance

    123/562

    the hypothesis the tides have no effect on the stock market if thep-value of your statistical test is 0.05 or less. There is then a one in twentychance that you will reject the hypothesis, and conclude that the tides dohave an effect on the stock market (even though they dont).

    Data mining refers to the practice of performing statistic test afterstatistical test, until finding one that rejects, and then reporting onlythelast test. This is a recipe for finding spurious resultschances are goodthat the result you report will be a Type I error, rather than a legitimateresult.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 123 / 563

    Basic Principles Multivariate Tests

    The pressure to find results is enormous, both in academicand industry

    l l fi d l bl d d

  • 8/10/2019 Empirical Finance

    124/562

    circles. Failure to find a result may mean no publication in academics, andno clients in industry. The incentives to engage in data mining are huge,and many engage in it, either fully aware of what they are doing, or havingsuccessfully deluded themselves into believing that what they are doing islegitimate.

    A rule of thumb is the following: if you cant think of a reasonableeconomic story for the statistical result you have found, that should be awarning sign that the result is the product of data mining.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 124 / 563

    Testing the CAPM

    Empirical Finance

    Testing the CAPM

  • 8/10/2019 Empirical Finance

    125/562

    Prof. Robert L [email protected]

    +65 6631 8579

    EDHEC Business School

    2427 Mar 20112224 Aug 2011

    Singapore Campus

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 125 / 563

    Testing the CAPM Conditional Probabilities

    We need to look at the relation between the returns of multiple securities;the notion of conditional probabilities is absolutely centralto the analysis.

    Th b bili f lik l d d h h i f i

  • 8/10/2019 Empirical Finance

    126/562

    The probability of an event very likely depends on how much informationone has. For example, it is much easier to forecast the value of a stock (orthe weather, or an election) one day in advance than it is three years inadvance. The reason is, over the past three years, a great deal hashappened that affects the value of the stock (or the weather, or theoutcome of the election). However, if you are making your forecast one dayin advance, then you know almost everything that will affect the variableyou are forecasting during the last three years; the only information you aremissing pertains to the one remaining day. If you are making your forecast

    three years in advance, you are doing so with much less information.

    Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 126 / 563

    Testing the CAPM Conditional Probabilities

    Probabilities therefore depend on an informationset; people with different

    i f ti h diff t b biliti f th t I

  • 8/10/2019 Empirical Finance

    127/562

    information have different probabilities for the same event. In somecontexts, the idea of the information set is left implicit; however, we willsometimes need to make it explicit.

    We will often deal with the situation of two distinct information sets, with

    one being a strict subset of the other. Probabilities based on the moreinformative information set are then called conditionalprobabilities, andthose based on the less informative information set are calledunconditionalormarginalprobabili