classicallinearmodel.pdf

39
The Classical Linear Model Least Squares Estimation Algebraic Properties The Classical Linear Model and OLS Estimation Walter Sosa-Escudero Econ 507. Econometric Analysis. Spring 2009 January 19, 2009 Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

Upload: roger-asencios-nunez

Post on 03-Oct-2015

215 views

Category:

Documents


3 download

TRANSCRIPT

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    The Classical Linear Model and OLS Estimation

    Walter Sosa-Escudero

    Econ 507. Econometric Analysis. Spring 2009

    January 19, 2009

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    The Classical Linear Model

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Social sciences: non-exact relationships.

    Starting point: a model for the non-exact relationshipbetween y (explained variable) and a set of variables x (theexplanatory variables).

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Assumption 1 (linearity):

    yi = 1x1i + 2x2i + + KxKi + ui, i = 1, . . . , n

    yi: explained variable for observation i. Its realizations areobserved

    xik, k = 1, . . . ,K: K explanatory variables. Observedrealizations.

    ui is a random variable with unobserved realizations.Represents the non-exact nature of the relationship.

    k, k = 1, . . . ,K are the regression coefficients.Assumption 1: the underlying relationship is linear for allobservations.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    The model in matrix notation

    Define the following vectors and matrices:

    Y =

    y1y2...yn

    n1

    =

    12...K

    K1

    u =

    u1u2...un

    n1

    X =

    x11 x21 . . . xK1x12 x22 xK2...

    . . ....

    x1n xKn

    nK

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Then the linear model can be written as: y1...yn

    = x11 x21 xK1x22 ...x1n xKn

    1...K

    + u1...un

    Y = X + u

    This is the linear model in matrix form.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Basic Results on Matrices and Random Vectors

    Before we proceed, we need to establish some results involvingmatrices and vectors.

    Let A be a m n matrix. A: n column vectors, or m rowvectors. The column rank of A is defined as the maximumnumber of columns linearly dependent. Similarly, the row rankis the maximum numbers of rows that are linearly dependent.

    The row rank is equal to the column rank. So we will talk, ingeneral, about the rank of a matrix A, and will denote it as(A)Let A be a square (mm) matrix. A is non singular if|A| 6= 0. In such case, there exists a unique non-singularmatrix A1 called the inverse of A such thatAA1 = A1A = Im.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    A a square mm matrix.

    If (A) = m |A| 6= 0If (A) < m |A| = 0

    X a nK matrix, with (X) = K (full column rank):

    (X) = (X X) = k

    This results guarantees the existence of (X X)1 based onthe rank of X.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Let b and a be two K 1 vectors. Then we define(ba)b

    = a

    Let b be a K 1 vector and A a symmetric K K matrix.(bAb)b

    = 2Ab

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Let Y be a vector of K random variables:

    Y =

    Y1...Yk

    E(Y ) = =

    E(Y1)E(Y2)

    ...E(YK)

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    V (Y ) = E[(Y )(Y )]

    =

    E(Y1 1)2 E(Y1 1)(Y2 2)

    E(Y2 2)2 . . .

    E(Yk K)2

    =

    V (Y1) Cov(Y1, Y2) . . . Cov(Y1YK)

    V (Y2)

    . . .

    V (YK)

    Tthe variance of a vector is called its variance-covariance matrix,an K K matrix

    If V (Y ) = and c is an K 1 vector, thenV (cY ) = cV (Y )c = cc.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Conditional Expectations

    E(Y |X = x) =y fY |Xdy

    Idea: how the expected value of Y changes when X changes. It isa function that depends on X. If X is a random variable, thenE(Y |X) es una variable aleatoria.

    Properties

    E(g(X)|X) = g(X)Y = a+ bX + U , then E(Y |X) = a+ bX + E(U |X).E(Y ) = E [E(Y |X)] (Law of Iterated Expectations).

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Assumption 2: Strict Exogeneity

    E(ui|X) = 0, i = 1, 2, . . . , n

    In basic courses it is assumed that E(ui) = 0. Which one isstronger?

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Implications of strict exogeneity:

    E(ui) = 0, i = 1, . . . , n.Proof: By the law of iterated expectations and strictexogeneity:

    E(u) = E[E(u|X)] = E(0) = 0

    In words: on average, the model is exactly linear.

    E(xjkui) = 0, j, i = 1, . . . , n; k = 1, . . . ,KIn words: explanatory variables are uncorrelated with the errorterms of all observations.Proof: as excercise.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Assumption 3: No Multicollinearity

    (X) = K, w.p.1

    Rank?

    All columns of the realizations of X must be linearlyindependent.

    Careful: this prohibits exact linear relations between columnsof X.

    The model admits non-exact relations and/or non-linearrelations.

    Examples.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Assumption 4: spherical error variance

    Homoskedasticity: E(u2i |X) = 2 > 0, i = 1, . . . , nNo serial correlation: E(uiuj |X) = 0, i, j = 1, . . . , n., , i 6= j.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Homoskedasticity: by strict exogeneity

    V (ui|X) = E[u2i |X] E(ui|X)2 = E(u2i |X)

    so the assumption implies constant conditional variance forthe error term.

    No serial correlation: also by strict exogeneity

    Cov(ui, uj |X) = E(uiuj |X)so no serial correlation implies that given X all error terms ofall observations are uncorrelated.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Assumption 4 in matrix terms: V (u|X) = E(uu|X) = 2InRecall that for any random vector Z of n elements:

    V (Z) E[(Z E(Z))(Z E(Z))],an n n matrix with typical element vij

    vij = Cov(Zi, Zj).

    Homoskedasticity (V (u2i |X) = 2) implies that all thediagonal elements of V (u|X) are equal to 2.No sereial correlation implies that all the off-diagonal elementsof V (u|X) are zero.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Summary

    The Classical Linear Model

    1 Linearity: Y = X + u.2 Strict exogeneity: E(u|X) = 03 No Multicollinearity: (X) = K, w.p.1.4 No heteroskedasticity/ serial correlation: V (u|X) = 2In.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Details and Interpretations

    Fixed RegressorsIn basic treatments X is taken as a fixed, non-random matrix.This is more compatible with experimental sciences. It simplifiessome computations.

    The InterceptConsider the case x1i = 1, i = 1, . . . , n

    yi = 1 + 2x2i + + KxKi + ui, i = 1, . . . , n

    Then 1 is the intercept of the model. Careful with interpretations.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Interpretations

    E(yi|X) = 1 + 2x2i + + KxKi

    If E(yi|X) is differentiable with respect to xki, which isfunctionally unrelated to all other variables:

    E(yi|X)xk

    = k

    Careful: this is a partial derivative. A constant marginal effect.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Dummy explanatory variables: Suppose xki is a binary variable,taking two values, indicating that the i-th observation belongs (1)or does not belong to a certain class (0) (male-female, forexample). We cannot use the previous result for an interpretation(why?)

    Compute the following magnitudes

    E(yi|X,xki = 1) = 1 + 2x2i + + k 1 + + KxKiE(yi|X,xki = 0) = 1 + 2x2i + + k 0 + + KxKi

    Then k = E(yi|X,xki = 1) E(yi|X,xki = 0)Example: gender differences.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    The linear model is not that linear

    yi = 1 + 2x2i + + KxKi + uiLinear?

    Linear in variables.

    Linear in parameters.

    For estimation purposes, what matters is linearity in parameters.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    A small catalog of non-linear models that can be handled with theclassical linear model

    Quadratic: Yi = 1 + 2X2i + 3X22i + uiInverse: Yi = 1 + 2X12i + uiInteractive: Yi = 1 + 2X2i + 3X3i + 4X2iX3iuiLogarithmic: lnYi = 1 + 2 lnX2i + uiSemilogarithmic: lnYi = 1 + 2X2i + ui

    We will explore interpretations and examples in the homework.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Least Squares Estimation

    Goal: recover based on a sample yi, xi, i = 1, . . . , n.Let be any estimator of .

    Define Y X (our prediction of Y ).Define e = Y Y (estimation errors).

    Note that if n > k we cannot produce an estimator by forcinge = 0. Why?.

    We need a criterion to derive a sensible and feasible estimator.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Consider the following penalty function:

    SSR() ni=1

    e2i = ee = (y X)(y X)

    SSR() is the aggregation of squared errors if we choose as anestimator.

    The least squares estimator will be:

    = argmin

    SSR()

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Result: = (X X)1X Y

    SSR() = ee = (Y X)(Y X)= Y Y X Y Y X + X X= Y Y 2X Y + X X

    In the second line, note that X Y is a scalar, and hence it istrivially equal to its transpose, Y X, that is how we obtain theresult in the third line.

    SSR can be easily shown to be a strictly convex, differentiablefunction of , so first order conditions for a stationary point aresufficient for a global minimum.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    First order conditions are:

    ee

    = 0

    Using the derivation rules introduced before:

    ee

    = 2X Y + 2X X = 0

    which is a system of K linear equations with K unknowns ().Solving for gives the desired solution:

    = (X X)1X Y

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Some comments and details

    Existence and uniqueness: guaranteed by the rank assumption(X) = K.Second order conditions: X X is positive definite, also by therank assumption.

    The role of the assumptions: which of the assumptions havebeen used to derive the OLS estimator and guarantee itsexistence and uniqueness?

    Notation: Y X, e = Y Y (the OLS residuals).

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Algebraic Properties

    Recall the FOCs from the least squares problem:

    X Y +X X = 0X (Y X) = 0

    X e = 0

    These are the normal equationsThe algebraic properties are those that can be derived from thenormal equations.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Sum of errors: if the model has an intercept, one of thecolumns of X is a vector of ones, so X e = 0 implies:

    ni=1

    ei = 0

    Orthogonality: X e = 0. Implying that OLS residuals areuncorrelated to all explanatory variables.

    Linearity: is a linear function of Y , that is, there exists aK n matrix A that depends solely on X, with (A) = Ksuch that = AY .Proof: trivial. Set A = (X X)1X

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Goodness of Fit

    First check some easy results, when there is an intercept in themodel:

    Yi =Yi

    Start with Yi = Yi + ei. Take averages in both sides,ei = 0 by the previous property.n

    i=1(Yi Y )2 =n

    i=1(Yi Y )2 +n

    i=1 e2i

    Start with (Yi Y ) = (Yi Y ) + ei. Take squares. Thenshow

    ei(Yi Y ) = eX Y

    ei = 0 by previous

    properties.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    (Yi Y )2 =

    (Yi Y )2 +

    e2i

    The total variation in Y around its mean can be decomposed intwo additive terms: one corresponding to the model and thesecond one to the estimation errors.

    If all errors are zero, then all the varation is due to the model: thefitted linear model explaines all the variation.

    This suggest the following measure of goodness of fit:

    R2 n

    i=1(Yi Y )2ni=1(Yi Y )2

    = 1n

    i=1 e2in

    i=1(Yi Y )2

    This is the (centered) coeficient of determination: the proportionof the total variability explained by the fitted linear model.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Comments and properties (as homework)

    0 R2 1. maximizes R2.

    R2 is non-decreasing in the number of explanatory variables,K.

    Use and abuse of R2.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    In some cases we will use the uncentered R2:

    R2u =Y 2iY 2i

    = 1e2iY 2i

    The last equality holds since:

    Y 2i = Y

    Y = (Y + e)(Y + e)

    = Y Y + ee+ 2Y e= Y Y + ee+ 2Xe= Y Y + ee

    by the orthogonality property.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

  • The Classical Linear ModelLeast Squares Estimation

    Algebraic Properties

    Estimation of 2

    We will need an estimator for 2. We will propose:

    S2 =n

    i=1 e2i

    nK =ee

    nKLater on we will establish its properties with more detail.

    Walter Sosa-Escudero The Classical Linear Model and OLS Estimation

    The Classical Linear ModelLeast Squares EstimationAlgebraic Properties