chapter 15

Chapter 15

Panel Data Analysis

What is in this Chapter?

• This chapter discusses analysis of panel data.

• This is a situation where there are observations on individual cross-section units over a period of time.

• The chapter discusses several models for the analysis of panel data.

What is in this Chapter?

• 1. Fixed effects models.

• 2. Random effects models.

• 3. Seemingly unrelated regression (SUR) model

• 4. Random coefficient model.

Introduction

• One of the early uses of panel data in economics was in the context of estimation of production functions.

• The model used is now referred to as the "fixed effects" model and is given by

Introduction

• This model is also referred to as the "least squares with dummy variables" (LSDV) model.

• The αi are estimated as coefficients of dummy variables.

The LSDV or Fixed Effects Model


• Define t

itit

iti yT

yxT

x1

,1


• In the case of several explanatory variables, Wxx is a matrix and β and Wxy are vectors.

The OLS model

• If we consider the hypothesis then the model is

Alternative method for the fixed effects model

• where αi (i=1, 2…, N) and β (KX1 vector) are unknown parameters to be estimated.

ititiit uxy '


• As part of this study’s focus on the dynamic relationships between yit and xit (i.e. the β parameters) we take the ‘group difference’ between variables and redefine the equation as follows:


• where * denotes variables deviated from the group mean (an example)

*** ' ititit uxy

iitit yyy_

*

iitit xxx_

*

iitit uuu_

*

Industry and year dummies

• Industry dummies– Using the first one-digit (or two-digit) of the fir

m’s SIC code.– Control for the potential variation across indus

tries

• Year dummies– Panel structure data– Year effect refers to the aggregate effects of u

nobserved factors in a particular year that affect all the companies equally

Industry and year dummies

• Yi,t = 0 + 1 Xi,t + control variables + year dummies + industry dummies

The Random Effects Model

• In the random effects model, the αi are treated as random variables rather than fixed constants.

• The αi are assumed to be independent of the errors uu and also mutually independent.

• This model is also known as the variance components model.

• It became popular in econometrics following the paper by Balestra and Nerlove on the demand for natural gas.


• For the sake of simplicity we shall use only one explanatory variable.

• The model is the same as equation (15.1) except that αi are random variables.

• Since αi are random, the errors now are vit = αi + uit


• Since the errors are correlated, we have to use generalized least squares (GLS) to get efficient estimates.

• However, after algebraic simplification the GLS estimator can be written in the simple form


• W refers to within-group

• B refers to between-group

• T refers to total


Thus the OLS and LSDV estimatorsare special cases of the GLS estimator with

θ = 1 and θ =0, respectively.

The SUR Model

• Zeilner suggested an alternative method to analyze panel data, the seemingly unrelatedregression (SUR) estimation

• In this model a GLS method is applied to exploit the correlations in the errors across cross-section units

• The random effects model results in a particular type of correlation among the errors. It is an equicorrelated model.

• In the SUR model the errors are independent over time but correlated across cross-section units:

The SUR Model

The SUR Model

• This type of correlation would arise if there are omitted variables that are common to all equations .

• The estimation of the SUR model proceeds as follows.

• We first estimate each of the N equations (for the cross-section units) by OLS.

• We get the residuals .

• Then we compute where k is the number of regressors.

• After we get the estimates we use GLS on all the N equations jointly.

itu

jtitij uukT ˆˆ)/(1ˆ

ij

The SUR Model

• If we have large N and small T this method is not feasible.

• Also, the method is appropriate only if the errors are generated by a true multivariate distribution.

• When the correlations are due to common omitted variables it is not clear whether the GLS method is superior to OLS.

• The argument is similar to the one mentioned in Section 6.9. See "autocorrelation caused by omitted variables."

The Random Coefficient Model


• If δ2 is large compared with υi, then the weights in equation (15.8) are almost equal and the weighted average would be close to simple average of the βi.


• In practice the GLS estimator cannot be computed because the parameters in equation (15.8 ) are not known.

• To obtain these we estimate equations for the N cross-section units and get the residuals .

• Thenitu

22 and i

chapter 15

Documents