panel data regression - s ugauss.stat.su.se/gu/e/slides/f17.pdfchapter 16 panel data regression...
TRANSCRIPT
Panel Data
Up to now we have analyzed either
� Cross-sectional data (data collected on severalindividuals/units at one point in time)
or
� Time series data (data collected on one individual/unit overseveral time periods)
What if we have a combination of these two types of data?
Chapter 16 Panel Data Regression Models 2/22
Panel data are repeated cross-sections over time, in essence therewill be space as well as time dimensions.
Other names are pooled data, micropanel data, longitudinal data,event history analysis and cohort analysis
Chapter 16 Panel Data Regression Models 3/22
Panel Data Examples
The individuals/units can for example be workers, �rms, states orcountries
� Annual unemployment rates of each state over several years
� Quarterly sales of individual stores over several quarters
� Wages for the same worker, working at several di¤erent jobs
Chapter 16 Panel Data Regression Models 4/22
Panel Data Examples
Some american surveys:
� The National Longitudinal Survey of Youth (NLSY) trackslabor market outcomes for thousands of individuals, beginningin their teenage years
� The Panel Survey of Income Dynamics (PSID) since 1968collects data on 5000 families about various socioeconomicand demographic variables
� The Survey of Income and Program Participation (SIPP),conducts interviews four times a year about the respondentseconomic conditions
Chapter 16 Panel Data Regression Models 5/22
Panel Data
Potential gains
� take heterogenity into account, get individual-speci�cestimates
� especially suitable to study dynamics of change� study more sophisticated behavioral models� minimize bias due to aggregation
However, panel data also increases the complexity of the analysis.
Chapter 16 Panel Data Regression Models 6/22
Panel Data
� Balanced/unbalanced
� Short panel/long panel
Chapter 16 Panel Data Regression Models 7/22
Panel Data
Two kinds of models:
FIXED EFFECTS MODELS
RANDOM EFFECTS MODELS
The two types of analyses make conceptually contrastingassumptions about e¤ects as either random or �xed
Chapter 16 Panel Data Regression Models 8/22
Example with 2 explanatory variables:
Yit = β1 + β2X2it + β3X3it + uit
Notice the subscript index it
� i stands for the i : th cross-sectional unit, i = 1, ...,N
� t stands for the t : th time period, i = 1, ...,T
Chapter 16 Panel Data Regression Models 9/22
Pooled OLS Regression
Treats all observation as equivalent and OLS as usual
Yit = β1 + β2X2it + β3X3it + uit
In this case the error term captures "everything"
Naive, ignores time and space
Chapter 16 Panel Data Regression Models 10/22
Fixed E¤ects Models with DummyVariables
Several kinds of �xed e¤ects models, di¤ers in the asumptionsabout�The interceptThe slope coe¢ cients
Chapter 16 Panel Data Regression Models 11/22
Fixed E¤ects Models with DummyVariables
Varies over individuals Varies over timeThe intercept
p �The slope coe¢ cients � �
Di¤erent intercepts for di¤erent individuals β1i
Yit = β1i + β2X2it + β3X3it + uit
but each individuals intercept does not vary over time
If the number of individuals is N = 4
Yit = α1 + α2D2i + α3D3i + α4D4i + β2X2it + β3X3it + uit
Chapter 16 Panel Data Regression Models 12/22
Fixed E¤ects Models with DummyVariables
Varies over individuals Varies over timeThe intercept � p
The slope coe¢ cients � �
Di¤erent intercepts for di¤erent time periods instead β1t
Yit = β1t + β2X2it + β3X3it + uit
If the number of time periods is T = 20
Yit = λ1 + λ2D2t + λ3D3t + ...+ λ20D20t + β2X2it + β3X3it + uit
Chapter 16 Panel Data Regression Models 13/22
Fixed E¤ects Models with DummyVariables
Varies over individuals Varies over timeThe intercept
p p
The slope coe¢ cients � �
Di¤erent intercepts for di¤erent individuals AND time periods β1it
Yit = β1it + β2X2it + β3X3it + uit
For N = 4 and T = 20
Yit = α1 + α2D2i + α3D3i + α4D4i + λ1 + λ2D2t +
λ3D3t + ...+ λ20D20t + β2X2it + β3X3it + uit
Chapter 16 Panel Data Regression Models 14/22
Fixed E¤ects Models with DummyVariables
Varies over individuals Varies over timeThe intercept
p �The slope coe¢ cients
p �
Both intercepts and slopes varies over individuals, introduces a lotof dummy variables
Yit = α1 + α2D2i + α3D3i + α4D4i + β2X2it + β3X3it+γ1D2iX2it + γ2D2iX3it + γ3D3iX2it + γ4D3iX3it+γ5D4iX2it + γ6D4iX3it + uit
the number of interaction terms is number of dummy variables �number of explanatory variables
Chapter 16 Panel Data Regression Models 15/22
Fixed E¤ects Models with DummyVariables
Both intercepts and slopes varies over individuals and time
requires even more variables
Chapter 16 Panel Data Regression Models 16/22
Fixed E¤ects Models
Cautions
� a lot of dummy variables) less df) increased risk of multicollinearity
� have to re�ect on the assumptions about the error term uit- heteroscedasticity?- autocorrelation?
easily gets complicated when both time and space dimensions
Chapter 16 Panel Data Regression Models 17/22
Random E¤ects Models
Now, in the Random E¤ects Model
Yit = β1i + β2X2it + β3X3it + uit
the intercepts/e¤ects β1i are assumed to be random variables withmean value
E (β1i ) = β1
and the intercept value for individual i can be expressed as
β1i = β1 + εi i = 1, ...,N
where E (εi ) = 0 and Var (εi ) = σ2ε
Chapter 16 Panel Data Regression Models 18/22
Random E¤ects Models
each individual in the sample is considered to be a drawing from anin�nite (or "close to") population of individuals which share thecommon mean value β1
Yit = β1 + β2X2it + β3X3it + εi + uitYit = β1 + β2X2it + β3X3it + wit
The error term wit consists of two components, random e¤ectsmodels are sometimes called error components models
Chapter 16 Panel Data Regression Models 19/22
Random E¤ects Models
Assumptions about the error components
εi � N�0, σ2ε
�E (εi εj ) = 0 for i 6= j
uit � N�0, σ2u
�E (uituis ) = E (uituit ) = E (uitujs ) = 0 for i 6= j t 6= s
E (εiuit ) = 0
9>>>>>>>>>>>=>>>>>>>>>>>;
Chapter 16 Panel Data Regression Models 20/22
Random E¤ects Models
)
8>>>><>>>>:E (wit ) = 0
Var (wit ) = σ2ε + σ2u
Corr (wit ,wis ) =σ2ε
σ2ε+σ2u
Chapter 16 Panel Data Regression Models 21/22
Random E¤ects vs Fixed E¤ects
depends on
� whether or not the individuals can be viewed as a randomsample from a large population
�E (εiXi ) = 0?
If yes: random e¤ects, if no: �xed e¤ects
� the relation between T and N
- for large T and small N not a big di¤erence- for small T and large N random e¤ects estimators are moree¢ cient than �xed e¤ects (if the assumptions hold)
Chapter 16 Panel Data Regression Models 22/22