part 4: partial regression and correlation 4-1/24 econometrics i professor william greene stern...

31
Part 4: Partial Regression and Correlation -1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Upload: elfrieda-williamson

Post on 22-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-1/24

Econometrics I Professor William GreeneStern School of Business

Department of Economics

Page 2: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-2/24

Econometrics I

Part 4 – Partial Regression and Correlation

Page 3: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-3/24

I have a simple question for you. Yesterday, I was estimating a regional production function with yearly dummies. The coefficients of the dummies are usually interpreted as a measure of technical change with respect to the base year (excluded dummy variable). However, I felt that it could be more interesting to redefine the dummy variables in such a way that the coefficient could measure  technical change from one year to the next. You could get the same result by subtracting two coefficients in the original regression but you would have to compute the standard error of the difference if you want to do inference. 

This is the code I used in LIMDEP. My question:Is this a well known procedure? 

Page 4: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-4/24

Frisch-Waugh (1933) Theorem

Context: Model contains two sets of variables: X = [ (1,time) : (other variables)]

= [X1 X2] Regression model: y = X11 + X22 + (population)

= X1b1 + X2b2 + e (sample)

Problem: Algebraic expression for the second set of least squares coefficients, b2

Page 5: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-5/24

Partitioned Solution

Method of solution (Why did F&W care? In 1933, matrix computation was not trivial!)

Direct manipulation of normal equations produces

, ] so and

( ) =

1 1 1 2 11 2

2 1 2 2 2

1 1 1 2 1 1

2 1 2 2 2 2

1 1 1 1 2 2 1

2 1 1 2 2 2 2 2 2 2 2 2 1 1

X X b X y

X X X X X yX = [X X X X = X y =

X X X X X y

X X X X b X y(X X)b = =

X X X X b X y

X X b X X b X y

X X b X X b X y ==> X X b X y - X X b

= 2 1 1X (y - X b )

Page 6: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-6/24

Probit = -1(%) + 5

Page 7: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-7/24

Partitioned Solution

Direct manipulation of normal equations produces b2 = (X2X2)-1X2(y - X1b1)

What is this? Regression of (y - X1b1) on X2

If we knew b1, this is the solution for b2.

Important result (perhaps not fundamental). Note the result if X2X1 = 0.

Useful in theory: Probably Likely in practice? Not at all.

Page 8: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-8/24

Partitioned Inverse Use of the partitioned inverse result

produces a fundamental result: What is the southeast element in the inverse of the moment matrix?

1 1 1 2

2 1 2 2

-1X 'X X 'X

X 'X X 'X

Page 9: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-9/24

Partitioned Inverse The algebraic result is: [ ]-1

(2,2) = {[X2’X2] - X2’X1(X1’X1)-1X1’X2}-1

= [X2’(I - X1(X1’X1)-1X1’)X2]-1

= [X2’M1X2]-1 Note the appearance of an “M” matrix.

How do we interpret this result? Note the implication for the case in which

X1 is a single variable. (Theorem, p. 34) Note the implication for the case in which

X1 is the constant term. (p. 35)

Page 10: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-10/24

Frisch-Waugh ResultContinuing the algebraic manipulation:

b2 = [X2’M1X2]-1[X2’M1y].

This is Frisch and Waugh’s famous result - the “double residual regression.”

How do we interpret this? A regression of residuals on residuals.

“We get the same result whether we (1) detrend the other variables by using the residuals from a regression of them on a constant and a time trend and use the detrended data in the regression or (2) just include a constant and a time trend in the regression and not detrend the data”

“Detrend the data” means compute the residuals from the regressions of the variables on a constant and a time trend.

Page 11: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-11/24

Important Implications

Isolating a single coefficient in a regression. (Corollary 3.2.1, p. 34). The double residual regression.

Regression of residuals on residuals – ‘partialling’ out the effect of the other variables.

It is not necessary to ‘partial’ the other Xs out of y because M1 is idempotent. (This is a very useful result.) (i.e., X2M1’M1y = X2M1y)

(Orthogonal regression) Suppose X1 and X2 are orthogonal; X1X2 = 0. What is M1X2?

Page 12: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-12/24

Applying Frisch-Waugh

Using gasoline data from Notes 3.X = [1, year, PG, Y], y = G as before.

Full least squares regression of y on X.

Page 13: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-13/24

Detrending the Variables - Pg

Page 14: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-14/24

Regression of Detrended G on Detrended Pg and Detrended Y

Page 15: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-15/24

Partial Regression

Important terms in this context: Partialing out the effect of X1. Netting out the effect …

“Partial regression coefficients.” To continue belaboring the point: Note the interpretation of partial

regression as “net of the effect of …”

Now, follow this through for the case in which X1 is just a constant term, column of ones.

What are the residuals in a regression on a constant. What is M1? Note that this produces the result that we can do linear regression on data

in mean deviation form.

'Partial regression coefficients' are the same as 'multiple regression coefficients.' It follows from the Frisch-Waugh theorem.

Page 16: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-16/24

Partial CorrelationWorking definition. Correlation between sets of residuals. Some results on computation: Based on the M matrices. Some important considerations: Partial correlations and coefficients can have signs and magnitudes that differ greatly from gross correlations and

simple regression coefficients.

Compare the simple (gross) correlation of G and PG with the partial correlation, net of the time effect. (Could you have predicted the negative partial correlation?)

CALC;list;Cor(g,pg)$ Result = .7696572

CALC;list;cor(gstar,pgstar)$ Result = -.6589938

G

1.00

1.50

2.00

2.50

3.00

3.50

4.00

4.50

.50

80 90 100 110 12070P

G

Page 17: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-17/24

Partial Correlation

Page 18: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-18/24

THE Application of Frisch-Waugh The Fixed Effects Model

A regression model with a dummy variable foreach individual in the sample, each observed Ti times.

yi = Xi + diαi + εi, for each individual

1 1

2 2

N

=

=

N

1

2

N

y X d 0 0 0

y X 0 d 0 0 βε

α

y X 0 0 0 d

β [X,D] ε

α

Zδ ε

N may be thousands. I.e., the regression has thousands of variables (coefficients).

N columns

Page 19: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-19/24

Estimating the Fixed Effects Model The FEM is a linear regression model but

with many independent variables

1

1

Using the Frisch-Waugh theorem

=[ ]

D D

b XX XD Xy

a DX DD Dy

b XM X XM y

Page 20: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-20/24

Application – Health and IncomeGerman Health Care Usage Data, 7,293 Individuals, Varying Numbers of PeriodsVariables in the file areData downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293 individuals. There are altogether 27,326 observations.  The number of observations ranges from 1 to 7 per family.  (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987).  The dependent variable of interest is

DOCVIS = number of visits to the doctor in the observation period

HHNINC =  household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped)HHKIDS = children under age 16 in the household = 1; otherwise = 0EDUC =  years of schooling AGE = age in years

We desire also to include a separate family effect (7293 of them) for each family. This requires 7293 dummy variables in addition to the four regressors.

Page 21: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-21/24

Fixed Effects Estimator (cont.)

i i i

i i i

i i i

1i

1 1 1T T T

1 1 1T T T

1 1 1T T T

(The dummy variables are orthogonal)

( ) (1/T)

1- - ... -

- 1- ... -

... ... ... ...

- - ... 1-

i i

1D

2D

D

ND

iD T i i i T i

M 0 0

0 M 0M

0 0 M

M I d dd d =I dd

=

Page 22: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-22/24

‘Within’ Transformations

i

i

TNi=1 i i i i t=1 it,k i.,k it,l i.,lk,l

TNi=1 i i i i t=1 it,k i.,k it i.k

, (x -x )(x -x )

, (x -x )(y -y )

i iD D D

i iD D D

XM X= XM X XM X

XM y= XM y XM y

Page 23: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-23/24

Least Squares Dummy Variable Estimator

b is obtained by ‘within’ groups least squares (group mean deviations)

Normal equations for a are D’Xb+D’Da=D’y a = (D’D)-1D’(y – Xb)

iTi i t=1 it it ia=(1/T)Σ (y - )=ex b

Page 24: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-24/24

Fixed Effects Regression

Page 25: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-25/24

Time Invariant Variable

iTN N 2

i=1 i i i=1 t=1 it i.

it i it i. it i.

i

= a variable that is the same in every period (FEMALE)

(f -f )

but f = f , so ff and (f -f ) 0

This is a simple case of multicollinearity.

= diag(f )*

iD D

f

fM f = fM f

f D

Page 26: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-26/24

Page 27: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-27/24

In the millennial edition of its World Health Report, in 2000, the World Health Organization published a study that compared the successes of the health care systems of 191 countries. The results notoriously ranked the United States a dismal 37th, between Costa Rica and Slovenia. The study was widely misrepresented , universally misunderstood and was, in fact, unhelpful in understanding the different outcomes across countries. Nonetheless, the result remains controversial a decade later, as policy

makers argue about why the world’s most expensive health care system isn’t the world’s best.

Page 28: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-28/24

1

22 3

log =

= α+β log

+β log +β (log ) -

i =1,...,191 countries

i i i

i

i i i

COMP Maximum Attainable - Inefficiency

HealthExp

Educ Educ u

The COMPOSITE Index Equation

Page 29: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-29/24

β1

β2

β3

α

Estimated Model

Page 30: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-30/24

Implications of results: Increases in Health Expenditure and increases in Education are both associated with increases in health outcomes.These are the policy levers in the analysis!

Page 31: Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics

Part 4: Partial Regression and Correlation4-31/24

WHO Data