development economics, part 2, first term 2012 jean-bernard chatelain université paris i panthéon...

120
Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Upload: timothy-james

Post on 21-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Development Economics,Part 2, First term 2012

Jean-Bernard CHATELAIN

Université Paris I Panthéon Sorbonne

Page 2: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Theoretical Insights: Growth and Poverty traps

Eggertson: Why Iceland Starved.

Diamond: Collapse (Easter island et al.).

Fukuyama, Levy: Entry points for development and political changes.

Page 3: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Econometrics explaining GrowthGrowth and Finance, Growth and Aid

Beck: Survey on finance and development.

Arcand et al.: Too much finance

Burnside and Dollar: Aid Policies and Growth

Doucouliagos Paldam: meta-analysis

Roodman: methodology of applied econometrics of growth and aid.

Page 4: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Michigan University Press (2005)

Iceland case / Institutional Economics (ISNIE, Ostrom).

Cf. to some extent, Acemoglu on political barriers to growth.

Page 5: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

2 sectors modelL agriculture + L fishing = L(t).

Pre-industrial Malthusian Model of (cyclical) Growth (cf. Galor’s papers) determines L(t).

Decreasing returns to scale for both production functions F(L) and G(L): market arbitrage and equilibrium:

F’(L agriculture) = (1 – t) G’(L fishing)

Questions: multifactor explanations of the distorsion factor t during 900 years.

Page 6: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

At least 4 actors

Internal: Landowners (Elite), Labor.

External: colonial power (Danmark), competitors of Danmark.

Why the coalition among the elite remained relatively stable, despite a relatively large autonomy from Danmark, despite various shocks, despite different coastal access to fish?

Page 7: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Shocks

Climate shocks

Populations increases (starvation, migration)

Changes of colonial power / trade partners

Changes for Danmark the colonial power in Europe (wars), relative weight of Iceland policy in all policy matters.

Technological change for fishing and boats (transportation costs) technology

Page 8: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Explain the distortion (t)

Price effects: tariffs on trade (what about the domestic price?), transportation costs.

Quantity effects: constraints on trade partners.

Regulatory constraints (quantity) on labor sectoral mobility.

Determined from a political equilibrium between the 4 actors.

Page 9: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 10: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Exhaustion of Key Necessary Input(s)

Y = F (K1, K2, L)

Y=F(0, K2, L) = 0

Exhaustible ressources or

Over-exploited renewable ressources.

Consumption K1(t) > Quantity renewed K1(t)

(Energy, Food).

Page 11: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Explanation WHY K1 tends towards zero

K1 turns to be a public good (« commons »).

External (climate) shocks, technology change consuming more K1.

Knowledge: do not understand, do not believe. Group think versus Dissent (value of dissent but increases problems of coordination).

Action: problem of coordination, Differents costs in the short run. (increases enemies, dissent/ lose friends). Bottom up / Top down.

Page 12: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Fukuyama Levy

Page 13: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Four entry points for reforms against poverty and political traps

1. Growth (just enough public governance)

2. State Building (tax system/ expenditures/ authority)

3. Political Institutions (rule of law, property rights, democracy)

4. Civil Society Development (bottom up, local and regional politics)

Page 14: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

The Order of Reforms and Local Constraints

The order of reform matters.

Their order is constrained by local, historical and current constraints for the exit of poverty traps.

External shocks matters (windows for reforms).

Not all reforms are heading in the development direction.

Page 15: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Finance and Growth, Crisis and Political Economy

Cf. « The failure of financial macroeconomics and what to do about it »

Page 16: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

EMBEDDED Macroeconomics: « Banking Fragile by Political Design »

Page 17: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 18: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 19: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Revising views

1. Allocation of capital between sectors (removing barriers to entry due to credit rationing, fostering creative destruction, risk sharing).

2. The fluctuations (risk increasing, crisis) channel of Finance on Growth emphasized.

3. Control of capital flows and regulation fostering financial stability?

4. Excess ressources into finance in developped world (excess trades, excess labour due to excess wages, too much risk-taking, excessive rents)?

Page 20: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Applied Econometrics Explaining Growth

Allegory of Truth: in L’iconologia, by Cesare Ripa, wood engraving, from Cesare d’Arpino,  1618.

Beautiful naked (simple) woman

Unvieled by Time

who holds:

the sun (light) or a mirror.

an open book (where is truth)

a palm (strength)

with Earth on her feet (over earthly matters, in the sky).

Page 21: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Explaining growth

Many causal factors: up to 500 indicators for 50 effects explaining growth (some of the indicators intend to measure the same effect).

Reverse causality: endogeneity, except for geography and far in the past.

Outliers.

Poverty traps: thresholds, non linear effects.

For the country monograph to general effects and policy?

Page 22: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Data: Cross sections; Historical time series; Panel Data.

1. Historical time series: Maddison’s data set.

2. Cross section, Between (averages of cross sections over time): look at time invariant variables (initial GDP per head).

3. Panel data, fixed effects or first differences: Excellent to eliminate endogeneity of regressors with time invariant country/individual unobservable charateristics: cov(x(it),a(i)). Drawback: not suited for the effect of in sample time-invariant regressors.

Page 23: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Dependant variable: Growth versus cycles

Data availibility (1960s) varies for regressors.

Averages over arbitrary 5, 6,…, 10 years.

Trend versus cycles using filters (example Hodrick Prescott).

Interaction between cycles of GDP/head and the growth trend, long term effects of crisis?

Page 24: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Less Wrong?The 12 labours of regression

1. Inference: statistical versus substantive significance,

2. Publication bias and multiple comparisons.

3. Multiple testing

4. Instrumental variables

5. Instrumental variables with GMM using panel data.

6. Power: minimal number N of observations

7. Maximal number k of regressors, contributions to R2.

8. Panel data: Within versus Between: time trends versus endogeneity

9. Time invariant variables in panel data.

10. Outliers detection, residuals graphs, robust estimates; overfitting.

11. Quadratic and interaction terms

12. Spurious regressions and near multicolinearity.

Page 25: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

1. Statistical significance criterion

Page 26: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 27: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 28: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Abs(x-mean)<sigma=66% of shocks (normal)Abs(x-mean)<1.96.sigma=95%

Page 29: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 30: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 31: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Statistical significance versus substantive significance

Fisher p<0.05 (1925-1940) type I error only.

1 published result in 20 expected to be wrong.

Gosset (Student), Egon Pearson and Jerzy Neyman (1928, 1938), type I, type II error

Deirdre (ex Donald) McCloskey and

Stephen Ziliak (1980s to now).

Page 32: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 33: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Around 200 A.D.

Page 34: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 35: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Disagreement in Wikipedia: « Statistical Significance »

This article needs attention from an expert in statistics. Please add a reason or a talk parameter to this template to explain the issue with the article. WikiProject Statistics or the Statistics Portal may be able to help recruit an expert. (June 2012)

Page 36: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

A response

Test a minimal « significant » size of the effect and not its existence:

Change: H0: ρ=0 by: H0: ρ<ρ(min)

Select the threshold (ρ(min))

Using a loss function of 2 types of errors

Example follows:

Page 37: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

37

Binary case: Power curve: % of True Positive (1-β) function of % de False Positive (α) (Logit, Probit)

Page 38: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

38

Page 39: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

39

DECISION making knowing proba of default: Loss function

LOSS FUNCTION of weighted sum of:

Type I Error: lend to a bankrupt firm next period: loss of loan and interest.

Type II Error: do not lend to a profitable firm next period: loss of profit.

LOSS=

LossGivenDefault * P(type I error) +

(r-r0)*P(type II error).

Page 40: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

40

Choice of threshold s minimizing the lender loss function

Min LOSS=LGD*P(P0/1)+(R-r0)*P(P1/0)Loss given default LGD:LGD=(%lost)*(1+r)*Loan > (R-r0)*Loan.

LGD*(1-y)+(R-r0)*x=L1 (given level of expected losses).

y = 1 - L1/LGD + ((R-r0)/LGD)*x

ISO-LOSS LINES: upper ones with lower loss level (intercept).

Page 41: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

41

The straight lines are iso-profit lines defining the optimal cutoff s*

at the tangential point with the Power curve.

Page 42: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

2. Publication bias

And Meta-Analysis

Page 43: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 44: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 45: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Veritas Filia

Temporis

Page 46: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

3. Multiple testing: test m several proxies (corruption, governance indices) one after

the other in m different regressions.i.e. running many (m) regressions

Remark: It is not a discussion of the number k of t-test in a given multiple regression

including k regressors .

Page 47: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 48: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 49: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

m=5 trials: one should use the threshold: p = 1% = 5%/(m=5)

Page 50: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Data fatigue

« In doing this paper of tremendous scope, he had a great struggle with the data. He won a few points, the data won a few points, and I gather they are both exhausted.»

Nordhaus (1975). It was burdensome to run regression in 1975.

Page 51: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

The mining ratio in applied macroeconomics (Paldam (2012))

The mining ratio is the number m of regressions made for each published paper. The full m-set and m itself are unknown except by the researcher. Meta-analysis study the m’-set (m’<<m) of reported regressions in disclosed grey literature (working papers) and published articles. The costs of regression have fallen, and this has caused the mining ratio to increase.

Paldam suggests two consequences: (1) It causes publication biases to rise. (2) It contributes to the rapid rise in the number and sophistication of econometric tools, even when it appears that the marginal productivity of new tools is falling.

Page 52: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

The Aid-Growth Regressions Rocket ?

« It is only by repeating experiments that one manage to succeed…

In other terms,…

the more you fail, the more you have chances that it works…»

Page 53: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

The positive and necessary side of multiple testing: exploratory data analysis (Tukey); « data mining ».

Along with serendipity, hypothesis changes: « Randomness only helps prepared minds »

(Pasteur)

Page 54: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Data mining involves six common classes of tasks:Anomaly detection (Outlier/change/deviation detection) – The identification of unusual data records, that might be interesting or data errors and require further investigation.Association rule learning (Dependency modeling) – Searches for relationships between variables. For example a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis.Clustering – is the task of discovering groups and structures in the data that are in some way or another "similar", without using known structures in the data.Classification – is the task of generalizing known structure to apply to new data. For example, an e-mail program might attempt to classify an e-mail as "legitimate" or as "spam".Regression – Attempts to find a function which models the data with the least error.Summarization – providing a more compact representation of the data set, including visualization and report generation.

Page 55: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

4. Instrumental Variables:« Imperfect IV »

E = disturbance

X1 = X1exo + X1endogène

But relative SHARE OF variance UNKNOWN between the two parts.

Perfect IV: never really available.

Cor ( Z, X1exo ) = 1 (strong > weak)

Cor ( Z, e ) = 0

Often, very exogenous instruments are weak and conversely = imperfect IV.

Page 56: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Near multicollinearity may corrupt Hausman Test if uses another regressor as instrument

Y = a.x1 + b. x2 + e

First step: x2hat = a’ x1 + b’.z

Second step: Y = a(IV). x1 + b(IV). (x2hat).

If Near collinearity b(IV) >>>> b:

Hausman test confirms endogeneity and b(IV) relevance.

Page 57: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Gonzalez (2005) Bank regulation and risk-taking incentives: An international comparison of bank risk

Non performing loans

(bank level).

Explained by:

REG(high) « freedom in the banking sector in a country» by Heritage foundation. Lots of freedom 0 (level 1 and 2), government intervention (level 3 and 4).

Page 58: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 59: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

In the first stage for Q, only « tangible assets» is absent in the second stage.

9 variables are common regressors in first stage and second stage.

The parameters are multiplied by a factor 2 to 15 which changes of signs in the second stage with respect to no IV.

Page 60: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 61: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 62: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Hausman test of b(IV)-b(nonIV) confirms IV, researcher states best regression = IV

Before IV: shift to 1 in Reg(high) implies +2.54 in non performing loans ratio (min=0, median 0.9, mean 2.3, standard error 4.5, max=39).

After IV: shift to 1 in Reg(high) implies ceteris paribus -25 (10 x +2.54 with change of sign) in non performing loans ratio, 5 times the standard error (>>>1% of shocks of a normal distribution). Its stretches credulity.

Page 63: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Now: before and after IV of Reg(high):

Parameter for Reg(high) shifts from +2.5 to +9.9 or +15.4 using IV. This time the sign is positive with IV.

Sign flips, very large parameters leading to impossible ceteris paribus effects on the dependent variable were downplayed: what mattered was that the researcher « dealt with endogeneity » and that « exogeneity is strongly rejected ».

Page 64: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Instrument selection is very often disguised « Multiple Testing » on x2hat

Researchers may try many, many instruments! Until:

Desired sign and desired value of parameter estimate with statistical significance and distinct from OLS.

Page 65: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

5. IV with GMM on panel dataStata Xtabond2 by Roodman

Designed for a few periods T<10

Corrects the bias of the parameter of auto-regressive y(i,t-1) bias, but this bias is quite small as soon as T>10.

Too many instruments, J-test not selective: Records of multiple testing for this method!

Instrument: X(i,t-2) counts for 1 for each date of estimation (T=10, 8 instruments for « one » lagged variable)

Page 66: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

GMM-panel estimators are very unusual IV estimators

Arellano and Bond (1991): First differences instrumented by lagged levels.

Arellano and Bover (1995), Blundell and Bond (1998): GMM system First differences instrumented by lagged levels AND Level equation instrumented by first differences.

Read: Roodman (2009):

“A Note on the Theme of Too Many Instruments” Oxford Bulletin of Economics and Statistics

Page 67: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

GMM-System: a few limits

1.Multiplies x 2 the number of instruments, and by much more the number of combinations and trials and errors (multiple testing).

2.Levels have more variances than first differences: the level equation performs better (in terms of « expected » coefficients). The debate of using only levels instrumented by first differences has never really occured.

3.Levels often INCLUDES TRENDS: method not robust to unit roots and near-multicollinearity (explanatory variables with common trends).

4.Levels (including trends) are weakly correlated with first differences (wipes out trends): both GMM panel estimators are weak instruments methods.

Page 68: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Sargan: do not concludethe opposite (cf. Ph.D. candidate)

H0: b=0, researcher « happy » if reject the null; Happy if p<0.05.

Sargan, J-test, Over-identifying restrictions test:

H0: E(Z(it).e(it))=0, researcher « happy » if does not reject the null,

Happy if p>0.05.

Page 69: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Selecting IV with GMM using the difference of Sargan (JBC economics letters, 2007)

Upwards testing.

Begin with a small set of farthest lags as instruments (m=k+1):

X(i,t-4).

Then add one by one X(i,t-3) or X(i,t-2) which minimize the Sargan or maximise the p-value.

Page 70: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Difference of sargan: H0: m are exogenous is crucial for power.J(m instruments): H0: m are exogenous

For a joined null H0’ AND the alternative:

J(m+1 instruments) – J(m instruments):

H0’: (m+1)th is exogenous; HA’ alternative: it is not, with both H0’ and HA’

conditional to H0: m are exogenous.

Upwards procedures are correct, downward procedures may deliver inconsistent outcome when H0 not true.

Page 71: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

In Xtabond2

The difference of Sargan test of group1 added to group2 is given

But also the difference of Sargan test of group2 added to group1 is given.

One of the 2 tests is likely to violate H0 which should be valid in the null H0’ and the alternative HA’ = hence this test has little power.

Page 72: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Why? Too many « heterogenous/exogeneity » instruments in the group for J-test.

Difference of J test has the following tendency:

M=4 very exogenous set of instruments:

Only accept an additional instrument which is very exogenous. (an elitist group selects a high quality new member to remain at top).

m=7, includes 4 very exogenous and 3 poorly exogenous instruments, accept an 8th which is moderately exogenous: (a mixed group improves with a medium quality additional individual)

Page 73: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Diff of Sargan procedure: depends on the starting set

Initial set of instruments: p-value = 80%

Ends final set at most to 65%

If initial set of instruments p-value is 20%, takes all lags for all variables and ends to p-value 5%.

Some samples are more restrictive for J-test.

Page 74: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

6. Sample size determination, (Number of observations),

Statistical power

Page 75: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Determine sample size

Expect the magnitude of the effect (size of the partial effect) on the dependent variable

Decide on power (up to 20%+5% errors):

(1-proba type II error)>80%

for proba type I < 5%.

Regression: rule of thumb:

N=10 per each additional covariate.

(N=100, k=10).

Page 76: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

76

Page 77: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Adding heterogenous samples to reach statistical significance?Y= a x + b +e for N1=20 observations.

Y=0.x + b + e for N2=1000 observations

Statistical significance may be gained if the mean point of sample 1 is different from the mean point of sample 2.

To correct: add a dummy for sample 2.

To gain statistical significance: omit this dummy.

Page 78: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Beware of spurious policy advice following spurious inference when pooling heterogenous

individuals/countries

From the slide above, if statistical significance is obtained for 2 groups (because you omitted a dummy for group 2), you mayrecommend a costly policy/treatment which is required for group 1 to be extended to group 2.

Researchers may believe that effects applying to a larger population is a greater contribution of them.

Page 79: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 80: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

7. Number of useful covariatesk= 5 to 6

An interesting (not the only one) indicator: the ordered contribution to differences of

R2(k+1) – R2(k) (remark: taken into account downwards in t-test power analysis).

Wage equation, Within transformed (fixed effect): T=7, N = 595 individuals

NT-N-k = 3561 = statistical significance too easy to get!

Page 81: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 82: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 83: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

The relative importance of regressors

1. Ordered contributions is an indicator. But some contributions may be very close so that the order may overweight too much differences.

2. Standardized parameters allows to compare parameters between variables: but when they exceed 1, it is a signal of near-multicollinearity.

Page 84: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

The relative importance of regressors, taking into account the number of observations and the

standard error

1. Ordering variables by t-statistics (N is the same) or p-value, with the *; **;***.

2. Compute the power or the proba of type II error of the t-test for each regressor. In this case, the last contribution to R2 given by: R2 (final regression) – R2 (regression omitting this variable) is one of the component of the power of the t-test for this variable in the multiple regression.

Page 85: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

8. Panel Data: Within versus Between

Orthogonal spaces:

70% Between: average over time of of cross sections, dimension N = good for time invariant inference. X(i.)

30% Within: deviation from this average, NT-N, Regression on within transformed variables X(it) – X(i.) = fixed effect models.

cov( x(it)-x(i.) , x(i.) ) = 0

Page 86: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Weakness of Within-Fixed effects

It eliminates cov (x(it) , a(i) ) BUT:

Common trends remains even with T<10:

spurious regressions, trend driven near-multicollinearity.

Try also first differences (but smaller variance)

BUT ALSO: large share of variance (between variance) unexplained.

Page 87: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Specification minimizing the gap Within versus Between (sub-correlation matrix W = B)

Minimize Panel Hausman Test statistics

while selecting regressors for the null:

H0: b(within)=b(between)

If not rejected:

1.Within regression with trends not spurious (same results in between/cross section).

2.Between not facing endogeneity Ex(it)a(i)=0

3.Between variance (often 70%) explained.

Page 88: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Baltagi Griffin (1983), N=18, T=19,Complete Analysis of Variance of

y=log(gasoline/head) explained by:Between y(i.), dof=N-k-1 Within y(it) – y(i.), dof=NT-N-k-1

Beta(t-stat,Dof=15)

Diff-R2 ordered %

x 83.3% of var(y)

Beta(t-stat,Dof=321)

Diff-R2 ordered%

x 17.7% of var(y)

Log(car/N) 0.63(7.15)

79,9 66,5 0.61(55.79)

91.7 16.2

Log(pgas/p) -0.29(-2.01)

+4,2 +3,5 -0.35(-7.47)

+1,2 +0.2

Intercept 0.77(0.92)

None

R2 =84.1 =70.0 =92,9 =16.4

+1-R2 +15.9 +13.3 +7,1 +1.3

Within transformed variables correlation with year trend: For the dependent variable: r = 91.5; for log(car/N), r = 0.86

Page 89: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

9. Time invariant using panel data

Orthogonal spaces:

Between: average over time of of cross sections, dimension N << NT – N

Valid space for inference of time invariant Z(i) via cancelling out of individual disturbances.

Page 90: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Regression in each between or within subspace.

Page 91: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Time Invariant

Y(it) = b X(it) + c Z(i) + a(i) + e(it)

If a(i) random individual effect

If cov ( X(it) , a(i) ) non zero (endogeneity)

Then use: within = fixed effects.

But Z(i) – Z(i.) = 0, eliminates time invariant

Between: cov (Z(i), a(i) ) non zero possible.

Y(i.) = b X(i.) + c Z(i) + a(i) + e(i.)

Page 92: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 93: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Time Invariant – MundlakPretest (JBC)

Y(it) = b X(it) + (bw-bb) X(i.)

+ c Z(i) + a(i) + e(it)

If H0: bw-bb=0 not rejected, X(i.) is exogenous with respect to a(i).

Could be a valid « internal » instrument in the Hausman Taylor estimator with time invariant variables (but Weak ???)

Page 94: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 95: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 96: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

10. Outliers – Graphs for detecting residuals patterns. Anscombe quartet:

all summary statistics identical including t-stats.

Property Value

Mean of x in each case

9 (exact)

Variance of x in each case

11 (exact)

Mean of y in each case

7.50 (to 2 decimal places)

Variance of y in each case

4.122 or 4.127 (to 3 decimal places)

Correlation between x and y in each case

0.816 (to 3 decimal places)

Linear regression line in each case

y = 3.00 + 0.500x (to 2 and 3 decimal places, respectively)

Page 97: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

INFLUENCE: Studentized residuals over 1.96DFBetas (for each observation i)both divided by a standard error

for each observation (i)

Page 98: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Burnside Dollar DFbeta

Page 99: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

99

What to do with studentized residuals? Robust estimates

Page 100: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

100

Over fitting = too many estimated parameters k in order to capture outliers, high R2 in the

estimation or learning sample, large prediction errors in a validation sample (« out of estimation

sample » prediction).

Page 101: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

11. Quadratic term/ interaction term: graphs Arcand et al.

No statistically significant effect during crisis. (only when confidence interval strictly below or higher than zero). Never interpret coefficients ceteris paribus

a X + b X*X =

(a + b X) * X

a X + b Y * X + c Y

= (a + b Y) * X + c Y

Page 102: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

12. Correlation matrix inspection: Omitted variable bias is bad except when adding highly collinear covariate or

« classical suppressor »

Y= a1. x1 + a2 . x2 + e

If corr (y , x1) below 0.1 in absolute value (« classical suppressor »):

If possible, omit x1 in the regression.

If corr (x1, x2) higher that 0.85 in absolute value: if possible, omit x1 OR x2 in the regression.

Page 103: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Ordinary least squares estimatorsYule (1898)

With standardized variables (mean = 0, standard error = 1), we get:

231213

231312

22313

12

23

23

22313

12

1

11

1

1

1rrr

rrr

rr

r

r

r

r

Page 104: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

A sign reversal or sign flip is more frequent when including a highly correlated regressor

(high r23). Example: from bivariate x1/x2 to trivariate regression x1/x2 and x3

231312122

12

23

2313123.12

122

11212

00,01

00

rrrrr

rrr

rr

Page 105: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Quadratic or Interaction terms: How the statistical significance target

leads to spurious effects(example: growth, aid*policy and aid2*policy)

23.113121

23.12

13121

23.12

13121

)92.0.(...

).92.0.(..

..

aidpolicyaidpolicyaidrx

policyaidpolicyaidpolicyaidrx

policyaidpolicyaidx

1

85.092.01

13.01.006.01

2313

2312

1312

rr

rr

rr

R

Page 106: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

* Statistically significant at the 5% level, N=275 observations; Burnside Dollar (2000)

PIF(1.2)= 0.20/0.095 = 2.13

PIF(1.3)= -0.019/0.0046 = -4.15 (opposite sign)

r12 = 0.13 test r12=0 not rejected

r13 = 0.06 test r13=0 not rejected r23= 0.92

Page 107: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne
Page 108: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

The (Un-)stability of conditional independance

Regression includes x3

Does not reject r12=0

Reject r12=0

Does not reject β12=0

No effect Type I discordance

Reject β12=0 Type II discordance (spurious)

Effect

Page 109: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

102,95.0),( 231312 Nrrfr

Statistical significance is easy to obtain when r12 and r13 close to zero and r23 >0.85

Page 110: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Critical regions in the graph

Critical region (reject the null) of t test in trivariate regression: Inside blue ellipse: feasible value of correlations coefficients ou Outside the red one.

Critical regions of t-tests in bivariate (simple) regression: outside the central horizontal and verticals strips limited by red line.

In the central small square, statistical significance is reached with trivariate regression except on the diagonal. By contrast, it is rejected in both bivariate regressions (x1 with x2 or x1 with x3).

Page 111: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Nowadays, textbooks claim

« Near-multicollinearity is only a problem when statistical significance is lost (estimated standard errors are too large). »

Frisch (1933), Tinbergen (1939) and Tobin (1950): even though it is statistically significant, one of the near-multicollinear variable should be omitted or its parameter constrained.

Page 112: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

x1 = 0 x2 + ε1.2 R2 = 0 %

x2 = 0,99 x3 + ε2.3 R2 = 0,992 = 98 %

x1 = 0,14107 x3+ ε1.3 R2 = 2 %

x1 = -7,0181 x2 + 7,0889 x3 + ε1.23

x1 = 0.x2 + 7,0889.(x3 – 0,99 x2) + ε1.23

= ε2.3 (var (ε2.3)=0.02)

R2 = – 7,0181. r12 + 7,0889 . r13

= – 7,0181. 0 + 7,0889. 0,14107 = 100 %

Page 113: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Estimators for the variance and the t-value

231213

231312

223

223.1

ˆ

ˆ

223

223

223.1

ˆ

ˆ

11

2

1

1

1

1

2

11ˆ

ˆ

13

12

13

12

rrr

rrr

rR

Nt

t

rN

rR

Page 114: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Roodman graph: residuals = intermediate orthogonalization

Residuals:

e= Aid - a Aid*Tropical + b

Are almost identical to dummies for Jordan Egypt and Syria observations.

Page 115: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Residuals: orthogonal to the regressors subspace.

Page 116: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Three equivalent regressions: Interaction term (1), with orthogonal regressors (2)

which suggests outlier driven (3).

(1) Growth = b1*aid + b2* aid*tropical area

With statistically significant parameters.

(2) Growth= (b3 close to 0)*aid

+ b4*(aid-a.aid*tropical)

(3) Growth = b5*dummy (Egypt)+b6*dummy(Syria) +b7*dummy (Jordan)

Publication bias: Journals publish (1): « interaction terms » general result and never (3): a few outliers exist.

Page 117: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

0889,799,01

1

1

1

99,01

99,0

99,0,cov

2223

23

1

232

231

r

xx

x

xx

xxx

The orthogonal residuals for a regression between highly correlated variables has indeed a

small variance (RMSE). Used as a regressor explaining another dependent variable, his

parameter (with its standard error at the denominator) will mechanically be very high and

with a high sensitivity to outliers.

Page 118: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

Spurious regression with near-multicollinearity: X2 has no effect on X1 X3 is highly correlated with X2:

Very interesting because:

X4 unobserved common cause to X3 and X2

X3 as a statistically significant « control variable »

X3=X2(t-1) Dynamical model

X3=X2 square,

X3= X2 cube

Non-linear model.

X3=X2 * X4

Interaction term

Complementarity

Page 119: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

13. Conclusion: Non Spurious Robust Statistics and Science without Authority Support: Semmelweiss, 1847. Veritas Filia Temporis: Pasteur confirmed 40 years after

Page 120: Development Economics, Part 2, First term 2012 Jean-Bernard CHATELAIN Université Paris I Panthéon Sorbonne

http://www.deirdremccloskey.org/

http://sites.roosevelt.edu/sziliak/

http://www.mostlyharmlesseconometrics.com/

http://www.econ.vt.edu/faculty/2008vitas_research/Spanos/Spanos%20Research.html

Time invariant variables in panel data:

http://hal-paris1.archives-ouvertes.fr/docs/00/49/20/39/PDF/Chatelain_Ralf_Time_Invariant_Panel.pdf

Exogenous Instruments selection with Panel-GMM:

http://halshs.archives-ouvertes.fr/docs/00/11/72/94/PDF/el-inst3.pdf

Spurious regressions with near-multicollinearity:

http://mpra.ub.uni-muenchen.de/42533/1/MPRA_paper_42533.pdf

The good, the bad and the ugly: avoiding the pitfalls of IV estimation (Murray), 2006.