omitted variable bias - faculty of...

Omitted Variable Bias

OLS estimates the causal relationship from to It is possible that the direction of causality goes both ways: to

(A) Simultaneity

E.g. Impact of smoking on health Does smoking determine health outcomes or do health outcomes determine smoking behaviour?

(B) Omitted variable bias

Eg. Impact of schooling on earnings Observed association between outcome variable ( ) and explanatory variable ( ) can be misleading partly reflects omitted factors that are related to both variables

If these factors could be measured and held constant in a regression omitted variable bias would be eliminated in practice this is difficult

Innate ability of ones’ parents affects earnings and schooling of children cannot perfectly control for ability which is essentially

unobservable

Formally:

Assumption of OLS is that and are not correlated This assumption is violated if: There are omitted variables which determine both and

which we cannot control for

In other words the estimate of β is not identified, we cannot deduce it from the joint distribution between and

Econ 495 - Econometric Review 52

• Here, we could add demographic characteristics such as marital status,

geographic location, etc.

1.13.2 Omitted Variables Bias

• If we omit variables that do belong, then the OLS estimate will likely

be biased, E(β̂1) 6= β1

• For example, suppose that the true wage equation model was

wagesi = β0 + β1educi + β2abil + ui

but that since we do not observe ability, we estimate

wagesi = β0 + β1educi + vi (13)


where vi = β2abil + ui

• Then, calling β̃1 the estimate from the equation (13) that omits ability

(13), we can show that

E[β̃1] = β1 + β2δ̃1

where δ1 =Cov(educi, abili)

V ar(educi)

• More generally, when X1 and X2 are correlated and β2 6= 0, the

estimate β̃1 will be biased.

• The sign of the bias depends on both the sign of β2 and of δ̃1


Corr(X1, X2) > 0 Corr(X1, X2) < 0β2 > 0 positive bias negative biasβ2 < 0 negative bias positive bias

• In the case of the wage equation , because more ability leads to higher

productivity, and higher wage: β2 > 0. There are also reason to

believe that educ and ability are positively correlated, so we would

think that the OLS estimates from equation (13) are too large

• What to do about it? This is not an easy problem to correct, if we do

not have some measures of ability in our sample. (One has to take an

quasi-experimental approach using IV for example.)

• However, in terms of reporting results, one would be aware of the


possibility of an “omitted variables” bias and qualify the results as

likely “upward biased” or “downward biased”.

1.13.3 Heteroscedasticity

• When the variance of the error terms is not constant across observa-

tions, we have a problem of heteroscedasticity: V ar(ui|educ) = σ2i =

σ2(Xi)

• The OLS estimates are still unbiased and consistent, but the standard

errors of the estimates are biased if we have heteroskedasticity

Ways to address this:

(1) Experiments which randomly assigns so that it is no longer correlated with

Job training program which conducts a social experiment which

randomly assigns training to a subset of individuals Random assignment assures that participation into the program is

not correlated with omitted personal or social factors

In practice randomization is not very feasible Not easy to run social experiments on population (outside of a lab)

(2) Instrumental Variables

Suppose we have a third variable (“the instrument”) which is correlated with but not with Hence is uncorrelated with the omitted variables and the regression error Instrumental variable technique allows us to estimate the coefficient of interest consistently (free of bias caused by the omitted variables) without having data on the omitted variables Intuitively -- instrumental variables uses only part of the variability in (the part which is uncorrelated with ) to estimate relationship

between and

Classic example Estimation of demand and supply elasticities Observed data on quantities and prices reflects a set of equilibrium points on both the demand and supply curves Consequently an OLS regression of quantities on prices fails to identify, that is trace out, either the supply or demand relationship We can solve this problem by finding certain “curve shifters” (now called instrumental variables)

Find additional factors which affect demand conditions without affecting supply conditions and vice-versa Example of linseed oil: For the demand curve shifter we can use the price of substitute

goods (cottonseed)

For the supply curve shifter can use factors that affect costs (yield per acre) such as weather patterns

Intuitively: weather related shifts (which shift the supply curve) are used to

trace out the demand curve

changes in the price of substitute goods are used to shift the demand curve so as to trace out the supply curve

EXAMPLE ON SUPPLY/DEMAND[from: Stock and Watson, Introduction to Econometrics, chapter 12]

Simultaneous causality bias in the OLS regression of quantities on prices arises because price and quantity are determined by the interaction of demand and supply!


The interaction between demand and supply could reasonably produce something not useful for our purposes!


But, what if only supply shifts?

TSLS estimates the demand curve by isolating shifts in price and quantity that arise from shifts in supply; Z is a variable that shifts supply but not demand.

Instrumental Variables Method First stage estimation:

Obtain predicted values:

Predicted value, , is not a random variable and hence is not correlated with

Second stage estimation:


4 Instrumental Variables

4.1 Single endogenous variable – One continuous instru-

ment

• Instrumental Variables (IV) estimation is used when a model

Y = β0 + β1X + u (1)

has an endogenous X, that is, whenever Cov(X, u) 6= 0

• In other words, IV can be used to address the problem of omitted

variable bias


• But what is an instrumental variable?

• In order for a variable, Z, to serve as a valid instrument for X, the

following must be true

• Assumption 1: Exclusion Restriction The instrument must be exoge-

nous, that is, uncorrelated with the error term, Cov(Z, u) = 0

• Assumption 2: Instrument Relevance The instrument must be corre-

lated with the endogenous variable X that is, Cov(Z, X) 6= 0

• How do we know that Z is a valid instrument?


• The main problem is that we have to use common sense and economic

theory to decide if it makes sense to assume Cov(Z, u) = 0

• In the case of multiple instruments, we can use the overid test below

• However, we can test whether Cov(Z, X) 6= 0

• We simply test H0 : π1 = 0 in the regression

X = π0 + π1Z + v (2)

• This regression is called the first-stage regression


• Card (1995) has used proximity to a four-year college nearc4 as in-

strument for education

. reg educ nearc4 exper expersq black south smsa smsa66 reg661-reg668 ;

Source | SS df MS Number of obs = 3010-------------+------------------------------ F( 15, 2994) = 182.13

Model | 10287.6179 15 685.841194 Prob > F = 0.0000Residual | 11274.4622 2994 3.76568542 R-squared = 0.4771

-------------+------------------------------ Adj R-squared = 0.4745Total | 21562.0801 3009 7.16586243 Root MSE = 1.9405

------------------------------------------------------------------------------educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------nearc4 | .3198989 .0878638 3.64 0.000 .1476194 .4921785exper | -.4125334 .0336996 -12.24 0.000 -.4786101 -.3464566

expersq | .0008686 .0016504 0.53 0.599 -.0023674 .0041046black | -.9355287 .0937348 -9.98 0.000 -1.11932 -.7517377south | -.0516126 .1354284 -0.38 0.703 -.3171548 .2139296smsa | .4021825 .1048112 3.84 0.000 .1966732 .6076918

Siwan

Highlight

Siwan

Highlight


smsa66 | .0254805 .1057692 0.24 0.810 -.1819071 .2328682reg661 | -.210271 .2024568 -1.04 0.299 -.6072395 .1866975reg662 | -.2889073 .1473395 -1.96 0.050 -.5778042 -.0000105reg663 | -.2382099 .1426357 -1.67 0.095 -.5178838 .0414639reg664 | -.093089 .1859827 -0.50 0.617 -.4577559 .2715779reg665 | -.4828875 .1881872 -2.57 0.010 -.8518767 -.1138982reg666 | -.5130857 .2096352 -2.45 0.014 -.9241293 -.1020421reg667 | -.4270887 .2056208 -2.08 0.038 -.8302611 -.0239163reg668 | .3136204 .2416739 1.30 0.194 -.1602434 .7874841_cons | 16.84852 .2111222 79.80 0.000 16.43456 17.26248

------------------------------------------------------------------------------

. test nearc4;

( 1) nearc4 = 0

F( 1, 2994) = 13.26Prob > F = 0.0003

Siwan

Highlight


• Rule-of-thumb: you need to worry about weak instruments if the first-

stage F-statistic is less than 10

• Given equation (1) and our assumptions 1 and 2

Cov(Z, Y ) = Cov[Z, (β0 + β1X + u)]

= β1Cov(Z, X) + Cov(Z, u),

so βIV1 =

Cov(Z, Y )

Cov(Z, X)


. ivreg lwage (educ=nearc4) exper expersq black south smsa smsa66 reg661-reg668 ;

Instrumental variables (2SLS) regression


Model | 141.146813 15 9.40978752 Prob > F = 0.0000Residual | 451.494832 2994 .150799877 R-squared = 0.2382

-------------+------------------------------ Adj R-squared = 0.2343Total | 592.641645 3009 .196956346 Root MSE = .38833

------------------------------------------------------------------------------lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------educ | .1315038 .0549637 2.39 0.017 .0237335 .2392742

exper | .1082711 .0236586 4.58 0.000 .0618824 .1546598expersq | -.0023349 .0003335 -7.00 0.000 -.0029888 -.001681black | -.1467757 .0538999 -2.72 0.007 -.2524603 -.0410912south | -.1446715 .0272846 -5.30 0.000 -.19817 -.091173smsa | .1118083 .031662 3.53 0.000 .0497269 .1738898

smsa66 | .0185311 .0216086 0.86 0.391 -.0238381 .0609003

Siwan

Highlight


reg661 | -.1078142 .0418137 -2.58 0.010 -.1898007 -.0258278reg662 | -.0070465 .0329073 -0.21 0.830 -.0715696 .0574767reg663 | .0404445 .0317806 1.27 0.203 -.0218694 .1027585reg664 | -.0579172 .0376059 -1.54 0.124 -.1316532 .0158189reg665 | .0384577 .0469387 0.82 0.413 -.0535777 .130493reg666 | .0550887 .0526597 1.05 0.296 -.0481642 .1583416reg667 | .026758 .0488287 0.55 0.584 -.0689832 .1224992reg668 | -.1908912 .0507113 -3.76 0.000 -.2903238 -.0914586_cons | 3.773965 .934947 4.04 0.000 1.940762 5.607169

------------------------------------------------------------------------------Instrumented: educInstruments: exper expersq black south smsa smsa66 reg661 reg662 reg663

reg664 reg665 reg666 reg667 reg668 nearc4------------------------------------------------------------------------------

• Notice that β̂IVeduc = 0.132 > β̂

OLSeduc = 0.075 (see LATE effect below)

• Which estimator should we prefer IV or OLS?


4.2 Single endogenous variable – more than one continuous

instrument

• Consider the following structural model

Y = β0 + β1X1 + β2X2 + u1 (6)

where X1 is an endogeneous variable and X2 is an exogenous variable

• Suppose now that we have two exogenous variables excluded from

equation (6)

X1 = π0 + π1Z1 + π2Z2 + v (7)

where Z1 and Z2 are valid instruments in that they do not appear

in the structural model and are uncorrelated with the structural error

term u1, but are correlated with X1


• With more than one instrument, the IV estimator is also called the

two-stage least squares (2SLS) estimator

• In our returns to education example, we can add proximity to a two-

year college nearc2

. reg educ nearc4 nearc2 exper expersq black south smsa smsa66 reg661-reg668 ;


Model | 10297.1164 16 643.569774 Prob > F = 0.0000Residual | 11264.9637 2993 3.76377002 R-squared = 0.4776

-------------+------------------------------ Adj R-squared = 0.4748Total | 21562.0801 3009 7.16586243 Root MSE = 1.94

------------------------------------------------------------------------------educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------


nearc4 | .3205819 .0878425 3.65 0.000 .148344 .4928197nearc2 | .1229986 .0774256 1.59 0.112 -.0288142 .2748114exper | -.4122915 .0336914 -12.24 0.000 -.4783521 -.3462309

expersq | .0008479 .00165 0.51 0.607 -.0023874 .0040832black | -.9451729 .0939073 -10.06 0.000 -1.129302 -.7610434south | -.0419115 .1355316 -0.31 0.757 -.3076561 .2238331smsa | .4013708 .1047858 3.83 0.000 .1959113 .6068303

smsa66 | .0000782 .1069445 0.00 0.999 -.2096139 .2097704reg661 | -.1687829 .2040832 -0.83 0.408 -.5689405 .2313747reg662 | -.269031 .1478324 -1.82 0.069 -.5588944 .0208325reg663 | -.1902114 .1457652 -1.30 0.192 -.4760216 .0955987reg664 | -.037715 .1891745 -0.20 0.842 -.4086403 .3332102reg665 | -.4371387 .1903306 -2.30 0.022 -.8103307 -.0639467reg666 | -.5022265 .2096933 -2.40 0.017 -.9133841 -.0910688reg667 | -.3775317 .207922 -1.82 0.070 -.7852162 .0301529reg668 | .3820043 .2454171 1.56 0.120 -.0991991 .8632076_cons | 16.77306 .2163481 77.53 0.000 16.34885 17.19727

------------------------------------------------------------------------------

. predict peduc;(option xb assumed; fitted values)

Siwan

Highlight

Siwan

Highlight

Siwan

Highlight


. predict reseduc, res;

. test nearc4=nearc2=0;

( 1) nearc4 - nearc2 = 0( 2) nearc4 = 0

F( 2, 2993) = 7.89Prob > F = 0.0004

• Here nearc2 is a weak instrument, so we no longer pass the rule-of-

thumb test, so we would prefer the IV using only nearc4

• In a more general case, we could use either Z1 or Z2 as an instrument

Siwan

Highlight

INSTITUTIONS AND ECONOMIC DEVELOPMENT Notes from : “Colonial origins of comparative development”

(Acemoglu et. al.) What are the fundamental causes of the large differences in income per capita across countries? Differences in institutions and property rights have received attention

View receives support from cross-country correlations between measures of property rights and economic development

At some level -- obvious that institutions matter North and South Korea East and West Germany

One part of the country stagnated under central planning and collective ownership while the other prospered with private property and a market economy To estimate impact of institutions on economic performance we need a source of exogenous variation in institutions (an instrument)

Propose a theory of institutional differences among countries colonized by Europeans Exploit this theory to derive a possible source of exogenous variation Theory rests on three premises

(1) Different types of colonization policies created different sets of institutions

At one extreme:

Colonizers did not settle and set up extractive institutions did not introduce protection for private property did not provide checks and balances against government

expropriation main purpose -- transfer as much of the resources of the colony

to the colonizer Latin America and the Belgian Congo

At the other extreme:

Colonizers settled and replicated European institutions strong emphasis on private property and checks against

government power

Australia, New Zealand, Canada and U.S.

(2) Colonization strategy was influenced by the feasibility of

settlements

In places where disease environment was not favorable to European settlement

formation of the extractive state was more likely

(3) The colonial state and institutions persisted even after independence.

Based on these three premises: use the mortality rates of the first European settlers as an instruments

for current institutions in these countries

settler mortality settlements early institutions current institutions current economic performance

Log

GD

P pe

r cap

ita, P

PP, 1

995

Log of Settler Mortality2 4 6 8

4

6

8

10

AGO

ARG

AUS

BDI

BENBFABGD

BHS

BLZ

BOL

BRA

BRB

CAF

CAN

CHL

CIVCMRCOG

COLCRI

DOMDZAECU

EGY

ETH

FJI

GAB

GHAGINGMB

GTM

GUY

HKG

HND

HTI

IDN

IND

JAM

KENLAO

LKAMAR

MDG

MEX

MLI

MLT

MRT

MUSMYS

NER NGA

NIC

NZL

PAK

PAN

PERPRY

RWA

SDN SEN

SGP

SLE

SLV

TCD

TGO

TTOTUN

TZA

UGA

URY

USA

VEN

VNM

ZAF

ZAR

Colonies where Europeans faced higher mortality rates are today substantially poorer than colonies that were healthy for Europeans Theory implies this relationship reflects the effect of settler mortality working

through the institutions brought by Europeans

assumes there is no direct affect between settler mortality and

economic performance today

Under these assumptions: Regress current performance on current institutions and instrument the latter by settler mortality rates Focus on property rights and checks against government power use protection against risk of expropriation index as a proxy for

institutions

Estimation Strategy

is income per capita in country i

is protection against expropriation (institutions)

is a vector of other control variables (geography, legal origins) is not an exogenous variable

many omitted variables determine both and OLS regression would suffer from omitted variable bias

First stage estimation:

Where is settler mortality rate Second stage estimation:

Log

GD

P pe

r cap

ita, P

PP, 1

995

Average Expropriation Risk 1985-954 6 8 10

4

6

8

10

AGO

ARG

AUS

BFA BGD

BHS

BOL

BRA

CAN

CHL

CIVCMRCOG

COLCRI

DOMDZAECU

EGY

ETH

GAB

GHAGINGMB

GTM

GUY

HKG

HND

HTI

IDN

IND

JAM

KEN

LKAMAR

MDG

MEX

MLI

MLT

MYS

NER NGA

NIC

NZL

PAK

PAN

PERPRY

SDN SEN

SGP

SLE

SLV

TGO

TTOTUN

TZA

UGA

URY

USA

VEN

VNM

ZAF

ZAR

Log

GD

P pe

r cap

ita, P

PP, 1

995


4

6

8

10

AGO

ARG

AUS

BDI

BENBFABGD

BHS

BLZ

BOL

BRA

BRB

CAF

CAN

CHL

CIVCMRCOG

COLCRI

DOMDZAECU

EGY

ETH

FJI

GAB

GHAGINGMB

GTM

GUY

HKG

HND

HTI

IDN

IND

JAM

KENLAO

LKAMAR

MDG

MEX

MLI

MLT

MRT

MUSMYS

NER NGA

NIC

NZL

PAK

PAN

PERPRY

RWA

SDN SEN

SGP

SLE

SLV

TCD

TGO

TTOTUN

TZA

UGA

URY

USA

VEN

VNM

ZAF

ZAR

Aver

age

Expr

opria

tion

Ris

k 19

85-9

5


4

6

8

10

AGO

ARG

AUS

BFA

BGD

BHS

BOL

BRA

CAN

CHL

CIV

CMR

COG

COLCRI

DOMDZAECU

EGY

ETH

GAB

GHAGIN

GMB

GTM

GUY

HKG

HND

HTI

IDN

IND

JAM

KENLKA

MAR

MDG

MEX

MLI

MLT

MYS

NER

NGANIC

NZL

PAK PANPER

PRY

SDN

SEN

SGP

SLE

SLV

TGO

TTO

TUNTZA

UGA

URY

USA

VEN

VNM

ZAF

ZAR

Notes from “The long-term effects of Africa’s slave trade” (Nunn) Africa’s economic performance in second half of the 20th century has been very poor One explanation for Africa’s underdevelopment is its history of extraction characterized by two events: slave trades colonialism

Estimation Equation

is per capita GDP in country i

is vector of variables reflecting origin of colonizer prior to independence

is vector of variables reflecting geography and climate

THE LONG-TERM EFFECTS OF AFRICA’S SLAVE TRADES 155

TABLE IIIRELATIONSHIP BETWEEN SLAVE EXPORTS AND INCOME

Dependent variable is log real per capita GDP in 2000, ln y

(1) (2) (3) (4) (5) (6)

ln(exports/area) −0.112∗∗∗ −0.076∗∗∗ −0.108∗∗∗ −0.085∗∗ −0.103∗∗∗ −0.128∗∗∗(0.024) (0.029) (0.037) (0.035) (0.034) (0.034)

Distance from 0.016 −0.005 0.019 0.023 0.006equator (0.017) (0.020) (0.018) (0.017) (0.017)

Longitude 0.001 −0.007 −0.004 −0.004 −0.009(0.005) (0.006) (0.006) (0.005) (0.006)

Lowest monthly −0.001 0.008 0.0001 −0.001 −0.002rainfall (0.007) (0.008) (0.007) (0.006) (0.008)

Avg max humidity 0.009 0.008 0.009 0.015 0.013(0.012) (0.012) (0.012) (0.011) (0.010)

Avg min −0.019 −0.039 −0.005 −0.015 −0.037temperature (0.028) (0.028) (0.027) (0.026) (0.025)

ln(coastline/area) 0.085∗∗ 0.092∗∗ 0.095∗∗ 0.082∗∗ 0.083∗∗(0.039) (0.042) (0.042) (0.040) (0.037)

Island indicator −0.398 −0.150(0.529) (0.516)

Percent Islamic −0.008∗∗∗ −0.006∗ −0.003(0.003) (0.003) (0.003)

French legal origin 0.755 0.643 −0.141(0.503) (0.470) (0.734)

North Africa 0.382 −0.304indicator (0.484) (0.517)

ln(gold prod/pop) 0.011 0.014(0.017) (0.015)

ln(oil prod/pop) 0.078∗∗∗ 0.088∗∗∗(0.027) (0.025)

ln(diamond −0.039 −0.048prod/pop) (0.043) (0.041)

Colonizer fixed Yes Yes Yes Yes Yes Yeseffects

Number obs. 52 52 42 52 52 42R2 .51 .60 .63 .71 .77 .80

Notes. OLS estimates of (1) are reported. The dependent variable is the natural log of real per capitaGDP in 2000, ln y. The slave export variable ln(exports/area) is the natural log of the total number of slavesexported from each country between 1400 and 1900 in the four slave trades normalized by land area. Thecolonizer fixed effects are indicator variables for the identity of the colonizer at the time of independence.Coefficients are reported with standard errors in brackets. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1%, 5%,and 10% levels.

for slave exports remains negative and significant, and the mag-nitude of the estimated coefficient actually increases.9

9. One may also be concerned that the inclusion of the countries in southernAfrica—namely South Africa, Swaziland, and Lesotho—may also be biasing theresults. As I report in the Appendix, the results are robust to also omitting this

Direction of causality OLS estimates show there is a relationship between slave exports and current economic performance Still unclear whether slave trades have a causal impact on current income Alternative explanation for relationship: societies that were initially under developed selected into slave

trades these societies continue to be under developed today

Therefore we observe a negative relationship between slave exports and current income even though the slave trades did not have any effect on subsequent economic development

Two strategies to evaluate whether there is a causal effect of the slave trades on income

(1) Using historic data: can evaluate importance and characteristics of selection into the slave trades

(2) Find instruments for slave exports

Historical Evidence on Selection during the Slave Trades Using data on initial population densities Check whether more or less prosperous areas selected into the slave

trades

Population density is a reasonable indicator of economic prosperity

158 QUARTERLY JOURNAL OF ECONOMICS

FIGURE IVRelationship between Initial Population Density and Slave Exports

obtained if civil wars or conflicts could be instigated (Barry 1992;Inikori 2003). As well, societies that were the most violent andhostile, and therefore the least developed, were often best ableto resist European efforts to purchase slaves. For example, theslave trade in Gabon was limited because of the defiance andviolence of its inhabitants toward the Portuguese. This resistancecontinued for centuries, and as a result the Portuguese were forcedto concentrate their efforts along the coast further south (Hall2005, pp. 60–64).

Using data on initial population densities, I check whether itwas the more prosperous or less prosperous areas that selectedinto the slave trades. Acemoglu, Johnson, and Robinson (2002)have shown that population density is a reasonable indicator ofeconomic prosperity. Figure IV shows the relationship betweenthe natural log of population density in 1400 and ln(exports/area).The data confirm the historical evidence on selection during theslave trades.12 The figure shows that the parts of Africa that were

12. The relationship is similar if one excludes island and North African coun-tries, or if one normalizes slave exports by population rather than land area.

Figure shows that parts of Africa that were the most prosperous in 1400 (measured by population density) tend also to be the areas that were most impacted by the slave trades evidence suggests that societies that were the most prosperous, not

the most under developed, that selected into the slave trades

unlikely that the strong relationship between slave exports and

current income is driven by selection

Instrumental Variables

Use instruments that are correlated with slave exports but are uncorrelated with other country characteristics As instruments for slave exports use distances from each African country to the locations where the

slaves were demanded

Validity of these instruments relies on presumption: although location of demand influenced the location of supply location of supply did not influence location of demand

If sugar plantations were established in West Indies because West Indies were close to Western Coast of Africa instruments not valid

If instead slaves were taken from Western Africa, because it was relatively close to plantation economies in West Indies instruments are valid

Historical evidence suggests this to be true location for demand of African slaves was determined by a number

of factors all unrelated to the supply of slaves

In West Indies and southern U.S. slaves were imported because of climates suitable for growing commodities such as sugar and tobacco Existence of gold and silver mines was determinant for demand of slaves in Brazil In northern Sahara, Arabia, and Persia slaves were needed to work in salt mines In the Red Sea area slaves were used as pearl divers

THE LONG-TERM EFFECTS OF AFRICA’S SLAVE TRADES 161

FIGURE VExample Showing the Distance Instruments for Burkina Faso

4. The overland distance from a country’s centroid to the clos-est port of export for the Red Sea slave trade. The portsare Massawa, Suakin, and Djibouti.14

The instruments are illustrated in Figure V, which shows thefour distances for Burkina Faso. The ports in each of the fourslave trades are represented by different colored symbols, and theshortest distances by colored lines. Details of the construction ofthe instruments are given in the Appendix.15

The IV estimates are reported in Table IV. The first columnreports estimates without control variables, the second columnincludes colonizer fixed effects, and the third and fourth columnsinclude colonizer fixed effects and geography controls. In column(4), the sample excludes islands and North African countries.

14. For island countries, one cannot reach the ports of the Saharan or RedSea slave trades by traveling overland. For these countries I use the sum of thesailing distance and overland distance.

15. An alternative strategy is to also include the distance from the centroidto the coast (which is also shown in Figure V) as an additional instrument, sincethis distance is part of the total distance to the markets in the Indian Ocean andtrans-Atlantic slave trades. The results are essentially identical if this distance isalso included as an additional instrument.


TABLE IVESTIMATES OF THE RELATIONSHIP BETWEEN SLAVE EXPORTS AND INCOME

(1) (2) (3) (4)

Second Stage. Dependent variable is log income in 2000, ln yln(exports/area) −0.208∗∗∗ −0.201∗∗∗ −0.286∗ −0.248∗∗∗

(0.053) (0.047) (0.153) (0.071)[−0.51, −0.14] [−0.42, −0.13] [−∞, +∞] [−0.62, −0.12]

Colonizer fixed No Yes Yes Yeseffects

Geography controls No No Yes YesRestricted sample No No No YesF-stat 15.4 4.32 1.73 2.17Number of obs. 52 52 52 42

First Stage. Dependent variable is slave exports, ln(exports/area)

Atlantic distance −1.31∗∗∗ −1.74∗∗∗ −1.32∗ −1.69∗∗(0.357) (0.425) (0.761) (0.680)

Indian distance −1.10∗∗∗ −1.43∗∗∗ −1.08 −1.57∗(0.380) (0.531) (0.697) (0.801)

Saharan distance −2.43∗∗∗ −3.00∗∗∗ −1.14 −4.08∗∗(0.823) (1.05) (1.59) (1.55)

Red Sea distance −0.002 −0.152 −1.22 2.13(0.710) (0.813) (1.82) (2.40)

F-stat 4.55 2.38 1.82 4.01Colonizer fixed No Yes Yes Yes

effectsGeography controls No No Yes YesRestricted sample No No No YesHausman test .02 .01 .02 .04

(p-value)Sargan test (p-value) .18 .30 .65 .51

Notes. IV estimates of (1) are reported. Slave exports ln(exports/area) is the natural log of the total numberof slaves exported from each country between 1400 and 1900 in the four slave trades normalized by land area.The colonizer fixed effects are indicator variables for the identity of the colonizer at the time of independence.Coefficients are reported, with standard errors in brackets. For the endogenous variable ln(exports/area), Ialso report 95% confidence regions based on Moreira’s (2003) conditional likelihood ratio (CLR) approach.These are reported in square brackets. The p-value of the Hausman test is for the Wu–Hausman chi-squaredtest. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1%, 5%, and 10% levels. The “restricted sample” excludes islandand North African countries. The “geography controls” are distance from equator, longitude, lowest monthlyrainfall, avg max humidity, avg min temperature, and ln(coastline/area).

The first-stage estimates are reported in the bottom panel ofthe table. The coefficients for the instruments are generally neg-ative, suggesting that the further a country was from slave mar-kets, the fewer slaves it exported.16 The exception is the distance

16. The specifications assume a linear first-stage relationship. The estimatesare similar if one also allows for a nonlinear relationship between slave exports

Possible Channels of Causality Channels through which slave trade may affect current economic development:

(1) Slave trades weaken ties between villages discourage formation of larger communities and broader ethnic identities

Evidence shows that ethnic fractionalization reduces provision of public goods (education, health facilities, access to water, transportation, infrastructure) important for economic development


FIGURE VIRelationship between Slave Exports and Current Ethnic Fractionalization

preliminary and exploratory. With only 52 observations it is notpossible to pin down the precise channels and mechanism under-lying the relationships with any reasonable degree of certainty.My strategy here is to simply investigate whether the data areconsistent with the historic events described in Section II.

An important consequence of the slave trades was that theytended to weaken ties between villages, thus discouraging theformation of larger communities and broader ethnic identities. Iexplore whether the data are consistent with this channel by ex-amining the relationship between slave exports and a measureof current ethnic fractionalization from Alesina et al. (2003). Asshown in Figure VI, there is a strong positive relationship be-tween the two variables.18 This is consistent with the historicaccounts of the slave trades impeding the formation of broaderethnic identities.

This consequence of the slave trades is important because ofthe increasing evidence showing that ethnic fractionalization is an

18. The results are also similar if other measures of ethnic fractionalizationare used.

(2) Slave trades weakened and under developed states

negative relationship between slave exports and 19th century state

centralization

Consistent with slave trades causing long-term political instability weakened and fragmented states

undevelopment of political structures (institutions)


FIGURE VIIRelationship between Slave Exports and Nineteenth-Century State Development

growth between 1960 and 1995. Looking within Africa, Gennaioliand Rainer (2006) find that countries with ethnicities that hadcentralized precolonial state institutions today provide more pub-lic goods, such as education, health, and infrastructure.

Herbst (1997, 2000) also focuses on the importance of statedevelopment for economic success, arguing that Africa’s pooreconomic performance is a result of postcolonial state failure,the roots of which lie in the underdevelopment and instability ofprecolonial polities. Herbst (2000, chaps. 2–4) argues that becauseof a lack of significant political development during colonial rule,the limited precolonial political structures continued to exist afterindependence.19 As a result, Africa’s postindependence leadersinherited nation states that did not have the infrastructurenecessary to extend authority and control over the whole country.Many states were, and still are, unable to collect taxes fromtheir citizens, and as a result they are also unable to provide aminimum level of public goods and services.

19. On the continuity between Africa’s precolonial and postcolonial politicalsystems also see Hargreaves (1969, p. 200).

omitted variable bias - faculty of...

Documents