sandwich package - r · 2 bread sandwich ... object-oriented computation of sandwich estimators....

38
Package ‘sandwich’ July 26, 2017 Version 2.4-0 Date 2017-07-26 Title Robust Covariance Matrix Estimators Description Model-robust standard error estimators for cross- sectional, time series, clustered, panel, and longitudinal data. Depends R (>= 2.10.0) Imports stats, utils, zoo Suggests AER, car, geepack, lattice, lmtest, MASS, multiwayvcov, pcse, plm, pscl, scatterplot3d, stats4, strucchange, survival License GPL-2 | GPL-3 NeedsCompilation no Author Achim Zeileis [aut, cre], Thomas Lumley [aut], Susanne Berger [ctb], Nathaniel Graham [ctb] Maintainer Achim Zeileis <[email protected]> Repository CRAN Date/Publication 2017-07-26 19:36:29 UTC R topics documented: bread ............................................ 2 estfun ............................................ 3 InstInnovation ........................................ 4 Investment .......................................... 6 isoacf ............................................ 8 kweights ........................................... 9 lrvar ............................................. 10 meat ............................................. 11 NeweyWest ......................................... 12 PetersenCL ......................................... 14 PublicSchools ........................................ 15 1

Upload: truongnhan

Post on 11-May-2018

255 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

Package ‘sandwich’July 26, 2017

Version 2.4-0

Date 2017-07-26

Title Robust Covariance Matrix Estimators

Description Model-robust standard error estimators for cross-sectional, time series, clustered, panel, and longitudinal data.

Depends R (>= 2.10.0)

Imports stats, utils, zoo

Suggests AER, car, geepack, lattice, lmtest, MASS, multiwayvcov, pcse,plm, pscl, scatterplot3d, stats4, strucchange, survival

License GPL-2 | GPL-3

NeedsCompilation no

Author Achim Zeileis [aut, cre],Thomas Lumley [aut],Susanne Berger [ctb],Nathaniel Graham [ctb]

Maintainer Achim Zeileis <[email protected]>

Repository CRAN

Date/Publication 2017-07-26 19:36:29 UTC

R topics documented:bread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2estfun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3InstInnovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Investment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6isoacf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8kweights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9lrvar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10meat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11NeweyWest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12PetersenCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14PublicSchools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1

Page 2: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

2 bread

sandwich . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17vcovBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18vcovCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19vcovHAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22vcovHC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25vcovOPG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27vcovPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28vcovPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30weightsAndrews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32weightsLumley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Index 37

bread Bread for Sandwiches

Description

Generic function for extracting an estimator for the bread of sandwiches.

Usage

bread(x, ...)

Arguments

x a fitted model object.

... arguments passed to methods.

Value

A matrix containing an estimator for the expectation of the negative derivative of the estimatingfunctions, usually the Hessian. Typically, this should be an k× k matrix corresponding to k param-eters. The rows and columns should be named as in coef or terms, respectively.

The default method tries to extract vcov and nobs and simply computes their product.

References

Zeileis A (2006), Object-Oriented Computation of Sandwich Estimators. Journal of StatisticalSoftware, 16(9), 1–16. URL http://www.jstatsoft.org/v16/i09/.

See Also

lm, glm

Page 3: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

estfun 3

Examples

## linear regressionx <- sin(1:10)y <- rnorm(10)fm <- lm(y ~ x)

## bread: n * (x'x)^-1bread(fm)solve(crossprod(cbind(1, x))) * 10

estfun Extract Empirical Estimating Functions

Description

Generic function for extracting the empirical estimating functions of a fitted model.

Usage

estfun(x, ...)

Arguments

x a fitted model object.

... arguments passed to methods.

Value

A matrix containing the empirical estimating functions. Typically, this should be an n × k matrixcorresponding to n observations and k parameters. The columns should be named as in coef orterms, respectively.

The estimating function (or score function) for a model is the derivative of the objective functionwith respect to the parameter vector. The empirical estimating functions is the evaluation of the es-timating function at the observed data (n observations) and the estimated parameters (of dimensionk).

References

Zeileis A (2006), Object-Oriented Computation of Sandwich Estimators. Journal of StatisticalSoftware, 16(9), 1–16. URL http://www.jstatsoft.org/v16/i09/.

See Also

lm, glm

Page 4: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

4 InstInnovation

Examples

## linear regressionx <- sin(1:10)y <- rnorm(10)fm <- lm(y ~ x)

## estimating function: (y - x'beta) * xestfun(fm)residuals(fm) * cbind(1, x)

InstInnovation Innovation and Institutional Ownership

Description

Firm-level panel data on innovation and institutional ownership from 1991 to 1999 over 803 firms.The observations refer to different firms over different years.

Usage

data("InstInnovation")

Format

A data frame containing 6208 observations on 25 variables.

company factor. company names.

sales numeric. sales (in millions of dollars).

acompetition numeric. constant inverse Lerner index.

competition numeric. varying inverse Lerner index.

capital numeric. net stock of property, plant, and equipment.

cites integer. future cite-weighted patents.

precites numeric. presample average of cite-weighted patents.

dprecites factor. indicates zero precites.

patents integer. granted patents.

drandd factor. indicates a zero R&D stock.

randd numeric. R&D stock (in millions of dollars).

employment numeric. employment (in 1000s).

sp500 factor. membership of firms in the S&P500 index.

tobinq numeric. Tobin’s q.

value numeric. stock market value.

institutions numeric. proportion of stock owned by institutions.

industry factor. four-digit industry code.

Page 5: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

InstInnovation 5

year factor. estimation period.

top1 numeric. share of the largest institution.

quasiindexed numeric. share of "quasi-indexed" institutional owners.

nonquasiindexed numeric. share of "non-quasi-indexed" institutional owners.

transient numeric. share of "transient" institutional owners.

dedicated numeric. share of "dedicated" institutional owners.

competition4 numeric. varying inverse Lerner index in the firm’s four-digit industry.

subsample factor. subsample for the replication of columns 1–5 from Table 4 in Aghion et al.(2013).

Details

Aghion et al. (2013) combine several firm level panel datasets (e.g., USPTO, SEC and Compustat)to examine the role of institutional investors in the governance of innovation. Their baseline tomodel innovation is the Poisson model, but they also consider negative binomial models. Berger etal. (2017) argue that nonlinearities in the innovation process emerge in case that the first innovationis especially hard to obtain in comparison to succeeding innovations. Then, hurdle models offer auseful way that allows for a distinction between these two processes. Berger et al. (2017) show thatan extended analysis with negative binomial hurdle models differs materially from the outcomes ofthe single-equation Poisson approach of Aghion et al. (2013).

Institutional ownership (institutions) is defined as the proportion of stock owney by institutions.According to Aghion et al. (2013), an institutional owner is defined as an institution that files aForm 13-F with the Securities and Exchange Commission (SEC).

Future cite-weighted patents (cites) are used as a proxy for innovation. They are calculated usingultimately granted patent, dated by year of application, and weight these by future citations through2002 (see Aghion et al. (2013)).

The presample average of cite-weighted patents (precites) is used by Aghion et al. (2013) as a proxyfor unobserved heterogeneity, employing the "presample mean scaling" method of Blundell et al.(1999).

The inverse Lerner index in the firm’s three-digit industry is used as a time-varying measure forproduct market competition (competition), where the Lerner is calculated as the median gross mar-gin from the entire Compustat database in the firm’s three-digit industry (see Aghion et al. (2013)).A time-invariant measure for competition (acompetition) is constructed by averaging the Lernerover the sample period.

The classification of institutions into "quasiindexed", "transient" and "dedicated" follows Bushee(1998) and distinguishes between institutional investors based on their type of investing. Quasi-indexed institutions are do not trade much and are widely diversified, dedicated institution do nottrade much and have more concentrated holdings, and transient institutions often trade and havediversified holdings (see Aghion et al. (2013) and Bushee (1998)).

Source

Data and online appendix of Aghion et al. (2013).

Page 6: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

6 Investment

References

Aghion P, Van Reenen J, Zingales L (2013). “Innovation and Institutional Ownership.” The Ameri-can Economic Review, 103(1), 277–304. doi: 10.1257/aer.103.1.277

Berger S, Stocker H, Zeileis A (2017). “Innovation and Institutional Ownership Revisited: An Em-pirical Investigation with Count Data Models.” Empirical Economics, 52(4), 1675–1688. doi: 10.1007/s0018101611180

Blundell R, Griffith R, Van Reenen J (1999). “Market Share, Market Value and Innovation in aPanel of British Manufacturing Firms.” Review of Economic Studies, 66(3), 529–554.

Bushee B (1998). “The Influence of Institutional Investors on Myopic R&D Investment Behavior.”Accounting Review, 73(3), 655–679.

Examples

## Poisson models from Table I in Aghion et al. (2013)

## load data setdata("InstInnovation", package = "sandwich")

## log-scale variableInstInnovation$lograndd <- log(InstInnovation$randd)InstInnovation$lograndd[InstInnovation$lograndd == -Inf] <- 0

## regression formulasf1 <- cites ~ institutions + log(capital/employment) + log(sales) + industry + yearf2 <- cites ~ institutions + log(capital/employment) + log(sales) +

industry + year + lograndd + dranddf3 <- cites ~ institutions + log(capital/employment) + log(sales) +

industry + year + lograndd + drandd + dprecites + log(precites)

## Poisson modelstab_I_3_pois <- glm(f1, data = InstInnovation, family = poisson)tab_I_4_pois <- glm(f2, data = InstInnovation, family = poisson)tab_I_5_pois <- glm(f3, data = InstInnovation, family = poisson)

## one-way clustered covariancesvCL_I_3 <- vcovCL(tab_I_3_pois, cluster = InstInnovation$company)vCL_I_4 <- vcovCL(tab_I_4_pois, cluster = InstInnovation$company)vCL_I_5 <- vcovCL(tab_I_5_pois, cluster = InstInnovation$company)

## replication of columns 3 to 5 from Table I in Aghion et al. (2013)cbind(coef(tab_I_3_pois), sqrt(diag(vCL_I_3)))[2:4, ]cbind(coef(tab_I_4_pois), sqrt(diag(vCL_I_4)))[c(2:4, 148), ]cbind(coef(tab_I_5_pois), sqrt(diag(vCL_I_5)))[c(2:4, 148), ]

Investment US Investment Data

Page 7: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

Investment 7

Description

US data for fitting an investment equation.

Usage

data(Investment)

Format

An annual time series from 1963 to 1982 with 7 variables.

GNP nominal gross national product (in billion USD),

Investment nominal gross private domestic investment (in billion USD),

Price price index, implicit price deflator for GNP,

Interest interest rate, average yearly discount rate charged by the New York Federal Reserve Bank,

RealGNP real GNP (= GNP/Price),

RealInv real investment (= Investment/Price),

RealInt approximation to the real interest rate (= Interest - 100 * diff(Price)/Price).

Source

Table 15.1 in Greene (1993)

References

Greene W.H. (1993), Econometric Analysis, 2nd edition. Macmillan Publishing Company, NewYork.

Executive Office of the President (1984), Economic Report of the President. US Government Print-ing Office, Washington, DC.

Examples

## Willam H. Greene, Econometric Analysis, 2nd Ed.## Chapter 15## load data set, p. 411, Table 15.1data(Investment)

## fit linear model, p. 412, Table 15.2fm <- lm(RealInv ~ RealGNP + RealInt, data = Investment)summary(fm)

## visualize residuals, p. 412, Figure 15.1plot(ts(residuals(fm), start = 1964),

type = "b", pch = 19, ylim = c(-35, 35), ylab = "Residuals")sigma <- sqrt(sum(residuals(fm)^2)/fm$df.residual) ## maybe used df = 26 instead of 16 ??abline(h = c(-2, 0, 2) * sigma, lty = 2)

if(require(lmtest)) ## Newey-West covariances, Example 15.3

Page 8: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

8 isoacf

coeftest(fm, vcov = NeweyWest(fm, lag = 4))## Note, that the following is equivalent:coeftest(fm, vcov = kernHAC(fm, kernel = "Bartlett", bw = 5, prewhite = FALSE, adjust = FALSE))

## Durbin-Watson test, p. 424, Example 15.4dwtest(fm)

## Breusch-Godfrey test, p. 427, Example 15.6bgtest(fm, order = 4)

## visualize fitted seriesplot(Investment[, "RealInv"], type = "b", pch = 19, ylab = "Real investment")lines(ts(fitted(fm), start = 1964), col = 4)

## 3-d visualization of fitted modelif(require(scatterplot3d)) s3d <- scatterplot3d(Investment[,c(5,7,6)],

type = "b", angle = 65, scale.y = 1, pch = 16)s3d$plane3d(fm, lty.box = "solid", col = 4)

isoacf Isotonic Autocorrelation Function

Description

Autocorrelation function (forced to be decreasing by isotonic regression).

Usage

isoacf(x, lagmax = NULL, weave1 = FALSE)

Arguments

x numeric vector.lagmax numeric. The maximal lag of the autocorrelations.weave1 logical. If set to TRUE isoacf uses the acf.R and pava.blocks function from

the original weave package, otherwise R’s own acf and isoreg functions areused.

Details

isoacf computes the autocorrelation function (ACF) of x enforcing the ACF to be decreasing byisotonic regression. See also Robertson et al. (1988).

Value

isoacf returns a numeric vector containing the ACF.

Page 9: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

kweights 9

References

Lumley T & Heagerty P (1999), Weighted Empirical Adaptive Variance Estimators for CorrelatedData Regression. Journal of the Royal Statistical Society B, 61, 459–477.

Robertson T, Wright FT, Dykstra RL (1988), Order Restricted Statistical Inference. New York.Wiley.

See Also

weave, weightsLumley

Examples

x <- filter(rnorm(100), 0.9, "recursive")isoacf(x)acf(x, plot = FALSE)$acf

kweights Kernel Weights

Description

Kernel weights for kernel-based heteroskedasticity and autocorrelation consistent (HAC) covariancematrix estimators as introduced by Andrews (1991).

Usage

kweights(x, kernel = c("Truncated", "Bartlett", "Parzen","Tukey-Hanning", "Quadratic Spectral"), normalize = FALSE)

Arguments

x numeric.

kernel a character specifying the kernel used. All kernels used are described in An-drews (1991).

normalize logical. If set to TRUE the kernels are normalized as described in Andrews(1991).

Value

Value of the kernel function at x.

References

Andrews DWK (1991), Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Esti-mation. Econometrica, 59, 817–858.

Page 10: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

10 lrvar

See Also

kernHAC, weightsAndrews

Examples

curve(kweights(x, kernel = "Quadratic", normalize = TRUE),from = 0, to = 3.2, xlab = "x", ylab = "k(x)")

curve(kweights(x, kernel = "Bartlett", normalize = TRUE),from = 0, to = 3.2, col = 2, add = TRUE)

curve(kweights(x, kernel = "Parzen", normalize = TRUE),from = 0, to = 3.2, col = 3, add = TRUE)

curve(kweights(x, kernel = "Tukey", normalize = TRUE),from = 0, to = 3.2, col = 4, add = TRUE)

curve(kweights(x, kernel = "Truncated", normalize = TRUE),from = 0, to = 3.2, col = 5, add = TRUE)

lrvar Long-Run Variance of the Mean

Description

Convenience function for computing the long-run variance (matrix) of a (possibly multivariate)series of observations.

Usage

lrvar(x, type = c("Andrews", "Newey-West"), prewhite = TRUE, adjust = TRUE, ...)

Arguments

x numeric vector, matrix, or time series.

type character specifying the type of estimator, i.e., whether kernHAC for the An-drews quadratic spectral kernel HAC estimator is used or NeweyWest for theNewey-West Bartlett HAC estimator.

prewhite logical or integer. Should the series be prewhitened? Passed to kernHAC orNeweyWest.

adjust logical. Should a finite sample adjustment be made? Passed to kernHAC orNeweyWest.

... further arguments passed on to kernHAC or NeweyWest.

Details

lrvar is a simple wrapper function for computing the long-run variance (matrix) of a (possiblymultivariate) series x. First, this simply fits a linear regression model x ~ 1 by lm. Second, thecorresponding variance of the mean(s) is estimated either by kernHAC (Andrews quadratic spectralkernel HAC estimator) or by NeweyWest (Newey-West Bartlett HAC estimator).

Page 11: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

meat 11

Value

For a univariate series x a scalar variance is computed. For a multivariate series x the covariancematrix is computed.

See Also

kernHAC, NeweyWest, vcovHAC

Examples

set.seed(1)## iid series (with variance of mean 1/n)## and Andrews kernel HAC (with prewhitening)x <- rnorm(100)lrvar(x)

## analogous multivariate case with Newey-West estimator (without prewhitening)y <- matrix(rnorm(200), ncol = 2)lrvar(y, type = "Newey-West", prewhite = FALSE)

## AR(1) series with autocorrelation 0.9z <- filter(rnorm(100), 0.9, method = "recursive")lrvar(z)

meat A Simple Meat Matrix Estimator

Description

Estimating the variance of the estimating functions of a regression model by cross products of theempirical estimating functions.

Usage

meat(x, adjust = FALSE, ...)

Arguments

x a fitted model object.adjust logical. Should a finite sample adjustment be made? This amounts to multipli-

cation withn/(n− k)

wheren

is the number of observations andk

the number of estimated parameters.... arguments passed to the estfun function.

Page 12: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

12 NeweyWest

Details

For some theoretical background along with implementation details see Zeileis (2006).

Value

Ak × k

matrix corresponding containing the scaled cross products of the empirical estimating functions.

References

Zeileis A (2006), Object-Oriented Computation of Sandwich Estimators. Journal of StatisticalSoftware, 16(9), 1–16. URL http://www.jstatsoft.org/v16/i09/.

See Also

sandwich, bread, estfun

Examples

x <- sin(1:10)y <- rnorm(10)fm <- lm(y ~ x)

meat(fm)meatHC(fm, type = "HC")meatHAC(fm)

NeweyWest Newey-West HAC Covariance Matrix Estimation

Description

A set of functions implementing the Newey & West (1987, 1994) heteroskedasticity and autocorre-lation consistent (HAC) covariance matrix estimators.

Usage

NeweyWest(x, lag = NULL, order.by = NULL, prewhite = TRUE, adjust = FALSE,diagnostics = FALSE, sandwich = TRUE, ar.method = "ols", data = list(),verbose = FALSE)

bwNeweyWest(x, order.by = NULL, kernel = c("Bartlett", "Parzen","Quadratic Spectral", "Truncated", "Tukey-Hanning"), weights = NULL,prewhite = 1, ar.method = "ols", data = list(), ...)

Page 13: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

NeweyWest 13

Arguments

x a fitted model object.lag integer specifying the maximum lag with positive weight for the Newey-West

estimator. If set to NULL floor(bwNeweyWest(x, ...)) is used.order.by Either a vector z or a formula with a single explanatory variable like ~ z. The

observations in the model are ordered by the size of z. If set to NULL (the default)the observations are assumed to be ordered (e.g., a time series).

prewhite logical or integer. Should the estimating functions be prewhitened? If TRUEor greater than 0 a VAR model of order as.integer(prewhite) is fitted viaar with method "ols" and demean = FALSE. The default is to use VAR(1)prewhitening.

kernel a character specifying the kernel used. All kernels used are described in An-drews (1991). bwNeweyWest can only compute bandwidths for "Bartlett","Parzen" and "Quadratic Spectral".

adjust logical. Should a finite sample adjustment be made? This amounts to multipli-cation with n/(n− k) where n is the number of observations and k the numberof estimated parameters.

diagnostics logical. Should additional model diagnostics be returned? See vcovHAC fordetails.

sandwich logical. Should the sandwich estimator be computed? If set to FALSE only themiddle matrix is returned.

ar.method character. The method argument passed to ar for prewhitening (only, not forbandwidth selection).

data an optional data frame containing the variables in the order.by model. Bydefault the variables are taken from the environment which the function is calledfrom.

verbose logical. Should the lag truncation parameter used be printed?weights numeric. A vector of weights used for weighting the estimated coefficients of

the approximation model (as specified by approx). By default all weights are 1except that for the intercept term (if there is more than one variable).

... currently not used.

Details

NeweyWest is a convenience interface to vcovHAC using Bartlett kernel weights as described inNewey & West (1987, 1994). The automatic bandwidth selection procedure described in Newey &West (1994) is used as the default and can also be supplied to kernHAC for the Parzen and quadraticspectral kernel. It is implemented in bwNeweyWest which does not truncate its results - if the resultsfor the Parzen and Bartlett kernels should be truncated, this has to be applied afterwards. For Bartlettweights this is implemented in NeweyWest.

To obtain the estimator described in Newey & West (1987), prewhitening has to be suppressed.

Value

NeweyWest returns the same type of object as vcovHAC which is typically just the covariance matrix.

bwNeweyWest returns the selected bandwidth parameter.

Page 14: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

14 PetersenCL

References

Andrews DWK (1991), Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Esti-mation. Econometrica, 59, 817–858.

Newey WK & West KD (1987), A Simple, Positive Semi-Definite, Heteroskedasticity and Auto-correlation Consistent Covariance Matrix. Econometrica, 55, 703–708.

Newey WK & West KD (1994), Automatic Lag Selection in Covariance Matrix Estimation. Reviewof Economic Studies, 61, 631–653.

Zeileis A (2004), Econometric Computing with HC and HAC Covariance Matrix Estimators. Jour-nal of Statistical Software, 11(10), 1–17. URL http://www.jstatsoft.org/v11/i10/.

See Also

vcovHAC, weightsAndrews, kernHAC

Examples

## fit investment equationdata(Investment)fm <- lm(RealInv ~ RealGNP + RealInt, data = Investment)

## Newey & West (1994) compute this type of estimatorNeweyWest(fm)

## The Newey & West (1987) estimator requires specification## of the lag and suppression of prewhiteningNeweyWest(fm, lag = 4, prewhite = FALSE)

## bwNeweyWest() can also be passed to kernHAC(), e.g.## for the quadratic spectral kernelkernHAC(fm, bw = bwNeweyWest)

PetersenCL Petersen’s Simulated Data for Assessing Clustered Standard Errors

Description

Artificial balanced panel data set from Petersen (2009) for illustrating and benchmarking clusteredstandard errors.

Usage

data("PetersenCL")

Page 15: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

PublicSchools 15

Format

A data frame containing 5000 observations on 4 variables.

firm integer. Firm identifier (500 firms).

year integer. Time variable (10 years per firm).

x numeric. Independent regressor variable.

y numeric. Dependent response variable.

Details

This simulated data set was created to illustrate and benchmark clustered standard errors. Theresidual and the regressor variable both contain a firm effect, but no year effect. Thus, standarderrors clustered by firm are different from the OLS standard errors and similarly double-clusteredstandard errors (by firm and year) are different from the standard errors clustered by year.

Source

http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.htm

References

Petersen MA (2009). “Estimating Standard Errors in Finance Panel Data Sets: Comparing Ap-proaches”, The Review of Financial Studies, 22(1), 435–480. doi: 10.1093/rfs/hhn053

PublicSchools US Expenditures for Public Schools

Description

Per capita expenditure on public schools and per capita income by state in 1979.

Usage

data(PublicSchools)

Format

A data frame containing 51 observations of 2 variables.

Expenditure per capita expenditure on public schools,

Income per capita income.

Source

Table 14.1 in Greene (1993)

Page 16: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

16 PublicSchools

References

Cribari-Neto F. (2004), Asymptotic Inference Under Heteroskedasticity of Unknown Form, Com-putational Statistics \& Data Analysis, 45, 215-233.

Greene W.H. (1993), Econometric Analysis, 2nd edition. Macmillan Publishing Company, NewYork.

US Department of Commerce (1979), Statistical Abstract of the United States. US GovernmentPrinting Office, Washington, DC.

Examples

## Willam H. Greene, Econometric Analysis, 2nd Ed.## Chapter 14## load data set, p. 385, Table 14.1data(PublicSchools)

## omit NA in Wisconsin and scale incomeps <- na.omit(PublicSchools)ps$Income <- ps$Income * 0.0001

## fit quadratic regression, p. 385, Table 14.2fmq <- lm(Expenditure ~ Income + I(Income^2), data = ps)summary(fmq)

## compare standard and HC0 standard errors## p. 391, Table 14.3library(sandwich)coef(fmq)sqrt(diag(vcovHC(fmq, type = "const")))sqrt(diag(vcovHC(fmq, type = "HC0")))

if(require(lmtest)) ## compare t ratiocoeftest(fmq, vcov = vcovHC(fmq, type = "HC0"))

## White test, p. 393, Example 14.5wt <- lm(residuals(fmq)^2 ~ poly(Income, 4), data = ps)wt.stat <- summary(wt)$r.squared * nrow(ps)c(wt.stat, pchisq(wt.stat, df = 3, lower = FALSE))

## Bresch-Pagan test, p. 395, Example 14.7bptest(fmq, studentize = FALSE)bptest(fmq)

## Francisco Cribari-Neto, Asymptotic Inference, CSDA 45## quasi z-tests, p. 229, Table 8## with Alaskacoeftest(fmq, df = Inf)[3,4]coeftest(fmq, df = Inf, vcov = vcovHC(fmq, type = "HC0"))[3,4]coeftest(fmq, df = Inf, vcov = vcovHC(fmq, type = "HC3"))[3,4]coeftest(fmq, df = Inf, vcov = vcovHC(fmq, type = "HC4"))[3,4]

Page 17: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

sandwich 17

## without Alaska (observation 2)fmq1 <- lm(Expenditure ~ Income + I(Income^2), data = ps[-2,])coeftest(fmq1, df = Inf)[3,4]coeftest(fmq1, df = Inf, vcov = vcovHC(fmq1, type = "HC0"))[3,4]coeftest(fmq1, df = Inf, vcov = vcovHC(fmq1, type = "HC3"))[3,4]coeftest(fmq1, df = Inf, vcov = vcovHC(fmq1, type = "HC4"))[3,4]

## visualization, p. 230, Figure 1plot(Expenditure ~ Income, data = ps,

xlab = "per capita income",ylab = "per capita spending on public schools")

inc <- seq(0.5, 1.2, by = 0.001)lines(inc, predict(fmq, data.frame(Income = inc)), col = 4)fml <- lm(Expenditure ~ Income, data = ps)abline(fml)text(ps[2,2], ps[2,1], rownames(ps)[2], pos = 2)

sandwich Making Sandwiches with Bread and Meat

Description

Constructing sandwich covariance matrix estimators by multiplying bread and meat matrices.

Usage

sandwich(x, bread. = bread, meat. = meat, ...)

Arguments

x a fitted model object.

bread. either a bread matrix or a function for computing this via bread.(x).

meat. either a bread matrix or a function for computing this via meat.(x, ...).

... arguments passed to the meat function.

Details

sandwich is a simple convenience function that takes a bread matrix (i.e., estimator of the expec-tation of the negative derivative of the estimating functions) and a meat matrix (i.e., estimator ofthe variance of the estimating functions) and multiplies them to a sandwich with meat between twoslices of bread. By default bread and meat are called.

Some theoretical background along with implementation details is given in Zeileis (2006).

Value

A matrix containing the sandwich covariance matrix estimate. Typically, this should be an k × kmatrix corresponding to k parameters.

Page 18: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

18 vcovBS

References

Zeileis A (2006), Object-Oriented Computation of Sandwich Estimators. Journal of StatisticalSoftware, 16(9), 1–16. URL http://www.jstatsoft.org/v16/i09/.

See Also

bread, meat, meatHC, meatHAC

Examples

x <- sin(1:10)y <- rnorm(10)fm <- lm(y ~ x)

sandwich(fm)vcovHC(fm, type = "HC")

vcovBS (Clustered) Bootstrap Covariance Matrix Estimation

Description

Estimation of basic bootstrap covariances, using a simple (clustered) case-based resampling. co-variance matrices using an object-oriented approach.

Usage

vcovBS(x, ...)

## Default S3 method:vcovBS(x, cluster = NULL, R = 250, start = FALSE, ...)

Arguments

x a fitted model object.

cluster a variable indicating the clustering of observations. By default each observationis its own cluster.

R integer. Number of bootstrap replications.

start logical. Should coef(x) be passed as start to the update(x, subset = ...)call? In case the model x is computed by some numeric iteration, this may speedup the bootstrapping.

... arguments passed to methods (to update in case of the default method).

Page 19: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

vcovCL 19

Details

Basic (clustered) bootstrap covariance matrix estimation is provided by vcovBS. The default methodsamples clusters (where each observation is its own cluster by default), i.e. case-based resampling.For obtaining a covariance matrix estimate it is assumed that an update of the model with theresampled subset can be obtained, the coef extracted, and finally the covariance computed withcov.

Note: The update model is evaluated in the environment(terms(x)).

Value

A matrix containing the covariance matrix estimate.

See Also

vcovCL

Examples

## Petersen's datadata("PetersenCL", package = "sandwich")m <- lm(y ~ x, data = PetersenCL)

## comparison of different standard errorsset.seed(1)cbind(

"classical" = sqrt(diag(vcov(m))),"HC-cluster" = sqrt(diag(vcovCL(m, cluster = PetersenCL$firm))),"BS-cluster" = sqrt(diag(vcovBS(m, cluster = PetersenCL$firm)))

)

vcovCL Clustered Covariance Matrix Estimation

Description

Estimation of one-way and multi-way clustered covariance matrices using an object-oriented ap-proach.

Usage

vcovCL(x, cluster = NULL, type = NULL, sandwich = TRUE, fix = FALSE, ...)meatCL(x, cluster = NULL, type = NULL, cadjust = TRUE, multi0 = FALSE, ...)

Page 20: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

20 vcovCL

Arguments

x a fitted model object.

cluster a variable indicating the clustering of observations or a list (or data.frame)thereof. By default, either attr(x, "cluster") is used. If that is also NULLeach observation is its own cluster.

type a character string specifying the estimation type (HC0–HC3). The default is touse "HC1" for "lm" objects and "HC0" otherwise.

sandwich logical. Should the sandwich estimator be computed? If set to FALSE only themeat matrix is returned.

fix logical. Should the covariance matrix be fixed to be positive semi-definite incase it is not?

cadjust logical. Should a cluster adjustment be applied?

multi0 logical. Should the HC0 estimate be used for the final adjustment in multi-wayclustered covariances?

... arguments passed to meatCL.

Details

Clustered sandwich estimators are used to adjust inference when errors are correlated within (butnot between) clusters. vcovCL allows for clustering in arbitrary many cluster dimensions (e.g., firm,time, industry), given all dimensions have enough clusters (for more details, see Cameron et al.2011). If each observation is its own cluster, the clustered sandwich collapses to the basic sandwichcovariance.

The function meatCL is the work horse for estimating the meat of clustered sandwich estimators.vcovCL is a wrapper calling sandwich and bread (Zeileis 2006). vcovCL is applicable beyond lmor glm class objects.

bread and meat matrices are multiplied to construct clustered sandwich estimators. The meat of aclustered sandwich estimator is the cross product of the clusterwise summed estimating functions.Instead of summing over all individuals, first sum over cluster.

A two-way clustered sandwich estimator M (e.g., for cluster dimensions "firm" and "industry") is alinear combination of one-way clustered sandwich estimators for both dimensions (Mfirm,Mtime)minus the clustered sandwich estimator, with clusters formed out of the intersection of both dimen-sions (Mid∩time):

M = Mid +Mtime −Mid∩time

Instead of substracting Mid∩time as the last substacted matrix, Ma (2014) suggests to substract thebasic HC0 covariance matrix when only a single observation is in each intersection of id and time.Set multi0 = TRUE to substract the basic HC0 covariance matrix as the last substracted matrix inmulti-way clustering. For details, see also Petersen (2009) and Thompson (2011).

With the type argument, HC0 to HC3 types of bias adjustment can be employed. HC2 and HC3types of bias adjustment are geared towards the linear model, but they are also applicable for GLMs(see Mc Caffrey and Bell (2002) and Kauermann and Carroll (2001) for details). A precondition forHC2 and HC3 types of bias adjustment is the existence of a hat matrix or a weighted version of thehat matrix for GLMs, respectively.

Page 21: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

vcovCL 21

The cadjust argument allows to switch the cluster bias adjustment factor G/(G − 1) on and off(whereG is the number of clusters in a cluster dimension g) See Cameron et al. (2008) and Cameronet al. (2011) for more details about small-sample modifications.

Cameron et al. (2011) observe that sometimes the covariance matrix is not positive-semidefinite.To force the covariance matrix to be positive-semidefinite, set fix = TRUE. Following Cameronet al. (2011), the eigendecomposition of the estimated covariance matrix is used and any negativeeigenvalue(s) are converted to zero.

Value

A matrix containing the covariance matrix estimate.

References

Cameron AC & Gelbach JB & Miller DL (2008). “Bootstrap-Based Improvements for Inferencewith Clustered Errors”, The Review of Economics and Statistics, 90(3), 414–427. doi: 10.3386/t0344

Cameron AC & Gelbach JB & Miller DL (2011). “Robust Inference With Multiway Clustering”,Journal of Business & Ecomomic Statistics, 29(2), 238–249. doi: 10.1198/jbes.2010.07136

Kauermann G & Carroll RJ (2001). “A Note on the Efficiency of Sandwich Covariance MatrixEstimation”, Journal of the American Statistical Association, 96(456), 1387–1396. doi: 10.1198/016214501753382309

Ma MS (2014). “Are We Really Doing What We Think We Are Doing? A Note on Finite-SampleEstimates of Two-Way Cluster-Robust Standard Errors”, Mimeo, Availlable at SSRN: URL http://ssrn.com/abstract=2420421.

McCaffrey DF & Bell RM (2002). “Bias Reduction in Standard Errors for Linear Regression withMulti-Stage Samples”, Survey Methodology, 28(2), 169–181.

Petersen MA (2009). “Estimating Standard Errors in Finance Panel Data Sets: Comparing Ap-proaches”, The Review of Financial Studies, 22(1), 435–480. doi: 10.1093/rfs/hhn053

Thompson SB (2011). “Simple Formulas for Standard Errors That Cluster by Both Firm and Time”,Journal of Financial Economics, 99(1), 1–10. doi: 10.1016/j.jfineco.2010.08.016

Zeileis A (2004). “Econometric Computing with HC and HAC Covariance Matrix Estimator”,Journal of Statistical Software, 11(10), 1–17. doi: 10.18637/jss.v011.i10

Zeileis A (2006). “Object-Oriented Computation of Sandwich Estimators”, Journal of StatisticalSoftware, 16(9), 1–16. doi: 10.18637/jss.v016.i09

See Also

vcovHC

Examples

## Petersen's datadata("PetersenCL", package = "sandwich")m <- lm(y ~ x, data = PetersenCL)

## clustered covariances

Page 22: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

22 vcovHAC

## one-wayvcovCL(m, cluster = PetersenCL$firm)## one-way with HC2vcovCL(m, cluster = PetersenCL$firm, type = "HC2")## two-wayvcovCL(m, cluster = PetersenCL[, c("firm", "year")])

## comparison with cross-section sandwiches## HC0all.equal(sandwich(m), vcovCL(m, type = "HC0", cadjust = FALSE))## HC2all.equal(vcovHC(m, type = "HC2"), vcovCL(m, type = "HC2"))## HC3all.equal(vcovHC(m, type = "HC3"), vcovCL(m, type = "HC3"))

## Innovation datadata("InstInnovation", package = "sandwich")

## replication of one-way clustered standard errors for model 3, Table I## and model 1, Table II in Berger et al. (2016)

## count regression formulaf1 <- cites ~ institutions + log(capital/employment) + log(sales) + industry + year

## model 3, Table I: Poisson model## one-way clustered standard errorstab_I_3_pois <- glm(f1, data = InstInnovation, family = poisson)vcov_pois <- vcovCL(tab_I_3_pois, InstInnovation$company)sqrt(diag(vcov_pois))[2:4]

## coefficient tablesif(require("lmtest")) coeftest(tab_I_3_pois, vcov = vcov_pois)[2:4, ]

## Not run:## model 1, Table II: negative binomial hurdle model## (requires "pscl" or alternatively "countreg" from R-Forge)library("pscl")library("lmtest")tab_II_3_hurdle <- hurdle(f1, data = InstInnovation, dist = "negbin")# dist = "negbin", zero.dist = "negbin", separate = FALSE)vcov_hurdle <- vcovCL(tab_II_3_hurdle, InstInnovation$company)sqrt(diag(vcov_hurdle))[c(2:4, 149:151)]coeftest(tab_II_3_hurdle, vcov = vcov_hurdle)[c(2:4, 149:151), ]

## End(Not run)

vcovHAC Heteroskedasticity and Autocorrelation Consistent (HAC) CovarianceMatrix Estimation

Page 23: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

vcovHAC 23

Description

Heteroskedasticity and autocorrelation consistent (HAC) estimation of the covariance matrix of thecoefficient estimates in a (generalized) linear regression model.

Usage

vcovHAC(x, ...)

## Default S3 method:vcovHAC(x, order.by = NULL, prewhite = FALSE, weights = weightsAndrews,adjust = TRUE, diagnostics = FALSE, sandwich = TRUE, ar.method = "ols",data = list(), ...)

meatHAC(x, order.by = NULL, prewhite = FALSE, weights = weightsAndrews,adjust = TRUE, diagnostics = FALSE, ar.method = "ols", data = list())

Arguments

x a fitted model object.

order.by Either a vector z or a formula with a single explanatory variable like ~ z. Theobservations in the model are ordered by the size of z. If set to NULL (the default)the observations are assumed to be ordered (e.g., a time series).

prewhite logical or integer. Should the estimating functions be prewhitened? If TRUE orgreater than 0 a VAR model of order as.integer(prewhite) is fitted via arwith method "ols" and demean = FALSE.

weights Either a vector of weights for the autocovariances or a function to compute theseweights based on x, order.by, prewhite, ar.method and data. If weights isa function it has to take these arguments. See also details.

adjust logical. Should a finite sample adjustment be made? This amounts to multipli-cation with n/(n− k) where n is the number of observations and k the numberof estimated parameters.

diagnostics logical. Should additional model diagnostics be returned? See below for details.

sandwich logical. Should the sandwich estimator be computed? If set to FALSE only themeat matrix is returned.

ar.method character. The method argument passed to ar for prewhitening.

data an optional data frame containing the variables in the order.by model. Bydefault the variables are taken from the environment which vcovHAC is calledfrom.

... arguments passed to sandwich.

Details

The function meatHAC is the real work horse for estimating the meat of HAC sandwich estimators –the default vcovHAC method is a wrapper calling sandwich and bread. See Zeileis (2006) for moreimplementation details. The theoretical background, exemplified for the linear regression model, isdescribed in Zeileis (2004).

Page 24: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

24 vcovHAC

Both functions construct weighted information sandwich variance estimators for parametric modelsfitted to time series data. These are basically constructed from weighted sums of autocovariances ofthe estimating functions (as extracted by estfun). The crucial step is the specification of weights:the user can either supply vcovHAC with some vector of weights or with a function that computesthese weights adaptively (based on the arguments x, order.by, prewhite and data). Two functionsfor adaptively choosing weights are implemented in weightsAndrews implementing the results ofAndrews (1991) and in weightsLumley implementing the results of Lumley (1999). The functionskernHAC and weave respectively are to more convenient interfaces for vcovHAC with these functions.

Prewhitening based on VAR approximations is described as suggested in Andrews & Monahan(1992).

The covariance matrix estimators have been improved by the addition of a bias correction and anapproximate denominator degrees of freedom for test and confidence interval construction. SeeLumley & Heagerty (1999) for details.

Value

A matrix containing the covariance matrix estimate. If diagnostics was set to TRUE this has anattribute "diagnostics" which is a list with

bias.correction

multiplicative bias correction

df Approximate denominator degrees of freedom

References

Andrews DWK (1991), Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Esti-mation. Econometrica, 59, 817–858.

Andrews DWK & Monahan JC (1992), An Improved Heteroskedasticity and Autocorrelation Con-sistent Covariance Matrix Estimator. Econometrica, 60, 953–966.

Lumley T & Heagerty P (1999), Weighted Empirical Adaptive Variance Estimators for CorrelatedData Regression. Journal of the Royal Statistical Society B, 61, 459–477.

Newey WK & West KD (1987), A Simple, Positive Semi-Definite, Heteroskedasticity and Auto-correlation Consistent Covariance Matrix. Econometrica, 55, 703–708.

Zeileis A (2004), Econometric Computing with HC and HAC Covariance Matrix Estimators. Jour-nal of Statistical Software, 11(10), 1–17. URL http://www.jstatsoft.org/v11/i10/.

Zeileis A (2006), Object-Oriented Computation of Sandwich Estimators. Journal of StatisticalSoftware, 16(9), 1–16. URL http://www.jstatsoft.org/v16/i09/.

See Also

weightsLumley, weightsAndrews, weave, kernHAC

Examples

x <- sin(1:100)y <- 1 + x + rnorm(100)fm <- lm(y ~ x)vcovHAC(fm)

Page 25: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

vcovHC 25

vcov(fm)

vcovHC Heteroskedasticity-Consistent Covariance Matrix Estimation

Description

Heteroskedasticity-consistent estimation of the covariance matrix of the coefficient estimates inregression models.

Usage

vcovHC(x, ...)

## Default S3 method:vcovHC(x,

type = c("HC3", "const", "HC", "HC0", "HC1", "HC2", "HC4", "HC4m", "HC5"),omega = NULL, sandwich = TRUE, ...)

meatHC(x, type = , omega = NULL)

Arguments

x a fitted model object.

type a character string specifying the estimation type. For details see below.

omega a vector or a function depending on the arguments residuals (the workingresiduals of the model), diaghat (the diagonal of the corresponding hat matrix)and df (the residual degrees of freedom). For details see below.

sandwich logical. Should the sandwich estimator be computed? If set to FALSE only themeat matrix is returned.

... arguments passed to sandwich.

Details

The function meatHC is the real work horse for estimating the meat of HC sandwich estimators –the default vcovHC method is a wrapper calling sandwich and bread. See Zeileis (2006) for moreimplementation details. The theoretical background, exemplified for the linear regression model, isdescribed below and in Zeileis (2004). Analogous formulas are employed for other types of models.

When type = "const" constant variances are assumed and and vcovHC gives the usual estimate ofthe covariance matrix of the coefficient estimates:

σ2(X>X)−1

All other methods do not assume constant variances and are suitable in case of heteroskedasticity."HC" (or equivalently "HC0") gives White’s estimator, the other estimators are refinements of this.They are all of form

Page 26: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

26 vcovHC

(X>X)−1X>ΩX(X>X)−1

and differ in the choice of Omega. This is in all cases a diagonal matrix whose elements can beeither supplied as a vector omega or as a a function omega of the residuals, the diagonal elements ofthe hat matrix and the residual degrees of freedom. For White’s estimator

omega <- function(residuals, diaghat, df) residuals^2

Instead of specifying the diagonal omega or a function for estimating it, the type argument can beused to specify the HC0 to HC5 estimators. If omega is used, type is ignored.

Long \& Ervin (2000) conduct a simulation study of HC estimators (HC0 to HC3) in the linearregression model, recommending to use HC3 which is thus the default in vcovHC. Cribari-Neto(2004), Cribari-Neto, Souza, \& Vasconcellos (2007), and Cribari-Neto \& Da Silva (2011), respec-tively, suggest the HC4, HC5, and modified HC4m type estimators. All of them are tailored to takeinto account the effect of leverage points in the design matrix. For more details see the references.

Value

A matrix containing the covariance matrix estimate.

References

Cribari-Neto F. (2004), Asymptotic Inference under Heteroskedasticity of Unknown Form. Com-putational Statistics \& Data Analysis 45, 215–233.

Cribari-Neto F., Da Silva W.B. (2011), A New Heteroskedasticity-Consistent Covariance MatrixEstimator for the Linear Regression Model. Advances in Statistical Analysis, 95(2), 129–146.

Cribari-Neto F., Souza T.C., Vasconcellos, K.L.P. (2007), Inference under Heteroskedasticity andLeveraged Data. Communications in Statistics – Theory and Methods, 36, 1877–1888. Errata: 37,3329–3330, 2008.

Long J. S., Ervin L. H. (2000), Using Heteroscedasticity Consistent Standard Errors in the LinearRegression Model. The American Statistician, 54, 217–224.

MacKinnon J. G., White H. (1985), Some Heteroskedasticity-Consistent Covariance Matrix Esti-mators with Improved Finite Sample Properties. Journal of Econometrics 29, 305–325.

White H. (1980), A Heteroskedasticity-Consistent Covariance Matrix and a Direct Test for Het-eroskedasticity. Econometrica 48, 817–838.

Zeileis A (2004), Econometric Computing with HC and HAC Covariance Matrix Estimators. Jour-nal of Statistical Software, 11(10), 1–17. URL http://www.jstatsoft.org/v11/i10/.

Zeileis A (2006), Object-Oriented Computation of Sandwich Estimators. Journal of StatisticalSoftware, 16(9), 1–16. URL http://www.jstatsoft.org/v16/i09/.

See Also

lm, hccm, bptest, ncv.test

Page 27: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

vcovOPG 27

Examples

## generate linear regression relationship## with homoskedastic variancesx <- sin(1:100)y <- 1 + x + rnorm(100)## model fit and HC3 covariancefm <- lm(y ~ x)vcovHC(fm)## usual covariance matrixvcovHC(fm, type = "const")vcov(fm)

sigma2 <- sum(residuals(lm(y ~ x))^2)/98sigma2 * solve(crossprod(cbind(1, x)))

vcovOPG Outer Product of Gradients Covariance Matrix Estimation

Description

Outer product of gradients estimation for the covariance matrix of the coefficient estimates in re-gression models.

Usage

vcovOPG(x, adjust = FALSE, ...)

Arguments

x a fitted model object.

adjust logical. Should a finite sample adjustment be made? This amounts to multipli-cation with

n/(n− k)

where

n

is the number of observations and

k

the number of estimated parameters.

... arguments passed to the estfun function.

Page 28: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

28 vcovPC

Details

In correctly specified models, the “meat” matrix (cross product of estimating functions, see meat)and the inverse of the “bread” matrix (inverse of the derivative of the estimating functions, seebread) are equal and correspond to the Fisher information matrix. Typically, an empirical versionof the bread is used for estimation of the information but alternatively it is also possible to use themeat. This method is also known as the outer product of gradients (OPG) estimator (Cameron &Trivedi 2005).

Using the sandwich infrastructure, the OPG estimator could easily be computed via solve(meat(obj))(modulo scaling). To employ numerically more stable implementation of the inversion, this simpleconvenience function can be used: vcovOPG(obj).

Note that this only works if the estfun() method computes the maximum likelihood scores (andnot a scaled version such as least squares scores for "lm" objects).

Value

A matrix containing the covariance matrix estimate.

References

Cameron AC and Trivedi PK (2005), Microeconometrics: Methods and Applications. CambridgeUniversity Press, Cambridge.

Zeileis A (2006), Object-Oriented Computation of Sandwich Estimators. Journal of StatisticalSoftware, 16(9), 1–16. URL http://www.jstatsoft.org/v16/i09/.

See Also

meat, bread, sandwich

Examples

## generate poisson regression relationshipx <- sin(1:100)y <- rpois(100, exp(1 + x))## compute usual covariance matrix of coefficient estimatesfm <- glm(y ~ x, family = poisson)vcov(fm)vcovOPG(fm)

vcovPC Panel-Corrected Covariance Matrix Estimation

Description

Estimation of sandwich covariances a la Beck and Katz (1995) for panel data.

Page 29: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

vcovPC 29

Usage

vcovPC(x, cluster = NULL, order.by = NULL,pairwise = FALSE, sandwich = TRUE, fix = FALSE, ...)

meatPC(x, cluster = NULL, order.by = NULL,pairwise = FALSE, kronecker = TRUE, ...)

Arguments

x a fitted model object.

cluster a variable indicating the clustering of observations or a list (or data.frame)thereof. By default, either attr(x, "cluster") is used.

order.by a variable indicating the aggregation within time periods.

pairwise logical. For unbalanced panels. Indicating whether the meat should be estimatedpair- or casewise.

sandwich logical. Should the sandwich estimator be computed? If set to FALSE only themeat matrix is returned.

fix logical. Should the covariance matrix be fixed to be positive semi-definite incase it is not?

kronecker logical. Calculate the meat via the Kronecker-product, shortening the computa-tion time for small matrices. For large matrices, set kronecker = FALSE.

... arguments passed to the meatPC or estfun function, respectively.

Details

vcovPC is a function for estimating Beck and Katz (1995) panel-corrected covariance matrix.

The function meatPC is the work horse for estimating the meat of Beck and Katz (1995) covariancematrix estimators. vcovPC is a wrapper calling sandwich and bread (Zeileis 2006).

Following Bailey and Katz (2011), there are two alternatives to estimate the meat for unbalancedpanels. For pairwise = FALSE, a balanced subset of the panel is used, whereas for pairwise = TRUE,a pairwise balanced sample is employed.

Value

A matrix containing the covariance matrix estimate.

References

Bailey D & Katz JN (2011). “Implementing Panel-Corrected Standard Errors in R: The pcse Pack-age”, Journal of Statistical Software, Code Snippets, 42(1), 1–11. URL http://www.jstatsoft.org/v42/c01/

Beck N & Katz JN (1995). “What To Do (and Not To Do) with Time-Series-Cross-Section Datain Comparative Politics”, American Political Science Review, 89(3), 634–647. URL http://www.jstor.org/stable/2082979

Zeileis A (2004). “Econometric Computing with HC and HAC Covariance Matrix Estimator”,Journal of Statistical Software, 11(10), 1–17. doi: 10.18637/jss.v011.i10

Page 30: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

30 vcovPL

Zeileis A (2006). “Object-Oriented Computation of Sandwich Estimators”, Journal of StatisticalSoftware, 16(9), 1–16. doi: 10.18637/jss.v016.i09

Examples

## Petersen's datadata("PetersenCL", package = "sandwich")m <- lm(y ~ x, data = PetersenCL)

## Beck and Katz (1995) standard errors## balanced panelsqrt(diag(vcovPC(m, cluster = PetersenCL$firm, order.by = PetersenCL$year)))

## unbalanced panelPU <- subset(PetersenCL, !(firm == 1 & year == 10))pu_lm <- lm(y ~ x, data = PU)sqrt(diag(vcovPC(pu_lm, cluster = PU$firm, order.by = PU$year, pairwise = TRUE)))sqrt(diag(vcovPC(pu_lm, cluster = PU$firm, order.by = PU$year, pairwise = FALSE)))

vcovPL Clustered Covariance Matrix Estimation for panel data

Description

Estimation of sandwich covariances a la Newey-West (1987) and Driscoll and Kraay (1998) forpanel data.

Usage

vcovPL(x, cluster = NULL, order.by = NULL,kernel = "Bartlett", sandwich = TRUE, fix = FALSE, ...)

meatPL(x, cluster = NULL, order.by = NULL,kernel = "Bartlett", lag = "NW1987", bw = NULL,adjust = TRUE, ...)

Arguments

x a fitted model object.

cluster a variable indicating the clustering of observations or a list (or data.frame)thereof. By default, either attr(x, "cluster") is used. If that is also NULLeach observation is its own cluster.

order.by a variable indicating the aggregation within time periods.

kernel a character specifying the kernel used. All kernels described in Andrews (1991)are supported, see kweights.

lag character or numeric, indicating the lag length used. Three rules of thumb("max" or equivalently "P2009", "NW1987", or "NW1994") can be specified, or anumeric number of lags can be specified directly. By default, "NW1987" is used.

Page 31: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

vcovPL 31

bw numeric. The bandwidth of the kernel which by default corresponds to lag + 1.Only one of lag and bw should be used.

sandwich logical. Should the sandwich estimator be computed? If set to FALSE only themeat matrix is returned.

fix logical. Should the covariance matrix be fixed to be positive semi-definite incase it is not?

adjust logical. Should a finite sample adjustment be made? This amounts to multi-plication with n/(n − k) where n is the number of observations and k is thenumber of estimated parameters.

... arguments passed to the metaPL or estfun function, respectively.

Details

vcovPL is a function for estimating the Newey-West (1987) and Driscoll and Kraay (1998) co-variance matrix. Driscoll and Kraay (1998) apply a Newey-West type correction to the sequenceof cross-sectional averages of the moment conditions (see Hoechle (2007)). For large T (and re-gardless of the length of the cross-sectional dimension), the Driscoll and Kraay (1998) standarderrors are robust to general forms of cross-sectional and serial correlation (Hoechle (2007)). TheNewey-West (1978) covariance matrix restricts the Driscoll and Kraay (1998) covariance matrix tono cross-sectional correlation.

The function meatPL is the work horse for estimating the meat of Newey-West (1978) and Driscolland Kraay (1998) covariance matrix estimators. vcovPL is a wrapper calling sandwich and bread(Zeileis 2006).

Default lag length is the "NW1987". For lag = "NW1987", the lag length is chosen from the heuristicfloor[T (1/4)]. More details on lag length selection in Hoechle (2007). For lag = "NW1994", thelag length is taken from the first step of Newey and West’s (1994) plug-in procedure.

Value

A matrix containing the covariance matrix estimate.

References

Andrews DWK (1991). “Heteroscedasticity and Autocorrelation Consistent Covariance Matrix Es-timation”, Econometrica, 817–858.

Driscoll JC & Kraay AC (1998). “Consistent Covariance Matrix Estimation with Spatially Depen-dent Panel Data”, The Review of Economics and Statistics, 80(4), 549–560.

Hoechle D (2007). “Robust Standard Errors for Panel Regressions with Cross-Sectional Depen-dence”, Stata Journal, 7(3), 281–312.

Newey WK & West KD (1987). “Hypothesis Testing with Efficient Method of Moments Estima-tion”, International Economic Review, 777-787.

Newey WK & West KD (1994). “Automatic Lag Selection in Covariance Matrix Estimation”, TheReview of Economic Studies, 61(4), 631–653.

White H (1980). “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Testfor Heteroskedasticity”, Econometrica, 817–838. doi: 10.2307/1912934

Page 32: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

32 weightsAndrews

Zeileis A (2004). “Econometric Computing with HC and HAC Covariance Matrix Estimator”,Journal of Statistical Software, 11(10), 1–17. doi: 10.18637/jss.v011.i10

Zeileis A (2006). “Object-Oriented Computation of Sandwich Estimators”, Journal of StatisticalSoftware, 16(9), 1–16. doi: 10.18637/jss.v016.i09

See Also

vcovCL

Examples

## Petersen's datadata("PetersenCL", package = "sandwich")m <- lm(y ~ x, data = PetersenCL)

## Driscoll and Kraay standard errors## lag length set to: T - 1 (maximum lag length)## as proposed by Petersen (2009)sqrt(diag(vcovPL(m, cluster = PetersenCL$firm, lag = "max", adjust = FALSE)))

## lag length set to: floor(4 * (T / 100)^(2/9))## rule of thumb proposed by Hoechle (2007) based on Newey & West (1994)sqrt(diag(vcovPL(m, cluster = PetersenCL$firm, lag = "NW1994")))

## lag length set to: floor(T^(1/4))## rule of thumb based on Newey & West (1987)sqrt(diag(vcovPL(m, cluster = PetersenCL$firm, lag = "NW1987")))

weightsAndrews Kernel-based HAC Covariance Matrix Estimation

Description

A set of functions implementing a class of kernel-based heteroskedasticity and autocorrelation con-sistent (HAC) covariance matrix estimators as introduced by Andrews (1991).

Usage

kernHAC(x, order.by = NULL, prewhite = 1, bw = bwAndrews,kernel = c("Quadratic Spectral", "Truncated", "Bartlett", "Parzen", "Tukey-Hanning"),approx = c("AR(1)", "ARMA(1,1)"), adjust = TRUE, diagnostics = FALSE,sandwich = TRUE, ar.method = "ols", tol = 1e-7, data = list(), verbose = FALSE, ...)

weightsAndrews(x, order.by = NULL, bw = bwAndrews,kernel = c("Quadratic Spectral", "Truncated", "Bartlett", "Parzen", "Tukey-Hanning"),prewhite = 1, ar.method = "ols", tol = 1e-7, data = list(), verbose = FALSE, ...)

bwAndrews(x, order.by = NULL, kernel = c("Quadratic Spectral", "Truncated","Bartlett", "Parzen", "Tukey-Hanning"), approx = c("AR(1)", "ARMA(1,1)"),weights = NULL, prewhite = 1, ar.method = "ols", data = list(), ...)

Page 33: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

weightsAndrews 33

Arguments

x a fitted model object.

order.by Either a vector z or a formula with a single explanatory variable like ~ z. Theobservations in the model are ordered by the size of z. If set to NULL (the default)the observations are assumed to be ordered (e.g., a time series).

prewhite logical or integer. Should the estimating functions be prewhitened? If TRUEor greater than 0 a VAR model of order as.integer(prewhite) is fitted viaar with method "ols" and demean = FALSE. The default is to use VAR(1)prewhitening.

bw numeric or a function. The bandwidth of the kernel (corresponds to the trun-cation lag). If set to to a function (the default is bwAndrews) it is adaptivelychosen.

kernel a character specifying the kernel used. All kernels used are described in An-drews (1991).

approx a character specifying the approximation method if the bandwidth bw has to bechosen by bwAndrews.

adjust logical. Should a finite sample adjustment be made? This amounts to multipli-cation with n/(n− k) where n is the number of observations and k the numberof estimated parameters.

diagnostics logical. Should additional model diagnostics be returned? See vcovHAC fordetails.

sandwich logical. Should the sandwich estimator be computed? If set to FALSE only themiddle matrix is returned.

ar.method character. The method argument passed to ar for prewhitening (only, not forbandwidth selection).

tol numeric. Weights that exceed tol are used for computing the covariance matrix,all other weights are treated as 0.

data an optional data frame containing the variables in the order.by model. Bydefault the variables are taken from the environment which the function is calledfrom.

verbose logical. Should the bandwidth parameter used be printed?

... further arguments passed to bwAndrews.

weights numeric. A vector of weights used for weighting the estimated coefficients ofthe approximation model (as specified by approx). By default all weights are 1except that for the intercept term (if there is more than one variable).

Details

kernHAC is a convenience interface to vcovHAC using weightsAndrews: first a weights function isdefined and then vcovHAC is called.

The kernel weights underlying weightsAndrews are directly accessible via the function kweightsand require the specification of the bandwidth parameter bw. If this is not specified it can be cho-sen adaptively by the function bwAndrews (except for the "Truncated" kernel). The automatic

Page 34: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

34 weightsAndrews

bandwidth selection is based on an approximation of the estimating functions by either AR(1)or ARMA(1,1) processes. To aggregate the estimated parameters from these approximations aweighted sum is used. The weights in this aggregation are by default all equal to 1 except thatcorresponding to the intercept term which is set to 0 (unless there is no other variable in the model)making the covariance matrix scale invariant.

Further details can be found in Andrews (1991).

The estimator of Newey & West (1987) is a special case of the class of estimators introduced byAndrews (1991). It can be obtained using the "Bartlett" kernel and setting bw to lag + 1. Aconvenience interface is provided in NeweyWest.

Value

kernHAC returns the same type of object as vcovHAC which is typically just the covariance matrix.

weightsAndrews returns a vector of weights.

bwAndrews returns the selected bandwidth parameter.

References

Andrews DWK (1991), Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Esti-mation. Econometrica, 59, 817–858.

Newey WK & West KD (1987), A Simple, Positive Semi-Definite, Heteroskedasticity and Auto-correlation Consistent Covariance Matrix. Econometrica, 55, 703–708.

See Also

vcovHAC, NeweyWest, weightsLumley, weave

Examples

curve(kweights(x, kernel = "Quadratic", normalize = TRUE),from = 0, to = 3.2, xlab = "x", ylab = "k(x)")

curve(kweights(x, kernel = "Bartlett", normalize = TRUE),from = 0, to = 3.2, col = 2, add = TRUE)

curve(kweights(x, kernel = "Parzen", normalize = TRUE),from = 0, to = 3.2, col = 3, add = TRUE)

curve(kweights(x, kernel = "Tukey", normalize = TRUE),from = 0, to = 3.2, col = 4, add = TRUE)

curve(kweights(x, kernel = "Truncated", normalize = TRUE),from = 0, to = 3.2, col = 5, add = TRUE)

## fit investment equationdata(Investment)fm <- lm(RealInv ~ RealGNP + RealInt, data = Investment)

## compute quadratic spectral kernel HAC estimatorkernHAC(fm)kernHAC(fm, verbose = TRUE)

## use Parzen kernel instead, VAR(2) prewhitening, no finite sample

Page 35: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

weightsLumley 35

## adjustment and Newey & West (1994) bandwidth selectionkernHAC(fm, kernel = "Parzen", prewhite = 2, adjust = FALSE,

bw = bwNeweyWest, verbose = TRUE)

## compare with estimate under assumption of spheric errorsvcov(fm)

weightsLumley Weighted Empirical Adaptive Variance Estimation

Description

A set of functions implementing a class of kernel-based heteroskedasticity and autocorrelation con-sistent (HAC) covariance matrix estimators as introduced by Andrews (1991).

Usage

weave(x, order.by = NULL, prewhite = FALSE, C = NULL,method = c("truncate", "smooth"), acf = isoacf, adjust = FALSE,diagnostics = FALSE, sandwich = TRUE, tol = 1e-7, data = list(), ...)

weightsLumley(x, order.by = NULL, C = NULL,method = c("truncate", "smooth"), acf = isoacf, tol = 1e-7, data = list(), ...)

Arguments

x a fitted model object.

order.by Either a vector z or a formula with a single explanatory variable like ~ z. Theobservations in the model are ordered by the size of z. If set to NULL (the default)the observations are assumed to be ordered (e.g., a time series).

prewhite logical or integer. Should the estimating functions be prewhitened? If TRUE orgreater than 0 a VAR model of order as.integer(prewhite) is fitted via arwith method "ols" and demean = FALSE.

C numeric. The cutoff constant C is by default 4 for method "truncate" and 1 formethod "smooth".

method a character specifying the method used, see details.

acf a function that computes the autocorrelation function of a vector, by defaultisoacf is used.

adjust logical. Should a finite sample adjustment be made? This amounts to multipli-cation with n/(n− k) where n is the number of observations and k the numberof estimated parameters.

diagnostics logical. Should additional model diagnostics be returned? See vcovHAC fordetails.

sandwich logical. Should the sandwich estimator be computed? If set to FALSE only themiddle matrix is returned.

Page 36: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

36 weightsLumley

tol numeric. Weights that exceed tol are used for computing the covariance matrix,all other weights are treated as 0.

data an optional data frame containing the variables in the order.by model. Bydefault the variables are taken from the environment which the function is calledfrom.

... currently not used.

Details

weave is a convenience interface to vcovHAC using weightsLumley: first a weights function isdefined and then vcovHAC is called.

Both weighting methods are based on some estimate of the autocorrelation function ρ (as computedby acf) of the residuals of the model x. The weights for the "truncate" method are

Inρ2 > C

and the weights for the "smooth" method are

min1, Cnρ2

where n is the number of observations in the model an C is the truncation constant C.

Further details can be found in Lumley & Heagerty (1999).

Value

weave returns the same type of object as vcovHAC which is typically just the covariance matrix.

weightsLumley returns a vector of weights.

References

Lumley T & Heagerty P (1999), Weighted Empirical Adaptive Variance Estimators for CorrelatedData Regression. Journal of the Royal Statistical Society B, 61, 459–477.

See Also

vcovHAC, weightsAndrews, kernHAC

Examples

x <- sin(1:100)y <- 1 + x + rnorm(100)fm <- lm(y ~ x)weave(fm)vcov(fm)

Page 37: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

Index

∗Topic datasetsInstInnovation, 4Investment, 6PetersenCL, 14PublicSchools, 15

∗Topic regressionbread, 2estfun, 3isoacf, 8kweights, 9lrvar, 10meat, 11NeweyWest, 12sandwich, 17vcovBS, 18vcovCL, 19vcovHAC, 22vcovHC, 25vcovOPG, 27vcovPC, 28vcovPL, 30weightsAndrews, 32weightsLumley, 35

∗Topic tsisoacf, 8kweights, 9lrvar, 10NeweyWest, 12vcovHAC, 22vcovHC, 25vcovOPG, 27weightsAndrews, 32weightsLumley, 35

.vcovBSenv (vcovBS), 18

ar, 13, 23, 33

bptest, 26bread, 2, 12, 17, 18, 20, 23, 25, 28, 29, 31bwAndrews (weightsAndrews), 32

bwNeweyWest (NeweyWest), 12

coef, 2, 3, 19cov, 19

estfun, 3, 11, 12, 24, 27

glm, 2, 3

hccm, 26

InstInnovation, 4Investment, 6isoacf, 8, 35

kernHAC, 10, 11, 14, 24, 36kernHAC (weightsAndrews), 32kweights, 9, 30, 33

lm, 2, 3, 10, 26lrvar, 10

meat, 11, 17, 18, 20, 28meatCL (vcovCL), 19meatHAC, 18meatHAC (vcovHAC), 22meatHC, 18meatHC (vcovHC), 25meatPC (vcovPC), 28meatPL (vcovPL), 30

ncv.test, 26NeweyWest, 10, 11, 12, 34nobs, 2

pava.blocks (isoacf), 8PetersenCL, 14PublicSchools, 15

sandwich, 12, 17, 20, 23, 25, 28, 29, 31

terms, 2, 3

37

Page 38: sandwich package - R · 2 bread sandwich ... Object-Oriented Computation of Sandwich Estimators. Journal of Statistical Software

38 INDEX

update, 19

vcov, 2vcovBS, 18vcovCL, 19, 19, 32vcovHAC, 11, 13, 14, 22, 33–36vcovHC, 21, 25vcovOPG, 27vcovPC, 28vcovPL, 30

weave, 9, 24, 34weave (weightsLumley), 35weightsAndrews, 10, 14, 24, 32, 36weightsLumley, 9, 24, 34, 35