a revision of school effectiveness analysis
TRANSCRIPT
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 1/24
http://jebs.aera.netBehavioral Statistics
Journal of Educational and
http://jeb.sagepub.com/content/37/1/157The online version of this article can be found at:
DOI: 10.3102/1076998610396898originally published online 12 August 2011
2012 37: 157JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS
Nicholas T. LongfordA Revision of School Effectiveness Analysis
Published on behalf of
American Educational Research Association
and
http://www.sagepublications.com
found at: can beJournal of Educational and Behavioral Statistics
Additional services and information for
http://jebs.aera.net/alertsEmail Alerts:
http://jebs.aera.net/subscriptionsSubscriptions:
http://www.aera.net/reprintsReprints:
http://www.aera.net/permissionsPermissions:
What is This?
- Aug 12, 2011OnlineFirst Version of Record
- Feb 1, 2012Version of Record>>
by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from by guest on October 14, 2013http://jebs.aera.netDownloaded from
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 2/24
A Revision of School Effectiveness AnalysisNicholas T. Longford
SNTL and Departament d’Economia i Empresa, Universitat Pompeu
Fabra, Barcelona, Spain
Statistical modeling of school effectiveness data was originally motivated by the
dissatisfaction with the analysis of (school-leaving) examination results that
took no account of the background of the students or regarded each school
as an isolated unit of analysis. The application of multilevel analysis was generally regarded as a breakthrough, although more recent assessments of
how they satisfy the goals of school effectiveness studies, to compare
the performances of schools, are much more guarded. This article shows that
the association of the school effects with randomness is not necessary, because
strength can be borrowed across the analy z ed schools even when they
are associated with fixed effects. The methods are illustrated on a reanalysis of
the data from an early study of school effectiveness. It also addresses the
problem of excess zero outcomes by treating them as censored (truncated).
Keywords: borrowing strength; censoring; composite estimator; multilevel analysis;
multiple imputation; school effectiveness
Introduction
A school effectiveness study is concerned with comparing the outcomes,
usually the results of a final (school-leaving) examination, across schools, with
an appropriate adjustment for the background of their students. Multilevel anal-
ysis (Aitkin & Longford, 1986; Goldstein, 2003; Raudenbush & Bryk, 2002) is
the established method for such studies. It combines ordinary or generalized
linear regression with a model for the variation of some of its coefficients
across the schools. Modeling the school-level differences by random coeffi-
cients is generally regarded as essential, because the alternative, regarding
them as fixed and applying the analysis of covariance (ANCOVA), has long
been rightly perceived as grossly inefficient, especially for schools that have
small sample sizes in the study.
This article presents an alternative view in which associating schools with
random terms is inappropriate, and the inefficiency of ANCOVA, with fixed
effects, is addressed by a method that is efficient in small samples. Our perspec-
tive is centered on the replication scheme implied by the adopted model and how
Journal of Educational and Behavioral Statistics
February 2012, Vol. 37, No. 1, pp. 157–179
DOI: 10.3102/1076998610396898
# 2012 AERA. http://jebs.aera.net
157
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 3/24
it conforms with the context of the study. We argue in the next section that ran-
dom school effects imply a scheme that is unnatural and show that it results in a
biased assessment of the precision of the estimators. This source of bias is not
recognized in the established approaches. The technical details are given in the
section Evaluation With the Fixed-Effects Assumptions. An estimator that corre-
sponds to our perspective more closely, and incorporates borrowing of strength
(Carlin & Louis, 2000; Efron & Morris, 1972; Robbins, 1955), is presented in the
section A Composite Estimator. We derive its mean squared error (MSE); unlike
its established counterpart, it incorporates the uncertainty about the regression
parameters. An application is presented in the section Application, reanalyzing
the study of Aitkin and Longford (1986). The section Zero Outcomes deals with
the outcomes that are equal to zero, an obvious source of their nonnormality, by
regarding them as censored, or truncated, and applies multiple imputation to fit
the regression for the underlying outcomes.
Our main conclusion is that the designation of school effects as ‘‘fixed’’ or
‘‘random’’ is not important for their estimation but can make a lot of difference
in the estimation of the corresponding MSEs. We show that the established
estimator, based on a random-effects model, is not uniformly more efficient than
the long-discarded estimator based on the (fixed-effects) ANCOVA and that the
MSE of the estimator of a school effect depends on the school effect itself.
Instead of estimating the MSE for each school by a single quantity, we draw the
plausible MSE as a function defined in a range of plausible values of the school’s
effect. In the reanalysis of Aitkin and Longford (1986), we show that these func-
tions are flat for most schools but attain a wide range of values for two schools
with the smallest sample sizes.
Fixed or Random?
We consider a school effectiveness study in which the outcomes yij of a
(standardized) test or assessment are available for all students i ¼ 1, . . . , n j of
the final year of study (a cohort) in schools j
¼ 1, . . . , J in a given academic year,
together with the values xij of a vector of relevant covariates. The standard approach is based on the two-level linear model
yij ¼ b0 j þ x ij b þ "ij ; ð1Þin which b is a vector of unknown parameters, the school-specific intercepts b0 j
are a random sample from a univariate normal distribution, N ðb0 ;s2 BÞ, and eij
are a random sample from N (0, s2). The two random samples are independent,
and their variances s2 and s2 B are unknown. We denote by o the variance ratio;
o
¼ s2
B=s2:
The model in (1), with parallel within-school regressions, has a variety of
extensions which include more flexible patterns of school-level variation
(random slopes, that is, associating variation with some covariates in addition
Longford
158
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 4/24
to the intercept), school-specific or modeled residual variances s2 j , adaptations
for distributions other than the normal within the generalized linear framework,
and incorporating further sources of variation in three-level or, in general, multi-
level models, with a structure of clustering that is not necessarily hierarchical
(Rasbash & Goldstein, 1994; Raudenbush, 1993). The assumption of normality
of b0 j can also be dispensed with. For example, their distribution may be a mix-
ture of unrelated normal distributions (McLachlan & Peel, 2000).
The random terms b0 j in (1) can be separated from their expectation b0 and
this parameter can be absorbed in the linear predictor xij b. We obtain the model
yij ¼ x ij b þ d j þ "ij ; ð2Þ
in which every xij is supplemented by a term equal to unity ( xij 0
¼ 1) and b by b0,
representing the average intercept.
Within the frequentist paradigm, inferences are made with reference to
hypothetical replications. For example, the bias of an estimator is defined as its
average deviation from the target (estimand) across replications, and its MSE as
the average of the squares of these deviations. A quantity, such as a parameter, is
declared as fixed when it has the same value in every replication of the study.
A quantity is declared as random when its value is generated in the replications
by a random mechanism. For example, for given indices i and j , eij in (1) and (2)
is drawn at random from a normal distribution, independently across the replica-
tions. Students are associated in these models with random terms eij , because dif-
ferent students may enroll in a school in a hypothetical replication of the study,
and even if the enrollment were the same, the students may perform (slightly)
differently in the (replicate) final exams. We want to assess how the school, with
all its staff, management, practices, ethos, and other attributes, that is, with its
educational process, would have performed with different sets of students.
To promote realism, we assume that such hypothetical sets of students would
have a similar profile (distribution of backgrounds) as the realized set.
The effect of a school’s educational process on its students is subject to
uncertainty, and this is accounted for by the residual variance s2. However, in a
replication, we would study the same schools, with the same educational processes,
because the assessment of the effectiveness of a school, by its position in a league
table or by some other means, refers to a particular school, in the context of a par-
ticular set of schools, well identified to an assessor or a funding or auditing agency,
even if it is made anonymous for an analyst and the research community.
When schools are associated with random effects, a fresh set of them appears
in every replication. Estimation of any quantity associated with a school is then
problematic in the frequentist perspective, because the school appears in replica-
tions only sporadically or not at all. If we make the concession that the same set
of schools appears in every replication, then b0 j for any given school j is like a
moving goalpost, changing from one replication to the next, defying any attempt
School Effectiveness
159
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 5/24
at its estimation that would not be tied to the realised replication, other than by
the unconditional expectation b0. In general, the designation of an effect as fixed
or random is not innocuous, because the data generated in replications entail
more variation with a random effect than with a fixed (constant) effect.
The impact of such additional variation on the inferences made depends on the
details of the inferential task and the methods applied.
In brief, we contend that for the inferences about a specific set of schools,
the schools have to be associated with fixed effects. This does not rule out
the application of random-effects models, because estimating some quantities
based on a model that is not valid is not in conflict with any statistical prin-
ciple, so long as the lack of validity is reflected in the inferential statements
we make. That is, the estimators used should be evaluated under the assump-
tion of a model that is, ideally, valid , or at least more credible than the model
used. Assuming in the evaluation that the school effects are fixed addresses
this point.
In the next section, we show that inferential statements based on the so-called
best linear unbiased predictors (BLUP), used as estimators, are not valid in two
important aspects: The MSEs are not correct, not even approximately or asymp-
totically, as J ! ?, and BLUP is not more efficient than the established
ANCOVA-based estimator for every school, although it is for a majority.
Notwithstanding these reservations, the established analysis, based on the model
in (1) or its extension, is useful, but the conclusions drawn (or implied) by the
established approach are somewhat optimistic and have to be carefully qualified.
In the analysis in the section Application, they relate to two schools (out of 18 in
the study) with the smallest enrolment.
Evaluation With the Fixed-Effects Assumptions
In this section, we study the properties of BLUP under the assumption that the
school-specific effects d j are fixed. To avoid distractions that are peripheral to
our argument, we assume in this section that the regression parameters in b aswell as the variances s2
B and s2 are known. The conditional expectation of the
deviation d j in (2) given the data defines the BLUP estimator of d j :
d j ¼ o
1 þ n j o e T
j 1 ;
where e j ¼ ð y1 j x 1 j b ; . . . ; yn j j x n j j bÞTis the vector of residuals for school j
and 1 is the vector of unities of length (n j ) implied by the context, so that e T j 1 is
the within-school total of the residuals. In the standard approach, we claim that
the MSE of d j is equal to s2 B=ð1 þ n j oÞ, the conditional variance of d j .
Assuming that d j is a fixed quantity is, in effect, the same as conditioning on
its value. We have E (e j | d j ) ¼ d j 1 and
Longford
160
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 6/24
E d j
d j
¼ n j o
1 þ n j o d j ;
so the bias of d j is – d j / (1
þ n j o). As a predictor, in the original setting of BLUP
(Henderson, 1975; Robinson, 1991), d j is unbiased. In contrast, in our setting,
involving estimation, d j is biased. Further, the sampling variance of d j is
var o
1 þ n j o e T
j 1
d j
¼ n j s
2o2
ð1 þ n j oÞ2 ; ð3Þ
because, with d j fixed and b known, var ðe T j 1Þ ¼ var ðn j d j þ "1 j þ þ "n j j Þ
¼ n j s2: Hence the MSE of d j is
MSE d j ; d j
¼ n j s2o2
ð1 þ n j oÞ2 þ d
2 j
ð1 þ n j oÞ2
¼ n j s2
Boþ d2 j
ð1 þ n j oÞ2 :
ð4Þ
The prediction-sampling variance of BLUP, s2 B=ð1 þ n j oÞ; is commonly
claimed to be the MSE of d j . However, it coincides with the MSE in (4) only
when d2 j ¼ s2
B , that is, for schools whose regressions have the typical deviation
d j ¼ +s B from the average regression xb.With the assumption of random d j , we would claim that the estimator d j is
more efficient than the ANCOVA estimator, which, when b in (1) is
assumed to be known, has the variance s2/n j . This claim is based on the
inequality
s2 B
1 þ n j o <
s2
n j
for all n j . The difference of the two sides diminishes as o ! þ?, or as n j ! ?
while o > 0. The corresponding inequality for d j fixed,
n j s2
Boþ d2 j
ð1 þ n j oÞ2 <
s2
n j
;
is equivalent to
d2 j < 2s2
B þ s2
n j
:
Thus, the ANCOVA estimator is more efficient than the BLUP for the schools for
which jd j j > ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2s2
B þ s2=n j p ¼ s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2o þ 1=n j p
. In a typical congenial setting,
this is not a trivial proportion of schools. For illustration, suppose the values
of d j , as a collection, are compatible with the normal distribution N ð0;s2 BÞ and
School Effectiveness
161
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 7/24
s2 ¼ 10s2 B . Then for schools with n j ¼ 50,
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2s2
B þ s2=n j
p ¼ 1:48s B : A value
d drawn at random from N ð0;s2 BÞ is outside the range (–1.48 s B, 1.48 s B) with
probability .14. So, when s2 B ¼ s2=10, BLUP is inferior to ANCOVA for about
one in seven of the schools with n j ¼ 50. The limits within which BLUP is moreefficient than ANCOVA get narrower with increasing n j , but only slightly. For
example, for n j ¼ 100 (and s2 B ¼ s2=10), the limits are 1:45s B and they
converge to 1:41s B as n j ! þ?. For a given value of s2, the upper limit
s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2o þ 1=n j
p increases with o.
We cannot identify the ‘‘exceptional’’ schools for which ANCOVA is
more efficient than BLUP. For schools with small n j , assuming that
s2 B s2, the threshold of 2s2
B þ s2=n j is large, so we need not be concerned
with the possible inefficiency of BLUP. For schools with large n j , the deviation
d j is estimated with high precision by both BLUP and ANCOVA, so the possibleinefficiency is inconsequential. However, the issue is relevant for intermediate
sample sizes n j .
A Composite Estimator
In this section, we derive an estimator based on the model with fixed school
effects. By construction, it is superior to BLUP, although it differs from it only
slightly. Its practical advantage is that an expression obtained for its MSE incor-
porates the uncertainty about the regression parameters.
The BLUP d j can be interpreted as a shrinkage estimator, pulling the unbiased
estimator e T j 1 toward zero, its unconditional expectation. The BLUP can also be
described as a composition (combination) of two estimators, e T j 1 and the iden-
tical zero. The former is unbiased but has a relatively large sampling variance and
the latter has no sampling variance, but differs from d j , and is therefore biased for
d j . Composite estimators can be applied more generally, whenever there are sev-
eral contending estimators (Longford, 2008, Chapter 1). In many contexts, we
select one of the contenders. Composition has a greater potential than selection, because by selection we can, at best, only match the performance of the most
efficient contender. In contrast, composition may yield an estimator more efficient
than any of the contending estimators. Indeed, with some qualifications, BLUP is
an important example.
Within the framework of ANCOVA, we consider two estimators: the standard
(ordinary least squares, OLS) estimator b0 j and the ‘‘averaged’’ estimator
b0 ¼ ðb01 þ þ b0 J Þ= J . The former is unbiased but has a relatively large sam-
pling variance of at least s2/n j . The latter has a much smaller sampling variance,
especially when the number of schools, J , is large, but has the bias b0 j – b0.The composition of b0 j and b0 is an alternative to the standard ANCOVA in
which one of these estimators is selected on the basis of a hypothesis test.
Longford
162
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 8/24
The following proposition states how the optimal coefficients in the composition
are derived in a general setting.
Proposition
Let y0 and y1 be two distinct estimators of the same quantity y. Suppose y0 is
unbiased and the bias of y1 is equal to B. Let V 0 ¼ var ðy0Þ and V 1 ¼ var ðy1Þ and
C ¼ covðy0 ; y1Þ. Then the composite estimator ~y ¼ ð1 bÞ y0 þ b y1 has the
smallest MSE for
b ¼ V 0 C
V 0 þ V 1 2C þ B2 : ð5Þ
The minimum attained is
MSE ~yðbÞ; yn o
¼ V 0 V 0 C ð Þ2
V 0 þ V 1 2C þ B2 : ð6Þ
The proof is given in the Appendix. It is easy to check that the MSE in (6) is
smaller than both V 0 and V 1 þ B2, the respective MSEs of y1 and y2 .
To apply this proposition to estimating b0 j , we require expressions for the
variances and the covariance of b0 j and b0 . The sampling variance matrix of
the regression parameter estimators in ANCOVA is derived from the matrix of the totals of squares and crossproducts of the indicators of the schools and the
covariates. We partition this matrix as
s2 A B
B T D
1
; ð7Þ
where the J J block A ¼ diag (n j ) corresponds to the school-level intercepts b0 j
and D ¼
P j
Pi x
Tij x ij to the covariates. The J rows of the off-diagonal block B
are the within-school totals xþ j
¼ x 1 j
þ þ x n j j . For a given school j , we com-
bine the estimators b0 j with their average b0 . In the Appendix, we show that the
matrix in (7) can be expressed as
s2 A1 þ A1 B GB T A1 A1 BG
GB T A1 G
; ð8Þ
where G ¼ ( D – BT A –1 B) –1. In our case,
G ¼ X J
j ¼1X
n j
i¼1
x Tij x ij n j x
T j x j !( )
1
;
and A –1 B comprises the vectors of within-school means x j ¼ n1 j xþ j as its
J rows. Further, using the notation of the proposition for estimating b0 j ,
School Effectiveness
163
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 9/24
V 0 ¼ var b0 j
¼ s2 1
n j
þ x j G x T j
V 1 ¼ var b0 ¼ s
2 1
J 2X J
h¼1
1
nh þ x 0 G xT0
!
C ¼ cov b0 j ; b0
¼ s2 1
Jn j
þ x j G x T0
B ¼ E b0
b0 j ¼ b0 b0 j ;
where x 0 ¼ J 1ð x 1 þ . . . þ x J Þ is the vector of the means of the within-school
means and b0 ¼ (b01 þ . . . þ b0 J )/ J . Note that V 0, C , and B depend on j , but
V 1 does not. The numerator of the coefficient
b in (5) is
V 0 C ¼ s2 1
n j
1 1
J
þ x j G x j x 0
T
and the denominator is
V 0 þ V 1 2C þ B2 ¼ s2 1
n j
2
Jn j
þ 1
J 2
X J
h¼1
1
nh
þ x j Gx T j 2 x j G x T
0 þ x0 G x T0
( )
þ b0 j b0 2
¼ s2 1
n j
1 1
J
2
þ 1
J 2
Xh6¼ j
1
nh
þ x j x 0
G x j x 0
T( )
þ b0 j b0
2:
The (common) within-school variance s2 is usually estimated with
sufficient precision, so substituting its estimate s2 in both expressions has
negligible consequences. However, the squared deviation (b0 j – b0)2, also
unknown, cannot be estimated directly with the precision required. We substitute
for it its school-level expectation, E ½ j fðb0 j b0Þ2
g ¼ s2
B : The subscript [ j ] indi-cates that the expectation (averaging) is over the schools. For s2
B to be well
defined, we do not have to subscribe to the random-effects perspective, because
we can estimate
s2 B ¼ 1
J
X J
j ¼1
b0 j b0
2;
the version of the school-level variance which refers to the particular set of
schools in the study, not to any superpopulation.
Note that the uncertainty about the regression parameters is taken into account
throughout, unlike in the standard empirical Bayes (BLUP) analysis. The weak-
ness of our approach is the replacement of the squared deviation (b0 j – b0)2 by the
Longford
164
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 10/24
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 11/24
s2 B ¼ S s2
J 1 1
J
X J
j ¼1
1
n j
þX J
j ¼1
x j x0
G x j x0
T
( ):
It is unbiased if an unbiased estimator s2 is used. Note that the estimator o ¼ s2
B=s2 is biased for o, but the bias is small when s2 is estimated with many
degrees of freedom.
In summary, we propose the composite ANCOVA estimator
~b0 j ¼ 1 b j
b0 j þ b j b0 ; ð9Þ
with
b j ¼1
n j
1
1
J þ
x j G x j
x0 T
1n j
1 1 J
2þ 1 J 2
Ph6¼ j
1nh
þ P J
j ¼1
x j x0
G x j x0
Tþ o; ð10Þ
and estimate its MSE by
s2 1
n j
þ x j G x j x0
T b j
1
n j
1 1
J
þ x j G x j x0
T
;
the naive estimator of (6). The estimators s2 and o are the only sources of uncer-
tainty in these expressions. A more precise but more complex alternative uses a
range of plausible values of the ratio (b0 j – b0)2 / s2 in place of o. We refer to the
established ANCOVA estimator as ANCOVA-OLS, to distinguish it from the
composite estimator given by (9).
Application
The data analyzed originally by Aitkin and Longford (1986) comprise the
scores compiled on the O-level examinations of school-leavers in 18 secondary
schools in a Local Educational Authority in England ( LEA scores) and the scores
on a general scholastic aptitude test, the Verbal Reasoning Quotient, VRQ, estab-
lished by a test soon after enrollment a few years earlier. The sex of each student
is recorded. There are two single-sex schools, with 21 and 22 students, respec-
tively. Their students have much higher LEA scores on average than the other
schools, but their average VRQ scores are also much higher. The other schools
have between 39% and 56% girls among 29 to 79 students. The dataset comprises
907 students in total, 477 boys and 430 girls. Students who do not take any
school-leaving examinations have zero LEA score. The scores are integers; the
highest score in the dataset is 68, attained by two students, and zero score is
attained by 138 students. The VRQ scores are also integers, in the range 70 to 140.
Table 1 lists the estimates and estimated standard errors of school-level devia-
tions d j based on the empirical Bayes model (BLUP) and ANCOVA, with
Longford
166
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 12/24
composition and without, that is, using OLS (marked by dagger). This analysis
ignores the obvious nonnormality due to the zero lower limit of the LEA scores.
We address this problem in the next section. The BLUP and ANCOVA compo-
site estimates are not pairwise comparable, because the former add up to zero,
whereas the latter do not. In fact, the OLS estimates, obtained prior to composi-tion, add up to zero, but after their (uneven) shrinkage the composite estimates do
not. No translation brings the two sets of estimates into a close agreement,
because the ANCOVA composite estimates are dispersed more than BLUP. With
BLUP, more shrinkage takes place—compare the corresponding values of b j .
The estimated standard errors for BLUP are much smaller than for ANCOVA
with composition. This difference is largely due to the stardards used in their cal-
culation. For BLUP, we ignore the uncertainty about b and o. For ANCOVA, the
uncertainty about b is accounted for, although the consequences of substituting
s2 B (and then s2
B) for the squared deviation (b0 j – b0)2 are ignored. This is equiv-
alent to substituting o for (b0 j – b0)2 / s2. The standard errors quoted in the
column headed St.e. refer to the parametrization withP
j b0 j ¼ 0. When shrinkage
TABLE 1
Estimates of the School-level Deviations Based on Empirical Bayes Model (BLUP) and
ANCOVA, With Composition (Middle Section) and Without (by OLS); Models With
Parallel Regressions
Sch.
BLUP ANCOVA OLS
n j b0 j St.e. b j ~b0 j St.e. St.e.0 b j
by0 j St.e.y
1 1.49 1.10 0.17 1.58 1.50 1.19 0.10 1.75 1.54 65
2 1.19 1.01 0.14 1.10 1.45 1.10 0.09 1.20 1.48 79
3 0.93 1.25 0.21 0.99 1.67 1.38 0.14 1.15 1.75 48
5 –2.76 1.26 0.22 –3.09 1.65 1.38 0.14 –3.58 1.73 47
6 0.43 1.09 0.17 0.28 1.54 1.20 0.11 0.31 1.59 66
7 –1.82 1.32 0.24 –2.21 1.71 1.50 0.15 –2.61 1.81 418 –1.84 1.21 0.20 –2.18 1.61 1.33 0.13 –2.49 1.68 52
9 –1.28 1.09 0.16 –1.45 1.49 1.17 0.10 –1.61 1.53 67
10 –0.13 1.24 0.21 –0.30 1.63 1.37 0.13 –0.35 1.70 49
11 3.32 1.26 0.22 3.56 1.67 1.38 0.14 4.13 1.75 47
12 –1.68 1.23 0.21 –1.92 1.61 1.34 0.13 –2.20 1.68 50
13 0.63 1.32 0.24 0.63 1.71 1.47 0.15 0.75 1.80 41
14 –1.15 1.24 0.21 –1.42 1.65 1.36 0.13 –1.63 1.72 49
15 –0.04 1.50 0.31 –0.11 1.89 1.74 0.20 –0.14 2.05 29
16 –1.17 1.06 0.15 –1.39 1.50 1.14 0.10 –1.54 1.54 72
17 –0.99 1.12 0.17 –1.18 1.54 1.22 0.11 –1.32 1.59 6220 7.49 1.64 0.37 9.22 2.00 2.10 0.25 12.34 2.25 22
21 –2.61 1.67 0.38 –2.90 2.24 2.14 0.30 –4.15 2.56 21
Note: St. e. ¼ standard error.
School Effectiveness
167
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 13/24
is not applied, the estimates by0 j satisfy this constraint. However, after the differ-
ential shrinkage the estimated deviations have a nonzero total.
We are interested only in the contrasts of the deviations, so it may be more
appropriate to estimate the standard errors of ~b0 j ~b0 . They are given in the col-umn headed St.e.0. They are much smaller than the standard errors quoted for ~b0 j
(e.g., 1.10 vs. 1.45 for School 2), but they are still greater than the BLUP-related
standard errors, because they do not ignore the uncertainty about the regression
coefficients on VRQ and Sex. For completeness, the estimates and estimated
standard errors obtained by ANCOVA without shrinkage are given in the
right-most part of Table 1. Note that the standard error of any contrast
b0 j 1 b0 j 2
is estimated straightforwardly with BLUP because the estimators
^b0 j are assumed to be independent. In ANCOVA, they are dependent, providinganother reason for dismissing any straightforward comparison of the pairs of
standard errors. Suffice to say that the BLUP and ANCOVA estimators use sim-
ilar shrinkage coefficients, and are therefore functionally similar.
The standard errors can be studied in greater detail in Figure 1. In the panel atthe top, the estimated root-MSEs based on (4) are drawn as functions of the devia-tions d ¼ d j for d in the range ð~d j 2 s j ; ~d j þ 2 s j Þ, where s j are the commonly
reported (estimated) standard errors s B= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ n j op
, marked by circles. The black
discs () mark the root-MSEs evaluated at the estimates d j and the horizontal
ticks are drawn at the ANCOVA-OLS standard errors s= ffiffiffiffi
n j p . The three setsof estimated standard errors are summarised in the left-hand part of the panel
by vertical segments drawn in the descending order of the reported standard
errors. For most schools, the ANCOVA root-MSE is uniformly greater than the
standard error for any plausible value of the deviation d j . The two schools with
the fewest students are the exceptions, most notably school 20 (n20 ¼ 22), for
which the ANCOVA-OLS standard error is uniformly smaller. For School 21, the
reported standard error and the standard error evaluated at the estimate nearly
coincide because d21
¼:
s B .
The bottom panel of Figure 1 contains the estimated root-MSEs of theANCOVA composite estimators as functions of the deviation d j ¼ b0 j – b0.The ‘‘reported’’ root-MSEs are the (naively) estimated minima given by thesquare roots of (6), with o substituted for (b0 j – b0)2 / s2. In general, the root-
MSEs are greater than their random-effects counterparts, because they account
for the uncertainty about the regression on VRQ and Sex, but their dependence
on the value of b0 j (or d j ¼ b0 j – b0) is much weaker. For most schools, the
reported root-MSE is the smallest and the ANOVA-OLS root-MSE the largest, but
the two schools with the smallest sample sizes stand out as exceptions. Note that
the number of such exceptions is probably underrepresented in the diagram, because the values of d j are dispersed more than their estimates d j in both panels,
and more than just one of them is likely to be outside the range (– s B, s B).
Longford
168
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 14/24
Zero Outcomes
About 15% of the students (138 out of 907) have zero LEA scores; they did not
take part in the final examinations in any subject or obtained no qualifying points
in them. The presence of such observations undermines the validity of the
−10 0 10
0
1
2
3
4
0
1
2
3
4
Deviation
S t a n d a r d e r r o r
Sch: 21
2015
13711
5314
1012
817
16
916
2−
Standard errorsReported
Eval. at estimate
ANCOVA
Random effects
−10 01 0
Deviation
S t a n d a r d e r r o r
Sch: 21
2015
713
311
514
1012
817
616
19
2
−
Standard errorsReported
Eval. at estimate
ANOVA (OLS)
Fixed effects
FIGURE 1. Standard errors (root-MSEs) of the random-effects (BLUP) and fixed-effects
(ANCOVA) estimators of the deviations b0 j , as functions of the deviation b j. The vertical
dashes are drawn at s B , the estimated school-level standard deviation.
School Effectiveness
169
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 15/24
analysis, because the outcomes are distinctly nonnormally distributed, irrespective
of the conditioning (regression model) applied. We deal with this problem by
regarding such outcomes as truncated at zero, as if originally these outcomes were
negative, and fit the regression to the outcomes prior to truncation. There is exten-
sive literature on censoring (e.g., Klein & Moeschberger, 2004), but most of it is
related to survival analysis, with applications to medicine and engineering, involv-
ing distributions other than normal. For an application to educational research, see
Braun and Zwick (1993).
We treat truncation as a cause of data incompleteness and apply multiple
imputation (Longford, 2005, Part 1; Rubin, 2002; Schafer, 1997) to fit models
that refer to the complete dataset, in which some outcomes have negative values
but these values are not known. Two schools, numbers 20 and 21, have no stu-
dents with zero outcomes. Since the students in these schools have substantially
greater mean scores, we base the imputation only on the remaining 16
schools (864 students).
We apply the following iterative procedure. For a set of provisional underly-
ing (pre-truncation) values of y, we generate provisional plausible ordinary
regressions of LEA scores on VRQ and Sex and school. We do not use the OLS
fit, but draw a plausible residual variance ~s2 from its estimated sampling distri-
bution (scaled w2), and then a set of plausible regression parameters ~b from the
sampling distribution of b, in which s2 is replaced by ~s2. For each student with
zero recorded outcome, we generate replicate underlying outcome scores accord-ing to this regression until a negative value is obtained. This value is a random
draw from the conditional distribution of the score given that it is negative. It
replaces the previous provisional imputed value. The iterations are started with
all the values for the truncated outcomes set to zero, and they should be
concluded when convergence in distribution is reached for the imputed values.
Such convergence is difficult to assess, because the 138 values are associated
with several distinct distributions. However, doing more iterations than is necessary
causes no harm.
After preliminary exploration, we adjusted this algorithm as follows. First, wecarry out 20 iterations of (provisional) imputation by conditioning only on VRQ.
Then, we continue with further 20 iterations, separately for each of the 16 schools,
in which we condition also on Sex. For each provisional value (in every itera-
tion), we have to draw several values from a normal distribution, until we obtain
a negative value. We set the maximum number of such draws to 200; if none of
the 200 values is negative, then the provisional value is set to zero. We checked
in several settings, that such instances are very rare. In a set of 10 replications,
generating 10 138 plausible values of the underlying LEA scores, one student
had two zeros imputed and five others had one zero each.The procedure is computationally not demanding; with the code written in
R for the purpose, its execution requires about 0.7 sec. of CPU time. We replicate
Longford
170
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 16/24
the procedure M ¼ 10 times, to generate 10 replicate completed datasets on
which we apply the estimation procedures described in the section Application.
The 10 sets of replicate (completed-data) estimates are averaged to obtain the
multiple-imputation (MI) estimates. Their sampling variances are also estimated
as the averages across the replications, with an inflation by the between-
imputation variance. For details and further background to MI, see Schafer
(1997) and Rubin (2002).
The estimates and estimated standard errors for the original (truncated) and the underlying outcomes for the random- and fixed-effects models are listed in
Table 2. For fixed effects, we do not give an estimate of the standard error of the
intercept, because the intercept is confounded with the parameters b0 j . For the
underlying outcomes, the fitted regressions have higher slopes on VRQ and
greater estimated residual variance s2 as well as overall variance s2ð1 þ oÞ.
The estimated standard errors for the regression parameters are inflated only
slightly, because the fraction of the missing information is small. It would be
equal to 15% (on the scale of variance and MSE) if we had no information about
the values of the outcomes when they are truncated. However, we know that theyare negative, and many of them are close to zero, so the fraction is much smaller
than 15%.
The estimates and estimated standard errors of the school-level deviations are
listed in Table 3. The standard errors for the estimates based on the underlying
regression are slightly inflated but, with a few exceptions, the estimates are
changed only slightly. After multiple imputation, the collections of estimates and
estimated standard errors largely retain the features observed on the estimates
based on the original (truncated) outcomes. Figure 2 presents them in a format
that is easier to digest. Each vertical segment is the interval ðd j 2^ s j ;d j þ 2^ s j Þ, where d j is an estimate and ^ s j the associated standard error. It shows
that shrinkage (with BLUP or ANCOVA) alters the estimates and reduces the
TABLE 2
Estimates and Estimated Standard Errors for the Original and Underlying Outcomes;
Models With Fixed and Random Effects
Random Effects Fixed Effects
1 VRQ Sex s2 o VRQ Sex s2 o
Truncated outcomes
Estimate –69.299 0.906 0.959 111.54 0.060 0.813 0.887 94.37 0.128
St. error 0.028 0.718 0.027 0.027 0.669
Underlying outcomes
Estimate –69.803 0.912 0.848 113.12 0.056 0.898 1.244 113.89 0.107
St. error 0.029 0.731 0.026 0.030 0.708
School Effectiveness
171
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 17/24
estimated standard errors but substantially so only for the two extreme schools,
20 and 21, at the price of the qualifications and deficiencies discussed earlier. And
these qualifications are essential for these two schools.
Varying Slopes
Bearing in mind that there are bound to be important covariates that were not
recorded, the results can hardly be used for any policy-related decisions or for
choice of alternative schools by a student or parent, especially when we realize
that the relevant coefficients may change from one year to the next (Leckie &
Goldstein, 2009). We have addressed one important model validity issue, related
to the truncation of the outcomes; the variation of the regression slopes (on VRQ)
is another. It corresponds to the model
yij ¼ x ij b þ z ij d j þ "ij ;
in which xij comprises the values of the covariates (including the intercept 1)
for student i in school j , and z ij is its subvector for the intercept and VRQ
or, in general, for the covariates associated with school-level variation; d j and
TABLE 3
Estimates and Estimated Standard Errors of the School-level Deviations (Effects) Based
on Models With Random and Fixed Effects and Parallel Regressions
Sch.
BLUP ANCOVA ANCOVA-OLS
Truncated Underlying Truncated Underlying Truncated Underlying
1 1.37 (1.17) 1.62 (1.18) 1.58 (1.50) 1.81 (1.64) 1.75 (1.54) 2.05 (1.70)
2 1.43 (1.08) 1.71 (1.11) 1.10 (1.45) 1.75 (1.60) 1.20 (1.48) 1.95 (1.64)
3 1.03 (1.31) 1.05 (1.32) 0.99 (1.67) 1.19 (1.82) 1.15 (1.75) 1.42 (1.92)
5 –2.71 (1.32) –2.69 (1.35) –3.09 (1.65) –3.12 (1.82) –3.58 (1.73) –3.72 (1.92)
6 0.24 (1.16) 0.38 (1.21) 0.28 (1.54) 0.27 (1.72) 0.31 (1.59) 0.31 (1.79)
7 –1.43 (1.39) –1.52 (1.47) –2.21 (1.71) –1.92 (1.93) –2.61 (1.81) –2.34 (2.08)
8 –1.48 (1.27) –1.60 (1.30) –2.18 (1.61) –1.95 (1.77) –2.49 (1.68) –2.29 (1.86)
9 –0.91 (1.15) –0.97 (1.17) –1.45 (1.49) –1.13 (1.63) –1.61 (1.53) –1.27 (1.69)
10 0.20 (1.30) –0.26 (1.33) –0.30 (1.63) –0.43 (1.79) –0.35 (1.70) –0.50 (1.89)
11 3.49 (1.32) 3.45 (1.35) 3.56 (1.67) 3.89 (1.83) 4.13 (1.75) 4.65 (1.93)
12 –1.82 (1.29) –1.60 (1.34) –1.92 (1.61) –1.88 (1.79) –2.20 (1.68) –2.21 (1.88)
13 0.52 (1.39) 0.16 (1.45) 0.63 (1.71) 0.14 (1.90) 0.75 (1.80) 0.17 (2.04)
14 –1.90 (1.30) –1.13 (1.33) –1.42 (1.65) –1.41 (1.81) –1.63 (1.72) –1.67 (1.91)
15 –0.01 (1.56) –0.18 (1.59) –0.11 (1.89) –0.26 (2.07) –0.14 (2.05) –0.33 (2.29)
16 –0.92 (1.12) –1.12 (1.15) –1.39 (1.50) –1.35 (1.65) –1.54 (1.54) –1.52 (1.71)
17 –0.42 (1.19) –0.54 (1.21) –1.18 (1.54) –0.69 (1.69) –1.32 (1.59) –0.80 (1.75)
20 6.37 (1.70) 6.20 (1.72) 9.22 (2.00) 8.21 (2.15) 12.34 (2.25) 11.54 (2.47)21 –3.04 (1.72) –3.00 (1.73) –2.90 (2.24) –3.57 (2.41) –4.15 (2.56) –5.43 (2.81)
Longford
172
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 18/24
eij are indepedent random samples from N 2ð0 ; S BÞ and N ð0;s2Þ, respec-
tively. The 2 2 variance matrix S B describes the pattern of school-level
variation.
In the ANCOVA version of this model, d j
, j ¼
1, . . . , J , are unknown vectors
of constants representing the interaction of VRQ with the School as a categorical
variable. The ANCOVA composite estimator can be extended to this model by
minimising the MSE of the (multivariate) composition
ð z b j ÞTbz j þ b T
j b0
where z ¼ (1, z )T is the vector at which we want to estimate (predict) the average
outcome and b j is a vector of coefficients, which can be interpreted as inducing
(bivariate) shrinkage. The vector b j depends on Ω
¼ s –2S B and has to be esti-
mated by moment matching. The details for b j and Ω are given in the Appendix.
We note that the optimal vector b j has the form Q1 j P j z , where the matrices
Q j and P j , the multivariate versions of the numerator V 0 – C and the denominator
1 2 3 5 6 7 8 9 10 11 12 13 14 15 16 17 20 21
−10
10
0
School
E s t i m a t e
Order of the estimates:
BLUP (Truncated)
BLUP (Underlying)
ANCOVA (Truncated)
ANCOVA (Underlying)
ANCOVA-OLS (Truncated)
ANCOVA-OLS (Underlying)
FIGURE 2. Estimates and estimated standard errors (root-MSEs) of the school-level
deviations based on the random- and fixed-effects models with the truncated and underlying
outcomes.
School Effectiveness
173
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 19/24
V 0 þ V 1 – 2C þ B2 in (5), do not depend on z . Thus, the solutions for a collection
of vectors z are characterized by the matrix B j ¼ Q1 j P j , common to all of them.
Estimators based on more complex models usually have smaller biases than
their simpler submodels but have greater sampling variation. In our case, the
variation is the dominant contributor to the MSE even for the schools with the
largest n j . This suggests that the model with varying slopes on VRQ is unlikely
to be useful for any purpose associated with comparing the schools. This is
indeed the case with both BLUP and ANCOVA. The problems with estimating
the standard errors of the school-specific coefficients are exacerbated somewhat.
The deviations of a within-school regression from the average regression depend
on VRQ, and therefore the associated sampling variances are (quadratic)
functions of VRQ, which depend on Ω and the deviations d j themselves. We omit
the details, but present the random- and fixed-effects estimates in Table 4, and
the sets of fitted within-school deviations in Figure 3.
The estimated deviation lines d0 j þ d1 j VRQ in Figure 3 are drawn between the
10th and 90th percentiles of the VRQ scores for each school’s sample. By sub-
tracting the average regression, the resolution of each panel is greatly improved.
The lines obtained by multiple imputation for the zero outcomes (the right-hand
panels) differ only slightly from their counterparts with the zero outcomes taken
at face value, but the BLUP (random effects) and ANCOVA (fixed effects)
results differ substantially, both in the extent and pattern of variation. However,
much of this difference is illusory because of the substantial sampling variation
associated with the intercepts and slopes of the lines, as well as their scaled var-
iance matrix Ω.
TABLE 4
Estimates and Estimated Standard Errors for the Original and Underlying Outcomes;
Models With (Random and Fixed) School-Specific Slopes on VRQ
BLUP ANCOVA
1 VRQ Sex s21a VRQa Sex s2
Truncated outcomes
Estimate –71.054 0.921 1.040 109.21 –56.673 0.799 0.653 91.97
St. error 0.044 0.716 0.684
Ω 1:608 0:0179
0:0179 0:00020
5:674 0:0508
0:0508 0:00048
Underlying outcomes
Estimate –71.919 0.931 0.899 111.52 –65.465 0.878 0.949 111.09
St. error 0.048 0.740 0.746Ω 2:094 0:0226
0:0226 0:00025
6:191 0:0553
0:0553 0:00055
a Average across the schools (for the intercept and VRQ in ANCOVA).
Longford
174
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 20/24
Discussion
We have shown that the assumption of randomness is not essential for borrowing
strength, that is, for exploiting the similarity of related schools (clusters). With the
assumption, maximum likelihood (ML) estimation is satisfactory. When the effects
(deviations) of the units are assumed to be fixed, ML coincides with OLS and is
unsatisfactory, but that is a problem of ML, not of the assumption. The problem
of estimating these deviations is quintessentially small-sample and distinctly not
asymptotic, and ML is not efficient. It can be improved upon, on average, by com-
position, without altering the assumption.
We have argued that the effects associated with schools should be treated
as fixed, because we wish to make inferences about the educational processes
80 100 120 140
−10
10
0 0
0
BLUP (Truncated outcomes)
VRQ
80 100 120 140
VRQ
I L E A s c o r e
−10
10
0
I L E A s c o r e
−10
10
I L E A s c o r e
−10
10
I L E A s c o r e
BLUP (Underlying outcomes)
80 100 120 140
ANCOVA (Truncated outcomes)
VRQ
80 100 120 140
ANCOVA (Underlying outcomes)
VRQ
FIGURE 3. Fitted school-level deviations from the average regression; models with
random (BLUP) and fixed (ANCOVA) coefficients with school-specific slopes on VRQ.
School Effectiveness
175
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 21/24
of specific schools. The composite estimators based on this assumption are
only slightly more efficient than with random effects. Our analysis of the
MSEs shows that the established BLUP estimator is more efficient than the
ANCOVA estimator for a majority of schools but not for all of them.
The MSEs of the estimators depend on the school effects d j , and so can
be studied (and presented) more completely through the ranges of their
plausible values.
Composition is a general principle that can be applied whenever there are
alternative estimators of the target quantity. Thus, we may even consider s2 B and
ðb0 j b0Þ2as alternative estimators of (b0 j – b0)2, and seek their combination
that attains the smallest MSE. In this case, both contending estimators are biased
but their biases are identified. The problem is treated by Longford (2007).
More flexible patterns of school-level differences are introduced bycovariate-by-school interactions. In the random-effects perspective, they cor-
respond to varying slopes (on continuous variables) and varying differences
(among levels of categorical variables). The approach presented in the sec-
tion A Composite Estimator can be extended straightforwardly. For example,
the matrix A is replaced by the block-diagonal matrix of within-school
counts, means, and crossproducts of the variables involved in the
interactions.
Issues similar to those discussed here arise in small-area estimation, in which
the inferential targets are the population means of a variable in the districts of acountry. Multilevel models and BLUP are the adopted standard in such analyses
(see Rao, 2003), but Longford (2005, Part 2, 2007) pointed out the conflict
between the sampling-design and model-based perspectives and the correspond-
ing treatment of the terms associated with the districts as fixed or random. Ran-
dom effects are essential for borrowing strength (exploiting similarity) neither
across the districts in small-area estimation, nor across schools in school-
effectiveness studies.
Another approach to the assessment of schools, or competing institutions in
general, is outlined by Longford and Rubin (2006). The approach is based on the potential outcomes framework (Holland, 1986), in which every student is associ-
ated with a set of fixed outcomes, one for each school, and a comparison of two
schools is defined by the average difference of the outcomes for a school-
specific reference set of students. This approach requires no distributional assump-
tions and the differences of the potential outcomes of a student for two schools are
not assumed to follow any particular pattern. Estimation can be framed as a
missing-data problem (for each student only one outcome is observed and the oth-
ers are missing), and multiple imputation applied to address the uncertainty
involved. Explicit modeling of the (self-)selection process, that is, of the assign-ment of students to schools, is a strength of this approach, but it can be fully rea-
lized only when a rich set of covariates are collected.
Longford
176
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 22/24
Appendix
Proof of the Proposition
The MSE of the estimator ~y ¼ ð
1
bÞy
1 þ by
2 is
mðbÞ ¼ ð1 bÞ2V 0 þ 2bð1 bÞC þ b2ðV 1 þ B2Þ
¼ b2 V 0 2C þ V 1 þ B2 2b V 0 C ð Þ þ V 0 ;
and this quadratic function, with a positive quadratic term, attains its minimum
for b ¼ (V 0 – C ) / (V 0 þ V 1 – 2C þ B2), as claimed in (5). The minimum attained,
given by (6), is obtained directly by substituting this solution in m(b).
The expression (8) for the inverse of a partitioned matrix can be derived by
sweeping the matrix in (7). For instance, by adding to the second block-row the
BT
A –1
premultiple of the first block-row we obtain the zero matrix in the bottomoff-diagonal block. The expression is confirmed more simply by multiplication:
A B
B T D
A1 þ A1 BG B T A1 A1 BG
GB T A1 G
¼ I n 0 n; p
0 p;n I p
;
where I is the identity matrix and 0 the matrix of zeros, with their dimensions
indicated in the subscripts, and G ¼ ( D – BT A –1 B) –1.
Multivariate Shrinkage
Suppose y is a p 1 vector of parameters and we wish to estimate the
combination z Ty. Let y0 be an unbiased and y1 another estimator of y. Denote
their respective variance matrices by V 0 and V 1 and their covariance matrix by
C . The bias of y1 is denoted by B.
We seek the estimator
~y ¼ z bð ÞTy0 þ b T y1 ;
which has the minimum MSE. The optimal vector of coefficients b is found byminimising the MSE of ~y :
m b ; z ð Þ ¼ z bð ÞTV 0 z bð Þ þ z bð ÞT
C b þ b T C z bð Þ þ b T V 1 þ BB T
b
¼ b T V 0 C C T þ V 1 þ BB T
b b T 2V 0 C C T
z þ z T V 0 z
¼ b T Qb b T Pz z T P T b þ z T V 0 z ;
where Q ¼ V 0 – C – CT þ V 1 þ BBT and P ¼ V 0 – C . The minimum of this
quadratic function of b is found by matrix differentiation or by completing the
square. In either way, we find that the minimum is attained at b*
¼ Q –1 Pz and
the minimum MSE is m (b*; z ) ¼ z T V 0 z – z T PT Q –1 Pz .
The (r r ) scaled variance matrix S B is estimated by the multivariate version
of the method described in the section A Composite Estimator. Let d j be the
School Effectiveness
177
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 23/24
estimate of the vector of deviations for school j , and d0 their average. We apply
moment matching to the statistic
S ¼ 1
J X J
j ¼1d j d0
d j d0 T
:
The expectation of this statistic is
E ðS Þ ¼ 1
J
X J
j ¼1
var d j d0
þ 1
J
X J
j ¼1
d j d0
d j d0
T
¼ S B þ s2
J 1
X J
j ¼1
H h ;
where H ¼ A –1 þ A –1 BGBT A –1 is the top diagonal block in (8) and H j is the r r diagonal submatrix of H that correspond to the school j . The matrix H does not
depend on any parameters other than s2. The school-level variance matrix of b j is
estimated by
S B ¼ 1
J S s2
J 1
X J
j ¼1
H j :
Declaration of Conflicting Interests
The author declared no conflicts of interest with respect to the authorship and/
or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research
and/or authorship of this article: Research and preparation of this article were
supported by the Grant SEJ2006–13537 from the Spanish Ministry of Science
and Technology.
References
Aitkin, M., & Longford, N. T. (1986). Statistical modelling issues in school effectiveness
studies. Journal of the Royal Statistical Society, Series A, 149, 1–43.
Braun, H. I., & Zwick, R. (1993). Empirical Bayes analysis of families of survival curves:
Application to the analysis of degree attainment. Journal of Educational Statistics, 18,
285–303.
Carlin, B. P., & Louis, T. A. (2000). Empirical Bayes: Past, present and future. Journal of
the American Statistical Association, 95, 1286–1289.Efron, B., & Morris, C. N. (1972). Limiting the risk of Bayes and empirical Bayes estima-
tors—Part II: The empirical Bayes case. Journal of the American Statistical Associa-
tion, 67 , 130–139.
Longford
178
8/13/2019 A Revision of School Effectiveness Analysis
http://slidepdf.com/reader/full/a-revision-of-school-effectiveness-analysis 24/24
Goldstein, H. (2003). Statistical analysis of multilevel data (3rd ed.). London: Edward
Arnold.
Henderson, C. R. (1975). Best linear unbiased estimation and prediction under a selection
model. Biometrics, 31, 423–447.
Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81, 945–970.
Klein, J. P., & Moeschberger, M. L. (2004). Survival analysis. Techniques for censored
and truncated data. New York, NY: Springer-Verlag.
Leckie, G., & Goldstein, H. (2009). The limitations of using school league tables to inform
school choice. Journal of the Royal Statistical Society, Series A, 172, 835–851.
Longford, N. T. (2005). Missing data and small-area estimation. Modern analytical
equipment for the survey statistician. New York, NY: Springer-Verlag.
Longford, N. T. (2007). On standard errors of model-based small-area estimators. Survey
Methodology, 33, 69–79.
Longford, N. T. (2008). Studying human populations. An advanced course in statistics.
New York, NY: Springer-Verlag.
Longford, N. T., & Rubin, D. B. (2006). Performance assessment and league tables.
Comparing like with like. Working Paper 994. Barcelona: Department of Economics
and Business, University Pompeu Fabra.
McLachlan, G. J., & Peel, D. (2000). Finite mixture models. New York, NY: Wiley.
Rao, J. N. K. (2003). Small area estimation. New York, NY: Wiley.
Rasbash, J., & Goldstein, H. (1994). Efficient analysis of mixed hierarchical and cross-
classified random structures using a multilevel model. Journal of Educational and
Behavioral Statistics, 19, 337–350.
Raudenbush, S. W. (1993). A crossed random effects model for unbalanced data with
applications in cross-sectional and longitudinal research. Journal of Educational Statis-
tics, 18, 321–349.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and
data analysis methods. Thousand Oaks, CA: Sage.
Robbins, H. (1955). An empirical Bayes approach to statistics. Proceedings of the Third
Berkeley Symposium on Mathematical Statistics and Probability, 1, 157–164. Berkeley:
University of California Press.
Robinson, G. K. (1991). That BLUP is a good thing: The estimation of random effects.
Statistical Science, 6 , 15–32.Rubin, D. B. (2002). Multiple imputation for nonresponse in surveys (2nd ed.). New York,
NY: Wiley.
Schafer, J. L. (1999). Analysis incomplete multivariate data. London: Chapman and Hall.
Author
NICHOLAS T. LONGFORD is Director, SNTL Statistics Research and Consulting, and
Academic Visitor, Universitat Pompeu Fabra, Ramon Trias Fargas 25-27, 08005
Barcelona, Spain; email: [email protected]. His research interests are multilevel
analysis, small-area estimation, dealing with missing values, composite estimation, and
statistical modelling and computing in general.
Manuscript received February 15, 2010
Revision received September 22, 2010
Accepted October 17, 2010
School Effectiveness
179