current issues in english language teacher-based assessment
TRANSCRIPT
TESOL QUARTERLY Vol. 43, No. 3, September 2009 393
Current Issues in English Language Teacher-Based Assessment CHRIS DAVISON University of New South Wales Sydney, Australia
CONSTANT LEUNG King’s College London London, England
Teacher-based assessment (TBA) is increasingly being promoted in educational policies internationally, with English language teachers being called on to plan and/or implement appropriate assessment pro-cedures to monitor and evaluate student progress in their own class-rooms. However, there has been a lack of theorization of TBA in the English language teaching fi eld, with researchers pointing to much variability, a lack of systematic principles and procedures, and a reliance on traditional, but now outdated, psychometric assumptions. This arti-cle provides an overview of some of the current issues in TBA, includ-ing its defi nition and key characteristics, and the complex but signifi cant questions which its implementation pose for our understandings of lan-guage, learning, and assessment.
Teacher-based assessment (TBA) is policy-supported practice in a number of educational systems internationally, including Australia,
New Zealand, 1 Canada, and the United Kingdom (e.g., Cumming & Maxwell, 2004; Learning and Teaching Scotland, 2006 2 ; Queensland Studies Authority, 2009b; Saskatchewan Learning, 1993; Spencer, 2005). It is increasingly being adopted as national educational policy in Asia
1 In Queensland where school-based assessment (SBA) was introduced in the 1970s (Sadler, 1989) teacher-based assessment is used for all assessment in the secondary school, even for high-stakes purposes (see Queensland Studies Authority, 2009a). The Australian Capital Territory (ACT) also uses only teacher-based assessment for senior secondary level (Department of Education and Training, Education Policy and Planning Section [Australia], n.d.). Other states such as New South Wales and Victoria have incorporated large scale teacher-based assessment into their public examinations (see, e.g., New South Wales Government, n.d.). New Zealand also has a long history of school-based assessment in the senior secondary school (see New Zealand Qualifi cations Authority, n.d.), and has developed a wide variety of teacher support material and associated research studies, (Ministry of Education [New Zealand], 2009).
2 In Scotland much interesting work in TBA is being conducted by the Scottish Assessment Is for Learning (AifL) group (Learning and Teaching Scotland, 2006) supported by the Ministry of Education in Scotland and involving many classrooms.
394 TESOL QUARTERLY
(Butler, this issue; Curriculum Development Institute [Hong Kong], 2002; Ministry of Education [Singapore], 2008; Xu & Liu, this issue) as well as in some developing countries, including South Africa, Ghana, and Zambia (Pryor & Akwesi, 1998; Pryor & Lubisi, 2002). It is also actively promoted in the United States (e.g., Popham, 2008a, 2008b; Stiggins, 2008; Stiggins, Arter, Chappuis, & Chappuis, 2007), although always over-shadowed by national testing programs. At the same time, English lan-guage teachers are increasingly being called on to plan and implement their own assessment instruments and procedures to monitor and evalu-ate student progress in their classrooms, and new curriculum documents and professional teaching standards increasingly demand English language teachers be knowledgeable and skilled in TBA (see, e.g., TESOL, 2005).
However, despite this widespread embrace of various forms of TBA in school and adult education, there has been comparatively little specifi c research into the TBA of English as a second or additional language. TBA has been neglected by researchers partly because of the uncertain status of TESOL as a discrete curriculum area in schools and tertiary institu-tions, partly because of the traditional dominance of the fi eld by large-scale English language tests such as the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) and their research priorities and needs, and partly because of the ongoing critique of notions of standard English and mod-els of correctness as well as debates over native versus nonnative speaking teachers and the implications for assessment.
What TBA research that has been done in TESOL reveals much vari-ability, a lack of systematic principles and procedures, and a dearth of information as to the impact of TBA on learning and teaching. In Australia, several studies of the use of large scale criterion-referenced English as second language (ESL) assessment frameworks in schools (Breen et al., 1997; Davison & Williams, 2002) have revealed a great diver-sity in teachers’ approaches to assessment, infl uenced by the teachers’ prior experiences and professional development, the assessment frame-works and scales they used, and the reporting requirements placed on them by schools and systems. Concerns have also been raised about, on the one hand, the ad-hoc or impressionistic nature of many teacher judg-ments (Leung, 1999; Leung & Teasdale, 1997) and, on the other hand, mechanistic criterion-based approaches to TBA, which are often imple-mented in such a way that they undermine rather than support teachers’ classroom-embedded assessment processes (Arkoudis & O’Loughlin, 2004; Black & Wiliam 1998; Carless, 2005; Davison, 2004; Leung, 2004a, 2004b).
Research into TBA in TESOL is further complicated by the consider-able uncertainty and disagreement around the concept of TBA itself and
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 395
by its intrinsically co-constructed and context-dependent nature (Black & Wiliam, 1998; Brookhart, 2003; McMillan, 2003; McNamara, 2001; Stiggins, 2001). When the principles and procedures underlying TBA are not clear, the basis for research and development is even muddier, hence the need for more public and mainstream discussion of the issues. This review article aims, fi rst, to defi ne more clearly the concept of TBA in English language teaching and second to explore some of the key con-ceptual issues and challenges for the fi eld, as well as the implications for practice. The article concludes with a summary of some of the areas in which more research into TBA is needed.
DEFINING TBA
There is no widely accepted common defi nition of teacher-based assess-ment in the English language teaching fi eld, with many terms used interchangeably to refer to the same practices and procedures, includ-ing terms such as alternative assessment , classroom and/or school-based assessment , formative assessment , and more recently, assessment for learning. Such terms highlight different aspects of the assessment process, but all tend to be used to signify a more teacher-mediated, context-based, classroom-embedded assessment practice, explicitly or implicitly defi ned in opposition to traditional externally set and assessed large scale formal examinations used primarily for selection and/or account-ability purposes. Thus, for the purposes of this article we take TBA to mean much more than just who is doing the assessing; TBA also has implications for the what , where , how and most importantly, the why of assessment.
TBA has a number of important characteristics which distinguish it from other forms of assessment:
It involves the teacher from the beginning to the end: from planning • the assessment programme, through to identifying and/or develop-ing appropriate assessment tasks right through to making the assess-ment judgments. It allows for the collection of a number of samples of student work • over a period of time, using a variety of different tasks and activities. It can be adapted and modifi ed by the teacher to match the teaching • and learning goals of the particular class and students being assessed. It is carried out in ordinary classrooms, not in a specialist assessment • centre or examination hall. It is conducted by the students’ own teacher, not a stranger. •
396 TESOL QUARTERLY
It involves students more actively in the assessment process, especially • if self and peer assessment is used in conjunction with teacher assessment. It opens up the possibility for teachers to support learner-led • enquiry. It allows the teacher to give immediate and constructive feedback to • students. It stimulates continuous evaluation and adjustment of the teaching • and learning programme. It complements other forms of assessment, including external • examinations.
The key steps involved in such teacher-based assessment are captured in Figure 1 .
FIGURE 1 A Framework for Teacher-Based Assessment (Davison, 2008)
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 397
Defi ned in this sense, TBA shares many of the characteristics of assess-ment for learning (AfL), a concept fi rst used in the United Kingdom in the late 1980s, and widely promoted through the work of the Assessment Reform Group (Assessment Reform Group, 1999, 2001; Black & Wiliam, 1998). The term was introduced to ensure “a clear distinction be made between assessment of learning for the purposes of grading and report-ing, which has its own well-established procedures, and assessment for learning , which calls for different priorities, new procedures and a new commitment” (Assessment Reform Group, 1999, p. 2). The Assessment Reform Group (1999, p. 7) has described AfL’s defi ning characteristics as follows:
embedded in a view of teaching and learning of which it is an essen-• tial part learning goals are shared with pupils • aims to help pupils know and recognise the standards they are aim-• ing for pupils are involved in self-assessment • provides feedback which helps pupils recognise their next steps and • how to take them underpinned by confi dence that every student can improve • both teacher and pupils review and refl ect on assessment data •
In TBA, the term assessment for learning is often used synonymously with the term formative assessment , so comprehensively documented by Black and Wiliam (1998), but more recently many researchers have been call-ing for a sharper distinction between the two terms (Kennedy, Chan, Yu, & Fok, 2006; Roos & Hamilton, 2005; Stiggins, 2002; Taras, 2005). Traditionally, formative assessment is seen as informal and fairly frequent, involving the gathering of information about students and their language learning needs while they are still learning. Formative assessment is usu-ally contrasted with summative assessment, generally defi ned as those more formal planned assessments at the end of a unit, term, or year which are used to evaluate student progress and/or grade students. In an assessment of learning culture, formative and summative assessment are seen as distinctly different in both form and function, with teacher and assessor roles clearly demarcated, but in an assessment for learning cul-ture, it is argued that even summative assessments of the students’ lan-guage skills can and should also be used formatively to give constructive student feedback and improve learning (see, e.g., Biggs, 1998; Carless, 2008; Davison, 2007, Davison & Hamp-Lyons, 2009; Hamp-Lyons, 2007; Harlen, 2005; Kennedy et al., 2006; Taras, 2005). These researchers argue that provided summative assessment is undertaken while students are still learning (and teachers are still teaching), such assessments can and
398 TESOL QUARTERLY
should also be used for formative purposes, that is, to improve learning and teaching, thus building a more coherent and stronger assessment for learning culture. Kennedy et al. propose that in this more inclusive model of assessment:
1. All assessment needs to be conceptualized as assessment for learning. 2. Feedback needs to be seen as a key function for all forms of
assessment. 3. Teachers need to be seen as playing an important role not only in
relation to formative assessment but in all forms of summative assess-ment as well—both internal and external.
4. Decisions about assessment need to be viewed in a social context because in the end they need to be acceptable to the community.
Kennedy concludes that “the continuing bifurcation between formative and summative assessment is no longer useful, despite the fact that such a distinction has resulted in some excellent research and development work on formative assessment” (p. 14). He joins Harlen (2005), Carless (2008), and others in calling for more research to be conducted into summative assessment, and as Carless puts it, tests as “productive learning opportunities” (p. 8). However, Kennedy challenges Roos and Hamilton’s (2005) view that summative assessment as a procedure is too deeply entrenched, in Roos and Hamilton’s words, to become “a valid activating mechanism for goal-directed educational activities” (p. 7). Biggs (1996, 1998) also argues that an exclusive focus on formative assessment may leave many negative summative assessment practices uncontested. He points out that this is deeply problematic given summative assessment’s signifi cant infl uence on student learning, often negative backwash under-mining any of the positive impacts of formative assessment. In fact, as has been well-documented in systems such as Hong Kong and Singapore (e.g., Cheah, 1998; Hamp-Lyons, 2007), it is extremely diffi cult to sustain any signifi cant teacher-based formative assessment practices in most tra-ditional examination-dominated cultures. 3
The traditional concept of formative assessment also needs to be prob-lematized. In AfL, formative assessment is seen as having two key func-tions: informing and forming. That is, formative assessment not only shapes the decisions about what to do next, by helping the teacher to select what to teach in the next lesson, or even in the next moment in the
3 In Hong Kong studies of the implementation of teacher-based assessment innovations such as the Target-Oriented Curriculum in primary schools (e.g. Cheung & Ng 2000; Carless 2004; Adamson & Davison 2003) and the Teacher Assessment Scheme in senior secondary science (Yung, 2006) found that any change in teacher assessment practice was diffi cult, severely constrained by traditional school culture and by teacher, parent, and student expectations.
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 399
lesson; the student also has to understand what they have learned and what they need to learn next (Black, 2001; Black, Harrison, Lee, Marshall, & Wiliam, 2003a, 2003b; Black & Wiliam, 1998). The learner’s role is cru-cial because it is the learner who does the learning. This point seems obvious, even trite, but it is central to the AfL philosophy and, if treated seriously, clearly highlights where formative assessment can go wrong. As Torrance (1993) argued some years ago, many teachers are at risk of assuming formative assessment is at best “fairly mechanical and behav-iouristic . . . in the graded test tradition”; at worst summative, “taking snapshots of where the children have ‘got to’, rather than where they might be going next” (p. 340).
Teachers coming from more traditional assessment cultures make two common misinterpretations of formative assessment. First, there is a widespread assumption that any continuous assessment is by defi nition formative, but this is not necessarily the case—a series of weekly tests are continuous, but they are not formative if they are not used by students to improve their learning:
The term ‘formative’ itself is open to a variety of interpretations and often means no more than that assessment is carried out frequently and is planned at the same time as teaching. Such assessment does not neces-sarily have all the characteristics just identifi ed as helping learning. It may be formative in helping the teacher to identify areas where more explana-tion or practice is needed. But for the pupils, the marks or remarks on their work may tell them about their success or failure but not about how to make progress towards further learning. (Assessment Reform Group, 1999, p. 7)
Second, there is a common misconception that a so-called alternative form of the assessment automatically makes it formative; that is, assess-ments like portfolios and oral presentations are by defi nition formative. However, such assessments can be and sometimes are components of large-scale externally set and assessed examinations, for example, the ubiquitous external oral examinations of many Asian educational systems which are not used at all for formative purposes.
To summarize, then, in an AfL culture, TBA needs to be continuous and embedded naturally into every stage of the teaching–learning cycle, not just at the end (see Ministry of Education [Singapore], 2008, for an example of this in a K–12 curriculum). All assessments (even those for accountability purposes) need to be designed and implemented with the overriding aim of improving student learning, with AfL as the domi-nant educational ethos. In such a classroom or institution, teachers would be continually engaged in various forms of formative assessment, even at the end of the course. In this model of TBA, outlined in Table 1 ,
400 TESOL QUARTERLY
TA
BL
E 1
A
sses
smen
t for
Lea
rnin
g in
the
Cla
ssro
om: A
Typ
olog
y of
Pos
sibi
litie
s (D
avis
on, 2
008)
In-c
lass
con
tinge
nt f
orm
ativ
e as
sess
men
t-wh
ile-te
ach
ing
Mor
e pl
ann
ed in
tegr
ated
fo
rmat
ive
asse
ssm
ent
Mor
e fo
rmal
moc
k or
tria
l ass
essm
ents
m
odel
ed o
n su
mm
ativ
e as
sess
men
ts
but u
sed
for
form
ativ
e pu
rpos
es
Pres
crib
ed s
umm
ativ
e as
sess
men
ts,
but r
esul
ts a
lso
used
form
ativ
ely
to
guid
e fu
ture
teac
hin
g/le
arn
ing
Defi
nit
ion
An
inte
gral
but
ver
y in
form
al p
art o
f eve
ry
teac
her
’s d
aily
pra
ctic
e
An
inte
gral
par
t of t
he
lear
nin
g an
d te
ach
ing
cycl
e, i.
e., p
art
of e
ffec
tive
teac
hin
g an
d pl
ann
ing
for
the
futu
re
A ti
me
for
taki
ng
stoc
k, a
sses
sin
g h
ow in
divi
dual
s ar
e pe
rfor
min
g co
mpa
red
wit
h w
hol
e gr
oup
A d
isti
nct
ive
stag
e at
the
end
of a
un
it o
f lea
rnin
g an
d te
ach
ing
Deg
ree
of
prep
lan
nin
gO
ften
spo
nta
neo
us
and
con
tin
gen
t wh
en
the
nee
d ar
ises
An
info
rmal
pla
nn
ed p
roce
ss
duri
ng
the
cour
se o
f th
e ye
ar
tailo
red
to th
e n
eeds
of t
he
indi
vidu
al s
tude
nts
an
d cl
ass
Usu
ally
pre
desi
gned
, sen
sitiv
e to
nee
ds o
f stu
den
ts b
ut a
lso
to th
e de
man
ds o
f ext
ern
al
requ
irem
ents
Pred
eter
min
ed, r
elat
ivel
y fo
rmal
an
d se
t at b
egin
nin
g of
un
it o
f lea
rnin
g an
d te
ach
ing
Focu
sL
earn
er r
efer
ence
d;
focu
s on
the
lear
nin
g pr
oces
s
Cri
teri
on r
efer
ence
d, b
ut in
re
lati
on to
lear
ner
’s s
tart
ing
poin
t; fo
cus
on th
e le
arn
ing
proc
ess
and
stud
ent p
rogr
ess
Cri
teri
on r
efer
ence
d, b
ut in
rel
atio
n
to s
yste
m-le
vel n
orm
s; fo
cus
on
stud
ent p
rogr
ess
and
gap
betw
een
w
hat
sh
ould
be
and
is
Cri
teri
on r
efer
ence
d, b
ut in
rel
atio
n
to s
yste
m-le
vel n
orm
s; fo
cus
mai
nly
on
the
prod
uct o
f lea
rnin
g, a
nd
wh
at s
tude
nt n
eeds
to d
o n
ext
Typ
ical
kin
ds
of fe
edba
ckIn
dire
ct o
r im
plie
d fe
edba
ck, o
r di
rect
, co
con
stru
cted
by
stud
ents
an
d te
ach
er
Dir
ect q
ualit
ativ
e fe
edba
ck,
may
invo
lve
mul
tipl
e an
d va
ried
sou
rces
e.g
., se
lf,
peer
s, te
ach
er, e
tc.
Dir
ect q
ualit
ativ
e fe
edba
ck,
may
indi
cate
pro
fi le
s or
gr
ades
, but
sti
ll ex
ten
sive
st
uden
t in
volv
emen
t
Rep
ort i
n p
rofi
les,
leve
ls, a
nd
mar
ks
by te
ach
er, b
ut p
rece
ded
and/
or
follo
wed
by
form
ativ
e se
lf a
nd
peer
ev
alua
tion
an
d ex
ten
sive
teac
her
fe
edba
ck
Typ
es o
f ass
essm
ents
: O
bser
ve
Info
rmal
obs
erva
tion
of
lear
ner
beh
avio
r/
lan
guag
e us
e
Mor
e st
ruct
ured
sel
f, pe
er a
nd
teac
her
obs
erva
tion
usi
ng
anec
dota
l rec
ords
/obs
erva
tion
ch
eckl
ists
/sel
f and
pee
r ev
alua
tions
Syst
emat
ic o
bser
vati
on o
f la
ngu
age
sam
ples
usi
ng
scal
es/p
rofi
les/
rubr
ics
Form
al m
oder
ated
map
pin
g of
st
uden
ts p
erfo
rman
ce o
n
syst
em-w
ide
publ
ish
ed
stan
dard
s/sc
ales
Inqu
iry
Focu
sed
open
que
stio
ns
to
elic
it/ch
eck
unde
rsta
ndi
ng;
op
port
unit
y fo
r st
uden
t se
lf-r
efl e
ctio
ns
Peer
con
fere
nci
ng;
info
rmal
st
uden
t sel
f an
d pe
er
refl
ecti
ons/
lear
nin
g lo
gs
Teac
her
-stud
ent c
onfe
ren
cin
g;
stru
ctur
ed s
tude
nt s
elf a
nd
peer
re
fl ec
tion
s/le
arn
ing
logs
NA
An
alys
is
Info
rmal
an
alys
is o
f pat
tern
s in
stu
den
t lan
guag
e us
eA
nal
ysis
of d
raft
s/vi
deo
and
audi
o sa
mpl
es o
f wor
kPo
rtfo
lios/
colle
ctio
ns
of
stud
ent w
ork/
pres
enta
tion
Fo
rmal
por
tfol
io/p
roje
ct/
vide
otap
ed p
rese
nta
tion
s
Test
NA
Info
rmal
qui
zzes
, dia
gnos
tic
test
s, s
tude
nt-d
evel
oped
test
sM
ore
form
al te
sts
Form
al te
sts,
exa
ms
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 401
assessment includes not only the formal planned moments when stu-dents undertake an assessment task but also the far more informal, even spontaneous moments when teachers are monitoring student group work and notice one student speaking more confi dently or another fail-ing to take an offered turn. Because the goal of TBA is to improve stu-dent learning, self and peer assessment are an integral component of all assessment activity. Feedback is also a defi ning element, with opportuni-ties for constructive and specifi c feedback related to specifi c assessment criteria and curriculum goals and content regularly reviewed by students and teachers.
Such an integrated approach to assessment underpinned the recent development of a school-based assessment (SBA) component in the Hong Kong Certifi cate of Education Examination (HKCEE) in English Language (Davison, 2007; Davison & Hamp-Lyons, 2009). The stated purpose of the SBA component was to provide a more comprehensive appraisal of Forms 4–5 (Grades 9–10) learners’ achievement by assess-ing learning objectives which could not be easily assessed in public examinations while at the same time enhancing teaching and learning. The initiative marked a shift from traditional norm-referenced exter-nally set and assessed examinations toward a more student-centered TBA system that drew its philosophical basis from the assessment for learning movement discussed earlier. Teachers are involved at all stages of the assessment cycle, from planning the assessment programme to identifying and developing appropriate formative and summative assessment activities right through to making the fi nal judgments. In-class formal and informal performance assessment of students’ authentic oral language skills using a range of tasks and guiding ques-tions and the use of teacher judgments of student performance using common assessment criteria are innovative aspects of the new SBA, as is the insistence that students play an active role in the assessment pro-cess and the vigorous promotion of self and/or peer assessment and feedback (for a fuller discussion, see Davison, 2007; Davison & Hamp-Lyons, 2009).
Such TBA is assumed to have a number of advantages over external examinations, especially in assessing language, because effective lan-guage development requires not just knowledge but skill and application in a wide range of situations and modes of communication. Hence, like other performance-based subjects such as music, art, drama, and various vocational subjects, it is often argued that languages are better assessed through more authentic-like, performance-based assessments. Table 2 summarizes some of the common advantages attributed to TBA com-pared with external examinations.
However, a number of these claims made for the effi cacy, or even supe-riority, of TBA over traditional assessments, especially those relating to
402 TESOL QUARTERLY
TABLE 2 Advantages of TBA Compared With External Examinations for Oral Language Assessment
(Adapted From SBA Consultancy Team, 2005)
Characteristics of classroom-based TBA Characteristics of exams
Scope Extends the range and diversity of assessment collection opportunities, task types, and assessors
Much narrower range of assessment opportunities: less diverse assessment; one exam per year
Authenticity Assesses work being done within the classroom; less possibility of cheating as teacher knows student capabilities; assessments more likely to be realistic
Removes assessment entirely from teaching and learning; stressful conditions may lead to students not demonstrating real capacities
Validity Improves validity through assessing factors that cannot be included in public exam settings
Limits validity by limiting scope of assessment, e.g., diffi cult to assess interaction skills in exam environment
Reliability Improves reliability by having more than one assessment by a teacher who is familiar with the student; allows for multiple opportunities for assessor refl ection/standardization
Even with double marking, examiners’ judgments can be affected by various factors (task diffi culty, topic, interest level, tiredness, etc.), but little opportunity for assessor refl ection/review
Fairness Fairness is achieved by following commonly agreed processes, outcomes and stan dards; teacher assumptions about students and their oral language levels are made explicit through collaborative sharing and discussion with other teachers
Fairness can only be achieved by treating everyone the same, i.e., setting the same task at the same time for all students
Feedback Students can receive constructive feedback immediately after the assessment has fi nished, hence improving learning
The only feedback is usually a grade at the end of the exam; generally no opportunities for interaction with assessor; no chance to ask how to improve
Positive washback (benefi cial infl uence on teaching and learning)
Ongoing assessment encourages students to work consistently; provides important data for evaluation of teaching and assessment practices in general
Examinations by their nature can only be purely summative, and do not serve any teaching-related purpose; effects on teaching and learning may even be negative; may encourage teaching to the test and a focus on exam tech-nique, rather than outcomes
Teacher and student empowerment
Teachers and students become part of the assessment process; collaboration and sharing of expertise takes place within and across schools
Teachers play little to no role in assessment of their students and have no opportunity to share their expertise or knowledge of their students; students treated as numbers
(Continued)
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 403
Characteristics of classroom-based TBA Characteristics of exams
Professional development
Builds teacher assessment skills, which can be transferred to other areas of the curriculum
Teachers have no opportunity to build their assessment skills; get little or no feedback on how to improve as teachers
Practicality and cost
Once teachers are trained, TBA is much cheaper as integrated into normal curriculum; undertaken by class teacher as part of everyday teaching; avoids wasting valuable teaching time on practice tests
Language assessment as currently practiced is very expensive in terms of task development (especially as multiple stimulus material is often needed to avoid cheating), assessor training and moderation and teaching/learning time
TABLE 2 (Continued)
validity and reliability, need to be explored further because they raise important theoretical issues that go to the heart of TBA and when applied to the English language teaching fi eld.
CONCEPTUAL ISSUES AND CHALLENGES IN TBA IN ENGLISH LANGUAGE TEACHING
There are obviously many issues and challenges confronting English language education in its movement toward greater use of high-quality TBA, ranging from the very practical concerns associated with any sig-nifi cant change in classroom practice, including the need to develop confi dence to overcome the inevitable implementation dip through to the more technical problems associated with learning how to construct explicit assessment criteria and tasks appropriate for range of individ-ual student needs (e.g., Fox, 2008), through to the signifi cant chal-lenge of changing deeply entrenched sociocultural attitudes and expectations (for a fuller discussion, see Brindley, 1995; Davison, 2007, Rea-Dickins, 2008). However, in this article we are more concerned with conceptual issues, in which questions of validity and reliability are central.
As others have pointed out (e.g., Rea-Dickins, 2007), in TBA there is much debate over evaluation criteria with researchers such as Leung (2004a, 2004b) and Teasdale and Leung (2000), on one hand, arguing that the evaluation criteria traditionally associated with psychometric testing such as reliability and validity need to be reinterpreted in TBA, particularly in relation to in-class contingent assessment in interaction, but testers such as Clapham (2000) insisting that traditional test criteria do apply to alternative TBA:
404 TESOL QUARTERLY
A problem with methods of alternative assessment, however, lies with their validity and reliability: Tasks are often not tried out to see whether they produce the desired linguistic information; marking criteria are not investigated to see whether they ‘work’; and raters are often not trained to give consistent marks. (p. 152)
However, these points represent two quite different levels of concern. Clapham’s arguments are primarily to do with the quality of assessment practice, whereas Leung’s problematize the underlying assessment theory. 4 Even when TBA is “best” practice, many theoretical issues remain unresolved; in fact, we would argue they are even more obvious.
As an example, take the development of the SBA system in Hong Kong (SBA Consultancy Team, 2005), outlined earlier, involving more than 1,800 teachers in more than 650 schools and institutions. Conducted over 2 years of schooling, and contributing 15% toward each student’s fi nal English score, it consists of the assessment of English oral language skills based on topics and texts drawn from a program of independent reading or viewing (“texts” encompass print, video, fi lm, fi ction, and non-fi ction material). The spoken language tasks are of two broad kinds: group interaction and individual presentation. Students choose at least three texts to read or view over the course of 2 years, keeping a logbook or brief notes, and undertaking a number of activities in and out of class to develop their independent reading, speaking, and thinking skills. For assessment they participate in several interactions with classmates on a particular aspect of the text they have read or viewed, leading up to mak-ing an individual presentation or group interaction on a specifi c text and responding to questions from their audience (for a full description of the assessment requirements, see SBA Consultancy Team, 2005).
A range of assessment tasks has been provided that teachers can choose from and adapt, including teacher-made tasks adapted from those used by teachers who took part in the initial development of the assessment initiative. Assessment tasks can vary in length and complexity, enabling teachers to provide students with appropriate, multiple, and varied oppor-tunities to demonstrate their oral language abilities individually tailored to their language levels and interests. At the same time, however, the teacher and the school need to be sure that the oral language produced is the student’s own work, not the result of memorization without under-standing. Hence, there are some important requirements or conditions that teachers and students must follow, including the assessment being conducted by the usual English teacher, in the presence of one or more
4 See Chapelle (1999), for a more detailed discussion of current debates over validity in lan-guage testing.
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 405
classmate(s). Students are assessed according to a set of assessment crite-ria consisting of a set of descriptors at each of six levels across four domains: (a) pronunciation and delivery, (b) communication strategies, (c) vocabulary and language patterns, and (d) ideas and organization. Teachers are encouraged to video or audio record a range of student assessments to assist with standardization and feedback, involving the stu-dents as much as possible. During the class assessments, which might span a number of weeks, individual teachers at the same level are encour-aged to meet informally to compare their assessments and make adjust-ments to their own scores as necessary. Such informal interactions give teachers the opportunity to share opinions on how to score performances and interpret the assessment criteria.
Near the end of the school year, all the English teachers at each level hold a formal meeting, chaired by a coordinator in each school, to review performance samples and standardize scores. Such meetings are critical for developing consistency in and between teacher-assessors, for public accountability, and for professional collaboration and support. At the end of each year, a district-level meeting is held for professional sharing and further standardization. Each coordinator is encouraged to share a range of typical and atypical individual assessment records (along with the video or audio recordings) and the class records. Once any necessary changes are made, the performance samples are archived and the scores are sub-mitted to the HKEAA for review. Maintaining notes of all standardization meetings and any follow up action is also encouraged so schools can show parents and the public that it has applied the assessment procedures con-sistently and fairly. The HKEAA then undertakes a process of statistical moderation 5 to ensure the comparability of scores across the whole Hong Kong school system. This TBA system is supported by a comprehensive teacher training package (SBA Consultancy Team, 2005) which includes an introductory DVD and booklet, and two training CD-ROMs contain-ing a range of student samples for benchmarking purposes. In addition, 39 district-level group coordinators, mostly serving teachers, were used to coordinate training and standardization sessions with school coordina-tors and with the teachers involved within each school. A 12-hour supple-mentary professional development program with comprehensive course and video notes on DVD (SBA Consultancy Team, 2007) was also devel-oped, and all teachers are encouraged to complete the program in their fi rst year of such assessment. Careful monitoring of the assessment pro-cess shows that teachers are able to reliably mark students’ work with high
5 It is diffi cult to justify statistical moderation from a theoretical perspective, given SBA and the external examination are measuring different things under very different conditions, but it is considered essential to ensure public confi dence in the examination system is maintained, while allowing the HKEAA to be more innovative in its assessment practices.
406 TESOL QUARTERLY
levels of interrater reliability and that a higher correlation exists between SBA and other components of the external exam than between the exter-nal oral exam and other components.
This TBA system aims to ensure, to paraphrase Clapham (2000), that tasks are tried out, marking criteria do work, and raters are trained; how-ever, theoretical problems to do with the nature of what is assessed and how it is assessed in English language education still arise and are, in fact, foregrounded by TBA in ways which challenge the English language teaching fi eld. In the interests of brevity, we will look at three key sets of issues arising from TBA which problematize our theorization of language, language learning, and assessment.
Implications of TBA for the Theorization of Language
All language use implicates meaning making (and meaning taking). In English language classes, almost all language is taught (and learned) through carrier content, for example, the language use one might engage in when arriving at an airport, or reviewing and discussing a fi lm. 6 Thus, TBA, whether we are talking about assessment embedded in in-class con-tingent interaction or more formal assessment at the end of the teaching and learning cycle, is inextricably connected to meaning making with ref-erence to content meaning in context. If we look at what is assessed in the example from Hong Kong outlined earlier, we see language use being embedded into other forms of social and cognitive activity, then being “pulled out” for separate assessment in ways which raise real issues of validity: Are we assessing speaking? Are we assessing cognition? Are we assessing reading? Are we assessing cultural knowledge? Are we assessing interactive style or personality? Such concerns about validity are some-what ironic given that the direct assessment and real-world (even if simu-lated) nature of TBA tasks is supposed to be one of the key advantages TBA has over external exams.
Thus, somewhat paradoxically, TBA raises two problems to do with the nature of language assessment that are relatively invisible in traditional testing. First, TBA, in its emphasis on language use in context, calls into action a multifaceted combination of linguistic, pragmatic, and cultural resources. Models of second language competence (e.g., Bachman, 1990; Canale & Swain 1980) have set out components such as grammatical com-petence, sociocultural competence, and strategic competence. These components have been used to inform model building for profi ciency lev-els and assessment (e.g., Council of Europe, 2001; Griffi n & McKay, 1992).
6 It is rare in these communicative language teaching days for language use to be solely con-cerned with a display of linguistic knowledge (as in grammar drill exercises).
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 407
However, as Widdowson (2001) noted, there is as yet no clear under-standing of how these different components relate to one another, par-ticularly in specifi c contexts. One consequence of this lack of understanding is that assessment has a built-in arbitrariness; for example, should pragmatic competence be regarded as being more important than linguistic competence in classroom discussions? Do we need a model(s) of language that would articulate the different component competences in different contexts?
Second, insofar as language use almost always involves some content meaning (even in formal language learning activities), then assessing language means inevitably assessing content to some degree. This is par-ticularly the case for the TBA of English that takes place in a subject learning contexts, for example, English language learners in mainstream English-medium school and university classrooms (e.g., in North America, Australia, and the United Kingdom as well as in English-medium institu-tions around the world), and in content-language integrated learning programmes (increasingly popular in Europe where the teaching and learning of English as a foreign language is carried out through a school subject such as science). But assessing English in subject learning con-texts raises certain questions: Do teacher-assessors need a framework for language assessment as well as a separate framework for content? Or can they adopt a content-language integrated view, as argued by Mohan, Leung, and Slater (in press), that proceeds on the assumption that there is no separation between meaning and wording? These are critical ques-tions to do with validity highlighted by TBA, but they have widespread signifi cance in the English language teaching fi eld.
Implications of TBA for the Theorization of Language Learning
The second key set of problems highlighted in TBA relate to the theo-rization of language learning. TBA, particularly for formative purposes, has generally put a high premium on teacher–student dialogue involving appropriate use of questioning and feedback, either as part of live con-tingent interaction or in the form of written comments. This practice generally, and mainly implicitly, assumes that language learning takes place through interaction between the teacher and the student with the guidance and advice given by the teacher advancing the student’s knowledge and skills. Until recently, such formative assessment appears to have been much more interested in practice than theory. Lantolf and Pohener (2004), Poehner and Lantolf (2005), and Poehner (2007, 2008) argue that formative TBA has been atheoretical and overly task ori-ented, without paying suffi cient attention to a learner’s overall cognitive
408 TESOL QUARTERLY
development; furthermore, such formative assessment seems to rely on teacher intuition rather than any systematic theory of learning. In con-trast to most forms of teacher-based formative assessment, their preferred model of assessment, dynamic assessment, works explicitly within a Vygotskyan sociocultural paradigm, using the twin constructs of teacher-mediated assistance and the zone of proximal development to theorize the process of learning through assessment. A similar approach grounded in sociocultural theory, called interactive assessment (see SBA Consultancy Team, 2008, for a fuller explanation), is also being promoted in the Hong Kong assessment initiative. Teachers are given a framework of guiding questions which make increasing cognitive and linguistic demands on the learner, and their teacher-assessors are encouraged to interact indi-vidually with a student at any time, asking specifi c question(s) to clarify and encourage the student to extend ideas, help prompt and scaffold the students’ oral interaction, probe the range and depth of their oral lan-guage skills, and verify the student’s understanding of what he or she is saying. The questions are meant to be used fl exibly to ensure that stu-dents have the opportunity to show the full range of their responses, hence achieving the most valid “true” judgment of students’ ability.
However, such approaches raise key questions not only about the nature of second language learning and its stages of development, but also about the role of assessment criteria and the teacher-assessor. Where learners of English need support to understand and express meaning, elements of teaching and scaffolding of the medium of communication may be built into formative guidance. How should this aspect of teacher–student inter-action be considered in any theorizing of TBA? How does a teacher decide what to foreground in any set of assessment criteria and what to downplay or even ignore? Do we need to adopt an explicit theory of interaction and its relationship with learning? Is there something unique about TBA of language that requires special and additional attention?
In a discussion on the development of a theory of formative assess-ment in general, Black and Wiliam (2009) suggest that theory building “must bring into relationship . . . three spheres, the teacher’s agenda, the internal world of each student, and the inter-subjective” (p. 26). TBA in English language teaching highlights the complexity of these relation-ships and problematizes the teacher-assessor’s own beliefs and construc-tions of their discipline (for a further discussion, see Leung, 2007) in ways which challenge all in English language teaching.
Implications of TBA for the Theorization of Assessment
The series of questions raised in the discussion so far are implicated in the third—and fi nal—set of issues to be addressed in this article, that is,
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 409
the theorization of assessment. It is a fundamental paradox of TBA that its inherent strengths are viewed by many psychometricians as its great-est weaknesses. In many ways TBA is the opposite to traditional forms of examination and testing in which context is regarded as an extrane-ous variable that must be controlled and neutralized and the assessor as someone who must remain objective and uninvolved throughout the whole assessment process (Davison, 2007). TBA, in contrast, derives a major part of its validity and reliability from its location in the actual class-room where assessment activities are embedded in the regular curricu-lum and assessed by a teacher who is familiar with the student’s work and presumably has a stake in their improvement. To work effectively, how-ever, TBA needs a theory of assessment which is aligned with and which exploits these inherent features. Thus, in the TBA initiative in Hong Kong, schools and teachers were granted a large degree of trust and autonomy in the design, implementation, and specifi c timing of assess-ment tasks. The criteria for evaluating reliability shifted from a focus on input to a focus on output; that is, no assessment tasks are the same across all schools; rather a standard set of expectations of students’ language use (i.e., assessment standards or criteria) were developed based on the curriculum goals, past performances, and the teachers own judgments, and are now used by all teachers—and more importantly, students—to generate tasks appropriate to the students’ language level, context, and needs (SBA Consultancy team, 2005). All students are given suffi -cient time and support to demonstrate their best—to show what they can do—and for the assessor to be able to confi dently assess their out-put, but even more importantly, validate their informal judgments of students’ language levels and achievements. In other words, the more formal assessment tasks are designed to encourage the teacher to stand back and refl ect on their implicit or explicit assumptions about individ-ual students’ capacities, compare those assumptions with careful analy-sis of examples of students’ actual performance, and then subject their judgments to explicit scrutiny and challenge or confi rmation by others. This TBA initiative does not assume that the class teacher is objective or has no preconceived ideas or assumptions about a student’s level. To the contrary, it seeks to make such assumptions explicit and open to dis-cussion with fellow teachers. Thus, it is not necessary to have complete consensus; that is, teachers do not need to agree to give identical marks; some variation within the range is to be expected. As Davison (2004) argues, in TBA trustworthiness comes more from the process of express-ing disagreements, justifying opinions, and so on than from absolute agreement.
This theorization of assessment is obviously very different from that associated with large-scale testing, one that has as core criteria for eval-uation not just learning outcomes, but the explicit enhancement of
410 TESOL QUARTERLY
learning and teaching. As such, the traditional conceptions of validity and reliability associated with the still-dominant psychometric tradition of testing are themselves a potential threat to the development of the necessarily highly contextualized and dialogic practices of TBA (Rea-Dickins, 2007). Given that TBA spans from in-class contingent formative assessment as part of teaching to prescribed relative formal summa-tive assessment, the following questions need to be asked: How can we develop a view of validity and reliability in terms of learning (not solely in terms of learning outcomes)? Is there a place for differentiated cri-teria of validity and reliability for different kinds of TBA? How can we further strengthen TBA and its nexus with learning and teaching while at the same time enhancing community confi dence in our assessment systems? How can we better align traditional theorizations of assessment with those needed for TBA and vice versa? Is such alignment theoretically possible?
CONCLUSION
There are obviously areas of TBA other than those explored in this article in which further research and conceptualization is needed. In par-ticular, more thinking is needed around ethics, trustworthiness, and fair-ness (e.g., see Lynch, 2001; Lynch & Shaw, 2005), and the relationship between assessment, feedback, and learning. More research is also needed into the effects of system-level change, including the impact on teachers and learners of the adoption, implementation, or evaluation of school-based TBA systems; the effect of importation of assessment approaches from other cultures; comparative perspectives on assessment policies and programs; and the impact of standards-based assessment on teachers and students. More research into teacher training and profes-sional development in assessment is also necessary: what this kind of teacher development comprises and how it is perceived, the quality and progress indicators of TBA, and different approaches to teacher develop-ment in assessment.
However, TBA, in all its incarnations, has been around English lan-guage teaching long enough to demonstrate its powerful potential to improve learning and teaching in a range of different contexts. What it has lacked until recently has been suffi cient engagement with theory and a sense of a research agenda. Perhaps more tellingly, the highly contex-tualized and variable nature of TBA has meant it lacks the capacity to be reduced to an off the shelf for-profi t product and thus has always been relegated to the status of the Other. However, as this special issue dem-onstrates, TBA appears to be gaining enough critical mass and common interest to generate a new level of discussion about core concepts. This is
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 411
to be applauded because many of the key questions and issues raised by TBA are of central interest to the English language teaching world.
ACKNOWLEDGMENTS
The authors acknowledge the important contribution of discussions with their col-leagues at King’s College, London, and the University of Hong Kong to the ideas expressed in this article.
THE AUTHORS
Chris Davison is Professor of Education and Head of the School of Education at the University of New South Wales, Sydney, Australia. Before going to Hong Kong, she worked as a teacher educator for 15 years. She is also actively involved in the research and development of English as a second language and languages other than English policy and programs in Australia and the Asia-Pacifi c area.
Constant Leung is Professor of Educational Linguistics in the Department of Education and Professional Studies at King’s College London. He has written and published widely on additional and second language education and language assess-ment issues. He is also the director of a master of arts program in English language teaching and applied linguistics.
REFERENCES
Adamson, B., & Davison, C. (2003). Innovation in English language teaching in Hong Kong primary schools: One step forwards, two steps sideways. Prospect, 18 , 27–41.
Arkoudis, S., & O’Loughlin, K. (2004). Tensions between validity and outcomes: Teachers’ assessment of written work of recently arrived immigrant ESL students. Language Testing, 20 , 284–304.
Assessment Reform Group. (1999). Assessment for learning: Beyond the black box . Cambridge: University of Cambridge School of Education. Retrieved on 2 October 2007 from http://arg.educ.cam.ac.uk/AssessInsides.pdf .
Assessment Reform Group. (2001). Assessment for learning: 10 principles. Retrieved 2 July 2009 from http://www.assessment-reform-group.org.uk .
Bachman, L. (1990). Fundamental considerations in language testing . Oxford: Oxford University Press.
Biggs, J. (1996). Assessing learning quality: Reconciling institutional, staff, and edu-cational demands. Assessment & Evaluation in Higher Education, 2 , 3–25.
Biggs, J. (1998). Assessment and classroom learning: A role for formative assessment? Assessment in Education: Principles, Policy & Practice, 5 , 103–110.
Black, P. (2001). Formative assessment and curriculum consequences. In D. Scott (Ed.), Curriculum and assessment. Westport, CT: Ablex.
Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2003a, April). Formative and summative assessment: Can they serve learning together? Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL. Retrieved 1 October 2007 from http://www.kcl.ac.uk/education/papers/AERA%20ClassAsst.pdf .
412 TESOL QUARTERLY
Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2003b, April). The nature and value of formative assessment for learning. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL. Retrieved on 1 October 2007 from http://www.kcl.ac.uk/education/papers/AERA%20Pres.pdf .
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5 , 7–74.
Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21, 5–31.
Breen, M., Barratt-Pugh, C., Derewianka, B., House, H., Hudson, C., Lumley, T., et al. (1997). Profi ling ESL children: How teachers interpret and use national and state assess-ment frameworks . Canberra, Australia: Department of Employment, Education, Training and Youth Affairs.
Brindley, G. (1995). Assessment and reporting in language learning programmes: Purposes, problems and pitfalls. In E. Li & G. James (Eds.) Testing and evaluation in second language education (pp. 133–162). Hong Kong SAR, China: Hong Kong University of Science and Technology.
Brookhart, S. (2003). Developing measurement theory for classroom assessment pur-poses and uses. Educational Measurement: Issues and Practice, 22 (4), 5–12.
Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1 , 1–47.
Carless, D. (2005). Prospects for the implementation of assessment for learning. Assessment in Education, 12 , 39–54.
Carless, D. (2008) Developing productive synergies between formative and summa-tive assessment processes. In M. F. Hui & D. Grossman (Eds.), Improving teacher edu-cation through action research (pp. 9–23). New York: Routledge.
Chapelle, C. (1999). Validity in language assessment. Annual Review of Applied Linguistics, 19, 254–272.
Cheah, Y. M. (1998). The examination culture and its impact on literacy innovations: The case of Singapore. Language and Education, 12 , 192–209.
Cheung, D., & Ng, D. (2000). Teachers’ stages of concern about the target-oriented curriculum. Education Journal, 28 , 109–122.
Clapham, C. (2000). Assessment and testing. Annual Review of Applied Linguistics, 20, 147–161.
Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching, assessment . Cambridge: Cambridge University Press.
Cumming, J. J., & Maxwell, G. S. (2004). Assessment in Australian schools: Current practice and trends. Assessment in Education, 11 , 89–108.
Curriculum Development Institute. (2002). School policy on assessment: Changing assessment practices. Chapter 5 in Basic education curriculum guide: Building on strength. Hong Kong: Curriculum Development Institute.
Davison, C. (2004). The contradictory culture of classroom-based assessment: Teacher-based assessment practices in senior secondary English. Language Testing, 21 , 304–333.
Davison, C. (2007). Views from the chalkface: School-based assessment in Hong Kong. Language Assessment Quarterly, 4 , 37–68.
Davison, C. (2008, March). Assessment for learning: Building inquiry-oriented assessment communities. Paper presented at 42nd Annual TESOL Convention and Exhibit, New York, NY.
Davison, C., & Hamp-Lyons, L. (2009) The Hong Kong Certifi cate of Education: School-based assessment reform in Hong Kong English language education. In L.-Y Cheng & A. Curtis (Eds.), English language assessment and the Chinese learner . New York: Routledge.
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 413
Davison, C., & Williams, A. (Eds.). (2002). Learning from each other: Literacy, labels and limitations. Studies of child English language and literacy development K-12, Volume 2. Melbourne: Language Australia.
Department of Education and Training, Education Policy and Planning Section (Australia). (n.d.). School excellence initiative. Teachers: The key to student success. A dis-cussion paper for government schools. Tuggeranong, Australian Capital Territory: Author. Retrieved 15 September 2009 from http://www.det.act.gov.au/__data/assets/pdf_fi le/0009/17964/sei_TeachersKeyToStudentSuccess.pdf
Fox, J. (2008). Alternative assessment. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopedia of language and education. Vol. 7: Language testing and assessment (2nd ed., pp. 97–108). New York: Springer.
Griffi n, P., & McKay, P. (1992). Assessing and reporting in the ESL language and lit-eracy in schools project. In P. McKay (Ed.), ESL development: Language and literacy in schools project, Vol. 2 (pp. 9–16). Canberra, Australia: Department of Employment, Education and Training.
Hamp-Lyons, L. (2007). The impact of testing practices on teaching: Ideologies and alternatives. In J. Cummins & C. Davison (Eds.), The international handbook of English language teaching, Vol. 1 (pp. 487–504). Norwell, MA: Springer.
Harlen, W. (2005). Teachers’ summative assessment practices and assessment for learning: Tensions and synergies. Curriculum Journal, 16 , 207–223.
Kennedy, K. J., Chan, K. S. J., Yu, W. M., & Fok, P. K. (2006, May). Assessment for produc-tive learning: Forms of assessment and their potential for enhancing learning. Paper pre-sented at the 32nd Annual Conference of the International Association for Educational Assessment, Singapore.
Lantolf, J. P., & Poehner, M. (2004). Dynamic assessment: Bringing the past into the future. Journal of Applied Linguistics, 1, 49–74.
Learning and Teaching Scotland. (2006). Assessment Is for Learning Programme . Retrieved 2 October 2007 from http://www.ltscotland.org.uk/assess/about/index.asp .
Leung, C. (1999). Teachers’ response to linguistic diversity. In A. Tosi & C. Leung (Eds.), Rethinking language education: From a monolingual to a multilingual perspective (pp. 225–240). London: Centre for Information on Language Teaching and Research, the National Centre for Languages.
Leung, C. (2004a). Classroom teacher-based assessment of second language develop-ment: Construct as practice. In E. Hinkel (Ed.), Handbook of research in second language learning and teaching . Mahwah, NJ: Erlbaum.
Leung, C. (2004b). Developing formative teacher-based assessment: Knowledge, practice, and change. Language Assessment Quarterly, 1 , 19–41.
Leung, C. (2007). Dynamic assessment: Assessment as teaching? Language Assessment Quarterly, 4 , 257–278.
Leung, C., & Teasdale, A. (1997). Raters’ understanding of rating scales as abstracted concept and as instruments for decision-making. Melbourne Papers in Language Testing, 6, 45–70.
Lynch, B. (2001). Rethinking assessment from a critical perspective. Language Testing, 18 , 351–372.
Lynch, B., & Shaw, P. (2005). Portfolios, power, and ethics. TESOL Quarterly, 39, 263–297.
McMillan, J. (2003). Understanding and improving teachers’ classroom assessment decision-making: Implications for theory and practice. Educational Measurement: Issues and Practice, 22 (4), 34–43.
McNamara, T. (2001). Language assessment as social practice: Challenges for research. Language Testing, 18 , 333–349.
414 TESOL QUARTERLY
Mohan, B., Leung, C., & Slater, T. (In press). Assessing language and content: A func-tional perspective. In A. Paran & L. Sercu (Eds.), Testing the untestable in language and education . Clevedon, England: Multilingual Matters.
Ministry of Education (New Zealand). (2009). Assessment resources banks: English, math-ematics, and science. Retrieved 15 September 2009 from http://arb.nzcer.org.nz/assessment/
Ministry of Education (Singapore). (2008). 2012 English language syllabus. Singapore: Government Printer.
New South Wales Government. (n.d.). Assessment for learning in the new years 7–10 syl-labuses. Retrieved 10 September 2009 from http://arc.boardofstudies.nsw.edu.au/go/sc/afl /.
New Zealand Qualifi cations Authority. (n.d.). Assessment and examination rules and procedures for secondary schools—2009. Retrieved 10 September 2009 from http://www.nzqa.govt.nz/ncea/acrp/secondary/5/5.html
Poehner, M. (2007). Beyond the test: L2 dynamic assessment and the transcendence of mediated learning. Modern Language Journal, 91, 323–340.
Poehner, M. E. (2008). Dynamic assessment: A Vygotskian approach to understanding and promoting L2 development . New York: Springer.
Poehner, M. E., & Lantolf, J. (2005). Dynamic assessment in the language classroom. Language Teaching Research, 9 , 233–265.
Popham, W. J. (2008a). Classroom assessment: What teachers need to know (5th ed.). Boston: Pearson Allyn & Bacon.
Popham, W. J. (2008b). Transformative assessment. Alexandria, VA: Association for Supervision and Curriculum Development.
Pryor, J., & Akwesi, A. (1998). Assessment in Ghana and England: Putting reform to the test of practice. Compare, 28 , 263–275.
Pryor, J., & Lubisi, C. (2002). Reconceptualizing educational assessment in South Africa—Testing times for teachers. International Journal of Educational Development, 22, 673–686.
Queensland Studies Authority (Australia). (2009a). PD packages. Retrieved 15 September 2009 from http://www.qsa.qld.edu.au/learning/3166.html.
Queensland Studies Authority (Australia). (2009b). Senior assessment general informa-tion [Web site]. Retrieved 24 August 2009 from http://www.qsa.qld.edu.au/assessment/2130.html
Rea-Dickins, P. (2007). Classroom-based assessment: Possibilities and pitfalls. In J. Cummins & C. Davison (Eds.), The international handbook of English language teach-ing, Vol. 1. (pp. 505–520). Norwell, MA: Springer.
Rea-Dickins, P. (2008). Classroom-based language assessment. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopedia of language and education. Vol. 7: Language testing and assessment (2nd ed., pp. 257–271). New York: Springer Science + Business.
Roos, B., & Hamilton, D. (2005). Formative assessment: A cybernetic viewpoint. Assessment in Education, 12 , 7–20.
Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144.
Saskatchewan Learning (Canada). (1993). Learning assessment program . Retrieved 24 August 2009 from http://www.education.gov.sk.ca/Assessment-for-Learning.
SBA Consultancy Team. (2005). 2007 HKCE English Language Examination: Introduction to the school-based assessment component (Training Package). Hong Kong SAR, China: Hong Kong Examination and Assessment Authority/Faculty of Education, The University of Hong Kong.
SBA Consultancy Team. (2007). Professional development for the school-based assessment component of the 2007 HKCE English Language Examination . Hong Kong SAR, China: Hong Kong Examination and Assessment Authority.
CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 415
SBA Consultancy Team. (2008). Aligning assessment with curriculum reform in junior sec-ondary English language teaching . Hong Kong SAR China: Quality Education Fund.
Spencer, E. (2005, February). Assessment in Scotland: ‘Assessment for learning.’ Formative assessment in a coherent system . PowerPoint presentation at the International Conference on Improving Learning Through Formative Assessment, Organisation for Economic Co-operation and Development, Paris, France. Retrieved on 9 October 2006 from http://www.oecd.org/dataoecd/40/54/34488069.ppt#7 .
Stiggins, R. (2001). The unfulfi lled promise of classroom assessment. Educational Measurement: Issues and Practice, 20 (3), 5–15.
Stiggins, R. (2002). Assessment crisis: The absence of assessment for learning. Phi Delta Kappan, 83 , 758–765.
Stiggins, R. (2008). An introduction to student-involved assessment for learning (5th ed.). Upper Saddle River, NJ: Pearson Merrill Prentice Hall.
Stiggins, R., Arter, J., Chappuis, J., & Chappuis, S. (2007). Classroom assessment for stu-dent learning. Upper Saddle River, NJ: Educational Testing Service & Pearson Merrill Prentice Hall.
Taras, M. (2005). Assessment—summative and formative: Some theoretical refl ec-tions. British Journal of Educational Studies, 53 , 466–478.
Teasdale, A., & Leung, C. (2000). Teacher-based assessment and psychometric the-ory: A case of paradigm crossing? Language Testing, 17 , 163–184.
TESOL. (2005). TESOL/NCATE program standards. Standards for the accreditation of ini-tial programs in P–12 ESL teacher education . Alexandria, VA: Author.
Torrance, H. (1993). Formative assessment: Some theoretical problems and empiri-cal questions. Cambridge Journal of Education, 23 , 333–343.
Widdowson, H. G. (2001). Communicative language testing: The art of the possible. In C. Elder, N. Brown, E. Iwashita, E. Grove, K. Hill, T. Lumley et al., (Eds.), Experimenting with uncertainty: Essays in honour of Alan Davies (pp. 12–21). Cambridge: Cambridge University Press.
Yung, B. (2006). Assessment reform in science: Fairness or fear. Norwell MA: Springer.