current issues in english language teacher-based assessment

TESOL QUARTERLY Vol. 43, No. 3, September 2009 393

Current Issues in English Language Teacher-Based Assessment CHRIS DAVISON University of New South Wales Sydney, Australia

CONSTANT LEUNG King’s College London London, England

Teacher-based assessment (TBA) is increasingly being promoted in educational policies internationally, with English language teachers being called on to plan and/or implement appropriate assessment pro-cedures to monitor and evaluate student progress in their own class-rooms. However, there has been a lack of theorization of TBA in the English language teaching fi eld, with researchers pointing to much variability, a lack of systematic principles and procedures, and a reliance on traditional, but now outdated, psychometric assumptions. This arti-cle provides an overview of some of the current issues in TBA, includ-ing its defi nition and key characteristics, and the complex but signifi cant questions which its implementation pose for our understandings of lan-guage, learning, and assessment.

Teacher-based assessment (TBA) is policy-supported practice in a number of educational systems internationally, including Australia,

New Zealand, 1 Canada, and the United Kingdom (e.g., Cumming & Maxwell, 2004; Learning and Teaching Scotland, 2006 2 ; Queensland Studies Authority, 2009b; Saskatchewan Learning, 1993; Spencer, 2005). It is increasingly being adopted as national educational policy in Asia

1 In Queensland where school-based assessment (SBA) was introduced in the 1970s (Sadler, 1989) teacher-based assessment is used for all assessment in the secondary school, even for high-stakes purposes (see Queensland Studies Authority, 2009a). The Australian Capital Territory (ACT) also uses only teacher-based assessment for senior secondary level (Department of Education and Training, Education Policy and Planning Section [Australia], n.d.). Other states such as New South Wales and Victoria have incorporated large scale teacher-based assessment into their public examinations (see, e.g., New South Wales Government, n.d.). New Zealand also has a long history of school-based assessment in the senior secondary school (see New Zealand Qualifi cations Authority, n.d.), and has developed a wide variety of teacher support material and associated research studies, (Ministry of Education [New Zealand], 2009).

2 In Scotland much interesting work in TBA is being conducted by the Scottish Assessment Is for Learning (AifL) group (Learning and Teaching Scotland, 2006) supported by the Ministry of Education in Scotland and involving many classrooms.

394 TESOL QUARTERLY

(Butler, this issue; Curriculum Development Institute [Hong Kong], 2002; Ministry of Education [Singapore], 2008; Xu & Liu, this issue) as well as in some developing countries, including South Africa, Ghana, and Zambia (Pryor & Akwesi, 1998; Pryor & Lubisi, 2002). It is also actively promoted in the United States (e.g., Popham, 2008a, 2008b; Stiggins, 2008; Stiggins, Arter, Chappuis, & Chappuis, 2007), although always over-shadowed by national testing programs. At the same time, English lan-guage teachers are increasingly being called on to plan and implement their own assessment instruments and procedures to monitor and evalu-ate student progress in their classrooms, and new curriculum documents and professional teaching standards increasingly demand English language teachers be knowledgeable and skilled in TBA (see, e.g., TESOL, 2005).

However, despite this widespread embrace of various forms of TBA in school and adult education, there has been comparatively little specifi c research into the TBA of English as a second or additional language. TBA has been neglected by researchers partly because of the uncertain status of TESOL as a discrete curriculum area in schools and tertiary institu-tions, partly because of the traditional dominance of the fi eld by large-scale English language tests such as the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) and their research priorities and needs, and partly because of the ongoing critique of notions of standard English and mod-els of correctness as well as debates over native versus nonnative speaking teachers and the implications for assessment.

What TBA research that has been done in TESOL reveals much vari-ability, a lack of systematic principles and procedures, and a dearth of information as to the impact of TBA on learning and teaching. In Australia, several studies of the use of large scale criterion-referenced English as second language (ESL) assessment frameworks in schools (Breen et al., 1997; Davison & Williams, 2002) have revealed a great diver-sity in teachers’ approaches to assessment, infl uenced by the teachers’ prior experiences and professional development, the assessment frame-works and scales they used, and the reporting requirements placed on them by schools and systems. Concerns have also been raised about, on the one hand, the ad-hoc or impressionistic nature of many teacher judg-ments (Leung, 1999; Leung & Teasdale, 1997) and, on the other hand, mechanistic criterion-based approaches to TBA, which are often imple-mented in such a way that they undermine rather than support teachers’ classroom-embedded assessment processes (Arkoudis & O’Loughlin, 2004; Black & Wiliam 1998; Carless, 2005; Davison, 2004; Leung, 2004a, 2004b).

Research into TBA in TESOL is further complicated by the consider-able uncertainty and disagreement around the concept of TBA itself and

CURRENT ISSUES IN ENGLISH LANGUAGE TEACHER-BASED ASSESSMENT 395

by its intrinsically co-constructed and context-dependent nature (Black & Wiliam, 1998; Brookhart, 2003; McMillan, 2003; McNamara, 2001; Stiggins, 2001). When the principles and procedures underlying TBA are not clear, the basis for research and development is even muddier, hence the need for more public and mainstream discussion of the issues. This review article aims, fi rst, to defi ne more clearly the concept of TBA in English language teaching and second to explore some of the key con-ceptual issues and challenges for the fi eld, as well as the implications for practice. The article concludes with a summary of some of the areas in which more research into TBA is needed.

DEFINING TBA

There is no widely accepted common defi nition of teacher-based assess-ment in the English language teaching fi eld, with many terms used interchangeably to refer to the same practices and procedures, includ-ing terms such as alternative assessment , classroom and/or school-based assessment , formative assessment , and more recently, assessment for learning. Such terms highlight different aspects of the assessment process, but all tend to be used to signify a more teacher-mediated, context-based, classroom-embedded assessment practice, explicitly or implicitly defi ned in opposition to traditional externally set and assessed large scale formal examinations used primarily for selection and/or account-ability purposes. Thus, for the purposes of this article we take TBA to mean much more than just who is doing the assessing; TBA also has implications for the what , where , how and most importantly, the why of assessment.

TBA has a number of important characteristics which distinguish it from other forms of assessment:

It involves the teacher from the beginning to the end: from planning • the assessment programme, through to identifying and/or develop-ing appropriate assessment tasks right through to making the assess-ment judgments. It allows for the collection of a number of samples of student work • over a period of time, using a variety of different tasks and activities. It can be adapted and modifi ed by the teacher to match the teaching • and learning goals of the particular class and students being assessed. It is carried out in ordinary classrooms, not in a specialist assessment • centre or examination hall. It is conducted by the students’ own teacher, not a stranger. •

396 TESOL QUARTERLY

It involves students more actively in the assessment process, especially • if self and peer assessment is used in conjunction with teacher assessment. It opens up the possibility for teachers to support learner-led • enquiry. It allows the teacher to give immediate and constructive feedback to • students. It stimulates continuous evaluation and adjustment of the teaching • and learning programme. It complements other forms of assessment, including external • examinations.

The key steps involved in such teacher-based assessment are captured in Figure 1 .

FIGURE 1 A Framework for Teacher-Based Assessment (Davison, 2008)


Defi ned in this sense, TBA shares many of the characteristics of assess-ment for learning (AfL), a concept fi rst used in the United Kingdom in the late 1980s, and widely promoted through the work of the Assessment Reform Group (Assessment Reform Group, 1999, 2001; Black & Wiliam, 1998). The term was introduced to ensure “a clear distinction be made between assessment of learning for the purposes of grading and report-ing, which has its own well-established procedures, and assessment for learning , which calls for different priorities, new procedures and a new commitment” (Assessment Reform Group, 1999, p. 2). The Assessment Reform Group (1999, p. 7) has described AfL’s defi ning characteristics as follows:

embedded in a view of teaching and learning of which it is an essen-• tial part learning goals are shared with pupils • aims to help pupils know and recognise the standards they are aim-• ing for pupils are involved in self-assessment • provides feedback which helps pupils recognise their next steps and • how to take them underpinned by confi dence that every student can improve • both teacher and pupils review and refl ect on assessment data •

In TBA, the term assessment for learning is often used synonymously with the term formative assessment , so comprehensively documented by Black and Wiliam (1998), but more recently many researchers have been call-ing for a sharper distinction between the two terms (Kennedy, Chan, Yu, & Fok, 2006; Roos & Hamilton, 2005; Stiggins, 2002; Taras, 2005). Traditionally, formative assessment is seen as informal and fairly frequent, involving the gathering of information about students and their language learning needs while they are still learning. Formative assessment is usu-ally contrasted with summative assessment, generally defi ned as those more formal planned assessments at the end of a unit, term, or year which are used to evaluate student progress and/or grade students. In an assessment of learning culture, formative and summative assessment are seen as distinctly different in both form and function, with teacher and assessor roles clearly demarcated, but in an assessment for learning cul-ture, it is argued that even summative assessments of the students’ lan-guage skills can and should also be used formatively to give constructive student feedback and improve learning (see, e.g., Biggs, 1998; Carless, 2008; Davison, 2007, Davison & Hamp-Lyons, 2009; Hamp-Lyons, 2007; Harlen, 2005; Kennedy et al., 2006; Taras, 2005). These researchers argue that provided summative assessment is undertaken while students are still learning (and teachers are still teaching), such assessments can and

398 TESOL QUARTERLY

should also be used for formative purposes, that is, to improve learning and teaching, thus building a more coherent and stronger assessment for learning culture. Kennedy et al. propose that in this more inclusive model of assessment:

1. All assessment needs to be conceptualized as assessment for learning. 2. Feedback needs to be seen as a key function for all forms of

assessment. 3. Teachers need to be seen as playing an important role not only in

relation to formative assessment but in all forms of summative assess-ment as well—both internal and external.

4. Decisions about assessment need to be viewed in a social context because in the end they need to be acceptable to the community.

Kennedy concludes that “the continuing bifurcation between formative and summative assessment is no longer useful, despite the fact that such a distinction has resulted in some excellent research and development work on formative assessment” (p. 14). He joins Harlen (2005), Carless (2008), and others in calling for more research to be conducted into summative assessment, and as Carless puts it, tests as “productive learning opportunities” (p. 8). However, Kennedy challenges Roos and Hamilton’s (2005) view that summative assessment as a procedure is too deeply entrenched, in Roos and Hamilton’s words, to become “a valid activating mechanism for goal-directed educational activities” (p. 7). Biggs (1996, 1998) also argues that an exclusive focus on formative assessment may leave many negative summative assessment practices uncontested. He points out that this is deeply problematic given summative assessment’s signifi cant infl uence on student learning, often negative backwash under-mining any of the positive impacts of formative assessment. In fact, as has been well-documented in systems such as Hong Kong and Singapore (e.g., Cheah, 1998; Hamp-Lyons, 2007), it is extremely diffi cult to sustain any signifi cant teacher-based formative assessment practices in most tra-ditional examination-dominated cultures. 3

The traditional concept of formative assessment also needs to be prob-lematized. In AfL, formative assessment is seen as having two key func-tions: informing and forming. That is, formative assessment not only shapes the decisions about what to do next, by helping the teacher to select what to teach in the next lesson, or even in the next moment in the

3 In Hong Kong studies of the implementation of teacher-based assessment innovations such as the Target-Oriented Curriculum in primary schools (e.g. Cheung & Ng 2000; Carless 2004; Adamson & Davison 2003) and the Teacher Assessment Scheme in senior secondary science (Yung, 2006) found that any change in teacher assessment practice was diffi cult, severely constrained by traditional school culture and by teacher, parent, and student expectations.


lesson; the student also has to understand what they have learned and what they need to learn next (Black, 2001; Black, Harrison, Lee, Marshall, & Wiliam, 2003a, 2003b; Black & Wiliam, 1998). The learner’s role is cru-cial because it is the learner who does the learning. This point seems obvious, even trite, but it is central to the AfL philosophy and, if treated seriously, clearly highlights where formative assessment can go wrong. As Torrance (1993) argued some years ago, many teachers are at risk of assuming formative assessment is at best “fairly mechanical and behav-iouristic . . . in the graded test tradition”; at worst summative, “taking snapshots of where the children have ‘got to’, rather than where they might be going next” (p. 340).

Teachers coming from more traditional assessment cultures make two common misinterpretations of formative assessment. First, there is a widespread assumption that any continuous assessment is by defi nition formative, but this is not necessarily the case—a series of weekly tests are continuous, but they are not formative if they are not used by students to improve their learning:

The term ‘formative’ itself is open to a variety of interpretations and often means no more than that assessment is carried out frequently and is planned at the same time as teaching. Such assessment does not neces-sarily have all the characteristics just identifi ed as helping learning. It may be formative in helping the teacher to identify areas where more explana-tion or practice is needed. But for the pupils, the marks or remarks on their work may tell them about their success or failure but not about how to make progress towards further learning. (Assessment Reform Group, 1999, p. 7)

Second, there is a common misconception that a so-called alternative form of the assessment automatically makes it formative; that is, assess-ments like portfolios and oral presentations are by defi nition formative. However, such assessments can be and sometimes are components of large-scale externally set and assessed examinations, for example, the ubiquitous external oral examinations of many Asian educational systems which are not used at all for formative purposes.

To summarize, then, in an AfL culture, TBA needs to be continuous and embedded naturally into every stage of the teaching–learning cycle, not just at the end (see Ministry of Education [Singapore], 2008, for an example of this in a K–12 curriculum). All assessments (even those for accountability purposes) need to be designed and implemented with the overriding aim of improving student learning, with AfL as the domi-nant educational ethos. In such a classroom or institution, teachers would be continually engaged in various forms of formative assessment, even at the end of the course. In this model of TBA, outlined in Table 1 ,

400 TESOL QUARTERLY

TA

BL

E 1

A

sses

smen

t for

Lea

rnin

g in

the

Cla

ssro

om: A

Typ

olog

y of

Pos

sibi

litie

s (D

avis

on, 2

008)

In-c

lass

con

tinge

nt f

orm

ativ

e as

sess

men

t-wh

ile-te

ach

ing

Mor

e pl

ann

ed in

tegr

ated

fo

rmat

ive

asse

ssm

ent

Mor

e fo

rmal

moc

k or

tria

l ass

essm

ents

m

odel

ed o

n su

mm

ativ

e as

sess

men

ts

but u

sed

for

form

ativ

e pu

rpos

es

Pres

crib

ed s

umm

ativ

e as

sess

men

ts,

but r

esul

ts a

lso

used

form

ativ

ely

to

guid

e fu

ture

teac

hin

g/le

arn

ing

Defi

nit

ion

An

inte

gral

but

ver

y in

form

al p

art o

f eve

ry

teac

her

’s d

aily

pra

ctic

e

An

inte

gral

par

t of t

he

lear

nin

g an

d te

ach

ing

cycl

e, i.

e., p

art

of e

ffec

tive

teac

hin

g an

d pl

ann

ing

for

the

futu

re

A ti

me

for

taki

ng

stoc

k, a

sses

sin

g h

ow in

divi

dual

s ar

e pe

rfor

min

g co

mpa

red

wit

h w

hol

e gr

oup

A d

isti

nct

ive

stag

e at

the

end

of a

un

it o

f lea

rnin

g an

d te

ach

ing

Deg

ree

of

prep

lan

nin

gO

ften

spo

nta

neo

us

and

con

tin

gen

t wh

en

the

nee

d ar

ises

An

info

rmal

pla

nn

ed p

roce

ss

duri

ng

the

cour

se o

f th

e ye

ar

tailo

red

to th

e n

eeds

of t

he

indi

vidu

al s

tude

nts

an

d cl

ass

Usu

ally

pre

desi

gned

, sen

sitiv

e to

nee

ds o

f stu

den

ts b

ut a

lso

to th

e de

man

ds o

f ext

ern

al

requ

irem

ents

Pred

eter

min

ed, r

elat

ivel

y fo

rmal

an

d se

t at b

egin

nin

g of

un

it o

f lea

rnin

g an

d te

ach

ing

Focu

sL

earn

er r

efer

ence

d;

focu

s on

the

lear

nin

g pr

oces

s

Cri

teri

on r

efer

ence

d, b

ut in

re

lati

on to

lear

ner

’s s

tart

ing

poin

t; fo

cus

on th

e le

arn

ing

proc

ess

and

stud

ent p

rogr

ess

Cri

teri

on r

efer

ence

d, b

ut in

rel

atio

n

to s

yste

m-le

vel n

orm

s; fo

cus

on

stud

ent p

rogr

ess

and

gap

betw

een

w

hat

sh

ould

be

and

is

Cri

teri

on r

efer

ence

d, b

ut in

rel

atio

n

to s

yste

m-le

vel n

orm

s; fo

cus

mai

nly

on

the

prod

uct o

f lea

rnin

g, a

nd

wh

at s

tude

nt n

eeds

to d

o n

ext

Typ

ical

kin

ds

of fe

edba

ckIn

dire

ct o

r im

plie

d fe

edba

ck, o

r di

rect

, co

con

stru

cted

by

stud

ents

an

d te

ach

er

Dir

ect q

ualit

ativ

e fe

edba

ck,

may

invo

lve

mul

tipl

e an

d va

ried

sou

rces

e.g

., se

lf,

peer

s, te

ach

er, e

tc.

Dir

ect q

ualit

ativ

e fe

edba

ck,

may

indi

cate

pro

fi le

s or

gr

ades

, but

sti

ll ex

ten

sive

st

uden

t in

volv

emen

t

Rep

ort i

n p

rofi

les,

leve

ls, a

nd

mar

ks

by te

ach

er, b

ut p

rece

ded

and/

or

follo

wed

by

form

ativ

e se

lf a

nd

peer

ev

alua

tion

an

d ex

ten

sive

teac

her

fe

edba

ck

Typ

es o

f ass

essm

ents

: O

bser

ve

Info

rmal

obs

erva

tion

of

lear

ner

beh

avio

r/

lan

guag

e us

e

Mor

e st

ruct

ured

sel

f, pe

er a

nd

teac

her

obs

erva

tion

usi

ng

anec

dota

l rec

ords

/obs

erva

tion

ch

eckl

ists

/sel

f and

pee

r ev

alua

tions

Syst

emat

ic o

bser

vati

on o

f la

ngu

age

sam

ples

usi

ng

scal

es/p

rofi

les/

rubr

ics

Form

al m

oder

ated

map

pin

g of

st

uden

ts p

erfo

rman

ce o

n

syst

em-w

ide

publ

ish

ed

stan

dard

s/sc

ales

Inqu

iry

Focu

sed

open

que

stio

ns

to

elic

it/ch

eck

unde

rsta

ndi

ng;

op

port

unit

y fo

r st

uden

t se

lf-r

efl e

ctio

ns

Peer

con

fere

nci

ng;

info

rmal

st

uden

t sel

f an

d pe

er

refl

ecti

ons/

lear

nin

g lo

gs

Teac

her

-stud

ent c

onfe

ren

cin

g;

stru

ctur

ed s

tude

nt s

elf a

nd

peer

re

fl ec

tion

s/le

arn

ing

logs

NA

An

alys

is

Info

rmal

an

alys

is o

f pat

tern

s in

stu

den

t lan

guag

e us

eA

nal

ysis

of d

raft

s/vi

deo

and

audi

o sa

mpl

es o

f wor

kPo

rtfo

lios/

colle

ctio

ns

of

stud

ent w

ork/

pres

enta

tion

Fo

rmal

por

tfol

io/p

roje

ct/

vide

otap

ed p

rese

nta

tion

s

Test

NA

Info

rmal

qui

zzes

, dia

gnos

tic

test

s, s

tude

nt-d

evel

oped

test

sM

ore

form

al te

sts

Form

al te

sts,

exa

ms

ctriplett

Sticky Note

Marked set by ctriplett


assessment includes not only the formal planned moments when stu-dents undertake an assessment task but also the far more informal, even spontaneous moments when teachers are monitoring student group work and notice one student speaking more confi dently or another fail-ing to take an offered turn. Because the goal of TBA is to improve stu-dent learning, self and peer assessment are an integral component of all assessment activity. Feedback is also a defi ning element, with opportuni-ties for constructive and specifi c feedback related to specifi c assessment criteria and curriculum goals and content regularly reviewed by students and teachers.

Such an integrated approach to assessment underpinned the recent development of a school-based assessment (SBA) component in the Hong Kong Certifi cate of Education Examination (HKCEE) in English Language (Davison, 2007; Davison & Hamp-Lyons, 2009). The stated purpose of the SBA component was to provide a more comprehensive appraisal of Forms 4–5 (Grades 9–10) learners’ achievement by assess-ing learning objectives which could not be easily assessed in public examinations while at the same time enhancing teaching and learning. The initiative marked a shift from traditional norm-referenced exter-nally set and assessed examinations toward a more student-centered TBA system that drew its philosophical basis from the assessment for learning movement discussed earlier. Teachers are involved at all stages of the assessment cycle, from planning the assessment programme to identifying and developing appropriate formative and summative assessment activities right through to making the fi nal judgments. In-class formal and informal performance assessment of students’ authentic oral language skills using a range of tasks and guiding ques-tions and the use of teacher judgments of student performance using common assessment criteria are innovative aspects of the new SBA, as is the insistence that students play an active role in the assessment pro-cess and the vigorous promotion of self and/or peer assessment and feedback (for a fuller discussion, see Davison, 2007; Davison & Hamp-Lyons, 2009).

Such TBA is assumed to have a number of advantages over external examinations, especially in assessing language, because effective lan-guage development requires not just knowledge but skill and application in a wide range of situations and modes of communication. Hence, like other performance-based subjects such as music, art, drama, and various vocational subjects, it is often argued that languages are better assessed through more authentic-like, performance-based assessments. Table 2 summarizes some of the common advantages attributed to TBA com-pared with external examinations.

However, a number of these claims made for the effi cacy, or even supe-riority, of TBA over traditional assessments, especially those relating to

402 TESOL QUARTERLY

TABLE 2 Advantages of TBA Compared With External Examinations for Oral Language Assessment

(Adapted From SBA Consultancy Team, 2005)

Characteristics of classroom-based TBA Characteristics of exams

Scope Extends the range and diversity of assessment collection opportunities, task types, and assessors

Much narrower range of assessment opportunities: less diverse assessment; one exam per year

Authenticity Assesses work being done within the classroom; less possibility of cheating as teacher knows student capabilities; assessments more likely to be realistic

Removes assessment entirely from teaching and learning; stressful conditions may lead to students not demonstrating real capacities

Validity Improves validity through assessing factors that cannot be included in public exam settings

Limits validity by limiting scope of assessment, e.g., diffi cult to assess interaction skills in exam environment

Reliability Improves reliability by having more than one assessment by a teacher who is familiar with the student; allows for multiple opportunities for assessor refl ection/standardization

Even with double marking, examiners’ judgments can be affected by various factors (task diffi culty, topic, interest level, tiredness, etc.), but little opportunity for assessor refl ection/review

Fairness Fairness is achieved by following commonly agreed processes, outcomes and stan dards; teacher assumptions about students and their oral language levels are made explicit through collaborative sharing and discussion with other teachers

Fairness can only be achieved by treating everyone the same, i.e., setting the same task at the same time for all students

Feedback Students can receive constructive feedback immediately after the assessment has fi nished, hence improving learning

The only feedback is usually a grade at the end of the exam; generally no opportunities for interaction with assessor; no chance to ask how to improve

Positive washback (benefi cial infl uence on teaching and learning)

Ongoing assessment encourages students to work consistently; provides important data for evaluation of teaching and assessment practices in general

Examinations by their nature can only be purely summative, and do not serve any teaching-related purpose; effects on teaching and learning may even be negative; may encourage teaching to the test and a focus on exam tech-nique, rather than outcomes

Teacher and student empowerment

Teachers and students become part of the assessment process; collaboration and sharing of expertise takes place within and across schools

Teachers play little to no role in assessment of their students and have no opportunity to share their expertise or knowledge of their students; students treated as numbers

(Continued)


Characteristics of classroom-based TBA Characteristics of exams

Professional development

Builds teacher assessment skills, which can be transferred to other areas of the curriculum

Teachers have no opportunity to build their assessment skills; get little or no feedback on how to improve as teachers

Practicality and cost

Once teachers are trained, TBA is much cheaper as integrated into normal curriculum; undertaken by class teacher as part of everyday teaching; avoids wasting valuable teaching time on practice tests

Language assessment as currently practiced is very expensive in terms of task development (especially as multiple stimulus material is often needed to avoid cheating), assessor training and moderation and teaching/learning time

TABLE 2 (Continued)

validity and reliability, need to be explored further because they raise important theoretical issues that go to the heart of TBA and when applied to the English language teaching fi eld.

CONCEPTUAL ISSUES AND CHALLENGES IN TBA IN ENGLISH LANGUAGE TEACHING

There are obviously many issues and challenges confronting English language education in its movement toward greater use of high-quality TBA, ranging from the very practical concerns associated with any sig-nifi cant change in classroom practice, including the need to develop confi dence to overcome the inevitable implementation dip through to the more technical problems associated with learning how to construct explicit assessment criteria and tasks appropriate for range of individ-ual student needs (e.g., Fox, 2008), through to the signifi cant chal-lenge of changing deeply entrenched sociocultural attitudes and expectations (for a fuller discussion, see Brindley, 1995; Davison, 2007, Rea-Dickins, 2008). However, in this article we are more concerned with conceptual issues, in which questions of validity and reliability are central.

As others have pointed out (e.g., Rea-Dickins, 2007), in TBA there is much debate over evaluation criteria with researchers such as Leung (2004a, 2004b) and Teasdale and Leung (2000), on one hand, arguing that the evaluation criteria traditionally associated with psychometric testing such as reliability and validity need to be reinterpreted in TBA, particularly in relation to in-class contingent assessment in interaction, but testers such as Clapham (2000) insisting that traditional test criteria do apply to alternative TBA:

404 TESOL QUARTERLY

A problem with methods of alternative assessment, however, lies with their validity and reliability: Tasks are often not tried out to see whether they produce the desired linguistic information; marking criteria are not investigated to see whether they ‘work’; and raters are often not trained to give consistent marks. (p. 152)

However, these points represent two quite different levels of concern. Clapham’s arguments are primarily to do with the quality of assessment practice, whereas Leung’s problematize the underlying assessment theory. 4 Even when TBA is “best” practice, many theoretical issues remain unresolved; in fact, we would argue they are even more obvious.

As an example, take the development of the SBA system in Hong Kong (SBA Consultancy Team, 2005), outlined earlier, involving more than 1,800 teachers in more than 650 schools and institutions. Conducted over 2 years of schooling, and contributing 15% toward each student’s fi nal English score, it consists of the assessment of English oral language skills based on topics and texts drawn from a program of independent reading or viewing (“texts” encompass print, video, fi lm, fi ction, and non-fi ction material). The spoken language tasks are of two broad kinds: group interaction and individual presentation. Students choose at least three texts to read or view over the course of 2 years, keeping a logbook or brief notes, and undertaking a number of activities in and out of class to develop their independent reading, speaking, and thinking skills. For assessment they participate in several interactions with classmates on a particular aspect of the text they have read or viewed, leading up to mak-ing an individual presentation or group interaction on a specifi c text and responding to questions from their audience (for a full description of the assessment requirements, see SBA Consultancy Team, 2005).

A range of assessment tasks has been provided that teachers can choose from and adapt, including teacher-made tasks adapted from those used by teachers who took part in the initial development of the assessment initiative. Assessment tasks can vary in length and complexity, enabling teachers to provide students with appropriate, multiple, and varied oppor-tunities to demonstrate their oral language abilities individually tailored to their language levels and interests. At the same time, however, the teacher and the school need to be sure that the oral language produced is the student’s own work, not the result of memorization without under-standing. Hence, there are some important requirements or conditions that teachers and students must follow, including the assessment being conducted by the usual English teacher, in the presence of one or more

4 See Chapelle (1999), for a more detailed discussion of current debates over validity in lan-guage testing.


classmate(s). Students are assessed according to a set of assessment crite-ria consisting of a set of descriptors at each of six levels across four domains: (a) pronunciation and delivery, (b) communication strategies, (c) vocabulary and language patterns, and (d) ideas and organization. Teachers are encouraged to video or audio record a range of student assessments to assist with standardization and feedback, involving the stu-dents as much as possible. During the class assessments, which might span a number of weeks, individual teachers at the same level are encour-aged to meet informally to compare their assessments and make adjust-ments to their own scores as necessary. Such informal interactions give teachers the opportunity to share opinions on how to score performances and interpret the assessment criteria.

Near the end of the school year, all the English teachers at each level hold a formal meeting, chaired by a coordinator in each school, to review performance samples and standardize scores. Such meetings are critical for developing consistency in and between teacher-assessors, for public accountability, and for professional collaboration and support. At the end of each year, a district-level meeting is held for professional sharing and further standardization. Each coordinator is encouraged to share a range of typical and atypical individual assessment records (along with the video or audio recordings) and the class records. Once any necessary changes are made, the performance samples are archived and the scores are sub-mitted to the HKEAA for review. Maintaining notes of all standardization meetings and any follow up action is also encouraged so schools can show parents and the public that it has applied the assessment procedures con-sistently and fairly. The HKEAA then undertakes a process of statistical moderation 5 to ensure the comparability of scores across the whole Hong Kong school system. This TBA system is supported by a comprehensive teacher training package (SBA Consultancy Team, 2005) which includes an introductory DVD and booklet, and two training CD-ROMs contain-ing a range of student samples for benchmarking purposes. In addition, 39 district-level group coordinators, mostly serving teachers, were used to coordinate training and standardization sessions with school coordina-tors and with the teachers involved within each school. A 12-hour supple-mentary professional development program with comprehensive course and video notes on DVD (SBA Consultancy Team, 2007) was also devel-oped, and all teachers are encouraged to complete the program in their fi rst year of such assessment. Careful monitoring of the assessment pro-cess shows that teachers are able to reliably mark students’ work with high

5 It is diffi cult to justify statistical moderation from a theoretical perspective, given SBA and the external examination are measuring different things under very different conditions, but it is considered essential to ensure public confi dence in the examination system is maintained, while allowing the HKEAA to be more innovative in its assessment practices.

406 TESOL QUARTERLY

levels of interrater reliability and that a higher correlation exists between SBA and other components of the external exam than between the exter-nal oral exam and other components.

This TBA system aims to ensure, to paraphrase Clapham (2000), that tasks are tried out, marking criteria do work, and raters are trained; how-ever, theoretical problems to do with the nature of what is assessed and how it is assessed in English language education still arise and are, in fact, foregrounded by TBA in ways which challenge the English language teaching fi eld. In the interests of brevity, we will look at three key sets of issues arising from TBA which problematize our theorization of language, language learning, and assessment.

Implications of TBA for the Theorization of Language

All language use implicates meaning making (and meaning taking). In English language classes, almost all language is taught (and learned) through carrier content, for example, the language use one might engage in when arriving at an airport, or reviewing and discussing a fi lm. 6 Thus, TBA, whether we are talking about assessment embedded in in-class con-tingent interaction or more formal assessment at the end of the teaching and learning cycle, is inextricably connected to meaning making with ref-erence to content meaning in context. If we look at what is assessed in the example from Hong Kong outlined earlier, we see language use being embedded into other forms of social and cognitive activity, then being “pulled out” for separate assessment in ways which raise real issues of validity: Are we assessing speaking? Are we assessing cognition? Are we assessing reading? Are we assessing cultural knowledge? Are we assessing interactive style or personality? Such concerns about validity are some-what ironic given that the direct assessment and real-world (even if simu-lated) nature of TBA tasks is supposed to be one of the key advantages TBA has over external exams.

Thus, somewhat paradoxically, TBA raises two problems to do with the nature of language assessment that are relatively invisible in traditional testing. First, TBA, in its emphasis on language use in context, calls into action a multifaceted combination of linguistic, pragmatic, and cultural resources. Models of second language competence (e.g., Bachman, 1990; Canale & Swain 1980) have set out components such as grammatical com-petence, sociocultural competence, and strategic competence. These components have been used to inform model building for profi ciency lev-els and assessment (e.g., Council of Europe, 2001; Griffi n & McKay, 1992).

6 It is rare in these communicative language teaching days for language use to be solely con-cerned with a display of linguistic knowledge (as in grammar drill exercises).


However, as Widdowson (2001) noted, there is as yet no clear under-standing of how these different components relate to one another, par-ticularly in specifi c contexts. One consequence of this lack of understanding is that assessment has a built-in arbitrariness; for example, should pragmatic competence be regarded as being more important than linguistic competence in classroom discussions? Do we need a model(s) of language that would articulate the different component competences in different contexts?

Second, insofar as language use almost always involves some content meaning (even in formal language learning activities), then assessing language means inevitably assessing content to some degree. This is par-ticularly the case for the TBA of English that takes place in a subject learning contexts, for example, English language learners in mainstream English-medium school and university classrooms (e.g., in North America, Australia, and the United Kingdom as well as in English-medium institu-tions around the world), and in content-language integrated learning programmes (increasingly popular in Europe where the teaching and learning of English as a foreign language is carried out through a school subject such as science). But assessing English in subject learning con-texts raises certain questions: Do teacher-assessors need a framework for language assessment as well as a separate framework for content? Or can they adopt a content-language integrated view, as argued by Mohan, Leung, and Slater (in press), that proceeds on the assumption that there is no separation between meaning and wording? These are critical ques-tions to do with validity highlighted by TBA, but they have widespread signifi cance in the English language teaching fi eld.

Implications of TBA for the Theorization of Language Learning

The second key set of problems highlighted in TBA relate to the theo-rization of language learning. TBA, particularly for formative purposes, has generally put a high premium on teacher–student dialogue involving appropriate use of questioning and feedback, either as part of live con-tingent interaction or in the form of written comments. This practice generally, and mainly implicitly, assumes that language learning takes place through interaction between the teacher and the student with the guidance and advice given by the teacher advancing the student’s knowledge and skills. Until recently, such formative assessment appears to have been much more interested in practice than theory. Lantolf and Pohener (2004), Poehner and Lantolf (2005), and Poehner (2007, 2008) argue that formative TBA has been atheoretical and overly task ori-ented, without paying suffi cient attention to a learner’s overall cognitive

408 TESOL QUARTERLY

development; furthermore, such formative assessment seems to rely on teacher intuition rather than any systematic theory of learning. In con-trast to most forms of teacher-based formative assessment, their preferred model of assessment, dynamic assessment, works explicitly within a Vygotskyan sociocultural paradigm, using the twin constructs of teacher-mediated assistance and the zone of proximal development to theorize the process of learning through assessment. A similar approach grounded in sociocultural theory, called interactive assessment (see SBA Consultancy Team, 2008, for a fuller explanation), is also being promoted in the Hong Kong assessment initiative. Teachers are given a framework of guiding questions which make increasing cognitive and linguistic demands on the learner, and their teacher-assessors are encouraged to interact indi-vidually with a student at any time, asking specifi c question(s) to clarify and encourage the student to extend ideas, help prompt and scaffold the students’ oral interaction, probe the range and depth of their oral lan-guage skills, and verify the student’s understanding of what he or she is saying. The questions are meant to be used fl exibly to ensure that stu-dents have the opportunity to show the full range of their responses, hence achieving the most valid “true” judgment of students’ ability.

However, such approaches raise key questions not only about the nature of second language learning and its stages of development, but also about the role of assessment criteria and the teacher-assessor. Where learners of English need support to understand and express meaning, elements of teaching and scaffolding of the medium of communication may be built into formative guidance. How should this aspect of teacher–student inter-action be considered in any theorizing of TBA? How does a teacher decide what to foreground in any set of assessment criteria and what to downplay or even ignore? Do we need to adopt an explicit theory of interaction and its relationship with learning? Is there something unique about TBA of language that requires special and additional attention?

In a discussion on the development of a theory of formative assess-ment in general, Black and Wiliam (2009) suggest that theory building “must bring into relationship . . . three spheres, the teacher’s agenda, the internal world of each student, and the inter-subjective” (p. 26). TBA in English language teaching highlights the complexity of these relation-ships and problematizes the teacher-assessor’s own beliefs and construc-tions of their discipline (for a further discussion, see Leung, 2007) in ways which challenge all in English language teaching.

Implications of TBA for the Theorization of Assessment

The series of questions raised in the discussion so far are implicated in the third—and fi nal—set of issues to be addressed in this article, that is,


the theorization of assessment. It is a fundamental paradox of TBA that its inherent strengths are viewed by many psychometricians as its great-est weaknesses. In many ways TBA is the opposite to traditional forms of examination and testing in which context is regarded as an extrane-ous variable that must be controlled and neutralized and the assessor as someone who must remain objective and uninvolved throughout the whole assessment process (Davison, 2007). TBA, in contrast, derives a major part of its validity and reliability from its location in the actual class-room where assessment activities are embedded in the regular curricu-lum and assessed by a teacher who is familiar with the student’s work and presumably has a stake in their improvement. To work effectively, how-ever, TBA needs a theory of assessment which is aligned with and which exploits these inherent features. Thus, in the TBA initiative in Hong Kong, schools and teachers were granted a large degree of trust and autonomy in the design, implementation, and specifi c timing of assess-ment tasks. The criteria for evaluating reliability shifted from a focus on input to a focus on output; that is, no assessment tasks are the same across all schools; rather a standard set of expectations of students’ language use (i.e., assessment standards or criteria) were developed based on the curriculum goals, past performances, and the teachers own judgments, and are now used by all teachers—and more importantly, students—to generate tasks appropriate to the students’ language level, context, and needs (SBA Consultancy team, 2005). All students are given suffi -cient time and support to demonstrate their best—to show what they can do—and for the assessor to be able to confi dently assess their out-put, but even more importantly, validate their informal judgments of students’ language levels and achievements. In other words, the more formal assessment tasks are designed to encourage the teacher to stand back and refl ect on their implicit or explicit assumptions about individ-ual students’ capacities, compare those assumptions with careful analy-sis of examples of students’ actual performance, and then subject their judgments to explicit scrutiny and challenge or confi rmation by others. This TBA initiative does not assume that the class teacher is objective or has no preconceived ideas or assumptions about a student’s level. To the contrary, it seeks to make such assumptions explicit and open to dis-cussion with fellow teachers. Thus, it is not necessary to have complete consensus; that is, teachers do not need to agree to give identical marks; some variation within the range is to be expected. As Davison (2004) argues, in TBA trustworthiness comes more from the process of express-ing disagreements, justifying opinions, and so on than from absolute agreement.

This theorization of assessment is obviously very different from that associated with large-scale testing, one that has as core criteria for eval-uation not just learning outcomes, but the explicit enhancement of

410 TESOL QUARTERLY

learning and teaching. As such, the traditional conceptions of validity and reliability associated with the still-dominant psychometric tradition of testing are themselves a potential threat to the development of the necessarily highly contextualized and dialogic practices of TBA (Rea-Dickins, 2007). Given that TBA spans from in-class contingent formative assessment as part of teaching to prescribed relative formal summa-tive assessment, the following questions need to be asked: How can we develop a view of validity and reliability in terms of learning (not solely in terms of learning outcomes)? Is there a place for differentiated cri-teria of validity and reliability for different kinds of TBA? How can we further strengthen TBA and its nexus with learning and teaching while at the same time enhancing community confi dence in our assessment systems? How can we better align traditional theorizations of assessment with those needed for TBA and vice versa? Is such alignment theoretically possible?

CONCLUSION

There are obviously areas of TBA other than those explored in this article in which further research and conceptualization is needed. In par-ticular, more thinking is needed around ethics, trustworthiness, and fair-ness (e.g., see Lynch, 2001; Lynch & Shaw, 2005), and the relationship between assessment, feedback, and learning. More research is also needed into the effects of system-level change, including the impact on teachers and learners of the adoption, implementation, or evaluation of school-based TBA systems; the effect of importation of assessment approaches from other cultures; comparative perspectives on assessment policies and programs; and the impact of standards-based assessment on teachers and students. More research into teacher training and profes-sional development in assessment is also necessary: what this kind of teacher development comprises and how it is perceived, the quality and progress indicators of TBA, and different approaches to teacher develop-ment in assessment.

However, TBA, in all its incarnations, has been around English lan-guage teaching long enough to demonstrate its powerful potential to improve learning and teaching in a range of different contexts. What it has lacked until recently has been suffi cient engagement with theory and a sense of a research agenda. Perhaps more tellingly, the highly contex-tualized and variable nature of TBA has meant it lacks the capacity to be reduced to an off the shelf for-profi t product and thus has always been relegated to the status of the Other. However, as this special issue dem-onstrates, TBA appears to be gaining enough critical mass and common interest to generate a new level of discussion about core concepts. This is


to be applauded because many of the key questions and issues raised by TBA are of central interest to the English language teaching world.

ACKNOWLEDGMENTS

The authors acknowledge the important contribution of discussions with their col-leagues at King’s College, London, and the University of Hong Kong to the ideas expressed in this article.

THE AUTHORS

Chris Davison is Professor of Education and Head of the School of Education at the University of New South Wales, Sydney, Australia. Before going to Hong Kong, she worked as a teacher educator for 15 years. She is also actively involved in the research and development of English as a second language and languages other than English policy and programs in Australia and the Asia-Pacifi c area.

Constant Leung is Professor of Educational Linguistics in the Department of Education and Professional Studies at King’s College London. He has written and published widely on additional and second language education and language assess-ment issues. He is also the director of a master of arts program in English language teaching and applied linguistics.

REFERENCES

Adamson, B., & Davison, C. (2003). Innovation in English language teaching in Hong Kong primary schools: One step forwards, two steps sideways. Prospect, 18 , 27–41.

Arkoudis, S., & O’Loughlin, K. (2004). Tensions between validity and outcomes: Teachers’ assessment of written work of recently arrived immigrant ESL students. Language Testing, 20 , 284–304.

Assessment Reform Group. (1999). Assessment for learning: Beyond the black box . Cambridge: University of Cambridge School of Education. Retrieved on 2 October 2007 from http://arg.educ.cam.ac.uk/AssessInsides.pdf .

Assessment Reform Group. (2001). Assessment for learning: 10 principles. Retrieved 2 July 2009 from http://www.assessment-reform-group.org.uk .

Bachman, L. (1990). Fundamental considerations in language testing . Oxford: Oxford University Press.

Biggs, J. (1996). Assessing learning quality: Reconciling institutional, staff, and edu-cational demands. Assessment & Evaluation in Higher Education, 2 , 3–25.

Biggs, J. (1998). Assessment and classroom learning: A role for formative assessment? Assessment in Education: Principles, Policy & Practice, 5 , 103–110.

Black, P. (2001). Formative assessment and curriculum consequences. In D. Scott (Ed.), Curriculum and assessment. Westport, CT: Ablex.

Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2003a, April). Formative and summative assessment: Can they serve learning together? Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL. Retrieved 1 October 2007 from http://www.kcl.ac.uk/education/papers/AERA%20ClassAsst.pdf .

412 TESOL QUARTERLY

Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2003b, April). The nature and value of formative assessment for learning. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL. Retrieved on 1 October 2007 from http://www.kcl.ac.uk/education/papers/AERA%20Pres.pdf .

Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5 , 7–74.

Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21, 5–31.

Breen, M., Barratt-Pugh, C., Derewianka, B., House, H., Hudson, C., Lumley, T., et al. (1997). Profi ling ESL children: How teachers interpret and use national and state assess-ment frameworks . Canberra, Australia: Department of Employment, Education, Training and Youth Affairs.

Brindley, G. (1995). Assessment and reporting in language learning programmes: Purposes, problems and pitfalls. In E. Li & G. James (Eds.) Testing and evaluation in second language education (pp. 133–162). Hong Kong SAR, China: Hong Kong University of Science and Technology.

Brookhart, S. (2003). Developing measurement theory for classroom assessment pur-poses and uses. Educational Measurement: Issues and Practice, 22 (4), 5–12.

Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1 , 1–47.

Carless, D. (2005). Prospects for the implementation of assessment for learning. Assessment in Education, 12 , 39–54.

Carless, D. (2008) Developing productive synergies between formative and summa-tive assessment processes. In M. F. Hui & D. Grossman (Eds.), Improving teacher edu-cation through action research (pp. 9–23). New York: Routledge.

Chapelle, C. (1999). Validity in language assessment. Annual Review of Applied Linguistics, 19, 254–272.

Cheah, Y. M. (1998). The examination culture and its impact on literacy innovations: The case of Singapore. Language and Education, 12 , 192–209.

Cheung, D., & Ng, D. (2000). Teachers’ stages of concern about the target-oriented curriculum. Education Journal, 28 , 109–122.

Clapham, C. (2000). Assessment and testing. Annual Review of Applied Linguistics, 20, 147–161.

Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching, assessment . Cambridge: Cambridge University Press.

Cumming, J. J., & Maxwell, G. S. (2004). Assessment in Australian schools: Current practice and trends. Assessment in Education, 11 , 89–108.

Curriculum Development Institute. (2002). School policy on assessment: Changing assessment practices. Chapter 5 in Basic education curriculum guide: Building on strength. Hong Kong: Curriculum Development Institute.

Davison, C. (2004). The contradictory culture of classroom-based assessment: Teacher-based assessment practices in senior secondary English. Language Testing, 21 , 304–333.

Davison, C. (2007). Views from the chalkface: School-based assessment in Hong Kong. Language Assessment Quarterly, 4 , 37–68.

Davison, C. (2008, March). Assessment for learning: Building inquiry-oriented assessment communities. Paper presented at 42nd Annual TESOL Convention and Exhibit, New York, NY.

Davison, C., & Hamp-Lyons, L. (2009) The Hong Kong Certifi cate of Education: School-based assessment reform in Hong Kong English language education. In L.-Y Cheng & A. Curtis (Eds.), English language assessment and the Chinese learner . New York: Routledge.


Davison, C., & Williams, A. (Eds.). (2002). Learning from each other: Literacy, labels and limitations. Studies of child English language and literacy development K-12, Volume 2. Melbourne: Language Australia.

Department of Education and Training, Education Policy and Planning Section (Australia). (n.d.). School excellence initiative. Teachers: The key to student success. A dis-cussion paper for government schools. Tuggeranong, Australian Capital Territory: Author. Retrieved 15 September 2009 from http://www.det.act.gov.au/__data/assets/pdf_fi le/0009/17964/sei_TeachersKeyToStudentSuccess.pdf

Fox, J. (2008). Alternative assessment. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopedia of language and education. Vol. 7: Language testing and assessment (2nd ed., pp. 97–108). New York: Springer.

Griffi n, P., & McKay, P. (1992). Assessing and reporting in the ESL language and lit-eracy in schools project. In P. McKay (Ed.), ESL development: Language and literacy in schools project, Vol. 2 (pp. 9–16). Canberra, Australia: Department of Employment, Education and Training.

Hamp-Lyons, L. (2007). The impact of testing practices on teaching: Ideologies and alternatives. In J. Cummins & C. Davison (Eds.), The international handbook of English language teaching, Vol. 1 (pp. 487–504). Norwell, MA: Springer.

Harlen, W. (2005). Teachers’ summative assessment practices and assessment for learning: Tensions and synergies. Curriculum Journal, 16 , 207–223.

Kennedy, K. J., Chan, K. S. J., Yu, W. M., & Fok, P. K. (2006, May). Assessment for produc-tive learning: Forms of assessment and their potential for enhancing learning. Paper pre-sented at the 32nd Annual Conference of the International Association for Educational Assessment, Singapore.

Lantolf, J. P., & Poehner, M. (2004). Dynamic assessment: Bringing the past into the future. Journal of Applied Linguistics, 1, 49–74.

Learning and Teaching Scotland. (2006). Assessment Is for Learning Programme . Retrieved 2 October 2007 from http://www.ltscotland.org.uk/assess/about/index.asp .

Leung, C. (1999). Teachers’ response to linguistic diversity. In A. Tosi & C. Leung (Eds.), Rethinking language education: From a monolingual to a multilingual perspective (pp. 225–240). London: Centre for Information on Language Teaching and Research, the National Centre for Languages.

Leung, C. (2004a). Classroom teacher-based assessment of second language develop-ment: Construct as practice. In E. Hinkel (Ed.), Handbook of research in second language learning and teaching . Mahwah, NJ: Erlbaum.

Leung, C. (2004b). Developing formative teacher-based assessment: Knowledge, practice, and change. Language Assessment Quarterly, 1 , 19–41.

Leung, C. (2007). Dynamic assessment: Assessment as teaching? Language Assessment Quarterly, 4 , 257–278.

Leung, C., & Teasdale, A. (1997). Raters’ understanding of rating scales as abstracted concept and as instruments for decision-making. Melbourne Papers in Language Testing, 6, 45–70.

Lynch, B. (2001). Rethinking assessment from a critical perspective. Language Testing, 18 , 351–372.

Lynch, B., & Shaw, P. (2005). Portfolios, power, and ethics. TESOL Quarterly, 39, 263–297.

McMillan, J. (2003). Understanding and improving teachers’ classroom assessment decision-making: Implications for theory and practice. Educational Measurement: Issues and Practice, 22 (4), 34–43.

McNamara, T. (2001). Language assessment as social practice: Challenges for research. Language Testing, 18 , 333–349.

414 TESOL QUARTERLY

Mohan, B., Leung, C., & Slater, T. (In press). Assessing language and content: A func-tional perspective. In A. Paran & L. Sercu (Eds.), Testing the untestable in language and education . Clevedon, England: Multilingual Matters.

Ministry of Education (New Zealand). (2009). Assessment resources banks: English, math-ematics, and science. Retrieved 15 September 2009 from http://arb.nzcer.org.nz/assessment/

Ministry of Education (Singapore). (2008). 2012 English language syllabus. Singapore: Government Printer.

New South Wales Government. (n.d.). Assessment for learning in the new years 7–10 syl-labuses. Retrieved 10 September 2009 from http://arc.boardofstudies.nsw.edu.au/go/sc/afl /.

New Zealand Qualifi cations Authority. (n.d.). Assessment and examination rules and procedures for secondary schools—2009. Retrieved 10 September 2009 from http://www.nzqa.govt.nz/ncea/acrp/secondary/5/5.html

Poehner, M. (2007). Beyond the test: L2 dynamic assessment and the transcendence of mediated learning. Modern Language Journal, 91, 323–340.

Poehner, M. E. (2008). Dynamic assessment: A Vygotskian approach to understanding and promoting L2 development . New York: Springer.

Poehner, M. E., & Lantolf, J. (2005). Dynamic assessment in the language classroom. Language Teaching Research, 9 , 233–265.

Popham, W. J. (2008a). Classroom assessment: What teachers need to know (5th ed.). Boston: Pearson Allyn & Bacon.

Popham, W. J. (2008b). Transformative assessment. Alexandria, VA: Association for Supervision and Curriculum Development.

Pryor, J., & Akwesi, A. (1998). Assessment in Ghana and England: Putting reform to the test of practice. Compare, 28 , 263–275.

Pryor, J., & Lubisi, C. (2002). Reconceptualizing educational assessment in South Africa—Testing times for teachers. International Journal of Educational Development, 22, 673–686.

Queensland Studies Authority (Australia). (2009a). PD packages. Retrieved 15 September 2009 from http://www.qsa.qld.edu.au/learning/3166.html.

Queensland Studies Authority (Australia). (2009b). Senior assessment general informa-tion [Web site]. Retrieved 24 August 2009 from http://www.qsa.qld.edu.au/assessment/2130.html

Rea-Dickins, P. (2007). Classroom-based assessment: Possibilities and pitfalls. In J. Cummins & C. Davison (Eds.), The international handbook of English language teach-ing, Vol. 1. (pp. 505–520). Norwell, MA: Springer.

Rea-Dickins, P. (2008). Classroom-based language assessment. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopedia of language and education. Vol. 7: Language testing and assessment (2nd ed., pp. 257–271). New York: Springer Science + Business.

Roos, B., & Hamilton, D. (2005). Formative assessment: A cybernetic viewpoint. Assessment in Education, 12 , 7–20.

Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144.

Saskatchewan Learning (Canada). (1993). Learning assessment program . Retrieved 24 August 2009 from http://www.education.gov.sk.ca/Assessment-for-Learning.

SBA Consultancy Team. (2005). 2007 HKCE English Language Examination: Introduction to the school-based assessment component (Training Package). Hong Kong SAR, China: Hong Kong Examination and Assessment Authority/Faculty of Education, The University of Hong Kong.

SBA Consultancy Team. (2007). Professional development for the school-based assessment component of the 2007 HKCE English Language Examination . Hong Kong SAR, China: Hong Kong Examination and Assessment Authority.

ctriplett

Sticky Note

Marked set by ctriplett


SBA Consultancy Team. (2008). Aligning assessment with curriculum reform in junior sec-ondary English language teaching . Hong Kong SAR China: Quality Education Fund.

Spencer, E. (2005, February). Assessment in Scotland: ‘Assessment for learning.’ Formative assessment in a coherent system . PowerPoint presentation at the International Conference on Improving Learning Through Formative Assessment, Organisation for Economic Co-operation and Development, Paris, France. Retrieved on 9 October 2006 from http://www.oecd.org/dataoecd/40/54/34488069.ppt#7 .

Stiggins, R. (2001). The unfulfi lled promise of classroom assessment. Educational Measurement: Issues and Practice, 20 (3), 5–15.

Stiggins, R. (2002). Assessment crisis: The absence of assessment for learning. Phi Delta Kappan, 83 , 758–765.

Stiggins, R. (2008). An introduction to student-involved assessment for learning (5th ed.). Upper Saddle River, NJ: Pearson Merrill Prentice Hall.

Stiggins, R., Arter, J., Chappuis, J., & Chappuis, S. (2007). Classroom assessment for stu-dent learning. Upper Saddle River, NJ: Educational Testing Service & Pearson Merrill Prentice Hall.

Taras, M. (2005). Assessment—summative and formative: Some theoretical refl ec-tions. British Journal of Educational Studies, 53 , 466–478.

Teasdale, A., & Leung, C. (2000). Teacher-based assessment and psychometric the-ory: A case of paradigm crossing? Language Testing, 17 , 163–184.

TESOL. (2005). TESOL/NCATE program standards. Standards for the accreditation of ini-tial programs in P–12 ESL teacher education . Alexandria, VA: Author.

Torrance, H. (1993). Formative assessment: Some theoretical problems and empiri-cal questions. Cambridge Journal of Education, 23 , 333–343.

Widdowson, H. G. (2001). Communicative language testing: The art of the possible. In C. Elder, N. Brown, E. Iwashita, E. Grove, K. Hill, T. Lumley et al., (Eds.), Experimenting with uncertainty: Essays in honour of Alan Davies (pp. 12–21). Cambridge: Cambridge University Press.

Yung, B. (2006). Assessment reform in science: Fairness or fear. Norwell MA: Springer.

current issues in english language teacher-based assessment

Documents