assessing the language of young learners

http://ltj.sagepub.com/Language Testing

http://ltj.sagepub.com/content/22/3/337The online version of this article can be found at:

DOI: 10.1191/0265532205lt312oa

2005 22: 337Language TestingAngela Hasselgreen

Assessing the language of young learners

Published by:

http://www.sagepublications.com

can be found at:Language TestingAdditional services and information for

http://ltj.sagepub.com/cgi/alertsEmail Alerts:

http://ltj.sagepub.com/subscriptionsSubscriptions:

http://www.sagepub.com/journalsReprints.navReprints:

http://www.sagepub.com/journalsPermissions.navPermissions:

http://ltj.sagepub.com/content/22/3/337.refs.htmlCitations:

What is This?

- Jul 1, 2005Version of Record >>

2014 at MEMORIAL UNIV OF NEWFOUNDLAND on July 18,ltj.sagepub.comDownloaded from


http://ltj.sagepub.com/

http://ltj.sagepub.com/content/22/3/337

http://www.sagepublications.com

http://ltj.sagepub.com/cgi/alerts

http://ltj.sagepub.com/subscriptions

http://www.sagepub.com/journalsReprints.nav

http://www.sagepub.com/journalsPermissions.nav

http://ltj.sagepub.com/content/22/3/337.refs.html

http://ltj.sagepub.com/content/22/3/337.full.pdf

http://online.sagepub.com/site/sphelp/vorhelp.xhtml



Assessing the language ofyoung learnersAngela Hasselgreen The University of Bergen

This article focuses on the plight of children as young language learners(YLLs), in the context of today’s Europe. By reference to the Council ofEurope’s recommendations and its material in the form of the CommonEuropean Framework of Reference (CEFR) and the European LanguagePortfolio (ELP), questions are posed here as to how far the special needs ofYLLs are being catered for by assessment practices in European schools.Examples are given of how the CEFR and ELP are currently used in YLLassessment, with a focus on recent developments in this direction in Norway,and conclusions are drawn on the outlook for YLL assessment in present dayEurope.

I Introduction

Most things that involve children are ‘special’ and language assess-ment is no exception. In today’s Europe, the need for assessment anddocumentation of the language ability of its citizens is exploding inthe wake of the growth in need for the ability itself, for work, study,pleasure or even survival. This assessment and documentation hasbeen given a significant boost from the Common EuropeanFramework of Reference (CEFR) (Council of Europe, 2001) and itsoffshoots, such as the European Language Portfolio (ELP) andDIALANG (Alderson and Huhta, this issue), as well as a growingnumber of other instruments, many of which claim, rightly orwrongly, to measure language ability according the yardstick of theCEFR. Moreover, the influence of the CEFR is not restricted to theworld of the adults for which it was originally intended, but isgradually taking root in school systems across Europe. But how wellare Europe’s children really being catered for? Are we attending tothe special assessment needs of our young language learners? Anddo we really understand these needs?

Language Testing 2005 22 (3) 337–354 10.1191/0265532205lt312oa © 2005 Edward Arnold (Publishers) Ltd

Address for correspondence: Angela Hasselgreen, AKSIS, The University of Bergen, Allegaten 27,5020 Bergen, Norway; email: [email protected]



This article takes a brief look at young language learners (YLLs)as a group, and considers which special demands are placed on theassessment of their language ability. It goes on to consider young-sters in today’s Europe, and attempts to describe their particularsituation as language learners with reference to Council of Europerecommendations; it considers what the CEFR and ELP have to offerthe YLL, looking at some ways in which they are being applied, andposing some fundamental questions regarding the needs of YLLassessment in Europe. It then focuses on the case of YLLs inNorway, describing how attempts have been made to find a way ofassessment which both puts the characteristics of the YLL in focusand, at the same time, exploits what the CEFR and its associates hasto offer. Finally it considers how far the Norwegian cases havemanaged to shed light on the fundamental questions posed, anddraws some conclusions on the outlook for the assessment of YLLs.

II Young language learners

Young language learners (YLLs) are defined here as being schoolpupils up to around 13 years old, thus incorporating the primaryschool population in most European countries, and impinging on thelower levels of secondary schooling in many. These pupils might becharacterized as needing language input and tasks that take accountof their age, and the fact that they generally need short-term motiva-tion. Moreover, they have usually not met with the world of certify-ing examinations, although internal testing in the school, and evensome external testing, may be familiar to them. The need for inter-esting and motivating activity, which ought of course to be present atall stages, is therefore particularly acute for YLLs, and will often, infact, in the absence of exam pressure, be decisive in determiningwhat goes on in the YLL classroom.

The characteristics of YLLs, and the implications of these for theassessment of their language ability are discussed widely in the‘young learner’ literature, such as in Halliwell (1992), Vale andFeunteun (1995) and Cameron (2001), as well as in the special issueof Language Testing (17, 2), e.g., in Rea-Dickins’s (2000) article. Onthe basis of this discussion, there appears to be consensus that assess-ment procedures for YLLs should satisfy the following demands:

• Tasks should be appealing to the age group, interesting andcaptivating, preferably with elements of game and fun.

338 Assessing the language of young learners



• Many types of assessment should be used, with the pupil’s, theparents’ and the teacher’s perspectives involved.

• Both the tasks and the forms of feedback should be designed sothat the pupil’s strengths (what he or she can do) are highlighted.

• The pupil should, at least under some circumstances, be givensupport in carrying out the tasks.

• The teacher should be given access to and support in understand-ing basic criteria and methods for assessing language ability.

• The activities used in assessment should be good learningactivities in themselves.

III Young language learners in Europe

So far, YLLs have been considered with respect to their characteris-tics as young people, learning languages as part of their regularschooling, but without reference to any particular context. In thissection, the focus is turned to the YLL in today’s Europe. Since 45countries are members of the Council of Europe, it will be assumedhere that documents produced by the Council of Europe relating tolanguage education will have relevance for the typical YLL inEurope. Some of these documents will therefore be considered,beginning with the Council of Europe’s (1998) ‘Recommendationno. R (98) 6 of the Committee of Ministers to Member Statesconcerning Modern Languages’.

In Section A, ‘General measures and principles’, of the Appendixto R (98) 6, countries are encouraged to ‘pursue educational policieswhich enable all Europeans to communicate with speakers of othermother tongues, thereby … facilitating free movement of people’.They are also encouraged to promote widespread plurilingualism, ina variety of ways, including ‘by facilitating lifelong languagelearning through the provision of appropriate resources’. Moreover,the three first recommendations in Section B on ‘Early languagelearning’ (up to age 11) are:

• Ensure that, from the very start of schooling, or as early aspossible, every pupil is made aware of Europe’s linguistic andcultural diversity.

• For all children, encourage and promote the early learning ofmodern languages in ways appropriate to national and localsituations and wherever circumstances permit.

• Ensure that pupils have systematic continuity of languagelearning from one educational cycle to another.

Angela Hasselgreen 339



These recommendations seem to characterize the young languagelearner in today’s Europe as living in a mobile society, and beingexposed to a number of languages, some of which he or she may belearning or already competent in, and which may not be ‘main-stream’ foreign languages. Moreover, any language learningembarked on will be likely to be followed through as he or she pro-ceeds through schooling, and possibly beyond. It should probably beadded, at least in the case of English as a target language, that today’sEuropean youngster is subjected to media/internet exposure that willhave an impact on the individual’s language which is quite independ-ent of what is presented at school.

This situation has several implications for the teaching and assess-ment of the YLL. First, pupils within any language class may be atvery different levels due to the mobility of society; some will havepersonal backgrounds where the target language is used extensively,while others may be disadvantaged by having recently arrived froman area where the language was little used or taught. Secondly, inorder to promote early learning, teachers may increasingly have toteach and assess languages without any specialist training. Thirdly,the sheer diversity of languages taught may stretch resources so that itis not possible to provide material for teaching or assessment inindividual languages, particularly where minority languages areinvolved. Fourthly, pupils and their teachers may be simultaneouslycoping with several foreign or ‘other’ languages. And, fifthly, somekind of cohesion can be expected between the learning/assessmentprocesses used at this stage with those to be used in subsequentstages of education.

How, then, is assessment to adapt to this situation? Since it is nolonger possible to assume that a YLL language class will berestricted to a particular, near-beginner level, provision must bemade to describe language ability across the full range of levels, up to near-native speaker, taking age into account; these descriptionsshould also be, in principle, the same as those used at higher educa-tional levels in order to ensure continuity in the assessment. And thefact that the level of each pupil cannot be predicted, or easily cateredfor by testing, highlights the need for ‘alternative’ means ofassessment, such as portfolios, with self-assessment components.Since language-specific assessment material may be unavailable, acommon set of materials, usable across languages, is needed. Andgiven that demands are being placed on teachers which may not bematched by their training, both easily-usable material and training inassessment are clearly necessary.




The Council of Europe has acknowledged these needs, and gonesome way to meeting them, both in terms of its recommendationsand the material it has produced. In Section G, ‘Specification ofobjectives and assessment’, the Appendix to R (98) 6 states:

• Encourage institutions to use the Council of Europe’s CommonEuropean Framework of Reference to plan or review languageteaching in a coherent and transparent manner in the interests ofbetter international co-ordination and more diversified languagelearning.

• Encourage the development and use by learners in all educa-tional sectors of a personal document (European LanguagePortfolio) in which they can record their qualifications and othersignificant linguistic and cultural experiences in an internationallytransparent manner, thus motivating learners and acknowledgingtheir efforts to extend and diversify their language learning at alllevels in a lifelong perspective.

Furthermore, in Section H on ‘Teacher training’, there is arecommendation to institutions responsible for initial and in-servicetraining that ‘their courses take account of: the principles andpractice of language testing and assessment, including learner self-assessment’.

The CEFR and ELP are designed to provide teachers and learnerswith a means of describing and documenting language ability acrossthe full range of levels. The CEFR levels are intended to be applica-ble irrespective of which language is being targeted, what the statusof the language is, and how it is learnt. The ELP builds on the CEFRlevels to provide a means of documenting progress, largely throughself-assessment, within a portfolio that is the possession of thelearner, and which can be built on as the learner moves through hisor her education and beyond.

Thus, it would appear that the needs of European YLLs asdescribed above are, in principle, catered for in the recommendationsof the Council of Europe. The levels scales of the CEFR are increas-ingly referred to in the language learning and assessment proceduresthroughout Europe, and are evident in school language curricula,such as those in Denmark and Finland. The ELP is being adaptedwidely in education systems across Europe. At the time of writing,versions have been validated for primary schools in nine countries.One of these is the portfolio for ESL learners in Irish primary schoolsdescribed by Little (this issue). In addition, models have been pro-posed for another two countries, including one for learners as young




as 3–7 years old, in Spain. For lower secondary learners, seven havebeen validated (Council of Europe, n.d.). Furthermore, there is a cur-rent focus on facilitating the linking of tests to CEFR levels, e.g.,through the publication of the Manual for relating language exami-nations to the Common European Framework of Reference forLanguages, with a preliminary pilot version currently available(Council of Europe, 2003; see also Figueras et al., this issue).

Leaving aside here issues around the theoretical basis for the CEFRlevels (recently raised, e.g., by Weir, this issue), the Framework hasunquestionably made it possible for people to describe what they canactually do with foreign languages – whatever the language, thelevel or how it has been acquired – in a way that is widely recognizedand useful in giving a detailed profile across individual skills. TheELP gives a means of documenting this ability across time, in a vari-ety of ways and from different perspectives, with self-assessment inthe forefront.

Thus, it would seem that young European citizens should be well-served by the CEFR and ELP. Originally intended for adults, theseare now widely in use among younger learners. In the Council ofEurope’s ELP: guide for developers, Schneider and Lenz (2001)admit the need for the adaptation of ‘can-do’ statements to the par-ticular learner group, while stating requirements and suggestingmethods for those who wish to do this. But neither the suitability ofthe CEFR/ELP as central instruments for the assessment of YLLs,nor the ease of adapting them, should be taken for granted. In thefinal part of this article, therefore, certain questions concerningimplementing the CEFR/ELP in the YLL context are posed, andthese questions considered from the perspective of Norway, whereprojects have been established to specifically address them. First,however, in order to illustrate the diversity of approaches to imple-menting these instruments with YLLs, some cases are consideredwhere this has been tackled in quite different ways.

The French ‘Mon premier portfolio des langues’ (Debyser andTagliante, 2001) is intended for pupils from 8–11 years, and coverslevels A1 and A2, with some advancement towards B1. The ‘I can dostatements’ in the portfolio ‘biography’ are adapted from the adultELP-prototype wording, such as in the original Swiss model, so thatmore or less the same notions of what can be read, spoken, etc. aremaintained, but phrased in a way more suited to children. This isdone mainly through language adjustments, e.g., by replacing wordssuch as ‘interlocutor’ with ‘someone’, or removing ‘simple’ before




questions, since children are not expected to be able to differentiatebetween simple or complex questions.

The Centre for Information on Language Teaching and Research(CILT) in the UK has taken an alternative approach. Instead ofadhering to a visible CEFR progression, the CILT primary-schoolportfolio, My languages portfolio (CILT, 2001), focuses on pupils’being able to do lots of different things with their foreign and secondlanguages. At the beginning level, it gives children the chance toshow, by colouring ‘speech bubbles’, that they can perform a widerange of easily understandable, simple functions, such as:

I can name the colours …

and much more, including a space for the pupil to add his or her ownclaims. In the section headed ‘getting better!’, pupils have anotherwide set of ‘Can-do’s to cross off, with concrete notions and clearwording, reflecting the world of children, such as:

I can talk about what has happened or is going to happen.

Again, there is space for pupils to write in ‘other’ ‘Can-do’s, and inthis section statements are approximated to levels on the CEFR,mainly at A1 and A2, but with occasional B1 statements.

These two portfolios, when contrasted, seem to highlight the ten-sion that exists, for those designing ELPs for YLLs, between fidelityto the CEFR levels (as described in detail in Council of Europe, 2001)and to the world of children. The French portfolio reflects the formerapproach, so that the CEFR levels are fairly rigorously represented,with the wording and concepts in the document being generallyreminiscent of the adult world (although the annex to the portfolio,where languages mastered at A2� are compared, offers a selection ofmore concrete, childlike notions, using selected themes such as ‘talk-ing about my family’). The CILT portfolio, on the other hand, makesa more approximate attempt to define the learner’s level through itsstatements, but would probably make more sense to children.

Testing to the CEFR is also available for children, e.g., throughthe Cambridge ESOL tests for YLLs, designed to assess the Englishof learners between the ages of 7 and 12. Tests can be taken at threelevels: Starters, Movers and, the highest level, Flyers, estimated ataround level A2 (Cambridge ESOL, n.d.).

These three examples show widely differing ways in which theCEFR has been implemented in YLL assessment material. Yet theyshare a common feature in that they all virtually exclusively cater for




the range A1–A2, with little provision made for learners beyond thislevel.

An attempt to assess a wider range of levels is exemplified by aninnovative scheme of voluntary testing, currently under trialling in the UK whereby learners, including YLLs, are able to take a testin a wide range of languages, linked to the CEFR. The ‘LanguagesLadder’, introduced as part of the Department for Education andSkills’ National Languages Strategy, allows a learner to take anexternally-rated test at the appropriate level in one or more of theskills of listening, speaking, reading and writing. Furthermore, eachstage or level of the framework (A1 to C2) is further divided into a series of grades or ‘steps’ on the ladder, with the possibility forassessment to be carried out by teachers at these steps (Departmentfor Education and Skills, 2004).

IV Assessing YLLs to the CEFR/ELP: the Norwegianexperience

In the preceding sections, the assessment of YLLs, both generallyand in the European context, has been considered. The CEFR andELP have emerged as being of great significance to this process, butthe ways in which these instruments have been implemented varywidely, and give rise to certain questions:

1) How can the levels and ‘Can-do’ statements in the CEFR/ELPbe adapted for the assessment of YLLs in order to preserve theintegrity of the CEFR levels, and yet take on board the particu-lar characteristics of children and young teenagers?

2) How can we ensure that our assessment material covers learnersat all the potential CEFR levels of YLL ability?

3) Are the descriptors of the type used in CEFR/ELP sufficient fordescribing the ability of YLLs?

4) Is it feasible to make tests to the CEFR, and to have someconfidence that the test results place the learner correctly?

5) Are the teachers of YLLS equipped to implement theCEFR/ELP in their assessment procedures?

In this section, two recent developments in assessment from theNorwegian school context are described which, between them, showhow attempts were made to tackle all the questions raised here. Theyare: (1) portfolio assessment linked to the ELP and (2) testing of




English (partially computer-adaptive) to be carried out nationally,linked to the CEFR.

1 Project 1: the Bergen ‘Can-do’ project: developing portfolioassessment material for lower secondary school pupils,compatible with the ELP

This project arose at the University of Bergen in the wake of anearlier one in which testing material had been developed for lowersecondary and primary school English (Hasselgreen, 2000). Forspeaking and writing, pupils’ ability was described in ‘real terms’,according to understandable criteria, whereas in the receptive skills,scores were given that were only interpretable relative to theperformance of the whole group.

In order to supplement the information from the testing,particularly in the case of the primary school, additional materialwas incorporated in the form of self-assessment forms, profilingforms and observation forms, intended for use both in the testingcontext and in ‘everyday activities’. This additional material wasinnovative in the Norwegian context, and was so well received byboth teachers and pupils that it was decided to embark on a new proj-ect, focusing on developing material for ‘alternative’, ongoingassessment, catering initially for lower secondary school pupils, andspecifically ensuring that the material should also be usable with lan-guages other than English. Moreover, it was intended that pupilswould be able to receive concrete feedback on their ability in thereceptive skills as well as the productive.

It was with these aims in mind that the Bergen ‘Can-do’ (BCd)Project was launched. Beginning as a local, county-based project,administered from the University of Bergen, it soon developed intoa Nordic/Baltic project, which was part of the ECML (EuropeanCentre for Modern Languages, Graz) medium-term programme ofactivities, 2000–03. Although the project was concerned with lowersecondary school pupils, mainly in the 13–15-year age group, theapproach followed is relevant to the discussion of YLL assessmentas it concerns adapting and supplementing ELP material for learnersyounger than those the ELP was originally designed for. The ‘prod-uct’ developed in the project is briefly outlined and exemplified inAppendices 1–4. However, the focus in this section is the processthat was followed, as this is considered most relevant in showinghow the first three of the five questions outlined above were tackled.




It was clear that many of the aims listed were shared by thoseengaged in the development of the ELP. The ELP builds on theCEFR, specifically the CEFR scales whereby language ability in arange of subskills is described at six levels. Fundamental to the ELPare the sets of ‘I can do’ statements, which form the backbone of theworking material, giving pupils concrete tasks to aim for and achieveas part of their progress from level to level. In order for the materialto be meaningful to the user group, it is essential that these ‘Can-do’statements match the reality of what pupils actually do with theirlanguage. A principal aim of the project was to research this actuallanguage use, and to design ELP-compatible ‘Can-do’ statementsthat seemed to match the real world of young teenagers, while at thesame time preserving the essence of the ability described in theCEFR levels. Moreover, it was the intention to supplement the ELP-type material with the types of self- and teacher-assessment materialdeveloped in the forerunning testing project, relating to the language‘quality’ in any individual performance. While ‘Can-do’s and levelsare able to show ‘milestones’ (large and small) in a pupil’s achieve-ment, it was still considered important to have a means of describingand assessing everyday performance, in linguistic rather than func-tional terms, in any skill.

The project was thus faced with two distinct tasks: adapting thebasic ELP to the context of language learning in the lower secondaryschool, and supplementing it to fully meet the aims of the project.

Adapting the ELP involved considering:

• levels on the CEFR scales: which of the six levels should beincluded/subdivided and how might these be reworded?

• ‘Can-do’s: how can these be made appropriate to this age group,taking into account the conditions under which they might useforeign/second languages?

Supplementing the ELP involved deciding:

• how to ensure really continual assessment (on a day-to-daybasis);

• how to include assessment of linguistic aspects (as well as thefunctional/communicative ones, which are in focus in the ELP).

The work in adapting the ELP scales and ‘Can-do’s to the lowersecondary school contexts in the BCd project has followed a basicset of procedures for each of the ‘skills’: spoken interaction, readingand writing. The procedure for the skill of reading is outlined here,and was similar in principle to that of the other skills. First, the 16




teachers in the BCd project were asked a series of questions relatingto the reading ability of their pupils. The aim was to establish whichlevels represented the range of ability of most pupils, which couldconceivably be omitted, and how the descriptors might be worded.Preliminary level descriptors, covering the levels from A1 to C1,with wording suited to the age group, were produced on the basis ofthis procedure. Next, a survey was carried out among pupils to estab-lish what they actually read in the foreign language in question;pupils were also asked to place themselves roughly on the draftlevels, which here included mid-way levels. The data from thesesurveys was then incorporated into a revised set of draft levels and‘Can-do’ statements representing the full range of levels A1 to C1,but not labelled as such. Pupils were next asked to assess which ofthe statements applied to them and, again, which level they felt theybelonged to. The data from 259 pupils across four Nordic/Balticcountries was analysed, and each ‘Can-do’ statement was consideredin terms of how well it apparently fitted the level it was intended for,and whether it should be adjusted or dropped. Here are two examplesof the resulting ‘Can-do’ statements:

A2: I can understand the main ideas in short, simple texts about familiarthings, if they use mainly common or easy-to-guess words, or wordsthat are ‘typical’ for the theme (e.g., menus, short emails, easy-to-readstories).

C1: I can read and understand all the texts and books I need or want to, evenif the language is rather ‘special’ and the theme is rather new to me.

Additionally, a procedure of drafting, trialling and adjusting was fol-lowed to produce ‘non-ELP’ type of material, intended for regularclassroom use; this consisted of self-assessment forms and a readingrecord form, as well as a variety of assessment material for theteacher to use, in the shape of profiling and observation forms. Whilethe material was initially trialled almost entirely with pupils ofEnglish, the material was worked on further at a pan-Europeanworkshop in Graz in 2002 in order to ensure that it was compatibleacross a wide range of target languages.

As a result of these procedures, sets of material for the portfolioassessment of spoken interaction, reading and writing weredeveloped, both in paper and electronic versions. The material hasthe following principal components for pupils’ portfolios:

• language-learner background;• scales of levels;• ‘Can-do’ checklists;




• self-assessment form;• reading-record form;

For the entire process, with the complete set of material on CD, seeHasselgreen, 2003.

2 Project 2: national testing of English (partially computer-adaptive) to the Common European Framework

The national testing described here is part of a wider testing pro-gramme for English, mother tongue reading and writing andMathematics, being put in place gradually from spring 2004 by theNorwegian Ministry of Education for pupils at 4th, 7th, 10th and11th grades in the national school system. The 4th and 7th grades arein the middle and at the end of primary school respectively, with agegroups of 9–10 and 12–13 years. The aim of the testing is primarily toreport (to schools, parents, and local and central authorities) on theability of pupils at these stages, although it has a pedagogical pur-pose insofar as pupils are to receive profiles, couched in concreteterms rather than single ‘scores’ in the subjects tested. Thus, aware-ness is raised regarding what individuals and groups are able to do,and what they need to acquire in order to progress. The NationalTesting project at the University of Bergen has been given responsi-bility for the designing and trialling of the tests of English, togetherwith the University’s Intermedia section, who are responsible forcomputerizing and delivering the tests. The first phase of the projectinvolves reading and writing skills, and it is the development of thereading test which has advanced furthest at the time of writing, andwhich is thus principally focused on in this section.

The testing of English has two features that combine to make itinnovative among national testing programmes. First, the results areto be expressed as levels (or sublevels) on the CEFR scale. Secondly,the tests of receptive skills are to be computer-adaptive, at ‘branch’(as opposed to item) level. In order to test to the CEFR, all test itemsare designed to correspond to ‘Can-do’ statements at the level they areintended to target. Thus the specifications for both the reading andthe writing test are largely based on sets of ‘Can-do’ statementsdefined for the age level concerned, taken or adapted from the BCdmaterial. In order to give results as levels, two very differentapproaches have been used.




In the case of writing, assessment criteria are defined at eachCEFR level taking into account both the kind of functions (‘Can-do’s) pupils should be able to perform, and the linguistic quality of thewriting, based on various descriptions within the CEFR. Teachersthemselves use these criteria to carry out the rating, after intensivetraining, with ‘spot checks’ being carried out by external raters. Thesuccess of this clearly depends on how consistently and accuratelythe rating is done, and is the subject of further investigation.

In the case of reading, the placing on levels is done ‘by thecomputer’. Pupils enter the test at a pre-test, made up largely ofitems around the most normal (based on teachers’ and other experts’judgments) levels for the age group, and then are placed as a resultof this on one of three ‘main test’ branches at differing levels,between them covering a wide spread of levels. The final score fromthe whole test is given instantly to the pupil as a level, or in-betweenlevel, from the lowest (A1) to the highest level (with a ceiling atB2� for primary and C1 for lower secondary pupils). Material isprovided for teachers as well as for parents and pupils, to explainwhat this means in terms of what he or she ‘can (potentially) do’ inEnglish reading. The same pupil, when tested three years later, willbe given the result on the same scale, and will thus be able to seehow he or she has progressed up the levels in the intervening period.

Two principal challenges have occupied the project in the yearsince it was launched. The first has involved coping with the restric-tions and exploiting the advantages of computers. The second hasinvolved the fundamental task of ensuring that the placing of pupilson CEFR levels was justified.

The challenge and opportunities associated with the computer hadto be considered from the outset. Testing reading vocabulary throughthe dragging of labels to objects, and testing the ability to followwritten instructions by actually getting pupils to do things withmovable objects readily suggested themselves as test methods. Itsoon became apparent that they were able to click on texts thatmatched pictures and vice versa; to fill gaps with link words throughdrag-and-drop so that they actually occupied the appropriate space;and to move pieces of text into position. However, the problemswere apparent too. The amount of text that can be shown on screenis limited, and raises such questions as whether to scroll or ‘page’, orto stick to short texts. The possibility of asking completely open-ended questions is lost, as the computer cannot be expected to copewith all spelling combinations, as a competent rater could. Wayshave had to be found to compensate for this, e.g., by getting pupils




to highlight the part of the text that shows the answer. Another issueto be faced was to ensure that pupils were not being tested on theircomputer skills, but on language skills alone. This has involvedbringing pupils into the computer labs for numerous rounds of tri-alling of techniques and formats, in order to ensure that they take tothese easily, with the minimum of computer expertise. Fortunately,the degree to which the children are at ease with this aspect of thetesting has exceeded all expectations. However, many questionsregarding the effect of computerized testing will only be answeredthrough research in the aftermath of the trialling.

The approach to testing to the CEFR levels involved many stages,and the services of a psychometrician as a consultant. First of all, asurvey was carried out among local teachers to establish which CEFRlevels were felt to be most representative of the age groups inquestion. Next, a test design was produced by the psychometrician,indicating how many items at each level, for each age group, neededto be made as well as the design for their piloting. Once this wasdone, the item writing began. Each item was based on a specific ‘Can-do’ statement (taken from the original BCd project set of ‘Can-do’s,with some adjustments/additions for each age group), and so wasintended to target a particular level. Once the 1050-plus items wereready, these were trialled on the basis of an incomplete linked testdesign in such a way that each item was answered by at least 220pupils. The total sample size for piloting consisted of over 11 000pupils.

The one-parameter logistic model (OPLM), which is an extensionof the Rasch model, was used for the calibration. After discarding themisfitting items, we ended up with 696 items fitting the OPLM.Based on these items 17 tests were produced in order to be able toapply two-stage adaptive testing on a test level for each grade (4th,7th, and 10th). For each grade there were three1 parallel (in terms ofdifficulty) versions of tests for the first stage, which consisted ofabout 20 items. The reliability of all first stage tests was above 0.86and each pupil was assigned to one of them randomly.

The number of versions of tests for the second stage of the adap-tive testing was three per grade, and these differ in their difficultylevel. Depending on the results of the first stage testing, each pupilwas directed to an easier or more difficult version of the second stage


1For grade 7 instead of three, only two versions of the first stage adaptive test were constructed.



test. The length of the second stage versions varied between 35 and38 items and the reliability of each of them was at least 0.95.

The standard setting procedure which followed was largely in linewith that outlined in the Council of Europe’s (2003) Manual andused in DIALANG (Alderson and Huhta, this issue), and involvedboth expert judgement and statistical analysis of the items. Theexpert judgement was carried out with 12 judges, who were familiarwith the CEFR and/or experienced teachers. Unfortunately, thisproved to be unhelpful, with little consistency between the experts’judgements and the way items actually performed. This may havebeen due to too little training, but may also have lain in the itemsthemselves, where, for instance, a task was set at a level unlike thatassociated with the text itself. As a result, instead of a modifiedAngoff procedure, Hofstee’s standard setting method was applied(based on the judges’ estimations of the frequency distribution of testscores). The comparison of these estimates with the results of actualtesting reveals that as a whole the established cut-off scores are quiteadequate. Still more work is, however, needed to validate thestandard setting as well as to provide appropriate training, ensuringbetter estimation of item difficulty.

Trial testing is being carried out at the time of writing, with all10th graders, but with only a small number (to be decided) of 4th and7th graders. Early evidence suggests that the state of computers inthe schools has been generally adequate to carry out the testing. Astatistical analysis of the data from the tests will reveal more aboutthe way they performed, and teething problems within the projectprocedures are being noted and worked on. The full testing round,with all grades, is scheduled for spring 2005.

V Conclusions

The accounts of the projects carried out in Norway have, hopefully,shed some light on the questions presented at the beginning ofSection IV.

The account of the Bergen ‘Can-do’ project has addressed ques-tions 1, 2 and 3. It has demonstrated one way in which the levels and‘Can-do’ statements in the CEFR/ELP can be adapted for the assess-ment of YLLs in order to preserve the integrity of the CEFR levels,and yet take into account the particular characteristics of childrenand young teenagers. Moreover, learners at all the potential CEFRlevels of ability in the learner group in question (13–15-year-olds)




were catered for, with a ‘ceiling’ (on grounds of maturity) at C1,initially based on teachers’ estimates and subsequently borne out bypupils’ responses. To the question of whether descriptors of the typeused in CEFR/ELP are sufficient in themselves for all description ofthe ability of YLLs, the answer was ‘no’. Teachers have expressed aclear need for material that captures the everyday ‘here and now’classroom performance, in terms of language quality, in addition tothe more function-focused ‘Can-do’s, which show long-term ratherthan immediate progress. There was, moreover, a feeling among theproject participants, that ‘alternative’ assessment on its own was notsufficiently reliable, and that testing from time to time should occuras a back-up to judgements about a pupil’s level on the CEFR.

The work carried out so far in the National Testing project hasgiven considerable insight into what is involved in the ambitious taskof testing to the CEFR, involving basing items on ‘Can-do’ state-ments and finding means of reliably expressing results as CEFR lev-els. By catering for a wide spread of levels in each age group, pupilsare given the opportunity to demonstrate their level, whether or notthis is in line with the norm for the group. It has thus gone some wayto addressing questions 2 and 4. The challenge of ensuring thatassessment material covers learners at all the potential CEFR levelsof YLL ability is particularly acute when it comes to testing recep-tive skills. In the case of the National Testing project, the challengehas been largely met through computer-adaptive testing, but this isonly feasible for large-scale projects with considerable resources.Additionally, ensuring that the tests place the learner correctly on theCEFR is fraught with difficulty, again requiring considerableresources, not least regarding the number of items to be trialled, andin making sure that ‘experts’ really are expert.

There is surely a need in the not too distant future, for a jointEuropean project for computer-adaptive testing, at least of receptiveskills – drawing on existing items such as those developed inNorway – to provide a good way of testing younger learners to theCEFR. However, this necessitates further research into the practicalityand validity of a computer-adaptive test in the skill involved.

The remaining question on our list, question 5, involves the extentto which teachers of YLLs are equipped to implement theCEFR/ELP in their assessment procedures. This is perhaps one ofthe most important ones, but is too often neglected. In the case ofNorway, every teacher of English in the grades being tested is beinggiven a one-day course in assessment to the CEFR framework. Thisis principally because the teachers themselves will be grading the




writing tests, with a sample being double-rated. However, it givesthe opportunity to introduce teachers to the CEFR thinking, andensures that every school in the country will have at least one personwith some training in this area.

Still, it is undoubtedly the case that teachers lack training inlanguage assessment. A survey recently carried out by the EuropeanAssociation for Language Testing and Assessment (EALTA)(Hasselgreen et al., 2004) has revealed that there is an acute needacross the board among language teachers for training in assessment.This is coupled with the fact that primary school teachers tend to bethe ‘poor relation’ when it comes to any specialist training in lan-guage teaching. In the case of Norway, a recent report showed thatin the lower half of the primary school (6–9-year-olds), whereEnglish has been taught for less than a decade, 65% of those whoteach English have no formal competence in it (in other words, theyhave not studied English from any perspective since leaving schoolthemselves). This figure drops to just over 50% in the upper primaryschool (10–13-year-olds) and plunges to 20% in the lower second-ary school (Norwegian Ministry of Education, 2003). It can be con-cluded that, in the case of YLLs in Europe, there is a long way to gobefore the Council of Europe’s recommendation to institutionsresponsible for initial and in-service training that ‘their courses takeaccount of the principles and practice of language testing and assess-ment, including learner self-assessment’ can be implemented. This isone challenge that must not be ignored.

VI References

Cambridge ESOL n.d.: Cambridge young learners English tests. Cambridge:University of Cambridge ESOL Examinations. Available at: http://www.cambridgeesol.org/exams/yle.htm (March 2005).

Cameron, L. 2001: Teaching language to young learners. Cambridge:Cambridge University Press.

CILT 2001: My languages portfolio. London: Centre for Information onLanguage Teaching and Research.

Council of Europe 1998: Recommendation no. R (98) 6 of the committee ofministers to member states, concerning modern languages. Available at:http://cm.coe.int/ta/rec/1998/98r6.htm (March 2005).

—— 2001: Common European Framework of Reference for Languages:learning, teaching, assessment. Cambridge: Cambridge University Press.




—— 2003: Relating language examinations to the Common EuropeanFramework of Reference for Languages: learning, teaching, assessment(CEF). Manual: preliminary pilot version. DGIV/EDU/LANG 2003, 5.Strasbourg: Language Policy Division.

—— n.d.: European language portfolio. Strasbourg: Language PolicyDivision. Available at http://www.coe.int/portfolio (March 2005).

Debyser, D. and Tagliante, C. 2001: Mon premier portfolio. Saint Amand: Didier.Department for Education and Skills, UK 2004: The languages ladder: steps for

success. Available at: http://www.dfes.gov.uk/languages/DSP_languagesladder.cfm (March 2005).

Halliwell, S. 1992: Teaching English in the primary classroom. London:Longman.

Hasselgreen, A. 2000: Assessment of English ability in Norwegian schools.Language Testing 17, 261–77.

—— 2003: Bergen ‘Can Do’ project. Strasbourg: Council of Europe. Availableat: http://www.ecml.at/documents/pub221E2003_Hasselgreen.pdf (March2005).

Hasselgreen, A., Carlsen, C. and Helness, H. 2004: European survey oflanguage testing and assessment needs. Report: part one: generalfindings. Available at: www.ealta.eu.org/resources.htm (March 2005).

Norwegian Ministry of Education 2003: NOU 16: I første rekke. Oslo.Rea-Dickins, P. 2000: Assessment in early years language learning contexts.

Language Testing 17, 115–22.Schneider, G. and Lenz, P. 2001: ELP: guide for developers. Strasbourg:

Language Policy Division, Council of Europe.Vale, D. and Feunteun, A. 1995: Teaching children English. Cambridge:

Cambridge University Press.




assessing the language of young learners

Documents