assessment and evaluation - chapters from baxter's evaluating your students

17
I - L- -- F1t L- r--- L- r--r L: L l--al L- r--r L- r---l L- l--- L- l-- r a-t r .J l- r-.- L- l--r l- r--r r t--r' L- g L- t-t l_ l-r t-_ r-.t l_- a-- l_ a--t t_- g l- r-.- L- r-..I L- l-.-r L- a._--- L- )--,f l- r a-J I- L .-- l-J- L- l--- l- I Eualua,ting youf Stud,ents Andy Baxter

Upload: yamith-j-fandino

Post on 18-Apr-2015

1.670 views

Category:

Documents


0 download

DESCRIPTION

A set of articles from Baxter's 1997 book.

TRANSCRIPT

Page 1: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

I-

L---F1t

L-r---

L-r--r

L:

Ll--al

L-r--r

L-r---l

L-l---

L-l--ra-tr.J

l-r-.-

L-l--r

l-r--rrt--r'

L-g

L-t-t

l_l-r

t-_r-.t

l_-a--

l_a--t

t_-g

l-r-.-

L-r-..I

L-l-.-r

L-a._---

L-)--,f

l -

ra-J

I -L .--

l-J-

L-l---

l-I

Eualua,tingyouf

Stud,entsAndy Baxter

YAMITH
Typewriter
FOR EDUCATIONAL PURPOSES ONLY
YAMITH
Typewriter
Page 2: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

Richmond Publ ish ing19 Berghem MewsBlythe RoadLondon W14 0HN

O Andy Baxter 1997Publ ished by Richmond Pubishing @First publ ished 1997

All rights re.served. llo paft.of this book may be reproduced, stored in a retrieval system ortransmitted in any form, electronic, mechanical, photocopying, recording or otherwise,withoutthe prior permission in writing of the publishers.

However, the pub.lisher grants permission for the photocopying of those pages marked'photocopiable', for individual use or for use in classes taught by the puichiser only. Underno circumstances may any paft of this book be photocopied for resale.

ISBN: 84-294-5067-XDepbsito legal: M-45897-2002Printed in Spain by Palgraphic, S.A.

Design Jonathan BarnardLayout Cecko LimitedCover Design Ceoff Sida, Ship Design

l l lust rat ions Cecko L imi ted & John p lumb

DedicationTo my father - a great educator.

IEItrEIl-

D-

Etr-EEDI:

rl'1j

":rl'1

IDl

IF

Itr=

I

Di^_J"-1qqDi

4qrr:

I-,8:

I>I-

.>I-

->l-

:_+l---

->

Page 3: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

pA RT A Assessment, testihg, evaluation

bhl_l--I

L-r--r

L-r--r

L-=\<

S.

rr--

L-r-.rrr---t

L-4..-

L-r--r

L:Vrr--rl

1:r-I

l-L.t

l:L-1]rL-f

l:a-.-l

1_:L.I

l-L-.tl

l:L-t

l-.J

L:-.-,-

l_g

t:g

l:l-Jrr

L-a4ra5

l:a-tl

1--.1

l--a--t

L:l---

I-

cHAPTER 1,-

Why do we assess students' learning?

1 To compare studentswith each other

To see if studentsmeet a particularstandard

There are many groups who have an interest in assessing a student's abilities:teachers, heads of departments, parents, governments and, of course, the studentsthemselves. However, we all share the same four main reasons for assessment:

.. . to comoare students with each other

... to see if students meet a oart icular standard

. . . to he lp the s tudent 's learn ing

... to check if the teaching programme is doing its job.

Write a [ ist of the types of tests (not just foreign [anguages) given in yourschool. Why are they given? Which group is each one primarity aimed at? Whoare the results for?

students teachers heads of departments parents governments others

lf your students want to enter a university to study a popular subject, theuniversity has to select which students i t takes. l t decides on a comparative basis,e.g. i t wants the top 20% of candidates. But there is a problem: consistency. Agood year of candidates may be compared with a weak year: this year's top 20%may not be as good as last year's top 20%. However, i t is st i l l the top 20% thatget through the exam. This approach has been called 'rat ioning the carrots':however well al l the candidates perform, only the top 20% get through.

Although this system may appear unfair, it is still often used by governments andparents to judge the quali ty of a school.

Large organisations, l ike the state, or international examining boards, havecertain standards of proficiency that students must meet. These standards do notnecessarily reflect the teaching programme that the students have followed:different schools may use different books or syLLABUsEs. So these largeorganisations have to set their own standards or criteria, and see if the studentcan perform at this level.

Other smaller organisations, l ike individual schools, can also set a part icularstandard based on their own individually-agreed criteria.

More frequently, though, schools wil l base their assessment on their ownteaching programme. They analyse what the students cover in class, and thenassess whether the students have learned it, often by giving an ACHTEVEMENT TEsr.

T A s K

YAMITH
Textbox
Baxter, A. (1997). Evaluating your students. London: Richmond Publishing.
YAMITH
Typewriter
FOR EDUCATIONAL PURPOSES ONLY
Page 4: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

Why do we assess students'learning?.-\.-/----./

To help the student'slearning

To check if theteaching programmeis doing its job

rlA .g K

Testers differ over what an AcHTEVEMENT resr should actually cover. lt couldtest either:

... the overallobjectives of the syLLABUS (e.g. in English, the ability to express pasttime, or the ability to write in a variety of styles), or

... the individual i tems on the svLlneus (e.g. in English, the past simple, or writ ingadvertisements).

Another reason for assessment is initial placement. We can analyse the students'abilities in order to see where they fit into the system. For example, if the schoolhas restrictions on space in classes, they may be placed according to whatpercentage they get (e.g. the top 10% go into the top class). Alternatively, theremay be ceftain criteria the students are expected to meet. lf one classconcentrates on writing while another specialises in grammar revision, thestudents' class will be determined by their success according to these criteria.

Whether we assess proficiency or achievement, we can analyse the student'sabilities in a diagnostic way. Instead of using the assessment to grade thestudent, we use it to see where the student needs more help. For example, thestudent gets an excellent grade in writing an adveftisement, but makes manyerrors in the grammar section, especially in the present simple third person -s.we may then decide to give him/her addit ional help and teaching in this area.

But suppose all the students get excellent grades in writing advertisements, butall make many errors in the present simple third person -s. we may then decideto alter the whole teaching programme to give all the students additional helpand teaching in this area.

On a larger scale, if teachers and inspectors identify a common problem across allschools', a government may decide to alter the whole of its education programme.

Summary

There are, as we shall see in this book, many ways of assessing students. Butprobably the most common method of assessment is a test.c pRoFtctENcy rEsrs examine a general standard in ability, regardless of the

teaching programme.

c ACHTEVEMENT TESTS examine whether students can do what they have beentaught, either by testing specific syLLABUs items or general objectives.

c PLACEMENT TESTS are a mixture of the above two, depending on what criteriawe use to place the student.

c DlAcNosrlc rEsrs use PRoFtcrENcy or AcHTEVEMENT TEsrs to analyse strengths andweaknesses in the student or the teaching programme itself.

Think of two different tests that you know wetL a tanguage test or other testthat is used in your school" and one of another subject or abil.ity (tike driving).Is the test based on the teaching programme or not?Who sets the tesfs standards/criteria?

How are the resutts used? To compare students? To assess the teachingprogramme? For other reasons?

.l-

.D=

I-

P

l!-

.EE

l-

.tr-D-

.uEJf-

.E,E,E.E=.E:!l:

,E=.u=.ErE

- 1v- z

P- _ 1 .

.P

- 1

-D-. 1

Dz

- 1-v

. 1 .v_1.

I-I-v1-v,Jr

.>'

Page 5: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

CHAPTER 2

What is testing?

Turning performanceinto numbers

bhL_l---

L-r--.rrr--rL-L -

L-

Lr--rl---rr--Irl-.trl---rL,'IrL-t

L:L-tl

L:a-,

l-:grL-'

l:r--I

L:l-tl

l-r--trr--.!l

L:a-t

L:a-t

l:L.t

l-a.4ra--Jra--J

lja5

l:)-tJ

l-a-J

lj] a

l--r.-r

1-l"--

l-

What's the difference betweentestitrg, teaching and evaluation?

Every time we ask students to answer a question to which we already know theanswer, we are giving them a kind of test. Much of what we do in class is, infact, testing students' knowledge. Here are some examples.He goes to the cinema. They ...?Find a word in the text that means 'angry'.

On the tape, where does John tell Susan he wants to visit?What is the main idea of paragraph three?Dictation: write down the following...That's that part of the lesson finished. What do you think we're going to do next?

Testing and teaching

Testing has, traditionally, measured the results of student performance.o We choose some representative samples of language.et We measure whether a student can use these samples.o We then try to quantify this by turning it into a mark or grade.e We keep a record of these marks and use this to give an end assessment.Over t ime, al l testing theory (whether languages or shampoo development) hastradit ionally been based on a semi-scientif ic procedure, namely:1 Measure the performance.

2 Do something to affect the performance.

3 Measure the performance again and compare the difference.

Applying this traditional testing procedure or model to language learners hasmeant that the language learner is treated as a kind of plant. We measure theplant, apply the new fertiliset and then measure the plant again to see what effectthe fertiliser has had. As language teachers, we apply a (pmcEmrNr) test, teach, andthen give an ACHTEVEMENT TEsr to see how much better the students are.In other words, testing is generally concerned with eruunnennrroru, that is, turningoerformance into numbers.

Plants Language learners

Stage' l measure plant test the present simple

Stage 2 add fertil iser teach the present simple

Stage 3 measure plant againcomoare the difference

test the present simple againcomoare the difference

YAMITH
Typewriter
FOR EDUCATIONAL PURPOSES ONLY
Page 6: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

__l9tt the difference between

Testing activities andteaching activities

-It4

f-

F)r-

F.I-

P

I-

.tr-Il-

D-

Il-

"[!E.F

l-

.L-

.L-

Jll-

"u=.ttrD:.E.tr-.,tr=EP

*

tr-. 1

D.-_ 1

I-. - 1

.P4

-*1

P1v

test ing, teaching and evaluatjon?

Teaching and testing go hand-in-hand. We often ask questions to check that thestudents have understood what we have said. Equally, we sometimes ask aquestion to find out whether we need to teach a point. We instinctively knowwhy we ask a question: whether it is to teach or to test something.

Compare the following two exercises.

Exercise I

Fill the gap with an appropriate form of the verb.

a John

b John

France every year since 1993. (visit)

France last year. (visit)

Exercise 2

In groups, discuss the differences between the wo sentences.

a John has visited France every year since 1993.b John visited France last year.

Exercise 1 assumes that the students have some knowledge and asks them toprove it. lt is clearly a testing activity. Note that if the students get the rightanswer, we don't know why they wrote that answer. lt may be a guess, or itmight just sound right.

Exercise 2 asks the students a question about the language. In other words, it isasking them to formulate a rule they can use in other situations - a generalisabletheory. lt is also trying to increase their awareness of how the language works. ltis trying to help them learn: i t is a teaching activity. On the other hand, someteachers would say that people don't need to know why it is right, they justneed to get it right.

Let's compare two more exercises.

Using the same ideas as we outlined above, Exercise 3 is clearly a test: it wants thestudent to show us what he/she can do. Exercise 4, on the other hand, clearly triesto make the student more aware of what he/she is trying to do: it tries to increaseawareness before giving the task. lt tries to help the student to learn.

Exercise 3

Composition: A Summer's Day at the Beach (150 words)

Exercise 4

Read the following two compositions entitled A Summer's Day at the Beach'.

\fhich do you prefer and why?

Underline all the words and ideas relating to summer. Underline all the wordsand ideas relating to the beach. Put a tick next to the parts you like in each essay.Put a cross next to the parts you don't like in each essay.

If all the paragraphs got accidentaily jumbled up, could you put them back in theright order? Vhat would help you do this? Discuss your ideas with another group.

Homework: write your own composition on the same theme (150 words).

1 0

Page 7: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

I

bLr*-L-r---

L-r-..-rr-..rL-trt ,rr-.-rl--trvlra-,Jrl-Jr3--ttra-t

L:l--

L:)--trrl-arr--l

L-r-.-

L:L-J

L-]-1,

L-.J

L-a-----

L:r--{

L-L . -rl _ - -rl-:trl_. -rl . - a

L-a-4

L-t - A

L-a-4

L-i:

What's the difference between testing, teaching and evaluation?

Teaching or testing? : Sometimes, though, teachers can get confused about whether they are teaching' or testing. We can think we are teaching when we are actually testing.: This is part icularly true when we try to teach the four ski l ls: reading, writ ing,

speaking and listening. Here language teachers face a major problem. We don'tI real ly know enough; that is, there are no clear rules about good l istening,: reading and other skills. All we have are some rather generalised ideas such as

skimming and scanning, and these are not detai led enough to help us work outan effective and progressive teaching programme.In other words, when faced with a skill that is difficult to teach, such as goodlistening, we normally answer this problem in one of two ways. Either we givethe students lots of opportunities to show what they know so we can see if they'reimproving. We ask them to read, write or listen to texts of increasing linguisticcomplexity and hope they keep the same general results or even improve; or we

; keep the same texts and increase the complexity of the questions.This is a bit l ike a doctor saying I don't know what caused your i l lness or whyyou're getting better, but your temperature is going down. All we can do toteach the four skills is expose students to language and take their temperaturevia testing to see if they're getting befcer.Or we substitute the skill that is difficult to teach with one that is easy to teach.While the rules for skills are not very clear, we do have some very good rules forgf&mmdr and vocabulary, which makes them easier to teach (however, writing agrammar/vocabulary test can be complex, as we shall see later). So wesometimes believe we are teaching or testing a ski l l , when really we are

. practising or testing grammar or vocabulary. For example, many speaking testsare disguised grammar revision: they can become an oral test of grammar. Theydon't test real speaking ski l ls such as interrupting without causing offence at al l .Why is this? Because the semi-scientif ic plant model of testing which we lookedat earl ier has some major problems. The next part covers these problems.

Problems with testing Problem 1: Skil ls into numbers

On pnce 9, we saw that testing is based on an idea from science: measure, make, changes, measure again and compare.

One problem with the scientif ic model is that not everything can necessari ly bemeasured in this way. There are some things we can easily test in this way, e.g.the present simple third person -s.

But other ski l ls are more diff icult to measure. How, for example, can we quantifya student's ability to make useful contributions to the class?o First, we would have to define 'useful ' and 'contribution' in a wav that we

could measure them.c We could define 'useful ' as 'successful ly explaining something to another

student' .' c We could define 'contribution' as 'answering a question put to the whole class

by the teacher'.e We could now count how many t imes a student successful ly answered a

. teacher's question and the majority of the rest of the class understood.: The problem with this is that we are now measuring how many t imes a student: 'successfully answered a teacher's question and the majority of the rest of the, class understood'. This is not necessari ly the same thing as making a useful: contribution to the class.

7 7

Page 8: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

What's the djfference between testing. teaching and evatuation?

Other problemswith testing

so there are two dangers when assessing skills that are difficult to measure.e We may take something we all understand and re-define it to make it

measurable; but, in doing this, we may change the very thing we are tryingto measure.

e lf something is too difficult to measure, we leave it out of the test - even ifthe skill is very important.

In the end, we arrive at a posit ion where we are only measuring the easily-measurable, rather than assessing the performance we are trying to improve.

Listen to your cotleagues having (11) conversations in the staff room.What percentage of their naturat spoken [anguage consists of fut[ sentences?What percentage consists of sentence fragments [inked by intonational devic€sand ums and ers?

How often do you teach students to speak in fragmented sentences?

Problem 2: Results versus processes, what versus whyAnother problem with this semi-scientific system of euRrurrrnrrvr mensunrmerur isthat it does not record euAlrrArvF onrR. Measuring will tell us if the plant hasgrown, but not why (or why not). l t gives us information about the results, butdoesn't tel l us anything about the process.

In the example essay (sEe pnce 10), we would get a much better idea of thestudent's abilities from Exercise 4, because we could see some of the processesbehind the work, e.g. we could look at where the student put the ticks and crossesin the essays, and then see if and how these were reflected in his/her own essav.

Problem 3: Standardisation and odd results

A third problem with the scientific model is that the fertiliser given to the plantmust always be the same, or the results cannot be compared. we must remove thevariables in order to assess the success of the programme. lt is difficult to see howthis can work in teaching. ln schools, all the teaching would have to be the same,or we couldn't really compare the progress of individual students. This model oftesting therefore leads to rather authoritarian teacher-proof methodologies.The scientific model is also more interested in general trends, and strangeindividual results are often ignored. For example, imagine that in a listening testall your students get9OT", but your best student only gets 10%. For us asteachers, it is that one odd result that we would want to investigate.

choose a coursebook - perhaps the one used in your school - and setect atrandom: three listening exercises three reading exercises three speakingexercises. What is the purpose of each exercise? Is it. . . testinggrammarorvocabu[ary?(e.g.t+/r8rown-theonenqetc.)... testing the student's understanding? (e.g. via multipte-choice questionsabout information in the texb information gaps; etc.)... teaching the student to read/tisten/speak better? (e.g. Does it include adviceabout how to improve reading or tistening, pnctising intenupting, etc.)..- teaching the student to study? (e.g. Does it teach classroom tanguage? Doesit hel.p the student to find answers to their own questions?, etc.)

ar<S-

Il-

t-

F

atL

>

I>

I-

E.tr-

.tr-EE.EIl-

IL-

.u|=I-

D=tr'=^u=.L-

.F-

-D-. - . -1-v

_1

.v_1

.v4

Uz

_-1

lz

llz

.P<

.U

-v-v

T A s K

1 ?

Page 9: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

Evaluation

Problems with testing:Can evaluationsolve them?

Ul--L-r---rr-er

L-tsIl _ - .

L-l--rl---rr-.trr--rrr---rl-,-t

L-t--t

L-r-.f

L-a-,

L-r--Irr-t

Ir-Ira--]rrl--t

L-)-t

L-|-D

L-a-,

I.--.J

L]IJra--c

L-a-t

L-t4

l:)-J

L:a -

L:rl-_-t

l -I

What's the difference between testing, teaching and evatuatjon?w

Testing and evatuation

The relationship between testing and evaluation is similar to the relationshipbetween the cunnrculum and the svlLaeusThe svllnsus is a set of items for the teacher to cover in a term. But the syLLABUsis part of a bigger methodological scheme - the cunnrculurvr. A languageteaching programme is not only what you cover (the svllnaus), but also how youcover it (the classroom procedures), and also why you cover it (the educationalapproach or rationale behind your syLLABUs and classroom procedures).

In this book, we will see evaluation as wider than testing. Evaluation sees testingas a useful tool, but also thinks there are other important criteria for assessingsomeone's performance. we want to assess students' ability to use the presentsimple, but we also want other information about their (language) learning, e.g.ei Can they use a dictionary?e Do they actually use the target language in class (e.g. for chatting)?c Are their notes well organised?c Do they contribute to groupwork?e' Are they well behaved?lf we compare this to the syLLABUS{uRRtcuLUM diagram, we can see a(simplif ied) similari ty:

Problem 1: Skil ls into numbersEvaluation is not limited to numbers or just giving students marks. Instead oftrying to count or measure a student's ability to make useful contributions to theclass, we can simply judge whether he/she makes a contribution or not. In otherwords, you can be subjective as well as objective.But when we make judgements, we must realise that other people, includingteachers and students, may not agree with what we think. Evaluation means thatsometimes we will have to justify, negotiate and possibly modify our opinions.We may need more than one judge - we may even need a jury.

Problem 2: Results versus processes, what versus whyIn addition to EttuiurRnrtoru, evaluation looks for tlumrrunrroru: How did you learnthat? why did you write that? we are doing something with the student, ratherthan doing something to the student. lf we had to assess Miguel's performanceoverthe year, would we rather have his essay from Exercise 3 (sEe pnce 1o), or hisessay from Exercise 4 with his notes stapled to the back of it? Exercise 3 tells uswhat, but Exercise 4 tells us what, how, and why.In addition, by asking these questions, we will learn a lot of extra information:... what the student thinks helshe is learning... what the student thinks is easyldiff icult. . . what the student enjoys/hates doing in class

curriculum evaluation

6";'.\-7-

1 3

Page 10: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

What's the dif ference between

Who evaluates?

FUtlz

I-

t :

IL

JP

>>

I-

f-

F-

I-

D-

E.E-

.tr-Er=r=D=tr=.tr-.tr-.tr-b.--.4>

--4

P4

P4

.D=

flz,

-v-

.P1

.U

-v.V

testing, teaching and evaLuation?

. . . where the teaching programme and the student don't meet

... where the teaching programme needs to be re-designed.

In other words, we can use the assessment procedure to develop and improve: not only the student, but also the teaching programme, and even the school. By

evaluating procedures and attitudes, we gain more information - and moreuseful information - than by simply looking at test results.

Problem 3: Standardisation and odd results

Evaluation does not want to remove the variables in the assessment process.Evaluation is interested in odd results as it is exactly this kind of result that mayil luminate something about the learning process. Equally, i t does not wantteacher-proof materials and methodologies. Instead, evaluation tries to include asmany people as possible, because all information is seen as possibly useful forimproving the design of a teaching programme.

As we will see in cHRprrR 4, writing a good test is an extremely complex task,and requires not only a lot of time and resources, but also some expedise instatistical analysis. For this reason, it tends to be large organisations such asgovernments and universities that write big tests, mainly because they need tokeep the same NoRM-REFERENcED standards year after !ear. C, srr pnce 31With evaluation, however, we are trying to help the student to learn. Evaluationis not just an assessment, but an aid to learning. This means that the morepeople who are involved in the process, the better the process is.

Summary

As we have seen in this chapter, to teach students ski l ls which are diff icult toteach, we either ask them lots of questions (i.e. effectively micro-tests) to see ifthey're improving or we substitute the skill that is difficult to teach with one that iseasy to teach.

To assess students in a skill which is difficult to measure, we either re-define it tomake it measurable; but possibly change what we are measuring or we leave itout of the test and measure only the easily-measurable.

Testing also looks at the general, rather than the individual. Individuals, whetherthey are teachers or students, are variables that have to be removed from theassessment process. lndividuals are turned into eunrurrrnrrvr data like results; andeuALtrATrvE onrn, like processes or attitudes, are statistically removed.

Who can evatuate language learners?

The head of your school has decided to develop a new assessment system forthe end of the next schoot year. He/She has asked you to provide a list of atlthe people who might have useful information about a student's [anguagelearning abitity. Make an appropriate [ist, then consider the fotlowing questions.

What information coutd each group provide?

Given the system as it exists now, who would actual.ty be consutted?What information would you get us'ing the present system?What information would be missing, given the present situation?Wh'ich parts of the missing information are the most important to incl.ude?Can you think of any ways of incorporating these important areas into thepresent system without needing to re-design the whole assessment procedure?

T A s K

7 4

Page 11: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

=tI

?:?_1-fl---

L-l---

IJl -

IJl r

ljr-.-

l:l--

Il-_r

ljl-I

L-l---r,-t

L-r-.t

L-r-,t

L:a-'t

L:---Jrr-rr

L-l-t

L-a-f

L:l-.]ra-,-Jra-tra-,

l_a-.f

|;

rl-trl-_!ra--'

ljL'

l-l-fra arr1=

CHAPTER 3

Message and medium

What do we assess?

Before we can assess a student's performance, we need to decide what we aregoing to assess.At first sight, this looks like an easy question. As foreign language teachers weevaluate the student's ability in a foreign language. Ejrlier *e g"u" the examplesbelow as test questions (i.e. the teacher arready knows the answers).

what do you think each of the foLlowing questions is actualty testing?Think of your answers before you look at the key below.1 He goes to the cinema. They ...?2 Find a word in the text that means 'angry,.

3 0n the tape, where does John te[ Susan he wants to visit?4 What is the main idea of paragraph three?5 Dictation: write down the fottowing ...6 That's that part of the lesson f inished. what do you think we're going to

do next?

rcv1 This is testing grammar (using the present simple third person plural).2 This is testing vocabulary (recognising thatfurious is a synonym of angry).3 This is testing the student's abil i ty to l isten for detai l .4 This is testing either l istening for general meaning or inferring from a text.5 This is testing general abi l i ty (writ ing, reading, l istening, pronunciation,

spell ing, etc.).5 This is testing their abi l i ty to infer lesson phasing from their previous

learning experience.

so we already test the students on a wide range of skills and abirities.

However, questions can have more than one answer. For exampre:Teacher: Miguel, where does the president of the lJnited states live?Miguel (1): He lives in London.Miguel (2): He live in the White House.Miguel gives the teacher a problem here. His first answer is grammatically correctbut factually wrong. His second answer is grammatically wrong but factuallycorrect. Which answer is better?The answer to this question is lt depends why you asked the question Languageteaching is concerned with both message and medium. l f we are testing the thirdperson -s, Answer 1 must be correct. on the other hand, we are also trying toteach students to communicate in a different language. The grammatical mistakethat Miguel makes in Answer 2 does not stop communication of the idea.Language teachers have to balance two different 'correctnesses': the right idea,i.e. the message and the right form of expression of that idea, i .e. the medium.

T A s Ki 1

1 5

YAMITH
Typewriter
FOR EDUCATIONAL PURPOSES ONLY
Page 12: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

What do we assess?"-/-

t>1. - I>tz

Ill

v

v

>.

I-

I-

f-

I-

!-

I-

tr-tr:EI-

>>>F>>v*

_-4

*4

*1

P-1

P4

P

*

Which languageabil i t ies do we test?

Other criteria fori nclusion: Easy/d iff icu ltto mark or record

Language components versus language use

Another common dist inction is whether we assess the individual i tems that weput together to make a sentence, i.e. the components of a language (grammar,

vocabulary, pronunciation); or whether we assess how the student puts thesecomponents together when they actually use the language (i.e. the four skills ofspeaking, l istening, reading and writ ing).

Other ski l ls of using language

We need to use language that is socially appropriate (e.g. formalversus informalvocabulary, etc.). We need otscouRsE SKILLs: making what we say fit what hasbeen said before (e.g. / saw John. He said he was going to the cinema, not / sawlohn. John said John was going...). We need srRnreclc SKILLS, too, such as how

to take turns in speaking, get information from a text, listen for gist, etc.

Language learning skills

e the ability to use a dictionary

e the ability to work out meanings of unknown words

e learning metalanguage such as asking the teacher What's the past tense ofthatverb? etc.

General learning ski l ls

c contributing to, and working in, groups in class

c the ability to know what you know and what you still need to learn

e, strategies for finding information you don't know

ei following the instructions in tests, etc.

Other behavioural or social ski l ls

Many teachers would say that one of the primary ski l ls for any learner is theability, for at least part of the lesson, to stay sitting in his/her chair workingrather than wandering around and disrupting the class.

Which of these abil i t ies should we include in our assessment?How much should each ski l l be worth?And, i f they are included, how should we record our assessment?

This takes us on to how easy or difficult these scores are to mark or record.

As we have already seen, there is also a problem about how to mark orrecord answers.

Assessments that give results as numbers (gap-fills, multiple-choice, etc.) arevery easy to record. We can simply write the results (or nnw scoREs) on a piece ofpaper, or we can convert this number into a percentage, a mark out of twenty oran A-E grade.

Similarly, there are ways, as we shall see later (in cHnprER 9), of marking writing,although these are much more complex as we are not counting correct results,but judging the quality of a piece of writing. We shall also see that the samesystems can be used for other abilities, like speaking and behaviour.

However, when we want to assess, e.g. the student's contributions to the class,we have a bigger problem. This will almost certainly mean we will have to write

P4

f-4

1 6

Page 13: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

*={=

{_t--r-

L-l--r

L-r---

L-t -ul:l-e

L-1.'

L-g

l-r-.'

L-l-r-

t--)-a

L-a5

L_a--f

L_t-f

l_:t-.J

L-a-1I

ljl-t

L-l--r

l-a-/'

L:a-t

L:a--

L-f-J'

t_a--,f

L_.-t

L:l-t

L:l-t

L:r-.-!]

l_1--t

L:t"J--t

L:i=

What do we assess?

notes. Notes are more difficult to record: different teachers will write differentamounts about different things. lf we want this information to be kept, we willhave to have files for each student.

Summary

language components(grammar, vocabulary,

ranguage use(reading, wr i t ing,l is tening, speaking)

language competencies(socio- l inguist ic , d iscourseand st rategic competencies)pronunciat ion)

learn ing sk i l ls 'truth' or'fact'

language learn ing sk i l ls general behavioura land socia l sk i l ls

th ings that are easyto mark or record

th ings that areeasy to test th ings that are

easy to teach

a general impression of the a general impression of thestudent as a language userstudent as a ( language) learner

a general impression of thestudent as a member of the c lass

Think about your current system of evatuating a student at the end of the year.

Which of the ski l ts mentioned above are included in your currentassessment system? Which of the ski l ls are not inctuded? Can you think whythey are not included?

Is one type of skitt more valuable than another? For examp[e, i t gets highermarks, or determines the studenfs assessment?How are these marks recorded?

Which ski l ts are forma[ly assessed (i .e. you record the information on thestudenfs records)?

Which skitts do think about when assessing the student, but are not recordedofficiatty?

Does your present system work? Do the good students get through and the badones fait?

So how does your system define a good learner? Finish the sentence betow:In our school, a good learneris someone who can ...

ASSESSMENT

What we typically assess

7 7

Page 14: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

FI-CHAPTER 5

#

" What i s di rect testi ng ?"

" Whot i s i ndi r e ct t esti ng ? "

What forms of testing and evaluationshoutd we use?

Direct and indirect testing

DrREcr rcsrNc means we ask the student to perform what we want to test.

tNDtREcr rEsrNc means we test things that give us an indication of thestudent's performance.

In orREcr rEslNG, we talk to students to see if they can communicate their ideasin interactive conversation. This sounds obvious, but it is not always practical todo this, e.g. a student may be away sick, or the class may be too big to speak toeach member for enough t ime, especial ly i f there is a str ict syLLABUs to adhere to.

In rNDrRrcr rEslNC we would f ind things that give us an indication of how wellthe students can speak. For example, we know that good speakers use longersentences or utterances than weak speakers. We could then invent a test wherewe measure skills associated with good speaking, e.g. the average length of eachutterance: the longer the utterance, the higher the grade.

The same is true for writing. We could directly ask the students to write anumber of texts. This would tel l us about their writ ing ski l ls. However, there maybe reasons (e.g. restricted marking time) why we can't ask the students actuallyto perform this ski l l . Therefore we might give them a test on l inker words (e.g.however, etc.). This may give us an indication of their abi l i ty to write well .

In the examples above, we are assuming a connection between length ofutterance and speaking ability; and linkers and writing ability.

One problem is making sure that the rruorRrcr resr is an extremely good indicatoror has a HtcH coRRELAToN with the skill we are trying to test. lf we find studentswho cannot write achieve high scores on our l inkers test, this shows l inkers arenot a good indicator. This will make the test result invalid.

tNDrREcr rEsrNc also often produces a negative BAcKWAsH effect (srr pnce 28):some teachers wil l spend hours in the classroom teaching l inkers rather thanteaching writing, because that's what is in the test.

These problems cannot exist when we use DlREcr rEsrtNc methods. Therefore,whenever possible, use them.

DTRECT TEsTS are preferabte to tHotnrcr resrs. Some teachers woutd argue that, aslanguage teachers, we are teaching students to communicate - in other words,to use language.

We use [anguage in tal.king (speaking and [ istening), reading and writ ing.Someone can be good at grammar but unabte to commun'icate in speech orwriting. In this case, grammar, vocabulary and pronunciation tests must reattya[[ be forms of indirect testing.

]z

!-

P

vv)-

_+

l-

.>l-

!-

->-v->P

U

-U

->_v-P

v

+,-

v_v_vt-

v.1

T A s K

3 0

YAMITH
Typewriter
FOR EDUCATIONAL PURPOSES ONLY
Page 15: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

-aa

-II

?;L.L -

L-l--a

L:r-...-

l_l r

l_-L-{

L:r-..-

L:r-.-r

L:J.J

L-g

L:]--J

L:gl

lja--t

t_:a-t

L:l-f

L:rut

L:r---

L:)1

L:a-'

L-a4

L:)-C

L:)-J

L-=a-tA

L:r.-4

L:-.-r]

L:r[:L--rtl

L:i=

What forms of testing and

So we shoutd ptace more emphasis on resutts from skiLl.sand vocabulary tests.Do you agree?Do you think your school agrees?How much of your students' f inal assessment is based on their abit i ty tocommunicate - to use and experiment with the language they have learned?How much is based on their abiLity to manipulate grammar and vocabutary? Doyou agree with this balance?

Norm-referenced and criteria-referenced testing

These terms refer to the way a test is made and the way the results of a testare presented.

When the results of the test compare a student with other students. The resultdoes not give any information about individual performance, only a comparisonwith other students' performances - from that year and from other years.For example, a university wants to restrict entry to its (science) courses to theapplicants who have the best chance of successful ly completing a course. In thepast, it has found - perhaps by trial and error - that students who scored goy. ormore in their f inal school year exams are the candidates most l ikely to succeed.Therefore they offer places only to students from your school who got g0% or16/20 in the i r f ina l sc ience exams.

when the result tel ls you about what the individual student can do, and doesnot compare him/her with other students. l t describes certain cri teria that thestudent has been able to meet.For example, a student is applying for a job which requires the abil i ty to use aword-processor. The employer does not want a computer expert, only someonewho can do basic word-processing: typing; f i le-management; simple cut, copyand paste commands. The studenttakes a word-processing course and the f inalexam tests these ski l ls.The employer doesn't need to know if anyone else on the course was befter orworse. He/she simply wants to know what the candidate can do.

As we saw above, it is the state and large international examination boardswho are most concerned with comparing people. The value of theirquali f ications depends upon year-on-year comparison and consistency. Forexample, many people feel that 'exams these days aren't as difficult as when Iwas at school'. Lack of year-on-year consistency devalues the state's awards, likeuniversity places and university degrees. However, this need for consistencymeans it is very diff icult to improve or develop exams because results wouldn'tbe comparable.

There is a similarity here with DrREcr and rruorRecr rEsrNc. g see nncr 3o

evaluation shoutd we use?_\_-/__

tests than grammar

" What is no rn-referencedtesting?"

"What is criteria referencedtesting?"

"So wh.ich ls better...?'

3 1

Page 16: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

What forms of testing and eva[uation shouLd we use?

"What is summativeeuoluotion?"

"Whot is formativeeualuation?"

*Whot is congruenteuoluotion?"

cRtrERrA-REFERENCED rEsrs can be directly interpreted by the user: an employer canread a description to discover what the student can do.

NoRM-REFERENCED TESTS are not directly interpretable: you can only use the resultby comparing one student's result with your past experience of other peoplewith the same score.

For teachers and students, it is obviously more useful for us to know what astudent can and cannot do so we can work on the areas where there areproblems. However, the state and other large organisations need to reducecomplexity - they want simple and consistent measurement.

Forms of evaluation

First we must remember the difference between simple testing and evaluation. Inthis book, the term testing is used when we are asking the students questions towhich we already know the answers. We are using the term evaluation to askthe students questions to which we don't know the answers - genuine questions:Do the students feel they are getting better? Have they found the course useful?

We earlier identified several different types of test: pRoncrENcy rEsrs, AcHTEVEMENTTEsrs, pLACEMENT rEsrs and orncruoslc rESTS (ser pnce 8). However, in all these tests,we are testing the students to see where they f i t in to our system and ourcriteria. We are in control.

But evaluation is different because we are asking questions to learn about thestudent's learning process and attitudes, and about the teaching Programme.There are three common terms used when describing evaluation: suMMAlvE,FoRMATTvE and coNcnuENT EVALUATIoN.

This is done at the end of (a stage of) a process. In teaching, this might be at theend of a term or a year. In this way, i t is a kind of f inal assessment, summarisingwhat has been achieved throughout that course. suMMArvE EVALUAToN looks atgeneral feedback to the teaching procedure used, so that next year's course canbe changed according to what has been more or less successful.

This is done during a process so that the process can be changed to make it moreeffective. In teaching, this might be feedback that a teacher gets to check howsuccessful the teaching programme is. The feedback from the students can alsoaffect the teaching procedure while there is still a chance to change it for thebetter - to help this year's students rather than next year's.

Less often referred to, this looks at the whole process before it starts, in order tomake sure that the aims, methodology and evaluation of the course match thestated purpose and beliefs. For example, imagine your purpose is to increase thestudents' oral fluency. You ask teachers to design a course and a way to evaluateit. They return it to you and you notice that the tests include writing: thiswouldn't match your original aims. In this way, coNcRUENr EVALUATIoN is verysimilar to corurrrut vALtDtry. ) see pnce 18

P

P

P

!]<P---.

+_1.

_vy

--1

-vy

P

->I-

_r-

-F

P

P

-v^>!-

.>l-

l-

t-

t-

t-

t-

l-

!-

D-

F

r-

F

D-

lr--

7 , t

Page 17: Assessment and Evaluation - Chapters From Baxter's Evaluating Your Students

II

=II>;1tI

l--

t -t_l---

l---

L -

l-L'

fJ-

L4

t_a-{

t_l-_t

a-tf

1-l

a4

t_a--.il

a4

t -]-_!,

l-.l-t

t -l-a-----

I

l--:l

t -t-t

I

t4

l-:t_

.-tf

--t_

-!.

-=

-|t

----_

--

-..-,

----t-

What forms of testing and evatuation should we use?--^

Putting the threetogether

You will notice that there is, in effect, litfle difference between these terms,because evaluation never ends. suMMATrvE EVALUATToN at the end of a courseinforms the teacher - and students - about how to change the course next timeto make it more successful and/or more closely related to the beliefs behind thecourse. suMMATlvE EVALUAToN will also have implications for the next course: ifthere are certain problem areas the following course will have to be changed toallow more (or less) time on these; or focus on different areas.In other words, the difference between suMMArvE, FoRMAIvE and corucRuerurEVALUAIoN is not one of how evaluation is done, but when and why evaluationis done.

c coNcRUENT EVALUAION tries to keep the process on the desired course.o FoRMATtvE EVALUATTON tries to alter the process while it is stil l going on.e suMMAlvE EVALUAION tries to assess the success of the completed process.

Summary

In this chapter we have covered the following.o rNDrREcr resrs test abilities related to the skill we are interested in.o DrREcr rrsrs test the skill itself.o NORM-REFERENCED exams compare one person'.s performance with many others.o CR|rER|A-REFERENCED exams describe what one person can do without

comparing them with others.o coNcRUENT, FORMATIVE and surunnnlvE EVALUATtolr describe when evaluation is

done: before, during, or after. But it is important to remember that evaluationis not linear, but cyclical. Each part informs the other.

Think about when and why your school evatuates i ts students.Does it ask the student genuine questions to improve the schoofs teaching:a during the year or teaching programme?b at the end of the year or teaching programme?Does it ask you, the teacher, genuine questions to improve the schoo['steaching:

a during the year or teaching programme?

b at the end of the year or teaching programme?Does your school examine new curricula, syLLeBuses and assessment proceduresbefore implementing them in order to check they match the schoofs aims?Does your schoofs teaching programme have 'officiaf aims?Why? Why not?

3 3