Download - 1Testing Scales of References

8/3/2019 1Testing Scales of References

1/33


2/33

Scales of Reference for Testing of

Proficiency However, in the two decades since these

ratings were first suggested,

applied linguists' views about what constitutes a true 'zero' or

'perfect' points or even

'native speaker' or

what an educated person's proficiency in alanguage means

have changed considerably. (Chandee, 1997)


3/33

McNamara (1995)

We cannot assume that native speakers

will perform better than non-native

speakers in the tasks on our tests,

as native and non-native speakers may not

easily be distinguished in terms of the non-

linguistic performance capacities that areinvolved in the tasks (p. 165).


4/33

Face and Content Validity


5/33

2.4.5 Face and Content Validity

Proficiency scales have high face validity: they look as if they are testing what they claim to be testing.

This is not validity in the technical sense(Anastasi, 1976,p. 139).

Although the use of proficiency scales can help to guideteachers and learners in setting realistic goals,

they raise a number of difficult issues inherent in the nature oflanguage proficiency

and with important implications for how it is measured

(Hyltenstam & Pienemann, 1985, p. 222).


6/33

Chandee 1997

What is important to note here is that language educators may lackrelevant professional training (Chandee, 1997),

And may either (i) see language learning in terms of some - rather than all - aspects of

language ability;

(ii) may treat language ability and language proficiency as identical,believing that proficiency testing provides an accurate and reliablemethod of assessing communicative competence, and/or

(iii) perceive no essential difference between proficiency testing and arange of other assessment procedures. (Chandee, 1997),

It is essential, therefore, to begin here by acknowledging the

importance of teachers awareness of empirical research in thisarea. (Chandee, 1997),


7/33

2.4.6 The Problem of Validity

in LanguageT

esting To be valid, a test must measure what it sets out

to measure.

For example, if listening and writing skills are to

be tested then

the test items must involve listening and writing

which may be in the form of, as Anastasi suggests,

listening to lectures and

writing reports and

both must contain authentic materials

(Anastasi, 1961, p. 138).


8/33

Accordingly, Anastasi's definition of

content validity is

the systematic examination of the test content

to determine whether it covers a

representative sample of the behaviour

domain to be measured. This representative sample of the

behaviour domain must closely reflect that

domain in performance terms

(Anastasi, 1976, p. 134-135).


9/33

Many language test researchers have

noted the inadequacy of face validity,

content relevance, and

predictive utility of language tests (Alderson, 1981; Bachman 1988; Bachman and

Savignon 1986; Skehan, 1984; Stevenson, 1981,

1985; Upshur, 1979).


10/33

This poses problems for predictive validity,

as, for example, Bachman (1990) notes,

an examination of predictive utility alone can

largely ignore the question of what abilitiesare being measured

(p. 250-251).


11/33

The problem becomes evident with the

use of, for example, multiple-choice

grammar tests

to measure an individuals writing ability

or for placing the individual in a writing

course (Bachman, 1990, p. 250-251).


12/33

Moreover, the conditions that determine the

meanings of a speech act are complex and,

for the test to be valid, test writers must take this intoconsideration. (Chandee, 1997)

This is highlighted in Spolsky's (1986) comment

that

we can study the pragmatic value and sociolinguisticprobability of choosing...structures in different

environments...but the complexity is such that we

cannot expect ever to come up with anything like a

complete list from which sampling is possible (p. 150).


13/33

2.4.7 Authenticity of

Communicative LanguageT

ests It is problematic to define the term

authenticity

in terms of samples of 'real-life' language use

since language use depends on different contexts,

purposes,

topics,

participants,

speech events,

and so forth

(Bachman, 1990, p. 690; Morrow, 1991, p. 114;Nunan, 1988, p. 99; Widdowson, 1990, pp. 44-47).


14/33

Chandee 1997

Any testing situation is, therefore, unnatural and

thus not authentic.

Language use in real life varies according to

speakers' linguistic and communicative competences,

the contexts the language is used in,

speakers and listeners' background knowledge and

the cultural aspects both speakers and listeners bring

with them.

This makes it difficult to distinguishing 'real-life'

from 'nonreal-life' language use.


15/33

To make a test authentic, it must, inevitably, be one that reproduces a real-life

situation in order to examine the students ability to cope with it (Doy, 1991, p. 105)

and must measure the interaction between thelanguage user and the discourse (Widdowson

197, p. 80)

Moreover, pragmatic criteria must be present.

That is, language tests...must require the learnerto understand the pragmatic interrelationship oflinguistic context and extralinguistic contexts (Oller, 1979, p. 33).


16/33

This sort of authenticity is difficult to

achieve in a test situation

where both the tester and the test taker know

that the only purpose of the interaction is to

obtain an assessment of the test taker's

language performance (Shohamy & Reves, 1985, p. 55).


17/33

Spolsky (1985) supports this view,

maintaining that

however hard the tester might try to

distinguish his purpose,

it is not to engage in genuine conversation with the

candidate. . . but rather to find out something aboutthe candidate in order to classify, reward, or punish

him/her (p. 36).


18/33

Authenticity is, therefore, almost unachievablesince, according to (Klein-Braley, 1985),

if authenticity means real-life behaviour, thenany language testing procedure is non-authentic (p. 76).

We are forced, therefore, with Spolsky, toconclude that testing is not authentic languagebehaviour, that examination questions are not real, however

much like real-life questions they seem(p. 36).

Furthermore, an examinee needs to learn thespecial rules of examinations before he or shecan take part in them successfully (Spolsky,1985, p. 36).


19/33

Though tests are, in general, inevitably notauthentic in the full sense,

it should be possible to establish criteria which willapproximate authenticity (Chandee, 1997).

Testing methods need, for example, to bemodified so that they do not impinge on thelanguage use observed (Chandee, 1997).

and, as both Spolsky (1985) and Shohamy andReves (1985) observe, the unobtrusive observation of language use in

'natural situations' is one way of achieving at least apartial solution to the question of authenticity

(Shohamy & Reves, 1985, p. 55; Spolsky, 1985, p. 39).


20/33

Chandee, 1997

Some theorists suggest that one authentic anddirect testing situation is to observe an individualover a period of time (Jones, 1985, p. 81).

The main problem, of course, with extensivenaturalistic observation of non-test language useis that it is impractical,

time-consuming, cumbersome and

expensive, and hence not feasible in most language testing situations.


21/33

Chandee, 1997

It is certainly impossible in a country which

does not use the target language in every

day life situations.


22/33

A different, but perhaps equally important

problem pointed out by is the serious

ethical question raised by using

information obtained surreptitiously,

without individuals' knowledge, for making

decisions about them. Spolsky (1989),


23/33

Subjects who for various reasons do not test well (who become over-anxious, or

who are unwilling to play the special game of testing,

i.e. answering a question the answer to which is known better by theasker than the answerer)

will not be accurately measured by any kind of formal test:

there will be a large gap between their test and their real-lifeperformance

(Spolsky, p. 74).

This lack of authenticity in the material used in a testraises issues about the generalizability of results(Spolsky, 1985, p. 39).


24/33

To solve the dilemma of test authenticity,

it might be possible to argue that languagetests have an authenticity of their own

(Chandee, 1997),

authentic tasks are in principle impossible

in a language testing situation,

and communicative language testing is in

principle impossible"

(Alderson (1981a) suggests p. 48).


25/33

The problem of authenticity might be resolved byaccepting Widdowsons (1978) definition of authenticityas a characteristic of the relationship between the passage and the

reader [that] has to do with appropriate response (p. 80). This notion of authenticity is very similar to Oller's (1979)

description of a 'pragmatic' test, that is, any procedure or task that causes the learner to process

sequences of elements in a language

that conform to the normal contextual constraints of that language, and which requires the learner to relate sequences of linguistic

elements via pragmatic mapping to extralinguistic context (p. 38).


26/33

2.4.8 Constructing

Language Proficiency Tests

Pimporn Chandee

1997


27/33

2.4.8 Constructing Language

Proficiency Tests When all of the problems of test authenticity are taken

into account, it is clear that it is very difficult to constructa test that will be authentic (Chandee, 1997).

Even so, even if the focus is on only one or a fewcomponents of language ability in a given testingcontext, Bachman (1990) notes that there is a need to be aware of the full

range of language abilities when designing,

developing and interpreting language test scores(p. 682).

and that design must be informed by abroader view of languageability (p. 682).


28/33

This view mirrors those of Spolsky (1989)who suggests that

test authenticity may be achieved if all the distinguishing characteristics or features

within a finite open set, consisting of a potentially infinite number of instances are

used in test constructions" (p. 74).

However, this may be impractical(Chandee, 1997).


29/33

Chandee, 1997

Problems in creating good tests of

language ability are unavoidable

since language tests can be used only as an

indirect way of making inferences about a test

taker's language ability.


30/33

Since language use involves the

integration of multiple components andprocesses,

it is unlikely that there will ever be a language

test that will measure all the components of

language ability or even a test (Chandee), in

Bachman's (1990) terms, that will elicit

language test performance that is characteristic of language performance

in non-test situations (p. 19).


31/33

To be similar to 'normal', or 'real-life' and

'nontest' language use, test tasks essentially must include the followingelements:

'pragmatic'

(Oller 1979, pp. 16-19, p. 27 and p. 33; 1991, p. 32; Spolsky,1986, p. 150),

'functional'

(Bachman, 1990, p. 301),

'communicative'

(Bachman, 1990, p. 301; Canale & Swain, 1980, p. 31),

'performance'

(Bachman, 1990, p. 301) and

'authenticity'

(Bachman, 1990, p. 301; Morrow, 1991, p. 112, p. 114;Spolsky, 1989, p. 74).


32/33

Every instance of authentic language use involvesseveral abilities.

For example, for taxi drivers to operate in the

international airport in Bangkok, they need to know notonly the conversational discourse such as a request by the customer to be taken to a particular place,

an agreement by the driver to take the customer, or

a request for directions followed by an agreement, and

finally a statement of the fare by the driver, and

a polite thank you upon receipt of the fare

but also

how to converse with the customer in the following situations

the fare as a point of bargaining,

the fare depending on the weather, the time of day or night, thecondition of the streets, traffic and so on

(Bachman, 1990, p. 312).


33/33

Chandee, 1997

Hence, Bachman points out, there is probably an infinite variety of conversational exchanges

that might take place between the taxi drivers and thecustomers (p. 312).

Furthermore, the very nature of language use is suchthat discourse consists of interrelated illocutionary acts expressed in

a variety of related forms.

If language test scores are to reflect several abilities, and if authentic test tasks are, by definition, interrelated,

then measurement models must be appropriate for analysing andinterpreting these abilities.

Download - 1Testing Scales of References

Top Related