1testing scales of references

Upload: pimporn-chandee

Post on 06-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 1Testing Scales of References

    1/33

  • 8/3/2019 1Testing Scales of References

    2/33

    Scales of Reference for Testing of

    Proficiency However, in the two decades since these

    ratings were first suggested,

    applied linguists' views about what constitutes a true 'zero' or

    'perfect' points or even

    'native speaker' or

    what an educated person's proficiency in alanguage means

    have changed considerably. (Chandee, 1997)

  • 8/3/2019 1Testing Scales of References

    3/33

    McNamara (1995)

    We cannot assume that native speakers

    will perform better than non-native

    speakers in the tasks on our tests,

    as native and non-native speakers may not

    easily be distinguished in terms of the non-

    linguistic performance capacities that areinvolved in the tasks (p. 165).

  • 8/3/2019 1Testing Scales of References

    4/33

    Face and Content Validity

  • 8/3/2019 1Testing Scales of References

    5/33

    2.4.5 Face and Content Validity

    Proficiency scales have high face validity: they look as if they are testing what they claim to be testing.

    This is not validity in the technical sense(Anastasi, 1976,p. 139).

    Although the use of proficiency scales can help to guideteachers and learners in setting realistic goals,

    they raise a number of difficult issues inherent in the nature oflanguage proficiency

    and with important implications for how it is measured

    (Hyltenstam & Pienemann, 1985, p. 222).

  • 8/3/2019 1Testing Scales of References

    6/33

    Chandee 1997

    What is important to note here is that language educators may lackrelevant professional training (Chandee, 1997),

    And may either (i) see language learning in terms of some - rather than all - aspects of

    language ability;

    (ii) may treat language ability and language proficiency as identical,believing that proficiency testing provides an accurate and reliablemethod of assessing communicative competence, and/or

    (iii) perceive no essential difference between proficiency testing and arange of other assessment procedures. (Chandee, 1997),

    It is essential, therefore, to begin here by acknowledging the

    importance of teachers awareness of empirical research in thisarea. (Chandee, 1997),

  • 8/3/2019 1Testing Scales of References

    7/33

    2.4.6 The Problem of Validity

    in LanguageT

    esting To be valid, a test must measure what it sets out

    to measure.

    For example, if listening and writing skills are to

    be tested then

    the test items must involve listening and writing

    which may be in the form of, as Anastasi suggests,

    listening to lectures and

    writing reports and

    both must contain authentic materials

    (Anastasi, 1961, p. 138).

  • 8/3/2019 1Testing Scales of References

    8/33

    Accordingly, Anastasi's definition of

    content validity is

    the systematic examination of the test content

    to determine whether it covers a

    representative sample of the behaviour

    domain to be measured. This representative sample of the

    behaviour domain must closely reflect that

    domain in performance terms

    (Anastasi, 1976, p. 134-135).

  • 8/3/2019 1Testing Scales of References

    9/33

    Many language test researchers have

    noted the inadequacy of face validity,

    content relevance, and

    predictive utility of language tests (Alderson, 1981; Bachman 1988; Bachman and

    Savignon 1986; Skehan, 1984; Stevenson, 1981,

    1985; Upshur, 1979).

  • 8/3/2019 1Testing Scales of References

    10/33

    This poses problems for predictive validity,

    as, for example, Bachman (1990) notes,

    an examination of predictive utility alone can

    largely ignore the question of what abilitiesare being measured

    (p. 250-251).

  • 8/3/2019 1Testing Scales of References

    11/33

    The problem becomes evident with the

    use of, for example, multiple-choice

    grammar tests

    to measure an individuals writing ability

    or for placing the individual in a writing

    course (Bachman, 1990, p. 250-251).

  • 8/3/2019 1Testing Scales of References

    12/33

    Moreover, the conditions that determine the

    meanings of a speech act are complex and,

    for the test to be valid, test writers must take this intoconsideration. (Chandee, 1997)

    This is highlighted in Spolsky's (1986) comment

    that

    we can study the pragmatic value and sociolinguisticprobability of choosing...structures in different

    environments...but the complexity is such that we

    cannot expect ever to come up with anything like a

    complete list from which sampling is possible (p. 150).

  • 8/3/2019 1Testing Scales of References

    13/33

    2.4.7 Authenticity of

    Communicative LanguageT

    ests It is problematic to define the term

    authenticity

    in terms of samples of 'real-life' language use

    since language use depends on different contexts,

    purposes,

    topics,

    participants,

    speech events,

    and so forth

    (Bachman, 1990, p. 690; Morrow, 1991, p. 114;Nunan, 1988, p. 99; Widdowson, 1990, pp. 44-47).

  • 8/3/2019 1Testing Scales of References

    14/33

    Chandee 1997

    Any testing situation is, therefore, unnatural and

    thus not authentic.

    Language use in real life varies according to

    speakers' linguistic and communicative competences,

    the contexts the language is used in,

    speakers and listeners' background knowledge and

    the cultural aspects both speakers and listeners bring

    with them.

    This makes it difficult to distinguishing 'real-life'

    from 'nonreal-life' language use.

  • 8/3/2019 1Testing Scales of References

    15/33

    To make a test authentic, it must, inevitably, be one that reproduces a real-life

    situation in order to examine the students ability to cope with it (Doy, 1991, p. 105)

    and must measure the interaction between thelanguage user and the discourse (Widdowson

    197, p. 80)

    Moreover, pragmatic criteria must be present.

    That is, language tests...must require the learnerto understand the pragmatic interrelationship oflinguistic context and extralinguistic contexts (Oller, 1979, p. 33).

  • 8/3/2019 1Testing Scales of References

    16/33

    This sort of authenticity is difficult to

    achieve in a test situation

    where both the tester and the test taker know

    that the only purpose of the interaction is to

    obtain an assessment of the test taker's

    language performance (Shohamy & Reves, 1985, p. 55).

  • 8/3/2019 1Testing Scales of References

    17/33

    Spolsky (1985) supports this view,

    maintaining that

    however hard the tester might try to

    distinguish his purpose,

    it is not to engage in genuine conversation with the

    candidate. . . but rather to find out something aboutthe candidate in order to classify, reward, or punish

    him/her (p. 36).

  • 8/3/2019 1Testing Scales of References

    18/33

    Authenticity is, therefore, almost unachievablesince, according to (Klein-Braley, 1985),

    if authenticity means real-life behaviour, thenany language testing procedure is non-authentic (p. 76).

    We are forced, therefore, with Spolsky, toconclude that testing is not authentic languagebehaviour, that examination questions are not real, however

    much like real-life questions they seem(p. 36).

    Furthermore, an examinee needs to learn thespecial rules of examinations before he or shecan take part in them successfully (Spolsky,1985, p. 36).

  • 8/3/2019 1Testing Scales of References

    19/33

    Though tests are, in general, inevitably notauthentic in the full sense,

    it should be possible to establish criteria which willapproximate authenticity (Chandee, 1997).

    Testing methods need, for example, to bemodified so that they do not impinge on thelanguage use observed (Chandee, 1997).

    and, as both Spolsky (1985) and Shohamy andReves (1985) observe, the unobtrusive observation of language use in

    'natural situations' is one way of achieving at least apartial solution to the question of authenticity

    (Shohamy & Reves, 1985, p. 55; Spolsky, 1985, p. 39).

  • 8/3/2019 1Testing Scales of References

    20/33

    Chandee, 1997

    Some theorists suggest that one authentic anddirect testing situation is to observe an individualover a period of time (Jones, 1985, p. 81).

    The main problem, of course, with extensivenaturalistic observation of non-test language useis that it is impractical,

    time-consuming, cumbersome and

    expensive, and hence not feasible in most language testing situations.

  • 8/3/2019 1Testing Scales of References

    21/33

    Chandee, 1997

    It is certainly impossible in a country which

    does not use the target language in every

    day life situations.

  • 8/3/2019 1Testing Scales of References

    22/33

    A different, but perhaps equally important

    problem pointed out by is the serious

    ethical question raised by using

    information obtained surreptitiously,

    without individuals' knowledge, for making

    decisions about them. Spolsky (1989),

  • 8/3/2019 1Testing Scales of References

    23/33

    Subjects who for various reasons do not test well (who become over-anxious, or

    who are unwilling to play the special game of testing,

    i.e. answering a question the answer to which is known better by theasker than the answerer)

    will not be accurately measured by any kind of formal test:

    there will be a large gap between their test and their real-lifeperformance

    (Spolsky, p. 74).

    This lack of authenticity in the material used in a testraises issues about the generalizability of results(Spolsky, 1985, p. 39).

  • 8/3/2019 1Testing Scales of References

    24/33

    To solve the dilemma of test authenticity,

    it might be possible to argue that languagetests have an authenticity of their own

    (Chandee, 1997),

    authentic tasks are in principle impossible

    in a language testing situation,

    and communicative language testing is in

    principle impossible"

    (Alderson (1981a) suggests p. 48).

  • 8/3/2019 1Testing Scales of References

    25/33

    The problem of authenticity might be resolved byaccepting Widdowsons (1978) definition of authenticityas a characteristic of the relationship between the passage and the

    reader [that] has to do with appropriate response (p. 80). This notion of authenticity is very similar to Oller's (1979)

    description of a 'pragmatic' test, that is, any procedure or task that causes the learner to process

    sequences of elements in a language

    that conform to the normal contextual constraints of that language, and which requires the learner to relate sequences of linguistic

    elements via pragmatic mapping to extralinguistic context (p. 38).

  • 8/3/2019 1Testing Scales of References

    26/33

    2.4.8 Constructing

    Language Proficiency Tests

    Pimporn Chandee

    1997

  • 8/3/2019 1Testing Scales of References

    27/33

    2.4.8 Constructing Language

    Proficiency Tests When all of the problems of test authenticity are taken

    into account, it is clear that it is very difficult to constructa test that will be authentic (Chandee, 1997).

    Even so, even if the focus is on only one or a fewcomponents of language ability in a given testingcontext, Bachman (1990) notes that there is a need to be aware of the full

    range of language abilities when designing,

    developing and interpreting language test scores(p. 682).

    and that design must be informed by abroader view of languageability (p. 682).

  • 8/3/2019 1Testing Scales of References

    28/33

    This view mirrors those of Spolsky (1989)who suggests that

    test authenticity may be achieved if all the distinguishing characteristics or features

    within a finite open set, consisting of a potentially infinite number of instances are

    used in test constructions" (p. 74).

    However, this may be impractical(Chandee, 1997).

  • 8/3/2019 1Testing Scales of References

    29/33

    Chandee, 1997

    Problems in creating good tests of

    language ability are unavoidable

    since language tests can be used only as an

    indirect way of making inferences about a test

    taker's language ability.

  • 8/3/2019 1Testing Scales of References

    30/33

    Since language use involves the

    integration of multiple components andprocesses,

    it is unlikely that there will ever be a language

    test that will measure all the components of

    language ability or even a test (Chandee), in

    Bachman's (1990) terms, that will elicit

    language test performance that is characteristic of language performance

    in non-test situations (p. 19).

  • 8/3/2019 1Testing Scales of References

    31/33

    To be similar to 'normal', or 'real-life' and

    'nontest' language use, test tasks essentially must include the followingelements:

    'pragmatic'

    (Oller 1979, pp. 16-19, p. 27 and p. 33; 1991, p. 32; Spolsky,1986, p. 150),

    'functional'

    (Bachman, 1990, p. 301),

    'communicative'

    (Bachman, 1990, p. 301; Canale & Swain, 1980, p. 31),

    'performance'

    (Bachman, 1990, p. 301) and

    'authenticity'

    (Bachman, 1990, p. 301; Morrow, 1991, p. 112, p. 114;Spolsky, 1989, p. 74).

  • 8/3/2019 1Testing Scales of References

    32/33

    Every instance of authentic language use involvesseveral abilities.

    For example, for taxi drivers to operate in the

    international airport in Bangkok, they need to know notonly the conversational discourse such as a request by the customer to be taken to a particular place,

    an agreement by the driver to take the customer, or

    a request for directions followed by an agreement, and

    finally a statement of the fare by the driver, and

    a polite thank you upon receipt of the fare

    but also

    how to converse with the customer in the following situations

    the fare as a point of bargaining,

    the fare depending on the weather, the time of day or night, thecondition of the streets, traffic and so on

    (Bachman, 1990, p. 312).

  • 8/3/2019 1Testing Scales of References

    33/33

    Chandee, 1997

    Hence, Bachman points out, there is probably an infinite variety of conversational exchanges

    that might take place between the taxi drivers and thecustomers (p. 312).

    Furthermore, the very nature of language use is suchthat discourse consists of interrelated illocutionary acts expressed in

    a variety of related forms.

    If language test scores are to reflect several abilities, and if authentic test tasks are, by definition, interrelated,

    then measurement models must be appropriate for analysing andinterpreting these abilities.