special contribution implicit learning and sla - jst

13
Special contribution Implicit learning and SLA John N. WILLIAMS S・)'H)/ N. VIY7A.Zr University of Cambridge )S- tz ・-j"i V pt i>ti<¥ 1 Introduction Implicit cognition pervades our everyday lives. wnen we how we actually manage to maintain our balance or turn we are unaware of how we manage to arrive at the exact (Reed, McLeod, & Dienes, 2010). Our reactions to unfami that we are not aware of (Hill, et al., 1990). Untrained in unfamiliar tunes in a familiar musical style even tho the rules of harmony that they vioiate (Koelsch, et al., 20 the native speaker of a language knows that certain con being abie to articulate the grammatical rules that they vio ImPlicit knowledge is knowledge that we are not deployed automatically, without conscious intention (Dien knowledge is knowledge "that the learner is aware o£ controlled, and hence usually non-fluent, processing. Im which implicit knowledge is acquired. Learning is induct basis of examples), and incidental (it occurs without i leaming where either the learner is provided with gener them out for themselves (intentional induction). Implicit leaming has been a topic at the heart of SL drew a distinction between acquisition (of implicit knowledge). He claimed essentially that the only rout perfbrmance is through implicit leaming; i.e. incident situations. This view has been refined over the years thr task based leaming and the interactional approach (Long same, to eschew direct.teaching of language rules as mer still advocate a more direct approaeh, relying on practic into the procedural fbrmat that is required to ultimately At the theoretical level the debate between Emerg - 3-

Upload: others

Post on 25-Apr-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Special contribution Implicit learning and SLA - JST

Special contribution

Implicit learning and SLA

John N. WILLIAMS

S・)'H)/ N. VIY7A.Zr

University of Cambridge

)S- tz ・-j"i V pt i>ti<¥

1 Introduction

Implicit cognition pervades our everyday lives. wnen we ride a bike we are probably unaware of

how we actually manage to maintain our balance or turn a corner. wnen we run to catch a ball

we are unaware of how we manage to arrive at the exact place at the right time to intercept it

(Reed, McLeod, & Dienes, 2010). Our reactions to unfamiliar faces are governed by subtle cues

that we are not aware of (Hill, et al., 1990). Untrained musicians can detect bad notes' or chords

in unfamiliar tunes in a familiar musical style even though they have no conscious knowledge of

the rules of harmony that they vioiate (Koelsch, et al., 2000; Pearce, et al., 2010). And of course

the native speaker of a language knows that certain constructions sound ungrammatical without

being abie to articulate the grammatical rules that they violate.

ImPlicit knowledge is knowledge that we are not aware of in the moment of use. It is

deployed automatically, without conscious intention (Dienes & Perner, 1999). In contrast explicit

knowledge is knowledge "that the learner is aware o£ It only infiuences behaviour through

controlled, and hence usually non-fluent, processing. Implicit leaming refers to the process by

which implicit knowledge is acquired. Learning is inductive (generalisations are formed on the

basis of examples), and incidental (it occurs without intention). This contrasts with explicit

leaming where either the learner is provided with generalisations, or they intentionally work

them out for themselves (intentional induction).

Implicit leaming has been a topic at the heart of SLA research at least since Krashen (1981)

drew a distinction between acquisition (of implicit knowledge) and !earning (of explicit

knowledge). He claimed essentially that the only route to acquisition and fluent language

perfbrmance is through implicit leaming; i.e. incidental inductive leaming in communicative

situations. This view has been refined over the years through input processing (VanPatten, 1996),

task based leaming and the interactional approach (Long, 1996), but the principle is still the

same, to eschew direct.teaching of language rules as mere `leaming'. On the other hand, others

still advocate a more direct approaeh, relying on practice to convert explicitly learned knowledge

into the procedural fbrmat that is required to ultimately achieve fluency (DeKeyser, 2003).

At the theoretical level the debate between Emergentists (e.g., N. C. Ellis, 1998) and

- 3-

Page 2: Special contribution Implicit learning and SLA - JST

Nativists (e.g., Hawkins, 2001) essentially concerns different theories of the implicit leaming

mechanism. Emergentists tend to appeal to domain-general leaming mechanisms and basic

principles of associative leaming. Language leaming is just another manifestation of the human

ability to absorb the complex statistical structure of the environment. On the other hand, for

Nativists language leaming requires access to domain-specific knowledge and the deployment of

leaming mechanisms that are specific to language.

Given these basic pedagogical and theoretical issues it is clearly important that we

understand the conditions under which implicit learning is more or less likely to occur (if at all),

and the kinds of Iinguistic regularity that can be learned implicitly (if any). This will help us

establish the division of labour between the implicit and explicit pedagogical approaches. At the

theoretical level such investigations will help clarify the extent to which we need to appeal to

domain-specific knowledge and leaming mechanisms to explain SLA. We would be inclined

towards a Nativist view if simple associative leaming cannot account fbr linguistic

generalisations that are demonstral)ly implicitly learnabie, or if there are generalisations that an

associative view predicts ought to be learnable but which in practice are not (perhaps because

they are linguistically unnatural).

Given that much SLA research is indeed research into implicit leaming (most especially

that conducted in generative tradition), it is perhaps surprising then that as recently as (2005), R.

Ellis remarked that there is a "data problem" in SLA research: "Thus, SLA as a field of inquiry

has been characterized by both theoretical controversy and by a data problem conceming how to

obtain reliable and valid evidence of learners' linguistic knowledge" (R. Ellis, 2005, p. 142).

Specifically, Ellis was refening to the problem of distinguishing conuibutions of implicit and

explicit knowledge to task performance. Until we know how to do this we are not going to be

able to make progress answeimg the pedagogical and theoretical issues mentioned above.

In fact, there has been a similar debate within psychology over how to measure implicit

knowledge reliably, and indeed, whether implicit leaming can be shown to exist at all when

rigorous criteria fbr assessing implicit knowledge are applied (Lovibond & Shanks, 2002;

Shanks & St. John, 1994). In the following sections I shall illustrate various kinds of method for

both making it likely that implicit (rather than explicit) leaming mechartisms were operative

during learning, and for distinguishing contributions of implicit and explicit knowledge to test

task perfbrmance. This research is laboratory-based and inv'olves relatively small amounts of

exposure to semi-artificial language systems under very specific task conditions. My hope is,

though, that such research, like the vast body of research on implicit leaming in psychology

more generally, gives some insight into the nature of implicit learning mechanisms as they

operate "in the wild". After all, if one is able to obtain implicit leaming effects even after very

limited exposure in the lab it seems highly likely that such mechanisms will be operative outside

of the lab as well. The only question is whether any limitations on implicit leaming within the

lal) are simply a result of limited exposure, and how much of real life SLA the mechanisms

exposed in the lab can explain. Here I shall explore these issues in two broad areas: learning

grammatical form-meaning connections, and leaming word order regularities.

- 4-

Page 3: Special contribution Implicit learning and SLA - JST

2 Implicit learning of grammatical form-meaning connections

Theories of word leaming often emphasise explicit, rather than implicit, memory and leaming

processes. For example, whilst amnesics show implicit memory for novel word fbrms, as shown

by priming effects (Haist, Musen, & Squire, 1991), they are very poor at learning new words in

the everyday sense of being able to understand and produce them (Gabrieli, 'Cohen, & Corkin,

1988). This suggests that explicit, declarative, memory systems are involved in word leaming (N.

C. Ellis, 1994). Vbcabulary leaming in young children appears to be highly dependerrt on

mechanisms of joint attention, as established through acts of pointing for example (Bloom, 2000).

A word will only be used as a label for an object if the child recognises that the adult is

intentionally using the word to label that object, implying that attention to fbrm and meaning is

critical to leaming. Howeveg when it comes to leaming the meanings associated with

grammatical morphemes different mechanisms are surely at play since as Bloom himself notes,

"Nobody ever points and says `The!' or `Of"' (Bloom, 2000, p.115). And native speakers of

English, fbr example, find it very difiicult to explain the exact conditions of usage of "the", or

might be surprised to have their attention drawn to the fact that whereas "The roof of the car"

sounds fine, "The hat of the man" does not (Rosenbach, 2008). Knowledge of the relevant

generalisations appears to have been learned implicitly. But is it possible to replicate such

phenomena in the adult second language learner?

Evidence fbr irrrplicit leaming of form-meaning connections has been provided by some

studies by myself (Williams, 2003, 2004, 2005), and with Janny Leung (Leung & Williams,

2006, forthcoming). For exarnple, in Williams (2005) participants only had to leam fbur novel

determiner-like forms, and were told that two of them (gi and ro) were used to refer to near

objects, and the other two (ul and ne) for far objects. So, for example, gi dog is like saying `the

near dog'. What they were not told was that determiner use also depended on the animacy of the

noun (gi and ul were used with living things, ro and ne with non-living things). The novel fbrms

were embedded in English carrier phrases or sentences (e.g., I was temfied when I turned

around and scrw gi lion right behind me) upon which the participants had to perform tasks that

forced them to process the novel determiners in relation to the taught near-far dimension. In a

surprise test phase participants' choice of determiner fbr novel nouns in novel contexts was

sensitive to animacy, even for participants who claimed to be unaware that animacy was a

relevant factor. This suggests that they had implicit knowledge of the correlation between noun

animacy and determiner usage. In another study (Leung & Williams, forthcoming) the novel

determiners were correlated with thematic roles (e.g. "Gi Mary the boy is kissing" gi is used

because the girl is the agent). Participants viewed a picture and had to indicate simply on which

side the named individual was located (e.g. a girl on the left is kissing a boy on the right). They

also had to translate the sentence, retaining the novel form (e.g. Gi Mary is kissing the boy), thus

ensuring that the sentence (and the participant roles) were fully processed. If participants knew

the underling regularity they would know already on hearing "gi" that the agent of the action was

going to be referred to, and their response time would be facilitated, or slowed down when that

regularity was violated. Once again, this is what was found, even for participants who when

questioned afterwards appeared to be unaware of the regularity

- 5-

Page 4: Special contribution Implicit learning and SLA - JST

Do these results violate the assumption that attention and, specificallM `noticing', are

necessary for learning (Schnidt, 2001)? We would argue that they do not. Participants certainly

notice the relevarrt fbrms (the novel determiners) and their linguistic context (the accompanying

nouns, and the sentence context in which they are embedded). What they do not notice is the

association between form and meaning. We assume that this association emerges through an

abstraction process that operates over the memories of individual items in memory3 or which, in

connectionist terms, results from the way in which those items are encoded in memory

(McClelland & Rumelhart, 1985). This would be an example of what, in the word leaming

literature has been referred to as the use of "cross situational statistics" to learn word-referent

mappings (Smith & YU, 2008). Assuming a close connection between attention and memory

encoding (Logan & Etherton, 1994) atterrtion is necessary for encoding the relevant events in

memory in this case the deterrniners and their context. But the learner need not be aware of the

generalistions that are abstracted over those events. Tb use Schmidt's (2001) terminologM our

participants have awareness at the level of noticing, but not at the level of understanding.

An additional critical factor in implicit leaming of grammatical form-meaning connections

may be the nature of the meanings involved. Are all meaning distinctions equally likely to

associate with grammatical morphemes? A domain-general associative leaming view would

suggest that so long as the conditions on learning are met, any meaning distinction is as likely as

any other to associate with a grammatical morpheme. We have begun to investigate this issue by

using our reaction time methodology to examine leaming of form-meaning correlations that are

less likely to be found in the world's languages, for example correlations between novel

determiners and relative size of the objccts in the display (Leung & Williarns, submitted) or

relative distance (Leung & Williams, 2006). No implicit leaming effects were obtained in these

studies. Effects were only obtained for participants who were aware of the relevant regularities,

demonstrating that the systems were learnable in principle. Although it is hard to base strong

claims on such null results, especially given the variety of methodological factors that may

determine whether leaming is obtained, these studies do provide preliminary evidence that

implicit learning of gramrnatical form-meaning connections may be constrained. Whether these

constraints derive from the first, or foreign, languages known by the participants, or from

knowledge of a universal set of "potentially encodable distinctions" (Bickerton, 2001) remains to

be seen. I shall return to the issue of constraints on implicit leaming in the Conclusion.

To what extent are such processes operative in naturalistic SLA? If we consider simply

leaming of agreement patterns involving abstract noun class distinctions, then there are reasons

to believe that similar processes are at work. Leaming grammatical gender classes is a persistent

problem for adult second language learners, with errors of gender assignment persist even after

massive exposure and at very high levels of proficiency (Carroll, 1999). Part of the reason fbr

this appears to be that second language learners are overly sensitive to the phonological and

semantic correlates of gender classes. Such correlates are only probabilistic, there are numerous

exceptions, and hence can lead to errors on exceptional items (Holmes & Dejean de la Batie,

1999). These errors are indicative of associative leaming, as iearners unconsciously abstract

patterns over individual items encoded in memory. Interestingly, in first language acquisition

such errors are less prevalent, and there is evidence that children are more reliant on the notion

- 6-

Page 5: Special contribution Implicit learning and SLA - JST

of the gender of a word as an abstract grammatical category (Caselli, et al., 1993). This may

point to a fundamental difference between first and second language acquisition, at least in this

domain, with the latter more dependent upon associative leaming.

3 Implicit learning of word order regularities

Within psychology the traditional way of studying implicit leaming of sequential regularities has

been using the artificial granrmar (AG) leaming paradigm introduced by Reber (1967).

Participants are presented with meaningless consonant strings such as VXXVS and TPPPTS in

what appears to be a short-term memory task. Following this they are told that the sequences

followed a complex rule system, and that they should now make intuitive judgments about the

grammaticality of novel strings. In fact the sequences are generated by a finite state grammar,

and even though participants are unable to describe the structure of that gramar their

grammaticality judgments are above chance (Reber, 1967; Reber & Allen, 1978) Since these

mitial findings there has been considerable debate over what it is exactly that participants learn

in such experiments, and whether their knowledge is truly unconscious. With regard to the first

question, many researchers now believe that participants learn chunks of letters that recur in the

training sequences (Endress,'Nespog & Mehler, 2009; Perruchet & Pacteau, 1990), and at a more

abstract level they learn characteristic patterns of letter alternation and doubling (repetition

structure). The latter appears to underlie transfer to test items with different surface

characteristics from the training set (Tunney & Altmann, 2001). Needless to say, knowledge of

chunks and repetition structure does not constitute knowledge of the entire grammar, .which

presumably is why performance in these experiments is usually in the range 65% to 70%,

dropping lower when the surface characteristics of the stimuli are changed at test. Nor is it

particularly relevant to learning natural language gramrriars (although see (Endress, et al., 2009)

for some suggestions as to' how it might be). With regard to the question of implicitness of the

knowledge that is acquired, whilst the reliance on verbal report might be criticised (Shanks & St.

John, 1994) more recent work using other, more sensitive, subjective measures (described

below) has found that participants in AG leaming experiments acquire a mixture of implicit and

explicit knowledge (Dienes & Scott, 2005).

The ahove studies have all used artificial material comprising nonsense words that are

devoid of meaning. One may wonder, then, how these results compare with those obtained when

known words are involved, since these have meaning, might be identified with known

grammatical categories (noun, verb, adverb, etc), and hence more grammatical knowledge might

be engaged than for strings of nonsense syllables. When exposed to known words in novel word

orders in an incidental leaming situation do people just learn (i) the literal sequences of words

that they have encountered, (ii) the underlying grammatical patterns (as a sequence of

grammatical categories), or (iii) deeper syntactic regularities? Each of these possibilities makes

predictions about the kind of generalisation that will be possible. In the first case any sentence

that contains new words will cause a drop in accuracy of grarmnaticality judgments Gust as when

letters are changed between training and test in AG experiments). In the second, sentences with

- 7-

Page 6: Special contribution Implicit learning and SLA - JST

new lexis will be acceptal)le so long as they correspond to a pattern of underlying grammatical

categories encountered in training (this would already be a deeper level of representation of the

sequence regularities than is achievable in AG leaming). In the third case even serrtences with

novel syntactic patterns might be acceptal)le if they conform to deeper syntactic principles

inferred from training examples.

There is very good evidence at least for possibility (ii) - people do rapidly and incidentally

acquire novel orderings of abstract grammatical categories, not just sequences of word forms

(Cleary & Langley, 2007; Francis, et al., 2009; Hudson Kam, 2009; Kaschak & Glenberg, 2004;

Robinson, 1996). For example, Francis et al (2009) exposed participarrts to three-word sentences

fbllowing NNV, VNN, and NVN word orders. If one of the non-English word orders (e.g. NNV)

was more frequent than the others then, after traming, it was read more quickly than them, even

though new words were used in the test sentences. Thus, participants rapidlM and incidentallM

acquired a sensitivity to the patteming of the underlying grammatical categories. However, what

is not clear from these experiments is the extent to which the resulting knowledge was implicit.

Nor is it clear whether people can go beyond leaming linear strings of lexical categories to

sequences of phrasal categories, or whether they can go even further and learn deeper syntactic

regularities.

Consider the rules of verb placement in German. In simple sentences the main verb has to

occur in second phrasal position, "V2". So if we were to use English words in a German order

we would have, for example, Last lvne !{zZec(tegld Rose her employee 's claim for compensation.

When a main clause precedes a subordinate clause the verb is in second position in the main

clause and final position in the subordinate clause ("V2-VF", e.g. Usually reasoned Atfike that the

compaay cash digested). wnen the subordinate clause precedes the main clause the verb is in

final position in the subordinate clause but first position in the main clause ("VF-Vl", e.g., Since

his teacher criticism voiced, put Peter more ofort into his homework). In Rebuschat & Williams

(2006; Rebuschat & Williams, submitted) native speakers of English with no knowledge of

German (or other verb-final languages like Japanese) read sentences comprising English words

in German order and simply had to judge them for semantic plausibility. The sentences followed

the three phrasal patterns illustrated ai)ove. They then performed a surprise grammaticality

judgment task on sentences containing new lexis (apart from function words). Following Dienes

& Scott (2005) they were also asked to indicate, for each decision, their confidence, and whether

their decision was based on guess, intuition, memory, or rule. We found that they were more

likely to endorse grammatical patterns encountered in training than ungrammatical patterns not

encountered in training. This demonstrates incidental leaming of phrase-level patterns. However,

their overall performance on ungrammatical items was at chance (and endorsement for some

items, e.g. "VF, was particularly high), suggesting a failure to learn the underlying verb

placement rules. With regard to awareness, overall accuracy was above chance when panicipants

said that their responses were based on moderate confidence and a rule that they had formulated,

indicating use of explicit knowledge. However, they were also significarrtly above chance when

basing their decisions on moderate confidence and intuition. This is similar to native speaker

(non-linguist) grammaticality judgments where we have confidence in our decisions but no

knowledge of the underlying rules. Dienes & Scott (2005) argue that moderately confident, but

- 8-

Page 7: Special contribution Implicit learning and SLA - JST

intuitive judgments, should be regarded as a reflection of implicit knowledge. Thus there was

evidence here for leaming of trained phrasal patterns, some evidence for use of implicit

knowledge, but no evidence of leaming the underlying rules.

Williams & Kuribara (2008) used a similar methodology, but this time combining English

lexis with Japanese case marking and word order. Participants with no knowledge of Japanese

(or other verb-final languages such as German) were first told the functions of the -ga

(nominative), -o (accusative) and -ni (dative) case markers as they were to be used in the

experiment (but not Japanese in general). In the exposure phase they performed plausibility

judgmerrts on sentences like 71hat beggar-ga banker-ni money-o loaned (correct answer =

implausible). The aim of the study was to see whether participants would incidentally acquire the

grammatical patrerns that they were exposed to, but also whether they would acquire two basic

principles of Japanese syntax - head-final verb position and scrambling. Most of the exposure

phase sentences followed the canonical Japanese S(I)OV or S[S(I)OV]V pattern, e.g., Fred-ga

Billrga arm-o broke that said (`Fred said that Bill broke (his) arm'). A minority exhibited object

or adjunct scrambling, both short distance, e.g., Plant-・o girl-ga melted, and long distance, e.g.

Carpet-o cat-ga Jane-ga scratched that thought. Crucially, these were the only kinds of

scrambling that occurred during the exposure phase. Participants then performed a surprise GJT

on sentences containing new lexis. Some of the test sentences followed the same canonical and

object scrambling patterns encountered in the exposure phase. But there were also new structures

involving scrambling of the indirect object (e.g. ISOV), and object scrambling in an embedded

clause (S[OSV]V). These tested whether participants had acquired a generalised notion of

scrambling that extends to novel constituents and environments. Ungrammatical structures

contained verbs in non clause-final positions (e.g. *SIVO, "S[SVO]V) and test for leaming of

head-final verb position. From a generative perspective scrambling is an optional syntactic

operation that moves a constituent in the opposite direction from the head direction (Saito &

Fukui, 1998). Japanese, being right-headed, licenses scrambling to the left. Hence one might

expect to find that acquiSition of head direction and scrambling are correlated. rfest phase

perfbrmance was compared to a group of participants who did not undergo the exposure phase.

They were first instructed in the meanings of the case rparkers and then had to indicate how

likely they thought each sentence was to be grarmnatical in the world's languages. Like the

exposure group, none of these no-exposure group participants knew Japanese or other verb-final

languages. We reasoned that the no-exposure group prQvides an indication of the initial state of

the learners, allowing us to see directly how this is modified by exposure to the novel language.

The first notable result from this study was that for grammatical structures the .exposure

group only showed a higher acceptance rate than the no-exposure control group on structures

that they had actually received during training. There were no differences between the groups on

novel grammatical structures. This suggests that they did not acquire a generalised notien of

scrambling that could generalise to new stmctures. In fact though, on closer examination we

found that a subset of participants (44%) showed a very high acceptance rate for canonical

structures and no reliable acceptance of scrambled structures. For example, the OS(I)V structure

that they had encountered in training was actually endorsed at a significantly lower level than the

control group. This may be an instance of what has been referred to in the statistical leaming

- 9-

Page 8: Special contribution Implicit learning and SLA - JST

literature as a failure to "probability match". When faced with unexplained varial)ility in the

input some learners will just acquire the most frequent alernative, imposing regularity on the

input and filtering out variation (Hudson Kam & Newport, 2005). InterestinglM a study of adult

learners of Japanese also found that some of them resist scrambling (Iwasaki, 2003).

The remaining 56% of the participants endorsed at least the trained short scrambles that

they had received in training. However, their acceptance rate on test stmctures that scrambled a

constituent that had not been scrambled in training (e.g. ISOV) was not consistently better than

the control group. The only significant difference was for the S[OS(I)V]V structure, where the

OS(I)V structure familiar from the exposure phase was embedded in a complex sentence. Thus,

there is no evidence even amongst these participants for acquisition of a generalised notion of

scrambling, only acquisition of specific structural patterns received during the exposure phase.

TUming to the ungramatical structures, given that in the exposure phase every clause

ended in a verb one would have expected the participants to learn this as a structural regularity in

the language. However, this was not the case, with endorsement rates fbr English word orders

("SVO- and "SVIO) not being significantly below chance, even amongst the sub-group who at

least accepted the trained scrambles. Moreover, endorsement of complex structures with right

movement of either O or S (i.e. "S[SVO]V and *S[OVS]V) was not significantly different from

the control group, and actually significantly above the chance level for the "S[SVO]V structure.

We concluded that there was no evidence of leaming the head-final characteristic of the

language.

If the Williams & Kuribara (2008) results cannot be explained in terms of leaming

grammatical generalisations then how can they be explained? It is not enough to argue that

participants simply learned the patterns that they were exposed to because this would predict

reliable rejection of all novel patterns, which was clearly not the case. We proposed that during

the exposure phase the participants learned the contingencies between grammatical categories

and that grammaticality judgments were driven by how well the sequential structure of a test

item matched the sequemial probabilities acquired during exposure. Connectionist networks

provide a means of calculating contingencies between events in a psychologically (if not

neurally) plausible way (Shanks, 1995). "Simple recurrent networks" (SR[Ns) are particularly

suital)le for sequence leaming problems such as segmentation of words from continuous speech

(Christiansen, Allen, & Seidenberg, 1998), and artificial gramrnar leaming (Kinder & Lotz,

2009). We therefore applied this kind of modeling framework to our data by coding the exposure

phase sentences in terms of grammatical categories (e.g. S, O, V) and presenting them to an SRN

(see Williams & Kuribara, 2008, fbr details). The network was essentially trained to predict each

category on the basis of the preceding categories in the sentence. After training on the set of

exposure phase sentences the network was presented with the test sentences and the strength of ,

its output calculated. The stronger the output the stronger the resonance between the sequential

structure of the test item and the sequential structure of the training data. For each test structure

type we then plotted the linear regression for network output against the mean acceptance rate

fbr that stmcture by the human participants, separately for the "scrambler" and "non-scrambler"

groups. In both cases the fit to the human data was extremely good. For example, excluding the

three long distance scrambling structures (where human performance appeared to be suppressed

- 10-

Page 9: Special contribution Implicit learning and SLA - JST

by processing difliculties) the regression over the remaining 16 structures accounted for 91% of

the variance in the scrambler data, and 70% in the non-scrambler data. These simulations

therefore provide good evidence that participants were leaming the sequential probal)ilities of

grammatical categories rather than grammatical generalisations.

Of course, one potential criticism of these laboratory-based studies is that the amount of

exposure was not sufficient to trigger grammar leaming. Assuming that with increasing exposure

the nature of the learning process does hot change, would it ever deliver knowledge of

grammatical generalisations? 'Ib answer this question we can see what happens when the

connectionist network is tested after different amounts of training, say for 50 versus 5000 times

through the exposure sentences. What we found was that with increasing exposure the output for

test stmctures that had appeared in training increased, consistent with better leaming of trained

structures. But the output fbr new structures did not change. That is, despite 100 times more

training the network did not change the strength of response to new grammatical scrambles and

new ungrammatical sentences. The kind of associative learning mechanism instantiated in this

kind of model would never deliver leaming of generalisal)le knowledge of grammar.

If we assume that the same would be true of human learners then it fbllows that if they do

show evidence of having acquired grammatical generalisations then it must be by using some

other kmd of knowledge. For example they might use explicit knowledge. R. Ellis (2005)

provided evidence that amongst adult learners of English perfbrmance in a grarrrrnaticality

judgment test was driven by a combination of implicit and explicit knowledge, with the use of

implicit knowledge associated with correct acceptance of grammatical stmctures, and the use of

explicit knowledge associated with correct rejection of ungrammatical structures. Grammatical

structures mighr be accepted through matching to patterns encountered in input, and hence could

be supported by implicit knowledge. But reliable rejection of ungrarmnatical structures requires

knowledge of generalised rules, which, on Ellis's evidence was only apparent as explicit

knowledge. Similarly Roehr (Roehr, 2008) suggests that implicit knowledge of a second

language is exemplar-based ieading to prototype and similarity effects, whereas categorical, and

context-independent, performance can only be achieved by using explicit metalinguistic

knowledge.

How plausible it is to appeal to explicit knowledge to explain detection of

ungrammaticalities depends very much on the nature of the violations in question. The head-final

character of Japanese is presumably a salient feature of the language that is easy for the learner

to represent explicitlM and which even beginner learners are explicitly taught. But what of more

subtle grammatical violations where the relevant grammatical rules are unlikely to hewe been

taught, and where there is unlikely to be anything in the input that tells the learners that they are

ruled out? For example, SMrite (2009) reviewed research on subjacency violations and subtle

aspects of the semantic interpretation of syntactic structures which suggests that L2 learners can

acquire aspects of the L2 that they are miikely to have transferred from the Ll, or about which

they are unlikely to have explicit knowledge. If it can be shown that judgments about the

ungrammaticality of such structures are indeed made in a native-like way and on the basis of

implicit knowledge (perhaps using the subjective measures employed by Rebuschat & Williams,

submitted), and that performance cannot be explained by domainrgeneral (e.g. connectionist)

- 11-

Page 10: Special contribution Implicit learning and SLA - JST

leaming mechanisms, then it will be necessary to appeal to implicit leaming mechanisms that go

beyond simple associative leaming.

4 Conclusion - Constraints on implicit learning -

There is a common theme running through studies of implicit leaming and statistical leaming in

both psychology and, more recently, in applied linguistics, and this is the notion that the implicit

leaming mechanism is constrained. We are not able to absorb all regularies in the environment.

In part this may be due to the fact that the implicit leaming mechanism operates on

representations that are fed to it by other systems. For exarnple, there is a general predisposition

to encode sequences in terms of positions relative to edges, and in terms of repetition structure,

and these predispositions may have specific implications for language leaming, and fbr our

notions of the role of innate constraints (Endress, et al., 2009). Of course in the case of SLA

relevant representations can be fed to the implicit leaming mechanism from the linguistic system

itsel£ as when people naturally learn about novel word patterns in terms of the sequencing of

underlying grammatical categories. Perhaps also they only implicitly learn associations between

grammatical morphemes and meanings that their prior linguistic knowledge tells them are likely

to be grammaticised in language. And it may also be that the IL mechanism itself is constrained

in the sense that it can only deliver certain kinds of knowledge - for example of the underlying

statistical regularities of sequences rather than generalisable rules. By exploring the limitations

of implicit leaming we can help define the phenomena that require a different kind of

explanation, be it explicit leaming or UG-constrained implicit leaming, and help demarcate areas

of language that can be assumed to be acquired spontaneously from those that require special

lnterventlon.

References

Bickerton, D. (2001). Okay for content words, but what about functional items? Commentary on

Bloom: How children learn the meanings of words. Behavioral and Brain Sciences, 24, (6),

1104-1105.

Bloom, P. (2000). Hbw Children Learn the Meanings of nlords. Cambridge, MA: MIT Press.

Carroll, S. E. (2005). Input and SLA: Adults' sensitivity to different sorts of cues to French

gender. Language Learning, 55, (1), 79-138.

Caselli, M. C., Leonard, L. B., Vblterra, V & Campagnoli, M. G. (1993). 'Ibward mastery of

Italian morphology: A cross-sectional study. Journal of Child Language, 20, (2), 377-393.

Christiansen, M. H., Allen, J. & Seidenberg, M. S. (1998). Learning to segment speech using

multiple cues: A connectionist model. Language and Cognitive Processes, 13, (213),

221-268.

Cleary, A. M. & Langley, M. M. (2007). Retention of the stmcture underlying sentences.

Language and Cognitive Processes, 22, (4), 614-628.'

DeKeyser, R. M. (2003). Implicit and explicit leaming. In C. Doughty & M. Long (Eds.),

Handbook of Second Language Acquisition, 313-348. 0xford: Blackwell.

- 12-

Page 11: Special contribution Implicit learning and SLA - JST

Dienes, Z. & Perner, J. (1999). A theory of implicit and explicit knowledge. Behavioral and

Brain Sciences, 22, 735-808.

-----. & Scott, R. (2Q05). Measuring unconscious knowledge: Distinguishing structural

knowledge and judgment knowledge. Psychological Research, 69, (5/6), 338-351.

Ellis, N. C. (1994). Vbcabulary acquisition: The expiicit ins and outs of explicit cognitive

mediation. In N. C. Ellis (Ed.), ImpZicit and Explicit Learning of Languages, 211-282.

London: Academic Press.

----- . (1998). Emergentism, connectionism and language leaming. Language Learning, 48, (4),

631-664.

Ellis, R. (2005). Measuring implicit and explicit knowledge of a second language: A

psychometric study. Studies in Second Language Acguisition, 27, (2), 141-172.

Endress, A. D., Nespor, M. & Mehler, J. (2009). Perceptual and memory constraints on language

acquisition. 7)ends in Cognitive Sciences, 13, (8), 348-353.

Francis, A. P., Sclmidt, G. L., Carr, T. H. & Clegg, B. A. (2009). Incidental leaming of abstract

rules for non-dominant word orders. Psychologicai Research-Psychoiogische Forschung, 73,

(1), 60-74.

Gal)rieli, J. D. E., Cohen, N. J. & Corkin, S. (1988). The impaired learning of semantic

knowledge following medial temporal lobe resection. Brain and Cognition, 7, (2), 157-177.

Haist, E, Musen, G. & Squire, L. (1991). Intact priming of words and nonwords in amnesia.

Psychobiology, 19, (4), 275-285.

Hawkins, R. (2001). Second Language Syntcwc: A Generative introduction. Oxford: Blackwell.

Hill, T., Lewicki, P., Czzyzewska, M. & Schuller, G. (1990). The role of learned infergntial

encoding rules in the perception of faces: Effects of nonconscious selfperpetuation of a bias.

Journal of ExperimentaZ Social Psychology, 26, (4), 350-371.

Holmes, V M. & Dejean de la Batie, B. (1999). Assignment of grammatical gender by native

speakers and foreign language learners. Applied Psycholinguistics, 20, (4), 479-506.

Hudson Kam, C. L. (2009). More than words: Adults learn probabilities over categories and

relationships between them. Language Learning and Development, 5, (2), 115-145.

-----. & Newport, E. L. (2005). Regularizing unpredictable variation: The roles of adult and

child learners in language formation and change. Language Learning and Development, 1,

(2), 151-195.

Iwasaki, N. (2003). L2 acquisition of Japanese: knowledge and use of case particles in SOV and

OSV sentences. In S. Karimi (Ed.), nlord Order and Scrambling, 273-300. 0xford:

Blackwell.

Kaschak, M. P. & Glenberg, A. M. (2004). This construction needs learned. .lburnal of

Experimental Psychology-General, 133, (3), 450-467.

Kinder, A. & Lotz, A. (2009). Connectionist models of anificial grammar leaming: wnat type of

knowledge is acquired? Psychological Research-Psychologische Forschung, 73, (5),

659-673.

Koelsch, S., Gunter, T., Friederici' , A. D. & Schroger, E. (2000). Brain indices of music

processing: "Non-musicians" are musical. Journal of Cognitive Neuroscience, 12, (3),

520-541.

- 13-

Page 12: Special contribution Implicit learning and SLA - JST

Krashen, S. (1981). Second Language Acguisition and Second Language Learning. London:

Pergamon.Leung, J. & Williams, J. N. (2006). lmplicit learning of form-meaning connections. ln R. Sun &

N. Miyake (Eds.), Proceedings qf the Annual Meeting of the Cognitive Science Society,

465-470. Mahwah, NJ.: Lawrence Erlbaum Associates.

-----. & -----. (fbnhcoming). The implicit leaming of mappings between forms and

contextually-derived meanings. Studies in Second Language Acquisition.

Logan, G. D. & Etherton, J. L. (1994). wuat is learned during automatization? The role of

attention in constructing an instance. Journal of Experimental Psychology: Learning,

Memory and Cognition, 20, (5), 1022-1050.

Long, M. H. (1996). The role of the linguistic environment in second language acquisition. In W

C. Ritchie & T. K. Bhatia (Eds.), Handbook of Second Language Acquisition, 413-468. New

Ybrk: Academic Press.

Lovibond, P. F. & Shanks, D. R. (2002). The role of awareness in Pavlovian conditioning:

Empirical evidence and theoretical implications. Journal gS Experimental Psychology:

Animal Behavior Processes, 28, (1), 3-26.

McClelland, J. & Rumelhart, D. (1985). Distributed memory and the representation of general

and specific infbrmation. Jburnal of Experimental Psychology: General, II4, (2), 159-188.

Pearce, M. T., Ruiz, M. H., Kapasi, S., Wiggins, G. A. & Bhattacharya, J. (2010). Unsupervised

statistical leaming underpins computational, behavioural, and neural manifestations of

musical expectation. Neuroimage, 50, (1), 302-313.

Perruchet, P. & Pacteau, C. (1990). Synthetic grammar leaming: Implicit rule abstraction or

explicit fragmentary knowledge? Journal of Experimental Psychology: General, H9, (3),

264-275.

Reber, A. S. (1967). Implicit learning of artificial grammars. Journal of IZerbal Leaming and

Verbal Behavior, 6, (6), 855-863.

-----. & Allen, R. (1978). Analogic and abstraction strategies in synthetic grammar leaming: A

fUnctionalist interpretation. Cognition, 6, (3), 189-221.

Rebuschat, P. & Williams, J. N. (2006). Dissociating implicit and explicit leaming of natural

language syntax. In R. Sun & N. Miyake (Eds.), Proceedings of the Annual Meeting of the

Cognitive Science Society, 2594. Mahwah, N.J.: Lawrence Erlbaum.

-----. & -----. (submitted). lmplicit leaming of second language syntax.

Reed, N., McLeod, P. & Dienes, Z. (2010). lmplicit knowledge and motor skill: wnat people

who know how to catch don't know. Consciousness and Cognition, 19, (1), 63-76.

Robinson, P. (1996). Learning simple and complex second language rules under implicit,

inciderrtal, rule-search, and instructed conditions. Studies in Second Language Acquisition,

I8, (1), 27-67.

Roehr, K. (2008). Linguistic and metalinguistic categories in second language leaming. ,

Cognitive Linguistics, 19, (1), 67-106.

Rosenbach, A. (2008). Animacy and grammatical variation: Findings from English genitive

variation. Lingua, 118, (2), 151-171.

Saito, M. & Fukui, N. (1998). Order in phrase structure and movement, Linguistic lnguiry, 29,

- 14-

Page 13: Special contribution Implicit learning and SLA - JST

(3), 439-474.

Schrnidt, R. (2001). Attention. ln P. Robinson (Ed.), Cognition and second Language

Instruction, 3-32.Cambridge:CambridgeUniversityPress.

Shanks, D. R. (1995). 71he Psychology of Associative Learning. Cambridge: Carnbridge

University Press.

-----. & St. John, M. (1994). Characteristics of dissociable human leaming systems. Behavioral

and Brain Sciences, 17, (3), 367-447.

Smith, L. & YU, C. (2008). Infants rapidly learn word-referent mappings via cross-situational

statistics. Cognition, I06, (3), 1558-1568.

Tunney, R. J. & Altmann, G. T. M. (2001). TWo modes of transfer in artificial grammar leaming.

Journal of Experimental Psychology-Learning Memory and Cognition, 27, (3), 614-639.

VanPatten, B. (1996). input Processing and Grammar lnstruction: 71heory and Research.

Norwood, New Jersey: Ablex Publishing Corporation.

White, L. (2009). Changing perspectives on universal grarmnar and the critical period

hypothesis. Second Language, 8, 3-22.

Williams, J. N. (2003). Inducing abstract linguistic representations: Human and connectionist

leaming of noun classes. In R. H. van Hout, A. Hulk, F. Kuiken & R. Tbwell (Eds.), 77ze

Interface between Synta)c and the Lexicon in Second Language Acquisition, 151-174.

Amsterdam: John Bebjarnins.

-----. (2004). Implicit leaming of form-meaning connections. In B. VanPatten, J. Williams, S.

Rott & M. Overstreet (Eds.), Form Meaning Connections in Second Language Acquisition ,

203-218. Mahwah, NJ: Lawrence Erlbaum Associates.

-----. (2005). Leaming without awareness. Studies in Second Language Acquisition, 27, (2),

269-304.

-----. & Kuribara, C. (2008). Comparing a nativist and emergentist approach to the initial stage

of SLA: An investigation of Japanese scrambling. Lingua, H8, (4), 522-553.

- 15-