special contribution implicit learning and sla - jst
TRANSCRIPT
Special contribution
Implicit learning and SLA
John N. WILLIAMS
S・)'H)/ N. VIY7A.Zr
University of Cambridge
)S- tz ・-j"i V pt i>ti<¥
1 Introduction
Implicit cognition pervades our everyday lives. wnen we ride a bike we are probably unaware of
how we actually manage to maintain our balance or turn a corner. wnen we run to catch a ball
we are unaware of how we manage to arrive at the exact place at the right time to intercept it
(Reed, McLeod, & Dienes, 2010). Our reactions to unfamiliar faces are governed by subtle cues
that we are not aware of (Hill, et al., 1990). Untrained musicians can detect bad notes' or chords
in unfamiliar tunes in a familiar musical style even though they have no conscious knowledge of
the rules of harmony that they vioiate (Koelsch, et al., 2000; Pearce, et al., 2010). And of course
the native speaker of a language knows that certain constructions sound ungrammatical without
being abie to articulate the grammatical rules that they violate.
ImPlicit knowledge is knowledge that we are not aware of in the moment of use. It is
deployed automatically, without conscious intention (Dienes & Perner, 1999). In contrast explicit
knowledge is knowledge "that the learner is aware o£ It only infiuences behaviour through
controlled, and hence usually non-fluent, processing. Implicit leaming refers to the process by
which implicit knowledge is acquired. Learning is inductive (generalisations are formed on the
basis of examples), and incidental (it occurs without intention). This contrasts with explicit
leaming where either the learner is provided with generalisations, or they intentionally work
them out for themselves (intentional induction).
Implicit leaming has been a topic at the heart of SLA research at least since Krashen (1981)
drew a distinction between acquisition (of implicit knowledge) and !earning (of explicit
knowledge). He claimed essentially that the only route to acquisition and fluent language
perfbrmance is through implicit leaming; i.e. incidental inductive leaming in communicative
situations. This view has been refined over the years through input processing (VanPatten, 1996),
task based leaming and the interactional approach (Long, 1996), but the principle is still the
same, to eschew direct.teaching of language rules as mere `leaming'. On the other hand, others
still advocate a more direct approaeh, relying on practice to convert explicitly learned knowledge
into the procedural fbrmat that is required to ultimately achieve fluency (DeKeyser, 2003).
At the theoretical level the debate between Emergentists (e.g., N. C. Ellis, 1998) and
- 3-
Nativists (e.g., Hawkins, 2001) essentially concerns different theories of the implicit leaming
mechanism. Emergentists tend to appeal to domain-general leaming mechanisms and basic
principles of associative leaming. Language leaming is just another manifestation of the human
ability to absorb the complex statistical structure of the environment. On the other hand, for
Nativists language leaming requires access to domain-specific knowledge and the deployment of
leaming mechanisms that are specific to language.
Given these basic pedagogical and theoretical issues it is clearly important that we
understand the conditions under which implicit learning is more or less likely to occur (if at all),
and the kinds of Iinguistic regularity that can be learned implicitly (if any). This will help us
establish the division of labour between the implicit and explicit pedagogical approaches. At the
theoretical level such investigations will help clarify the extent to which we need to appeal to
domain-specific knowledge and leaming mechanisms to explain SLA. We would be inclined
towards a Nativist view if simple associative leaming cannot account fbr linguistic
generalisations that are demonstral)ly implicitly learnabie, or if there are generalisations that an
associative view predicts ought to be learnable but which in practice are not (perhaps because
they are linguistically unnatural).
Given that much SLA research is indeed research into implicit leaming (most especially
that conducted in generative tradition), it is perhaps surprising then that as recently as (2005), R.
Ellis remarked that there is a "data problem" in SLA research: "Thus, SLA as a field of inquiry
has been characterized by both theoretical controversy and by a data problem conceming how to
obtain reliable and valid evidence of learners' linguistic knowledge" (R. Ellis, 2005, p. 142).
Specifically, Ellis was refening to the problem of distinguishing conuibutions of implicit and
explicit knowledge to task performance. Until we know how to do this we are not going to be
able to make progress answeimg the pedagogical and theoretical issues mentioned above.
In fact, there has been a similar debate within psychology over how to measure implicit
knowledge reliably, and indeed, whether implicit leaming can be shown to exist at all when
rigorous criteria fbr assessing implicit knowledge are applied (Lovibond & Shanks, 2002;
Shanks & St. John, 1994). In the following sections I shall illustrate various kinds of method for
both making it likely that implicit (rather than explicit) leaming mechartisms were operative
during learning, and for distinguishing contributions of implicit and explicit knowledge to test
task perfbrmance. This research is laboratory-based and inv'olves relatively small amounts of
exposure to semi-artificial language systems under very specific task conditions. My hope is,
though, that such research, like the vast body of research on implicit leaming in psychology
more generally, gives some insight into the nature of implicit learning mechanisms as they
operate "in the wild". After all, if one is able to obtain implicit leaming effects even after very
limited exposure in the lab it seems highly likely that such mechanisms will be operative outside
of the lab as well. The only question is whether any limitations on implicit leaming within the
lal) are simply a result of limited exposure, and how much of real life SLA the mechanisms
exposed in the lab can explain. Here I shall explore these issues in two broad areas: learning
grammatical form-meaning connections, and leaming word order regularities.
- 4-
2 Implicit learning of grammatical form-meaning connections
Theories of word leaming often emphasise explicit, rather than implicit, memory and leaming
processes. For example, whilst amnesics show implicit memory for novel word fbrms, as shown
by priming effects (Haist, Musen, & Squire, 1991), they are very poor at learning new words in
the everyday sense of being able to understand and produce them (Gabrieli, 'Cohen, & Corkin,
1988). This suggests that explicit, declarative, memory systems are involved in word leaming (N.
C. Ellis, 1994). Vbcabulary leaming in young children appears to be highly dependerrt on
mechanisms of joint attention, as established through acts of pointing for example (Bloom, 2000).
A word will only be used as a label for an object if the child recognises that the adult is
intentionally using the word to label that object, implying that attention to fbrm and meaning is
critical to leaming. Howeveg when it comes to leaming the meanings associated with
grammatical morphemes different mechanisms are surely at play since as Bloom himself notes,
"Nobody ever points and says `The!' or `Of"' (Bloom, 2000, p.115). And native speakers of
English, fbr example, find it very difiicult to explain the exact conditions of usage of "the", or
might be surprised to have their attention drawn to the fact that whereas "The roof of the car"
sounds fine, "The hat of the man" does not (Rosenbach, 2008). Knowledge of the relevant
generalisations appears to have been learned implicitly. But is it possible to replicate such
phenomena in the adult second language learner?
Evidence fbr irrrplicit leaming of form-meaning connections has been provided by some
studies by myself (Williams, 2003, 2004, 2005), and with Janny Leung (Leung & Williams,
2006, forthcoming). For exarnple, in Williams (2005) participants only had to leam fbur novel
determiner-like forms, and were told that two of them (gi and ro) were used to refer to near
objects, and the other two (ul and ne) for far objects. So, for example, gi dog is like saying `the
near dog'. What they were not told was that determiner use also depended on the animacy of the
noun (gi and ul were used with living things, ro and ne with non-living things). The novel fbrms
were embedded in English carrier phrases or sentences (e.g., I was temfied when I turned
around and scrw gi lion right behind me) upon which the participants had to perform tasks that
forced them to process the novel determiners in relation to the taught near-far dimension. In a
surprise test phase participants' choice of determiner fbr novel nouns in novel contexts was
sensitive to animacy, even for participants who claimed to be unaware that animacy was a
relevant factor. This suggests that they had implicit knowledge of the correlation between noun
animacy and determiner usage. In another study (Leung & Williams, forthcoming) the novel
determiners were correlated with thematic roles (e.g. "Gi Mary the boy is kissing" gi is used
because the girl is the agent). Participants viewed a picture and had to indicate simply on which
side the named individual was located (e.g. a girl on the left is kissing a boy on the right). They
also had to translate the sentence, retaining the novel form (e.g. Gi Mary is kissing the boy), thus
ensuring that the sentence (and the participant roles) were fully processed. If participants knew
the underling regularity they would know already on hearing "gi" that the agent of the action was
going to be referred to, and their response time would be facilitated, or slowed down when that
regularity was violated. Once again, this is what was found, even for participants who when
questioned afterwards appeared to be unaware of the regularity
- 5-
Do these results violate the assumption that attention and, specificallM `noticing', are
necessary for learning (Schnidt, 2001)? We would argue that they do not. Participants certainly
notice the relevarrt fbrms (the novel determiners) and their linguistic context (the accompanying
nouns, and the sentence context in which they are embedded). What they do not notice is the
association between form and meaning. We assume that this association emerges through an
abstraction process that operates over the memories of individual items in memory3 or which, in
connectionist terms, results from the way in which those items are encoded in memory
(McClelland & Rumelhart, 1985). This would be an example of what, in the word leaming
literature has been referred to as the use of "cross situational statistics" to learn word-referent
mappings (Smith & YU, 2008). Assuming a close connection between attention and memory
encoding (Logan & Etherton, 1994) atterrtion is necessary for encoding the relevant events in
memory in this case the deterrniners and their context. But the learner need not be aware of the
generalistions that are abstracted over those events. Tb use Schmidt's (2001) terminologM our
participants have awareness at the level of noticing, but not at the level of understanding.
An additional critical factor in implicit leaming of grammatical form-meaning connections
may be the nature of the meanings involved. Are all meaning distinctions equally likely to
associate with grammatical morphemes? A domain-general associative leaming view would
suggest that so long as the conditions on learning are met, any meaning distinction is as likely as
any other to associate with a grammatical morpheme. We have begun to investigate this issue by
using our reaction time methodology to examine leaming of form-meaning correlations that are
less likely to be found in the world's languages, for example correlations between novel
determiners and relative size of the objccts in the display (Leung & Williarns, submitted) or
relative distance (Leung & Williams, 2006). No implicit leaming effects were obtained in these
studies. Effects were only obtained for participants who were aware of the relevant regularities,
demonstrating that the systems were learnable in principle. Although it is hard to base strong
claims on such null results, especially given the variety of methodological factors that may
determine whether leaming is obtained, these studies do provide preliminary evidence that
implicit learning of gramrnatical form-meaning connections may be constrained. Whether these
constraints derive from the first, or foreign, languages known by the participants, or from
knowledge of a universal set of "potentially encodable distinctions" (Bickerton, 2001) remains to
be seen. I shall return to the issue of constraints on implicit leaming in the Conclusion.
To what extent are such processes operative in naturalistic SLA? If we consider simply
leaming of agreement patterns involving abstract noun class distinctions, then there are reasons
to believe that similar processes are at work. Leaming grammatical gender classes is a persistent
problem for adult second language learners, with errors of gender assignment persist even after
massive exposure and at very high levels of proficiency (Carroll, 1999). Part of the reason fbr
this appears to be that second language learners are overly sensitive to the phonological and
semantic correlates of gender classes. Such correlates are only probabilistic, there are numerous
exceptions, and hence can lead to errors on exceptional items (Holmes & Dejean de la Batie,
1999). These errors are indicative of associative leaming, as iearners unconsciously abstract
patterns over individual items encoded in memory. Interestingly, in first language acquisition
such errors are less prevalent, and there is evidence that children are more reliant on the notion
- 6-
of the gender of a word as an abstract grammatical category (Caselli, et al., 1993). This may
point to a fundamental difference between first and second language acquisition, at least in this
domain, with the latter more dependent upon associative leaming.
3 Implicit learning of word order regularities
Within psychology the traditional way of studying implicit leaming of sequential regularities has
been using the artificial granrmar (AG) leaming paradigm introduced by Reber (1967).
Participants are presented with meaningless consonant strings such as VXXVS and TPPPTS in
what appears to be a short-term memory task. Following this they are told that the sequences
followed a complex rule system, and that they should now make intuitive judgments about the
grammaticality of novel strings. In fact the sequences are generated by a finite state grammar,
and even though participants are unable to describe the structure of that gramar their
grammaticality judgments are above chance (Reber, 1967; Reber & Allen, 1978) Since these
mitial findings there has been considerable debate over what it is exactly that participants learn
in such experiments, and whether their knowledge is truly unconscious. With regard to the first
question, many researchers now believe that participants learn chunks of letters that recur in the
training sequences (Endress,'Nespog & Mehler, 2009; Perruchet & Pacteau, 1990), and at a more
abstract level they learn characteristic patterns of letter alternation and doubling (repetition
structure). The latter appears to underlie transfer to test items with different surface
characteristics from the training set (Tunney & Altmann, 2001). Needless to say, knowledge of
chunks and repetition structure does not constitute knowledge of the entire grammar, .which
presumably is why performance in these experiments is usually in the range 65% to 70%,
dropping lower when the surface characteristics of the stimuli are changed at test. Nor is it
particularly relevant to learning natural language gramrriars (although see (Endress, et al., 2009)
for some suggestions as to' how it might be). With regard to the question of implicitness of the
knowledge that is acquired, whilst the reliance on verbal report might be criticised (Shanks & St.
John, 1994) more recent work using other, more sensitive, subjective measures (described
below) has found that participants in AG leaming experiments acquire a mixture of implicit and
explicit knowledge (Dienes & Scott, 2005).
The ahove studies have all used artificial material comprising nonsense words that are
devoid of meaning. One may wonder, then, how these results compare with those obtained when
known words are involved, since these have meaning, might be identified with known
grammatical categories (noun, verb, adverb, etc), and hence more grammatical knowledge might
be engaged than for strings of nonsense syllables. When exposed to known words in novel word
orders in an incidental leaming situation do people just learn (i) the literal sequences of words
that they have encountered, (ii) the underlying grammatical patterns (as a sequence of
grammatical categories), or (iii) deeper syntactic regularities? Each of these possibilities makes
predictions about the kind of generalisation that will be possible. In the first case any sentence
that contains new words will cause a drop in accuracy of grarmnaticality judgments Gust as when
letters are changed between training and test in AG experiments). In the second, sentences with
- 7-
new lexis will be acceptal)le so long as they correspond to a pattern of underlying grammatical
categories encountered in training (this would already be a deeper level of representation of the
sequence regularities than is achievable in AG leaming). In the third case even serrtences with
novel syntactic patterns might be acceptal)le if they conform to deeper syntactic principles
inferred from training examples.
There is very good evidence at least for possibility (ii) - people do rapidly and incidentally
acquire novel orderings of abstract grammatical categories, not just sequences of word forms
(Cleary & Langley, 2007; Francis, et al., 2009; Hudson Kam, 2009; Kaschak & Glenberg, 2004;
Robinson, 1996). For example, Francis et al (2009) exposed participarrts to three-word sentences
fbllowing NNV, VNN, and NVN word orders. If one of the non-English word orders (e.g. NNV)
was more frequent than the others then, after traming, it was read more quickly than them, even
though new words were used in the test sentences. Thus, participants rapidlM and incidentallM
acquired a sensitivity to the patteming of the underlying grammatical categories. However, what
is not clear from these experiments is the extent to which the resulting knowledge was implicit.
Nor is it clear whether people can go beyond leaming linear strings of lexical categories to
sequences of phrasal categories, or whether they can go even further and learn deeper syntactic
regularities.
Consider the rules of verb placement in German. In simple sentences the main verb has to
occur in second phrasal position, "V2". So if we were to use English words in a German order
we would have, for example, Last lvne !{zZec(tegld Rose her employee 's claim for compensation.
When a main clause precedes a subordinate clause the verb is in second position in the main
clause and final position in the subordinate clause ("V2-VF", e.g. Usually reasoned Atfike that the
compaay cash digested). wnen the subordinate clause precedes the main clause the verb is in
final position in the subordinate clause but first position in the main clause ("VF-Vl", e.g., Since
his teacher criticism voiced, put Peter more ofort into his homework). In Rebuschat & Williams
(2006; Rebuschat & Williams, submitted) native speakers of English with no knowledge of
German (or other verb-final languages like Japanese) read sentences comprising English words
in German order and simply had to judge them for semantic plausibility. The sentences followed
the three phrasal patterns illustrated ai)ove. They then performed a surprise grammaticality
judgment task on sentences containing new lexis (apart from function words). Following Dienes
& Scott (2005) they were also asked to indicate, for each decision, their confidence, and whether
their decision was based on guess, intuition, memory, or rule. We found that they were more
likely to endorse grammatical patterns encountered in training than ungrammatical patterns not
encountered in training. This demonstrates incidental leaming of phrase-level patterns. However,
their overall performance on ungrammatical items was at chance (and endorsement for some
items, e.g. "VF, was particularly high), suggesting a failure to learn the underlying verb
placement rules. With regard to awareness, overall accuracy was above chance when panicipants
said that their responses were based on moderate confidence and a rule that they had formulated,
indicating use of explicit knowledge. However, they were also significarrtly above chance when
basing their decisions on moderate confidence and intuition. This is similar to native speaker
(non-linguist) grammaticality judgments where we have confidence in our decisions but no
knowledge of the underlying rules. Dienes & Scott (2005) argue that moderately confident, but
- 8-
intuitive judgments, should be regarded as a reflection of implicit knowledge. Thus there was
evidence here for leaming of trained phrasal patterns, some evidence for use of implicit
knowledge, but no evidence of leaming the underlying rules.
Williams & Kuribara (2008) used a similar methodology, but this time combining English
lexis with Japanese case marking and word order. Participants with no knowledge of Japanese
(or other verb-final languages such as German) were first told the functions of the -ga
(nominative), -o (accusative) and -ni (dative) case markers as they were to be used in the
experiment (but not Japanese in general). In the exposure phase they performed plausibility
judgmerrts on sentences like 71hat beggar-ga banker-ni money-o loaned (correct answer =
implausible). The aim of the study was to see whether participants would incidentally acquire the
grammatical patrerns that they were exposed to, but also whether they would acquire two basic
principles of Japanese syntax - head-final verb position and scrambling. Most of the exposure
phase sentences followed the canonical Japanese S(I)OV or S[S(I)OV]V pattern, e.g., Fred-ga
Billrga arm-o broke that said (`Fred said that Bill broke (his) arm'). A minority exhibited object
or adjunct scrambling, both short distance, e.g., Plant-・o girl-ga melted, and long distance, e.g.
Carpet-o cat-ga Jane-ga scratched that thought. Crucially, these were the only kinds of
scrambling that occurred during the exposure phase. Participants then performed a surprise GJT
on sentences containing new lexis. Some of the test sentences followed the same canonical and
object scrambling patterns encountered in the exposure phase. But there were also new structures
involving scrambling of the indirect object (e.g. ISOV), and object scrambling in an embedded
clause (S[OSV]V). These tested whether participants had acquired a generalised notion of
scrambling that extends to novel constituents and environments. Ungrammatical structures
contained verbs in non clause-final positions (e.g. *SIVO, "S[SVO]V) and test for leaming of
head-final verb position. From a generative perspective scrambling is an optional syntactic
operation that moves a constituent in the opposite direction from the head direction (Saito &
Fukui, 1998). Japanese, being right-headed, licenses scrambling to the left. Hence one might
expect to find that acquiSition of head direction and scrambling are correlated. rfest phase
perfbrmance was compared to a group of participants who did not undergo the exposure phase.
They were first instructed in the meanings of the case rparkers and then had to indicate how
likely they thought each sentence was to be grarmnatical in the world's languages. Like the
exposure group, none of these no-exposure group participants knew Japanese or other verb-final
languages. We reasoned that the no-exposure group prQvides an indication of the initial state of
the learners, allowing us to see directly how this is modified by exposure to the novel language.
The first notable result from this study was that for grammatical structures the .exposure
group only showed a higher acceptance rate than the no-exposure control group on structures
that they had actually received during training. There were no differences between the groups on
novel grammatical structures. This suggests that they did not acquire a generalised notien of
scrambling that could generalise to new stmctures. In fact though, on closer examination we
found that a subset of participants (44%) showed a very high acceptance rate for canonical
structures and no reliable acceptance of scrambled structures. For example, the OS(I)V structure
that they had encountered in training was actually endorsed at a significantly lower level than the
control group. This may be an instance of what has been referred to in the statistical leaming
- 9-
literature as a failure to "probability match". When faced with unexplained varial)ility in the
input some learners will just acquire the most frequent alernative, imposing regularity on the
input and filtering out variation (Hudson Kam & Newport, 2005). InterestinglM a study of adult
learners of Japanese also found that some of them resist scrambling (Iwasaki, 2003).
The remaining 56% of the participants endorsed at least the trained short scrambles that
they had received in training. However, their acceptance rate on test stmctures that scrambled a
constituent that had not been scrambled in training (e.g. ISOV) was not consistently better than
the control group. The only significant difference was for the S[OS(I)V]V structure, where the
OS(I)V structure familiar from the exposure phase was embedded in a complex sentence. Thus,
there is no evidence even amongst these participants for acquisition of a generalised notion of
scrambling, only acquisition of specific structural patterns received during the exposure phase.
TUming to the ungramatical structures, given that in the exposure phase every clause
ended in a verb one would have expected the participants to learn this as a structural regularity in
the language. However, this was not the case, with endorsement rates fbr English word orders
("SVO- and "SVIO) not being significantly below chance, even amongst the sub-group who at
least accepted the trained scrambles. Moreover, endorsement of complex structures with right
movement of either O or S (i.e. "S[SVO]V and *S[OVS]V) was not significantly different from
the control group, and actually significantly above the chance level for the "S[SVO]V structure.
We concluded that there was no evidence of leaming the head-final characteristic of the
language.
If the Williams & Kuribara (2008) results cannot be explained in terms of leaming
grammatical generalisations then how can they be explained? It is not enough to argue that
participants simply learned the patterns that they were exposed to because this would predict
reliable rejection of all novel patterns, which was clearly not the case. We proposed that during
the exposure phase the participants learned the contingencies between grammatical categories
and that grammaticality judgments were driven by how well the sequential structure of a test
item matched the sequemial probabilities acquired during exposure. Connectionist networks
provide a means of calculating contingencies between events in a psychologically (if not
neurally) plausible way (Shanks, 1995). "Simple recurrent networks" (SR[Ns) are particularly
suital)le for sequence leaming problems such as segmentation of words from continuous speech
(Christiansen, Allen, & Seidenberg, 1998), and artificial gramrnar leaming (Kinder & Lotz,
2009). We therefore applied this kind of modeling framework to our data by coding the exposure
phase sentences in terms of grammatical categories (e.g. S, O, V) and presenting them to an SRN
(see Williams & Kuribara, 2008, fbr details). The network was essentially trained to predict each
category on the basis of the preceding categories in the sentence. After training on the set of
exposure phase sentences the network was presented with the test sentences and the strength of ,
its output calculated. The stronger the output the stronger the resonance between the sequential
structure of the test item and the sequential structure of the training data. For each test structure
type we then plotted the linear regression for network output against the mean acceptance rate
fbr that stmcture by the human participants, separately for the "scrambler" and "non-scrambler"
groups. In both cases the fit to the human data was extremely good. For example, excluding the
three long distance scrambling structures (where human performance appeared to be suppressed
- 10-
by processing difliculties) the regression over the remaining 16 structures accounted for 91% of
the variance in the scrambler data, and 70% in the non-scrambler data. These simulations
therefore provide good evidence that participants were leaming the sequential probal)ilities of
grammatical categories rather than grammatical generalisations.
Of course, one potential criticism of these laboratory-based studies is that the amount of
exposure was not sufficient to trigger grammar leaming. Assuming that with increasing exposure
the nature of the learning process does hot change, would it ever deliver knowledge of
grammatical generalisations? 'Ib answer this question we can see what happens when the
connectionist network is tested after different amounts of training, say for 50 versus 5000 times
through the exposure sentences. What we found was that with increasing exposure the output for
test stmctures that had appeared in training increased, consistent with better leaming of trained
structures. But the output fbr new structures did not change. That is, despite 100 times more
training the network did not change the strength of response to new grammatical scrambles and
new ungrammatical sentences. The kind of associative learning mechanism instantiated in this
kind of model would never deliver leaming of generalisal)le knowledge of grammar.
If we assume that the same would be true of human learners then it fbllows that if they do
show evidence of having acquired grammatical generalisations then it must be by using some
other kmd of knowledge. For example they might use explicit knowledge. R. Ellis (2005)
provided evidence that amongst adult learners of English perfbrmance in a grarrrrnaticality
judgment test was driven by a combination of implicit and explicit knowledge, with the use of
implicit knowledge associated with correct acceptance of grammatical stmctures, and the use of
explicit knowledge associated with correct rejection of ungrammatical structures. Grammatical
structures mighr be accepted through matching to patterns encountered in input, and hence could
be supported by implicit knowledge. But reliable rejection of ungrarmnatical structures requires
knowledge of generalised rules, which, on Ellis's evidence was only apparent as explicit
knowledge. Similarly Roehr (Roehr, 2008) suggests that implicit knowledge of a second
language is exemplar-based ieading to prototype and similarity effects, whereas categorical, and
context-independent, performance can only be achieved by using explicit metalinguistic
knowledge.
How plausible it is to appeal to explicit knowledge to explain detection of
ungrammaticalities depends very much on the nature of the violations in question. The head-final
character of Japanese is presumably a salient feature of the language that is easy for the learner
to represent explicitlM and which even beginner learners are explicitly taught. But what of more
subtle grammatical violations where the relevant grammatical rules are unlikely to hewe been
taught, and where there is unlikely to be anything in the input that tells the learners that they are
ruled out? For example, SMrite (2009) reviewed research on subjacency violations and subtle
aspects of the semantic interpretation of syntactic structures which suggests that L2 learners can
acquire aspects of the L2 that they are miikely to have transferred from the Ll, or about which
they are unlikely to have explicit knowledge. If it can be shown that judgments about the
ungrammaticality of such structures are indeed made in a native-like way and on the basis of
implicit knowledge (perhaps using the subjective measures employed by Rebuschat & Williams,
submitted), and that performance cannot be explained by domainrgeneral (e.g. connectionist)
- 11-
leaming mechanisms, then it will be necessary to appeal to implicit leaming mechanisms that go
beyond simple associative leaming.
4 Conclusion - Constraints on implicit learning -
There is a common theme running through studies of implicit leaming and statistical leaming in
both psychology and, more recently, in applied linguistics, and this is the notion that the implicit
leaming mechanism is constrained. We are not able to absorb all regularies in the environment.
In part this may be due to the fact that the implicit leaming mechanism operates on
representations that are fed to it by other systems. For exarnple, there is a general predisposition
to encode sequences in terms of positions relative to edges, and in terms of repetition structure,
and these predispositions may have specific implications for language leaming, and fbr our
notions of the role of innate constraints (Endress, et al., 2009). Of course in the case of SLA
relevant representations can be fed to the implicit leaming mechanism from the linguistic system
itsel£ as when people naturally learn about novel word patterns in terms of the sequencing of
underlying grammatical categories. Perhaps also they only implicitly learn associations between
grammatical morphemes and meanings that their prior linguistic knowledge tells them are likely
to be grammaticised in language. And it may also be that the IL mechanism itself is constrained
in the sense that it can only deliver certain kinds of knowledge - for example of the underlying
statistical regularities of sequences rather than generalisable rules. By exploring the limitations
of implicit leaming we can help define the phenomena that require a different kind of
explanation, be it explicit leaming or UG-constrained implicit leaming, and help demarcate areas
of language that can be assumed to be acquired spontaneously from those that require special
lnterventlon.
References
Bickerton, D. (2001). Okay for content words, but what about functional items? Commentary on
Bloom: How children learn the meanings of words. Behavioral and Brain Sciences, 24, (6),
1104-1105.
Bloom, P. (2000). Hbw Children Learn the Meanings of nlords. Cambridge, MA: MIT Press.
Carroll, S. E. (2005). Input and SLA: Adults' sensitivity to different sorts of cues to French
gender. Language Learning, 55, (1), 79-138.
Caselli, M. C., Leonard, L. B., Vblterra, V & Campagnoli, M. G. (1993). 'Ibward mastery of
Italian morphology: A cross-sectional study. Journal of Child Language, 20, (2), 377-393.
Christiansen, M. H., Allen, J. & Seidenberg, M. S. (1998). Learning to segment speech using
multiple cues: A connectionist model. Language and Cognitive Processes, 13, (213),
221-268.
Cleary, A. M. & Langley, M. M. (2007). Retention of the stmcture underlying sentences.
Language and Cognitive Processes, 22, (4), 614-628.'
DeKeyser, R. M. (2003). Implicit and explicit leaming. In C. Doughty & M. Long (Eds.),
Handbook of Second Language Acquisition, 313-348. 0xford: Blackwell.
- 12-
Dienes, Z. & Perner, J. (1999). A theory of implicit and explicit knowledge. Behavioral and
Brain Sciences, 22, 735-808.
-----. & Scott, R. (2Q05). Measuring unconscious knowledge: Distinguishing structural
knowledge and judgment knowledge. Psychological Research, 69, (5/6), 338-351.
Ellis, N. C. (1994). Vbcabulary acquisition: The expiicit ins and outs of explicit cognitive
mediation. In N. C. Ellis (Ed.), ImpZicit and Explicit Learning of Languages, 211-282.
London: Academic Press.
----- . (1998). Emergentism, connectionism and language leaming. Language Learning, 48, (4),
631-664.
Ellis, R. (2005). Measuring implicit and explicit knowledge of a second language: A
psychometric study. Studies in Second Language Acguisition, 27, (2), 141-172.
Endress, A. D., Nespor, M. & Mehler, J. (2009). Perceptual and memory constraints on language
acquisition. 7)ends in Cognitive Sciences, 13, (8), 348-353.
Francis, A. P., Sclmidt, G. L., Carr, T. H. & Clegg, B. A. (2009). Incidental leaming of abstract
rules for non-dominant word orders. Psychologicai Research-Psychoiogische Forschung, 73,
(1), 60-74.
Gal)rieli, J. D. E., Cohen, N. J. & Corkin, S. (1988). The impaired learning of semantic
knowledge following medial temporal lobe resection. Brain and Cognition, 7, (2), 157-177.
Haist, E, Musen, G. & Squire, L. (1991). Intact priming of words and nonwords in amnesia.
Psychobiology, 19, (4), 275-285.
Hawkins, R. (2001). Second Language Syntcwc: A Generative introduction. Oxford: Blackwell.
Hill, T., Lewicki, P., Czzyzewska, M. & Schuller, G. (1990). The role of learned infergntial
encoding rules in the perception of faces: Effects of nonconscious selfperpetuation of a bias.
Journal of ExperimentaZ Social Psychology, 26, (4), 350-371.
Holmes, V M. & Dejean de la Batie, B. (1999). Assignment of grammatical gender by native
speakers and foreign language learners. Applied Psycholinguistics, 20, (4), 479-506.
Hudson Kam, C. L. (2009). More than words: Adults learn probabilities over categories and
relationships between them. Language Learning and Development, 5, (2), 115-145.
-----. & Newport, E. L. (2005). Regularizing unpredictable variation: The roles of adult and
child learners in language formation and change. Language Learning and Development, 1,
(2), 151-195.
Iwasaki, N. (2003). L2 acquisition of Japanese: knowledge and use of case particles in SOV and
OSV sentences. In S. Karimi (Ed.), nlord Order and Scrambling, 273-300. 0xford:
Blackwell.
Kaschak, M. P. & Glenberg, A. M. (2004). This construction needs learned. .lburnal of
Experimental Psychology-General, 133, (3), 450-467.
Kinder, A. & Lotz, A. (2009). Connectionist models of anificial grammar leaming: wnat type of
knowledge is acquired? Psychological Research-Psychologische Forschung, 73, (5),
659-673.
Koelsch, S., Gunter, T., Friederici' , A. D. & Schroger, E. (2000). Brain indices of music
processing: "Non-musicians" are musical. Journal of Cognitive Neuroscience, 12, (3),
520-541.
- 13-
Krashen, S. (1981). Second Language Acguisition and Second Language Learning. London:
Pergamon.Leung, J. & Williams, J. N. (2006). lmplicit learning of form-meaning connections. ln R. Sun &
N. Miyake (Eds.), Proceedings qf the Annual Meeting of the Cognitive Science Society,
465-470. Mahwah, NJ.: Lawrence Erlbaum Associates.
-----. & -----. (fbnhcoming). The implicit leaming of mappings between forms and
contextually-derived meanings. Studies in Second Language Acquisition.
Logan, G. D. & Etherton, J. L. (1994). wuat is learned during automatization? The role of
attention in constructing an instance. Journal of Experimental Psychology: Learning,
Memory and Cognition, 20, (5), 1022-1050.
Long, M. H. (1996). The role of the linguistic environment in second language acquisition. In W
C. Ritchie & T. K. Bhatia (Eds.), Handbook of Second Language Acquisition, 413-468. New
Ybrk: Academic Press.
Lovibond, P. F. & Shanks, D. R. (2002). The role of awareness in Pavlovian conditioning:
Empirical evidence and theoretical implications. Journal gS Experimental Psychology:
Animal Behavior Processes, 28, (1), 3-26.
McClelland, J. & Rumelhart, D. (1985). Distributed memory and the representation of general
and specific infbrmation. Jburnal of Experimental Psychology: General, II4, (2), 159-188.
Pearce, M. T., Ruiz, M. H., Kapasi, S., Wiggins, G. A. & Bhattacharya, J. (2010). Unsupervised
statistical leaming underpins computational, behavioural, and neural manifestations of
musical expectation. Neuroimage, 50, (1), 302-313.
Perruchet, P. & Pacteau, C. (1990). Synthetic grammar leaming: Implicit rule abstraction or
explicit fragmentary knowledge? Journal of Experimental Psychology: General, H9, (3),
264-275.
Reber, A. S. (1967). Implicit learning of artificial grammars. Journal of IZerbal Leaming and
Verbal Behavior, 6, (6), 855-863.
-----. & Allen, R. (1978). Analogic and abstraction strategies in synthetic grammar leaming: A
fUnctionalist interpretation. Cognition, 6, (3), 189-221.
Rebuschat, P. & Williams, J. N. (2006). Dissociating implicit and explicit leaming of natural
language syntax. In R. Sun & N. Miyake (Eds.), Proceedings of the Annual Meeting of the
Cognitive Science Society, 2594. Mahwah, N.J.: Lawrence Erlbaum.
-----. & -----. (submitted). lmplicit leaming of second language syntax.
Reed, N., McLeod, P. & Dienes, Z. (2010). lmplicit knowledge and motor skill: wnat people
who know how to catch don't know. Consciousness and Cognition, 19, (1), 63-76.
Robinson, P. (1996). Learning simple and complex second language rules under implicit,
inciderrtal, rule-search, and instructed conditions. Studies in Second Language Acquisition,
I8, (1), 27-67.
Roehr, K. (2008). Linguistic and metalinguistic categories in second language leaming. ,
Cognitive Linguistics, 19, (1), 67-106.
Rosenbach, A. (2008). Animacy and grammatical variation: Findings from English genitive
variation. Lingua, 118, (2), 151-171.
Saito, M. & Fukui, N. (1998). Order in phrase structure and movement, Linguistic lnguiry, 29,
- 14-
(3), 439-474.
Schrnidt, R. (2001). Attention. ln P. Robinson (Ed.), Cognition and second Language
Instruction, 3-32.Cambridge:CambridgeUniversityPress.
Shanks, D. R. (1995). 71he Psychology of Associative Learning. Cambridge: Carnbridge
University Press.
-----. & St. John, M. (1994). Characteristics of dissociable human leaming systems. Behavioral
and Brain Sciences, 17, (3), 367-447.
Smith, L. & YU, C. (2008). Infants rapidly learn word-referent mappings via cross-situational
statistics. Cognition, I06, (3), 1558-1568.
Tunney, R. J. & Altmann, G. T. M. (2001). TWo modes of transfer in artificial grammar leaming.
Journal of Experimental Psychology-Learning Memory and Cognition, 27, (3), 614-639.
VanPatten, B. (1996). input Processing and Grammar lnstruction: 71heory and Research.
Norwood, New Jersey: Ablex Publishing Corporation.
White, L. (2009). Changing perspectives on universal grarmnar and the critical period
hypothesis. Second Language, 8, 3-22.
Williams, J. N. (2003). Inducing abstract linguistic representations: Human and connectionist
leaming of noun classes. In R. H. van Hout, A. Hulk, F. Kuiken & R. Tbwell (Eds.), 77ze
Interface between Synta)c and the Lexicon in Second Language Acquisition, 151-174.
Amsterdam: John Bebjarnins.
-----. (2004). Implicit leaming of form-meaning connections. In B. VanPatten, J. Williams, S.
Rott & M. Overstreet (Eds.), Form Meaning Connections in Second Language Acquisition ,
203-218. Mahwah, NJ: Lawrence Erlbaum Associates.
-----. (2005). Leaming without awareness. Studies in Second Language Acquisition, 27, (2),
269-304.
-----. & Kuribara, C. (2008). Comparing a nativist and emergentist approach to the initial stage
of SLA: An investigation of Japanese scrambling. Lingua, H8, (4), 522-553.
- 15-