the lexicon-syntax interface in second language aquisition

242
The Lexicon–Syntax Interface in Second Language Aquisition

Upload: way-up-english-learning-usuario

Post on 24-Dec-2015

38 views

Category:

Documents


7 download

DESCRIPTION

paper on second language aquisition

TRANSCRIPT

The Lexicon–Syntax Interface in Second Language Aquisition

<DOCINFO AUTHOR ""TITLE "The lexicon-syntax interface in second language aquisition"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Language Acquisition & Language Disorders

Volumes in this series provide a forum for research contributing to theories oflanguage acquistion (first and second, child and adult), language learnability,language attrition and language disorders.

Series Editors

Harald ClahsenUniversity of Essex

Lydia WhiteMcGill University

Editorial Board

Melissa F. BowermanMax Planck Institut für Psycholinguistik, Nijmegen

Katherine DemuthBrown University

Wolfgang U. DresslerUniversität Wien

Nina HyamsUniversity of California at Los Angeles

Jürgen M. MeiselUniversität Hamburg

William O’GradyUniversity of Hawaii

Mabel RiceUniversity of Kansas

Luigi RizziUniversity of Siena

Bonnie D. SchwartzUniversity of Hawaii at Manao

Antonella SoraceUniversity of Edinburgh

Karin StromswoldRutgers University

Jürgen WeissenbornUniversität Potsdam

Frank WijnenUtrecht University

Volume 30

The Lexicon–Syntax Interface in Second Language AquisitionEdited by Roeland van Hout, Aafke Hulk, Folkert Kuiken and Richard Towell

The Lexicon–Syntax Interfacein Second Language Aquisition

Edited by

Roeland van HoutUniversity of Nijmegen

Aafke HulkUniversity of Amsterdam

Folkert KuikenUniversity of Amsterdam

Richard TowellUniversity of Salford

John Benjamins Publishing Company

Amsterdam/Philadelphia

The paper used in this publication meets the minimum requirements8 TM

of American National Standard for Information Sciences – Permanenceof Paper for Printed Library Materials, ansi z39.48-1984.

Library of Congress Cataloging-in-Publication Data

The lexicon–syntax interface in second language aquisition / edited by Roelandvan Hout, Aafke Hulk, Folkert Kuiken and Richard Towell.

p. cm. (Language Acquisition and Language Disorders, issn

0925–0123 ; v. 30)Includes bibliographical references and index.

1. Second language acquisition. 2. Grammar, Comparative andgeneral--Syntax. 3. Lexicology. I. Title: Lexicon-syntax interface in 2nd languageacquisition. II. Hout, Roeland van. III. Series.

P118.2.L49 2003418-dc21 2003051906isbn 90 272 2499 4 (Eur.) / 1 58811 418 X (US) (Hb; alk. paper)

© 2003 – John Benjamins B.V.No part of this book may be reproduced in any form, by print, photoprint, microfilm, orany other means, without written permission from the publisher.

John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The NetherlandsJohn Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa

Table of contents

<TARGET "toc" DOCINFO AUTHOR ""TITLE "Table of contents"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150" VOFFSET "4">

Acknowledgments

< R E F

"ack">

vii

< / R E F

"ack">

1. Introduction: Second language acquisition research in search of

< R E F

"tow">

an interface 1

< / R E F

"tow">

Richard Towell

2. Locating the source of defective past tense marking in advanced

< R E F

"haw">

L2 English speakers 21

< / R E F

"haw">

Roger Hawkins and Sarah Liszka

3. Perfect projections

< R E F

"cor">

45

< / R E F

"cor">

Norbert Corver

4. L1 features in the L2 output

< R E F

"cra">

69

< / R E F

"cra">

Ineke van de Craats

5. Measures of competent gradience

< R E F

"duf">

97

< / R E F

"duf">

Nigel Duffield

6. Lexical storage and retrieval in bilinguals

< R E F

"dyk">

129

< / R E F

"dyk">

Ton Dijkstra

7. Inducing abstract linguistic representations: Human and

< R E F

"wil">

connectionist learning of noun classes 151

< / R E F

"wil">

John N. Williams

8. Neural substrates of representation and processing of a second

< R E F

"sab">

language 175

< / R E F

"sab">

Laura Sabourin and Marco Haverkort

9. Neural basis of lexicon and grammar in L2 acquisition:

< R E F

"gre">

The convergence hypothesis 197

< / R E F

"gre">

David W. Green

10. The interface: Concluding remarks

< R E F

"hou">

219

< / R E F

"hou">

Roeland van Hout, Aafke Hulk and Folkert Kuiken

vi Table of contents

Name index

< R E F

"ni">

227

< / R E F

"ni">

Subject index

< R E F

"si">

229

< / T ARGET

"toc">

Acknowledgments

<TARGET "ack" DOCINFO AUTHOR ""TITLE "Acknowledgments"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

This volume contains a selection of papers presented at the NWCL/LOT ExpertSeminar on ‘The interface between syntax and the lexicon in second languageacquisition’, held in Amsterdam on March 30–31, 2001. The seminar wasorganized by the editors of this volume. We want to thank the participants ofthe seminar for reviewing the papers submitted for this volume. We are verygrateful to the following institutions for financial support: the North WestCentre for Linguistics (NWCL), the Landelijke Onderzoekschool Taalweten-schap (LOT; Netherlands Graduate School of Linguistics) and from theUniversity of Amsterdam: the Amsterdam Center for Language and Com-munication (ACLC), the Chair of Linguistics of the Romance Languages andthe Chair of Second Language Acquisition.

February 2003,The editors

</TARGET "ack">

<TARGET "tow" DOCINFO AUTHOR "Richard Towell"TITLE "Introduction"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 1

Introduction

Second language acquisition researchin search of an interface

Richard TowellUniversity of Salford

1. Introduction

If it is to attain its eventual goal, second language acquisition research has tointegrate the totality of second language acquisition processes. These mustinclude the learning of the core syntax of a second language, the learning of thelexical items and determining the role of the cognitive mechanisms which arenecessary for the use of linguistic forms in comprehension and production.

It has been accepted for a long time that these three domains involvedifferent kinds of learning: syntax is learnt through a process of implementinga particular set of universal structures (Chomsky 1986, White 1989); lexis islearnt by establishing a set of arbitrary associations which operate in a givensociety (Waxman 1996); comprehension and production are reliant on generalcognitive procedures (Harley 2001). The learning of syntax is often character-ised as a process of triggering (Sakas and J.D. Fodor 2001); the learning of lexisis characterised by the building up of associations (or connections) (Schreuderand Weltens 1993); comprehension and production are learnt by establishingand practising the required procedures (Pinker 1997).

However, these three systems must come together in the creation of a wholelinguistic capacity in the mind of an individual. The syntax will govern thestructure of the grammar but the lexical items will govern how the structure isimplemented. The linguistic knowledge which results from the interaction of thesetwo systems can only develop and then find expression through the cognitivemechanisms associated with language comprehension and production.

The researchers who attempt to provide accounts of the processes andoutcomes of second language acquisition (SLA) are generally all too aware,

2 Richard Towell

therefore, that they have set themselves an ambitious, interdisciplinary task.Ideally, as a group, they wish to account for all aspects of second languageacquisition from the phonetic to the intercultural (see Mitchell and Myles 1998).In particular, they set themselves the task of explaining those factors which havelong been recognised as specific to the acquisition of a second or foreignlanguage as opposed to the mother tongue. These are usually identified astransfer or cross-linguistic influence, evidence of a specifiable route of acquisi-tion regardless of first language background, variability (also known as option-ality) in the language of individual learners, and incompleteness or fossilisationin the final state of the majority of acquirers (Towell and Hawkins 1994).

Clearly no one researcher could ever hope to deal with all aspects. Atdifferent times, the nature of the activities which are being described has led tothe involvement of scholars from disciplines ranging from acoustics to anthro-pology. The central disciplines involved have, however, always been linguisticsand psychology. For what now seems a brief period in the 1950s, linguistics andpsychology came together to provide what was then thought of as a completedescription of what language was and how it was learned: a powerful combina-tion of structuralist linguistics and behaviourist psychology (Gass and Selinker2001). Unfortunately, this period is only talked about in today’s classes onsecond language acquisition in order to show how misguided both of theseinitiatives were, forgetting rather that, despite the radical shifts of views whichhave followed, these efforts laid the foundations of the disciplines within whichwe all situate our research.

It is probably true to say that linguistics and psychology began to followdifferent routes after the devastating criticisms of Skinner’s (1957) VerbalBehaviour put forward by Chomsky (1959), although many psychologists stillcontinued to attempt to interpret transformational theory in psychologicalterms. These attempts foundered as the derivational theory of complexity (thebelief that the more complex transformations were, the longer they would taketo process) was denied by linguists. Linguists pointed out that, although termslike ‘least effort’ and ‘economy’ were essential to their endeavours, they werenot defined with regard to processing effort but in relation to linguistic simplic-ity: “Chomsky’s economy principles are unambiguously matters of competence,in that they pertain to representations and derivations internal to the languagefaculty and exclude relations beyond the interfaces” (Smith 1999:114). Main-taining this position, linguists have gone on to develop their discipline significantlybut within the boundaries which they have seen as necessary. They have thereforedone so with little reference to any insights from psychology. Mainstream

In search of an interface 3

generative linguists have focused on syntax. That mainstream focus provides thebackground for the articles by Hawkins and Liszka, Corver, Van de Craats andDuffield in this volume.

However, during the same period, the study of language within psychologyalso made rapid strides in exploring many aspects of psycholinguistics (seeHarley 2001). One of these has concentrated on the lexicon, as is demonstratedin the chapters by Dijkstra and Williams in this volume; others have exploredissues of how language may be stored in the brain and have made use ofimaging techniques to enable us to begin to relate our theoretical analyses tophysical realities (Perani et al. 1996, Dewaele 2002). These are represented inthis volume by the chapters by Sabourin and Haverkort and by Green. Therehas been the occasional fruitful interchange in some areas of the discipline fromtime to time but there has been no real examination of how the two disciplineshave evolved with regard to SLA and whether there are more global reasons forlooking towards collaboration. There are now signs that both groups ofresearchers are coming to an understanding that their particular view of theworld may not suffice to account for the overall process and that each will haveto understand more about what the other knows.

Whilst this book does not pretend to complete that task, it will seek topresent current examples of the way linguists think about second languageacquisition and of the way researchers working within a more psychologicalframe of reference think about the same subject in such a way as to show howthere is a degree of complementarity in the work being done, even if, at thehighest levels of argument, we are unlikely to see a swift return to the unity ofview of the 1950s (cf. Smith 1999:174).

The belief that the time is right to seek such complementarity is encouragedby two fairly recent developments. Within linguistics, the advent of the mini-malist theory and its consequences for SLA has caused researchers to look againat the relative roles of syntax and lexis. Under the minimalist view which, asCorver demonstrates in chapter three, applies as much to interlanguages as toany other natural languages, syntax is thought to be universal. It is “constitutedof invariant principles with options restricted to functional elements andgeneral properties of the lexicon” (Chomsky 1995:170). The invariant nature ofthe syntax is possible because the functional elements are now seen as part ofthe lexicon: “It is clear that the lexicon contains substantive elements (nouns,verbs…) … And it is reasonably clear that it contains some functional catego-ries.” (Chomsky 1995:240). More recently, SLA specialists (Hawkins 2001:345,Herschensohn 2000:80) have stated rather more firmly that, under minimalism,

4 Richard Towell

functional categories should be seen as part of the lexicon. We will argue belowthat this change of emphasis — Smith (1999) argues cogently that minimalismis a natural evolution of the generativist enterprise rather than a revolution —may well modify the way in which linguists have to think about development inSLA and indeed about the other central features of SLA research outlined above.

Within psychology, there has been a welcome renewal of interest in howsecond languages are acquired. Since the mid 1980s, we have seen attempts toaccount for second language learning drawing on a rich vein of inductiveresearch using computer modelling (Rumelhart and McClelland 1986). Morerecently, a variety of non-intrusive ways of providing physical evidence of brainprocesses have become available (Perani 1999). Both linguists and psychologistshave become interested in how knowledge is stored in the mind and how it isretrieved from storage. Arguments have been put forward based on a distinc-tion between declarative and procedural memory systems (Towell and Hawkins1994, Ullman 2001) some of which suggest radical differences in the way firstand second languages are acquired and stored in the mind. Most recently,arguments have been put forward to suggest that usage-based analyses canaccount for all linguistic units: “Psycholinguistic and cognitive linguistictheories of language acquisition hold that all linguistic units are abstracted fromlanguage use. In these usage based perspectives, the acquisition of grammar isthe piecemeal learning of many thousands of constructions and the frequency-biased abstraction of regularities within them” (Ellis 2002:144). All of this addsto the view that a full account of second language acquisition will requirecomplementary input from both disciplines.

One of the main keys must lie in how we see the concept of developmentand the psychological mechanisms which underlie development. The efforts ofsyntacticians focus on describing the syntactic structure which lies behind theinterlanguage of the learner. This has always made it difficult for them toaccount for development (see Gregg 1996): the placing of functional categorieswithin the lexicon makes this difficulty more acute. The invariant syntacticknowledge which learners have is a template present in the mind of the learnerwhich can be modified by the information inserted within it. There cannot bea driving force for development in the syntax. It follows therefore that thatdriving force really comes from the lexis. However, up to now the learning oflexis has been thought of mainly in terms of one or other forms of associationistlearning theory, with connectionism being the most powerful. Theoristspursuing this model have tended to argue that connectionist learning canaccount for the totality of language learning, including the learning of syntax

In search of an interface 5

(see quotation from Ellis (2002) above). But syntacticians cannot accept thatthe sophisticated structures which they observe and which provide no visibleclues on the surface structure of the language can be learnt in this empiricalfashion. They claim instead that innate knowledge (mediated or not by the L1)must be ‘guiding’ the learning (Hawkins 2001).

It is not clear that this argument can be resolved by theoretical debatesbetween ‘nativists’ and ‘non-nativists’. We need to examine in detail theevidence of how learners acquire a second language. This evidence must comefrom a variety of sources using a variety of techniques. It will take us intoquestions of what it is that learners acquire, how they acquire it and how thatprocess modifies their linguistic capacity in both knowledge and use. Hawkins,Corver and Van de Craats in this volume present clear empirical accounts ofhow specific features of syntactic and lexical knowledge play a fundamental rolein second language development. Their accounts cannot, however, tell useverything we need to know about the mental processes involved. Duffield asksfundamental questions about the nature of the competence which is acquired.Dijkstra provides an account of the way in which bilinguals store and accesstheir knowledge. Williams examines the way in which linguistic knowledge may bebuilt up on the basis of distributional evidence. Sabourin and Haverkort and thenGreen look at the way in which the knowledge may be stored and used. In this waywe can see that a full account of the acquisition of a second language involvesthe three systems outlined at the beginning of this chapter. We will argue thatnone of the current arguments will suffice alone to account for the total processand that it is essential to attempt to integrate the sources of knowledge availableto us (see Jackendoff (2002) for a similarly motivated position).

In this chapter we shall seek first to outline the research context from thegenerativist point of view, initially in general, and then specifically with regardto the contribution of minimalism. We will then examine in more detail thecontribution of psychological research and seek to show how a degree ofcomplementarity may well exist, if there is the will to look for it. In this way, weare seeking to provide a context for the more detailed studies which figure inthe rest of the volume by means of which the value of their contribution can beseen against the background of the evolving discipline of SLA research.

6 Richard Towell

2. The Linguistics dimension: The generativist research paradigm

In outlining the contribution of generativist research, we will first briefly reviewthe nature of this research paradigm and the reasons why it has become socentral to SLA research.

2.1 The generativist position

Noam Chomsky (1986:3) poses three basic questions which linguists need to beable to answer:

1. What constitutes knowledge of language?2. How is language acquired?3. How is knowledge of language put to use?

He and his followers have in effect concentrated on the first question. It isessential to — but separable from — the understanding of the other two.Indeed, Chomsky argues that in order to obtain a proper answer to thisquestion it is necessary to idealise the data to be examined away from issues ofperformance so that the researcher can gain insight into the abstract knowledgewhich the native speaker of a language possesses, i.e. that person’s linguisticcompetence. Furthermore, the generativist position adopts a modular view ofthe mind in which the child possesses an innate language faculty. This isconceived of as separate from those parts of the mind which are devoted togeneral cognitive skills associated with the processing of information (percep-tion, comprehension, production) and memory. It is argued that, becauselinguistic structure is universal and is not signalled overtly on the surface of anyof the languages of which it is a manifestation, it is not possible for a child toacquire knowledge of language (in the sense of syntactic competence) on thebasis of exposure to surface cues alone. This is frequently referred to as thelogical problem of language acquisition. Surface cues are necessary to providean indication for the child as to which of the limited number of possiblelanguages he or she is confronted with, but they could never be sufficient toprovide knowledge of the kind of organisation which is present in the syntacticstructure of language. As each child acquires this knowledge with no consciouseffort, no explicit instruction, following a regular pattern of acquisition notreflected in the data to which the child is exposed and without making themistakes which piecemeal learning would imply, generativists conclude thatlinguistic knowledge is a biological innate endowment for humankind. This

In search of an interface 7

endowment is what enables the child to know more than the surface of thelanguage reveals: the surface forms act only as a trigger for the underlyingknowledge which the child already possesses.

It is important to highlight four significant aspects of this theoreticalposition as these will give rise to further comment below and will be dealt within subsequent chapters. First, a generativist approach involves idealisation ofthe data to be examined. For the study of adult competence in the mothertongue the researcher can frequently consult his or her own linguistic knowl-edge as a representative sample of the idealised speech community. This is notpossible in second language research. SLA research requires data gatheringmethods which can isolate linguistic competence from performance factors.Second, the primacy of syntax within the generative paradigm has led to aseparation between syntax and semantics. This is not without its problems asmore and more researchers are finding that semantic factors influence syntacticphenomena (Juffs 1998, 2001). Duffield in this volume is concerned with howcompetence can be successfully defined within this paradigm and proposes thatit is necessary to conceive of competence at two levels, one of which is morerelated to the surface structures of language. Third, the conception of theacquisition of syntactic knowledge through a process of triggering has given riseto debate (Lightfoot 1993, Carroll 2000, Sakas and J.D. Fodor 2001). Hawkins,Corver and Van de Craats, through an examination of the acquisition ofsyntactic and lexical features, define more clearly the nature of the featureswhich have to be learnt and discuss the role of the L1 in providing the initialknowledge. This might provide a more satisfactory conceptual basis at least withregard to the initial state. It does, however, leave open the question of how thelearners use empirical evidence to move to subsequent states. Fourth, thegenerativist position for SLA has to adapt to the fact that a second languageacquirer has already learnt one language. The issue of how learners transfer oraccess universal knowledge in cases where one language has already been learntis one which may have to be looked at again within the minimalist paradigm.This is an issue for Hawkins and Liszka, for Corver, and for Van de Craats.

2.2 Generativist second language research

The particular strength of this approach has been in providing syntacticanalyses within a theoretical framework. This enabled SLA researchers topredict in a precise way what learners needed to acquire in order to developtheir interlanguage system. During the 1990s, this was manifested mainly

8 Richard Towell

through analyses presented within the conceptual basis provided by parameter(re-)setting (Flynn 1987). In this framework, languages could be compared onthe basis of a single underlying syntactic phenomenon which was independentlytheoretically motivated by Universal Grammar (UG). This underlying syntacticphenomenon would have several surface expressions not linked together byother linguistic theories. Clear statements could then be made about whatlearners needed to do in order to re-set their existing parameters to the settingneeded for their second language. Clear predictions could also be made aboutwhat the learner language would look like if the re-setting took place and whatit would look like if it did not. To give two examples: the pro-drop parametercontrasted the presence or absence in the learner’s interlanguage grammar ofsuch diverse phenomena as null subjects, of expletive it and there, the permissi-bility of subject-verb inversion and the possibility of extracting a wh-subjectacross an overt complementiser (that-trace filter) in languages such as Englishcompared with Spanish or Italian. The verb raising parameter linked differencesin adverb placement, negation and the use of quantifiers in French and English.The hypothesis in both cases was that exposure to the second language wouldenable learners to re-set the parameter through triggering and that researcherswould be able to measure the differences in the learners’ linguistic competenceat different points in time. Thus, if a native speaker of English acquiringSpanish triggered the pro-drop parameter to the Spanish setting, that personwould immediately acquire knowledge of all the elements linked in the parame-ter. If a native speaker of French learning English re-set the verb-raisingparameter, issues to do with adverb placement, the position of negatives and offloated quantifiers would be solved at the same time. Whilst this researchprovided a very positive move forward in SLA work, the empirical evidencefrequently did not bear out the view that parameter re-setting was the essentialprocess of SLA learning of syntax. Many studies (Hawkins, Towell and Bazergui1993, White 1991) showed what might be called partial parameter re-setting, inthe sense that some of the elements identified were learnt together but otherswere not. Whilst learners did not produce ‘wild grammars’ i.e. grammars whichfell outside the constraints of linguistic theory, they could not be seen simply tore-set a parameter. This called into question either the nature of the underlyinglinguistic definition of the parameter or the process involved. There was alsosome uncertainty about what might constitute a trigger: would learning anyform which was a surface manifestation of one element of the parameter triggerthe other forms or was one form particularly privileged to act as a trigger?

In search of an interface 9

2.2.1 The minimalist perspectiveThere are several important differences between the principles and parameters(P and P) model of syntax and the minimalist version. The most important oneis the claim that the syntax is invariant and that the morpholexical system is thesource of all variation. This has important implications for second languageresearchers. Herschensohn (1999) argues convincingly that minimalism shouldbe better able to account for all of the main features of second languageacquisition as defined above. It will be better at dealing with the area where ithas always been most successful viz transfer but it should also be in a positionto give a better account of the route of learning, incompleteness and variability(optionality). The problem with the P and P model was that it was all ornothing: either the parameter had been re-set and all features fell into place orit had not and they did not. As pointed out above, investigations based on thistheory tended to find that partial re-setting took place, but the theory itselfcould not account for ‘partiality’ given that the re-setting process was one of‘switch-flipping’. The notion contained within minimalism that acquisitionproceeds more through “the gradual building of L2 grammar through thecontrol of morpholexical constructions” (Herschensohn 1999:81) allows forlearners to be aware of the need to apply certain features or categories in somecircumstances but not others.

Hawkins (2001) and Hawkins and Liszka in this volume would probablyagree that the minimalist approach opens the door to a more satisfactoryaccount of variability (optionality) and incompleteness, but they base theirarguments more on the presence or absence of features in the functionalcategories of the L2 than on the build up of constructions. Hawkins and Liszkaalso set out a specific view on the relationship between the L1 and the L2. Theyshow that Chinese learners learning English do not mark tense consistently.Having investigated and rejected a range of alternative proposals, Hawkins andLiszka argue that this is likely to be because Chinese learners are unable toestablish that the functional category English T is specified for +/- past. Theysuggest, furthermore, that this feature is not available to the learners because itdoes not exist in their first language. They claim that “where parametrisedsyntactic features are not present in a speaker’s L1, they will not be accessible inlater L2 acquisition”. Such a point of view, if substantiated, would argue againstfull transfer from the L1 and would substantiate the partial access hypothesis. Inthe presentation of the article, Hawkins and Liszka contrast their account withthat of Lardiere who adopts a full access point of view. Lardiere’s explanation isthat the evidence suggests a failure of mapping from one component within the

10 Richard Towell

language faculty to another. Hawkins and Liszka feel that the evidence fromChinese learners argues more in favour of a partial transfer perspective.

If Hawkins and Liszka are correct in their analysis, there are considerableconsequences for acquisition in a more general sense. If L2 learners reallycannot provide a feature based analysis for this part of their interlanguagesyntax, they will have to learn the required forms (and store them) in anotherway. Whilst the discussion provided by Hawkins and Liszka remains within theframework of the role of features within generative syntax, the discussion of thealternative strategies available to learners provides food for thought as towhether the learning of the non-integrated forms must then be carried out in adifferent way e.g. stored as declarative as opposed to procedural knowledge asdiscussed in the article by Green.

The notion of partial transfer also contrasts with the articles by Corver andVan de Craats. They allow for the full transfer of both lexical and syntacticfeatures from the L1 to the L2. Their argument is that the full transfer of the L1features into the L2 provides the starting point for learners beginning to acquirean L2. This is combined with a conservative strategy i.e. one in which thelearners maintain those features unless and until they perceive the need tomodify them. The learners create a series of interlanguages, all of which remainwithin the constraints of UG and thus provide checkable and interpretableinformation for the internal and external interfaces. As they progressivelymodify the features in response to the differences which they perceive on thebasis of the input they receive from speakers of their ‘target’ L2, their inter-languages move towards a point where their interlingual ‘perfect’ systemscorrespond more to the natural language used by the L2 speakers.

The above comments should serve to show that the generativist perspectiveon second language acquisition has the power to create well-defined hypothesesabout the nature of language and to turn these into clearly defined investigativestrategies. The shift to minimalism means, however, that the learning of itemsin the lexicon is potentially more significant than it was previously. Theevidence from the Van de Craats article in particular shows how learnersprogressively revise their featural specification of both functional and lexicalcategories. As these are both part of the lexicon, it is clear that, from theminimalist perspective, the way in which second language learners come torevise their lexical forms — especially those which are linked to functionalcategories — is now a more important issue. We have suggested above that itprovides the driving force for acquisition. The process of coming to know thatforms and features need to be revised, however, necessarily involves comparative

In search of an interface 11

perception of language forms. How the learners trigger their knowledge or howthey perceive the differences is something which cannot easily be dealt withwithin generative linguistics as currently defined, because it is more related toperformance than competence. As soon as we mention perception we need toreturn to the psychological dimension and look again at second languageacquisition from that perspective.

3. The psychological dimension

It is probably as well to recognise immediately that the issues which dividedlinguists and psychologists fifty years ago have not gone away. Mainstreamlinguists still work within a rationalist framework which contrasts with thepsychologists’ emphasis on empiricism. Linguists tend to reason on a top-downbasis, psychologists base their theories on bottom-up evidence. Linguists believein an innate, biological endowment specific to language; psychologists believethat language learning is one manifestation of cognition amongst others.Linguists believe that language is a symbolic system; some psychologists, at least,believe that it can be accounted for without the use of symbols. Linguists tend toreject computer modelling; many psychologists rely on computer modelling.

Despite these differences, the argument that is being developed in thischapter (and which is the justification for this book) is that these differencesactually provide us with perspectives which are complementary rather thancontradictory, if we choose to look for the areas where insights from one fieldcan contribute to the other (see Hulstijn 2002 for a similarly motivated view).Indeed, the fact that each field has excluded the domains covered by the otherdiscipline surely leads to a position where significant aspects of the total processof SLA as described in Section 1 cannot be dealt with except by reference to theother discipline.

It is therefore important to look more closely at the methods and results ofpsycholinguistic research in order to establish where the complementarity maylie. In order to do so we will now situate the articles in the book which haveadopted a psychological reference point within the development of the method-ologies used in psycholinguistics.

Psychologists interested in language have made use of a variety of methods,many of which are shared with other branches of cognitive psychology. Thethree most important of these are experimental investigations which rely on themeasurement of reaction times against a theoretically predicted outcome; the

12 Richard Towell

use of physical measurements of brain activity; and computer simulations ofspeaker or learner behaviour. Psychologists apply these methods to a variety ofL1 and L2 speakers and to patients whose language ability is impaired in someway e.g. aphasics. All of these are represented in papers in this volume.

Dijkstra’s article provides an excellent illustration of two out of the threemethodologies typically encountered in the psycholinguistic literature. Theproblem space he addresses is whether bilinguals in possession of words fromdifferent languages (under his definition this includes language learners as‘unbalanced bilinguals’) access both sets automatically in response to a givenstimulus or whether one or the other is primed by different contexts or presen-tation methods. His stimuli include words which are cognates (similar in bothform and meaning in the two languages), homographs (similar in orthographicform but not in meaning) and homophones (similar in sound but not inorthographic form or meaning). In a series of experiments Dijkstra and hiscolleagues have shown that bilinguals cannot do otherwise than access the itemsin both of their languages when presented with an applicable stimulus ofisolated words (they call this nonselective access). They have also shown thatfrequency of use is the main determinant: highly automatised L1 words areaccessed more swiftly than L2 words. But they have also shown that both itemsremain activated for a relatively long time before language selection takes place.Dijkstra also reports on a (limited) number of studies which examine words insentential contexts by the measurement of Event Related brain Potentials(ERPs) through EEGs. The results have suggested that there are significantdifferences in the way bilinguals process their second language. Those related tosemantic aspects seem quantitative in nature whilst those related to the syntax havea qualitative dimension as well, showing differences amongst early and late secondlanguage learners. Such studies, if replicated, could have considerable impact onthe critical age hypothesis which is essential to some of the generative hypothe-ses, such as the general blocking principle discussed by Hawkins and Liszka.

The study by Williams in this volume makes use at least in part of the thirdmethodology regularly exploited by psycholinguists: computer modelling.Psychologists from Winograd (1972) onwards have been keen to exploit theprocessing abilities of computers for the purposes of modelling human behav-iour. The most recent manifestation is connectionism which has been exten-sively (and controversially) used to model language and language learning.Connectionism (Bechtel and Abrahamsen 1991) attempts to show that manyapparently complex processes can, in fact, be accounted for in a relativelysimple way as long as the processing involved can be very large in quantitative

In search of an interface 13

terms and can operate in parallel (it is also known as parallel distributedprocessing or PDP). The basic idea is to have (a large number of) processingunits which feed into one another at several levels (some of which are ‘hidden’)in sequence. The units involved can be given levels of activation or ‘weighting’prior to any simulation. These may be random or specified in relation to theoutcome envisaged. This will vary according to whether the model is being usedsimply to replicate what is assumed to happen in certain forms of processing orwhether it is intended to model a developmental process (such as languagelearning). In the former case, random levels will be initially assigned and it willbe intended that during the simulation the model will ‘learn’ new levels ofactivation. It is hoped that these levels of activation will represent eventuallysome kind of reality. Put (over-)simply, the lowest level or input units are thengiven different levels of activation and these levels feed through to the higherlevels. Where the interaction with higher levels is ‘facilitated’ because certainunits at those levels already have positive activation (excitation), the signaltransmitted will be strengthened. Where the level of input activation encountersnegative activation at a higher level, it will be ‘inhibited’ and, in the longer term,weakened. Over very many trials, the simulation establishes a stable level ofoutput activation which is in essence derived through parallel processing fromthe relative frequency of the input it has received. This is arrived at throughdifferentiated interaction between signals at the intermediate levels on the basisof the processes of strengthening and weakening.

In a simulated task learning context, models can be modified in such a wayas to include information about the extent to which the model is performing asit should in relation to some target (back-propagation). This means that themodel can be trained to move nearer over time to a specified target. Researcherscan then see how the model responds to the information it has been given andif and how it can ‘progress’ towards the target. It should be noted that thismodel only has one kind of unit which is linked to the signal strength of theconnections: there is no symbolic level within connectionist models. Theresearchers can inspect the intermediate levels to see what activation levels themodel produced at different times, they will be aware of different stages of‘learning’ and they will see the extent to which the desired outcome is obtainedand the relation with the input given.

There is no doubt that theorising alone cannot discover the effect ofenormous quantities of parallel processing and the computer is very efficient atexamining this effect. The sixty thousand dollar question, however, is whethersuch modelling really reflects anything which goes on in the human brain.

14 Richard Towell

Those who favour this kind of modelling argue that it parallels human learning,that it can generalise beyond the data set which it is given and that it canproduce ‘knowledge’ which is equivalent to what other approaches consider tobe ‘abstract’ knowledge. It then follows naturally that proponents of this worksee no need for abstract ‘symbolic’ categories in the mind because, from theirpoint of view, these are ‘only’ the outcomes of the parallel processing describedabove, not primitive units.

Numerous articles in the mid 1980s emanating from the PDP group led byRumelhart and McClelland (Rumelhart, McClelland and the PDP ResearchGroup 1986) made very strong claims about the way in which their networkcould parallel human language learning. These drew a detailed response,notably from Pinker and Prince (1988), in which the claims were thrown intoquestion. As Fodor and Pylyshyn put it: “… Pinker and Prince argue (in effect)that more must be going on in learning past tense morphology than merelyestimating correlations since the statistical hypothesis provides neither a closefit to the ontogenic data nor a plausible account of the adult data on which theontogenic processes converge. It seems to us that Pinker and Prince have, byquite a lot, the best of this argument.” (Fodor and Pylyshyn 1988:68). Connec-tionists have, however, gone on to refine their models and to enable them to callon other sets of information through which they have renewed their claim tomodel language.

These are the issues which are interestingly explored in the study byWilliams. He is critically interested in whether the learning which can bedemonstrated by computer modelling really models that of humans or not. Toinvestigate this, he combines computer-based simulations with experimentalinvestigations on humans. The context of learning is that of the specification ofabstract noun classes on the basis of a gender type classification. We alreadyknow that there is considerable evidence available to show that second languagelearners fail to assign gender in the consistent way that native speakers do.There are two central issues. The first is whether computer based inductiveprocesses can genuinely be said to have gone beyond exemplar based generali-sations to the creation of abstract classes. The second is whether what computersdo and what humans do is in fact similar. Computers may very well demonstratea remarkable ability to generalise from distributional examples, but do humansdo the same? Do they rely on many examples to induce classes or do theyinduce classes in other ways, such as by making use of other clues e.g. animacy?

Williams’ first simulation of the training kind with feedback seemed toshow that the computer-based learning could generate productive knowledge

In search of an interface 15

of noun classes which enabled the network to generalise beyond the trainedexemplars. It could behave as if it had formed abstract representations. Hissecond simulation which did not involve feedback in the same way did notlearn as well. Humans who learnt to classify the data into noun classes at 66%or above seemed to be using conscious explicit strategies, which clearly were notavailable to the computers, and/or to be influenced by prior knowledge ofgender languages, equally not true of the computers. The conclusions whichmay be drawn are not simple: it seems as if some aspects of human behaviourmay be similar to the inductive generalising of computers but that humansmake use of other devices as well. Once feedback is introduced computerlearning is considerably more powerful, but then it is probably more powerfulthan the human mind.

Green and Sabourin and Haverkort take us into yet other areas of psycho-linguistic research. Their concern is with how linguistic knowledge may berepresented and stored in the mind. They share many reference points and tosome extent a methodology. They both discuss evidence based on aphasics, theyboth rely on physiological evidence from ERPs to confirm or deny what othersources have indicated. Their conclusions appear to be slightly different.

The broader argument which this research addresses concerns whether ornot L1 and L2 learners acquire and store language in the same way. Virtually allresearchers acknowledge the differences in learning environment for many L2learners: they are generally older (beyond the so-called critical age) with fullydeveloped memory and cognitive systems; they are often literate; they arealready in possession of a first language and they are often exposed to explana-tions about language in classrooms as well as to language forms in more or less‘authentic’ contexts of use. A key question has always been: do these differencesmean that they will learn in a different way? Those who argue that they dosuggest that they rely more on ‘explicit’ learning and that this will be reflected inthe way their knowledge is stored in the mind. A separation is often made between‘declarative’ knowledge, sometimes glossed as ‘knowing that’, the kind of knowl-edge which can be consciously accessed and articulated, such as a rule of grammar,and ‘procedural’ knowledge, sometime glossed as ‘knowing how’ i.e. the kind ofknowledge which underlies skill activity, such as riding a bicycle, which cannotbe consciously accessed. The two forms of knowledge are said to be acquired indifferent ways and stored in two different memories: a declarative memory anda procedural memory each of which is accessed in a different way (Anderson1983, 1993, 2000). Declarative knowledge is acquired explicitly, consciouslyand quickly but cannot be used swiftly as a basis for any skill based action.

16 Richard Towell

Procedural knowledge is acquired implicitly, unconsciously and slowly by dintof a lot of practice. It is available swiftly in response to an appropriate stimulus.The argument has been put that L1 acquisition may be implicit and proceduraland L2 acquisition may be explicit and declarative, although most researcherswho discuss this issue allow for some overlap and some movement betweencategories over time as learners become more proficient and make theirknowledge more ‘automatic’. Researchers such as Paradis (1997), however,claim categorically that explicit knowledge cannot become implicit.

Green’s article questions the necessity for a separation between declarativeand procedural memory and queries the evidence from aphasics on which hebelieves it is based. Computer modelling has indicated that the data derivedfrom the aphasic experiments does not depend on having two memories. Hetherefore argues that it is worth looking at physiological evidence to see whetherit confirms the necessity of two memory systems and whether there is evidenceto suggest that L2 learners store knowledge in this differentiated way. Hisinterpretation of the available evidence suggests that there is no difference forproficient L2 learners and he suggests that if there is a difference in the earlystages of learning it soon disappears. Only longitudinal studies which includedphysiological studies could settle this argument conclusively.

Sabourin and Haverkort do not refer explicitly to the two kinds of knowl-edge or the two memory systems outlined above but they do argue in favour ofa clearer separation between the representation of linguistic knowledge and therepresentation of the knowledge which lies behind linguistic processing ability.This is because they believe that the empirical evidence that they have gatheredby comparing the results obtained when proficient L2 learners are required toundertake the same grammaticality judgement test ‘off-line’ and ‘on-line’ showsthat the knowledge base is different. The results of the ‘off-line’ task show nodifference between the advanced learners and native speakers but the ‘on-line’results do. When learners do a grammaticality judgement test in a paper andpencil way, they score as highly as native speakers. But when their performanceon the same task is measured through ERPs, it is revealed that they are notresponding in the same way. For Sabourin and Haverkort, this suggests thatthey are not accessing the same knowledge even though the observed outcomeof correct answers is the same.

Green therefore argues that the underlying knowledge base for advancedproficient learners is likely to be the same but Sabourin and Haverkort arguethat it is likely to be different. Both agree that more research is needed.

In search of an interface 17

4. In what ways can the linguistic and psychological perspective be seento be complementary?

In this final section of the chapter, an attempt is made to build on the accountgiven so far and to draw out some of the central themes which are treated in thefollowing chapters. At the level of principle, as pointed out briefly in Section 3above, there seems to be nothing but contradiction in the present stance of thefounding disciplines of psycho-linguistics. And yet at a more pragmatic level,the accounts of the research outlined above suggest that the two disciplines mayneed each other rather more than they are prepared to admit. We will brieflyexplore this issue by looking at how interlanguages are created and how theydevelop bearing in mind the evidence presented in the various articles.

Let us start with some notion of the initial state of knowledge for secondlanguage learners. This has to be defined for us by the linguists. In theirchapters they argue cogently that learners transfer the features of lexical andfunctional categories from the L1 in ways which make up an operationalinterlanguage. Those features which are not available via transfer may beavailable through direct access to UG. In those many cases where the languagesdiffer, the features are not combined in the same ‘bundles’ as those which areused by L2 speakers. The task for the learners is then to modify the relationshipbetween features and forms in such a way as to create over time combinationswhich correspond more to the ‘bundles’ used by speakers of the L2. If they cancreate those bundles in an appropriate manner, Universal Grammar will ensurethat they are interpretable by other cognitive systems. There are argumentsabout the extent to which the knowledge will transfer, and about the nature ofcompetence, but the line of the argument is clear.

The next question for second language acquisition researchers is how andwhy the necessary modification takes place. The linguist’s answer is that thelearner must in some way come to know (but not in a conscious way) that thebundles of features in the interlanguage are not adequate to the purpose of fullcommunication with native speakers in the L2. Once that unconscious realisa-tion has taken place, the learner must then have a way of revising the existingfeature bundles to make them correspond more to the L2 bundles. This is said bythe linguists to happen implicitly and without conscious feedback. As was notedabove, the term triggering is one which is frequently used in the literature butit is becoming more and more difficult to accept that what must be a complexprocess can adequately be summed up by that term. At this point we really haveto turn to the psychologists to gain some insight into what may be happening.

18 Richard Towell

Dijkstra’s article makes it very clear that when possessors of two languageshear a given stimulus they activate all the relevant linguistic knowledge theyhave without separating it out into L1 and L2. All the activated forms have thepossibility of giving rise to overt production: there is a selection process whichdetermines which will actually be produced. The level of activation relates verymuch to frequency of use. This suggests that when second language learnersacquire new forms they will compete with the existing forms. There will bedifferences in activation depending on the context when dealing with words inutterances (as opposed to the isolated words mostly studied) but the principlewill hold that the ability to use second language forms accurately in fluentutterances will depend at least to some degree on the frequency of use experi-enced by the user.

This raises interesting questions about how the activation of the interlingualforms for the purposes of communication in the interlanguage permits modifi-cation of the interlanguage system. If the effect of use is merely to strengthenthe connections, as is implied by the non symbolic connectionist modelsexplored by Williams and implied in the work of Dijkstra, then how can usegive rise to modification of the forms? How also will they overcome thecompetition of existing forms?

Sabourin and Haverkort suggest that whilst advanced L2 users may displaythe same knowledge in grammaticality judgement tests, the knowledge that liesbehind their use of the language is not the same as that which lies behind theuse in language production. Williams in his discussion of the comparativelearning by humans and computers points out that humans appear to beinfluenced by factors other than the purely distributional. Is it possible thatsecond language learners do indeed have some differential representation whichallows them not to have to rely purely on the distributional analysis? Or is thereanother more symbolic version of implicit learning which could account formodification rather than strengthening?

Green in his discussion of the storage of linguistic knowledge raises issuesabout the relative contribution of declarative and procedural knowledge.Within that area of reference, there are interesting questions about how learnersdeal with those lexical and functional categories which are not fully integratedwithin the syntax, as must be the case for interlingual systems which have notyet fully developed the syntactic system. We have seen that Hawkins and Liszkatake the view that Chinese learners cannot integrate +/- past in the T categoryof their interlanguage syntax. They nonetheless produce correct past tenseforms of regular verbs for at least some of the time. Where and how are these

In search of an interface 19

forms stored? If they are not generated by a function of the syntax, they must bestored as separate lexical items. This immediately opens the door to the notionthat there must be a difference between the proportion of knowledge which isstored in the lexicon of the L1 and the L2 and in interlanguages: it would seemprobable that in the early stages of learning at least second language learnersmust store a large proportion of the forms they have learnt as new lexical itemsand only work out later how they may be part of the syntax.

Whilst there are no clear answers which immediately fall out of the existingstate of knowledge, it should be evident that a combination of the insights oflinguists and of psychologists will be required to answer these questionsproperly.

References

Anderson J.R. 1983. The architecture of cognition. Cambridge, Mass.: Harvard University Press.Anderson J.R. 1993. Rules of the mind. New Jersey: Lawrence Erlbaum.Anderson J.R. 2000. Learning and memory. New York: John Wiley.Bechtel, W. and Abrahamsen, A. 1991. Connectionism and the mind. Oxford: Blackwell.Carroll, S.E. 2000. Input and evidence. Amsterdam: John Benjamins.Chomsky, N. 1959. “Review of Skinner 1957”. Language 35: 26–58.Chomsky, N. 1986. Knowledge of language. New York: Praeger.Chomsky, N. 1995. The minimalist program. Cambridge, Mass.: MIT Press.Dewaele, J.M. 2002 “Individual differences in L2 fluency: The effect of neurobiological

correlates”. In Portraits of the L2 user, V. Cook (ed.), Clevedon: Multilingual Matters.Ellis, N. 2002. “Frequency effects in language processing: A review with implications for

theories of implicit and explicit language acquisition”. Studies in Second LanguageAcquisition 24 (2): 143–189.

Flynn, S. 1987. A parameter-setting model of L2 acquisition. Dordrecht: Reidel.Fodor, J.A. and Pylyshyn, Z.W. 1988. “Connectionism and cognitive architecture”. In

Connections and symbols, Special Edition of Cognition, S. Pinker and J. Mehler (eds), 3-73. Cambridge, Mass.: MIT Press.

Gass, S. and Selinker, L. 2001. Second language acquisition. New Jersey: Lawrence Erlbaum.Gregg, K. 1996. “The logical and developmental problems of second language acquisition”.

In Handbook of second language acquisition, W. Ritchie and T. Bhatia (eds), San Diego:Academic Press.

Harley, T. 2001. The psychology of language. Hove: Psychology Press.Hawkins, R. 2001. Second language syntax. Oxford: Blackwell.Hawkins, R., Towell, R. and Bazergui, N. 1993. “Universal Grammar and the acquisition of

French verb movementby native speakers of English”. Second Language Research 9: 189–233.Herschensohn, J. 1999. The second time around minimalism and L2 acquisition. Amsterdam:

John Benjamins.

20 Richard Towell

Hulstijn, J. 2002. “Towards a unified account of the representation, processing and acquisi-tion of second language knowledge”. Second Language Research 18 (3): 193–224.

Jackendoff, R. 2002. Foundations of language. Oxford: Oxford University Press.Juffs, A. 1998. “The Acquisition of semantics-syntax correspondences and verb frequencies

in ESL materials”. Language Teaching Research 2: 93–123.Juffs, A. 2001. “Verb classes, event structure, and second language learners’ knowledge of

semantics-syntax correspondences”. Studies in Second Language Acquisition 23: 305–313.Lightfoot, D. 1993. How to set parameters. Cambridge, Mass.: MIT Press.Mitchell, R. and Myles, F. 1998. Second language acquisition theories. London: Arnold.Paradis, M. 1997. “The cognitive neuropsychology of bilingualism”. In Tutorials in bilingualism.

Psycholinguistic perspectives, A.M.B. de Groot and J.F. Kroll (eds), New Jersey: LawrenceErlbaum.

Perani, D. 1999. “The functional basis of memory: PET mapping of the memory systems inhumans”. In Cognitive neuroscience of memory, L.G. Nilsson and H.J. Markovitsch(eds), 55–78. Seattle: Hogrefe and Huber.

Perani, D., Dehaene, S., Grassi, F., Cohen, L. Cappa, S.F. and Dupoux, E. 1996. “Brainprocessing of native and foreign languages”. Neuroreport 7: 2439–2444.

Pinker, S. 1997. How the mind works. Harmondsworth: Penguin.Pinker, S. and Prince, A. 1988. “On language and connectionism. Analysis of a parallel

distributed processing model of language acquisition”. Cognition 28: 73–195.Rumelhart, D. and McClelland, J. 1986. “On learning the past tense of English verbs”. In

Parallel distribued processing: Vol 1. Foundations, D. Rumelhart, J. McClelland and thePDP Research Group 1986. Cambridge, Mass.: MIT Press.

Rumelhart, D., McClelland, J. and the PDP Research Group. 1986. Parallel distribuedprocessing: Vol 1. Foundations. Cambridge, Mass.: MIT Press.

Sakas, W.G. and Fodor, J.D. 2001. “The structural triggers learner”. In Language acquisitionand learnability, S. Bertolo (ed.), 172–234. Cambridge: Cambridge University Press.

Schreuder, R. and Weltens, B. (eds). 1993. The bilingual lexicon. Amsterdam: John Benjamins.Smith, N. 1999. Chomsky: Ideas and ideals. Cambridge: Cambridge University Press.Skinner, B.F. 1957. Verbal behaviour. New York: Appleton Century Crofts.Towell, R. and Hawkins, R. 1994. Approaches to second language acquisition. Clevedon:

Multilingual Matters.Ullman, M. 2001. “The neural basis of lexicon and grammar in first and second language:

The declarative/procedural model”. Bilingualism: Language and Cognition 4: 105–122.Waxman, S. 1996. “The development of an appreciation of specific linkages between

linguistic and conceptual organisation”. In The acquisition of the lexicon, L. Gleitmanand B. Landau (eds), Cambridge, Mass.: MIT Press.

White, L. 1989. Universal Grammar and second language acquisition. Amsterdam: JohnBenjamins.

White, L. 1991. “Adverb placement in second language acquisition: Some effects of negativeevidence in the classroom”. Second Language Research 7: 133–61.

Winograd, T. 1972. Understanding natural language. New York: Academic Press.

</TARGET "tow">

<TARGET "haw" DOCINFO AUTHOR "Roger Hawkins and Sarah Liszka"TITLE "Locating the source of defective past tense marking in advanced L2 English speakers"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 2

Locating the source of defective past tensemarking in advanced L2 English speakers*

<LINK "haw-n*">

Roger Hawkins and Sarah LiszkaUniversity of Essex

1. Introduction

It is well-known that advanced L2 speakers of English from certain L1 back-grounds show persistent optionality in marking thematic verbs for simple pasttense in spontaneous oral production, as for example in The police caught theman and take him away. Speakers whose L1 is Chinese appear to be such agroup. Bayley (1991, 1996) found the phenomenon sufficiently robust inChinese speakers to undertake a variationist analysis of the factors which mightbe causing it. Wolfram and Hatfield (1984) had found similar optionality in theL2 English of Vietnamese speakers. More recently Lardiere (1998a, 1998b, 2000)has reported remarkably consistent optionality in simple past tense marking ina near-native speaker of English sampled with an eight-and-a-half year interval.Patty, a native speaker of Mandarin and Hokkien, marked simple past tense onthematic verbs around only one-third of the time in data collected fromspontaneous speech. Native speakers, by contrast, do not apparently showendemic optionality of the same kind, although failure to inflect a verb for pasttense is found sporadically in ‘slips of the tongue’ (Fromkin 1988).1

An important question for theories of SLA which assume that the mentalgrammars of individual L2 speakers are derived from Universal Grammar (UG)is why such optionality might exist in advanced/near-native speakers. Giventhat the morphophonology of forms in English provides clear positive evidencethat past tense is marked, it is unexpected that advanced/near-native speakersshould continue to have problems with it. Locating the source of the difficultywould be a small contribution to the broader goal of determining exactly howthe language faculty is involved in the construction of grammatical knowledgeby older L2 learners.

Beck (1997) has argued that the kind of optionality in question is unlikely

22 Roger Hawkins and Sarah Liszka

to be the result of a deficit in the component of the language faculty whichgenerates inflected phonological word forms: morphology. Beck compared thereaction times of 31 non-native speakers from a variety of L1 backgrounds withthose of 32 natives on a task requiring the production of past-inflected verbforms. Speakers were presented with verb stems on a computer screen, andrequired to produce the simple past tense form orally, which activated a timingdevice. In previous studies with natives (e.g. Prasada, Pinker and Snyder 1990)it had been found that irregular verb forms show a frequency effect: the morefrequent the stem of the verb in question (where frequency is defined asfrequency of occurrence in corpora of English usage) the faster the reactiontime to the past tense form. For regular verbs, however, frequency of the stemform had no effect on reaction time. Such findings have led to the claim thatirregular past tense verb forms are stored associatively in memory (i.e. are‘listed’) and hence show frequency effects as a function of strength of associa-tion between the stem and the listed form. By contrast, regular inflection isproduced by rule, which applies in the same way to all regular stems, indepen-dently of frequency. Hence there is no reaction time effect.

Beck’s findings with the non-native speakers were that they performedsimilarly to the natives: reaction times on low frequency irregular stems wereslower than on high frequency stems (although not significantly so), and therewas no difference in their performance on frequent and less frequent regularstems. This led Beck to suggest that it is not the morphological componentwhich is involved in causing persistent optionality in past tense marking.

Lardiere (1998a, 1998b, 2000) has suggested that the problem for Pattyresides in the ‘mapping’ between fully specified syntactic phrase markers andsurface morphophonology. Lardiere argues that other evidence from Patty’sspontaneous oral production suggests that she has a T(ense) category specifiedfor finiteness. Firstly, her use of nominative case-marked pronouns is perfect;on standard assumptions the nominative case of subjects in English is the resultof an agreement or checking relation between a T category specified [+finite]and the subject in the specifier of TP. Hence if Patty uses nominative pronounsperfectly she must have represented a finite feature on T. Secondly, there is noevidence for thematic verb raising (e.g. over negation) in Patty’s productions,suggesting that she has established that T has weak inflectional properties inEnglish — another piece of evidence that Patty has a T specified for finiteness.Thirdly, there is evidence that Patty projects finite CPs, which on standardassumptions implies the presence of a T specified for finiteness.

The ‘mapping’ difficulty is a problem accessing morphological forms which

Defective past tense marking in L2 English 23

have ‘layers’ of feature structure. Assuming a model of grammar where “anautonomous morphological component ‘reads’ the output of lexical andsyntactic derivation, identifying those features which condition inflectionaloperations” (Lardiere 1998a:20), Lardiere argues that the more inflectionalfeatures there are associated with a morphological form, the more likely aproblem will arise for an L2 speaker. In the case of tense-marking, she assumesthat the output of syntactic computations presents the morphological compo-nent with a terminal T node specified [+finite]. The morphological componentmust then determine whether it is [+past] or [−past], and if [+past] selectsuppletive forms in the case of irregular verbs, or invoke the regular rule for theaffixation of -ed in the case of regular verbs. She proposes (speculatively) “thatit is among the increasingly complex ‘outer’ layer mappings from morphologyto PF that we are likely to find the greatest vulnerability to ‘fossilization’ and‘critical period’ effects” (Lardiere 2000:124). Additionally, if the phonologicalforms themselves involve complex phonology — for example, involving wordfinal clusters like -kt, -skt and -mpst in the past tense forms of walked, asked andglimpsed — Lardiere argues that this may further affect successful mapping:“We can further imagine that an essentially morphophonological mappingprocedure would be especially vulnerable to ‘derailment’ from a variety of post-syntactic or extra-syntactic factors, such as phonological transfer from the L1”(Lardiere 1998a:21). Given that (Mandarin) Chinese is a language with basic(C)V(nasal) syllable structure and no syllable- or word-final consonant clusters(Hansen 2001), we can take Lardiere’s claim to be that while mapping ofphonological forms onto terminal nodes which are the output of the syntax cancause problems generally for L2 speakers where layers of features are involved,in Patty’s case this is compounded by the fact that the L2 requires word-finalconsonant clusters where the L1 disallows them.

The combined results of the studies by Beck and Lardiere point to thefollowing possible conclusion: a deficit has occurred in one of the mechanismsof the morphological component — the mechanism referred to as the ‘vocabul-ary’ in models of distributed morphology (Halle and Marantz 1993, Embick andNoyer 2001) — which inserts vocabulary items (phonological forms) intoterminal nodes where the feature specification of the vocabulary item and thefeature specification of the terminal node are non-distinct. The representationof morphological forms themselves is not affected (as Beck’s study suggests), andthe feature specification of categories manipulated by the syntax is not affected, ifLardiere’s analysis is correct. Moreover, the difference in phonotactic con-straints between the L1 and L2 has a persistent influence, such that the mapping

24 Roger Hawkins and Sarah Liszka

problem is exacerbated. Thus this account of optionality in tense marking by L2speakers is located at the interface between syntax and morphology.

In this chapter we test this claim by comparing the spontaneous oralproduction of advanced L2 speakers of English from three different L1 back-grounds: Chinese, Japanese and German. If there is a general mapping problemfor L2 speakers at the interface between syntax and morphology involving featurematching at the point of vocabulary insertion, we would expect this to appear in allthree groups. Since Japanese is similar to Chinese in its phonotactic structure(disallowing word-final consonant clusters) but German is like English (cf.word-final clusters such as -ntst: getanzt ‘danced’), we would expect mappingproblems to be more marked in the Chinese and Japanese informants.2

In fact we will argue that neither of the predictions is borne out by the data.The Chinese informants do mark simple past tense optionally in oral produc-tion, as previous studies had found, but the Japanese and German speakers aresignificantly less likely to do so. We will also show that our Chinese informantsappear to know the morphological properties of English past tense verbinflection, as Beck’s study suggests that they should, and that they do notgenerally appear to have problems producing word-final consonant clusters.This will lead us to argue for a different locus for their deficit: at the interfacebetween the syntactic component and the lexicon. In particular, we will claimthat the Chinese speakers have difficulty assigning the formal (i.e. syntactically-relevant) feature [past], which determines the morphophonological forms ofverbs in English, to the feature inventory of the category T(ense) in the lexicon,because this feature is not selected in Chinese, and is subject to a critical period.

2. Assumptions about the organisation of the language faculty

We follow the spirit of recent work within the minimalist program (Chomsky1998, 1999, 2001) and assume that the language faculty has a number ofuniversally fixed and invariant computational procedures, and provides auniversal inventory of phonological, semantic and syntactic (formal) features Ffrom which lexical items can be assembled. One subset of the computationalprocedures is the syntax, which is capable of a small number of empirically andconceptually necessary operations: merge, agree and move. The syntax takesitems presented to it from the lexicon and combines them into expressions.These expressions are interpreted by a semantic component (LF) (whoseprocedures are themselves universally invariant), and which makes the syntactic

Defective past tense marking in L2 English 25

expressions ‘legible’ to the conceptual-intentional modules of mind. Thesyntactic expressions are also interpreted by morphological/phonologicalprocedures which are universally uniform, and which make syntactic expres-sions ‘legible’ to the sensori-motor systems for the production and understand-ing of speech.

The universal inventory F of semantic, phonological and syntactic featuresis crucial in this model, because it is from this set that features are selected forthe assembly of a lexicon whose items provide the input to the computationalprocedures. Individual languages select a subset of features from F and assemblethem into lexical items. It is at this point — the selection of particular featuresfor the assembly of lexical items — that languages vary.

We focus here on the selection of syntactic (formal) features. Syntacticfeatures are those which initiate syntactic operations, for example, wh-move-ment, case agreement, N-to-D movement, and so on. Some selections ofsyntactic features appear to be obligatory. For example, finite T appearsnecessarily to select the syntactic feature required to activate structural nomina-tive case. Nominals also appear to obligatorily select case features which renderthem active for the purpose of case agreement. Languages are uniform inselecting these features. Parametric differences between languages arise whenthey make different choices of optional syntactic features.

The distinction between obligatory and optional syntactic features isimportant for understanding tense and how it is realised morphologically inlanguages like English and Chinese. The view we adopt is that there is a syntac-tic (i.e. semantically uninterpretable) tense feature which, for the sake ofexposition, we call [±past], which is available in the universal inventory F, butwhich is optional. English has selected it, but Chinese has not. In English, thepresence of [±past] in finite T has a consequence for the morphology of theverb. [±past], being a syntactic (formal) feature, is not semantically interpret-able at LF and so must be eliminated from a syntactic expression before suchinterpretation takes place. This elimination is effected through a checking (ormatching) of the features of T with the morphological features of inflected verbforms like was, had, walked, ran. There are various complications that arise inthis checking/matching operation depending on whether the verb is a light verb(be, have, do) or a thematic verb (walk, run). We will not expand on these here(see Lasnik 1999, Embick and Noyer 2001). But the basic claim is that finiteEnglish T has syntactic [±past] features which have morphological consequenc-es for V. By contrast, we claim that Chinese does not have syntactic [±past]features on T, although it does have a syntactic [±finite] feature (Li 1990:18).

26 Roger Hawkins and Sarah Liszka

As a consequence, bare Vs in Chinese can be interpreted either as past or non-past, depending on context:

(1) Zhangsan kan dianying.Zhangsan see movie‘Zhangsan is seeing OR saw a movie.’

We take this up in more detail in Section 4.

3. The study

To test the claim that the source of optionality in tense marking by L2 speakerslies at the interface between the syntactic and morphological components of thelanguage faculty, we selected advanced L2 speakers of English from different L1backgrounds, devised a test aimed at measuring informants’ knowledge of themorphological processes involved in simple past tense marking, and collecteda sample of spontaneous production data including simple past tense verbforms from the same informants.

3.1 Informants

Advanced L2 speakers of English were selected for this study on the basis oftheir performance on an independent measure of general proficiency. Thisconsisted of two components: (a) the written multiple-choice grammar testcomponent of the Oxford Placement Test (Allan 1992) which has 100 itemscovering a range of the core morphosyntactic properties of English; (b) Nation’s(1990) ‘vocabulary levels test’, which was designed as a language teacher’s aidfor giving help with vocabulary learning, but provides a rough notional measureof the size of a speaker’s vocabulary up to the 10,000-word level. Using thiscombined grammar/vocabulary test, we selected informants whose L1 wasChinese, Japanese or German and whose mean scores broadly matched at theupper end (over 80% correct). This produced an experimental set of informantsof two Chinese, five Japanese and five German speakers.3 Details of the profi-ciency scores are given in Table 1.

3.2 Test of knowledge of morphology

If Beck’s (1997) findings are generalisable, we expect to find that the informants inour study can productively manipulate morphological processes involved in past

Defective past tense marking in L2 English 27

tense marking in English. To test this we designed a task which required infor-

Table 1.Proficiency test scores: experimental informants

L1 Mean proficiency score (%) Range (%)

Chinese (n=2)Japanese (n=5)German (n=5)

86.685.790.7

83.4–89.783.4–87.085.8–96.0

mants to inflect both real and invented (nonce) verb stems for simple past tense.Our reasoning was that if speakers know the verb morphology associated with pasttense, it should make no difference whether real or nonce forms are involved.

3.2.1 DesignThe task was adapted from one used by Prasada and Pinker (1993) with nativespeakers (‘experiment 3’ in their study). Prasada and Pinker were interested inthe extent to which natives would produce regular and irregular past tenseforms when presented with nonce verbs, and in particular whether novelirregulars displaying the ‘prototypical’ phonological shape of partially produc-tive past tense irregulars (like string, sling, fling, cling Æ strung, slung, flung,clung) would elicit novel irregular past tense forms, such as spling Æ splung. Toelicit such responses, informants were presented with nonce forms and aninvented definition for each — six at the top of each page of a test questionnaire— followed by six sentences, each with a blank, where one of the nonce formsbelonged (Prasada and Pinker 1993:24).

In our test we were interested simply in whether informants would inflectverbs appropriately for past tense in clear past tense contexts, and whether theirknowledge was generative in the sense of allowing them to inflect verbs they hadnever encountered before correctly. They were therefore presented with sixverbs at the top of each page of a test questionnaire with definitions (as in thePrasada and Pinker study), but half were real and half invented (18 of them infact taken from the set of prototypical regular and irregular nonce verbs used byPrasada and Pinker). Below each set of six verbs with their definitions were sixsentences where informants had to decide which verb to insert, and what itsform should be. These contexts required either a simple past or a presentperfect form. Thus informants could not simply produce a past tense form byrote, but had to make a definite choice of tense appropriate to the context. Apartial illustration of the form of the test is given in (2):

28 Roger Hawkins and Sarah Liszka

(2) SPLING: If you spling, you blow something out of your mouth (likesmoke rings or air bubbles).

CUT: If you cut something, you make it shorter, or divide it, orbreak its surface.

a. The ground staff haven’t marked out the tennis court or put up thenet yet, but the head gardener claims that they __________ the grassin readiness.

b. As he rose slowly, the diver __________ bubbles in short bursts.

There were 120 contexts in all, 60 of which were expected to elicit simple pasttense verb forms, and of these 30 involved nonce verb stems (15 prototypicalregular types, like blark, and 15 prototypical irregular types, like spling). Acontrol group of native speakers (n=5) also took the morphology test.

3.2.2 ResultsTable 2 displays the frequencies of inflected and uninflected real and nonceverbs in simple past tense contexts.

A χ2 test comparing the total frequency of inflected and uninflected forms

Table 2.Frequencies of real and nonce verbs inflected for simple past

L1

Chi (n = 2) Jap (n = 5) Ger (n = 5) Eng (n = 5)

Verb type Inflected Score (%) Score (%) Score (%) Score (%)

Real reg YesNo

233

88.511.5

602

96.83.2

641

98.51.5

740

.100

.0

Real irreg YesNo

290

.100

.0642

97.03.0

650

.100

.0730

.100

.0

Nonce reg YesNo

252

92.67.4

582

96.73.3

640

.100

.0722

97.32.7

Nonce irreg YesNo

184

81.818.2

549

85.714.3

537

88.311.7

710

.100

.0

Total YesNo

959

91.38.7

23615

94.06.0

2468

96.93.1

2902

99.30.7

NB: Cases where an informant selected a verb form other than simple past (e.g. past progressive orperfect) are excluded from the table. Hence scores do not necessarily add up to the expected maximum.

produced by the non-native groups shows no significant difference between

Defective past tense marking in L2 English 29

them (χ2 =4.94, df=2, p<.05). A comparison between the non-natives as agroup and the native controls shows that there is a significant difference(χ2 =12.64 (with Yates’ correction factor), df=1, p<.05). This differenceappears to be located primarily in the extent to which the non-natives fail toinflect irregular nonce forms for simple past (where the native controls doinflect in 100% of cases). This is particularly interesting because there are nosignificant differences between non-natives and natives in inflecting regularnonce forms. The result seems to suggest that speakers know that certainirregular nonce forms are irregular, hence do not attach a regular inflection tothem, but do not know what the inflected form should be, producing anuninflected form.

Broadly, and assuming that in some sense the non-native speakers aredistinguishing nonce regulars from irregulars, frequency of past tense marking inthe responses of these advanced non-natives is very similar to those of the natives.This is consistent with Beck’s findings, and suggests that the morphological com-ponent is operating similarly in these speakers to the way it operates in natives.

3.3 Inflected simple past tense in spontaneous production

Spontaneous oral data were collected from two tasks: the retelling of a shortextract from a Charlie Chaplin film (Modern Times), and the recounting of ahappy or exciting experience each informant had had. The data were recordedand transcribed, and only verbs in unambiguously simple past tense contexts(i.e. those where a native could use no other form) were counted. For thepurposes of this study, only thematic verbs were scored (e.g. walked, but notmodals, copula/auxiliary was or auxiliary had). Verbs in contexts which werephonologically ambiguous were also discounted (i.e. regular past tense verbsfollowed by homophonic stops as in walked to work, or interdental fricatives, asin chased them). All unambiguous forms thus counted were rechecked againstthe original recordings to ensure accuracy. The frequencies of inflected anduninflected forms across the three groups of non-native speakers are presentedin Table 3.

χ2 tests show that there is a significant difference between groups both onfrequency of inflection with regular verbs (χ2=30.49, df=2, p<.01) and with irreg-ular verbs (χ2=8.13, df=2, p<.05). The difference appears to be located entirely inthe Chinese speakers’ performance. This is in contrast to the performance ofthese speakers on the morphology test, where there was no significant differencebetween the three non-native groups, either on real verbs or nonce forms.

30 Roger Hawkins and Sarah Liszka

These results are problematic for the view that L2 speakers generally have

Table 3.Frequencies of inflected/uninflected verbs in simple past tense contexts:spontaneous oral production

L1

Chinese (n = 2) Japanese (n = 5) German (n = 5)

Verb type Score (%) Score (%) Score (%)

Regular InflectedUninflect.

2515

62.537.5

13712

91.98.1

522

96.33.7

Total 40 .100 149 .100 54 .100

Irregular InflectedIninflect.

6412

84.215.8

25218

93.36.7

794

95.24.8

Total 76 .100 270 .100 83 .100

difficulty mapping phonological forms (‘vocabulary items’) with ‘layers’ ofmorphological features onto terminal nodes generated by the syntax. If thiswere the case, we would expect all three groups to perform similarly on thespontaneous production task, but the Chinese speakers are performing signifi-cantly differently from the Japanese and German speakers.4

Is the difference the result of L1 phonology interfering with and depressingthe performance of the Chinese informants? This is unlikely. Recall that theprediction was that if L1 phonology had such an effect, we would expect theChinese and Japanese to experience similar problems because the relevantproperty (absence of word-final consonant clusters) is present in both L1s.However, to pursue this possibility further, we considered performance of thenon-native speakers on consonant clusters elsewhere in their spontaneousproduction, specifically in monomorphemic words like most, kind. If word-finalclusters are problematic in production, some evidence for this should surface inthese forms. Table 4 compares informants’ retention of -t/-d in inflectingregular past tense verbs with their retention of -t/-d in monomorphemes.

Frequencies are small, but suggestive. Although Chinese speakers do dropfinal -t/-d in monomorphemes, they do not do so as frequently as in the simplepast. This result matches a similar finding in Bayley (1996), who in a group of20 L1 Chinese speakers of intermediate and advanced proficiency in L2 Englishfound a 65% retention of final -t/-d in monomorphemes versus a 44% reten-tion in simple past tense verb forms. This is in marked contrast with the pattern

Defective past tense marking in L2 English 31

of -t/-d deletion found in many studies of native speakers (Labov 1989), as

Table 4.Absence of word-final -t/-d in monomorphemes and regular simple past tenseforms compared

L1

Chinese Japanese German

Word type -t/-d Score (%) Score (%) Score (%)

Monomorphemes PresentAbsent

92

8218

271

964

480

1000

Simple past(Regular)

PresentAbsent

2515

6337

13712

928

522

964

already observed. Natives are more likely to drop -t/-d when a morphologicalboundary is not involved than when it is.

Another possibility is that spontaneous oral production introduces perfor-mance pressures which make it difficult for L2 speakers to access inflected pasttense verb forms in real-time (Prévost and White 2000:129). If this were thecase, performance on the morphology test would be a better reflection of theinformants’ competence, because it lessens such pressures, while performancein spontaneous oral use of English underrepresents informants’ competence.This also looks implausible in the context of the three-group comparison. Ifperformance pressures were involved, we would expect them to surface in allthree groups, not just in the Chinese speakers. However, to pursue the possibili-ty further, we looked at performance of the informants in using regularinflected participles in cases like were scared of, be sliced, is released, which areidentical to the simple past tense forms. Table 5 brings together the frequenciesof inflected/uninflected regular participles, monomorphemes and regularsimple past tense forms.

Again, although small frequencies are involved, if performance pressureswere responsible for producing optionality in simple past tense marking, wemight expect to see some evidence in the production of participles, where layersof morphological features also seem to be involved: if a verb form is [−finite]then it is either [+durative] (walking) or [−durative]. If it is [−durative] it iseither [+past] ((have) walked) or [−past] ((to) walk).

In summary, the results from spontaneous oral production are the follow-ing: although the non-native groups were matched for general high proficiencyin L2 English, the Chinese informants were significantly less likely to inflect

32 Roger Hawkins and Sarah Liszka

both regular and irregular thematic verbs for past tense than the Japanese or

Table 5.Absence of word-final -t/-d in regular participles, monomorphemes andregular simple past tense forms compared

L1

Chinese Japanese German

Word type -t/-d Score (%) Score (%) Score (%)

Participles PresentAbsent

100

1000

230

1000

550

1000

Monomorphemes PresentAbsent

92

8218

271

964

480

1000

Simple past(Regular)

PresentAbsent

2515

6337

13712

928

522

964

German speakers. The Chinese speakers retained word-final -t/-d with mono-morphemes more often than with regular past tense verb forms (suggesting theproblem is not caused by word-final consonant clusters). The Chinese speakerswere perfect in inflecting past participles, in contrast to simple past tense verbforms, suggesting that ‘performance pressures’ are unlikely as a source ofomission of inflections with past tense verbs.

4. Discussion

We are interested in locating the source of optionality in the marking ofthematic verbs for simple past tense in oral production by certain groups ofhigh proficiency L2 speakers of English, for example speakers of L1 Chinese. Toaddress this problem comparative data were collected from L1 speakers ofChinese, Japanese and German matched for general proficiency as ‘advanced’speakers of English. An important caveat in discussing the results is that thenumber of informants was small, therefore the generalisability of any conclu-sions drawn must be treated with caution. The data were elicited from a test ofknowledge of past tense morphology, and tasks allowing free oral production(an anecdote and the retelling of a film). Results of the morphology test suggestthat all three groups, including the Chinese speakers, know the morphologicalproperties of past tense marking in the context of real English verbs, and areproductively aware of the regular versus irregular distinction even with verbs

Defective past tense marking in L2 English 33

they have never encountered before (nonce forms). This is consistent withearlier findings of Beck (1997), who showed that the reaction times of non-native speakers on a verb inflection task were parallel to those of native speakerson the same task. This led her to conclude that the source of optionality of pasttense marking is not located in the operation of the morphological component,a claim with which we concur.

The comparative results from the free oral production of our three experi-mental groups show that the Chinese speakers are significantly different fromthe other two groups, producing uninflected regular verbs in unambiguouslypast tense contexts in over one third of cases. This was prima facie evidenceagainst the idea that L2 speakers in general have difficulty mapping phonologi-cal forms (‘vocabulary items’) onto syntactic terminal nodes where thatmapping implicates layers of morphological structure. If this were a generalproblem for L2 speakers, and on the assumption that morphological propertiesdo not transfer from the L1 to the L2, we expected all three groups to showevidence of it. By looking at the Chinese speakers’ performance on word-finalconsonant clusters in monomorphemic words, and their performance ininflecting past participles, we also established that optionality in producing final-t/-d was lesser in these contexts. This suggested that the source of the problemwith simple past tense was unlikely to be phonological in nature or the result ofperformance pressure because similar optionality would be expected acrossthese contexts too.

Having eliminated the morphological component, and the mapping ofmorphophonological forms onto syntactic representations as likely sources ofobserved optionality in past tense marking in the informants studied, analternative explanation is needed. In this final section we explore the conse-quences of assuming that optionality results from a failure of L2 learners toinclude a syntactic feature for tense, that we are calling [±past], among thefeatures which make up the lexical item T. Where this is the case, T enterssyntactic derivations without the feature which, for native speakers of English,eventually forces the insertion of vocabulary items inflected for past tense intoterminal nodes. Given such a hypothesis, we need to account for why optiona-lity is characteristic of the Chinese speakers in our sample, but not the Japaneseor German speakers; why the Chinese speakers nevertheless have greater thanchance success in marking verbs for past tense in past tense contexts; and whythe Chinese speakers might be more successful in marking irregular past formsand regular participles than in marking regular past forms. Central to thisaccount is an understanding of the relation between how tense is interpreted by

34 Roger Hawkins and Sarah Liszka

the semantic component, and the presence of a syntactic (formal) tense featurein T. We therefore examine this relationship more closely.

Languages appear to differ in whether they ‘grammaticalise’ tense throughspecial morphophonological forms or not. Chinese is standardly assumed tolack such morphophonological forms, although it does have verbal aspectmarkers like -le, -guo and -zhe (Li and Thompson 1981, Li 1990, Packard 2000).For example, as already noted in Section 2, the verb kan ‘see’ can be freelyinterpreted as present or past depending on context:

(3) a. Zhangsan kan dianying.Zhangsan see movie‘Zhangsan is seeing a movie.’

b. Zhangsan zuotian kan dianying.Zhangsan yesterday see movie‘Zhangsan saw a movie yesterday.’

By contrast, Japanese and German do appear to grammaticalise tense, like English.Japanese has the forms -ta (past) and -ru (non-past) which appear to be tenseauxiliaries (Okuwaki 2000). Thematic verbs in German inflect for past versusnon-past tense like English, with regular and irregular variants (e.g. regular:kaufen ‘buy’, ich kaufte ‘I bought’; irregular: singen ‘sing’, ich sang ‘I sang’).5

Why might some languages have grammatical exponents of a ‘past/non-past’distinction while others, like Chinese, simply lack such exponents? An interest-ing perspective on this question can be found in the work of Chierchia (1998).Chierchia suggests that the presence of a syntactic property in a grammar whichis associated with a particular semantic operation has the effect of inhibiting thefree application of that operation. In his own study he suggests that the presenceof articles in a language blocks the free application of the semantic operationswhich give nominals generic, definite or indefinite meanings. So in English,count Ns can only be interpreted as definite when the is present, but in Russian,which lacks articles, bare Ns can be freely interpreted as generic, definite orindefinite ‘depending, presumably, on the context’ (Chierchia 1998:361). Thisidea has been extended by Takeda (1999:103) into a Generalised BlockingPrinciple (GBP): if a language has a certain functional category in its lexicon,the free application of the semantic operation that has the same function as thatsyntactic category is blocked in that language.

Taking ‘functional category’ here to mean ‘feature of a functional category’,what the GBP proposes in terms of tense is that the semantic operation whichinterprets a T-V configuration as past or non-past can apply freely except where

Defective past tense marking in L2 English 35

a language assigns a syntactic [±past] feature to T. This then requires overtmorphological expression and blocks the free operation of tense interpretation.So while finite bare Vs in Chinese can potentially be interpreted as past, presentor future, depending on the context, finite bare Vs in English can only beinterpreted as non-past, because past tense interpretation is associated with thespecific features which give rise to forms like walked, ran.

Consider now how this idea might be implemented in a grammar whichincorporates some version of distributed morphology (Halle and Marantz 1993,Embick and Noyer 2001), the kind of model assumed in the work of Lardiere(2000) and Prévost and White (2000). The syntactic component generatesexpressions from bundles of syntactic and semantic features using the opera-tions merge, move and agree (Chomsky 1998, 1999, 2001). Expressions consistof strings of terminal nodes which are presented to the morphological compo-nent for the insertion of vocabulary items (phonological forms). Insertion is anautomatic process which takes place where the features of a vocabulary item arenon-distinct from the features of a terminal node. For example, the terminalstring T-V will present the morphological component with features like[+finite, −past] or [+finite, +past]. Vocabulary items with the relevant featureswill then compete for insertion into this position. Two important properties ofinsertion are firstly that a vocabulary item does not necessarily have to have allof the features of the terminal node to be inserted; it is sufficient that thefeatures of the item are non-distinct from those of the terminal node (i.e. theymay be a subset of those features). Secondly, that where vocabulary items are incompetition for insertion into a terminal node, the most highly specified itemcompatible with the features of the terminal node is the one which wins thecompetition for insertion (Lumsden 1992:480). For example, suppose thatvariants of walk have the following feature specifications:

(4) walks [V, +finite, 3p, +sing]walked [V, +finite, +past]walk [V ]

Given such specifications, only walked can be inserted into a terminal node withthe features [+finite, +past], even though walk is unspecified for those featuresand is in principle available for insertion. The reason is that although bothforms have feature specifications which are non-distinct from the terminalnode, walked is the more highly specified item.

With such an account of the interaction between syntax and morphology,it is easy to see how the proposal that adult L2 speakers have difficulty with the

36 Roger Hawkins and Sarah Liszka

mapping of morphophonological forms onto terminal nodes might be madeexplicit: the procedure for inserting more specified vocabulary items over lessspecified items is not operating categorically in L2 speakers; less specified formsare being inserted where they should not be.

The account we wish to explore here, however, is one where the syntacticfeature [±past] is absent from T in the terminal string which is the output of thesyntactic computations. The claim we will make is the following: whereparametrised syntactic features are not present in a speaker’s L1, they will notbe accessible in later L2 acquisition.

This is equivalent to saying that syntactic features which invoke theGeneralised Blocking Principle that have not been activated in early life do notoperate in adult SLA. This immediately distinguishes Japanese and Germanspeakers on the one hand, from Chinese speakers on the other. Japanese andGerman both have morphosyntactic exponents of ‘past tense’, hence theunderlying syntactic features which invoke the GBP. In principle this shouldallow them to determine that past tense verb forms in English are syntacticallymotivated. By contrast, for L1 speakers of Chinese, T does not have the syntac-tic feature [±past]. Our proposal entails that when Chinese speakers learnEnglish they are unable to establish that English T is specified for [±past]. Thiswould mean that in their English grammars the terminal string T-V will include[±finite] but not [±past]. The distinction between vocabulary items like walks,walked, walk would then have no syntactic motivation for them.6 This is onekind of answer to the first question: why is optionality in past tense markingcharacteristic of the Chinese speakers in our sample, but not the Japanese andGerman speakers? It is also clear, however, that the Chinese informants studiedhere have acquired vocabulary items with past tense forms, and use them (atleast superficially) in a highly target-like way in the morphology test, and atabove chance level in spontaneous production. What might explain this if ourclaim is correct?

One possibility is that Chinese speakers analyse participles and irregularpast tense verb forms differently from regular past tense forms: they have adifferent morphological status in their grammars. Past participles are arguablydifferent from simple past tense forms even in native grammars, in that they areaspectual in nature (realising ‘perfectivity’), and result from a verb-internalword formation process, which does not involve the T-V configuration. Sinceour assumption is that L2 speakers do not have difficulty with morphologicaloperations per se, it would not be surprising if they did not have difficulty withparticiple forms. In other words, we expect Chinese speakers to have difficulty

Defective past tense marking in L2 English 37

with simple past tense forms because they involve a syntactic feature missingfrom T, but not participle forms because they do not involve T. In the case ofirregular past tense forms, one possibility is that the Chinese speakers haveacquired them as items independent from the equivalent bare V forms; i.e. ranis not the past tense exponent of run as it is in native grammars, but an inde-pendently acquired word form, in the same way that, say, amble and saunter areindependent word forms for native speakers. While this is speculative, there issome limited evidence consistent with it in the transcripts of the spontaneousproduction data. There are some cases of ‘doubly inflected’ verb forms such asin (5a). The same speaker who produced (5a) also used ran in clearly non-pastenvironments ((5b) and (5c)):

(5) a. The girl ranned not far away.b. You should ran away together.c. She could not ran any more.

If ran were an independent word form it could be inflected for past tense and beused in non-past contexts. Observe also that (5b) and (5c) involve the use of ranas a non-finite verb. Given the model of vocabulary insertion assumed, rancould not be an inflected variant of run specified for the feature [+past]. If itwere, its feature specification would clash with the feature specification of thenon-finite terminal nodes into which it is inserted in (5b) and (5c). Insertionshould simply be impossible.

If past participles and irregular forms like ran have a different morphologi-cal status in the L2 English grammars of Chinese speakers from regular pasttense verb forms like walked, this would be one kind of answer to the question:why are Chinese speakers more successful in marking irregular past tense verbsand regular participles than regular past tense verbs? They are not treating themas past tense verb forms at all; rather they have independent lexical item statusfor Chinese speakers. However, given this account, we have little idea currentlyof what these forms might ‘mean’ for Chinese speakers.

This leaves the need to give an account of optionality in the marking ofregular thematic verbs for past tense. We have no real answer to give in thiscase. Firstly, we have to assume that there is some reason why Chinese speakersdo not treat regular verbs as independent vocabulary items as in the case ofirregulars. One possibility is that it is the very regularity of the inflection and thefrequency of such forms in the input which forces a morphological analysis ofthem as rule-based variants of the bare V. The problem then is to explain whysome forms, but not others, are inflected in spontaneous production. To give a

38 Roger Hawkins and Sarah Liszka

flavour of the problem, consider the following short extract from the transcriptof one of the Chinese informants in our sample:

“When I saw the film ‘Lonely and Hungry’ and it reminded me of the old timewhen life was very hard. Some people they were very hungry and they have nowork to do. They really don’t want to steal. But they had no other choice andwhen they become so hungry and they really want to just get the food and justwant to eat the food so they stole or they just do something bad. But it’sunderstandable. It was not their mistake … And I watch it maybe 20 years ago.But it … I didn’t remember clearly about what it talk about. I just laugh a lot.But now when we see the film, I think. It gave me much imagination.”

What might be prompting this speaker’s use of the bare verb forms want, watch,talk and laugh in a context where past tense reference is often clearly intended?We have little idea yet of what the answer might be, but some possibilitiesappear unlikely. One explanation for selective verb inflection that has beenadvanced in studies of the early stages of L2 acquisition is the ‘Aspect Hypo-thesis’. This maintains that English past markers in early grammars are associat-ed preferentially with verbs/predicates which have ‘telicity’ as part of theirmeaning (‘achievements’ and ‘accomplishments’ in the terminology of Vendler(1967)). See Bardovi-Harlig (1999) for a review of work on this topic. Does thesame pattern obtain in the high proficiency Chinese speakers investigated here?That is, are they treating the regular past tense form as a marker of inherentverbal/predicate aspect? A breakdown of inflected regular forms by verb type isgiven in Table 6.

The results are not conclusive. While statives are not inflected at all, which

Table 6.Inflected past tense regular verbs by aspectual type: Chinese speakers

Statives Activities Accomplishments Achievements

Inflected tokens%

0/40%

10/1471%

1/250%

14/2070%

would be consistent with the Aspect Hypothesis, activities (which are atelic) areinflected to the same degree as achievements (which are telic). Interestingly,Lardiere (2002) has also analysed Patty’s past tense verb forms in terms ofwhether there is a correlation with the telicity of the verb/predicate, and foundthat 40% of all telic verbs (112/277, covering both regulars and irregulars) weremarked for past tense, while 35% of all atelic verbs (130/371) were past-marked.

Defective past tense marking in L2 English 39

This is a non-significant difference (using χ2). Thus, although the AspectHypothesis might be an explanation for the distribution of English past tenseverbs in low proficiency L2 speakers, it looks unlikely as an explanation foroptionality in the marking of past tense on regular verbs by high proficiency L2speakers.

Another unlikely possibility considered (and rejected) by Lardiere in thecase of Patty is the ‘Discourse Hypothesis’ (Bardovi-Harlig 1995). This holdsthat L2 speakers of English may initially use past forms of verbs to markforeground events in narratives, but not mark background events. ‘Foreground’events are defined as clauses that move time forward in a narrative and which,if interchanged, would change the sequence of events in the narrative; ‘back-ground’ events are often ‘out of sequence’, provide additional informationabout the foreground events, set the scene, evaluate or explain (Bardovi-Harlig1995:265–267). Lardiere calculates that 32% of Patty’s past-marked verb forms(13/41) describe foreground events, while 30% (11/38) describe backgroundevents.7 Again, while the Discourse Hypothesis might explain the distributionof forms in low proficiency L2 speakers, it looks unlikely to be the source ofoptionality in high proficiency L2 speakers.

The only hypothesis we are able to advance at present (but for which there isscant evidence in our current sample) is the following: linguistic theory appears toneed to allow operations which apply to strings post-syntactically, that is in themorphological component or following vocabulary insertion. For example,Chomsky (1999) proposes that English has an output condition which bars surfaceadjacency between V and direct objects where the V is an unaccusative or a passive.For example, although (6a) is the expected output expression generated by thesyntax, it is less natural than (6b) or (6c). Chomsky claims that a post-syntacticoutput constraint forces the object to move either leftwards or rightwards:

(6) a. There was placed a large bowl on the table.b. There was a large bowl placed ___ on the table.c. There was placed ___ on the table a large bowl.

The surface ordering of clitic clusters in French also appears to be the effect of apost-syntactic output condition (Perlmutter 1971). If two third person clitics co-occur, an accusative form precedes a dative, but if a first or second person anda third person clitic co-occur, a dative form precedes an accusative (as in (7)):

(7) a. Elle le lui donne.she it-acc him-dat gives‘She is giving it to him.’

40 Roger Hawkins and Sarah Liszka

b. Elle me le donne.she me-dat it-acc gives‘She is giving it to me.’

This suggests that the language faculty allows some monitoring of surfacestrings. Perhaps in the normal case for Chinese speakers the morphology insertsbare verb vocabulary items into T-V strings. But because output checking is apossibility available in the grammar, Chinese speakers monitor the ambientdiscourse for ‘pastness’ and insert V-ed forms when they are able to detect it.So, as in French where ordering of object clitics is determined by the person ofthe forms in question (first and second person must precede third person), forthe Chinese speakers selection of a V-ed form is determined by ‘pastness’. Thedifference between the two cases is that whereas [person] is a feature present inthe specification of the French pronoun vocabulary items, ‘pastness’ has to bedetermined on the basis of context.

Because context is involved, the monitoring process is unstable and givesrise to the kind of apparently random use of bare and inflected verb formsillustrated in the sample of informant speech given above. This is consistentwith two observations about the marking of regular verbs for past tense byChinese speakers. First, individual speakers of high proficiency can differmarkedly in the extent to which they are successful in inflecting verbs. Ourinformants are apparently more successful in spontaneous oral production thanPatty, for example. Secondly, different modalities allow a greater degree ofsuccess in past tense marking by the same individual. So our informants weremore successful on the morphology test than in oral production, and Lardiere(2002) reports that Patty’s written output (in the form of e-mail messages)shows higher proportions of past tense marking than her spoken output.

5. Conclusion: The interface between syntax and the lexicon in adult SLA

In a comparison of the performance of three groups of highly proficient L2speakers of English (with Chinese, Japanese and German as L1s), it was foundthat the marking of past tense was optional in the spoken English of the Chinesespeakers, but not in the case of the Japanese and German speakers. With thecaveat that caution is required in generalising from small numbers of informants,we argued that the Chinese speakers’ knowledge of English morphologicalprocesses is intact (as Beck (1997) had already argued), and that phonological

Defective past tense marking in L2 English 41

properties (i.e. -t/-d deletion) did not appear to be a factor for our speakers.Furthermore, we found that past participle and irregular past tense forms wereinflected more consistently than regulars.

Although previous studies have argued that L2 speakers can fully acquirethe syntactic features of lexical items like T (Lardiere 1998a, 1998b, 2000,Prévost and White 2000), we explored the alternative possibility that Chinesespeakers cannot establish [±past] on T in English precisely because this featureis absent in their L1. By contrast [±past] is present on T both in Japanese andGerman. If Chinese speakers do not have access to such a feature in the con-struction of L2 knowledge, but Japanese and German speakers do, this wouldexplain the observed difference between them in inflecting thematic verbs forpast tense in spontaneous production.

We linked this ‘deficit’ in the Chinese speakers’ grammars to the idea thatoptional syntactic features, when selected, ‘block’ the free application ofsemantic operations (Chierchia 1998, Takeda 1999). Our account assumes thatthese optional properties are subject to a critical period (an idea which origi-nates in the work of Tsimpli and Roussou (1991) and Smith and Tsimpli(1995)). We then explored, in highly speculative mode, reasons why Chinesespeakers might be successful at all in marking thematic verbs for past tense.

This line of enquiry raises an interesting possibility for the interfacebetween the syntactic component and the lexicon in adult SLA. Within thespirit of ‘minimalist’ enquiry, it might be proposed that all of the resources andcomputational procedures of the language faculty — LF, syntax, morphology —are intact and operative in adult SLA. However, optional (parametrised)syntactic features which play a major role in tying morphosyntactic structure tosemantic operations in the L1 acquisition of specific languages are unavailableif not activated in early life. In parsing L2 data, older L2 speakers can use all ofthe resources except for these features. Phonological forms (‘vocabulary items’)which are selected by such features in native grammars are not selected by thosefeatures in the grammars of L2 learners beyond the critical period. Highlyproficient speakers must nevertheless find some motivation for them. Werejected some of the conceivable ways in which Chinese speakers mightmotivate past tense verb forms in English (the ‘Aspect Hypothesis’ and the‘Discourse Hypothesis’), and we suggested that they might be operating interms of an ‘output condition’ which monitors the T-V string in relation to the‘pastness’ of the discourse.

42 Roger Hawkins and Sarah Liszka

Notes

* Parts of this work have been presented to audiences at McGill (2000) and GASLA (2000).

<DEST "haw-n*">

We are grateful for the comments of those present in both cases. We have also benefitedgreatly from the comments of Donna Lardiere on an earlier draft discussion of the ideasoutlined here. She will not agree with our conclusions, but her views have helped us sharpenour thinking on a number of issues.

1. The phenomenon of -t/-d deletion from word-final consonant clusters attested widely ininformal varieties of native English (Labov 1989), is strongly disfavoured in the context of theregular simple past tense (Bayley 1996:109). We return to this below, since it appears that thepattern is the reverse in non-native speakers: -t/-d is more likely to be missing from theregular past tense forms of verbs than in other contexts.

2. The assumption underlying the predictions made in this paragraph is that the morpholog-ical features of vocabulary items do not transfer from the L1 to the L2. Donna Lardiere (p.c.and 2000:116–117) points out that the assumption is not self-evident. If the “L1 exhibits thesame (or higher) degree of [morphological] complexity [L2 learners] may know what to lookfor”. Whether L2 speakers do or do not transfer morphological properties from their L1 isobviously an empirical question. If it turns out that they do, the observations reported in thisarticle will need to be reinterpreted. Given that evidence bearing on the empirical questionis currently lacking (although see Jarvis and Odlin (2000) for some relevant discussion), forthe purposes of this article it will be assumed that transfer of verbal morphological propertiesfrom L1 to L2 is not a factor.

3. The small size of the Chinese group is a reflection of the difficulty we had in locating L1Chinese informants who can achieve a score of 80% or above on the proficiency test (ratherthan a difficulty in finding L1 Chinese speakers of L2 English per se). Since, however, for thepurposes of the present investigation, the frequency of past tense marking in non-native-speaker English is the focus of interest, the variation in group size is relatively unimportant.In further work it would be desirable to have larger samples of advanced English speakers ofthe L1s in question.

4. Unless, of course, L2 speakers whose L1s are morphologically complex ‘know what to lookfor’. See footnote 2.

5. However, it should be noted that the present perfect — ich habe gekauft — predominatesin everyday spoken German in what would be simple past contexts in English (Comrie 1976).

6. A question that might be raised here is whether this shouldn’t predict random usage bythe Chinese speakers of all three forms (Donna Lardiere p.c.). This would be the case only ifL2 speakers assumed that morphological variants of a lexical item can occur in free variation.However, we argue subsequently that Chinese speakers attempt to establish representationswhich distinguish morphologically distinct forms. The problem for them is that they have todo so without the benefit of the syntactic feature [±past].

7. In practice the criteria seem to allow for considerable variation between raters in decidingwhich events are in the foreground and which in the background. We had difficulty applyingthem to our own sample, and do not report the results here.

Defective past tense marking in L2 English 43

References

Allan, D. 1992. Oxford placement test. Oxford: Oxford University Press.Bardovi-Harlig, K. 1995. “A narrative perspective on the development of the tense/aspect system

in second language acquisition”. Studies in Second Language Acquisition 17: 263–291.Bardovi-Harlig, K. 1999. “From morpheme studies to temporal semantics: tense-aspect

research in SLA”. Studies in Second Language Acquisition 21: 341–382.Bayley, R. 1991. Variation theory and second language learning: Linguistic and social constraints on

interlanguage tense marking. Unpublished doctoral dissertation, Stanford University.Bayley, R. 1996. “Competing constraints on variation in the speech of adult Chinese learners

of English”. In Second language acquisition and linguistic variation, R. Bayley and D.Preston (eds), 97–120. Amsterdam: John Benjamins.

Beck, M-L. 1997. “Regular verbs, past tense and frequency: tracking down a potential sourceof NS/NNS competence differences”. Second Language Research 13: 93–115.

Chierchia, G. 1998. “Reference to kinds across languages”. Natural Language Semantics 6:339–405.

Chomsky, N. 1998. “Minimalist inquiries: the framework”. MIT Working Papers in Linguis-tics 15: 1–56.

Chomsky, N. 1999. “Derivation by phase”. Ms. MIT.Chomsky, N. 2001. “Beyond explanatory adequacy”. Ms. MIT.Comrie, B. 1976. Aspect. Cambridge: Cambridge University Press.Embick, D. and Noyer, R. 2001. “Movement operations after syntax”. Linguistic Inquiry 32:

555–595.Fromkin, V. 1988. “The grammatical aspects of speech errors”. In Linguistics: The Cambridge

survey, Vol. II, F. Newmeyer (ed.), 117–138. Cambridge: Cambridge University Press.Halle, M. and Marantz, A. 1993. “Distributed morphology and the pieces of inflection”. In

The view from building 20: Essays in linguistics in honor of Sylvain Bromberger, K. Haleand S. J. Keyser (eds), 111–176. Cambridge, MA: MIT Press.

Hansen, J. 2001. “Linguistic constraints on the acquisition of English syllable codas by nativespeakers of Mandarin Chinese”. Applied Linguistics 22: 338–365.

Jarvis, S. and Odlin, T. 2000: “Morphological type, spatial reference, and language transfer”.Studies in Second Language Acquisition 22: 535–556.

Labov, W. 1989. “The child as linguistic historian”. Language Variation and Change 1: 85–98.Lardiere, D. 1998a. “Case and tense in the ‘fossilized’ steady state”. Second Language Research

14: 1–26.Lardiere, D. 1998b. “Dissociating syntax from morphology in a divergent L2 end-state gram-

mar”. Second Language Research 14: 359–375.Lardiere, D. 2000. “Mapping features and forms in second language acquisition”. In Second

language acquisition and linguistic theory, J. Archibald (ed.), 102–129. Malden, MA:Blackwell.

Lardiere, D. 2002. “Second language knowledge of [±past] vs. [±finite]”. Paper presented atGASLA 6, University of Ottawa.

Lasnik, H. 1999. Minimalist analysis. Malden, MA: Blackwell.Li, A. Y-H. 1990. Order and constituency in Mandarin Chinese. Dordrecht: Kluwer.

44 Roger Hawkins and Sarah Liszka

Li, C.N. and Thompson, S. 1981. Mandarin Chinese: A functional reference grammar.Berkeley: University of California Press.

Lumsden, J. 1992. “Underspecification in grammatical and natural gender”. LinguisticInquiry 23: 469–486.

Nation, I.P.S. 1990. Teaching and learning vocabulary. Boston, MA: Heinle and Heinle.Okuwaki, N. 2000. “Japanese -ta as an auxiliary verb”. Ms. University of Essex.Packard, J. 2000. The morphology of Chinese: A linguistic and cognitive approach. Cambridge:

Cambridge University Press.Perlmutter, D. 1971. Deep and surface structure constraints in syntax. New York: Holt,

Rinehart and Winston.Prasada, S. and Pinker, S. 1993. “Generalization of regular and irregular morphological

patterns”. Language and Cognitive Processes 8: 1–56.Prasada, S., Pinker, S. and Snyder, W. 1990. “Some evidence that irregular forms are

retrieved from memory but regular forms are rule-generated”. Poster paper, 31stAnnual Meeting of the Psychonomic Society, New Orleans.

Prévost, P. and White, L. 2000. “Missing surface inflection or impairment in second languageacquisition? Evidence from tense and agreement”. Second Language Research 16: 103–133.

Smith, N. and Tsimpli, I-M. 1995. The mind of a savant: Language learning and modularity.Oxford: Blackwell.

Takeda, K. 1999. Multiple headed structures. Unpublished doctoral dissertation, Universityof California, Irvine.

Tsimpli, I-M. and Roussou, A. 1991. “Parameter resetting in L2?” University College WorkingPapers in Linguistics 3: 149–169.

Vendler, Z. 1967. “Verbs and times”. In Linguistics and Philosophy, Z. Vendler (ed.), 97–121.Ithaca, NY: Cornell University Press.

Wolfram, W. and Hatfield, D. 1984. “Tense marking in second language learning: patternsof spoken and written English in a Vietnamese community”. ERIC document ED 25960. Washington, DC: Centre for Applied Linguistics.

</TARGET "haw">

<TARGET "cor" DOCINFO AUTHOR "Norbert Corver"TITLE "Perfect projections"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 3

Perfect projections*

<LINK "cor-n*">

Norbert CorverUtrecht University

1. An interface perspective on L2-knowledge

A central question in current generative research is the question of how perfecta system language is (cf. Chomsky’s minimalist research program: Chomsky1995, 2000a, 2000b). Perfection is defined here from an interface perspective:the grammatical information provided by the linguistic expressions that aregenerated by the language L must be legible to the external performance systemswithin which the language faculty is embedded. In view of the traditionalassumption that language is a relation of sound and meaning, i.e. a mentalphonetic representation and a mental meaning representation, there are twopoints of access from external systems. There is a sensory-motor system that islooking at P(honetic) F(orm) and reads off information provided by thePF-representation. And there is some language use or conceptual system thatreads off the meaning information provided by the L(ogical) F(orm)-represen-tation. If the linguistic expression generated by the language system is legibleboth on the sound side and on the meaning side, the expression is said toconverge at both interface levels. If there is some element or property whichcannot be interpreted at the interface, the expression crashes. The sentence Johnmet Mary Bill crashes, for example, because one of the noun phrases (say Bill)does not receive an interpretation at the LF (i.e. meaning) interface. The nounphrases John and Mary are interpreted as arguments of the two-place verbalpredicate met. The noun phrase Bill cannot receive an argumental interpreta-tion, nor any other meaning interpretation, and therefore turns the sentenceinto an LF-representation that crashes at the meaning interface.

Taking this interface perspective on human language, one could say thatlanguage is an optimally designed, perfect system. It makes linguistic informa-tion available to the external systems in a form which is accessible to them.Knowledge of language, then, can be defined in interface terms: a person has

46 Norbert Corver

knowledge of language L if he is able to form linguistic expressions (i.e. sound-meaning pairs) that are fully-interpretable at the interface levels.

This requirement that the linguistic expressions (PF-LF pairs) generated by thecomputational system be legible to the external systems plausibly holds for anystate of L1-knowledge (both interlanguage knowledge and final state grammaticalknowledge).1 In all stages of language acquisition, the language system (i.e thegrammar) must interact with the external systems. If it does not, the linguisticknowledge is not usable at all and consequently there would not be any outputproducts. Thus, from this interface perspective we could say that the L1-productsin any stage of language acquisition are perfect linguistic objects (PF-LF pairs)in the sense that they fully consist of interface-interpretable properties.

What does the interface perspective on linguistic knowledge contribute toour view on second language knowledge (and second language acquisition)? Infact, the conclusion seems inescapable that L2-expressions are also perfectgrammatical objects. If they were not, the L2-objects generated by the computa-tional system would not be legible and usable at all by the external systemswhich interact with the ‘L2-grammar’. There simply would not be any output(i.e. utterances). The conclusion must be that L2-products are interpretableboth on the meaning side and the sound side. And just like it does for thevarious L1-knowledge states, this conclusion should also hold for the variousknowledge states of the L2-grammar (initial state, interlanguage states andtarget state).2

Of course, the conclusion that L2-representations are natural languageobjects, in the sense of being objects that fall within the bounds of UniversalGrammar, is not new. Over the last two decades, various researchers interestedin investigating the linguistic competence of L2-learners have argued and triedto show that the (mental) L2-representations generated by the interlanguagegrammar are constrained by principles of UG (and consequently can beanalysed in the same way as other (e.g. L1 linguistic data; cf. Bley-Vroman 1990,Schachter 1989, White 1988, 2000). The importance of the interface perspectiveon linguistic expressions is that the conclusion of UG-consistency seemsinescapable: if interlanguage representations did not obey the bare outputconditions (i.e. the interface legibility requirements), these representationswould be illegible and inaccessible to the external cognitive systems with whichthe interlanguage grammar interacts. The linguistic expressions generated bythe interlanguage grammar would simply not contain the right ‘instructions’ tothe performance systems, and consequently, they would not be able to put it touse. And as a consequence of that, there would not be any output.

Perfect projections 47

Thus, the L2-system, just like the L1-system, is a perfect system in the senseof being a system that is optimally designed to meet external conditions (bareoutput conditions) imposed by other cognitive systems that the languagefaculty interacts with. From a different perspective, though, linguistic expres-sions produced by L2-learners seem to be highly imperfect. They very oftendeviate, to a greater or lesser extent, from the linguistic expressions produced byadult mother tongue learners of the language that is acquired. Some L2-expres-sion E with meaning representation LFx very often differs from the equivalentL1-expression generated by a mother tongue speaker of the language.

Consider, for example, the ‘imperfect’ L2-expressions in (1a)–(4a), whichare produced by Turkish L2-learners of Dutch.3,4 The b-examples represent thecorresponding target expressions.

(1) a. Ik komt huis en Slenol wegt huis Stokhasselt.I come home and Slenol away-3sg house Stokhasselt‘I go home and Slenol went to his house in Stokhasselt.’(L2-expression)

b. Ik kom thuis en Slenol gaat (weg) naar z’n huis in Stokhasselt.I come home and Slenol goes (away) to his home in Stokhasselt.(target expression)

(2) a. Altijd uh alles woonte Klirsehir van Turkije.always uh all/everything live(d) Klirsehir of Turkey‘Everyone still lives in Klirsehir in Turkey.’ (L2-expression)

b. Nog altijd woont iedereen in Klirsehir in Turkije.still always lives everyone in Klirsehir in Turkey‘Everyone still lives in Klirsehir in Turkey.’ (target expression)

(3) a. Ik gaan school.I go-inf school‘I go to school.’ (L2-expression)

b. Ik ga naar school.I go-1sg to school‘I go to school.’ (target expression)

(4) a. En dan andere jongens komt.and then other boys-pl come-sg

‘And then, the other boys come.’ (L2-expression)b. En dan komen de andere jongens.

and then come-pl the other boys-pl

‘And then, the other boys come.’ (target expression)

48 Norbert Corver

The L2-expressions in the a-examples in (1)–(4) are (superficially) imperfect inthe sense that they deviate from the expressions generated by the grammar ofmother tongue speakers of Dutch. In (1a), imperfection relates to the lexicalitem wegt, a verbal form which does not exist in (target) Dutch. In (2a),imperfection concerns the use of the lexical item alles, a quantificational expres-sion that does not express quantification over persons in the target language. In(3a), imperfection relates to the linguistic expression of ‘location’: there seemsto be no prepositional element available, which carries the locational (i.e. path)interpretation of school. In (4a), finally, we appear to have an imperfect agree-ment relation between the (plural) subject-noun phrase and the finite verb.

Even though these L2-expressions may be imperfect from the perspective ofthe target language, I hope to show in this paper that they are perfect from theperspective of the interface conditions.5 L2-projections are perfect projectionsin the sense that, at the interface level, they consist of features (associated withlexical items) that are interpretable for the interface conditions. One could say:L2-projections are perfect in being externally-interpretable.

On the basis of the types of L2-expressions illustrated in (1)–(4), I hope toshow in this article that target (im)perfection (i.e. (non-)correspondence withthe target pattern) should be distinguished from interface (im)perfection.L2-products of interlanguage grammars are typically ‘target-imperfect’ but‘interface-perfect’. Importantly, those target-imperfect but interface-perfectinterlanguage expressions can be of two types. First of all, there are inter-language expressions that are legible at LF because the equivalent L1 expressionis legible at LF. An example of such an expression resulting from transfer (orconservation; see Van de Craats, Corver and Van Hout 2000, Van de Craats thisvolume) is given in (5). Secondly, there are target-imperfect interlanguageexpressions that are legible at LF and whose imperfection results from othermechanisms, e.g. the (non-target) merger of some root element and an inflec-tional suffix. An example of such an expression is the element wegt in (1).

(5) examen van tolkexam of interpreter‘the interpreter at the exam’

Let me briefly dwell on the example in (5), which from a superficial perspectivelooks like a perfect target Dutch pattern (‘surface perfection’). In possessivestructures of native speakers of Dutch, the element van is an adpositional (i.e.prepositional) marker that is interpreted as the spell-out of the abstract genitivecase feature associated with the possessor DP in postnominal position (like in:

Perfect projections 49

dat boek van Jan, ‘that book of Jan’). A Dutch-based analysis of the sequenceexamen van tolk is highly unlikely, however; under such an analysis, in whichtolk is the complement to the noun examen, the entire noun phrase needs to beinterpreted as ‘the examen taken by the interpreter’. This is not the reading ithas, which is ‘the interpreter at/of the exam’; examen acts as the ‘possessor’.Given this semantic interpretation, Van de Craats, Corver and Van Hout reachthe conclusion that the syntactic structure associated with the linear sequencein (5) is a structure transferred (i.e. conserved) from the first language (i.e.Turkish). This amounts to an analysis according to which van is an inflectionalsuffix attached to the possessed noun. It is the equivalent of the inflectionalelement -nin in Turkish expressions like Ayse-nin araba-si (Ayse-gen car-3sg,‘Ayse’s car’). The Turkish syntactic structure which underlies the Dutch surfacesequence in (5) is then the one in (6a). (6b) represents the syntactic structure ofthe Turkish sequence Ayse-nin araba-si.

(6) a. [DP [AgrP [examen-van]i [Agr¢ [NP ti tolk] Agr]] D]b. [DP [AgrP [Ayse-nin]i [Agr¢ [NP ti araba] si]] D]

In this article, I will consider examples of interface-legible interlanguageexpressions that are ‘target-imperfect’. One may wonder what interface-illegibleinterlanguage expressions look like. Being illegible at the interface such illegiti-mate patterns should never surface in the derivational output. They are ruledout by the bare output conditions at the interface. Potential examples of suchillegible interlanguage patterns can be made up, of course. In Van de Craats,Corver and Van Hout (2000), for example, it is noted that even though TurkishL2-learners produce a great variety of interlanguage possessive patterns (see (7)for some examples), certain imaginable patterns are not attested in the L2-data;for example, the patterns in (8). The absence of these patterns in the L2derivational output hints at the characterization of these patterns as ‘interface-illegible’ interlanguage expressions. The external systems somehow cannot‘read’ the linguistic instructions provided by these structures and, as a conse-quence of that, these patterns are never present in the L2-output.

(7) a. examen van tolkexam of interpreter‘the interpreter of/at the exam’ (attested)

b. auto z’n lampcar its light‘the car’s light’ (attested)

50 Norbert Corver

(8) a. tolk examen vaninterpreter exam of‘the interpreter of/at the exam’ (unattested)

b. z’n lamp autohis light car‘the car’s light’ (unattested)

Through discussion of some illustrative L2-expressions, I hope to show in thischapter that target (im)perfection should be distinguished from interface(im)perfection. L2-products of interlanguage grammars are typically ‘target-imperfect’ but ‘interface-perfect’. As I will show, interface perfection appliesboth at the level of words and at the level of phrasal categories (i.e. lexicalprojections). L2-words that may be imperfect from the point of view of thetarget language are (interface-)perfect at the lexicon-syntax interface (seeSection 2). And L2-phrases that are target-imperfect turn out to be fully legible(hence perfect) at the LF-interface (see Sections 3, 4 and 5).

2. Perfect L2-words at the lexicon-syntax interface

Chomsky (1995) systematically refers to PF and LF as interfaces of syntax withrespectively a perception/articulation system and an interpretation/use system— both being mental faculties. The syntactic structure generated by thecomputational system (say, merge and move/attract) is assigned a PF-represen-tation and an LF-representation. Besides the PF- and LF-interface of syntax, athird interface of the syntax can be identified, viz. the syntactic representationbuilt up from the lexicon. This is explicitly stated in Chomsky (1991:46):

…that there are three ‘fundamental’ levels of representation: D-structure, PF,and LF. Each constitutes an ‘interface’ of the syntax (broadly constructed) withother systems: D-structure is a projection of the lexicon, via the mechanisms ofX-bar theory; PF is associated with articulation and perception, and LF withsemantic interpretation.

Although a separate level of D-structure is no longer adopted in minimalisttheorizing, the general idea that syntactic structure is built up from the lexiconstill is a core assumption of generative linguistics. This is also clear from thefollowing statement by Chomksy (1995:225):

Perfect projections 51

Another natural condition is that outputs consist of nothing beyond propertiesof items of the lexicon (lexical features) — in other words, that the interfacelevels consist of nothing more than arrangements of lexical features.

For the lexicon-syntax interface, this implies that each lexical item (i.e. aconstellation of lexical features) must be legible to the computational system(merge and attract/move) that accesses these objects (i.e. lexical expressions)and builds more complex expressions (i.e. syntactic expressions) from thoselexical expressions. A question which then arises is: What makes a lexical itemlegible to the computational system (i.e. the ‘rules’ of grammar: e.g. merge,move/attract, agree)?

To answer this question, let us first address the question of what a lexicalitem (LI) is. In line with De Saussure’s conception of words, a LI is typicallydefined as a sound-meaning pair (i.e. a PF-LF pair). In a sense, a LI is a struc-tured object with a sound representation (the phonological matrix; soundproperties) and a meaning representation (semantic properties).

Presumably it is not the phonological and purely semantic properties whichmake a lexical item legible to the computational system. If a LI were just asound-meaning pair one could wonder what the syntax (i.e. the recursiveprocedure) should do with it. That is, what would make such a sound-meaningpair legible to the computational system? Phonetic and purely semantic featuresdo not seem to be accessed by the recursive syntactic procedures. In short, thesePF-LF-pairs would remain illegible to the computational system.

So, what makes these sound-meaning pairs legible to the computationalrules, which combine these pairs into more complex sound-meaning con-structs? The answer is: formal (i.e. syntactic) features. Suppose a formal featuremust be ‘added’ (merged) to the sound-meaning pair (i.e. the lexical item) forthe LI to be legible to the computational processes that generate larger struc-tures. In other words, merger of a categorial feature with the sound-meaningpair turns the lexical item into an object that is visible at the lexicon-syntaxinterface (cf. Marantz 1997, Chomsky 2000b).

Thus, what you have at the clausal level (syntax as a mediating representa-tion between sound and meaning) is also what you have at the word level:

(9) meaning

syntactic-formal feature

phonology

52 Norbert Corver

Surface imperfections in the L2 derivational output can now be due to mis-categorisation (i.e. ‘mis-’ from the perspective of the target-language): anincorrect categorial feature is associated with some sound-meaning pair. In(10)–(13), some examples of miscategorisation by L2-learners are given:

(10) Hier komt weg. Ik beetje momentes.here comes away I a-bit moment-infl

‘Here he goes away. I wait a bit.’

(11) Ik komt huis en Slenol wegt huis Stokhasselt.I come home and Slenol away-3sg house Stokhasselt‘I go home and Slenol went to his house in Stokhasselt.’

(12) A: In de buit ligt die.in the outside lie those‘Outside lay these.’

I: Hm?·lack of understandingÒ

A: In de buiten. Buiten ook heeft de steen.in the outside outside also has the stone‘There are also stones (at the) outside.’

(13) a. Ja verzeker betalen he.yes insure pay discourse-prt

‘Yes, my insurance will pay.’b. Uh ik ongeluk beur maar ik heb nu geen verzeker.

uh I accident happen but I have now no insure‘I had an accident but I have no insurance now.’

In (10) and (11), moment and weg are treated as roots (i.e. sound-meaningpairs) that receive a verbal character after merger of a verbal categorial feature.Schematically (order irrelevant):

(14) v

v moment/weg

After attachment of the verbal categorial feature to the root, the lexical itemdisplays verbal behavior: it carries, for example, the verbal inflection -t (presenttense, third person singular). From the perspective of the (grammar of the)L2-learner, there is nothing odd about these lexical items: they each representa root (a sound-meaning pair), which carries a categorial feature that turns itinto a verbal form that is legible at the lexicon-syntax interface, in the sense that

Perfect projections 53

the lexical item (carrying a categorial feature) is accessible to the computational(i.e. morphosyntactic) rules. In short, these lexical items represent perfectobjects in the L2-learner’s grammar.

Another interesting example of a target-imperfect but (lexicon–syntax)interface-perfect object is the verbal form bint in the following examples:

(15) a. Ja komt politieauto en hij bint in auto toe.yes comes police-car and he inside-3sg in car prt

‘Yes, there comes a police car and he goes into (enters) the car.’b. I: Hij wat?

he what?A: Hij bint auto.

he inside-3sg car‘He goes into the car.’

The use of bint in these examples seems to be a combination of L1-transfer (i.e.conservation) and creative use of the L2. Turkish has a verb binmek, which means‘to get in’, and Dutch has a preposition/particle binnen, which means ‘inside’.6

The lexical items buit(en) (cf. (12)) and verzeker (cf. (13)) are also perfectobjects in the L2-learner’s grammar. As can be concluded from their co-occur-rence with determiner-like elements (de, geen), these items are nominal. Thisnominal behaviour is represented by the nominal categorial feature that isattached to the root of the lexical item i.e. the ‘bare’ sound-meaning pair.Schematically:

(16) n

n buit(en)/verzeker

The verbal analysis of items like momentes, wegt and bint and the nominalanalysis of buit(en) and verzeker are imperfect from the perspective (of thegrammar) of the mother tongue speaker of Dutch. For him, binnen (meaning:‘inside’) and buiten (meaning: ‘outside’) are both prepositions; moment(meaning: ‘moment’) is a noun and verzeker (meaning: ‘to insure’) is a verbalform. In short, the ‘wrong’ categorial feature is associated with the root in theseL2-expressions. From the perspective of the L2-learner’s interlanguage gram-mar, however, these non-target-words are perfect projections: assignment of acategorial value makes these items legible at the lexicon-syntax interface andaccessible to the computational rules that take these objects as their input. Thenoun buit, for instance, is merged with the determiner de.

54 Norbert Corver

Summarizing, ‘miscategorisation’ (i.e, ‘mis’ from the perspective of thetarget language) yields what could be called ‘target imperfect’ lexical items.Because of this miscategorisation, these L2-lexical items are often hard tounderstand for mother tongue speakers of that language. From the interfaceperspective, however, there is no reason to believe that these lexical items areillegible for the system (read: (morpho)syntax) which interacts with the lexicon.After assignment of a categorial feature, it is input to the combinatorial rules ofthe grammar.

3. Perfect quantificational expressions

Thus far, we have seen that mis-categorisations on the part of the L2-learner leadto the formation of L2-expressions that are perfect from an interface perspectivebut may be deemed ‘imperfect’ when compared with the target language.

These target imperfections do not only appear in the domain of contentwords (also called: lexical categories), but, not unexpectedly, also in the domainof function words (also called: functional categories). An interesting illustrationof interface legibility of target imperfections comes from the domain ofquantified noun phrases. As shown by the following examples, quantifyingexpressions may display different forms depending on their function andposition within the syntactic structure. Consider, for example, the followingvariants of the universal quantifier al in present-day Dutch:7

(17) a. Jan heeft alle mensen herkend.Jan has all people recognized

b. Jan heeft alles herkend.Jan has everything recognized

c. Jan heeft allen herkend.Jan has all recognized‘Jan has recognized all of them/everyone.’

In (17a), we have the quantifying determiner alle, which is often treated as afusion of the pre-determiner al (cf. note 7) and the definite article de: al+de Æalle (cf. Paardekooper 1974, Verkuyl 1981; but see Zwarts 1992). In (17b), al isfollowed by the sequence (e)s, which presumably used to be a genitive casesuffix in older variants of Dutch, but is no longer recognized as such anymore;i.e. alles seems to have developed into a non-composite form which carries aneuter meaning: ‘everything’. As shown by (17b), alles, as opposed to the

Perfect projections 55

quantificational form alle, occupies an argument position in the clause. Thequantificational element allen, finally, represents a plural form. It always refersto human beings, and as such arguably carries the formal property [+human].Just like alles, the lexical item allen in (17c) occupies an argument positionwithin the clause.8

Importantly, all these quantified expressions obey the universal constraintthat there is a restriction on the quantificational element (say al). This restrictedreading on the quantifier can be represented as follows:

(18) a. for all xi [xi: people] (cf. (17a))b. for all xi [xi: things] (cf. (17b))c. for all xi [xi: people] (cf. (17c))

By allowing this restricted quantificational reading (i.e. the set of individuals(objects, persons) is specified over which the quantifier ranges) and binding avariable at LF, the nominal expressions in (17) are interpretable at the LF-interface.

Quantification, being a core property of natural language, is also found inthe L2-derivational output. Also with quantificational expressions we findpatterns which are imperfect from the perspective of the target language butperfect from an interface perspective on natural language expressions. Consider,for example, the forms in (19) and (20), which are all produced by one and thesame Turkish learner of Dutch (viz. Abdullah):

(19) a. Wat doet alles ik weet niet.what does all I know not‘I don’t know what everyone does.’

b. Altijd uh alles woonte Klirsehir van Turkije.always uh all lived Klirsehir of Turkey‘Everyone still lives in Klirsehir in Turkey.’

c. En dan vandaag hier komen alles.and then today here come all‘Today, everyone comes here.’

(20) a. Alles mensen # toerist ja.all people (were) tourists yes

b. Ken je in Nederlands alles stad?know you in Netherlands every city

c. Alles ja uh kinderen niet Turks spreken.all yes uh children not Turkish speak‘All children don’t speak any Turkish.’

56 Norbert Corver

d. Wij maakte alles maar ik weet niet alles naam # Nederlands.

we make everything but I know not everything name # Dutch‘We make everything but I don’t know the name of everything in Dutch.’

In (19) and (20), the form alles is used instead of the target form allen and alle,respectively. The L2-learner has identified the universal quantificationalmeaning of alles, but he has not discovered yet that alles has a non-humaninterpretation and that it cannot occur as a quantificational determiner. Or toput it differently, the L2-learner has identified the quantifier feature associatedwith alles, but other lexical features, like its categorical feature (e.g. alle being aquantificational determiner and alles being a noun-like expression), do notseem to have been identified yet.

Importantly, the quantificational expressions in (19) and (20) are LF-inter-pretable: they receive a restrictive reading (i.e. a set of objects is defined overwhich they range) and bind a variable at LF. Thus, an L2-expression like (19c)has the following LF-structure:

(21) For all xi, [xi: people], xi come here today.

In short, the pattern alles (N) represents an LF-interpretable structure from aninterface perspective.

LF-legibility also holds for the lexical item alleen, as it is produced in thefollowing examples by the same Turkish learner of Dutch:

(22) a. Ja nu alleen mag.yes now ‘alleen’ may‘Yes now, everything is permitted.’

b. Ik wil alleen leren.I want ‘alleen’ learn‘I want to learn everything.’

(23) a. Maar niet alleen.but not ‘alleen’‘But not everyone (is unpleasant).’

b. Maar ik zeg niet alleen slechte mensen.but I say not ‘alleen’ bad people‘But I don’t say that everyone is bad’

The item alleen in (22) and (23) receives a clearly (universal) quantificationalinterpretation. In (22), it receives the interpretation ‘everything’; in (23), it isinterpreted as ‘everyone’. Also in these examples, then, the L2-learner hasidentified the universal quantificational element al.

Perfect projections 57

Interestingly, this quantificational reading is not the one which is associatedwith the lexical item alleen in the target language. In Dutch, alleen means‘alone’, a reading which arguably derives from the two elements that composethis expression: the quantifier al and the numeral een: alleen actually means:‘one is all’.9 As regards its distributional behaviour, alleen occurs as a floatingelement which enters into a predicative relationship with a noun phrase in thesentence. In (24), for example, alleen is predicated over the subject ik.

(24) Ik ben toen alleen naar de bakker gegaan.I am then alone to the baker went‘Then I went to the bakery shop alone.’

Contrary to the L2-expression alleen in (22) and (23), the target language itemalleen can never occur in an argument position:

(25) *Ik kende alleen.I knew alone

Although the L2-lexical item alleen, which has a universal quantificationalmeaning, does not occur in the target language, we should not conclude fromthis that it is an illegitimate object. At the LF-interface, this L2 quantificationalexpression receives a restrictive reading (‘for all xi [xi: persons/things]’) andbinds a variable at LF.

(26) For all xi, [xi: thing], I want to learn xi

As a matter of fact, the L2 quantificational expression alleen is just as perfectfrom an interface perspective as a target quantificational expression like iedereen(‘everyone’) and menigeen (‘many a one’). These forms are sometimes analysedas composite quantificational expressions, that consist of a quantifying element(ieder, menig) and an indefinite pronominal part (een), which, in the targetlanguage, refers to humans (just like English one in One shouldn’t do that). TheL2 learner who produces the forms in (22) and (23) has possibly identified thecomposite character of the quantificational expression alleen: it consists of theuniversal quantifier al and the indefinite pronominal element een. As opposedto a target expressions like iedereen, however, the L2 expression alleen is notrestricted to quantification over humans. This suggests that the L2-learner hasnot discovered yet that the indefinite pronominal een is restricted to a humaninterpretation.

To summarize: forms like alles and alleen, as produced in (19)–(20) and(22)–(23) are legitimate expressions at the LF-interface. Even though their

58 Norbert Corver

distribution and interpretation (e.g. nonhuman versus human) may differ fromthat of the target lexical items alles and alleen, the conclusion seems inescapablethat these quantificational expressions are fully legible at the LF-interface: theyreceive a restrictive reading and bind a variable at LF.

4. The interpretability of apparently P-less structures

Target imperfection is also found with L2-structures expressing ‘prepositional’features like ‘location’ and ‘path’. Consider, for example, the expressions in (27)and (28), that are produced by a Turkish learner of Dutch (see Schenning(1998) for extensive discussion).

(27) a. Hij ook woon Kirslehir.he also lives Kirslehir‘He also lives in Kirslehir.’

b. Hij werkt Ankara.he works Ankara‘He works in Ankara.’

c. Ik nooit geweest Istanbul.I never been Istanbul‘I have never been in Istanbul.’

(28) a. Ik gaan school.I go school‘I go to school.’

b. Kom maar mijn huis.come just my house‘Come to my house.’

c. Ja wij moet altijd moskee gaan een dag vijf keer moskee gaan.yes we must always mosque go a day five time mosque go‘Yes, we must go to the mosque five times every day.’

As shown by the following target Dutch equivalents of (27a) and (28a), respec-tively, Dutch requires the presence of a prepositional element in those syntacticconstructs that express the abstract property of location.

(29) Hij woont ook in Kirslehir.he lives also in Kirslehir

(30) Ik ga naar school.I go to school

Perfect projections 59

In (27), the elements Kirslehir, Ankara and Istanbul indicate a static location, i.e.‘place’. In (28), the elements school, huis and moskee have a ‘path’ interpreta-tion. It is obvious that abstract meaning properties like ‘place’ or ‘path’ are notdirectly related to these nouns. These nominal elements don’t have an inherentlocative meaning, as is clear from sentences in which they fulfil a non-locativerole, as in (31):

(31) a. Ik ken Ankara goed.I know Ankara well

b. Ik zie mijn huis.I see my house

It is more likely that the locative properties ‘place’ (static location) and ‘path’(dynamic location) are associated with the category P(reposition). In the(target) Dutch examples in (29)–(30), this prepositional element expressing theabstract meaning property ‘space’ or ‘path’ is phonetically realized; in theL2-expressions in (27)–(28) it is not. The ‘non-visibility’ (i.e. phonetic absence)of the prepositional element does not imply that the category P (and itsprojection) is absent in expressions like Kirslehir and school. In fact, the preposi-tion may very well be empty:

(32) PP

PØspace· Ò NP

Kirslehir

(cf. (27a))

(33) PP

PØpath· Ò NP

school

(cf. (28a))

In Emonds (1985, 2000), a principle is proposed that permits a closed classcategory to be empty under the condition that it is realized on its phrasal sister.According to this principle, which Emonds calls the Invisible Category Principle(ICP), ‘empty P’ structures can be utilized in a language if features like path orspace are realized on the NP-sister by means of a case marking. Thus, althoughthe interpretable P-feature (i.e. path/space) itself is associated with the P-headof the prepositional structure, it can be alternatively realized on the NP-sister.These alternative spell-outs of the locative/path feature are pure spell-outs offeatures and appear late in the derivation (i.e. spell out at PF).

60 Norbert Corver

As noted in Kornfilt (1996), locative meanings are often expressed bymeans of case suffixes in a language like Turkish:

(34) a. Kitap masa-da.book table-loc

‘The book is on the table.’b. Hasan Ankara-ya git-ti.

Hasan Ankara-dat go-past

‘Hasan went to Ankara.’c. Hasan Ankara-dan gel-di.

Hasan Ankara-abl come-past

‘Hasan came from Ankara.’

Following Emonds’ ICP, these structures could be interpreted as emptyP-structures, which have the prepositional feature (space/path) realized on theNP-sister as a case-suffix. For example (order of P and NP-sister irrelevant):

(35) PP

PØspace· Ò NP

masa-DA (-DA as alternative realization of P-feature)

Under the assumption that L2-learners take a conservative approach towardsthe expression of location and path denoting expressions (i.e. PPs), it isexpected that they initially do not realize the prepositional head. Just like inTurkish, they try to realize the prepositional feature by means of a case markingon the NP-sister of P, a strategy which is not available in Dutch.10

The following L2-variants of prepositional structures are interesting in thiscontext:

(36) a. I: Waar woont hij?I: where lives heO: van Tilburg.O: of (= in) Tilburg

b. I: Wanneer ziet ze die jongen dan?I: when sees she that boy thenO: van Trabzon.O: of (= in) Trabzon

c. En dan beetje wandelen van ·name of streetÒ.and then bit walk of ·name of streetÒ‘And then I walk a bit in ·name of streetÒ.’

Perfect projections 61

In these examples, the prepositional element van appears in a prepositionalstructure which denotes a location. Van itself does not seem to carry anylocative ‘meaning’. As a matter of fact, this meaningless preposition van alsoshows up in non-locative prepositional contexts (cf. Schenning 1998):

(37) a. Ik niet trouwen van Yvette.I not marry of Yvette‘I don’t marry Yvette.’

b. Ik zegt: waarom jij niet praten van mij?I say why you not speak of (= with) me

c. En dan moet ik vertellen van hun.and then must I tell of them‘And then I must tell it to them.’

This distribution of the meaning-less element van is suggestive for an analysisin which it is a case-suffix which alternatively realizes the locative prepositionalfeatures and other types of prepositional features. Schematically, the van-vari-ants would then have the structures in (38):

(38) PP

PØspace· Ò NP

VAN-Tilburg (VAN as a case-a x)Y

In conclusion, both the prepositional structures in (32)–(33) and the preposi-tional structure in (38) are LF-interpretable objects. The empty preposition(and its projection) carries the interpretable property ‘space’ or ‘path’. Thetarget imperfection is simply a surface phenomenon that relates to the phoneticspell out of the prepositional position: the L2-learner initially leaves theprepositional head empty and tries to realize the prepositional feature by means ofa case-marking (i.e. alternative feature realization). Given the lack of clear casemarkings in Dutch, this alternative realization remains empty initially. At a certainstage in the acquisition process, the prepositional feature gets alternativelyrealized on the NP-complement by the semantically empty ‘preposition’ van.

5. Agreement and asymmetric spell out

In traditional grammars, one often finds the observation that agreement is anasymmetric relation. A verb, for example, is said to agree with its subject-DP in

62 Norbert Corver

person and number; it is not the subject-DP that is dependent on the verb foragreement. And an adjective is said to agree in number and gender with thenoun it modifies; it is not the noun that is valued for certain phi-features underagreement with the modifying adjective. In recent generative studies (cf. e.g.Chomsky 1995), this asymmetry of the agreement relationship is captured in termsof the notion ‘interpretable (formal) feature’. An interpretable (formal) feature isa feature that has a semantic contribution at the LF-interface (i.e. it is interpretableat LF). A non-interpretable (formal) feature has no interpretation at LF (or for thatmatter: PF). Structural case for nouns and phi-features for categories that agreewith nouns are core examples of uninterpretable formal properties.

The asymmetry in the agreement relationship, as observed in traditionalgrammatical studies, has been reinterpreted in terms of the notions [+interpret-able] versus [−interpretable]. It is the element carrying the [+interpretable]feature that agrees with the element carrying the [−interpretable] feature. Thisagreement relationship involves feature matching and elimination of theuninterpretable feature that is associated with the matching constituent. In(39a), for example, it is the verb zagen that enters into an agreement relation-ship with the plural subject-noun phrase de mannen. Plurality (‘more than one’)and singularity (‘one’) is an interpretable property of nouns. In a way, theplurality feature on the verb zagen is redundant; it does not contribute anysemantics and, as such, can be characterized as uninterpretable. The pluralitymarking on the verb is just a formal marker of the agreement relationship;semantically, it does not contribute anything to the linguistic expression.Therefore, in a certain intuitive sense, it is easier to imagine that a pluralinterpretation gets associated with the ill-formed expression (39b) than with theill-formed expression in (39c). In (39b), plurality is morphologically specifiedon the noun, i.e. the category that carries the number feature as a [+interpret-able] property, and the verb is not morphologically specified for plurality. Eventhough there is a morphological mis-match, there is a tendency to assign thissentence a plural interpretation: that is, the interpretation ‘the men-pl saw-pl

me’ is much more likely than the interpretation ‘the man-sg saw-sg me’. Inother words, it is the plurality marking on the noun that most strongly deter-mines the semantic interpretation of this expression which does not satisfyagreement at the level of morphological expression. Consider next (39c), wherewe have the reverse situation: the subject noun phrase does not bear anymarking of plurality; it is the verb zagen which is plural morphologically. Inspite of the plural marking on the verb, it is intuitively more difficult to get aplural reading of the noun. As a matter of fact, it is the singularity of the noun

Perfect projections 63

which seems to be dominant again for the interpretation. In short, given the factthat in subject-verb agreement relations, it is the noun that determines plural orsingular interpretation, it is expected that morphological marking of singularityor plurality is more likely to be realized on the noun than on the verb.

(39) a. De mannen zagen mij.the men-pl saw-pl me

b. *De mannen zag mij.the men-pl saw-sg me

c. *De man zagen mij.the man-sg saw-pl me

This asymmetry in the morphological realization of the number feature isreflected in the L2 derivational output: patterns are typically found in which theagreement properties of the noun are correctly realized overtly, but not those ofthe verb. In other words, morphological spell-out of the agreement propertygenerally applies to the element carrying the [+interpretable] property and notto the element carrying the [−interpretable] property. Consider, for example,the following L2-utterances, which are produced by a Turkish learner of Dutch.

(40) a. En dan andere jongens komt.and then other boys-pl come-sg

b. Maar twee meisen drie jongens weet ut wel maar heelbut two girls-pl three boys-pl know-sg it prt but veryklein beetje.little bit‘But two girls and three boys knew it, but only a little bit.’

c. Twee jongens woont Oisterwijk.two boys-pl live-sg Oisterwijk‘Two boys live in Oisterwijk.’

d. Die jongens komt Turkije.those boys-pl come-sg Turkey‘Those boys come to Turkey.’

In these examples, the phi-feature number carries the value ‘plural’. Thisnumber feature is correctly spelled out on the noun of the subject-noun phrasethat enters into an agreement relation with the verb. The verb carries what lookslike a singular inflection. Arguably, this verbal form is unanalysed morphologi-cally or, alternatively, the marking -t is underspecified for number.

What is important is that, even though the L2-expressions in (40) may beregarded as imperfect from the perspective of the target language, they are

64 Norbert Corver

perfect from the perspective of the LF-interface: i.e. they are expressions that arefully legible semantically.

Let me close off this section with another illustration of an LF-interpretablestructure in which the agreement properties are morphologically spelled outasymmetrically. The relevant example comes from noun-phrase internalagreement between a numeral and a noun. As noted among others in Emonds(1985), plurality is a property of numerals: i.e. numbers above one are inherent-ly marked for the property [+plural]. This formal property is interpretable,since plurality plays a role in the semantic interpretation of a linguistic expres-sion. As illustrated by the following L2-expressions, plurality is not alwaysmarked on the ‘agreeing’ noun:

(41) a. Vijf minuut he uh en dan klaar # alles dit.five minute prt prt and then ready # all this‘All this is ready in five minutes.’

b. I: Hoe lang zit je al weer op school # twee weken?I: how long sit you already at school # two weeksA: Vijf daag.A: five day

c. I: Heb je nu vakantie van school?I: have you now holiday from schoolA: Ja twee week.A: Yes two week

d. Vader, moeder en drie broer.father mother and three brother

In (41a), the noun minuut does not carry plural morphology. The target patternwould be: twee minuten. Also in this case, then, the L2-learner chooses thestrategy of not morphologically expressing plurality if this semantic feature isalready specified in the projected structure. In other words, redundant markingof the plurality feature on the noun is avoided.

Again, it is important to stress that the numeral+noun-patterns in (41) arefully legible at the LF-interface. From a target language perspective, however,these structures are imperfect: in Dutch, plurality is (redundantly) marked onthe noun, when it combines with a numeral that is inherently specified forplurality (i.e. ‘more than one’). In this respect, Dutch differs, by the way, fromTurkish. As noted in Kornfilt (1996:225), there are syntactic contexts inTurkish where, despite plural semantics of the noun phrase, the head nouncannot be marked for plurality. When the noun is preceded by a numeral orcertain quantifiers, the plural suffix cannot occur:

Perfect projections 65

(42) a. bes çocuk(*-lar)five child(*ren)‘five children’

b. birçok çocuk(*-lar)many child(*ren)‘many children’

In view of the non-redundant marking in (42), it is likely that the L2-learnerwho has produced the numeral+noun-patterns in (41) has adopted a conserva-tive strategy: the Turkish rule of not morphologically marking plurality on thenoun when it is preceded by a numeral (or certain quantifiers) is also at thebasis of the numeral+noun sequences in his second language, i.e. Dutch (seeagain Van de Craats (this volume) for further discussion of the notion ofconservation).

6. Conclusion

From the perspective of the target language, L2-expressions often seem highlyimperfect. At the surface, these L2-expressions (e.g. Dutch L2-products ofTurkish learners) seem to differ greatly from those produced by mother tonguespeakers. From a different perspective, though, there does not seem to be muchwrong with those L2-expressions: they are perfect expressions, in the sense thatthey meet conditions imposed by other cognitive systems that the languagefaculty interacts with (external requirements). That is, any L2 (interlanguage)grammar provides (grammatical) information that is ‘legible’ to the cognitivesystems with which it interacts. I have tried to illustrate the interface-legibilityof L2-expressions by means of four types of phenomena: (a) categorial labelingof words, (b) quantificational expressions, (c) the expression of location inprepositional structures, and (d) the morphological expression of certainagreement patterns. As regards the categorial labeling of words, it was notedthat certain L2-words (e.g. an L2-verb like wegt (‘goes’)) that do not exist in thetarget-language, are perfect lexical constructs from a (lexicon-syntax) interfaceperspective: it is a sound-meaning pair that through assignment of a categorialvalue (i.e. V) becomes accessible to the computational system (merge, move,morphological rules) of the interlanguage grammar. I further argued thatcertain L2 quantificational expressions (e.g. alles mensen) that are imperfectfrom a target language perspective are fully legitimate from the perspective ofLF-legibility: they are ‘normal’ quantificational expressions in the sense of

66 Norbert Corver

allowing a restricted quantificational reading and binding a variable at LF.I further argued that LF-legibility also holds for what, at the surface, looks likea bare nominal carrying a locative meaning (e.g. Ankara, meaning ‘in/toAnkara’). At a more abstract level, these L2-expressions are prepositionalstructures, which have the locative meaning associated with an empty P. These‘empty-P’-structures are fully interpretable expressions at the LF-interface.Finally, it was observed that in L2-agreement patterns (of beginning learners)it is typically the [+interpretable] element that gets morphologically marked.Absence (or underspecification) of morphological marking of the [−interpret-able] feature may yield a pattern which is imperfect from the perspective of thetarget language. From the interface perspective, however, the non-redundantlymarked pattern is perfect: an agreement property like number is typicallyspelled out on those items for which singularity or plurality is an inherentsemantic property.

In this approach, I have only ‘glanced over’ a variety of L2-expressions fromthe interface perspective; a perspective which is characteristic of the minimalistthesis. The major purpose was to show that by taking this perspective, theconclusion seems inescapable that L2-expressions are perfect grammaticalobjects, where perfection amounts to legibility of its information to the systemswith which it interacts.

Notes

* I thank the participants at the workshop for their comments on the talk. I would also like

<DEST "cor-n*">

to thank an anonymous reviewer for helpful comments and suggestions.

1. In a recent interview with Adriana Belletti and Luigi Rizzi, Chomsky states the following(cf. Chomsky, Belletti and Rizzi 1999 (rev. 2000:17)): “Every language meets minimaliststandards. Now, that means that not only the language faculty, but every state that it can attainyields an infinite number of interpretable expressions. That essentially amounts to saying thatthere are no dead ends in language acquisition.” He further states: “The minimalist thesiswould say that all states have to satisfy the condition of infinite legibility at the interface.”

2. A reviewer raises the following question: What can a generative-minimalist theory accountfor with respect to interlanguage expressions that other theories (e.g. a GB-based approachusing presumed UG-notions like government) cannot account for? Although such acomparison of approaches may be useful in certain respects, it is not always easy to evaluatethe benefits of one specific analysis of interlanguage data over another one, especially if theanalytic tools are different. Important, though, is the different perspective that the minimalistapproach towards language design provides: linguistic properties are not so much consideredfrom an intra-grammatical perspective, but rather from an interface perspective (lexicon-

Perfect projections 67

syntax, syntax-semantics, syntax-phonology). This raises different sorts of questions aboutthe linguistic objects one examines (e.g. What makes an (L2) representation (il)legible at theinterface?) and arguably provides different sorts of accounts of the grammatical propertiesdisplayed by these representations.

3. The data are drawn from the European Science Foundation (ESF) Program in SecondLanguage Acquisition by Adult Immigrants (for design, elicitation techniques, and topics, seePerdue 1993). This project was set up as a longitudinal and cross-linguistic multiple casestudy. Most of the data discussed in this article are from the Turkish informants Abdullahand Osman. Since the issue of language development is not central in this paper, I have leftout information about the stages in which the expressions were uttered by these informants.

4. In each of the L2-expressions (1a)–(4a), there is more than one ‘target imperfection’. Forthe sake of discussion, I will pick out one type of imperfection for each of the examples.

5. I won’t consider in this paper the issue of (L2) perfection at the PF-interface.

6. I would like to thank the reviewer for discussion of this example.

7. Another pattern featuring the quantificational element al is: al de boeken (all the books).In this pattern, the quantificational element occurs in a pre-determiner position.

8. The quantificational form allen also shows up as a floating element, like in: Zij zijn gisterenallen gekomen (they are yesterday all come; ‘They all came yesterday.’).

9. Alleen can also mean ‘only’ in present-day Dutch. This (homophonous) adverbial element isnot quantificational and displays a cross-categorial distribution, just like its English equivalent.

10. See Van de Craats (2000) and Van de Craats, Corver and Van Hout (2000) for adiscussion of conservation of L1-grammatical features in L2-expressions. See also Van deCraats’ contribution in this volume.

References

Bley-Vroman, R. 1990. The logical problem of foreign language learning. Linguistic Analysis20: 3–49.

Chomsky, N. 1991. “Some notes on economy of derivation and representation”. In Principlesand parameters in comparative grammar, R. Freidin (ed.), 417–454. Cambridge, MA:MIT Press.

Chomsky, N. 1995. The minimalist program. Cambridge MA: MIT Press.Chomsky, N. 2000a. New horizons in the study of language and mind. Cambridge, UK:

Cambridge University Press.Chomsky, N. 2000b. “Minimalist inquiries (MI)”. In Step by step: Essays in minimalist syntax

in honor of Howard Lasnik, Martin et al. (eds). Cambridge MA: MIT Press.Chomsky, N., Belletti, A. and Rizzi, L. 1999 (rev. 2000). An interview on minimalism.

University of Siena.Craats, I. van de 2000. Conservation in the acquisition of possessive constructions. Doctoral

Dissertation, Tilburg University.

68 Norbert Corver

Craats, I. van de, Corver, N. and Hout, R. van 2000. “Conservation of grammatical knowl-edge: on the acquisition of possessive noun phrases by Turkish and Moroccan learnersof Dutch”. Linguistics 38 (2): 221–314.

Emonds, J. 1985. A unified theory of syntactic categories. Dordrecht: Foris.Emonds, J. 2000. Lexicon and grammar: The English syntacticon. Berlin/New York: Mouton

de Gruyter.Kornfilt, J. 1996. Turkish. New York: Routledge.Marantz, A. 1997. “No escape from syntax: Don’t try morphological analysis in the privacy of

your own lexicon”. University of Pennsylvania Working Papers in Linguistics 4 (2): 201–225.Paardekooper, P.C. 1974. Beknopte ABN-syntaxis. Den Bosch: Malmberg.Perdue, C. (ed.) 1993. Adult language acquisition: Cross-linguistic perspectives, vol. I: Field

methods. Cambridge, UK: Cambridge University Press.Schachter, J. 1989. Testing a proposed universal. In Linguistic perspectives on second language

acquisition, S. Gass and J. Schachter (eds), 73–88. Cambridge: Cambridge University Press.Schenning, S. 1998. Learning to talk about space. The acquisition of Dutch as a second language

by Moroccan and Turkish adults. Doctoral Dissertation, Tilburg University.Verkuyl, H. 1981. “Numerals and quantifiers in X-Bar syntax and their semantic interpreta-

tion”. In Formal methods in the study of language, J. Groenendijk, T. Janssen and M.Stokhof (eds), 567–599, Amsterdam: Mathematic Centre.

White, L. 1988. “Island effects in second language acquisition”. In Linguistic theory in secondlanguage acquisition, S. Flynn and W. O’Neill (eds): 144–172. Dordrecht: Reidel.

White, L. 2000. “Second language acquisition: From initial state to final state”. In Secondlanguage acquisition and linguistic theory, J. Archibald (ed.): 130–155. Oxford: Blackwell.

Zwarts, J. 1992. X¢-Syntax — X¢-Semantics. On the interpretation of functional and lexicalheads. Doctoral Dissertation. Research Institute for Language and Speech — OTS.Utrecht University.

</TARGET "cor">

<TARGET "cra" DOCINFO AUTHOR "Ineke van de Craats"TITLE "L1 features in the L2 output"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 4

L1 features in the L2 output

Ineke van de CraatsUniversity of Nijmegen

1. Introduction

In current generative syntax (e.g. Chomsky 1995), the role of the lexicon hasbecome more prominent in the generation of syntactic expressions. Under thisview, the lexicon is not limited to vocabulary, but also contains importantgrammatical information. Lexical knowledge consists of grammatical propertiesas defined by language-particular knowledge of functional categories (parame-ter settings), language-specific knowledge of lexical items and their features (thevocabulary) and morphological knowledge. By means of lexical items selectedfrom the lexicon, the computational system of human language can buildphrases and sentences. A syntactic object — a clause or a phrase — is consid-ered to be the structural projection of a series of linguistic properties associatedwith a lexical item. Those formal features (e.g. singular, accusative, human,+V, −N) are stored in the vocabulary, together with the semantics of thatspecific lexical item. Lexical items, like nouns and verbs, are base-generated asa unit, including case morphology and inflectional morphology like person,number, tense, under lexical heads. Functional heads, on the other hand, do notdominate inflectional morphology, they dominate bundles of abstract features.These features have to be eliminated or erased in the course of the derivation,which is done by feature checking. This feature checking is a matching of thefeatures (e.g. case morphology is checked by its case assigner) and is done byadjoining the inflected N or V to the relevant functional head. So, morphologywhich is associated with a verb or a noun has to be checked by the abstractfeatures dominated by a functional head (e.g. Agr or T for verbs, and Agr andD for nouns). What features are dominated by a functional head and whetherthese features are strong or weak is lexical knowledge which is necessary for thegeneration of a syntactic object, but is not part of the vocabulary. So, in theminimalist approach, a parameter is related to a feature of a functional head

70 Ineke van de Craats

that attracts an identical feature of a lexical item at some point in the derivation,and so, is essentially linked to the lexicon.

If, in recent linguistic theorizing, the formal features of a lexical item and thespecification of functional heads, can be seen as the seeds for building a syntacticstructure they must play the same role in the acquisition of a new language,assuming that we are dealing with a natural language. Applying generative theoryto second language acquisition is not new of course. Before the work of White(1982, 1985), Flynn (1986) and many others, Adjémian (1976) was the first toadopt a Chomskyan approach to interlanguage development. He consideredgrammatical interlanguage systems to be natural languages but different fromL1 grammars only in their ‘permeability’ to aspects of the L1 system.

In this chapter, the current view of generative syntax will be applied to theanalysis of naturalistic second language data. Although L2 expressions are thesyntactic products of the interaction between the computational system and achanging lexicon, the focus will be on how the grammatical knowledge of an L2learner is encoded in the lexicon, not on derivation and syntactic representa-tions. The question is more: what is the nature of this grammatical knowledgeat the L2-initial state and how does it change? Through examples produced byL2 learners and by outlining the longitudinal development of some lexical items,it will be shown how features of a lexical item may change in the course of theacquisition process, giving rise to new syntactic structures. For a detailed discus-sion of the syntactic representations of nominal possessive constructions in whichvan (Subsection 5.1) and the realisation of the personal pronoun (Subsection 5.2)are involved, the reader is referred to Van de Craats, Corver and Van Hout(2000), for representations of clausal possessive constructions in which heeft isinvolved (Section 6) to Van de Craats, Corver and Van Hout (2002).

2. The L2-initial state, data and informants

The central claim is that, initially, the grammatical system of an L2 learner isnot only ‘permeable’ to the learner’s L1 system (Adjémian 1976) but even basedon the L1 system. It is assumed that L2 learners exhibit conservative behaviourand take the fully fledged grammar of their L1 as the starting point of the L2acquisition process (in case they have command of only their L1). This amountsto both the Full Transfer/Full Access Hypothesis (Schwartz and Sprouse 1996)and the Conservation Hypothesis (Van de Craats 2000, Van de Craats, Corverand Van Hout 2000). The latter, however, explicitly states that the learners’

L1 features in the L2 output 71

output cannot show all L1 properties because of a strongly limited L2 vocabu-lary. With the developing vocabulary, (more) L1 properties related to free andbound functional morphemes become manifest gradually, as we will see forgenitive markers (Subsection 5.1) and copular forms (Subsection 6), which,initially, are not found in learners’ data but appear gradually.

The following aspects of lexical knowledge may be conserved at the L2-ini-tial state, when all learning starts:

– parameter settings (e.g. strength values);– knowledge of morphology and morphological realization rules (e.g.

realization of case);– knowledge of lexical items: formal features (e.g. categorial features) and

semantic-conceptual values (the meaning).

Because of acquisition, restructuring will apply at all levels of lexical knowledge:from parameter values to semantic-conceptual values. Initially, L2 learners relyon the old system of the L1. On the basis of primary linguistic input, they willdepart from their conserved L1 parameter setting and L1 ‘vocabulary’ lexicalknowledge. This implies that a parameter together with its possible values willremain available through UG.

In the next sections, the changing grammatical knowledge at the basis of L2expressions will be shown through the spontaneous production data of eightadults (18–24 years old) learning Dutch as a second language. These data werecollected within the framework of the European Science Foundation (ESF)Program on Second Language Acquisition by Adult Immigrants (see Perdue1993). The ESF project was set up as a longitudinal and cross-linguistic multiplecase study; we only use the Dutch data here. The eight informants were fol-lowed for two and a half years. The period of investigation was divided intothree cycles of nine sessions, one session a month. In the examples in the nextsections, we refer to the learner and to the cycles and recording session (e.g. I.7= first cyle, session 7) in which the utterance was produced. At the time of thefirst session, the informants had been living in the Netherlands for seven totwelve months. They had a very low level of proficiency in Dutch, were mono-lingual, and had a limited level of education. Several elicitation tasks wererepeated in each cycle, such as interviews, role-playing, and film-retelling tasks.

Some other examples used were produced by child L2 learners between six andnine years old. They come from another corpus of longitudinal and cross-linguisticdata collected by Vermeer (1986). These informants were also from a Turkishand Moroccan Arabic background: 16 children from both language groups.

72 Ineke van de Craats

They were followed over 2.5 years from the time they entered primary school.At the time of the first recording the children’s age ranged from 6;4 to 7;9 years.

3. The nature of a lexical item

As hinted at in the previous section, we distinguish lexical knowledge as definedby UG from language-specific knowledge of lexical items and their lexicalentries. To avoid confusion, we refer to the latter type as the vocabulary. Thefirst question that arises here is what knowledge learners have exactly of a lexicalitem. Lexical items are combinations of sound and meaning properties whichcan be read, or interpreted, by other cognitive systems (cf. Chomsky 1994,1995). Phonetic features make up the phonetic representation and semanticfeatures make up the semantic representation. A sound-meaning pairing isencoded by a phonological matrix. Each coding of a lexical item also containsa set of formal features: intrinsic features and optional features. The former areunpredictable, idiosyncratic grammatical properties of lexical items (e.g. thecategorial feature [+N,−V] and the person feature [3 person]); the latter includegrammatical features that are predictable from other properties of the lexical entry(e.g. the features number and (abstract) case, which might be derived from thecategorial feature definition [+N,−V]).1 Table 1 gives an example of the concept‘bicycle’ in three different languages. Only the phonological matrix differs.

In what way might we conceive of conservation of vocabulary knowledge?

Table 1.Lexical items of the concept ‘bicycle’ compared for three languages

Turkish Dutch English

– phonological matrix– semantics– formal features

intrinsicintrinsicintrinsicoptionaloptional

/bisiklet/‘bicycle’

[+N,−V][−human][3 person][singular][nominative]

/fiets/‘bicycle’

[+N,−V][−human][3 person][singular][nominative]

/bike/‘bicycle’

[+N,−V][−human][3 person][singular][nominative]

Obviously, conservation does not apply at the level of the phonological matrix.2

One might conceive of early lexical acquisition as a process in which L2 learnerstry to match a meaning representation associated with some lexical item of theirL1 vocabulary with a phonological matrix of the target language.

L1 features in the L2 output 73

The consequence of this conservation model at the level of the derivational

Table 2.The development of an L2 lexical item

L1 item(Turkish)

Interlanguage L2 item(Dutch)

– phonological matrix– semantics– formal features

/bisiklet/‘bicycle’[+N,−V][3 person][singular][nominative]

/Ø/‘bicycle’[+N,−V][3 person][singular][nominative]

/fiets/‘bicycle’[+N,−V][3 person][singular][nominative]

output might be that, in an interlanguage, apparently empty constituents mayexist. They are filled by L1 feature bundles of semantic-conceptual and formalfeatures, but lacking a phonological matrix. The phonological representation issimply absent. The task of the L2 learner will be to fill in the empty slot of thephonological matrix. Schematically, the acquisition of a lexical item may berepresented as in Table 2.

What learners do, in fact, is add a new phonological matrix to the alreadyexisting bundle of semantic and formal features. They match, in Table 2 forinstance, the L2 phonological matrix /fiets/ with the semantic and formalfeatures belonging to the Turkish phonological matrix /bisiklet/. This combina-tion of an L2 phonological matrix, L1 semantics and an L1 feature bundleessentially is a new lexical item in the learners’ L2 vocabulary.

Evidence for this way of learning lexical items is (i) L2 lexical items showingimperfect matching or mismatching and (ii) empty phonological matrices indevelopmental sequences which are filled up later by elements based on the L1syntax as will be shown by the developmental sequence of the lexical items van(Subsection 5.1) and heeft (Section 6). In the next section, some examples ofmismatches will be presented. They are related to the semantic features, theformal features and the argument structure of lexical (not functional) elements.In Sections 5 and 6, we focus on mismatches of functional elements, both in thenominal and the clausal domain.

4. Mismatches in lexical development

As long as there are no differences between the formal feature bundles of twolexical items in the source and target languages, learners can carry out this

74 Ineke van de Craats

matching operation without making any errors. For most content words, sucha perfect match is possible as far as formal features are involved. We do notexpect a different set of formal features in two languages because entities aretypically nouns ([+N,−V]), actions are typically verbs ([−N,+V]), and qualitiesare typically adjectives, each of those with its own categorial values (N, V, A)and the formal features typically related to those categories. As for the semantic-conceptual aspects, differences between L1 and L2 are to be expected, however.

The first example of imperfect matching involves the domain of semantics.A meaning representation of a lexical item may have different aspects. TheTurkish verb içmek, for instance, differs minimally from the Dutch verb drinken(‘to drink’). The formal features are the same, but the basic meaning of the verb/içmek/ is ‘to put something in something else’. This general meaning hasseveral more specific meaning aspects, viz. ‘to drink’ and ‘to smoke’. Considerthe L2 expression in (1).

(1) Als ik Marlboro drinken. Turkish learner: Ergün: III-5when I Marlboro drink‘When I smoke a Marlboro.’

In (1), Ergün maps the L2 phonological matrix /drinken/ on to both L1semantic aspects and grammatical properties of the verb içmek, which results ina mismatch.3

Mismatches, however, are not restricted to the domain of semantics. Theymay also involve formal features of lexical elements. Let us consider theexamples in (2) and (3). Mahmut, a Turkish learner of Dutch, is retelling ascene from a silent movie in which Charlie Chaplin must pay the bill in arestaurant. Charlie refuses to do so because he wants to go to jail. The sameepisode was told twice, with an approximately ten months’ interval. In (2a) and(3a) the policeman orders Charlie to pay, in (2b) and (3b) Charlie answers thathe has no money.

(2) a. Jij betalen geven. Turkish learner: Mahmut, II-9you to pay to give‘You must pay.’

b. Ik niet betalen.I not pay‘I have no money.’

(3) a. Politie zegt: “Jij geld geven”. Mahmut, III-9policeman says “you money give‘The policeman says: “You must pay”.’

L1 features in the L2 output 75

b. Ik heb niet geld.I have not money‘I do not have money.’

The examples in (3) make clear what Mahmut meant to say in (2). We canassume that instead of betalen ‘to pay’ he intended to say geld ‘money’.4 Fromthe perspective of the Conservation Hypothesis, the argument runs as follows.A Turkish learner expects to find the verb at the end of the sentence, as Turkishis basically a language with an SOV sentence structure. Hence, the Turkishlearner in (2a) considers geven to be a verb. Since the L1 item ödemek (‘to pay’)has an internal argument, this L2 learner places this argument in the objectposition that normally precedes the verb, e.g. in (2a). Hence, betalen must bethe argument, and a noun. In (2b) betalen might be meant as a verb, but (3b)makes that unlikely, the more so because, in cycle II, Mahmut is not yet able toproduce possessive clauses in which hebben (‘to have’) occurs.

As can be inferred from the sentences in (2) and (3), the categorial value ofbetalen in (2) is that of a noun: [+N,−V], and not that of a verb [−N,+V]. Onlyin (3), the phonological matrix /betalen/ is replaced by /geld/, as represented inTable 3. In this table, the subcategorization frame of ‘to pay’ has been integratedin order to underline the verbal character of the L2 item.

A comparable mismatch due to misinterpretation of the L2 item terug

Table 3.A learner variant of the lexical item /betalen/ compared to the relevant itemsin source and target languages; deviances from the target language are in italics

L1 item(Turkish)

Learnervariant

L2 item(Dutch)

– phonological matrix– semantics– formal features

– subcategorization frame

/para/‘money’[+N,−V][−human][3 person][singular][accusative]–

/betalen/‘money’[+N,−V][−human][3 person][singular][accusative]–

/betalen/‘to pay’[−N,+V]

[DP2 DP1 –]

(‘back’) by the L2 learner is presented in (4).

(4) a. Mijn vrouw ik thuis terug. Mahmut, I-5my wife I home back‘My wife came to my house.’

76 Ineke van de Craats

b. Ik meisje baby Ø. Mahmut, I-6I girl baby ‘My girlfriend is expecting.’Vijf maanden baby terug.five months baby back‘In five months the baby will come.’

In (4a), Mahmut tells about his wedding. Before the wedding, he used to go tohis girlfriend’s house, far away, but after the wedding his wife came to hishouse. In this early stage of acquisition, Mahmut had two options for expressingthe possessive pronoun first person singular: the target variant mijn (‘my’) +possessee and the learner variant ik (‘I’) + possessee. He used them both in thissentence. The particle terug expresses the action of coming. In Dutch, theparticle terug is the separable part of the compound verb terugkomen (‘to comeback/to return’). In matrix clauses, the finite part of the verb appears in (moreprecisely: is moved to) the second position in the sentence, while the particleremains at the end. This is probably the cause of the misinterpretation by theTurkish learner, who expects the finite verb in end position, where he findsterug (instead of geliyor ‘comes’). In Dutch, as in English, it is even possible toleave out the past participle gekomen in a perfect tense and to say, as the resultof the action terugkomen, ik ben terug (‘I am back’), which may be interpretedby this learner as: I have come.

In (4b) the directional element is not so evident. Mahmut took part in arole playing task. He was asked to explain to a housing officer why he needed ahouse. The informant was given the information that his girlfriend was preg-nant. The introducing sentence (ik meisje baby) cannot have another meaningthan that his girlfriend is expecting. In Dutch, the internal argument (a baby)of the verb verwachten (‘to be expecting’) must be expressed overtly. Note that weare dealing here with two arguments that are strongly suggestive of the predicateverwachten (‘to expect’), so that it makes sense to assume a predicate with anempty phonological matrix, viz., verwachten. The second sentence of (4b)confused the housing officer. He understood that the baby was already bornand that he or she would come back after a stay in Turkey or somewhere else.

The cause of this misunderstanding is that Mahmut maps the L2 phonolog-ical matrix /terug/onto a L1 feature bundle linked to the verb /geliyor/. In thatway, the particle terug can act as a verb and has the same argument structure asthe verb ‘to come’.5 This is represented in Table 4.

The examples above have shown that beginning L2 learners may havedifficulty in discerning the grammatical properties of L2 content words, which

L1 features in the L2 output 77

are lexical elements with a relatively high salience in the environmental input.

Table 4.Lexical items of the concept ‘comes’; deviance from the target language is in italics

L1 item(Turkish)

Learnervariant

L2 item(Dutch)

– phonological matrix– semantics– L1 formal features

– subcategorization frame

/geliyor/‘comes’[−N,+V][3 person][singular][present tense][DP−]

/terug/‘back’[−N,+V][3 person][singular][present tense][DP−]

/komt/‘comes’[−N,+V][3 person][singular][present tense][DP−]

For the understanding of functional elements this must be still harder.

5. Functional elements in the nominal domain

As functional elements such as determiners and affixes have little semantic loadand are often unstressed, they are less salient to L2 learners than content words.So, L2 learners get less opportunity to perceive them in the L2 input and tocomprehend them. This is even more so for the formal features. Therefore, it isnot surprising that grammatical properties of L1 functional elements persist fora longer time than those of lexical elements. Consider for this purpose twomismatches of formal features at the level of the categorial value and see howthese differences become manifest in the learners’ L2 expressions. The firstexample relates to the realisation of genitive case. The second example is a moreindirect consequence of the possibility of dropping the subject (pro-drop) inTurkish and the impossiblity of doing so in Dutch.

5.1 The genitive case

Speakers of Turkish are used to realizing case marking overtly by a rich mor-phological system of suffixes on the head noun. In Dutch, case marking isgenerally done covertly, except for pronouns and the marking of genitive case(cf. Corver in this volume). For a good understanding of the learners’ data, it isnecessary to go into a bit more detail regarding the nominal possessive con-structions in Dutch and in Turkish. A Turkish possessive construction like theone in (5) features agreement in person and number between the possessed

78 Ineke van de Craats

(pro)noun and the possessor. This agreement is manifested by the agreementsuffix -sı.

(5) Ayse-nin /o-nun araba-sı.Ayse-gen/(s)he-gen car-3sg

‘Ayse’s/her car.’

The possessor noun phrase (DP) carries a genitive case feature (-nin or -nun,choice determined by vowel harmony) and the possessed noun a genitive caseagreement feature (-sı), as represented in Table 5. This possessive relationshipis characterized by agreement, overtly realized both on possessor and possessee.In line with Chomsky (1986, 1995), it is assumed that the genitive case agree-ment feature of the possessee must check off the genitive case feature associatedwith the possessor DP. The required structural configuration is AgrP where thepossessor DP and possessee N enter in Spec–Head configuration because of thestrength properties of the Agr head (cf. Van de Craats et al. 2000 for details).

Unlike Turkish, Dutch has several ways for expressing a possessive relation-

Table 5.Formal feature complex of two lexical items and the functional head Agr in theTurkish possessive construction

Possessor Possessee Agr

– phonological matrix– semantics– formal features

/Ayse-nin/‘of Ayse’[+N,−V][3 person][singular][genitive]

/araba-sı/‘car’[+N,−V][3 person][singular][+genitive case as-signment]

/Ø/−[+N,−V,+D]·strongÒ

[+N,−V]·weakÒ

ship. Two of them look, superficially, like Turkish (6a, 6b), the other (analytic)construction (7) does not.

(6) a. Ayse ’s auto (Saxon genitive)Ayse ’s car‘Ayse’s car’

b. Ayse d’r auto (Doubling possessive)Ayse her car‘Ayse’s car’

L1 features in the L2 output 79

(7) de auto van Ayse (Analytic construction)the car of Ayse‘Ayse’s car’

Although the possessive construction in (6a) is called the Saxon genitiveconstruction, the -s should not be interpreted as an inflectional case suffixbecause, with a coordinated possessor, the possessive marker -s is phonological-ly attached to the rightmost noun, as in (8). If it were a normal case suffixrealized on the head noun, it would be expressed on both nominal heads of acoordinated construction (cf. Corver 1990). We assume that the Saxon genitiveis a clitic in the Agr head.6

(8) [Ayse en Jan]’s kritiek op elkaar.‘Ayse and John’s criticism of each other.’

The major characteristic of the doubling possessive construction (6b) is thepresence of a possessive clitic which doubles the possessor noun phrase andagrees in phi features with the possessor. It is assumed that the possessive clitic(d’r for feminine, z’n for masculine) heads the functional phrase AgrP (Miller1991) and also that the doubled possessor DP originates within the lexicalprojection NP and raises overtly to the specifier position of Agr.

In analytic constructions, as in (7), the dummy preposition van (‘of ’) can beconsidered to be the morphological realization of the inherent genitive case (cf.Chomsky 1986). This implies that such genitive case will only be assigned by Nto a noun phrase that receives a thematic role from it. In line with Chomsky’s(1995:285) reinterpretation of inherent case, this genitive case is a + interpretablefeature of DP that need not, but could be, checked in a Spec–Head configuration.This has the consequence that the genitive bearing DP (i.e., Ayse) can remainwithin its base position since its genitive case feature need not be checked.

The differences between the three types of possessive constructions vary intheir modes of genitive case licensing and in what element is in the Agr head.Comparison of Tables 5 and 6 shows that the important differences betweenTurkish and Dutch lie in the mechanism of case licensing and the lexicalmaterial projected in Agr head, both abstract features and overt functionalelements. But what about the realisation of genitive case? That is another pointof difference and more transparent to learners than the properties discussedabove. In Table 7, the lexical items associated with the concept ‘possessor of ’ or,to put it in other words, the realisation of genitive case, are compared forTurkish, Dutch and two learner variants.

80 Ineke van de Craats

L2 learners who map the phonological matrix of the L2, van (‘of ’), onto the

Table 6.Formal feature complex of two lexical items and the functional head Agr forthree possessive constructions in Dutch

Possessor Possessee Agr

Saxongenitive

Doublingpossessive

Analyticconstruct.

– phonol. matrix– semantics– formal features

/Ayse/‘Ayse’[+N,−V][3 person][singular][genitive]

/auto/‘car’[+N,−V][3 person][singular][+gen. caseassignment]

/-s/–[+N,−V,+D]·strongÒ

[+N,−V]·weakÒ

/clitic/(e.g. z’n)[+N,−V,+D]·strongÒ

[+N,−V]·weakÒ

Agr notprojected

Table 7.Lexical items of the concept ‘possessor of ’; deviances from the target languageare in italics

L1 item(Turkish)

Learnervariant 1

Learnervariant 2

L2 item(Dutch)

– phonological matrix– semantics– categorial value

– subcategorization frame

/-(n)In/‘possessor’[affix gen]

[N−]

/Ø/‘possessor’[affix gen]

[N−]

/van/‘possessor’[affix gen]

[N−]

/van/‘possessor’preposition[−N,−V][DP−]

grammatical properties of their L1, produce such nominal phrases as in (9)and (10).

(9) pronominal possessora. [die van] auto Ergün, III-4

[that of car‘his car’

b. [onze van] broer child learner (number T25)[our of brother‘our brother’

(10) full noun possessora. [examen van] tolk Ergün, III-5

[exam of interpreter‘the interpreter at the exam’

L1 features in the L2 output 81

b. [die jongen van] zijn vader Osman, III-2[that boy of his father‘that boy’s father’

c. [de auto van] de lichten child learner (number T41)[the car of the lights‘the car lights’

In the examples in (9) and (10), the preposition van (‘of ’) should not beconsidered an element of an analytic construction, as in (7), but the genitivecase marker of the preceding possessor noun phrase as in Turkish (cf. example(5). Notice that both child learners and adult learners produce these construc-tions based on the L1 bundle of formal features as presented in Table 7.

It is a complicating factor that L2 learners do not show this genitive casemarker from the earliest stage of L2 acquisition. L2 learners, in general, produceonly a few functional elements in the beginning of the acquisition process andit is questionable whether they perceive functional elements in the L2 input atall. Nevertheless, they build syntactic constructions like the L2 expressions in(11), which are not simple two word utterances but can be extensive phrases.

(11) a. tante dochter7 Osman, I-5aunt daughter‘my aunt’s daughter (= cousin)’

b. tante zoon auto Mahmut, I-5aunt son car‘my aunt’s son’s car’

c. mijn vrouw oma andere man dochter Mahmut, II-8my wife grandmother other man daughter‘my wife’s grandmother’s second husband’s daughter’

The learner variants in (11) are almost incomprehensible to native speakers ofDutch, but make sense to Turkish speakers because, under the view of theConservation Hypothesis, they base those expressions on their L1 grammar,more particularly on the fact that in Turkish, all possessor nouns are overtlymarked by a genitive case and that the head noun (the possessee) is marked bya person and number marker that refers to the preceding possessor, as in (5).Under this view, an empty (Ø) phonological matrix associated with an L1 basedfeature bundle can be assumed for the learner variants in (11). (Compare alsoCorver’s contribution in this volume about the interpretability of apparentlyP-less structures.) This analysis is corroborated by the fact that the emptyphonological matrix is filled in at a later developmental stage by the case marker

82 Ineke van de Craats

van (‘of ’), and by the fact that only Turkish learners exhibit possessive construc-tions overtly marked for genitive case. Moroccan learners do not. One mayobject that the productions exemplified in (11) proceed directly from UG. Butwhy would Turkish learners have access to UG and Moroccan learners not?

In addition to possessive noun phrases in which the possessor precedes van,as in (5) and (6), L2 learners with a Turkish language background producepossessive noun phrases where van precedes the possessor, as in (12) and (13).

(12) pronominal possessora. [van hem] moeder Ergün, II-9

[of him mother‘his mother’(target: zijn moeder)

b. [van ons] die fabriek Ergün, III-4[of us that factory‘our factory’(target: onze fabriek)

c. [van ons] buurman heeft goed gemaakt child learner (number T41)[of our neighbour has good made‘our neighbour has repaired it’(target: onze buurman)

(13) full noun possessora. [van Ergün] auto Abdullah, II-7

[of Ergün auto‘Ergün’s car’(target: Ergün z’n/s auto)

b. [van schoenen] die touwtje8 Osman III-6[of shoes that rope‘the shoelaces’(target: de veters van de schoenen; de schoenveters)

The examples in (12) and (13) exemplify a new stage of lexical development,viz., the stage in which van is no longer a genitive case suffix, but an adpositionpreceding the possessor, like van in analytic constructions as in (7). A secondalteration vis-a-vis the former stage is that van precedes a full DP instead ofbeing a suffix on a noun.

In addition to these examples, there are some rare learner variants of theSaxon genitive and the doubling possessive constructions in which L1 and L2elements are mixed. These are given in (14). In (14a), we see a target doubling

L1 features in the L2 output 83

possessive construction and in (14b) a Saxon genitive construction in which thegenitive case is overtly realized together with a functional element in Agr head,i.e., ’s and zijn (instead of the target form z’n).

(14) a. die [jongen van] zijn vader Osman, III-2that [boy of his father‘the boy’s father’(target: de jongen z’n vader)

b. [van Ömers] huis child learner (number T29)[of Ömer’s house‘Ömer’s house’(target: Ömer’s huis)

Likewise, the complete development of any lexical item can be represented from

Table 8.Developmental stages of the L2 lexical item /van/ in constructions with full DPpossessors; changes with regard to the previous stage are in italics

Stage 1 Stage 2 Stage 3 Target

L1 agreement pattern– phonological matrix– semantics– categorial value– subcategorization frame– example

/Ø/‘possessor’[affix gen][N−]Ayse-Ø auto

/van/‘possessor’[affix gen][N−][Ayse-van]auto

/van/‘possessor’[adposition][−DP pos’sor][van Ayse] auto

d.n.a.

L2 construct pattern– phonological matrix– semantics– categorial value– subcategorization frame– example

L2 analytic pattern– phonological matrix– semantics– categorial value

– subcategorization frame– example

/van/‘possessor’[affix gen][N−]Ayse-van d’rauto

/van/‘possessor’[preposition][−N,−V][DP pos’see−]auto [van Ayse]

/van/‘possessor’[adposition][−DP pos’sor]van Ayse’s auto

/van/‘possessor’[preposition][−N,−V][DP pos’see−]auto [van Ayse]

d.n.a.

Ayse d’r autoAyse’s auto

/van/‘possessor’[preposition][−N,−V][DP pos’see−]auto [van Ayse]

the L2-initial state, viz. the final state of the L1 lexicon, to the state where thelexical item of the target language is attained. The formal features required forbuilding target syntactic objects, viz. those dominated by the functional headAgr and those of possessor and possessee, were already presented in Table 6.

84 Ineke van de Craats

In this way, lexical item learning in all aspects can be made visible. The vocabu-

Table 9.Distribution of variants of the lexical possessive item /van/ produced by fourTurkish learners in constructions with full DP possessors

Mahmut Ergün Abdullah Osman

Cycles I II III I II III I II III I II III

L1 agreement patternPossessor ØPossessor – affixPos’sor -adposition

63––

55––

42––

15––

9––

112–

723

7–4

–19

12–2

311

–21

L2 construct patternPos’sor +van + cliticPos’sor + clitic L2

––

––

––

––

––

––

––

––

––

––

––

2–

L2 analytic construct.Van-preposition – – 2 – 2 8 2 23 12 2 14 25

lary-internal development of the L2 lexical item /van/ can be represented as inTable 8 for constructions with full DP possessors. Three developmental stagesare distinguished, each of them is characterized by one or more changescompared to the previous stage. The order of the stages is determined by thefirst emergence of the changed value. In stage 1, the possessive L2 expressionsare based on an L1 pattern and the genitive case is not overtly realized. Stage 2 ischaracterized by the new (analytic) L2 pattern in which the genitive case is realizedby van, both as a preposition and an affix. Likely, the new construction has impacton the agreement pattern from the L1 because learners interpret van correctlyas the morphological realization of genitive case, but they mis-assess its cate-gorial value. In stage 3, van is used as a preposition and adposition as well.

In Table 9, the distribution of these lexical stages over the four Turkishinformants is given. The time course represents three successive cycles, eachrepresenting ten months of data collection. The informants are arranged fromthe slowest learner on the left of the table to the faster learners on the right.

From the combination of the data in the Tables 8 and 9, it can be inferredthat Mahmut just entered stage 2, that Ergün attained this stage somewhatearlier, and that Abdullah and Osman have reached stage 3 at the beginning ofthe data collection. Osman is the only informant who produced a few doublingpossessive constructions. Table 9, also shows a considerable overlap between thestages. In cycle III, Osman and Abdullah have abandoned stage 1, Mahmut andErgün have not yet done so.

L1 features in the L2 output 85

5.2 The personal pronoun in an L1 pro-drop system

Turkish is a pro-drop language (cf. Kornfilt 1997). This implies that thepersonal pronoun that functions as the subject is dropped if the subject hasalready been introduced in or is known from the discourse. To be more precise,the subject is assumed to be a lexically empty pronominal (pro) with the sameperson and number features as its lexical equivalent. A full pronoun is usedeither for the reintroduction of a person or thing, or for emphasizing thesubject, e.g. in (15b). In the latter case, the subject would be stressed in Dutch.

(15) a. (pro) geli-yorum (non-focused) come-pres.1sg

b. ben geli-yorum (focused)non-I (stressed) come-pres.1sg

‘I come’

The same holds for the pronoun in a possessive noun phrase like his car. Thepresence of a lexical pronoun yields a reading in which the possessor is empha-sized, e.g., in (16b), his car as opposed to your car. In neutral, non-focusedreadings, the pronoun is not lexically expressed but contains number andperson features, as in (16a).

(16) a. (pro) araba-sı (non-focused) car-3sg

‘his car’b. o-nun araba-sı (focused)non-

he/that-gen car-3sg

‘his car’

So, in Turkish two different phonological matrices, /ben/ and /Ø/, can be linkedto the same formal feature bundle, as illustrated in the columns 2 and 3 ofTable 10. Notice here that empty phonological matrices are not restricted tolearner variants as in (11) and in the Tables 7 and 8. Dutch does not have thepossibility of pro-drop and therefore can only use a lexically filled pronominalik (‘I’). According to recent analyses (see, among others, Corver and Delfitto1999) Dutch pronouns should be considered transitive determiners (D). It isassumed that their complement is a pro which stands for the phrasal projectionNP and not just for the head noun: [DP D [NP pro]]. The name prodeterminerseems to be more appropriate for these determiners than pronoun. Turkishpronouns, however, seem to be pronominals in the true sense of this word; i.e.they seem to be of the lexical category type N rather than of the functional

86 Ineke van de Craats

categorial type D (see Kornfilt 1997:300). Support for this comes, for example,

Table 10.Lexical items of the concept ‘I’ in Turkish and Dutch

L1 item(Turkish)

L2 item(Dutch)

– phonol. matrix– semantics– formal features

/ben/‘I’ focused[+N,−V][1 person][singular][nominative]

/Ø/‘I’[+N,−V][1 person][singular][nominative]

/ik/ stressed‘I’ focused[+D][1 person][singular][nominative]

/’k/ /ik/ unstressed‘I’[+D][1 person][singular][nominative]

from their behaviour with respect to various morphological rules; like commonnouns, pronominals in Turkish function as stems to which case and pluralmorphology can be attached. The two types of pronominals in Turkish andDutch are compared in Table 10.

The differences between the two pronominal systems consist in (i) thecategorial value of the pronoun, and (ii) the use of overt pronouns as a meansof focalising. Given the Conservation Hypothesis, Turkish L2 learners match aphonological matrix of the L2 with an L1 feature bundle. The question ariseswhich combination they will make. As can be inferred from the L2 data,Turkish learners link the non-focused item (an empty pro), which is morefrequently used in Turkish than the focused one and can be considered thedefault form, to the Dutch full pronoun. That is not surprising because, ininitial stages, adult learners turn out to perceive full pronouns before cliticpronominals (cf. Broeder 1991), although the clitic pronouns occur morefrequently in the spoken L2 input. However, the question remains for L2learners how to express the focused pronoun in their L2 Dutch. Assuming thatDutch pronominals are of the categorial type N as well, their projections maycontain a slot for demonstrative determiners. So these L2 learners may attach ademonstrative die or deze (‘that’) to the pronoun in emphatic contexts, as in(17a), in which Mahmut explains why his daughter must learn Turkish, and in(17b), in which Ergün talks about his brother.

(17) a. [Die ik] Hollands dochter Hollands. Mahmut, I-4[that ik Dutch daughter Dutch‘When I speak Dutch, my daughter speaks Dutch.’

L1 features in the L2 output 87

[Die mijn] mama [die ik] Turks praten mijn dochter[that my mother [that I Turkish speak my daughterHollands praten.Dutch speak‘When my mother and I speak Turkish, my daughter speaks Dutch.’

b. [die van mijn] broer Ergün, I-3[that of my brother‘my (emphatic) brother’

Table 11 represents schematically how L1 knowledge is linked to L2 knowledgein the learners’ lexicon.

Table 11.Learner variants of the lexical item /ik/ compared to the target language;deviances from the target language are in italics

Learnervariant 1

L2 item(Dutch)

Learnervariant 2

L2 item(Dutch)

– phonol. matrix– semantics– formal features

/die ik/‘I’ focused[+N,−V][1 person][singular][nominative

/ik/ (stressed)‘I’ (focused)[+D][1 person][singular][nominative]

/ik/‘I’[+N,−V][1 person][singular][nominative]

/’k/ /ik/ unstressed‘I’[+D][1 person][singular][nominative]

6. Functional elements in the clausal domain

An interesting example of how small differences in feature bundles play a rolein L2 acquisition is the acquisition of the possessive verb hebben (‘to have’). Anumber of studies (e.g. Benveniste 1966, Freeze 1992, Moro 1997) have pointedout that possessive have-constructions derive from an underlying locativeconstruction, and differ only in some respects from possessive constructionswithout have. Under this view, the form heeft (‘has’) is considered a form of tobe in which a locative preposition has been incorporated. In that way the Dutchform heeft is the spell out of the features Tense (present) (T) + Agreement (3sg)(Agr) + Locative (LOC). Basically, the possessor is considered to be the comple-ment of the locative preposition (P); this PP is moved to the subject position (inSpecAgrsP) after the incorporation of the locative prepostion. The derivation isas follows in (18).9

88 Ineke van de Craats

(18) a. …dat Paul een motor heeft.10

…that Paul a motorcycle has‘… that Paul has a motorcycle.’[CP dat [AgrP Pauli [[TP ti [SC een motor [PP Pk ti]] [T+Pk]j]] Agr+[T+P]j (= heeft)]]

b. Paul heeft een motor.Paul has a motorcycle‘Paul has a motorcycle.’

Some languages also show a non-incorporated variant like French (ce livre est àmoi ‘this book is to me’). Moroccan Arabic, however, does not have a variant inwhich the locative preposition is incorporated in the be-copula, viz. no equiva-lent of the verb to have. Moroccan Arabic uses only a locative sentence with thePP at+clitic (i.e. Eend-u at+him), both for a present tense sentence (19a) and fora past tense sentence (19b) (cf. Harrell 1970).

(19) a. Abder, Eend-u dar kbira. (present tense)Abder at-him house-f big-f

‘Abder has a big house’[CP Abder, [CP [AgrP [Agr¢ [Eend-u]i [SC dar kbira [PP ti]]]]]]

b. Abder, kanet Eend-u dar kbira. (past tense)Abder cop.past.3sg.f at-him house-f big-f

‘Abder had a big house.’[CP Abder, [CP [AgrP [Agr¢ kanetk [TP [Eend-u]i [T¢ tk [SC dar kbirati]]]]]]]

Notice that, in (19a), a copular form is lacking in the present tense and thepreposition Eend-u occupies the position of the verb (in Agr), whereas in (19b),the position of the verb in Agr head is taken by the past tense copula kanet andEend-u is in SpecTP. This may have interesting consequences for L2 learnerswho are in search of equivalents of their L1 grammar in the environmentalinput of the L2. Let us make first a comparison of the concept ‘has’ in thesource and the target languages (see Table 12).

Just as was the case for the genitive case marker in Section 5.1, L2 learnersstart with an empty phonological matrix linked to the L1 feature bundle, as canbe seen in the following L2 expressions produced by adult Moroccan learners.In (20a), Fatima is asked about her work in Morocco: she had a small shop withsome knitting machines. In (20b), Mohamed describes his situation in aninterview with a housing official.

L1 features in the L2 output 89

(20) a. pronominal possessor

Table 12.Lexical items of the concept ‘have’ compared for source and target languages

L1 item(Moroccan Arabic)

L2 item(Dutch)

– phonological matrix– semantics– categorial value

– subcategorization frame

/Eend- /‘with/at’[−N,−V]

[−DP ·cliticÒ]

/heeft/‘has’[Agr +T +P]3sg +PRES +LOC[DP−]

Ik klein winkel. (= ik Ø klein winkel) Fatima, I-3I small shop ‘I had a small shop.’

b. nominal possessorMijn vrouw ook klein huis. (= mijn vrouw Ø ook klein huis)my wife also small house ‘My wife has also a small house.’ Mohamed, I-6

Slow learners often show a stage in which they produce a preposition betweenthe possessor and the possessee, e.g. in (21). The preposition appears in thesame position where it would appear in the L1, between the strong pronoun(not the clitic) or the full noun phrase in the dislocated position at the begin-ning of the sentence (cf. the position of Abder in (19a, b) and second nounphrase. In (21a), Fatima talks again about her work and, in (21b), about herrelatives on a photograph: three of them are half sisters and brothers.

(21) a. pronominal possessorIk met klein winkel. Fatima, I-3I with small shop‘I had a small shop.’[CP ik [CP [AgrP metj [SC klein winkel [PP tj]]]]]

b. nominal possessorFatiha Mustafa Khilifiye met andere moeder. Fatima, II-4Fatiha Mustafa Khilifiye with other mother‘Fatiha, Mustafa and Khilifiye have another mother.’

In more advanced stages of L2 development, learners happen to producesentences in which the locative character of the possessive clause becomesmanifest, e.g. in (22a) and (22b), in which the entire PP has been moved to

90 Ineke van de Craats

SpecAgrP, or (22c) in which the locative preposition is overtly spelled out andincorporated in heeft as well.

(22) a. Met kind een jaar.11 Fatima, II-7with child one year‘The child is one year old.’

b. Bij hem kief. HassanK, III-2at him hashish‘He has hashish.’

c. Met Soumiya heeft veel pijn. HassanM, III-5with Soumiya has much pain‘Soumiya suffers very much.’

Except for Fatima, such locative constructions instead of the verb have are rarein the data produced by the Moroccan informants. When their have-construc-tions are considered more closely, however, we become more sceptical aboutthe character of the have-forms. To start with Fatima, she uses the form heeft(3sg) not only for all person roles, but she alternatively uses met and heeft withthe same meaning for more than 13 months. Another informant producesinstances of heeft in which it can be a synonym for met (23a). The sentence in(23b) is even a direct copy of Moroccan Arabic in which heeft is followed by aclitic as if it were a locative preposition.

(23) a. Die heeft geel haar mag binnen. HassanK, II-2that has yellow hair may inside‘The boy with blond hair is allowed to go inside.’

b. Die meisje, heef-ze een oom. HassanK, II-2that girl has-she an uncle‘That girl has an uncle.’

The situation becomes crucial when this informant wants to express a posses-sive clause in the past tense. Recall from (19b) that, in Moroccan Arabic, thepast tense of a have-clause is expressed by a be-copula in the past tensefollowed by the locative preposition and see what HassanK produced in (24).

(24) a. Die was heeft (-pro) 30 jaar. HassanK, II-3he was-cop.past.3sg has [=Eend+3sg] 30 years‘He was 30 years.’

b. Die meisje was nooit heeft (-pro) verkering. II-4that girl was-cop.past.3sg never has [=Eend+3sg] relationship‘That girl was never in a relationship.’

L1 features in the L2 output 91

c. Dan was heeft (-pro) een huis. HassanK, II-9then was-cop.past.3sg has [=Eend+3sg] a house‘Then he had a house.’

From the examples above we infer that HassanK and Fatima base their syntactichave-structures on the feature bundle of the Moroccan Arabic prepositionEend (‘at, with’). In this developmental stage, they still match the L1 featurebundle with an L2 phonological matrix. Heeft is in fact a prepositional heeft.The other two informants do not produce past tense forms of the verb hebben;that is already sufficient reason to be sceptical about the real identity of theirhave-forms. The question can be raised how and when we can be sure thatheeft is a real copula and no longer a prepositional form. It is not easy to decidewhen we are dealing with a copular verb, but it can safely be claimed that it isno longer a preposition when past tense forms of hebben are expressed in oneverbal form (had ‘had’). HassanK succeeds in producing several past tenseforms and Mohamed does only once (see Table 14). A second indication is thatthe verb hebben has extended from a possessive copula to an auxiliary of theperfect tense, e.g. hij heeft gezien (‘he has seen’). In this light, it is relevant tosignal that all Moroccan learners produce have-copulas six months or morebefore they start using the auxiliary have (see Van de Craats 2000).

Finally, the development of the lexical item /heeft/ is represented in Ta-ble 13, from the L2-initial state (i.e. the empty phonological matrix with the L1feature bundle) via various developmental stages to the state in which the itemhas attained the complete feature constellation it has in the target language.Tables 13 focuses on the properties of the lexical item heeft, which is the spellout of the features [agreement], [tense] and [−N,−V]. This already suggests aprocess of adjunction of several functional heads. In Table 13, we abstract awayfrom the formal features dominated by T and Agr triggering the movement ofP(P) (see Van de Craats et al. 2002:156 for the derivation).

Table 14 shows how the stages are distributed over the data of the fourMoroccan learners. Slow learners exhibit more tokens of the developmentalstages than fast learners. The order of the stages is determined again by the firstemergence of a changed value. For the slow learners, a considerable overlap ofthe stages can be observed. The overlap is smaller for learners like Mohamedand HassanM. At the end of the data collection, stage 3 is attained by all fourlearners; only two of them can form past tense forms according to targetlanguage standards, however.

92 Ineke van de Craats

7. Conclusions

Table 13.Stages in the development of the L2 lexical item /heeft/; differences withregard to the previous stage are in italics

Stage 1 Stage 2 Stage 3 Target state

– phon.matrix– semantics– categorial value– subcategorization frame– example

/Ø/‘with, at’[−N,−V][−DP clitic]Abder Ø boek

/met, bij/‘with, at’[−N,−V][−DP clitic]Abder metboek

/heeft/‘with, at’[−N,−V][−DP clitic]Abder washeeft boek

/heeft/‘has’[+Agr,+T,+P][DP−]Abder heeftboek

Table 14.Distribution of variants of the lexical possessive item /hebben/ produced byfour Moroccan learners over developmental stages

Cycles Fatima Mohamed HassanK HassanM

I II III I II III I II III I II III

L1 constructionStage 1–ØStage 2–prep.Stage 3–prep.Stage 3–heeft

48810

62226

21–76

17144

10––

1––

1177

71173

411

7–52

11–

––1

Target stateHebben presentHebben past

––

––

13–

––

1381

150–

––

––

822

15–

143–

137–

From the examples discussed above, we can infer that, in general, L2 learnersare more aware of the fact that words differ from language to language, i.e., thatphonological matrices differ, than that they are aware of the fact that otherlexical properties differ. Adult L2 learners are inclined to map or to attach a newphonological matrix from the L2 to a bundle of semantic and formal featuresfrom the L1. Only if those features bundles differ, can we get an insight in howL2 learners proceed in their learning of lexical items: they start by assuming anL1 feature constellation and they gradually change the features of the bundleone by one, as we saw in Tables 8 and 13 for the acquisition of van and heeft. Inthis way, the L2 learners’ output becomes more and more target-like; not onlythe surface form (= the phonological matrix) alters, but also the features.

In this light, it can be explained why functional elements are more difficultto acquire than lexical elements (i.e. content words). It is not only the case that

L1 features in the L2 output 93

they are less salient in the environmental input, but they also differ more in thestructure of their feature bundle. Therefore, it takes more time to discover eachof the composing features. The formal features, however, are of crucial impor-tance for the development of the L2 syntax: they are the input for the computa-tional system and, hence, decisive for the result of derivation: the syntacticobjects. They are the interface itself, so to say.12 Until a Moroccan learner ofDutch, for instance, has not discovered each of the formal features of heeft(‘has’), he cannot attain the formation of the verb with subject-verb agreement,he cannot form the past tense of that verb, nor the perfect aspect by means ofthe auxiliary hebben. This shows the direct link between lexicon and syntax.

Why is there progress? The syntactic objects (i.e. the product of the compu-tational system and the selection of lexical elements, including formal features)in interlanguages seem to provide L2 learners with a reason to change their L2output, because it is not sufficiently understood by native speakers, incomplete(recall, for instance, the empty phonological matrices in the output of beginnersin example (11)), etc. Therefore, L2 learners are forced to constantly reanalysetheir L2 output and to change the underlying formal features and/or theparametric values. This process of restructuring seems to occur in a continuousinterplay between syntax and formal features as the seeds of syntax and as a partof the lexicon as well.

Notes

1. In Chomsky (1994) it is argued that only the idiosyncratic formal features are part of thelexical item. Optional formal features are added to the lexical item when it is selected fromthe lexicon. Here, we will abstract away from that distinction.

2. Except for the case in which L1 and L2 are so much similar that an L2 learner decides touse an L1 lexical item (with the L1 phonological matrix) directly in the L2 output.

3. It is certainly not the case that learners keep on using all meaning aspects of a lexical itemin their L2: the more transparent and concrete a specific semantic aspect is, the earlier it isused in L2 expressions (cf. the 17 different aspects of breken (‘to break’) the acceptability ofwhich Kellerman (1979) asked his Dutch subjects to judge).

4. A reviewer proposed that betalen geven would be a serial verb construction, or geven ‘togive’ a light verb; geven can function as a light verb in Dutch. Although this account is notquite impossible, the reviewer is arguing too much from a target language perspective.Moreover, there is no light verb geven in (2b), and the example in (3b) makes clear that it isnot the verb betalen that is meant.

94 Ineke van de Craats

5. A reviewer advanced that Adjémian (1983) already found that L2 learners tend to transferlexical patterns from their L1 to their L2, and assume that verbs take the same kinds ofsubject and object in their L2 as they do in their L1. Mapping an L2 phonological matrixonto L1 lexical properties goes even further.

6. Following insights from Longobardi (1996), we take the position that the Saxon genitiveand the doubling construction are in essence hidden construct noun phrases, because theyshare certain properties with the Construct-State construction (as in Arabic and Hebrew).The Saxon genitive and the possessive doubling have the same distribution as a definitearticle, viz. they block the presence of a definite article in D0.

7. The English translation of these examples suggests that the word order is not uncommon,but possessors DPs in the Saxon genitive are restricted to proper names, and to humanbeings in the doubling possessive construction. Moreover, recursion is odd in thoseconstructions. A native speaker of Dutch would only use the analytic construction forrecursive patterns due to processing limitations.

(i) de dochter van mijn tante(ii) de auto van de zoon van mijn tante(iii) de dochter van de tweede man van de oma van mijn vrouw

8. Note that in Dutch, the order Possessor-Possessee is restricted to human possessors; theSaxon Genitive (e.g. Jans fiets ‘John’s bike’) is even restricted to proper names and nounsequivalent to proper names, e.g. tante ‘aunt’. This points all the more to the L1 grammar inexample (11b).

9. See also Van de Craats (2000) and Van de Craats, Corver and Van Hout (2002) for detailson this construction in Dutch and Moroccan Arabic.

10. In this example, a subordinated clause is given first because it shows the basic position ofthe verb in Dutch (SOV language).

11. In Moroccan Arabic, one’s age is also expressed by a have-construction.

12. Although the lexicon plays an important role in this view on adult L2 acquisition, it is notidentical to the Lexical Learning Hypothesis (e.g. Clahsen, Eisenbeiss and Penke 1996) norto other recent views on L1 acquisition (e.g. Radford 2000, Roeper 1996) in which theacquisition of (L1) syntax is seen as gradual structure building.

References

Adjémian, C. 1976. “On the nature of interlanguage systems”. Language Learning 16: 297–320.Adjémian, C. 1983. “The transferability of lexical properties”. In Language transfer in

language learning. S. Gass & L. Selinker (eds), 250–268. Rowley, MA: Newbury House.Benveniste, E. 1966. “‘Etre’ et ‘avoir’ dans leur fonctions linguistiques”. Bulletin de la Société

de la linguistique de Paris 55 (1): 113–134. Reprinted in Problèmes de linguistiquegénérale. Paris: Gallimard.

Broeder. P. 1991. Talking about people. A multiple case study on adult second languageacquisition. Amsterdam/Lisse: Swets & Zeitlinger.

Chomsky, N. 1986. Barriers. Cambridge, Massachusetts: MIT Press.

L1 features in the L2 output 95

Chomsky, N. 1994. “Bare phrase structure”. In MIT occasional papers in linguistics. Cam-bridge, Massachusetts: MIT Press.

Chomsky, N. 1995. The minimalist program. Cambridge, Massachusetts: MIT Press.Clahsen, H., Eissenbeiss, S. and Penke, M. 1996. “Lexical learning in early syntactic develop-

ment”. In Generative perspectives on language acquisition, H. Clahsen (ed.), 129–159.Amsterdam/Philadelphia: John Benjamins.

Corver, N. 1990. The syntax of left branch extractions. Doctoral dissertation. Tilburg University.Corver, N. and Delfitto, D. 1999. “On the nature of pronoun movement”. In Clitics in the

Language of Europe, H. van Riemsdijk (ed.), 799–861. Berlin: Mouton & de Gruyter.Flynn, S. 1986. A parameter-setting model of second language acquisition. Reidel, Dordrecht.Freeze, R. 1992. “Existentials and other locatives”. Language 68: 553–595.Harrell, R. 1970. A short reference grammar of Moroccan Arabic. Washingon, D.C.: George-

town University Press.Kellerman, E. 1979. “Transfer and non-transfer: Where we are now”. Studies in Second

Language Acquisition 2: 37–57.Kornfilt, J. 1997. Turkish. London, New York: Routledge.Longobardi, G. 1996. The syntax of N raising: A minimalist theory. Manuscript, Utrecht

University.Miller, P. 1991. Clitics and constituents in phrase structure grammar. Doctoral dissertation,

University of Utrecht.Moro, A. 1997. Predicative noun phrases and the theory of clause structure. Cambridge:

Cambridge University Press.Perdue, C. 1993. Adult language acquisition: Cross-linguistic perspectives. Volumes I and II.

Cambridge: Cambridge University Press.Radford, A. 2000. “Children in search of perfection: towards a minimalist model of acquisi-

tion”. In Essex research reports in linguistics, 34.Roeper, T. 1996. “The role of merger theory and formal features in acquisition”. In Genera-

tive perspectives on language acquisition, H. Clahsen (ed.), 415–449. Amsterdam/Philadelphia: John Benjamins.

Schwartz, B.D. and Sprouse, R. 1996. “L2 Cognitive states and the full transfer/full accessmodel”. Second Language Research 12: 40–72.

Van de Craats, I. 2000. Conservation in the acquisition of possessive constructions. A study ofsecond language acquisition by Turkish and Moroccan learners of Dutch. Doctoraldissertation, Tilburg University.

Van de Craats, I., Corver, N. and Van Hout, R. 2000. “Conservation of grammaticalknowledge: on the acquisition of possessive noun phrases by Turkish and Moroccanlearners of Dutch”. Linguistics 38 (2): 221–314.

Van de Craats, I., Corver, N. and Van Hout, R. 2002. “The acquisition of possessive have-clauses by Turkish and Moroccan learners of Dutch”. Bilingualism: Language andCognition 5 (2): 147–174.

Vermeer, A. 1986. Tempo en struktuur van tweede-taalverwerving bij Turkse en Marokkaansekinderen. Doctoral dissertation, Tilburg University.

White, L. 1982. Grammatical theory and language acquisition. Dordrecht: Foris.White, L. 1985. “The pro-drop parameter in second language acquisition”. Language

Learning 35: 47–62.

</TARGET "cra">

<TARGET "duf" DOCINFO AUTHOR "Nigel Duffield"TITLE "Measures of competent gradience"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 5

Measures of competent gradience*

<LINK "duf-n*">

Nigel DuffieldMcGill University & Max Planck Institute for Psycholinguistics,Nijmegen

1. Introduction: Reconsidering the competence–performancedistinction

An assumption of much current generative research is that grammaticality is astrictly categorical property. By extension, underlying competence, of whichgrammaticality is a reflex, is also assumed to be categorical in nature.1 In mostanalyses, this assumption about grammaticality is explicitly formalized in theconvention of placing asterisks in front of ‘ungrammatical’ sentences: starredsentences are categorically ‘bad’, unstarred sentences are unequivocally ‘good’.2

A second, equally standard assumption in generative research is thatacceptability judgments are not categorical, at least if one ignores such uninter-esting deviances as arbitrary word scrambles, nonce-word insertions, and soforth. Everybody knows that acceptability judgments on most theoreticallyrelevant sentences are only relative, that the particular level of acceptability ofa given sentence is a complex function of a variety of factors, including extra-grammatical ones. It is notable that L2 researchers have been more explicit inrecognizing this fact than pure theorists (see especially Birdsong 1989, Ellis1991, Hedgcock 1993, Martohardjono 1998; cf. also Coppieters 1987, Mandell1999, but the view is more generally held: see Bever 1970, Greenbaum 1977,Schütze 1996, for discussion and further references).

In spite of the tension between these assumptions, most theoretical linguistsuse continuous performance data to draw inferences about (what is assumed tobe) categorical grammaticality, and by extension, categorical underlyingcompetence.3 Having once ‘idealized to competence’, theoretical linguists arefree to ignore those aspects of performance that they deem irrelevant togrammaticality: this includes the property of gradience that is a principalconcern of this paper.

98 Nigel Duffield

Most of the foregoing discussion is familiar to anyone who has consideredthe competence issue, so it might be wondered why I am bringing it up again.There are three reasons, two of which apply to linguistic research in general,one which is more specifically concerned with the relationship between native-speaker (NS) and non-native-speaker (NNS) data in second language acquisi-tion research.

The first reason for critically re-examining the competence–performancedistinction is to save it from attack from an increasingly coherent body ofresearch that takes seriously the complexity of grammatical systems whilst at thesame time questioning the empirical validity of that distinction.4

One articulate statement of this new position is presented in the ‘emergentist’work of Allen and Seidenberg (1999). Allen and Seidenberg’s attack on thecompetence–performance distinction as it relates to language acquisition ismotivated by what they claim to be the obscure basis of grammaticality judgments:

The mapping between competence grammar and performance is at bestcomplex, as we have noted: it is also largely unknown. A problem arisesbecause the primary data on which the standard approach relies — grammati-cality judgments — are themselves performance data (Bever 1970). Themethodology of the standard approach holds that properties of the hypothe-sized language faculty can be identified on the basis of experts’ intuitivejudgments of the well-formedness of utterances. However, the relationshipbetween grammaticality judgments and the structure of the grammar is nomore transparent than that between other aspects of competence and perfor-mance. (Allen and Seidenberg 1999:117–118)

While one might take issue with the reference to ‘experts’ intuitive judgments’,the rest of this paragraph is unexceptionable. Allen and Seidenberg furtherpoint out that misgivings about the basis of grammaticality judgments are to befound within the generativist camp. The authors cite a telling paragraph inSchütze (1996:20):

It is conceivable that competence in this sense of a statically representedknowledge does not exist. It could be that a given string is generated or itsstatus computed when necessary, and that the demands of the particularsituation determine how the computation is carried out, e.g., by some sort ofcomparison to prototypical sentence structure stored in memory. Since sucha scenario would demand a major re-thinking of the goals of the field oflinguistics, I will not deal with it further.

Measures of competent gradience 99

Allen and Seidenberg claim that this sidesteps an important issue. Theirresponse is to try to derive the effects of grammaticality from an input-drivensystem lacking any autonomous syntactic representations: that is, without anyreference to grammatical competence as theoretical linguists would construe it.

For various reasons, including both the results of their own simulation andthe results of other experiments to be discussed below, I believe that thisalternative approach to grammaticality is mistaken, and that something like thecompetence–performance distinction is empirically more correct. At the sametime, their work demonstrates the urgent need for a clearer articulation of thenature of the distinction, and for a more explicit statement of the content ofgrammaticality judgments. While I will not be able to offer that statement here,I hope to sketch out the direction in which it might be found.

A second reason for reconsidering the competence–performance distinctionis that I agree with those who claim that the line between competence andperformance is presently drawn in the wrong place: that the current idealizationto competence presents too narrow a view of scope of competence. This is aview held not only by connectionists, but also by some generativists, mostnotably Culicover (1998, 2000), though cf. Fodor (2001).

The specific notion that I would like to include within a revised version ofcompetence is that of syntactic gradience. As the title of this article implies, Iwill argue that part of competence consists, not in correctly making categoricaljudgments about grammaticality, but in correctly making gradient ones. To citeone brief example, to be a competent ‘knower’ of English is to know, not merelythat the sentences in (1) and (2) below are all less than perfect, but also thatthey are rather precisely ordered in acceptability, each example being slightlyless acceptable than the one that immediately precedes it. These examples, citedin Kluender (1992), are originally due to Chung and McCloskey (1983) (wherethe symbol ‘>’ indicates ‘more acceptable than’):

(1) a. This is the paper that we really need to find someone whounderstands. >

b. This is the paper that we really need to find a linguist whounderstands. >

c. This is the paper that we really need to find the linguist whounderstands. >

d. This is the paper that we really need to find his advisor whounderstands. >

e. This is the paper that we really need to find John, who understands.

100 Nigel Duffield

(2) a. This is a paper that you really need to find someone that you canintimidate with. >

b. Which paper do you really need to find someone that you canintimidate with. >

c. How many papers do you really need to find someone that you canintimidate with. >

d. What do you really need to find someone that you can intimidate with.

Kluender’s concern in presenting such examples is to draw out the ‘hidden’factors determining the acceptability of these sentences: all of the sentences in(1), and those in (2), involve very similar syntactic structures, yet their gramma-ticality status varies from (almost) completely acceptable to strongly unaccept-able. In this case, the degrees of acceptability are a function, not of syntacticstructure, but of a semantic factor, what Kluender terms ‘referential specificity’.

One conclusion to be drawn from such examples is that we are failing toaccount for speaker’s intuitions by attributing these grammaticality effects topurely syntactic conditions. This is Kluender’s contention. I return to the‘hidden factors’ issue in Section 2.4. below, in discussing the factors determin-ing another putatively syntactic effect — the parallelism constraint on VP-ellip-sis. At this point, however, my concern is with the fact that this is an instance of‘competent gradience’: competent knowers of the language judge these sentenc-es as increasingly less acceptable, and they also converge on the particularordering involved.

This becomes especially relevant to SLA when we consider a hypotheticaladvanced L2 learner’s response to such sentences.5 On the categorical view ofcompetence, an L2 learner who systematically rejected all of these sentences asunacceptable (*) should in fact be ‘more correct’ relative to the hypothesizedgrammar (I-language) than the native speaker who is inclined to accept most orall of them (most or all of the time). Although unusual, such instances of L2learners ‘outperforming’ native-speakers are not unknown; the existence of thisphenomenon is the third reason to re-examine the competence–performancedistinction. In fact, I will suggest that the hypothetical L2 learner’s performancedoes reveal an underlying categorical competence.

Intuitively, however, a learner who judges as unacceptable what native-speakers judge as less acceptable has somehow failed to achieve native-speakercompetence, if we define competence as whatever implicit knowledge deter-mines successful language use.

By one measure, then, the L2 learner whose judgments fail to match thoseof the native-speaker will be judged more competent than the native speaker,

Measures of competent gradience 101

where competence is defined in terms of convergence between a discretepattern of behaviour and underlying categorical knowledge. Another measure— and common sense — dictates that the same L2 learner exhibiting the samebehaviour should be judged less competent than the native-speaker — wherecompetence is defined in terms of convergence on native-speaker’s judgments.Although most generative SLA researchers would claim that they are probingthe first type of competence, I know of very little work that does not determinethis competence — implicitly or explicitly — in terms of the second type.6,7

There have been two responses to this apparent paradox. As noted above,the anti-generativist response is simply to reject the idea of competence asstatically represented, autonomous knowledge. The standard generativistresponse is to continue to pretend that gradience is not part of syntacticcompetence. Neither alternative seems satisfactory.8

A middle way is to claim that both impulses are correct, because there aretwo types of competence. This is the proposal I will try to motivate here. For thepurposes of exposition, I’ll term these hypothesized competences, underlyingversus surface competence, respectively, with no preference intended. Byhypothesis, underlying competence (UC) is categorical, and consists of formal(phonological and syntactic) principles, autonomous from the lexicon. It isplausible to think of UC as innate. Surface competence (SC), by contrast, isintimately determined by the interaction of contextual and specific lexicalproperties with the formal principles delivered by UC; as a consequence, SCgenerates gradient effects. SC is largely language-specific learned knowledge. Inprinciple, grammaticality judgments can be a reflection in performance ofeither type of competence; generally, however — again, by hypothesis — explicitgrammatical judgment tasks will tap surface competence, whereas implicit taskswill tap underlying competence.9

Now, it may happen that, in a particular task, a given set of judgments willbe ‘weakly consistent’ with both types of competence, and also that L1 and L2learners’ judgments will converge. This state-of-affairs is schematised inFigure 1 below. Gradient judgments will be weakly consistent with UC just incase (i) all sentences judged less than perfectly acceptable violate some formalconstraint, (ii) none of the sentences judged to be perfect violate this constraint,and (iii) it is reasonable to attribute the relative degrees of acceptability of theunacceptable set to some, non-formal (preferably extra-grammatical) factor.This I take to be the standard generativist line.

102 Nigel Duffield

However, several legitimate alternatives exist. These are cases where either

Common set of judgments(shared by NSs & NNs)

Surfacecompetence

Underlyingcompetence

Figure 1.Full convergence (SC and UC generate the same set of grammatical sentences,NS and NNS converge on this set).

the performance judgments on the data presented are a reflex of only one typeof competence (UC or SC), or the judgments of two subject groups (L1 versusL2) reflect different competences. These logical alternatives are schematised inFigures 2–5 below.

Common set of judgments(shared by NSs & NNs)

Surfacecompetence

Underlyingcompetence

Figure 2.Convergence on SC only.

Common set of judgments(shared by NSs & NNs)

Surfacecompetence

Underlyingcompetence

Figure 3.Convergence on UC only.

Measures of competent gradience 103

Surfacecompetence

Underlyingcompetence

NativeSpeakerJudgments

Non-NativeSpeakerJudgments

Figure 4.Parallel, disjoint convergence (Type 1: NS converge on SC, NNS on UC).

Surfacecompetence

Underlyingcompetence

Non-NativeSpeakerJudgments

NativeSpeakerJudgments

Figure 5.Parallel, disjoint convergence (Type 2: NNS converge on SC, NS on UC).

Notice that all of these scenarios assume convergence between NS and NNScompetences; that is to say, I am ignoring those cases where L2 learners haveinternalised a surface or underlying competence that is different from that ofnative-speakers; cf. Sorace (1993, 1996). What this model is intended to accountfor are cases in which the L2 learner has attained the target grammar, but wherehis/her judgment patterns nevertheless diverge systematically from those of thenative-speaker.

Clearly, hypothesizing two different types of competence complicates anytheory of the relationship between competence and performance. Parsimonyrequires that this complication be shown to be empirically necessary. Thepurpose of the rest of this article, therefore, is to provide such motivation. Theorganization is as follows. The following section provides some experimentalevidence motivating two types of competence. Next, I consider two types ofgradient effect, both of which I will argue are properties of surface, rather thanunderlying, competence. Then, I discuss two instances of principled mismatchbetween NS and NNS judgments, motivating both Figures 4 and 5 above.Finally, I examine a special case of Figure 2 above, where NS and NNS appar-ently converge on the same SC, but for different reasons.

104 Nigel Duffield

2. The dual competence approach

2.1 Motivating two types of competence

This section draws attention to two pieces of recent work that suggest a distinc-tion between two types of competence. In each case, complementary methodol-ogies have been used to investigate speakers’ judgments of particular syntacticphenomena, which have been argued (in the theoretical literature) to exhibit acategorical distinction. In each case, these methodologies have yielded diver-gent, but nevertheless coherent, results. This divergence, I would claim, reflectsthe fact that two different types of competence are being tapped.

2.1.1 McKoon and MacFarland (2000)McKoon and MacFarland’s (2000) study investigates the theoretical argumentfor a discrete representational contrast between two classes of verbs: externally-versus internally-caused change of state verbs. As these labels suggest, external-ly-caused change-of-state (ECS) verbs are those whose result state may bebrought about by some external agent/cause, whereas internally-caused changeof state (ICS) verbs describe result states whose cause is internal to the themeNP undergoing change-of-state. An example of the former is the verb redden,an example of the latter, bloom.

The theoretical literature on such predicates analyses the difference in termsof a discrete lexical contrast: the lexical argument-structure of ECS verbs is saidto involve an additional abstract predicate (CAUSE), which is systematicallyabsent from the argument-structure of ICSs. In the specific analysis of Levinand Rappaport Hovav (1995), this entails a structural difference in the lexico-syntactic representation of the two verb-types: as illustrated in (3) below, ECSverbs project more structure than ICS verbs.

(3) a. ECS: ((α) CAUSE (BECOME (x ·STATEÒ)))b. ICS: (BECOME (x ·STATEÒ))

One piece of empirical evidence supporting this structural contrast comes fromthe alleged (in)ability of these intransitive verbs to form transitive counterparts:it has been claimed that ECSs, but not ICSs, permit transitive alternants. So, forexample, there is claimed to be a categorical contrast in the acceptability of (4a)versus (4b):

(4) a. The sunshine reddened Amy’s cheeks.b. *The sunshine bloomed the tulips.

Measures of competent gradience 105

McKoon and MacFarland tested the validity of this empirical claim through anumber of different studies. First, they examined a corpus of approximately 180million words of written and spoken English. The results of that investigation,summarized in Table 1, below demonstrate that ICSs considered as a class arein fact just as likely to transitivise as ECSs.

All of these predicates vary individually in transitivity — some ECSs such as

Table 1.Probability of transitive use for ECS and ICS verbs (adapted from McKoon andMacFarland 2000: Table 2)

SentenceType External Cause Internal Cause

verb prob. ‘yes’ verb prob. ‘yes’

Low prob. Transitive atrophyawakecrumbleabate

.03

.05

.05

.10

bloomdeteriorategerminaterot

.00

.01

.06

.08

mean .06 mean .06

Higher prob. Transitive reddendissipatefrayfossilize

.24

.41

.52

.60

blisterfermentcorrodeerode

.22

.54

.63

.67

mean .48 mean .45

abate show a low probability of transitivity, others such as fossilize are morelikely to transitivise: the same is true of ICSs. In terms of usage, then, thetransitivity constraint does not distinguish the two classes of predicate; rather,transitivity is a lexically gradient effect. Thus, the corpus study results apparent-ly invalidate a key piece of empirical evidence for a representational contrastbetween the two verb classes.

The proposed theoretical distinction becomes even more questionable inlight of a follow-up study, in which subjects were asked, in an offline grammati-cal judgment task, to rate the acceptability of various transitivised ECS and ICSverbs. The results of this task confirmed the patterns already observed in thespontaneous production data, namely, that the acceptability of the transitivealternant of a given verb was a function of individual lexical differences, and ofselectional restrictions, rather than putative verb-class. Statistically, there is nomain effect of verb class, nor an interaction between verb-class and transitivity,

106 Nigel Duffield

in the acceptability ratings. Overall, the acceptability rate for both classes ofverbs was extremely high: even those verbs judged least acceptable had a meanacceptance rate of over 80%.

The results of the first two studies speak against the theoretical claim thatthese two types of verbs are represented differently. On the contrary, subjects’close convergence on a scale of relative acceptability (for the transitive forms ofindividual predicates) suggests that a fine-grained, lexically determined, andinherently gradient, type of competence informs native-speakers’ performance.

Had McKoon and MacFarland stopped at this point, their work wouldprovide support for the usage-based, probabilistic models of language perfor-mance favoured by connectionists and others (cf. especially MacDonald et al.(1994), Barlow and Kemmer (2000)). A final experiment, however, suggesteda different conclusion. In this latter, implicit experiment, McKoon andMacFarland measured the response latencies involved in reading ECSs versusICSs that had previously been matched in terms of length and offline accept-ability. In direct contrast to the previous results, this implicit measure revealeda reliable contrast in reading time between ECS and ICS verbs; ECS verbs —whether presented in intransitive or transitive frames — took significantlylonger to read than their matched ICS counterparts. The main results arereproduced in Table 2 below (from McKoon and MacFarland: Tables 7 and 8:848, 851). That is to say, the results of these last experiments support the idea ofdistinct representations for these verbs on the basis of the hypothesized verbclasses.

McKoon and MacFarland argue that the results of the final experimentspeaks against probabilistic models of lexical representation, and confirm thepsycholinguistic reality of the theoretical model. Both conclusions are toostrong, I think. On the one hand, whether or not there is a categorical distinc-tion in the representation of the two classes of verbs, it is still necessary toaccount for subjects’ ability to converge on a scale of gradient judgments forindividual predicates. As discussed earlier, a speaker whose judgments of theacceptability of such predicates was the inverse of all other subjects (in Experi-ment 2) could reasonably be judged ‘less competent’ than one whose relativejudgments were in accordance with other native speakers (at least by the seconddefinition of competence), even if their reading times in Experiment 3 werecomparable.

Moreover, the fact that subjects’ performance in an implicit task yields astatistically discrete result does not prove that the competence underlying thisbehaviour is itself categorical, and certainly is no more than consistent with the

Measures of competent gradience 107

theoretical analysis. To infer the reality of a particular contrast in underlying

Table 2.Mean reading times for ECS and ICS verbs (from McKoon and MacFarland2000: Tables 7 & 8)

Sentence Type External Cause Internal Cause

Intransitive frames JTime (ms) prob. ‘yes’ JTime (ms) prob. ‘yes’

All test sentencesLow prob. TransitiveHigher prob. Transitive

155115611538

.91

.92

.90

140013921413

.96

.96

.96

Intransitive frames External Cause Internal Cause

JTime (ms) prob. ‘yes’ JTime (ms) prob. ‘yes’

All test sentencesLow prob. transitiveHigher prob. transitive

222022302210

.86

.81

.93

206921311963

.96

.96

.96

competence from an apparently isomorphic contrast in processing is to assumea very direct interpretation of the derivational theory of complexity that mostpsycholinguists would view with some caution.

Having said that, the results of the third experiment do provide support forthe idea that some correlate of the categorical contrast described by the theoreti-cal analysis has psychological reality, and that this correlate does seem to becategorically expressed.

I suggest that an adequate model is one that accommodates both sets ofresults. Rather than attempting to reconcile these within a single type ofcompetence, the proposal is to assume a dual-competence model, as represent-ed in Figures 1–5 above. In this model, the results of the first two experimentscan be represented as in Figure 2 (tapping SC), those of the final experiment asin Figure 3 (tapping UC).10

2.1.2 Duffield and White (1999), Duffield et al. (2002)The dual competence model receives additional support from recent SLA workusing complementary methodologies to investigate L2 learners’ knowledge ofpronominal clitic placement in Spanish and French: Duffield and White (1999),Duffield et al. (2002). In this section, my concern is with the divergent resultsof the French native-speaker control group in one section of experiments. Here,unusually, an offline grammaticality judgment task revealed a grammaticalitycontrast to which the implicit (sentence-matching) task was apparently insensitive.

108 Nigel Duffield

(5) a. Je veux le voir.I want 3sg to.see‘I want to see him.’

b. *Je le veux voir.c. Je le fais chanter.

I 3sg make to.sing‘I make him sing.’

d. *Je fais le chanter.

The phenomenon of interest, illustrated in (5) above, is the contrast in pro-nominal clitic placement between so-called restructuring verbs (such as pouvoir,vouloir) and the causative verb faire. The distinction is of theoretical interest inthat the two structures are assumed to involve distinct syntactic representations.Briefly, many theoretical analyses assume that restructuring verbs — as thename suggests — restructure the syntax of the lower clause, creating a mono-clausal structure. By contrast, sentences involving causatives are assumed toremain fundamentally bi-clausal: at an abstract level of representation, thepronominal clitic is still syntactically associated with the lower verb; seeDuffield et al. (2002), for more detailed discussion.

Notice that this analysis, derived on the basis of cross-linguistic comparisonwith other Romance varieties, is precisely the opposite of that suggested by anaïve inspection of the French facts. In French, the pronominal clitic stays closeto the verb of which it is an argument in restructuring contexts, and is displacedfrom it in causative contexts: hence, one might expect that restructuring wouldbe required for the interpretation of clitics with causatives, rather than withverbs like vouloir.

The theoretical analysis, however, predicts a distinction between theacceptability of two types of ‘ungrammatical’ sentence, namely, between (5b)and (5d) above: whereas (5b) is predicted to be ungrammatical at all levels ofrepresentation, (5d), in which the clitic is attached to the verb with which it isthematically associated, is predicted to be ungrammatical only at surface level.This predicts that if a task could be found that taps ‘underlying’, rather thansurface, grammaticality, then sentences such as (5d) should pattern with othergrammatical, as opposed to ungrammatical, sentences.

In Duffield et al. (2002), we employed just such a task to elicit implicitgrammaticality judgements. This is the Sentence-Matching (SM) paradigm —introduced by Freedman and Forster (1985), later developed for SLA researchby Bley-Vroman and Masterson (1989); see also Eubank (1993), Eubank andGrace (1988). In this task, subjects are asked to determine whether or not two

Measures of competent gradience 109

visually-presented sentences are identical in form. Previous research has shownrepeatedly that, in general, subjects take significantly less time to decide thatmatching pairs of grammatical sentences are identical than to match corre-sponding ungrammatical pairs of sentences. Hence, statistically discretedifferences in reaction times constitute an implicit measure of grammaticality.This measure has proven useful in comparing native-speakers’ implicit gram-maticality judgments with those of L2 learners, since one is — theoretically, atleast — able to avoid a potential pitfall of explicit grammaticality judgmenttasks, in which L2 learners may have explicitly learned a particular rule, and beable to apply it in a grammatical judgment task, but nevertheless have aradically different interlanguage competence from that of native-speakers.

Contrary to some other researchers, I see little value in using the SMparadigm if the only purpose is to confirm results achievable using moretraditional methodologies. Arguably, the cases of interest are those wheretraditional grammaticality judgment tasks and SM yield divergent results.Indeed, it seems unlikely that sentence-matching would have received muchattention at all had it not been for the fact that it ‘fails’ in some interestingcontexts. These contexts were the focus — arguably, the raison d’être — ofFreedman and Forster’s original paper from 1985. Freedman and Forstershowed that subjects who otherwise reliably distinguished in response latencybetween identical pairs of grammatical versus ungrammatical sentences,appeared to treat a particular subset of ungrammatical sentences — namely,specified subject condition violations — as though they were in fact grammati-cal. That is to say, there was no significant difference in response latency in thiscondition. For example, subjects treated pairs of sentences rated offline asungrammatical, such as (6a), on a par with grammatical sentences as in (6b),and distinct from matching pairs of ungrammatical sentences, such as (6c),which elicited significantly longer response latencies.

(6) a. *Who did you see Rick’s picture of? Æ implicitly treated as‘grammatical’

b. Who did you see a picture of?c. *What you did of a picture see?

Freedman and Forster interpreted this systematic absence of an effect in termsof the level of syntactic representation tapped by the SM task: given the theoret-ical framework they assumed, they argued that the sentence was grammatical ats-structure, but ungrammatical at at the level of logical form.

Clahsen, Hong and Sonnenstuhl-Henning (1995) reinterpret these findings

110 Nigel Duffield

in terms of operations applying at different levels of structure: SM, they argue,inspected an underlying syntactic level (LF), but was insensitive to operator-variable binding relations holding at that level of representation. This re-interpretation captured both the Freedman and Forster results as well as theirown results on verb-position in finite root versus embedded clauses in German.These latter results showed that speakers treat as grammatical main clauses inwhich the verb (incorrectly) occupies the final position, as in (7a), but treat asungrammatical embedded clauses in which the verb is incorrectly raised tosecond position, as shown in (7c):

(7) a. *Hans den Hund gesehen hat. Æ treated as ‘grammatical’ in SM.Hans the dog seen has‘Hans has seen the dog.’

b. Hans hat den Hund gesehen.c. *daß Hans hat den Hund gesehen. Æ treated as ‘ungrammatical’ in SM.

that Hans has the dog seen‘that Hans has seen the dog.’

d. daß Hans den Hund gesehen hat.

In our own experiments on clitic placement (see Duffield et al. 2002), we deter-mined that in SM native-speakers treat ‘ungrammatical causatives’ differentlyfrom ‘ungrammatical restructuring’ sentences. As predicted by the theoreticalanalysis, French native speakers treat cases such as (5d) on a par with othergrammatical sentences: there was no significant difference in response latencybetween (5c) pairs and (5d) pairs; by contrast, it took subjects significantlylonger to match (5b) pairs than their grammatical counterparts (5a). Crucially,in all other conditions involving incorrect clitic placement, French native-speakers were significantly slower to match sentences compared to the corre-sponding correct placements of these clitics. Thus, it was not the case that SMoverall was insensitive to constraints on clitic placement; quite the contrary.Instead, SM was selectively insensitive to surface violations of grammaticality:sentences that were ‘underlyingly grammatical’ were accepted as grammatical,even though the surface string was ungrammatical.

In offline grammatical judgment tasks, on the other hand, French native-speakers consistently treat surface ungrammatical sentences equally: offline,(5d) is considered no more grammatical than (5b). Thus, there is once again adivergence between the results obtained by implicit versus explicit methodolo-gies. In contrast to the McKoon and MacFarland experiments discussed in theprevious section, here it is the selective absence of a specific result from the

Measures of competent gradience 111

implicit experiment that is significant. Whichever the direction, though, bothsets of experiments require a dual competence approach to underlying linguisticknowledge, if we wish to accommodate and model both online and offlineresults. Both sets of experiments — here again, I am considering only thenative-speakers’ results in our experiment — exemplify Figures 2 and 3 above,with the implicit experiment tapping UC, and the results of the explicit gram-matical judgment task reflecting surface competence.

2.2 Types of gradience

Having established the basic framework of a dual competence model toaccommodate both categorical and gradient effects, I would now like toconsider this latter notion more closely, in order to distinguish different typesof gradience. This is the issue that bears most directly on the more generalconcern of this volume, namely, on the nature of the lexicon-syntax interface.Just as I have suggested that there are two types of competence, it is also necessaryto draw a distinction between two types of gradience. Again, for want of betterterms, I will refer to these subtypes as lexical and syntactic (constructional)gradience, respectively. Since this latter distinction is somewhat more intuitivethan the UC/SC contrast, a couple of illustrative examples should suffice.

2.2.1 Lexical gradienceLexical gradience refers to cases where the acceptability of a given sentencevaries as a function of the properties of particular lexical items. Such propertiesmay be semantic or idiosyncratic. An example of a semantic property might beintentionality: verbs that select +intentional subjects may be more acceptable ina given sentential context than those that do not. Idiosyncratic properties, bycontrast, distinguish lexical items from near neighbours: for example, highlyfrequent nouns may be more acceptable than (near) synonyms of lowerfrequency; some items may be more appropriate than others in a given register.In both cases, the acceptability of the carrier sentence is determined by proper-ties of the specific lexical entries.

McKoon and MacFarland’s work just discussed exemplifies this type ofgradience: their corpus study showed that some intransitive verbs transitivisemuch more readily than others, independently of the verbal class to which theybelong. For instance, whereas atrophy (ECS) and deteriorate (ICS) have anextremely low probability of occurring in transitive frames, fray (ECS) andferment (ICS) are much more likely to be transitivised. This difference is partly

112 Nigel Duffield

a function of inherent semantic factors, and partly one of frequency.11

Change of LocationChange of StateContinuation of a pre-existing stateExistence of StateUncontrolled processControlled process (motional)Controlled process (non-motional)

selects (least variation)be

selects (least variation)have

Figure 6.Auxiliary Selection Hierarchy (adapted from Sorace 2000).

Another article in the same issue of Language, by Sorace (2000), provides adifferent example of lexical gradience. Sorace is concerned with variation inauxiliary selection (have versus be) in constructions in the perfect in Germanicand Romance (especially Italian). The standard theoretical assumption is that,for any particular variety, the auxiliary associated with a particular predicate isrigidly lexically-determined: a given predicate either categorically selects the beauxiliary or the have auxiliary (the latter being the default value). Sorace’sarticle suggests that this categorical view is incorrect. Her paper shows that —both cross-linguistically and within a given variety — auxiliary selection is agradient rather than categorical property. According to their semantic proper-ties, verbs occupy a position on a continuum (or ‘hierarchy’, to use Sorace’sterm) between be and have selection (see Figure 6).

At either end of this semantically-defined continuum, there is little varia-tion in which auxiliary is selected, so that for a non-motional controlledprocess, such as chat (It. chiaccherare) have is the only acceptable auxiliary,whereas for a pure change of location predicate such as come (It. venire), only beis possible. This is illustrated by the examples in (8a versus 8b) below. Bycontrast, predicates whose inherent semantic properties place them in themiddle of this continuum show much more flexibility as to which auxiliary isselected. This flexibility is reflected both in terms of linguistic variation withrespect to selection — as illustrated by examples in (9) below — and in termsof ‘coercability’ within a given variety: as the examples in (10) illustrate, verbsin the middle of the continuum can be pragmatically coerced into preferentiallyselecting either one or the other auxiliary.12

(8) a. Maria è /*ha venuta alla festa.Maria is/has come to-the party‘Maria came to the party.’ [1a]

Measures of competent gradience 113

b. I colleghi hanno/*sono chiaccherato tutto il pomeriggio.the colleagues have /are chatted whole the afternoon‘The colleagues chatted the whole afternoon.’ [33a]

(9) a. Gli atleti svedesi hanno corso/?sono corsi alle Olimpiadi.the athletes Swedish have run /are run at-the Olympics‘The Swedish athletes ran at the Olympic Games.’ [37]

b. De temperatuur is /heeft 3 uur lang gestegen, maar is toenthe temperature is/has 3 hours risen but is thenweer gezakt.again dropped‘The temperature rose for three hours but then dropped again.’ [11]

(10) a. Il pilota ha /?è atterato sulla pista di emergenza.the pilot has/is landed on-the runway of emergency‘The pilot landed on the emergency runway.’ [44a]

b. L’aereo ?ha /è atterato sulla pista di emergenza.the plane has/is landed on-the runway of emergency‘The plane landed on the emergency runway.’ [44b]

If only the poles of the continuum are considered, this contrast gives theappearance of being categorical; however, once one considers the middle range,it becomes clear that auxiliary selection is a lexically gradient phenomenon.13

2.2.2 Syntactic/constructional gradienceIn other constructions, however, gradient effects are observed that are indepen-dent of the particular lexical items involved. That is to say, certain sentences areregularly judged to be less than perfectly acceptable without being deemedwholly unacceptable. In the theoretical literature, such sentences are typicallydesignated as ‘marginal’, a status denoted by one or two question-marks (?/??).14

However, as noted above, since standard theoretical models have no way ofrepresenting such judgments, marginal sentences are usually ‘re-classified’ asgrammatical or ungrammatical ad hoc, depending on the analysis that is beingpursued. Such reclassification immediately obscures an essential feature of mostacceptability judgments, namely, their syntactic gradience.

One example of this type of gradience has already been mentioned, viz., theinfluence of referential specificity in determining the relative strength ofsyntactic island effects: Kluender (1992), see also Kluender and Kutas (1993).

A different example of syntactic gradience is provided by my work withAyumi Matsuo on VP-ellipsis constructions in English. Ellipsis constructions,

114 Nigel Duffield

and the constraints pertaining to them, have provided core data for generativeanalyses for several decades, their importance first brought to general attentionin Sag’s dissertation (1976). The aspect of ellipsis constructions relevant to thepresent discussion is a constraint on structural parallelism, which — thetheoretical literature claims — requires the VP of the antecedent clause to besyntactically parallel to that of the understood ellipsis. This structural parallel-ism constraint is used to explain the contrast between (a) versus (b) examplesin (11) and (12) below: examples (11a) and (12a) show VP-ellipsis with‘parallel’ active/verbal antecedent clauses; those in (11b) and (12b) illustratetwo types of ‘non-parallel’ antecedent, passive VPs and nominal antecedents,respectively. The examples in (11/12c) and (11/12d) are intended to show thatthis parallelism constraint fails to apply — or, at least, does not apply sostrongly — if the ellipsis (VPE) is replaced with the semantically equivalentVP-anaphora (VPA) clause.

(11) a. Someone had to take out the garbage.– But Barney refused to. (VPE)

b. The garbage had to be taken out.– ?/??But Barney refused to.

c. Someone had to take out the garbage.– But Barney refused to do it. (VPA)

d. The garbage had to be taken out.– But Barney refused to do it.

(12) a. It always annoyed Sally if anyone mentioned her sister’s name.– Tom did, out of spite. (VPE)

b. The mention of her sister’s name always annoyed Sally.– ??/*Tom did, out of spite.

c. It always annoyed Sally if anyone mentioned her sister’s name.– Tom did it, out of spite. (VPA)

d. The mention of her sister’s name always annoyed Sally.– ?Tom did it, out of spite.

The structural parallelism effect is interesting for at least two reasons. First, fornative-speakers, the parallelism constraint has generally gradient, rather thancategorical effects. That is to say, native-speakers typically disprefer, but do notnecessarily exclude, violations of structural parallelism with VPE (the (b)examples above). This has been demonstrated experimentally in Tanenhaus andCarlson (1990), as well as in our own work (Duffield and Matsuo 2001, 2002).

The availability of ‘non-parallel ellipsis’, in contrast to some other kinds of

Measures of competent gradience 115

‘ungrammatical’ sentence, has also been documented in corpora of spontaneousspeech, as reported in Hardt (1993). The following examples, taken from Hardt(1993), attest to the productivity of violations of the parallelism constraint.

(13) a. This information could have been released by Gorbachov, but hechose not to. (Daniel Schorr, National Public Radio broadcast10/17/92) [Hardt (131)]

b. A lot of this material can be presented in a fairly informal and acces-sible fashion, and often I do. (Chomsky 1982, cited in Dalrymple etal. (1991)) [Hardt (134)]

c. [Many Chicago-area cab-drivers] … sense a drop in visitors to thecity. Those who do, they say, are not taking cabs. (Chicago Tribune2/6/92) [cf. Hardt ex. 118]

Hence, it seems fair to claim that such sentences have a different status fromthose that native-speakers quite generally reject as unacceptable.

A second point to observe about non-parallel ellipsis is that construction-type seems to be a factor in determining acceptability. Once again, experimentalevidence just cited confirms the intuition that non-parallel ellipsis where theantecedent is a derived nominal (12b) is significantly less acceptable than non-parallel ellipsis where the antecedent is a passive VP (11b) (though it still remainssignificantly more acceptable than some other kinds of ungrammatical sentence).

Standard theoretical analyses of ellipsis have no way to represent either thegradient effects of the parallelism constraint overall, or the differential effects ofconstruction type. Hence, in the theoretical literature, the relative acceptabilityjudgments just described get ‘recoded’ categorically, with non-parallel ellipsisbeing considered uniformly ungrammatical (*), irrespective of the particulartype of antecedent, and non-parallel anaphora ((11d)/(12d)) deemed perfectlyacceptable, native-speakers’ intuitions notwithstanding. This contrast isrepresented schematically in Table 3.

It should be stressed that this type of gradience is orthogonal to lexicalgradience: for each condition tested in our experiments, different verbs anddifferent auxiliaries were deemed more or less acceptable in ellipsis contexts;crucially, though, all of the verbs were accepted some of the time in non-parallel contexts.

The main statistical findings for native speakers were as follows. First, inboth experiments, there was a reliable main effect of parallelism: in particular,VP-ellipsis following non-parallel antecedents was significantly less acceptablethan following parallel antecedents, irrespective of construction type. Second,

116 Nigel Duffield

two clear interactions were observed: between ellipsis type and parallelism (VPE

Table 3.Designated vs. actual acceptability judgments for parallel vs. non-parallelantecedents (VPE and VPA completions). RH column shows acceptance rates aspercentages for trials in Duffield & Matsuo (2001), (2002), respectively.

Antecedent-ellipsistype

Designatedgrammaticalityjudgment

Actual acceptability judgments

Active-VPEPassive-VPE

–*

–?

9052

8848

Verbal-VPENominal-VPE

–*

–??

8939

9357*

Active-VPAPassive-VPA

––

–?

9691

8784

Verbal-VPANominal-VPA

––

–?

9774

8876

* The relatively high acceptance rate for VPE following nominal antecedents is due to the differentbalance of finite and non-finite ellipsis sentences in the latter experiment. See 2.4 below for furtherdiscussion.

following non-parallel antecedents is reliably less acceptable than VPA in thesame context), and between construction type and parallelism (VPE followingpassive antecedents is significantly more acceptable than following nominalantecedents, though still less acceptable than following active antecedents).Third, VPA also shows a reliable parallelism effect with nominal antecedents,albeit a smaller one compared to the VPE effect.

Before discussing the second language learners, let us consider how we mightmodel the native-speaker results, first, given the traditional competence–perfor-mance model, and then within the dual competence model proposed here.

As far as I can determine, it is simply impossible to model these gradienteffects in the traditional framework without arbitrarily recoding them ascategorical effects. One could, for example, model the main effect of parallelismby reclassifying the circa 50% acceptance rate for VPE in passive contexts asequivalent to categorically unacceptable sentences, say, those with less than 5%acceptance ratings by native speakers. One could also gloss over the statisticallyreliable difference between construction types: since the principles and parame-ters approach allows no construction-specific rules, it cannot allow construc-tion-specific effects to bear on grammaticality. Finally, one could dismiss the

Measures of competent gradience 117

small, but significant, effect of parallelism in VPA contexts.Viewed charitably, this way of treating gradient effects obscures subtle and

empirically valid distinctions, while reconciling them with an explanatorymodel; a less charitable interpretation would regard this is as fixing the data.However it is viewed, something important is lost. I suggest that these gradienteffects are an essential feature of linguistic competence, not something to befactored out.

By contrast, the dual competence model allows us to model and to interpretthese gradient results without abandoning the idea that some aspects ofsyntactic competence are indeed categorical and autonomous of lexical andconstructional knowledge. As was the case for the McKoon and MacFarlandresults discussed above, I suggest that the results of this experiment be inter-preted in terms of Figures 2 and 3 respectively, in which the structural parallel-ism constraint, which shows its effects across constructions, is represented inUC, whereas specific lexical and constructional information, including frequen-cy information, is represented in SC. The interaction between these two typesof competence gives rise to the various types of gradience observed.

The dual competence model not only permits modelling of lexical andsyntactic gradience; as discussed above, it also provides a potentially explanato-ry model of principled divergences between native-speakers’ and secondlanguage learners’ behaviour, and a way to resolve the apparent paradox of L2learners outperforming native-speaker ‘controls’. In the remaining sections ofthis paper, I will consider some cases of what I will term parallel disjointconvergence.

2.3 Parallel disjoint convergence

As noted above, the dual competence model allows non-native speakers’ resultsto differ from those of native-speakers on any given acceptability judgment taskin two principled ways: either non-native speakers results can reflect UC whilenative speakers’ reflect SC, or vice versa, these options being schematised inFigures 4 and 5 above, respectively. In the following sections, I will outline oneinstance of each type of disjoint convergence.

2.3.1 Type 1 disjoint convergenceType 1 disjoint convergence (Figure 4 above) refers to instances discussed at theoutset of this paper, in which L2 learners seem to outperform their native-speaker counterparts; that is, instances where L2 learners’ behaviour actually

118 Nigel Duffield

comes closer to the categorical behaviour predicted by standard theoreticalmodels than does that of native-speakers.

This mismatch is illustrated in a recent cross-linguistic study of investigat-ing L2 learners’ knowledge of English derivational morphology (Duffield,Sabourin and Curtin 1998). This study was a partial replication for SLA of a setof experiments with native-speakers reported in Marslen-Wilson et al. (1994).The Marslen-Wilson et al. experiments examined morphological relatednessbetween derivationally related words, as evidenced by priming effects. Amongthe interesting findings of the original study, the two most relevant were asfollows. First, it was determined that words related by simple phonetic overlapdid not yield priming effects (tin does not prime tinsel, nor asp, asparagus); inother words, only words that could be decomposed into a shared stem plus alegitimate affix (for the derived form) are considered by native-speakers to berelated. Second, it was determined that being formally morphologically relatedwas a necessary, but not a sufficient condition for lexical relatedness; in addi-tion, the two forms had to be semantically related. Thus, govern was shown toprime government (and vice versa), but depart fails to prime department, sincethe latter pair do not share any meaning.

From a theoretical point of view, the second finding is somewhat unexpect-ed, since most theoretical morphological models assume formal rules to beautonomous of specific lexical-semantic information. Psycholinguistically,however, for native-speakers, the results show clearly that relatedness isencoded in particular lexical entries (which is the only possible locus of specificsemantic information). Thus, there is a mismatch between native-speakers’psycholinguistic representations and what the theoretical models predict.

In our replication of the Marslen-Wilson et al. experiments with L2learners, we predicted — given our theoretical assumptions — that L2 learnersmight diverge in a principled way from native-speakers. Specifically, while weexpected that both native-speakers and L2 learners should fail to show primingfor purely phonetically related (non-morphologically-related) pairs, wehypothesized that intermediate learners might initially over-generalise theformal rule, showing priming in depart-department cases, in contrast to nativespeakers. This was precisely what we found: whereas our native-speaker groupand our advanced L2 group replicated the Marslen-Wilson et al. findings, theintermediate L2 group (native speakers of Japanese) showed priming effects formorphologically-related pairs irrespective of semantic relatedness. By demon-strating categorical, autonomous behaviour, the intermediate group betterapproximated the theoretical ideal than either the advanced group or the native

Measures of competent gradience 119

speakers: in this sense, these intermediate learners were more competent — rather,closer to underlying competence — than the others. On the other hand, they wereclearly less competent than the advanced learners in converging on the judg-ments, and by extension, on the overall competence, of native-speakers, sincein this case target competence is lexically-constrained, gradient behaviour.

Although this contrast is not exactly comparable to the other phenomenadiscussed in this paper, since it is purely lexical, rather than syntactic, the samelogic applies: some grammatical phenomena are categorical and autonomousproperties, others show lexically-specific, gradient effects; both need to beaccommodated.

2.3.2 Type 2 disjoint convergenceThe converse behaviour — where second language learners’ results reflectsurface competence when native-speakers’ results show the influence of UC —can be seen in the sentence-matching experiments on clitic placement discussedearlier. Recall that the claim was that native-speakers’ failure to show a gram-maticality effect (in the online task) in French causative constructions — inclear contrast to the grammaticality effect they exhibited for restructuring verbs— was due to the ‘underlying grammaticality’ of the clitic placement insentences such as (5d) above.

If this is the correct explanation of the observed asymmetry, then thepredictions for the second language learners on this task are somewhat paradox-ical, raising the possibility that L2 learners’ implicit acceptability judgmentsmight fail to match those of native speakers by outperforming them withrespect to the presumed theoretical target. In our experiment, this is preciselywhat happened: both English and Spanish speaking L2 learners of Frenchshowed a grammaticality effect for both restructuring and causative contexts inthe online task, in contrast to the French native speakers.

Once again, the standard model provides no satisfactory account of theseresults: either one is forced to exclude the causative condition altogether on thegrounds that it ‘did not work’ for the native-speaker controls, or one accepts(paradoxically) that the L2 learners have achieved ‘native-speaker competence’in this condition, as measured by approximation to the theoretical target,although their implicit judgments are wholly distinct from those of the native-speakers. Once more, something significant is lost.

By contrast, this pattern of results can be accommodated directly by thedual competence model, as schematised in Figure 5 above. The necessaryassumption would be that whereas native-speakers analyse syntactic structures

120 Nigel Duffield

in terms of a general computational system, (at least some) L2 learners’ analysesare at the level of the surface properties of specific constructions. This assump-tion, though controversial, is in line with a respectable body of L2 research (seeespecially Clahsen and Muysken 1989, Bley-Vroman 1989, 1990). Clearly,considerably more research is necessary to demonstrate this version of theFundamental Difference Hypothesis (the idea that SLA is constrained by funda-mentally different principles and mechanisms than those that guide first languageacquisition). The point here is that the present model is able to treat such diver-gences between native and non-native speakers in a principled fashion, withouttotally excluding L2 learners’ access to UG (UC in present terms).

2.4 Factoring out gradient effects: L1 versus L2 differences

Before concluding, I wish to draw attention to another aspect of competencethat is revealed when one studies gradient effects, but which remains obscuredin the traditional paradigm. By focussing on gradience, it is possible to deter-mine which of several logically independent variables contribute(s) to aparticular acceptability judgment and — just as importantly — to determinethe relative strength of these variables. One potential outcome of this type offactor analysis in SLA studies is that native speakers and second languagelearners converge on the same overall result for quite different reasons: in otherwords, their common acceptability judgments are determined by distinct‘constraint rankings’.15

In work reported in Duffield and Matsuo (2002, in preparation), we carriedout two follow-up experiments on the VP-Ellipsis study reported above. Theprevious experiments — in line with most other psycholinguistic work in thisarea — had assumed that the parallelism effect was entirely due to the syntacticproperties of the antecedent clause. The follow-up experiments re-examinedthis assumption, the goal being to explain the parallelism effect in VP-ellipsisconstructions, by teasing apart the other linguistic factors that may contributeto that effect, and (again) to compare native speakers and L2 learners’ sensitivi-ty to such factors.

For these latter experiments, we considered two properties in addition tosyntactic parallelism, namely (conceptual and syntactic) recoverability, andfiniteness. Recoverability refers to the idea that the parallelism effect may bepartly due to the relative salience in the discourse representation of the materialto be reconstructed: that is, non-parallel antecedents might be dispreferred notfor structural reasons, but for interpretive ones. To test this, we manipulated

Measures of competent gradience 121

the antecedent clause in the active-passive experiment (see above), such that thelinguistic information necessary to (re)-construct and interpret the ellipsisclause was more or less recoverable from the antecedent. Specifically, wehypothesized that (conceptual) recoverability of non-parallel passive anteced-ents would be enhanced by the presence of a by-phrase, as for example in (14a)versus (14b) below.16

(14) a. Mary was busy, so the package was set by Tom.– ?He had promised that he would. (with by-phrase)

b. When we got back, our driveway had been cleared of snow.– ??A neighbour told us that Tom had. (no by-phrase)

The other property manipulated was the finiteness of the ellipsis clause.Standard theoretical accounts do not distinguish between non-finite ellipsis —involving to — as in (11) above, on the one hand, and finite ellipsis — involvingdo, or some other auxiliary verb — as in (12) above, on the other. That is to say, theparallelism effect is generally claimed to constrain finite and non-finite ellipsisequally. Intuitively, however, the parallelism effect is considerably weaker withnon-finite ellipsis. These experiments tested that intuition experimentally.

Detailed results and discussion of these experiments are reported inDuffield and Matsuo (in preparation). Here, it suffices to report the mainfindings, which were as follows (see also Duffield and Matsuo 2002). First,contrary to standard theoretical assumptions, our experiments show that theparallelism effect in ellipsis is not uniquely due to the structural properties ofthe antecedent clause: other lexical and conceptual factors interact to determinethe strength of the effect. Of these factors, finiteness is crucial: non-finite ellipsisdisplays significantly weaker parallelism effects in non-parallel contexts thanfinite ellipsis. Second, conceptual recoverability does have an effect on theacceptability of non-parallel antecedents, but only — for native speakers, atleast — in interaction with finiteness: recoverability weakens the parallelismeffect only in non-finite contexts.

The comparison between native-speakers and L2 learners’ performance wasalso revealing. Overall, L2 learners’ performance parallels that of native-speakers: both groups exhibit a significantly lower acceptance of ellipsis in non-parallel contexts; and, for both groups, this effect is gradient, rather thancategorical (just as in the previous experiments). This clearly indicates thatgradient effects can be successfully acquired in SLA.

On the other hand, the constraint ranking underlying native speakers’ andL2 learners’ common overall results appear to be quite different. Whereas

122 Nigel Duffield

finiteness shows a robust main effect for both groups, for L2 learners conceptu-al recoverability has an ameliorating effect even in finite clauses (which was notthe case for native speakers). This suggests that this type of conceptual informa-tion plays a larger role in determining L2 learners’ judgments than it does fornative speakers, who rely more on purely formal information.

Whatever the final interpretation of these results should be, it should beclear that such findings are only attainable in principle if attention is paid to thedetails of gradient effects: a categorical model can neither describe nor accom-modate them.

3. Conclusion

The purpose of this paper has been to draw attention to various types ofgradient effects, both lexical and syntactic. As suggested in the title, I have arguedthat these effects form an essential part of our implicit grammatical competence.A revised model of competence was proposed that accommodates these effects,but which maintains the generativist assumption that certain core aspects ofgrammatical knowledge are still categorical, and autonomous of the lexicon.The proposed model was also shown to offer an explanation for apparentparadoxes that arise in SLA whenever second language learners outperformnative-speakers: by distinguishing two types of implicit knowledge, it is possibleto offer a principled account for certain systematic mismatches between native-speakers and second language learners. Finally, this model may ultimately allowus to bridge the gap between those who argue for strong continuity in SLA andthose who advocate fundamental differences. There is considerable empiricalevidence for both positions; it could be that both are correct.

Notes

* I am grateful to two anonymous reviewers for constructive comments and suggestions. I

<DEST "duf-n*">

would also like to thank those people who commented on previous drafts of this paper,including David Birdsong, Jonathan Bobaljik, Tom Roeper and Lydia White. I am especiallygrateful to Jonathan Bobaljik for clarifying many misunderstandings on my part, and foroffering a persuasive defense of the standard approach. Unfortunately, due to time con-straints, I have not been able to integrate all suggested revisions into this paper. No-one otherthan myself is responsible for remaining errors and inconsistencies.

Measures of competent gradience 123

1. These assumptions were not always held: as noted by Levelt et al. (1977), theoreticalresearchers in the sixties and early seventies developed theories on degrees of ungrammati-cality (Levelt et al. cite Chomsky 1964, Katz 1964, Ziff 1964, and Lakoff 1971).

2. I postpone any discussion of the status of marginal (?) sentences, since most theoreticalanalyses actually end up designating such sentences either as grammatical or ungrammatical,usually the former; their relative deviance — or amelioration — relative to the othersentences of their designated type being attributed to peripheral, extra-syntactic, factors. Forfurther discussion, see Schütze (1996: especially pp.41ff).

3. In this sense, there has been comparatively little progress from a notion of grammaticalitydefined in terms of extensional (infinite) sets: in this model, particular sentences are eithergenerated by the grammar, or they are not. Yet most generative linguists would claim,following Chomsky (1986), that I-language, rather than E-language, is the proper object ofstudy (see also Hoekstra 1990).

4. Of course, denying the competence–performance distinction is nothing new; in the past,though, its detractors often misunderstood or vastly underestimated the intricacy andcomplexity of grammatical knowledge.

5. The paradox arises most clearly in L2 research simply because L2 researchers are, of necessity,more acutely aware than are theoretical linguists of the metalinguistic nature of linguisticjudgments, and of the methodological and analytic problems of data collection and comparison.

6. The comparison involved here may be direct, where L2 subjects’ judgments are comparedwith the judgments of a ‘control group’ of native-speakers, or indirect, where L2 judgmentsare compared with a pre-established set of judgments (perhaps taken from a theoretical paper),which native-speakers presumably would agree on; either way, L2 learners are judged competentif their judgments ‘match’ those of native-speakers in some statistically reliable way.

7. It might be objected that such a conflict only arises where native-speakers’ judgments aregradient. However, the contention here is that almost all apparently categorical judgmentsare in fact gradient (when properly analyzed); hence, there is a real problem here.

8. This is a possible move, provided that there is something interesting left to a judgmentonce the gradient properties have been factored out of the equation. Often, though, it seemsas if there is nothing left, no interesting residue that UG could explain. This again echoesCulicover (1998:48):

Chomsky has argued consistently that this perspective about linguistic theory [includingthe notion of UG as an ‘idealized characterization of linguistic competence’: NGD] isrational and scientific, virtually indisputable. In fact, it cannot reasonably be disputedgiven the presumptions that: (i) a language faculty exists that contains specific syntacticknowledge; (ii) what is left after stripping away the dynamical aspects of language issomething that really exists, in some sense, in the mind/brain… (emphasis mine: NGD).

9. This latter assumption may of course be incorrect. The guiding intuition here is that theintrospection involved in explicit tasks is inevitably mediated (in adult native speakers) bylexical knowledge, a product of surface competence. My speculation is that direct introspec-tion of the computational system is impossible.

124 Nigel Duffield

10. Since McKoon and MacFarland did not test L2 learners, this interpretation is intendedto apply to native-speakers only.

11. Here, I make simplifying assumption that these are independent factors. Obviously, thisis not always the case: if, for example, inherent semantic constraints restrict or reduce theoccurrence of a verb in a transitive frame (see immediately below), this will affect the tokenfrequency for that item, which in turn may further inhibit its acceptability.

12. Numbers in square brackets designate Sorace’s original example numbers. Example 9b[11] above is originally due to van Hout (1993:7).

13. Similar remarks would seem to apply to other ‘unaccusative effects’: for example,there-insertion in English (see Levin & Rappaport Hovav 1995).

14. Crucially, the marginal status of these sentences emerges from a uni-modal pattern ofacceptances: all speakers accept these sentences sometimes, and reject them on otheroccasions; see Avrutin and Wexler (1992), for a relevant discussion of uni-modal vs. bi-modal patterns of acceptance, and their proper interpretation.

15. In response to a reviewer’s query, the expression ‘constraint ranking’ is not intended hereto imply a treatment in terms of Optimality Theory necessarily. It is not obvious thatstandard OT models capture gradient effects any better than mainstream generative models,since ‘violable constraints’ do not yield gradient judgments (in most models, at any rate).Rather, the term is intended to refer to differences in the relative weighting of various lexicaland syntactic factors that determine the judgment. How these should relate to a particulartheoretical description is an independent question.

16. In the case of non-parallel nominal antecedents, we manipulated syntactic, rather thanconceptual, recoverability. Here, we contrasted zero-derived versus non-zero-derivedalternations (e.g., visit versus discussion), since it has been argued that the former (zero-derived nominals) are more easily reconstructable as verb-phrases in VPE contexts. No effectwas found for this type of syntactic recoverability (see Duffield and Matsuo in preparation).

References

Allen, J. and Seidenberg, M.S. 1999. “The emergence of grammaticality in connectionistnetworks”. In The emergence of language, B. Macwhinney (ed.), 115–152. Mahwah, NJ:Erlbaum.

Avrutin, S. and Wexler, K. 1992. “Development of principle B in Russian: Co-indexation atLF and coreference”. Language Acquisition 2 (4): 259–306.

Barlow, M. and Kemmer, S. 2000. Usage-based models of language. Stanford: Center for theStudy of Language and Information.

Bever, T.G. 1970. “The cognitive basis for linguistic structures”. In The development oflanguage, J.R. Hays (ed.), 279–362. New York: John Wiley & Sons.

Birdsong, D. 1989. Metalinguistic performance and interlinguistic competence. New York:Springer-Verlag.

Measures of competent gradience 125

Bley-Vroman, R. 1989. “The logical problem of second language learning”. In Linguisticperspectives on second language acquisition, S. Gass and J. Schachter (eds). Cambridge:Cambridge University Press.

Bley-Vroman, R. 1990. “The logical problem of foreign language learning”. LinguisticAnalysis 20: 3–49.

Bley-Vroman, R. and Masterson, D. 1989. “Reaction time as a supplement to grammaticalityjudgements in the investigation of second language competence”. University of Hawai’iWorking Papers in ESL 8 (2): 207–237.

Chomsky, N. 1964. “Degrees of grammaticalness”. In The structure of language: Readings in thephilosophy of language, J.A. Fodor and J.J. Katz (eds). Englewood Cliffs: Prentice Hall.

Chomsky, N. 1986. Knowledge of language: Its nature, origin and use. New York: Praeger.Chung, S. and Mccloskey, J. 1983. “On the interpretation of certain island facts in GPSG”.

Linguistic Inquiry 14: 704–713.Clahsen, H. and Muysken, P. 1989. “The UG paradox in L2 acquisition”. Second Language

Research 5: 1–29.Clahsen, H., Hong, U. and Sonnenstuhl-Henning, I. 1995. “Grammatical constraints in syntactic

processing: Sentence-matching experiments in German”. The Linguistic Review.Coppieters, R. 1987. “Competence differences between native and near-native speakers”.

Language 63: 544–573.Culicover, P. 1998. “The minimalist impulse”. In The limits of syntax, P.W. Culicover and L.

McNally (eds), 44–77. New York: Academic Press.Culicover, P. 2000. “Minimalist architectures (Review of Jackendoff 1997)”. Journal of

Linguistics 35: 137–150.Dalrymple, M., Shieber, S. and Pereira, F. 1991. “Ellipsis and higher-order unification”.

Linguistics and Philosophy 14 (4): 399–452.Duffield, N. and Matsuo, A. 2001. “A comparative study of ellipsis and anaphora in L2

acquisition”. In Proceedings of the 25th Boston University conference on languagedevelopment, A.H.-J Do, L. Domínguez and A. Johansen (eds), 238–249. Somerville,MA: Cascadilla Press.

Duffield, N. and Matsuo, A. 2002. “Finiteness and parallelism: Assessing the generality ofknowledge about English ellipsis in SLA”. In Proceedings of the 26th Boston Universityconference on language development, B. Skarabela, S. Fish, S. and A.H.-J. Do (eds),197–207. Somerville, MA: Cascadilla Press.

Duffield, N. and Matsuo, A. in preparation. “Acquiring competent gradience: Factoring outthe parallelism effect in VP-ellipsis”. ms. McGill University/University of Ottawa.

Duffield, N., Sabourin, L. and Curtin, S. 1998. “UG constraints on derivational morphologyin SLA”. McGill Working Papers in Linguistics: Proceedings of GASLA 1997 13 (1, 2).

Duffield, N. and White, L. 1999. “Assessing L2 knowledge of Spanish clitic placement:Converging methodologies”. Second Language Research 15 (2): 133–160.

Duffield, N., White, L., Bruhn De Garavito, J., Montrul, S. and Prévost, P. 2002. “Clitic place-ment in L2 French: Evidence from sentence matching”. Journal of Linguistics 38 (3): 1–37.

Ellis, R. 1991. “Grammaticality judgments and second language acquisition”. Studies inSecond Language Acquisition 132: 161–186.

Eubank, L. 1993. “Sentence matching and processing in L2 development”. Second LanguageResearch 9: 253–280.

126 Nigel Duffield

Eubank, L. and Grace, S. 1988. “V-to-I and inflection in non-native grammars”. In Morphologyand its interface in L2 knowledge, M.-L. Beck (ed.), 69–88. Amsterdam: John Benjamins.

Fodor, J.D. 2001. “Parameters and the periphery: Reflections on syntactic nuts”. Journal ofLinguistics 37 (2): 367–392.

Freedman, S.E. and Forster, K. I. 1985. “The psychological status of overgenerated sentenc-es”. Cognition 19: 101–131.

Greenbaum, S. 1977. Acceptability in language. The Hague: Mouton.Hardt, D. 1993. Verb phrase ellipsis: Form, meaning and processing. Computer and Informa-

tion Science, University of Pennsylvania: Ph.D. dissertation.Hedgcock, J. 1993. “Well-formed vs. ill-formed strings in L2 metalingual tasks: Specifying

features of grammaticality judgments”. Second Language Research 91: 1–21.Hoekstra, T. 1990. “Markedness and growth”. In Logical issues in language acquisition, I. Roca

(ed.), 63–83. Dordrecht: Foris.Katz, J. J. 1964. “Semi-Sentences”. In The structure of language: Readings in the philosophy of

language, J.A. Fodor and J. J. Katz (eds), 400–416. Englewood Cliffs: Prentice Hall.Kluender, R. 1992. “Deriving island constraints from principles of predication”. In Island

constraints: Theory, acquisition and processing, H. Goodluck and M. Rochemont (eds),195–222. Dordrecht & Boston: Kluwer.

Kluender, R. and Kutas, M. 1993. “Subjacency as a processing phenomenon”. Language andcognitive processes 8 (4): 573–640.

Lakoff, G. 1971. “Presuppositions and wellformedness”. In Semantics, D.D. Steinberg, andL.A. Jakobovitz (eds). London: Cambridge University Press.

Levelt, W.J.M., Van Gent, J.A.W.M., Haans, A.F. J. and Meijers, A. J. 1977. “Grammatica-lity, paraphrase, imagery”. In Acceptability in language, S. Greenbaum (ed.), 87–101. TheHague: Mouton.

Levin, B. and Rappaport Hovav, M. 1995. Unaccusativity: At the syntax-lexical semanticsinterface. Vol. 26. Linguistic Inquiry Monograph Series. Cambridge, Mass.: MIT Press.

Macdonald, M.-E.C., Pearlmutter, N. J. and Seidenberg, M.A. 1994. “Syntactic ambiguityresolution as lexical ambiguity resolution”. In Perspectives on sentence processing, C.Clifton Jr., L. Frazier and K. Rayner (eds), 123–154. Hillsdale, NJ: Erlbaum.

Mandell, P.B. 1999. “On the reliability of grammaticality judgment tests in second languageacquisition research”. Second Language Acquisition 15 (1): 73–99.

Marslen-Wilson, W., Tyler, L.K., Waksler, R. and Older, L. 1994. “Morphology and meaningin the English mental lexicon”. Psychological Review 101 (1): 3–33.

Martohardjono, G. 1998. “Measuring competence in L2 acquisition: Commentary on part I”.In The generative study of second language acquisition, S. Flynn, G. Martohardjono andW. O’Neil (eds), 151–157. Mahwah, NJ: Lawrence Erlbaum Associates.

McKoon, G. and MacFarland, T. 2000. “Externally and internally cause change of stateverbs”. Language 76 (4): 833–858.

Sag, I. 1976. Deletion and logical form. MIT: Doctoral dissertation.Schütze, C. 1996. The empirical base of linguistics. Chicago: University of Chicago Press.Sorace, A. 1993. “Incomplete and divergent representations of unaccusativity in non-native

grammars of Italian”. Second Language Research 9: 22–48.

Measures of competent gradience 127

Sorace, A. 1996. “The use of acceptability judgments in second language acquisitionResearch”. In Handbook of language acquisition, T. Bhatia and W. Ritchie (eds). NewYork: Academic Press.

Sorace, A. 2000. “Gradients in auxiliary selection with intransitive verbs”. Language 76 (4):859–890.

Tanenhaus, M. and Carlson, G.N. 1990. “Comprehension of deep and surface verbphraseanaphors”. Language and Cognitive Processes 5 (4): 257–280.

Van Hout, A. 1993. “On unaccusativity: The relation between argument and aspect.” Paperpresented at the Arbeitsgruppe Strukturelle Grammatik, MPG, Berlin.

Ziff, P. 1964. “On understanding utterances”. In The structure of language: Readings in thephilosophy of language, J.A. Fodor and J. J. Katz (eds). Englewood Cliffs: New Jersey.

</TARGET "duf">

<TARGET "dyk" DOCINFO AUTHOR "Ton Dijkstra"TITLE "Lexical storage and retrieval in bilinguals"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 6

Lexical storage and retrieval in bilinguals*

<LINK "dyk-n*">

Ton DijkstraNICI/University of Nijmegen

1. Introduction

Language users, monolinguals and bilinguals alike, usually communicate insentences. Because sentences consist of words, a complete understanding ofhow language users process sentences includes an understanding of how theyrecognize their constituent words. Although language users appear to recognizewords embedded in sentences of their mother tongue almost effortlessly, theunderlying word recognition process must surely be very complex. First, wordidentification must depend on the characteristics of the lexical item itself, forinstance, on how often it has been encountered in the past (e.g. does it have a highor low frequency of usage?) and on whether it is ambiguous with respect to itssyntactic category (e.g. is dance used as a noun or verb?) or semantics (e.g. doesbank refer to the river side or the institution?). In addition, a word’s recognitionprocess could be affected by the syntactic and semantic aspects of the precedingsentence context, which may be more or less constraining or predictive.

For bilinguals reading in their second language (L2), the recognition ofwords in sentences must be even more complex, because several additionalfactors comes into play. At the lexical level, for instance, it is likely that thesubjective frequency of the L2 words is considerably lower than that of their L1words (due to the participants’ lower proficiency in L2), making L2 wordsharder to recognize. Furthermore, it may not always be clear in advance whichlanguage a presented word belongs to, because a word form may be ambiguousacross languages (e.g. the word room occurs both in English and in Dutch, butin Dutch means ‘cream’) or because there may be code switches (languagealternations) in the sentence (e.g. dit is een voorbeeld van bottom-up processing,meaning ‘this is an example of bottom-up processing’). In addition, the syntacticand semantic aspects of the preceding sentence are not the only possible con-straints that may affect target word recognition: there is an additional factor,

130 Ton Dijkstra

namely the language of the preceding words, which might provide an indepen-dent source of constraint on bilingual word recognition.

Because the recognition of words in sentence context is so complex, it is nowonder that most studies in this area during the last decades have focussed onthe bilingual’s recognition of isolated words. This process already requires adistinction of different types of structural representations for words (forinstance, orthographic, semantic, and phonological; a storage issue), and isinextricably bound up with how words are retrieved (a processing issue), forwhich purposes they are retrieved (issues with respect to task demands andinstruction), and in which non-linguistic context their retrieval takes place (e.g.is the word positioned within a stimulus list containing words from the same ora different language?).

The major part of this chapter is specifically concerned with the issue ofisolated word recognition, and it addresses the following questions:

1. Structure: which representations are activated during bilingual wordrecognition?

2. Process: what is the time-course of activation for words of different languages?3. Contextual constraints: how do non-linguistic experimental factors, such as

participant expectancies and instruction, affect lexical selection in bilinguals?

In the short second part of this chapter, we will discuss a few recent studies thathave examined effects of linguistic context (syntactic, semantic, and lexical) onthe recognition of words in sentence context. At present, only a handful ofreaction time (RT) studies has been done, investigating quite diverse linguisticquestions within the perspective of different theoretical frameworks. Given thissorry state of affairs, we will focus on the more coherent studies that haveinvestigated the bilinguals’ brain activity during the processing of syntactic andsemantically incorrect sentences in terms of Event-Related Potentials (ERPs).We will argue that while semantic processing may be quantitatively differentbetween monolinguals and bilinguals, syntactic parsing may be both quantita-tively and qualitatively different, in complex ways that depend, for instance, onthe L2 proficiency of the bilinguals involved.

2. How bilinguals recognize words presented in isolation

Before we examine recent studies on the recognition of isolated words bybilinguals, we will first consider three important general aspects of their

Lexical storage and retrieval in bilinguals 131

organization. First, what kind of bilinguals participated in these studies?Second, what type of stimulus materials did they involve? Third, what were theempirical paradigms the studies used for investigation?

Perhaps unexpected by some readers, the term ‘bilinguals’ in many psycho-linguistic studies does not specifically refer to those language users who use thewords of two different languages at the same rate and with the same ease (so-called ‘balanced bilinguals’). Instead, the bilinguals participating in these studiesusually are persons who use their two languages in daily life but to a differentextent, implying that they are more proficient in one language (usually theirnative language or L1) than the other (their second language or L2; they are‘unbalanced bilinguals’). In many of the studies we will be talking about,participants are university students, who are quite proficient in English (gener-ally having eight or more years of experience) but have a different language(often Dutch) as their strongest, native language. These bilinguals, who haveacquired their L2 relatively late (at puberty or later), will be referred to as ‘latebilinguals’, while earlier L2 acquisition makes bilinguals ‘early bilinguals’.

The stimulus materials that are often used in studies on isolated words arereferred to as ‘interlingual homographs’ and ‘cognates’. Interlingual homo-graphs are words that have identical orthographic representations acrosslanguages but different semantics (such as angel, meaning ‘heavenly messenger’in English, but ‘sting’ in Dutch), while cognates are words that overlap acrosslanguages in both their orthographic form and their meaning (e.g. film). Toaddress how words of different languages are stored and retrieved in bilinguals,RTs to interlingual homographs or cognates are compared to matched one-language control items in different experimental paradigms. Any latencydifferences that arise between the two item types are assumed to be a conse-quence of the special bilingual status of interlingual homographs and cognates.

Among the many experimental paradigms used in bilingual RT studies arevariants of lexical decision, language decision, progressive demasking, (lan-guage) go/no-go, word naming, and word association. In a language-specificlexical decision task, participants press one button if they encounter a word inthe target language, and another button if they see a nonsense letter string ornon-word. For instance, in the English lexical decision task, participants pressa ‘yes’ button if a presented letter string was English, and a ‘no’ button if it isnot. In a generalized lexical decision task, participants press the ‘yes’ button forany word they encounter, irrespective of its language. A presented word could,for instance, be Dutch or English. In contrast, in the language decision tasknonsense letter strings (non-words) do not occur. Only words are presented,

132 Ton Dijkstra

belonging to one of two languages. One button must be pressed when a wordbelongs to one language (e.g. English), and another if it belongs to the otherlanguage (e.g. Dutch). In the go/no-go task, only words from two languages arepresented as well. However, participants react only when they identify a wordfrom the target language, for instance, English (English go/no-go), but they donothing if a word of the non-target language (for instance, Dutch) is presented.In progressive demasking, participants identify a word that is presented in analternating sequence with a pattern mask, for instance a checkerboard or a rowof hash marks. Across alternations, the pattern mask decreases in duration,while stimulus presentation increases, until the target word is recognized. Inword naming, the participants must read aloud words in the target language, forinstance, in their mother tongue (e.g. Dutch) or in their second language (e.g.English). Finally, in word association, participants respond to a presented wordby producing the first word that comes to their mind in the target language.

2.1 Structure: Which representations are activated during bilingualword recognition?

Research has shown that in the initial stages of monolingual word recognition,an input letter string leads to the activation of multiple word candidates in themental lexicon that closely match the input (see Grainger and Dijkstra 1996).For instance, when the letter string word is presented to English monolinguals,the stored representations for words like word, cord, ward, wood, and work willinitially become active (such candidates are called neighbours). When the wordidentification process proceeds, inappropriate candidates will gradually bereduced in activation and no longer be considered as a possible input word.Finally, only the lexical representation corresponding to the presented wordremains active and becomes recognized. For bilinguals, the interesting questionarises if word candidates from different languages are activated if they overlapsufficiently with the input letter string. For instance, are both the English andthe Dutch readings of the interlingual homograph angel activated in parallel ifDutch-English bilinguals read the word in an English book?

According to the ‘language-selective access’ view on bilingual word recogni-tion, only lexical candidates of the task-relevant language (in this exampleEnglish) are activated (Gerard and Scarborough 1989, Macnamara and Kushnir1971). Thus, when the word word is presented, Dutch word candidates that aresimilar to word, like bord and wond would not become activated. In contrast,according to the ‘language non-selective access’ view, word candidates from

Lexical storage and retrieval in bilinguals 133

both languages (here English and Dutch) become active (Altenberg and Cairns1983, Grainger and Beauvillain 1987, Van Heuven, Dijkstra and Grainger 1998).As we shall see later, the majority of recent studies support the language non-selective access hypothesis. However, formulating the lexical access views in thisgeneral way ignores at least two important points. First, the issue of (non)selec-tive access should be differentiated with respect to different types of lexicalrepresentations: e.g. orthographic, phonological, and semantic representations.Second, the answer to the question might depend on whether one is processingin one’s native language (L1) or in a second language (L2).

Dijkstra, Grainger and Van Heuven (1999) found evidence of cross-linguistic competition between words of different languages that are similar inform and/or meaning. They investigated whether Dutch-English bilingualsrecognized interlingual homographs faster or slower than matched one-language control items in English lexical decision and progressive demaskingtasks. The English stimulus words varied in their degree of orthographic (O),phonological (P), and semantic (S) overlap with Dutch words. Examples ofitems in their six test conditions are sport (overlap in S, O, and P codes), wild(SO), wheel (SP), pink (OP), angel (O), and pace (P). The first two conditions(SOP and SO conditions) consist of what are usually called ‘cognates’, while thelast three conditions contain ‘interlingual homographs’ (OP and O conditions)or ‘interlingual homophones’ (P condition).

Participants were faster to make a lexical decision to the target words withcross-linguistic overlap than to exclusively English control words if the overlapwas orthographic and/or semantic in nature (e.g. in the SO and O conditions).In contrast, cross-linguistic phonological overlap produced inhibitory effects.Responses to test items of the P condition, for example, were slower than tomatched purely English words.

To show that the observed result pattern did not arise because the test andcontrol items were not well matched in some unknown aspects, the lexical decisionexperiment was replicated with American-English monolinguals. For theseparticipants, no RT or error differences were found between test items and theirmatched controls. Furthermore, to show that the results were not restricted tolexical decision, the experimental materials were also included in a progressivedemasking task. The result pattern obtained with this paradigm was strikinglysimilar to that in lexical decision, indicating that it was not the task that inducedthe facilitation and inhibition effects for homographs relative to controls.

In sum, a presented word form in L2 appears to initially activate ortho-graphic, phonological, and semantic lexical representations in both L2 and L1.

134 Ton Dijkstra

The opposite effects of orthographic and phonological overlap may help toexplain observed differences in the result patterns of other available studies,because the stimulus materials in these studies may have varied in terms of thedegree of cross-linguistic phonological overlap and therefore in the relativeamount of inhibition (e.g., Dijkstra, Van Jaarsveld and Ten Brinke 1998, Font2001, Gerard and Scarborough 1989, Von Studnitz and Green 2002).

In a follow-up experiment (Van Heuven and Dijkstra 1999), Englishpseudo-homophones were added to the stimulus list. Pseudo-homophones arenonsense words for which the pronunciation sounds like a real word, like braneand bloo. The reasoning behind this manipulation was that the presence of suchitems would discourage the use of phonology, and would therefore lead to areduction of the earlier found phonological inhibition effect (see Davelaar,Coltheart, Besner and Jonasson 1978, for similar arguments in a monolingualstudy). A reduction of the phonological inhibition effect was indeed found inseveral conditions, but the effect did not disappear completely in conditionswhere cross-linguistic overlap occurred in several codes (e.g. sop). This findingsuggests that phonology may be re-activated by interactions between codes. Forinstance, via its semantic and orthographic overlap an interlingual homographlike film might reactivate its phonology in both languages (see Gottlob, Goldin-ger, Stone and Van Orden 1999, for such ‘resonance effects’ in a comparablemonolingual study; and Sebastián-Gallés and Kroll in press, for an overview ofthe role of phonology in bilingual lexical processing).

Recent studies by Van Hell and Dijkstra (2002) and Font (2001) indicatethat language non-selective access also occurs for cross-linguistically ambiguoustarget words of the native language (L1) and even when targets are not com-pletely identical in form across languages. In the study by Van Hell and Dijkstra(2002), trilinguals with Dutch as their native language, English as their secondlanguage, and French as their third language performed a word association taskor a lexical decision task in their L1 (Dutch). Stimulus words were (mostly)non-identical cognates such as tomaat or non-cognates. Shorter association andlexical decision times were observed for Dutch-English cognates than for non-cognates. For trilinguals with a more equal (high) proficiency in French andEnglish, faster responses in lexical decision were found for both Dutch-Englishand Dutch-French cognates. In other words, even when their orthographic andphonological overlap across languages is incomplete, cognates may be recog-nized faster than non-cognates.

For French-Spanish bilinguals, Font (2001) has found that in lexical decisioncognates differing in one letter between languages (called ‘neighbour cognates’

Lexical storage and retrieval in bilinguals 135

by her) are still facilitated but significantly less so than identical cognates.Furthermore, she has shown that the amount of facilitation that is observeddepends on the position of the deviating letter in the word. Neighbour cognateswith the different letter at the end of the word (e.g. French texte – Spanish texto)are facilitated more than neighbour cognates with the different letter inside (e.g.French usuel – Spanish usual). In fact, facilitatory effects for the latter type ofcognate disappeared and effects tended towards inhibition when such cognateswere of low frequency in both languages. Similar patterns of results were foundin both L1 and L2 processing.

These results make it likely that the size of RT effects observed for cognatesand interlingual homographs depends on their degree of cross-linguisticoverlap (also cf. Cristoffanini, Kirsner and Milech 1986). Note that it followslogically that across language pairs that do not share orthography at all (e.g.Chinese and English), no ‘orthographically similar’ word candidates can beactivated, while effects of phonological similarity might still occur (dependingon, for instance, the way tonal information affects the establishment of the setof lexical candidates).

2.2 Process: What is the time-course of activation for wordsfrom different languages?

The issue of language (non)selective access can also be examined from aprocessing point of view by considering the time-course of lexical activation andselection in bilingual word recognition. As a first question, we may consider therate of code activation in L1 and L2: how fast do orthographic, phonological, andsemantic codes from the two languages become active? From the monolingualdomain, we know that high frequency words are generally recognized fasterthan low frequency words, and, because the words of L2 must have a lowersubjective frequency than those of L1 (simply because the former have beenencountered less often), it seems likely that L2 codes become available slowerthan L1 codes. A comparison of the study by Dijkstra et al. (1999), discussed inthe previous section, with a study by Lemhöfer and Dijkstra (submitted)provides information that supports this viewpoint.

Dijkstra et al. (1999) showed that in a lexical decision task where L2(English) was the target language of the bilingual participants, cross-linguisticeffects arose for L1-L2 (Dutch-English) homographs with respect to all threecodes. Because English was the target language in this task, task executionimplied the verification of the English language membership of possible word

136 Ton Dijkstra

candidates, even when Dutch codes would be available faster than English ones.In other words, Dutch codes had time to establish themselves and exert effectson later available English targets that were necessary for responding.

Lemhöfer and Dijkstra (submitted) presented the same stimulus materialsto Dutch-English bilinguals in a generalized lexical decision task. In this task,participants responded with ‘yes’ to both English and Dutch words, but with‘no’ to non-words. In contrast to English lexical decision, participants in thistask can use both Dutch and English lexical representations as a reliable basisfor response. Thus, in this task cross-linguistic effects will arise only to theextent that L1 and L2 codes can affect each other before the fastest codes(usually Dutch ones, we assume) are retrieved and responded to. The results ofthis study were quite clear: no facilitation effects arose for interlingual homo-graphs, while cognates were facilitated relative to control words. The pattern ofresults for homographs indicates that responses were based upon the fastestavailable code, usually the Dutch orthographic code, while cross-linguisticoverlap with respect to semantics in the case of cognates apparently can be usedto speed up the response.

In sum, even though L1 and L2 codes become active in parallel, L2 codesare often activated more slowly than L1 codes, probably because of differencesin subjective frequency between languages. As a consequence, the developmentof cross-linguistic effects depends on the target language in the experiment (L1or L2) and on other temporal characteristics of the task involved.

There is a different way of approaching the issue of the time-course oflexical selection in bilinguals. Rather than asking how fast lexical representa-tions from different languages become active, we might wonder how long theyremain active. Even if there is an initial activation of various codes fromdifferent languages, lexical selection might be relatively fast or slow afterwards.This issue has been investigated in experimental studies by varying the frequen-cy ratio of the two readings of interlingual homographs (e.g. angel is relativelymore frequent in English than in Dutch).

Dijkstra, Timmermans and Schriefers (2000) examined how long the tworeadings of an interlingual homograph compete for selection and whetherlanguage information provided by the item can be used to facilitate the selec-tion of one of these readings. In three experiments, each with a differentinstruction, bilingual participants processed the same set of homographsembedded in identical mixed-language lists. Homographs of three types wereused: high-frequent in English and low-frequent in Dutch; low-frequent inEnglish and high-frequent in Dutch; and low-frequent in both languages. In the

Lexical storage and retrieval in bilinguals 137

first experiment (involving language decision), one button was pressed when anEnglish word was presented and another button for a Dutch word. In thesecond and third experiments participants reacted only when they identifiedeither an English word (English go/no-go) or a Dutch word (Dutch go/no-go),but they did not respond if a word of the non-target language (Dutch orEnglish, respectively) was presented.

In all three tasks, clear inhibition effects arose for homographs relative toone-language controls. Even in the Dutch go/no-go task for Dutch-Englishbilinguals performing in their native language, participants were unable tocompletely exclude effects from the non-target language on homographidentification. More important for the present discussion, however, is thefinding that target-language homographs were often ‘overlooked’, especially ifthe frequency of their other-language competitor was high. The relativefrequency of the two readings of the interlingual homograph was found to affectboth RTs and error rates. In the Dutch go/no-go task, participants did notrespond to low-frequency items belonging to their native language in about 25percent of the cases!

Inspection of cumulative distributions showed that if they did not respondafter about 1500–1600 ms, they did not respond anymore within the timewindow of two seconds. The observed flattening of the cumulative distributiontowards an asymptotic value suggests that recognition of the homographreading from the non-target language in some way ‘prohibits’ the subsequentrecognition of the target language reading (e.g. after recognition, all otherlexical candidates may be suppressed). Thus, selection of one of the readings ofthe interlexical homographs takes place rather late during processing. Theresults suggest that until that time both readings of a presented homograph areinvolved in a (frequency-dependent) ‘race to recognition’ that is won by thefastest candidate.

It is clear that the system must at some time arrive at a selection of onelexical item only, but apparently the role played by the language of that item inaiding selection is only minor. In fact, determination of the language of the itemmay depend on lexical selection having taken place. In addition, it does notseem possible to discard the homograph reading from the non-target languageand to focus on the target reading only on the basis of the instruction that justthe target language needs to be responded to.

Similar results were found when the target language was English (L2) andwhen it was Dutch (L1), even though fewer target words were overlooked in thesecond case. Again, this finding points to activation from both readings of the

138 Ton Dijkstra

interlingual homographs irrespective of whether the target language is thenative language or a second language.

This study allows two important additional conclusions. First, there appearto be serious limitations on the degree of control that participants can exert onthe relative activation of their two languages. Second, the selection of the targetword appears to be based on item characteristics (such as word frequency) andnot on the language membership of the item. Language membership appears tobe available relatively late (maybe only after item identification) and thereforecannot help to speed up lexical selection.

2.3 Contextual constraints: How do non-linguistic experimental factorsaffect lexical selection in bilinguals?

In the previous sections, we have argued that under the experimental circum-stances of the presented studies involving isolated words, the word recognitionsystem functions in a language non-selective way. However, that the system canfunction in a non-selective way does not imply that it does so irrespective of theexperimental circumstances. In the following two sections, we will consider towhich extent the observed language non-selectivity may be modulated bycontext. We will make a distinction between two types of contextual factors:non-linguistic or experimental and linguistic. Non-linguistic or experimentalcontext aspects are concerned with participant expectations based on, forinstance, (the explicitness of the) instruction and task demands. Linguisticcontext aspects have to do with lexical, syntactic, semantic, and languageinformation, such as provided by a sentence context. We note that for lists ofindividual items, stimulus list composition and in particular language intermix-ing could have both linguistic (lexical) and non-linguistic effects.

Language intermixing refers to whether an experiment contains exclusivelyitems belonging to one language (blocked presentation) or items from twolanguages (mixed presentation). The effects of language intermixing and taskinstruction on bilingual word recognition were the focus of three Dutch-English lexical decision experiments by Dijkstra, Van Jaarsveld and Ten Brinke(1998). In Experiment 1, Dutch bilingual participants performed an Englishlexical decision task including Dutch-English homographs, cognates, and purelyEnglish control words. The mean RTs to interlingual homographs wereunaffected by the frequency of the Dutch reading and did not differ from thoseto monolingual controls. In contrast, cognates were recognized faster thancontrols. In Experiment 2, Dutch participants again performed an English

Lexical storage and retrieval in bilinguals 139

lexical decision task including interlingual homographs, but, apart from non-words, Dutch words were also incorporated, requiring a ‘no’-reaction. Stronginhibition effects were now obtained for interlingual homographs relative toEnglish control words. The size of the inhibition effect depended on the relativefrequency difference of the two readings of the homograph. It was largest whenthe Dutch reading of the homographs had a high frequency relative to theEnglish reading. In Experiment 3, Dijkstra et al. (1998) used the same stimulusmaterials but changed the task demands. Participants now performed a generallexical decision task, responding ‘yes’ if a word of either language was presented(rather than saying ‘no’ to Dutch words). In this experiment, frequency-dependent facilitation effects were found for the interlingual homographs(relative to English control words).

The authors argued that the null-results for interlingual homographs in thefirst experiment did not constitute conclusive evidence that bilingual wordrecognition involves a language selective access process, because in that case thedifferent stimulus list composition of Experiment 2 should not have affected theresults. Instead, the results of Experiment 2 were considered as evidencesupporting the language non-selective access view, and this view was tested andsupported again under the different task demands of Experiment 3. This lastexperiment further showed that task demands may affect the direction of theobserved effects: changing the task from language specific lexical decision inExperiment 2 towards generalized lexical decision in Experiment 3 turned theinhibition effects of Experiment 2 into facilitation effects in Experiment 3.

The null-effects in Experiment 1 make one wonder if the Dutch readings ofthe interlingual homographs were activated at all in this English lexical decisiontask. Recent reanalysis of the data suggests they were. A regression analysisshowed that (despite the over-all null results) homograph responses becameslower as the frequency of their Dutch reading increased, while they becamefaster with increasingly high English frequency readings. Furthermore, De Moor(1998) demonstrated that the L1 semantics of the interlingual homographs wasapparently activated as well. De Moor first replicated the null-result for homo-graphs relative to controls. Then, on the trial after the homograph appeared, shepresented the English translation of its Dutch reading. For instance, brand wasfollowed by fire, which is the English translation of the Dutch word brand. Asmall but reliable translation priming effect of 11 ms was found. In a replicationof this experiment with different stimulus materials, Van Heste (1999) observeda reliable 35 ms difference between translation and control trials. The Dutchreading of the homograph on the previous trial had apparently been activated

140 Ton Dijkstra

even though this did not affect its RT (cf. De Bruijn, Dijkstra, Chwilla andSchriefers 2001).

Finally, Dijkstra et al. (1999) performed an analogous experiment that wasreviewed in Section 2, also involving an English lexical decision task withinterlingual homographs and controls. In this study, significant facilitationeffects were found for homographs having cross-linguistic overlap in orthogra-phy but not in phonology (stage), and no effects for items with overlap in both(step). The items in this study were comparable to those in Dijkstra et al. (1998),making it likely that the null-effects in the earlier study were due to mixing thetwo types of items. (Indirect support for this reasoning comes from a Spanishlexical decision study involving French-Spanish bilinguals by Font (2001:115),who found facilitatory effects for French-Spanish homographs that had littlephonological similarity across languages.)

Several other accounts have been proposed for the null-results in Experi-ment 1 and the inhibitory effects in Experiment 2 from Dijkstra et al. (1998).These accounts have either referred to differences in the relative activation ofwords from the two languages in Experiments 1 and 2 (Dijkstra et al. 1998,Grosjean 2001), or to differences in participants’ decision strategies (De Groot etal. 2000, Dijkstra et al. 2000). Let us take a closer look at the various proposals.

Dijkstra et al. (1998) assumed that the degree of activation of Dutch (thenon-target language) was higher in Experiment 2 than in Experiment 1, becauseDutch words were only included in Experiment 2. As a consequence, theEnglish readings of the interlingual homographs suffered from more competi-tion by the Dutch reading in Experiment 2. As an underlying mechanism, thisview assumes that lexical activation effects can last across trials and can affectrelative language activation. In sum, the different results in the two experimentsare assumed to be a consequence of different bottom-up activation processesdue to the composition of the stimulus list. This is basically an explanation interms of lexical context effects.

Similarly, Grosjean (2001) interpreted the results in terms of the partici-pants’ ‘language mode’, referring to the relative state of activation of thebilingual’s languages and language processing mechanisms. The mode is‘monolingual’ if only one language is relatively active and ‘bilingual’ if bothlanguages are active (though one language may be more active than the other).In Experiment 1, the participants only read English words and non-words(although some words were homographs and cognates) and they were instruct-ed to decide whether the items were English words or not. This would havepositioned them towards the monolingual end of the mode continuum, but

Lexical storage and retrieval in bilinguals 141

they did not reach this position totally as they knew they were being tested asbilinguals. Thus, although their Dutch was partly active (which would explainthe cognate effect) it was not sufficiently active to create a homograph effect. Insum, Grosjean (2001) proposed that both the participants’ expectancies withrespect to the English lexical decision task and the degree of language intermix-ing (encountering mostly English words) affected the bilinguals’ performance.This explanation implies that both non-linguistic and linguistic context aspectsaffected relative language activation.

De Groot et al. (2000) replicated the null-results observed in Experiment 1by Dijkstra et al. (1998) using different stimulus materials and different Dutch-English bilinguals. They proposed that the participants were instructed toperform a ‘language specific’ English lexical decision task, but on some trialsmay instead have treated the task as a ‘language neutral’ lexical decision task.The adoption of a ‘language specific’ processing mode would induce slowerresponses to homographs than to matched controls due to lexical competitionbetween the activated target and non-target readings of the interlingualhomographs (just as in Experiment 2 by Dijkstra et al. 1998). In contrast, in a‘language neutral’ processing mode the response to a homograph would bebased on the availability of any reading, irrespective of language, and homo-graphs could then be responded to faster than controls (as in the generalizedlexical decision task of Experiment 3 by Dijkstra et al. 1998). In sum, a mixtureof the two processing modes adopted by the participants led to a mixture offacilitation and inhibition effects for homographs relative to controls, yieldingan overall null-result. (Note that this account would predict larger standarddeviations for the homographs in the condition where the Dutch reading of thehomograph has a high frequency than in the conditions where it has a lowfrequency, because in the former type of condition, the Dutch reading would beavailable much faster than the English reading, while that would not be the casefor the last type of words.)

Dijkstra, De Bruijn, Schriefers and Ten Brinke (2000) pointed out that theparticipants in the three studies that reported null-results were apparently nottold in advance that some of the presented letter strings would be words in bothDutch and English. Participants might sometimes have adopted a ‘languageneutral’ processing mode because they were in an uncertain situation. Todisentangle the effects of instruction and language intermixing, Dijkstra et al.designed an experiment that combined features of Experiments 1 and 2 byDijkstra et al. (1998). Participants were explicitly instructed that they wouldencounter Dutch words requiring a ‘no’ response and were provided with

142 Ton Dijkstra

examples in the practice set. However, exclusively Dutch words were presentedonly in the second part of the experiment. No significant RT differences werefound between the interlingual homographs and matched English control itemsin the first part of the experiment. In contrast, strong inhibitory effects forhomographs relative to control words appeared in the second part. Examinationof the transition from Part 1 to Part 2 showed that, as soon as Dutch itemsstarted to come in, the RTs to homographs were considerably slowed downcompared to control words. These results converge completely with those ofExperiments 1 and 2 by Dijkstra et al. (1998) discussed above. They suggest thatlanguage intermixing rather than instruction-based expectancies drives thebilingual participants’ performance. Instead of interpreting the pattern ofresults in Part 1 and Part 2 of the experiment as evidence for differences inrelative language activation (depending on the local absence or presence ofDutch items in the stimulus list), Dijkstra et al. propose that participants useddifferent decision criteria in the two parts of the experiment, depending on thetypes of lexical items they encountered.

Ignoring the details of the proposed underlying mechanisms, we can drawa number of general conclusions on the basis of these and other studies (seeDijkstra and Van Heuven (2002), for an elaborated model of bilingual wordrecognition based on this evidence). First, word candidates from both targetand non-target languages are activated in parallel in a ‘bottom-up’ way (via thesignal), even though their rates of activation may differ depending on subjectivefrequency. Second, stimulus list composition and task demands are importantdeterminants of the response patterns. Third, task demands, instruction details,and other ‘top-down’ information sources do not ‘override’ the activatedbottom-up information; instead, the activated representations in the twolexicons are used for responding in accordance with the requirements of thetask at hand. In all, the conclusion is that for isolated words presented instimulus lists, bilingual word recognition is based on the input signal and isbasically automatic. Non-linguistic context effects (due to the composition ofthe stimulus list, the specific instruction, or the task to be performed) appear toaffect the decision criteria that are used to accept one lexical candidate oranother during the lexical selection process, but not to affect the relative degreeof activated word candidates from one or the other language.

Lexical storage and retrieval in bilinguals 143

3. How bilinguals recognize words presented in sentences

In everyday life, the contextual influences on word recognition are not providedby previous words in an unrelated word list or by the demands of the experi-mental task that must be performed, but by the syntactic, semantic, lexical, andlanguage aspects of the sentence context that precedes a particular word that isto be recognized. In the following section, we briefly consider such linguisticcontext effects on word recognition. We will first examine an RT study on thegeneral effects of sentence context on bilingual word recognition, showing thatthere may be complex interactions between different aspects of the sentencecontext and word identification. Next, we will zoom in on syntactic andsemantic aspects of sentence processing as reflected in studies that measure thebilinguals’ brain activity using Event-Related Potentials.

Altarriba, Kroll, Sholl and Rayner (1996) examined semantic and lexicalform effects of a preceding sentence context on bilingual word recognition intwo experiments. In the first experiment, they monitored the eye movements ofSpanish-English bilinguals who were reading English (L2) sentences thatcontained either an English (L2) or a Spanish (L1) target word (Experiment 1).Sentences provided either high or low semantic constraints on the target words.An example sentence of the high constraint and Spanish target condition is Hewanted to deposit all his dinero at the credit union, where dinero is Spanish for‘money’. The experiment led to an interaction between the frequency of thetarget word and degree of sentence constraint for Spanish target words withrespect to the first fixation duration, but not for English target words. Thus,when the Spanish target words were of high frequency and appeared in highlyconstrained sentences, the participants apparently experienced interference.This result suggests that sentence constraint influences not only the generationof semantic feature restrictions for upcoming words, but also that of lexicalfeatures. The high-frequency Spanish word matched the generated set ofsemantic features, but not the expected lexical features when the word appearedin the alternate language (Altarriba et al. 1996:483). The same pattern of resultswas found in a second experiment, where the sentences were presented word byword using the rapid serial visual presentation (RSVP) technique and partici-pants named the capitalized target word in each sentence.

The findings of this study indicate that linguistic sentence context interactswith target word recognition, suggesting that linguistic context functions in adifferent way than non-linguistic context. Furthermore, it is interesting to notethat the data pattern showed an interaction of word frequency (a lexical

144 Ton Dijkstra

information source) and the sentence constraint, and not of language member-ship and the sentence constraint. This suggests that (just like for isolated words)lexical characteristics are more important than language characteristics in thedetermination of word recognition in sentences.

Only a limited number of studies have investigated syntactic effects ofsentence context on word recognition in some detail (for a full review, see Krolland Dussias in press). Here we will briefly describe a few recent studies thatused Event-Related brain Potentials (ERPs) to compare syntactic and semanticaspects of sentence processing in bilinguals (Weber-Fox and Neville 1996,Hahne 2001, Hahne and Friederici 2001). Like the authors of these studies, wewill argue that there are processing differences between monolinguals andbilinguals with respect to semantic aspects that appear to be especially quantita-tive in nature, while the differences with respect to syntax appear to be qualita-tive as well.

In the study by Weber-Fox and Neville (1996), five groups of Chinese-English bilinguals performed an acceptability judgment task for sentences intheir L2, English, while their EEG was recorded. These groups of participantshad learned English at different ages. Apart from normal control sentences, theyread semantically anomalous sentences in English (e.g. The scientist criticizedMax’s event of the theorem), sentences that contained violations of Englishphrase structure rules (e.g. The scientist criticized Max’s of proof the theorem),sentences that contained specificity constraint violations (e.g. What did thescientist criticize Max’s proof of?), and sentences that contained subjacencyconstraint violations (e.g. What was a proof of criticized by the scientist?). Interms of their brain activity, early L2 learners (those who had learned Englishbefore age 11) responded to the semantic anomalies in a very similar way asmonolingual language users. The other bilingual groups differed from themonolinguals only quantitatively: an N400 effect, a marker associated with theprocessing of semantic anomalies, was present in their EEGs, but it was delayedin time relative to that in monolinguals.

In contrast, several qualitative differences between proficiency groups werefound with respect to the syntactic processing of phrase structure violations.First, none of the bilingual groups displayed a so-called early left anteriornegativity (N125) in the EEG that was present in monolinguals. The N125 is anearly effect in the EEG that may reflect automatized first-pass parsing processes.Second, a second left lateralized negativity (N300–500) was found in all groups,which was left lateralized (found in the left hemisphere of the brain) in mono-linguals and early bilinguals, but more bilaterally distributed in late bilinguals.

Lexical storage and retrieval in bilinguals 145

Third, a P600 effect was present in the monolinguals and early bilinguals butnot in the late learners. The P600 effect is considered to be the most importantEEG marker associated with syntactic reanalysis and repair. In sum, late L2learners consistently displayed large differences in ERPs patterns relative tomonolinguals, suggesting that (especially late) syntactic processes are differentin late L2 learners.

Hahne (2001) came to similar conclusions on the basis of an auditorysentence processing study involving proficient late Russian-German bilinguals andGerman monolinguals. Her participants listened to German sentences that wereeither correct (e.g. Die Tür wurde geschlossen, ‘The door was being closed’),contained a semantically incorrect item (selection restriction violation: Die Ozeanwurde geschlossen, ‘The ocean was being closed’), or a syntactically correct item(word category violation: Das Geschäft wurde am geschlossen, ‘The shop wasbeing on closed’). As before, ERP differences in processing semantic incongrui-ties between native and L2 speakers were only quantitative in nature, whilethere were qualitative differences with respect to syntactic processing betweenthe two participant groups. This suggests that the second language learners didnot process syntactic information in the way that native listeners did.

Hahne and Friederici (2001) examined sentence comprehension in Japa-nese speakers who had learned German as a second language after puberty.These bilinguals listened to German sentences that were correct or containedsemantic and/or syntactic violations. A variety of differences was found in theERPs for the Japanese-German bilinguals and German monolinguals. Semanti-cally incorrect sentences induced an ERP pattern similar for the two groups (anN400 effect), while correct sentences led to a different pattern (greater positi-vity) in L2 learners than in native listeners. The latter finding may reflect thegreater difficulties the learners had with respect to syntactic integration. Forsentences containing a phrase structure violation, L2 learners, in contrast tonative listeners, did not show significant modulations of the syntax-related ERPcomponents mentioned above (the early anterior negativity and the P600).Furthermore, sentences containing a pure semantic or combined syntactic/semantic violation elicited effects not found in native listeners. These effectsmay reflect additional conceptual-semantic processing in late bilinguals.

These ERP studies indicate that future RT studies examining sentenceprocessing in bilinguals are likely to yield evidence for complex interactionsbetween lexical and syntactic knowledge in L1 and L2. Of course, there is a vastnumber of research issues to be addressed. For a psycholinguist working on thebilingual lexicon, one interesting issue to explore is to which extent the language

146 Ton Dijkstra

non-selective access mechanism found at the lexical level also holds at thesyntactic level.

The assumption that the syntactic rules and syntactic categories of differentlanguages are not incorporated in language-specific databases but in anintegrated store leads to a variety of predictions. For instance, for a Dutch-English bilingual processing differences could arise for a noun phrase like ‘thelight of a distant star’ and a noun phrase like ‘the man sat in his room’, becausethe interlingual homograph star is an adjective in Dutch (meaning ‘rigid’) buta noun in English, while the homograph room is a noun in both languages(which means ‘cream’ in Dutch).

Furthermore, language non-selective access of syntactic rules might lead tospecific cross-linguistic priming effects. For instance, hearing a sentence like‘the librarian handed the reader a book’ might prime the production of de vadergaf het meisje een appel (‘The father gave the girl an apple’) but not de vader gafeen appel aan het meisje (‘The father gave an apple to the girl’) (cf. Bock 1986).

Another interesting issue is whether there is a separate effect of the languageof the sentence context on the recognition of a target word. In other words,could a noun phrase or larger sentence context elicit some kind of languageframe that affects the processing of later arriving words? In that case, processinga sentence with a code-switch like ‘I see a huis’ might be more difficult thanprocessing a regular sentence like ‘I see a house’, simply because the words inthe first sentence context do not all belong to one and the same language.

Finally, it seems likely that we may be expecting some unexpected results infuture RT studies on bilingual syntactic processing, due to the complex interac-tions between lexical, syntactic, and semantic factors. One possibility is thatquantitative differences in working memory capacity for L2 syntactic processingmay lead to qualitative processing consequences between L2 monolinguals andbilinguals (cf. Michael, Dijkstra and Kroll 2002). For instance, in a pilot studyin our lab we found that although both Dutch and German readers tended toresolve local ambiguities in subject- and object-relative clauses in their L1 byusing syntactic information only, Dutch-German readers in their L2 usedsemantic information as well (Caelen 1998, but also see Frenck-Mestre andPynte 1997). A similar pattern was found in L1 readers under higher processingload conditions.

All these exciting questions and many others are amenable to empiricalresearch by means of existing research techniques. Unfortunately, the collectionof empirical data addressing these questions has only just started and we haveno answers to these questions yet.

Lexical storage and retrieval in bilinguals 147

4. General conclusions

In this chapter, we have considered a number of questions about bilinguallexical processing and provided answers based on the presently availableempirical evidence with respect to interlingual homographs and cognates. First,we have argued that during the recognition of isolated words by bilinguals,lexical candidates from several languages are activated in parallel. Such parallelactivation does not only hold for orthographic representations, but also forphonological and semantic codes. Moreover, there is evidence that languagenon-selective access occurs even when bilinguals are processing words in theirnative language and are not aware that their second language knowledge isimportant. These findings indicate that the bilingual word identification system,just like the monolingual system, is to a large extent ‘automatic’ in nature, inthe sense that lexical candidates from both languages are activated in a fastrecognition process that in itself is largely unaffected by intentional andattentional factors.

Second, we have examined the time-course of lexical activation with respectto L1 and L2 and found that L2 is slower to be activated than L1, depending onrelative L1/L2 proficiency and therefore on (subjective) L1/L2 word frequency.For interlingual homographs, we have found that in spite of differences inL1/L2 activation rates, both readings of interlingual homographs remain activeduring lexical processing for a relatively long time. This finding has severalimportant theoretical consequences. For instance, if the language membershipof word candidates could be used quickly to suppress lexical candidates that areirrelevant in the experimental context, effects of the non-targeted reading of thehomographs should quickly disappear. However, if language membershipinformation becomes available late during processing, both readings of thehomograph would remain active for quite long. The available empirical studiessupport the latter position, which is in correspondence with the automaticnature of bilingual word recognition.

Third, we have demonstrated that both non-linguistic experimental andlinguistic context factors may affect the result patterns that are observed inexperiments. It appears that non-linguistic factors such as task demands andinstruction affect the performance of bilingual participants at the level of taskand decision processes as well as participant strategies. Linguistic factors suchas lexical, syntactic, and semantic aspects of the sentence context appear toaffect the word identification process more directly. Evidence from ERP studiesindicates that syntactic processing in bilinguals may differ both quantitatively

148 Ton Dijkstra

and qualitatively from that in monolinguals, but RT studies are badly needed inorder to specify from which underlying mechanisms the differences originate.

To conclude, empirical studies on bilingual word recognition in the lastdecade have uncovered a number of fundamental characteristics of the bilingualword recognition system. They have answered some major questions that areunique to the bilingual domain, such as that about language selective or non-selective access, as well as more generally important questions, such as howlanguage users handle lexical ambiguity and how task and stimulus contextaffect word recognition. The conclusions of these studies will have to be takeninto account during the development of a more general model of bilingualprocessing. However, much more empirical evidence on the interactionbetween lexical, syntactic, and semantic processing is needed before we caneven attempt to build such a model.

Note

* The author thanks Folkert Kuiken and two anonymous reviewers for their comments on

<DEST "dyk-n*">

a previous version of this paper. The author also thanks Judy Kroll for her continuoussupport and the many discussions that shaped the ideas in this chapter.

References

Altarriba, J., Kroll, J.F., Sholl, A. and Rayner, K. 1996. “The influence of lexical and concep-tual constraints on reading mixed-language sentences: Evidence from eye fixations andnaming times”. Memory & Cognition 24: 477–492.

Altenberg, E.P. and Cairns, H.S. 1983. “The effects of phonotactic constraints on lexicalprocessing in bilingual and monolingual studies”. Journal of Verbal Learning and VerbalBehavior 22: 174–188.

Bock, J.K. 1986. “Syntactic persistence in language production”. Cognitive Psychology 18:355–387.

Caelen, M. 1998. Extending the study on the processing of relative clauses to bilingualism.Unpublished Master’s Thesis, University of Nijmegen.

Cristoffanini, P., Kirsner, K. and Milech, D. 1986. “Bilingual lexical representation: The status ofSpanish-English cognates”. Quarterly Journal of Experimental Psychology 38A: 367–393.

Davelaar, E., Coltheart, M., Besner, D. and Jonasson, J.T. 1978. “Phonological recoding andlexical access”. Memory & Cognition 6: 391–402.

De Bruijn, E., Dijkstra, A., Chwilla, D. and Schriefers, H. 2001. “Language context effects oninterlingual homograph recognition: Evidence from event-related potentials andresponse times in semantic priming”. Bilingualism: Language and Cognition 4: 155–168.

Lexical storage and retrieval in bilinguals 149

De Groot, A.M.B., Delmaar, P. and Lupker, S. J. 2000. “The processing of interlexical homo-graphs in a bilingual and a monolingual task: Support for nonselective access tobilingual memory”. Quarterly Journal of Experimental Psychology 53: 397–428.

De Moor, W. 1998. Visuele woordherkenning bij tweetalige personen. [Visual word recognitionin bilinguals.] Unpublished Master Thesis, University of Ghent.

Dijkstra, A., De Bruijn, E., Schriefers, H. J. and Ten Brinke, S. 2000. “More on interlingualhomograph recognition: Language intermixing versus explicitness of instruction”.Bilingualism: Language and Cognition 3: 69–78.

Dijkstra, A., Grainger, J. and Van Heuven, W.J.B. 1999. “Recognition of cognates andinterlingual homographs: The neglected role of phonology”. Journal of Memory andLanguage 41: 496–518.

Dijkstra, A., Timmermans, M. and Schriefers, H. 2000. “Cross-language effects on bilingualhomograph recognition”. Journal of Memory and Language 42: 445–464.

Dijkstra, A. and Van Heuven, W.J.B. 2002. “The architecture of the bilingual word recogni-tion system: From identification to decision”. Bilingualism: Language and Cognition 5:175–197.

Dijkstra, A., Van Jaarsveld, H. and Ten Brinke, S. 1998. “Interlingual homograph recogni-tion: Effects of task demands and language intermixing”. Bilingualism: Language andCognition 1: 51–66.

Font, N. 2001. Rôle de la langue dans l’accès au lexique chez les bilingues: Influence de laproximité orthographique et sémantique interlangue sur la reconnaissance visuelle de mots.Unpublished Doctoral Thesis of the Université Paul Valery, Montpellier, France.

Frenck-Mestre, C. and Pynte, J. 1997. “Syntactic ambiguity resolution while reading insecond and native languages”. Quarterly Journal of Experimental Psychology 50: 119–148.

Gerard, L.D. and Scarborough, D.L. 1989. “Language-specific lexical access of homographsby bilinguals”. Journal of Experimental Psychology: Learning, Memory and Cognition 15:305–313.

Gottlob, L.R., Goldinger, S.D., Stone, G.O. and Van Orden, G.C. 1999. “Reading homo-graphs: Orthographic, phonologic, and semantic dynamics”. Journal of ExperimentalPsychology: Human Perception and Performance 25: 561–574.

Grainger, J. and Beauvillain, C. 1987. “Language blocking and lexical access in bilinguals”.Quarterly Journal of Experimental Psychology 39A: 295–319.

Grainger, J. and Dijkstra, A. 1996. “Visual word recognition”. In Computational Psycho-linguistics: AI and connectionist models of human language processing, A. Dijkstra and K.De Smedt (eds), 139–165. London: Taylor and Francis.

Grosjean, F. 2001. “The bilingual’s language modes”. In Language processing in the bilingual,J.L. Nicol and T.D. Langendoen (eds), 1–25. Oxford: Blackwell.

Hahne, A. 2001. “What’s different in second-language processing? Evidence from event-related brain potentials”. Journal of Psycholinguistic Research 30: 251–266.

Hahne, A. and Friederici, A. 2001. “Processing a second language: late learners’ comprehen-sion mechanisms as revealed by event-related brain potentials”. Bilingualism: Languageand Cognition 4: 123–141.

Kroll, J.F. and Dussias, P.E. In press. “The comprehension of words and sentences in twolanguages”. Chapter to appear in Handbook of bilingualism, T. Bhatia and W. Ritchie(eds). Cambridge, MA: Blackwell Publishers.

150 Ton Dijkstra

Lemhöfer, K. and Dijkstra, A. Submitted. “Recognizing cognates and interlingual homo-graphs: Time course and code similarity effects in generalized lexical decision”.

Macnamara, J. and Kushnir, S.L. 1971. “Linguistic independence of bilinguals: The inputswitch”. Journal of Verbal Learning and Verbal Behavior 10: 480–487.

Michael, E., Dijkstra, A. and Kroll, J.F. 2002. “Individual differences in the degree of languagenonselectivity in fluent bilinguals”. Paper presented at the meeting of the InternationalLinguistic Association, Toronto, Canada.

Sebastián-Gallés, N. and Kroll, J.F. In press. “Phonology in bilingual language processing:Acquisition, perception, and production”. In Phonetics and phonology in languagecomprehension and production: Differences and similarities, N. Schiller and A. Meyer(eds). Berlin: Mouton de Gruyter.

Van Hell, J. and Dijkstra, A. 2002. “Foreign language knowledge can influence nativelanguage performance in exclusively native contexts”. Psychonomic Bulletin and Review9: 780–789.

Van Heste, T. 1999. Visuele woordherkenning bij tweetaligen. [Visual word recognition inbilinguals.] Unpublished Master Thesis, University of Leuven.

Van Heuven, W.J.B., Dijkstra, A. and Grainger, J. 1998. “Orthographic neighborhood effectsin bilingual word recognition”. Journal of Memory and Language, 39: 458–483.

Van Heuven, W.J.B. and Dijkstra, A. April 1999. The role of phonology in the recognition ofinterlingual homographs and cognates. Paper presented at the Second InternationalSymposium on Bilingualism, Newcastle, UK.

Von Studnitz, R.E. and Green, D. 2002. “Interlingual homograph interference in German-English bilinguals: Its modulation and locus of control”. Bilingualism: Language andCognition 5: 1–23.

Weber-Fox, C.M. and Neville, H.J. 1996. “Maturational constraints on functional specializa-tions for language processing: ERP and behavioral evidence in bilingual speakers”.Journal of Cognitive Neuroscience 8: 231–256.

</TARGET "dyk">

<TARGET "wil" DOCINFO AUTHOR "John N. Williams"TITLE "Inducing abstract linguistic representations"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 7

Inducing abstract linguistic representations

Human and connectionist learning of noun classes

John N. WilliamsUniversity of Cambridge

1. Introduction

Noun class information is a crucial component of the interface between thelexicon and the grammar. In order to explain linguistic productivity it isnecessary to assume that linguistic rules are defined not over specific words, butclasses of word. This is not only true given the classical distinction betweenlexicon and grammar, but also in ‘emergentist’ views which see no clearseparation between these two systems (Ellis 1998, Tomasello 2000). Eventhough the latter stress the lexical-specificity of many ‘grammatical rules’, it isstill recognised that adult productivity can only be explained if words aregrouped into classes, even if those classes do not map neatly onto traditionallinguistic categories. The way in which words are grouped into grammaticalclasses is therefore an important issue in understanding language development,particularly in explaining the leap from lexical learning to grammar learning.

Noun classes, such as grammatical gender, are fundamentally abstract,grammatical notions (Corbett 1991). However, attempts have been made touncover subtle phonological and semantic cues that can be used to predict aword’s gender (Kelly 1992). For example, masculine nouns in German are morelikely to be monosyllabic, and monosyllabic words that are masculine containmore consonants than those of other classes. In French, feminine nouns tend toend in closed stressed syllables (e.g. personne, tomate, viande), and masculinenouns tend to end in open stressed syllables (e.g. avion, bruit, chapeau, bain).There are also a number of characteristic derivational morphemes associatedwith each gender (e.g. -eur and -ment are masculine, and -tion, -euse, -ière arefeminine). Sokolik and Smith (1992) trained a connectionist network to classifyFrench nouns as either masculine or feminine. The network was presented withthe orthographic, rather than phonological, forms of the words. They found

152 John N. Williams

that it could then indicate the gender for nouns that it had not received duringtraining, although its performance was not perfect (ranging between 73% and75%). This indicates that there are regularities in the form (in this case spelling)of French words which can to a certain extent predict gender category.

Yet there are always words which fall stubbornly outside such generalisa-tions. In the case of French, Carroll (2001) argues that in any case, the kinds ofphonological cues that have been appealed to are more subtle than couldreasonably be expected to be represented in the lexicon. This is not to say thatphonological and semantic cues do not play a role in learning gender systems,or that they do not affect how easy it is to remember the gender of specificwords. But ultimately gender classes impose an abstract categorisation on wordswhich is independent of their phonological and semantic properties. Learninggender systems, then, requires the formation of abstract grammatical categories,and producing grammatically well-formed utterances involves applyingagreement rules which make reference to those categories.

There is a growing body of evidence which suggests that even quite ad-vanced second language learners continue to make gender errors (Hawkins2001, Holmes and De la Batie 1999). In contrast, such errors are relatively rarein first language acquisition (Caselli, Leonard, Volterra and Campagnoli 1993).There is also evidence for qualitative differences between first and secondlanguage acquisition and processing of gender. A number of studies have shownthat second language learners are more sensitive to phonological agreementpatterns that correlate with gender classes than either children or adults in theirnative language. For example, for the Italian il pettine (‘the comb’, masculinesingular) a second language learner might produce *le pettine, using the articlewhich is more often associated with the -e ending on feminine plural nouns(Holmes and De la Batie 1999). In contrast a child would be more likely toproduce *il pettino, choosing an article that is correct for the noun’s gender andnumber, and providing the noun with the characteristic -o ending for mascu-line singulars. This demonstrates a grasp of the noun’s abstract gender as thecontrolling influence in determiner selection (Caselli et al. 1993). In reactiontime tasks on adults, Taraban and Kempe (1999) showed that non-nativespeakers of Russian are more sensitive to phonological cues to gender than arenatives. Finally, a study by Guillelmon and Grosjean (2001) showed thatwhereas native speakers of French and early bilinguals show certain gendercongruency effects in reaction time tasks, such effects are absent in late bilin-guals. These studies suggest that second language learners do not achievenative-like representation or processing of gender information.

Inducing abstract linguistic representations 153

In this chapter I shall explore the possibility that the reason why gender isa persistent problem for second language learners is precisely because theunderlying abstract grammatical concepts are difficult to acquire throughassociative learning. I shall address this issue through behavioural studies ofsemi-artificial language learning in tandem with computational (connectionist)simulations. These simulations were used as a means of assessing the viability,and potential limitations, of a purely associative learning account of thebehavioural data.

2. The issue of abstraction in human and connectionist learning

Noun class induction provides a well-constrained domain in which to examinethe broader issue of abstraction in both human and connectionist learning. Inthe case of adult implicit learning there has been a good deal of debate overwhether the knowledge that is acquired in, say artificial grammar learningexperiments can really be characterised as abstract (compare Johnstone andShanks 1999, Knowlton and Squire 1996, Meulemans and Van der Linden1997). Some degree of abstraction is suggested by the ability to transfer ruleknowledge between stimulus sets (Knowlton and Squire 1996, Mathews et al.1989). But this appears to be no more than knowledge of patterns of alternationor doubling of stimuli, for example the common abstract ABA structure whichunderlies the syllable sequences ga-ti-ga and wo-fe-wo (Marcus, Vijayan, BandiRao and Vishton 1999). Gómez and Gerken (2000) refer to this as “pattern-based abstraction”. But language structure depends upon patterns that aredefined over abstract categories, such as the common NVN structure underly-ing ‘Dogs eat pizza’ and ‘John loves books’. Gómez and Gerken, (2000) refer tothis as “category-based abstraction”. Very little implicit learning research hasexamined this kind of abstraction, even though it is a prime area in whichimplicit learning of language structure can be evaluated.

In connectionist networks rule-like behaviour, such as the ability togeneralise to novel inputs, is an emergent property of the system, and there isno separation between rote memory for examples and the representation ofunderlying generalisations (consider, for example, the well-known models ofpast tense formation (Rumelhart and McClelland 1986), and reading (Seiden-berg and McClelland 1989)). But it has been argued that the human generativecapacity in linguistic domains can not be accounted for without the ‘classical’distinction between knowledge of instances and knowledge of rules, or the

154 John N. Williams

traditional computational distinction between data and symbolic programs(Fodor and Pylyshyn 1988). According to this view, the problem with connecti-onist models is that they respond to novel inputs purely on the basis of theirsimilarity to trained examples, and not by applying abstract rules (Berent,Marcus, Shimron and Gafos 2002, Marcus 1999, Marcus et al. 1999). Category-based abstraction provides an ideal arena in which to explore this issue.

3. Previous research into human and connectionist learningof word classes

In his work on sequence learning Elman (1990) showed that there is a sense inwhich a connectionist network can learn abstract noun classes. This networklearned the sequential probabilities of words in simple sentences through aprediction task (attempting to predict the next word in a sentence on the basisof the preceding ones). When the internal states of the network were examined(see below for an illustration of how this is done) it was found that the activa-tion patterns produced by words clustered into classes that reflected thedistributional properties of the training sentences. The two largest clusters werefor nouns versus verbs, and within these groups there were smaller sub-clusterscorresponding to transitivity preference for verbs, and animacy for nouns.These clusters were based purely on a distributional analysis of the words in theinput. For example, what made a noun ‘inanimate’ was nothing more than thefact that it only occurred before certain kinds of verb (e.g. move, break) and notothers (e.g. smell, see). This work is widely cited as proof that networks caninduce word classes by performing distributional analysis, and as support for astatistical approach to language learning (Redington and Chater 1998).

Given the apparent power of distributional information to deliver nounclass information it is perhaps surprising that there is only limited evidencefrom experimental studies that humans are able to exploit it in order to learnnoun classes. Saffran (2001) examined incidental learning of a set of hierarchi-cal phrase structure rules in which each phrase was associated with a distinctclass of nonsense words. She argued that the results of the grammaticalityjudgment tests showed that the participants developed sensitivity to phrasestructure and word class, and that this was based on a statistical analysis of thedistribution of the words in the input. However, abstract representations of wordclass would permit test items containing word sequences that had never occurredin the input to be judged as grammatical (or more grammatical than similar

Inducing abstract linguistic representations 155

sequences which violated phrase structure). Because no such test was performedit is difficult to know whether abstract word classes had really been learned.

More stringent tests of word class learning become possible when nounclasses, such as gender systems, are considered. Brooks, Braine, Catalano andBrody (1993) used an artificial language in which there were two noun classes,and each class used different affixes to mark the location of the actor in relationto the object denoted by the noun. Neither the form nor meaning of the nounsprovided any clue to their class. Adults were first taught the vocabulary, andthen performed both comprehension and production tasks (e.g. acting outphrases, or describing pictures with feedback in the form of the correct answer).After training they were tested on knowledge of the trained items, and also ontheir ability to produce the correct response for noun-affix combinations thathad not been presented during training. Whilst their performance on traineditems was at around 75%, they were at chance on the generalisation items. Notone of the 16 subjects showed evidence of having learned the system. Similarresults have been obtained in a number of other studies (Braine 1987, Braine etal. 1990, Frigo and McDonald 1998). Frigo and McDonald (1998:237) arguethat models of noun class learning that depend on pure distributional analysis(Anderson 1983, Maratsos and Chalkley 1980, Pinker 1984) are “too powerful”.

The question is, then, does connectionism fall into this class of overlypowerful learning mechanisms for learning noun classes? The experiments andsimulations presented below further explored the circumstances under whicharbitrary and non-arbitrary noun class systems can be learned by humans andconnectionist networks.

4. Experiment 1

Williams and Lovatt (2003) tested whether humans can learn the arbitrary nounclass system shown in Table 1. There were eight nouns divided into two arbi-trary classes ‘masculine’ and ‘feminine’. Words in the ‘masculine’ class occurredwith the determiners ig, i, ul, and tei. Words in the ‘feminine’ class occurredwith the determiners ga, ge, ula, and tegge.1 The training items were the non-italicised phrases shown in Table 1. The italicised items were withheld fortesting generalisation. It would only be possible to know that ‘the ball’ shouldbe translated as ig johombe by knowing that johombe belongs to the ‘masculine’class. Neither its form, its -e ending, nor its meaning provide any clues.

The participants first learned the nouns and determiners as isolatedvocabulary items. They then received the determiner-noun combinations for

156 John N. Williams

each training item as part of an exercise in rote memorisation that cycled

Table 1.The items employed in Experiments 1 and 2. Items used for testing generalisa-tion are in italics.

definite singular(the)

definite plural(the)

indefinite singular(a)

indefinite plural(some)

‘masculine’ballhousefightbird

ig johombeig zabideig wakimeig migene

i johombii zabidii wakimii migeni

ul johombeul zabideul wakimeul migene

tei johombitei zabiditei wakimitei migeni

‘feminine’shoekisscakenose

ga shosanega tissekega chakumega nawase

ge shosanige tissekige chakumige nawasi

ula shosaneula tissekeula chakumeula nawase

tegge shosanitegge tissekitegge chakumitegge nawasi

through phases of presentation and cued recall over sets of four items. Fourphrases were presented with their English translations, for example: ‘the nose’ga nawase, ‘the birds’ i migeni, ‘some balls’ tei johombi, ‘a kiss’ ula tisseke. Theparticipants repeated each novel phrase immediately after they had seen andheard it. After the four phrases had been presented participants attempted torecall each phrase given the English translation and stem as cues, for example: ‘thebirds’ _ migen_, ‘the nose’ _ nawas_, ‘a kiss’ _ tissek_, ‘some balls’ _ johomb_. Theywere provided with feedback after each recall attempt in the form of the correctanswer. After receiving the 24 training items they performed a generalisationtest on the withheld items in Table 1. The generalisation test was similar to therecall component of the training phase. The English translation of each phrasewas presented (e.g. ‘the ball’), along with the form of the corresponding stem(johomb_), and the participants had to produce the appropriate determiner andappropriately inflected noun. No feedback was given. This sequence of memoryand generalisation tasks was repeated five times.

Across 21 participants the mean generalisation performance over the fivecycles was 36%, 48%, 54%, 66%, and 67%. A repeated measures Anova showedthat the improvement in performance was significant, F(4,80)=13.11, p<0.001.This shows that the participants learned something of the underlying noun classorganisation. However, there were large individual differences in the level oflearning. Two factors were found to independently predict performance on thefinal generalisation test. The first was the participants’ phonological short-term

Inducing abstract linguistic representations 157

memory, as measured prior to the experiment by their ability to recall lists ofthree nonsense words (the singular forms of the nouns in the target language)in the order of presentation. The correlation between this memory measure andperformance on the final generalisation test was r=0.528, p<0.05. There wasevidence that the relationship between phonological short-term memory and rulelearning was mediated by memory for determiner-noun combinations receivedduring training. Clearly memory ability is crucial to performing the kind ofdistributional analysis upon which learning of this kind of system depends.

The second factor was a measure of the participants’ breadth and depth ofknowledge of other gender languages. All of the participants’ L1s were non-genderlanguages (in fact all but one of them was a native speaker of English), but themore gender languages they knew as L2s, and the better they knew them, thenthe better their performance on the generalisation test (r=0.520, p<0.05). Thissuggests that the learning process was facilitated by linguistic knowledge.

There are a number of possible reasons why our participants managed tolearn an arbitrary noun class system whereas those in the previous studies didnot. First, the systems used by Brooks et al. (1993) and Braine et al. (1990)involved agreement between spatial prepositions and nouns, and Frigo andMcDonald (1998) used a system involving agreement between greetings andnames. Participants may have had relatively little familiarity with similarsystems in other languages that they knew. Second, it is possible that the size ofthe languages is important. Braine et al. (1990) used a 24-word vocabulary, Brookset al. (1993) used 30 words, and Frigo and McDonald (1998) used 20 words,whereas Experiment 1 used only eight words. Clearly, keeping track of the collo-cates of 20 to 30 words is much harder than keeping track of the collocates of eightwords. A third potentially important factor is that in the present case some ofthe determiners in each class had the same ending. The feminine class con-tained the pairs ga-ula and ge-tegge; the masculine class contained i-tei, and theremaining determiners ig and -ul were the only ones to end in consonants. Thissimilarity structure may have facilitated the learning process.

Experiment 1 demonstrates that an arbitrary noun class system is inprinciple learnable. The question now is whether a connectionist simulation ofthe same learning problem will be similarly successful.

4.1 Simulation 1

For this and all other simulations reported here, the simulation package Tlearnwas used (Plunkett and Elman 1997). The aim in the first simulation was to

158 John N. Williams

train the network in a way which resembled as closely as possible the trainingtask performed by the participants in Experiment 1. I decided to focus on therecall component of the training task. The network was taught to produce thecorrect determiner for each phrase in the training set shown in Table 1. Theinput consisted of representations of the noun stem, the inflection, the Englishdeterminer, and the number of the noun. For example, the input for the itemtei johombi was ‘johomb’, ‘-i’, ‘some’, and ‘plural’. This is the information thatis relevant to predicting the determiner, and which was explicitly provided tothe participants in the recall component of the training task in Experiment 1.2

Following Elman (1990) one unique input node was used to represent eachelement of the input (for example one unit was used to represent johomb),yielding a total of 15 input nodes (eight stems, two inflections, three Englishdeterminers, singular, plural). The input nodes were connected to five hiddenunits, which were in turn connected to eight output units, one for each of theeight possible determiners. For each input pattern the network was taught toproduce the correct determiner. For example given the input johomb, -i, ‘some’,and ‘plural’ it was taught to predict tei. This involved comparing the actualoutput from the network with the correct output, and making appropriatechanges to the connection weights within the network according to the degreeof error. In this sense the network was provided feedback in the same way as theparticipants in the experiment.

The network was initially trained until the root mean square (RMS) errorfor the training items was 0.1 (this required an average of 2,479 cycles throughthe training set).3 An error of this magnitude indicated that for each inputpattern the network was able to activate the correct determiner on the outputlayer to a value close to the target value of 1.0, and all other output units hadvalues close to zero. Testing involved presenting the input patterns for thegeneralisation items in Table 1 (i.e. the network was presented with patterns that ithad not received during training). For each input pattern, the activation level ofthe output units was recorded, compared to the correct answer and the degree oferror calculated. The training and test procedure was repeated 20 times, and oneach run the connection weights were given random starting values.

Generalisation performance on each run was perfect in the sense that theactivation on the node for the correct determiner was far greater than that ofthe others. Over 20 runs the mean RMS error was 0.118 (which is not muchgreater than that for trained items). That is, the network was able to correctlypredict the determiner for input patterns that it had never encountered duringtraining with an accuracy which was almost as high as for the trained items.4

Inducing abstract linguistic representations 159

In order to explore the nature of the network’s internal representations theword stems were presented alone to the input layer at test (i.e. all of the otherelements of the input were given values of zero). The activation patterns overthe five hidden units were recorded and submitted to a cluster analysis (for asimilar procedure see Elman 1990). The logic of only presenting the word stemswas that the aim was to ascertain the similarity structure of the hidden unitactivations to the nouns in a way that was not contaminated by the activationsproduced in specific contexts of definiteness and number. Over six separateruns a similar result was obtained — the activation patterns clustered accordingto gender. That is, nouns within the same class produced activation patternsthat were similar to each other and distinct from the patterns produced by thenouns in the other class.

It should be clear that this network is not simply producing responses to thegeneralisation items on the basis of their similarity to trained items. Forexample, for the test item ig johombe the stimuli were johomb, -e, ‘the’, and‘singular’. During training johomb and -e only occurred with ul. The elementsthe and ‘singular’ occurred with both ig and ga with equal frequency. Yet thenetwork was able to produce a strong output on ig and much lower levels ofactivation on the remaining determiners. Simulation 1 therefore shows that aconnectionist network can achieve linguistic productivity, and can behave as ifit has formed abstract representations, even though there are no abstractrepresentations as such within the network.

There are various ways in which the power of Simulation 1 could be variedin order to account for the effects of individual differences in Experiment 1, orthe failures to obtain learning of arbitrary noun classes in previous experiments.The effect of memory ability could be dealt with by changing the learning rateparameter (which determines the size of the weight changes in response to agiven amount of error). Factors such as the similarity structure of the deter-miners, or the number of nouns in the training set, would be expected toinfluence learning rate as well. However, the influence of knowledge of othergender languages is more problematic and will be considered after the remain-ing experiments and simulations have been reported.

Connectionist networks are commonly regarded as models of the associa-tive mechanisms underlying implicit learning (Cleeremans and Jiménez 2002).However, when we debriefed our participants after Experiment 1 it was clearthat the more successful amongst them had been employing intentionallearning strategies, and that there was a good correspondence between theirconscious understanding of the system and their performance in the final

160 John N. Williams

generalisation test. It therefore becomes important to test whether learningcould be obtained under implicit conditions.

5. Experiment 2

This experiment employed a training task that was thought to be unlikely toinduce an intentional learning strategy. Participants first performed the samephonological short-term memory test and vocabulary learning exercise as inExperiment 1. Determiner-noun combinations from the training set were thenauditorily presented in a semi-random sequence, avoiding immediate repeti-tions of the same noun or determiner. For each item the participants had toperform the following tasks: (1) repeat the phrase aloud, (2) indicate whetherit refers to a living or non-living thing by pressing one of two response keys, and(3) translate the phrase into English. For example, for the item ul johombe theywould respond by saying ‘ul johombe’, pressing the non-living key, and saying‘a ball’. The meanings of the words were altered so that half of the nouns in eachclass referred to living things and half to non-living things. The living/non-living decision was included because this experiment was also a control for asubsequent version in which noun animacy predicted noun class membership(see Experiment 3 below). Here it serves as a means of increasing task demandsso that participants would be less likely to attempt to engage explicit learningprocesses. The participants were told that the purpose of the experiment was tosee how their decision and translation performance improved with practice andso they were encouraged to make their responses as quickly and as accurately aspossible. Training extended over 15 cycles through the 24-item training set,giving a total of 360 training trials. This took between 60 and 75 minutesincluding rest breaks after every five cycles.

The training phase was followed by the test phase. On each trial the Englishtranslation of a test phrase was visually presented (e.g. ‘the ball’) and theparticipants had to choose between a grammatical and ungrammatical transla-tion in the target language, where the determiner for the ungrammatical itemwas always of the correct number and definiteness, but the incorrect gender(e.g. ig johombe versus ga johombe for ‘the ball’). First the eight generalisationitems were presented (see Table 1) followed by 16 trained items.

There were 18 participants who were selected on the basis of their goodknowledge of gender languages so as to increase the potential for obtaininglearning in this experiment. They all rated themselves as intermediate or better

Inducing abstract linguistic representations 161

in at least two gender languages (mean=2.8, range=2 to 6). Twelve of theparticipants spoke a gender L1. Using the same scale for assessing knowledge ofgender languages as employed by Williams and Lovatt (2003) they scored 5.8,which is much higher than the mean of 2.6 for the participants in Experiment 1.Their phonological short-term memory was also somewhat superior, the meanscore being 71% as opposed to 64%.

None of the participants were aware of the noun class system either duringtraining or test phases. The average percentage correct on the generalisationitems was 56%, which was not significantly different from the chance level of50%, t=1.34, p>0.1. On the other hand, performance on the trained items was69%, which is significantly better than chance, t=5.53, p<0.001, and signifi-cantly better than performance on generalisation items, t=2.58, p<0.05. Thus,although the participants had quite good memory for trained items, there wasno evidence of learning the underlying noun class distinction. This conclusionis emphasised by the fact that the ten participants who scored 75% or better onthe trained items (mean=80%) had a mean generalisation score of 50%. Norwere there any correlations between generalisation test performance and eitherphonological short-term memory or language background, and participantswho spoke a gender L1 did no better than those that did not (generalisation testscores were 56% for both groups).

Given the failure to obtain learning in this experiment one may concludethat Simulation 1 was in fact too powerful, and that the learning that occurredin Experiment 1 was a result of purely explicit processes which fall outside thescope of the model. However, there is an alternative possibility. We should alsoconsider the relationship between the task performed by Simulation 1 and thetasks performed by the participants in Experiments 1 and 2. Simulation 1 wasintended as a model of the recall component of the training task used inExperiment 1. But in Experiment 2 the participants’ task was very different.They did not have to generate any determiners at any point during training, butonly had to perform animacy decisions and produce English translations.Simulation 1 could not be said to be a good model of this task. A secondsimulation was therefore conducted that made different assumptions about thelearning task.

5.1 Simulation 2

Incidental learning is best regarded as a relatively passive process of recordingcorrelations between attended features in each experience. Cleeremans and

162 John N. Williams

Jiménez (2002), following O’Reilly and Munakata (2000:18), have referred tothis as ‘model learning’, the goal of which is to “enable the cognitive system todevelop useful, informative models of the world by capturing its correlationalstructure”. Connectionist models of model learning do not require feedbackbecause the system merely attempts to represent the structure of the inputs it isprovided. This is in contrast to ‘task learning’ which has the aim of “masteringspecific input-output mappings (i.e. achieving specific goals) in the context ofspecific tasks through error-correcting learning procedures” (ibid. p.18).Crucially for present purposes they assume that model learning operatescontinuously, regardless of the task. Simulation 1 instantiated task learning, andwas successful because the underlying noun class distinction happened to berelevant to the task the network was required to perform. But in Experiment 2the tasks that the participants were performing (animacy decisions and transla-tion) exerted no pressure to learn the noun class distinction. The same wouldbe true of simulations of those tasks. The only way in which the noun classdistinction could be learned, therefore, would be through model learning,which requires a different kind of network from that used in Simulation 1.

One way of instantiating model learning is to train a three-layer network toassociate each input to itself. That is, the network learns to reproduce the inputpattern on the output layer. These are called “autoassociation” networks(Plunkett and Elman 1997). Because there are fewer hidden than input/outputunits the network is forced to discover an economical means of representing thepatterns so that they can be reproduced on the output. This gives the networkthe potential to extract generalisations. Autoassociation networks do notrequire feedback because the input itself provides the reference point againstwhich the accuracy of the output can be judged. How does such a network fareon the arbitrary noun class induction problem?

In Simulation 2 there were 31 input units representing the eight deter-miners, eight stems, two inflections, three English determiners, eight Englishnouns, and units for singular and plural. All of the relevant information in atraining item such as ul johombe, ‘a ball’ was represented as a pattern over theinput layer. The 31 output units represented the same information as the inputunits. The network had 20 hidden units.5 For each item in the training set thenetwork was trained to reproduce the input pattern on the output layer.Training continued until output error ceased to decline (which was after about2,500 cycles).

In Experiment 2, learning was assessed by forcing participants to choosebetween two translations for a phrase, for example, between ga johombe and ig

Inducing abstract linguistic representations 163

johombe as translations of ‘the ball’. The model can be tested in the same way bypresenting both grammatical and ungrammatical determiner-noun combina-tions and comparing the strength of the output on the determiner units. For atrained item, such as ul johombe, the strength of activation of the correspondingdeterminer in the output, in this case ul, was, as one would expect, very high(0.996 when averaged over eight training items on five separate runs, where therequired activation level was 1.0). Ungrammatical items such as ula johombeproduced much weaker activation of the corresponding output determinernode, in this case ula (0.214). Clearly the network had not simply learned toreproduce input patterns on the output layer. Rather, its ability to do so wasaffected by whether it had received those patterns during training. In humanterms this would be the equivalent of a greater feeling of familiarity for uljohombe than ula johombe. But for generalisation items the output activation ondeterminers in both grammatical and ungrammatical items, e.g. ig johombeversus ga johombe, was very low and not significantly different (0.054 and 0.055respectively). In other words, both items appeared equally unfamiliar to thenetwork. Therefore, like the human participants in Experiment 2, the auto-association network had good memory for trained items, but was unable todistinguish between grammatical and ungrammatical generalisation items.

The contrast between Simulations 1 and 2 demonstrates that task learningenabled a connectionist network to become sensitive to an abstract noun classdistinction whereas model learning did not. This is a rather surprising resultwhen one considers that there is a sense in which the networks were performingrather similar tasks. In both cases they had to remember which determinersoccurred with which configurations of noun, definiteness, and number in thetraining items. The difference was that in Simulation 1 the network’s resourceswere focused on predicting the determiner from the cues that it was provided,whereas in Simulation 2 the network was actually attempting to remember theunique combination of determiner, noun, definiteness, and number thatoccurred in each training item. This exercise in episodic memory for entiretraining episodes apparently did not exert sufficient pressure on the network todiscover the underlying noun class distinction.

The contrast between task learning and model learning is reminiscent of theprocedural-declarative distinction in Anderson’s ACT framework (Anderson1983). Productions are sets of rules which match their ‘IF’ conditions againstthe current contents of working memory, and if these are satisfied, they ‘THEN’produce some action, or deposit some other kind of representation in workingmemory. Although stated in a symbolic formalism in ACT, a connectionist

164 John N. Williams

network can be conceived as a subsymbolic model of the entire set of produc-tions which perform the transformation between one type of input to anothertype of output (Sun et al. 2001). Both procedural learning and connectionistlearning of this type depends on error correction. In contrast, Simulation 2could be identified with the ‘declarative’ memory component of the ACTframework. The idea that these two kinds of memory system might havedifferential power to extract generalisations from the environment is clearlyrelevant to attempts to construct a theory of second language acquisition interms of their interaction (Towell and Hawkins 1994), or to identify them withdifferent brain regions (Ullman 2001). Indeed, the idea that procedural learningis more powerful at extracting abstract linguistic rules would be consistent withthe proposal that such a mechanism supports first language acquisition,whereas second language acquisition is supported by declarative learning(Ullman 2001).

If Simulation 1 is accepted as a valid model of the learning process inExperiment 1 then there is another interesting consequence. The learning thatwas occurring in that experiment was characterised as ‘explicit’. Not only didthe participants appear to have an intention to learn, but some of them alsomade comparisons between consciously recalled input items, and formedconscious hypotheses. Simulation 1 captures the intentional component of thelearning process, since it too evaluated its outputs with respect to feedback forthe purpose of learning in order to be able to generate determiners. But itobviously does not model the other components of what, in human terms, weregard as explicit learning. However, this does not necessarily detract from therelevance of the model. Shanks (1995) reviews a range of studies on humanlearning where there is a good fit between human behaviour and connectionistmodels. Yet in many of these experiments the participants were activelysearching for rules. For example, in a medical diagnosis task (see Shanks1995:42) participants were presented with hypothetical patients with certainsymptoms and were instructed to diagnose what illness each patient had. Eachtrial was accompanied by feedback in the form of the correct diagnosis. Perfor-mance was directly related to the degree of contingency between different cues(symptoms) and outcomes (diseases) in the training data. Shanks showed thatthe results could be adequately modelled by a simple connectionist network inwhich symptoms were presented as inputs, diagnoses as outputs, and the correctdiagnosis was provided as feedback (Shanks 1995:120). Yet the participants in theexperiment presumably had the experience of actively trying to work out therelationship between symptoms and diseases. Whilst it is presently unclear how the

Inducing abstract linguistic representations 165

conscious states of the learner influence the learning mechanism, it should notbe assumed that the possibility of there being such interactions rules out aunified associative explanation (Cleeremans and Jiménez 2002).

6. Experiment 3

With the benefit of hindsight Experiment 2 was a poor test of implicit learningbecause the kind of associative learning mechanism supposed to underlieimplicit and incidental learning would not be expected to learn the underlyinggeneralisations. We6 therefore decided to run Experiment 2 again, but this timeusing a system that would be learnable even by the kind of autoassociationnetwork used in Simulation 2. The language was essentially the same as thatshown in Table 1 except that the meanings of the words were altered so that allof the words in class I referred to living things and all of the words in class IIreferred to inanimate objects. For simplicity, the living/non-living distinctionwill be referred to here in terms of an ‘animacy’ cue to noun class. It has beenshown that under intentional learning conditions humans have no problemgrasping semantically-based noun classes (Braine 1987, Carroll 1999). A versionof Simulation 2 that included animacy information confirmed that the presentsystem was also learnable by an autoassociation network. This is presumablybecause there are direct associations between the units that encode animacy andcertain determiners. Note, therefore, that in this experiment we are no longerconcerned with whether implicit learning of abstract noun classes is possible.Rather the issue is whether implicit learning of a noun class distinction can beobtained under conditions where the connectionist model predicts that thereshould be an effect.

The tasks and procedure were exactly the same as in Experiment 2. For eachphrase presented during training the participants had to repeat it, indicatewhether it referred to a living or nonliving thing, and translate it into English.Note that this time the living/nonliving decision coincided with the noun classof the word. The same learning tests were used as in Experiment 2. There were37 participants with varied language backgrounds.

Only seven of the participants became aware of the noun class distinctionand its relation to animacy during the training phase, and their performancewas perfect, or near perfect, on the generalisation and trained items. None ofthe remaining 30 participants became aware of the system during the trainingphase and none of them claimed to have been consciously trying to work out

166 John N. Williams

the system during the generalisation test. Even at the end of the whole testingphase none of them realised the relevance of animacy. Nevertheless, perfor-mance on generalisation items was 61%, which was significantly above thechance level of 50%, t=3.25, p<0.01. They scored 71% correct on traineditems, which was also significantly above chance, t=6.09, p<0.001. Therefore,Experiment 3 succeeded in demonstrating at least some degree of implicitlearning of a system that was also learnable by an autoassociation network.

However, there were large individual differences in generalisation testperformance. Just as in Experiment 1 there were correlations with phonologicalshort-term memory (r=0.50, p<0.01) and knowledge of gender languages(r=0.586, p<0.001), which in this case was quantified simply in terms of thenumber of gender languages in which the participants rated their proficiency asintermediate or better (mean=1.8, range=0 to 5). We also evaluated whether,amongst the 30 unaware participants, speakers of gender L1s did better thanspeakers of non-gender L1s. For the 13 speakers of gender L1s mean generali-sation test performance was 71%, which is significantly above chance, t=4.08,p<0.01, whereas for the 17 speakers of non-gender L1s it was 54%, which is notsignificantly above chance, t=0.96. The difference between these two groupswas significant, t=2.78, p<0.01. The two groups did not differ significantly interms of the number of L2s spoken to an intermediate level or better (3.54 and3.12 respectively, t<0.92), the number of gender languages known as an L2 (themeans were 1.46 and 1.23 respectively), but they did differ slightly in terms ofphonological short term memory (77% versus 68%, p=0.08). Better matchedgroups resulted from removing the three participants with the lowest phonolog-ical short term memory scores from the sample (all scores were less than 50%,and all three participants were in the non-gender L1 group). The 13 speakers ofgender L1s and remaining 14 speakers of non-gender L1s were well matched interms of number of gender languages spoken as an L2 (1.46 and 1.43 respective-ly) and in terms of phonological short term memory scores (77% and 73%). Yetthe generalisation scores were 71% and 55% (the difference being significant,t=2.31, p<0.05). Note that for the non-gender L1 group the mean for thetrained test items was well above chance (67%, t=3.74, p<0.01).

7. Discussion

In one sense it could be argued that there is a good alignment between theconnectionist models and the human data in the present studies, provided

Inducing abstract linguistic representations 167

assumptions are made about which kind of network is appropriate to whichtask conditions. Where the model was able to generalise there was also evidencefor generalisation amongst the participants in the experiment (Simulation 1 andExperiment 1, Simulation 2 supplemented by animacy information andExperiment 3). Where the model was not able to generalise there was noevidence for generalisation amongst the human participants (Simulation 2 andExperiment 2). The problem is, however, that the networks only seem toaccount for learning amongst those participants who already possessed knowl-edge of other gender languages. Yet none of the networks contained any priorknowledge. Seen in this light they provide a poor fit to the human data. In thisfinal section I shall consider ways in which prior knowledge could have influ-enced human learning, and whether the data then become more amenable to aconnectionist interpretation. I shall then consider the implications of thepresent results for second language acquisition.

7.1 The role of prior linguistic knowledge

One way in which prior knowledge could facilitate learning is through its effecton the learners’ strategy. Recall that the success of Simulation 1 depended uponusing number, definiteness, and an abstract representation of the nouns(represented as single nodes) to generate the determiners. But this presupposesa certain understanding of the nature of gender systems. Participants who didnot have this understanding may simply have approached the task in the waythat it was presented to them; that is, as a short-term memory exercise fordeterminer-noun combinations. In that case their learning processes would bemore appropriately modelled by Simulation 2 than Simulation 1. Indeed, thecontrast between Simulations 1 and 2, between task learning and modellearning, could be seen as a computational account of a more general contrastbetween analytic and non-analytic, memory-based, learning strategies (Skehan1998). In the present case the probability of adopting an appropriate analysisstrategy could have also depended upon metalinguistic knowledge of othergender systems that was derived from second language learning experience.

Obviously a learning strategy account can not apply to the kind of inciden-tal and implicit learning occurring in Experiment 3. However, in this caselearning failures could be accounted for simply by assuming that animacy wasnot perceived as being relevant to the determiners. In Williams (in preparation)I argue that implicit learning of form-meaning connections (such as betweendeterminers and animacy information) is problematic because of the requirement

168 John N. Williams

that form and meaning are unitised at encoding; learners must actually perceivethem as being relevant to each other. Merely paying attention to the relevantelements does not appear to be sufficient, at least not under the task conditionsof Experiment 3. In terms of the model learning mechanism instantiated bySimulation 2 this means that even though animacy information was attended,it did not enter the same memory trace as information about the determiner,noun identity, definiteness, and number. The problem is, therefore, to explainwhy participants who spoke a gender L1 defied this principle and were able tounconsciously associate the determiners with animacy information. There is noobvious connectionist answer to this problem. Is the classical linguistic ap-proach any more promising?

Linguistic (Carroll 1989, Hawkins 2001) and psychological (Levelt, Roelofsans Meyer 1999, Vigliocco, Antonini and Garrett 1997) analyses of genderrepresentation and processing in the L1 assume abstract gender features that areattached to nouns in the lexicon. How gender features are acquired is not oftenconsidered. However, Carroll (2001) proposes an induction procedure whichis triggered by the presence of alternating determiner forms in the input (e.g.two words for ‘some’). The first occurrence of one of the determiners, forexample in tei johombi (‘some monkeys’) has no effect. But when anotherphrase involving a word for ‘some’ is encountered, for example tegge nawasi(‘some vases’), the learner seeks to rationalise the contrast by marking the nounwith a [+gender] feature. In this way, one of the determiners becomes anassigner of the gender feature whilst the other remains the default. Remember-ing which of the alternating pair of determiners assigns the gender feature islikely to be problematic, however. In the (admittedly artificial) case that[+gender] also corresponds to some other active feature of the noun, such as[+inanimate], one can imagine that this problem would be alleviated. Toaccount for the influence of gender L1s in Experiment 3 it would have to beassumed that this kind of induction mechanism can only operate in L2 if it wasused in the L1. This is perhaps not too implausible if one considers that eachtime a speaker of a gender language encounters a novel noun the same processof using the accompanying determiner to assign gender to it must operate. Onthe other hand, it is another matter to assume that, when confronted with a newlanguage, learners are able to assign new gender features on the basis of newlyobserved alternations between determiners. It is also relevant to consider thatat present there is no evidence that speakers of gender L1s have any lessproblem with gender in an L2 than do speakers of non-gender L1s (Bruhn andWhite 2000). Thus, although the gender L1 advantage found in Experiment 3

Inducing abstract linguistic representations 169

is intriguing, there is no obvious way of accounting for it at the present timefrom either connectionist or classical perspectives.

7.2 Implications for second language acquisition

When considering second language acquisition, particularly under naturalisticun-instructed conditions, it is relevant to consider the power of incidentallearning mechanisms; that is, learning that takes place as a natural consequenceof processing the relevant stimuli for purposes other than discovering theunderlying regularities. This means that we should consider implicit learningconditions like those in Experiments 2 and 3 and learning mechanisms of thetype exemplified by Simulation 2 as being the most relevant. Granted thisassumption, then the prospects for associative learning of abstract noun classeswould appear to be bleak.

However, one limitation of Experiment 2 and Simulation 2 is that theyemployed a completely arbitrary noun class system. As mentioned earlier, it hasbeen argued that in many natural languages at least a proportion of the mem-bers of the same noun class share phonological and semantic properties. Couldthe presence of these cues facilitate learning? In fact a number of experimentalstudies have shown that partial phonological and semantic cues do indeedfacilitate noun class induction (Braine 1987, Brooks et al. 1993, Frigo andMcDonald 1998). However, these studies have only demonstrated an effect ofpartial cues under intentional learning conditions similar to those in Experi-ment 1. There have been no demonstrations of their effect upon implicitlearning. Indeed, my own preliminary investigations of learning such systemsusing networks of the type used in Simulation 2 have failed to generalise tounmarked words (whereas a network such as Simulation 1 would clearly haveno problem).

Even under the intentional learning conditions of the earlier experimentsthere was very little evidence of generalisation to items that did not carry theappropriate cues. The adults in the study of Brooks et al. (1993) showed barelya significant effect using a one-tailed test (which assumes that the direction ofthe difference is predicted), and for the children in their second experimentthere was no evidence of generalisation at all. Given that seven out of the 16adults had explicit knowledge of the word classes, whereas only one of thechildren did, then it seems likely that these participants were responsible for theslightly above-chance performance of the group as a whole. Generalisation tounmarked nouns would therefore seem to be unlikely under implicit conditions.

170 John N. Williams

Only in one of three experiments of Frigo and McDonald (1998) was perfor-mance on unmarked generalisation items significantly above chance, and thiswas when word class was indicated by a characteristic initial and final syllable(e.g. wanersumglot, wanolovglot, wanalglot versus kaisalmrish, kaisilvrish,kaisalbrish). Braine (1987) also obtained good generalisation to unmarkedwords, but half of the nouns in one class referred to males and the other half tofemales. Thus, generalisation appears to be limited to cases where the cues aremore salient than in natural languages.

Somewhat counter-intuitively, where the above studies did find evidence ofgeneralisation to unmarked items was when entirely novel nouns were intro-duced in the final test phase. The equivalent test in the context of the languageused here would involve telling participants that ul vark means ‘a dog’ andasking them to produce the translation of ‘the dog’ (the correct answer being igvark). Such a test only requires knowledge of the associations between thedeterminers. Therefore, it does appear that partial phonological cues canfacilitate acquisition of inter-determiner associations (or rather, their equivalentin the languages that were used). Determiners in the same class presumablybecome associated by virtue of their frequent association to the same phonolog-ical cue. Generalisation is then achieved by a process of inference from anotherdeterminer-noun combination that is provided at test or recalled from memory.As argued by Frigo and McDonald (1998), poor performance on generalisationtests involving nouns that occurred in training could be because of problemsrecalling an example of a determiner that occurred with that noun. But thenative speaker of a gender language is assumed to generate an appropriatedeterminer directly on the basis of an abstract specification of the noun’sgender in the lexicon, not by inference. It is far from clear that the participantsin these experiments acquired knowledge of noun classes in that sense.

The results from these studies do not, therefore, offer much prospect ofincidental learning of noun classes. This is of course consistent with the claimthat gender is a persistent problem for second language learners. Assuming anunderlying model learning mechanism such as that in Simulation 2, learningwould be predicted to be limited to rote storage of determiner-noun combina-tions, and associations between determiners and partial phonological andsemantic cues. This would explain L2 learners’ sensitivity to phonological cuesin gender processing tasks (Guillelmon and Grosjean 2001, Holmes and De laBatie 1999, Taraban and Kempe 1999). Unmarked nouns would have to bedealt with through rote storage, putting a strain on phonological memory(Williams and Lovatt 2003). The lack of a true underlying noun class organisation

Inducing abstract linguistic representations 171

would make storage of determiner-noun combinations particularly prone toerror, but if at least one instance of a determiner-noun pair can be retrieved,other appropriate determiners could be inferred using knowledge of inter-determiner associations. Thus, second language learners can acquire a sem-blance of competence but the failure to organise the underlying representationsin terms of abstract noun classes will cause persistent problems. I have arguedthat this reflects a weakness in the type of associative learning mechanism thatis assumed to underlie incidental learning.

Notes

1. This language was derived from Italian. The determiners were derived from the Italian il,i, un, dei, la, le, una, and delle by systematically substituting consonants (lÆg, dÆt, nÆl).The nouns correspond to Italian nouns which end in -e in the singular and -i in the pluralregardless of gender, e.g. cliente (masculine), stazione (feminine). Note that none of theparticipants in Experiments 1 and 2 (reported below) had any knowledge of Italian, and onlytwo participants in Experiment 3 knew Italian at an intermediate level or better as an L2.

2. The only difference was that in the experiment they also had to produce the inflection,whereas in the simulation the inflection was provided on the input. However, in theexperiment the participants learned the correct plural inflections in the preliminaryvocabulary learning phase, and not in the training phase of the main experiment. In any casethe inflection provides no clue as to the correct determiner over and above the presence orabsence of the plurality of the noun.

3. A Root Mean Square error of 0.1 means that over all of the input patterns presented on aparticular cycle the average difference between the actual output and the required output oneach node was 0.1 units of activation. The point at which the correct output node was simplythe most active occurred well before an RMS error of 0.1 was achieved.

4. The Luce ratio was also used as a measure of network performance — the activation levelof the correct output node divided by the sum of the activation over all output nodes. Perfectoutput would be indicated by a Luce ratio of 1.0. In this simulation the mean Luce ratio over20 runs was 0.87.

5. The number of hidden units was set to about two thirds of the number of input/outputunits so as to force the inputs through a reduced representational space, exerting pressure onthe network to extract generalisations. Other simulations were performed with either 10 or40 hidden units but the generalisation performance was similar to that reported here.

6. This experiment was run in collaboration with Helen East.

172 John N. Williams

References

Anderson, J.R. 1983. The architecture of cognition. Cambridge MA: Harvard University Press.Berent, I., Marcus, G.F., Shimron, J. and Gafos, A. I. 2002. “The scope of linguistic general-

izations: Evidence from Hebrew word formation”. Cognition 83: 113–139.Braine, M.D.S. 1987. “What is learned in acquiring word classes: A step towards an acquisi-

tion theory”. In Mechanisms of language acquisition, B. MacWhinney (ed.), 65–87.Hillsdale, NJ: Lawrence Erlbaum.

Braine, M.D.S., Brody, R.E., Brooks, P.D., Sudhalter, V., Ross, J.E., Catalano, L. and Fisch,S.M. 1990. “Exploring language acquisition in children with a miniature artificiallanguage: Effects of item and pattern frequency, arbitrary subclasses, and correction”.Journal of Memory and Langage 29: 591–610.

Brooks, P J., Braine, M.D S., Catalano, L. and Brody, R. 1993. “Acquisition of gender-likenoun classes in an artificial language: The contribution of phonological markers tolearning”. Journal of Memory and Language 32: 76–95.

Bruhn, J. and White, L. 2000. “L2 acquisition of Spanish DPs: the status of grammaticalfeatures”. In Proceedings of the 24th annual Boston University conference on languagedevelopment. Vol. 1, S.C. Howell, S.A. Fish and T. Keith-Lucas (eds), 164–175. Somer-ville, Mass.: Cascadilla Press.

Carroll, S. 1989. “Second language acquisition and the computational paradigm”. LanguageLearning 39: 535–594.

Carroll, S.E. 1999. “Input and SLA: Adults’ sensitivity to different sorts of cues to Frenchgender”. Language Learning 49: 37–92.

Carroll, S.E. 2001. Input and evidence: The raw material of second language acquisition.Amsterdam: John Benjamins.

Caselli, M.C., Leonard, L.B., Volterra, V. and Campagnoli, M.G. 1993. “Toward mastery ofItalian morphology: A cross-sectional study”. Journal of Child Language 20: 377–393.

Cleeremans, A. and Jiménez, L. 2002. “Implicit learning and consciousness: A graded,dynamic perspective”. In Implicit learning and consciousness, R.M. French and A.Cleeremans (eds), 1–40. Hove: Psychology Press.

Corbett, G. 1991. Gender. Cambridge: Cambridge University Press.Ellis, N.C. 1998. “Emergentism, connectionism and language learning”. Language Learning

48: 631–664.Elman, J.L. 1990. “Finding structure in time”. Cognitive Science 14: 179–211.Fodor, J.A. and Pylyshyn, Z.W. 1988. “Connectionism and cognitive architecture: A critical

analysis”. Cognition 28: 3–71.Frigo, L. and McDonald, J.L. 1998. “Properties of phonological markers that affect the

acquisition of gender-like subclasses”. Journal of Memory and Language 39: 218–245.Gómez, R.L. and Gerken, L. 2000. “Infant artificial language learning and language acquisi-

tion”. Trends in Cognitive Sciences 4: 178–186.Guillelmon, D. and Grosjean, F. 2001. “The gender marking effect in spoken word recogni-

tion: The case of bilinguals”. Memory and Cognition 29: 503–511.Hawkins, R. 2001. Second language syntax: A generative introduction. Oxford: Blackwell.

Inducing abstract linguistic representations 173

Holmes, V.M. and De la Batie, B.D. 1999. “Assignment of grammatical gender by nativespeakers and foreign language learners”. Applied Psycholinguistics 20: 479–506.

Johnstone, T. and Shanks, D.R. 1999. “Two mechanisms in implicit artificial grammarlearning? Comment on Meulemans and Van der Linden 1997”. Journal of ExperimentalPsychology: Learning, Memory, and Cognition 25: 524–531.

Kelly, M.H. 1992. Using sound to solve syntactic problems: The role of phonology ingrammatical category assignments. sychological Review 99: 349–364.

Knowlton, B. J. and Squire, L.R. 1996. “Artificial grammar learning depends on implicitacquisition of both abstract and exemplar-specific information”. Journal of ExperimentalPsychology: Learning, Memory, and Cognition 22: 169–181.

Levelt, W J.M., Roelofs, A. and Meyer, A.S. 1999. “A theory of lexical access in speechproduction”. Behavioural and Brain Sciences 22: 1–75.

Maratsos, M.P. and Chalkley, M.A. 1980. “The internal language of children’s syntax: Theontogenesis and representation of syntactic categories”. In Children’s Language Vol. 2,K. Nelson (ed.), 127–214. New York: Gardner Press.

Marcus, G.F. 1999. “Language acquisition in the absence of explicit negative evidence: Cansimple recurrent networks obviate the need for domain-specific learning devices?”Cognition 73: 293–296.

Marcus, G.F., Vijayan, S., Bandi Rao, S. and Vishton, P.M. 1999. “Rule learning in 7-month-old infants”. Science 283: 77–80.

Mathews, R.C., Buss, R.R., Stanley, W.B., Blanchard-Fields, F., Cho, J.-R. and Druhan, B. 1989.“The role of implicit and explicit processes in learning from examples: A synergistic effect”.Journal of Experimental Psychology: Learning, Memory, and Cognition 15: 1083–1100.

Meulemans, T. and Van der Linden, M. 1997. “Associative chunk strength in artificialgrammar learning”. Journal of Experimental Psychology: Learning, Memory, andCognition 23: 1007–1028.

O’Reilly, R.C. and Munakata, Y. 2000. Computational explorations in cognitive neuroscience:Understanding the mind by simulating the brain. Cambridge: MA: MIT Press.

Pinker, S. 1984. Language learnability and language development. Cambridge, Mass.: HarvardUniversity Press.

Plunkett, K. and Elman, J.L. 1997. Exercises in rethinking innateness: A handbook forconnectionist simulations. Cambridge, MA: MIT Press.

Redington, M. and Chater, N. 1998. “Connectionist and statistical approaches to languageacquisition: A distributional perspective”. Language and Cognitive Processes 13: 129–191.

Rumelhart, D.E. and McClelland, J.L. 1986. “On learning the past tense of English verbs”. InParallel distributed processing: Explorations in the microstructure of cognition Vol. 2, J.L.McClelland and D.E. Rumelhart (eds), Cambridge, MA: MIT Press.

Saffran, J.R. 2001. “The use of predictive dependencies in language learning”. Journal ofMemory and Language 44: 493–515.

Seidenberg, M.S. and McClelland, J.L. 1989. “A distributed, developmental model of wordrecognition and naming”. Psychological Review 96: 523–569.

Shanks, D.R. 1995. The psychology of associative learning. Cambridge: Cambridge UniversityPress.

Skehan, P. 1998. A cognitive approach to language learning. Oxford: Oxford University Press.

174 John N. Williams

Sokolok, M.E. and Smith, M.E. 1992. Assignment of gender to French nouns in primary andsecondary language: A connectionist model. Second Language Research 8: 39–58.

Sun, R., Merrill, E. and Peterson, T. 2001. “From implicit skills to explicit knowledge: abottom-up model of skill learning”. Cognitive Science 25: 203–244.

Taraban, R. and Kempe, V. 1999. “Gender processing in native and nonnative Russianspeakers”. Applied Psycholinguistics 20: 119–148.

Tomasello, M. 2000. “The item-based nature of children’s early syntactic development”.Trends in Cognitive Sciences 4: 156–163.

Towell, R., and Hawkins, R. 1994. Approaches to second language acquisition. Clevedon:Multilingual Matters.

Ullman, M.T. 2001. “The neural basis of lexicon and grammar in first and second language:The declarative/procedural model”. Bilingualism: Language and Cognition 4: 105–122.

Vigliocco, G., Antonini, T. and Garrett, M.F. 1997. “Grammatical gender is on the tip ofItalian tongues”. Psychological Science 84: 314–317.

Williams, J.N. In preparation. “Implicit learning of form-meaning connections”.Williams, J.N. and Lovatt, P. 2003. “Phonological memory and rule learning”. Language

Learning 53: 67–121.

</TARGET "wil">

<TARGET "sab" DOCINFO AUTHOR "Laura Sabourin and Marco Haverkort"TITLE "Neural substrates of representation and processing of a second language"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 8

Neural substrates of representationand processing of a second language*

<LINK "sab-n*">

Laura Sabourin and Marco HaverkortUniversity of British Columbia / University of Nijmegen &Boston University

1. Introduction

Most research in second language acquisition — as well as in first languageacquisition — does not make a careful enough distinction between the differentlevels at which language behaviour and changes in the language of the learnerare described. The description is usually cast in terms of a representation ofgrammatical knowledge available to the learner, and changes in languagebehaviour are viewed as the result of qualitative changes in that knowledge, forinstance the addition of a rule or the resetting of a parameter. In this paper, wewant to argue that it is important to distinguish between the representation ofgrammatical knowledge, the language processor, and general cognitive strate-gies in adult second language acquisition.

In order to be able to distinguish between grammar and processor insecond language acquisition, we will compare results obtained with differentmethods. In particular, we will use an off-line grammaticality judgment task totap grammatical knowledge of second language learners, and on-line EEGmeasurements to investigate to what extent the processing strategies of secondlanguage learners are qualitatively similar to those used by native speakers.Specifically, we will investigate the use of grammatical gender to see if L2processing is strictly linguistic in nature or depends to some degree on moregeneral cognitive strategies.

In the next sections we will first look at evidence from the field of aphasiathat a distinction between knowledge and grammar on one hand and theprocessor on the other should be made. We will then look at two L2 experi-ments to see if this distinction also holds for L2.

176 Laura Sabourin and Marco Haverkort

2. Grammar versus processor: Evidence from aphasia

There is quite extensive evidence from another domain of linguistic inquiry —the study of aphasia — that supports the idea that the representation ofgrammatical knowledge on the one hand and language processing on the otherare dissociated entities. According to this view, aphasics have the knowledge oftheir language available, but cannot process language on-line, due to workingmemory or other processing limitations (cf. Kolk 1995, 1998).

There are a number of observations that support the idea that aphasics stillhave access to the correct grammatical representations, but that their access istoo slow for adequate on-line processing. First, although there is a spontaneousrecovery process post-onset for virtually all patients, there is no indication thatthe representations must be re-acquired or relearned.

Second, the fact that patients exhibit task-dependent variation supportsaphasics having access to the grammatical representations. They perform atchance level with certain constructions — such as object relative clauses,passives, and object clefts (Caplan and Hildebrandt 1988, Grodzinsky 1990) —in a sentence-picture matching task, where they have to select the picture thatcorrectly depicts the sentence they were just given. These same patients canperform much better (close to ceiling level), in a grammaticality judgment task.Linebarger et al. (1983) and Grodzinky and Finkel (1998) however, found thattheir patients had problems with the grammaticality judgment task, especiallywhen the sentences to be judged involved antecedent-trace dependencies. Theformer task is more complex and involves more processing than the latter: asyntactic structure needs to be established, onto which a semantic representa-tion is then mapped; subsequently, the pictures need to be analysed, resultingin a conceptual structure, and finally these conceptual structures have to becompared with the semantic representation of the sentence in order to find thebest match. In a grammaticality judgment task, on the other hand, only the firststep needs to be taken: a syntactic representation needs to be established, and ifthe structure under construction fails before it is finished, the sentence is markedas ungrammatical (Chomsky 1995). A full semantic structure does not need to becomputed, as in the sentence-picture matching task. Thus, if patients have aprocessing problem, it is to be expected that they will perform better on thelatter task, which is simpler computationally and requires less storage capacity.The fact that they perform much better on grammaticality judgments clearlyindicates that the grammatical knowledge must be available at some level.

Third, it has been shown in several studies (Burkhardt et al. 2001, Haarmann

Neural substrates of representation and processing of a second language 177

1993, Kolk 2002, among others) that aphasics exhibit syntactic and semanticpriming effects. In a syntactic priming task, unimpaired subjects are quicker in alexical decision task if the target is a word that syntactically fits into the sequence ofwords heard or read up to the point of presentation of the target. It reflects the factthat language users have clear expectations about what syntactic category is tocome next. For aphasics, these effects also show up, but only with stimulus onsetasynchronies (SOA) that are larger than the optimal SOA for unimpaired subjects:whereas the optimal SOA for unimpaired subjects is 300 ms (with longer SOAs theeffect gradually disappears), the optimal SOA for the aphasics is much larger.

Haarman (1993) presents data from a syntactic priming study. He com-pared sentences such as those in (1) and found a priming effect for an unim-paired control group of about 65 ms on the last word: if that word fit thesyntactic context, the unimpaired control group made the lexical decision 65 msquicker than if it did not. The agrammatic patients showed the same primingeffect (a quicker response to words that fit the syntactic context), but only whenthe SOA was increased from 300 (normals) to 1100 ms.

(1) a. Wij zijn getest/*gewandeld.‘We are tested/walked.’

b. Wij kunnen praten/*neus.‘We can talk/nose.’

c. op de tafel/*rood‘on the table/red’

The fact that the aphasics showed a syntactic priming effect can only beexplained under the assumption that they have the relevant knowledge (regard-ing phrase structure and subcategorization) at their disposal and hence havesyntactic expectations as to what word class the next word will be; otherwise, noeffect should be found. The fact that the optimal SOA is a little over three timesas large for the aphasic population as for the control group, however, indicatesthat the aphasics cannot make use of the relevant knowledge quick enough on-line; as soon as they are given more time, the exact same effect shows up as forthe unimpaired population. However, for the unimpaired control group, thepriming effect disappeared when items were presented at longer SOAs.

A similar priming effect has been shown to exist in filler-gap dependencies,using semantic priming. Burkhardt et al. (2001), using sentences with movedwh-phrases and DPs, presented semantically related or unrelated words(Examples 2 and 3 below) at the trace or 600 ms after the trace in objectposition (indicated by ti).

178 Laura Sabourin and Marco Haverkort

(2) The kid loved the cheese whichi the brand new microwave melted ti

yesterday afternoon while the entire family was watching tv.

(3) The butteri in the small white dish melted ti after the boy turned on thebrand new microwave.

In this experiment as well, the priming effect can only be observed in a smallwindow, which for normals is immediately at the object position; if the seman-tic prime is presented with a delay, the priming effect is gradually lost in theunimpaired population, an indication that it is indeed the reactivation of thesemantic content of the moved phrase at the trace position that causes theeffect. Here, again, the aphasics exhibit a priming effect, but only when thesemantically (un)related word is presented with a delay of 600 ms, indicatingthat the patients can construct the trace associated with the moved wh-phraseor DP. Thus, the representation of the relevant syntactic knowledge must beavailable to them; otherwise no priming effect would be expected.

These observations all point in the direction that a processing-basedaccount of aphasic behaviour is on the right track: the knowledge base seems tobe available, and can be used by the patients under particular conditions.However, the task cannot be too complicated or involve too many sub-tasks,and the patients need to be given sufficient time to do the task. At the behav-ioural level, young children and second language learners behave similarly tothe aphasics in a number of respects (subjects are omitted, verbs are notinflected for tense and agreement but occur in the infinitival form in thecorresponding syntactic position instead, and functional categories — conjunc-tions, determiners, pronouns, auxiliaries, copula verbs and prepositions, forinstance — are omitted), which suggests that their behaviour should beexplained along similar lines, viz. in terms of processing limitations, in line withOckham’s razor (see also Avrutin, Haverkort and Van Hout 2001 and thedifferent papers in that volume). We hypothesize that in other populations,particularly second language learners, however, these limitations are of adifferent nature: not so much timing restrictions, as in the aphasic population,but the use of qualitatively different processing strategies (see below).

3. Second language processing

As indicated above, the aim of this paper is to investigate the role of therepresentation of grammatical knowledge, language processing and general cog-nitive strategies in adult second language learners. It is possible that successful

Neural substrates of representation and processing of a second language 179

second language learners have native-like knowledge, just like the aphasics(suggesting that access to Universal Grammar (UG) for a second language ispossible). However, they may actually process this knowledge in a non-native-like manner. Their non-native processing, though, may not be due to timinglimitations as in aphasic populations (a quantitative difference) but may be aqualitative difference.

We will now look at studies that investigate whether advanced secondlanguage learners of Dutch, even if they exhibit the same knowledge as nativespeakers in an off-line grammaticality judgment task, exhibit the same neuro-physiological responses to grammatical violations. This would indicate that,even though their knowledge is comparable to that of native speakers, their on-line processing differs qualitatively. Comparison of data obtained using thetraditional grammaticality judgment technique with those obtained by tappingdirectly into electrophysiological activity in the brain associated with a specificprocessing phenomenon allows us to study knowledge and processing separate-ly. Grammatical gender is specifically interesting in this respect, because itinvolves both lexical and syntactic aspects; hence storage, computation, andtheir interaction can be studied simultaneously.

4. Knowledge versus processing: Two experiments

4.1 Grammatical gender

Grammatical gender or noun classification systems are found in many of theworld’s languages. Dutch is a gender language with two gender classes, markedby the definite articles de (common gender) and het (neuter gender). Originally,the language employed a three gender system with masculine, feminine andneuter categories, but the former two were conflated into one common gender.The earlier three-way system is similar to the system presently used in German,a language that is closely related to Dutch.

The following experiments investigated how second language speakers dealwith local grammatical gender agreement within the noun phrase. There aretwo different types of agreement that fall into this category. One type is theagreement between definite determiner and noun. Common gender nouns(such as tafel, ‘table’) take the common definite determiner de and neutergender nouns (such as kind, ‘child’) take, in the singular, the neuter definitedeterminer het. In the plural, the determiner de is used for both common and

180 Laura Sabourin and Marco Haverkort

neuter gender nouns. The indefinite determiner is the same for both genders,i.e. een, and the agreement is only evident on the adjective (adjective-nounagreement). For indefinite common gender nouns, the suffix -e is added to theadjective, while for indefinite neuter nouns the adjective remains uninflected,as shown in the following examples:

(4) a. Een klein kind.a small-Ø child-neut

b. Een klein-e tafel.a small-agr table-com

4.1.1 Experiment 1: Grammatical knowledgeThis first experiment was designed to determine the level of knowledge secondlanguage speakers can achieve concerning the Dutch gender agreement system.Only advanced participants with German as their native language were tested inthe second language group. The task here, as for Experiment 2, was to judge thegrammaticality of sentences. There were 2 types of sentences in the experimen-tal items: the first sentence type contained either the correct or incorrectdefinite determiner, while the second type contained either correct or incorrectadjectival agreement.

ParticipantsIn total 59 participants were tested on this task: 34 native speakers of Dutchformed the control group, while there were 25 second language learners withGerman as their native language. As we were interested in studying advancedsecond language learners, participants were required to have a high level ofproficiency. Therefor attain such participants had to have been using Dutch forat least three years. A proficiency score was also obtained from each secondlanguage participant; a score of 90% or more correct was required. Thisproficiency score was determined by testing participants on their knowledge ofnumber and finiteness agreement.1 Information about the participants issummarized in Table 1.

Materials and methodologyThe grammaticality judgment test contained 80 sentences of interest, each ofwhich contained the critical determiner-adjective-noun sequence. Half of theseitems belonged to the determiner-noun agreement condition and the other halfto the adjective-noun agreement condition. For the first condition, the critical

Neural substrates of representation and processing of a second language 181

nouns in the sentences were preceded either by the correct definite determiner

Table 1.Participant information. The number of participants included for eachlanguage group along with information as to the average duration and range ofexposure the German participants had to the Dutch language.

Native language Exposure to Dutch Accuracy on proficiency test

Dutch (n=34) N/A Range: 90–100%Average: 98%

German (n=25) Range: 2–49 yrs*Average: 11.6 yrs

Range: 92–100%Average: 97%

* The one German subject who had less then 3 years of exposure to Dutch, started teaching himselfDutch while still living in Germany, but those years were not counted, as the amount of Dutch usedbefore moving to the Netherlands could not be determined.

or by the incorrect definite determiner (Example 5). In the second conditionthere were sentences containing indefinite NPs in which the critical noun waspreceded either by the correctly or incorrectly inflected form of the adjective(Example 6). The test also included 200 filler sentences with different types ofviolations: 80 sentences that were used in the proficiency measure, 80 sentenceslooking at the use of the relative pronouns and 40 sentences looking at the formof the predicative adjective. Full details can be found in Sabourin (2003).

(5) Het/*De kleine kind probeerde voor het eerst te lopen.the-neut/*com small child-neut tried for the first to walk.‘The small child tried to walk for the first time.’ (DET-N agreement)

(6) Hij loopt op een gekke/*gek manier. (A-N agreement)he walks in a funny-com/*neut way.

The critical nouns used in this experiment were controlled for frequency. Halfof the items were of high frequency while the other half were of a middlefrequency. The middle frequency items were still of a fairly high frequency toensure that second language participants would know them. The frequency ofeach item was determined through the CELEX database (Burnage 1990). Thelog frequency of each high frequency item was between 1.96 and 2.98 (average2.28); for each middle frequency item, it was between 1.11 and 1.49 (average1.31). Items were also broken down into gender class: half of the nouns werecommon gender nouns (de) and the other half were neuter gender ones (het).

Each subject received a grammaticality judgment questionnaire. Theparticipants were asked to first go through the test making a yes/no decision asto the grammaticality of each sentence. They were required to complete this

182 Laura Sabourin and Marco Haverkort

task in 30 minutes. After judging the grammaticality of each sentence, they wereasked to go back to the beginning and correct every sentence they had markedas ungrammatical. This was done to ensure that subjects were rejecting asentence for the right reasons and not, for instance, due to the fact that they feltthat an incorrect preposition or incorrect word order had been used.

In scoring the grammaticality judgments only sentences that were bothcorrectly judged as grammatical or ungrammatical and that contained arelevant correction were considered as correct answers. For example, in theungrammatical version of the sentence in (5), repeated below as (7), theparticipant correctly might have said the sentence was ungrammatical, but, inthe correction of the sentence only changed the position of the prepositionalphrase voor het eerst. If this was the case, the answer was scored as incorrect.

(7) *De kleine kind probeerde voor het eerst te lopen.the-com small child-neut tried for the first to walk‘The small child tried to walk for the first time.’

Similarly, if the sentence was supposed to be marked as grammatical but thesubject rated the sentence as ungrammatical, making a correction that wasunrelated to the condition being tested, the answer was scored as correct. Forexample, if the above sentence had been grammatical (with het kind instead ofde kind), but the subject still rated it as ungrammatical due to the position ofthe prepositional phrase, the answer would have been considered as correct,since correct judgment was given with respect to gender.

ResultsThe results for this experiment will be analysed using a four-way Anova(analysis of variance). Only responses to the ungrammatical items will beanalysed as scores on the grammatical items were near perfect for both groups.The within-subjects effects were definiteness (definite and indefinite), frequen-cy (high and middle), and gender (common and neuter). The between-subjectseffect was native language (Dutch and German). Only one significant interac-tion was found in this analysis; definiteness significantly interacted with L1(F(1,57)=15.8, p<.001). This effect can be seen in Figure 1. The main effects ofdefiniteness (F(1,57)=32.84, p<.001) and L1 (F(1,57)=25.37, p<.001) werealso significant.

What is most important to note here is that while there was a differencebetween the native speakers and the L2 learners, this is only clearly the casewhen indefinite NPs are being used. The L2 learners perform significantly less

Neural substrates of representation and processing of a second language 183

worse than the native speakers on the definite NP items compared to the

50

60

70

80

90

100

Dutch German

def

indef

Figure 1.Scores (in percent) comparing the Dutch and German scores on the definiteand indefinite NPs.

indefinite NP items.To summarize, the German group shows that, for the definite NPs (the

determiner-noun agreement condition), their knowledge is similar to that ofthe native speakers. However, for indefinite NPs (the adjective-noun agreementcondition), the German group performs very poorly. One way to interpret theseresults is by noting that determiner-noun agreement is similar to simplyassigning gender to nouns and can, therefore, be done on the basis of lexicalinformation rather than via a syntactic process. On the other hand, adjective-noun agreement requires the participants to take the lexical knowledge of whichgender an item is and apply this information in order to correctly inflect theadjective. In the next experiment we look at the on-line processing of the samesentences. The question now is how the L2 group processes these different kindsof data on the on-line version of the task.

4.1.2 Experiment 2: ProcessingAs was seen in the first experiment, the German group showed that when theyare given an overt determiner they can judge the grammaticality quite accurate-ly. If only this off-line measure had been used, the conclusion might be thatsecond language speakers acquire a native-like competence of their secondlanguage. But, as was argued above, there are some reasons to think that therepresentation of knowledge on the one hand and language processing on theother can be dissociated. For aphasics, evidence was presented, indicating thatthey still have knowledge of the language but that they cannot process thelanguage on-line in a quick enough manner. It is therefore also possible that

184 Laura Sabourin and Marco Haverkort

second language learners acquire the knowledge of the second languagegrammar but are not able to process this language in the same manner as nativespeakers, though for different reasons than the aphasics. The fact that accuracydecreases when syntactic agreement must be processed is suggestive that thismay be the case.

There are numerous techniques that can be used to test on-line languageprocessing. Some experiments make use of reaction time (RT) measurementssuch as lexical decision and self-paced reading. Unfortunately, although thesetechniques can tell us a lot about how language processing is organized in termsof its general architecture, they may not be fine-grained enough to determinewhether similar or different processing mechanisms are being used. One on-linetechnique that provides detailed information on the qualitative aspects ofprocessing is electroencephalography (EEG). The neuroimaging technique ofEvent-Related Potentials (ERPs) is able to measure the electrophysiologicalactivity in the brain that is thought to directly reflect neural activity. ERPs arenegative or positive changes in the voltage of the ongoing brain activity that canbe elicited by sensory input or a cognitive task. The technique provides infor-mation on the latency, amplitude, polarity, and distribution over the scalp ofthe EEG-signal. It has been found in previous language studies that signalselicited by, for instance, grammatical and ungrammatical sentences can bediscriminated (Kutas 1993, Rugg and Coles 1995). One well-understoodERP-correlate of syntactic language processing is the P600 or Syntactic PositiveShift (SPS) which is associated with processing of morpho-syntactic anomaliesand complexity (Osterhout and Holcomb 1992, Hagoort, Brown and Groot-husen 1993). The P600 is a positive deflection in the EEG-signal that startsapproximately 500 ms after the presentation of the word that renders a sentenceungrammatical; this positivity continues for about 400 ms and reaches itsmaximum amplitude at around 600 ms after the presentation of the word thatrenders the sentence ungrammatical. This component is most prominent incentro-parietal regions of the scalp.

There has recently been some research on the neural correlates of grammat-ical gender processing in both Dutch and German. These studies have onlylooked at native speakers and they have looked only at determiner-nounagreement. For Dutch, Hagoort and Brown (1999) showed that grammaticalgender incongruencies result in an increase in the amplitude of the P600component as compared to sentences with congruent determiner-nounagreement. A P600 component was also found in German for gender violations(Gunter, Friederici and Schriefers 2000). Thus, grammatical gender violations

Neural substrates of representation and processing of a second language 185

in both Dutch and German result in an increase in amplitude of the P600component. The question then is: will Germans also show this P600 in their L2processing?

ParticipantsIn total 39 participants were tested on the ERP version of the above experiment.There were 23 native speakers of Dutch and 14 second language speakers withGerman as their native language. The L2 participants have lived in the Nether-lands between two and 32 years with an average of 9.8 years. None of theparticipants took part in Experiment 1.

Materials and methodologyThe critical stimulus sentences used in this experiment were the same sentencesused in Experiment 1. While the task used in the ERP version also contained agrammaticality judgment, there were a few important differences in how theon-line task was run compared with the off-line task from Experiment 1.During the ERP measurement, participants were seated in a dimly lit sound-proof room facing a computer monitor. Sentences were presented word byword in the middle of the screen (the word was on the screen for 250 ms,followed by a blank screen for 250 ms before the next word appeared). Eachsentence was preceded by an asterisk (to let participants know that a newsentence was about to start). After each sentence, a delay screen was displayed,followed by a screen requesting subjects to give a grammaticality judgment bypushing one of two buttons. After each sentence participants were given twoseconds in which they were allowed to blink.2 The experiment started with apractice session to allow participants to get used to the presentation style of thesentences and to practice not blinking during the sentence trials. The actualexperiment lasted approximately one hour. The native speakers were giventhree breaks while the L2 participants were given a total of seven breaks; thelength of the pause was chosen by the participant, so the total testing timevaried, depending on the length of the pauses that were taken.

EEG recordingThe EEG activity was recorded by means of tin electrodes mounted in an elasticcap (Electro-Cap International) from 12 electrode sites, based on the interna-tional 10–20 system. The 12 electrodes analyzed were: F7, Fz, F8, T3, Cz, T4, T5,Pz, T6, O1, Oz and O2. The ‘F’ represents frontal electrodes, ‘T’ representstemporal electrodes, ‘C’ represents central electrodes, ‘P’ represents parietal

186 Laura Sabourin and Marco Haverkort

electrodes and ‘O’ represents occipital electrodes. Odd numbers represent elec-trodes on the left half of the scalp, even numbers represent electrodes on the righthalf, and the ‘z’ represents electrodes along the midline. All electrodes werereferenced to linked mastoids. Both horizontal and vertical electro-oculograms(EOGs) were measured for both eyes. Electrode impedances were kept below 5 kΩ.EEG and EOG signals were sampled at 1000 Hz, amplified and digitally filteredwith a cut-off frequency of 30 Hz; effective sample frequency was 100 Hz.

ResultsFirst the behavioural results (accuracy on the grammaticality judgment) will bepresented followed by separate analyses of the on-line ERP data for the nativespeakers and the German group. For the L2 group only the sentences whichthey correctly judged as grammatical will be analyzed.

The average accuracy scores for the Dutch and German groups are present-ed in Table 2.

Average ERPs were computed at the above electrode sites for each partici-

Table 2.Accuracy scores (in %) ont the ERP version of the grammaticality judgment task

NP Definites NP Indefinites

grammatical ungrammatical grammatical ungrammatical

Dutch (n=23)German (n=14)

95%88%

90%80%

93%93%

92%68%

pant in all conditions. The averaging was done for an interval starting at theonset of the critical noun and continuing for 1500 ms post-onset. All averageswere aligned with a 200 ms pre-stimulus baseline (200 ms before onset of thecritical noun is set to zero for both conditions to correct for pre-existingdifferences in the EEG). For analysis purposes, averaged ERPs of each 1500 msepoch were divided into 50 intervals of 30 ms. This method allows one to seethe onset and duration of effects more clearly. In each of these 50 intervals, meanamplitudes were statistically analysed with a Manova. Effects will be reported onlyif three or more successive intervals reach significance at the .05 level for nativespeakers and at the 0.1 level for the German group. Three successive significantintervals are more likely to reflect a real and reliable effect despite the use ofmultiple comparisons. A .1 level will be allowed for the L2 speakers in order toavoid false negative results of the P600 as less of their data can be analysed (onlythe sentences for which they made a correct judgment) and it is expected that

Neural substrates of representation and processing of a second language 187

their data will also be more variable. Both of these reasons will likely make itmore difficult to find significant differences in the wave patterns so a less strictlevel of significance will be taken, but the reader must be aware then, thatunexpected significant results should be looked at carefully.

Each language group will be analysed separately and then discussedcomparatively. For each 30 ms interval a three-way Manova was carried outlooking at the effects of grammaticality (two levels), front to back scalp distri-bution (four levels: frontal, temporal, parietal and occipital), and left to rightscalp distribution (three levels: left, midline and right). The four levels of thefront to back scalp distribution are: F7, Fz and F8 as the first level, T3, Cz andT4 as the second level, T5, Pz, and T6 as the third level and O1, Oz and O2 asthe fourth level. The three levels of the left to right scalp distribution are: F7, T3,T5 and O1 as the left hemisphere electrodes; Fz, Cz, Pz and Oz as the midlineelectrodes and; F8, T4, T6 and O2 as the right hemisphere electrodes.

Definite NPsUpon a visual inspection of the ERP patterns for the definite NPs for the nativespeakers a clear P600 pattern is seen; the ungrammatical sentences are morepositive than the grammatical sentences over the more posterior electrodes.This can be seen by looking at Figure 2.

Looking at the waves statistically the presence of a P600 is supported.Within each of the 30 intervals from 570 to 1500 ms the effect of grammaticalityis significant at the .05 cut-off. Within this same time frame there was also asignificant interaction with the front to back factor. From 570 to 900 ms, thereis a largely distributed P600 component which is significant over the followingelectrodes: Fz, C3, Cz, C4, T5, P3, Pz, P4, T6, O1, Oz and O2. From 900 ms tothe end the posterior positivity is maintained, while a frontal negativity startswhich is only significant to the .05 level at electrodes F3 and F8.

Upon visually inspecting the ERP patterns for the German participants (seeFigure 3) we can also see a P600 component.

Statistically we see that indeed a P600 effect is present. However, it is muchmore restricted and starts later than the one found for native speakers. Between840 and 990 ms there is a significant positivity for the ungrammatical sentences atelectrodes C3, Cz, C4, T5, P3, Pz, P4 and T6. This can be seen in Figure 4 wherethe difference waves at electrode Pz are compared for the Dutch and Germangroups. Difference waves represent the wave found for the ungrammaticalsentence minus the wave found for the grammatical sentence; thus a positivityin the difference wave reflects a positivity in the ungrammatical sentences.

188 Laura Sabourin and Marco Haverkort

In Figure 4 we can see that the P600 effect starts later for the German group

Figure 2.ERP wave patterns for the grammatical (darker line) versus ungrammaticalsentences for the native speakers in the NP definite condition. The y-axis represents avoltage of ±5 microvolts with positive plotted up.

and that its maximal amplitude occurs later as well. Another difference betweenthe ERP waves for the German and Dutch groups can be seen. The German groupdoes not show the late frontal negativity that is seen in the native speaker ERP.

Indefinite NPsA visual inspection of the native speaker data (see Figure 5) shows that a P600effect is also present for these sentences. Statistically, the P600 effect is signifi-cant from 600 to 710 ms at electrodes C3, Cz and C4 and between 600 and 1500ms at electrodes T5, P3, Pz, P4, T6, O1, Oz and O2. Electrodes Fz and F8 showa late frontal negativity. This can be seen in Figure 5.

Visual inspection of the German data reveals no obvious effects (seeFigure 6). It is important to note that only sentences that were correctly judged

Neural substrates of representation and processing of a second language 189

in the grammaticality judgment portion of this task are included in the

Figure 3.ERP wave patterns for the grammatical (darker line) versus ungrammaticalsentences for the German group in the NP definite condition. The y-axis represents avoltage of ±5 microvolts with positive plotted up.

analyses below.Looking at the statistics there are three time frames where the effect of

grammaticality is significant: 300 to 420 ms, 780 to 960 ms and 1080 to 1200ms. Within none of these time frames does the effect of grammaticality signifi-cantly interact with either the front to back factor or the left to right factor.Looking at each electrode separately also does not result in any significantdifferences between grammatical and ungrammatical waves. However, uponvisual inspection, it appears that in the first time frame there is a frontalpositivity, followed by a left posterior negativity and a frontal negativity for theungrammatical sentences. The final significant time frame seems to be due to alargely distributed positivity. A comparison of the difference waves for thenative and German speakers at electrode Pz can be seen in Figure 7.

190 Laura Sabourin and Marco Haverkort

In the case of the definite NP where gender agreement between the overt

Figure 4.The difference waves for the ungrammatical minus the grammatical condition forthe NP definite sentences. The P600 component as seen at electrode Pz for the nativespeakers (the darker line) and the German speakers. The y-axis represents a voltage of±8 microvolts with positive plotted up.

determiner and noun can be seen as equivalent to gender assignment to nouns,which is similar in Dutch and German, the Germans are able to perform welloff-line and their on-line processing looks very similar to that of native speak-ers, although the P600 component found for the ungrammatical sentencesoccurs later and is more restricted. However, in the case of Dutch indefiniteNPs, where agreement can only occur at a more purely syntactic level, sincemore than just knowing whether an item is common or neuter is required, theGerman speakers have quite a bit of difficulty in the off-line judgment, and inlooking at processing of only the items they correctly judged in the on-line taskthey do not show a P600 component.

Neural substrates of representation and processing of a second language 191

5. General conclusions

Figure 5.ERP wave patterns for the grammatical (darker line) versus ungrammaticalsentences for the native speakers in the NP indefinite condition. The y-axis representsa voltage of ±5 microvolts with positive plotted up.

The main goal of this paper was to show that in the study of language behaviourin general and in the field of second language acquisition in particular therepresentation of grammatical knowledge and the processing in which thisknowledge is employed need to be carefully distinguished. We have shown thatfor the phenomenon of grammatical gender and for one specific group ofsecond language learners this is indeed an important distinction, because thisgroup shows a difference in processing at least for the indefinite NPs. Whilequantitative differences in language processing can be seen in aphasic popula-tions, when compared with unimpaired language users, we have shown thatthere is actually a qualitative difference between native speakers and secondlanguage learners in the processing of language. The results presented in this

192 Laura Sabourin and Marco Haverkort

paper suggest that the German participants may be making use of a translation

Figure 6.ERP wave patterns for the grammatical (darker line) versus ungrammaticalsentences for the German group in the NP indefinite condition. The y-axis represents avoltage of ±5 microvolts with positive plotted up.

strategy to learn Dutch gender assignment knowledge and that they may beusing their L1 processing strategies to process their L2 for cases where thegrammars are very similar. Further support for this is seen in the ERP process-ing patterns for sentences involving subject-verb agreement and finiteness, bothphenomena for which the Germans show native-like knowledge in the off-linetask. For finiteness structures, for which the German group can translate theirL1 strategies, they show native-like processing but for subject-verb agreement,which exists in German but is different in this language, the processing isdifferent from that observed in native speakers (see Sabourin 2003). Anotherimportant thing to note is that the very clear late frontal negativity which wasseen in the ERPs of the native speakers was not found for the German groupeven for the NP definite condition for which a P600 effect can be seen.3 These

Neural substrates of representation and processing of a second language 193

findings suggest that linguistic processing (as reflected by the P600) can only

Figure 7.The difference waves for the ungrammatical minus the grammatical condition forthe NP indefinite sentences. The P600 component as seen at electrode Pz for the nativespeakers (the darker line) and the German speakers. The y-axis represents a voltage of±8 microvolts with positive plotted up.

occur in the L2 when the processing strategy from the L1 can be used relativelydirectly in L2 processing. Processing of grammatical structures that are notsimilar in the L1 and L2 may be learned and handled by more general cognitivestrategies. Ullman (2001) discussed the declarative/procedural memorydistinction in terms of L1 processing, lexical knowledge being declarative andsyntactic knowledge being procedural in nature. For L2 speakers, Ullman claimsthat both lexical and syntactic knowledge usually rely on declarative memory,although Ullman suggests that factors such as age of exposure and practice mayinfluence the ability to use procedural memory in L2 learners. Using this terminol-ogy the results presented here suggest that in L2 processing of grammaticalknowledge, only in the case where the L1 and the L2 are similar can proceduralmemory be used by advanced adult L2 learners although probably with aquantitative difference compared to native speakers, viz. a temporal delay.

194 Laura Sabourin and Marco Haverkort

Further research should shed light on whether this non-native-like processingis an across-the-board second language effect or whether differences may befound depending on the particular phenomena studied and particular languagegroups involved.

Notes

* We would like to thank Laurie Stowe, John Hoeks, Liz Temple and two anonymous

<DEST "sab-n*">

reviewers for their comments on an earlier draft of this paper. The research of the first authorwas funded by the School of Behavioral and Cognitive Neurosciences (BCN) of the Universi-ty of Groningen; the research of the second author was funded by a grant from the RoyalNetherlands Academy of Sciences (KNAW).

1. For more details on the proficiency test see Sabourin (2001, 2003).

2. Participants were asked to try their best to not blink during presentation of the sentencesas eye movements and blinks greatly distort the EEG signal.

3. This suggests that whatever the L2 speakers are doing, it is not exactly the same as thenative speakers.

References

Avrutin, S., Haverkort, M. and Van Hout, A. 2001. “Introduction: Language acquisition andlanguage breakdown”. Brain and Language 77: 269–273.

Burkhardt, P., Piñango, M. and Wong, K. 2001. The role of the anterior left hemisphere inreal-time sentence comprehension: Evidence from split intransitivity. Ms. Yale University.

Burnage, G. 1990. A guide for users. Nijmegen: CELEX Centre for Lexical Information.Caplan, D. and Hildebrandt, N. 1988. Disorders of syntactic comprehension. Cambridge: MIT

Press.Chomsky, N. 1995. The minimalist program. Cambridge: MIT Press.Grodzinsky, Y. 1990. Theoretical perspectives on language deficits. Cambridge: MIT Press.Grodzinsky, Y. and Finkel, L. 1998. “The neurology of empty categories: Aphasics’ failure to

detect ungrammaticality”. Journal of Cognitive Neuroscience 10 (2): 281–292.Gunter, T.C., Friederici, A.D. and Schriefers, H. 2000. “Syntactic gender and semantic

expectancy: ERPs reveal early autonomy and late interaction”. Journal of CognitiveNeuroscience 12 (4): 556–568.

Haarmann, H. 1993. Agrammatic aphasia as a timing deficit. Doctoral dissertation, Universityof Nijmegen.

Hagoort, P. and Brown, C. 1999. “Gender electrified: ERP evidence on the syntactic natureof gender processing”. Journal of Psycholinguistic Research: Special Issue on “Processing ofGrammatical Gender” 28 (6): 715–728.

Neural substrates of representation and processing of a second language 195

Hagoort, P., Brown, C. and Groothusen, J. 1993. “The syntactic positive shift (SPS) as anERP-measure of syntactic processing”. Language and Cognitive Processes 8: 439–483.

Kolk, H. 1995. “A time-based approach to agrammatic production”. Brain and Language 50:282–303.

Kolk, H. 1998. “Disorders of syntax in aphasia: Linguistic-descriptive and processingapproaches”. In Handbook of neurolinguistics, B. Stemmer and H. Whitaker (eds),250–260. San Diego: Academic Press.

Kolk, H. 2002. Language production in agrammatic aphasics: an experimental study. Paperpresented at the University of Nijmegen Linguistics Colloquium.

Kutas, M. 1993. “In the company of other words: Electrophysiological evidence for single-word and sentence context effects”. Language and Cognitive Processes 8 (4): 533–632.

Linebarger, M., Schwarz, M. and Saffran, E. 1983. “Sensitivity to grammatical structure in so-called agrammatic aphasics”. Cognition 13: 361–392.

Osterhout, L. and P. J. Holcomb. 1992. “Event-related brain potentials elicited by syntacticanomaly”. Journal of Memory and Language 31: 785–806.

Rugg, M.D. and Coles, M.G.H. 1995. “The ERP and cognitive psychology: Conceptualissues”. In Electrophysiology of the mind: Event-related brain potentials and cognition,M.D. Rugg and M.G.H. Coles (eds), 27–39. Oxford: Oxford University Press.

Sabourin, L. 2001. “L1 effects on the processing of grammatical gender in L2”. In EuroslaYearbook, Volume 1, S. Foster-Cohen and A. Nizegorodcew (eds), 159–169. Amsterdam:John Benjamins.

Sabourin, L. 2003. Grammatical gender agreement in L2 processing. Doctoral dissertation,University of Groningen.

Ullman, M.T. 2001. “The neural basis of lexicon and grammar in first and second language:The declarative/procedural model”. Bilingualism: Language and Cognition 4 (1):105–122.

</TARGET "sab">

<TARGET "gre" DOCINFO AUTHOR "David W. Green"TITLE "Neural basis of lexicon and grammar in L2 acquisition"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 9

Neural basis of lexicon and grammarin L2 acquisition

The convergence hypothesis*

<LINK "gre-n*">

David W. GreenUniversity College London

1. Introduction

In acquiring a second language (L2) individuals must grasp its grammar and itsvocabulary but differences in the context of acquisition from a first acquiredlanguage (L1) may mean that different learning mechanisms are involved. Suchdifferences, in turn, carry implications for the neural representation of L1 andL2. Understanding the neural basis of the representation of L1 and L2 cantherefore contribute to a deeper understanding of the interface of syntax andlexicon in L2 acquisition.

The basic orienting question is this: is a person’s lexical and grammaticalknowledge represented differently if it is learned as an L2 as opposed to an L1?In this chapter, I contrast this proposal, termed the differential representationhypothesis, with an alternative, termed the convergence hypothesis. Thishypothesis states that as proficiency in L2 increases, non-native speakersrepresent, and process, the language in the same way as native speakers of thatlanguage. In addressing these hypotheses, I consider both their computationalbasis, and relevant neuropsychological and neuroimaging data.

The chapter is structured as follows: I first consider the differential repre-sentation hypothesis. On one version an L2 is represented in a person’s right-hemisphere rather than in their left-hemisphere. I reject this possibility. Next Iconsider a version in which in L2, in contrast to L1, grammatical and lexicalinformation is represented in a common memory system. This version of thehypothesis relies on a distinction between two memory systems that has beenjustified primarily by studies on amnesic patients. However, computationalmodelling shows that a single memory system is sufficient to generate the data

198 David W. Green

of these amnesic patients. Such a result prompts consideration of the computa-tional basis for the alternative, convergence hypothesis. By themselves computa-tional arguments are not decisive and so we consider neuropsychological andneuroimaging data in an effort to adjudicate between the two hypothesesempirically. A final section considers the kind of studies needed to further ourunderstanding of the issue.

2. A distinct hemispheric representation of L2?

The human brain is a product of evolution and it makes computational sensefor an evolved system to have redundant and duplicate mechanisms forperforming tasks (Edelman 1989). A strong version of the differential represen-tation hypothesis is therefore possible in which L2 is represented in a complete-ly distinct neuroanatomical substrate from L1 (Scoresby-Jackson 1867).Language functions in monolingual, right-handed individuals are typicallyrepresented in a distributed left-hemisphere network. 91% of right-handedparticipants showed left-hemisphere dominance for language in a study inwhich they were injected with a barbiturate of sodium amytal either into theright, or into the left, carotid artery (Loring, Meador, Lee et al. 1990). Sodiumamytal causes an anaesthesia for 1 to 2 minutes in the cerebral hemisphere onthe same side as the injection. If language is lateralised to that hemisphere theperson will be unable to speak. Less invasively, in a large functional imagingstudy, 94% of right-handed participants showed left-hemisphere dominance forlanguage (Springer, Binder, Hammeke et al. 1999). These data are consistentwith the notion that the left-hemisphere contains circuitry specialised forlanguage processing. Nonetheless, in principle, L2 might be represented inhomologous areas of the right-hemisphere (Albert and Obler 1978). However,Rapport, Tan and Whitaker (1983) in a study of right-handed polyglot aphasicsprior to surgery found no evidence of the disruption of picture naming follow-ing intracarotid injection of sodium amytal into the right-hemisphere. Incontrast, naming was massively disrupted following injection into the left-hemisphere. Further, in a study of 88 reported cases of right-handed bilingualaphasics, Fabbro (1999:210–211) found that only 8% presented with a lesion tothe right-hemisphere. Taking into account reporting biases, he concluded that theincidence of aphasia in bilinguals with right-hemisphere lesions is not in facthigher than that shown by monolingual aphasics. These observations suggest thatboth L1 and L2 are represented in a common substrate in the left-hemisphere

Neural basis of lexicon and grammar in L2 acquisition 199

though perhaps with different microanatomical representations (Paradis 2001).The critical question then concerns the neural representation of lexical andgrammatical knowledge for L2 within this hemisphere. I consider a more subtleversion of the differential representation hypothesis in the next section.

3. The specific representation of lexicon and grammar

Researchers have taken different views on the extent to which the lexicon andgrammar of a language are subserved by distinct neural mechanisms that arelanguage-specific. According to one view, words are processed in one dedicated,posterior system and grammar is processed in another dedicated anteriorsystem (e.g. Chomsky 1995, Pinker 1994). An alternative view is that these twocomponents of the language system are indeed mediated by distinct neuralmechanisms but that these mechanisms are not in fact specific to language.Ullman (2001a) proposed that the lexicon is stored in a neural system thatsubserves declarative memory in general. By contrast, grammar is representedin a procedural memory system that is implicated in the learning of motor andcognitive skills in general.

The declarative memory system is held to be involved in the learning offacts and events and to be particularly important in the learning of arbitrarilyrelated information from different sources such as the associations between thesounds of words and their meanings. Information in this system is available for‘explicit’ (i.e. conscious) recollection. Whilst initial learning may depend on themedial temporal structures (e.g. the hippocampus), neocortical regionssubsequently become the principal site of representation (e.g. temporo-parietalregion). In contrast, the use of grammar (including syntax, morphology, andphonology) is achieved by a system that underlies the performance of motorskills in general — a procedural system mediated by structures in the frontalcortex and basal ganglia and the inferior parietal region (Squire, Knowlton andMusen 1993, Squire 1994). The nondeclarative or procedural system is held toinfluence behaviour ‘implicitly’, i.e. in the absence of conscious recollection.Hence the contrast between declarative and procedural (or nondeclarative)memory systems is sometimes referred to, in summary terms, as a contrastbetween explicit and implicit memory systems (see, for example, Paradis, 1994).But this identification cannot be taken too far. The acquisition of vocabulary forinstance is not simply a matter of declarative memory. Gupta and Dell (1999)argue that the learning of vocabulary involves the explicit learning of the

200 David W. Green

relationship between a phonological representation and meaning and theimplicit learning of the mapping of input phonemes onto an articulatory chain(see also Ellis 1995 and Segalowitz and Segalowitz 1993). Consistent with thisview, Paradis (1997:333–334) argues that the acquisition of vocabulary is apartially explicit process. Likewise Lebrun (2002:304) argues that commonwords and phrases are not only stored neocortically but as verbomotor subcor-tical patterns. But the point to note here is that this contrast applies equally toL1 and L2 vocabulary learning.

The distinction between declarative and nondeclarative memory systemshas been exploited as a means to contrast the neural representation of L1 andL2. Consider one possible difference between the acquisition of L1 and L2. Therepresentation of a language acquired in an oral, conversational setting (e.g.Quebecois; Friulian) may differ from one acquired in the formal setting of aschool. In particular, there may be a difference in the representation of gram-mar and morphology (morphosyntax). Individuals who acquire L1 in aconversational setting achieve proficiency in morphosyntax implicitly. Incontrast the grammatical rules in the school setting are part of an explicitdeclarative knowledge. Maturational constraints may also affect the acquisitionof morphosyntax more than the acquisition of vocabulary (Paradis 1994:398).In consequence, Paradis (1994) has argued that L1 and L2 may load differentlyon these two memory systems. L1, especially its morphosyntax, but also itslexicon to an extent, may load more on the implicit, procedural memory systemwhereas L2 may load more on an explicit, declarative memory system.

Ullman (2001b) has made a related proposal based on the notion thatlinguistic abilities are sensitive to the age of exposure to the language (Lenne-berg 1967). It is generally considered that attainment in L2 is constrained by theage at which learning begins. For instance, there is a negative correlationbetween the age at which learning begins and eventual performance (Johnsonand Newport 1989). But not all language capacities are affected equally: the useof grammar is more adversely affected than the use of lexical items. As a result,in L2 acquisition, there is a specific shift in processing of grammatical computa-tion from the procedural memory system to the declarative memory system(Ullman 2001b:108; Note 2: 110 contrasts his proposal with that of Paradis).There is no shift for lexical processes. These are held to depend on the declara-tive memory system for both L1 and for L2. I take this version of the differentialrepresentation hypothesis and consider it in a little more detail.

In Ullman’s (2001b) view, vocabulary in both L1 and in L2 is representedin a declarative memory system — in the form perhaps of an associative

Neural basis of lexicon and grammar in L2 acquisition 201

network linking meanings and sounds. By contrast, whereas grammaticalprocessing in an L1 (e.g. forming the past-tense of a regular English verb suchas walk by adding -ed to the stem) relies on a procedural system, grammaticalprocessing in an L2 (such as English), is achieved declaratively. The basic notionis that linguistic forms that are compositionally computed in L1 are memorizedin L2 as if they were words or idioms. Given that the associative lexical memorycan generalize patterns, such a system can still be productive. Certain rules mayalso be learned, though these will differ in type from any implicitly learned rulesof L1. Ullman acknowledges that age of exposure to L2 is not the only factoraffecting the dependence on declarative memory: “even older learners mayshow a degree of dependence on procedural memory if they have had a largeamount of practice — that is, a fairly substantial amount of use of the language”(Ullman, 2001b:110). But the clear implication is that even proficient speakersof L2 will differ from native speakers of that language in relying much more ondeclarative memory for grammatical computations.

There are two elements to Ullman’s (2001b) proposal. First, it is motivatedby the claim that linguistic abilities are sensitive to the age of exposure. Second,it appeals to two distinct types of memory that have been inferred from researchon amnesic patients. The following section considers each element.

4. Some grounds for doubt

Ullman (2001b) motivated the shift towards a declarative representation ofgrammatical knowledge in L2 by appealing to data (Johnson and Newport1989) on the limits of L2 (English) attainment for native Korean, and Chinese;speakers who learned English post-puberty (after the age of 17 years). In theJohnson and Newport study, L2 learners were asked to judge whether or notauditorily presented sentences were grammatically correct or not. Roughly halfof the sentences were grammatical and half were minimally different ungram-matical variants. A key finding was that age of acquisition was negativelycorrelated with performance before puberty, but there was no systematicrelationship between age of acquisition and performance post-puberty. Further,few if any of the 46 participants in their study, achieved native-like levels ofperformance post-puberty. These results are consistent with a critical periodview of language acquisition (Lenneberg 1976). In contrast to such data,Birdsong and Molis (2001) in a replication of Johnson and Newport, but using61 native Spanish speakers, found that the age of acquisition did predict

202 David W. Green

attainment in L2 post puberty. They also found evidence of native-like attain-ment in late learners of L2. They argue, in line with Flege, Yeni-Komshian andLiu (1999), that practice is an important factor in determining the eventual levelof attainment. The nature of the L1 and L2 pairing may also be relevant.

Turning now to the key element of Ullman’s (2001b) proposal, namely theidea that emphasis is shifted to the declarative memory system in L2 learnersand that there is little or no involvement of a procedural memory system ingrammatical processing. This proposal presumes that there is good evidence forthe existence of these two types of memory systems. Data from amnesic patientsappear to provide compelling support. Amnesics have poor declarative memorybut show normal performance on various tasks involving nondeclarativememory (Gabrieli 1998). A study by Knowlton and Squire (1993) is exemplary.They used dot patterns created by systematically distorting a prototype pattern.Amnesic patients were able to classify these patterns normally (a nondeclarativetask) but were severely impaired in their ability to recognize whether or not aparticular pattern had been presented previously (a declarative task). Knowltonand Squire interpreted these data as evidence that performance in the two taskswas mediated by two different memory system one of which (the declarativesystem) was impaired and the other of which was not. However, computationalwork by Nosofsky and Zaki (1998) challenged this interpretation by showingthat differences in performance on these two tasks can be obtained within asingle memory system. A slight reduction in the value of a sensitivity parameterin their computational model reduced classification performance marginallybut exerted a marked effect on recognition performance.

More pertinent to the present concern is work on the learning of artificialgrammars. Knowlton and Squire (1994, 1996) contrasted normals and amnesicsin their ability to learn an artificial grammar. In such studies individuals firstmemorise a set of strings generated by a (finite-state) grammar and are theninformed about a set of rules generating the strings. In the classification task,they have to classify a new set of strings into those that are grammatical andthose that are not. In the recognition task, they have to indicate whether or nota string of symbols was presented. In the studies by Knowlton and Squireclassification performance in amnesic patients was normal but recognitionperformance was impaired. Knowlton and Squire interpreted this as evidencethat the two tasks are mediated by different memory systems — only one ofwhich, the declarative memory system, is impaired in amnesics.

Other studies support a dissociation between recognition and repetitionpriming in amnesic patients (e.g. Hamann and Squire 1997: Experiment 1).

Neural basis of lexicon and grammar in L2 acquisition 203

Repetition priming refers to the improvement in the identification, detection orproduction of a stimulus as result of having experienced it previously. It isconsidered to be mediated by nondeclarative memory because repetitionpriming occurs even when there is no conscious recollection of the priorexperience of the stimulus (Gabrieli 1998). Priming is held to be mediated byneocortical structures that are spared in amnesic patients (e.g. McClelland,McNaughton and O’Reilly 1995).

Hamann and Squire (1997: Experiment 1) presented amnesic patients andcontrols in a priming phase with a set of four-letter consonant strings for threeseconds each. In a later study phase, these stimuli and others were presented for170 ms each and participants had to identify them. Priming was operationalisedas the difference in the identification of old and new strings. After two primingand study phases participants were tested for their recognition of stimulipresented in the study phases. The recognition test consisted of pairs of old andnew stimuli and participants had to decide which string was old. Amnesicsshowed the same degree of priming as the normal controls but their recognitionperformance was at chance. But do these dissociations (between recognitionand classification and between recognition and repetition priming) require usto postulate two distinct memory systems?

On the basis of simulation data, Kinder and Shanks (2001) argue that theydo not. They used a simple recurrent network (see Cleeremans 1993) tosimulate performance in artificial grammar learning. In order to differentiate anamnesic network from a normal network they reduced the learning rate duringacquisition and, in a separate simulation, reduced the number of hidden unitsprior to test. A change in either parameter was sufficient to induce a dissocia-tion between classification and recognition. In a second set of simulations, theyshowed that a simple recurrent network could also simulate the dissociationbetween recognition and repetition priming.

These simulation results show that the dissociations observed in the clinicalpopulation do not require a dual-memory system. Instead, such results areconsistent with a single system or network and so weaken support for theprocedural/declarative distinction at the heart of the differential representationhypothesis. Although these simulation results lead us to be wary of usingperformance differences as direct evidence of different cognitive and neuralsystems,1 computational results only provide an existence proof and do notestablish that the brain does not in fact have distinct declarative and non-declarative memory systems. Further, I know of no computational studiesaimed at showing whether or not the kinds of specific dissociations referenced

204 David W. Green

by Ullman can emerge from single networks (see also Ullman 2001b, Note 1:10).At a minimum, such results encourage the search for an alternative formula-tion. In fact, computational considerations lead us to expect a rather differentoutcome for L2 acquisition. The next section considers a computationaljustification for the convergence hypothesis.

5. The convergence hypothesis and its computational basis

According to the convergence hypothesis, any qualitative differences betweennative speakers of a language and L2 speakers of that language disappear asproficiency increases. Such a hypothesis is broadly in line with the idea thatproficiency in language involves identifying, and using, the various cues tomeaning — see, for example, the competition model (MacWhinney 1997). Theconvergence hypothesis is specifically concerned with the neural, and notsimply the cognitive, representation of L2. Given the diversity of languages, inorder to consider the computational grounds for the convergence hypothesis,we need to characterise languages at a certain level of abstractness. There arefour linguistic means for communicating experience (see, for example, Toma-sello 1995): individual symbols (lexical items); markers on symbols (grammati-cal morphology), ordering patterns of symbols (word order) and prosodicvariations of speech (e.g. stress, intonation, timing). Languages differ in theweight they attach to these different linguistic means. In some languages, wordorder is basically free and information on ‘who did what to whom’ is conveyedby word endings or by prosody in tone languages. By contrast, in English, suchinformation is conveyed by word order and this is relatively rigid. Thesedifferent linguistic means or signals require different devices for their process-ing. The first step of the computational argument for the convergence hypothe-sis is that the neural representation of the various linguistic devices is similaracross languages. The next paragraph spells out the basis for this argument.

Such devices may be represented by specific networks with a distinct neuralanatomical representation or they may be mediated by a specialised networkwith a distributed neural representation. Specialised networks can emerge fromunique interactions amongst a set of regions each fulfilling a number ofdifferent functions (e.g. Mesulam 1990). Consider the development of a systemusing these devices to communicate meaning. First, different neural regionsmay compete to process input. Those regions, whether innately specified, orpossessing some small processing advantage, will come to mediate processing ofa given linguistic means. Neural regions active at the same time will connect

Neural basis of lexicon and grammar in L2 acquisition 205

together (Hebb 1949, Robertson and Murre 1999) giving rise to a specialisednetwork. Second, once a network has come to process signals of a particulartype it will resist processing other types of signals unless input to it is curtailedin which case it may process signals of another type given that plastic reorgani-zation is possible. One line of support for this view is evidence of ‘crowding-out’: when language is displaced to the right-hemisphere as a result of neurolog-ical damage to the left-hemisphere, a person’s visuo-spatial skills (typicallymediated by the right-hemisphere) are impaired (Teuber 1974, Strauss, Satzand Wada 1990). Third, given commonalities across brains in the initialsensitivities of different regions — call this the commonality assumption2 —there will be commonalities in the neural representation of the different devicesfor speakers of different languages.

The second step of the computational argument for the convergencehypothesis is that the acquisition of an L2 arises in the context of an alreadyspecified, or partially-specified system, with a specific neural network mediatingeach device.3 It follows that L2 will receive convergent representation with L1.Further, given the commonality assumption (see above) the representation ofL2 will converge with the representation of that language learned as an L1.

The convergence hypothesis does not entail that a speaker of L2 willnecessarily achieve native-like levels of performance (for example in achievingcertain phonetic norms, Flege 1995) nor does it exclude the possibility that taskssuch as mental arithmetic are carried out exclusively in L1. Clearly, also, thecontexts of acquisition (e.g. a formal school setting versus an immersionsetting) affect the initial registration of linguistic information. However, incontrast to the differential representation hypothesis, the convergence hypothe-sis is committed to the prediction that as proficiency in L2 increases, the samelinguistic means involve the same neural networks as native speakers. Thehypothesis would be refuted if there is no change in representation withproficiency and if a normal, proficient L2 speaker activated neural networksdisjoint from those of a native speaker, especially when encoding and decodingsyntactic information. The fact that explicit, declarative representations ofgrammatical information, play only an initial role in on-line processing,according to the convergence hypothesis, does not mean that they are unimpor-tant. Explicit (metalinguistic) representation may well benefit the recovery of L2over L1 following brain-damage (e.g. Lebrun 2002, Paradis 1994, 1997). Butsuch a possibility, it seems to me, cannot be used to claim a continuing role forsuch representations in on-line processing, once the relevant procedures are inplace. However, this possibility is open to test.

206 David W. Green

Of course, certain differences in processing profiles and neural activationare to be expected when L2 speakers are contrasted with monolingual speakersof that language. The acquisition of an L2 carries consequences. Alternativemeans for expressing communicative intentions can induce competition bothin production (e.g. Bialystok 1992, Gollan and Kroll 2001, Green 1986, 1998,Hermans 2000) and in comprehension (e.g. Dijkstra, van Jaarsveld and ten Brinke1998, de Groot, Delmaar and Lupker 2000) though the range of conditions underwhich this occurs is unknown. Depending on how the system is controlled theremay be a difference in processing profiles despite convergence. But there will bea marker for such an effect: increased competition (and hence increasedactivation perhaps) in the areas associated both with lexical and with grammati-cal encoding will be associated with increased activation in the areas associatedwith language control. Such effects will be apparent both in L2 and in L1.

The next section considers empirical data with a view to adjudicatingbetween the differential representation and convergence hypotheses.

6. Empirical data: Can we adjudicate?

Both neuropsychological and neuroimaging studies provide data that may helpin adjudicating between the two hypotheses. Under the latter we include Event-Related Potential (ERP) data and haemodynamic methods (Positron EmissionTomography, PET and functional Magnetic Resonance Imaging, fMRI (Note 4briefly describes these classes of method). We first consider evidence for thedistinct representation of lexical and grammatical information in L1 and thenconsider what we can infer from neuropsychological and neuroimaging studiesof bilingual speakers.

6.1 The representation of L1

Both neuropsychological and neuroimaging data suggest that there is a degreeof specialisation within monolingual speakers for syntactic and semanticprocesses. For instance, Breedin and Saffran (1999) reported a patient, D.M.,who was good at detecting grammatical violations despite a pervasive loss ofsemantic knowledge. ERP data from normal individuals also indicate that thereare distinct mechanism mediating at least post-lexical syntactic and semanticprocesses (Hagoort, Brown and Osterhout 2000). For instance, N400 (found400 ms after an event) is sensitive to violations of semantic expectancy whereas

Neural basis of lexicon and grammar in L2 acquisition 207

P600 (found 600 ms after an event) is sensitive to syntactic violations. ERP datacannot provide direct evidence of the neural sources of such effects but haemo-dynamic studies are informative. Studies on grammatical processing andencoding in native speakers (Hagoort, Brown and Osterhout 2000) suggest acommon syntactic component subserved by the left frontal area (a dorsal partof Broca’s area and adjacent parts of the middle frontal gyrus) and studies onthe semantic representation of words identify regions in the temporo-parietalregion — the left extrasylvian temporal cortex and the left anterior inferiorfrontal cortex (Price 2000). Neuropsychological data (Donkers, Redfern andKnight 2000) and also neuroimaging data (e.g. Price, Moore, Humphreys andWise 1997) suggest a specific area in the anterior temporal region as a sitecritical for the interpretation of sentences (i.e. syntactic-semantic integration).

6.2 The representation of L2

What empirical evidence is there that L2 is represented differently from L1 asproposed by the differential representation hypothesis? According to Ullman(2001b), L2 learned late will be sensitive to damage to neocortical temporal/temporal-parietal regions for those linguistic forms that depend on grammaticalprocessing in L1. A case reported by Ku, Lachmann and Nagler (1996) seems tosupport his position. A 16 year old native Chinese speaker who had been livingin the United States for six years and who had received intensive training inEnglish over this period suffered a circumscribed lesion to the left temporal lobe(as a result of herpes simplex encephalitis). For three weeks following the lesionhe lost the ability to comprehend and to speak English. In contrast, naming inMandarin was normal. However, in speaking Mandarin his syntax was simpli-fied and so this case is not decisive support for the claim that grammaticalinformation is represented differently in L2.

The notion that L1 grammatical processing is mediated by a frontal-basalganglia circuit predicts that damage to the basal ganglia will lead to a selectiveloss of L1. Fabbro and Paradis (1995) report the case of E.M. with such a lesionand, true to prediction, her spontaneous speech in her L1 (Venetan) was poorwhereas her speech was better in her L2 (Italian) that she rarely used prior tothe lesion. Ullman (2001b) considered the nature of her errors. There was asimilar proportion of word finding difficulties in both languages but a tendencyfor poorer grammatical performance in L1 (e.g. the omission of grammaticalfunction words in obligatory contexts). However, these effects are small and theoverwhelming difference is her spontaneous use of L2 in preference to L1.

208 David W. Green

Green and Price (2001) argue that language control (e.g. the ability to selectbetween one language and another) is also mediated by frontal-basal gangliacircuits. An impairment in this system will also give rise to problems in modu-lating the output from the lexical-semantic system.

Leaving the neuropsychological data on one side, how convincing is theERP and neuroimaging data for a distinct representation of L2? Individualsacquiring L2 can vary in terms of when they acquired L2, how they acquired L2and how proficient they are in using it. Typically, proficiency is confoundedwith the age of acquisition. In terms of proficiency it is natural to expect thatless proficient users of L2 will show quantitative differences on a range ofmeasures (e.g. naming time, ERP effects and activation patterns). The criticalissue is whether or not there are qualitative differences indicating that differentneural mechanisms are involved. If there are, it is important to determinewhether these necessarily imply different representations.

ERP data point to both quantitative and qualitative differences in process-ing between L1 and L2. Kutas and Kluender (1991), for instance, found that theN400 component in response to a semantic anomaly was delayed and of loweramplitude in a bilingual’s less fluent language. Likewise, Webber-Fox andNeville (1996) found N400 present in all groups of Chinese-English secondlanguage learners though it was more delayed in those learning L2 afterreaching the age of 11–13 years. More critically, in contrast, to monolinguals,there was a distinct pattern of response to phrase structure violations inbilinguals. Only individuals acquiring L2 before the age of four showed nodifference from native learners of L2. Such data are compatible with the notionthat there is a critical period for language learning and are consistent with thenotion that different brain mechanisms mediate syntactic processing in latelearners of L2. But they are not decisive as later exposure to English wasassociated with worse performance in identifying syntactically anomaloussentences. Such individuals may have been circumventing syntactic processing.

Hahne and Friederici (2001) examined the effects of phrase structureviolations and semantic anomaly in Japanese late learners of German. Theseindividuals also showed substantial error rates in a grammaticality judgementtask and so cannot be considered proficient. Hahne and Friederici (2001)confirmed a delayed N400 effect in response to semantic anomaly but also founda right anterior central negativity. Unlike native German speakers there was noearly anterior negativity in response to a syntactic violation. They propose thatlate learners identify lexical content independently of morphological form (e.g.the past participle form of the verb) and construct a representation directly

Neural basis of lexicon and grammar in L2 acquisition 209

based on conceptual information. Hahne and Friederici speculated on thesource of these effects based on other functional imaging data (Falk, Durwen,Müller et al. 1999, Opitz, Mecklinger and Friederici 2000). They proposed thatlate learners of L2 (at least their Japanese participants) supplement lexical-semantic information by using the right prefrontal cortex to construct asemantic-conceptual representation of sentence content.

Unfortunately, there is a dearth of functional imaging studies of L2 gram-matical processing and encoding. On the production side more generally, Kim,Relkin, Lee and Hirsch (1997) used fMRI to study the representation of L1 andL2 while bilinguals covertly described what they had done the previous day.Half of their sample acquired their L2 in infancy and half after puberty. L1 andL2 were represented in spatially segregated parts of the left inferior frontalcortex (Broca’s area) in late learners but in overlapping parts of Broca’s area inearly learners. Regions activated in Wernicke’s area (traditionally linked tolanguage comprehension) overlapped for both groups. Kim et al. concludedthat age of acquisition affected neural representation. However, there was noassessment of proficiency in L2 and so we cannot tell whether or not age ofacquisition is critical. Late learners could have been less proficient in their L2.In fact, when L2 proficiency is high, Chee, Tan and Thiel (1999) found nodifference within the left prefrontal cortex (including Broca’s area) whencomparing word generation in early bilinguals (L2 acquired before the age of six)and late bilinguals (L2 acquired after the age of twelve) for Mandarin-Englishspeakers in Singapore. The pattern of brain activation in response to Mandarinwords was similar to that observed in response to English words, and did notvary as a function of age of acquisition. Klein, Milner, Zatorre et al. (1995)reached a similar conclusion: a common network of brain areas is engaged in L1and in L2 in highly-proficient bilinguals despite late acquisition of L2.

In terms of comprehension, Abutalebi, Cappa and Perani (2001) concludedthat both languages are processed in a single and common left-sided network,comprising all the classical language areas when L2 is acquired early (before the ageof five). In contrast for late bilinguals, the degree of language proficiency is thecritical factor. Highly proficient late bilinguals activate similar left hemisphericareas for L1 and L2 (Perani, Paulesu, Sebastian-Galles et al. 1998) whereas lessproficient subjects have different patterns of activation for their two languages(Perani, Dehaene, Grassi et al. 1996, Dehaene et al. 1997, Price, Green and VonStudnitz 1999). Critically, more extensive activations are associated with the lessproficient language (e.g. greater temporal lobe dispersion) perhaps indicating thatin comprehending stories individuals process grammatical forms differently.

210 David W. Green

At the macroanatomical level then current functional imaging data indicatethat there is little difference in the representation of L1 and L2 for highly-proficient bilinguals. The implication is that age of acquisition is less criticalthan proficiency. However, as Vaid and Hull (2002) observed, we still needstudies that directly compare individuals differing in L2 proficiency.

7. Ways forward

The differential representation hypothesis and the convergence hypothesis concurthat the initial representation of L2 may differ from that of native speakers of thatlanguage. The fundamental research requirement then is to conduct within-participant longitudinal studies to chart changes in behaviour on various tasks,e.g. picture naming (Kroll, Michael, Tokowicz and Dufour 2002) and toexamine changes in ERP and neuroimaging profiles as proficiency in L2changes. Further, such studies need to involve both syntactic and lexical tasks.To my knowledge there are currently relatively few such studies.

As discussed above, in contrast to L1, grammatical knowledge of L2 may berepresented explicitly and declaratively. According to the convergence hypothe-sis, ERP responses to syntactic anomalies should change with proficiency.Osterhout and McLaughlin (2000) studied responses to semantic and syntacticanomalies in native speakers of French and in novice learners. Semanticanomalies yielded N400 and syntactic anomalies yielded a P600 in nativeFrench speakers. French learners after four weeks of instruction showed anN400 in response to semantic anomalies. In contrast, there was either an N400or no effect in response to syntactic anomalies. After just four months, however,syntactic anomalies yielded a P600 but no N400. These data suggest that if thereare qualitative differences between native speakers and L2 learners, these can berather short-lived. Any differences in responding to syntactic anomalies arepresumably negatively correlated with the increasing grasp of syntax. Consistentwith the convergence hypothesis, Weber-Fox and Neville (submitted, cited inUllman 2001b) examined responses to open-class and closed-class words.Native speakers of English showed a left anterior negativity to closed class words(N280) and an N400 for open-class words. L2 speakers of English (with Chineseas their first language) showed the same open class N400 as native Englishspeakers. Interestingly, the response to closed class words related to an indepen-dent test of their grammatical ability. The higher the score on the test, theearlier the anterior negativity for closed class words.

Neural basis of lexicon and grammar in L2 acquisition 211

It is important to extend longitudinal investigation to examine the neuralcorrelates of parsing in more detail. A number of behavioural studies haveexamined the extent to which L2 learners of English show an influence of theirL1 on resolving local syntactic ambiguities of various kinds (e.g. Frenck-Mestreand Pynte 1997, Juffs 1998). The behavioural picture is complex but notinconsistent, in my view, with the convergence hypothesis (see Kroll andDussias in press for a recent review). In examining the extent to which L2parsing profiles converge with those of native speakers it is possible that we willneed to identify instances where an inappropriate parse leads to a high cost inrecovering the intended interpretation. After all, if the intended interpretationcan be recovered quickly, what computational constraint is there for the neuralprocessing profile of an L2 learner to converge with that a native speaker? Onthe other hand, the processing cost of recovery may be a function of languagebackground. As Juffs (1998:135) proposes, languages (e.g. Japanese, Korean)with Subject Object Verb structure may lead speakers to become adept atrecovery from garden-paths. Speakers of these languages may routinely makeparsing decisions about theta-roles that must be revised on encountering theverb. Regardless of these kinds of possibilities, the convergence hypothesispredicts that for proficient speakers of L2, sentence interpretation will activatean area in the anterior temporal pole and this area, as in the case of L1 (Nop-peney and Price submitted), will show evidence of syntactic priming, i.e.reduced activation in circumstances where the same syntactic structure isrepeated either within- or between-languages.

The achievement of proficiency also entails attending to the world in themanner of native speakers so that lexical and grammatical processes can becoordinated appropriately (Black and Chiat 2003, Levelt, Roelofs and Meyer1999, Slobin 1996). It is this pattern of coordination that also needs to beconsidered. Neuroimaging allows us to consider how areas work together.Büchel, Friston and Frith (2000:339), for instance, describe methods forexamining effective connectivity (“the influence one neuronal system exerts onanother”) using structural equation modelling of the patterns of activation indifferent regions of interest. We should expect convergence of these patterns ofeffective connectivity with those of native speakers of the language as proficien-cy increases. By way of illustration, consider differences between languages inthe way they package together different aspects of a movement. English packag-es manner and motion together (hop, float) and, unlike some other languages(e.g. Spanish), has fewer verbs that express motion and direction together (e.g.rise, fall). In order to select the correct verb in Spanish an English speaker must

212 David W. Green

explicitly encode the direction of motion — i.e. make that property of the scenesalient rather than the manner of motion. Proficiency should be associated withchanges in the explicit representation of the properties of scenes. In terms ofactivation patterns, there should be changes in the activation and connection ofcortical regions mediating those properties so that the correct words can beselected and expressed in a suitable syntactic frame.5

8. Conclusion

This chapter has contrasted two main hypotheses about the representation andprocessing of lexicon and grammar in L2. The subtle form of the differentialrepresentation hypothesis proposes that declarative representations play a muchmore important role in the representation of grammar in L2 than in L1. Thischapter has considered the computational and empirical basis for the differen-tial representation hypothesis and argued, on both computational and neuro-imaging grounds, for an alternative, convergence hypothesis. According to thishypothesis, as proficiency in L2 increases, the networks mediating L2 convergewith those mediating language use in native speakers of that language. Currentevidence marginally favours the convergence hypothesis. However, we lackappropriate longitudinal studies of L2 acquisition. Crucially, we have little or noinformation about the functional integration of different neural regions duringsecond language use. Scope indeed for discovery!

Notes

* I thank the editors of this volume for the opportunity to contribute to our understanding

<DEST "gre-n*">

of the interface between syntax and the lexicon in L2 acquisition and for constructivecomments on a previous draft of this chapter.

1. It would be useful to have converging evidence for the existence of distinct neural systemsbeyond that offered by amnesia. Fortunately, we do not need to rely exclusively on neuropsy-chological data. Neocortical activity is reduced for stimuli that have been processed before(Ungerleider, 1995) and this datum has been interpreted as evidence that these structuresmediate priming. But amnesics can also be impaired on priming in certain conditions(Ostergaard, 1999).

2. The commonality assumption is compatible with anatomical variability. Brains differ inthe location of quite major features (e.g. Rickard 2000). The commonality assumption refersto the sensitivity of neural regions, not to their precise anatomical location. Neuroimagingevidence for convergence must take such variability into account.

Neural basis of lexicon and grammar in L2 acquisition 213

3. At the lexical level, the impact of a prior representation is captured by cognitive modelssuch as the revised hierarchical model (Kroll and Stewart 1994) and the distributed featuresmodel (e.g. De Groot 1993, Kroll and de Groot 1997). It might also be argued that priorrepresentation of L1 induces a radically distinct representation of L2. Jiang and Forster(2001) propose, on the basis of experimental evidence, that lexical items in L2 (they testednative Chinese speakers with English as the L2) are represented in a non-lexical memory.This memory allows the meaning of translation equivalents to be retrieved indirectly via theL1 lexical item. However, it is unclear what it means for a non-lexical system to represent thesyntactic properties of lexical items. Their findings, as they acknowledge, need to bereplicated with proficient L2 speakers and with a different language pairs.

4. ERPs provide high resolution temporal evidence of the existence of different processesduring language processing. These are derived by averaging signals from an eletroencephalo-gram over a series of trials that time-locked to the presentation of a particular type ofstimulus. An ERP itself comprises various components (i.e. positive and negative voltagepeaks). Where these are affected by some experimental manipulation they are termed ERPeffects. ERP data are compatible with an infinite number of neural generators (the “inverseproblem”) and so they need to be complemented by data from haemodynamic methods.Haemodynamic methods (Positron Emission Tomography, PET; or functional MagneticResonance Imaging, fMRI) rely on a close coupling between changes in the activation of apopulation of neurons and change in blood supply. A haemodynamic effect arises only whenthere is a change in the overall metabolic demand in a neuronal population. PET and fMRItrack different signals. PET measures the decay of a short-lived isotope which accumulatesin a neural region in proportion to the amount of blood flowing through that region. Themost typical fMRI method indexes metabolic demand and hence relative neural activity byassessing the ratio of deoxy- to oxyhaemoglobin in the blood (see Rugg (2000) for a criticalappraisal of these methods). It is worth noting that haemodynamic methods, along withother electrophysiological methods, allow us to identify regions that are sufficient for taskperformance but they do not allow us to identify regions that are necessary for task perfor-mance. Other data are needed to identify which regions are necessary. For instance, if taskperformance is impaired in a patient with a lesion at given site then this region, or thenetwork of which it is part, is necessary for task performance. Likewise, “virtual” lesionsinduced by drugs or by transcranial magnetic stimulation, may help identify regionsnecessary for task performance.

5. Lower levels of proficiency in L2 might also be associated with more reliance on conceptu-al/pragmatic information (see Hahne and Friederici above). In native speakers, conceptualfactors affect grammatical encoding (Vigliocco and Hartsuiker 2001) and it is reasonable toexpect that such effects might be more marked in novice learners. A sentence completiontask offers one behavioural measure. Vigliocco and Franck (2001) showed that there weremore errors in generating a predicate when the sex of the referent was incongruent with thegender of the noun. Given that novice learners of French or Italian, for instance, know thesyntactic gender of the noun, they might also show greater effects of incongruity.

214 David W. Green

References

Abutalebi, J., Cappa, S.F. and Perani, D. 2001. “The bilingual brain as revealed by functionalimaging”. Bilingualism: Language and Cognition 4: 179–190.

Albert, M.L. and Obler, L.K. 1978. The bilingual brain: Neuropsychological and neurolingu-istic aspects of bilingualism. New York: Academic Press.

Bialystok, E. 1992. “Selective attention in cognitive processing: The bilingual edge”. InCognitive processes in bilinguals, R. J. Harris (ed.), 501–513. Amsterdam: Elsevier SciencePublishers B.V.

Birdsong, D. and Molis, M. 2001. “On the evidence for maturational constraints in second-language acquisition”. Journal of Memory and Language 44: 235–249.

Black, M and Chiat, S. 2003. “Noun-verb dissociations: a multi-faceted phenomenon”.Journal of Neurolinguistics 16: 231–250.

Breedin, S.D. and Saffran, E.M. 1999. “Sentence processing in the face of semantic loss:A case study”. Journal of Experimental Psychology: General 128: 547–562.

Büchel, C., Frith, C. and Friston, K. 2000. “Functional integration: Methods for assessinginteractions amongst neuronal systems”. In The neurocognition of language, C.M.Brownand P. Hagoort (eds), pp.337–355. Oxford: Oxford University Press.

Chee, M.W.L., Tan, E.W.L. and Thiel, T. 1999. “Mandarin and English single wordprocessing studied with functional Magnetic Resonance Imaging”. Journal of Neurosci-ence 19: 3050–3056.

Chomsky, N. 1995. The minimalist program. Cambridge, MA: MIT Press.Cleeremans, A. 1993. Mechanisms of implicit learning: Connectionist models of sequence

processing. Cambridge, MA: MIT Press.Dehaene, S.D., Dupoux, E., Mehler, J., Cohen, L., Paulesu, E., Perani, D., van de Moortele,

P.F., Lehéricy, S. and Le Bihan, D. 1997. “Anatomical variability in the cortical repre-sentation of first and second languages”. Neuroreport 8: 3809–3815.

De Groot, A.M.B. 1993. “Word-type effects in bilingual processing tasks: Support for amixed representational system”. In The bilingual lexicon, R. Schreuder and B. Weltens(eds), 27–51. Amsterdam: John Benjamins.

De Groot, A.M.B., Delmaar, P. and Lupker, S.J. 2000. “The processing of interlexical homo-graphs in translation recognition and lexical decision: Support for non-selective access tobilingual memory”. Quarterly Journal of Experimental Psychology 53A: 397–428.

Dijkstra, A., van Jaarsveld, H. and ten Brinke, S. 1998. “Interlingual homograph recognition:Effects of task demands and language intermixing”. Bilingualism: Language andCognition 1: 51–66.

Dronkers. N.F., Redfern, B.B. and Knight, R.T. 2000. “The neural architecture of languagedisorders”. In The new cognitive neurosciences, M.S. Gazzaniga (ed.), 949–960. Cam-bridge, Mass: MIT.

Edelman, G.M. 1989. The remembered present: A biological theory of consciousness. New York:Basic Books.

Ellis, N.C. 1995. “The psychology of foreign language vocabulary acquisition: Implicationsfor CALL”. Computer Assisted Language Learning 8: 103–128.

Neural basis of lexicon and grammar in L2 acquisition 215

Fabbro, F. 1999. The neurolinguistics of bilingualism: An introduction. Hove, Sussex: Psychol-ogy Press.

Fabbro, F. and Paradis, M. 1995. “Differential impairments in four multilingual patients withsubcortical lesions”. In Aspects of bilingual aphasia, M. Paradis (ed.), 139–176. Oxford,UK: Pergamon.

Falk, A.R., Durwen, H.F., Müller, C., König, M., Müller, E. and Heuser, L. 1999. “Determi-nation of eloquent cortical areas in Russian bilinguals performing a word generationtask”. Neuroimage 6: S1002.

Flege, J.E. 1995. “Second-language speech learning: Theory, findings and problems”. InSpeech perception and linguistic experience: Issues in cross-language research, W. Strange(ed.), 233–277. Timonium, MD: York Press.

Flege, J.E., Yeni-Komshian, G.H. and Liu, S. 1999. “Age constraints on second-languageacquisition”. Journal of Memory and Language 41: 78–104.

Frenck-Mestre, C. and Pynte, J. 1997. “Syntactic ambiguity resolution while reading in secondand native languages”. Quarterly Journal of Experimental Psychology 50A: 119–148.

Gabrieli, J.D. 1998. “Cognitive neuroscience of human memory”. Annual Review ofPsychology 49: 87–115.

Gollan, T.H. and Kroll, J.F. 2001 “Lexical access in bilinguals”. In A handbook of cognitiveneuropsychology: What deficits reveal about the human mind, B. Rapp (ed.), 321–345.New York: Psychology Press.

Green, D.W. 1986. “Control, activation and resource: A framework and a model for thecontrol of speech in bilinguals”. Brain and Language 27: 210–223.

Green, D.W. 1998. “Mental control of the bilingual lexico-semantic system”. Bilingualism:Language and Cognition 1: 67–81.

Green, D.W. and Price, C. 2001. “Functional imaging in the study of recovery patterns inbilingual aphasics”. Bilingualism: Language and Cognition 4: 191–201.

Gupta, P. and Dell, G.S. 1999. “The emergence of language from serial order and proceduralmemory”. In The emergence of language, B. MacWhinney (ed.), 447–481. Mahwah:Lawrence Erlbaum Associates.

Hagoort, P., Brown, C.M. and Osterhout, L. 2000. “The neurocognition of syntacticprocessing”. In The neurocognition of language, C.M. Brown and P. Hagoort (eds),273–316. Oxford: Oxford University Press.

Hahne, A. and Friederici, A.D. 2001. “Processing a second language: Late learners’ compre-hension mechanisms as revealed by event-related brain potentials”. Bilingualism:Language and Cognition 4: 123–141.

Hamann, S.B. and Squire, L.R. 1997. “Intact priming for novel perceptual representationsin amnesia”. Journal of Cognitive Neuroscience 9: 699–713.

Hebb, D.O. 1949. The organization of behaviour: A neuropsychological theory. New York: Wiley.Hermans, D. 2000. Word production in a foreign language. Doctoral dissertation, Katholieke

Universiteit Nijmegen.Jiang, N. and Korster, K. I. 2001. “Cross-language priming asymmetries in lexical decision

and episodic recognition”. Journal of Memory and Language 44: 32–51.Johnson, J.S. and Newport, E.L. 1989. “Critical period effects in second language learning:

The influence of maturational state on the acquisition of English as a second language”.Cognitive Psychology 21: 60–99.

216 David W. Green

Juffs, A. 1998. “Main verb versus reduced relative clause ambiguity resolution in L2 sentenceprocessing”. Language Learning 48: 107–147.

Kim, K.H.S., Relkin, N.R., Lee, K.M. and Hirsch, J. 1997. “Distinct cortical areas associatedwith native and second languages”. Nature 388: 171–174.

Kinder, A. and Shanks, S. 2001. “Amnesia and the declarative/non-declarative distinction:A recurrent network model of classification, recognition and repetition priming”.Journal of Cognitive Neuroscience 13: 648–669.

Klein, D., Milner, B., Zatorre, R., Meyer, E. and Evans, A. 1995. “The neural substratesunderlying word generation: A bilingual functional-imaging study”. Proceedings of theNational Academy of Sciences USA 92: 2899–2903.

Knowlton, B. J. and Squire, L.R. 1993. “The learning of categories: Parallel brain systems foritem memory and category knowledge”. Science 262: 1747–1749.

Knowlton, B.J. and Squire, L.R. 1994. “The information acquired during artificial grammarlearning”. Journal of Experimental Psychology: Learning, Memory, and Cognition 20: 79–91.

Knowlton, B. J. and Squire, L.R. 1996. “Artificial grammar learning depends on implicitacquisition of both abstract and exemplar-specific information”. Journal of ExperimentalPsychology: Learning, Memory, and Cognition 22: 169–181.

Kroll, J.F. and Dussias, P.E. (in press). “The comprehension of words and sentences in twolanguages”. In Handbook of bilingualism, T. Bhatia and W. Ritchie (eds), Cambridge,MA: Blackwell Publishers.

Kroll, J.F. and de Groot, A.M.B. 1997. “Lexical and conceptual memory in the bilingual:Mapping form to meaning in two languages”. In Tutorials in bilingualism: Psycho-linguistic perspectives, A.M.B. de Groot and J.F. Kroll (eds), 169–199. Mahwah, NJ:Lawrence Erlbaum Associates.

Kroll, J.F. and Stewart, E. 1994. “Category interference in translation and picture naming:Evidence for asymmetric connections between bilingual memory representations”.Journal of Memory and Language 33: 149–174.

Kroll, J.F., Michael, E., Tokowicz, N. and Dufour, R. (2002). “The development of lexicalfluency in a second language”. Second Language Research 18: 137–171.

Ku, A., Lachmann, E.A. and Nagler, W. 1996. Selective language aphasia from herpessimplex encephalitis. Pediatric Neurology 15: 169–171.

Kutas, M. and Kluender, R. 1991. “What is who violating? A reconsideration of linguisticviolations in light of event-related brain potentials”. In Cognitive Electrophysiology, H.-J.Heinze, T.F. Münte and G.R. Mangun (eds),183–210. Boston: Birkhäuser.

Lebrun, Y. 2002. “Implicit competence and explicit knowledge”. In Advances in neuro-linguistics of bilingualism, F. Fabbro (ed.), 299–313. Udine: Forum.

Lenneberg, E.H. 1976. Biological foundations of language. New York: Wiley.Levelt, W.J.M., Roelofs, A. and Meyer, A.S. 1999. “A theory of lexical access in speech

production”. Behavioral and Brain Sciences 22: 1–75Loring, D.W., Meador, K. J., Lee, G.P., et al. 1990. “Cerebral language lateralization:

evidence from intracorotid amobarbital testing”. Neuropsychologica 28: 831–838.McClelland, J.L., McNaughton, B.L. and O’Reilly, R.C. 1995. “Why there are complementary

learning systems in the hippocampus and neocortex: Insights from the successes andfailures of connectionist models of learning and memory”. Psychological Review 102:419–437.

Neural basis of lexicon and grammar in L2 acquisition 217

MacWhinney, B. 1997. “SLA and the competition model”. In Tutorials in bilingualism:Psycholinguistic perspectives, A.M.B. de Groot and J.F. Kroll (eds), 113–142. Mahwah,NJ: Lawrence Erlbaum Associates.

Mesulam, M.M. 1990. Large scale neurocognitive networks and distributed processing forattention, language and memory. Annals of Neurology 28: 597–613.

Noppeney, U. and Price, C. J. (submitted). “The neural basis of syntactic priming”. Ms.Wellcome Department of Imaging Neuroscience, UCL.

Nosofsky, R.M. and Zaki, S.R. 1998. “Dissociations between categorization and recognitionin amnesic and normal individuals: an exemplar-based interpretation”. PsychologicalScience 9: 247–255.

Opitz, B., Mecklinger, A. and Friederici, A.D. 2000. “Functional asymmetry of humanprefrontal cortex: Encoding and retrieval of verbally and non-verbally coded informa-tion”. Learning and Memory 7: 85–96.

Ostergaard, A.L. 1999. “Priming deficits in amnesia: Now you see them now you don’t”.Journal of the International Neuropsychological Society 5: 175–190.

Osterhout, L. and McLaughlin, J. 2000. “What brain activity can tell us about second-language learning”. Paper presented at the 13th Annual CUNY conference on HumanSentence Processing, San Diego.

Paradis, M. 1994. “Neurolinguistic aspects of implicit and explicit memory: Implications forbilingualism and second language acquisition”. In Implicit and explicit language learning,N. Ellis (ed.), 393–419. London: Academic Press.

Paradis, M. 1997. “The cognitive neuropsychology of bilingualism”. In Tutorials in bilingual-ism: Psycholinguistic perspectives, A.M.B. de Groot and J.F. Kroll (eds), 331–354.Mahwah, NJ: Lawrence Erlbaum Associates.

Paradis, M. 2001. “Bilingual and polyglot aphasia”. In Handbook of neuropsychology, 2nd edition,vol. 3 Language and aphasia, R.S. Berndt (ed.), 69–91. Amsterdam: Elsevier Science.

Perani, D., Dehaene, S., Grassi, F., Cohen, L., Cappa, S.F., Dupoux, E., Fazio. F. and Mehler,J. 1996. “Brain processing of native and foreign languages”. NeuroReport 7: 2439–2444.

Perani, D., Paulesu, E., Sebastian-Galles, N., Dupoux, E., Dehaene, S., Bettinardi, V., Cappa,S.F., Fazio, F. and Mehler, J. 1998. “The bilingual brain: Proficiency and age of acquisi-tion of the second language”. Brain 121: 1841–1852.

Pinker, S. 1994. The language instinct. New York: William Morrow.Price, C. J. 2000. “The anatomy of language: contributions from functional neuroimaging”.

Journal of Anatomy 197: 335–359.Price, C. J., Green, D. and Von Studnitz, R. 1999. “A functional imaging study of translation

and language switching”. Brain 122: 2221–2236.Price, C. J., Moore, C., Humphreys, G. and Wise, R. 1997. “Segregating semantic from

phonological processes during reading”. Journal of Cognitive Neuroscience 9: 727–733.Rapport, R.L., Tan, C.T., and Whitaker, H.A. 1983. “Language function and dysfunction

among Chinese and English speaking polyglots: Cortical stimulation, Wada testing, andclinical studies”. Brain and Language 18: 342–366.

Rickard, T.C. 2000. “Methodological issues in functional magnetic resonance imaging studies ofplasticity following brain injury”. In Cerebral reorganization of function after brain damage,H.S. Levin and J.G. Grafman (eds), 304- 317. Oxford: Oxford University Press.

218 David W. Green

Robertson, I.H.and Murre, J.M. 1999. “Rehabilitation of brain damage: Brain plasticity andprinciples of guided recovery”. Psychological Bulletin 125: 544–575.

Rugg, M.R. 2000. “Functional neuroimaging in cognitive science”. In The neurocognition oflanguage, C.Brown and P. Hagoort (eds), 15–36. Oxford: Oxford University Press.

Scoresby-Jackson, R.E. 1867. “Case of aphasia with right hemiplegia”. Edinburgh MedicalJournal 12: 696–706.

Segalowitz, N.S. and Segalowitz, S. J. 1993. “Skilled performance, practice, and the differenti-ation of speed-up from automatization effects: Evidence from second language wordrecognition”. Applied Psycholinguistics 14: 369–385.

Slobin, D. I. 1996. “From ‘thought and language’ to ‘thinking for speaking’”. In Rethinkinglinguistic relativity, J. J. Gumperz and S.C. Levinson (eds), 177–202. Cambridge:Cambridge University Press.

Springer, J.A., Binder, J.R., Hammeke, T.A. et al. 1999. “Language dominance in neurologi-cally normal and epilepsy subjects: A functional MRI study”. Brain 122: 2033–2046.

Squire, L.R., Knowlton, B. and Musen, G. 1993. “The structure and organization ofmemory”. Annual Review of Psychology 44: 453–495.

Squire, L.R. 1994. “Declarative and nondeclarative memory: Multiple brain systemssupporting learning and memory”. In Memory systems, D.L. Schacter and E. Tulving(eds), 203–231. Cambridge, MA: MIT Press.

Strauss, E., Satz, P. and Wada, J. 1990. “An examination of the crowding hypothesis in epilepticpatients who have undergone the carotid amytal test”. Neuropsychologia 28, 1221–1227.

Teuber, H.L. 1974. “Why two brains?” In The Neurosciences. Third study program, F.G.Worden (ed.), 71–74. Cambridge: MIT Press.

Tomasello, M. 1995. “Language is not an instinct”. Cognitive Development 10: 131–156.Ullman, M.T. 2001a. “The declarative/procedural model of lexicon and grammar”. Journal

of Psycholinguistic Research 30: 37–69.Ullman, M.T. 2001b. “The neural basis of lexicon and grammar in first and second language:

the declarative/procedural model”. Bilingualism: Language and Cognition 4: 105–122.Ungerleider, L.G. 1995. “Functional brain imaging studies of cortical mechanisms for

memory”. Science 270: 769–775.Vaid, J. and Hull 2002. “Re-envisioning the bilingual brain using functional neuroimaging:

Methodological and interpretive issues”. In Advances in neurolinguistics of bilingualism,F. Fabbro (ed.), 315–355. Udine: Forum.

Vigliocco, G. and Franck, J. 2001. “When sex affects syntax: Context effects in sentenceproduction”. Journal of Memory and Language 45: 368–390.

Vigliocco, G. and Hartsuiker, R. J. 2001. “The interplay of meaning, sound, and syntax insentence production”. Psychological Bulletin (under review).

Weber-Fox, C. and Neville, H. J. 1996. “Maturational constraints on functional specializa-tions for language processing: ERP and behavioral evidence in bilingual speakers”.Journal of Cognitive Neuroscience 8: 231–256.

Weber-Fox, C. and Neville, H. J. (submitted). “Sensitive periods differentiate processingsubsystems for open and closed class words: An ERP study in bilinguals”. cited inUllman, 2001b (above).

</TARGET "gre">

<TARGET "hou" DOCINFO AUTHOR "Roeland van Hout, Aafke Hulk and Folkert Kuiken"TITLE "The interface"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 10

The interface

Concluding remarks

Roeland van Hout, Aafke Hulk and Folkert KuikenUniversity of Nijmegen (Hout) / Utrecht University (Hulk) /Universiteit van Amsterdam (Kuiken)

1. Interfaces in generative grammar

In its bare form, a grammar can be defined as a set of elements or symbols (thelexicon) and a set of rules (the syntax) that together produce output strings, theutterance of the language belonging to the grammar. The format of the lexiconis evident: it has no structure, and it does not need to have one. Chomsky(1965:84) characterizes the lexicon as “simply an unordered list of all lexicalformatives”. Although the assumed absence of order may be accepted as aheuristic device, the lexicon of human languages and of human speakers is notan unordered list, far from that, as is apparent from many recent lexical studiesas well as from the chapters in this book. The form, role and structure of lexicalitems, words, lemmas, or whatever the lexical ‘formatives’ are called, and therelationships or connections between them constitute the pivotal domain ofresearch in the chapters of this book. The nearest neighbour to which lexicalitems and their structural properties connect is syntax, the machinery by whichutterances can be computed, on the basis of lexical input. The questions thenare: How are lexicon and syntax linked precisely? What kind of information dothey exchange? What is the interface between lexicon and syntax? Whichinterface levels need to be distinguished? There are several possible answers tothis question depending on one’s theoretical perspective.

Within the generative, modular paradigm, Chomsky (1995:131) distin-guishes two interface levels: the level of phonetic form (PF) is the interface withsensorimotor systems, the level of logical form (LF) is the interface with systemsof conceptual structure and language use. The two performance systems involvedare the articulatory-perceptual system and the conceptual-intentional system,

220 Roeland van Hout, Aafke Hulk and Folkert Kuiken

or to put it in a more straightforward way, they relate to sound and meaning.This generativist view is captured by Jackendoff (2002:197) in the followingdiagram:

lexicon

syntax

phonology semantics

This diagram illustrates the central position of syntax, and, at the same time, itraises the question of the relationship between lexicon and syntax. In thechapter by Norbert Corver, this relationship was identified as the third interfacelevel. Corver cited Chomsky (1991, 46):

…that there are three ‘fundamental’ levels of representation: D-structure, PFand LF. Each constitutes an ‘interface’ of the syntax (broadly constructed) withother systems: D-structure is the projection of the lexicon, via the mechanismsof X-bar theory; PF is associated with articulation and perception, and LF withsemantic interpretation.

D-structure is no longer a separate level in minimalist theory, as Corver pointsout, but the internal interface of syntax and lexicon is still to be distinguished.The lexical input needs to provide the information required to put grammar towork. This means that lexical items in generative grammar contain informationrelated to syntax (the formal features), to phonology (the phonological matrix),and to semantics (the semantic features).

The minimalist approach puts syntax in a central position. Other generativeview points exist, for instance Jackendoff (2002) attaches equal generative capacityand autonomy to each of the three levels. From his perspective, phonologicalstructures, syntactic structures and conceptual structures are part of the sameprocessing architecture (Jackendoff 2002:199), and that implies that interfaceissues between lexicon and syntax are basically not different from conceptualand phonological interface issues, as is illustrated in the following diagram:

lexicon

phonology semanticssyntax

The interface 221

The contributions in this book concentrate on the lexicon-syntax dimension,without any direct claims about the relative importance of other interfaces.Interesting contributions on the relationships between semantics/conceptualstructures and syntactic structures, for instance, can be found in Bowerman andLevinson (2001), in the context of (first) language acquisition.

In the Jackendoff approach, the key position of the lexicon in relation tophonological, syntactic, and semantic structures is reflected in the format oflexical items. Three layers of information are distinguished for lexical items:phonological features, semantic features, and syntactic (or formal) features. Thelexicon contains the fuel to put language to work, which applies to both contentand function words. In his chapter Ton Dijkstra added the orthographic layer,a consequence of our literate society that tends to be overlooked by linguists.

2. Learning the syntax, learning the lexicon

As discussed in Richard Towell’s chapter, L2 acquisition, in contrast to L1acquisition, is marked by variability and incompleteness. Any theory on humanlanguage should be able to explain these two phenomena, but, in addition, theclaim can be made that L2 acquisition can shed light on the properties ofhuman language. The interaction between one language (for instance the L1)and another language (for instance the L2) in the heads and mouths of realspeakers, can produce evidence for the core properties of lexical and syntacticstructures (cf. Muysken 2000). Including the developmental track of acquisitionand stagnation stresses the relevance of bilingual language processing as aprimary topic in language research even further.

What is the relationship between syntax and lexicon in L2 acquisition? Astrong argument in favour of the high status of lexical information is thatsecond language learners always have been aware of the kernel value of alexicon. Learners prefer to walk around with dictionaries, not with grammars.It has been understood for some time that syntax and lexicon involve differentkinds of learning: syntax is learnt through a process of implementing a particu-lar set of universal structures; lexis is learnt by establishing a set of arbitraryassociations which operate in a given society. The learning of syntax is oftencharacterised as a process of triggering; the learning of lexis is characterised bythe building up of associations (or connections). Yet these two systems mustcome together in the creation of a whole linguistic system in the mind of anindividual. The syntax will govern the phrase structure of the grammar but the

222 Roeland van Hout, Aafke Hulk and Folkert Kuiken

lexical items will govern how the phrase structure is implemented, notably theargument structure and the feature composition of the lexical items is essentialto the implementation of the syntax in the language production process.

This book was designed to examine the relative contribution of these twodimensions in a clear fashion, through illustrations of exemplary researchcarried out within each paradigm and to examine how they can be made tointer-relate in a way which would enable us to explain better the overall processof SLA. An examination of the interface between syntax and the lexicon is bothtimely and important. Both groups of researchers are now coming to anunderstanding that their particular view of the world may not suffice to accountfor the overall process and that each will have to understand more about whatthe other knows. From the point of view of the researcher interested in syntax(generally coming from a background in linguistics), the shift of linguistictheory away from principles and parameters and into minimalism has resultedin a crucial increase of the significance of the lexicon. Within minimalism somuch of the information essential for the working of the system has beenassigned to the lexicon that it has become crucial for syntacticians to reflectmore on how the lexicon works. From the point of view of the researcherinterested in the lexicon (generally coming from a background in psychology),it is important to integrate the outcome of lexical learning within the overallacquisition process. Unless one adopts a purely connectionist position, it isclear that the use of the lexical items studied can only take place within thesyntactic system. An understanding of the acquisition of the syntax is thereforeessential to the understanding of the whole second language processing andacquisition. The introductory chapter by Richard Towell explicates the comple-mentarity of linguistics and psychology in doing language research. Taking theother chapters in this book in consideration, he balances the linguistic andpsychological dimensions of basic questions in second language acquisitionresearch. Towell argues that the lexical part of the lexicon-syntax interface is thedriving force of language acquisition and that we need to investigate thepsychological mechanisms behind this development.

3. Some final considerations

The chapters in this book demonstrate the many different perspectives requiredto study second language acquisition over its full range. A whole series ofcontrasts keeps returning: symbolic learning vs. connectionist learning, L1 vs.

The interface 223

L2 acquisition, procedural vs. declarative knowledge, structure vs. process, com-petence vs. performance, etc. One main conclusion is that, over and over again, weneed to plea for a cocktail of treatments, for many sorts of data, and for many sortsof expertises to give an appropriate answer to the questions belonging to suchcontrastive pairs. In this cocktail, the following distinctions can be made.

3.1 Variety in methodology

It will be clear that no single particular methodology can produce all theanswers we need. We need the linguistic analysis of spontaneous and elicitatedspeech data (see the chapters by Hawkins and Liszka, Van de Craats, Corver),but also on-line and off-line grammatical judgment tasks (see the chapters byDuffield, and Sabourin and Haverkort). We need to carry out psycholinguisticexperiments with reaction times (see the chapters by Duffield and Dijkstra), butalso newer methodologies should be applied like eye-tracking, and neuro-imaging techniques (see the chapters by Sabourin and Haverkort, and Green).Another promising methodology is the use of computer simulations (see thechapter by Williams).

3.2 Variety in learners and languages

The discovery and testing of general or universal principles and parameters inlanguage and language acquisition require the full range of learners: from realbeginners (see Corver and Van de Craats) to (near)native speakers (see, e.g.Hawkins and Liszka), from unguided acquisition to classroom learning. At thesame time we need to consider the full language typological range, both as asource and a target language in L2 acquisition (the chapters in this book covera range larger than in many other books on L2 acquisition). As for the comput-er simulations, different learning algorithms should be probed, includingalgorithms where previous knowledge (L1) can be implemented to explore itseffect on acquiring another language system (L2). Without studying realbeginners, it seems not feasible to get an answer to the question of the role ofcognitive vs. linguistic principles in speaking a new language (cf. Klein andPerdue 1997), and how, perhaps, cognitive principles are matched by linguisticstructures. Cognitive principles can be influential in an indirect way via thesyntax-semantic interface, but maybe they directly trigger specific syntacticmechanisms. That would contradict approaches based solely on lexical formalfeatures as the main sources of information for syntactic structures.

224 Roeland van Hout, Aafke Hulk and Folkert Kuiken

3.3 Variety in linguistic domains

The same linguistic domains must be investigated, in both L1 and L2 acquisi-tion, including not only inflectional processes but derivational morphologicaldevices as well, and including not only syntactic parameters related to the wholeclause, but also parameters for subdomains (possessive DPs, for instance; seeVan de Craats). Studying different linguistic domains can be helpful in deter-mining which words are stored and which explicit rules are active in producinglexical items. Schreuder and Baayen (1997) show that even regular inflectionalforms can be stored instead of being computed. Morphology is interesting herebecause it constitutes, due to its paradigmatic organisation, the link betweensyntactic structures and sets of lexical items.

In L1 acquisition the acquisition of past tense forms has been an importantdomain for testing associative learning algorithms. (cf. Plunkett and Juola1999). Past tense forms, from a generative point of view, are the topic ofresearch of Hawkins and Liszka in this book. They observe that high proficiencyadult L2 speakers of English do not always mark thematic verbs like walk, noticefor past tense in obligatory past tense contexts. They find that the morphologycannot be the source of such optionality. In their view, optionality in adult L2performance results from the interaction between a perfectly functioningsyntactic component with an impaired lexicon (as far as tense features areconcerned).

Gender systems turn out to be another attractive domain in L2 acquisitionbecause of their irregularities and their covert regularities: Williams suggeststhat problems learning gender in a second language may reflect the weakness ofthe kind of associative learning mechanism that underlies incidental learning.Sabourin and Haverkort argue that there is a qualitative difference betweennative speakers and second language learners in language processing. Theysuggest that the German participants use a translation strategy to learn Dutchgender assignment and that they use their L1 processing strategies to processtheir L2 in cases where the grammars are very similar.

3.4 Variety in contexts and tasks

Both Hawkins and Liszka, and Haverkort and Sabourin make it clear thatdifferent tasks may return different results. A simple production task may showthat a learner has control over a specific process, whereas spontaneous languageproduction may show strong differences between native and L2 speakers.

The interface 225

Dijkstra’s research corroborates the naturalness of such task differences. Thelexicon is a flexible store whose properties may differ with the task it is con-fronted with. Only different forms of processing can be held responsible forsuch task differences, which means that processing is an inherent part oflinguistic structuring. The chapters by Dijkstra and Green show how intricatethe organisation of the bilingual lexicon is. Differences in organisation may berelated to the way differences between speakers and communities develop,dependent upon the type of bilingualism and, in addition, the linguisticdistance between the language involved.

3.5 Variety in perspectives

Towell’s introduction to this book is a plea for a stronger cooperation betweenlinguistics and psychology and a plea for longitudinal studies. Generativelinguistics alone will not do.

Whilst generative linguists claim to describe the acquisition process, mostof their efforts pertain to the classification of successive linguistic stages oflearners’ interlanguages. Towell concludes that the driving force of languagelearning must come from lexis, as there is no driving force available in thecomputational system (CS), as it is defined in modern generative syntax.Syntactic knowledge is a template simply present in the mind of the learner thatautomatically operates given the lexical information inserted. This is illustratedby Corver who argues, both for word level categories and phrasal level catego-ries, that L2-expressions are just as perfect as L1-expressions from an interfaceperspective, even though from the perspective of the target language they mayseem highly imperfect.

On the other hand, the learning of lexis has been thought of mainly interms of some form of associative learning theory, with connectionism beingthe leading variant in language acquisition research. Some theorists haveconcluded that connectionist learning can account for the totality of thelearning, including the learning of syntax. Generative syntacticians argue thatthe sophisticated structures which they observe and which are not visible at thesurface structure of the language, cannot be learnt in an empirical fashion aloneand therefore they claim that innate knowledge (mediated or not by the L1)must be ‘guiding’ language acquisition. It is clear that both groups have a strongcase to make but that neither is able on their own to account for the totalprocess. Syntax needs to be fed by lexical information, including formalfeatures; this information has to be collected by the lexicon from output strings

226 Roeland van Hout, Aafke Hulk and Folkert Kuiken

generated by syntax. This is particularly clear in the contribution of Van deCraats who shows how syntax and lexicon interact in the data of Moroccan andTurkish adults learning Dutch outside the class-room. The starting point oftheir developmental process is assumed to be the fully-fledged grammar of theL1 that under the impact of the L2-environmental input changes the underlyinggrammar of the L1 towards a more target-like L2 output.

The computation of syntactic information on the basis of output stringsseems to require storage devices, associative linking and analogical strategies.Such computational efforts need time (see Van de Craats), and sometimes a lotof time before the proper information can become available in the productionof spontaneous speech (see Hawkins and Liszka).

We hope that the readers of this volume have come to appreciate thecomplexities involved in the issue of interface and will be encouraged to do moreinterdisciplinary research in order pave the way for a deeper understanding ofthe interface between syntax and the lexicon in second language acquisition.

References

Bowerman, M. and Levinson. S. (eds). (2001). Language acquisition and conceptual develop-ment. Cambridge: Cambridge University Press.

Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge. Mass: MIT Press.Chomsky, N. (1991). Some notes on economy of derivation and interpretation. In Principles and

parameters in comparative grammar, R. Freidin (ed.), 417–454. Cambridge MA: MIT Press.Chomsky, N. (1995). The minimalist program. Cambridge MA: MIT Press.Jackendoff, R. (2002). Foundations of language. Brain, meaning, grammar, evolution. Oxford:

Oxford University Press.Klein, W. and Perdue, C. (1997). The basic variety. Or: Couldn’t natural languages be much

simpler? Second Language Research 13, 301/347.Muysken, P. (2000). Bilingual speech. A typology of code-mixing. Cambridge: Cambridge

University Press.Plunkett, K. and Juola, P. (1999). A connectionist model of English past tense and plural

morphology. Cognitive Science 23, 436–490.Schreuder, R. and Baayen, H. (1997). How complex simplex words can be. Journal of

Memory and Language 37, 118–139.

</TARGET "hou">

Name index

<TARGET "ni" DOCINFO AUTHOR ""TITLE "Name index"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

AAbutalebi 209, 214Allen 54-56, 67, 98, 99, 124Altarriba 143, 148Avrutin 124, 178, 194

BBarlow 106, 124Bever 97, 98, 124Birdsong 97, 122, 124, 201, 214Bley-Vroman 46, 67, 108, 120, 125Brown 184, 194, 195, 206, 207, 214, 215,

218

CCappa 20, 209, 214, 217Carlson 114, 127Chomsky 1-3, 6, 19, 20, 24, 35, 39, 43, 45,

50, 51, 62, 66, 67, 69, 72, 78, 79, 93,94, 95, 115, 123, 125, 176, 194, 199,214, 219, 220, 226

Chung 99, 125Clahsen 94, 95, 109, 120, 125Coppieters 97, 125Corver 3, 5, 7, 10, 45, 48, 49, 67, 68, 70,

77, 79, 81, 85, 94, 95, 220, 223, 225Culicover 99, 123, 125

DDe Groot 20, 140, 141, 149, 206, 213, 214,

216, 217De Moor 139, 149Dijkstra 3, 5, 12, 18, 129, 132-136,

138-142, 146, 148-150, 206, 214,221, 223, 225

Duffield 3, 5, 7, 97, 107, 108, 110, 114,116, 118, 120, 121, 124, 125, 223

Dussias 144, 149, 211, 216

EEllis 4, 5, 19, 97, 125, 151, 172, 200, 214,

217

FFabbro 198, 207, 215, 216, 218Fodor 1, 7, 14, 19, 20, 99, 125-127, 154,

172Font 134, 140, 149Foster 195Freedman 108-110, 126Friederici 144, 145, 149, 184, 194, 208,

209, 213, 215, 217

GGerard 132, 134, 149Grainger 132, 133, 149, 150Green 3, 5, 10, 15, 16, 18, 134, 150, 197,

206, 208, 209, 215, 217, 223, 225Greenbaum 97, 126Grosjean 140, 141, 149, 152, 170, 172

HHagoort 184, 194, 195, 206, 207, 214, 215,

218Hahne 144, 145, 149, 208, 209, 213, 215Hamann 202, 203, 215Hardt 115, 126Hebb 205, 215Hedgcock 97, 126Hong 109, 125

KKemmer 106, 124Kinder 203, 216Kirsner 135, 148Kluender 99, 100, 113, 126, 208, 216Kornfilt 60, 64, 68, 85, 86, 95

228 Name index

Kroll 20, 134, 143, 144, 146, 148-150,206, 210, 211, 213, 215-217

Kushnir 132, 150Kutas 113, 126, 184, 195, 208, 216

LLebrun 200, 205, 216Lemhöfer 135, 136, 150Levelt 123, 126, 168, 173, 211, 216Levin 104, 124, 126, 217

MMacDonald 106, 126MacFarland 104-107, 110, 111, 117, 124,

126Macnamara 132, 150MacWhinney 124, 172, 204, 215, 217Mandell 97, 126Marslen-Wilson 118, 126Martohardjono 97, 126Matsuo 113, 114, 116, 120, 121, 124, 125McCloskey 99, 125McKoon 104-107, 110, 111, 117, 124, 126Milech 135, 148Murre 205, 218Muysken 120, 125, 221, 226

NNeville 144, 150, 208, 210, 218

OOsterhout 184, 195, 206, 207, 210, 215,

217

PParadis 16, 20, 199, 200, 205, 207, 215,

217Perani 3, 4, 20, 209, 214, 217Price 207-209, 211, 215, 217

RRappaport Hovav 104, 124, 126

Rayner 126, 143, 148Robertson 205, 218

SSag 114, 126Scarborough 132, 134, 149Schriefers 136, 140, 141, 148, 149, 184,

194Schütze 97, 98, 123, 126Seidenberg 98, 99, 124, 126, 153, 173Shanks 153, 164, 173, 203, 216Sholl 143, 148Sonnenstuhl-Henning 109, 125Sorace 103, 112, 124, 126, 127Squire 153, 173, 199, 202, 203, 215, 216,

218

TTanenhaus 114, 127Ten Brinke 134, 138, 141, 149, 206, 214Timmermans 136, 149

UUllman 4, 20, 164, 174, 193, 195,

199-202, 204, 207, 210, 218

VVan de Craats 3, 5, 7, 10, 48, 49, 65, 67,

69, 70, 78, 91, 94, 95, 223, 224, 226Van Hell 134, 150Van Heste 139, 150Van Heuven 133, 134, 142, 149, 150Van Hout 48, 49, 67, 70, 94, 95, 124, 127,

178, 194, 219Van Jaarsveld 134, 138, 149, 206, 214Von Studnitz 134, 150, 209, 217

WWeber-Fox 144, 150, 210, 218Wexler 124White 1, 8, 20, 31, 35, 41, 44, 46, 68, 70,

95, 107, 122, 125, 168, 172, 178

</TARGET "ni">

Subject index

<TARGET "si" DOCINFO AUTHOR ""TITLE "Subject index"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Aabstract representations 15, 154, 159abstraction 4, 153, 154acceptability judgment 117, 120, 144acquisition of vocabulary 199, 200age of acquisition 201, 208-210, 217agreement 22, 25, 44, 48, 61-66, 77, 78,

83, 84, 87, 93, 152, 157, 178-181,183, 184, 190, 192, 195

alternative feature realization 61amnesic patients 197, 198, 201-203animacy 14, 154, 160-162, 165-168aphasia 175, 176, 194, 195, 198, 215-218aphasics 12, 15, 16, 176-179, 183, 184,

194, 195, 198, 215artificial grammar 153, 173, 202, 203, 216artificial language 153, 155, 172aspect hypothesis 38, 39, 41association 22, 131, 132, 134, 150, 170asymmetric spell out 61auxiliary selection 112, 113, 127

Bbilingual 20, 130-132, 134-136, 138-140,

142-150, 198, 206, 208, 214-218,221, 225, 226

bilingual syntactic processing 146

Ccategorical 56, 97, 99-101, 104, 106, 107,

111-114, 116-119, 121-123causative verb 108clitic placement 107, 108, 110, 119, 125closed-class words 210cognate 135, 141common gender 179-181

competence 2, 5-8, 11, 17, 31, 43, 46,97-104, 106, 107, 109, 111, 116, 117,119, 120, 122-126, 171, 183, 216, 223

competition model 204, 217computer modelling 4, 11, 12, 14, 16connectionism 4, 12, 19, 20, 155, 172, 225connectionist learning 4, 151, 153, 154,

164, 222, 225connectionist model 165, 174, 226connectionist network 151, 154, 159, 163,

164conservation 48, 53, 65, 67, 68, 70, 72, 73,

75, 81, 86, 95conservation hypothesis 70, 75, 81, 86conservative strategy 10, 65constructional gradience 113convergence 101-103, 106, 117, 119, 197,

198, 204-206, 210-212convergence hypothesis 197, 198, 204,

205, 210-212critical period 23, 24, 41, 201, 208, 215

Ddeclarative 4, 10, 15, 16, 18, 20, 163, 164,

174, 193, 195, 199-203, 205, 212,216, 218, 223

declarative memory 15, 193, 199-203derivational theory of complexity 2, 107discourse hypothesis 39, 41distributed morphology 23, 35, 43distributional analysis 18, 154, 155, 157dual competence 104, 107, 111, 116, 117,

119Dutch go/no-go 137

EEEG 144, 145, 175, 184-186, 194effective connectivity 211

230 Subject index

electroencephalography 184English go/no-go 132, 137English lexical decision task 131, 138-141ERP data 186, 206-208, 213ERPs 12, 15, 16, 130, 144, 145, 184, 186,

192, 194, 213Event-Related brain Potentials 144, 149,

195, 215, 216explicit learning 160, 164, 199

Ffeature 9, 10, 17, 22-25, 33-37, 40-42, 48,

51-54, 56, 59-64, 66, 69, 70, 72, 73,76, 78-81, 85-88, 91-93, 113, 117,143, 168, 222

feature bundle 73, 76, 81, 85, 86, 88, 91,93

formal feature 51, 73, 78, 80, 85frequency 4, 12, 13, 18, 19, 22, 28, 29, 37,

42, 43, 111, 112, 117, 124, 129, 135,136-139, 141-143, 147, 159, 172,181, 182, 186

frequency of use 12, 18frontal-basal ganglia circuit 207functional category 9, 34functional head 69, 78, 80, 83

Ggeneralised blocking principle (GBP) 34, 36generalized lexical decision 131, 136, 139,

141, 150generativist 4-7, 10, 98, 101, 122, 220go/no-go 131, 132, 137gradience 97, 99-101, 111-113, 115, 117,

120, 125grammar 1, 4, 8, 9, 15, 17, 19-21, 23, 26,

34, 35, 40, 43, 44, 46, 48, 51-54, 65,67, 68, 70, 81, 88, 94, 95, 98, 100,103, 123, 151, 153, 173, 174, 175,176, 179, 184, 195, 197, 199, 200,202, 203, 212, 216, 218, 219, 220,221, 226

grammatical gender 151, 173-175, 179,184, 191, 194, 195

grammaticalise 34

grammaticality 16, 18, 97-101, 107-110,116, 119, 123-126, 154, 175, 176,179, 180-183, 185-187, 189, 208

grammaticality judgment 107, 109, 116,126, 154, 175, 176, 179-181, 185,186, 189

haemodynamic methods 206, 213

Hhemispheric representation of L2 198

Iidealisation of the data 7implicit learning 18, 153, 159, 165-167,

169, 172, 174, 200, 214incidental learning 154, 161, 165,

169-171, 224initial sensitivities 205innate knowledge 5, 225interlanguage 4, 7, 8, 10, 17, 18, 43, 46,

48-50, 53, 65, 66, 70, 73, 94, 109interlingual homograph 132, 134, 136,

137, 146, 148-150, 214interpretable feature 79invariant principles 3invisible category principle (ICP) 59, 60

Llanguage decision 131, 137language faculty 2, 6, 10, 21, 22, 24, 26,

40, 41, 45, 47, 65, 66, 98, 123language intermixing 138, 141, 142, 149,

214language mode 140language non-selective access 132-134,

139, 146, 147language processing 19, 140, 149, 150,

176, 178, 183, 184, 191, 198, 213,218, 221, 222, 224

language processor 175language-selective access 132language-specific lexical decision 131lexical decision 131, 133-136, 138-141,

150, 177, 184, 214, 215lexical gradience 111, 112, 115

Subject index 231

lexical item 33, 37, 42, 48, 51-53, 55-57,69, 70, 72, 73, 72-75, 83, 84, 87, 91,92, 93, 129, 137, 213

lexicon 4, 10, 19, 20, 24, 25, 34, 40, 41,50-54, 65, 66, 68, 69, 70, 83, 87, 93,94, 101, 111, 122, 126, 132, 145, 151,152, 168, 170, 174, 195, 197, 199,200, 212, 214, 218-222, 224-226

lexicon-syntax interface 1, 50-53, 111,222

LF-legibility 56, 65, 66linguistic context effects on word

recognition 143location 48, 58-61, 65, 112, 155, 212logical problem of language acquisition 6

Mmapping problem 24medial temporal structures 199metalinguistic knowledge 167minimalist program 19, 24, 67, 95, 194,

214, 226minimalist theory 3, 66, 95, 220miscategorisation 52, 54model learning 162, 163, 167, 168, 170modular 6, 219morphological relatedness 118

Nnear-native speaker 21neocortical regions 199neuter gender 179-181non-parallel ellipsis 114, 115noun class 151, 153-157, 160-163, 165,

169, 170noun classification 179null-results for interlingual homographs

139numeral 57, 64, 65

Oon-line processing 176, 179, 183, 190, 205open-class words 210optionality 2, 9, 21, 22, 24, 26, 31-33, 36,

37, 39, 224

output condition 39, 41

PP600 145, 184-188, 190, 192, 193, 207,

210parallel distributed processing 13, 20, 173parallelism constraint 100, 114, 115, 117parameter resetting 44parameter setting 71past participle 41, 76, 208past tense marking v, 21, 22, 26, 27, 29,

31-33, 36, 40, 42path 48, 58-61performance 6, 7, 11, 16, 22, 26, 29-33,

40, 45, 46, 97-103, 106, 116, 121,123, 124, 141, 142, 147, 149, 150,152, 155-161, 164-166, 169-171,199-203, 205, 207, 208, 213, 218,219, 223, 224

phonological matrix 51, 72, 73, 72-78, 80,81, 83, 86, 89, 88, 91-94, 220

procedural 4, 10, 15, 16, 18, 20, 163, 164,174, 193, 195, 199-203, 215, 218, 223

procedural learning 164procedural memory 4, 15, 16, 193,

199-202, 215prodeterminer 85progressive demasking 131-133

Qqualitative difference 179, 191, 224quantification 48, 55, 57

Rrecognition 129, 130, 132, 135, 137-139,

142-144, 146-150, 172, 173, 202,203, 214-218

repetition 202, 203, 216repetition priming 202, 203, 216representation of grammatical knowledge

175, 176, 178, 191, 201

Ssemantic anomalies 144, 210sentence-matching 107-109, 119, 125

232 Subject index

short-term memory 156, 157, 160, 161,166, 167

simple past tense in spontaneousproduction 29

simulation data 203spontaneous recovery 176subsymbolic model 164surface competence 101, 111, 119, 123syntactic anomalies 184, 210syntactic feature 25, 33, 36, 37, 42syntactic gradience 99, 113, 117Syntactic Positive Shift (SPS) 184, 195syntactic priming effect 177syntactic processing 125, 144-147, 195,

208, 215

Ttask learning 13, 162, 163, 167task-dependent variation 176terminal node 23, 35test of knowledge of morphology 26time-course of lexical activation 135, 147triggering 1, 7, 8, 17, 91, 221

VVP-ellipsis 100, 113-115, 120, 125

Wword association 131, 132, 134word naming 131, 132working memory 146, 163, 176

</TARGET "si">

In the series LANGUAGE ACQUISITION AND LANGUAGE DISORDERS (LALD) thefollowing titles have been published thus far or are scheduled for publication:

1. WHITE, Lydia: Universal Grammar and Second Language Acquisition. 1989.2. HUEBNER, Thom and Charles A. FERGUSON (eds): Cross Currents in Second

Language Acquisition and Linguistic Theory. 1991.3. EUBANK, Lynn (ed.): Point Counterpoint. Universal Grammar in the second lan-

guage. 1991.4. ECKMAN, Fred R. (ed.): Confluence. Linguistics, L2 acquisition and speech pathol-

ogy. 1993.5. GASS, Susan and Larry SELINKER (eds): Language Transfer in Language Learning.

Revised edition. 1992.6. THOMAS, Margaret: Knowledge of Reflexives in a Second Language. 1993.7. MEISEL, Jürgen M. (ed.): Bilingual First Language Acquisition. French and German

grammatical development. 1994.8. HOEKSTRA, Teun and Bonnie SCHWARTZ (eds): Language Acquisition Studies in

Generative Grammar. 1994.9. ADONE, Dany: The Acquisition of Mauritian Creole. 1994.10. LAKSHMANAN, Usha: Universal Grammar in Child Second Language Acquisition.

Null subjects and morphological uniformity. 1994.11. YIP, Virginia: Interlanguage and Learnability. From Chinese to English. 1995.12. JUFFS, Alan: Learnability and the Lexicon. Theories and second language acquisition

research. 1996.13. ALLEN, Shanley: Aspects of Argument Structure Acquisition in Inuktitut. 1996.14. CLAHSEN, Harald (ed.): Generative Perspectives on Language Acquisition. Empirical

findings, theoretical considerations and crosslinguistic comparisons. 1996.15. BRINKMANN, Ursula: The Locative Alternation in German. Its structure and acquisi-

tion. 1997.16. HANNAHS, S.J. and Martha YOUNG-SCHOLTEN (eds): Focus on Phonological

Acquisition. 1997.17. ARCHIBALD, John: Second Language Phonology. 1998.18. KLEIN, Elaine C. and Gita MARTOHARDJONO (eds): The Development of Second

Language Grammars. A generative approach. 1999.19. BECK, Maria-Luise (ed.): Morphology and its Interfaces in Second Language Knowl-

edge. 1998.20. KANNO, Kazue (ed.): The Acquisition of Japanese as a Second Language. 1999.21. HERSCHENSOHN, Julia: The Second Time Around – Minimalism and L2 Acquisition.

2000.22. SCHAEFFER, Jeanette C.: The Acquisition of Direct Object Scrambling and Clitic

Placement. Syntax and pragmatics. 2000.23. WEISSENBORN, Jürgen and Barbara HÖHLE (eds.): Approaches to Bootstrapping.

Phonological, lexical, syntactic and neurophysiological aspects of early languageacquisition. Volume 1. 2001.

24. WEISSENBORN, Jürgen and Barbara HÖHLE (eds.): Approaches to Bootstrapping.Phonological, lexical, syntactic and neurophysiological aspects of early languageacquisition. Volume 2. 2001.

25. CARROLL, Susanne E.: Input and Evidence. The raw material of second languageacquisition. 2001.

26. SLABAKOVA, Roumyana: Telicity in the Second Language. 2001.27. SALABERRY, M. Rafael and Yasuhiro SHIRAI (eds.): The L2 Acquisition of Tense–

Aspect Morphology. 2002.28. SHIMRON, Joseph (ed.): Language Processing and Acquisition in Languages of

Semitic, Root-Based, Morphology. 2003.29. FERNÁNDEZ, Eva M.: Bilingual Sentence Processing. Relative clause attachment in

English and Spanish. 2003.30. HOUT, Roeland van, Aafke C.J. HULK, Folkert KUIKEN and Richard J. TOWELL

(eds.): The Lexicon-syntax Interface in Second Language Acquisition. 2003.