a cross-linguistic study of the relationship … · 2 a cross-linguistic study of the relationship...

24
A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development (Topic Area: Grammar) Antonella Devescovi 1 Maria Cristina Caselli 2 Daniela Marchione 1 Judy Reilly 3 Elizabeth Bates 4 1 University of Rome ‘La Sapienza’ 2 Institute for Cognitive Science & Technology, National Council of Research, Rome, Italy 3 San Diego State University 4 University of California, San Diego Technical Report CND-0301 2003 Project in Cognitive and Neural Development Center for Research in Language University of California, San Diego La Jolla, CA 92093-0526

Upload: lydat

Post on 30-Jul-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

A Cross-Linguistic Study of the Relationship between Grammar &Lexical Development

(Topic Area: Grammar)

Antonella Devescovi1

Maria Cristina Caselli2

Daniela Marchione1

Judy Reilly3

Elizabeth Bates4

1University of Rome ‘La Sapienza’

2Institute for Cognitive Science & Technology, National Council of Research, Rome, Italy

3San Diego State University

4University of California, San Diego

Technical Report CND-03012003

Project in Cognitive and Neural DevelopmentCenter for Research in Language

University of California, San DiegoLa Jolla, CA 92093-0526

Page 2: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

2

A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development

AbstractThe relationship between grammatical and lexical development was compared in 233 English and 233 Italian children between18 and 30 months of age, matched for age, gender, and vocabulary size on the MacArthur Communicative DevelopmentInventories (CDI). Four different measures of Mean Length of Utterance were applied to the three longest utterances reported byparents, and to corrected/expanded versions representing the ‘target’ for each utterance. Italians had longer MLUs on mostmeasures, but the ratio of actual to target MLUs did not differ between languages. Age and vocabulary both contributedsignificant variance to MLU, but the contribution of vocabulary was much larger, suggesting that vocabulary size may provide abetter basis for cross-linguistic comparisons of grammatical development. The relationship between MLU and vocabulary sizewas non-linear in English but linear in Italian, suggesting that grammar ‘gets off the ground’ earlier in a richly inflected language.A possible mechanism to account for this difference is discussed.

Cross-linguistic studies have played a central rolein child language research for decades (Braine, 1976;Slobin (1985, 1992, 1997); MacWhinney & Bates,1989; Choi & Bowerman, 1991; Berman & Slobin,1994; Bates, Devescovi, & Wulfeck, 2001; Bowerman& Choi, 2001). As Slobin has pointed out repeatedly inhis own pioneering work on the topic, this is theresearch strategy that stands the best chance of helpingus to disentangle universal versus language-specificphenomena in language development. Through suchstudies, we have learned that children commit surpris-ingly few errors in the course of language learning(although the errors they do produce are quite inform-ative — Slobin, 1985), that the content or concepts that1–2-year-olds try to express are remarkably similarfrom one language to another (negation, possession,location, disappearance, etc. — Braine, 1976), but thatvariations in the forms used by very young children toexpress these concepts are strikingly different from onelanguage to another (Demuth, 1990; Fortescue &Lennert Olsen, 1992). All these phenomena suggest thatchildren are conservative, ‘sticking to their input’ asthey figure out how to express a common stock of ideas(and some language-specific ideas as well — Choi &Bowerman, 1991; Bowerman & Choi, 2001).

Despite these admirable advances in our under-standing, serious methodological problems remain thatare acknowledged by virtually all researchers who en-gage in cross-linguistic research. One of the mostvexing problems is the establishment of equivalencebetween samples of children: When children from dif-ferent language groups are compared to unveil simi-larities and differences, how shall they be matched?The simplest strategy is to match children by age, e.g.comparing the speech produced by 24-month-oldItalians with the speech produced by 24-month-oldchildren acquiring English. However, the variation thatcan be observed within any given language in this agerange is so vast (e.g. 24-month-olds can have virtuallyno speech at all, or they can display complex syntaxwith vocabularies of more than 600 words — Dromi,1987; Ogura, Yamashita, Murase & Dale, 1993; Fenson

et al., 1994; Caselli & Casadio, 1995; Caselli, Casadio& Bates, 1999; Maitel, Dromi, Sagi & Bornstein, 2000)that this strategy is necessarily risky. This is especiallytrue when the samples in question are small (as is oftenthe case in detailed longitudinal studies of free speech).In many years of comparative research on early lan-guage, attempts have been made to match childrenbased on length of utterance in content words or totalwords. However (as we shall also see below), thisstrategy does not guarantee a match in language level,since languages can vary in their ‘wordiness’ (e.g. thedifference between languages that do and do not permitomission of subjects and sometimes also objects in free-standing declarative sentences; variations over lan-guages in the obligatory status of articles and otherfunctors). Attempts have also been made to matchbased on mean length of utterance in morphemes, butthis strategy raises a host of definitional issues aroundwhich no consensus has emerged, as evidenced by alively exchange on the Info-Childes mailing list (info-childes @mail.talkbank.org, as archived at ht tp : / /linguistlist.org). Indeed, the most successful and thor-ough efforts have typically been tailored to individuallanguages, with no attempt to generalize across lan-guages (e.g. Dromi & Berman, 1982).

In the present study, we will illustrate an alter-native approach to the issue of cross-language match-ing. Recent studies have shown that, within a singlelanguage, vocabulary size is a more powerful predictorof grammatical development than age or gender, con-tributing significant variance to measures of grammarafter age and gender are controlled (Marchman &Bates, 1994; Bates & Goodman, 1997; Dale, Dionne,Eley, & Plomin, 2000). Studies in Italian, Japanese,Spanish and Hebrew have illustrated the same point(Caselli et al., 1999; Ogura et al., 1993; Jackson-Mal-donado, Thal, Marchman, Bates & Gutierrez-Clellen,1993; Maitel et al., 2000). Taking advantage of thisfinding, we used the large norming data bases for theMacArthur CDI in two languages, English and Italian,to explore grammatical development and the relation-ship between vocabulary and grammar (controlling for

Page 3: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

3

age and gender as well). To obtain estimates of struc-tural complexity under different coding schemes, wecompared the three longest utterances reported byparents for 466 children (233 Italian; 233 English),matched for age, gender, and vocabulary size on theMacArthur CDI for each language. For each reportedutterance, a corrected version is constructed in whicherrors are corrected and (by conservative criteria) oblig-atory elements are restored (e.g. from ‘Kitty sleeping’to ‘The kitty is sleeping’). This will permit us tocompare the distance between actual and ‘target’ (at-tempted) constructions in each language. Four differentmeasures of utterance length are applied to each actualand expanded utterance, yielding eight averages(MLUs) for each child. The four measures are designedto represent a continuum from a conservative estimatethat should differ minimally over languages (MLU incontent words) to increasing rich and complex meas-ures of bound morphology. We want to stress that theset of possible morphological coding schemes is poten-tially infinite, and there is (as noted) no consensus onthe ‘right’ measure to use across languages. We havedesigned two morphological coding schemes that arelikely to reveal language differences. Both favor Italian(e.g. credit for gender agreement on modifiers, andmultiple aspects of verb morphology that are marked inItalian but not in English), but one allows for multiplecontrasts in pronoun choice that are marked in bothlanguages and may help English to ‘catch up’.

The present study builds on two previous cross-linguistic studies by our research group comparingresults from the CDI norming data for English andItalian.

The first of these (Caselli et al., 1995) focused onboth expressive and receptive vocabulary in the periodfrom 8 to 16 months, prior to the onset of grammar. Weexamined the relative proportions of nouns, predicates(verbs and adjectives), social words (games, routines,proper names) and function words at different ages, andat different levels of expressive and receptive voc-abulary size. Similarities between languages far out-weighed differences in this age range when childrenwere matched for total vocabulary size, including earlypredominance of common nouns and later appearanceof verbs and other predicates. However, we did findthat Italian children use a higher proportion of socialwords across this range of development compared withAmerican children acquiring English, reflecting culturaldifferences (including the tendency for extended fam-ilies to live in the same city in Italy).

The second study (Caselli et al., 1999) focused onboth expressive vocabulary and grammar in the periodfrom 18 to 30 months. Grammatical development wasassessed with a 37-item complexity scale comprisingpairs of sentences that vary in degree of complexity(e.g. ‘Kitty sleeping’ versus ‘Kitty is sleeping’). Parents

were asked to check the sentence within each pair that‘sounds more like the way that your child is talkingright now.’ Previous studies (Dale, Bates, Reznick &Morisset, 1989; Dale, 1991; Jackson-Maldonado et al.,1993; Marchman & Bates, 1994) have shown that thismeasure of grammatical complexity is highly correlatedwith Mean Length of Utterance in Morphemes based onfree-speech samples. When children were matched overlanguages for overall vocabulary size, Caselli et al.confirmed an early predominance of common nounsand later onset of predicates, with no significant dif-ferences between languages in proportions of nouns orverbs at any point from 18 to 30 months. However, theycontinued to observe (as they had at a younger stage)higher proportions of social words in the Italian sample.There were also interesting differences in the shape ofdevelopment for function words as a proportion of totalvocabulary, reflecting continuous linear change inItalian from 50–600+ words, versus non-linear growthin English (i.e. no change in proportions of functionwords from 0–400 words, but a marked accelerationafter 400 words).

Caselli et al. (1999) found no differences betweenEnglish and Italian on the grammatical complexityscale when total vocabulary was controlled. Indeed, theshape of the strong non-linear growth function connect-ing grammatical complexity to total vocabulary sizewas identical in the two languages. This may seemrather surprising, given known differences betweenthese two languages in richness of inflectional morph-ology. However, the authors note that the complexityscales themselves are designed to discourage quanti-tative differences in the growth of grammar betweenthese two languages, because the scales in bothlanguages each contain exactly 37 items, designed toreflect grammatical structures that are known to occurin that language between 18 and 30 months. Tounderscore this point, Caselli et al. presented someinformal examples of the longest utterances reported byparents in a separate part of the CDI. When English andItalian children were matched for total vocabulary size,there appeared to be a marked advantage in grammat-ical complexity for the Italian children, in accord withthe well-known differences in complexity betweenthese two languages.

The present study builds on the prior two byfocusing in much more detail on the three longestutterances reported by the parents of American andItalian children between 18 and 30 months. Weacknowledge from the outset that there is no substitutefor ‘live’ measures of free speech of the sort that areobtained in small-sample studies, and we present ourconclusions as working hypotheses for future studiesusing videotaped observations. However, we believethat results of the present study are provocative, and ofsufficient heuristic value to merit consideration despite

Page 4: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

4

the limits of parental recall regarding the three longestutterances recently produced by their infants. Using thismethodology, we will address the following questions.

(1) Within and across the four differentmeasures of structural complexity, will vocabularyaccount for more developmental variance thanchronological age?

(2) When age, gender and vocabulary size arecontrolled, will there be an ‘Italian advantage’ instructural complexity, reflecting the greater morph-ological complexity of Italian compared with Eng-lish? And if so, will this advantage be observedonly on MLU in morphemes, or will we find cross-linguistic differences in total words and/or contentwords?

(3) Following on earlier reports by Caselli etal. (1999) for growth in function words, will wefind differences between English and Italian in theshape of change, reflected in non-linear growthpatterns for English (initially flat growth withsubsequent acceleration) and linear patterns forItalian?

(4) When growth in structural complexity isevaluated in terms of the ratio of ‘complexityobtained’ (the actual utterance) versus ‘complexityattempted’ (the expanded/corrected versions), willthere be developmental and/or cross-linguistic dif-ferences in the proportion of attempted utterancesthat children are able to realize in their reportedspeech? Will we find a greater gap between actualand attempted speech in the richer inflectionalsystem of Italian, at least in the early stages? Orwill we see that children in each language are ableto express roughly the same proportion of theirtargets within and across levels of development,suggesting some kind of developmental constant inthe distance between effort and success (i.e. a kindof linguistic ‘zone of proximal development’)?

MethodParticipants

Participants were the parents of 466 childrenbetween 18 and 30 months of age, 233 in each lan-guage, 120 females and 113 males in each language,with reported expressive vocabularies on the Mac-Arthur CDI ranging from 50 to 680 words. The childrenwere selected from larger norming samples for eachlanguage (Fenson et al., 1994; Caselli & Casadio,1995), including 618 Americans and 304 Italians forwhom parents had completed the Three LongestUtterances section of the CDI. To match children forthe purposes of the present study, we eliminated allchildren with vocabularies under 50 words (on theassumption, confirmed by direct inspection, that anyword combinations reported by parents were likely tobe frozen phrases). Because there were more American

than Italian children in the norming samples, we tookthe Italian children as the basis for comparison andsought, for each child, an American child of the samechronological age in months, gender, and approximatevocabulary size. This yielded the set of 233 children perlanguage used for all analyses below. A two-tailed t-testconfirmed that the two samples did not differ signif-icantly in vocabulary size after matching.

Groupings for analyses over age were determinedon the basis of chronological age in months (18 months,19 months, etc.). Groupings for analyses over vocabu-lary size followed precedents from our previous studies,as follows (eliminating children with vocabulariesunder 50 words): 50–100 words; 101–200 words;201–300 words; 301–400 words; 401–500 words;501–600 words; > 600 words.

MaterialsData for the English sample are based on the CDI:

Words and Sentences (Fenson et al., 1993, 1994),designed for use with children in the 16–30-month agerange. This scale includes a 680-word vocabularyproduction checklist, organized into 22 semanticcategories (for details, see Fenson et al., 1993, 1994).Data for the Italian sample are based on the Words andSentences Scale for the Italian version of the Mac-Arthur CDI (Caselli & Casadio, 1995). For the Italianversion of this scale, norms are available between 18and 30 months. The Italian word production checklistcontains 670 items, organized into 23 semanticcategories. The English and Italian versions of the CDIboth contain several different subscales designed tomeasure aspects of grammatical development, e.g. thegrammatical complexity checklist based on pairs ofsentences that vary in structural complexity. The gram-mar section of the CDI also has a final section that asksparents to write out the three longest sentences (each onseparate lines) that they can remember their childsaying in the last couple of weeks (on the grounds thatthese would be sufficiently recent and striking events tohave acceptable validity, similar to diary studies). Thislongest-utterance section furnished the speech samplesanalyzed in the present study.

CodingClean-up. Prior to coding the longest utterances

reported by each parent, we first eliminated allutterances that were partially unintelligible, or wereobviously frozen phrases (taken from songs, prayers,counting, and other formulae). There were alsoinstances in which parents failed to write down threeseparate utterances. For those cases, and for those inwhich items had to be eliminated, all averages are basedon the total number of utterances available. For 82.8%of the children, averages were based on threeutterances. For 12.9%, averages were based on twoutterances, and for a small number of cases (4.3%) only

Page 5: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

5

a single novel utterance was available for coding afterfrozen and/or partially unintelligible forms were re-moved.

MLU coding schemes. Four different codingschemes for mean length of utterance were applied tothese data: in total content words, in total words (bothcontent and function words), in a conservative count oftotal morphemes (treating pronouns as single unin-flected lexical forms), and in an expanded count of totalmorphemes (evaluating proforms along multiple dimen-sions).

The distinction between total words and totalcontent words is not straightforward. In fact, there is noconsensus on the categorization of items into contentversus function words, and there are also cases in whichthe boundary of a single word is in question. To assureinter-rater reliability over languages, a series of ruleswere written to define content words and functionwords. For example, compound proper names (e.g.‘Santa Claus’) were treated as a single content word,but proper names that involve a potentially productiveelement (e.g. ‘Uncle Charlie’, ‘Uncle Fred’) weretreated as two content words. Pronouns, articles andquantifiers, prepositions, copulae and auxiliary verbswere treated as function words. Modal verbs weretreated as content words if they served as the main verb(e.g. ‘I want ice cream’) but as function words if theyserved as modals with another verb (e.g. ‘I want go’).Elided forms (e.g. ‘She’ll’ in English, or ‘della’ inItalian) were unpacked into two separate words (e.g.‘She will’ in English, and ‘di la’ in Italian) prior tocoding. A handful of high-frequency adverbial andadjectival forms (e.g. ‘very’) were treated as functionwords, with classifications made by agreement betweenraters.

Matters become more complex as we moved fromword counts to morpheme counts. We began byattempting to construct a coding scheme that did notrequire assumption of any specific form as theunmarked form. Briefly put, this effort failed, bothwithin and between languages. Therefore, for eachlanguage, we established the unmarked forms fornouns, verbs, modifiers and (for the enriched pronouncount) pronominal forms. If the child produced theunmarked form only, a single morpheme was credited.For each marked contrast added to the item, anadditional morpheme was credited.

For the conservative count of MLU in morphemes,rules were established that necessarily differ for Englishand Italian. English has a zero form for all nouns andverbs, which we chose as the unmarked form in allcases. This decision is further justified by the fact thatEnglish children tend to begin with the unmarked formof nouns and verbs in their early speech (with someinteresting item-based exceptions — Bloom, Lightbown& Hood, 1975; Tomasello, 1992). As is well known,

there are relatively few morphemes that can be added inEnglish, consisting almost entirely of plurals on nouns,and a handful of contrasts on verbs that are mutuallyexclusive (past tense, participial, third person singular,progressive). By contrast, Italian has no zero form fornouns or verbs. The singular-plural contrast for nouns ismarked syncretically (e.g. singular for boy is ‘ragazzo’while the plural is ‘ragazzi’). Verb markings are alsosyncretic rather than agglutinative, but several separatecontrasts can be marked simultaneously (e.g. ‘he/sheeats’ = ‘mangio’; ‘they will eat’ = ‘mangeranno’; ‘Iused to walk’ = ‘mangiavo’). To capture these facts, thefollowing rules were established.

— For nouns in both languages: unmarked = singular → no additional points marked = plural → one additional point

— For verbs in English: unmarked = zero form → no additional points marked = any non-zero form→one additional point (i.e. 3rd person singular, progressive, past tense)

— For verbs in Italian: unmarked = 3rd person singular present → no additional points marked = 1st or 2nd person → one additional point

= plural → one additional point = any tense/aspect other than simple pres.

→ one additional point— For modifiers in Italian: unmarked = singular → no additional points marked = plural → one additional point gender agreement (masc. or fem., modifiers only)

→ one additional point

For the conservative count, all pronouns (exceptmodifiers) were treated as single morphemes. Bycontrast, for the enriched/expanded count we also at-tempted to give credit for marked morphologicaldimensions on all proforms. This coding scheme willnecessary result in longer estimates than the con-servative count, because it is cumulative. However,because English and Italian tend to indicate the samecontrasts on pronouns through lexical choice, thiscoding scheme provides an opportunity for Englishchildren to ‘catch up’ to some extent with their Italianpeers. The assumed unmarked proform was a nomina-tive singular third-person pronoun in both languages.One additional point each was given for first- orsecond-person pronouns, for plural pronouns, and forany deviations from nominative case (accusative, geni-tive, dative). Hence the count for a single pronouncould range from one (‘he’ in English, ‘lui’ in Italian)to four (e.g. ‘ours’ in English, ‘nostri’ in Italian).

Observed versus Expanded Utterances. Prior to theapplication of the above coding schemes, all utterances

Page 6: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

6

were written in two forms: the original form reportedby the parent, and a form that was conservativelyexpanded/corrected to restore grammaticality, ifnecessary. So, for example, if a parent reported ‘Kittysleeping’, the expanded form would be ‘The kitty issleeping.’ We will refer to these two forms as‘observed’ and ‘attempted’, respectively, on theassumption that the corrected form represents thechild’s target, whereas the observed form representsthat portion of the target that the child was able torealize in his/her speech. The four coding schemesdescribed above were applied to both the observed andattempted form of every utterance, resulting in eightscores for each child, plus an additional four scoresrepresenting the ratio of observed to attempted for eachcoding scheme. To illustrate the latter, the ratio of‘Kitty sleeping’ to ‘The kitty is sleeping’ would be 2/4(50%) in total words and 3/5 (60%) in total morphemesby the conservative count. The idea behind thecorrected codings was to assess, for each codingscheme and for each language, whether there aredevelopmental or cross-linguistic changes in theproportion of their presumed target utterances thatchildren are able to produce.

Appendix I provides examples of observed andexpanded utterances at each vocabulary level, for eachlanguage. For each of these utterances, illustrativescores are provided in Appendix I for each utterance,for each of the four coding schemes.

Results and Discussion

Results will be presented in an order that reflectsthe four main questions posed in the introduction.

(1) Does vocabulary predict growth better thanage?

We first addressed this question with two omnibusmixed analyses of variance: one for Language by AgeGroup by Coding Scheme, with age and language asbetween-subjects variables and coding scheme as awithin-subjects variable; another for Language byVocabulary Size by Coding Scheme, with age andvocabulary size between-subjects and coding schemewithin-subjects. All within-subjects effects are Green-house-Geiser corrected.

In the analysis over age, there were significantmain effects of all three variables (Language, F(1, 440)= 8.58, p < 0.004; Age, F(12, 440) = 13.99, p < 0.0001;Coding Scheme, F(3, 1320) = 456.28, p < 0.0001). Thecoding scheme effect was inevitable, given thecumulative nature of the four measures, and the ageeffect reflects an unsurprising increase in complexitywith age. The language effect reflects an overalladvantage for Italian. In addition, there were twosignificant two-way interactions: Code by Language(F(3, 1320) = 19.81, p < 0.0001) and Code by Age

(F (36, 1320) = 11.22, p < 0.0001). The Code byLanguage interaction reflects a larger Italian advantageon the coding schemes that tap into morphologicalcomplexity. The Code by Age interaction reflectsbigger gains over time for the coding schemes thatinvolve inflections and function words. There was noLanguage by Age interaction (F < 1.0), nor did thethree-way interaction reach significance, indicating thatage effects are parallel for English and Italian, despitethe overall Italian advantage in morphology. Althoughthe three-way interaction did not reach significance,Age by Coding Scheme effects are plotted separatelyfor English and Italian in Figures 1a–b, so that readerscan more easily examine the shape of developmentaleffects involving age, and compare them with thevocabulary size effects below.

In the corresponding analysis of Language byVocabulary Level by Coding Scheme, all three maineffects again reached significance (Language, F(1 ,452)= 14.38, p < 0.0001; Vocabulary Level, F(6, 452) =56.40, p < 0.0001; Code, (3,1356) = 1035.24, p <0.0001), all in the predicted directions. There was nosignificant Language by Vocabulary interaction(F(6.452) = 1.51, n.s.), indicating that changes tendedto occur in parallel across the two languages whenconflating over coding schemes. However, there weresignificant two-way interactions of Code by Language(F(3, 1356) = 39.81, p < 0.0001) and Code by Vocabu-lary Level (F(18, 1356) = 46.79, p < 0.0001), as well asa significant 3-way interaction (F(18, 1356) = 2.10, p <0.04). To illustrate these effects, the Code byVocabulary Level interactions are plotted separately forEnglish and Italian in Figures 2a–b. In general, growthappeared to start earlier in Italian for the measuresinvolving inflections and function words. The conser-vative morpheme measure (pronouns treated as wholeforms) and the expanded morpheme measure (pronounsscores on multiple dimensions) appeared to differ morefor English, whereas the conservative morpheme meas-ure and the measure in total words appeared to differmore for Italian. More detailed explorations of all theseeffects follow below, when we investigate the shape ofdevelopmental and cross-language effects within eachcoding scheme.

Comparing Figures 1a–b for age and 2a–b forvocabulary size, it is evident at a glance that results arefar more regular and lawful when developmental effectsare plotted as a function of vocabulary size. Toinvestigate this more directly, with an emphasis onfinding out which of these developmental predictorsdoes a better job, we repeated the above two omnibusanalyses, covarying out the effects of vocabulary in theanalysis over age, and then covarying out the effects ofage in the analysis over vocabulary levels. When ageeffects were covaried out, the three-way relationship ofLanguage by Vocabulary by Coding Scheme remained

Page 7: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Figure 1a

Page 8: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Figure 1b

Page 9: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Figure 2a

Page 10: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Figure 2b

Page 11: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

7

significant (F(18, 1353) = 2.08, p < 0.04), and the maineffect of vocabulary size remained strong (F(6, 451) =30.53, p < 0.0001). In the corresponding analysis overage with vocabulary size (total number of words)covaried out, the 3-way interaction still failed to reachsignificance (F(36, 1370) < 1.0, n.s.), and although themain effect of age remained significant (F(12, 439) =3.67, p < 0.0001), it was relatively weak.

In a final look at the relative predictive value ofthese two developmental measures, we also conductedregression analyses within each coding scheme, usingage, vocabulary size (in total words) as well as lan-guage as predictors. Results are summarized in Table 1.In all four analyses, the joint variance accounted for bythe three predictors was significant (p < .0001), rangingfrom a low of 28.8% of the developmental variance forMLU in content words, to a high of 48.5% (almost halfthe variance) for the conservative measure of MLU inmorphemes. In all four analyses, significant uniquecontributions were observed for all three predictorswhen the other two were controlled. However, in everycase, vocabulary accounted for more than five times theamount of unique variance explained by age. We mayconclude that chronological age and size of vocabularyare both effective developmental predictors, and thatthey are partially independent in the variance that theyexplain. But vocabulary size is clearly a much strongerpredictor than age for all our measures of complexity.

In all remaining analyses, we will concentrate onchanges over vocabulary levels rather than age, to learnmore about cross-linguistic differences in the natureand shape of changes in structural complexity.

(2) Is there an Italian advantage?The omnibus analyses suggest that Italian children

display a global advantage in the development of struc-tural complexity, conflating across coding schemes.Furthermore, language interacts with coding scheme,suggesting that the Italian advantage may be greater formeasures involving inflectional morphology. To ex-plore these cross-linguistic differences in more detail,we conducted separate Language by Vocabulary ana-lyses of variance for each of the four coding schemes,with both language and vocabulary serving as between-subjects variables. To simplify the text, statistical de-tails of these analyses are presented in Table 2. Brieflysummarized, significant main effects of vocabulary sizewere observed for all four measures, in the predicteddirection. Main effects of language favoring Italianwere observed on the two measures involving morph-ological complexity. However, only one of the fourcoding schemes yielded a significant Language byVocabulary interaction, and that was (to our surprise)mean length of utterance in content words. For the otherthree, the presence or absence of an Italian advantageappears to be a statistical constant across levels ofvocabulary.

Although the Language by Vocabulary interactionswere non-significant in three out of four cases, forpurposes of comparison, Figures 3a–d illustrate cross-linguistic differences across vocabulary levels for eachof the four coding schemes. In addition, asterisksindicate whether, in planned two-tailed t-tests, thelanguage difference reached significance within eachvocabulary level (a tilde indicates a trend at p < 0.10).Except for the content word coding scheme, post hocresults reflecting an Italian advantage were generallysignificant in the middle range of development, wherethe shape of change is most different for English andItalian (see analyses of shape, below). This was trueeven for the measure of MLU it total words, which didnot yield an overall main effect of language. All posthoc tests failed to reach significance at the highest level(> 600 words), a result that probably reflects ceilingeffects that will be discussed later.

The content word effect merits further exploration,because both its existence and shape are unexpected(Figure 3a). Why should there be any difference be-tween English and Italian in Mean Length of Utterancein Content Words? And why should this effect berestricted entirely to children at the earlier stages ofvocabulary development, from 50 to 300 words (adistribution that is responsible for our only significantLanguage by Vocabulary interaction)? We entertainedtwo hypotheses that might explain the early Italianadvantage for length in content words:

(1) Italian children tend to have a higher pro-portion of social words, and therefore may producemore utterances containing proper names, especiallyat the earlier stages of development. This wouldinflate the content word count.

(2) Italian is a pro-drop language in which thesubject can be omitted from free-standing declarativesentences. When the subject is included, it tends to benew information, typically expressed with a nounrather than a pronoun. By contrast, English is alanguage in which the subject is obligatory in free-standing declarative sentences. Therefore it is morecommon to find sentences with pronominal subjects,a factor that might decrease the overall length ofutterances in content words.

In pursuit of the first hypothesis, we reanalyzed allof the utterances produced by children in the first threevocabulary levels, and coded them for the number ofproper nouns that they contained. We then calculated asimple proportion score reflecting number of propernouns divided by number of utterances. A two-tailed t-test comparing means for English (67.1%) and Italian(88.9%) was significant (t(170) = –2.82, p < 0.005).This finding is consistent with the social-word hy-pothesis, but it would also be consistent with thehypothesis that Italians produce fewer sentences withpronominal subjects. To determine whether the en-

Page 12: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Table 1: Joint and Unique Effects of Age, Vocabulary Size and Language on Four Coding

Schemes

Percent Unique Variance Due to

MLU in

Total Variance

Accounted for Language Age Vocabulary

Content words 28.8%*** +2.1%*** +1.5%** +12.7%***

Total words 43.9%*** +0.7%* +2.4%*** +20.5%***

Conservative

morpheme count

48.5%*** +7.7%*** +2.3%*** +19.5%***

Expanded

morpheme count

46.9%*** +2.9%*** +2.5%*** +20.9%***

Table 2: Statistical Results of Language by Vocabulary Analyses of Variance

for Each Coding Scheme

EFFECTS OFCODING SCHEMES

LANGUAGE

F(1, 452) =

VOCABULARY

F(6, 452) =

LANG. X VOC.

F(6, 452) =

MLU Content Words 8.50, p < .004 26.19, p < .0001 2.22, p < .05

MLU Total Words 1.06, n.s. 55.85, p < .0001 1.95, p < .08

MLU-Conservative 39.35, p < .0001 55.07, p < .0001 1.33, n.s.

MLU-Expanded 11.41, p < .001 58.26, p < .0001 1.57, n.s.

Page 13: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Figure 3a

Page 14: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Figure 3b

Page 15: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Figure 3c

Page 16: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Figure 3d

Page 17: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

8

hanced presence of proper nouns was sufficient toexplain the Italian advantage in MLU in content words,we repeated the Language by Vocabulary Level ana-lysis of variance just for those children at the first threelevels, covarying out proper-noun proportion scores.The presence of the covariate did not eliminate theItalian advantage, reflected in a persistent significantmain effect of Language (F (1, 165) = 17.73, p <0.0001), and a significant Language by Vocabularyinteraction (F(2, 165) = 6.63, p < 0.0001), reflectinggrowth in the magnitude of the Italian advantage overthese early levels. These results suggest that enhancedpresence of proper nouns is not sufficient to explain theItalian advantage in the content word count.

Turning to the nominal/pronominal-subject hypo-thesis, we again recoded all utterances for children atthe first three vocabulary levels, coding the subject ofthe sentence (if one was present) as nominal or pro-nominal. These were used to construct two simpleproportion scores: mean number of pronominal subjectsper utterance, and mean number of nominal subjects perutterance. These ratios were quite different for Englishand Italian. For pronominal subjects, the mean ratio forEnglish-speaking children was 23.6%, compared with5.4% for Italian, significantly different by a two-tailedt-test (t(170) = 4.95, p < 0.0001). For nominal subjects,the mean ratio for English speaking children was 39%,compared with 69.3% for Italians, again significantlydifferent by a two-tailed t -test (t(170) = –4.27, p <0.0001). To determine whether this clear difference insubject type might be responsible for the Italianadvantage in content words, we repeated the Languageby Vocabulary Level analysis of variance just over thefirst three levels, covarying out nominal subjectproportion scores. This covariate did not manage toknock out the Italian advantage in content words. Therewas still a significant main effect of Language (F(1,165) = 10.63, p < 0.001), and a significant Language byVocabulary interaction (F(2, 165) = 3.17, p < 0.05),reflecting slight growth in the Italian advantage from 50to 300 words. We repeated the analysis covarying onpronominal rather than nominal subjects, and the sameresults obtained.

We may conclude that greater use of social wordsand greater use of nominal subjects both contribute tothe Italian advantage in mean length in content words,but neither of these hypotheses is wholly responsiblefor the effect. Hence, even controlling for these knownlanguage differences, Italians have a temporaryadvantage in utterance length even by a conservativemeasure that eliminates their known advantage ininflectional morphology. This appears to be true even ina study like the present one, in which children arematched for age, gender and vocabulary size.

(3) Does the shape of change differ in Englishand Italian?

In an earlier study of growth in function words as aproportion of total vocabulary, Caselli et al. (1999)discovered that growth is linear in Italian (getting offthe ground sooner, and growing smoothly from 0 –>600 words) while the corresponding function is non-linear in English (proportionally flat until approx-imately 400 words, and accelerating thereafter). Herewe will ask whether such differences in the shape ofdevelopment are also observed in these measures ofstructural complexity. Towards this end, one-way ana-lyses of variance were performed across vocabularylevels, separately for each language and each codingscheme. In each analysis, we tested for significance ofboth the linear and the quadratic component. Statisticaldetails are presented in Table 3.

Briefly summarized, results indicate that the linearcomponent is significant for all four coding schemes, inboth languages. For MLU in content words, thequadratic component does not reach significance ineither language, although both show a non-significanttrend (p < 0.10). For the other three coding schemes,the two languages differ in accord with predictions. ForEnglish, the quadratic component is significant forMLU in total words (including function words), for theconservative MLU count, and for the pronoun-enrichedMLU measure. For Italian, there is no significantquadratic component for any of these measures. Theseresults are similar to those reported by Caselli et al.(1999) for function words out of context, as aproportion of total vocabulary. We may conclude thatgrammatical complexity ‘gets off the ground’ earlier inItalian (as is also clear from Figures 3b-d), while ittends to lag in English-speaking children until a criticalmass of approximately 400 words has accrued (see alsoMarchman & Bates, 1994). Some possible reasons forthese robust results are presented under summary andconclusions.

(4) Do languages vary in observed versusattempted complexity?

For each coding scheme, a ratio was constructeddividing observed by attempted utterance length. Thepoint of these analyses is to determine whether there isa systematic ‘zone of proximal development’ betweenwhat children are able to do and the closest correctversion of that utterance, and to determine whetherthese ratios vary over languages and vocabulary levels.A mixed Language by Vocabulary by Coding Schemeanalysis of variance was conducted on these proportionscores, with language and vocabulary as between-subjects variables and coding scheme as a within-subjects variable.

Summarizing briefly, there was a significant maineffect of vocabulary level collapsed in all four codingschemes (F(1, 465) = 26.05, p < 0.0001), indicating the

Page 18: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Table 3: Linear and Quadratic Effects in One-Way Analyses of Variance for Each Coding

Scheme, in Each Language

ENGLISH ITALIANCODING

SCHEMES LINEAR QUADRATIC LINEAR QUADRATIC

MLU Content F = 87.88*** F<1.0 F = 39.76*** F = 1.10

MLU Total F = 165.05*** F = 9.59** F = 9.40*** F < 1.0

MLU-Conservative

F = 170.66*** F = 12.02*** F = 105.66*** F < 1.0

MLU-Expanded

F = 175.44*** F = 13.92*** F = 109.96*** F < 1.0

* = p < .05; ** = p < .01; *** p < .001

Page 19: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

9

children tend to ‘close the gap’ between observed andattempted utterances as vocabulary size goes up. Therewas, however, no significant main effect of languageand no significant interactions involving language. Thecomplete absence of language effects on observed/attempted ratios is particularly interesting in view of theconsistent Italian advantage that is observed withinand/or across developmental levels. Even though Italianchildren tend to produce more complex utterances, thegap between their performance and their presumedtargets is no greater than the corresponding gap inEnglish, for any coding scheme. To make certain thatthe absence of language effects really does hold for allfour coding schemes, we also analyzed each MLU codeseparately in Language by Vocabulary analyses of vari-ance, summarized in Table 4. In every case, the maineffect of vocabulary was significant, but there were nosignificant main effects of language and no interaction.These results suggests there may be some kind ofconservative cross-linguistic constant in the proportionof their utterance targets that children are able toproduce, even though the complexity of those targetscan vary over languages and over levels of develop-ment.

Despite the absence of language effects, theCoding Scheme by Vocabulary interaction did reachsignificance (F(18, 1353) = 10.34, p < 0.0001). Thisinteraction is illustrated in Figure 4. It is clear fromFigure 4 that gains are relatively shallow for MLU incontent words, which starts out relatively high (close to88%) in children with 50–100 words, increasesprimarily within the first developmental levels, andreaches asymptote by 400 words. In contrast, the otherthree coding schemes all show steady growth from50–> 600 words, from around 70% for childrenbetween 50–100 words to more than 95% for childrenwith > 600 words. It’s also worth noting thatobserved/attempted ratios for the three coding schemesinvolving function words and inflections are very closetogether in Figure 4, even though we know that theabsolute numbers differ (see Figures 2a-b). Finally,separate one-way analyses were conducted overvocabulary level for each of the fourobserved/attempted, to investigate the shape of thesedevelopmental effects. All effects proved to besignificantly linear, with no significant quadraticeffects.

Summary & Conclusions

We will summarize results in terms of the fourmain questions raised in the introduction, and thenconsider some potential theoretical accounts for ourmost important effects.

(1) Within and across four different measures ofstructural complexity, will vocabulary account formore developmental variance than chronologicalage?

Age and vocabulary both contributed significantvariance to all four measures of MLU. However, theeffects of vocabulary were much larger than the effectsof chronological age, in both languages, for all meas-ures of complexity. From a methodological perspective,this result suggests that vocabulary size (whenavailable) may provide a better basis for cross-languagematching in comparative studies of grammaticaldevelopment. From a theoretical perspective, as dis-cussed by Bates and Goodman (1997) and Marchmanand Bates (1994), this finding is also compatible withlexicalist theories of grammar, that is, theories in whichgrammatical forms are stored and accessed as construc-tions within the same lexical component in whichcontent and function words are listed. Vocabularydevelopment drives grammar (and vice-versa) becausethese two aspects of language are inextricably linked,represented together and accessed together.

(2) When age, gender and vocabulary size arecontrolled, will there be an ‘Italian advantage’ instructural complexity, reflecting the greatermorphological complexity of Italian compared withEnglish? And if so, will this advantage be observedonly on MLU in morphemes, or will we find cross-linguistic differences in total words and/or contentwords?

With age, gender and vocabulary size controlled,regression analyses showed a significant contributionfrom language for all four measures, reflecting anItalian advantage. The Italian advantage for MLU incontent words interacted with vocabulary level, and wasrestricted to the early stages of development. Post hocexplorations of the data suggested that the Italianadvantage for content words is partially due to a largernumber of proper nouns in Italian children, and to atendency toward pronominal subjects in English versusnominal subjects in Italian (a pro-drop language inwhich pronominal subjects are rare).

From a methodological perspective, we may con-clude that it is possible to capture cross-linguistic dif-ferences in complexity using parental report, whenparents are allowed to report their recollections in anopen-ended format (as opposed to the closed 37-itemscales used by Caselli et al., 1999). From a moresubstantive perspective, we add to a large body ofcross-linguistic work in child language showing that thepace and complexity of development varies withcomplexity in the child’s input. Finally, the existence ofan Italian advantage in content words and total words(at least at some ages) raises a note of caution for cross-linguistic studies that try to use length in content wordsor total words as a matching criterion.

Page 20: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Table 4: Statistical Results of Analyses of Variance on Proportion Scores for Observed/

Attempted MLU for Each Coding Scheme

EFFECTS OFCODING SCHEMES

LANGUAGE

F(1, 452) =

VOCABULARY

F(6, 452)

LANG. X VOC.

F(6, 452)

MLU Content Words 2.17, n.s. 13.55, p < .0001 < 1.0, n.s.

MLU Total Words < 1.0, n.s. 21.56, p < .0001 1.67, n.s.

MLU-Conservative <1.0, n.s. 24.31, p < .0001 1.48, n.s.

MLU-Expanded < 1.0, n.s. 23.25, p < .0001 1.45, n.s.

Page 21: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

Figure 4

Page 22: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

10

(3) Will we find differences between English andItalian in the shape of change, reflected in non-lineargrowth patterns for English (initially flat growth withsubsequent acceleration) and linear patterns forItalian?

Although vocabulary was a powerful predictor ofMLU in both languages, there were significant cross-linguistic differences in the shape of the relationshipbetween MLU and vocabulary size. Specifically, for allmeasures except MLU in content words, the MLU/vocabulary functions were non-linear in English (slowincreases in MLU up to 400 words, followed by a sharpburst), while the corresponding functions in Italian wereall linear (steady growth in MLU across all levels ofvocabulary). These results indicate that grammar may‘get off the ground’ earlier in a richly inflected lan-guage.

One possible explanation for this result can bederived from connectionist models of learning in bothlinguistic and non-linguistic domains (Plunkett &Marchman, 1993; Plaut, McClelland, Seidenberg &Patterson, 1996; Elman, 1998). In a system in whichlearning takes place through parallel distributed proces-sing, it may actually be easier to learn systems with alarge but consistent set of regularities, compared withsystems in which regularities are sparsely representedin the input. In other words, more can be better, if by‘more’ we mean more information, consistently mark-ed, forming a coherent system. If this general tendencyapplies to language learning in children, then therelatively rich, regular and consistently marked gram-matical system in Italian may provide an easier target,requiring fewer exemplars (and smaller vocabularies) tosupport extraction of strong generalizations. Our col-league J. Elman (personal communication) is currentlyconducting simulations of morphological learning inneural networks, designed to parallel the contrastsbetween English and Italian that underlie the work wehave presented here.

(4) When growth in structural complexity isevaluated in terms of the ratio of ‘complexityobtained’ (the actual utterance) versus ‘complexityattempted’ (the expanded/correct versions), will therebe developmental and/or cross-linguistic differencesin the proportion of attempted utterances thatchildren are able to realize in their reported speech?Will we find a greater gap between actual andattempted speech in the richer inflectional system ofItalian, at least in the early stages? Or will we findthat children in each language are able to expressroughly the same proportion of their targets withinand across levels of development, suggesting somekind of developmental constant in the distancebetween effort and success (i.e. a kind of linguistic‘zone of proximal development’?

We found clear evidence for a linear decrease withvocabulary size in the ratio of observed-to-attemptedutterances, on all four coding schemes (althoughcontent word ratios started high and reached asymptoteearly). Hence children do appear to be ‘closing the gap’between performance and their attempted target.However, despite the overall Italian advantage in MLU,ratios of actual to expanded (target) MLUs did notdiffer over languages, for any measure. In other words,children appear to maintain a similar distance betweenactual and attempted constructions in each language.

This result was not inevitable. Given the substan-tially greater complexity of Italian, we might haveexpected to find a larger gap in this language at theearly stages, reflecting the greater load of morpholog-ical marking that these children must take on. Instead,the children appear to try out utterances that are close totheir current capacity, to a similar extent in English andItalian. Although it remains to be seen whether thisfinding will prove to be a cross-language universal, theabsence of a difference in observed/attempted ratios forEnglish and Italian provides support for models thatassume a conservative, ‘piecemeal’ approach to lan-guage learning (Tomasello, 1992, 2003).

All of these conclusions are tentative, and we ac-knowledge again the fact that they are based on alimited and unusual data base of utterances recalled byparents. These ‘best utterances’ have more in commonwith diary methods than with the averages in MLU thatare observed using free-speech transcriptions. Asidefrom the problem of representativeness and general-izability, the ‘best utterance’ approach is subject toceiling effects for children at the highest levels offunctioning. We have noted informally, for example,that English-speaking children with vocabularies over600 words were able to achieve remarkably lengthyutterances despite the impoverished nature of Englishmorphology, by creating chains of people and events asin ‘We went to the zoo, and got ice cream and saw thetigers and bears and monkeys and giraffes.’ Suchutterances, though interesting, are unlikely to charac-terize larger samples of speech from the same children.

Despite the acknowledged limits of the presentstudy, all of these hypotheses could be tested againstfree-speech data, using smaller but well-matched sam-ples of children in English, Italian and other languages.Based on our findings, we suggest that such matchesinclude vocabulary size (most conveniently estimatedby parent report). And we would also propose that, forpurposes of comparison, it might be useful to considerthe 3, 5, 10 or 20 ‘best utterances’ produced by childrenin free speech, in addition to averages based on a largerproportion of the data.

Page 23: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

11

REFERENCES

Bates, E., Devescovi, A. & Wulfeck, B. (2001). Psych-olinguistics: A cross-language perspective. AnnualReview of Psychology, 52, 369–98.

Bates, E. & Goodman, J. (1997). On the inseparabilityof grammar and the lexicon: Evidence from acqui-sition, aphasia and real-time processing. Languageand Cognitive Processes, 12, 507–86.

Berman, R.A. & Slobin, D.I. (1994). Relating events innarrative: A crosslinguistic developmental study.Hillsdale, NJ: Erlbaum.

Bloom, L., Lightbown, L. & Hood, L. (1975). Structureand variation in child language. Monographs forthe Society for Research in Child Development, 40,#160.

Bowerman, M. & Choi, S. (2001). Shaping meaningsfor language: universal and language-specific inthe acquisition of spatial semantic categories. In M.Bowerman & S.C. Levinson (eds), Language ac-quisition and conceptual development. Cambridge:CUP.

Braine, M.D.S. (1976). Children’s first word combi-nations. With commentary by Melissa Bowerman.Monographs of the Society for Research in ChildDevelopment, 41(Serial No. 164).

Caselli, M.C., Bates, C., Casadio, P., Fenson, L., Fen-son, J., Sanderl, L. & Weir, J. (1995). A cross-linguistic study of early lexical development.Cognitive Development, 10, 159–99.

Caselli, M.C. & Casadio, P. (1995). Il primo vocabu-lario del bambino: Guida all’uso del questionarioMacArthur per la valutazione della comunicazionee del linguaggio nei primi anni di vita [The child’sfirst words: Guide for the use of the MacArthurquestionnaire for assessing communication andlanguage in the first years of life] (pp. 94-100).Milan: FrancoAngeli.

Caselli, M.C., Casadio, P. & Bates, E. (1999). A com-parison of the transition from first words togrammar in English and Italian. Journal of ChildLanguage, 26, 69–111.

Choi, S. & Bowerman, M. (1991). Learning to expressmotion events in English and Korean: The influ-ence of language-specific lexicalization patterns.Cognition, 41, 83–121.

Dale, P. (1991). The validity of a parent report measureof vocabulary and syntax at 24 months. Journal ofSpeech and Hearing Sciences, 34, 565–71.

Dale, P., Bates, E., Reznick, S. & Morisset, C. (1989).The validity of a parent report instrument of childlanguage at 20 months. Journal of Child Language,16, 239–49.

Dale, P., Dionne, G., Eley, T. C. & Plomin, R. (2000).Lexical and grammatical development: A behav-

ioural genetic perspective. Journal of Child Lan-guage, 27, 619–42.

Demuth, K. (1990). Subject, topic and Sesotho passive.Journal of Child Language, 17, 67–84.

Dromi, E. (1987). Early lexical development. NewYork: Cambridge University Press.

Dromi, E. & Berman, R.A. (1982). A morphemicmeasure of early language development: data frommodern Hebrew. Journal of Child Language, 9 ,403–24.

Elman, J.L. (1998). Generalization, simple recurrentnetworks, and the emergence of structure. In M.A.Gernsbacher & S J. Derry (eds), Proceedings of theTwentith Annual Conference of the CognitiveScience Society. Mahwah, NJ: Erlbaum.

Fenson, L., Dale, P., Reznick, J.S., Thal, D., Bates, E.,Hartung, J., Pethick, S. & Reilly, J. (1993). TheMacArthur Communicative Development Inven-tories: User's guide and technical manual. SanDiego: Singular Publishing Group.

Fenson, L., Dale, P., Reznick, J., Bates, E., Thal, D. &Pethick, S. (1994). Variability in early communi-cative development. Monographs of the Society forResearch in Child Development, Serial No. 242,Vol. 59, No. 5.

Fortescue, M. & Lennert Olsen, L. (1992). The acquisi-tion of West Greenlandic. In D.I. Slobin (ed), Thecrosslinguistic study of language acquisition: Vol.3 (pp. 111–219).

Jackson-Maldonado, D., Thal, D., Marchman, V.,Bates. E. & Guitierrez-Clellen, V. (1993). Earlylexical development of Spanish-speaking infantsand toddlers. Journal of Child Language, 20,523–49.

MacWhinney, B. & Bates, E. (eds). (1989). The cross-linguistic study of sentence processing. New York:CUP.

Maitel, S.L., Dromi, E., Sagi, A. & Bornstein, MH.(2000). The Hebrew Communicative DevelopmentInventory: Language specific properties and cross-linguistic generalizations. Journal of Child Lan-guage, 27, 43–67.

Marchman, V. & Bates, E. (1994). Continuity in lexicaland morphological development: a test of thecritical mass hypothesis. Journal of Child Lan-guage, 21, 339–66.

Ogura, T., Yamashita, Y., Murase, T. & Dale, P.S.(1993). Some findings from the Japanese EarlyCommunicative Development Inventory. Memoirsof the Faculty of Education, Shimane University,Matsue, Japan: vol. 29, no. 1.

Plaut, D.C., McClelland, J.L., Seidenberg, M.S. &Patterson, K. (1996). Understanding normal andimpaired word reading: Computational principlesin quasi-regular domains. Psychological Review,103, 56–115.

Page 24: A Cross-Linguistic Study of the Relationship … · 2 A Cross-Linguistic Study of the Relationship between Grammar & Lexical Development Abstract The relationship between grammatical

12

Plunkett, K. & Marchman, V.A. (1993). From rotelearning to system building: Acquiring verb morp-hology in children and connectionist nets. Cogni-tion, 48, 21–69.

Slobin, D.I. (1985). Introduction: Why study languagecrosslinguistically? In D. I. Slobin (ed), The cross-linguistic study of language acquisition: Vol. 1.The data (pp. 3-24). Mahwah, NJ: Erlbaum.

Slobin, D.I. (1985, 1992, 1997). The crosslinguisticstudy of language acquisition. Vol. 1: The Data(1985); Vol. 2: Theoretical issues (1985); Vol. 3(1992); Vol. 4 (1997); Vol. 5: Expanding thecontexts (1997). Mahwah, NJ: Erlbaum.

Tomasello, M. (1992). First verbs: A case study ofearly grammatical development. Cambridge [UK];New York: Cambridge University Press.

Tomasello, M. (2003). Constructing a language: Ausage-based theory of language acquisition.Cambridge, MA: Harvard University Press.

Acknowledgements: This project was partially sup-ported by three grants to Elizabeth Bates: “Cross-linguistic studies of aphasia” (NIH/NIDCD R01–DC00216), “Center for the Study of the NeurologicalBases of Language & Learning” (NIH/NINDS P50–NS22343), "Origins of Communication Disorders”(NIH/NIDCD P50–DC01289), and by the NationalCouncil of Research Institute of Psychology.

Correspondence concerning this article should be ad-dressed to: Antonella Devescovi, Department of De-velopment and Socialization Processes, University ofRome “La Sapienza”, Via dei Marsi 78, Rome, Italy;telephone: 3906–4403678; fax: 3906–4403685; email:[email protected]