the development of second language writing complexity in

23
The Development of Second Language Writing Complexity in Groups and Individuals: A Longitudinal Learner Corpus Study NINA VYATKINA University of Kansas Department of Germanic Languages and Literatures 1445 Jayhawk Boulevard Room 2080 Lawrence, KS 66045 Email: [email protected] This study explores the development of multiple dimensions of linguistic complexity in the writing of beginning learners of German both as a group and as individuals. The data come from an annotated, longitudinal learner corpus. The development of lexicogrammatical complexity is explored at 2 intersections: (a) between crosssectional trendlines and the individual development paths of 2 focal learners and (b) between different complexity variables. The study contributes to the empirical body of linguistic complexity research by close tracking of beginning learners over 4 semesters of collegiate study of German as a second language (L2). For this purpose, data for multiple variables were collected at dense time intervals using multiple waves, and correlation analysis between various datasets was performed. The results conrm some general developmental trends established in previous research. However, the study also found signicant variability between individual and crosssectional data. Furthermore, differences found for more specic complexity measures between this studys results and previous research are explained in terms of differences in instructional approaches. In addition, the study contributes to the discussion of methods and metrics appropriate for tracking the development of complexity in foreign language writing. The study concludes with implications for L2 pedagogy and further research, including applications of computational methods. THE NOTION OF TIME IS CENTRAL TO ALL disciplines concerned with human development, including second language (L2) studies. To make inferences about how learners develop over time, most L2 studies have applied either crosssectional designs (where researchers compare data from different groups of learners at different prociency levels) or classicallongitudinal designs (with few waves of data collection from the same participants over a relatively long period). However, more recently, many prominent researchers have called for employing longitudinal designs with dense developmental data collection (de Bot, Lowie, & Verspoor, 2011); describing the interaction between crosssectional and longitudinal data (Byrnes, Max- im, & Norris, 2010; LarsenFreeman, 2006); looking beyond stable developmental patterns and account- ing for variation and degrees of this variation (Byrnes, 2009; Ortega & Byrnes, 2008; Pallotti, 2009); and capturing the interrelation of multiple developing interlanguage subsystems (LarsenFreeman, 2009; Verspoor, de Bot, & Lowie, 2011; Verspoor, Lowie, & van Dijk, 2008). This study responds to these calls by exploring the development of multiple dimensions of linguistic complexity in the writing of beginning learners of German both as a group and as The Modern Language Journal, 96, 4, (2012) DOI: 10.1111/j.1540-4781.2012.01401.x 0026-7902/12/576598 $1.50/0 © 2012 The Modern Language Journal

Upload: others

Post on 24-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Development of Second Language Writing Complexity in

The Development of SecondLanguage Writing Complexityin Groups and Individuals: ALongitudinal Learner Corpus StudyNINA VYATKINAUniversity of KansasDepartment of Germanic Languages and Literatures1445 Jayhawk BoulevardRoom 2080Lawrence, KS 66045Email: [email protected]

This study explores the development of multiple dimensions of linguistic complexity in thewriting of beginning learners of German both as a group and as individuals. The data come froman annotated, longitudinal learner corpus. The development of lexicogrammatical complexityis explored at 2 intersections: (a) between cross‐sectional trendlines and the individualdevelopment paths of 2 focal learners and (b) between different complexity variables. The studycontributes to the empirical body of linguistic complexity research by close tracking of beginninglearners over 4 semesters of collegiate study of German as a second language (L2). For thispurpose, data for multiple variables were collected at dense time intervals using multiple waves,and correlation analysis between various datasets was performed. The results confirm somegeneral developmental trends established in previous research. However, the study also foundsignificant variability between individual and cross‐sectional data. Furthermore, differencesfound for more specific complexity measures between this study’s results and previous researchare explained in terms of differences in instructional approaches. In addition, the studycontributes to the discussion of methods and metrics appropriate for tracking the developmentof complexity in foreign language writing. The study concludes with implications for L2pedagogy and further research, including applications of computational methods.

THE NOTION OF TIME IS CENTRAL TO ALLdisciplines concerned with human development,including second language (L2) studies. To makeinferences about how learners develop over time,most L2 studies have applied either cross‐sectionaldesigns (where researchers compare data fromdifferent groups of learners at different proficiencylevels) or “classical” longitudinal designs (with fewwaves of data collection from the same participantsover a relatively long period). However, morerecently, many prominent researchers have called

for employing longitudinal designs with densedevelopmental data collection (de Bot, Lowie, &Verspoor, 2011); describing the interactionbetweencross‐sectional and longitudinal data (Byrnes, Max-im,&Norris, 2010; Larsen–Freeman, 2006); lookingbeyond stable developmental patterns and account-ing for variation and degrees of this variation(Byrnes, 2009; Ortega & Byrnes, 2008; Pallotti,2009); and capturing the interrelation of multipledeveloping interlanguage subsystems (Larsen‐Freeman, 2009; Verspoor, de Bot, & Lowie, 2011;Verspoor, Lowie, & van Dijk, 2008).

This study responds to these calls by exploringthe development of multiple dimensions oflinguistic complexity in the writing of beginninglearners of German both as a group and as

The Modern Language Journal, 96, 4, (2012)DOI: 10.1111/j.1540-4781.2012.01401.x0026-7902/12/576–598 $1.50/0© 2012 The Modern Language Journal

Page 2: The Development of Second Language Writing Complexity in

individuals. The data come from an annotated,longitudinal learner corpus, which compriseswriting samples from American beginning col-lege‐level learners of German collected at regularshort intervals over their first four semesters ofstudy. The development of lexicogrammaticalcomplexity is explored at two intersections:(a) between cross‐sectional trendlines and theindividual development paths of 2 focal learnersand (b) between different complexity variables.Furthermore, the study contributes to the discus-sion of methods and metrics appropriate fortracking the development of complexity in foreignlanguage writing.

The study is organized as follows. The nextsection presents the study background by firstdiscussing the construct of linguistic complexityand then reviewing relevant research literature.The next sections report on the study. Thedescription of the research purpose and questionsis followed by a detailed methodology section. Itbegins with introducing the participants and thendescribes the instructional approach, design,measures, and computational procedures. Theresults section presents both cross‐sectional andlongitudinal data and reports correlations (a) ofdifferent complexity measures with time and(b) between measures. At the end of the resultssection, a qualitative collocation analysis is pre-sented. The results section is followed by summaryand discussion, including comparisons with previ-ous research. The last section presents conclusionsand implications for further research. Finally, theelicitation tasks used in the study are listed in theAppendix.

STUDY BACKGROUND

Linguistic Complexity as a Developmental Construct

Linguistic complexity is one of the componentsof the three‐dimensional L2 proficiency modelencompassing Complexity, Accuracy, and Fluency,or CAF (Skehan, 1989). According to contributorsto a recent special issue of Applied Linguistics onCAF research, these measures have, since the late’70s, “figured asmajor research variables in appliedlinguistic research” (Housen & Kuiken, 2009,p. 461) and have been used for assessment oflearner performance on specific written and oraltasks, as indicators of proficiency levels, and asmilestones for learner progress. Although it isgenerally recognized that the three measures areclosely interrelated, many studies focus on onespecific dimension, notably complexity.

Complexity is defined as “[t]he extent to whichthe language produced in performing a task iselaborate and varied” (Ellis, 2003, p. 340) andmayinclude both syntactic and lexical features (seeBulté & Housen, 2012 and Ortega, 2012 for anextended discussion). Research synthesis studies(Ortega, 2003; Wolfe‐Quintero, Inagaki, &Kim, 1998) have established that the averagenumber of words per T‐unit (main clause with alldependent subordinate clauses) has been themost frequently used syntactic complexity mea-sure. In addition, researchers have used theindices of words per clause, clauses per T‐unit,and dependent clauses per clause. In a recentcomprehensive study that investigated a large setof complexity measures, Lu (2011) suggested thatthe clause and ratios of various features per clauseare the best indicators of different L2 proficiencylevels. Lu also advocates using more specificcomplexity measures in addition to generalcomplexity measures for evaluating proficiency.

A number of complexity studies supplementlength‐based complexity measures with lexicaldensity and diversity measures. Lexical density istypically operationalized as the ratio of lexical tofunctional or total words, and lexical diversity as thetype–token ratio (TTR), or the ratio of differentwords to all words in a text (Polio, 2001, p. 99).Studies investigating different aspects of linguisticcomplexity are reviewed in the next sections.

Complexity Studies on Second Languages OtherThan German

Most early studies found a general increase ofaverageCAF values as learners progressed throughinstructional sequences (e.g., Arthur, 1979). How-ever, all researchers also discovered considerablebetween‐subject and within‐subject variation.Larsen‐Freeman (1983) explored both writtenand oral English as a second language (ESL)learner productions in a cross‐sectional and alongitudinal study and found that the frequency ofT‐units and error‐free T‐units could be used asindicators of proficiency levels but varied depend-ing on the languagemode and task. There was alsoconsiderable variation between individuals. Ar-thur also discovered significant intra‐individualvariation in the development of ESL writers over8 weeks on all CAF measures (accuracy, spellingaccuracy, and length), even in cases with a generalmonotonic group‐level developmental trend. Ca-sanave (1994) came to a similar conclusionexploring English as a foreign language (EFL)writing of Japanese students and strongly sug-gested that development be studied for individual

Nina Vyatkina 577

Page 3: The Development of Second Language Writing Complexity in

learners rather than based on group averages.Furthermore, Arthur and Kern and Schultz (1992)showed that the T‐unit length increases forbeginning learners but, at a certain point in theinstructional sequence, it starts to decrease. Theauthors attributed this finding to the fact thatmore proficient learners use more embeddingrather than longer syntactic constituents. Theseresults from early research have been fine‐tunedby more recent longitudinal studies, which arereviewed below.

A series of recent studies written from aDynamic Systems Theory (DST) perspectivesparked renewed interest in CAF in general andcomplexity in particular (e.g., Ellis & Larsen‐Freeman, 2006; Verspoor et al., 2011). Research-ers who work in this paradigm are primarilyinterested in “intra‐individual and inter‐individualvariation over time” (de Bot et al., 2011, p. 2). Forexample, Larsen‐Freeman (2006), using a time‐series design (four observations over 10 months),showed that while all CAF group averages of herESL learners consistently increased over time, theparticipants exhibited notable developmentalvariability as individuals. As far as complexity isconcerned, one participant showed considerabledevelopment in lexical complexity while lagging ingrammatical complexity, whereas another partici-pant followed an opposite developmental pattern.Finally, Larsen‐Freeman showed how a qualitativeanalysis of learner writing sheds additional light onthe participants’ developmental profiles andrecommended adding a qualitative componentto future developmental studies.

Verspoor et al. (2008) and Spoelman andVerspoor (2010) explored the dynamic interac-tion of different writing complexity variables inlongitudinal case studies. The former studydescribed the development of an advanced EFLlearner, and the latter study focused on beginningstages of L2 proficiency in Finnish. Writingsamples were collected in both studies at densetime intervals over a 3‐year period, in which bothlexical and syntactic complexity features wereexplored. The analysis showed a complex relationbetween these variables that changed dynamicallyover time. For example, in the first study, duringthe first observation points, varied word use (TTR)and length of sentence (SL) showed a positivecorrelation, thus acting as “connected growers”(van Geert, 1994). However, from observation 4 to15, these measures showed a negative correlationand, therefore, a competitive relationship to eachother. The authors of both studies concluded thatsome complexity features develop hand in hand byusing the same attentional resources, whereas

others may require the full attention of the learnerduring his or her active development.

Complexity Studies on L2 German

An early investigation of complexity in L2German is Cooper (1976). Cooper compared anumber of syntactic complexity measures acrossfive proficiency levels, including four levels ofcollege‐level L2 writers and also professionalnative‐speaker newspaper writers. He found thatclause length, subordination ratio, T‐unit length,sentence length, as well as the number of certainsentence‐embedding constructions steadily in-creased with each adjacent level (roughly equiva-lent to 1 year of study), but the increase wassignificant only with every second level. It shouldbe noted that Cooper only explored cross‐sectional data from 10 participants at each leveland did not consider any longitudinal data. Thefollowing sections review more recent L2 Germancomplexity studies that explicitly focused onlongitudinal research methods and data.

The authors in many studies reviewed in theprevious section advanced the point that at leastsome syntactic complexitymeasures not only cannotbe strictly separated from, but are inextricablyintertwined with, lexical complexity measures. AsRavid (2005) has argued, “[c]lause length derivesfrom number and length of intraclausal phrases,which in turn reflect lexical density and diversity,combined with syntactic depth and diversity”(p. 351). This approach reflects the premise ofSystemic‐Functional Linguistics, which postulateslexicogrammar as a sole complex object of linguisticinquiry instead of the two separate layers ofgrammar and lexicon (e.g., Halliday &Matthiessen, 1999).

Following the SFL approach, the research teamworking on the curriculum project DevelopingMultiple Literacies at the Georgetown UniversityGerman Department (GUGD) has published aseries of studies exploring essays of developing L2German student writers with the focus on variouslexicogrammatical patterns as “forms of textualmeaning‐making” (Byrnes et al., 2010, p. 38). Incontrast to most studies on L2 grammaticaldevelopment, which rarely provide details of therespective pedagogical settings, the overarchinggoal of these studies has been to explore linguisticfeatures of student writing as specific learningoutcomes achieved in response to a carefullydesigned research‐based 4‐year‐long curriculumimplemented at GUGD.

In a longitudinal study, Byrnes (2009) exploredthe development of subclausal complexification in

578 The Modern Language Journal 96 (2012)

Page 4: The Development of Second Language Writing Complexity in

terms of nominalization and grammatical meta-phor (GM) in student writing. She found that thefrequency of these features increased dramaticallyfrom level (year) III to level IV, which directlyreflected the shift of the instructional focus frommore verbal (narrative) to more nominal (exposi-tory) text genres and writing styles, and thusdemonstrated a desired learning outcome. Fur-thermore, the study showed that, whereas lexicaldensity (measured in content words per clause)increased significantly over three instructionallevels, grammatical intricacy (measured in clausesper sentence) slightly (yet insignificantly) de-creased. Another important finding, facilitatedby the longitudinal design, is that students whoseem average based on more general complexitymeasures such as clause length may demonstratespeedier development of more specific measuressuch as the amount of nominalization (see alsoByrnes & Sinicrope, 2008, for the case ofrelativization). Ryshina‐Pankova (2010) expandedthis line of research by exploring the use of thegrammatical metaphor as a means of thematicprogression in L2 German written texts. Sheidentified different GM types as characteristic ofvarious (advanced) acquisition levels and demon-strated how GM use contributed to greater orlesser communicative success of learner texts.

Byrnes et al. (2010) is a recent comprehensivestudy from this series, which investigated, amongother targets, the development of several syntacticcomplexity measures. The results showed thatgeneral complexity (words per T‐unit) increasedincrementally over four curricular levels, whereasmore specific measures exhibited more complexdevelopmental patterns. The authors provided acurricular explanation for the latter finding. Thenumber of clauses per T‐unit increased significantlyin level II due to the instructional focus onnarration, which is characterized by various formsof subordination. This trend continued in level IIIbut was also paired with a significant increase insubclausal complexification (words per clause) dueto the added instructional focus on public dis-courses which are characterized by extensivenominal structures and, therefore, longer clauses.Finally, clauses became again significantly longer inlevel IV but the amount of subordination decreased.The latter result confirmed Byrnes’s (2009) findingthat showed increased clause length due to exten-sive use of nominalization by the same learnerpopulation in response to the exclusive instructionalfocus on secondary, public discourses in level IV.Furthermore, thisfinding corroborated results fromearlier complexity studies that found leveling ofsubordination but increases in clause length atmore

advanced proficiency levels (see Ortega, 2003, for areview).

Importantly, the developmental patterns foundby Byrnes et al. (2010) for “course‐embedded […]Prototypical Performance Tasks” or PPTs (p. 163),were confirmed when tested cross‐sectionallyusing the same “Baseline Writing Task,” or BWT,at all curricular levels. However, the differencesbetween levels were more distinct when measuredin response to PPTs, which led the authors toconclude that curricular‐embedded tasks aremoreconducive to investigation of syntactic develop-ment. Finally, Byrnes et al. showed that their L2German undergraduate students achieved similarlevels of grammatical complexity as graduatestudents in Cooper’s (1976) study or evensurpassed them, which the authors again attribut-ed to positive learning outcomes of the curricularinnovation consistently implemented in the de-partment’s pedagogical practices.

RESEARCH PURPOSE AND QUESTIONS

The present study will contribute to theempirical body of linguistic complexity researchby closely tracking learners from the beginnerlevel and over four semesters of collegiate L2German study and by comparing cross‐sectionalcohort data and longitudinal data for 2 individuals.For achieving this purpose, data for multiplevariables were collected at dense time intervalsusing multiple waves, and correlation analysesbetween various datasets were performed.

The study aims to answer the following researchquestions:

1. How does L2 German writing complexitydevelop cross‐sectionally for a cohort oflearners and in 2 individual learners?

2. What is the relationship between the devel-opmental paths of the cohort and theindividuals?

3. How do developmental paths measured bydifferent complexity metrics correlate witheach other?

METHOD

Participants

The data for this study were collected fromstudents enrolled in beginning and intermediateGerman courses at The University of Kansasduring four sequential 16‐week‐long semesters,specifically in the first semester course in thespring of 2008 (80 hours of instruction); secondsemester, fall 2008 (80 hours of instruction); third

Nina Vyatkina 579

Page 5: The Development of Second Language Writing Complexity in

semester, spring 2009 (48 hours of instruction);and fourth semester, fall 2009 (48 hours ofinstruction). Most students were completing theirfour‐semester‐long language requirement andhadlittle or no knowledge of, or exposure to, Germanprior to their enrollment. Also, an overwhelmingmajority of the students grew up in theMidwesternregion of the United States and had AmericanEnglish as their first language (L1). It remained arelatively homogenous group of learners in termsof type and amount of exposure to the targetlanguage; that is, it was mostly limited to classroominteractions and instructional materials.

Cohort. Although the data were collectedduring consecutive semesters, the actual partici-pant constituency changed from semester tosemester as students withdrew from the programor joined it at some later point via placement test.Furthermore, data were collected only from thosewho had signed consent forms at the beginning ofeach data collection semester, and not all studentschose to participate each semester. Therefore,writing samples from the described studentpopulation were used as cross‐sectional cohortdata for calculating group averages rather thanlongitudinal data in this study.

Individuals. Furthermore, the study focuses on2 individual learners from the same population forlongitudinal data collection, who were assignedthe pseudonyms “Braden” and “Cassie.” These 2participants (from a total of 7 who completed thefull four‐semester‐long sequence) were selectedbecause they had a number of similarities in theirlanguage learning history. Both of them were “truebeginners,” that is, they had not studied Germanbefore and had never visited German‐speakingcountries. Both participants were not majoring inGerman, lived all their lives in theU.S.Midwest, andhadAmerican English as their L1. Both of themhadsome knowledge of Spanish, and Cassie also hadsome knowledge of sign language. However, theywere different in terms of gender and age: Bradenwas a 19‐year‐oldmale andCassiewas a female inherearly 30s. Despite these differences, according totheir instructors, both participants had somesimilarities in learning style: They were generallycharacterized as diligent learners who tried hardand were better at writing than speaking.

Instructional Approach

The students in this study were enrolled in amultisection beginning and intermediate Germanlanguage program, in which all courses were taughtby graduate student instructors under the supervi-

sion of the researcher. All instructors followedsyllabi and textbooks that were uniform for eachinstructional level although they had freedom todesign specific lesson plans following the coordina-tor’s guidelines. Each course included a combina-tion of spoken interaction, grammar explanationand practice, writing assignments, searching Ger-man Web sites, and creative culture projects. Thewriting component was allocated 20% of thecurriculum and, respectively, of the total coursegrade at each level. This aspect of the curriculumdesign reflected the coordinator’s conviction that“writing deserves systematic and continued atten-tion in the foreign language classroom in its ownright, not merely as a support skill for listening,reading or speaking” (Abrams, 2010, Conclusionsection, para. 1; see also Byrnes et al., 2010). While amultidraft process writing approach was used at alllevels, only rough drafts of written assignment areused as data in this study.

Tasks and Timeline

Students wrote their essays in response to level‐appropriate writing tasks with topics and promptsreflecting relevant instructional content. In thisregard, tasks in this study can be consideredPrototypical Performance Tasks, or “curricular‐level‐specific PPTs” as defined by Byrnes et al. (2010, p.179). However, the design of this study differs fromthe cited study in observation density and includes18 PPTs collected approximately every 3 weeks asopposed to the four PPTs collected at the end of aninstructional level each by Byrnes et al. The finaltask was a BWT (described in the next section).

The curricular progression of the genres of thewriting tasks (see also Table 1 and Appendix) was asfollows. The tasks at time points T1–T5 (firstsemester) and T6–T10 (second semester) requiredlearners to write personal narratives. According toMaxim (2011), this genre involves “exploring self‐identity in the German‐speaking world throughdifferent roles that young adults assume in society(e.g., student, hobbyist, consumer, traveler, familymember, citizen)” (p. 12) and is level‐appropriatefor first‐year college‐level language learners. T11–T14 tasks (3rd semester) also required students towrite personal narratives and personal accounts butwith added reasoning elements, which is appropri-ate for second‐year language learners (Maxim, 2011,see also Byrnes et al., 2010). T15–T18 tasks (fourthsemester) invited students to reflect on themeaningof the stories from the course book and to provideexplanations of selected aspects and argumentssupporting their opinions, which is appropriate formore advanced students. The final (T19) task asked

580 The Modern Language Journal 96 (2012)

Page 6: The Development of Second Language Writing Complexity in

students to summarize the contents of a book oftheir choosing and to explain why they wishedto recommend it to their German peers. Thistask thus combined elements of narration andargumentation.

Topic and Prompt Sources

In the first three semesters, the prompts weretaken from journal writing assignments at the endof each of the 14 chapters of the course workbook(Briggs, Di Donato, Clyde, & Vansant, 2008),which were based on the material covered in thecorresponding course textbook (Di Donato,Clyde, & Vansant, 2008). In the fourth semester,selected chapters from Teichert and Teichert(2005), a textbook based on short stories written byGerman‐speaking authors, were used as primarycourse texts, and writing tasks were taken fromthose chapters. Finally, the T19 task was a slightlymodified BWT, adopted from the GUGDMultipleLiteracies project (see Byrnes et al., 2010), which isa generic writing task not directly related toinstructional materials used in this study’s pro-gram.1 The full list of writing topics is provided inthe Appendix, and specific prompts can be foundin the sources cited in this paragraph.

Writing Conditions

Collected writing samples were written underthe following conditions. During the first three

semesters, students typed each of their essays in acomputer lab during a 50‐minute‐long class periodand submitted them through the online course-ware (Blackboard).Make‐up essays of students whowere absent during designated writing sessionswere not included in the analysis. Learners wererequired to write during the whole class period. Inthis way, they were writing under controlled, timedconditions, with the instructor’s supervision, andwere allowed to consult the instructor and onlinedictionaries but neither online translators nor theirtextbooks or notebooks. In the fourth semester, theessay‐writing procedure changed. Students wrotefour out‐of‐class essays in response to textbookprompts (to allow more time for reflectionrequested in the tasks) and submitted themelectronically via Blackboard on designated duedates. They were required to write at least onedouble‐spaced page. At the end of the fourthsemester, students wrote their final essay under thesame conditions as timed essays in precedingsemesters (during a 50‐minute‐long class period).In sum, essays were collected at 19 data collectionpoints (T1–T19) during the four focal semesters.T1–T14 and T19 data thus represent timed in‐classwriting, whereas T15–T18 data represent untimedout‐of‐class writing (Table 1).

Sampling

With regard to the sampling procedure, thestudy followed the “instruction‐embedded total‐

TABLE 1Timeline, Tasks, and Data Pool

SemesterTime

Point (T) SamplesWords(Total)

Words(Mean)

Timed/Untimed Tasks From: Chapter

First 1 28 1914 68 t Briggs et al. (2008) 12 25 2176 87 t 23 26 3404 131 t 34 27 2864 106 t 45 25 2445 98 t 5

Second 6 40 3854 96 t 67 40 4164 104 t 78 29 3056 105 t 89 38 4083 107 t 910 35 4072 116 t 10

Third 11 30 4148 138 t 1112 24 3627 151 t 1213 24 3517 147 t 1314 20 2758 138 t 14

Fourth 15 21 4550 217 u Teichert and Teichert (2005) BWT 116 16 3907 244 u 317 16 3780 236 u 418 18 4276 238 u 1019 22 3547 161 t NA

Nina Vyatkina 581

Page 7: The Development of Second Language Writing Complexity in

sampling approach” (Byrnes et al., 2010, p. 165);that is, all essays written by the students of the focalcohort in response to instructional tasks werecollected. It must be noted that not all participantsturned in essays at all data collection points,including the 2 longitudinal participants: Bradendid not submit essays at T11, T14, and T16, andCassie at T8. Due to this fact and to participantattrition (see the section above), the number ofwriting samples that served as cross‐sectional datain this study varied from time point to time pointand ranged from 16 to 40 (Table 1). Table 1 alsolists the total and average word count produced bythe cohort at each time point.2

Measures

For choosing measures of linguistic complexity,this study relied on recent recommendations thatNorris and Ortega (2009) derived from a compre-hensive research review and synthesis. The authorsstrongly advise researchers to measure complexitymultidimensionally and to supplement generalmeasures such as sentence length with distinct,complementary, and specific complexity measures(see also Lu, 2011). Norris and Ortega identify thefollowingmain dimensions of syntactic complexity,themetrics for which should be chosen dependingon specific conditions of each study: (a) overall orgeneral complexity; (b) subclausal complexity;(c) complexity via subordination and coordina-tion; and (d) variety, sophistication, and acquisi-tional timing of forms produced (pp. 561–562).

In this study, a number ofmeasures were used totap into these multiple dimensions of complexity.In choosing from the plethora of availablemeasures, it was decided to only use measuresthat allowed automatic annotation and/or searchesfor target features so as to eliminate as muchsubjectivity as possible in data tagging andcalculations. As Granger, Kraif, Ponton, Antonia-dis, and Zampa (2007) note, the value of a learnercorpus increases exponentially for both theoreti-cal and applied second language acquisition(SLA) research purposes when it is annotatedfor parts of speech (POS) and other linguisticproperties and analyzed with natural languageprocessing (NLP) tools. Still, despite the wideavailability of computational resources, theseapplications remain largely underexploited incomplexity research (see, however, Hawkins &Buttery, 2010; Saville, 2010), especially for lan-guages other than English. This study aimed tobenefit from available NLP resources and usedautomatic corpus tools for computing length‐based measures as well as automatically assigned

POS tags as proxy measures for surface syntacticstructures (Aarts & Granger, 1998). All metricsused in this study are ratios.

General Complexity. First, sentence length (SL, seeVerspoor et al., 2008) was measured in the numberof words per sentence (W/S). SL was chosen as ageneric metric “with a potentially multiple‐clausalunit of production in the denominator” measuringoverall syntactic complexity (Norris & Ortega, 2009,p. 561). The sentence was chosen as the main unitof analysis instead of the widely used T‐unit(Hunt, 1965) because SL can be calculatedautomatically, whereas T‐units need to be codedmanually. Furthermore, as Bardovi‐Harlig (1992)argues, a T‐unit analysis “artificially divides senten-ces that were intended to be units by the languagelearner, imposing uniformity of length and com-plexity on output that is not present in the originallanguage sample” (p. 391). In contrast, by using thesentence, “the unit directly produced by thelearner” (Bardovi‐Harlig, 1992, p. 391), the re-searcher takes into account the learner’s (consciousor unconscious) choice. Finally, since a T‐unitanalysis treats conjoined clauses as independentclauses, it “discounts the learner’s knowledge ofcoordination” (Bardovi‐Harlig, 1992, p. 391) whichis an important indicator of complexity at beginninglevels of language proficiency (see also Casanave,1994; Ishikawa, 1995).

Clausal Complexity. The SL measure was sup-plemented by two specific complexity indicators.Sentence length can be increased by two differenttypes of complexification: adding more coordi-nate or subordinate clauses to a matrix clause ormaking clauses longer (subclausal complexifica-tion). Thus, the first metric can be expressed inmean sentence length in clauses (clauses/senten-ces), and the second metric in mean clause lengthin words (words/clauses). It must be noted that,on the one hand, although clauses have been usedas a unit of analysis in an overwhelmingmajority ofcomplexity studies, they have not been definedconsistently,3 which may engender differencesin annotation and subsequent results (Bulté& Housen, 2012; Ishikawa, 1995; Lu, 2011;Polio, 2001). On the other hand, most CAF studiesfollow Hunt (1965), who defined a clause as “avisible subject and a finite verb” (p. 29). If clausesare restricted to finite clause units, the number offinite verbs can be used as a proxy for counting thenumber of clauses. The only difference betweenthese two units of analysis would be in countingunits constituted by coordinated finite verbs asdistinct finite verb units (called FV‐units byVerspoor et al., 2008), when they actually belong

582 The Modern Language Journal 96 (2012)

Page 8: The Development of Second Language Writing Complexity in

to one and the same clause unit and have one andthe same subject. However, the benefit of using FV‐units instead of clauses is that the former are lessambiguous and can be found and computed byautomatic POS tagging tools. Since only automati-cally computed measures were used in this study,subclausal complexity wasmeasured by the ratio ofwords per FV‐unit (W/FV) following Verspooret al. To characterize complexification by coordi-nation and subordination combined, the ratio FV‐units per sentence (FV/S) was used.

Coordination and Subordination. To distinguishbetween complexification by coordination andsubordination, two additional proxymeasures wereused, namely coordinating conjunctions (CC) andsubordinating conjunctions (SC). Normalized CC andSC frequencies per 100 words were used tocompare essays of different length. Coordinatingconjunctions (e.g., und ‘and,’ aber ‘but,’ oder ‘or’)are connectors that combine homogenous syntac-tic constituents, or “parts that have ‘equal’ syntacticvalue” (Verspoor & Sauter, 2000, p. 101). They canconnect coordinated subjects, predicates, or otherintraclausal constituents as well as clauses. Coordi-nation was thus explored as a broader syntacticconstruct not only limited to clauses. Subordinat-ing conjunctions (e.g., dass ‘that,’ weil ‘because,’wenn ‘when’) are syntactic connectors that func-tion at the interclausal level, combining a mainclause and its dependent clauses. Subordinationwas explored only in terms of adverbial andnominal clauses but not modifier clauses (ex-pressed by relative pronouns and relative clauses).4

Lexicogrammatical Variety. Finally, lexicogram-matical variety was measured by a type–token ratiofollowing some other studies that comparedsyntactic and lexical complexity (Larsen‐Free-man, 2006; Verspoor et al., 2008). Namely, correctedtype–token ratio (CTTR), also known as adjusted orsophisticated TTR, was used. CTTR is calculated inword types divided by the square root of 2 times thetotal number of words and thus “takes the length ofthe sample into account to avoid the problem thatregular type–token ratios are affected by length”(Larsen‐Freeman, 2006, p. 597, see also Lu, 2012).In addition, a qualitative type–token analysis ofconstructions containing coordinating and subor-dinating conjunctions for the 2 focal learners wasundertaken to arrive at a more specific level ofdescription of lexicogrammatical variety.

Computational Procedures

The units sentence and word were defined herebased solely on typographic features. A sentence

was defined as “a string of words with a capitalletter at the beginning of the first word and aperiod or another terminal punctuation markafter the last word” (Homburg, 1984, pp. 91–92). Aword was defined as a string of letters separated byspaces from adjacent strings of letters. Sentence,word type, and word token frequencies werecomputed automatically using WordSmith Tools(Scott, 2008). Finite verbs as well as coordinatingand subordinating conjunctions were automatical-ly annotated in the learner corpus using the POSTree Tagger for German (Schmid, 1994).5 Tagfrequencies were then automatically computedusing the WordSmith Tools WordList function(Scott, 2008). Computed frequencies were en-tered into a Microsoft Excel spreadsheet that wasused for calculating finite verb frequency (a sumoffinite full, auxiliary, and modal verbs) as well asratios of words per sentence, finite verbs persentence, words per finite verb unit, coordinatingand subordinating conjunctions per 100 words,and CTTR. Next, graphs were created to illustratethe developmental course for each focal feature.Finally, the WordSmith Tools Concord function(Scott, 2008) was used for a qualitative analysis ofCC and SC collocations (see Results II).

Cross‐sectional and longitudinal data for allfocal variables were plotted inMicrosoft Excel, anda correlation analysis was performed. First, cross‐sectional averages for the cohort were computedfor each measure at each time point, anddevelopmental trendlines for the cohort wereanalyzed. Next, it was established how the 2 focallongitudinal learners fared vis‐à‐vis this classaverage at each time point, and their developmen-tal pathways were explored.

RESULTS I: CORRELATIONS OFCOMPLEXITY MEASURES WITH TIME

Table 2 presents the correlation values betweentime and the six complexity measures for thecohort averages as well as for the 2 focal learners.

The following sections report on both cross‐sectional and longitudinal results. Each sectioncontains three graphs to illustrate how the cohortand the 2 focal individuals developed over 19 timepoints vis‐à‐vis a complexity measure. Chart (a)represents the quantified development of thecohort, with the average of each measure for eachassignment and a regression trendline for theaverage calculated by least squares. To illustrategraphically the relative certainty of each of thesetrendline estimates, curved lines representing a95% confidence interval of the predicted averageare plotted, as well (Draper & Smith, 2001). Charts

Nina Vyatkina 583

Page 9: The Development of Second Language Writing Complexity in

(b) and (c) then compare individual developmentto the cohort average trendline. Braden andCassie’s values for each measure and time pointare represented in scatterplots, along with leastsquares regression trendlines and 95% confidenceintervals for the trendline. These are superim-posed on the cohort trendline for comparisonpurposes.

Sentence Length in Words (SL)

General syntactic complexity was measured bymean sentence length (number of words persentence). Figure 1a shows that the class averagesentence length increases with the instructionalprogression with very slight oscillations of theascending trend, namely from 6 words persentence at T1 to 10 words per sentence at T19.The latter result is similar to Cooper’s (1976), whofound that on average, American students ofGerman write 10.3 words per sentence in theirsecond year of study.6 Figure 1a shows a fairlysmooth diagonal increase with slight upward anddownward oscillations. An extremely strong posi-tive correlation between SL and time was found forthe cohort (r ¼ 0.94).

Figures 1b and 1c show that Braden and Cassie’ssentence lengths develop in a very similar way: Thecohort trendline runs right through the center ofBraden and Cassie’s trendlines. In other words, onthis measure they can be taken as very goodrepresentatives of the average for this cohort.Respectively, sentence lengths of both focallearners strongly and positively correlate withtime (r ¼ 0.76 for Braden, r ¼ 0.74 for Cassie).In terms of individual variation, Braden is morefrequently above the class average in the firstsemester, below the average in the secondsemester, and around the average in the fourthsemester. Cassie is either below or slightly abovethe average, with the largest fluctuations at T16and T18. In the last semester, her sentence lengthis mostly above the class mean. This is reflected in

her trendline, which has a slightly higher slopethan the cohort average (Figure 1c).

FV‐Units per Sentence (FV/S)

The cross‐sectional cohort data analysis(Figure 2a) shows a general increase of theaverage sentence length in FV‐units. This meansthat learners use more finite verbs per sentenceover time: The trendline goes up from 1.1 to 1.4FV/S. This overall increase may not seem large;however, Ortega (2003) has shown that anincrease for a similar measure (clauses/T‐unit)is usually significant at a 0.2 level. The correlationbetween FV‐units per sentence and time is positiveand very strong in this data (r ¼ 0.82).

The longitudinal data for both Braden andCassie also show a general linear increase of FV‐units per sentence and a positive correlation of thismeasure with time. For Braden, this correlation isvery strong (r ¼ 0.86) and for Cassie, it is moremoderate but still significant (r ¼ 0.54; p < 0.05).Figure 2b illustrates that Braden starts around theclass mean, but after that, his FV/S ratio isconsistently at or above the average (except T7and T9) and surpasses it by a large margin of 0.43at T19. Figure 2b shows a steeper slope forBraden’s trendline in comparison with the cohorttrendline, which runs below Braden’s confidenceintervals approximately after T12.

Cassie’s path follows Braden’s almost identicallyfrom T1 through T6 (her values staying just belowBraden’s), but her later values are consistentlylower than Braden’s and the class mean, with a fewexceptions. Also, Cassie’s values are scatteredmore around the trendline, which shows a largervariation. Notably, Cassie’s FV/S value dropssharply down to the starting point (around 1FV/S) at the final time point (T19). Figure 2cshows that Cassie’s trendline is below the cohorttrendline and does not intersect it, with the slopesof both lines increasing almost in parallel overtime.

TABLE 2Correlations Between Time and Complexity Measures

SL (W/S) FV/S W/FV CC SC CTTR

Cohort 0.94*** 0.82*** 0.44 �0.63** 0.84*** 0.89***

Braden 0.76*** 0.86*** �0.29 0.07 0.66** 0.59*

Cassie 0.74*** 0.54* 0.33 �0.18 0.62** 0.60**

Note. CC ¼ coordinating conjunctions; CTTR ¼ corrected type‐token ratio; SC ¼ subordinating conjunctions;SL ¼ sentence length; W/FV ¼ words per finite verb unit; W/S ¼ words per sentence.*p < 0.05, **p < .01, ***p < 0.001.

584 The Modern Language Journal 96 (2012)

Page 10: The Development of Second Language Writing Complexity in

In sum, the cohort as well as the 2 focal learnersuse more FV‐units per sentence over time. Howev-er, Braden’s increase surpasses the class average,whereas Cassie’s values, staying mostly below theclass average, increase roughly at the same rate asthe latter. Also, Cassie’s data show more upwardand downward oscillations, which moderates thecorrelation of the focal measure with time.

Words per FV‐Unit (W/FV)

As opposed to themeasures discussed above, theaverage length of FV‐units in words did not showan obviousmonotonic increase for the cohort data(Figure 3a): The values oscillate between 5.75 and7with no obvious pattern. The correlation analysis,however, showed a moderate positive correlationof W/FV measure with time (r ¼ 0.44) thatapproached significance (p ¼ 0.06).

The scattergram of Braden’s data (Figure 3b)represents no discernable pattern between thelength of FV‐units and time, with W/FV valuesranging randomly from 4.9 to 7.3. There is anindication of a slight negative correlation with time(r ¼ �0.29), which is illustrated by an almosthorizontal yet slightly declining trendline inFigure 3b. In other words, Braden does not beginto write longer FV‐units over time; his FV‐unitsshow a trend of becoming shorter in contrast tothe slightly increasing cohort average. However,this correlation is not significant (p > 0.05).Additionally, the differences between Braden’strendline and the cohort’s trendline are notsignificant, as the latter runs inside the confidenceintervals of Braden’s trendline.

In contrast, Cassie’s FV‐unit length shows a notvery steep but incremental increase, although thepositive correlation with time (r ¼ 0.33) is not

FIGURE 1Sentence LengthData and Trendlines: (a) Cohort Averages; (b) BradenWith Cohort Trendline; (c) CassieWithCohort Trendline

Nina Vyatkina 585

Page 11: The Development of Second Language Writing Complexity in

significant (p > 0.05). It appears that this correla-tion is moderated by several oscillations in the data(Figure 3c): Cassie’s trendline goes up from about 6to 7 words per FV‐unit over time, but the W/FVvalue jumps up to 8.5 at T6, to 9 at T9, and to 10.4 atT19, as well as drops to 4 at T17. Although Cassie’strendline has a visually steeper slope than the cohorttrendline, this difference is not significant, asillustrated by the cohort trendline staying largelyinside the confidence intervals of Cassie’s trendline.

Complexification by Coordination

Figure 4a illustrates the cross‐sectional cohortdevelopment of coordinating conjunctions (CC)as used per each 100 words. These values declined

with time, which is supported by a strong negativecorrelation (r ¼ �0.63). The cohort trendlineshows a diagonal decrease going down from about5.8 to 3.5 CC per 100 words. One can also see anumber of fluctuations reaching out beyond theconfidence interval lines, which can in part beexplained by differences in essay topics. Forexample, at T3, the task included describingfamily trees, which lends itself to listing multiplefamily members by using coordinated nominalphrases. This explains an upward fluctuation to8 CC per 100 words.

The longitudinal data analysis shows that thecohort trendline runs inside the confidence inter-vals of both learners’ trendlines (Figures 4b & 4c).However, the correlation of CC per 100 words with

FIGURE 2FV‐Units per Sentence Data and Trendlines: (a) Cohort Averages; (b) Braden with Cohort Trendline; (c) Cassiewith Cohort Trendline

586 The Modern Language Journal 96 (2012)

Page 12: The Development of Second Language Writing Complexity in

time was very weak and not significant for eitherBraden (r ¼ 0.07) or Cassie (r ¼ �0.18). This canbe explained by the fact that the longitudinal dataexhibit large variation: They are widely scatteredaround the trendlines. Braden’s trendline isperfectly horizontal: His CC use does not signifi-cantly change over time and stays at 6 per 100 wordson average. Cassie’s trendline has a very slightdecline from 4.8 to 4 words.

Complexification by Subordination

The cross‐sectional cohort development ofsubordinating conjunctions (SC) as used pereach 100 words is shown in Figure 5a. Focusedinstruction on subordination in general and SC inparticular occurred at T8, and learners were not

expected to use these syntactic connectors prior tothat point. However, the chart shows sporadic SCinstances, even at earlier data points. This can beexplained by the fact that participants in this studyare adult, cognitively developed learners whoeventually feel the need to express more complexthoughts and, therefore, to use more advancedtextual cohesive devices. To do that, they mayconsult the dictionary or the teacher. The use ofSC sharply jumps up from around 0 to 2.85 per 100words at T8 due to the instructional focus, whenlearners were explicitly taught this grammaticalfeature and encouraged to use SC in their writing.The rate of SC use drops again at the next timepoint (T9) to 0.73 per 100 words but after thatgradually increases with occasional fluctuationsbeyond the confidence intervals of the trendline.

FIGURE 3Words per FV‐Unit Data and Trendlines: (a) Cohort Averages; (b) Braden with Cohort Trendline; (c) Cassiewith Cohort Trendline

Nina Vyatkina 587

Page 13: The Development of Second Language Writing Complexity in

The correlation of this measure with time ispositive and very strong (r ¼ 0.84).

The longitudinal data analysis showed that bothfocal learners tried out subordinating conjunc-tions before focused instruction: Braden at T4(Figure 5b) and Cassie at T5 (Figure 5c). Their SCuse positively correlates with time, and thecorrelation is relatively strong (r ¼ 0.66 forBraden, r ¼ 0.62 for Cassie). Both learners’trendlines run below the cohort trendline; thatis, their development lags behind the class averageon this measure. However, the cohort trendlineruns inside the confidence boundaries of Braden’strendline but above those of Cassie’s trendline.This suggests that Braden’s SC development is notsignificantly different from the class average,whereas Cassie’s is. Additionally, both learners’

data are fairly scattered, indicating large variation,including occasional drops back to zero SC use.

Lexical Variety (CTTR)

As measured by the adjusted type–token ratio(CTTR), the cohort mean lexical variety showed asteady increase with only four data points fallingoutside the confidence intervals. The correlationof CTTR with time is very strong and positive(r ¼ 0.89).

Both longitudinal learners develop their CTTRover time similarly to the class trendline: Thecorrelation of this measure with time is strong andpositive (r ¼ 0.59 for Braden, r ¼ 0.60 for Cassie).However, Figures 6a and 6b also show strikingdifferences between them. Braden’s CTTR starts

FIGURE 4CC per 100Words Data and Trendlines: (a) Cohort Averages; (b) Braden with Cohort Trendline; (c) Cassie withCohort Trendline

588 The Modern Language Journal 96 (2012)

Page 14: The Development of Second Language Writing Complexity in

and continues to stay below the class average, withonly two data points above the cohort trendline.Cassie’s CTTR, in contrast, starts off above the classaverage and consistently surpasses it over the wholedevelopmental course with only two data pointsbelow the cohort trendline. Moreover, the latter isconsistently below the confidence intervals ofCassie’s trendline, which shows that she is signifi-cantly above the class average on this measure.

RESULTS II: CORRELATION BETWEENMEASURES

The analysis above showed that all measures forthe cohort and most measures for the longitudi-

nal learners exhibited moderate to strong corre-lation with time. However, variation for specificmeasures was found between the two focallearners as well as between the longitudinallearners and the cohort. Additionally, since itwas reasonable to assume that some specificmeasures were interrelated, further correlationanalyses were conducted. Table 3 presents thecorrelation values between these selected com-plexity measures for the cohort averages as well asfor the 2 focal learners. These results arediscussed in the following sections. The finalsection also contains a qualitative analysis of thecollocations of syntactic connectors in thelongitudinal data.

FIGURE 5SC per 100Words Data and Trendlines: (a) Cohort Averages; (b) Braden with Cohort Trendline; (c) Cassie withCohort Trendline

Nina Vyatkina 589

Page 15: The Development of Second Language Writing Complexity in

General and Specific Complexity Measures

As shown in Figures 1–6, sentence length (SL),the most general complexity measure, correlatedvery strongly with time for both the cohort andlongitudinal data. For the cohort data, thiscorrelation translated into very similar (positiveor negative) correlations of all more specificcomplexity measures (FV/S, CC, SC, CTTR)with both time and SL (cf. Tables 2 & 3). However,for the 2 focal learners, some intriguing differ-ences were found. For Braden, SL correlated verystrongly (r ¼ 0.81) with FV/S and moderately yetinsignificantly with CTTR (r ¼ 0.46; p > 0.05).

For Cassie, SL correlated very strongly with CTTR(r ¼ 0.75) and moderately yet significantly withboth FV/S (r ¼ 0.54) and W/FV (r ¼ 0.47). Thereader is also reminded that Braden surpassed theclass average on the FV/S measure (Figure 2b)andCassie on the CTTRmeasure (Figure 6c). Thisfinding suggests that Braden’s general complexity(SL) mostly developed by increasing the numberof FV‐units per sentence, whereas Cassie predomi-nantly made her sentences longer by using morewords and more varied words. A closer analysis ofBraden and Cassie’s use of coordinating andsubordinating conjunctions (see next section)lends further support to this hypothesis.

FIGURE 6CTTR Data and Trendlines: (a) Cohort Averages; (b) Braden with Cohort Trendline; (c) Cassie with CohortTrendline

590 The Modern Language Journal 96 (2012)

Page 16: The Development of Second Language Writing Complexity in

FV‐Units per Sentence and Words per FV‐Unit

The 2 focal learners developed differently vis‐à‐vis the two complexity measures based on FV‐units:FV/S (Figures 2b & 2c) and W/FV (Figures 3b& 3c). Braden develops more on the former andCassie on the latter. Furthermore, Braden signifi-cantly surpasses the cohort average developmenton the first measure, whereas Cassie’s data do notsignificantly diverge from the cohort trendline oneither.

Additionally, the scatterplots for Braden andCassie suggested a curious pattern: It appearedthat at some points, when the FV/S values showedsudden upward fluctuations, the W/FV valuesshowed downward fluctuations, such as at T7, T9,and T15 for Braden (Figures 2b & 3b) and at T6,T9, T17, and T19 for Cassie. This findingprompted the researcher to test the correlationbetween the two complexity measures. The resultsshowed a moderate negative correlation betweenFV/S and W/FV which, however, does not reachsignificance for either Braden (r ¼ �0.44;p ¼ 0.08) or Cassie (r ¼ �0.45; p ¼ 0.06). Thecohort showed no correlation between thesemeasures (r ¼ �0.02). This difference betweencross‐sectional and longitudinal data illustrateshow averaging may mask intra‐individual correla-tion between features.

Coordination and Subordination

The data for syntactic connectors suggested anupward trend for SC and a downward trend forCC. Therefore, it was decided to measure thecorrelation between these two variables. Theanalysis confirmed a strong negative correlationbetween the strategies CC and SC for the cohortdata (r ¼ �0.61, p < .01). Additionally, an even

stronger correlation was found after T8 when thefocused instruction on SC was administered(r ¼ �0.68). For the two focal learners, a moder-ate yet insignificant correlation between CC andSC was found (r ¼ �0.36 for Braden, r ¼ �0.30for Cassie). Interestingly, whenmeasured after T8,this correlation became stronger for Braden(r ¼ �0.52) but virtually disappeared for Cassie(r ¼ �0.17).

Additionally, an extremely strong positivecorrelation was found between FV/S and SCfor the cohort (r ¼ 0.95). In contrast, FV/Scorrelates negatively with CC with a strong effectfor the cohort (r ¼ �0.58). This may indicatethat on average, when learners started usingmore FV‐units per sentence (which happenedover time, see Figure 2a), these were typicallysubordinate and not coordinate clauses. Thisresult was confirmed for SC (r ¼ 0.63, p < .01)for Braden and approached significance forCassie (r ¼ 0.47; p ¼ 0.0515). However, therewas a quite different picture for CC: There was nosignificant correlation for either of the 2 focalparticipants. This finding highlighted anotherarea of individual variability in comparison withclass averages.

To shed more light onto the latter finding, acollocation analysis was performed for Braden andCassie’s use of CC and SC. Using the WordSmithTools Concord function (Scott, 2008), concordancelines for all instances of CC and SC were retrievedand analyzed qualitatively. The results showed thatBraden used a total of 28 SC tokens and Cassie atotal of 37 SC tokens. However, it turned out thattype‐wise, both learners’ data were similar: Each ofthem used 9 different types of SC conjunctions.The most frequent SC type by far was thecomplementizer dass (‘that’): Each of the 2participants used it 10 times. This is in line with

TABLE 3Correlations Between Selected Complexity Measures

FV/S W/FV CC SC CTTR

Cohort SL 0.88*** 0.45 �0.59** 0.85*** 0.79***

Braden 0.81*** 0.16 0.57* 0.30 0.46Cassie 0.54** 0.47* �0.07 0.45 0.75***

Cohort FV/S �0.02 �0.58** 0.95***

Braden �0.44 0.33 0.63**

Cassie �0.45 �0.09 0.47Cohort CC �0.61**

Braden �0.36Cassie �0.30

Note. CC ¼ coordinating conjunctions; CTTR ¼ corrected type–token ratio; FV/S ¼ finite verbs per sentence;SC ¼ subordinating conjunctions; SL ¼ sentence length. W/FV ¼ words per finite verb unit.*p < 0.05, **p < .01, ***p < 0.001.

Nina Vyatkina 591

Page 17: The Development of Second Language Writing Complexity in

Sato (1988), who found that complement clauseswere themost frequent type of subordination usedby beginning learners. However, the 2 focallearners differed in the range of lexical itemsand structures governing the complementizerdass. Braden used dass in a fixed construction ichdachte, dass (‘I thought that’) in 5 instances out of10. In contrast, all of Cassie’s 10 dass instancesoccur in collocations with different verb lexemesor forms (e.g., er hatte erklärt, dass ‘he hadexplained that’; deshalb hat er gedacht, dass ‘there-fore, he thought that’; ich glaube, dass ‘I think that’;Ich hoffe, dass ‘I hope that’; Ich würde nicht sagen, dass‘I would not say that’), thus exhibitingmuch richerlexicogrammatical variety and supporting thefindings of the earlier CTTR analysis.

The CC concordance analysis showed that bothlearners predominantly used the conjunctions und(‘and’) and aber (‘or’). Braden used 95 und and 16aber; Cassie used 89 und and 20 aber. A closer look atthe collocations of the most frequent conjunctionund7 revealed that both learners used it to connectFV‐units only in about a third of all occurrences(Example 1). In contrast, two thirds of undinstances connected different kinds of paratacticconstructions (Examples 2 & 3), thus contributingto lengthening of FV‐units rather than to increas-ing the number of coordinated FV‐units.

EXAMPLES

(1) Am Morgen, ich frühstücke und Kaffee trinken.(FV‐units)‘In the morning, I have breakfast and drink

coffee.’

(2) Meine Tante Kay und ihr Mann Bill habenzwei Kinder und zwei Enkel. (noun phrases)‘My aunt Kay and her husband Bill havetwo children and two grandchildren.’(noun phrases)

(3) Die Stadt ist größer und schneller. (predicativeadjectives)

‘The city is bigger and faster.’

SUMMARY AND DISCUSSION

Analysis of the development of syntactic com-plexity shows a general upward trend in both thecohort and longitudinal data on most of themeasures used in this study: a general complexitymeasure (sentence length) and more specificcomplexity measures (finite verb units, subordina-tion, and lexical variety). In other words, learnerswrite more complex essays as they progress fromthe novice level through a college‐level instruc-

tional sequence of four semesters. This overallresult corroborates the assurance by Larsen‐Freeman (2006) that “[f]oreign language teachersand learners can take heart” because over time,learners’ “writing has become more complex ingrammar and in vocabulary” (p. 598). This findingwas confirmed for both timed and untimed essaywriting. Although students’development in writingcomplexity ismore apparent when they are allowedto write without time pressure, the developmentaltrend still holds even when only timed essays areconsidered. More specific results are discussednext in comparison with previous research.

Cohort Data

General and Specific Syntactic Measures. Thefindings for sentence length triangulate the resultsfrom Byrnes (2009), Byrnes et al. (2010), andCooper (1976), whose participants, like in thisstudy, were American college‐level learners ofGerman. The cited studies showed an incrementalincrease in means of generic complexity measures(words per sentence and/or words per T‐unit) attransition points between curricular levels. Thisstudy supported and augmented these findingsby the correlation analysis of densely plotteddata, which showed a steady linear increase ofthe SL trendline with tight confidence intervals(Figure 1a). This finding presents evidence thatlearners in this study consistently increased theirsentence length during each semester, and notonly between semesters. A similar result wasachieved for sentence length in clause‐type units(FV‐units/S), which parallels the trend of asignificant increase in the number of clauses perT‐unit shown between adjacent levels I to III inByrnes et al. (2010).8 The data collection in thisstudy ended after the fourth semester (about 255contact hours total), which roughly corresponds tothe end of level III in Byrnes et al. (2010).Therefore, no data are available to test whetherthe number of clause‐type units per a genericsyntactic unit declines at higher proficiency levels(as found in the cited studies).

In contrast, the picture for the FV‐unit length inwords is different. Byrnes (2009) and Byrnes et al.(2010) found that clause length was not signifi-cantly different between levels I and II butincreased with each next level. Based on thisfinding, an increase in FV‐unit length could havebeen expected in the fourth quartile of this study’stimeline, but this was not substantiated in the data(Figure 3a). In this regard, the participants in thisstudy seem more similar to those of Cooper(1976), who did not see an increase in clause

592 The Modern Language Journal 96 (2012)

Page 18: The Development of Second Language Writing Complexity in

length until the fourth year of study (Cooperprovides no data for the first year). An explanationof this fact may lie in the differences in theinstructional policies and practices. Whereas thecurriculum in this study’s instructional settinggenerally switched from more narrative to moreexpository writing at the focal juncture (fourthsemester), the GUGD curriculum, apparently, hada much more explicit focus on secondary publicdiscourse genres and associated linguistic featuressuch as nominalization and other means of clauseextension at the corresponding level III. Indeed, asNorris and Ortega (2009) explain, “mean lengthof clause is radically different from the otherlength‐based measures” because

any increases can only result from the addition of pre‐or postmodification within a phrase (via adjectives,adverbs, prepositional phrases, or nonfinite clauses)or as a result of the use of nominalizations, or theprocess of reduction of clauses into phrases whichhelp to condense information. (p. 561)

Therefore, increasing clause length may serve asan indicator of advanced levels of L2 proficiencyascertained at level III for GUGD students but notin this study or Cooper’s (1976) study.

Syntactic and LexicalMeasures. The study showedthat on average, this cohort’s lexical variety (CTTR)positively and very strongly correlates with sentencelength and with time. Interestingly, the strongcorrelation between SL and CTTR for both thecohort and the longitudinal data in this study is thereverse of the pattern found by Verspoor et al.(2008) in the written data of an advanced learner ofEnglish, whose SL and TTR development correlat-ed negatively. However, the data for that samelearner showed a weak to moderate positivecorrelation at the beginning of the data collectionperiod. The present study thus provides support forVerspoor et al.’s conclusion that varied word useand the length of sentences have a complexrelation to each other that “changes dynamicallyover time” (p. 225). At the beginning stages of L2proficiency, as is the case for the learners in thisstudy, these two variables may act as “connectedgrowers” (van Geert, 1994) that support eachother’s development. Indeed, beginning learnerswho need to increase the length of their initiallyvery basic sentences, need not only new syntacticstructures but also new vocabulary to fill theirsentences. In Verspoor et al.’s words, sentencelength “reflects to a great extent vocabularyacquisition and the ease with which vocabulary isused” (2008, p. 220). In contrast, at more advancedstages of L2 development, lexical and syntactic

complexity may enter into a competitive relation-ship, as shown by Verspoor et al.

Coordination and Subordination. In contrast,CC and SC have been shown to correlatenegatively in this data. Moreover, CC has alsobeen shown to negatively correlate with time,sentence length, and the number of FV‐units persentence (Tables 2 & 3). This finding can beexplained as follows. At the very outset of languagestudy, an overwhelming majority of syntacticconnectors used by learners are coordinatingconjunctions (Figure 4a). This is not surprisingbecause coordination is the only means ofsyntactic complexification that students are taughtduring the first semester of study. After focusedinstruction on subordinating conjunctions (T8),the use of CC gradually decreased and the use ofSC gradually increased (Figures 4a & 5a), whichalso goes hand in hand with the increase of(subordinate) FV‐units (Table 3). This findingcorroborated the results from previous studies ofbeginning language learners (Bardovi‐Har-lig, 1992; Ishikawa, 1995). However, at someobservation points, the pattern was reversed: CCwent up at T9, T13, and T19, whereas SC wentdown. This reverse effect may be explained by thetask effect: For example, essay 19 was written inclass, and learners predictably used fewer cogni-tively challenging SC (than in essay 18, which waswritten at home) and resorted back to using moreCC. More dynamic research methods are neededto account for these interactions between factors.

Moreover, the linear increase of SC in this studycan be loosely paralleled to the increase of clausesper T‐unit (a different subordination measure)found in Byrnes et al. (2010) through levels I to III.On the one hand, this similarity may be explainedby the gradually increasing focus on reasoningcomponents in both writing programs, whichrequire the use of subordinating conjunctions(Byrnes et al., 2010; Michel, 2010). On the otherhand, the results regarding CC should be inter-preted with caution. Coordination is a much moreheterogeneous linguistic feature than SC becauseit can combine components at different syntacticlevels (words, clauses, phrases), as shown by theanalysis of concordance lines with the CC und.Moreover, coordinationmay indicate lesser as wellas greater syntactic complexity (e.g., in the case ofconjoined clauses, see Robinson, 2007). In fact,Cooper (1976) found mixed results regardingcoordinated structures: They declined from levelII to III in his learners but gradually increasedagain with each next proficiency level. A detailedanalysis of coordination in thewriting of the 2 focal

Nina Vyatkina 593

Page 19: The Development of Second Language Writing Complexity in

learners was performed in a separate study (seeVyatkina, in press).

Longitudinal Data

The longitudinal data analysis richly illustratesthe variability in developmental pathways of 2individual learners vis‐à‐vis the backdrop of thecross‐sectional developmental trend. The analysisof sentence length shows that Braden and Cassie’sdevelopmental pathways oscillate closely aroundthe average class trendline on this genericsyntactic complexity measure. It was thereforeinteresting to track how these two learnersdeveloped based on more specific complexitymeasures; that is, what are the specific means bywhich their sentence length increases? The resultsshow that Braden makes his sentences progres-sively longer primarily by adding more FV‐units.He is consistently and significantly above the classaverage on this measure. In contrast, Cassie’ssentence length most strongly correlates withCTTR, and she consistently and significantlysurpasses her classmates on this measure. Thequalitative analysis confirms this hypothesis, atleast, for SC collocations: Cassie uses the SC dasswith a variety of words and constructions. Thisvariation is in stark contrast with Braden’s use ofthe same connector mostly in a fixed constructionich dachte, dass. The specific verb form dachte asused by Braden in conjunction with dass is anexample of chunk‐type word combinations, or“lexical entry points into complex structures”typical of beginning learners, according to Sato(1988, p. 392). This observation points to the factthat Cassie may have experimented with a richerrepertoire of lexicogrammatical features typical ofmore advanced learners. This assumption findsfurther support in an observation of Cassie’sinstructor who praised this student’s creativity inher essays.

Additionally, Cassie’s sentence length increasemoderately correlated with an increase in the FV‐unit length. As discussed earlier, lengthening ofclause‐type units is generally considered a moreadvanced complexification strategy as opposed tothe increase of the number of clause‐type units persentence (as used by Braden). Therefore, apositive correlation between Cassie’s SL and FV‐unit length may suggest that she has reached ahigher stage of syntactic complexification than herclassmates. Indeed, a separate qualitative study ofthe 2 focal learners confirmed this assumption,showing that Cassie used a wider variety of moreadvanced syntactic constructions (such as infinitivesand participles) than Braden (Vyatkina, in press).

Finally, a moderate negative correlation be-tween FV‐units per sentence andwords per FV‐unitwas found for the 2 focal learners, whereas thecohort data showed no correlation. While thisfinding suggests a hypothesis about a competitiverelationship between the frequency and length ofclause‐type units in these individuals’ longitudinaldata, nonlinear dynamic methods will be neededto support or refute it.

IMPLICATIONS AND CONCLUSION

This study analyzed the development of writingcomplexity in college‐level learners of Germanover four semesters of study beginning at thenovice level. Various lexicogrammatical measuresof complexity were used, and their correlationswere explored cross‐sectionally and longitudinally.Comparisons were made between cross‐ sectionaltrendlines and individual development paths takenby 2 learners. The results confirm that learnersfollow some general developmental trends estab-lished in previous CAF research, but considerablevariability was also illustrated and explained. In thisway, this research responds to Byrnes’s (2009) callto study both individual values and group valuesand “to find regularities, but to find them invariations that are themselves specified” (p. 63).The relationships between different complexitymeasures were also explored.

Furthermore, this study contributes to themethodology for conducting developmental SLAresearch. First, selected CAF measures weredescribed in detail, which will enable futureapplications of the study design to other contexts.The study shows that frequencies of automaticallyannotated POS such as finite verbs and syntacticconnectors can be used as convenient proxymeasures for syntactic complexity. Second, thestudy showed how cross‐sectional and longitudinaldata can be used as complementary data sourcesyielding interilluminating research results. Inparticular, using scatterplots along with trendlinesand confidence interval lines illustrates cross‐sectional to longitudinal comparisons.

The results were compared to a series offindings from previous research conducted forsimilar learner populations (Byrnes, 2009; By-rnes et al., 2010; Cooper, 1976). The cited studiesas well as this study have shown that the generalsyntactic complexity and the amount of subordi-nation linearly increase in the writing of begin-ning to intermediate learners. This study hasadditionally shown that the linear increase trendheld both for time points T1 to T10, at whichmodifications of a narrative task were used, and

594 The Modern Language Journal 96 (2012)

Page 20: The Development of Second Language Writing Complexity in

for time points T11 to T19, at which a reasoningcomponent was added. This is an importantmethodological finding confirming that thesemeasures (sentence or T‐unit length in words andclause‐type units as well as normalized counts ofsubordinating conjunctions) may be used asreliable indices of progressing from beginning tointermediate L2 proficiency, which are sufficientlyresistant to differences in instructional settings andtasks (the latter being a particularly severe threat tovalidity in longitudinal designs).

In contrast, the clause‐type unit length in wordsdid not work as a developmental unit in this study,at least for the cohort data, because the values didnot change significantly over time. This findingpointed to curricular differences between thisstudy’s program and the program described byByrnes (2009) and Byrnes et al. (2010), whoshowed a significant increase of clause length (acharacteristic of advanced L2 capacities) at acomparable time juncture in terms of the numberof instructional hours. Finally, this study has shownthat the general amount of coordination corre-lates negatively with the amount of subordination,although it was pointed out that coordination is apotentially heterogeneous measure and merits afurther exploration.

It has to be noted that the complexity measuresand methods in this study have inevitable limi-tations, which show directions for future research.First, the decision to use only automaticallycomputedmeasures brings up the issue of softwareaccuracy. This was evaluated here by spot‐checks;however, future investigations should (ideally)follow a semi‐automatic approach with a systematiccorrection of tagger errors (Garretson & O’Con-nor, 2007, see also Vyatkina, in press). Second, toenhance research comparability, more studies areneeded with data manually annotated for syntacticcomplexity measures used for similar learnerpopulations (such as clauses, T‐units, and lexicalsophistication). Third, this study has only begun totap into the development of more specificcomplexity measures that present a rich potentialfor future research. More studies are alreadyunderway that explore a large set of POS‐basedcomplexity measures in response to specificwriting tasks for a focal cohort and individuals.Fourth, this study used linear correlationmethods,which was considered appropriate as a firstapproximation for comparing cross‐sectionaland individual data. However, a number oflongitudinal findings yielded hypotheses thatmay be tested with nonlinear dynamic methods(Verspoor et al., 2011), such as (potentially)supporting relationship between sentence length

and lexical variety or a competing relationshipbetween coordination and subordination (at acertain point in the instructional progression).

On a final note, this study has a potential to beexpanded with analyses of data collected forsubsequent learner cohorts. Such analyses maybe particularly revealing because of the changes inthe curricular approach to writing implemented inthe focal program over the years. In particular,efforts have been made to enhance the ecologicalvalidity (Byrnes et al., 2010) of the writingcomponent (see Vyatkina, 2011). Future compar-isons of the learning outcomes for differentlearner cohorts may validate such curricularchanges, which would have implications for bothwriting pedagogy and research in L2 education.

NOTES

1 Byrnes et al. (2010) explain that they designed thisBWT for students in their program as “a task that (a) couldbe reasonably attempted by learners at four broadlydifferent ability levels; (b) could be completed within arelatively short amount of time; and (c) could, nevertheless,elicit a reasonably trustworthy indication of learners’writingabilities and syntactic patterns” (p. 165). Given that studentsin this study engaged in discussions of German short storiesfor a whole semester, writing a short book review wasconsidered an appropriate semester‐final writing task.

2 Although some researchers account for productivity,or text length in words, in complexity analyses, it is widelyaccepted that it is rather a separate CAF measure akin tofluency (Polio, 2001). Following this assumption, pro-ductivity is not analyzed in this study. However, asadditional information on the study background, Table 1demonstrates that the average essay length progressivelyincreases (with some upward and downward oscillations)from 68 words at T1 to 161 words at T19 (which wasexpected). Also, learners wrote longer essays underuntimed conditions (which was also expected). Further-more, it should be noted that whereas Braden’s essaylength mostly fluctuated around the cohort trendline,Cassie frequently wrote longer and several times muchlonger essays than the class average.

3 Although clauses are rarely defined explicitly, thefollowing differences emerge from discussion sectionsand/or examples in some recent studies: Norris andOrtega (2009) discuss finite clauses but considernonfinite verb constructions phrases, i.e., subclausalelements (p. 561, see also Lu, 2011); Byrnes et al. (2010)count infinitival constructions zu (‘to’) þ infinitive asnonfinite clauses but infinitives governed by modal verbsas subclausal elements (p. 168); whereas Kuiken, Vedder,and Gilabert (2011) consider all nonfinite verb con-structions governed by modal verbs separate clauses (atleast for some Germanic and Romance languages,following Housen, 2002, p. 106).

Nina Vyatkina 595

Page 21: The Development of Second Language Writing Complexity in

4 Another study that reports on specific cases ofcoordination and subordination, including modifica-tion, for the 2 longitudinal learners accounts for thislimitation (Vyatkina, in press).

5 Using an automatic POS tagger trained on nativedata for learner data implies that only native‐like formswere calculated, i.e., accuracy was implicitly taken intoaccount although it is not the focus of this study. It shouldbe noted that using an automatic tool always assumesa certain percentage of annotation errors (Granger,2002; Meurers & Müller, 2009). The tagger accuracy wasevaluated by two independent raters and estimated atabout 96%, which is fairly high. Conjunctions are shortuninflected words in which learners make few spellingmistakes, and many misspelled verb forms such as arbieteinstead of arbeite were recognized as FV by the tagger. Itmust be added that human annotation is not free oferrors either, and automatic taggers’ errors are at leastsystematic.

6 Sentence length is expected to further increase withgrowing L2 proficiency based on Cooper’s (1976) data,which showed consistent growth with each year of studyand a significant increase after every 2 years.

7 This observation was also confirmed by the cross‐sectional wordlist analysis: The word und is by far themost frequent word, not only the most frequentconnector, in this learner corpus. Moreover, this patternparallels native speaker usage of und that has been shownto be one of the most frequent words in German(Tschirner, 2005).

8 No direct comparisons are possible between theunits used in this study (automatically annotated FV‐units) and the cited studies (manually annotated finiteand nonfinite clauses in Byrnes et al., 2010, and finiteclauses inCooper, 1976).However, general trends can becompared because all these units are similar in the sensethat they measure syntactic complexity at the level lowerthan a sentence but higher than a word andhave a verb attheir core.

ACKNOWLEDGMENTS

This study was supported in part by the University ofKansas General Research Fund allocations Nos. 2302139and 2301446. I would like to acknowledge EmilyHackmann for her help in evaluating the tagger accuracyand Jonathan J. Van Tassel for assistance with correlationanalysis. I would also like to thank William J. Comer andthe anonymous reviewers for their comments on earlierdrafts of this article.

REFERENCES

Aarts, J., & Granger, S. (1998). Tag sequences in learnercorpora: A key to interlanguage grammar anddiscourse. In S. Granger (Ed.), Learner English oncomputer (pp. 132–141). London: Longman.

Abrams, Z. (2010). Writing. In C. Blyth (Ed.), Foreignlanguage teaching methods. Texas LanguageTechnology Center, University of Texas at Austin.Retrieved from http://coerll.utexas.edu/methods

Arthur, B. (1979). Short‐term changes in EFL composi-tion skills. In C. Yorio, K. Perkins, & J. Schachter(Eds.), On TESOL ’79: The learner in focus (pp. 330–342). Washington, DC: TESOL.

Bardovi‐Harlig, K. (1992). A second look at T‐unitanalysis: Reconsidering the sentence. TESOLQuarterly, 26, 390–395.

Briggs, J., Di Donato, R., Clyde, M., & Vansant, J. (2008).Workbook to accompany Deutsch, Na Klar!: An intro-ductory German course. New York: McGraw‐Hill.

Bulté, B., & Housen, A. (2012). Defining and operation-alising L2 complexity. In A. Housen, V. Kuiken &I. Vedder (Eds.), Dimensions of L2 performance andproficiency: Complexity, accuracy and fluency in SLA(pp. 21–46). Philadelphia/Amsterdam: JohnBenjamins.

Byrnes, H. (2009). Emergent L2 German writing abilityin a curricular context: A longitudinal study ofgrammatical metaphor. Linguistics and Education,20, 50–66.

Byrnes, H., Maxim, H., & Norris, J. M. (2010). Realizingadvanced foreign language writing developmentin collegiate education: Curricular design, peda-gogy, assessment [Monograph]. Modern LanguageJournal, 94(s1).

Byrnes, H., & Sinicrope, C. (2008). Advancedness andthe development of relativization in L2 German: Acurriculum‐based longitudinal study. In L. Ortega& H. Byrnes (Eds.), The longitudinal study ofadvanced L2 capacities (pp. 109–138). New York:Routledge/Taylor & Francis.

Casanave, C. (1994). Language development in students’journals. Journal of Second Language Writing, 3, 179–201.

Cooper, T. C. (1976). Measuring written syntacticpatterns of second language learners of German.Journal of Educational Research, 69, 176–183.

de Bot, K., Lowie, W., & Verspoor, M. H. (2011).Introduction. In M. H. Verspoor, K. de Bot, & W.Lowie (Eds.), A dynamic approach to second languagedevelopment. Methods and techniques (pp. 1–4).Philadelphia/Amsterdam: John Benjamins.

Di Donato, R., Clyde, M., & Vansant, J. (2008). Deutsch,na klar!: An introductory German course. Boston:McGraw‐Hill.

Draper, N. R., & Smith, H. (2001). Applied regressionanalysis. New York: Wiley.

Ellis, N., & Larsen‐Freeman, D. (2006). Languageemergence: Implications for applied linguistics—introduction to the special issue. Applied Linguistics,27, 558–589.

Ellis, R. (2003). Task‐based language learning and teaching.Oxford: Oxford University Press.

Garretson, G., & O’Connor, M. C. (2007). Between theHumanist and the Modernist: Semi‐automatedanalysis of linguistic corpora. In E. Fitzpatrick(Ed.), Corpus linguistics beyond the word: Corpus

596 The Modern Language Journal 96 (2012)

Page 22: The Development of Second Language Writing Complexity in

research from phrase to discourse (pp. 87–106).Amsterdam:

Granger, S. (2002). A bird’s‐eye view of learner corpusresearch. In S. Granger, J. Hung, & S. Petch‐Tyson(Eds.), Computer learner corpora, second languageacquisition and foreign language teaching (pp. 3–33).Philadelphia/Amsterdam: John Benjamins.

Granger, S., Kraif, O., Ponton, C., Antoniadis, G., &Zampa, V. (2007). Integrating learner corporaand natural language processing: A crucial steptowards reconciling technological sophisticationand pedagogical effectiveness. ReCALL, 19, 252–268.

Halliday, M. A. K., & Matthiessen, C. M. I. M. (1999).Construing experience through meaning: A language‐based approach to cognition. London: Continuum.

Hawkins, J. A., & Buttery, P. (2010). Criterial featuresin learner corpora: Theory and illustrations.English Profile Journal, 1, DOI: 10.1017/S2041536210000103.

Homburg, T. J. (1984). Holistic evaluation of ESLcompositions: Can it be validated objectively?TESOL Quarterly, 18, 87–107.

Housen, A. (2002). Second language development in theEuropean School systemofmultilingual education.In D. W. C. So & G. M. Jones (Eds.), Education andsociety in plurilingual contexts (pp. 96–128). Belgium,Brussels: VUB University Press.

Housen, A., & Kuiken, F. (2009). Complexity, accuracy,andfluency in second language acquisition.AppliedLinguistics, 30, 461–473.

Hunt, K. W. (1965). Grammatical structures written atthree grade levels (NCTE research report No. 3).Champaign, IL: National Council of Teachers ofEnglish.

Ishikawa, S. (1995). Objective measurement of low‐proficiency EFL narrative writing. Journal of SecondLanguage Writing, 4, 51–69.

Kern, R. G., & Schultz, J. M. (1992). The effects ofcomposition instruction on intermediate levelFrench students’ writing performance: Somepreliminary findings. Modern Language Journal,76, 1–13.

Kuiken, F., Vedder, I., & Gilabert, R., (2011, March).Syntactic complexity in L2 writing as an indicator of L2proficiency. Paper presented at the Annual Confer-ence of the American Association for AppliedLinguistics, Chicago, IL.

Larsen–Freeman, D. (1983). Assessing global secondlanguage proficiency. In H. Seliger & M. Long(Eds.), Classroom‐oriented research in second languageacquisition (pp. 287–304). Rowley, MA: NewburyHouse.

Larsen–Freeman, D. (2006). The emergence of com-plexity, fluency, and accuracy in the oral andwritten production of five Chinese learners ofEnglish. Applied Linguistics, 27, 590–619.

Larsen–Freeman,D. (2009). Adjusting expectations: Thestudy of complexity, accuracy, and fluency inSecond Language Acquisition. Applied Linguistics,30, 579–589.

Lu, X. (2011). A corpus‐based evaluation of syntacticcomplexity measures as indices of college‐level ESLwriters’ language development. TESOL Quarterly,45, 36–62.

Lu, X. (2012). The relationship of lexical richness to thequality of ESL learners’ oral narratives. ModernLanguage Journal, 96, 190–208.

Maxim, H., (2011, March). Establishing a curricular trajec-tory: A socio‐semiotic perspective on text selection andsequencing. Paper presented at the Annual Confer-ence of the American Association for AppliedLinguistics, Chicago, IL. Retrieved from http://userwww.service.emory.edu/�hmaxim/presenta-tions.html

Meurers, W. D., &Müller, S. (2009). Corpora and syntax.In A. Lüdeling & M. Kytö (Eds.), Corpus linguistics:An international handbook (pp. 920–933). Berlin:Mouton de Gruyter.

Michel, M. (2010). Cognitive and interactive aspects of task‐based performance in Dutch as a second language(Unpublished doctoral dissertation). University ofAmsterdam, The Netherlands.

Norris, J., & Ortega, L. (2009). Towards an organicapproach to investigating CAF in instructed SLA:The case of complexity. Applied Linguistics, 30, 555–578.

Ortega, L. (2003). Syntactic complexity measures andtheir relationship to L2 proficiency: A researchsynthesis of college‐level L2 writing. Applied Lin-guistics, 24, 492–518.

Ortega, L. (2012). Interlanguage complexity: A constructin search of theoretical renewal. In B. Kortmann &B. Szmrecsanyi (Eds.), Linguistic complexity: Secondlanguage acquisition, indigenization, contact (pp. 127–155). Berlin: Mouton de Gruyter.

Ortega, L., & Byrnes, H. (2008). The longitudinal studyof advanced L2 capacities: An introduction. In L.Ortega & H. Byrnes (Eds.), The longitudinal study ofadvanced L2 capacities (pp. 3–20). New York:Routledge/Taylor & Francis.

Pallotti, G. (2009). CAF: Defining, refining and differ-entiating constructs. Applied Linguistics, 30, 590–601.

Polio, C. (2001). Research methodology in secondlanguage writing research: The case of text‐basedstudies. In T. Silva & P. K. Matsuda (Eds.), Onsecond language writing (pp. 91–115). Mahwah, NJ:Lawrence Erlbaum.

Ravid, D. (2005). Emergence of linguistic complexity inlater language development: Evidence from ex-pository text construction. In D. D. Ravid & H. B.Shyldkrot (Eds.), Perspectives on language andlanguage development: Essays in honor of Ruth A.Berman (pp. 337–356). London: Kluwer Academic.

Robinson, P. (2007). Task complexity, theory of mind,and intentional reasoning: Effects on L2 speechproduction, interaction, uptake and perceptionsof task difficulty. International Review of AppliedLinguistics, 45, 237–257.

Ryshina–Pankova, M. (2010). Toward mastering thediscourses of reasoning: Use of grammatical–

Nina Vyatkina 597

Page 23: The Development of Second Language Writing Complexity in

metaphor at advanced levels of foreign languageacquisition. Modern Language Journal, 94, 181–197.

Sato, C. (1988). Origins of complex syntax in interlan-guage development. Studies in Second LanguageAcquisition, 10, 371–395.

Saville, N. (2010). The English profile programme:Background, current issues and future prospects.Language Teaching, 43, 238–244.

Schmid, H. (1994). Probabilistic part‐of‐speech taggingusing decision trees. In Proceedings of the Internation-al Conference on New Methods in Language Processing.Manchester, UK. Retrieved from http://www.ims.uni‐stuttgart.de/�schmid/

Scott, M. (2008). WordSmith Tools (Version 5) [Com-puter software]. Liverpool, UK: Lexical AnalysisSoftware.

Skehan, P. (1989). Individual differences in second languagelearning. London: Edward Arnold.

Spoelman, M., & Verspoor, M. (2010). Dynamic patternsin development of accuracy and complexity: Alongitudinal case study in the acquisition ofFinnish. Applied Linguistics, 31, 532–553.

Teichert, H., & Teichert, L. (2005). Allerlei zum Lesen[All kinds of reading]. Boston: Houghton MifflinCo.

Tschirner, E. (2005). Korpora, Häufigkeitslisten, Wort-schatzerwerb [Corpora, frequency lists, vocabularyacquisition]. In A. Heine, M. Hennig, & E.Tschirner (Eds.), Deutsch als Fremdsprache—Kontu-ren und Perspektiven eines Fachs [German as a foreignlanguage—contours and perspectives of a discipline](pp. 133–149). München, Germany: Iudicium.

van Geert, P. (1994). Dynamic systems of development:Change between complexity and chaos. New York:Harvester.

Verspoor,M.H., de Bot, K., & Lowie,W. (Eds.). (2011).Adynamic approach to second language development.Methods and techniques. Philadelphia/Amsterdam:John Benjamins.

Verspoor, M., Lowie, W., & van Dijk, M. (2008).Variability in second language development froma dynamic systems perspective. Modern LanguageJournal, 92, 214–231.

Verspoor, M., & Sauter, K. (2000). English sentenceanalysis: An introductory course. Philadelphia/Amsterdam: John Benjamins.

Vyatkina, N., (in press). Specific syntactic complexity:Developmental profiling of individuals based on anannotated learner corpus. Modern Language Jour-nal, 97(s1).

Vyatkina, N. (2011). Writing instruction and policies forwritten corrective feedback in the basic languagesequence. L2 Journal, 3, 63–92.

Wolfe–Quintero, K., Inagaki, S., & Kim, H.‐Y. (1998).Second language development in writing: Measures offluency, accuracy, and complexity. Second LanguageTeaching & Curriculum Center, University ofHawai’i at Manoa.

APPENDIXInstruction‐Embedded Elicitation Tasks (based on Briggs,Di Donato, Clyde, & Vansant, 2008)

1. Who are you?2. Your apartment, your friends3. Your family4. Your daily routine5. Description: Your favorite clothes OR Your luckycharm OR A shopping day6. Party plan7. Your last weekend (sequence of activities in thepresent perfect tense)8. You as a human (personal characteristics) OR ahuman (and human life)9. Your town10. Comparison of two trips (in the present perfect tense)11. The life in the future (including a consequentialexplanation)12. Describe yourself as a person or a person who yourespect (including a consequential explanation)13. You and the media OR You and technology(including a consequential explanation)14. A (societal or personal) problem / challenge(including a possible solution)15, 16, 17, 18. Interpretation. Use your imagination andthink about the deepermeaning of this story (3–7morespecific prompts provided)19. Book review. Choose a book, either fictional ornonfictional, that you have read and write an article fora regular feature in a student newspaper (a few morespecific prompts provided)

598 The Modern Language Journal 96 (2012)