a receptive vocabulary knowledge test for french l2 ... · figure 5. sample of the french x-lex...
TRANSCRIPT
A Receptive Vocabulary Knowledge Test for French L2 Learners
With Academic Reading Goals
Roselene Batista
A Thesis
in
the Department
of
Education
Presented in Partial Fulfillment of the Requirements
for the Degree of Master of Arts (Applied Linguistics) at
Concordia University
Montreal, Quebec, Canada
March 2014
© Roselene Batista, 2014
ii
CONCORDIA UNIVERSITY
School of Graduate Studies
This is to certify that the thesis prepared
By: Roselene Batista
Entitled: A Receptive Vocabulary Knowledge Test for French L2 Learners
With Academic Reading Goals
and submitted in partial fulfillment of the requirements for the degree of
Master of Arts (Applied Linguistics)
complies with the regulations of the University and meets the accepted standards with
respect to originality and quality.
Signed by the final Examining Committee:
__________________________________ Chair
Elizabeth Gatbonton
__________________________________ Examiner
Sara Kennedy
__________________________________ Examiner
Joanna White
__________________________________ Supervisor
Marlise Horst
Approved by Richard Schimid
Chair of Department
February 14, 2014 Joanne Locke
Dean of Faculty
iii
ABSTRACT
A Receptive Vocabulary Knowledge Test for French L2 Learners
With Academic Reading Goals
Roselene Batista
Recent studies in second language acquisition have confirmed a positive
correlation between L2 learners' lexical knowledge and their language abilities. In more
specific terms, researchers are increasingly confirming the idea that vocabulary size
greatly impacts reading comprehension. In order to estimate English L2 learners'
vocabulary size, several kinds of receptive vocabulary tests have been developed. But
what about French L2 learners? How is their vocabulary size measured? There appear to
be few well designed measures available. This study describes the development and
validation of a new measure for French, the Test de la taille du vocabulaire (TTV). The
TTV is closely modeled on Nation's (1990) widely used Vocabulary Levels Test (VLT)
and follows the guidelines written by Schmitt, Schmitt and Clapham (2001). The TTV
draws on recent corpus-based frequency lists for French (Baudot, 1992; Lonsdale & Le
Bras, 2009). Initially, a pilot version was trialled with 63 participants, then an improved
version was administered to 175 participants at four levels of proficiency. Results attest
to the TTV's validity: scores indicate that the higher the group, the larger its vocabulary
size. Moreover, the mean scores across the four word sections decrease as the test
sections became more difficult. This assessment tool also proved to be reliable as
iv
performance on the test confirmed learners' level as determined by the institutional
placement test. Post-test interviews with the participants confirmed their knowledge of
the test words. Recommendations for improving the TTV, implications for theory and
practice, and limitations will be discussed.
v
Acknowledgements
This thesis project would not have been possible without the encouragement,
guidance and generosity of my supervisor, Professor Marlise Horst. Her remarkable
expertise in the field, her relevant and helpful suggestions as well as her immense
patience made all the difference during the course of writing this thesis. I also attribute
the success of this thesis to the great relationship we developed throughout this process. It
all started at the last day of classes, when vocabulary tests were discussed. This
discussion inspired me to learn more and to propose the idea of developing a test to
Marlise, who gladly accepted it without any hesitation. From this day on, my interest in
vocabulary research has increased greatly, all thanks to a passionate professor who gave
me the "vocabulary bug."
I would also like to sincerely thank my thesis committee consisting of Professor
Sara Kennedy and Professor Joanna White for their invaluable attention and insightful
comments when this project was in its inception. I really appreciate the fact that they took
interest in my work, imparting their knowledge on me, which greatly contributed to the
completion of my thesis.
I also give thanks to Dr. Pavel Trofimovich and Dr. Walcir Cardoso, who played
an important role in my academic pursuits. It was truly a privilege to be their student.
Special thanks go to the Department of Education staff.
I would also convey my gratitude to Cécile Hernu, the pedagogical advisor in the
department of Continuing Education at the CEGEP de Saint-Laurent, who showed a deep
interest in this research since the beginning and allowed me to go back to my former
vi
workplace and conduct my research there. In addition, I would like to thank the teachers
in the Francisation program, who welcomed me into their classrooms so I could invite
their students to participate in this study. My most heartfelt thanks go to the students who
kindly and eagerly accepted my invitation and came in large numbers to take my test.
Finally, I am specially grateful to my husband, Valdir Jorge, for enduring years of
boring and incomprehensible explanations of my countless academic projects in general,
and of this research study in particular; my son, Ivan Jorge, for his unrelenting search of
native speaker participants, and my daughter, Larissa Jorge, for being my devoted
research assistant and for helping me organize the large amount of data I collected. Each
of you can share in this accomplishment, and I am forever indebted to all of you.
vii
Table of Contents
List of Figures .................................................................................................................... ix
List of Tables ...................................................................................................................... x
Chapter 1. Introduction ....................................................................................................... 1
Chapter 2. Literature Review .............................................................................................. 3
Words, numbers, definitions ........................................................................................... 3
Knowing a word .............................................................................................................. 5
Vocabulary size, frequency, and L2 learning ................................................................. 5
L2 vocabulary size and reading comprehension ......................................................... 8
How many words do learners need to know to understand academic texts in French? 12
Receptive vocabulary testing ........................................................................................ 15
A useful measure for academic learners ................................................................... 22
Receptive vocabulary testing in French .................................................................... 26
Developments in French frequency lists ....................................................................... 29
Research Questions and Hypotheses ............................................................................ 36
Chapter 3. Methodology ................................................................................................... 38
Materials ....................................................................................................................... 38
Designing the pilot test ............................................................................................. 38
Format ................................................................................................................... 38
Selecting the words and writing the definitions .................................................... 39
Questionnaire ............................................................................................................ 43
Interview ................................................................................................................... 44
Participants .................................................................................................................... 44
Pilot testing phase ..................................................................................................... 44
Final testing phase..................................................................................................... 45
Procedures and Analyses .............................................................................................. 46
Preliminary version of the TTV ................................................................................ 46
viii
Pilot version of the TTV ........................................................................................... 47
Post-test Validation Interviews ................................................................................. 48
The final version of the TTV .................................................................................... 50
Results of the pilot testing............................................................................................. 58
Chapter 4. Results ............................................................................................................. 60
Research question 1 ...................................................................................................... 60
Research question 2 ...................................................................................................... 61
Research question 3 ...................................................................................................... 63
Chapter 5. Discussion ....................................................................................................... 65
Comparing estimates of vocabulary sizes ..................................................................... 69
Previous learned languages and vocabulary sizes ........................................................ 71
Limitations and tentative solutions ............................................................................... 74
Chapter 6. Implications and conclusions .......................................................................... 79
Implications for research............................................................................................... 79
Implications for teaching .............................................................................................. 80
Conclusion .................................................................................................................... 81
References ......................................................................................................................... 82
Appendix A ....................................................................................................................... 88
Appendix B ....................................................................................................................... 89
Appendix C ....................................................................................................................... 90
Appendix D ....................................................................................................................... 91
Appendix E ....................................................................................................................... 92
Appendix F........................................................................................................................ 93
Appendix G ....................................................................................................................... 94
Appendix H ....................................................................................................................... 97
ix
List of Figures
Figure 1. Two sample clusters from the VLT - paper and electronic versions. ................ 17
Figure 2. Sample questions in the Yes/No Test. ............................................................... 18
Figure 3. Sample questions in the computerized Yes/No Test or EFL Vocabulary Test. 19
Figure 4. Sample question from the Vocabulary Size Test. ............................................. 21
Figure 5. Sample of the French X-Lex Vocabulary Test - paper version. ........................ 27
Figure 6. Opening screen of the French X-Lex Vocabulary Test - computerized version.
.................................................................................................................................... 28
Figure 7. Distribution of clusters in each section on the TTV - pilot version. ................. 40
Figure 8. A noun cluster from the Test de la taille du vocabulaire. .................................. 42
Figure 9. Card used to confirm learners' knowledge of the word "division." ................... 49
Figure 10. Distribution of clusters in each section on the TTV - final improved version. 55
Figure 11. Test score mean per word section and proficiency level. ................................ 62
x
List of Tables
Table 1. The Percentage of Text Cover Necessary to Read Different Kinds of Texts ....... 6
Table 2. Vocabprofil Output for an Academic Text - Psychology Program .................... 13
Table 3. Summary of Characteristics of Three Vocabulary Tests - Brown's (2007)
qualities ...................................................................................................................... 25
Table 4. Description of Groups of Participants ................................................................ 45
Table 5. Facility Values and Discrimination Indices for Pilot Version ............................ 52
Table 6. Number of Words (Means) by Proficiency ......................................................... 60
Table 7. Test Scores (Means) by Section.......................................................................... 62
Table 8. Comparison of Interview Results With TTV's Results ....................................... 63
Table 9. Number of Known Words by Proficiency Level: Extrapolations Based on Milton
and TTV Results (N = 175) ........................................................................................ 69
Table 10. Mean Correct Scores for Different Language Groups ...................................... 72
1
Chapter 1. Introduction
How did I become interested in creating a vocabulary test? Vocabulary became
the focus of my lessons in a gradual manner, during the course of several years of
teaching French as a second language to newly arrived immigrants in Quebec. On many
occasions, I realized that I needed to assess the size of my students' vocabulary more
precisely, but I did not have a suitable or reliable tool to measure it. Frequently, these
second language (L2) learners approached me at the end of the term to ask if I thought
they were ready to pursue higher level studies in French (in their case, CEGEP or
university level studies), and at those moments I wished I had a vocabulary test to assess,
for example, whether they had the lexical resources necessary to cope with certain
language tasks, such as reading academic materials. I answered the question hesitantly,
knowing that my verdict was solely based on the learners' overall language performance,
not on their specific vocabulary knowledge, due to the fact that the tests they took
measured their general proficiency rather than their vocabulary knowledge in particular.
It is known that measures of general proficiency may tell us whether a learner is able to
communicate efficiently in daily interactions, but they seem less helpful as indicators of
readiness to read and understand academic textbooks in the new language. In my studies
as a graduate student, I became aware of the strong correlation between L2 vocabulary
knowledge and reading comprehension (Nation, 2001). Therefore, it became apparent to
me that a vocabulary-size test in French had the potential to be very useful in determining
student's readiness for doing academic work.
The French test I envision would be able to estimate how many words learners
2
know, and its results would indicate with reasonable accuracy whether the learners'
vocabulary knowledge is adequate for undertaking academic study. This would help them
make more informed decisions concerning their studies in the future. Ideally, the test
would also be easy to build, simple to administer and fast to score. I have seen that such
measures have been developed for estimating the vocabulary sizes of L2 learners of
English, and I began to wonder what was available for L2 learners of French. These
thoughts led to this study to create and test a new measure of vocabulary size for French.
In the next chapter, research literature relevant to measuring L2 vocabulary size
will be reviewed, beginning with definitions of key concepts. Then recent views on the
key role of vocabulary in developing L2 proficiency will be presented. In subsequent
sections, vocabulary frequency lists and their uses in building measures of vocabulary
size will be examined, and special attention will be given to lists and tests of French.
3
Chapter 2. Literature Review
Words, numbers, definitions
As Wilkins (1972) rightly put it: "Without grammar very little can be conveyed;
without vocabulary nothing can be conveyed" (p. 111). Vocabulary is a crucial part of a
language, "an essential building block" as Schmitt, Schmitt & Clapham (2001) assert.
The lexicon of a language may contain hundreds of thousands of words or even millions
(depending on how "word" is defined). According to Michel et al. (2011), the size of the
English lexicon is 1,022,000 words. This number was obtained by the Harvard University
and Google team, who ran a computational analysis on a 361 billion word corpus, the
product of millions of digitized books in English. In this project, the team defined word
as "one string of characters uninterrupted by a space" (Michel et al., 2011, p. 176), and
included speed, learnt, and netiquette as good examples of strings.
In addition to strings, words can also be counted as tokens, types, lemmas or
families. Nation (2001), for example, defines token as every word form in a text. When
this unit of counting is used, the same word form will be counted as many times as it
occurs in a page. By counting tokens, it is possible to obtain a raw number of words. So
the sentence:
How do they do a word count?
contains seven tokens, since each occurrence of do is counted separately. However, if
another unit of counting is used, one which treats repetitions as a single item, that same
sentence would contain six words or "types." By counting types, it is possible to obtain
the number of categories of words. When researchers choose "lemmas" as the basis of
4
counting, they include all the inflected forms of a word as a single headword or lemma
(Lonsdale & Le Bras, 2009). In the case of the verb learn, for instance, it may include the
inflected forms learns (-s, third person singular present tense), learning (-ing,
progressive), and learned/learnt (-ed/-t, past tense) as a single lemma. Yet another way of
counting words is based on the notion of "word family." A family includes all the
common inflections and derived forms of a word (Milton, 2009). For instance, the word
family of learn includes the headword (learn), its inflected forms (learns, learning,
learnt, learned,), and also its closely related derived forms (learner, learners, learnable,
relearn, relearns, relearned, relearnt, relearning, unlearn, unlearns, unlearned, etc.).
The same Google team and Harvard professors mentioned earlier have also
created a French corpus containing 45 billion words, but no estimation of number of
types, lemmas or families was reported. A different source, though, the Office québécois
de la langue française (OQLF) (2002), reports that the French language contains 500,000
words. Differently from the Harvard/Google team, however, the OQLF does not specify
which unit of counting they used to obtain their number.
No native speaker of these languages can possibly know all of these many words,
let alone an L2 learner. Nevertheless, in order to communicate successfully in an L2, a
portion of them must be known. Not surprisingly, many second language acquisition
(SLA) researchers have focused their attention on L2 learners' vocabulary size, trying to
answer the following recurring questions: How much vocabulary does a learner need to
know? Which are the most important words to study? Answering these questions requires
a definition of what it means to know a word, which will be explained in the next section.
5
Knowing a word
According to Nation (2001), "there are many things to know about any particular
word and there are many degrees of knowing" (p. 23); knowledge can range from
realizing one has heard a certain word before to being able to use it in ways that reflect
“deep” awareness of the word’s register, collocation and associated concepts. Nation
(2001) divides these various aspects of vocabulary knowledge into two basic categories:
receptive and productive. When learners recognize a word that they read or hear and
understand its meaning, they demonstrate their receptive knowledge. On the other hand,
when learners are able to produce a word in written or spoken manner, they demonstrate
their productive knowledge. In this study, the focus is on receptive vocabulary
knowledge, which involves learners recognizing the form of a word while reading and
simultaneously connecting it with its meaning, or, in other words, making the form-
meaning link (Schmitt, 2010). Given the importance of having a large receptive
vocabulary size, it comes as no surprise that a number of tests have been developed to
measure this basic connection (Schmitt, 2010). The next sections feature discussions on
research that investigated the vocabulary size variable in relation to learners' developing
language proficiency and present the history of word frequency lists that have been used
in developing tests of vocabulary size for learners of English.
Vocabulary size, frequency, and L2 learning
Simply put, vocabulary size refers to the number of words a learner knows
receptively (Milton, 2009). Most researchers measure this in terms of the number of word
families known. While an educated native speaker of English, by the age of 20, knows
6
approximately 20,000 word families (Nation, 2001), L2 learners need to know between
8,000 to 9,000 word families in order to understand unsimplified written texts in English
(Nation, 2006) adequately. Based on Nation's (2001) estimate, a native speaker learns on
average a thousand words per year of existence up to the age of 20 - simply by exposure.
If the same principle is applied to L2 learners, they would only be able to read authentic
materials in English in eight years! That is why teachers and learners should make a
priority of ensuring students learn large numbers of useful word as quickly as possible.
But what are considered useful words?
With recent advances in computing, the answer to this question has become clear.
Researchers have been able to analyze large corpora of texts in a given language and
identify lists of the words that occur most frequently. This work also identifies the
coverage provided by lists of highly frequent words. The concept of coverage is
illustrated in Table 1.
Table 1
The Percentage of Text Cover Necessary to Read Different Kinds of Texts
Levels Fiction Newspaper Academic text
1st 1000 82.3% 75.6% 73.5%
2nd 1000 5.1% 4.7% 4.6%
Academic 1.7% 3.9% 8.5%
Total 89.1% 84.2% 86.6%
Note. Adapted from Nation (2001).
7
The figures in the first row indicate that learners who know all of the 1,000 most
frequent English word families will understand over 82% of the words that occur in
fiction texts, almost 76% of the words in newspapers, and about 74% of the words in
academic textbooks. These 1,000 families are clearly important for learners of English to
know. The table also shows that with knowledge of the 2,000 most frequent families and
the families on Coxhead’s (2000) Academic Word List, learners can attain a high level of
known word coverage for the three different kinds of written productions (novels, press,
and academic materials).
Frequency-informed selection of words that are important for learners of English
to know is not a new approach. In the early 1930's, C. K. Ogden and I. A. Richards
created the Basic English list, containing only 850 words. This list was designed with the
purpose of limiting "English vocabulary to the minimum necessary for the clear
statement of ideas" (Schmitt, 2000, p. 15). But due to the very limited number of words,
there was an over reliance on paraphrase, which sometimes resulted in a very unnatural
language use. For instance, the expression put a question replaced the word ask. A more
useful list was compiled by West (1953). His General Service List (GSL) consists of
2,000 frequent words drawn from a 5 million word corpus. In addition to being corpus-
based, this list was also informed by range (frequency across text genres). This list
greatly influenced the work of leading researchers in L2 vocabulary assessment such as
Nation and Meara, and it still is a surprisingly valuable list today, even though a few of
its words are out of date.
Another important contribution was made by Coxhead (2000), who created a list
with the needs of academic learners in mind: the Academic Word List (AWL). By
8
gathering short and long written texts from four different fields of knowledge (arts,
commerce, law, and science), the author created a collection of academic texts in English,
consisting of 3.5 million tokens or 70 thousands types. She identified 570 word families
that occurred frequently across all four subject areas; in other words, this list was also
informed by range.
Assuming that academic learners know the 2,000 words in the GSL (which are the
most frequent and the "easiest" words of a language, such as read, word, family) and the
technical vocabulary of their fields of expertise (which are considered "difficult" words
by laymen, such as schwa, washback, lemmatization), one can note that there is a
vocabulary gap between these two distinct parts of their lexicon. The AWL with its 570
word families (e.g. assess, percent, select) attempts to fill this gap, bridging the sets of
easy and difficult words (Cobb & Horst, 2004). These words are clearly very useful for
learners who wish to expand their vocabulary in order to read academic texts, and
"become 'successful candidates' in the conventional forms of education" (Nation, 2001, p.
26). How knowledge of frequent vocabulary on lists such as the GSL and AWL impacts
reading comprehension will be discussed in the next section.
L2 vocabulary size and reading comprehension
It is now widely known that knowledge of thousands of words is necessary to use
a language efficiently, and that vocabulary knowledge correlates strongly with language
skills. Work by SLA researchers is increasingly demonstrating that vocabulary size has
an impact on the ability to use an L2 (Schmitt, 2010). Generally, L2 learners with larger
vocabulary sizes perform better on measures of language skills compared to L2 learners
9
with smaller vocabularies. Looking at reading skills in particular, most researchers agree
that receptive vocabulary size plays a major role in reading comprehension activities. The
idea behind this is that learners who have a larger vocabulary size will also know a higher
percentage of words that occur in a text.
A recent study by Staehr (2008) confirms this relationship. He looked at the
effects of pre-university students' vocabulary size and three types of skills, namely,
reading, listening, and writing, and found that of the three, reading comprehension is the
most dependent on vocabulary size (p. 148). By using the Vocabulary Levels Test (VLT)
to measure 88 Danish L1 learners' receptive vocabulary size, the author discovered that
most learners who knew the first 2,000 word families performed well in the reading test;
in fact, knowledge of these words at this level was a key predictor of a passing score.
Further confirmation of the connection between vocabulary size and reading
comprehension comes from the study by Qian (2002). The author tested 217 adult
learners attending an intensive ESL program (all pre-university, undergraduate and
graduate students from numerous L1 backgrounds), and discovered that scores on the
VLT correlated highly with scores on a reading comprehension measure. In other words,
the larger the learners' vocabulary size was, the better they performed on the test that
presented five short written texts about academic content (e.g. biology, geography, art
history) and 30 multiple-choice questions. However, in another study investigating the
relationship between L2 learners' vocabulary size and their reading comprehension,
Laufer and Ravenhorst-Kalovski (2010) did not obtain consistent confirmation of the idea
that a larger vocabulary size leads invariably to increased reading comprehension. They
tested 735 learners from three L1 backgrounds (Hebrew, Arabic, and Russian) enrolled in
10
an English for Academic Purposes course. In order to measure the learners' vocabulary
size, the VLT was used; in order to measure reading comprehension, they administered
the English part of a university entry test. This English test assessed learners'
understanding of words, sentences, and textual information. Although most participants
followed the pattern previously confirmed by research, the participants with greater
vocabulary knowledge (those who knew 7,000-8,000 words) obtained lower scores on the
reading test than the ones with smaller vocabulary knowledge (those who knew 6,000
words). In other words, the expected advantage for knowing more words was not found
in all groups under investigation. The researchers concluded that knowing a larger
amount of words does not always guarantee better reading comprehension. In spite of that
finding, most studies do confirm a connection between larger vocabulary sizes and higher
comprehension scores (e.g. Anderson & Freebody, 1983; Laufer, 1992; Qian, 1999;
Schmitt, Jiang & Grabe, 2011).
As briefly explained earlier, in order to read and comprehend authentic texts in
English, L2 learners need to know 8,000 to 9,000 frequent word families (Nation, 2006).
This estimate is based on the idea of coverage. By knowing this number of words,
learners will be able to recognize 98% of the words in an unsimplified text, which means
that, for each 100 words, only two will be unfamiliar (and there will be enough support
available - in principle - to allow them to be guessed from context). Schmitt, Jiang and
Grabe's (2011) study of over 600 university learners of English confirmed the 98% figure
with actual ESL readers. They tested the readers on their knowledge of a large proportion
of the words that occurred in two texts that the participants eventually read. Results
showed that when known-word density amounted to 98%, the mean reading
11
comprehension score (based on two measures) amounted to 75%. The authors considered
this to be "adequate" comprehension. This work and the study by Nation (2006)
mentioned above converge on the figure of 8,000 to 9,000 frequent words as the
vocabulary size needed to attain 98% known word coverage. It is worth noting that both
readers of general and academic texts need the same percentage of word coverage,
namely 98%, to accomplish their reading tasks adequately.
A study by Cobb and Horst (2004) looked more closely at the vocabulary needed
for reading academic texts. This work shows that the first 2,000 most frequent word GSL
families covers around 80% of written productions in English (see also Table 1), and that
the 570 words in the AWL accounts for 10% of the academic corpus (Coxhead, 2000, p.
222); these 2,570 words together translate into a text coverage level of almost 90%. But,
according to research by Schmitt, Jiang and Grabe (2011) described above, this is not
nearly enough for adequate reading comprehension (since as stated earlier, 98% coverage
is needed). So if learners only master these two groups of vocabulary (GSL+AWL), they
clearly will not attain the coverage threshold level needed to make successful inferences
about unfamiliar words, and they will have to add at least another 5,430 words to their
lexical repertoire if they want to reach the 8,000 figure Nation (2006) has determined is
the minimum needed to adequately comprehend the materials they are required to read in
their pre-university or university courses. It is important to note that these findings
pertain to English. Now, if the focus is turned to L2 learners of French, this research may
not be relevant, since it is not known whether they need to know a similar number of
French word families to comprehend authentic academic texts, which are required
readings for CEGEP or university courses. If these numbers do not apply to them, then
12
how large does an academic learner's vocabulary size need to be in order to comprehend
materials written in French? In the next section, some answers to these questions will be
provided.
How many words do learners need to know to understand academic texts in
French?
The study by Cobb and Horst (2004) considered the numbers of words learners of
French and learners of English need to know in order to read academic texts in their
respective L2s. As mentioned earlier, ESL learners need to know almost 2,600 high
frequency word families (the first 2,000 most frequent words plus the AWL) in order to
achieve 90% known word coverage of academic texts. Their investigation showed that
learners of French only need to know the 2,000 most frequent word families in order to
reach the same 90% level of coverage of the vocabulary that occurred in the academic
materials they analyzed. In other words, by knowing a smaller amount of basic words in
French, learners will gain 90% of text coverage; they do not need to learn an additional
set of academic words. The authors concluded that "French seems to use its frequent
words even more heavily than the English does," since part of the "academic" words in
French are used in both everyday language and academic discourse (p. 35).
To explore how many words in each frequency level learners need to know to
read academic texts in French, the following mini-experiment was undertaken. An
academic text was selected: the research study called "Dynamiques familiales et activité
sexuelle précoce au Canada" (Assche & Adjiwanou, 2009). This article was one of the
required readings for students in the Psychology Program at Sherbrooke University
13
during the Winter term in 2013. Preliminarily, a sample of the document (a 4,127 word
text) was manually stripped of unneeded material, such as images and graphs. Then the
resulting text was fed into Vocabprofil, a French-based version of Vocabprofile (available
at Cobb's Lexical Tutor website). The results of the analysis are shown in Table 2, where
all the words in the sample text are classified into bands of frequency: K1 stands for the
first 1,000 most frequent words in French, K2 for the second 1,000 most frequent words,
and the Lower frequency words include the third 1,000 most frequent words and the
words that did not appear on the frequency lists.
Table 2
Vocabprofil Output for an Academic Text - Psychology Program
Families Types Tokens Percent Cumulative
K1 Words 375 477 4127 80.91 80.91
K2 Words 128 152 416 8.16 89.07
Lower frequency
words119 248 558 10.93 100%
Total 622 877 5101 100% 100%
As can be seen in the last column, with knowledge of only the 2,000 most
frequent families, French L2 learners can be expected to understand 89% of the words in
this academic text. This is broadly consistent with the findings reported by Cobb and
Horst (2004), which identified a coverage level of nearly 89% for the 2,000 most frequent
words.
Milton (2009) says that "95% or 98% coverage of a text might be required for full
comprehension in any language" (p. 69), but assuming this is true in the case of French,
14
how many word families would learners of French need to know in order to reach the full
98% coverage level needed for comprehension of academic texts? It is simply not
possible to answer this question because the software tools do not (yet) exist to analyze
French texts for the occurrence of words beyond the 1K and 2K levels. However, a new
corpus-based frequency list (Lonsdale & Le Bras, 2009) for French includes five 1,000-
word frequency bands and may be able to help answer the question soon.
So far, this chapter has presented some evidence from the research literature
showing that successful reading comprehension is closely associated with performance
on measures of receptive vocabulary size. Also, it has detailed studies that attempt to
show how many word families learners of English need to know to be able to read and
understand academic texts adequately. But in the case of French, the answer is far less
clear. Table 2 shows that the 1K and 2K offer an important amount of coverage, but
researchers do not (yet) know how many additional bands of what is termed "lower
frequency vocabulary" in the table need to be known in order to support comprehension.
It is safe to conclude that a receptive vocabulary size measure for French L2 learners will
be useful in answering the question. With a size test in place, it will be possible to
investigate learners' vocabulary size in relation to their comprehension of texts of various
known words densities. Thus, for example, a learner's performance on the test might
show that she has receptive knowledge to the 5,000 most frequent French words, and that
this gives her 93% known word coverage of a particular reading passage. A subsequent
reading comprehension test could then determine the extent to which the learner
comprehended the text adequately.
If a study were conducted to measure French L2 learners' vocabulary size, which
15
test should be used? Which words should be included? How should they be selected from
the incredibly large lexicon of a language? In the next section, some types of existing L2
vocabulary size tests will be examined with a view to understanding how they can inform
the process of constructing a principled test for French L2 learners.
Receptive vocabulary testing
Tests aiming at measuring learners' vocabulary size (e.g. Meara & Buxton, 1987;
Meara & Jones, 1988; Nation, 1983; Nation & Beglar, 2007) are often based on samples
of words from word frequency lists with each tested word requiring only a single answer
(Read, 1993). Performance on a sample is then extrapolated to the entire set of words in a
particular frequency band. Thus a learner who demonstrates knowledge of 16 of 20
(80%) words sampled from a frequency band of 1,000 words is assumed to know 800 of
the 1,000. These language tests measure receptive vocabulary knowledge, since they only
require learners to recognize words being tested. In most of them, learners are just asked
to make the form-meaning link (Schmitt, 2010), demonstrating that they know the form
of a lexical item and its corresponding meaning. In several receptive tests, the target
words are presented in isolation (no context), which means that they are measuring
decontextualized vocabulary knowledge.
Different types of tests using several response formats have been developed
throughout the years. To name a few, those most extensively used in testing L2 receptive
vocabulary knowledge are multiple-choice, checklists, word association, and matching
words with definitions. In terms of evaluation criteria, tests may also vary in relation to
their methods: some ask learners to choose written synonyms of target words, others ask
16
learners to self-report whether they know words on a checklist that contains some
nonsense words (Wesche & Paribakht, 1996).
Among all the recognition vocabulary tests that were developed from frequency
lists, three will be discussed in this study: the Vocabulary Levels Test, the Yes/No
checklists, and the Vocabulary Size Test. They were chosen for their proven strengths
and for being extensively used by researchers and teachers to assess vocabulary
knowledge of L2 learners. They are discussed in chronological order below.
1) The Vocabulary Levels Test
The Vocabulary Levels Test (VLT, or the Levels Test) was first developed by
Nation (1983) as a diagnostic test intended to help learners with vocabulary learning
(Nation, 1990). Using a word-definition matching format and including only real words
in English, the test results can give a good indication of where learners need to develop
their vocabulary knowledge, showing the gaps in their lexicon. Then a number of
researchers started using it as a measure to estimate L2 learners' receptive vocabulary size
in experimental studies (Staehr, 2008).
The VLT has a number of distinct advantages. First, due to the fact that it only
contains single decontextualized words and short definitions, the test does not require
sophisticated grammatical knowledge or higher reading skills (Schmitt, Schmitt &
Clapham, 2001). Second, the definitions are written using simple vocabulary, which
helps ensure that the options are comprehensible. Third, the format (see Figure 1) has
been designed to minimize the role of guessing. For each definition on the right, there are
six possible answers. Fourth, this test can be scored objectively by comparing learners'
17
responses with an established set of acceptable responses (scoring key); no subjective
judgements are required on the part of the marker. Finally, this type of test format allows
a large number of words to be tested in a short time.
As Figure 1 shows, learners simply write numbers to match the three definitions
on the right with three of the words on the left. Figure 1 illustrates a cluster from the
5,000 word level, in two different versions: (a) paper-and-pencil and (b) electronic:
(a) (b)
Figure 1. Two sample clusters from the VLT - paper and electronic versions.
In the original paper-and-pencil version, learners write the number of the target
word next to its meaning; in the electronic one, they type the number instead. All the
responses are verifiable, meaning that the test words have only one acceptable response.
The VLT contains five sections: four assessing the 2,000, 3,000, 5,000 and 10,000
frequency levels and one assessing academic vocabulary. Containing 30 clusters (six per
section), it has a total of 180 words, 90 test words (18 per section, three per cluster), and
it takes 20 minutes on average to complete. Besides being fast to score, and simple to
produce, it is low cost (free of charge), and easy to procure: both printable and electronic
versions are available at Cobb's (2000) Compleat Lextutor website.
Schmitt et al. (2001) created an updated and bigger version of the VLT. In fact,
they created two alternate versions: Version 1 and Version 2. Each consists of 50 clusters
(10 clusters per section), 150 test words (30 per section, three per cluster), and it takes 30
18
minutes on average to complete. Version 1 is available in Schmitt (2000); Version 2 in
Schmitt et al. (2001).
2) The Yes/No checklists
Meara and Buxton (1987) developed this paper-and-pencil test as an alternative to
multiple-choice vocabulary tests, which are considered to be difficult and time-
consuming to produce (Meara, Lightbown & Halter, 1994). The authors attained their
objective, since the Yes/No checklist is considered to be the simplest type of test format
used in receptive vocabulary testing (Read, 1993). Besides being very easy to construct, it
requires only a few minutes to be completed (Meara & Buxton, 1987). This test relies
entirely on self-report by the learners, who are asked to read a list of words in isolation
(real and invented ones), and mark the ones they think they know. The reason for adding
invented words to the test was to avoid learners overestimating their vocabulary
knowledge (Beeckmans et al., 2001). If learners say they know a nonsense word, their
score is reduced. If learners mark all real words as known, they obtain the maximum
score. Figure 2 illustrates a sample of the checklist (prompt and test items). The test
words were selected from frequency-informed lists, and the invented ones followed
phonotactic and morphological rules for word formation in English (Beeckmans et al.,
2001).
Figure 2. Sample questions in the Yes/No Test.
19
It takes approximately a second for learners to decide whether they know the
word or not (Beeckmans et al., 2001). These fast self-reported responses allow hundreds
of words to be tested within the 10 minute period assigned for the testing.
Meara (1992) developed a computer-based version of the Yes/No test called the
EFL Vocabulary Test. The checklist contains 300 target words divided in five levels. The
instruction of this test reads 'For each word: if you know what it means, check the box
beside the word; if you aren't sure, do not check the box'. Figure 3 illustrates a sample of
words from level 5 of the computerized Yes/No test.
Figure 3. Sample questions in the computerized Yes/No Test or EFL Vocabulary Test.
After examining the Yes/No vocabulary test, Beeckmans et al. (2001) concluded
that this measure is not reliable, because learners may adopt different strategies when
dealing with invented or real words. In addition, Meara et al. (1994) found out that a
small group of learners self-reported knowing more invented words than real ones,
resulting in "unscoreable" verdicts. Such non-results present a clear threat to the
applicability of this kind of test. Similar problems are reported by Cameron (2002); she
compared Meara's (1992) Yes/No test to Nation's (1990) VLT and considered the latter
more useful, due to the unreliable results that the inclusion of invented words produced.
Nevertheless, this methodology has been used by Meara and his colleagues to
produce vocabulary size tests for a variety of European languages including French.
English and French versions are available at Cobb's (2000) Compleat Lextutor website.
20
3) The Vocabulary Size Test
Created by Nation and Beglar (2007), this proficiency test called the Vocabulary
Size Test (VST) determines learners' English receptive vocabulary size from the first
1,000 frequency level to the 14th 1,000, using Nation's (2006) updated frequency lists
based on the British National Corpus (BNC). Its ability to assess how many words the
learner knows from the most frequent 14,000 word families makes it a more
comprehensive measure than the earlier VLT, which tests only up to 10,000 word
families. Including 14 sections, each one corresponding to a word level band, the VST
contains 140 clusters (10 clusters per section), which amounts to 140 test words. It also
has the advantage of drawing on one of the largest and most current corpora of English
available. However, 12 of its 14 sections (85% of the test) used the lists of the BNC
spoken section, and even the authors recognized that developing a written text based on a
spoken corpus is somewhat unusual (Nation et al., 2007). Their choice was based on the
assumption that the spoken ordering was closer to the order in which the learners might
learn the words.
Similar to the VLT, the test requires the learners to select one response from the
options given in each question, and the answers given by the learners are verifiable. In
the prompt, the test word is accompanied by a simple non-defining context, which
indicates its part of speech, facilitating the establishment of the form-meaning
connection. However, the distractors are rather closely related in meaning to the test
word; as can be seen in Figure 4, all the meaning options for miniature use the word
small. Since this requires learners to have a more precise idea of a word meaning, the
VST is slightly more demanding than the VLT (Nation & Beglar, 2007).
21
Differently from the VLT, the VST is a standard multiple-choice format that
requires the writer to create four definitions for each test word. In addition to making the
test more time consuming to construct, this presents a number of problems. Due to the
fact that the test contains 560 sentences (140 test words X four definitions), it requires
more reading on the part of the test-takers, who also need to know more complex syntax
to understand the definitions. Besides, there is a larger scope for guessing on the VST
than on the VLT: in the former, there are only four answer options (for each test word),
while in the latter, there are six answer options (for three test words). Figure 4 shows a
sample question from the 5,000 word section of the VST. This test is available at Cobb's
(2000) Lexical Tutor website.
Figure 4. Sample question from the Vocabulary Size Test.
As seen in this overview, existing studies have tended to focus on assessing
learners of English with only the Yes/No checklist test, a format that has proved to be
somewhat problematic, available in French. In the next section, some criteria will be
considered in order to determine which is the best model for creating an improved
vocabulary size test in French.
22
A useful measure for academic learners
When one is designing a vocabulary test, Read (1993) specifies four dimensions
that should be taken into account: (1) simple to more complex test formats; (2) verifiable
versus self-report responses; (3) breadth and depth of knowledge; and (4) isolation versus
contextualization of test items. Based on these dimensions, the three vocabulary tests
described above, namely the VLT, the Yes/No test and the VST can be described as
follows:
1) In relation to the first dimension, all three tests are on the simple end of the
continuum, since they ask test-takers to indicate the correct answer rather than to
perform complex tasks such as 'write an essay on a specific subject';
2) In terms of types of responses, the second dimension, two of the three tests use the
verifiable response format (VLT and VST) and one uses self-report (Yes/No test);
3) Regarding the third dimension, all three tests are measures of vocabulary breadth
rather than vocabulary depth; and
4) As far as the fourth dimension goes, two of them present the test words in isolation
(VLT and Yes/No test), one presents them in context (VST).
Brown (2007) specifies three main qualities (or criteria) for testing a test, namely
practicability, reliability and validity. According to Bachman and Palmer (1996), test
practicability involves three types of resources: material resources (e.g. room for test
administration, computers, paper); human resources (e.g. test developers, raters); and
time (e.g. time for test development or scoring). First, in terms of material resources, the
Yes/No checklist test, the VLT and the VST are very practical, because they are all
pencil-and-paper tests, which require only a photocopy machine or a printer to make
23
paper copies of the test document. Also, they can be administered in traditional
classrooms with no need to use any special equipment whatsoever. Second, in terms of
human resources and from the developer's point of view, clearly the VLT and the VST
are not as simple to construct as the Yes/No checklist test, which, as its name suggests, is
simply a list of decontextualized words. Nevertheless, the cluster format of the VLT does
not require writing as many definitions as the multiple-choice format of the VST, in
which every test item must have a set of four answer options. Nor does it require the
creation of short non-defining context sentences as in the VST. From the test-takers' point
of view, the Yes/No test is very practical. Read (1993) asserts that there is no simpler test
format than this checklist test, where students are only required to read isolated words
and mark whether they know them or not. Multiple-choice formats, such as the VST, and
word-definition matching formats, such as the VLT, require more extensive reading on
the part of the test-taker (Read, 1993). However, the VLT does not require test-takers to
read as much as the VST, since its clusters contain individual words and short defining
phrases. Third, in terms of time, the Yes/No checklist test is the fastest test to sit: 10
minutes on average. On the other hand, due to the fewer number of clusters (50 cluster),
the VLT takes less time to sit (30 minutes on average) than the VST which includes 140
clusters. The larger the number of clusters, the longer the test takes to complete and some
test-takers might be tired or bored towards the end of the assessment period, which might
affect the test results. In sum, among the three tests described above, the Yes/No test has
the highest practicality level but the VLT comes in second with distinct advantages over
the more time-consuming VST.
A good test needs to measure consistently, that is, it should be reliable. If test-
24
takers perform the same test twice, on two different occasions, and obtain similar scores,
this is evidence that the test has high reliability. The Yes/No checklist test's use of
invented words designed to resemble real words as close as possible in terms of spelling
and morphology has raised reliability concerns (Read, 2000). It is easy to imagine that a
learner might say yes ('I know this word') to a nonsense word that was a near match to
real word in one test setting and no ('I don't know this word') in another. In the case of the
VLT and VST, different answers to the same question might be less likely to happen,
since test-takers are asked to demonstrate their vocabulary knowledge by choosing real
words only. Besides, the responses are verifiable (a scorer or a computer compares
learners' answers with a scoring key), which greatly reduces the cases of measurement
errors, making both tests have high marker reliability. In short, there is reason to favour
the VLT and the VST over the Yes/No test, in terms of potential reliability.
In terms of validity, the Yes/No test is clearly problematic in that it simply asks
test-takers to respond "yes" if they recognize a written word form as being an English
word. It is difficult to verify what a yes response means. Does the learner have deep
knowledge of multiple meanings of the word and its many associations? Or does she
simply recognize it as a familiar string of letters? The test-takers are not required to
recognize a connection to any kind of actual definition of the real words. In contrast, the
VLT and the VST do ask test-takers to explicitly show that they have established the
form-meaning links of the test words. In sum, both the VLT and the VST are more likely
to be valid reflections of learners' recognition vocabulary knowledge than the Yes/No
test, but the VST appears to make finer knowledge distinctions. Table 3 contains a
summary of Brown's three criteria and how the three tests described above meet them.
25
Table 3
Summary of Characteristics of Three Vocabulary Tests - Brown's (2007) qualities
In summary, the VLT shows a better compromise among the main criteria
commonly used in measuring test efficiency: First, this word definition matching test is
practical for both test-writers and test-takers; secondly, it is more likely to be reliable,
since it only presents real words, and its response format elicits verifiable responses,
reducing the scope for measurement errors; and third, measuring learners' lexical
knowledge in terms of their ability to make connections using simple definitions is more
likely to be a valid reflection of learners' recognition vocabulary knowledge than
unspecifiable yes responses on the checklist test. Although the VST also presents itself as
a good candidate for developing a new measure in French (since it requires students to
have detailed knowledge of words), it seems so far that the VLT is the ideal model for
creating a vocabulary size test for French L2 learners.
In the next section, some reasons why such a measure has not yet been developed
in French will be outlined. Moreover, the Yes/No checklist test in French, one of the few
size measures that do exist for French, will be examined.
26
Receptive vocabulary testing in French
Milton (2006) asserts that research on French vocabulary is scanty and little is
known about French vocabulary learning in school. The few studies that were conducted
to measure learners' vocabulary size in French paid attention to children and adolescents
learning this language in European schools, particularly in England and Wales. Due to
the fact that these studies have adopted similar methodologies for estimating vocabulary
size in French, it was possible to make comparisons among their findings.
According to Milton's (2006) results, 69 British university entrants knew 1,930
words in French, on average, which leads to the conclusion that, with this small amount
of vocabulary, these learners were not ready to pursue their studies in French at the
university level. In David's (2008) study, findings were similar. The learners of French in
the UK were 32 pre-university students attending Year 13 in Newcastle schools. The
author found out that they knew 2,108 words in French on average.
In both studies, the French vocabulary test used to measure learners' vocabulary
size was the French X-Lex test, an adaptation of the Yes/No test developed by Meara and
Milton (2003). Figure 5 illustrates this test format. The French X-Lex test is based on
Baudot's (1992) frequency list (see details in the next section), which is a lemmatized
vocabulary list of modern French (Milton, 2009). It only tests knowledge of the first five
1,000 word frequency bands, and presents the test-takers with 120 words: 100 real words,
drawn on Baudot's (1992) list, as stated earlier, and 20 invented ones, that were built
following phonological and morphological rules to resemble real words. The French X-
Lex took 5 to 10 minutes to be completed, and was found to be a reliable and valid
measure of receptive vocabulary size according to David (2008).
27
Figure 5. Sample of the French X-Lex Vocabulary Test - paper version.
In order to estimate learners' receptive vocabulary in French, Richards, Malvern,
and Graham (2008) also used the X-Lex test. However, they chose to use the
computerized version of this test (Meara et al., 2003). Figure 6 shows the opening screen
of this receptive vocabulary test. The 23 test-takers were students attending the first year
of their non-compulsory education (Year 12), which corresponds roughly to the first year
of CEGEP in Quebec. These test-takers were shown 120 real and invented words, one by
one, and needed to click on a button to indicate if they knew the word or not. They took
between four and seven minutes to complete it. The authors found out that these test-
takers knew, on average, 2,437 words.
It is important to point out that this higher average of known words, compared to
the 1,930 and 2,108 known words from Milton's (2006) and David's (2008) studies,
respectively, can be explained by the fact that Richards et al. (2008) did not deduct points
from the yes responses to invented words. Therefore, their results are almost certainly
inflated (David, 2008). Nevertheless, Richards et al. (2008) showed that acquisition of
vocabulary in French correlates with word frequency; that is, the most frequent words
were the most known words among the participants. This was evident by the gradual
decline across the five frequency bands shown in the results: the test-takers knew more
28
words from the 1K level than from the 2K level; they also knew more words from the 2K
level than from the 3K level, and so on.
Figure 6. Opening screen of the French X-Lex Vocabulary Test - computerized version.
Milton (2010) sees the development of comparable X-Lex checklist tests in a
variety of languages as an important contribution. Test results can be tied to the levels of
the Common European Framework of Reference for Languages (CEFR) (p. 228). The
CEFR is a document of reference containing proposals for language instruction and
common standards to be achieved by L2 learners across Europe. In a number of studies
carried out in the UK, Greece and Spain, mainly conceived to estimate how many words
English, Greek and Spanish learners of French knew, the X-Lex tests were used to
measure their lexical size. Milton (2009) reports that these learners, who were at the
CEFR B2 level (which corresponds to 560 - 650 hours of instruction), knew 1,920, 2,450,
and 2,630 words, respectively (p. 190). The number of words English learners of French
29
have, for example, is very similar to the ones the British students on Milton's (2006) and
David's (2008) studies know: 1,930 and 2,108, respectively. Milton concludes that taken
together, these studies show British students to be rather unimpressive learners of French.
Although the X-Lex test has proven to be suitable for use in comparison studies
across different European languages, use of a single measure to estimate learners'
receptive vocabulary can be seen as a limitation. David (2008) proposes that these results
need to be confirmed by other studies using different measures. This constitutes a further
argument for the design and testing of a new measure, such as the Test de la taille du
vocabulaire (TTV). This assessment tool is intended to function as the French counterpart
of Nation's (1990) VLT, which was identified as a useful model (see discussion in the
previous chapter). It is hoped that the TTV will prove helpful to researchers and teachers
and will contribute a much needed new instrument for use in L2 vocabulary research.
As mentioned, tests that measure learners' vocabulary size are often based on a
sample of words from word frequency lists (Read, 1993). The X-Lex in French, for
instance, drew on Baudot's (1992) frequency word list. How about the TTV? Should it
use the same list? Are there more recent and up-to-date lists for French, from which the
test words in TTV could be selected? In the next session, some of the most important
French frequency lists will be examined.
Developments in French frequency lists
Clearly, the success of a test that samples words of differing frequencies depends
on a reliable and current frequency list for French. Several frequency lists have been
created in this language. Six of those most referred to in the research literature will be
30
described in order to determine which one can best serve as source from which the words
for the vocabulary size test in French, the TTV, will be selected. The treatment here is
chronological, and each list will be described based on a number of key criteria that were
considered as important for ensuring recent French language usage. Criteria are as
follows:
o The list should be generated from an electronic corpus that has been analyzed
using frequency software.
o The corpus that the list is based on should be substantial in size (at least one
million running words). Corpora smaller than this size cannot be seen as a reliable
source of generalizations (O’Keeffe, McCarthy & Carter, 2007).
o The corpus should reflect current French usage. This may seem obvious but many
corpora contain large amounts of archaic material.
o The corpus should be representative of the language, reflecting general and
academic content from both written and oral sources.
o The lists derived from the corpus should reflect range as well as frequency; that is,
a word is considered frequent only if it occurs across a variety of sub-corpora.
o Ideally, the corpus would be relevant to the Quebec setting; that is, it should
include materials that reflect French as it is used in Quebec.
1) Le Français fondamental (1er degré)
Containing 1,063 words, this old, precomputational and unlemmatized frequency
list, was published in 1954 by the Ministère de l'éducation nationale français
(Gougenheim, Rivenc, Sauvageot & Michéa, 1964). This list of supposedly indispensable
31
words in French emerged from a corpus that included only spoken productions: audio-
taped and transcribed conversations about family, health, and vacation, for the most part,
with 301 participants, mostly Parisians from all walks of life (engineers, doctors,
musicians, nurses, sales persons, farmers, school children, etc.). For a long period of
time, almost all the pedagogical materials intended to teach French as a second language
used Français fondamental as their vocabulary source (Tréville, 2000). This list is clearly
a poor candidate for use in building the proposed test. First of all, it is based on corpus of
only 312,135 words. Secondly, the corpus, which was built manually, may have errors
due to the poor sound quality of the audio-recording (the authors used "light" (6 kg) and
inexpensive recording equipment). Thirdly, it is obviously out of date; for example, the
words télévision ("television") and ordinateur ("computer") do not occur. Finally, it also
contains a number of chunks, such as c'est-à-dire ("that is to say") and tout d'un coup
("suddenly"), which are not the type of words that will be tested in the TTV.
2) The Frequency Dictionary of French Words
Published more than four decades ago, the Frequency Dictionary of French
Words was one of the first lists in French generated by computational analysis. This
dictionary was conceived with the sole purpose of becoming a reliable body of evidence
for the language, which would present the appropriate structural frameworks, formal
procedures, and historical study of French (Juilland, Brodin & Davidovitch, 1970).
Parallel to this scholarly purpose, there was a pedagogical one, considered secondary by
the authors: to create short lists of most frequent vocabulary for beginner students and
longer ones intended for more advanced students. These lists would be a supplement to
32
advanced reading materials (e.g. academic texts).
To make the list, Juilland et al. (1970) gathered a collection of texts containing
500,000 lexical items from a variety of sources: 90 plays (to compensate for the relative
shortness of spoken sources), 50 novels and short stories, 50 essays, 50 newspapers and
magazines, and 50 scientific and technical texts. Two word selection criteria were
followed: frequency (all words dispersed in fewer than four occurrences were discarded),
and range (all words dispersed in fewer than three types of sources were discarded). All
these spoken and written productions were part of a larger sample containing 3,078
books, documents, and publications. In order to select the 290 texts that constitute the
corpus, which produced a list of 5,082 words, the authors used a computer programmed
to generate random numbers. One of the findings reported by the authors is worth
mentioning: investigation of the list's coverage indicated that students would be able to
understand 90% of the words in a representative text just by knowing their first 1,000
most frequent words. Again, this list is not suitable for the TTV: its corpus of half a
million words is not large enough to be considered reliable and it does not contain
materials produced in Quebec French.
3) Le Dictionnaire de fréquence des mots du français parlé au Québec
Beauchemin, Martel and Théoret (1992) built an imposing frequency list
containing 11,327 words based on a corpus of one million tokens. It was used to create
the Dictionnaire de fréquence des mots du français parlé au Québec and clearly meets
the criterion for Quebec content, since different varieties of French spoken in Montreal,
Estrie, Québec and Saguenay-Lac-St-Jean were included; however, it is not a good choice
33
for building the TTV due to its restricted type of source: it contains only spoken
productions, such as transcribed interviews (non-scripted texts), and folktales, theater,
television soaps (scripted texts). Moreover, the list includes Quebec forms that differ
from standard French (e.g. char, "car"), which bring its usefulness for the academic
learner into question.
4) Les Fréquences d'utilisation des mots en français écrit contemporain
Aiming at investigating contemporary French vocabulary, Baudot (1992) built a
dictionary including 21,684 types, which were the most frequent words found in a larger
list of 1,040,150 lemmas. Although this computer-generated dictionary was published in
the 1990's, its corpus was created in 1967 by the Bureau des langues du gouvernement
canadien. To make this frequency list, Baudot assembled a corpus containing 803 written
text samples (from 1,000 to 1,500 words long), dispersed in 15 literary genres (e.g.
magazines, books, newspapers, reports, brochures). The corpus consists of written works
from the French and Canadian press (62% and 37% respectively), and from other
countries (1%).
At first glance, this corpus, including texts written from 1906 up to 1967, could be
considered old and out-of-date. However, in fact, it is fairly recent, since 90% of the
written productions have their origin in the 60's. For research purposes, it is still
considered a list of modern French (Milton, 2009, p. 23).
This list meets almost every criterion specified above: it is generated from an
electronic corpus, it is substantial in size, its corpus reflects general and academic content
as well as Quebec French usage. However, its texts are restricted to written sources and
34
contain a portion (10%) of old material. Baudot's (1992) list may be a good candidate for
our test. However, a more recent list would be preferable. The search is not over.
5) Word list for a French learners' dictionary
Verlinde and Selva created this substantial frequency list in 2001. It emerged from
a 50 million word collection of written texts: 1998 issues of a French and a Belgian
newspaper (Le Monde and Le Soir, respectively). Besides being frequency-informed, this
list included words that occurred at least 100 times in both newspapers, which resulted in
12,000 lemmas. These lemmas include single words as well as multiple words (chunks
and collocations), and they all represent the most common vocabulary of the current
language. Even borrowed words, such as cool, fast-food, marketing, web, were
incorporated into the list, which was developed with the sole purpose of building a
dictionary for a specific kind of L2 French learner: one interested in engaging in
everyday conversation and reading the press (Verlinde & Selva, 2001). Even though its
sampling is restricted to one type of written source (two national newspapers), this list
has the advantage of using recent material, reflecting a more up-to-date vocabulary in
French.
After acknowledging some limitations in their list, Verlinde and Selva (2001)
concluded their article wishing for a list based on a larger and more varied collection of
texts aimed at improving the quality of resources for vocabulary learning. Maybe their
wishes came true in the form of the "Frequency Dictionary of French," discussed in the
next session.
35
6) A Frequency Dictionary of French
Conceived by Lonsdale and Le Bras in 2009, this dictionary includes a ranked list
of the top 5,000 lemmas in French, derived from a large-scale corpus: the corpus consists
of 23 million words contained in a comprehensive range of text types and geographical
varieties. Created from transcriptions of spoken and written French sources, it comprises
texts used in different French-speaking countries, such as France, Switzerland, and
Canada. Examples of spoken texts are governmental debates and hearings, telephone
conversations, and film subtitles; of written productions are newspaper and magazine
articles, technical manuals, and literature books. It is interesting to note that half of the
corpus (around 11.5 million words) is made up of the spoken text portion, the other half,
the written one. Although acknowledging the importance of collocations and idioms in
L2 learning, the authors decided to build their list with single words only (Lonsdale & Le
Bras, 2009).
One of the great advantages of this list is the fact that all texts are recent in origin:
only materials that were produced after 1950 were included, reflecting a more modern
representation of the French language.
Among the six word lists described above, the Lonsdale and Le Bras (2009) list
(the last one) seems to be the most appropriate to be used in the development of the TTV.
It was decided to draw the words for the 2,000, 3,000, and 5,000 word sections
(henceforth, 2K, 3K, 5K word sections, respectively) from this list. However, a problem
remained: how to create the 10,000 word section (henceforth, 10K word section) with the
Lonsdale and Le Bras frequency list that only goes as far as 5,000 words? The solution
was to use the second best one: Baudot's (1992) list (the fourth discussed above), which
36
does list words all the way up to the 22,000 most frequent. Only the words included in its
9,000 to 10,000 range were used to create the 10K word section of the TTV.
Research Questions and Hypotheses
The current study has two main purposes: The first is to develop the Test de la
taille du vocabulaire (TTV), a French version of Nation's (1983) VLT. This, a receptive
vocabulary test, is expected to provide useful estimates of the number of words (lemmas)
potential CEGEP and university students know. This test includes four frequency
sections, namely, 2K, 3K, 5K and 10K. The second goal is to assess its effectiveness by
administering the test to groups of L2 French learners at different proficiency levels, and
interpreting the results (quantitatively and qualitatively). The research questions
developed for this study are as follows:
1. How do learners perform on the TTV? Is there a proficiency effect such that
students in higher groups have higher scores than students in lower groups?
2. Is the TTV implicational such that students' scores are higher on the 2K test,
lower on the 3K test, lower still on the 5K and so on?
3. Do learners really know the meanings of words that they have indicated as known
on the TTV?
The hypotheses developed for each question are as follows:
1. Research question 1: Learners in higher groups will have higher scores than
learners in lower groups.
2. Research question 2: The test will produce higher scores on the 2K section, lower
on the 3K section, lower still on the 5K, and even lower on the 10K section.
37
3. Research question 3: Another measure will show that students know most of the
words that they have indicated as known on the test.
The purpose of the first question that investigates whether more proficient
learners obtain higher scores than less proficient ones is to determine the extent to which
learners' lexical proficiency as determined by their placement in language program relates
to their overall vocabulary size as measured by the TTV. This will be explored with
groups of L2 learners attending a French course called Francisation. This course was
designed by the Ministère de l'Immigration et des Communautés Culturelles specifically
for newly arrived immigrants to aid in their integration into Quebec society.
The second question tackles whether the test provides a scalable profile of
vocabulary frequency levels. In particular, it will show how large learners' vocabulary
size is at each of the four frequency levels (2K, 3K, 5K, and 10K). It is reasonable to say
that the words that recur frequently in the language are most likely to be known (Nation,
2001). Therefore, it is expected that mean score for the entire group on the 2K word
section will be higher than the mean on the 3K word section, the mean on the 3K word
section will be higher than the mean on the 5K word section and so forth.
As for the third question, which involves gathering qualitative data, it explores the
extent to which the test results correspond to the results of an audio-recorded interview.
After completing the pilot version of the TTV, learners were interviewed on their
knowledge of 48 target words in order to investigate whether they actually knew the
words they answered correctly on the test. The goal of this part of the study is to confirm
whether they were guessing (or not) while performing the test.
38
Chapter 3. Methodology
In this chapter, the methodology that was adopted to carry out this research
project is discussed. The chapter is divided into four parts: (1) the description of the
materials, including a detailed explanation of how the TTV was designed, as well as the
kinds of answers elicited by the sociobiographical questionnaire and the interviews with
participants; (2) the profiles of two samples of participants: those who took part in the
pilot testing phase, and those in the final testing phase; (3) the description of the
procedures for administering the pilot test to L2 French learners of different proficiency
levels, how their feedback helped to shape the final version of the TTV, and how the
results were analyzed; and (4) the key results obtained from the pilot testing. Results
based on the improved version that resulted from this process are reported in the next
chapter.
Materials
Designing the pilot test
Format
The test followed the model used by Nation (1983, 1990), a matching item test,
and drew on guidelines by Schmitt, Schmitt and Clapham, who created an updated and
larger version in 2001. Differently from the original VLT, the TTV includes only four
sections (2K, 3K, 5K, and 10K word levels), instead of five. The VLT includes these
levels plus an academic section, but since no academic word list has been identified for
French, this level was not included. Another difference concerns the number of clusters:
39
although the original 1983 version had six clusters for each of its five sections (a total of
30 clusters), the TTV follows the format by Schmitt et al. (2001), which has 10 clusters
per section (a total of 40 clusters). They found that the increase was needed in order for
the test to be reliable.
In order to increase the chances of ending up with 10 good clusters per section on
the final version of the TTV, the actual pilot version of the test included 12 clusters per
section (a total of 48 clusters). Asking test-takers to answer a larger number of clusters
made it possible to eliminate the two poorest functioning ones in each section and
produce a final version containing the 40 best clusters (10 per section).
Similarly to the original VLT, each cluster consists of six words and three
definitions. In sum, the pilot version of the TTV consists of four sections, 48 clusters (12
per section), and 144 test words (36 per section).
Selecting the words and writing the definitions
The lexical content of three sections of the test, namely, the 2K, 3K, 5K levels,
was extracted from the relevant sections of the Lonsdale and Le Bras (2009) 5,000 most
frequent word list, while the lexical content of the 10K level was extracted from Baudot's
(1992) 10th 1,000 most frequent word list. A random number generator was used to help
in the selection of words at each frequency level. Four lists of words were produced, one
per level. From each randomized word list, six groups of six nouns, three groups of six
verbs and three groups of six adjectives were drawn in order to build 12 clusters per
level. Interestingly, in the randomized word list used to build Nation's (1983) VLT, the
distribution of word classes in English was a ratio of 3 (nouns): 2 (verbs): 1 (adjectives),
40
according to Schmitt et al. (2001). In the case of the randomized word lists used to build
the TTV, the proportion of nouns, verbs and adjectives differs from the one of English
with verbs being less frequent. According to the counts obtained from the randomized
word lists in French, the proportion of nouns, verbs and adjectives on the TTV should be
a 2 : 1 : 1 ratio. In order to respect this distribution as closely as possible among the 48
clusters, the cluster distribution in each section is as follows: 6 noun clusters, 3 verb
clusters, and 3 adjective clusters (2K word section), 5 noun clusters, 4 verb clusters, and
3 adjective clusters (3K word section), 6 noun clusters, 2 verb clusters, and 4 adjective
clusters (5K word section), and 6 noun clusters, 2 verb clusters, and 4 adjective clusters
(10K word section). As a result, the TTV includes 23 noun clusters, 11 verb clusters and
14 adjective clusters (see Figure 7), which taken over the whole test approximates the 2 :
1 : 1 ratio.
Figure 7. Distribution of clusters in each section on the TTV - pilot version.
Note. N stands for noun (clusters), V for verb, and A for adjectives.
Once the six words in each cluster were chosen, they were then presented in
alphabetical order as in the original VLT. Following that, the definitions of all six words
in every cluster were written with the help of a paper version dictionary called
41
Dictionnaire de Synonymes et Contraires (Du Chazaud, 1998), and two on-line
dictionaries: Le Grand dictionnaire terminologique (Gouvernement du Québec, 2012)
and Portail Lexical in the Centre national de ressources textuelles et lexicales (CNRTL,
2012). Instead of simply copying the "best" definition of a word chosen among these
three sources (which was done in the case of single word definitions), the researcher
compared the three and came up with a new definition, which contained as much simple
vocabulary as possible (see definition-writing guidelines below).
As stated earlier, the TTV was conceived to assess one single meaning sense of
each word. However, the vast majority of the words used in the test were polysemous.
The solution to this problem was to choose the sense of a test word according to the
meaning used in the French usage context, as stated in the Lonsdale and Le Bras (2009)
5,000 word list, which was said to reflect the most frequent meaning in their corpus (p.
6). This procedure worked very well for three sections of the TTV, namely 2K, 3K and
5K. For the 10K section, which drew words from Baudot's (1992) frequency word list,
another solution was needed, since this list did not include definitions or usage contexts:
the first meaning sense that was listed in the entry on the on-line dictionary Portail
Lexical (CNRTL, 2012) was chosen as the sense to be measured in the 10K word section.
Once the definitions of the six words in each cluster were written, then it was time
to select three of them randomly (a random number generator was used again), since only
three test words per cluster were needed. This exercise showed that a number of the
selected definitions did not comply with some definition-writing guidelines, since they
contained low frequency words or complex grammar. In the TTV, a definition was
considered to be appropriate if it was short (to reduce reading to a minimum),
42
syntactically simple and used words from a higher frequency level than the test words.
Test words from the 2K word section are accompanied by definitions using words from
the first 1,000 words in the Lonsdale and Le Bras (2009) list; test words at the 3K, 5K,
and 10K word sections were defined using the first and second 1,000 most frequent
words from the same list. However, it should be pointed out that a small portion of the
test words at the 10K word section was defined using words from the third 1,000 word
level (a total of 14). Creating short definitions for the tenth 1,000 most common words in
French, having only the 2,000 most frequent words on the Lonsdale and Le Bras (2009)
list at one's disposal, turned out to be a very difficult task. Therefore, some definitions
were rewritten using some words of the third 1,000 word level, but none that ranked
higher than the 3,500 position in the Lonsdale and Le Bras (2009) list. All the initial and
rewritten definitions were subsequently checked by a native French speaker with a
professional background in Education.
The three definitions in each cluster followed the "length rule" suggested by
Schmitt et al. (2001) for ordering answer options: the shortest one precedes a longer one
as seen in Figure 8. A sample version of the entire TTV (the 5K word section) that
implements these design principles and was piloted with French L2 learners can be seen
in Appendix A. The answer key can be seen in Appendix B.
1. brouillard
2. coïncidence
3. farce _____ une histoire qui fait rire
4. instituteur _____ ce qui empêche de voir loin
5. pneu _____ un professionnel de l'éducation
6. soumission
Figure 8. A noun cluster from the Test de la taille du vocabulaire.
43
The key steps to build the TTV can be summarized as follows:
Building the TTV
1. Identify a frequency word list
2. Select words (nouns, adjectives and verbs) randomly from the list for each section of the test
3. List the words (from the same class of speech) in groups of six (six words = one cluster)
4. Create 10 groups of words on each section (following the 2:1:1 ratio for nouns, verbs and adjectives)
5. Put each group of six option words in alphabetical order
6. Write short definitions for each option word, using high frequency words only
7. Among the six definitions, choose three of them randomly
8. Put the three definitions in length order (from shortest to longest)
9. Display the option words in one column (to the left) and the definitions in another (to the right)
Questionnaire
Before taking the TTV, participants completed a short sociobiographical
questionnaire in French (Appendix C), which was used to collect background information
(e.g. name, gender, country of origin, L1, language spoken at home, etc.), as well as their
attitudes towards learning French as an L2. The first part of the questionnaire
(background information) included closed-item and open-item questions. The second part
(learning preferences) was mainly developed to elicit the participants' perceptions of the
importance of learning grammar and vocabulary. It included two multiple-choice items,
each containing a statement and a five-point Likert scale ranging from "very useful" to
"not useful."
44
Interview
In order to investigate whether test-takers actually know the words they answered
correctly on a test where it was possible to guess (research question 3), audio-recorded
interviews were carried out with a subset of the participants. The individual interviews,
which were 30 minutes long, considered three aspects: (1) participants' perceptions of the
TTV; (2) how they proceeded to answer the clusters, in particular, an easy cluster and a
difficult one (see Appendix D), and (3) meanings of one third of test words, contained in
the TTV they had performed the day before. A list of the 48 test words can be seen in
Appendix E.
Participants
Pilot testing phase
The pilot test was administered to 63 adult immigrant learners from a variety of
first language backgrounds in the spring of 2013. All were enrolled in the Francisation
program, at a CEGEP in Montreal. There were two proficiency levels: intermediate and
advanced. The intermediate level students were attending, at the time of the study, the
intensive full-time French course (30 hours per week), and had already attended 700
hours of instruction on average, while the advanced ones, the part-time French writing
course (12 hours per week), had already attended 1,200 hours of instruction on average.
From the sample of participants, 12 took part in the post-test interviews: five from the
intermediate group and seven from the advanced group. It is interesting to note that in
this program, vocabulary is taught using theme-based lists (countries, numbers, family
professions, etc.), which are not frequency-informed.
45
Final testing phase
The improved version of the test was administered to 175 learners, 115 female
and 60 male in the fall of 2013 at the same institution. They were also enrolled in the
same French course as the participants who took the pilot test version. In contrast to the
two levels in the pilot testing, they belonged to four different proficiency levels, ranging
from 1 to 4 (beginner, low-intermediate, upper-intermediate, and advanced). The
students' levels were determined by the institutional placement test: the higher the level,
the higher the number of hours the participants attended this French course (see Table 4).
This sample population was linguistically and culturally diverse, which is an important
requirement to satisfy when investigating the behaviour of this type of vocabulary test
(Schmitt et al., 2001).
Table 4
Description of Groups of Participants
Romance Non-Romance
1 330 31 19% 81%
2 660 55 31% 69%
3 990 64 41% 59%
4 1320 25 72% 28%
Note. N = 175
Proficiency
level
Number of
hours of
instruction
nFirst language
These learners of French for general and academic purposes come from 39
countries, mainly China, Colombia, Iran, and Moldova. They represented 21 L1
backgrounds: Spanish (38), Persian (26), Romanian (26), Mandarin (23), Russian (20),
46
Arabic (15), Tagalog (9), Portuguese (3), Ukrainian (2), Vietnamese (2), Amharic (1),
Bangla (1), Berber (1), Bulgarian (1), English (1), Hungarian (1), Korean (1), Kyrgyz (1),
Nepali (1), Tamil (1) and Teochew (1).
As can be seen in Table 4, the proportions of Romance language speakers vary
following a particular pattern: the higher the level, the higher the proportion of
participants whose L1 is a Romance language (Portuguese, Romanian, and Spanish).
Although most of the participants (62%) reported residing in Canada for one year
or less, the average length of stay was one year and nine months (ranging from three
months to 13 years). Also, slightly under half of them (46%) said they had studied French
in their countries of origin before coming to Canada, and the same percentage of
participants indicated that they do not speak English.
Interestingly, the majority uses French outside the classroom (64%), and thinks
that both grammar and vocabulary are important or very important in order to learn
French (71%).
Procedures and Analyses
Preliminary version of the TTV
Prior to the administration of the pilot test, five native speakers of French (three
graduates from Quebec and two graduates from France) took a preliminary version in
order to ensure that all the test items were acceptable to educated native speakers.
They were given as much time as needed to complete the test; average completion
time was 15 minutes. Their scores ranged from 141 to 144 with a mean of 142.2 (the
maximum score was 144, one point being attributed for each test item). Since these
47
proficient French speakers reached maximum or near-maximum scores, it is safe to say
that the TTV format was not a problem for them. However, some test words posed some
comprehension problems for these L1 speakers. In the cases where two of them answered
the same test word incorrectly, the definition of that word was rewritten in order to
remove the ambiguity. For instance, in the 5K section, the test word inverser ("to
reverse") should match the definition changer de place ("to change the order"). Two test-
takers, nevertheless, chose the distractor desservir ("to clear the table"), which prompted
the test-writer to modify the definition to changer le sens ("change the way"). This
definition was chosen because none of its senses can be associated with the other
distractors, which seemed likely to reduce the confusion. Furthermore, if three or more of
these test-takers indicated that they had considered a definition confusing (even though
they had answered it correctly), it was also rewritten. For instance, three native speakers
disliked the definition unité de mesure ("unit of measurement") attributed to the test word
baril ("barrel") in the 10K section. According to them, this definition is not the first one
that comes to mind when they see the word baril. In order to come up with an answer,
they said they had guessed, choosing their last option after eliminating the others. In view
of these comments, a new definition was written: utilisé pour transporter des liquides
("used to transport liquids"), which was created in order to avoid test-takers resorting to
guessing.
Pilot version of the TTV
This part of the data collection period took 60 minutes. Right before the test
administration, the 63 participants spent 10 minutes completing a sociobiographical
48
questionnaire. While collecting the completed questionnaires, the researcher announced
that dictionaries and cellular phones were not permitted and reinforced the idea that the
test was individual.
Most of the students completed the entire test of 48 questions in 30 to 35 minutes.
This attests to the TTV's practicality.
Post-test Validation Interviews
Interviews with the participants were used in this pilot study to investigate the
validity of the TTV. In order to determine whether an item within a cluster is valid or not,
it is necessary to determine whether the meaning of this target word on the test is indeed
known to a test-taker who has identified a correct definition (Schmitt et al., 2001).
The individual interviews took place on the day after the written test. All
interviews were audio-recorded using a digital Edirol Recorder (model: R-09HR) so they
could be reviewed and examined for insights into the test taking.
Five participants from the intermediate level group and seven from the advanced
group volunteered to take part in it. In this three part interview, which followed a similar
procedure used in Schmitt et al.'s (2001) study, the researcher started by posing a
question conceived to probe the participants' perception of the vocabulary test they had
performed the day before: "Is the TTV a good test of vocabulary?" More often than not,
participants answered this question briefly, not spending too much time on it. This initial
conversation worked as a warm-up, putting the participants at ease, and prompting them
to answer the next questions in a more elaborated way.
In the second part of the interview, which was designed to find out how the
49
participants proceeded to answer two types of clusters (Appendix D), the researcher
always started by showing an easy cluster to the participant, who read the question and
spent some seconds trying to answer it silently. Then the researcher asked: "Can you
describe what you did while you were answering this question?" and the participant
answered it "thinking out-loud." After that, the researcher showed a difficult cluster to the
participant and followed the same procedure used with the easy cluster.
In the last part of the interview, the participant's knowledge of the test words was
explored. Each was given a sheet of paper containing a list of 48 test words (12 words
from each section, see Appendix E), which corresponds to one third of the 144 test words
in the TTV. Then the researcher pointed to the first word on the list and asked: "Can you
tell me what this word means?" If the participant was not able to answer the question
orally (that is, he or she could not come up with an acceptable synonym or definition), the
participant was given a card containing the test word and five answer options (the correct
definition on the TTV and four distractors). Figure 9 illustrates the card with the word
"division" on it, along with its key (option e) and its distractors.
5 division a. le résultat attendu
b. une attitude positive
c. un document officiel
d. un jour de la semaine
e. séparation en deux parties
Figure 9. Card used to confirm learners' knowledge of the word "division."
50
Note that responding to cards such as the one shown in Figure 9 involves
answering multiple-choice questions. This format is useful for the validation process in
that it presents a different kind of opportunity to demonstrate knowledge than is available
on the TTV. That is, test-takers choose from five meaning options for the target word,
whereas on the TTV, test-takers must choose one word from a set of six to match to a
meaning.
When participants answered this multiple-choice question, the researcher wrote
down in her worksheet (Appendix F) whether they had answered it correctly or not. The
same procedure was repeated with each of the 47 remaining words on the list. By
comparing the participants' responses both in the TTV and in the interviews, it is possible
to determine whether they really knew the words. A participant, who answers an item
correctly in the TTV and demonstrates knowledge of this same item during the interview,
confirms her knowledge of the target word meaning. If the correspondence between the
written test and the interview is high, this can be interpreted as the TTV being a good
predictor of learners' vocabulary knowledge. This last part of the interview took most of
the time allotted. Interestingly, the participants did not seem intimidated by the audio-
recorder. For the most part, they were eager to talk and seemed disappointed when the
end of the interview was announced.
The final version of the TTV
In order to investigate the validity of the TTV (pilot version), two types of
statistics were used: facility and discrimination indices. These indices were calculated for
each item in order to explore each cluster's behaviour (recall that there were 144 items or
51
individual word-to-definition matches within the 48 three-part clusters). But in deciding
which clusters to include, the performance of each cluster was considered as a whole.
By calculating the facility index (FI), the proportion of test-takers who answered
an item correctly was obtained. Calculating the FI involved adding up the number of
correct responses for an item and dividing by the number of test-takers (Fulcher, 2010, p.
182). For the item sagesse ("wisdom"), 41 test-takers out of 63 have answered it
correctly, and 41/63 = .65. The facility indices for individual items within the clusters
ranged from as low as 0.127 (obtained for the item moisi, "mouldy") to as high as 1.0
(obtained for the items hiver, "winter"; crier, "to scream"; and bras, "arm"). Facility
indices for the 10 best functioning clusters ranged from 0.376 to 0.910. Mean facility
indices and standard deviations for each section of the test are shown in Table 5.
By calculating the discrimination index (DI), it was known how well an item
discriminated between the top scorers and the bottom ones. Prior to calculating the DI, it
is necessary to group the test-takers into three groups (top, middle and bottom scorers) in
order to calculate two facility indices: one for the top scorers, which is called FI Top, and
another for the bottom scorers, which is called FI Bottom. The DI is obtained by
subtracting the FI Top from the FI Bottom, in other words, DI is FI Top − FI Bottom
(Fulcher, 2010, p. 182). The discrimination indices for individual items within the
clusters ranged from as low as 0.000 (obtained for the items faim, "hunger"; bras, "arm"
and crier, "scream" to as high as 0.895 (obtained for the items ordure, "trash"; and
fragmentaire, "fragmentary"). Discrimination indices for the best functioning clusters
ranged from 0.175 to 0.684. Mean discrimination indices and standard deviations for
each section of the test is shown in Table 5.
52
Based on the FI and DI values, the poorest functioning two of the 12 clusters in
each section were eliminated. Table 5 shows the mean facility and discrimination indices
for the 10 remaining clusters in each section.
The item facility means in the third column of Table 5 show that the test is
functioning as intended; the means decrease as the test items become more infrequent,
with about 80% of the test-takers able to answer the items on the 2K section correctly and
the 10K section proving the most difficult. However, the mean FI of the 10K section was
higher than expected. Test-takers clearly found these questions harder to answer than the
2K, 3K and 5K items, but the mean of 0.549 is considerably higher than the 0.289 figure
Schmitt et al. (2001) found for their 10K section. In fact, it was expected that this section
containing low frequency words would be very difficult for most of the participants in
this pilot study.
Table 5
Facility Values and Discrimination Indices for Pilot Version
M (sd) M (sd)
2K section 30 0.818 0.091 0.318 0.078
3K section 30 0.723 0.147 0.469 0.110
5K section 30 0.654 0.144 0.458 0.099
10K section 30 0.549 0.140 0.549 0.076
Note. N = 63
Number of
items
Item facility Discrimination index
It was surprising to see that even some participants from the intermediate level
group answered a considerable number of items on this section correctly. Closer
53
inspection of the words targeted in the 10K section and drawn from Baudot's (1992) list,
revealed that four of them were not as infrequent as might have been expected. On the
more comprehensive and current Lonsdale and Le Bras (2009) list, these same words
were identified as 3K and 5K words. For instance, while Baudot (1992) listed pêcheur
("fisherman") and expertise ("skill") as 10K words, Lonsdale and Le Bras (2009) listed
them as a 3K word (rank 2759) and a 5K word (rank 4310), respectively. In view of this
discrepancy, the 10K word section was rewritten entirely. A new randomized set of 10K
words was drawn from Baudot's (1992) list and checked against both the Lonsdale and
Le Bras (2009) list and Cobb's (2000) Vocabprofil software to ensure the new items were
truly infrequent. The new 10K section consisting of 13 clusters was taken by two native
speakers of French, who obtained perfect scores. Their consensus on the questions that
they found both clearly written and yet challenging to answer informed the selection of
the 10 clusters for the final version of the test. Appendix G shows the new 10K word
section of the TTV (under the title La dixième tranche de mille mots).
Concerning the mean discrimination indices shown in Table 5, it is important to
point out that even though they are above the ideal .3 mark (Fulcher, 2010), the DI for
some individual matching items within clusters fell below this criterion in all four
sections of the test (recall that each cluster is made up of three definition-matching
items). There were 12 such items (40%) in the 2K section, 7 items (23%) in the 3K
section, 7 items (23%) in the 5K section, and 6 items (20%) in the 10K section. Thus
even though a cluster as a whole discriminates adequately, it may contain items within it
that have low DIs. When a test item has a low DI, this means that it is not discriminating
well between the top scorers and the bottom ones (e.g. almost all participants answered
54
an item correctly). The fact that the TTV has retained some items with such low indices
could be interpreted as a threat to the validity. However, it is worth noting that many
participants in this pilot study were expected to know many (if not all) words included in
the "easiest" parts of the test, namely, the words from the 2K and 3K sections, due to
their present proficiency level in French. The least proficient participants in this pilot
study had already spent 700 hours in class, on average, and could be expected to know as
many as 2,800 words. This expectation is based on the learning rate of four words per
hour of class time reported by Milton (2009) (700 hours X 4 words per hour = 2,800). In
other words, it is not surprising that the TTV, which was conceived to measure learners'
vocabulary size over a wide range of proficiency levels, contained some items that were
known by most of the test-takers. Schmitt et al. (2001) report low DIs in the development
of their test and make a similar argument for the inclusion of such items. In reality, these
low DIs attest to the (positive) fact that nearly all of the participants in the pilot testing
knew some of the most frequent words in French, which means that they had spent their
time wisely, learning the most important words of a language, namely, the high frequency
words (Nation, 2001).
A few questions where students were strongly attracted to a particular wrong
distractor were rewritten. For instance, in a verb cluster of the 3K section, the distractor
arracher ("pull out") was flagged due to the high number of participants who had
matched this word to the definition faire naître ("be born"), instead of engendrer
("generate"). The reason why 12 participants (19% of the test-takers) might have been
attracted to this distractor could be as follows: Not knowing what engendrer meant, they
might have tried to come up with another answer, and arracher might seem to be the
55
right answer, since, in the context of a mother giving birth, the baby needs to be pulled
out (by an obstetrician). To avoid this kind of "educated guesses" and further confusion,
the verb arracher was replaced by the next verb in the randomized word list: commenter
("comment"). In sum, some of the questions that were removed were omitted on
statistical grounds (e.g. low DIs), some because they were confusing (e.g. the distractors
were chosen often).
Even though two of the original 12 clusters were eliminated from each section of
the pilot test, the distribution of the remaining 40 still follows the 2 : 1 : 1 ratio. On the
final version of the TTV, the cluster distribution in each section is as follows: 5 noun
clusters, 3 verb clusters, and 2 adjective clusters (2K word section), 5 noun clusters, 3
verb clusters, and 2 adjective clusters (3K word section), 5 noun clusters, 2 verb clusters,
and 3 adjective clusters (5K word section), and 5 noun clusters, 3 verb clusters, and 2
adjective clusters (10K word section). As a result, the final version of the TTV includes
20 noun clusters, 11 verb clusters and 9 adjective clusters (see Figure 10), which taken
over the whole test approximates the 2 : 1 : 1 ratio.
Figure 10. Distribution of clusters in each section on the TTV - final improved version.
Note. N stands for noun (clusters), V for verb, and A for adjectives.
56
The participants' feedback provided in the post-test interviews also prompted
further revision of the TTV. When they were asked to explain how they had proceeded to
answer a cluster, more than half of them mentioned that they resorted to guessing (by
using their doigt magique, "magic finger," as some would describe their behaviour in a
joking way), particularly when completing the 10K section. This behaviour might be due
to the assumption that all questions needed to be answered. Based on this kind of input
and on the low number of blank responses (995 out of 9,072 responses, 11%), a
modification was made to the instructions of the TTV: the sentence Si vous ne connaissez
pas un mot, laissez la réponse en blanc ("If you do not know the word, leave the answer
space blank.") was added in order to avoid blind guessing. Schmitt et al. (2001) also
recommend that these instructions be reinforced orally prior to the test administration.
Finally, based on Fulcher's (2010) recommendation of starting the test with the
easiest questions in order to motivate the participants, further revisions at this stage
included changing the sequence of the clusters in each section in order to display the
questions with the highest item facility first.
The sequence of this procedure as well as the steps to analyze the data are
summarized below:
57
The final improved version of the entire TTV that implements all the changes
described above can be seen in Appendix G. The answer key can be seen in Appendix H.
This version was given in Fall 2013 to different groups of French L2 learners
attending the Francisation program. It was held in four sessions on two consecutive days.
Similar to the pilot test administration, each session of this final administration of the
TTV took 60 minutes: the first 10 minutes was allotted to complete the sociobiographical
questionnaire, and the remaining 50 minutes to answer the test questions. Even though
the final version of the test was shorter (it included 40 questions instead of the 48 on the
pilot version), the participants had the same amount of time to complete it. In each
session, all participants started the test at the same time, right after the researcher had
58
directed their attention to the instructions and reinforced orally the idea that they did not
need to resort to guessing.
Participants in three sessions were timed and they averaged 33 minutes (range 14
- 50) to finish the test. Once again, TTV's practicality was confirmed.
No post-test interviews were scheduled at this point of the study.
Results of the pilot testing
The results of the pilot study confirmed that the test was appropriate for the
purpose and feasible to administer. Following are some of the most important findings:
The test was completed in less than 35 minutes by all 63 participants.
The testing resulted in a range of scores, which was useful for the analysis. Scores
across all participants ranged from 31 to 135 (maximum score = 144) with a mean
of 99.98 and a standard deviation of 26.23.
The mean score on all 144 items was 91.36 (SD = 27.45) in the intermediate
group and 109.47 (SD = 21.5) in the more advanced group. An independent
samples t-test showed that this difference was statistically significant (t = -2.89, df
= 61, p < .003). Thus these preliminary results confirm the idea that the higher
the proficiency level of a group, the greater their vocabulary size.
When these scores are extrapolated to numbers of words (lemmas), the mean
vocabulary size in the intermediate level group amounts to 5,838, and in the
advanced group to 7,169.
Average scores based on all of the piloted clusters in each section (i.e. 36
definition matches per section) show the expected gradual decline across the
59
frequency levels: from 29.71 (2K), through 26.22 (3K), 24.30 (5K) to 19.75
(10K).
In the interviews, the TTV was considered a good test of vocabulary by all
participants (n = 12).
All interviewees said that they did not have any problem with the cluster format.
Results based on the final version of the test and the interview data are reported in
the next chapter; they are discussed in detail in chapter 5 of this study.
60
Chapter 4. Results
Research question 1
The first research question asked: "How do learners perform on the TTV? Is there
a proficiency effect such that students in higher groups have higher scores than students
in lower groups?" Group means for the various proficiency groups are shown in Table 6.
As shown in the third column, learners in higher groups obtained higher scores than
learners in lower ones (maximum total score = 120): the Level 1 group mean is the lowest
at 38.87 (SD = 20.83); as proficiency level increases so do the means, with the highest
mean of 92.44 (SD = 13.50) obtained in Level 4. According to the results of a one-way
ANOVA, there were significant differences in the data (df = 3, F = 40.97, p < .0001).
Post hoc comparisons indicated that all of the between-group differences were significant
(p < .01).
Table 6
Number of Words (Means) by Proficiency
Proficiency
level
Number of
itemsM (sd) %
Extrapolation
to 10K
1 120 38.87 20.83 32% 2699
2 120 56.29 22.39 47% 4068
3 120 73.88 29.50 62% 5274
4 120 92.44 13.50 77% 6891
Note. N = 175
61
The rightmost column in Table 6 shows the mean vocabulary sizes in the various
groups; these are obtained by extrapolating the group means to the 10,000 words that the
test samples. These data point to a direct relationship between proficiency level and
number of "known" words.
Hypothesis 1, which predicted that learners in higher groups would have higher
total scores (and larger vocabulary sizes) than learners in lower groups, appears to be
confirmed.
Research question 2
The second research question asked: "Is the TTV implicational such that students'
scores are higher on the 2K test, lower on the 3K test, lower still on the 5K and so on?"
Answering this question involved calculating mean performance for each section of the
test in the entire participant group (N = 175). The means for each 30-item section are
shown in Table 7. The figures show the expected pattern with scores for more frequent
words being higher than scores for less frequent words. According to the results of a one-
way ANOVA, there were significant differences in the data (df = 3, F = 422.82, p <
.0001). Post hoc pairwise comparisons showed that all of the differences in means were
significant (p < .01). As shown in Table 7, the learners as a group know more than two
thirds of the words at the 2K section (69%), but little less than a third of those at the 10K
section (32%), and progressively fewer in the sections in between. The declining scores
across the word sections clearly indicate that the TTV provides a scalable profile of
vocabulary frequency levels.
62
Table 7
Test Scores (Means) by Section
SectionNumber of
itemsM (sd) %
2K 30 20.72 6.59 69%
3K 30 18.25 7.53 61%
5K 30 16.25 7.82 54%
10K 30 9.58 6.00 32%
Note. N = 175
Figure 11 shows the means on the various sections of the test in the four groups.
As this figure illustrates, the percentages of correct responses show a consistent pattern of
declining scores from the highest frequency level (2K) to the lowest (10K) regardless of
learners' proficiency level. It is clear that hypothesis 2, which predicted that the test
would produce higher scores on the 2K section, lower on the 3K section, lower still on
the 5K, and even lower on the 10K section, is confirmed.
Figure 11. Test score mean per word section and proficiency level.
63
Research question 3
The third research question asked: "Do learners really know the meanings of
words that they have indicated as known on the TTV?" Interviews with a subset of the
participants were conducted to answer this question. The post-test interviews elicited 576
answers (12 participants X 48 test words), which fit four different scenarios: (a) correct
response on the test and in the interview; (b) incorrect response on the test and correct
response in the interview; (c) correct response on the test and incorrect response in the
interview; and (d) incorrect response on the test and in the interview. Table 8 shows that
scenario a, the perfect match, occurred 68% of the time (390 answers out of the total
answers). This indicates that there is a fairly strong correspondence between the test
results and the results of the interview.
Table 8
Comparison of Interview Results With TTV's Results
Correct Incorrect
Knew a 390 b 60 450
Did not know c 46 d 80 126
436 140 576
Note. N = 12
TTV
Interview
The hypothesis 3, which predicted that a qualitative measure would show that
students knew most of the words that they had indicated as known on the test, is largely
64
true. The finding that 14% of the words not known on the test were also not known in the
interviews (scenario d) also supports the test's validity. Taken together (68 + 14 = 82%),
these figures indicate that test performance is a reasonably accurate reflection of what
students do and do not know.
The mismatches (scenario b and c), in which participants did not demonstrate
knowledge of target words while performing one of the elicitation tasks, occurred 18% of
the time (106 answers out of the total answers). However, scenario b occurred slightly
more often than scenario c (10% and 8%, respectively), which means that a number of
participants performed better in the interview than on the test. In terms of validating the
test, there would ideally be very few mismatches. As can be seen by the results, the
proportion of mismatches was fairly small, which attests to the TTV's validity.
65
Chapter 5. Discussion
The goals of this study were to develop and validate a research-informed measure
of receptive vocabulary size for French L2 learners. Based on Nation's (1983) VLT and
Schmitt et al.'s (2001) guidelines, the TTV was created to estimate learners' vocabulary
size up to 10,000 words. This new assessment tool was administered to adult L2 French
learners at different proficiency levels and their test scores confirmed its validity. Also,
learners' lexical proficiency as determined by their placement in language program relates
to their overall vocabulary size as measured by the TTV, which attests to its reliability. It
is worthwhile mentioning that the institutional placement/achievement tests are held
during a week-long period, and include four subtests, which evaluate students' oral
comprehension, written comprehension, oral expression and written expression. The fact
that performance on the TTV confirmed learners' level as determined by the results
obtained from this thorough testing process can be seen as a strength of this new measure.
In answer to the first research question which explored the relationship between
vocabulary size as indicated by the test and L2 proficiency, the results indicate that there
was a positive relationship between proficiency level and number of "known" words; in
other words, the higher the proficiency level of a group, the greater its mean vocabulary
size. At this point, it is pertinent to reiterate that, in the context of this study, the
expression "knowing a word" refers to knowing the correspondence between one form
and one sense of the word, also known as establishing the word's form-meaning link at
the receptive level. The TTV was solely conceived to indicate initial learning of a word,
mainly recognition, since only two word knowledge aspects (written form and meaning)
66
were measured in the current experiment. It is recognized that additional measures,
including a measure of production, are needed to give a more complete picture of a
learners' vocabulary knowledge.
The second question that explored whether the frequency sections of the test
would function as expected was answered in the affirmative. That is, mean performance
on the most frequent (2K) words was higher than performance on 3K words and so on
with the 10K section proving to be the most difficult. This is an important finding in
terms of the test's usefulness. But it is important to note that while this was true of the
group, performance of individuals did not always follow this pattern. For example, the
individual profiles for six Level 4 learners (24% of this group) did not form a perfect
descending scale. One learner, in particular, scored 24 (2K), 29 (3K), 27 (5K), and 18
(10K). Milton (2009) noted similar results in his 2009 overview of vocabulary size
testing (p. 33). He identifies word frequency as the strongest predictor of acquisition,
with more frequent words being consistently acquired earlier than less frequent ones –
across a variety of languages and learning settings – and the findings reported here
confirm that. But he points out that individuals cannot be assumed to always follow this
trajectory.
Answering the third question involved the investigation of a subset of participants
to determine whether responses to test items on the TTV corresponded to knowledge that
was assessed in another format (interviews). Results provided strong evidence for the
validity of the new test; there was a great deal of consistency between interview and test
performance with learners able to demonstrate knowledge of words they had answered
accurately on the TTV. A number of other interesting insights emerged from this part of
67
the research:
First, the test seems to have been received positively. As mentioned, all
interviewees answered affirmatively when asked if the TTV was a good test. Some of
them, however, considered the test difficult (three interviewees) or long (two
interviewees). One student, in particular, added a comment that the TTV was too general,
and that she would have preferred to be tested on words from her own field of expertise.
Secondly, the interviews revealed students' answering strategies. When asked about their
procedures on answering the clusters, two distinct test-taking behaviours were observed:
either they use the same strategy to tackle both easy and difficult clusters (eight
participants) or they use different strategies for each type of cluster (four participants).
Within the former group, for instance, seven participants reported that they read
definitions (second column) first and then alternatives (first column), and only one did
the opposite (first column then second column). When presented with the easy cluster,
the majority of the interviewees answered the three items quickly and correctly. They
usually chose the correct alternative without considering the others, treating the items in
an independent way. The only exception was two interviewees who answered only two of
the three items: both of them did not find among the test words (évidence, hommage,
maître, richesse, totalité, trafic) the answer to be associated with the definition/synonym
commerce (the right response was the last option). They knew that commerce meant
magasins ("stores") or affaires ("business"), but when the researcher asked them what
trafic meant, they answered promptly: 'The circulation of cars,' which is a valid answer.
This indicates that they knew one of the meanings of this polysemous word, but not the
meaning selected in the TTV: 'circulation of goods,' which prevented them from making
68
the appropriate form-meaning connection. On the other hand, when presented with the
difficult cluster, a small portion of the interviewees answered its three items correctly.
However, all interviewees said that, at some point, they had to resort to guessing in order
to respond to this type of cluster. For the most part, they read the alternatives more than
three times and tried to eliminate the least plausible options, which made them treat the
items in a dependent way and spend more time on a cluster. Not surprisingly, this
procedure was more time-consuming and oftentimes unsuccessful. In summary, the
interviews reveal certain limitations of the TTV's matching format. The single definitions
mean that the test may underestimate the knowledge of students who know alternative
meanings. It is also clear that the test may overestimate knowledge because the format
allows for guessing and elimination.
On the point of guessing, it is interesting to note that instructions added to the
TTV's improved version asking participants to avoid blind guessing seem to have been
effective. In the pilot testing most students tended to complete the test, which appeared to
involve a great deal of guessing. However, at the administration of the improved version,
written instructions to leave unknown questions blank were added and reinforced orally
by the researcher. This intervention is important to emphasize in future uses of this test
for the sake of validity: less guessing means that the measure reflects participants' word
knowledge more accurately.
In the next sections, the findings are discussed in relation to two topics: previous
estimations of L2 French vocabulary size and the role of L1 influence.
69
Comparing estimates of vocabulary sizes
Earlier in this thesis, in order to give the reader a preliminary idea of French L2
learners' vocabulary sizes, initial estimates were calculated using Milton's (2009) rule of
thumb (learners gain four words/lemmas per classroom hour on average). When
comparing this kind of estimates to the ones obtained by the TTV (see Table 9), a clear
difference can be observed between the learners Milton investigated and the Quebec
participants in the thesis study.
Table 9
Number of Known Words by Proficiency Level: Extrapolations Based on Milton and TTV
Results (N = 175)
Proficiency
level
Number of hours
of instruction
Milton (2009) TTV %
1 330 1320 2699 104%
2 660 2640 4068 54%
3 990 3960 5274 33%
4 1320 5280 6891 31%
Estimates of vocabulary sizes
In this study, Level 1 students, who attended 330 hours of instruction, know 1,320
words by Milton's (2009) count, but 2,699 words, on average, according to the TTV's
results. This amounts to a sizable difference of 1,379 words (104%) or slightly more than
the double the claim based on Milton (2009), though it is interesting to note that the
higher the proficiency level, the smaller the difference between both estimates. A number
of reasons can explain this variation in vocabulary estimates. These include the role of
70
the learning context and the design of the size test itself. As for context of learning,
Milton's investigation of learners of French singled out the Spanish speakers in his
sample and this allows for a close comparison to similar test-takers in the current study.
The learners in Milton's study had had around 560 - 650 hours of instruction and are
therefore roughly comparable to Level 2 group in the TTV study (see Table 9).
According to their test scores on the X-Lex (frequency bands 1 through 5), they knew
2,630 lemmas on average (Milton, 2009). For the sake of comparison, the number of
words known by the participants in the current study having a similar profile was
calculated: the eight learners of French who were at the Level 2 and whose L1 was
Spanish knew, on average, 3,875 out of the 5,000 most frequent words (taking together
the 2K, 3K and 5K word sections). This large difference in estimates may be explained
by the fact that the two groups of participants were fairly different: in Milton's study, the
participants were learning French in non-French speaking societies, while the TTV
sample were living and working in a French-speaking society. Learners who live in a
French speaking society benefit from being exposed to their L2 outside the classroom,
and this may explain why their recognition lexicons are much larger compared to those
reported by Milton (2009). Future research that tests the TTV in a variety of learning
settings may be able to shed light on these issues.
Another reason that might explain the wide variation in vocabulary estimates
pertains to the different types of frequency lists used to build the tests the French L2
learners took in the studies. As mentioned, in Europe, vocabulary sizes were measured by
the X-Lex test, but in Canada they were measured by the TTV. The former test drew
words from the Baudot (1992) list, which was based on a corpus containing written
71
materials only. The latter test drew 75% of the test words from the Lonsdale and Le Bras
(2009) list, which was based on a more comprehensive corpus with a large spoken
component (50%). It is possible that by assessing some target words typically used in
spoken language, the TTV presents test-takers with higher frequency words, with which
they may be more familiar. If indeed the TTV assesses more of these "easier" words than
the X-Lex test, then learners have less difficulty in making the form-meaning link, which
translates into better scores, and consequently, larger vocabulary sizes.
Previous learned languages and vocabulary sizes
A further point of interest in these results is the role of learners' L1 on their
receptive vocabulary knowledge. In order to explore how the L1 impacted test scores,
three language groups were investigated: Romance speakers, Asian speakers and non-
Romance. The first group consisted of 67 speakers of Portuguese, Romanian, and
Spanish, who account for 38% of the population. This particular group is known for
having a distinct advantage when learning French (also a Romance language), because
they can recognize a great number of cognates, mainly Greco-Latin origin words (Milton,
2009; Schmitt, 2010). The second group consisted of 27 speakers of Korean, Mandarin,
Teochew, and Vietnamese (in fact, East Asian languages), who account for 15% of the
population. Their L1s are typologically very distant from French and have not been
strongly influenced by Latin (as English has been, for instance). The third group
consisted of 81 speakers of Persian, Russian, Tagalog and eleven other languages (see
Participants subsection), who account for 46% of the population. As seen in Table 10,
average scores (maximum score = 30) based on each frequency section showed the
72
expected gradual decline across the frequency levels, with mean scores on more frequent
words consistently higher than those for lower frequency words. However, the analyses
show that the mean scores for Romance speakers are substantially higher than the other
two groups on all sections of the test. These numbers suggest that these learners used
their knowledge of helpful L1 cognates to their advantage. As for the Non-Romance and
Asian speakers, mean scores are nearly equivalent in the 2K section in these two groups
(17.8, SD = 5.68, and 17.85, SD = 7.49, respectively), and in the 3K section (14.41, SD =
6.06, and 14.30, SD = 7.19, respectively). There is a difference, however, in the most
difficult sections, in which the non-Romance group performs slightly better than the
Asian group. Perhaps the European (e.g. Russian) language background of some
participants in this group offered them a small cognate advantage over the Asian
language speakers. A close examination of the role of L1 background on performance on
the TTV is beyond the scope of this study, but it is an interesting avenue for further
research.
Table 10
Mean Correct Scores for Different Language Groups
Section Romance speakers Non-Romance speakers Asian speakers
2K 25.40 (4.16) 17.80 (5.68) 17.85 (7.49)
3K 24.48 (4.53) 14.41 (6.06) 14.30 (7.19)
5K 21.94 (5.03) 13.04 (6.76) 11.78 (8.23)
10K 13.97 (4.53) 7.02 (5.11) 6.37 (5.30)
Note. Romance speakers (n = 67), non-Romance speakers (n = 81), Asian speakers (n = 27).
73
It is clear that the learners whose L1 is more closely related to French have an
advantage over the other linguistic groups when performing this kind of receptive
vocabulary test. This corroborates previous empirical findings (Schmitt et al., 2001) and
confirms the importance of L1 in L2 vocabulary development.
Another interesting aspect is the role of previous L2 knowledge (of English and
French, in particular). As mentioned, just under half of the participants (46%) said that
they did not speak English and the same percentage of them reported that they had
studied French in their countries of origin before coming to Canada. Among the Asian
speakers, knowing English affected their performance on the test greatly: when looking
only at Asian speakers' scores in each proficiency level, all the test-takers who answered
the most questions correctly were also the ones who said they knew English. For
example, among all the Asian speakers at the Level 3 (nine participants), the highest
score (84%) was obtained by a test-taker who knew English. On the other hand, in the
Romance group, this pattern was not observed. For example, among all the Romance
speakers at the Level 3 (26 participants), the highest score (87%) was obtained by two
test-takers: one knew English, the other did not. Studying French before coming to
Canada also played a role in test-takers' performance: among the Asian speakers who
answered the most questions correctly, 56% of them said they had studied French
previously; a similar figure was obtained from Romance speakers (55%). The effects of
L1 background and previous study of English and/or French on test performance is an
interesting topic for further exploration.
74
Limitations and tentative solutions
Although the TTV is expected to be useful to SLA researchers and L2 teachers
and learners, there are some limitations to this study that should be taken into account.
First, it was quite unexpected to see that six participants outperformed themselves
when they were interviewed on their knowledge of the 48 target words. One way of
explaining this difference between test and interview performance might be the re-test
effect. Although participants were not informed that they would be interviewed on their
knowledge of some of the test words, during the interview, four of them said they
consulted a dictionary to look up some of the words they had encountered for the first
time on the test. The apparent reason for this sudden motivation to learn more words was
the realization that, after taking the TTV, their vocabulary size was not as big as they
thought it would be. Due to time constraints, the interviews were not scheduled
immediately after the test, as Schmitt et al. (2009) scheduled theirs. Since the interviews
occurred the following day after the test, some participants had plenty of time to check
some words and remember their definitions during the interview. As a result, they were
more knowledgeable about these words than they might have been, and this may explain
their strong performance on the interview. On the other hand, a few participants
performed better on the test than in the interview. One way of explaining this type of
mismatch might be that participants resorted to guessing. However, this does not seem to
be too much of a problem, since it occurred only 8% of the 576 total cases, as mentioned
in the previous chapter.
A second possible measurement limitation is the use of a lemmatized frequency
list to build the test. Most L2 researchers measure vocabulary size in terms of word
75
families. As mentioned in the review of the literature, a vocabulary of 8,000 to 9,000
word families is necessary in order to understand academic written texts in English. Since
the frequency word lists used to build the TTV are lemmatized rather than 'familized', the
test measured learners' vocabulary size in terms of lemmas known. Thus if a student
identifies the correct definition of the lemma représenter ("to represent") on the test,
strictly speaking, one can assume that she knows the verb forms of this word only. It is
reasonable to assume that she also knows words in the same family such as représentant
("representative," noun), représentation ("performance"), and représentatif
("representative," adjective), but, since the frequency list on which the test is based is
ordered according to lemmas rather than families, that assumption is questionable. In
principle, it is possible to investigate lemma knowledge in relation to reading
comprehension: As determined by the TTV, the most proficient learners (Level 4) know
an average of 6,891 lemmas. Researchers could explore how much text coverage this
knowledge of almost 7,000 French lemmas would yield and whether it would enable
these Level 4 learners to understand academic texts adequately. Nonetheless, testing
knowledge of lemmas rather than families presents a definite limitation. Can the student
who answered the question about représenter (a 1K lemma) correctly read and
understand a text that contains représentatif (a 4K lemma)? The present lemma-based
form of the TTV does not allow us to answer that question.
Another important limitation of the study concerns the development of the 10K
section of the TTV. The use of two lists, drawn from two very distinct corpora (Lonsdale
and Le Bras for the 2K, 3K and 5K sections and Baudot for the 10K) was problematic
due to conflicting information they contained: in some instances, a word that was
76
considered low frequency in one list was high frequency in another. Due to such
inconsistencies, the entire 10K section that had been piloted was discarded and another
one was created from scratch for the final version of the test. Ideally, this new section
would have been piloted and poorly functioning clusters would have been eliminated
after calculating the facility and discrimination indices. Fortunately, the participants
performed on the new 10K test as expected: their scores on this section containing low
frequency test words were the lowest, which indicates that its level of difficulty was the
highest. Nonetheless, it is clear that it would have been better to have been able to base
the entire test on a single recent and comprehensive word list in French that includes at
least 10,000 frequent words. Since such a list does not exist, it seems clear that more
corpus work in French is of great necessity.
Although the Lonsdale and Le Bras (2009) frequency list was a very helpful
(though incomplete) tool in the development of the TTV, its 23 million word corpus of
modern spoken and written French had some limitations. Due to the fact that more than
20% of the corpus consists of European and Canadian parliamentary debates, which
require a more formal register, some rare words were ranked as high frequency words,
and some basic ones became low frequency. For instance, clore ("to close," as in clore la
session, "close the session"), which is not a basic word in the French language for most
users, was ranked at 1827 (or a 2K word), while cahier ("notebook"), which is one of the
words learned earliest in classroom instruction, was ranked at 4001 (or a 5K word). There
is little evidence that such problematic rankings may have affected the test scores, but,
again, in order to avoid this problem, it is necessary to create a new frequency count
derived from a corpus that reflects a less formal spoken register of the French language.
77
As mentioned, the research only explores whether learners know the most
frequent meaning of the most frequent words. The fact that the test does not permit the
testing of multiple meanings of a given words, namely, polysemous words, may be
considered a limitation. For example, a learner might know that bureau ("desk" or
"office") can mean "desk" (particularly, if she is a student and is most likely exposed to
expressions such as bureau du professeur ("teacher's desk") and ordinateur de bureau
("desktop computer") but not know the more frequent meaning "office." One solution that
could be envisioned to assess different meanings of a single word is the creation of
several versions of the TTV, which would be administered longitudinally. The
subsequent versions would contain the words previously tested, with the difference that
they would include a new definition, assessing a different meaning sense. For instance, in
one version, bureau would be matched to its sense of "piece of furniture" as in bureau du
professeur ("teacher's desk"), and in another version, it would be associated to its sense of
"room for work," as in bureau de l'avocat ("lawyer's office"). It should be mentioned that
this solution was previously described by Beeckmans et al. (2001) as a way to tackle
polysemous words in the Yes/No checklist test.
Finally, another limitation of this study concerns the fact that the TTV focuses on
measuring receptive vocabulary knowledge only. It does not measure learners' ability to
use the test words productively, since it does not assess their pronunciation or spelling
skills, for instance. Also, test words are presented in their written forms, so it is not clear
whether test-takers who are able to identify correct definitions of words are also able to
recognize these words when they hear them in use. In addition, because the TTV only
demands test-takers to do a basic and simple task, that is, demonstrate receptive mastery
78
using test words in isolation, this exercise may give learners a false sense of what
knowing a word really is. A way to solve this problem would be to use the TTV as a sub-
test included in a larger test, which would consist of several measures, such as a reading
comprehension test, presenting the words in different types of contexts of use. It should
be pointed out, however, that even if the TTV only asks learners to make basic form-
meaning connections, making these fundamental links accurately is crucial in expanding
their L2 vocabulary. Essentially, the TTV is a useful starting point in a longer process.
79
Chapter 6. Implications and conclusions
Implications for research
In the previous chapter, some avenues for future research were suggested or
implied. These included various ways of improving the test. First, it is recognized that
further validation of the TTV is needed. For instance, administering the test to larger and
more varied samples will make it possible to investigate the test clusters more closely. In
Schmitt, Schmitt and Clapham's validation of the VLT, they were able to administer the
test to 801 participants in five countries, which was clearly beyond the scope of this
thesis. But as the TTV becomes available to the research community, it may be possible
to conduct a larger-scale investigation and gain further insights on the test's usefulness. If
the TTV is tried in other settings where there are different levels of learners, its efficacy
in predicting proficiency will also become more clear. A second venue for improving the
test pertains to the lists and corpora that were used. As larger and more representative
French corpora and frequency lists become available, an improved version of the TTV
will become possible. As mentioned earlier, a list of frequent families is important in this
regard. An important next step for software developers is to incorporate existing (or
improved) French frequency lists into lexical frequency profiling software. This will
make it possible to answer questions posed at the beginning of this thesis about matching
learners to texts. For instance, it would be useful to know how well a student whose TTV
score indicates that she knows the 1K through 3K words in French is able to understand
texts written at that level. This kind of research can also answer the question posed at the
outset about the level of French needed to read texts at the CEGEP or university level.
80
Implications for teaching
According to Schmitt (2010), in order to promote vocabulary learning, teachers
need to teach words explicitly and expose students to large amounts of language input,
especially at the beginning of the learning process. At that point, learners will be inclined
to pick up the most basic word knowledge aspects, such as meaning and form, and at the
same time it will be appropriate to measure the meaning-form links using a test of
receptive vocabulary size as the TTV.
Differently from Szabó (2008), who asserts that "testing hinders rather than
facilitates the learning process" (p. 11), the TTV was perceived by participants as a task
that encouraged them to learn vocabulary. Most of the interviewees reported that, thanks
to the TTV, they became aware that their vocabulary size was smaller than they thought it
was and that they were willing to fill this gap by engaging in vocabulary learning. Even
the teachers considered the TTV a task that improves vocabulary learning. On an
anecdotal note, all teachers, whose students took part in the pilot testing phase of the
TTV, allowed the researcher to return to their respective classrooms in the following
session and invite their new groups of students to take the final version of the TTV.
Surprisingly, some teachers went beyond that and told their colleagues about their
learners' positive experience with the TTV, which resulted in a far greater number of
participants in this study. On the basis of this experience, it seems clear that the need for
a test such as the TTV is recognized by both learners and teachers. Its potential for
motivating learning is also evident.
The relatively successful use of the TTV reported in this study indicates that it
81
attained one of the goals, namely, to devise a test to gain meaningful and reliable
information concerning learners' receptive vocabulary size. With this kind of information,
teachers can identify gaps in their students' lexical knowledge and focus attention on
what appears to be a problematic area. Also, with knowledge of their students’
vocabulary size, teachers can choose suitable texts and design level-appropriate
instruction, which has the potential to enhance learning. For that reason, efficient testing
is of great importance. Only the information obtained by efficient testing overrides the
negative side effects caused by poor testing. As Fulcher and Davidson (2007) put it,
"[t]he usefulness of assessment, the validity of interpretation of evidence, is meaningful
only if it results in improved learning" (p. 35).
Conclusion
Previous research indicates that L2 vocabulary is a problematic area for learners
continuing with their study in higher education (Richards et al., 2008). Thus, learners
hoping to move beyond basic reading skills (e.g. to read academic texts) must know how
large their vocabulary is so they can focus expanding their lexicons in the right direction.
The present study aimed to fill a gap by developing and validating a research-
based vocabulary size-test in French. It is hoped that it contributes to the second language
acquisition field in providing evidence that receptive vocabulary tests are good tools for
L2 learning and furthering understanding of L2 vocabulary development.
82
References
Anderson, R. C., & Freebody, P. (1983). Reading comprehension and the assessment and
acquisition of word knowledge. In B. A. Hudson (Ed.), Advances in
Reading/Language Research. Volume 2 (pp. 231-256). Greenwich, CT: JAI Press.
Assche, S. B.-V., & Adjiwanou V. (2009). Dynamiques familiales et activité sexuelle
précoce au Canada. Cahiers québécois de démographie, 38(1), 41-69.
Bachman L., & Palmer, A. (1996). Language testing in practice. Oxford: Oxford
University Press.
Baudout, J. (1992). Les fréquences d'utilisation des mots en français écrit contemporain.
Montréal: Presses de l'Université de Montréal.
Beauchemin, N., Martel, P., & Théoret, M. (1992). Dictionnaire de fréquence des mots
du français parlé au Québec. New York: Peter Lang.
Beeckmans, R., Eyckmans, J., Janssens, V., Dufranne, M., & Van de Velde, H. (2001).
Examining the Yes/No vocabulary test: Some methodological issues in theory and
practice. Language Testing, 18(3), 235-274.
Brown, H. D. (2007). Teaching by principles: An interactive approach to language
pedagogy (3rd ed.). White Plains, NY: Pearson Education.
Cameron, L. (2002). Measuring vocabulary size in English as an additional language.
Language Teaching Research, 6(2), 145-173.
Centre National de Ressources Textuelles et Lexicales (2012). Portail Lexical [Web site].
Retrieved from http://www.cnrtl.fr/definition/
83
Cobb, T. (2000). The compleat lexical tutor [Web site]. Retrieved in April 5th, 2013, from
http://www.lextutor.ca/
Cobb, T., & Horst, M. (2004). Is there room for an academic word list in French? In P.
Bogaards & B. Laufer (Eds.), Vocabulary in a second language: selection,
acquisition, and testing. (pp. 15-38). Philadelphia, PA: John Benjamins.
Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213-238.
David, A. (2008). Vocabulary breadth in French L2 learners. Language Learning
Journal, 36(2), 167-180.
Du Chazaud, H. B. (1998). Dictionnaire de synonymes et contraires. Montreal, QC:
Dicorobert.
Gougenhein, G., Rivenc, P., Sauvageot, A., & Michéa, R. (1964). L'élaboration du
français fondamental (1er degré). Paris: Didier.
Gouvernement du Québec (2012). Le Grand dictionnaire terminologique [Web site].
Retrieved from http://gdt.oqlf.gouv.qc.ca/index.aspx
Fulcher, G. (2010). Practical language testing. London, UK: Hodder Education.
Fulcher, G., & Davidson, F. (2007). Language testing and assessment: An advanced
resource book. New York, NY: Routledge.
Juilland, A., Brodin, D., & Davidovitch, C. (1970). Frequency dictionary of French
words. Paris: Mouton.
Laufer, B. (1992). Reading in a foreign language: how does L2 lexical knowledge
interact with the reader's general academic ability? Journal of Research in Reading,
15(2), 95-103.
84
Laufer, B., & Ravenhorst-Kalovski, G. C. (2010). ‘Lexical threshold revisited: Lexical
coverage, learners’ vocabulary size and reading comprehension.’ Reading in a
Foreign Language, 22, 15-30.
Lonsdale, D., & Le Bras, Y. (2009). A frequency dictionary of French. New York, NY:
Routledge.
Meara, P. (1992). EFL Vocabulary Tests. Swansea: Centre for Applied Language Studies,
University of Wales.
Meara, P., & Buxton, B. (1987). An alternative to multiple choice vocabulary tests.
Language Testing, 4, 142-154.
Meara, P., & Jones, G. (1988). Vocabulary size as a placement indicator: In P. Grunwell
(Ed.), Applied Linguistics in Society (pp. 80-87). London: Centre for Information
on Language Teaching and Research.
Meara, P., Lightbown, P. M., & Halter, R. H. (1994). The effect of cognates on the
applicability of Yes/No vocabulary tests. The Canadian Modern Language Review,
50(2), 296-311.
Meara, P., & Milton, J. (2003). X_Lex, The Swansea Levels Test. Newbury, UK: Express.
Michel, J.-B., et al. (2011). Quantitative analysis of culture using millions of digitized
books. Science, 331, 176-182.
Milton, J. (2006). Language lite? Learning French vocabulary in school. French
Language Studies, 16, 187-205.
Milton, J. (2009). Measuring second language vocabulary acquisition. Bristol:
Multilingual Matters.
85
Milton, J. (2010). The development of vocabulary breadth across the CEFR levels. In I.
Bartning, M. Martin & I. Vedder (Eds.), Communicative proficiency and linguistic
development: Intersections between SLA and language testing research (pp. 211–
232). Eurosla Monographs Series, 1. Retrieved from
http://eurosla.org/monographs/EM01/211-232Milton.pdf.
Nation, I. S. P. (1983). Testing and teaching vocabulary. Guidelines, 5(1), 12-25.
Nation, I. S. P. (1990). Teaching and learning vocabulary. Boston: Heinle & Heinle.
Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge:
Cambridge University Press.
Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening?
Canadian Modern Language Review, 63(1), 59-82.
Nation, I. S. P., & Beglar, D. (2007). A vocabulary size test. The Language Teacher,
31(7), 9-13.
Office québécois de la langue française. (2002). Combien de mots ? [Text]. Retrieved
from
http://www.oqlf.gouv.qc.ca/actualites/capsules_hebdo/phistoire_nombre_20030123
.html
O'Keeffe, A., McCarthy M., & Carter R. (2007). From corpus to classroom: Language
use and language teaching. Cambridge: Cambridge University Press.
Qian, D. D. (1999). Assessing the roles of depth and breadth of vocabulary knowledge in
reading comprehension. The Canadian Modern Language Review, 56(2), 282-397.
86
Qian, D. D. (2002). Investigating the relationship between vocabulary knowledge and
academic reading performance: An assessment perspective. Language Learning,
52(3), 513-536.
Read, J. (1993). The development of a new measure of L2 vocabulary knowledge.
Language Testing, 10, 355-371.
Read, J. (2000). Assessing vocabulary. Cambridge: Cambridge University Press.
Richards, B., Malvern, D., & Graham, S. (2008). Word frequency and trends in the
development of French vocabulary in lower-intermediate students during year 12 in
English schools. Language Learning, 36(2), 199-213.
Szabó, G. (2008). Applying item response theory in language test item bank building.
Frankfurt: Peter Lang.
Schmitt, N. (Ed.). (2000). Vocabulary in language teaching. Cambridge: Cambridge
University Press.
Schmitt, N. (2010). Researching vocabulary: a vocabulary research manual. Hampshire:
Palgrave and Macmillan.
Schmitt, N., Jiang, X., & Grabe, W. (2011). The percentage of words known in a text and
reading comprehension. Modern Language Journal, 95(1), 26-43.
Schmitt, N., Schmitt, D., & Clapham, C. (2001). Developing and exploring the behaviour
of two new versions of the Vocabulary Levels Test. Language Testing, 18(1), 55-
88.
Staehr, L. S. (2008). Vocabulary size and the skills of listening, reading and writing.
Language Learning, 36(2), 139-152.
87
Tréville, M.-C. (2000). Vocabulaire et apprentissage d'une langue seconde: recherches
et théories. Montreal: Logiques.
Verlinde, S., & Selva, T. (2001). Corpus-based versus intuition-based lexicography:
defining a word list for a French learners’ dictionary. In Proceedings of the Corpus
Linguistics 2001 conference, Lancaster University, 594-598.
Wesche, M. B., Paribakht, T. S. (1996). Assessing second language vocabulary
knowledge: depth versus breadth. Canadian Modern Language Review, 53, 13-28.
West, M. (1953). A general service list of English words. London: Longman, Green.
Wilkins, D. A. (1972). Linguistics in language teaching. Cambridge, MA: MIT Press.
88
Appendix A
Test de la taille du vocabulaire (TTV) - 5,000 word section
1. brouillard 1. aviser
2. coïncidence 2. capter
3. farce _____ une histoire qui fait rire 3. desservir _____ informer
4. instituteur _____ ce qui empêche de voir loin 4. empresser _____ changer le sens
5. pneu _____ un professionnel de l'éducation 5. inverser _____ formuler des conditions précises
6. soumission 6. stipuler
1. coffre 1. balancer
2. convoi 2. copier
3. duc _____ abri 3. épanouir _____ mettre en équilibre
4. inondation _____ caisse où l'on met de l'argent 4. lasser _____ faire comme l'original
5. nuance _____ différence entre deux choses semblables 5. naviguer _____ prendre une certaine partie
6. tente 6. prélever
1. affection 1. analogue
2. chancelier 2. barbare
3. fouet _____ premier ministre 3. échéant _____ pareil
4. golfe _____ un type de sentiment 4. intact _____ qui n'a pas été touché
5. lance _____ compétence et expérience 5. philosophique _____ qui cause une grande surprise
6. qualification 6. stupéfiant
1. configuration 1. amusant
2. émeute 2. boursier
3. harmonisation _____ vêtement 3. irrégulier _____ mauvais
4. pantalon _____ une masse de pierre 4. malin _____ qui n'est pas uniforme
5. rocher _____ objet qui ressemble à une boîte 5. naïf _____ qui concerne le marché financier
6. valise 6. solidaire
1. cage 1. confiner
2. panneau 2. énumérer
3. repli _____ défaut 3. imputer _____ isoler
4. sou _____ prison 4. octroyer _____ être suspendu
5. traite _____ monnaie 5. pendre _____ dresser une liste
6. vice 6. venger
1. cote 1. anormal
2. fourniture 2. convaincant
3. goutte _____ le milieu entre deux extrêmes 3. éminent _____ froid
4. hépatite _____ très petite quantité d'un liquide 4. insensible _____ bizarre
5. juridiction _____ des objets utilisés en salle de classe 5. mensuel _____ qui arrive une fois par mois
6. moyenne 6. séparatiste
La cinquième tranche de mille mots
89
Appendix B
Test de la taille du vocabulaire (TTV) - 5,000 word section - Answer key
1. brouillard 1. aviser
2. coïncidence 2. capter
3. farce __3__ une histoire qui fait rire 3. desservir __1__ informer
4. instituteur __1__ ce qui empêche de voir loin 4. empresser __5__ changer le sens
5. pneu __4__ un professionnel de l'éducation 5. inverser __6__ formuler des conditions précises
6. soumission 6. stipuler
1. coffre 1. balancer
2. convoi 2. copier
3. duc __6__ abri 3. épanouir __1__ mettre en équilibre
4. inondation __1__ caisse où l'on met de l'argent 4. lasser __2__ faire comme l'original
5. nuance __5__ différence entre deux choses semblables 5. naviguer __6__ prendre une certaine partie
6. tente 6. prélever
1. affection 1. analogue
2. chancelier 2. barbare
3. fouet __2__ premier ministre 3. échéant __1__ pareil
4. golfe __1__ un type de sentiment 4. intact __4__ qui n'a pas été touché
5. lance __6__ compétence et expérience 5. philosophique __6__ qui cause une grande surprise
6. qualification 6. stupéfiant
1. configuration 1. amusant
2. émeute 2. boursier
3. harmonisation __4__ vêtement 3. irrégulier __4__ mauvais
4. pantalon __5__ une masse de pierre 4. malin __3__ qui n'est pas uniforme
5. rocher __6__ objet qui ressemble à une boîte 5. naïf __2__ qui concerne le marché financier
6. valise 6. solidaire
1. cage 1. confiner
2. panneau 2. énumérer
3. repli __6__ défaut 3. imputer __1__ isoler
4. sou __1__ prison 4. octroyer __5__ être suspendu
5. traite __4__ monnaie 5. pendre __2__ dresser une liste
6. vice 6. venger
1. cote 1. anormal
2. fourniture 2. convaincant
3. goutte __6__ le milieu entre deux extrêmes 3. éminent __4__ froid
4. hépatite __3__ très petite quantité d'un liquide 4. insensible __1__ bizarre
5. juridiction __2__ des objets utilisés en salle de classe 5. mensuel __5__ qui arrive une fois par mois
6. moyenne 6. séparatiste
La cinquième tranche de mille mots
90
Appendix C
Questionnaire
Cochez la bonne réponse ou écrivez l’information appropriée.
1. Information personnelle
- Nom ______________________________________
- Pays d’origine _______________________________
- Langue maternelle ____________________________
- Langue parlée à la maison ______________________
- Date d'arrivée au Canada _______________________
- Genre: Masculin ☐ Féminin ☐
- Avez-vous appris le français dans votre pays d’origine? Oui ☐ Non ☐
- Parlez-vous français à l’extérieur des cours? Oui ☐ Non ☐
- En francisation, vous êtes au niveau 1 ☐ 2 ☐ 3 ☐ français écrit ☐
- Parlez-vous anglais? Oui ☐ Non ☐
2. Préférences d’apprentissage de la langue française
Apprendre la grammaire est
Très utile 1 2 3 4 5 Inutile
Apprendre du vocabulaire est
Très utile 1 2 3 4 5 Inutile
Merci d’avoir rempli ce questionnaire !
91
Appendix D
Two clusters used in the interviews:
(1) Easy (2K word band)
1. évidence
2. hommage
3. maître _____ commerce
4. richesse _____ professeur
5. totalité _____ beaucoup d'argent
6. trafic
(2) Difficult (10K word band)
1. bordée
2. désignation
3. loucheur _____ ce que l'on met au pied
4. peluche _____ personne qui regarde de travers
5. sandale _____ un objet très apprécié des enfants
6. vaurien
92
Appendix E
Word list used in the validation interviews
Lisez les mots suivants et définissez-les:
1. désir
2. mécanisme
3. distribuer
4. circuit
5. division
6. global
7. prudent
8. distinguer
9. traverser
10. fondamental
11. animal
12. joie
13. brutal
14. avertir
15. rigoureux
16. médecine
17. engendrer
18. race
19. record
20. obligatoire
21. caractériser
22. remporter
23. barre
24. cible
25. balancer
26. affection
27. insensible
28. rocher
29. coffre
30. qualification
31. valise
32. goutte
33. mensuel
34. analogue
35. pendre
36. malin
37. récrimination
38. palper
39. cadrer
40. ventilateur
41. isotherme
42. consultant
43. ordure
44. expertise
45. douanier
46. détestable
47. fragmentaire
48. inébranlable
93
Appendix F
INTERVIEW - TTV
Student: ________________________ Group: _________ Date: ___________ 2013
1) Good test? (Y) (N)
2) Easy cluster: Guessed? (Y) (N) Difficult cluster: Guessed (Y) (N)
3) Confirmation of lexical knowledge:
2000 3000 5000 10000
1. désir Y N 13. brutal Y N 25. balancer Y N 37. récrimination Y N
2. mécanisme Y N 14. avertir Y N 26. affection Y N 38. palper Y N
3. distribuer Y N 15. rigoureux Y N 27. insensible Y N 39. cadrer Y N
4. circuit Y N 16. médecine Y N 28. rocher Y N 40. ventilateur Y N
5. division Y N 17. engendrer Y N 29. coffre Y N 41. isotherme Y N
6. global Y N 18. race Y N 30.
qualification
Y N 42. consultant Y N
7. prudent Y N 19. record Y N 31. valise Y N 43. ordure Y N
8. distinguer Y N 20. obligatoire Y N 32. goutte Y N 44. expertise Y N
9. traverser Y N 21. caractériser Y N 33. mensuel Y N 45. douanier Y N
10. fondamental Y N 22. remporter Y N 34. analogue Y N 46. détestable Y N
11. animal Y N 23. barre Y N 35. pendre Y N 47. fragmentaire Y N
12. joie Y N 24. cible Y N 36. malin Y N 48. inébranlable Y N
Total Total Total Total
Total (Y):_____
Total (N):_____
Date of analysis: ____________________
94
Appendix G
1. accomplir
2. bloquer
3. dormir _____ arrêter
4. encourager _____ aider quelqu'un
5. mêler _____ réaliser ou terminer
6. ressentir
Vous devez inscrire votre réponse de la façon suivante:
1. accomplir
2. bloquer
3. dormir __2__ arrêter
4. encourager __4__ aider quelqu'un
5. mêler __1__ réaliser ou terminer
6. ressentir
Test de la taille du vocabulaire
Nom: __________________ Groupe: _________ Date: le _____ octobre 2013
Dans ce test de vocabulaire, vous devez trouver le mot qui correspond à la définition.
Mettez le bon numéro à côté de chaque définition choisie. Voici un exemple:
Complétez toutes les sections du test. Si vous ne connaissez pas un mot, laissez la
réponse en blanc.
95
La deuxième tranche de mille mots
1. concours 1. brûler
2. division 2. distinguer
3. joie _____ grand plaisir 3. examiner _____ imaginer
4. phase _____ un moyen de transport 4. mentionner _____ remarquer
5. stade _____ séparation en deux parties 5. rêver _____ détruire par le feu
6. véhicule 6. supprimer
1. autorisation 1. fondamental
2. bonjour 2. global
3. confusion _____ erreur 3. moderne _____ complet
4. faim _____ le besoin de manger 4. prudent _____ qui est la base
5. rupture _____ la maison de la justice 5. récent _____ qui ne prend pas de risques
6. tribunal 6. traditionnel
1. adapter 1. attaque
2. crier 2. contribution
3. distribuer _____ partager 3. dommage _____ institution
4. formuler _____ parler très fort 4. église _____ action violente
5. procéder _____ aller d'un côté à l'autre 5. incident _____ ensemble de pièces
6. traverser 6. mécanisme
1. bras 1. actif
2. circuit 2. inutile
3. détermination _____ tour 3. fier _____ occupé
4. match _____ principe et règle 4. majeur _____ qui ne sert à rien
5. réception _____ une partie du corps 5. puissant _____ qui a un grand pouvoir
6. théorie 6. scolaire
1. bâtiment 1. dégager
2. consultation 2. élire
3. essence _____ enquête 3. mériter _____ choisir
4. habitude _____ construction 4. persuader _____ convaincre
5. leçon _____ la façon normale de faire 5. résumer _____ raconter de façon brève
6. prestation 6. soulever
1. ambassadeur 1. chéri
2. enfance 2. décisif
3. portrait _____ conflit 3. magnifique _____ strict
4. rayon _____ première partie de la vie 4. rigoureux _____ terrible
5. trouble _____ représentant du gouvernement 5. tragique _____ celui qui est très aimé
6. vœu 6. vital
1. aube 1. douleur
2. docteur 2. minorité
3. issu _____ lever du soleil 3. permanence _____ passage
4. législation _____ science des lois 4. rédaction _____ texte écrit
5. préparation _____ article d'un journaliste 5. sagesse _____ bon sens et connaissance
6. reportage 6. transition
1. amateur 1. annuler
2. cellule 2. cultiver
3. expansion _____ petite chambre 3. défaire _____ travailler la terre
4. profil _____ un visage vu de côté 4. mentir _____ ne pas dire la vérité
5. sondage _____ des questions et des réponses 5. plaider _____ prendre la défense d'une cause
6. terrorisme 6. siéger
1. brutal 1. commenter
2. formidable 2. engendrer
3. impressionant _____ dur 3. promouvoir _____ faire naître
4. mobile _____ juste 4. remporter _____ gagner un jeu
5. obligatoire _____ nécessaire 5. songer _____ élever à un rang supérieur
6. raisonnable 6. téléphoner
1. automobile 1. avertir
2. barre 2. blesser
3. dominant _____ espèce 3. caractériser _____ définir
4. paquet _____ objet long et étroit 4. déclencher _____ signaler
5. race _____ activité d'une personne qui voyage 5. serrer _____ provoquer un phénomène
6. tourisme 6. trahir
La troisième tranche de mille mots
96
1. cote 1. analogue
2. fourniture 2. barbare
3. goutte _____ le milieu entre deux extrêmes 3. échéant _____ pareil
4. hépatite _____ très petite quantité d'un liquide 4. intact _____ qui n'a pas été touché
5. juridiction _____ des objets utilisés en salle de classe 5. philosophique _____ qui cause une grande surprise
6. moyenne 6. stupéfiant
1. anormal 1. brouillard
2. convaincant 2. coïncidence
3. éminent _____ froid 3. farce _____ une histoire qui fait rire
4. insensible _____ bizarre 4. instituteur _____ ce qui empêche de voir loin
5. mensuel _____ qui arrive une fois par mois 5. pneu _____ un professionnel de l'éducation
6. séparatiste 6. soumission
1. affection 1. coffre
2. chancelier 2. convoi
3. fouet _____ premier ministre 3. duc _____ abri
4. golfe _____ un type de sentiment 4. inondation _____ caisse où l'on met de l'argent
5. lance _____ compétence et expérience 5. nuance _____ différence entre deux choses semblables
6. qualification 6. tente
1. amusant 1. cage
2. boursier 2. panneau
3. irrégulier _____ mauvais 3. repli _____ défaut
4. malin _____ qui n'est pas uniforme 4. sou _____ prison
5. naïf _____ qui concerne le marché financier 5. traite _____ monnaie
6. solidaire 6. vice
1. aviser 1. confiner
2. capter 2. énumérer
3. desservir _____ informer 3. imputer _____ isoler
4. empresser _____ changer le sens 4. octroyer _____ être suspendu
5. inverser _____ formuler des conditions précises 5. pendre _____ dresser une liste
6. stipuler 6. venger
1. amas 1. astiquer
2. dortoir 2. foisonner
3. loque _____ mots, phrases, règles 3. incarcérer _____ enfermer quelqu'un
4. huissier _____ chambre à plusieurs lits 4. marteler _____ avoir en grande quantité
5. nain _____ personne de très petite taille 5. railler _____ faire hésiter entre deux choses
6. syntaxe 6. tirailler
1. armure 1. bourrer
2. charpentier 2. crouler
3. épinette _____ espèce d'arbre 3. émanciper _____ remplir
4. granit _____ vêtement en métal 4. mijoter _____ détruire ou tomber
5. hypothèque _____ celui qui construit des immeubles 5. patauger _____ revêtir d'une couleur
6. torsade 6. teinter
1. cafard 1. culbuter
2. étang 2. décupler
3. fenouil _____ lac 3. grelotter _____ souffrir du froid
4. ouïe _____ plante 4. poncer _____ couper un arbre
5. répit _____ pause 5. pulvériser _____ briser ou réduire en morceaux
6. sangle 6. scier
1. buée 1. âpre
2. dicton 2. fulgurant
3. fût _____ ce que l'on utilise pour frapper 3. hébété _____ violent ou rapide
4. garniture _____ destiné à contenir des liquides 4. rusé _____ fort, gros et solide
5. maillet _____ homme qui marche dans les rues 5. trapu _____ qui est désagréable au toucher
6. piéton 6. utopique
1. avortement 1. ahuri
2. matrice 2. dru
3. persévérance _____ jeune soldat 3. nutritif _____ étonné
4. ravitaillement _____ morceau coupé d'un objet 4. opaque _____ qui n'est pas transparent
5. recrue _____ qualité de celui qui a de la patience 5. pondéré _____ très animé ou mouvementé
6. tronçon 6. trépidant
La cinquième tranche de mille mots
La dixième tranche de mille mots
97
Appendix H
La deuxième tranche de mille mots
1. concours 1. brûler
2. division 2. distinguer
3. joie __3__ grand plaisir 3. examiner __5__ imaginer
4. phase __6__ un moyen de transport 4. mentionner __2__ remarquer
5. stade __2__ séparation en deux parties 5. rêver __1__ détruire par le feu
6. véhicule 6. supprimer
1. autorisation 1. fondamental
2. bonjour 2. global
3. confusion __3__ erreur 3. moderne __2__ complet
4. faim __4__ le besoin de manger 4. prudent __1__ qui est la base
5. rupture __6__ la maison de la justice 5. récent __4__ qui ne prend pas de risques
6. tribunal 6. traditionnel
1. adapter 1. attaque
2. crier 2. contribution
3. distribuer __3__ partager 3. dommage __4__ institution
4. formuler __2__ parler très fort 4. église __1__ action violente
5. procéder __6__ aller d'un côté à l'autre 5. incident __6__ ensemble de pièces
6. traverser 6. mécanisme
1. bras 1. actif
2. circuit 2. inutile
3. détermination __2__ tour 3. fier __1__ occupé
4. match __6__ principe et règle 4. majeur __2__ qui ne sert à rien
5. réception __1__ une partie du corps 5. puissant __5__ qui a un grand pouvoir
6. théorie 6. scolaire
1. bâtiment 1. dégager
2. consultation 2. élire
3. essence __2__ enquête 3. mériter __2__ choisir
4. habitude __1__ construction 4. persuader __4__ convaincre
5. leçon __4__ la façon normale de faire 5. résumer __5__ raconter de façon brève
6. prestation 6. soulever
1. ambassadeur 1. chéri
2. enfance 2. décisif
3. portrait __5__ conflit 3. magnifique __4__ strict
4. rayon __2__ première partie de la vie 4. rigoureux __5__ terrible
5. trouble __1__ représentant du gouvernement 5. tragique __1__ celui qui est très aimé
6. vœu 6. vital
1. aube 1. douleur
2. docteur 2. minorité
3. issue __1__ lever du soleil 3. permanence __6__ passage
4. législation __4__ science des lois 4. rédaction __4__ texte écrit
5. préparation __6__ article d'un journaliste 5. sagesse __5__ bon sens et connaissance
6. reportage 6. transition
1. amateur 1. annuler
2. cellule 2. cultiver
3. expansion __2__ petite chambre 3. défaire __2__ travailler la terre
4. profil __4__ un visage vu de côté 4. mentir __4__ ne pas dire la vérité
5. sondage __5__ des questions et des réponses 5. plaider __5__ prendre la défense d'une cause
6. terrorisme 6. siéger
1. brutal 1. commenter
2. formidable 2. engendrer
3. impressionant __1__ dur 3. promouvoir __2__ faire naître
4. mobile __6__ juste 4. remporter __4__ gagner un jeu
5. obligatoire __5__ nécessaire 5. songer __3__ élever à un rang supérieur
6. raisonnable 6. téléphoner
1. automobile 1. avertir
2. barre 2. blesser
3. dominant __5__ espèce 3. caractériser __3__ définir
4. paquet __2__ objet long et étroit 4. déclencher __1__ signaler
5. race __6__ activité d'une personne qui voyage 5. serrer __4__ provoquer un phénomène
6. tourisme 6. trahir
La troisième tranche de mille mots
98
1. cote 1. analogue
2. fourniture 2. barbare
3. goutte __6__ le milieu entre deux extrêmes 3. échéant __1__ pareil
4. hépatite __3__ très petite quantité d'un liquide 4. intact __4__ qui n'a pas été touché
5. juridiction __2__ des objets utilisés en salle de classe 5. philosophique __6__ qui cause une grande surprise
6. moyenne 6. stupéfiant
1. anormal 1. brouillard
2. convaincant 2. coïncidence
3. éminent __4__ froid 3. farce __3__ une histoire qui fait rire
4. insensible __1__ bizarre 4. instituteur __1__ ce qui empêche de voir loin
5. mensuel __5__ qui arrive une fois par mois 5. pneu __4__ un professionnel de l'éducation
6. séparatiste 6. soumission
1. affection 1. coffre
2. chancelier 2. convoi
3. fouet __2__ premier ministre 3. duc __6__ abri
4. golfe __1__ un type de sentiment 4. inondation __1__ caisse où l'on met de l'argent
5. lance __6__ compétence et expérience 5. nuance __5__ différence entre deux choses semblables
6. qualification 6. tente
1. amusant 1. cage
2. boursier 2. panneau
3. irrégulier __4__ mauvais 3. repli __6__ défaut
4. malin __3__ qui n'est pas uniforme 4. sou __1__ prison
5. naïf __2__ qui concerne le marché financier 5. traite __4__ monnaie
6. solidaire 6. vice
1. aviser 1. confiner
2. capter 2. énumérer
3. desservir __1__ informer 3. imputer __1__ isoler
4. empresser __5__ changer le sens 4. octroyer __5__ être suspendu
5. inverser __6__ formuler des conditions précises 5. pendre __2__ dresser une liste
6. stipuler 6. venger
1. amas 1. astiquer
2. dortoir 2. foisonner
3. loque __6__ mots, phrases, règles 3. incarcérer __3__ enfermer quelqu'un
4. huissier __2__ chambre à plusieurs lits 4. marteler __2__ avoir en grande quantité
5. nain __5__ personne de très petite taille 5. railler __6__ faire hésiter entre deux choses
6. syntaxe 6. tirailler
1. armure 1. bourrer
2. charpentier 2. crouler
3. épinette __3__ espèce d'arbre 3. émanciper __1__ remplir
4. granit __1__ vêtement en métal 4. mijoter __2__ détruire ou tomber
5. hypothèque __2__ celui qui construit des immeubles 5. patauger __6__ revêtir d'une couleur
6. torsade 6. teinter
1. cafard 1. culbuter
2. étang 2. décupler
3. fenouil __2__ lac 3. grelotter __3__ souffrir du froid
4. ouïe __3__ plante 4. poncer __6__ couper un arbre
5. répit __5__ pause 5. pulvériser __5__ briser ou réduire en morceaux
6. sangle 6. scier
1. buée 1. âpre
2. dicton 2. fulgurant
3. fût __5__ ce que l'on utilise pour frapper 3. hébété __2__ violent ou rapide
4. garniture __3__ destiné à contenir des liquides 4. rusé __5__ fort, gros et solide
5. maillet __6__ homme qui marche dans les rues 5. trapu __1__ qui est désagréable au toucher
6. piéton 6. utopique
1. avortement 1. ahuri
2. matrice 2. dru
3. persévérance __5__ jeune soldat 3. nutritif __1__ étonné
4. ravitaillement __6__ morceau coupé d'un objet 4. opaque __4__ qui n'est pas transparent
5. recrue __3__ qualité de celui qui a de la patience 5. pondéré __6__ très animé ou mouvementé
6. tronçon 6. trépidant
La cinquième tranche de mille mots
La dixième tranche de mille mots