an exploratory corpus study of the ap spanish

32
Steven Saffels April 2014 An exploratory corpus study of the AP Spanish Exam

Upload: steven-saffels

Post on 21-Mar-2017

38 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An exploratory corpus study of the AP Spanish

Steven SaffelsApril 2014

An exploratory corpus study of the AP Spanish Exam

Page 2: An exploratory corpus study of the AP Spanish

An exploratory corpus study of the AP Spanish Exam• The impression of the language used on the AP

Spanish Exam is that it primarily consists of lexically rich but grammatically simple text.• Vocabulary – relatively specialized for specific topics• Mostly of simple sentences and relies on noun phrase

modification• 86% of all verbs are in present, past, and infinitive forms.• Recurrent formulaic expressions are used to introduce

source texts.

Page 3: An exploratory corpus study of the AP Spanish

Introduction: What is a corpus study?

Page 4: An exploratory corpus study of the AP Spanish

Corpus-based methodologies

• (Anthony, 2011)

• Corpus research:• Uses a computer program

called a concordancer• analyze key words,

phrases & parts of words• in a large, representative,

computerized collection of texts, called a corpus.

(O’Keefe, McCarthy & Carter 2007)

• Allow very extensive, systematic and descriptive data(De Kock 2001)

Page 5: An exploratory corpus study of the AP Spanish

Research gap• Relatively few

corpus studies in languages other than English

(Parodi, 2007)

• “Gap” between corpus-based research results and pedagogical practice

(Cortes 2013)

Page 6: An exploratory corpus study of the AP Spanish

Goal of this studyThe present study aspires to: help redress both

the lack of corpus research in Spanish and the gap between research and practice by applying corpus methodologies to a pedagogical problem from a Spanish L2 classroom:

How to best prepare high school students for success on a high-stakes, skills-based exam of proficiency in Spanish.

Page 7: An exploratory corpus study of the AP Spanish

Research Questions• RQ1: How representative is the AP Spanish Exam of

broader usage of Spanish? Specifically, in terms of:• Vocabulary• Parts of speech• Verb forms

• RQ2: What are the most frequent recurrent word combinations? • What are the salient 3-, 4-, 5-, or 6-grams used on the exam? • Are there any salient tendencies in n-gram use?

• RQ3: Are the “transition phrases” suggested by a popular test-prep book used frequently on the exam?

Page 8: An exploratory corpus study of the AP Spanish

Advanced Placement Spanish Exam• Year-end, Skills-based exam• No vocabulary or grammar specifications• Students must use information from authentic

texts to:• Write a personal letter• Compose a synthesis essay• Respond orally to a simulated conversation• Make an oral synthesis presentation

(College Board, 2008-2013)

Page 9: An exploratory corpus study of the AP Spanish

The AP Corpus• Total of 10 texts

• 18,333 word tokens

• Most of the text is from the articles and radio reports used as sources for the presentational writing and speaking exercises.

Page 10: An exploratory corpus study of the AP Spanish

A Frequency Dictionary of Spanish (Davies 2006)• List of the 5,000 most

frequent words in Spanish

• Based on a subset of the 100-million-word Corpus del Español (CDE)

(Davies 2002-)

• Balanced, representative corpus:• Spoken/Written• Latin America/Spain

Page 11: An exploratory corpus study of the AP Spanish

Lexical Analysis

Page 12: An exploratory corpus study of the AP Spanish

Lexical Analysis: “Absent” Verbs• 30% of the top 300 words in Davies’ (2006) do not

appear on the AP word list.• Of those, 41% are verbs, including many core vocabulary

items for lower-level Spanish classes:PONER (PUT)LLAMAR (CALL)VENIR (COME)SALIR (LEAVE)VOLVER (RETURN)VIVIR (LIVE)MIRAR (LOOK)EMPEZAR (BEGIN)ENTRAR (ENTER)ENTENDER (UNDERSTAND)

PEDIR (REQUEST)RECIBIR (RECEIVE)TERMINAR (FINISH)SACAR (TAKE OUT)NECESITAR (NEED)LEER (READ)ABRIR (OPEN)

Page 13: An exploratory corpus study of the AP Spanish

Lexical Analysis: “Absent” Nouns• General

Nouns:• COSA (THING)• HOMBRE (MAN)• MUJER (WOMAN)• MODO (WAY)• RELACIÓN

(RELATIONSHIP)

• Body Parts:• MANO (HAND)• OJO (EYE)

• Human Relations:• HIJO (SON)• SEÑOR (MISTER)• MADRE

(MOTHER)• NOSOTROS (WE)• NADIE (NOBODY)

• Religion:• VERDAD

(TRUTH)• SANTO (HOLY)• DIOS (GOD)

• Time/Space• PUNTO (POINT)• LADO (SIDE)• NOCHE (NIGHT)• PRINCIPIO

(BEGINNING)• PUEBLO (TOWN)

Page 14: An exploratory corpus study of the AP Spanish

Lexical Analysis: “Absent” Adjectives• Several of the generally common adjectives that

are missing from the AP Corpus frequency list are typically pre-modifiers.• AQUEL (THAT)desde aquel día (from that day)• TAL (SUCH) hacerlo de tal manera (to do it in

such a way)• PROPIO (OWN) tiene su propio estilo (has his own

style)• NINGÚN (NONE) no hay ningún problema (there’s no

problem)• CUALQUIER (ANY) puede hacer cualquier cosa (can do

any thing)• ÚNICO (ONLY) ¿Usted es el único hijo? (You are the only

son?)

Page 15: An exploratory corpus study of the AP Spanish

Lexical Analysis: Unusually frequent terms• Terms used to introduce source texts for the presentational

writing and speaking activities• Not extremely salient for the student taking the exam—

referential information—not necessary for interpreting the texts

• May be helpful to guide students in quickly selecting appropriate strategies to make the most efficient use of time

FUENTE (SOURCE)DIARIO (NEWSPAPER)INFORME (REPORT)ARTÍCULO (ARTICLE)APARECER (APPEAR)

RADIO (RADIO)EMITIR (BROADCAST)SIGUIENTE (FOLLOWING)TITULADO (TITLED)TEXTO (TEXT)CONVERSACIÓN

(CONVERSATION)IMPRESO (PRINTED)PERIÓDICO (NEWSPAPER)ADAPTACIÓN (ADAPTATION)GRABACIÓN (RECORDING)

Page 16: An exploratory corpus study of the AP Spanish

Lexical Analysis: Example of bicicleta

Page 17: An exploratory corpus study of the AP Spanish

Lexical Analysis: Thematic Vocabulary• Geography: país (country), mundo (world), ciudad (city), español (Spanish),

lengua (language), mundial (worldwide), idioma (language), estado (state)

• Environment: cambio (change), climático (climate), invierno (winter), oso (bear), ave (bird), combustible (fuel), nieve (snow), calentamiento (warming)

• Wellbeing: agua (water), físico (physique), salud (health), organismo (body), risa (laughter), peso (weight), alimento (food), kilómetros (kilometers)

• Technology: computadora (computer), internet (Internet), digital (digital), electrónico (electronic), red (network), tecnología (technology), virtual (virtual)

• Fine arts: arte (art), música (music), orquesta (orchestra), artista (artist), producción (production), pintura (painting), músico (musician), lienzo (canvas)

• Education: educación (education), niño (child), joven (young person), calidad (quality), escuela (school), estudio (study), alumno (student), clase (class)

Page 18: An exploratory corpus study of the AP Spanish

Grammatical Analysis: Word Class

Page 19: An exploratory corpus study of the AP Spanish

Grammatical Analysis: Word Class

WORD CLASSAP TOKENS %

CDE TOKENS %

PREPOSITIONS 3,079

22.79% 5,553,520

24.72%

ARTICLES 2,54418.83

% 4,643,03920.67

%CONJUNCTIONS 1,371

10.15% 3,781,609

16.83%

PRONOUNS 608 4.50% 2,046,356 9.11%VERBS 1,213 8.98% 1,928,260 8.58%ADVERBS 566 4.19% 1,764,952 7.86%COMMON NOUNS 2,435

18.02% 1,459,968 6.50%

ADJECTIVES 1,294 9.58% 614,069 2.73%PROPER NOUNS 296 2.19% 365,057 1.62%NUMERALS 100 0.74% 246,519 1.10%INTERJECTIONS 5 0.04% 64,277 0.29%

TOTAL 13,511 100%22,467,62

6 100%

Page 20: An exploratory corpus study of the AP Spanish

Grammatical Analysis: Register in CDE

WORD CLASSACADEMIC NEWS

FICTION ORAL

COMMON NOUNS 241,116

222,729

209,619

169,680

PREPOSITIONS 162,221

156,788

132,089

118,329

ARTICLES 153,504139,48

4124,27

2105,91

0

VERBS 116,308136,35

9187,78

8183,30

6ADJECTIVES 90,953 72,177 58,305 50,667CONJUNCTIONS 72,917 76,856 97,745

116,953

PROPER NOUNS 53,161 57,932 22,147 28,177ADVERBS 27,754 37,160 55,152 79,902PRONOUNS 24,821 32,464 65,805 73,150NUMERALS 6,125 8,705 5,426 9,434INTERJECTIONS 93 286 818 8,134

Page 21: An exploratory corpus study of the AP Spanish

Grammatical Analysis: Verb Forms

VERB FORM AP % CDE %

PRESENT 74761.58

%1,190,97

137.52

%

INFINITVE 21117.39

% 459,89014.49

%

PRETERITE 12610.39

% 386,21812.17

%

IMPERFECT 27 2.23% 443,18213.96

%PAST PARTICIPLE 53 4.37% 259,488 8.18%GERUND 4 0.33% 107,727 3.39%CONDITIONAL 13 1.07% 57,225 1.80%FUTURE 12 0.99% 67,040 2.11%SUBJUNCTIVE-PRESENT 17 1.40% 126,093 3.97%SUBJUNCTIVE-PAST 3 0.25% 73,073 2.30%SUBJUNCTIVE-FUTURE 0 0.00% 3,141 0.10%

 TOTALS1,21

3  100%3,174,04

8  100%

Page 22: An exploratory corpus study of the AP Spanish

Beyond the word: Lexical Bundles & N-grams

Page 23: An exploratory corpus study of the AP Spanish

Beyond the word: N-grams Structure & Function?-Gram

Freq

Range N-Gram English Structure Functio

nSubcategory

6 10 5apareció en el sitio de internet

appeared on the website

Verb Phrase fragment

Referential Intangible

framing4 15 8 este artículo

apareció enthis article appeared in

Verb Phrase fragment

Referential

Intangible framing

3 20 10 artículo apareció en

article appeared in

Verb Phrase fragment

Referential

Intangible framing

3 19 7 apareció en el appeared on the

Verb Phrase fragment

Referential

Intangible framing

3 14 5 el sitio de the site ofNoun Phrase fragment

Referential Intangible

framing

3 14 5 en el sitio on the site Prep Phrase fragment

Referential

Identification/Focus

3 14 5 sitio de internet internet site

Noun Phrase fragment

Referential

Identification/Focus

3 11 10 informe de la report from the

Noun Phrase fragment

Referential

Identification/Focus

3 11 5 se presentó en was presented on

Verb Phrase fragment

Referential

Intangible framing

Page 24: An exploratory corpus study of the AP Spanish

Beyond the word: Lexical bundle• Lexical Bundle – an N-gram that occurs a certain

number of times acros a certain number of texts in a corpus • Cut-off numbers determined by the type of corpus and the

length of N-gram• Based on these criteria, the six-word expression

apareció en el sitio de internet (appeared on the website) can be considered a lexical bundle for this corpus.

Page 25: An exploratory corpus study of the AP Spanish

Beyond the word: Salient N-Grams• Empirically identified, frequency-based

expressions which could be salient for the examinee and therefore useful for interpreting the texts:• todo el mundo (the whole world)• a través de (throughout)• por ciento de (percent of)• una de las (one of the)• se trata de (is about)• cuál es el (what is the…?)• de enero de (of January of)• de noviembre de (of November of)• en la ciudad (in the city)

Page 26: An exploratory corpus study of the AP Spanish

Transitions

Page 27: An exploratory corpus study of the AP Spanish

Transitions: AP Spanish: Preparing for the Language Examination• One of the most

popular textbooks for the AP Spanish course.

• Contains exhaustive list of transition words and phrases

• Very few of these appear in the AP Corpus

Page 28: An exploratory corpus study of the AP Spanish

Transitions: FrequencyTRANSITION English FREQ TRANSITION English FREQ

que that 565 entonces Then 8y and 482 sin embargo However 8

como like, as 84 mientras While 7o or 57 o sea that is 7

pero but 49 ya que Since 7también also 34 al + inf upon + -ing 5

si if 33 sino but rather 5cuando when 30 a partir de as of 4porque because 26 como si as if 4durante during 18 luego later, then 4según according

to 17 primero first 4además in addition 14 sino que but rather 3para que so that 12 tampoco neither 3

por ejemplo

for example 11 una vez que once 3

sobre todo above all 11 tanto… como…

as much… as… 3

aunque although 9  

Page 29: An exploratory corpus study of the AP Spanish

As we have seen…• The impression of the language used on the AP

Spanish Exam is that it primarily consists of lexically rich but grammatically simple text.• High frequency of relatively obscure & specific

vocabulary items;• Many common “general” vocabulary items are missing• Texts consist mostly of simple sentences with few

conjunctions• Communication relies on noun phrase modification—

academic register• 83% of all verbs in present, infinitive or preterite forms.• Recurrent word combinations are primarily used to

introduce source texts.

Page 30: An exploratory corpus study of the AP Spanish

Pedagogical implications: Vocabulary• In order to successfully interpret the tasks on the

AP Spanish Exam, students must possess a broad vocabulary that is strongly rooted in, but extends well beyond, the most frequent lexical items in the language.

• An AP student’s vocabulary should include a variety of synonyms, especially a wide range of nouns related to specific themes that express concrete entities and abstract concepts.

Page 31: An exploratory corpus study of the AP Spanish

Pedagogical implications: Grammar & Discourse• Present, Preterite & Imperfect tenses along with

the Infinitive account for:• 86% of all verbs in the AP Corpus• 78% of all verbs in the Corpus del Español

• The most important grammatical focus for the AP class might well be that of the noun phrase.

• Complex verb tenses should not be the organizing factor for an upper-level Spanish curriculum

Page 32: An exploratory corpus study of the AP Spanish

Selected References:• Anderson, N. J. (2014). Developing Engaged Second Language Readers. In M. Celce-Murcia, D. M. Brinton,

& M. A. Snow (Eds.), Teaching English as a Second or Foreign Language. 4th ed. (pp. 170-188). Boston: Heinle Cengage.

• Anthony, L. (2011). AntConc (Version 3.2.4w) [Computer Software]. Tokyo, Japan: Waseda University. Available from http://www.antlab.sci.waseda.ac.jp/

• Biber, D., Johansson, S., Leech, G., Conrad, S., & Finnegan, E. (1999). Longman Grammar of Spoken and Written English. Essex, England: Longman.

• College Board. (2008-2013). AP Spanish Language Exam: Free-Response Questions. Retrieved from http://apcentral.collegeboard.com/apc/public/courses/teachers_corner/221848.html.

• Cortes, V. (2013, January). Waiting for the revolution. Plenary talk presented at the Conference for the American Association of Corpus Linguistics (AACL), San Diego, California, USA.

• Davies, M. (2002-). Corpus del Español: 100 million words, 1200s-1900s. Available online at http://corpusdelespanol.org.

• Davies, M. (2006). A Frequency dictionary of Spanish: Core vocabulary for learners. New York: Routledge. • De Kock, J. (2001). [Preface]. In J. De Kock (Ed.), Gramática española: Enseñanza e investigación (Vol. 7.

Lingüística con corpus). (pp. 7-8). Salamanca: Ediciones Universidad de Salamanca.• Díaz, J. M. (2014). AP Spanish: Preparing for the Language and Culture Examination. Boston: Pearson

Education.• Parodi, G. (2007). Catching up with corpus linguistics: Register-diversified studies from different corpora in

different Spanish-speaking countries. In G. Parodi (Ed.), Working with Spanish Corpora. (pp. 1-10). New York: Continuum.

• Tracy-Ventura, N., Cortes, V., & Biber, D. (2007). Lexical bundles in speech and writing. In G. Parodi (Ed.), Working with Spanish Corpora. (pp. 217-231). New York: Continuum.