what is corpus linguistics assignment
TRANSCRIPT
-
8/6/2019 What is Corpus Linguistics Assignment
1/6
Corpus, is also used in Matters of Learning, for several Works of the same Nature,
collected, join'd, and bound together. The Corpus of the Civil Law is compos'd of the
Digest, Code, and Institutes.We have also a Corpus of the Greek Poets. (Corpus, def
3) This citation from the Oxford English Dictionary is dated 1728. It bears testament to
the use of corpora in academic study in its earliest form. It also gives us one of the
earliest senses of the term corpus; work collected, joined and bound together.
What is Corpus Linguistics?
Corpus linguistics may be viewed as the study of language using a principled
collection of texts which can be searched using concordance software to find frequent
words, collocations and lexical patterns.
It employs two major methods; a Word Based Method which deals with analyzing
frequency and key word lists and a Category Based Method which adds information to
the corpus by placing words into categories. The corpora which arise can take many
forms, they may be spoken or written; synchronic or diachronic; general or specialized;
monolingual or multilingual (parallel, comparable).1 Some examples of these are the
Brown Corpus of American English, CHILDES, Alpino Tree Bank and the Crater
Multilingual project.
How do approaches to the studies of English Grammar using corpus linguistics
differ from those of other linguistic methods?
1Browns Corpus (1960) is a primary example of a written corpus; one million words of written American
English.
The CHILDES is one example of a specialized corpus, created for the distinct purpose of analyzing child
acquisition speech (20 million plus words)Alpino Treebank Corpus is a non-English , Dutch monolingual corpus, it is also syntactically annotated.
The CRATER Multilingual Aligned Annotated Corpusof English, French and Spanish (15,000)
-
8/6/2019 What is Corpus Linguistics Assignment
2/6
Corpus linguistics when viewed from the perspective of other fields of linguistics
reveals inherent advantages and inadequacies. The greatest antagonist of Corpus
linguistics has been its Generativist counterpart. Generative approaches to grammar, led
by Chomskian theory have forced us to consider the influence of empiricism on science
and scholarship. Corpus Linguistics has made great strides in producing linguistic data
that may be successfully measured and empirically recorded. However, it facilitates both
quantitative and qualitative study.
Qualitatively, the use of corpora has allowed the linguist to consider grammar in real use,
with all its complexity and variation; separate from nave speaker intuition. Consider
Hockets statement that, Interestingly, Chomsky's empiricist antecedents in American
structural linguistics, who were in principle incapable of postulating a sharp dichotomy
on the basis of observationally graded data, were forced to negate the langue/parole
distinction by regarding the former as no more than a set of 'habits' deducible directly
from speech behavior. (Hocket,1952; Newmeyer,1986 ) The corpus gives us no real
indicators of speakers/writers competence (langue); rather, it solely allows the thorough
investigation and description of performance (parole). We see in Structuralist mode that it
highlights the clear dichotomy of langue versus parole.
Conversely, Chomskyan linguistics considers competence integral to
understandings of Universal grammar; that innate to the ideal speaker. Evidence of
grammars and usage is presented by the ideal speaker stemming from his intuition. For
the corpus linguist, evidence is presented by the concordance with while not intuitive is
ideally a perfect speaker not susceptible to the flaws and observational defects of the
native producing language for study. By looking solely at syntax we are rationalizing the
-
8/6/2019 What is Corpus Linguistics Assignment
3/6
ability for human language to be infinitely creative. The corpus does not aim solely at
rationalizing but at describing the range of that creativity.
What are the advantages and disadvantages of using corpus data?
The use of corpora are not without its disadvantages. Firstly, the corpora collected may
eliminate key elements which the linguist overlooks because they do not appear in the
corpus but which do appear in the speech of a native.
Advs
It allows us to really understand the many senses of words outside of the dictionary
meanings. It also gives insight into lexical patterns and lexical grammar.
The applications of corpora to language teaching are seemingly endless. It can
allow for inductive learning where students investigate word categories using
concordances and deduce meaning, or grammatical relationships. They may also be used
to provide real examples when teaching specific topics. Liu and Jiang argue that such
learning activities, especially the inductive type[s], motivate students and promote
discovering learning. And they are particularly effective for the acquisition of grammar
and vocabulary because they help learners to notice and retain lexico-grammatical usage
patterns better by engaging them in deeper [language] processing Parallel concordances
may be used specifically in data driven learning; the corpus does not teach but provides a
rich bank of useful examples. (Liu & Jiang, 2009) Authentic language learning is the
-
8/6/2019 What is Corpus Linguistics Assignment
4/6
main advantage of Corpus use in education; ultimately language form, meaning, and use
are learnt as an integrated whole (Larson-Freeman, 2003; Jliu & Jhang, 2009)
Furthermore, corpora particularly monolingual/multilingual corpora can facilitate
easier and successful compilations of word lists which inevitable can propel the field of
lexicography unto new plateaus. The lexicography can easily look for derivations, sense,
units of meaning and frequency of use (archaisms); possibly eliminating the tedious tasks
of collecting words on pieces of paper, as was done in the first compilation of the Oxford
dictionary and many consequent books.
Give an example of a question about grammar that can be analysed using a corpus
linguistics methodology, and suggest how you would go about investigating it.
A corpus methodology may also serve useful in a description of the formation of the
habitual aspect in English; with specific focus on the parameters guiding the optional
usage ofused to orwould? Ultimately, the researcher may seek to respond to the question
of if the two synonymous or are they used under differing conditions? Tagliamonte and
Lawrence (2000) in their article I used to dance but now I dont outline the differing
contexts in which the used to and would representations of the habitual aspect may be
used.2 It is possible to therefore use corpora to verify the actual and authentic usage of the
2 Tagliamonte notes that, inter alia:
y Used to is:i. favoured in affirmative sentences (e.g. we used to go every week.)
ii. favoured with 1st person singular subjects (e.g.I used to run all sorts of functions in the
Malcolm Club.)iii. decidedly confined to non-stative verbs which, as with the findings concerning animate
subjects contrasts with the assertions of Visser (1963) and Bybee et al (1994) mentioned above.
-
8/6/2019 What is Corpus Linguistics Assignment
5/6
words and to classify these occurrences based on the parameters that Tagliamonte and
Lawrence provide or to describe- create completely new parameters.
iv. favoured with animate subjects (e.g. And a lot of folks used to say "Is your Dad down at t'
garden?"). Tagliamonte found that used to is rarely employed with inanimate subjects (see
preterit below) which is (as with iii above) at variance with the claims of Bybee et al (1994) andVisser (1963).
v. disfavoured in negative structures. (Such structures are rare in Tagliamonte's data) She sees this
as a corroboration of Dennison's (1993: 323) suggestion that speakers may eschew the
employment of used to with negatives.
y Would is:vi. favoured with 3rd person subjects, including pronouns and complete noun phrases both
singular and plural (e.g.: He'd stand there and sing a little hymn; The boys and girls would passby on their way to school.).
Note: Where the context disbars the use of would the preterite is employed in its place For
example, He drove for my dad for twenty-odd years, as opposed toHe would drive for my dad
for twenty-odd years.vii. "concentrated in contexts of short duration" (e.g., in relation to a particular motorbike
ride,Like you'd be like that with your neck.)
viii. characteristically employed within a sequence of HP sentences (see Tagliamonte andLawrence, 2000 for further details).
-
8/6/2019 What is Corpus Linguistics Assignment
6/6
WORKS CITED
Newmeyer, Frederick J. "A Chomskyan Revolution in Linguistics." Linguistic Society of
America62.1 (1986): 1-18.JSTOR. Web. 09 Feb. 2011.
.
Tagliamonte, S. and Lawrence, H. (2000) "I used to dance, but I don't dance
now.": The Habitual Past in English.Journal of English Linguistics, 28(4):
323-353.http://www.yorkshiredialect.com/habitual.htm
http://odur.let.rug.nl/~vannoord/trees/
http://www.comp.lancs.ac.uk/linguistics/crater/corpus.html