what is corpus linguistics assignment

Upload: shivana-allen

Post on 07-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 What is Corpus Linguistics Assignment

    1/6

    Corpus, is also used in Matters of Learning, for several Works of the same Nature,

    collected, join'd, and bound together. The Corpus of the Civil Law is compos'd of the

    Digest, Code, and Institutes.We have also a Corpus of the Greek Poets. (Corpus, def

    3) This citation from the Oxford English Dictionary is dated 1728. It bears testament to

    the use of corpora in academic study in its earliest form. It also gives us one of the

    earliest senses of the term corpus; work collected, joined and bound together.

    What is Corpus Linguistics?

    Corpus linguistics may be viewed as the study of language using a principled

    collection of texts which can be searched using concordance software to find frequent

    words, collocations and lexical patterns.

    It employs two major methods; a Word Based Method which deals with analyzing

    frequency and key word lists and a Category Based Method which adds information to

    the corpus by placing words into categories. The corpora which arise can take many

    forms, they may be spoken or written; synchronic or diachronic; general or specialized;

    monolingual or multilingual (parallel, comparable).1 Some examples of these are the

    Brown Corpus of American English, CHILDES, Alpino Tree Bank and the Crater

    Multilingual project.

    How do approaches to the studies of English Grammar using corpus linguistics

    differ from those of other linguistic methods?

    1Browns Corpus (1960) is a primary example of a written corpus; one million words of written American

    English.

    The CHILDES is one example of a specialized corpus, created for the distinct purpose of analyzing child

    acquisition speech (20 million plus words)Alpino Treebank Corpus is a non-English , Dutch monolingual corpus, it is also syntactically annotated.

    The CRATER Multilingual Aligned Annotated Corpusof English, French and Spanish (15,000)

  • 8/6/2019 What is Corpus Linguistics Assignment

    2/6

    Corpus linguistics when viewed from the perspective of other fields of linguistics

    reveals inherent advantages and inadequacies. The greatest antagonist of Corpus

    linguistics has been its Generativist counterpart. Generative approaches to grammar, led

    by Chomskian theory have forced us to consider the influence of empiricism on science

    and scholarship. Corpus Linguistics has made great strides in producing linguistic data

    that may be successfully measured and empirically recorded. However, it facilitates both

    quantitative and qualitative study.

    Qualitatively, the use of corpora has allowed the linguist to consider grammar in real use,

    with all its complexity and variation; separate from nave speaker intuition. Consider

    Hockets statement that, Interestingly, Chomsky's empiricist antecedents in American

    structural linguistics, who were in principle incapable of postulating a sharp dichotomy

    on the basis of observationally graded data, were forced to negate the langue/parole

    distinction by regarding the former as no more than a set of 'habits' deducible directly

    from speech behavior. (Hocket,1952; Newmeyer,1986 ) The corpus gives us no real

    indicators of speakers/writers competence (langue); rather, it solely allows the thorough

    investigation and description of performance (parole). We see in Structuralist mode that it

    highlights the clear dichotomy of langue versus parole.

    Conversely, Chomskyan linguistics considers competence integral to

    understandings of Universal grammar; that innate to the ideal speaker. Evidence of

    grammars and usage is presented by the ideal speaker stemming from his intuition. For

    the corpus linguist, evidence is presented by the concordance with while not intuitive is

    ideally a perfect speaker not susceptible to the flaws and observational defects of the

    native producing language for study. By looking solely at syntax we are rationalizing the

  • 8/6/2019 What is Corpus Linguistics Assignment

    3/6

    ability for human language to be infinitely creative. The corpus does not aim solely at

    rationalizing but at describing the range of that creativity.

    What are the advantages and disadvantages of using corpus data?

    The use of corpora are not without its disadvantages. Firstly, the corpora collected may

    eliminate key elements which the linguist overlooks because they do not appear in the

    corpus but which do appear in the speech of a native.

    Advs

    It allows us to really understand the many senses of words outside of the dictionary

    meanings. It also gives insight into lexical patterns and lexical grammar.

    The applications of corpora to language teaching are seemingly endless. It can

    allow for inductive learning where students investigate word categories using

    concordances and deduce meaning, or grammatical relationships. They may also be used

    to provide real examples when teaching specific topics. Liu and Jiang argue that such

    learning activities, especially the inductive type[s], motivate students and promote

    discovering learning. And they are particularly effective for the acquisition of grammar

    and vocabulary because they help learners to notice and retain lexico-grammatical usage

    patterns better by engaging them in deeper [language] processing Parallel concordances

    may be used specifically in data driven learning; the corpus does not teach but provides a

    rich bank of useful examples. (Liu & Jiang, 2009) Authentic language learning is the

  • 8/6/2019 What is Corpus Linguistics Assignment

    4/6

    main advantage of Corpus use in education; ultimately language form, meaning, and use

    are learnt as an integrated whole (Larson-Freeman, 2003; Jliu & Jhang, 2009)

    Furthermore, corpora particularly monolingual/multilingual corpora can facilitate

    easier and successful compilations of word lists which inevitable can propel the field of

    lexicography unto new plateaus. The lexicography can easily look for derivations, sense,

    units of meaning and frequency of use (archaisms); possibly eliminating the tedious tasks

    of collecting words on pieces of paper, as was done in the first compilation of the Oxford

    dictionary and many consequent books.

    Give an example of a question about grammar that can be analysed using a corpus

    linguistics methodology, and suggest how you would go about investigating it.

    A corpus methodology may also serve useful in a description of the formation of the

    habitual aspect in English; with specific focus on the parameters guiding the optional

    usage ofused to orwould? Ultimately, the researcher may seek to respond to the question

    of if the two synonymous or are they used under differing conditions? Tagliamonte and

    Lawrence (2000) in their article I used to dance but now I dont outline the differing

    contexts in which the used to and would representations of the habitual aspect may be

    used.2 It is possible to therefore use corpora to verify the actual and authentic usage of the

    2 Tagliamonte notes that, inter alia:

    y Used to is:i. favoured in affirmative sentences (e.g. we used to go every week.)

    ii. favoured with 1st person singular subjects (e.g.I used to run all sorts of functions in the

    Malcolm Club.)iii. decidedly confined to non-stative verbs which, as with the findings concerning animate

    subjects contrasts with the assertions of Visser (1963) and Bybee et al (1994) mentioned above.

  • 8/6/2019 What is Corpus Linguistics Assignment

    5/6

    words and to classify these occurrences based on the parameters that Tagliamonte and

    Lawrence provide or to describe- create completely new parameters.

    iv. favoured with animate subjects (e.g. And a lot of folks used to say "Is your Dad down at t'

    garden?"). Tagliamonte found that used to is rarely employed with inanimate subjects (see

    preterit below) which is (as with iii above) at variance with the claims of Bybee et al (1994) andVisser (1963).

    v. disfavoured in negative structures. (Such structures are rare in Tagliamonte's data) She sees this

    as a corroboration of Dennison's (1993: 323) suggestion that speakers may eschew the

    employment of used to with negatives.

    y Would is:vi. favoured with 3rd person subjects, including pronouns and complete noun phrases both

    singular and plural (e.g.: He'd stand there and sing a little hymn; The boys and girls would passby on their way to school.).

    Note: Where the context disbars the use of would the preterite is employed in its place For

    example, He drove for my dad for twenty-odd years, as opposed toHe would drive for my dad

    for twenty-odd years.vii. "concentrated in contexts of short duration" (e.g., in relation to a particular motorbike

    ride,Like you'd be like that with your neck.)

    viii. characteristically employed within a sequence of HP sentences (see Tagliamonte andLawrence, 2000 for further details).

  • 8/6/2019 What is Corpus Linguistics Assignment

    6/6

    WORKS CITED

    Newmeyer, Frederick J. "A Chomskyan Revolution in Linguistics." Linguistic Society of

    America62.1 (1986): 1-18.JSTOR. Web. 09 Feb. 2011.

    .

    Tagliamonte, S. and Lawrence, H. (2000) "I used to dance, but I don't dance

    now.": The Habitual Past in English.Journal of English Linguistics, 28(4):

    323-353.http://www.yorkshiredialect.com/habitual.htm

    http://odur.let.rug.nl/~vannoord/trees/

    http://www.comp.lancs.ac.uk/linguistics/crater/corpus.html