applying corpus linguistics to pedagogy - mark...

25
International Journal of Corpus Linguistics 14:3 (2009), 393–417. doi 10.1075/ijcl.14.3.05flo issn 1384–6655 / e-issn 1569–9811 © John Benjamins Publishing Company Applying corpus linguistics to pedagogy A critical evaluation* Lynne Flowerdew Hong Kong University of Science and Technology is article reviews and discusses four somewhat contentious issues in the ap- plication of corpus linguistics to pedagogy, ESP in particular. Corpus linguistic techniques have been criticized on the grounds that they encourage a more bottom-up rather than top-down processing of text in which concordance lines are examined atomistically. One criticism levelled against corpus data is that a corpus presents language out of its original context. For this reason, some corpus linguists have underscored the importance of ‘pedagogic mediation’ to contex- tualize the data for the students’ own writing environment. Concerns relating to the inductive approach associated with corpus-based pedagogy have also been raised as this approach may not always be the most appropriate one. A final con- sideration relates to the issue of whether a corpus is always the most appropriate resource to use among the wealth of other resources available. Keywords: top-down, decontextualisation, pedagogic processing, inductive, data-driven learning 1. Introduction Corpus linguistics is usually associated with a phraseological approach to analysis, which takes a syntagmatic, as opposed to a purely paradigmatic, view of language. Corpus analysis, in fact, gives both a paradigmatic and syntagmatic view of lan- guage as concordance output can either be “read” vertically, i.e. paradigmatically, in line with the slot and filler notion espoused by substitution tables, or horizontally, i.e. syntagmatically, from a phraseological perspective. Following Sinclair (1999, 2004a) the lexical item has primacy, with its core meaning and semantic prosody as obligatory categories, and collocation, colligation and semantic preference con- sidered as optional categories. An interweaving of some or all of these categories gives what Sinclair refers to as an ‘extended unit of meaning’, although in later work Sinclair (2004b: 280) extends this concept to the ‘maximal approach’ which:

Upload: phamduong

Post on 03-Apr-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

International Journal of Corpus Linguistics 143 (2009) 393ndash417 doi 101075ijcl14305floissn 1384ndash6655 e-issn 1569ndash9811 copy John Benjamins Publishing Company

Applying corpus linguistics to pedagogyA critical evaluation

Lynne FlowerdewHong Kong University of Science and Technology

This article reviews and discusses four somewhat contentious issues in the ap-plication of corpus linguistics to pedagogy ESP in particular Corpus linguistic techniques have been criticized on the grounds that they encourage a more bottom-up rather than top-down processing of text in which concordance lines are examined atomistically One criticism levelled against corpus data is that a corpus presents language out of its original context For this reason some corpus linguists have underscored the importance of lsquopedagogic mediationrsquo to contex-tualize the data for the studentsrsquo own writing environment Concerns relating to the inductive approach associated with corpus-based pedagogy have also been raised as this approach may not always be the most appropriate one A final con-sideration relates to the issue of whether a corpus is always the most appropriate resource to use among the wealth of other resources available

Keywords top-down decontextualisation pedagogic processing inductive data-driven learning

1 Introduction

Corpus linguistics is usually associated with a phraseological approach to analysis which takes a syntagmatic as opposed to a purely paradigmatic view of language Corpus analysis in fact gives both a paradigmatic and syntagmatic view of lan-guage as concordance output can either be ldquoreadrdquo vertically ie paradigmatically in line with the slot and filler notion espoused by substitution tables or horizontally ie syntagmatically from a phraseological perspective Following Sinclair (1999 2004a) the lexical item has primacy with its core meaning and semantic prosody as obligatory categories and collocation colligation and semantic preference con-sidered as optional categories An interweaving of some or all of these categories gives what Sinclair refers to as an lsquoextended unit of meaningrsquo although in later work Sinclair (2004b 280) extends this concept to the lsquomaximal approachrsquo which

394 Lynne Flowerdew

ldquohellip would be to extend the dimensions of a unit of meaning until all the relevant patterning was included mdash all the patterning that was instigated by the presence of the central word hellip[W]e should extend the unit until the ambiguity disappearsrdquo In this phraseological approach recurring patterns in concordance output have revealed how language can follow certain tendencies according to Sinclairrsquos notion of lsquoan extended unit of meaningrsquo rather than being bound by hard-and-fast rules

By way of example Danielsson (2007 18) presents a methodology using raw frequency data as a step towards the identification of meaningful units for which as she points out ldquothere are no accepted answers to simple questions such as lsquoWhat exactly constitutes a multi-word unitrsquo or lsquoWhere does a multi-word unit begin and endrsquordquo Taking the word jam as the node Danielsson identified its most fre-quent collocates in the BNC with traffic the top collocate Concordance lines for the node and its collocate were then generated with a span of 4+4 words either side of the node This investigation showed a to be the most frequent collocate in the lines including both traffic and jam eg stuck in a traffic jam with your pulse man in a traffic jam who curses (ibid 19) A further calculation was made to find out the most frequent collocate in the lines for traffic and jam with the second collocate a generated This showed in to be the next most frequent collocate eg stuck in a traffic jam you might reflect (ibid 20) This procedure was repeated until no other collocates were found to occur above the cut-off point of 5 (an arbitrary point that Danielsson admits may have to be revised) The search ended with the unit stuck in a traffic jam ldquoFrom here no other collocates occur with sufficient fre-quency to reach the cut-off point and we may claim to have achieved the maximal unit [my italics] based on the distribution in this corpusrdquo (p 20) Danielsson then considers the paradigmatic axis testing each word to see if the unit allows for any alternatives finding that the corpus offers sitting waiting and caught as alterna-tives to stuck It is to be noted that these alternatives reflect the syntagmatic nature of semantic prosody (the frequent co-occurrence of a lexical item with items ex-pressing a positive or negative evaluation) and semantic preference (the frequent co-occurrence of an item with those from a particular semantic set) ldquoIn this par-ticular context the words seem to be related to offer a set of verbs that create a feeling associated with the annoying event of being held up in trafficrdquo (p 20)

Some researchers (cf Tognini-Bonelli 2001) see this phraseological approach associated with corpus linguistics as a catalyst in redefining aspects of linguistic theory This is regarded as the corpus-driven approach (as opposed to the corpus-based one) in which the data is approached without any pre-conceived notions in relation to how it should be analysed Other corpus linguists take a less ex-treme approach for example McEnery et al (2006 6) while considering corpus linguistics as ldquoa new philosophical approach to linguistic enquiryrdquo with its own theoretical status do not view it as a discipline in its own right with its own theory

Applying corpus linguistics to pedagogy 395

but rather as a methodology The same position can be applied to data-driven learning (DDL) it is generally agreed there is no underlying theory as such but rather it rests on a methodology which can uncover facts about language hitherto unexplored

Notwithstanding the advantages of this approach for DDL during the last few years some accounts in the literature have adopted a more critical stance drawing attention to potential drawbacks of using corpora in DDL This paper reviews the following key issues in the debate on applying corpus linguistics to pedagogy

ndash Corpus linguistic techniques encourage a more bottom-up rather than top-down processing of text in which truncated concordance lines are examined atomistically

ndash Corpus data are decontextualised and for this reason may not be directly transferable to studentsrsquo own context of writing

ndash Corpus-based learning is usually associated with an inductive approach to learning in which rules or indeed patterns are derived from multiple ex-amples rather than a rule-based deductive approach1 This approach might not be the most appropriate choice for some students

ndash There are different types of corpora (general specialized learner) and differ-ent types of online resources (dictionaries grammars) Students may have dif-ficulty in selecting the most appropriate corpus and resource for a particular query

The issues outlined above are not in fact discrete issues but inter-related as the fol-lowing discussion shows They are examined specifically with reference to corpora of written text

2 Corpus linguistic techniques encourage a bottom-up processing of text

Corpus linguistic techniques have been criticized for encouraging a more bottom-up rather than top-down processing of text in which truncated concordance lines are examined in a somewhat atomistic fashion without recourse to the overall discourse (Swales 2002 2004) Like Swales Kaltenboumlck amp Mehlmauer-Larcher (2005 71) have expressed similar sentiments ldquoThere are however certain parts of a text that even a concordancer cannot reach These are aspects of the macro-structure of a text such as textual moves ie a unit of text that expresses a specific communicative functionrdquo However in the last couple of years corpus linguistics research has paid much more attention to these two different modes of text pro-cessing (Flowerdew 2003 2005 forthcoming a) with Biber et al (2007a) explain-ing the concept behind these two different yet complementary approaches thus

396 Lynne Flowerdew

In the lsquotop-downrsquo approach the functional components of a genre are determined first and then all texts in a corpus are analysed in terms of these components In contrast textual components emerge from the corpus analysis in the lsquobottom-uprsquo approach and the discourse organization of individual texts is then analysed in terms of linguistically-defined textual categories (Biber et al 2007a 11)

These two different starting points can be illustrated by reference to the studies of Kanoksilapatham (2007) and Jones (2007) Kanoksilapatham (2007) in her corpus-based examination of rhetorical moves in biochemistry research articles commences with a top-down analysis by first developing an analytical discourse-based framework through identifying the move types that can occur in each sec-tion of biochemistry research articles before embarking on the corpus analysis A corpus analysis using Biberrsquos multi-dimensional analysis was subsequently car-ried out to determine the linguistic characteristics of different rhetorical moves Jonesrsquo (2007) research on the other hand begins with a corpus analysis to identify the linguistic characteristics of vocabulary-based discourse units (VBDUs) using software which specifically highlights new words not found in the preceding ad-jacent stretch of discourse For example in the methods sections of biochemistry articles new linguistic items such as was stored was extracted denoting ldquoProce-dural description of past actionsrdquo were found

As Biber et al (2007b) point out the different starting points of a top-down functional analysis vs a bottom-up linguistic analysis will yield differences ldquoFif-teen different move types were identified in the analysis of biochemistry research articleshellip In contrast only 6 different discourse types were identified in the VBDU studyrdquo (p 249) However at the same time ldquothe inherent structure of a genre would be reflected in analyses undertaken from both perspectivesrdquo (ibid 249) as there are areas of overlap in that the VBDU type of ldquoProcedural description of past ac-tionsrdquo can be mapped onto one of the moves in the Methods section ldquoDescribing experimental proceduresrdquo

It is generally acknowledged that exploitation of corpus linguistics findings takes time to percolate through to pedagogic applications (Braun 2005 Franken-berg-Garcia 2006) Interestingly it is in the area of English for Specific Purposes where corpora are now taking on an increasingly mainstream role (Belcher 2006 Flowerdew forthcoming b) with the compilation of small ldquolocalisedrdquo corpora often compiled by the class tutor or sometimes the students (Flowerdew 2004) Indeed this is evidenced by the studies outlined below which report a more discourse-based pedagogic application of corpora combining top-down and bottom-up ap-proaches to text analysis

Applying corpus linguistics to pedagogy 397

21 Moving from top-down to bottom-up processing

Charles (2007) mediates between top-down and bottom-up processing of aca-demic thesis writing in order to introduce her students to the rhetorical function of ldquodefending your work against criticismrdquo a two-part rhetorical pattern ldquoin which the writer first concedes the possibility of criticism and then moves to neutralize its potentially negative effectrdquo (p 289) Charles achieves this by first having her stu-dents engage in initial macro discourse-based tasks as consciousness-raising ac-tivities of the discourse pattern In this initial stage students are introduced to the rhetorical function its purpose in the text and the different ways in which it can be realised Students discuss the insights that they have gained in their group discus-sion in a whole-class feedback session Charles then moves from this more top-down to bottom-up analysis by having students perform corpus searches which focus on specific lexico-grammatical structures within a discourse-based frame-work For instance students concordance on salient (in the sense of noteworthy) items with the aim of formulating a generalization about the use and positioning of while in constructing a concession Students are asked to examine the lexico-grammar for constructing a concession noting that while co-occurs with acknowl-edge as in Table 1 below or that it may also appear in the context of appearseemmay eg While this may seem contradictory to the conclusions drawn abovehellip

Table 1 Anticipated criticisms and writersrsquo defence (from Charles 2007 294)

Extract Anticipated criticism

A Although the results of the experiments are not conclusive

B There is nothing essential in these categories and they may not appear tenable to other scholars

C While I acknowledge that in some cases the distinction between institutions and groups may seem rather arbitrary

D Unfortunately specimen preparation is especially laborious for the completed device structure which meant that

A more top-down approach similar to that adopted by Charles is also found in Weberrsquos (2001) materials aimed at law students First Weberrsquos students were in-ducted into the genre of legal essays by reading through whole essays taken from the University of London LLB Examinations written by native speakers and iden-tifying some of the prototypical rhetorical features eg identifying andor delimit-ing the legal principle involved in the case They were then asked to identify any lexical expressions which seemed to correlate with the genre features This was followed up by consulting the corpus of the legal essays to verify and pinpoint regularities in lexico-grammatical expressions

398 Lynne Flowerdew

Another corpus linguistics practitioner who commences from a top-down perspective is Noguchi (2004) In fact Noguchi had her science and engineering majors build their own mini-corpora of research journal articles first before clas-sifying the genre features and then moving on to examining prototypical lexico-grammatical features

It has been noted that Swales (2002 163) has contrasted the ldquofragmentedrdquo world of corpus linguistics with its tendency to adopt a somewhat bottom-up atomistic approach to text with the more ldquointegratedrdquo world of ESP material design with its focus on top-down analysis of macro-level features However Weberrsquos tasks and those by Charles and Noguchi seem to be achieving a ldquosymbiosisrdquo between these two approaches as called for by Partington (1998 145)

22 Moving from bottom-up to top-down processing

Hylandrsquos (2007 2008) work on genre-specific phraseological routines commenc-es from a bottom-up perspective Hyland (2007) tabulates the most frequent 50 4-word bundles across four disciplines (biology electrical engineering applied linguistics business studies) noting the great extent to which these are specific to particular disciplines These bundles are then classified into three broad foci of research text and participants as outlined below (Hyland 2007 13ndash14)

Research-oriented minus help writers to structure their activities and experiences of the real world eg

Procedure (the use of the the operation of the)Quantification (the magnitude of the the surface of the)

Text-oriented minus concerned with the organization of the text and its meaning as a message or argument eg

Structuring signals mdash text-reflexive markers which organize stretches of discourse (in the present study in the next section)

Framing signals minus situate arguments by specifying limiting conditions (in the case of with respect to the)

Participant-oriented minus these are focused on the writer or reader of the text egEngagement features minus address readers directly (it should be noted that it can be

seen)

Hyland (2004 220) notes the high productivity of the bundle the of which he ar-gues justifies its inclusion in courses assisting students to write effective academic papers in the sciences For example two frequent bundles in biology the presence of and the splicing of ldquowould seem to offer students valuable forms for expressing meanings relating to existence and to research processes in their writingrdquo (ibid)

Applying corpus linguistics to pedagogy 399

Pedagogic applications with a bottom-up starting point moving to more top-down processing have also been noted by Flowerdew (2006) Besides lexical bun-dles another type of phraseological routine is collocations Usually collocations are considered as word combinations one of the most common being adjective + noun which is a pairing of particular difficulty for advanced learners of English (Nesselhauf 2003 2004) However such collocations can also be involved in more top-down processing of text For example Flowerdew (2006) notes that in a mod-ule on business letter writing students were not sure which adjective from a set of seemingly semantically synonymous adjectives was the ldquorightrdquo one to choose in the following sentence

Thank you for your kind sincere cordial invitation to the alumni dinner

A search on these different combinations in a Business Letters Corpus revealed the following patternings shown in Figures 1 and 2 below2

hoping in fact that you will accept our cordial invitation to be our guest for the length ofMay we extend to you a cordial invitation to call in at Whitersquos and make the

Please accept our cordial invitation to visit and become acquainted withhave the pleasure in extending to you our cordial invitation to visit our organization at a dataand I shall be pleased to extend to you my cordial invitation to visit our Tokyo office at your

Figure 1 Selected concordance lines for cordial + invitation

I very much appreciate your kind invitation to join the University Club and I knowThank you very much again for your kind invitation and I hope your conference will be a

I am therefore very happy to accept your kind invitation and look forward to attending a greatas if I shall be unable to accept your kind invitation this time because of a most important

Thanking you once more for your kind invitation to address the audience I remain

Figure 2 Selected concordance lines for kind + invitation

In order to determine the most appropriate collocation for this context students were required to look beyond the immediate collocation to an lsquoextended unit of meaningrsquo which takes in the subject + verb and direct andor indirect object of the sentence In so doing students were able to work out that cordial + invitation was used for offering an invitation (May we extend to you a cordial invitation) whereas kind + invitation was used for accepting an invitation or thanking the host (thank you very much for your kind invitation) or as one of my students expressed it cordial is used from you to me and kind from me to you As Aston (personal com-munication) has pointed out it is important that students do not just consult the corpus in a phrasebook type fashion but have something more substantial Having students ldquoreadrdquo the corpus paradigmatically to find alternatives to the verbs extend

400 Lynne Flowerdew

accept thank and their respective phraseologies would be a way of counteracting this narrow ldquoreadingrdquo of the corpus Stubbsrsquo (1996 36) oft-quoted principle ldquoThere is no boundary between lexis and grammar lexis and grammar are interdepen-dentrdquo is also of relevance here Lexis and grammar have been shown to manifest interdependency for example the subject of the intransitive verb set in very often refers to ldquounpleasant states of affairsrdquo such as bad weather (Sinclair 1991 74) In a similar fashion collocations and functions can also be viewed as interdependent albeit at a more discourse-based level as evidenced by the above analysis of the collocations of ldquoinvitationrdquo with kind and cordial

Another example of more top-down processing leading on from a bottom-up search is as follows Flowerdew (2008b) notes that in a course on report writing one student query related to whether the active or passive voice was used in the following sentence

This project focuses is focused on the incidence of mosquitoes on campus

A search on focus was conducted in an institutionally-compiled 7 million-word corpus of reports which gave the results shown in Figure 3 below

Pattern Left sort Right Sort Frequency Sort

NOUN + VERB + PREP eg ldquostudy focuses onrdquo Show results 292

VERB + VERB + PREP eg ldquohas focused onrdquo Show results 231

TO + VERB + PREP eg ldquoto focus onrdquo Show results 95

ADV + VERB + PREP eg ldquonot focused onrdquo Show results 94

PRON + VERB + PREP eg ldquowe focus onrdquo Show results 57

CONJ + VERB + PREP eg ldquothat focus onrdquo Show results 56

DET + VERB + PREP eg ldquowhich focus onrdquo Show results 38

VERB + VERB + ADV eg ldquohas focused almostrdquo Show results 34

NOUN + VERB + ADV eg ldquoefforts focused primarilyrdquo Show results 33

TO +VERB + ADV eg ldquoto focus morerdquo Show results 31

Figure 3 Search for focus (all word forms) (Flowerdew 2008)

Besides the fact that the students were able to glean the different meanings be-tween the active and passive forms of focus by examining the verb in a wider con-text accessed via ldquoShow resultsrdquo (column three of the Table) I also found that this search encouraged a more top-down processing of text Studentsrsquo scrutiny of the concordance output prompted one student to ask Why are there so many occur-rences of focus in the present perfect This kind of comment which I have termed a lsquotriggered queryrsquo because it is activated by something the student has alighted on in the corpus data unprompted by the teacher (Flowerdew 2008b) echoes Swainrsquos

Applying corpus linguistics to pedagogy 401

(1998) concept of lsquonoticingrsquo Swain (1998 66) remarks that there are several levels of lsquonoticingrsquo one of which is that ldquoLearners may simply notice a form in the target language due to the frequency or salience of the features themselvesrdquo An examina-tion of the wider context of the present perfect forms of focus revealed that this tense was used when previous research was introduced to set up a critical evalu-ation of this work signalled by however This discourse-based function of however is therefore being used as a key signalling item in Swalesrsquo (1990) CARS (create a research space) model opening up a gap for the authorrsquos own research eg Much of this cross-cultural work to date however has focussed on East Asian versus An-glo comparisons with little attention given to the issue of cross-cultural differences within the East Asian region

This type of browsing is thus in the spirit of Bernardinirsquos philosophy as the lsquolearner as travelerrsquo (Bernardini 2004) Although the type of serendipitous learning advocated by Bernardini (2000 2002) has been mildly criticized as lsquoincidentalistrsquo (Swales 2002) an example such as the one above illustrates that this ad hoc brows-ing can encourage students to process corpus data in a much more top-down way In fact both Granger (1999) and Hahn (2000) emphasise that the teaching of tens-es should be approached from a discourse-based perspective and that a corpus is an ideal medium for achieving this

Another account of searches extending from bottom-up to top-down process-ing is reported in Lee amp Swales (2006) Their innovative corpus-informed EAP course entitled ldquoExploring your own discourse worldrdquo required students to com-pile their own corpora after working with specialized corpora and conduct more genre-based enquiries For example using the BNCweb students were sensitized to the different discourse environments in which for instance and for example are found3

hellip for instance is used a lot more frequently in the social sciences and humanities (where it often introduces casual non-essential exemplifications of points mainly for emphasis or color) whereas in the natural sciences for example is clearly fa-vored (being used to illuminate and clarify a difficult or complex point through the exemplification) (Lee amp Swales 2006 67)

The pedagogic applications reviewed above testify to the fact that traditional class-room corpus-based explorations which tended to centre on a lsquovertical readingrsquo have now been complemented by a more discourse-based approach which requires lsquohorizontal readingrsquo for the analysis of linguistic patternings in relation to their communicative and cultural embedding (Braun 2005) and one could also add here in relation to the practices of different academic disciplines (see Flowerdew in press for further examples of corpus-based discourse approaches to writing

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

394 Lynne Flowerdew

ldquohellip would be to extend the dimensions of a unit of meaning until all the relevant patterning was included mdash all the patterning that was instigated by the presence of the central word hellip[W]e should extend the unit until the ambiguity disappearsrdquo In this phraseological approach recurring patterns in concordance output have revealed how language can follow certain tendencies according to Sinclairrsquos notion of lsquoan extended unit of meaningrsquo rather than being bound by hard-and-fast rules

By way of example Danielsson (2007 18) presents a methodology using raw frequency data as a step towards the identification of meaningful units for which as she points out ldquothere are no accepted answers to simple questions such as lsquoWhat exactly constitutes a multi-word unitrsquo or lsquoWhere does a multi-word unit begin and endrsquordquo Taking the word jam as the node Danielsson identified its most fre-quent collocates in the BNC with traffic the top collocate Concordance lines for the node and its collocate were then generated with a span of 4+4 words either side of the node This investigation showed a to be the most frequent collocate in the lines including both traffic and jam eg stuck in a traffic jam with your pulse man in a traffic jam who curses (ibid 19) A further calculation was made to find out the most frequent collocate in the lines for traffic and jam with the second collocate a generated This showed in to be the next most frequent collocate eg stuck in a traffic jam you might reflect (ibid 20) This procedure was repeated until no other collocates were found to occur above the cut-off point of 5 (an arbitrary point that Danielsson admits may have to be revised) The search ended with the unit stuck in a traffic jam ldquoFrom here no other collocates occur with sufficient fre-quency to reach the cut-off point and we may claim to have achieved the maximal unit [my italics] based on the distribution in this corpusrdquo (p 20) Danielsson then considers the paradigmatic axis testing each word to see if the unit allows for any alternatives finding that the corpus offers sitting waiting and caught as alterna-tives to stuck It is to be noted that these alternatives reflect the syntagmatic nature of semantic prosody (the frequent co-occurrence of a lexical item with items ex-pressing a positive or negative evaluation) and semantic preference (the frequent co-occurrence of an item with those from a particular semantic set) ldquoIn this par-ticular context the words seem to be related to offer a set of verbs that create a feeling associated with the annoying event of being held up in trafficrdquo (p 20)

Some researchers (cf Tognini-Bonelli 2001) see this phraseological approach associated with corpus linguistics as a catalyst in redefining aspects of linguistic theory This is regarded as the corpus-driven approach (as opposed to the corpus-based one) in which the data is approached without any pre-conceived notions in relation to how it should be analysed Other corpus linguists take a less ex-treme approach for example McEnery et al (2006 6) while considering corpus linguistics as ldquoa new philosophical approach to linguistic enquiryrdquo with its own theoretical status do not view it as a discipline in its own right with its own theory

Applying corpus linguistics to pedagogy 395

but rather as a methodology The same position can be applied to data-driven learning (DDL) it is generally agreed there is no underlying theory as such but rather it rests on a methodology which can uncover facts about language hitherto unexplored

Notwithstanding the advantages of this approach for DDL during the last few years some accounts in the literature have adopted a more critical stance drawing attention to potential drawbacks of using corpora in DDL This paper reviews the following key issues in the debate on applying corpus linguistics to pedagogy

ndash Corpus linguistic techniques encourage a more bottom-up rather than top-down processing of text in which truncated concordance lines are examined atomistically

ndash Corpus data are decontextualised and for this reason may not be directly transferable to studentsrsquo own context of writing

ndash Corpus-based learning is usually associated with an inductive approach to learning in which rules or indeed patterns are derived from multiple ex-amples rather than a rule-based deductive approach1 This approach might not be the most appropriate choice for some students

ndash There are different types of corpora (general specialized learner) and differ-ent types of online resources (dictionaries grammars) Students may have dif-ficulty in selecting the most appropriate corpus and resource for a particular query

The issues outlined above are not in fact discrete issues but inter-related as the fol-lowing discussion shows They are examined specifically with reference to corpora of written text

2 Corpus linguistic techniques encourage a bottom-up processing of text

Corpus linguistic techniques have been criticized for encouraging a more bottom-up rather than top-down processing of text in which truncated concordance lines are examined in a somewhat atomistic fashion without recourse to the overall discourse (Swales 2002 2004) Like Swales Kaltenboumlck amp Mehlmauer-Larcher (2005 71) have expressed similar sentiments ldquoThere are however certain parts of a text that even a concordancer cannot reach These are aspects of the macro-structure of a text such as textual moves ie a unit of text that expresses a specific communicative functionrdquo However in the last couple of years corpus linguistics research has paid much more attention to these two different modes of text pro-cessing (Flowerdew 2003 2005 forthcoming a) with Biber et al (2007a) explain-ing the concept behind these two different yet complementary approaches thus

396 Lynne Flowerdew

In the lsquotop-downrsquo approach the functional components of a genre are determined first and then all texts in a corpus are analysed in terms of these components In contrast textual components emerge from the corpus analysis in the lsquobottom-uprsquo approach and the discourse organization of individual texts is then analysed in terms of linguistically-defined textual categories (Biber et al 2007a 11)

These two different starting points can be illustrated by reference to the studies of Kanoksilapatham (2007) and Jones (2007) Kanoksilapatham (2007) in her corpus-based examination of rhetorical moves in biochemistry research articles commences with a top-down analysis by first developing an analytical discourse-based framework through identifying the move types that can occur in each sec-tion of biochemistry research articles before embarking on the corpus analysis A corpus analysis using Biberrsquos multi-dimensional analysis was subsequently car-ried out to determine the linguistic characteristics of different rhetorical moves Jonesrsquo (2007) research on the other hand begins with a corpus analysis to identify the linguistic characteristics of vocabulary-based discourse units (VBDUs) using software which specifically highlights new words not found in the preceding ad-jacent stretch of discourse For example in the methods sections of biochemistry articles new linguistic items such as was stored was extracted denoting ldquoProce-dural description of past actionsrdquo were found

As Biber et al (2007b) point out the different starting points of a top-down functional analysis vs a bottom-up linguistic analysis will yield differences ldquoFif-teen different move types were identified in the analysis of biochemistry research articleshellip In contrast only 6 different discourse types were identified in the VBDU studyrdquo (p 249) However at the same time ldquothe inherent structure of a genre would be reflected in analyses undertaken from both perspectivesrdquo (ibid 249) as there are areas of overlap in that the VBDU type of ldquoProcedural description of past ac-tionsrdquo can be mapped onto one of the moves in the Methods section ldquoDescribing experimental proceduresrdquo

It is generally acknowledged that exploitation of corpus linguistics findings takes time to percolate through to pedagogic applications (Braun 2005 Franken-berg-Garcia 2006) Interestingly it is in the area of English for Specific Purposes where corpora are now taking on an increasingly mainstream role (Belcher 2006 Flowerdew forthcoming b) with the compilation of small ldquolocalisedrdquo corpora often compiled by the class tutor or sometimes the students (Flowerdew 2004) Indeed this is evidenced by the studies outlined below which report a more discourse-based pedagogic application of corpora combining top-down and bottom-up ap-proaches to text analysis

Applying corpus linguistics to pedagogy 397

21 Moving from top-down to bottom-up processing

Charles (2007) mediates between top-down and bottom-up processing of aca-demic thesis writing in order to introduce her students to the rhetorical function of ldquodefending your work against criticismrdquo a two-part rhetorical pattern ldquoin which the writer first concedes the possibility of criticism and then moves to neutralize its potentially negative effectrdquo (p 289) Charles achieves this by first having her stu-dents engage in initial macro discourse-based tasks as consciousness-raising ac-tivities of the discourse pattern In this initial stage students are introduced to the rhetorical function its purpose in the text and the different ways in which it can be realised Students discuss the insights that they have gained in their group discus-sion in a whole-class feedback session Charles then moves from this more top-down to bottom-up analysis by having students perform corpus searches which focus on specific lexico-grammatical structures within a discourse-based frame-work For instance students concordance on salient (in the sense of noteworthy) items with the aim of formulating a generalization about the use and positioning of while in constructing a concession Students are asked to examine the lexico-grammar for constructing a concession noting that while co-occurs with acknowl-edge as in Table 1 below or that it may also appear in the context of appearseemmay eg While this may seem contradictory to the conclusions drawn abovehellip

Table 1 Anticipated criticisms and writersrsquo defence (from Charles 2007 294)

Extract Anticipated criticism

A Although the results of the experiments are not conclusive

B There is nothing essential in these categories and they may not appear tenable to other scholars

C While I acknowledge that in some cases the distinction between institutions and groups may seem rather arbitrary

D Unfortunately specimen preparation is especially laborious for the completed device structure which meant that

A more top-down approach similar to that adopted by Charles is also found in Weberrsquos (2001) materials aimed at law students First Weberrsquos students were in-ducted into the genre of legal essays by reading through whole essays taken from the University of London LLB Examinations written by native speakers and iden-tifying some of the prototypical rhetorical features eg identifying andor delimit-ing the legal principle involved in the case They were then asked to identify any lexical expressions which seemed to correlate with the genre features This was followed up by consulting the corpus of the legal essays to verify and pinpoint regularities in lexico-grammatical expressions

398 Lynne Flowerdew

Another corpus linguistics practitioner who commences from a top-down perspective is Noguchi (2004) In fact Noguchi had her science and engineering majors build their own mini-corpora of research journal articles first before clas-sifying the genre features and then moving on to examining prototypical lexico-grammatical features

It has been noted that Swales (2002 163) has contrasted the ldquofragmentedrdquo world of corpus linguistics with its tendency to adopt a somewhat bottom-up atomistic approach to text with the more ldquointegratedrdquo world of ESP material design with its focus on top-down analysis of macro-level features However Weberrsquos tasks and those by Charles and Noguchi seem to be achieving a ldquosymbiosisrdquo between these two approaches as called for by Partington (1998 145)

22 Moving from bottom-up to top-down processing

Hylandrsquos (2007 2008) work on genre-specific phraseological routines commenc-es from a bottom-up perspective Hyland (2007) tabulates the most frequent 50 4-word bundles across four disciplines (biology electrical engineering applied linguistics business studies) noting the great extent to which these are specific to particular disciplines These bundles are then classified into three broad foci of research text and participants as outlined below (Hyland 2007 13ndash14)

Research-oriented minus help writers to structure their activities and experiences of the real world eg

Procedure (the use of the the operation of the)Quantification (the magnitude of the the surface of the)

Text-oriented minus concerned with the organization of the text and its meaning as a message or argument eg

Structuring signals mdash text-reflexive markers which organize stretches of discourse (in the present study in the next section)

Framing signals minus situate arguments by specifying limiting conditions (in the case of with respect to the)

Participant-oriented minus these are focused on the writer or reader of the text egEngagement features minus address readers directly (it should be noted that it can be

seen)

Hyland (2004 220) notes the high productivity of the bundle the of which he ar-gues justifies its inclusion in courses assisting students to write effective academic papers in the sciences For example two frequent bundles in biology the presence of and the splicing of ldquowould seem to offer students valuable forms for expressing meanings relating to existence and to research processes in their writingrdquo (ibid)

Applying corpus linguistics to pedagogy 399

Pedagogic applications with a bottom-up starting point moving to more top-down processing have also been noted by Flowerdew (2006) Besides lexical bun-dles another type of phraseological routine is collocations Usually collocations are considered as word combinations one of the most common being adjective + noun which is a pairing of particular difficulty for advanced learners of English (Nesselhauf 2003 2004) However such collocations can also be involved in more top-down processing of text For example Flowerdew (2006) notes that in a mod-ule on business letter writing students were not sure which adjective from a set of seemingly semantically synonymous adjectives was the ldquorightrdquo one to choose in the following sentence

Thank you for your kind sincere cordial invitation to the alumni dinner

A search on these different combinations in a Business Letters Corpus revealed the following patternings shown in Figures 1 and 2 below2

hoping in fact that you will accept our cordial invitation to be our guest for the length ofMay we extend to you a cordial invitation to call in at Whitersquos and make the

Please accept our cordial invitation to visit and become acquainted withhave the pleasure in extending to you our cordial invitation to visit our organization at a dataand I shall be pleased to extend to you my cordial invitation to visit our Tokyo office at your

Figure 1 Selected concordance lines for cordial + invitation

I very much appreciate your kind invitation to join the University Club and I knowThank you very much again for your kind invitation and I hope your conference will be a

I am therefore very happy to accept your kind invitation and look forward to attending a greatas if I shall be unable to accept your kind invitation this time because of a most important

Thanking you once more for your kind invitation to address the audience I remain

Figure 2 Selected concordance lines for kind + invitation

In order to determine the most appropriate collocation for this context students were required to look beyond the immediate collocation to an lsquoextended unit of meaningrsquo which takes in the subject + verb and direct andor indirect object of the sentence In so doing students were able to work out that cordial + invitation was used for offering an invitation (May we extend to you a cordial invitation) whereas kind + invitation was used for accepting an invitation or thanking the host (thank you very much for your kind invitation) or as one of my students expressed it cordial is used from you to me and kind from me to you As Aston (personal com-munication) has pointed out it is important that students do not just consult the corpus in a phrasebook type fashion but have something more substantial Having students ldquoreadrdquo the corpus paradigmatically to find alternatives to the verbs extend

400 Lynne Flowerdew

accept thank and their respective phraseologies would be a way of counteracting this narrow ldquoreadingrdquo of the corpus Stubbsrsquo (1996 36) oft-quoted principle ldquoThere is no boundary between lexis and grammar lexis and grammar are interdepen-dentrdquo is also of relevance here Lexis and grammar have been shown to manifest interdependency for example the subject of the intransitive verb set in very often refers to ldquounpleasant states of affairsrdquo such as bad weather (Sinclair 1991 74) In a similar fashion collocations and functions can also be viewed as interdependent albeit at a more discourse-based level as evidenced by the above analysis of the collocations of ldquoinvitationrdquo with kind and cordial

Another example of more top-down processing leading on from a bottom-up search is as follows Flowerdew (2008b) notes that in a course on report writing one student query related to whether the active or passive voice was used in the following sentence

This project focuses is focused on the incidence of mosquitoes on campus

A search on focus was conducted in an institutionally-compiled 7 million-word corpus of reports which gave the results shown in Figure 3 below

Pattern Left sort Right Sort Frequency Sort

NOUN + VERB + PREP eg ldquostudy focuses onrdquo Show results 292

VERB + VERB + PREP eg ldquohas focused onrdquo Show results 231

TO + VERB + PREP eg ldquoto focus onrdquo Show results 95

ADV + VERB + PREP eg ldquonot focused onrdquo Show results 94

PRON + VERB + PREP eg ldquowe focus onrdquo Show results 57

CONJ + VERB + PREP eg ldquothat focus onrdquo Show results 56

DET + VERB + PREP eg ldquowhich focus onrdquo Show results 38

VERB + VERB + ADV eg ldquohas focused almostrdquo Show results 34

NOUN + VERB + ADV eg ldquoefforts focused primarilyrdquo Show results 33

TO +VERB + ADV eg ldquoto focus morerdquo Show results 31

Figure 3 Search for focus (all word forms) (Flowerdew 2008)

Besides the fact that the students were able to glean the different meanings be-tween the active and passive forms of focus by examining the verb in a wider con-text accessed via ldquoShow resultsrdquo (column three of the Table) I also found that this search encouraged a more top-down processing of text Studentsrsquo scrutiny of the concordance output prompted one student to ask Why are there so many occur-rences of focus in the present perfect This kind of comment which I have termed a lsquotriggered queryrsquo because it is activated by something the student has alighted on in the corpus data unprompted by the teacher (Flowerdew 2008b) echoes Swainrsquos

Applying corpus linguistics to pedagogy 401

(1998) concept of lsquonoticingrsquo Swain (1998 66) remarks that there are several levels of lsquonoticingrsquo one of which is that ldquoLearners may simply notice a form in the target language due to the frequency or salience of the features themselvesrdquo An examina-tion of the wider context of the present perfect forms of focus revealed that this tense was used when previous research was introduced to set up a critical evalu-ation of this work signalled by however This discourse-based function of however is therefore being used as a key signalling item in Swalesrsquo (1990) CARS (create a research space) model opening up a gap for the authorrsquos own research eg Much of this cross-cultural work to date however has focussed on East Asian versus An-glo comparisons with little attention given to the issue of cross-cultural differences within the East Asian region

This type of browsing is thus in the spirit of Bernardinirsquos philosophy as the lsquolearner as travelerrsquo (Bernardini 2004) Although the type of serendipitous learning advocated by Bernardini (2000 2002) has been mildly criticized as lsquoincidentalistrsquo (Swales 2002) an example such as the one above illustrates that this ad hoc brows-ing can encourage students to process corpus data in a much more top-down way In fact both Granger (1999) and Hahn (2000) emphasise that the teaching of tens-es should be approached from a discourse-based perspective and that a corpus is an ideal medium for achieving this

Another account of searches extending from bottom-up to top-down process-ing is reported in Lee amp Swales (2006) Their innovative corpus-informed EAP course entitled ldquoExploring your own discourse worldrdquo required students to com-pile their own corpora after working with specialized corpora and conduct more genre-based enquiries For example using the BNCweb students were sensitized to the different discourse environments in which for instance and for example are found3

hellip for instance is used a lot more frequently in the social sciences and humanities (where it often introduces casual non-essential exemplifications of points mainly for emphasis or color) whereas in the natural sciences for example is clearly fa-vored (being used to illuminate and clarify a difficult or complex point through the exemplification) (Lee amp Swales 2006 67)

The pedagogic applications reviewed above testify to the fact that traditional class-room corpus-based explorations which tended to centre on a lsquovertical readingrsquo have now been complemented by a more discourse-based approach which requires lsquohorizontal readingrsquo for the analysis of linguistic patternings in relation to their communicative and cultural embedding (Braun 2005) and one could also add here in relation to the practices of different academic disciplines (see Flowerdew in press for further examples of corpus-based discourse approaches to writing

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 395

but rather as a methodology The same position can be applied to data-driven learning (DDL) it is generally agreed there is no underlying theory as such but rather it rests on a methodology which can uncover facts about language hitherto unexplored

Notwithstanding the advantages of this approach for DDL during the last few years some accounts in the literature have adopted a more critical stance drawing attention to potential drawbacks of using corpora in DDL This paper reviews the following key issues in the debate on applying corpus linguistics to pedagogy

ndash Corpus linguistic techniques encourage a more bottom-up rather than top-down processing of text in which truncated concordance lines are examined atomistically

ndash Corpus data are decontextualised and for this reason may not be directly transferable to studentsrsquo own context of writing

ndash Corpus-based learning is usually associated with an inductive approach to learning in which rules or indeed patterns are derived from multiple ex-amples rather than a rule-based deductive approach1 This approach might not be the most appropriate choice for some students

ndash There are different types of corpora (general specialized learner) and differ-ent types of online resources (dictionaries grammars) Students may have dif-ficulty in selecting the most appropriate corpus and resource for a particular query

The issues outlined above are not in fact discrete issues but inter-related as the fol-lowing discussion shows They are examined specifically with reference to corpora of written text

2 Corpus linguistic techniques encourage a bottom-up processing of text

Corpus linguistic techniques have been criticized for encouraging a more bottom-up rather than top-down processing of text in which truncated concordance lines are examined in a somewhat atomistic fashion without recourse to the overall discourse (Swales 2002 2004) Like Swales Kaltenboumlck amp Mehlmauer-Larcher (2005 71) have expressed similar sentiments ldquoThere are however certain parts of a text that even a concordancer cannot reach These are aspects of the macro-structure of a text such as textual moves ie a unit of text that expresses a specific communicative functionrdquo However in the last couple of years corpus linguistics research has paid much more attention to these two different modes of text pro-cessing (Flowerdew 2003 2005 forthcoming a) with Biber et al (2007a) explain-ing the concept behind these two different yet complementary approaches thus

396 Lynne Flowerdew

In the lsquotop-downrsquo approach the functional components of a genre are determined first and then all texts in a corpus are analysed in terms of these components In contrast textual components emerge from the corpus analysis in the lsquobottom-uprsquo approach and the discourse organization of individual texts is then analysed in terms of linguistically-defined textual categories (Biber et al 2007a 11)

These two different starting points can be illustrated by reference to the studies of Kanoksilapatham (2007) and Jones (2007) Kanoksilapatham (2007) in her corpus-based examination of rhetorical moves in biochemistry research articles commences with a top-down analysis by first developing an analytical discourse-based framework through identifying the move types that can occur in each sec-tion of biochemistry research articles before embarking on the corpus analysis A corpus analysis using Biberrsquos multi-dimensional analysis was subsequently car-ried out to determine the linguistic characteristics of different rhetorical moves Jonesrsquo (2007) research on the other hand begins with a corpus analysis to identify the linguistic characteristics of vocabulary-based discourse units (VBDUs) using software which specifically highlights new words not found in the preceding ad-jacent stretch of discourse For example in the methods sections of biochemistry articles new linguistic items such as was stored was extracted denoting ldquoProce-dural description of past actionsrdquo were found

As Biber et al (2007b) point out the different starting points of a top-down functional analysis vs a bottom-up linguistic analysis will yield differences ldquoFif-teen different move types were identified in the analysis of biochemistry research articleshellip In contrast only 6 different discourse types were identified in the VBDU studyrdquo (p 249) However at the same time ldquothe inherent structure of a genre would be reflected in analyses undertaken from both perspectivesrdquo (ibid 249) as there are areas of overlap in that the VBDU type of ldquoProcedural description of past ac-tionsrdquo can be mapped onto one of the moves in the Methods section ldquoDescribing experimental proceduresrdquo

It is generally acknowledged that exploitation of corpus linguistics findings takes time to percolate through to pedagogic applications (Braun 2005 Franken-berg-Garcia 2006) Interestingly it is in the area of English for Specific Purposes where corpora are now taking on an increasingly mainstream role (Belcher 2006 Flowerdew forthcoming b) with the compilation of small ldquolocalisedrdquo corpora often compiled by the class tutor or sometimes the students (Flowerdew 2004) Indeed this is evidenced by the studies outlined below which report a more discourse-based pedagogic application of corpora combining top-down and bottom-up ap-proaches to text analysis

Applying corpus linguistics to pedagogy 397

21 Moving from top-down to bottom-up processing

Charles (2007) mediates between top-down and bottom-up processing of aca-demic thesis writing in order to introduce her students to the rhetorical function of ldquodefending your work against criticismrdquo a two-part rhetorical pattern ldquoin which the writer first concedes the possibility of criticism and then moves to neutralize its potentially negative effectrdquo (p 289) Charles achieves this by first having her stu-dents engage in initial macro discourse-based tasks as consciousness-raising ac-tivities of the discourse pattern In this initial stage students are introduced to the rhetorical function its purpose in the text and the different ways in which it can be realised Students discuss the insights that they have gained in their group discus-sion in a whole-class feedback session Charles then moves from this more top-down to bottom-up analysis by having students perform corpus searches which focus on specific lexico-grammatical structures within a discourse-based frame-work For instance students concordance on salient (in the sense of noteworthy) items with the aim of formulating a generalization about the use and positioning of while in constructing a concession Students are asked to examine the lexico-grammar for constructing a concession noting that while co-occurs with acknowl-edge as in Table 1 below or that it may also appear in the context of appearseemmay eg While this may seem contradictory to the conclusions drawn abovehellip

Table 1 Anticipated criticisms and writersrsquo defence (from Charles 2007 294)

Extract Anticipated criticism

A Although the results of the experiments are not conclusive

B There is nothing essential in these categories and they may not appear tenable to other scholars

C While I acknowledge that in some cases the distinction between institutions and groups may seem rather arbitrary

D Unfortunately specimen preparation is especially laborious for the completed device structure which meant that

A more top-down approach similar to that adopted by Charles is also found in Weberrsquos (2001) materials aimed at law students First Weberrsquos students were in-ducted into the genre of legal essays by reading through whole essays taken from the University of London LLB Examinations written by native speakers and iden-tifying some of the prototypical rhetorical features eg identifying andor delimit-ing the legal principle involved in the case They were then asked to identify any lexical expressions which seemed to correlate with the genre features This was followed up by consulting the corpus of the legal essays to verify and pinpoint regularities in lexico-grammatical expressions

398 Lynne Flowerdew

Another corpus linguistics practitioner who commences from a top-down perspective is Noguchi (2004) In fact Noguchi had her science and engineering majors build their own mini-corpora of research journal articles first before clas-sifying the genre features and then moving on to examining prototypical lexico-grammatical features

It has been noted that Swales (2002 163) has contrasted the ldquofragmentedrdquo world of corpus linguistics with its tendency to adopt a somewhat bottom-up atomistic approach to text with the more ldquointegratedrdquo world of ESP material design with its focus on top-down analysis of macro-level features However Weberrsquos tasks and those by Charles and Noguchi seem to be achieving a ldquosymbiosisrdquo between these two approaches as called for by Partington (1998 145)

22 Moving from bottom-up to top-down processing

Hylandrsquos (2007 2008) work on genre-specific phraseological routines commenc-es from a bottom-up perspective Hyland (2007) tabulates the most frequent 50 4-word bundles across four disciplines (biology electrical engineering applied linguistics business studies) noting the great extent to which these are specific to particular disciplines These bundles are then classified into three broad foci of research text and participants as outlined below (Hyland 2007 13ndash14)

Research-oriented minus help writers to structure their activities and experiences of the real world eg

Procedure (the use of the the operation of the)Quantification (the magnitude of the the surface of the)

Text-oriented minus concerned with the organization of the text and its meaning as a message or argument eg

Structuring signals mdash text-reflexive markers which organize stretches of discourse (in the present study in the next section)

Framing signals minus situate arguments by specifying limiting conditions (in the case of with respect to the)

Participant-oriented minus these are focused on the writer or reader of the text egEngagement features minus address readers directly (it should be noted that it can be

seen)

Hyland (2004 220) notes the high productivity of the bundle the of which he ar-gues justifies its inclusion in courses assisting students to write effective academic papers in the sciences For example two frequent bundles in biology the presence of and the splicing of ldquowould seem to offer students valuable forms for expressing meanings relating to existence and to research processes in their writingrdquo (ibid)

Applying corpus linguistics to pedagogy 399

Pedagogic applications with a bottom-up starting point moving to more top-down processing have also been noted by Flowerdew (2006) Besides lexical bun-dles another type of phraseological routine is collocations Usually collocations are considered as word combinations one of the most common being adjective + noun which is a pairing of particular difficulty for advanced learners of English (Nesselhauf 2003 2004) However such collocations can also be involved in more top-down processing of text For example Flowerdew (2006) notes that in a mod-ule on business letter writing students were not sure which adjective from a set of seemingly semantically synonymous adjectives was the ldquorightrdquo one to choose in the following sentence

Thank you for your kind sincere cordial invitation to the alumni dinner

A search on these different combinations in a Business Letters Corpus revealed the following patternings shown in Figures 1 and 2 below2

hoping in fact that you will accept our cordial invitation to be our guest for the length ofMay we extend to you a cordial invitation to call in at Whitersquos and make the

Please accept our cordial invitation to visit and become acquainted withhave the pleasure in extending to you our cordial invitation to visit our organization at a dataand I shall be pleased to extend to you my cordial invitation to visit our Tokyo office at your

Figure 1 Selected concordance lines for cordial + invitation

I very much appreciate your kind invitation to join the University Club and I knowThank you very much again for your kind invitation and I hope your conference will be a

I am therefore very happy to accept your kind invitation and look forward to attending a greatas if I shall be unable to accept your kind invitation this time because of a most important

Thanking you once more for your kind invitation to address the audience I remain

Figure 2 Selected concordance lines for kind + invitation

In order to determine the most appropriate collocation for this context students were required to look beyond the immediate collocation to an lsquoextended unit of meaningrsquo which takes in the subject + verb and direct andor indirect object of the sentence In so doing students were able to work out that cordial + invitation was used for offering an invitation (May we extend to you a cordial invitation) whereas kind + invitation was used for accepting an invitation or thanking the host (thank you very much for your kind invitation) or as one of my students expressed it cordial is used from you to me and kind from me to you As Aston (personal com-munication) has pointed out it is important that students do not just consult the corpus in a phrasebook type fashion but have something more substantial Having students ldquoreadrdquo the corpus paradigmatically to find alternatives to the verbs extend

400 Lynne Flowerdew

accept thank and their respective phraseologies would be a way of counteracting this narrow ldquoreadingrdquo of the corpus Stubbsrsquo (1996 36) oft-quoted principle ldquoThere is no boundary between lexis and grammar lexis and grammar are interdepen-dentrdquo is also of relevance here Lexis and grammar have been shown to manifest interdependency for example the subject of the intransitive verb set in very often refers to ldquounpleasant states of affairsrdquo such as bad weather (Sinclair 1991 74) In a similar fashion collocations and functions can also be viewed as interdependent albeit at a more discourse-based level as evidenced by the above analysis of the collocations of ldquoinvitationrdquo with kind and cordial

Another example of more top-down processing leading on from a bottom-up search is as follows Flowerdew (2008b) notes that in a course on report writing one student query related to whether the active or passive voice was used in the following sentence

This project focuses is focused on the incidence of mosquitoes on campus

A search on focus was conducted in an institutionally-compiled 7 million-word corpus of reports which gave the results shown in Figure 3 below

Pattern Left sort Right Sort Frequency Sort

NOUN + VERB + PREP eg ldquostudy focuses onrdquo Show results 292

VERB + VERB + PREP eg ldquohas focused onrdquo Show results 231

TO + VERB + PREP eg ldquoto focus onrdquo Show results 95

ADV + VERB + PREP eg ldquonot focused onrdquo Show results 94

PRON + VERB + PREP eg ldquowe focus onrdquo Show results 57

CONJ + VERB + PREP eg ldquothat focus onrdquo Show results 56

DET + VERB + PREP eg ldquowhich focus onrdquo Show results 38

VERB + VERB + ADV eg ldquohas focused almostrdquo Show results 34

NOUN + VERB + ADV eg ldquoefforts focused primarilyrdquo Show results 33

TO +VERB + ADV eg ldquoto focus morerdquo Show results 31

Figure 3 Search for focus (all word forms) (Flowerdew 2008)

Besides the fact that the students were able to glean the different meanings be-tween the active and passive forms of focus by examining the verb in a wider con-text accessed via ldquoShow resultsrdquo (column three of the Table) I also found that this search encouraged a more top-down processing of text Studentsrsquo scrutiny of the concordance output prompted one student to ask Why are there so many occur-rences of focus in the present perfect This kind of comment which I have termed a lsquotriggered queryrsquo because it is activated by something the student has alighted on in the corpus data unprompted by the teacher (Flowerdew 2008b) echoes Swainrsquos

Applying corpus linguistics to pedagogy 401

(1998) concept of lsquonoticingrsquo Swain (1998 66) remarks that there are several levels of lsquonoticingrsquo one of which is that ldquoLearners may simply notice a form in the target language due to the frequency or salience of the features themselvesrdquo An examina-tion of the wider context of the present perfect forms of focus revealed that this tense was used when previous research was introduced to set up a critical evalu-ation of this work signalled by however This discourse-based function of however is therefore being used as a key signalling item in Swalesrsquo (1990) CARS (create a research space) model opening up a gap for the authorrsquos own research eg Much of this cross-cultural work to date however has focussed on East Asian versus An-glo comparisons with little attention given to the issue of cross-cultural differences within the East Asian region

This type of browsing is thus in the spirit of Bernardinirsquos philosophy as the lsquolearner as travelerrsquo (Bernardini 2004) Although the type of serendipitous learning advocated by Bernardini (2000 2002) has been mildly criticized as lsquoincidentalistrsquo (Swales 2002) an example such as the one above illustrates that this ad hoc brows-ing can encourage students to process corpus data in a much more top-down way In fact both Granger (1999) and Hahn (2000) emphasise that the teaching of tens-es should be approached from a discourse-based perspective and that a corpus is an ideal medium for achieving this

Another account of searches extending from bottom-up to top-down process-ing is reported in Lee amp Swales (2006) Their innovative corpus-informed EAP course entitled ldquoExploring your own discourse worldrdquo required students to com-pile their own corpora after working with specialized corpora and conduct more genre-based enquiries For example using the BNCweb students were sensitized to the different discourse environments in which for instance and for example are found3

hellip for instance is used a lot more frequently in the social sciences and humanities (where it often introduces casual non-essential exemplifications of points mainly for emphasis or color) whereas in the natural sciences for example is clearly fa-vored (being used to illuminate and clarify a difficult or complex point through the exemplification) (Lee amp Swales 2006 67)

The pedagogic applications reviewed above testify to the fact that traditional class-room corpus-based explorations which tended to centre on a lsquovertical readingrsquo have now been complemented by a more discourse-based approach which requires lsquohorizontal readingrsquo for the analysis of linguistic patternings in relation to their communicative and cultural embedding (Braun 2005) and one could also add here in relation to the practices of different academic disciplines (see Flowerdew in press for further examples of corpus-based discourse approaches to writing

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

396 Lynne Flowerdew

In the lsquotop-downrsquo approach the functional components of a genre are determined first and then all texts in a corpus are analysed in terms of these components In contrast textual components emerge from the corpus analysis in the lsquobottom-uprsquo approach and the discourse organization of individual texts is then analysed in terms of linguistically-defined textual categories (Biber et al 2007a 11)

These two different starting points can be illustrated by reference to the studies of Kanoksilapatham (2007) and Jones (2007) Kanoksilapatham (2007) in her corpus-based examination of rhetorical moves in biochemistry research articles commences with a top-down analysis by first developing an analytical discourse-based framework through identifying the move types that can occur in each sec-tion of biochemistry research articles before embarking on the corpus analysis A corpus analysis using Biberrsquos multi-dimensional analysis was subsequently car-ried out to determine the linguistic characteristics of different rhetorical moves Jonesrsquo (2007) research on the other hand begins with a corpus analysis to identify the linguistic characteristics of vocabulary-based discourse units (VBDUs) using software which specifically highlights new words not found in the preceding ad-jacent stretch of discourse For example in the methods sections of biochemistry articles new linguistic items such as was stored was extracted denoting ldquoProce-dural description of past actionsrdquo were found

As Biber et al (2007b) point out the different starting points of a top-down functional analysis vs a bottom-up linguistic analysis will yield differences ldquoFif-teen different move types were identified in the analysis of biochemistry research articleshellip In contrast only 6 different discourse types were identified in the VBDU studyrdquo (p 249) However at the same time ldquothe inherent structure of a genre would be reflected in analyses undertaken from both perspectivesrdquo (ibid 249) as there are areas of overlap in that the VBDU type of ldquoProcedural description of past ac-tionsrdquo can be mapped onto one of the moves in the Methods section ldquoDescribing experimental proceduresrdquo

It is generally acknowledged that exploitation of corpus linguistics findings takes time to percolate through to pedagogic applications (Braun 2005 Franken-berg-Garcia 2006) Interestingly it is in the area of English for Specific Purposes where corpora are now taking on an increasingly mainstream role (Belcher 2006 Flowerdew forthcoming b) with the compilation of small ldquolocalisedrdquo corpora often compiled by the class tutor or sometimes the students (Flowerdew 2004) Indeed this is evidenced by the studies outlined below which report a more discourse-based pedagogic application of corpora combining top-down and bottom-up ap-proaches to text analysis

Applying corpus linguistics to pedagogy 397

21 Moving from top-down to bottom-up processing

Charles (2007) mediates between top-down and bottom-up processing of aca-demic thesis writing in order to introduce her students to the rhetorical function of ldquodefending your work against criticismrdquo a two-part rhetorical pattern ldquoin which the writer first concedes the possibility of criticism and then moves to neutralize its potentially negative effectrdquo (p 289) Charles achieves this by first having her stu-dents engage in initial macro discourse-based tasks as consciousness-raising ac-tivities of the discourse pattern In this initial stage students are introduced to the rhetorical function its purpose in the text and the different ways in which it can be realised Students discuss the insights that they have gained in their group discus-sion in a whole-class feedback session Charles then moves from this more top-down to bottom-up analysis by having students perform corpus searches which focus on specific lexico-grammatical structures within a discourse-based frame-work For instance students concordance on salient (in the sense of noteworthy) items with the aim of formulating a generalization about the use and positioning of while in constructing a concession Students are asked to examine the lexico-grammar for constructing a concession noting that while co-occurs with acknowl-edge as in Table 1 below or that it may also appear in the context of appearseemmay eg While this may seem contradictory to the conclusions drawn abovehellip

Table 1 Anticipated criticisms and writersrsquo defence (from Charles 2007 294)

Extract Anticipated criticism

A Although the results of the experiments are not conclusive

B There is nothing essential in these categories and they may not appear tenable to other scholars

C While I acknowledge that in some cases the distinction between institutions and groups may seem rather arbitrary

D Unfortunately specimen preparation is especially laborious for the completed device structure which meant that

A more top-down approach similar to that adopted by Charles is also found in Weberrsquos (2001) materials aimed at law students First Weberrsquos students were in-ducted into the genre of legal essays by reading through whole essays taken from the University of London LLB Examinations written by native speakers and iden-tifying some of the prototypical rhetorical features eg identifying andor delimit-ing the legal principle involved in the case They were then asked to identify any lexical expressions which seemed to correlate with the genre features This was followed up by consulting the corpus of the legal essays to verify and pinpoint regularities in lexico-grammatical expressions

398 Lynne Flowerdew

Another corpus linguistics practitioner who commences from a top-down perspective is Noguchi (2004) In fact Noguchi had her science and engineering majors build their own mini-corpora of research journal articles first before clas-sifying the genre features and then moving on to examining prototypical lexico-grammatical features

It has been noted that Swales (2002 163) has contrasted the ldquofragmentedrdquo world of corpus linguistics with its tendency to adopt a somewhat bottom-up atomistic approach to text with the more ldquointegratedrdquo world of ESP material design with its focus on top-down analysis of macro-level features However Weberrsquos tasks and those by Charles and Noguchi seem to be achieving a ldquosymbiosisrdquo between these two approaches as called for by Partington (1998 145)

22 Moving from bottom-up to top-down processing

Hylandrsquos (2007 2008) work on genre-specific phraseological routines commenc-es from a bottom-up perspective Hyland (2007) tabulates the most frequent 50 4-word bundles across four disciplines (biology electrical engineering applied linguistics business studies) noting the great extent to which these are specific to particular disciplines These bundles are then classified into three broad foci of research text and participants as outlined below (Hyland 2007 13ndash14)

Research-oriented minus help writers to structure their activities and experiences of the real world eg

Procedure (the use of the the operation of the)Quantification (the magnitude of the the surface of the)

Text-oriented minus concerned with the organization of the text and its meaning as a message or argument eg

Structuring signals mdash text-reflexive markers which organize stretches of discourse (in the present study in the next section)

Framing signals minus situate arguments by specifying limiting conditions (in the case of with respect to the)

Participant-oriented minus these are focused on the writer or reader of the text egEngagement features minus address readers directly (it should be noted that it can be

seen)

Hyland (2004 220) notes the high productivity of the bundle the of which he ar-gues justifies its inclusion in courses assisting students to write effective academic papers in the sciences For example two frequent bundles in biology the presence of and the splicing of ldquowould seem to offer students valuable forms for expressing meanings relating to existence and to research processes in their writingrdquo (ibid)

Applying corpus linguistics to pedagogy 399

Pedagogic applications with a bottom-up starting point moving to more top-down processing have also been noted by Flowerdew (2006) Besides lexical bun-dles another type of phraseological routine is collocations Usually collocations are considered as word combinations one of the most common being adjective + noun which is a pairing of particular difficulty for advanced learners of English (Nesselhauf 2003 2004) However such collocations can also be involved in more top-down processing of text For example Flowerdew (2006) notes that in a mod-ule on business letter writing students were not sure which adjective from a set of seemingly semantically synonymous adjectives was the ldquorightrdquo one to choose in the following sentence

Thank you for your kind sincere cordial invitation to the alumni dinner

A search on these different combinations in a Business Letters Corpus revealed the following patternings shown in Figures 1 and 2 below2

hoping in fact that you will accept our cordial invitation to be our guest for the length ofMay we extend to you a cordial invitation to call in at Whitersquos and make the

Please accept our cordial invitation to visit and become acquainted withhave the pleasure in extending to you our cordial invitation to visit our organization at a dataand I shall be pleased to extend to you my cordial invitation to visit our Tokyo office at your

Figure 1 Selected concordance lines for cordial + invitation

I very much appreciate your kind invitation to join the University Club and I knowThank you very much again for your kind invitation and I hope your conference will be a

I am therefore very happy to accept your kind invitation and look forward to attending a greatas if I shall be unable to accept your kind invitation this time because of a most important

Thanking you once more for your kind invitation to address the audience I remain

Figure 2 Selected concordance lines for kind + invitation

In order to determine the most appropriate collocation for this context students were required to look beyond the immediate collocation to an lsquoextended unit of meaningrsquo which takes in the subject + verb and direct andor indirect object of the sentence In so doing students were able to work out that cordial + invitation was used for offering an invitation (May we extend to you a cordial invitation) whereas kind + invitation was used for accepting an invitation or thanking the host (thank you very much for your kind invitation) or as one of my students expressed it cordial is used from you to me and kind from me to you As Aston (personal com-munication) has pointed out it is important that students do not just consult the corpus in a phrasebook type fashion but have something more substantial Having students ldquoreadrdquo the corpus paradigmatically to find alternatives to the verbs extend

400 Lynne Flowerdew

accept thank and their respective phraseologies would be a way of counteracting this narrow ldquoreadingrdquo of the corpus Stubbsrsquo (1996 36) oft-quoted principle ldquoThere is no boundary between lexis and grammar lexis and grammar are interdepen-dentrdquo is also of relevance here Lexis and grammar have been shown to manifest interdependency for example the subject of the intransitive verb set in very often refers to ldquounpleasant states of affairsrdquo such as bad weather (Sinclair 1991 74) In a similar fashion collocations and functions can also be viewed as interdependent albeit at a more discourse-based level as evidenced by the above analysis of the collocations of ldquoinvitationrdquo with kind and cordial

Another example of more top-down processing leading on from a bottom-up search is as follows Flowerdew (2008b) notes that in a course on report writing one student query related to whether the active or passive voice was used in the following sentence

This project focuses is focused on the incidence of mosquitoes on campus

A search on focus was conducted in an institutionally-compiled 7 million-word corpus of reports which gave the results shown in Figure 3 below

Pattern Left sort Right Sort Frequency Sort

NOUN + VERB + PREP eg ldquostudy focuses onrdquo Show results 292

VERB + VERB + PREP eg ldquohas focused onrdquo Show results 231

TO + VERB + PREP eg ldquoto focus onrdquo Show results 95

ADV + VERB + PREP eg ldquonot focused onrdquo Show results 94

PRON + VERB + PREP eg ldquowe focus onrdquo Show results 57

CONJ + VERB + PREP eg ldquothat focus onrdquo Show results 56

DET + VERB + PREP eg ldquowhich focus onrdquo Show results 38

VERB + VERB + ADV eg ldquohas focused almostrdquo Show results 34

NOUN + VERB + ADV eg ldquoefforts focused primarilyrdquo Show results 33

TO +VERB + ADV eg ldquoto focus morerdquo Show results 31

Figure 3 Search for focus (all word forms) (Flowerdew 2008)

Besides the fact that the students were able to glean the different meanings be-tween the active and passive forms of focus by examining the verb in a wider con-text accessed via ldquoShow resultsrdquo (column three of the Table) I also found that this search encouraged a more top-down processing of text Studentsrsquo scrutiny of the concordance output prompted one student to ask Why are there so many occur-rences of focus in the present perfect This kind of comment which I have termed a lsquotriggered queryrsquo because it is activated by something the student has alighted on in the corpus data unprompted by the teacher (Flowerdew 2008b) echoes Swainrsquos

Applying corpus linguistics to pedagogy 401

(1998) concept of lsquonoticingrsquo Swain (1998 66) remarks that there are several levels of lsquonoticingrsquo one of which is that ldquoLearners may simply notice a form in the target language due to the frequency or salience of the features themselvesrdquo An examina-tion of the wider context of the present perfect forms of focus revealed that this tense was used when previous research was introduced to set up a critical evalu-ation of this work signalled by however This discourse-based function of however is therefore being used as a key signalling item in Swalesrsquo (1990) CARS (create a research space) model opening up a gap for the authorrsquos own research eg Much of this cross-cultural work to date however has focussed on East Asian versus An-glo comparisons with little attention given to the issue of cross-cultural differences within the East Asian region

This type of browsing is thus in the spirit of Bernardinirsquos philosophy as the lsquolearner as travelerrsquo (Bernardini 2004) Although the type of serendipitous learning advocated by Bernardini (2000 2002) has been mildly criticized as lsquoincidentalistrsquo (Swales 2002) an example such as the one above illustrates that this ad hoc brows-ing can encourage students to process corpus data in a much more top-down way In fact both Granger (1999) and Hahn (2000) emphasise that the teaching of tens-es should be approached from a discourse-based perspective and that a corpus is an ideal medium for achieving this

Another account of searches extending from bottom-up to top-down process-ing is reported in Lee amp Swales (2006) Their innovative corpus-informed EAP course entitled ldquoExploring your own discourse worldrdquo required students to com-pile their own corpora after working with specialized corpora and conduct more genre-based enquiries For example using the BNCweb students were sensitized to the different discourse environments in which for instance and for example are found3

hellip for instance is used a lot more frequently in the social sciences and humanities (where it often introduces casual non-essential exemplifications of points mainly for emphasis or color) whereas in the natural sciences for example is clearly fa-vored (being used to illuminate and clarify a difficult or complex point through the exemplification) (Lee amp Swales 2006 67)

The pedagogic applications reviewed above testify to the fact that traditional class-room corpus-based explorations which tended to centre on a lsquovertical readingrsquo have now been complemented by a more discourse-based approach which requires lsquohorizontal readingrsquo for the analysis of linguistic patternings in relation to their communicative and cultural embedding (Braun 2005) and one could also add here in relation to the practices of different academic disciplines (see Flowerdew in press for further examples of corpus-based discourse approaches to writing

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 397

21 Moving from top-down to bottom-up processing

Charles (2007) mediates between top-down and bottom-up processing of aca-demic thesis writing in order to introduce her students to the rhetorical function of ldquodefending your work against criticismrdquo a two-part rhetorical pattern ldquoin which the writer first concedes the possibility of criticism and then moves to neutralize its potentially negative effectrdquo (p 289) Charles achieves this by first having her stu-dents engage in initial macro discourse-based tasks as consciousness-raising ac-tivities of the discourse pattern In this initial stage students are introduced to the rhetorical function its purpose in the text and the different ways in which it can be realised Students discuss the insights that they have gained in their group discus-sion in a whole-class feedback session Charles then moves from this more top-down to bottom-up analysis by having students perform corpus searches which focus on specific lexico-grammatical structures within a discourse-based frame-work For instance students concordance on salient (in the sense of noteworthy) items with the aim of formulating a generalization about the use and positioning of while in constructing a concession Students are asked to examine the lexico-grammar for constructing a concession noting that while co-occurs with acknowl-edge as in Table 1 below or that it may also appear in the context of appearseemmay eg While this may seem contradictory to the conclusions drawn abovehellip

Table 1 Anticipated criticisms and writersrsquo defence (from Charles 2007 294)

Extract Anticipated criticism

A Although the results of the experiments are not conclusive

B There is nothing essential in these categories and they may not appear tenable to other scholars

C While I acknowledge that in some cases the distinction between institutions and groups may seem rather arbitrary

D Unfortunately specimen preparation is especially laborious for the completed device structure which meant that

A more top-down approach similar to that adopted by Charles is also found in Weberrsquos (2001) materials aimed at law students First Weberrsquos students were in-ducted into the genre of legal essays by reading through whole essays taken from the University of London LLB Examinations written by native speakers and iden-tifying some of the prototypical rhetorical features eg identifying andor delimit-ing the legal principle involved in the case They were then asked to identify any lexical expressions which seemed to correlate with the genre features This was followed up by consulting the corpus of the legal essays to verify and pinpoint regularities in lexico-grammatical expressions

398 Lynne Flowerdew

Another corpus linguistics practitioner who commences from a top-down perspective is Noguchi (2004) In fact Noguchi had her science and engineering majors build their own mini-corpora of research journal articles first before clas-sifying the genre features and then moving on to examining prototypical lexico-grammatical features

It has been noted that Swales (2002 163) has contrasted the ldquofragmentedrdquo world of corpus linguistics with its tendency to adopt a somewhat bottom-up atomistic approach to text with the more ldquointegratedrdquo world of ESP material design with its focus on top-down analysis of macro-level features However Weberrsquos tasks and those by Charles and Noguchi seem to be achieving a ldquosymbiosisrdquo between these two approaches as called for by Partington (1998 145)

22 Moving from bottom-up to top-down processing

Hylandrsquos (2007 2008) work on genre-specific phraseological routines commenc-es from a bottom-up perspective Hyland (2007) tabulates the most frequent 50 4-word bundles across four disciplines (biology electrical engineering applied linguistics business studies) noting the great extent to which these are specific to particular disciplines These bundles are then classified into three broad foci of research text and participants as outlined below (Hyland 2007 13ndash14)

Research-oriented minus help writers to structure their activities and experiences of the real world eg

Procedure (the use of the the operation of the)Quantification (the magnitude of the the surface of the)

Text-oriented minus concerned with the organization of the text and its meaning as a message or argument eg

Structuring signals mdash text-reflexive markers which organize stretches of discourse (in the present study in the next section)

Framing signals minus situate arguments by specifying limiting conditions (in the case of with respect to the)

Participant-oriented minus these are focused on the writer or reader of the text egEngagement features minus address readers directly (it should be noted that it can be

seen)

Hyland (2004 220) notes the high productivity of the bundle the of which he ar-gues justifies its inclusion in courses assisting students to write effective academic papers in the sciences For example two frequent bundles in biology the presence of and the splicing of ldquowould seem to offer students valuable forms for expressing meanings relating to existence and to research processes in their writingrdquo (ibid)

Applying corpus linguistics to pedagogy 399

Pedagogic applications with a bottom-up starting point moving to more top-down processing have also been noted by Flowerdew (2006) Besides lexical bun-dles another type of phraseological routine is collocations Usually collocations are considered as word combinations one of the most common being adjective + noun which is a pairing of particular difficulty for advanced learners of English (Nesselhauf 2003 2004) However such collocations can also be involved in more top-down processing of text For example Flowerdew (2006) notes that in a mod-ule on business letter writing students were not sure which adjective from a set of seemingly semantically synonymous adjectives was the ldquorightrdquo one to choose in the following sentence

Thank you for your kind sincere cordial invitation to the alumni dinner

A search on these different combinations in a Business Letters Corpus revealed the following patternings shown in Figures 1 and 2 below2

hoping in fact that you will accept our cordial invitation to be our guest for the length ofMay we extend to you a cordial invitation to call in at Whitersquos and make the

Please accept our cordial invitation to visit and become acquainted withhave the pleasure in extending to you our cordial invitation to visit our organization at a dataand I shall be pleased to extend to you my cordial invitation to visit our Tokyo office at your

Figure 1 Selected concordance lines for cordial + invitation

I very much appreciate your kind invitation to join the University Club and I knowThank you very much again for your kind invitation and I hope your conference will be a

I am therefore very happy to accept your kind invitation and look forward to attending a greatas if I shall be unable to accept your kind invitation this time because of a most important

Thanking you once more for your kind invitation to address the audience I remain

Figure 2 Selected concordance lines for kind + invitation

In order to determine the most appropriate collocation for this context students were required to look beyond the immediate collocation to an lsquoextended unit of meaningrsquo which takes in the subject + verb and direct andor indirect object of the sentence In so doing students were able to work out that cordial + invitation was used for offering an invitation (May we extend to you a cordial invitation) whereas kind + invitation was used for accepting an invitation or thanking the host (thank you very much for your kind invitation) or as one of my students expressed it cordial is used from you to me and kind from me to you As Aston (personal com-munication) has pointed out it is important that students do not just consult the corpus in a phrasebook type fashion but have something more substantial Having students ldquoreadrdquo the corpus paradigmatically to find alternatives to the verbs extend

400 Lynne Flowerdew

accept thank and their respective phraseologies would be a way of counteracting this narrow ldquoreadingrdquo of the corpus Stubbsrsquo (1996 36) oft-quoted principle ldquoThere is no boundary between lexis and grammar lexis and grammar are interdepen-dentrdquo is also of relevance here Lexis and grammar have been shown to manifest interdependency for example the subject of the intransitive verb set in very often refers to ldquounpleasant states of affairsrdquo such as bad weather (Sinclair 1991 74) In a similar fashion collocations and functions can also be viewed as interdependent albeit at a more discourse-based level as evidenced by the above analysis of the collocations of ldquoinvitationrdquo with kind and cordial

Another example of more top-down processing leading on from a bottom-up search is as follows Flowerdew (2008b) notes that in a course on report writing one student query related to whether the active or passive voice was used in the following sentence

This project focuses is focused on the incidence of mosquitoes on campus

A search on focus was conducted in an institutionally-compiled 7 million-word corpus of reports which gave the results shown in Figure 3 below

Pattern Left sort Right Sort Frequency Sort

NOUN + VERB + PREP eg ldquostudy focuses onrdquo Show results 292

VERB + VERB + PREP eg ldquohas focused onrdquo Show results 231

TO + VERB + PREP eg ldquoto focus onrdquo Show results 95

ADV + VERB + PREP eg ldquonot focused onrdquo Show results 94

PRON + VERB + PREP eg ldquowe focus onrdquo Show results 57

CONJ + VERB + PREP eg ldquothat focus onrdquo Show results 56

DET + VERB + PREP eg ldquowhich focus onrdquo Show results 38

VERB + VERB + ADV eg ldquohas focused almostrdquo Show results 34

NOUN + VERB + ADV eg ldquoefforts focused primarilyrdquo Show results 33

TO +VERB + ADV eg ldquoto focus morerdquo Show results 31

Figure 3 Search for focus (all word forms) (Flowerdew 2008)

Besides the fact that the students were able to glean the different meanings be-tween the active and passive forms of focus by examining the verb in a wider con-text accessed via ldquoShow resultsrdquo (column three of the Table) I also found that this search encouraged a more top-down processing of text Studentsrsquo scrutiny of the concordance output prompted one student to ask Why are there so many occur-rences of focus in the present perfect This kind of comment which I have termed a lsquotriggered queryrsquo because it is activated by something the student has alighted on in the corpus data unprompted by the teacher (Flowerdew 2008b) echoes Swainrsquos

Applying corpus linguistics to pedagogy 401

(1998) concept of lsquonoticingrsquo Swain (1998 66) remarks that there are several levels of lsquonoticingrsquo one of which is that ldquoLearners may simply notice a form in the target language due to the frequency or salience of the features themselvesrdquo An examina-tion of the wider context of the present perfect forms of focus revealed that this tense was used when previous research was introduced to set up a critical evalu-ation of this work signalled by however This discourse-based function of however is therefore being used as a key signalling item in Swalesrsquo (1990) CARS (create a research space) model opening up a gap for the authorrsquos own research eg Much of this cross-cultural work to date however has focussed on East Asian versus An-glo comparisons with little attention given to the issue of cross-cultural differences within the East Asian region

This type of browsing is thus in the spirit of Bernardinirsquos philosophy as the lsquolearner as travelerrsquo (Bernardini 2004) Although the type of serendipitous learning advocated by Bernardini (2000 2002) has been mildly criticized as lsquoincidentalistrsquo (Swales 2002) an example such as the one above illustrates that this ad hoc brows-ing can encourage students to process corpus data in a much more top-down way In fact both Granger (1999) and Hahn (2000) emphasise that the teaching of tens-es should be approached from a discourse-based perspective and that a corpus is an ideal medium for achieving this

Another account of searches extending from bottom-up to top-down process-ing is reported in Lee amp Swales (2006) Their innovative corpus-informed EAP course entitled ldquoExploring your own discourse worldrdquo required students to com-pile their own corpora after working with specialized corpora and conduct more genre-based enquiries For example using the BNCweb students were sensitized to the different discourse environments in which for instance and for example are found3

hellip for instance is used a lot more frequently in the social sciences and humanities (where it often introduces casual non-essential exemplifications of points mainly for emphasis or color) whereas in the natural sciences for example is clearly fa-vored (being used to illuminate and clarify a difficult or complex point through the exemplification) (Lee amp Swales 2006 67)

The pedagogic applications reviewed above testify to the fact that traditional class-room corpus-based explorations which tended to centre on a lsquovertical readingrsquo have now been complemented by a more discourse-based approach which requires lsquohorizontal readingrsquo for the analysis of linguistic patternings in relation to their communicative and cultural embedding (Braun 2005) and one could also add here in relation to the practices of different academic disciplines (see Flowerdew in press for further examples of corpus-based discourse approaches to writing

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

398 Lynne Flowerdew

Another corpus linguistics practitioner who commences from a top-down perspective is Noguchi (2004) In fact Noguchi had her science and engineering majors build their own mini-corpora of research journal articles first before clas-sifying the genre features and then moving on to examining prototypical lexico-grammatical features

It has been noted that Swales (2002 163) has contrasted the ldquofragmentedrdquo world of corpus linguistics with its tendency to adopt a somewhat bottom-up atomistic approach to text with the more ldquointegratedrdquo world of ESP material design with its focus on top-down analysis of macro-level features However Weberrsquos tasks and those by Charles and Noguchi seem to be achieving a ldquosymbiosisrdquo between these two approaches as called for by Partington (1998 145)

22 Moving from bottom-up to top-down processing

Hylandrsquos (2007 2008) work on genre-specific phraseological routines commenc-es from a bottom-up perspective Hyland (2007) tabulates the most frequent 50 4-word bundles across four disciplines (biology electrical engineering applied linguistics business studies) noting the great extent to which these are specific to particular disciplines These bundles are then classified into three broad foci of research text and participants as outlined below (Hyland 2007 13ndash14)

Research-oriented minus help writers to structure their activities and experiences of the real world eg

Procedure (the use of the the operation of the)Quantification (the magnitude of the the surface of the)

Text-oriented minus concerned with the organization of the text and its meaning as a message or argument eg

Structuring signals mdash text-reflexive markers which organize stretches of discourse (in the present study in the next section)

Framing signals minus situate arguments by specifying limiting conditions (in the case of with respect to the)

Participant-oriented minus these are focused on the writer or reader of the text egEngagement features minus address readers directly (it should be noted that it can be

seen)

Hyland (2004 220) notes the high productivity of the bundle the of which he ar-gues justifies its inclusion in courses assisting students to write effective academic papers in the sciences For example two frequent bundles in biology the presence of and the splicing of ldquowould seem to offer students valuable forms for expressing meanings relating to existence and to research processes in their writingrdquo (ibid)

Applying corpus linguistics to pedagogy 399

Pedagogic applications with a bottom-up starting point moving to more top-down processing have also been noted by Flowerdew (2006) Besides lexical bun-dles another type of phraseological routine is collocations Usually collocations are considered as word combinations one of the most common being adjective + noun which is a pairing of particular difficulty for advanced learners of English (Nesselhauf 2003 2004) However such collocations can also be involved in more top-down processing of text For example Flowerdew (2006) notes that in a mod-ule on business letter writing students were not sure which adjective from a set of seemingly semantically synonymous adjectives was the ldquorightrdquo one to choose in the following sentence

Thank you for your kind sincere cordial invitation to the alumni dinner

A search on these different combinations in a Business Letters Corpus revealed the following patternings shown in Figures 1 and 2 below2

hoping in fact that you will accept our cordial invitation to be our guest for the length ofMay we extend to you a cordial invitation to call in at Whitersquos and make the

Please accept our cordial invitation to visit and become acquainted withhave the pleasure in extending to you our cordial invitation to visit our organization at a dataand I shall be pleased to extend to you my cordial invitation to visit our Tokyo office at your

Figure 1 Selected concordance lines for cordial + invitation

I very much appreciate your kind invitation to join the University Club and I knowThank you very much again for your kind invitation and I hope your conference will be a

I am therefore very happy to accept your kind invitation and look forward to attending a greatas if I shall be unable to accept your kind invitation this time because of a most important

Thanking you once more for your kind invitation to address the audience I remain

Figure 2 Selected concordance lines for kind + invitation

In order to determine the most appropriate collocation for this context students were required to look beyond the immediate collocation to an lsquoextended unit of meaningrsquo which takes in the subject + verb and direct andor indirect object of the sentence In so doing students were able to work out that cordial + invitation was used for offering an invitation (May we extend to you a cordial invitation) whereas kind + invitation was used for accepting an invitation or thanking the host (thank you very much for your kind invitation) or as one of my students expressed it cordial is used from you to me and kind from me to you As Aston (personal com-munication) has pointed out it is important that students do not just consult the corpus in a phrasebook type fashion but have something more substantial Having students ldquoreadrdquo the corpus paradigmatically to find alternatives to the verbs extend

400 Lynne Flowerdew

accept thank and their respective phraseologies would be a way of counteracting this narrow ldquoreadingrdquo of the corpus Stubbsrsquo (1996 36) oft-quoted principle ldquoThere is no boundary between lexis and grammar lexis and grammar are interdepen-dentrdquo is also of relevance here Lexis and grammar have been shown to manifest interdependency for example the subject of the intransitive verb set in very often refers to ldquounpleasant states of affairsrdquo such as bad weather (Sinclair 1991 74) In a similar fashion collocations and functions can also be viewed as interdependent albeit at a more discourse-based level as evidenced by the above analysis of the collocations of ldquoinvitationrdquo with kind and cordial

Another example of more top-down processing leading on from a bottom-up search is as follows Flowerdew (2008b) notes that in a course on report writing one student query related to whether the active or passive voice was used in the following sentence

This project focuses is focused on the incidence of mosquitoes on campus

A search on focus was conducted in an institutionally-compiled 7 million-word corpus of reports which gave the results shown in Figure 3 below

Pattern Left sort Right Sort Frequency Sort

NOUN + VERB + PREP eg ldquostudy focuses onrdquo Show results 292

VERB + VERB + PREP eg ldquohas focused onrdquo Show results 231

TO + VERB + PREP eg ldquoto focus onrdquo Show results 95

ADV + VERB + PREP eg ldquonot focused onrdquo Show results 94

PRON + VERB + PREP eg ldquowe focus onrdquo Show results 57

CONJ + VERB + PREP eg ldquothat focus onrdquo Show results 56

DET + VERB + PREP eg ldquowhich focus onrdquo Show results 38

VERB + VERB + ADV eg ldquohas focused almostrdquo Show results 34

NOUN + VERB + ADV eg ldquoefforts focused primarilyrdquo Show results 33

TO +VERB + ADV eg ldquoto focus morerdquo Show results 31

Figure 3 Search for focus (all word forms) (Flowerdew 2008)

Besides the fact that the students were able to glean the different meanings be-tween the active and passive forms of focus by examining the verb in a wider con-text accessed via ldquoShow resultsrdquo (column three of the Table) I also found that this search encouraged a more top-down processing of text Studentsrsquo scrutiny of the concordance output prompted one student to ask Why are there so many occur-rences of focus in the present perfect This kind of comment which I have termed a lsquotriggered queryrsquo because it is activated by something the student has alighted on in the corpus data unprompted by the teacher (Flowerdew 2008b) echoes Swainrsquos

Applying corpus linguistics to pedagogy 401

(1998) concept of lsquonoticingrsquo Swain (1998 66) remarks that there are several levels of lsquonoticingrsquo one of which is that ldquoLearners may simply notice a form in the target language due to the frequency or salience of the features themselvesrdquo An examina-tion of the wider context of the present perfect forms of focus revealed that this tense was used when previous research was introduced to set up a critical evalu-ation of this work signalled by however This discourse-based function of however is therefore being used as a key signalling item in Swalesrsquo (1990) CARS (create a research space) model opening up a gap for the authorrsquos own research eg Much of this cross-cultural work to date however has focussed on East Asian versus An-glo comparisons with little attention given to the issue of cross-cultural differences within the East Asian region

This type of browsing is thus in the spirit of Bernardinirsquos philosophy as the lsquolearner as travelerrsquo (Bernardini 2004) Although the type of serendipitous learning advocated by Bernardini (2000 2002) has been mildly criticized as lsquoincidentalistrsquo (Swales 2002) an example such as the one above illustrates that this ad hoc brows-ing can encourage students to process corpus data in a much more top-down way In fact both Granger (1999) and Hahn (2000) emphasise that the teaching of tens-es should be approached from a discourse-based perspective and that a corpus is an ideal medium for achieving this

Another account of searches extending from bottom-up to top-down process-ing is reported in Lee amp Swales (2006) Their innovative corpus-informed EAP course entitled ldquoExploring your own discourse worldrdquo required students to com-pile their own corpora after working with specialized corpora and conduct more genre-based enquiries For example using the BNCweb students were sensitized to the different discourse environments in which for instance and for example are found3

hellip for instance is used a lot more frequently in the social sciences and humanities (where it often introduces casual non-essential exemplifications of points mainly for emphasis or color) whereas in the natural sciences for example is clearly fa-vored (being used to illuminate and clarify a difficult or complex point through the exemplification) (Lee amp Swales 2006 67)

The pedagogic applications reviewed above testify to the fact that traditional class-room corpus-based explorations which tended to centre on a lsquovertical readingrsquo have now been complemented by a more discourse-based approach which requires lsquohorizontal readingrsquo for the analysis of linguistic patternings in relation to their communicative and cultural embedding (Braun 2005) and one could also add here in relation to the practices of different academic disciplines (see Flowerdew in press for further examples of corpus-based discourse approaches to writing

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 399

Pedagogic applications with a bottom-up starting point moving to more top-down processing have also been noted by Flowerdew (2006) Besides lexical bun-dles another type of phraseological routine is collocations Usually collocations are considered as word combinations one of the most common being adjective + noun which is a pairing of particular difficulty for advanced learners of English (Nesselhauf 2003 2004) However such collocations can also be involved in more top-down processing of text For example Flowerdew (2006) notes that in a mod-ule on business letter writing students were not sure which adjective from a set of seemingly semantically synonymous adjectives was the ldquorightrdquo one to choose in the following sentence

Thank you for your kind sincere cordial invitation to the alumni dinner

A search on these different combinations in a Business Letters Corpus revealed the following patternings shown in Figures 1 and 2 below2

hoping in fact that you will accept our cordial invitation to be our guest for the length ofMay we extend to you a cordial invitation to call in at Whitersquos and make the

Please accept our cordial invitation to visit and become acquainted withhave the pleasure in extending to you our cordial invitation to visit our organization at a dataand I shall be pleased to extend to you my cordial invitation to visit our Tokyo office at your

Figure 1 Selected concordance lines for cordial + invitation

I very much appreciate your kind invitation to join the University Club and I knowThank you very much again for your kind invitation and I hope your conference will be a

I am therefore very happy to accept your kind invitation and look forward to attending a greatas if I shall be unable to accept your kind invitation this time because of a most important

Thanking you once more for your kind invitation to address the audience I remain

Figure 2 Selected concordance lines for kind + invitation

In order to determine the most appropriate collocation for this context students were required to look beyond the immediate collocation to an lsquoextended unit of meaningrsquo which takes in the subject + verb and direct andor indirect object of the sentence In so doing students were able to work out that cordial + invitation was used for offering an invitation (May we extend to you a cordial invitation) whereas kind + invitation was used for accepting an invitation or thanking the host (thank you very much for your kind invitation) or as one of my students expressed it cordial is used from you to me and kind from me to you As Aston (personal com-munication) has pointed out it is important that students do not just consult the corpus in a phrasebook type fashion but have something more substantial Having students ldquoreadrdquo the corpus paradigmatically to find alternatives to the verbs extend

400 Lynne Flowerdew

accept thank and their respective phraseologies would be a way of counteracting this narrow ldquoreadingrdquo of the corpus Stubbsrsquo (1996 36) oft-quoted principle ldquoThere is no boundary between lexis and grammar lexis and grammar are interdepen-dentrdquo is also of relevance here Lexis and grammar have been shown to manifest interdependency for example the subject of the intransitive verb set in very often refers to ldquounpleasant states of affairsrdquo such as bad weather (Sinclair 1991 74) In a similar fashion collocations and functions can also be viewed as interdependent albeit at a more discourse-based level as evidenced by the above analysis of the collocations of ldquoinvitationrdquo with kind and cordial

Another example of more top-down processing leading on from a bottom-up search is as follows Flowerdew (2008b) notes that in a course on report writing one student query related to whether the active or passive voice was used in the following sentence

This project focuses is focused on the incidence of mosquitoes on campus

A search on focus was conducted in an institutionally-compiled 7 million-word corpus of reports which gave the results shown in Figure 3 below

Pattern Left sort Right Sort Frequency Sort

NOUN + VERB + PREP eg ldquostudy focuses onrdquo Show results 292

VERB + VERB + PREP eg ldquohas focused onrdquo Show results 231

TO + VERB + PREP eg ldquoto focus onrdquo Show results 95

ADV + VERB + PREP eg ldquonot focused onrdquo Show results 94

PRON + VERB + PREP eg ldquowe focus onrdquo Show results 57

CONJ + VERB + PREP eg ldquothat focus onrdquo Show results 56

DET + VERB + PREP eg ldquowhich focus onrdquo Show results 38

VERB + VERB + ADV eg ldquohas focused almostrdquo Show results 34

NOUN + VERB + ADV eg ldquoefforts focused primarilyrdquo Show results 33

TO +VERB + ADV eg ldquoto focus morerdquo Show results 31

Figure 3 Search for focus (all word forms) (Flowerdew 2008)

Besides the fact that the students were able to glean the different meanings be-tween the active and passive forms of focus by examining the verb in a wider con-text accessed via ldquoShow resultsrdquo (column three of the Table) I also found that this search encouraged a more top-down processing of text Studentsrsquo scrutiny of the concordance output prompted one student to ask Why are there so many occur-rences of focus in the present perfect This kind of comment which I have termed a lsquotriggered queryrsquo because it is activated by something the student has alighted on in the corpus data unprompted by the teacher (Flowerdew 2008b) echoes Swainrsquos

Applying corpus linguistics to pedagogy 401

(1998) concept of lsquonoticingrsquo Swain (1998 66) remarks that there are several levels of lsquonoticingrsquo one of which is that ldquoLearners may simply notice a form in the target language due to the frequency or salience of the features themselvesrdquo An examina-tion of the wider context of the present perfect forms of focus revealed that this tense was used when previous research was introduced to set up a critical evalu-ation of this work signalled by however This discourse-based function of however is therefore being used as a key signalling item in Swalesrsquo (1990) CARS (create a research space) model opening up a gap for the authorrsquos own research eg Much of this cross-cultural work to date however has focussed on East Asian versus An-glo comparisons with little attention given to the issue of cross-cultural differences within the East Asian region

This type of browsing is thus in the spirit of Bernardinirsquos philosophy as the lsquolearner as travelerrsquo (Bernardini 2004) Although the type of serendipitous learning advocated by Bernardini (2000 2002) has been mildly criticized as lsquoincidentalistrsquo (Swales 2002) an example such as the one above illustrates that this ad hoc brows-ing can encourage students to process corpus data in a much more top-down way In fact both Granger (1999) and Hahn (2000) emphasise that the teaching of tens-es should be approached from a discourse-based perspective and that a corpus is an ideal medium for achieving this

Another account of searches extending from bottom-up to top-down process-ing is reported in Lee amp Swales (2006) Their innovative corpus-informed EAP course entitled ldquoExploring your own discourse worldrdquo required students to com-pile their own corpora after working with specialized corpora and conduct more genre-based enquiries For example using the BNCweb students were sensitized to the different discourse environments in which for instance and for example are found3

hellip for instance is used a lot more frequently in the social sciences and humanities (where it often introduces casual non-essential exemplifications of points mainly for emphasis or color) whereas in the natural sciences for example is clearly fa-vored (being used to illuminate and clarify a difficult or complex point through the exemplification) (Lee amp Swales 2006 67)

The pedagogic applications reviewed above testify to the fact that traditional class-room corpus-based explorations which tended to centre on a lsquovertical readingrsquo have now been complemented by a more discourse-based approach which requires lsquohorizontal readingrsquo for the analysis of linguistic patternings in relation to their communicative and cultural embedding (Braun 2005) and one could also add here in relation to the practices of different academic disciplines (see Flowerdew in press for further examples of corpus-based discourse approaches to writing

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

400 Lynne Flowerdew

accept thank and their respective phraseologies would be a way of counteracting this narrow ldquoreadingrdquo of the corpus Stubbsrsquo (1996 36) oft-quoted principle ldquoThere is no boundary between lexis and grammar lexis and grammar are interdepen-dentrdquo is also of relevance here Lexis and grammar have been shown to manifest interdependency for example the subject of the intransitive verb set in very often refers to ldquounpleasant states of affairsrdquo such as bad weather (Sinclair 1991 74) In a similar fashion collocations and functions can also be viewed as interdependent albeit at a more discourse-based level as evidenced by the above analysis of the collocations of ldquoinvitationrdquo with kind and cordial

Another example of more top-down processing leading on from a bottom-up search is as follows Flowerdew (2008b) notes that in a course on report writing one student query related to whether the active or passive voice was used in the following sentence

This project focuses is focused on the incidence of mosquitoes on campus

A search on focus was conducted in an institutionally-compiled 7 million-word corpus of reports which gave the results shown in Figure 3 below

Pattern Left sort Right Sort Frequency Sort

NOUN + VERB + PREP eg ldquostudy focuses onrdquo Show results 292

VERB + VERB + PREP eg ldquohas focused onrdquo Show results 231

TO + VERB + PREP eg ldquoto focus onrdquo Show results 95

ADV + VERB + PREP eg ldquonot focused onrdquo Show results 94

PRON + VERB + PREP eg ldquowe focus onrdquo Show results 57

CONJ + VERB + PREP eg ldquothat focus onrdquo Show results 56

DET + VERB + PREP eg ldquowhich focus onrdquo Show results 38

VERB + VERB + ADV eg ldquohas focused almostrdquo Show results 34

NOUN + VERB + ADV eg ldquoefforts focused primarilyrdquo Show results 33

TO +VERB + ADV eg ldquoto focus morerdquo Show results 31

Figure 3 Search for focus (all word forms) (Flowerdew 2008)

Besides the fact that the students were able to glean the different meanings be-tween the active and passive forms of focus by examining the verb in a wider con-text accessed via ldquoShow resultsrdquo (column three of the Table) I also found that this search encouraged a more top-down processing of text Studentsrsquo scrutiny of the concordance output prompted one student to ask Why are there so many occur-rences of focus in the present perfect This kind of comment which I have termed a lsquotriggered queryrsquo because it is activated by something the student has alighted on in the corpus data unprompted by the teacher (Flowerdew 2008b) echoes Swainrsquos

Applying corpus linguistics to pedagogy 401

(1998) concept of lsquonoticingrsquo Swain (1998 66) remarks that there are several levels of lsquonoticingrsquo one of which is that ldquoLearners may simply notice a form in the target language due to the frequency or salience of the features themselvesrdquo An examina-tion of the wider context of the present perfect forms of focus revealed that this tense was used when previous research was introduced to set up a critical evalu-ation of this work signalled by however This discourse-based function of however is therefore being used as a key signalling item in Swalesrsquo (1990) CARS (create a research space) model opening up a gap for the authorrsquos own research eg Much of this cross-cultural work to date however has focussed on East Asian versus An-glo comparisons with little attention given to the issue of cross-cultural differences within the East Asian region

This type of browsing is thus in the spirit of Bernardinirsquos philosophy as the lsquolearner as travelerrsquo (Bernardini 2004) Although the type of serendipitous learning advocated by Bernardini (2000 2002) has been mildly criticized as lsquoincidentalistrsquo (Swales 2002) an example such as the one above illustrates that this ad hoc brows-ing can encourage students to process corpus data in a much more top-down way In fact both Granger (1999) and Hahn (2000) emphasise that the teaching of tens-es should be approached from a discourse-based perspective and that a corpus is an ideal medium for achieving this

Another account of searches extending from bottom-up to top-down process-ing is reported in Lee amp Swales (2006) Their innovative corpus-informed EAP course entitled ldquoExploring your own discourse worldrdquo required students to com-pile their own corpora after working with specialized corpora and conduct more genre-based enquiries For example using the BNCweb students were sensitized to the different discourse environments in which for instance and for example are found3

hellip for instance is used a lot more frequently in the social sciences and humanities (where it often introduces casual non-essential exemplifications of points mainly for emphasis or color) whereas in the natural sciences for example is clearly fa-vored (being used to illuminate and clarify a difficult or complex point through the exemplification) (Lee amp Swales 2006 67)

The pedagogic applications reviewed above testify to the fact that traditional class-room corpus-based explorations which tended to centre on a lsquovertical readingrsquo have now been complemented by a more discourse-based approach which requires lsquohorizontal readingrsquo for the analysis of linguistic patternings in relation to their communicative and cultural embedding (Braun 2005) and one could also add here in relation to the practices of different academic disciplines (see Flowerdew in press for further examples of corpus-based discourse approaches to writing

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 401

(1998) concept of lsquonoticingrsquo Swain (1998 66) remarks that there are several levels of lsquonoticingrsquo one of which is that ldquoLearners may simply notice a form in the target language due to the frequency or salience of the features themselvesrdquo An examina-tion of the wider context of the present perfect forms of focus revealed that this tense was used when previous research was introduced to set up a critical evalu-ation of this work signalled by however This discourse-based function of however is therefore being used as a key signalling item in Swalesrsquo (1990) CARS (create a research space) model opening up a gap for the authorrsquos own research eg Much of this cross-cultural work to date however has focussed on East Asian versus An-glo comparisons with little attention given to the issue of cross-cultural differences within the East Asian region

This type of browsing is thus in the spirit of Bernardinirsquos philosophy as the lsquolearner as travelerrsquo (Bernardini 2004) Although the type of serendipitous learning advocated by Bernardini (2000 2002) has been mildly criticized as lsquoincidentalistrsquo (Swales 2002) an example such as the one above illustrates that this ad hoc brows-ing can encourage students to process corpus data in a much more top-down way In fact both Granger (1999) and Hahn (2000) emphasise that the teaching of tens-es should be approached from a discourse-based perspective and that a corpus is an ideal medium for achieving this

Another account of searches extending from bottom-up to top-down process-ing is reported in Lee amp Swales (2006) Their innovative corpus-informed EAP course entitled ldquoExploring your own discourse worldrdquo required students to com-pile their own corpora after working with specialized corpora and conduct more genre-based enquiries For example using the BNCweb students were sensitized to the different discourse environments in which for instance and for example are found3

hellip for instance is used a lot more frequently in the social sciences and humanities (where it often introduces casual non-essential exemplifications of points mainly for emphasis or color) whereas in the natural sciences for example is clearly fa-vored (being used to illuminate and clarify a difficult or complex point through the exemplification) (Lee amp Swales 2006 67)

The pedagogic applications reviewed above testify to the fact that traditional class-room corpus-based explorations which tended to centre on a lsquovertical readingrsquo have now been complemented by a more discourse-based approach which requires lsquohorizontal readingrsquo for the analysis of linguistic patternings in relation to their communicative and cultural embedding (Braun 2005) and one could also add here in relation to the practices of different academic disciplines (see Flowerdew in press for further examples of corpus-based discourse approaches to writing

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

402 Lynne Flowerdew

instruction) In fact Swales has now modified his position and acknowledges this more top-down orientation as reported by Lee (2008)

It can be seen that utilizing a more top-down approach to processing cor-pus data provides more co-text and hence more contextual information on the corpora under investigation by shedding light on different practices of different academic disciplines as revealed by differences in lexico-grammatical patterning However whether the starting point should be with a bottom-up or top-down ap-proach is not an easy question to answer and very much depends on the nature of the query and composition of the corpus Starting with the moves (which could be coded in the corpus) may be appropriate for those genres which have clearly defined move structures such as law cases with four obligatory moves factsstat-ing history of the case presenting argument deriving ratio decidendi pronounc-ing judgment (cf Bhatia et al 2004) but difficult to implement for those genres which are mixed or which display embedded moves (Flowerdew 2004) Biber et al (2007b 241) compare these two different approaches noting that which one is adopted depends on the primary basis of the analysis

Functional analysis is primary in top-down approaches functional distinctions are determined on a qualitative basis to determine the set of relevant discourse types and to identify specific discourse units within texts In contrast linguistic analysis is primary in bottom-up approaches a wide range of linguistic distribu-tional patterns are analysed quantitatively again being used to determine the set of relevant discourse types and to identify specific discourse units within texts (Biber et al 2007b 241)

3 Corpus data are decontextualised and may not be directly transferable

Corpus data have been viewed as decontextualised such that the findings may not be directly transferable lock stock and barrel to pedagogy This issue is discussed below with reference to pedagogic applications in the field of ESP

31 The issue of contextualisation in corpus data

Widdowsonrsquos (2004) arguments on the decontextualised nature of corpus data are well-rehearsed in the literature (see Flowerdew 2008a Braun 2005 Kaltenboumlck amp Mehlmauer-Larcher 2005 McEnery et al 2006) but it is worth reviewing them again briefly Both Aston (1995) and Widdowson (1998 2002) have drawn atten-tion to the decontextualised nature of corpus data with Widdowson commenting that corpus data are but a sample of language as opposed to an example of authen-tic language because it is divorced from the communicative context in which it was

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 403

created ldquothe text travels but the context does not travel with itrdquo (keynote lecture 29 July 2002)

Whether Widdowson is correct or not would seem very much to depend on what is being transferred Charles (2007 295) disagrees with Widdowson on the issue of decontextualisation and maintains that one of the advantages of the type of corpus work described in Section 21 above is that ldquohellip it allows students to gain a greater sense of contextualization than is possible to achieve through the use of paper-based materialsrdquo While it is undoubtedly true that more top-down corpus enquiries by their very nature provide more contextualization the question of the practices of different academic and professional disciplines needs to be taken into account as uncovered by the corpus-based enquiries of for instance and for example in Lee amp Swales (2006) which show just how finely nuanced differences can be (also see Hyland 2000 2002 for research studies in this area)

32 lsquoPedagogic processingrsquo of corpus data

Widdowson maintains that it may not be expedient to transfer corpus data directly to pedagogic materials on account of the cultural or contextual inappropriacy of the corpus data (see Cook 1998 Widdowson 1991 also cited in Seidlhofer 2003 for a discussion on the issue of prescription vs description regarding the trans-fer of corpus data to pedagogy) Widdowson therefore advocates adopting some kind of lsquopedagogic processingrsquo as do other corpus linguists such as Braun (2007) and McCarthy (2001) in order to transform samples of language into pedagog-ically-accessible examples This aspect of pedagogic mediation of corpus data is discussed from the perspective of the ldquowhatrdquo and the ldquohowrdquo below

321 The ldquowhatrdquo of pedagogic processingSection 31 has shown that variation across disciplines needs to be considered in the transfer of corpus data to pedagogy Another aspect that needs to be con-sidered concerns pragmatic appropriacy Flowerdew (2008a) advises caution on exploiting a corpus of reports in which consultancy companies are advising ex-ternal clients for student report writing which requires them to write internally to university authorities The student writing is similar to the corpus of reports in respect of the rhetorical Problem-Solution pattern However it would not be reg-isterially appropriate for students to transfer the pattern grammatical metaphor noun (indicating a solution to a problem) + will + verb (signalling mitigation of a problem) (eg Implementation of barriers will reduce noise) to their own report writing in view of the different contextual features Students would need to modify the lsquoframersquo (see Biber et al 2004 and Stubbs 2004 for further examples of frames) derived from the corpus of reports by supplying mitigation devices to attenuate the

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

404 Lynne Flowerdew

phrase to make it socio-culturally appropriate for writing to university authorities Thus they would need to expand the original frame with the addition of a prefac-ing phrase such as ldquowe would like to suggest thathelliprdquo and replace will reduce by the more rhetorically appropriate would reduce Corpus consultation has therefore to be conducted with great care and it is not surprising that Widdowson (1998) sees the need for some kind of lsquomediating processrsquo whereby students authenticate the corpus data to suit the socio-cultural and linguistic parameters of their own writ-ing in light of considerations relating to differences across disciplines and prag-matic appropriacy

322 The ldquohowrdquo of pedagogic processingHaving established in the previous sub-section that some type of pedagogic pro-cessing may be necessary with some types of data there still remains the question of how this can be achieved

In order to integrate the type of pedagogic processing Widdowson is referring to so as to enable students to authenticate the corpus data for their own contex-tual writing environment Flowerdew (2008b) has adopted student peer response activities which draw on Vygotskian socio-cultural theories of co-constructing knowledge through collaborative dialogue and negotiation (see OrsquoSullivan 2007 who gives a very insightful exposition on the role of cognitive and social construc-tivist theories to foster corpus consultation literacy) In these peer-to-peer interac-tion groups weaker students were intentionally grouped with more proficient ones to foster productive dialogue through lsquoassisted performancersquo thus drawing on an-other aspect of socio-cultural theory In this scaffolding-type of activity more pro-ficient students were able to offer their insights and interpretations on the corpus data thus assisting the weaker students to gradually develop more independence The author reports some success with this approach of incorporating group dis-cussion activities revolving around the corpus data as a form of pedagogic media-tion resulting in consciousness-raising of register awareness not only for the task in hand but also what might be appropriate phraseologies for other contexts Peer discussion also raised issues of what could be transferred from corpus data ie the use of nominalisations such as implementation which led to further discussion as to whether the gerund implementing would also be acceptable and what would not be appropriate for the context ie the frame ldquoIt is recommended thathelliprdquo which students mentioned sounded too authoritative Students were therefore encour-aged to engage in ldquocollaborative metatalkrdquo (Swain 1998 68) to ldquouse language to reflect on language userdquo (ibid) Gavioli amp Aston (2001 242) also advocate spoken interaction among students in corpus consultation as ldquodifferent learners will often notice different things in concordances and draw different conclusionsrdquo Sugges-tions for other types of pedagogic mediation of corpora have been given by Braun

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 405

(2005) for inclusion of video activities by Milton (2006) for didactic written hints built into the software and by Vannestaringl amp Lindquist (2007) for peer teaching

Pedagogic mediation of corpora could well be assisted through the incorpora-tion of contextual information in written texts to aid the transfer of corpus data to pedagogy Following Burnard (2004) Krishnamurthy amp Kosem (2007) advocate encoding the corpus with metadata to aid subsequent analyses Although vari-ous speech corpora such as the Michigan Corpus of Spoken Academic English MICASE have been marked up with metadata categories such as the gender age range academic position role of the interlocutors these are lacking in corpora of writing4 Corpora of business writing are especially context-sensitive and could benefit from the inclusion of such metadata

However it should be noted that sometimes the co-textual environment can provide clues to contextual information In the business letters written by stu-dents the structure and use of appreciate was found to be particularly problem-atic across a wide range of students with learners confusing the active and passive forms eg I would be much appreciated if hellip and omission of the object in the active eg I would appreciate ifhellip The Business Letters Corpus referred to ear-lier proved invaluable for alerting students to the correct structure What students were unsure of though was in which situations the active and passive forms were most appropriate Here frequency counts and the co-text in the environment of appreciate provided valuable clues The frame hellipappreciate it if hellip occurred 105 times whereas there were only 9 instances of the frame Ithellipappreciated ifhellip thus suggesting some kind of marked use In fact scrutiny of the co-textual environ-ment ie the lsquoextended unit of meaningrsquo revealed that the passive frame would be used when the power relations between the addresser addressee were quite distinct and when a big favour was being asked This example thus demonstrates that corpora may not be completely devoid of context which can sometimes in part be recovered from the co-textual environment

4 Corpus-based pedagogy is usually associated with an inductive approach which may not be appropriate for all students

Both Gavioli (2005) and Meunier (2002) have noted the drawbacks of an inductive approach in which students extrapolate the rules or patterning from examples

Despite their advantages DDL activities have some drawbackshellip The various learning strategies (deductive vs inductive) that students adopt can lead to prob-lems Some students hate working inductively and teachers should aim at a com-bined approach (see Hahn 2000 for a combined approach) (Meunier 2002 135)

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

406 Lynne Flowerdew

In common with Meunier (ibid) I also believe that an inductive approach may not appeal to students on account of their different cognitive styles (Flowerdew 2008b) Field-dependent students who thrive in cooperative interactive settings and who would seem to enjoy discussion centering on extrapolation of rules from examples may benefit from this type of pedagogy However field-independent learners who are known to prefer instruction emphasizing rules may not take to the inductive approach inherent in corpus-based pedagogy It is interesting to note that Vannestaringl amp Lindquist (2007 343) state that some of the students in their inductive corpus-based grammar course commented that ldquohellipthey preferred the more traditional way of reading about grammatical rules in the book and did not feel that they learned anything by doing corpus exercisesrdquo

Another reason as to whether an inductive or deductive approach is adopted would very much seem to depend on the nature of a particular enquiry If the enquiry is based on a grammar rule (for example the difference between for and since in time expressions see Tribble amp Jones 1990) then the differences are quite clear-cut However if the enquiry focuses on an aspect of phraseology students may find it difficult to extrapolate the tendencies associated with patterns in lan-guage (Hunston amp Francis 2000) as they may be confronted with conflicting ex-amples which do not follow a particular pattern in all cases

One area that posed difficulty for my students was that of ergativity As noted by Celce-Murcia (2002) overpassivisation of ergative verbs is an aspect that poses particular problems for advanced learners

With the verbs lsquoincreasersquo and lsquodecreasersquo [the ergative] tends [my italics] to be used when the inanimate subject is objectively or subjectively measurable (rather than an animate agentdynamic instrument object mdash both of which favor active voice mdash or a patient subject mdash for the passive voice)

(Celce-Murcia 2002 146)

Students found it difficult to work out from a close reading of concordance lines the correct choice of verb in the following sentence because of the probabilistic nature of language when viewed syntagmatically

With a very crowded schedule studentsrsquo level of motivation was decreased has decreased

Vannestaringl amp Lindquist (2007) have commented on the difficulty students have in interpreting corpus data and this aspect seems to be a particularly thorny issue when phraseology comes into play It would seem then that it is in order to supply prompts or hints to enable students to work out the tendencies of phraseological patterns For example in the case of the use of the ergative students could be given

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 407

a prompting question such as ldquoDo you notice any difference in the subjects for was decreased and has decreasedrdquo

In tackling corpus-based enquiries Carter amp McCarthy (1995) have formu-lated the lsquo3 Isrsquo strategy

Illustration looking at data Interaction discussion and sharing observations and opinions Induction (making onersquos own rule for a particular feature)

However based on the difficulties my students have encountered with induc-ing phraseological tendencies I would like to elaborate on the above model by proposing a lsquo4 Isrsquo formulation adding lsquoInterventionrsquo as an optional stage between Interaction and Induction This would allow the inclusion of hints such as the one mentioned above Although in the literature on language teaching deductive and inductive approaches are usually seen as polarities the above discussion has shown that clues and prompts can be used to mediate the inductive harr deductive continuum For this reason the following dynamic paradigm for corpus investiga-tions is proposed which allows for finer-tuning of corpus queries

Inductive

Deductive

Phraseology(probabilities)

Grammar rules

(Clues)

Figure 4 Dynamic paradigm for corpus investigations

Implementing a more delicate approach to corpus queries would help to reduce some of the difficulties associated with interpretation for students especially when they are engaged in working out phraseological tendencies As pointed out by Gardner (2007) it is this combinatorial nature of lexis and grammar which poses problems

hellipit is likely that only the most advanced language learners can take advantage of the intricate semantic relationships between words that are revealed through con-cordancing Certainly such an approach to language training presupposes that learners will know most of the words (cotext) that surround a key word or phrase in context (KWIC) and that they can connect their meanings mdash an assumption that seems unreasonable for many groups of language learners (children begin-ning L2 learners learners with low literacy skills etc) (Gardner 2007 255)

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

408 Lynne Flowerdew

Corpora are useful for phraseological enquiries (cf Granger amp Meunier 2008 Meunier amp Granger 2008) as the language which falls between lexis and gram-mar is often not easily retrievable from grammars or dictionaries However some intervention in the form of clues or hints may be needed to enable students to con-nect meanings Conversely while hard-and-fast grammar rules may be easier for students to glean from corpora a corpus or indeed a particular sub-corpus may not be the best or most efficient resource for consultation This issue is the focus of the following section

5 Which corpus and which online resource

Chambers (2005) and Chambers amp OrsquoSullivan (2004) have underscored the impor-tance for students of having the ability to select appropriate electronic resources

The concept of literacy now includes not only the knowledge and skills which are traditionally associated with that concept but also the ability to select evaluate and use the electronic tools and resources appropriate for the activity which is being undertaken (Chambers amp OrsquoSullivan 2004 158)

In this respect Davies (2004) reports on a program on student use of three main corpora for examining syntactic variation in Spanish noting that sometimes the studentsrsquo intention was to use a corpus that was not the most appropriate for the research question they had formulated

In my own class of report writing referred to earlier in the article students wanted to know which of the verb collocations below was the most appropriate for survey

We plan to do carry out conduct a survey on the use of computers

Students considered the 7-million word sub-corpus of reports to be ideal for searching the noun survey and expected that it would show correct verb + noun collocations Although the corpus data displayed useful verbs to collocate with the noun survey these were not easy to discern There was a lot of lsquonoisersquo as students were required to read through quite a number of concordance lines to identify appropriate verb + noun collocations for their context of writing as evidenced by the results shown in Figure 5

This problematic example above then gave me the opportunity to remind stu-dents of another program JustTheWord5 The screenshot below shows this to be a more appropriate online tool to use with the cluster feature of particular use as the collocations are grouped semantically In Figure 6 below a glance at Cluster 1

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 409

Words Left sort Right Sort Show PoS Frequency Sorted

Response rate to a survey from See contexts 3

And hcfa distributed a survey to See contexts 2

Response rate to a survey of See context 2

Response rates to a survey form See contexts 2

Thinking about conducting a survey to See contexts 2

$150000 to undertake a survey and See contexts 1

1998 report on a survey by See contexts 1

2 we sent a survey to See contexts 1

Acquisition venterfootnote33sent a survey on See contexts 1

Addition to mailing a survey of See contexts 1

And employment funded a survey of See contexts 1

And francis used a survey to See contexts 1

Figure 5 Search for a survey

V obj N

1224

1146

1030

8444

99

23157

132

0 50 100 150 200

cluster 1

carry out survey

conduct survey

take in survey

cluster 2

mention in survey

quote survey

cluster 3

complete survey

do survey

cluster 4

publish survey

report in survey

unclustered

base on survey

come in survey

commission survey

design survey

Figure 6 Search for survey in JustTheWord collocations program

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

410 Lynne Flowerdew

confirmed studentsrsquo initial intuitions but some were surprised to find that the verb do in Cluster 3 was acceptable An examination of the concordance lines for this collocation revealed though that it was mainly used in an informal setting in speaking as in the following I mean I havenrsquot done a detailed survey on anything

One misconception held by students was that the Business Letters Corpus would be useful for consulting for any aspect of their letter writing The utility of this corpus for answering business-related language queries such as the structure and use of phrases with appreciate has been illustrated earlier in this article For other problematic areas though such as topic-comment (eg For the training pro-gram it will start onhellip) it would have been more appropriate to consult a local reference grammar targeting common errors of Hong Kong students

It is noteworthy that which resource (corpus grammar dictionary etc) is the most appropriate for a particular query has not been explored much to date Ken-nedy (2008) notes that a corpus might not be the most efficient way for students to discover the differences in use between tall high upright and vertical when the differences are made explicit in good dictionaries but such insightful observations are few and far between in the literature This is an important area that Bernardini (2002 2004) has flagged for future development

Here are two sets of typical examples one from published journal articles and one from stu-dent dissertations What do you notice about the use of it seems in the two sets of examples Can you suggest why they are different

Published articles Student dissertations

bullensp enspItenspseemsenspclearenspthatenspasenspinsiderenspholdingenspproportions increase capitalization ratios decrease

bullensp enspItenspseemsenspthatenspdifferentenspstudiesensphaveenspshownenspdifferent results

bullensp enspItenspseemsensplikelyenspthatensptheenspeightiesenspandenspninetiesenspwill be known as decades of large scale disaggregation

bullensp enspItenspseemsenspthatensptheensppracticeenspofenspemployingensplo-cal staff by multinationals is increasing

bullensp enspItenspseemsenspquiteenspprobableenspthatenspconsumersenspwould not recognize such relatively small degrees of difference

bullensp enspItenspseemsenspthatenspsomeenspindividualensptrainingenspcourses are below their full capacity

Now look at the following examples of it seems that from published journal articles How is it used differently from student dissertations

bullensp enspItenspseemsenspthatenspconsumersenspareenspmoreensplikelyensptoenspuseensppriceensptacticenspandenspswitchenspstoresensponlyenspwhenenspcertain brands and product categories are promoted

bullensp enspItenspseemsenspthatensptheenspissueenspofenspprivatizationenspcouldenspbecomeenspanenspobjectenspofenspaenspnationalenspreferendum

Figure 7 Concordance task for it seems in published articles and student dissertations (from Hewings 2002)

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 411

Neither should it be forgotten that corpora of learner writing are another valu-able resource in corpus-based pedagogy (see Pravec 2002 for a review) either to inform materials (cf Granger 2004 Gilquin et al 2007 Mukherjee 2006) or for exploitation by the learners themselves (Hewings amp Hewings 2002 Mukherjee amp Rohrbach 2006 Seidlhofer 2000) For example Mukherjee amp Rohrbach (ibid) propose individualising the corpus analysis in order to compare variation in in-dividual learnersrsquo output Having learners build corpora of their own writing to compare with a reference corpus would thus increase the relevance of corpus-based pedagogy by individualising it The corpus-based materials of Hewings amp Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in professional business writing ie published journal articles from the field of Business Studies also incorporate the findings from learner corpora (MBA disser-tations written by non-native speakers) Asking students to compare and discuss the differences of it seemshellip in concordance lines selected from the two corpora as shown in Figure 7 overleaf would serve to alert students to particularly problem-atic areas for post-graduate writers which students might not appreciate if they were just exposed to working with expert or professional corpora

6 Conclusion

This article has reviewed four inter-related issues concerning the application of corpus linguistics to pedagogy and ESP in particular It can be seen that very re-cent pedagogic endeavours have adopted a much more discourse-based top-down approach to analysis (or worked from a bottom-up to a more top-down analysis) a development that was advocated by Flowerdew (1998) over a decade ago It has also been illustrated that corpus pedagogy has progressed beyond looking at trun-cated concordance lines and is now encompassing Sinclairrsquos lsquounits of meaningrsquo outlined in the introduction of this article

However the issue of contextualization still remains problematic and it is en-visaged that in future more attention will be paid to the mark-up of written text with contextual features as is the norm for spoken corpora nowadays It has been shown though that corpora are not completely devoid of context and that the co-textual environment may provide useful contextual clues Although there are a few accounts in the literature regarding the lsquopedagogic mediationrsquo of corpus data these are few and far between indicating this is an area for further discussion and expansion Finally it has been suggested that more attention needs to be paid to the types of enquiry corpora are best suited for The increasing availability of other online resources such as grammars thesauri dictionaries etc will make it easier for students to toggle between a multitude of online resources to decide which is

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

412 Lynne Flowerdew

the most relevant and useful look-up tool Learner corpora it is argued are also of value here However the above can only be accomplished with strategy train-ing not only of students but also of teachers as called for by Frankenberg-Garcia (2006) There is therefore still much to debate and develop in the application of corpus linguistics to pedagogy a field first founded with the pioneering work of Tim Johns (1991a 1991b) in the early nineties

Notes

This is a revised and extended version of a paper given at the 8th Teaching and Language Corpora Conference Lisbon Portugal on 6th July 2008 and also an invited lecture given at the Hong Kong Association for Applied Linguistics on 5th March 2007 I wish to thank the two anonymous reviewers for their very helpful and constructive com-ments on an earlier draft of this paper Any shortcomings naturally remain my own

1 I use lsquocorpus-basedrsquo in this article to refer to any hands-on pedagogic applications of corpora See Tognini-Bonelli (2001) for a discussion on her definitions of lsquocorpus-basedrsquo vs lsquocorpus-drivenrsquo See also Lee (2008) for additional details on lsquocorpus-informedrsquo and lsquocorpus-supportedrsquo linguistics

2 The BLC is a freely available corpus at httpysomeyahpinfoseekcojp (accessed January 2009) It comprises one million words of business letters

3 The BNCweb is a user-friendly interface for the 100-million word BNC See httphomepagemaccombncwebmanualbncwebmanmainhtm (accessed December 2008) for more details and also Hoffmann et al (2008)

4 Information on MICASE can be found at httpquodlibumichedummicase (accessed July 2008)

5 JustTheWord is an online collocations program which interfaces with the 100-million-word BNC

References

Aston G 1995 ldquoCorpora in language pedagogy Matching theory and practicerdquo In G Cook amp B Seidlhofer (Eds) Principle and Practice in Applied Linguistics Oxford Oxford University Press 257ndash270

Belcher D 2006 ldquoEnglish for Specific Purposes Teaching to perceived needs and imagined futures in worlds of work study and everyday liferdquo TESOL Quarterly 40 (1) 133ndash156

Bernardini S 2000 ldquoSystematising serendipity Proposals for concordancing large corpora with language learnersrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Frankfurt Peter Lang 225ndash234

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 413

Bernardini S 2002 ldquoExploring new directions for discovery learningrdquo In B Kettemann amp G Marco (Eds) Teaching and Learning by Doing Corpus Analysis Amsterdam Rodopi 165ndash182

Bernardini S 2004 ldquoCorpora in the classroom An overview and some reflections on future developmentsrdquo In J McH Sinclair (Ed) How to Use Corpora in Language Teaching Am-sterdamPhiladelphia John Benjamins 15ndash36

Bhatia V Langton N amp Lung J 2004 ldquoLegal discourse Opportunities and threats for corpus linguisticsrdquo In U Connor amp T Upton (Eds) Discourse in the Professions Perspectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 203ndash231

Biber D Conrad S amp Cortes V 2004 ldquolsquoIf you look athelliprsquo Lexical bundles in university teaching and textbooksrdquo Applied Linguistics 25 (3) 371ndash405

Biber D Connor U amp Upton T (Eds) 2007a Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins

Biber D Connor U amp Upton T 2007b ldquoConclusion Comparing the analytical approachesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Discourse Structure AmsterdamPhiladelphia John Benjamins 239ndash259

Braun S 2005 ldquoFrom pedagogically relevant corpora to authentic language learning contentsrdquo ReCALL 17 (1) 47ndash64

Braun S 2007 ldquoIntegrating corpus work into secondary education From data-driven learning to needs-driven corporardquo ReCALL 19 (3) 307ndash328

Burnard L 2004 online ldquoMetadata for corpus workrdquo Available at httpusersoxacuk~louwipmetadatahtml (accessed January 2009)

Carter R amp McCarthy M 1995 ldquoGrammar and the spoken languagerdquo Applied Linguistics 16 (2) 141ndash158

Celce-Murcia M 2002 ldquoOn the use of selected grammatical features in academic writingrdquo In M Schleppegrell amp C Colombi (Eds) Developing Advanced Literacy in First and Second Languages Mahwah NJ Lawrence Erlbaum 143ndash157

Chambers A 2005 ldquoIntegrating corpus consultation in language studiesrdquo Language Learning and Technology 9 (2) 111ndash125

Chambers A amp OrsquoSullivan I 2004 ldquoCorpus consultation and advanced learnersrsquo writing skills in Frenchrdquo ReCALL 16 (1) 158ndash172

Charles M 2007 ldquoReconciling top-down and bottom-up approaches to graduate writing Us-ing a corpus to teach rhetorical functionsrdquo Journal of English for Academic Purposes 6 (4) 289ndash302

Cook G 1998 ldquoThe uses of reality A reply to Ronald Carterrdquo ELT Journal 52 (1) 57ndash63Danielsson P 2007 ldquoWhat constitutes a unit of analysis in languagerdquo Linguistik online 31

22007 17ndash24Davies M 2004 ldquoStudent use of large annotated corpora to analyse syntactic variationrdquo In G

Aston S Bernardini amp D Stewart (Eds) Corpora and Language Learners AmsterdamPhiladelphia John Benjamins 257ndash269

Flowerdew L 1998 ldquoCorpus linguistic techniques applied to textlinguisticsrdquo System 26 (4) 541ndash552

Flowerdew L 2003 ldquoA combined corpus and systemic-functional analysis of the Problem-So-lution pattern in a student and professional corpus of technical writingrdquo TESOL Quarterly 37 (3) 489ndash511

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

414 Lynne Flowerdew

Flowerdew L 2004 ldquoThe argument for using specialised corpora to understand academic and professional languagerdquo In U Connor amp T Upton (Eds) Discourse in the Professions Per-spectives from Corpus Linguistics AmsterdamPhiladelphia John Benjamins 11ndash33

Flowerdew L 2005 ldquoAn integration of corpus-based and genre-based approaches to text analy-sis in EAPESP Countering criticisms against corpus-based methodologiesrdquo English for Specific Purposes 24 (3) 321ndash332

Flowerdew L 2006 ldquoTexts tools and contexts in corpus applications for writingrdquo Paper pre-sented in invited academic session ldquoCurrent Trends in Corpus Linguistics Researchrdquo 40th Annual TESOL Convention Tampa Florida 16th March

Flowerdew L 2008a Corpus-based Analyses of the Problem-Solution Pattern A Phraseological Analysis AmsterdamPhiladelphia John Benjamins

Flowerdew L 2008b ldquoCorpus linguistics for academic literacies mediated through discussion activitiesrdquo In D Belcher amp A Hirvela (Eds) The Oral-Literate Connection Perspectives on L2 Speaking Writing and Other Media Interactions Ann Arbor MI University of Michigan Press 268ndash287

Flowerdew L In press ldquoUsing corpora for writing instructionrdquo In M McCarthy amp A OrsquoKeeffe (Eds) The Routledge Handbook of Corpus Linguistics London Routledge

Flowerdew L Forthcoming a ldquoCorpus-based discourse analysisrdquo In J P Gee amp M Hanford (Eds) The Routledge Handbook of Discourse Analysis London Routledge

Flowerdew L Forthcoming b ldquoESP and corpus studiesrdquo In D Belcher A Johns amp B Paltridge (Eds) New Directions for ESP Research Ann Arbor MI University of Michigan Press

Frankenberg-Garcia A 2006 ldquoRaising teachersrsquo awareness to corporardquo Plenary paper given at the 7th Conference on Teaching and Language Corpora Paris 1ndash4 July

Gardner D 2007 ldquoValidating the construct of Word in applied corpus-based vocabulary re-search A critical surveyrdquo Applied Linguistics 28 (2) 241ndash265

Gavioli L 2005 Exploring Corpora for ESP Learning AmsterdamPhiladelphia John Benja-mins

Gavioli L amp Aston G 2001 ldquoEnriching reality Language corpora in language pedagogyrdquo ELT Journal 55 (3) 238ndash246

Gilquin G Granger S amp Paquot M 2007 ldquoLearner corpora The missing link in EAP peda-gogyrdquo Journal of English for Academic Purposes 6 (4) 319ndash335

Granger S 1999 ldquoUse of tenses by advanced EFL learners Evidence from an error-tagged com-puter corpusrdquo In S Hasselgard amp S Oksefjell (Eds) Out of Corpora Studies in Honour of Stig Johansson Amsterdam Rodopi 191ndash202

Granger S 2004 ldquoPractical applications of learner corporardquo In B Lewandowska-Tomaszczyk (Ed) Practical Applications in Language and Computers Bern Peter Lang 1ndash10

Granger S amp Meunier F (Eds) 2008 Phraseology An Interdisciplinary Perspective AmsterdamPhiladelphia John Benjamins

Hahn A 2000 ldquoGrammar at its best The development of a rule- and corpus-based grammar of English tensesrdquo In L Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Hamburg Peter Lang 193ndash206

Hewings M 2002 ldquoUsing computer-based corpora in teachingrdquo Paper presented at the 36th TESOL Conference Utah March 2002

Hewings M amp Hewings A 2002 ldquolsquoIt is interesting to note thathelliprsquo A comparative study of antic-ipatory lsquoitrsquo in student and published writingrdquo English for Specific Purposes 21 (4) 367ndash383

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 415

Hoffmann S Evert S Smith N Lee D amp Berglund Prytz Y 2008 Corpus Linguistics with BNCweb minusA Practical Guide Bern Peter Lang

Hunston S amp Francis G 2000 Pattern Grammar A Corpus-driven Approach to the Lexical Grammar of English AmsterdamPhiladelphia John Benjamins

Hyland K 2000 Disciplinary Discourses Social Interactions in Academic Writing London Longman

Hyland K 2002 ldquoSpecificity revisited How far should we gordquo English for Specific Purposes 21 (4) 385ndash395

Hyland K 2004 Genre and Second Language Writing Ann Arbor University of Michigan PressHyland K 2007 ldquoAs can be seen Lexical bundles and disciplinary variationrdquo English for Specific

Purposes 27 (1) 4ndash21Hyland K 2008 ldquoAcademic clusters Text patterning in published and postgraduate writingrdquo

International Journal of Applied Linguistics 18 (1) 41ndash62Johns T 1991a ldquoFrom printout to handout Grammar and vocabulary teaching in the context of

data-driven learningrdquo In T Odlin (Ed) Perspectives on Pedagogical Grammar Cambridge Cambridge University Press 293ndash313

Johns T 1991b ldquoShould you be persuaded Two examples of data-driven learningrdquo English Lan-guage Research Journal 4 Department of English University of Birmingham 1ndash16

Jones J 2007 ldquoVocabulary-based discourse units in biology research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 175ndash212

Kaltenboumlck G amp Mehlmauer-Larcher B 2005 ldquoComputer corpora and the language classroom On the potential and limitations of computer corpora in language teachingrdquo ReCALL 17 (1) 65ndash84

Kanoksilapatham B 2007 ldquoRhetorical moves in biochemistry research articlesrdquo In D Biber U Connor amp T Upton (Eds) Discourse on the Move Using Corpus Analysis to Describe Dis-course Structure AmsterdamPhiladelphia John Benjamins 73ndash119

Kennedy G 2008 ldquoPhraseology and language pedagogyrdquo In F Meunier amp S Granger (Eds) Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins 21ndash41

Krishnamurthy R amp Kosem I 2007 ldquoIssues in creating a corpus for EAP pedagogy and re-searchrdquo Journal of English for Academic Purposes 6 (4) 356ndash373

Lee D 2008 ldquoCorpora and discourse analysis New ways of doing old thingsrdquo In V K Bhatia J Flowerdew amp R Jones (Eds) Advances in Discourse Studies London Routledge 86ndash99

Lee D amp Swales J M 2006 ldquoA corpus-based EAP course for NNS doctoral students Moving from available specialized corpora to self compiled corporardquo English for Specific Purposes 25 (1) 56ndash75

McCarthy M 2001 Issues in Applied Linguistics Cambridge Cambridge University PressMcEnery T Xiao R amp Tono Y 2006 Corpus-based Language Studies London RoutledgeMeunier F 2002 ldquoThe pedagogic value of native and learner corpora in EFL grammar teach-

ingrdquo In S Granger J Hung amp S Petch-Tyson (Eds) Computer Learner Corpora Second Language Acquisition and Foreign Language Teaching AmsterdamPhiladelphia John Ben-jamins 119ndash141

Meunier F amp Granger S (Eds) 2008 Phraseology in Foreign Language Learning and Teaching AmsterdamPhiladelphia John Benjamins

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

416 Lynne Flowerdew

Milton J 2006 ldquoResource-rich web-based feedback Helping learners become independent writersrdquo In K Hyland amp F Hyland (Eds) Feedback in Second Language Writing Cam-bridge Cambridge University Press 123ndash139

Mukherjee J 2006 ldquoCorpus linguistics and language pedagogy The state of the art minus and be-yondrdquo In S Braun K Kohn amp J Mukherjee (Eds) Corpus Technology and Language Peda-gogy Frankfurt am Main Peter Lang 5ndash24

Mukherjee J amp Rohrbach J-M 2006 ldquoRethinking applied corpus linguistics from a language-pedagogical perspective New departures in learner corpus researchrdquo In B Kettemann amp G Marko (Eds) Planning and Gluing Corpora Inside the Applied Corpus Linguistrsquos Workshop Frankfurt am Main Peter Lang 205ndash231

Nesselhauf N 2003 ldquoThe use of collocations by advanced learners of English and some implica-tions for teachingrdquo Applied Linguistics 24 (2) 223ndash242

Nesselhauf N 2004 Collocations in a Learner Corpus AmsterdamPhiladelphia John Benja-mins

Noguchi J 2004 ldquoA genre analysis and mini-corpora approach to support professional writing by non-native speakersrdquo English Corpus Studies 11 101ndash110

OrsquoSullivan I 2007 ldquoEnhancing a process-oriented approach to literacy and language learning The role of corpus consultation literacyrdquo ReCALL 19 (3) 269ndash286

Partington A 1998 Patterns and Meanings AmsterdamPhiladelphia John BenjaminsPravec N 2002 ldquoSurvey of learner corporardquo ICAME Journal 26 (1) 8ndash14Seidlhofer B 2000 ldquoOperationalising intertextuality Using learner corpora for learningrdquo In L

Burnard amp T McEnery (Eds) Rethinking Language Pedagogy from a Corpus Perspective Bern Peter Lang 207ndash223

Seidlhofer B (Ed) 2003 Controversies in Applied Linguistics (Section 2 Corpus Linguistics and Language Teaching) Oxford Oxford University Press

Sinclair J McH 1991 Corpus Concordance Collocation Oxford Oxford University PressSinclair J McH 1999 ldquoThe lexical itemrdquo In E Weigand (Ed) Contrastive Lexical Semantics

AmsterdamPhiladelphia John Benjamins 1ndash24Sinclair J McH 2004a ldquoThe search for units of meaningrdquo In J McH Sinclair (edited with R

Carter) Trust the Text London Routledge 24ndash48Sinclair J McH 2004b ldquoNew evidence new priorities new attitudesrdquo In J McH Sinclair (Ed)

How to Use Corpora in Language Teaching AmsterdamPhiladelphia John Benjamins 271ndash299

Stubbs M 1996 Text and Corpus Analysis Oxford BlackwellStubbs M 2004 ldquoOn very frequent phrases in English Distributions functions and structuresrdquo

Plenary address given at ICAME 25 Verona Italy 19ndash23 MaySwain M 1998 ldquoFocus on form through conscious reflectionrdquo In C Doughty amp J Williams

(Eds) Focus on Form in Classroom Second Language Acquisition Cambridge Cambridge University Press 64ndash81

Swales J M 1990 Genre Analysis English in Academic and Research Settings Cambridge Cam-bridge University Press

Swales J M 2002 ldquoIntegrated and fragmented worlds EAP materials and corpus linguisticsrdquo In J Flowerdew (Ed) Academic Discourse Harlow UK Longman 150ndash64

Swales J M 2004 Research Genres Cambridge Cambridge University Press

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk

Applying corpus linguistics to pedagogy 417

Tognini-Bonelli E 2001 Corpus Linguistics at Work AmsterdamPhiladelphia John Benja-mins

Tribble C amp Jones G 1990 Concordances in the Classroom Harlow UK LongmanVannestaringl M amp Lindquist H 2007 ldquoLearning English grammar with a corpus Experimenting

with concordancing in a university grammar courserdquo ReCALL 19 (3) 329ndash350Weber J-J 2001 ldquoA concordance- and genre-informed approach to ESP essay writingrdquo ELT

Journal 55 (1) 14ndash20Widdowson H G 1991 ldquoThe description and prescription of languagerdquo In J Alatis (Ed)

Georgetown University Round Table in Language and Linguistics Washington DC George-town University

Widdowson H G 1998 ldquoContext community and authentic languagerdquo TESOL Quarterly 32 (4) 705ndash716

Widdowson H G 2002 ldquoCorpora and language teaching tomorrowrdquo Keynote lecture delivered at the Fifth Teaching and Language Corpora Conference Bertinoro Italy 29 July

Widdowson H G 2004 Text Context Pretext London Blackwell

Authorrsquos address

Lynne FlowerdewHong Kong University of Science and TechnologyLanguage CentreClear Water Bay RoadKowloonHong Kong SAR

lclynneusthk