investigating spoken academic english with corpus tools mario cal varela francisco j. fernández...

Investigating spoken academic English with corpus tools

Mario Cal VarelaFrancisco J. Fernández Polo

Universidade de Santiago de Compostela, Spain

24-25 September 2015

Outline

Background : the NIUS survey

A corpus of CPs: compilation and transcription issues

Investigated topics

A look ahead: the CASE project

Survey of English language needs at the USC

• Mailed questionnaire + interviews

• 5 research areas – 25 departments

• 213 valid responses (25% return rate)

• USC > Spain > Southern Europe?

How much English do you need to…?

Self-assess your current competence in English to…

Why a corpus of CPs?

• CPs: a key research genre.

• Little research on CPs.

• Small, self-compiled corpora (Rowley-Jolivet

& Carter-Thomas 2005, Webber 2005).

• CPs underrepresented in existing corpora

(MICASE, BASE; cfr. ELFA).

Project aim:Description of the CP genre

• Structural and lexico-grammatical

features.

• Pragmatic and discourse strategies.

• Multimodal resources.

• Variability across speaker groups (NS/NNS,

expertise, spoken vs. read aloud…).

Current contents

• Paper presentation and discussion sections; Linguistics conferences.

• Video and audio recordings, PPTs and handouts.

• Field notes (audience, physical setting, etc).• Consent forms and speakers profiles.• Current holdings: 30+ events.• Sample limitations: NS vs. NNS, field.

Recording

USC professional services

– Better quality video

– Stronger observer effect.

– Recording staff unaware of research agenda.

– Excessive emphasis on video quality to the detriment of sound.

Researcher-recorder.– Reduced effect of

observer's paradox.

– Naturalistic insider view.

– Researcher in charge.

– Lower quality video recording.

Recording

Lessons

– There is a trade-off with recording quality and unobtrusiveness.

– Ideal recording equipment for CPs: 3 cameras (presenter + screen + audience) and 2 microphones (1 tie-clip + 1 ambient mic).

Transcription and annotation:How detailed?

• General research goals, in-group use > tentative transcription decisions.

• Broad orthographic transcription, restricted mark-up (MICASE).

• Naturalness: features of impromptu speech (repetitions, false starts…).

• Approximate phonological transcription of hesitation, backchannel cues…

• Punctuation: pause duration and basic intonation.

• Spelling: normalised to SBE. Some non standard contractions (sorta, gonna…).

• Layout: 1 utterance per line.

Preliminary mark-up

• MICASE conventions as a starting point.• No POS tagging…yet.• Set of labels used:

– Laughter and humour– Reading– Gesture– Slides– Contextual events

Contextual information: samples

8. but uhm the EFL classroom that i’ll be talking about today is my classroom the teaching that i do, <HUMOUR>so anything embarrassing you see is all my fault</HUMOUR> <LAUGHTER: SS laugh>

9. <SLIDE: new slide; title+bulleted line> why do i analyze my classes?

10.what's my motivation?

Contextual information: samples

130.<HUMOUR: aside> i’m sorry but teachers get anxious too, and they are very proficient, most of the time.

131.<READING>anxiety has negati- negative impact on language performance</READING>.

132.we found as a matter of fact that sometimes the anxiety was a key to learn.

133.now a key thing here <GESTURE: points at a specific point on the screen presentation> is performance and learn.

Investigated topics

Quantitative:• Speakers’ self-references • Audience references• Imperatives and other directive expressions

Qualitative:• Rhetorical structure of specific CP sections• The role of humour• The role of gestures and visuals

The role of humour

Functions of humour

• creates solidarity• reinforces in-group membership• mitigates conflict

“An orientation toward humour by one or more participants from the outset in a potentially thorny interaction can mitigate controversy and prevent serious conflict” (Norrick & Spitz 2008)

Defining and researching humour

• Non-seriousness (Chafe 2007)• Evidence for humour: language and gesture

“We cannot always tell from a transcript or an audio recording what might have been intended or taken as humorous”(Swales, J. 2004. Research Genres, on the limitations of MICASE).

S1: okay, good afternoon. our first speaker is X, and he_, his talk has a very interesting and long title <GESTURE: looking at presenter inquisitively; raising eyebrows and smiling>S2: yes, <LAUGHTER: S1> well, i know <OVERLAP> (xx) yes</OVERLAP>.S1: <READING: from notes><OVERLAP> title X </OVERLAP> title X continues…

Humour in CPs: the importance of context

Humour in CPs: research foci

• Size of the humorous episodes?• Place in CPs?• Target?

Size & Place in CPs

• Generally short.• Few long episodes in NS data only.

• Tend to cluster around moments of tension: outset of talk, before question-time, complex data…, especially in NS talks.

Non-seriousness in CPs: targetUndesirable situations

– incongruences between speaker’s announced plans and actual presentation;

– mismatches between thoughts and actually uttered words;

– slips of the tongue;– mismatches between slides and speaker’s

words;– unreasonably long examples;– self-deprecation: e.g. methodological flaw;– running out of time;– citing work of a member of the audience.

Non-seriousness: an excessively long example

<SLIDE: run-on text> so six is a very long example <LAUGHTER: speaker and audience> erm well i can summarise (the part) up to the deontic expression <LAUGHTER: speaker>

Humour in CPs: target

Abnormal situations– Surprising/unexpected results– Questionable conceptual

distinction– Unusual method or terminology

Visuals: graphs and tables

• 23 speakers (9 M; 14 F) / 19 CPs• 113 episodes• Text (16), graph (36) or table (51).• Presenter variables:

• Expertise (1-3)• NNS vs. NS

• Type of event

Full episode

• Introducing visual element

• Describing element

• Identifying patterns in data

• Interpreting data

81. ... right, now, moving on to results.

82. <SLIDE: graph> let me show you this table, mm? <GESTURE: turns to look at screen>

83. i think it uh it speaks by itself <HUMOUR><LAUGHTER: S1 laughs>

84. but let me explain it to you.

86. and by the way at the bottom of the bar uh there are positive critical comments and at the top are the negative ones.

87. and as you can see, while the frequency of positive critical comments is statistically similar in the two corpora the frequency of negative critical comments is radically different, with one hundred and seventy in English and only forty one in Spanish.

89. but uh if you look at the figures, about eighty percent of the Spanish reviews contained between zero and three critical comments

90. and the most uh uh ne- negative critical comments i mean, and uh the most frequent uh i mean the mode was zero critical comments in in the reviews, in the Spanish reviews.

91. okay uhm so it could be said that our findings show that Anglo-American uh book-review writers display a clear critical attitude towards the book under review

82. okay and now the actual <SLIDE: table+bar graph> pièce de résistance the use of interactional metadiscourse.

83. in this graph you can see the use of hedges boosters and attitude markers taken together

84. and it diminishes over time85. so there is less use of these

interactional markers in the Journal of Pragmatics as time goes by.

now, eh in this case of vague past time reference which I had identified as an area where the_ one could expect to see eh quite marked differences between eh national and other varieties eh <SLIDE: bars are shown one by one> it is striking that white and black informants eh responded quite differently.

i wasn’t prepared for that very marked difference which came out

and what you can see there is a various_ a very clear difference,

i tested this for statistical significance by means of the t-test applied to independent pairs

and as you can see both of these differences perfect and preterite turned out to be significant at the five per cent level in the case of the perfect at the point one percent level in the case of the preterite.

and then eh eh that eh put me on on the track of the eh importance that the distinction between blacks and whites eh could be (expected to play) in this test

60. er so perhaps not the best illustrative examples.61. what i want to show here however is ….62. now this is the study of the occurrence of third

person present simple verb forms in erm the corpus that i’ve collected.

63. erm. and you can see here that,64. well i i i’ll talk through it <GESTURE: turns to

screen> you’ve got too many columns.65. i’ve divided just the all of the verbs in the study

into main verbs on the one hand and then auxiliary verbs on the other.

66. and then you can see a fairly even distribution of occurrences of third person S in other words the conventional standard E-N-L form you know? and the third person zero form you know? a hundred and three to a hundred and eight okay?

67. so you might think well it’s a fairly er even distribution there

68. but what’s very interesting is that the pattern emerges especially when you look at auxiliary verbs you know?

69. so the third person zero as a a feature of lingua franca talk occurs primarily on main verbs

Directives in CPs

Directives in CPs. Research Aims.

• Native vs non-native usage. Pedagogic style? Cfr. MICASE

• CPs vs Research Articles. Cfr. Swales & al 1998; Hyland 2002.

Directive expressions considered

• Imperative clauses• Let us/me-imperatives• It is important, essential, etc. + to-clause

(audience oriented with directive force)• you/we + modal verb (must/have to/need)• If(when)-clauses: if you/we (with directive

force)• As-clauses: as you/we (with directive force)

Verbs with directive force: CPs vs RAs

see 85

look 32

bear/keep (in mind)

7

consider 6

remember 4

compare, note, turn, take into account

2

conclude, imagine 1

see 87

consider 71

note 38

suppose 25

recall 6

define 4

classify, insert, assume 3

contrast, calculate, notice, imagine, denote, …

2

RAs (Swales & al 1998)CPs

See-based directives: NS vs NNS

NNS NS

N /1000 N /1000

you/we can see 16 0.39 20 1.05

you/we see 19 0.46 0 0

as you/we can see

18 0.44 0 0

You vs we-based directives

NNS NS

N /1000 N /1000

You-directives

56 1.37 16 0.85

We-directives 26 0.64 22 1.16

Readers are mostly explicitly brought into the text s discourse participants by the use of personal pronouns, most commonly the inclusive we (…) You and your occur only rarely (…) This widespread avoidance may indicate that writers generally seek to circumvent the stark detachment from their audience that you suggests. (Hyland, 2001)

Let me and let us imperatives:CPs vs lectures

NS NNS MICASE Small

Lectures

let me + verb

2 8 29

let us + verb

5 21 59

TOTAL 7 (0.37) 29 (0.71) 88 (1.63)

A look ahead: The CASE project

• Informal conversations via Skype

• Advanced L2 use of English

• Academic topics

• Duration (45 mins)

• Multilayer transcription

Exploiting CASE

• Interaction management: speaking turns,

communicative strategies, body language…

• Identity issues: projecting self-identity,

stereotypes…

• Appropriating the code: negotiation of

meaning, linguistic creativeness…

[email protected]@usc.es

investigating spoken academic english with corpus tools mario cal varela francisco j. fernández...

Documents

academic english

english lecturing skills

oral needs

oral skills

current competence

strong lecturing needs

oral deficits

speaking needs