investigating spoken academic english with corpus tools mario cal varela francisco j. fernández...

41
Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25 September 2015

Upload: emerald-chandler

Post on 03-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Investigating spoken academic English with corpus tools

Mario Cal VarelaFrancisco J. Fernández Polo

Universidade de Santiago de Compostela, Spain

24-25 September 2015

Page 2: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Outline

Background : the NIUS survey

A corpus of CPs: compilation and transcription issues

Investigated topics

A look ahead: the CASE project

Page 3: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Survey of English language needs at the USC

• Mailed questionnaire + interviews

• 5 research areas – 25 departments

• 213 valid responses (25% return rate)

• USC > Spain > Southern Europe?

Page 4: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

How much English do you need to…?

Page 5: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Self-assess your current competence in English to…

Page 6: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Why a corpus of CPs?

• CPs: a key research genre.

• Little research on CPs.

• Small, self-compiled corpora (Rowley-Jolivet

& Carter-Thomas 2005, Webber 2005).

• CPs underrepresented in existing corpora

(MICASE, BASE; cfr. ELFA).

Page 7: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Project aim:Description of the CP genre

• Structural and lexico-grammatical

features.

• Pragmatic and discourse strategies.

• Multimodal resources.

• Variability across speaker groups (NS/NNS,

expertise, spoken vs. read aloud…).

Page 8: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Current contents

• Paper presentation and discussion sections; Linguistics conferences.

• Video and audio recordings, PPTs and handouts.

• Field notes (audience, physical setting, etc).• Consent forms and speakers profiles.• Current holdings: 30+ events.• Sample limitations: NS vs. NNS, field.

Page 9: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Recording

USC professional services

– Better quality video

– Stronger observer effect.

– Recording staff unaware of research agenda.

– Excessive emphasis on video quality to the detriment of sound.

Researcher-recorder.– Reduced effect of

observer's paradox.

– Naturalistic insider view.

– Researcher in charge.

– Lower quality video recording.

Page 10: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Recording

Lessons

– There is a trade-off with recording quality and unobtrusiveness.

– Ideal recording equipment for CPs: 3 cameras (presenter + screen + audience) and 2 microphones (1 tie-clip + 1 ambient mic).

Page 11: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Transcription and annotation:How detailed?

• General research goals, in-group use > tentative transcription decisions.

• Broad orthographic transcription, restricted mark-up (MICASE).

• Naturalness: features of impromptu speech (repetitions, false starts…).

• Approximate phonological transcription of hesitation, backchannel cues…

• Punctuation: pause duration and basic intonation.

• Spelling: normalised to SBE. Some non standard contractions (sorta, gonna…).

• Layout: 1 utterance per line.

Page 12: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Preliminary mark-up

• MICASE conventions as a starting point.• No POS tagging…yet.• Set of labels used:

– Laughter and humour– Reading– Gesture– Slides– Contextual events

Page 13: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Contextual information: samples

8. but uhm the EFL classroom that i’ll be talking about today is my classroom the teaching that i do, <HUMOUR>so anything embarrassing you see is all my fault</HUMOUR> <LAUGHTER: SS laugh>

9. <SLIDE: new slide; title+bulleted line> why do i analyze my classes?

10.what's my motivation?

Page 14: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Contextual information: samples

130.<HUMOUR: aside> i’m sorry but teachers get anxious too, and they are very proficient, most of the time.

131.<READING>anxiety has negati- negative impact on language performance</READING>.

132.we found as a matter of fact that sometimes the anxiety was a key to learn.

133.now a key thing here <GESTURE: points at a specific point on the screen presentation> is performance and learn.

Page 15: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Investigated topics

Quantitative:• Speakers’ self-references • Audience references• Imperatives and other directive expressions

Qualitative:• Rhetorical structure of specific CP sections• The role of humour• The role of gestures and visuals

Page 16: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

The role of humour

Page 17: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Functions of humour

• creates solidarity• reinforces in-group membership• mitigates conflict

“An orientation toward humour by one or more participants from the outset in a potentially thorny interaction can mitigate controversy and prevent serious conflict” (Norrick & Spitz 2008)

Page 18: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Defining and researching humour

• Non-seriousness (Chafe 2007)• Evidence for humour: language and gesture

“We cannot always tell from a transcript or an audio recording what might have been intended or taken as humorous”(Swales, J. 2004. Research Genres, on the limitations of MICASE).

Page 19: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

S1: okay, good afternoon. our first speaker is X, and he_, his talk has a very interesting and long title <GESTURE: looking at presenter inquisitively; raising eyebrows and smiling>S2: yes, <LAUGHTER: S1> well, i know <OVERLAP> (xx) yes</OVERLAP>.S1: <READING: from notes><OVERLAP> title X </OVERLAP> title X continues…

Humour in CPs: the importance of context

Page 20: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Humour in CPs: research foci

• Size of the humorous episodes?• Place in CPs?• Target?

Page 21: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Size & Place in CPs

• Generally short.• Few long episodes in NS data only.

• Tend to cluster around moments of tension: outset of talk, before question-time, complex data…, especially in NS talks.

Page 22: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Non-seriousness in CPs: targetUndesirable situations

– incongruences between speaker’s announced plans and actual presentation;

– mismatches between thoughts and actually uttered words;

– slips of the tongue;– mismatches between slides and speaker’s

words;– unreasonably long examples;– self-deprecation: e.g. methodological flaw;– running out of time;– citing work of a member of the audience.

Page 23: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Non-seriousness: an excessively long example

<SLIDE: run-on text> so six is a very long example <LAUGHTER: speaker and audience> erm well i can summarise (the part) up to the deontic expression <LAUGHTER: speaker>

Page 24: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Humour in CPs: target

Abnormal situations– Surprising/unexpected results– Questionable conceptual

distinction– Unusual method or terminology

Page 25: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Visuals: graphs and tables

Page 26: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

• 23 speakers (9 M; 14 F) / 19 CPs• 113 episodes• Text (16), graph (36) or table (51).• Presenter variables:

• Expertise (1-3)• NNS vs. NS

• Type of event

Page 27: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Full episode

• Introducing visual element

• Describing element

• Identifying patterns in data

• Interpreting data

Page 28: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

81. ... right, now, moving on to results.

82. <SLIDE: graph> let me show you this table, mm? <GESTURE: turns to look at screen>

83. i think it uh it speaks by itself <HUMOUR><LAUGHTER: S1 laughs>

84. but let me explain it to you.

86. and by the way at the bottom of the bar uh there are positive critical comments and at the top are the negative ones.

87. and as you can see, while the frequency of positive critical comments is statistically similar in the two corpora the frequency of negative critical comments is radically different, with one hundred and seventy in English and only forty one in Spanish.

89. but uh if you look at the figures, about eighty percent of the Spanish reviews contained between zero and three critical comments

90. and the most uh uh ne- negative critical comments i mean, and uh the most frequent uh i mean the mode was zero critical comments in in the reviews, in the Spanish reviews.

91. okay uhm so it could be said that our findings show that Anglo-American uh book-review writers display a clear critical attitude towards the book under review

Page 29: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

82. okay and now the actual <SLIDE: table+bar graph> pièce de résistance the use of interactional metadiscourse.

83. in this graph you can see the use of hedges boosters and attitude markers taken together

84. and it diminishes over time85. so there is less use of these

interactional markers in the Journal of Pragmatics as time goes by.

Page 30: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

now, eh in this case of vague past time reference which I had identified as an area where the_ one could expect to see eh quite marked differences between eh national and other varieties eh <SLIDE: bars are shown one by one> it is striking that white and black informants eh responded quite differently.

i wasn’t prepared for that very marked difference which came out

and what you can see there is a various_ a very clear difference,

i tested this for statistical significance by means of the t-test applied to independent pairs

and as you can see both of these differences perfect and preterite turned out to be significant at the five per cent level in the case of the perfect at the point one percent level in the case of the preterite.

and then eh eh that eh put me on on the track of the eh importance that the distinction between blacks and whites eh could be (expected to play) in this test

Page 31: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

60. er so perhaps not the best illustrative examples.61. what i want to show here however is ….62. now this is the study of the occurrence of third

person present simple verb forms in erm the corpus that i’ve collected.

63. erm. and you can see here that,64. well i i i’ll talk through it <GESTURE: turns to

screen> you’ve got too many columns.65. i’ve divided just the all of the verbs in the study

into main verbs on the one hand and then auxiliary verbs on the other.

66. and then you can see a fairly even distribution of occurrences of third person S in other words the conventional standard E-N-L form you know? and the third person zero form you know? a hundred and three to a hundred and eight okay?

67. so you might think well it’s a fairly er even distribution there

68. but what’s very interesting is that the pattern emerges especially when you look at auxiliary verbs you know?

69. so the third person zero as a a feature of lingua franca talk occurs primarily on main verbs

Page 32: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Directives in CPs

Page 33: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Directives in CPs. Research Aims.

• Native vs non-native usage. Pedagogic style? Cfr. MICASE

• CPs vs Research Articles. Cfr. Swales & al 1998; Hyland 2002.

Page 34: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Directive expressions considered

• Imperative clauses• Let us/me-imperatives• It is important, essential, etc. + to-clause

(audience oriented with directive force)• you/we + modal verb (must/have to/need)• If(when)-clauses: if you/we (with directive

force)• As-clauses: as you/we (with directive force)

Page 35: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Verbs with directive force: CPs vs RAs

see 85

look 32

bear/keep (in mind)

7

consider 6

remember 4

compare, note, turn, take into account

2

conclude, imagine 1

see 87

consider 71

note 38

suppose 25

recall 6

define 4

classify, insert, assume 3

contrast, calculate, notice, imagine, denote, …

2

RAs (Swales & al 1998)CPs

Page 36: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

See-based directives: NS vs NNS

NNS NS

N /1000 N /1000

you/we can see 16 0.39 20 1.05

you/we see 19 0.46 0 0

as you/we can see

18 0.44 0 0

Page 37: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

You vs we-based directives

NNS NS

  N /1000 N /1000

You-directives

56 1.37 16 0.85

We-directives 26 0.64 22 1.16

Readers are mostly explicitly brought into the text s discourse participants by the use of personal pronouns, most commonly the inclusive we (…) You and your occur only rarely (…) This widespread avoidance may indicate that writers generally seek to circumvent the stark detachment from their audience that you suggests. (Hyland, 2001)

Page 38: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Let me and let us imperatives:CPs vs lectures

NS NNS MICASE Small

Lectures

let me + verb

2 8 29

let us + verb

5 21 59

TOTAL 7 (0.37) 29 (0.71) 88 (1.63)

Page 39: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

A look ahead: The CASE project

• Informal conversations via Skype

• Advanced L2 use of English

• Academic topics

• Duration (45 mins)

• Multilayer transcription

Page 40: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

Exploiting CASE

• Interaction management: speaking turns,

communicative strategies, body language…

• Identity issues: projecting self-identity,

stereotypes…

• Appropriating the code: negotiation of

meaning, linguistic creativeness…

Page 41: Investigating spoken academic English with corpus tools Mario Cal Varela Francisco J. Fernández Polo Universidade de Santiago de Compostela, Spain 24-25

[email protected]@usc.es