investigating spoken academic english with corpus tools mario cal varela francisco j. fernández...
TRANSCRIPT
Investigating spoken academic English with corpus tools
Mario Cal VarelaFrancisco J. Fernández Polo
Universidade de Santiago de Compostela, Spain
24-25 September 2015
Outline
Background : the NIUS survey
A corpus of CPs: compilation and transcription issues
Investigated topics
A look ahead: the CASE project
Survey of English language needs at the USC
• Mailed questionnaire + interviews
• 5 research areas – 25 departments
• 213 valid responses (25% return rate)
• USC > Spain > Southern Europe?
How much English do you need to…?
Self-assess your current competence in English to…
Why a corpus of CPs?
• CPs: a key research genre.
• Little research on CPs.
• Small, self-compiled corpora (Rowley-Jolivet
& Carter-Thomas 2005, Webber 2005).
• CPs underrepresented in existing corpora
(MICASE, BASE; cfr. ELFA).
Project aim:Description of the CP genre
• Structural and lexico-grammatical
features.
• Pragmatic and discourse strategies.
• Multimodal resources.
• Variability across speaker groups (NS/NNS,
expertise, spoken vs. read aloud…).
Current contents
• Paper presentation and discussion sections; Linguistics conferences.
• Video and audio recordings, PPTs and handouts.
• Field notes (audience, physical setting, etc).• Consent forms and speakers profiles.• Current holdings: 30+ events.• Sample limitations: NS vs. NNS, field.
Recording
USC professional services
– Better quality video
– Stronger observer effect.
– Recording staff unaware of research agenda.
– Excessive emphasis on video quality to the detriment of sound.
Researcher-recorder.– Reduced effect of
observer's paradox.
– Naturalistic insider view.
– Researcher in charge.
– Lower quality video recording.
Recording
Lessons
– There is a trade-off with recording quality and unobtrusiveness.
– Ideal recording equipment for CPs: 3 cameras (presenter + screen + audience) and 2 microphones (1 tie-clip + 1 ambient mic).
Transcription and annotation:How detailed?
• General research goals, in-group use > tentative transcription decisions.
• Broad orthographic transcription, restricted mark-up (MICASE).
• Naturalness: features of impromptu speech (repetitions, false starts…).
• Approximate phonological transcription of hesitation, backchannel cues…
• Punctuation: pause duration and basic intonation.
• Spelling: normalised to SBE. Some non standard contractions (sorta, gonna…).
• Layout: 1 utterance per line.
Preliminary mark-up
• MICASE conventions as a starting point.• No POS tagging…yet.• Set of labels used:
– Laughter and humour– Reading– Gesture– Slides– Contextual events
Contextual information: samples
8. but uhm the EFL classroom that i’ll be talking about today is my classroom the teaching that i do, <HUMOUR>so anything embarrassing you see is all my fault</HUMOUR> <LAUGHTER: SS laugh>
9. <SLIDE: new slide; title+bulleted line> why do i analyze my classes?
10.what's my motivation?
Contextual information: samples
130.<HUMOUR: aside> i’m sorry but teachers get anxious too, and they are very proficient, most of the time.
131.<READING>anxiety has negati- negative impact on language performance</READING>.
132.we found as a matter of fact that sometimes the anxiety was a key to learn.
133.now a key thing here <GESTURE: points at a specific point on the screen presentation> is performance and learn.
Investigated topics
Quantitative:• Speakers’ self-references • Audience references• Imperatives and other directive expressions
Qualitative:• Rhetorical structure of specific CP sections• The role of humour• The role of gestures and visuals
The role of humour
Functions of humour
• creates solidarity• reinforces in-group membership• mitigates conflict
“An orientation toward humour by one or more participants from the outset in a potentially thorny interaction can mitigate controversy and prevent serious conflict” (Norrick & Spitz 2008)
Defining and researching humour
• Non-seriousness (Chafe 2007)• Evidence for humour: language and gesture
“We cannot always tell from a transcript or an audio recording what might have been intended or taken as humorous”(Swales, J. 2004. Research Genres, on the limitations of MICASE).
S1: okay, good afternoon. our first speaker is X, and he_, his talk has a very interesting and long title <GESTURE: looking at presenter inquisitively; raising eyebrows and smiling>S2: yes, <LAUGHTER: S1> well, i know <OVERLAP> (xx) yes</OVERLAP>.S1: <READING: from notes><OVERLAP> title X </OVERLAP> title X continues…
Humour in CPs: the importance of context
Humour in CPs: research foci
• Size of the humorous episodes?• Place in CPs?• Target?
Size & Place in CPs
• Generally short.• Few long episodes in NS data only.
• Tend to cluster around moments of tension: outset of talk, before question-time, complex data…, especially in NS talks.
Non-seriousness in CPs: targetUndesirable situations
– incongruences between speaker’s announced plans and actual presentation;
– mismatches between thoughts and actually uttered words;
– slips of the tongue;– mismatches between slides and speaker’s
words;– unreasonably long examples;– self-deprecation: e.g. methodological flaw;– running out of time;– citing work of a member of the audience.
Non-seriousness: an excessively long example
<SLIDE: run-on text> so six is a very long example <LAUGHTER: speaker and audience> erm well i can summarise (the part) up to the deontic expression <LAUGHTER: speaker>
Humour in CPs: target
Abnormal situations– Surprising/unexpected results– Questionable conceptual
distinction– Unusual method or terminology
Visuals: graphs and tables
• 23 speakers (9 M; 14 F) / 19 CPs• 113 episodes• Text (16), graph (36) or table (51).• Presenter variables:
• Expertise (1-3)• NNS vs. NS
• Type of event
Full episode
• Introducing visual element
• Describing element
• Identifying patterns in data
• Interpreting data
81. ... right, now, moving on to results.
82. <SLIDE: graph> let me show you this table, mm? <GESTURE: turns to look at screen>
83. i think it uh it speaks by itself <HUMOUR><LAUGHTER: S1 laughs>
84. but let me explain it to you.
86. and by the way at the bottom of the bar uh there are positive critical comments and at the top are the negative ones.
87. and as you can see, while the frequency of positive critical comments is statistically similar in the two corpora the frequency of negative critical comments is radically different, with one hundred and seventy in English and only forty one in Spanish.
89. but uh if you look at the figures, about eighty percent of the Spanish reviews contained between zero and three critical comments
90. and the most uh uh ne- negative critical comments i mean, and uh the most frequent uh i mean the mode was zero critical comments in in the reviews, in the Spanish reviews.
91. okay uhm so it could be said that our findings show that Anglo-American uh book-review writers display a clear critical attitude towards the book under review
82. okay and now the actual <SLIDE: table+bar graph> pièce de résistance the use of interactional metadiscourse.
83. in this graph you can see the use of hedges boosters and attitude markers taken together
84. and it diminishes over time85. so there is less use of these
interactional markers in the Journal of Pragmatics as time goes by.
now, eh in this case of vague past time reference which I had identified as an area where the_ one could expect to see eh quite marked differences between eh national and other varieties eh <SLIDE: bars are shown one by one> it is striking that white and black informants eh responded quite differently.
i wasn’t prepared for that very marked difference which came out
and what you can see there is a various_ a very clear difference,
i tested this for statistical significance by means of the t-test applied to independent pairs
and as you can see both of these differences perfect and preterite turned out to be significant at the five per cent level in the case of the perfect at the point one percent level in the case of the preterite.
and then eh eh that eh put me on on the track of the eh importance that the distinction between blacks and whites eh could be (expected to play) in this test
60. er so perhaps not the best illustrative examples.61. what i want to show here however is ….62. now this is the study of the occurrence of third
person present simple verb forms in erm the corpus that i’ve collected.
63. erm. and you can see here that,64. well i i i’ll talk through it <GESTURE: turns to
screen> you’ve got too many columns.65. i’ve divided just the all of the verbs in the study
into main verbs on the one hand and then auxiliary verbs on the other.
66. and then you can see a fairly even distribution of occurrences of third person S in other words the conventional standard E-N-L form you know? and the third person zero form you know? a hundred and three to a hundred and eight okay?
67. so you might think well it’s a fairly er even distribution there
68. but what’s very interesting is that the pattern emerges especially when you look at auxiliary verbs you know?
69. so the third person zero as a a feature of lingua franca talk occurs primarily on main verbs
Directives in CPs
Directives in CPs. Research Aims.
• Native vs non-native usage. Pedagogic style? Cfr. MICASE
• CPs vs Research Articles. Cfr. Swales & al 1998; Hyland 2002.
Directive expressions considered
• Imperative clauses• Let us/me-imperatives• It is important, essential, etc. + to-clause
(audience oriented with directive force)• you/we + modal verb (must/have to/need)• If(when)-clauses: if you/we (with directive
force)• As-clauses: as you/we (with directive force)
Verbs with directive force: CPs vs RAs
see 85
look 32
bear/keep (in mind)
7
consider 6
remember 4
compare, note, turn, take into account
2
conclude, imagine 1
see 87
consider 71
note 38
suppose 25
recall 6
define 4
classify, insert, assume 3
contrast, calculate, notice, imagine, denote, …
2
RAs (Swales & al 1998)CPs
See-based directives: NS vs NNS
NNS NS
N /1000 N /1000
you/we can see 16 0.39 20 1.05
you/we see 19 0.46 0 0
as you/we can see
18 0.44 0 0
You vs we-based directives
NNS NS
N /1000 N /1000
You-directives
56 1.37 16 0.85
We-directives 26 0.64 22 1.16
Readers are mostly explicitly brought into the text s discourse participants by the use of personal pronouns, most commonly the inclusive we (…) You and your occur only rarely (…) This widespread avoidance may indicate that writers generally seek to circumvent the stark detachment from their audience that you suggests. (Hyland, 2001)
Let me and let us imperatives:CPs vs lectures
NS NNS MICASE Small
Lectures
let me + verb
2 8 29
let us + verb
5 21 59
TOTAL 7 (0.37) 29 (0.71) 88 (1.63)
A look ahead: The CASE project
• Informal conversations via Skype
• Advanced L2 use of English
• Academic topics
• Duration (45 mins)
• Multilayer transcription
Exploiting CASE
• Interaction management: speaking turns,
communicative strategies, body language…
• Identity issues: projecting self-identity,
stereotypes…
• Appropriating the code: negotiation of
meaning, linguistic creativeness…
[email protected]@usc.es