exploring mismatches of scores from automated writing...
TRANSCRIPT
Exploring Mismatches of Scores from Automated Writing Evaluation (AWE) Software and Instructor Rating in ESL Classes from SFL perspective
Zhi Li, Volker Hegelheimer ALT, Iowa State University MIDTESOL12, Ames
score 6 on Paper 4
Criterion scores
Instructor Scores
[A] [A-] [B+] [B] [B-] [C+] [C]
Paper 4
6 11 6 5 9 2
4
Background (Study presented at LTRC 2012)
AWE Related Research
Figure 1. The decomposition of e-rater into features and microfeatures. Enright & Quinlan, 2010, p. 320
Systemic Functional Linguistics (SFL)
Why SFL (also known as Systemic Functional Grammar, SFG)?
Grammatical structure and meaning (Ravelli, 2000). SFL as a tool for genre analysis (Donohue, 2012; Lee, 2006)
Systemic Functional Linguistics (SFL)
Metafunction Grammatical system
Ideational Representing experience of reality
TRANSITIVITY
Interpersonal Enacting social relations MOOD
Textual Presenting messages as text in context
THEME
Adopted from Christie & Unsowrth, 2000, p. 9
Transitivity Analysis
You will receive a package this week.
Participant Process Participant Circumstance
Material
Mental
Verbal
Behavioural
Relational
Existential there be
Human
Non-human
Concrete
Abstract
Specific
Non-specific
Theme/Rheme
Fortunately, the proposal was accepted.
(Interpersonal ) (Topical)Theme Rheme
Topical
Textual
Interpersonal
(Immediate)Thematic Progression Patterns
T2 = T1 Especially in recent society, a lot of people T1#R1 live in their own busy world. But most people T2 # R2 also need to communicate with others and build a close relationship with them.
T2 = R1 So I usually T1#R1 chat a lot with my virtual friends. Most of the chatting detail T2#R2 would be some problems I felt sad.
T2 = S1 So the electronic-communication tools like online social networks, instant messaging and E-mail T1#R1 appear and become widely used in people of all ages. This T2#R2 is really an impact in human society.
T2 = TN Therefore, they T1#R1 do not think they are not respected or ignored. You T2#R2 want to tell your real friends for some reason,
Adapted from Dane (1985) Note: T2 = the theme in the second clause complex, T1 = the theme in the first clause complex, S1 = the first sentence, R1 = the rheme in the first clause complex, TN = new theme.
Writing samples All the writing samples are taken from Engl101C classes collected in Fall 2011. The assignment was an argumentative writing on Virtual Friends. The instructor ratings were an average of two closest ratings from a panel of experience ESL instructors based on Engl101C assignment rubric.
101C 1112
101C 306
101C 723
101C 1119
101C 301
101C 704
T Grade A A A B B B
Arabic Chinese Chinese Chinese Arabic Chinese
Flesch Reading Ease 43.9 69.1 58.4 64.7 55.7 58.4
F-K Grade Level 14 7.6 9.6 8.7 12.1 9.6
Word Count 997 911 897 911 986 905
Note: the Criterion scores for all the papers are 6, the highest.
Data Analysis Transitivity Paper Existential Relational Mental Material Verbal Behavoural
101C1112 2 (2.7%) 31 (41.3%) 6 (8.0%) 25 (33.3%) 1 (1.3%) 10 (13.3%)
101C723 4 (4.4%) 21 (23.1%) 14 (15.4%) 38 (41.8%) 5 (5.5%) 9 (9.9%)
101C306 0 (0.0%) 23 (26.7%) 15 (17.4%) 16 (18.6%) 6 (7.0%) 26 (30.2%)
101C704 2 (3.2%) 25 (39.7%) 7 (11.1%) 23 (36.5%) 2 (3.2%) 4 (6.3%)
101c301 3 (4.4%) 22 (32.4%) 7 (10.3%) 17 (25.0%) 2 (2.9%) 17 (25.0%)
101C1119 2 (2.7%) 19 (25.3%) 8 (10.7%) 18 (24.0%) 7 (9.3%) 21 (28.0%)
Process types in the writing samples (percentage in parenthesis)
Data Analysis
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
EX RE ME MA VE BE
Matched papers Mismatched papers
Comparison of Process types between two sets of writing samples (in averaged percentage; analysis unit is clause)
Data Analysis Theme
Comparison of Thematic Progression between two sets of writing samples (in averaged percentage; analysis unit is clause complex) Note: T2 = the theme in the second clause complex, T1 = the theme in the first clause complex, S1 = the first sentence, R1 = the rheme in the first clause complex, TN = new theme.
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
T2 = T1 T2 = R1 T2 = TN T2 = S1 Textual theme
Matched papers Mismatched papers
Data Analysis Paper T2 = T1 T2 = R1 T2 = TN T2 = S1
Textual theme
101C1112 3 (8%) 11 (30%) 18 (49%) 5 (13%) 24 (65%)
101C723 8 (15%) 13 (25%) 25 (48%) 6 (12%) 30 (58%)
101C306 8 (16%) 16 (32%) 23 (46%) 3 (6%) 33 (66%)
101C704 3 (7%) 9 (21%) 29 (69%) 1 (2%) 29 (69%)
101c301 14 (27%) 6 (11%) 27 (52%) 5 (9%) 21 (40%)
101C1119 15 (30%) 11 (22%) 20 (40%) 4 (8%) 30 (60%)
Theme types in the writing samples (percentage in parenthesis) Note: T2 = the theme in the second clause complex, T1 = the theme in the first clause complex, S1 = the first sentence, R1 = the rheme in the first clause complex, TN = new theme.
Conclusions
The analyses of the two sets of writing samples indicate some differences in processes types, thematic progression patterns, and theme types, which, to some
this argumentative writing assignment.
Implications & Future Studies So what?
With bigger sample of student writing or annotated corpus, significant differences could be spotted to distinguish papers of various quality. If these features could be automatically identified and quantified, they could be a part of future AWE scoring system.
a systematic SFL analysis of a larger sample of students writing Inferential statistical analysis of the data (Multiple regression.)
Selected references Donohue, J. P. (2012). Using systemic functional linguistics in academic writing development: An example from studies. Journal of English for Academic Purposes, 11, 4 16. Enright, M. K., & Quinlan, T. (2010). Complementing human judgment of essays written by English language learners with e-rater® scoring. Language Testing, 27, 317 334. Lee, S. H. (2006). The use of interpersonal resources in argumentative/persuasive essays by East-Asian ESL and Australian tertiary students. Unpublished Dissertation. Sydney, University of Sydney. Mickan, P., & Slater, S. (2003). Text analysis and the assessment of academic writing. International English Language Testing System, 4(2), 59 88. Ravelli, L. (2000). Getting started with functional analysis of texts. In L. Unsworth (Ed.), Researching language in schools and communities: Functional linguistic perspectives (pp. 27 64). London and Washington: Continuum. Schwarz, L., Bartsch, S., Eckart, R., & Teich, E. (2008). Exploring Automatic Theme Identification: A Rule-Based Approach. Text Resources and Lexical Knowledge. Selected Papers from the 9th Conference on Natural Language Processing KONVENS 2008 (pp. 15 26). Mouton de Gruyter.
Thank you! Your questions and comments will be greatly appreciated.
Zhi Li, [email protected] Volker Hegelheimer, [email protected] Criterion research group: http://volkerh.public.iastate.edu/awe