automating translation in the localisation factory an investigation of post-editing effort sharon...

25
Automating Translation in the Localisation Factory An Investigation of Post- Editing Effort Sharon O’Brien Dublin City University

Upload: anastasia-parrish

Post on 24-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Automating Translation in the Localisation Factory

An Investigation of Post-Editing Effort

Sharon O’BrienDublin City University

Page 2: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Assumptions about MT

T (MT + PE) < T (Trans)

Page 3: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Do we have proof?

Dated studies: Pan-American Health Organisation General Motors European Union

3-4 times faster than translation But:

No details given More Recently:

Average daily throughput for PE: 5,250 words per day

Krings (2001): only thorough, published empirical data on PE rates

Page 4: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

MT + CL

CL: Relatively young field of research/implementation

Consequently: little empirical data

Page 5: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

CL improves “translatability”

The notion of translatability is based on so-called "translatability indicators" where the occurrence of such an indicator in the text is considered to have a negative effect on the quality of machine translation. The fewer translatability indicators, the better suited the text is to translation using MT.

(Underwood and Jongejan, 2001: 363)

Page 6: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Can we prove it - empirically?

By using CL rules to eliminate negative “translatability indicators”, post-editing effort of MT output will be lower than for output where negative translatability indicators have not been removed.

Page 7: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Experimental Set-Up

Validity!Professional, experienced subjects, native

speakers (German)Homogenous backgrounds and level of

experienceFamiliar text (user guide)Familiar working environmentPayment for time

However: limited number of subjects

Page 8: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Framework of Analysis

How do you measure post-editing “effort”?TemporalTechnicalCognitive

Two sentence types: “Snti” “Smin-nti”

Page 9: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Framework of Analysis

Temporal Effort: How much time, in seconds, did it take to post-edit

each sentence?

Technical Effort: How many deletions, insertions, cut & pastes were

made for each sentence?

Cognitive Effort: Combined Temporal & Technical Additional measurement: Choice Network Analysis

Page 10: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Analysis Tools

IBM WebsphereTranslogExcel

Page 11: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Translog User Interface

Page 12: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Translog Log File

Page 13: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Results: General Temporal Effort

0

2

4

6

8

10

12

14

16

18

Median WordsPer Minute

Post-Editor

Translator

Page 14: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Temporal Effort: Individual Variation

12.9

13

13.1

13.2

13.3

13.4

13.5

13.6

13.7

Median Wordper Minute

Translator 1

Translator 2

Translator 3

0

5

10

15

20

25

30

Median WordsPer Minute

Fastest Post-Editor

Slowest Post-Editor

Page 15: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Temporal Effort by Sentence Type

Processing Speed: the total number of source words in each

segment divided by the total processing time for that segment

Page 16: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Processing Speed by Sentence Type

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Median Processing Speed

Snti

Smin-nti

Page 17: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Technical Effort by Sentence Type

0

0.5

1

1.5

2

2.5

3

3.5

4

MedianDeletions

Snti

Smin-nti

0

0.5

1

1.5

2

2.5

3

3.5

4

MedianInsertions

Snti

Smin-nti

Page 18: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Technical Effort: Cut & Paste

Very little activity!Retyping of entire phrases rather than

cutting & pastingLess effort to re-type?Need for training?

Page 19: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Cognitive Effort

On average, the elimination of NTIs suggests that PE effort is reduced.

However, CNA shows:More edits to some NTIs than to othersEven though NTIs have been removed from

a sentence, this does not guarantee zero post-editing

Page 20: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

High PE Effort

Gerund (“ing” form of verb) Ungrammatical Phrase Putting an adjective after the noun Non-finite verb (no tense marked) Slang Misspelling Long Noun Phrase Ellipsis Long Sentence (more than 25 words) Verbs with particles Use of Footnotes Multiple Prepositions Short Segment (fewer than 4 words)

Page 21: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Medium PE Effort

Multiple Coordinators Problematic Punctuation Passive Voice Phrase not syntactically complete Use of Personal Pronouns Use of Slash as a separator Ambiguous coordination Use of brackets Proper Nouns Missing “that” in a relative clause

Page 22: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Low PE Effort

AbbreviationsDemonstrative PronounsMissing “in order to”Contractions (“Let’s”)

Page 23: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Conclusions

Taking into account that no QA was performed on the final texts:

On average post-editing can be faster than translationHigh degree of individual variation

On average, removing NTIs reduces PE EffortBut some NTIs demand more effort than

others

Page 24: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Conclusions

Even if all known NTIs are removed, sentences may still require PE effort.

Page 25: Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University

Conclusions

Not all CL rules will have equal impactEven if CL is applied, PE effort will not

be removed completelyPost-editors are still human and still

translators…