towards application of user-tailored machine translation in localization

Towards Application of User-Tailored Machine Translation

in LocalizationAndrejs Vasiļjevs, Raivis Skadiņš, Inguna Skadiņa

TILDE

JEC 2011, LuxembourgOctober 14, 2011

http://intranets/LV/Tildes%20produkti%20logo%20uc/Logo%20kvadrata.jpg

MT in Localization Work The localization process is generally related to

the translation and cultural adaptation of software, video games, and websites, and less frequently to any written translation

Translation Memory is commonly used Although there are number of MT use-cases in

practical localization process, it is not yet widely adapted


Previous Work Evaluation with keyboard-monitoring program and

Choice Network Analysis for measuring the effort involved in post-editing MT output (O´Brien, 2005)

Productivity tests have been performed in translation and localization industry settings at Microsoft (Schmidtke, 2008):

• SMT system of Microsoft Research trained on MS tech domain for 3 languages for Office Online 2007 localization task: Spanish, French and German

• By applying MT to all new words on average 5-10% productivity was gained


Previous Work – Adobe In Adobe two experiments were performed

(Flournoy and Duran, 2009):• Small test set of 800-2000 words was machine

translated and post-edited• Then, based on the positive results, about 200,000

words of new text were localized• The rule-based MT was used for translation into

Russian (PROMT) and SMT for Spanish and French (Language Weaver)

• Productivity increase between 22% and 51%


Previous Work – Autodesk Evaluation of Autodesk Moses SMT system

(Plitt and Masselot, 2010):• Translation from English to French, Italian, German and

Spanish with three translators for each language pair• To measure translation time special workbench was

designed to capture keyboard and pause times for each sentence

• MT allowed translators to improve their throughput on average by 74%

• Varying increase in productivity: from 20% to 131%• Optimum throughput has been reached for sentences

of around 25 words in length


LetsMT! project

User-driven cloud-based MT factory, based on open-source MT tools

Services for data collection, MT generation, customization and running of variety of user-tailored MT systems

Application in localization among the key usage scenarios Strong synergy with FP7 project ACCURAT to advance data-

driven machine translation for under-resourced languages and domains


Partnership with Complementing Competencies Tilde (Project Coordinator) - Latvia University of Edinburgh - UK University of Zagreb - Croatia Copenhagen University - Denmark Uppsala University - Sweden Moravia – Czech Republic SemLab – Netherlands

Industry experience

Research excellence


LetsMT! Main Features Online collaborative platform for MT building from user-

provided data Repository of parallel and monolingual corpora for MT

generation Automated training of SMT systems from specified

collections of data Users can specify particular training data collections and

build customised MT engines from these collections Users can also use LetsMT! platform for tailoring MT

system to their needs from their non-public data


User requirement analysis 21 interviews have been conducted with

localization/translation agencies. Only 7 of the respondents replied that they

use fully automatic MT systems in their translation practice and only one LSP organization employs MT as the primary translation method.


User survey

38%

23%

18%

23%

IPR of text resources in interviewee or-ganizations

no replyinterviewee has IPRinterviewee has restricted/partial IPRinterviewee has no IPR


User survey

40%

23%

21%

16%

Organizations' ability to share data

no reply/interviewee has no data now

perhaps

yes

no


Training UsingSharing of training data

Giza++Moses SMT toolkit

SMT Resource Repository

SMT Multi-Model Repository

(trained SMT models)

Proc

esin

g, E

valu

ation

...

Upl

oad

Anon

ymou

sac

cess

Auth

entic

ated

acce

ss

System management, user authentication, access rights control ...

Web page

Web service

Web pagetranslation widget

CAT tools

Web browserPlug-ins

SMT Resource Directory

SMT System Directory

Moses decoder


LetsMT! architecture Multitier architecture

• User interface (webpage UI, web service API)

• Application Logic• Resource Repository

(stores MT training data and trained models)

• High-performance Computing Cluster(executes all computationally heavy tasks: SMT training, MT service, Processing and aligning of training data etc.)

Interface Layer

Web Page UI Public API

Application Logic LayerResource

Repository Adapter

SMT training

Data Storage Layer(Resource Repository)

High-performance Computing (HPC) Cluster

SGE

Widget ...CAT toolsCAT tools CAT toolsBrowser plug-ins

http

sR

ES

T

http

/http

sht

ml

http

sR

ES

T

http

sR

ES

T, S

OA

P, .

..

TCP

/IP

http

RE

ST

/SO

AP

CPUCPU

CPU CPU

CPU CPU

CPU

CPUht

tpR

ES

T/S

OA

P

Translation

RE

ST

System DB

RR API

SVN

File Share

Web Browsers

HPC frontend CPUREST


LetsMT! architecture Webpage interface

where users can

• See, upload and manage corpora• See, train and manage user tailored SMT system• Translate texts and files

Web service API• Integration with CAT tools• Integration in web pages and web browsers


LetsMT! architecture Resource Repository

• Stores SMT training data• Supports different formats – TMX, XLIFF, PDF, DOC,

plain text• Converts to unified format• Performs format conversions and alignment


Based on Moses SMT toolkit Training tasks are managed with

Moses Experiment Management System Training tasks are executed in HPC cluster

(Oracle Grid Engine)

Hosted in Web Services infrastructure which provides easy access to on demand computing resources

New Moses features: • incremental training, • distributed language models, • interpolated language models for domain adaptation• randomized language models to train using huge corpora• translation of formatted texts, • running Moses decoder in a server mode


Bilingual corpora for the English-Latvian system

Bilingual corpus Parallel units

Localization TM ~1.29 M

DGT-TM ~1.06 M

OPUS EMEA ~0.97 M

Fiction ~0.66 M

Dictionary data ~0.51 M

Web corpus ~0.9 M

Total 5.37 M


Latvian monolingual corpora

Monolingual corpus Words

Latvian side of parallel corpus

60 M

News (web) 250 M

Fiction 9 M

Total, Latvian 319 M


Tilde English-Latvian MT Tools: Moses, Giza++, SRILM,

Tilde Latvian morphological analyzer Factored phrase-based SMT system:

• Surface form Surface form, Morphology tag• 2 Language models:

• (1) 5-gram surface form• (2) 7-gram morphology tag

The MT System

surface

lemma

morph.

synt.

surf.

lemma

morph.

stem

suffix

English Latvian


Development and evaluation data• Development - 1000 sentences • Evaluation – 500 sentences• Balanced• BLEU score: 35.0

The MT system

Topic PercentageGeneral information about European Union 12%Specifications, instructions and manuals 12%Popular scientific and educational 12%Official and legal documents 12%News and magazine articles 24%Information technology 18%Letters 5%Fiction 5%


Localization workflow at TildeEvaluate original / assign

Translator and Editor

Import into SDL TradosAnalyze against TMs

Translateusing translation suggestions for TMs

Evaluate translation quality / Edit

(optional) Fix errors

Hand over to customer


MT Integration into Localization Workflow

Evaluate original / assign Translator and Editor

Analyze against TMs

Translateusing translation suggestions for TMs

and MT

Evaluate translation quality / Edit

Fix errors

Ready translation

MT translate new sentences


Evaluation of Productivity Key interest of localization industry is to

increase productivity of translation process while maintaining required quality level

Productivity was measured as the translation output of an average translator in words per hour

5 translators participated in evaluation including both experienced and new translators


Evaluation of Quality Performed by human editors as part of their regular QA process Result of translation process was evaluated, editors did not

know was or was not MT applied to assist translator Comparison to reference is not part of this evaluation Tilde standard QA assessment form was used covering the

following text quality areas:• Accuracy• Spelling and grammar• Style• Terminology


Error Score


QA Evaluation FormError Category Weight Amount of errors Negative points

1. Accuracy 1.1. Understanding of the source text 3 0

1.2. Understanding the functionality of the product 3 01.3. Comprehensibility 3 01.4. Omissions/Unnecessary additions 2 01.5. Translated/Untranslated 1 01.6. Left-overs 1 0

Total 02. Language quality 2.1. Grammar 2 02.2. Punctuation 1 02.3. Spelling 1 0

Total 03. Style 3.1. Word order, word-for-word translation 1 03.2. Vocabulary and style choice 1 03.3. Style Guide adherence 2 03.4. Country standards 1 0

Total 04. Terminology 4.1. Glossary adherence 2 04.2. Consistency 2 0

Total 0Additional plus points for style (if applicable) 0

Grand Total 0Negative points per 1000 words 0Quality: Resulting Evaluation


QA GradesError Score

(sum of weighted errors)Resulting Quality

Evaluation

0…9 Superior

10…29 Good

30…49 Mediocre

50…69 Poor

>70 Very poor

Tilde Localization QA assessment applied in the evaluation


Evaluation data► 54 documents in IT domain► 950-1050 adjusted words in each document► Each document was split in half:

►the first part was translated using suggestions from TM only

►the second half was translated using suggestions from both TM and MT


Integration of MT in SDL Trados


Evaluation Results

► Average translation productivity:►Baseline with TM only: 550 w/h►With TM and MT: 731 w/h 32.9% productivity increase

► High variability in individual performance with productivity increase by 64% to decrease by 5%.

► Increase of error score from 20.2 to 28.6 points but still at the level “GOOD” (<30 points)


Average error scores by error typesError Class Baseline

scenarioMT

scenario

Accuracy 6 9Language quality 6 10Style 3 4Terminology 5 7

Biggest quality degradation in language quality due to increase of grammatical errors.


Conclusions The results of our experiment clearly demonstrate that it is feasible

to integrate the current state of the art SMT systems for highly inflected languages into the localization process.

Quality of translation decreases in all error categories still degradation is not critical and the result is acceptable for production purposes.

Significant performance differences for individual translators hints to the role of human factors such as training, work procedures, personal inclinations etc.

Extended experiments are planned (I) involving more translators, (II) translating texts in different domains and (III) in other language pairs.


Application in localization workflow


Thank you!

Let's MT!

The research within the LetsMT! project leading to these results has received funding from the ICT Policy Support Programme (ICT PSP), Theme 5 – Multilingual web, grant agreement no 250456


towards application of user-tailored machine translation in localization

Documents

mt building

tailoring mt system

tailored mt systemsapplication

automatic mt systems

customised mt engines

localizedthe rulebased

number of mt usecases

translation practice