kerstin bier, localization world barcelona, manuel herranz, mt, pangeanic, sybase
DESCRIPTION
Co-presentation by Kerstin Bier and Manuel Herranz in Localization World Barcelona 2011 on the achievement and progress made by a customized PangeaMT engine at Sybase. Initial machine translation implementation, machine translation customization for Sybase, use of client's data for training and productivity results.TRANSCRIPT
MT ExperiencesAt Sybase
Kerstin Bier Manuel HerranzPangeaMT
MT at Pangeanic From Trial to Service2007/08
.
2009/10
2011/12
• DIY SMT • Empower Users• Glossary• Automated re-training• Transfer architecture and know-how to users• Compatibility with commercial formats (ttx, sdlxliff, itd)
2007 and before• RB tests with commercial software• Insufficiently good output• Only internal production• EU Post-Editing Award
• V1: Small data sets (2-5M words), automotive & electronics• (ES), then Fr/It/De in other fields
• Division born • 00's of engine trials and language combinations• Open-Source to commercial• TMX / XLIFF workflows
MT and PE at Sybase: From Trial to Production
2009/2010 Trial project with PangeaMT (EN-DE)
Engine: 2.5 million words, narrow domain (one product)
Results:Surprisingly good (BLEU: 49, PE productivity > 70 % )
2010 Project 1: MT and Post-Editing (EN-DE)
Engine: 5 million words
Major new release, lots of new content: 400.000 „new“ words post-edited
2010/2011 Project 2: Retraining, MT + PE
Retraining with post-edited and cleaned-up TMs
Small product update: 80.000 words „no matches“ MT + PE
Expectations
Excell.10%
Good30%
OK30%
OK or Bad???
20%
Bad10%
MT output gets better over time
Continuous PE productivity increase
Turnaround times shorter Cost savings go up over
time
Initial system – Expected output (% of MT words)
Retraining 1, Retraining 2, …
Main Challenges
Results: From „human“ perspective
Results: The MT perspective
Retraining
Project 1
Project 2
METEORscore range
100-70
50- 69
40 - 49
30 - 39
0 – 29
Examples: MT output and PE effort Minimal PE effort
Small PE effort
Medium PE effort
Lessons Learned Small in-domain MT engine = excellent starting point
For future projects: Faster turnaround, lower costs Other product lines Experiences help with other languages
MT output better than expected Often better than translators said Improved after retraining
We think that improving translator acceptance will improve productivity Idea: Filtering out poor translations (confidence scores) Retraining, retraining, retraining
2015
2014
2013
2011
2010
2009
2012
2018
2017
2016
Use
r em
pow
erm
en
t
YEAR2016
00
0's o
f custo
mize
d M
T sy
stem
s
Predictions
PangeaMT Tech. notthe realm of afew providers
2015
2014
2013
2011
2010
2009
2012
2018
2017
2016
Use
r em
pow
erm
en
t
YEAR2016
00
0's o
f custo
mize
d M
T sy
stem
s
Predictions
PangeaMT Tech. notthe realm of afew providers
Thank you!
Kerstin BierSybase, An SAP Company
Manuel HerranzPangeaMT