04-02-2015 d 4.7 mosescore 3rd yr events, 2014 v02 · 2 !!! tableof!contents! executivesummary! 3!...

44
MOSES CORE Deliverable D 4.7 Report on third year’s industry outreach events Work Package: WP 4: Industry Outreach Date (mm/yy): January 2015 Dissemination level: Public Author: Yulia Korobova

Upload: others

Post on 04-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

                   

MOSES CORE  

   

 

Deliverable  D  4.7    Report  on  third  year’s  industry  outreach  events  

       

Work  Package:   WP  4:  Industry  Outreach  Date  (mm/yy):   January  2015  Dissemination  level:   Public  Author:   Yulia  Korobova  

                                                   

Page 2: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

2

     

Table  of  Contents  

EXECUTIVE  SUMMARY   3  

1.  OUTREACH  EVENTS  IN  EUROPE   4  

1.1.  MT  SHOWCASE,  DUBLIN,  IRELAND   4  1.1.1.  RESULTS   5  1.2.  MT  DISCUSSION  AT  THE  VVIN  CONFERENCE,  THE  HAGUE  (THE  NETHERLANDS)   5  

2.  OUTREACH  EVENTS  IN  NORTH  AMERICA   6  

2.1.  MT  SHOWCASE  IN  VANCOUVER,  CANADA   6  2.1.1.  RESULTS   7  2.2.  MOSES  INDUSTRY  ROUNDTABLE  IN  VANCOUVER,  CANADA   7  2.2.1.  RESULTS   9  

3.  GENERAL  FINDINGS   10  

4.  CONCLUSIONS   15  

APPENDIX  1:  TAUS  MT  SHOWCASE  DUBLIN  2014  DISCUSSIONS   16  

APPENDIX  2:  TRANSCRIPT  OF  THE  RECORDING  OF  THE  TAUS  MT  SHOWCASE  PANEL  DISCUSSION   20  

APPENDIX  3:  NOTES  FROM  MOSES  INDUSTRY  ROUNDTABLE  BREAKOUT  DISCUSSIONS   31  

APPENDIX  4:  TAUS  MT  SHOWCASE  VANCOUVER  2014  DISCUSSIONS     34  

                                       

Page 3: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

3

 

Executive  Summary    This  report  on  industry  outreach  events  outlines  the  third  year  MosesCore  results  in  accordance  with  the  Communication  plan  (D4.1)  delivered  in  May  2012.    

Based  on  the  number  of  participants   in  previous  years  (low  attendance   in  Asia)  we  decided  to  focus  Moses  outreach  events  in  2014  in  Europe  and  North  America.      In   2014   together   with   our   partner   Localization   World   we   organized   two   MT  Showcases   (in  Dublin   and   in  Vancouver)  with   an  aim   to   foster   the  use  of  machine  translation.   In   addition   to   that,   AMTA   hosted   a   TAUS   Industry   Roundtable   in  Vancouver.   TAUS   also   gave   a   presentation   at   the   VViN   Lustrum   in   The   Hague.  Cooperation  with  these  organizations  was  beneficial  for  the  exposure  of  Moses.      At  all  events  we  used  Moses  collateral.  During  the  autumn  events  in  Canada  we  have  distributed  the  leaflets  outlining  the  MosesCore  project  and  the  results  we  reached  within  the  last  three  years.    Prior  to  the  events  promotion  campaigns  were  run  to  reach  and  attract  participants:  social  media  posts,  one-­‐on-­‐one  emails/conversations  and  big  e-­‐bulletins.    For  communication  synergy  in  2014  we  continued  to  use  three  key  messages  of  the  MosesCore  project:    1.   Moses   is   a   state-­‐of-­‐art   machine   translation   toolkit.   It   is   best   suited   to   making  specialized  MT  engines  for  specific  clients  and  industry-­‐domains.    2.   Using  Moses   helps   ensure   flexibility   and   choice   for   users   and   fosters   a   healthy  competitive  landscape.    3.  Using  Moses  helps  to  improve  translation  processes  and  capacity  and  create  new  business  opportunities.      Participants    TAUS  is  the  leader  of  work  package  4  “Industry  Outreach”  and  is  supported  by  UEDIN  and  ALS  (now  Capita  Translation  and  Interpreting)    

Page 4: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

4

1.  Outreach  events  in  Europe  

1.1.  MT  Showcase,  Dublin,  Ireland    Hosted  at:  Localization  World  Date:  4  June  2014  Audience:   Translation   buyers,   translation   companies,   translation   technology  providers  Aim:   To   raise   awareness   and   demystify   MT,   to   help   set   expectations   and   share  knowledge/best  practices  of  using  MT  and  Moses    Number  of  participants:  61  Use   cases:   European   Commission,   Iconic   Translation   Machines,   KantanMT,   Sovee,  Tilde  Web  presence:  1.215  views  of  the  presentations  (18/01/2014-­‐14/01/2015).    An  overview  of  the  presentations:    MT@EC   for   European   public   administrations   and   online   services,  Spyridon   Pilos  (European  Commission)1  This   is   a   showcase   of   the   new   MT   system   built   by   the   Directorate   General   for  Translation  (DGT)  using  Moses.    Beyond   Data:   Delivering  Machine   Translation   with   Subject   Matter   Expertise,   John  Tinsley  (Iconic  Translation  Machines)2  A  success  story  of  commercial  machine  translation  systems.    Enabling  MT  for  the  everyone!,  Tony  O'Dowd  (KantanMT)3  A  cloud-­‐based  implementation  of  Moses.    Sovee   Smart   Engine   2.0:   a   Leap   beyond   Base   Moses   Technology,   Scott   Gaskill  (Sovee)4  A  demonstration  of   the  automated   language  tuning  and  training  capabilities  of   the  Sovee  Smart  Engine  2.0.    MT  applications  in  the  EU  public  sector,  Andrejs  Vasiljevs  (Tilde)5  Case  study  about  the  benefits  of  the  MT  in  public  sector.    

1 http://www.slideshare.net/TAUS/taus-mt-showcase-mtec-for-european-public-administrations-and-online-services-spyridon-pilos-european-commission 2 http://www.slideshare.net/TAUS/taus-mt-showcase-beyond-data-john-tinsley-iconic-translation-machines 3 http://www.slideshare.net/TAUS/enabling-mt-for-the-everyone-tony-odowd-kantanmt 4 http://www.slideshare.net/TAUS/taus-mt-showcase-sovee-smart-engine-20-a-leap-beyond-base-moses-technology-scott-gaskill-sovee 5 http://www.slideshare.net/TAUS/taus-mt-showcace-mt-applications-in-the-eu-public-sector-adrejs-vasiljevs-tilde

Page 5: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

5

All   the  presentations   are  publicly   available  on   the  MosesCore  project  website6,   on  the  Moses  Resources  page  on  taus.net7  and  on  Slideshare.    

1.1.1.  Results  The  MT  Showcase   in  Dublin   started  some   interesting  discussions.  Appendix  1   is  an  overview  of  the  questions  based  on  the  presentations  and  the  podium  discussion.      All  participants  were  asked  to  fill  in  the  survey  that  was  handed  out.  This  survey  was  aimed   to   collect   information   about   the   current   adoption   and   influence   of   the  MosesCore  project  on  the  MT  community.    In  addition  to  the  above-­‐mentioned  survey,  following  the  review  recommendations  we  recorded  the  panel  discussion.  Appendix  2  contains  a  transcript  of  this  discussion.  This   discussion   covered   a   number   of   interesting   points   about   the   current   use   of  Moses  by  industry  leaders,  as  well  as  the  sustainability  of  Moses  beyond  EC  funding.    “Moses   has   a   place   in   any   type   of   MT   world,   so   even   if   you   are   a   company   like  Systran  who  provides  essentially  rule-­‐based  machine  translation  systems  you  can  use,  something   like  Moses   could   be   used   at   various   stages   in   that   process   to   enhance  MT.”  

John  Tinsley  (Iconic  Translation  Machines)        “I   have   a   very   considerate   view   on   Moses.   If   you   equate   Moses   to   the   internal  combustion   engine:   every   car   manufactured   in   the   world   uses   the   internal  combustion   engine   and   that   is  what  Moses   is   to  MT  providers,   it’s   like   an   internal  combustion  engine  but  just  like  every  car  is  different  from  every  car  manufacturer  we  are   going   to   get   lots   of   different   flavours,   we   are   going   to   get   huge   leaps   in  innovation,  we  are  going  to  get  lots  of  new  reordering  models,  going  to  get  analytic  models  and  we  are  going  to  extend  the  power  over  and  over  again.”  

Tony  O’Dowd  (KantanMT)    

1.2.  MT  Discussion  at  the  VViN  Conference,  The  Hague  (the  Netherlands)    Hosted  by:  VViN  Date:  19  September  2014  Aim:  To  raise  awareness  of  MT,  share  knowledge/best  practices,  explain  what  Moses  is  Audience:  Translation  companies  Moderator:  TAUS  Number  of  participants:  20    

6 http://www.statmt.org/mosescore/index.php?n=Main.Videos 7 https://translate.taus.net/translate/mosescore/mosescore-resources#use-cases

Page 6: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

6

To  meet   the   wishes   of   the   participants   of   the   previous  MT   Showcases   about   the  shortage   of   similar   events   in   Europe,   the   TAUS   team   took   part   in   the   VViN  Conference  in  September.  Together  with  20  highly  motivated  participants  we  had  a  Q&A  about  open-­‐source  MT,  Moses  and  adoption  of  MT  in  general.    

2.  Outreach  events  in  North  America  

2.1.  MT  Showcase  in  Vancouver,  Canada  Hosted  at:  Localization  World  Date:  29  October  2014  Aim:  To  share  knowledge/best  practices,  explain  what  Moses  is  Audience:  Translation  companies,  buyers  Presenters:  eBay,  Precision  Translation  Tools,  Unbabel,  Translated,  TAUS  Number  of  participants:  41  Web  presence:  889  views  of  the  presentations  (04/10/2014-­‐14/01/2015).    An  overview  of  the  presentations:    TAUS   Introduction   and  MT  market   overview,   Jaap   van   der  Meer   &   Achim   Ruopp,  TAUS8  This  presentation  outlined  the  results  of   the  2014  MT  Market  report,   including  the  use  of  Moses,  and  a  number  of  predictions  about  the  further  development  of  the  MT  market.    Machine  Translation  at  eBay,  Saša  Hassan,  eBay9  In  this  talk,  eBay  presented  recent  launches  of  Machine  Translation,  based  on  Moses,  on  the  eBay  site  for  various  locales,  e.g.  Russian  and  Latin  American  markets,  which  enables  buyers  to  shop  in  their  native  languages  and  fosters  overall  cross-­‐border  trade.      The   Simplified   Guide   to   Getting   Started   in   SMT,   Tom   Hoar,   Precision   Translation  Tools10  This   session   reviews   the   fundamentals  of   selecting  an  SMT  solution  with  examples  that  reference  use  cases  with  PTTools'  DoMT  Desktop,  a  commercial  application  with  a  Moses  kernel.      Seamless  Globalization  with  distributed  crowd  post  editing,  Vasco  Pedro,  Unbabel11  

8 http://www.slideshare.net/TAUS/taus-machine-translation-showcase-taus-introduction-and-mt-market-overview-taus-2014 9 http://www.slideshare.net/TAUS/taus-machine-translation-showcase-machine-translation-at-ebay-2014 10 http://www.slideshare.net/TAUS/taus-machine-translation-showcase-the-simplified-guide-to-getting-started-in-smt-precision-translation-tools-2014?utm_source=slideshow&utm_medium=ssemail&utm_campaign=post_upload 11 http://www.slideshare.net/TAUS/taus-machine-translation-showcase-seamless-globalization-with-distributed-crowd-post-editing-unbabel-2014?utm_source=slideshow&utm_medium=ssemail&utm_campaign=post_upload

Page 7: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

7

In  this  talk  Unbabel  presented  its  method  and  technology  of  using  MT  based  on  Moses  in  combination  with  community  post-­‐editing,  as  well  as  showcaseing  key  integrations  and  early  results.      Introduction  to  Matecat,  the  open-­‐source  CAT  tool  for  post-­‐editing,  Marco  Trombetti,  Translated12  Marco  Trombetti  discussed  the  strategy  beyond  CAT-­‐tools,  the  use  cases  for  LSPs  and  buyers,  as  well  as  tutorials  on  advanced  Moses  integration  including  real  time  online  learning.      

2.1.1.  Results    As  during  the  other  MT  Showcases,  we  showcased  a  variety  of  options  in  Vancouver  on  how  MT  and  Moses-­‐solutions  can  be  implemented  in  various  environments  from  cross-­‐border   commerce   to   crowd-­‐sourced   post-­‐editing.   This   latest   showcase  presented  the  breadth  of  solutions  that  Moses  enables  pointing  to  the  versatility  and  value  of  this  open  source  solution  offers  as  an  enabling  technology.    

2.2.  Moses  Industry  Roundtable  in  Vancouver,  Canada  Hosted  at:  AMTA  Date:  26  October  2014  Aim:  Discussing  the  future  of  Moses:  Moses  beyond  the  MosesCore  project  Audience:  Translation  companies,  government,  academia  Number  of  participants:  37  Web  presence:  589  views  of  the  presentations  (04/10/2014-­‐14/01/2015).    An  overview  of  the  presentations:    TAUS  Moses  Industry  Roundtable  2014,  MT  Market,  Jaap  van  der  Meer,  Achim  Ruopp,  TAUS13  In  this  presentation  TAUS  analyzed  market  MT  trends,  opportunities  and  challenges  as  well  as  market  drivers  and  inhibitors.    TAUS  Moses  Industry  Roundtable  2014,  Moses-­‐Past,  Present,  Future,  Hieu  Hoang,  Ulrich  Germann,  University  of  Edinburgh14  This  presentation  is  about  EC  projects  dedicated  to  MT.    TAUS  Moses  Industry  Roundtable  2014,  Changes  in  Moses,  Hieu  Hoang,  University  of  Edinburgh15  

12 http://www.slideshare.net/TAUS/taus-machine-translation-showcase-mate-cat-translated-2014 13 http://www.slideshare.net/TAUS/taus-machine-translation-showcase-taus-introduction-and-mt-market-overview-taus-2014 14 http://www.slideshare.net/TAUS/taus-moses-industry-roundtable-2014-moses-past-present-future-hieu-hoang-ulrich-germann-university-of-edinburgh 15 http://www.slideshare.net/TAUS/taus-moses-industry-roundtable-2014-changes-in-moses-hieu-hoang-university-of-edinburgh

Page 8: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

8

Hieu  Hoang  presented  results  reached  within  the  MosesCore  project  in  the  last  three  years.    TAUS  Moses  Industry  Roundtable  2014,  Moses-­‐Past,  Present,  Future,  Hieu  Hoang,  Ulrich  Germann,  University  of  Edinburgh16  This  presentation  provided  an  overview  of  the  EU  projects  that  provided  funding  for  Moses  in  the  past  and  which  ones  are  on  the  horizon  to  continue  funding  some  of  the  development.    TAUS  Moses  Industry  Roundtable  2014,  Introducing  Strategic  Questions17  Q&A  and  results  of  the  breakout  discussions.    At  the  first  Moses  Industry  Roundtable  last  year,  TAUS  brought  together  the  Moses  developer   community   and  Moses   users   from   industry   and   governments   to   discuss  common  challenges  and  opportunities  for  cooperation  to  tackle  common  issues.  By  collocating  the  roundtable  at  AMTA  2014  with  the   following  TAUS  and  Localization  World   conferences   this   year,   TAUS   enabled   the   broadest   possible   audience   the  opportunity  to  participate  and  continue  the  conversation.    As  a  discussion  facilitator,  TAUS  captured  the  breakout  notes  of  a  organizational  and  technical   breakout   (Appendix   3)   and   audio   recorded  during   stakeholder   discussion  following  the  breakouts.  These  valuable  resources  help  to  identify  a  stepping-­‐stone  for  continued  maintenance,  support  and  development  of  Moses,  also  with  additional  non-­‐governmental  funding  given  the  increased  use  of  Moses  by  industry.    The   organizational   breakout   discussed   the   pros   and   cons   of   different   options   to  organize   and   fund   Moses   development   in   the   coming   years   for   the   different  stakeholders   that   were   present   from   academia,   government   and   industry.   In   the  technical   breakout   the   current   development   process   and   the   framework   for  contributions  was  discussed.  After   the  breakouts   the   stakeholders   got   together   in   a   larger   group  discussing   the  breakout  findings  and  identifying  opportunities  for  continued  development.  While  a  foundation   idea   was   generally   supported   by   the   participants,   potential   funders  stressed   the   importance   of   a   defined   foundation   scope   and   a   description   of   the  benefits  they  would  get  from  such  an  organization.  The  stakeholders  also  discussed  where  such  a  foundation  should  be  located.    

2.2.1.  Results    The  above  described  results  of  the  Moses  Industry  Roundtable  discussions  provided  valuable  input  to  the  MosesCore  Sustainability  Report  (D5.5).  The  roundtables  organized  by  TAUS  in  2013/2014  established  a  core  community  of  Moses  

16 http://www.slideshare.net/TAUS/taus-moses-industry-roundtable-2014-moses-past-present-future-hieu-hoang-ulrich-germann-university-of-edinburgh 17 http://www.slideshare.net/TAUS/taus-moses-industry-roundtable-2014-introducing-strategic-questions

Page 9: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

9

stakeholders  from  industry,  academia  and  government  that  can  carry  the  development  of  Moses  forward  after  the  end  of  public  funding.      

Page 10: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

10

3.  General  Findings    In  2014  we  set  a  goal  to  attract  at  least  40  registered  participants  per  MT  Showcase  session   (see   Deliverable   D4.1:     Industrial   Outreach   Plan   2012-­‐201518).   Both  Showcases  in  Dublin  and  Vancouver  attracted  the  targeted  amount  of  attendees.      Chart  1  shows  an  overview  of   the  number  of  participants  at  each  Moses  event  we  organized  over  the  past  3  years.  It  is  clear  that  the  events  held  in  the  western  part  of  the  world  were  more   highly   attended   than   in   the   eastern   part   of   the  world.   One  reason  for  this  is  that  the  audience  of  LocWorld  (the  conference  that  hosted  the  MT  Showcases)   in   Asia   is   always   smaller   than   their   conferences   in   Europe   and   North  America.  Chart  2  shows  the  average  number  of  participants  per  year.  We  see  that  we  dropped  quite  a  bit  in  2013  but  climbed  up  again  in  2014.    

 Chart  1  Number  of  participants  at  TAUS  MosesCore  events,  2012-­‐2014      

18 http://www.statmt.org/mosescore/uploads/Internal/io-plan2.pdf

Page 11: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

11

 Chart  2  Average  number  of  participants  in  TAUS  MosesCore  events  per  year      Following  the  recommendations   listed  in  the  previous  EC  reviews,   in  2014  we  have  handed  out   short   surveys  during   the  MT  Showcases   in  Dublin   and  Vancouver.   The  aim  of  these  surveys  was  to  collect  more  insight  data  on  how  attendees  learn  about  Moses,  their  plans,  as  well  as  pros  and  cons  influencing  their  decisions  to  (not)  use  Moses.    We  also  wanted   to  get   some  more   information  on  how  their  organization  were  using  MT  in  general.    

 Chart  3  How  is  your  organization  using  MT?  

 

Page 12: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

12

Chart  3  gives  an  overview  on  how  the  different  organizations  are  using  MT.  It  also  gives  us  an  idea  on  what  percentage  of  organizations  is  already  using  Moses  and  what  percentage  is  not  (yet)  and  can  thus  be  prospects  for  the  Moses  technology.    

 Chart  4  How  did  you  learn  about  Moses,  2014    Chart  4  shows  an  overview  of  the  attendees’  answers  in  2014  to  the  question  “How  did  you  learn  about  Moses?”.  These  answers  were  returned  by  56  participates  (total  participants  from  MT  showcase  in  Dublin  and  Vancouver  102).  The   above   answers   show   that   a   big   share   of   the   active   respondents   learns   about  Moses  via  events  (Localization  World,  previous  MT  Showcase,  MT  Marathon),   from  business  partners   presenting   at   an  MT  Showcase   (Precision   Translation   Tools,  Asia  Online,  Sovee)  or  online  research.      Chart  5  shows  answers  of  the  MT  Showcase  attendees  in  Vancouver  to  the  question  “After  attending  this  MT  showcase  are  you  more  likely  to  look  into  Moses  or  a  Moses-­‐based  MT  solution?”.      

Page 13: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

13

 Chart  5  After  attending  this  MT  showcase  are  you  more  likely  to  look  into  Moses  or  a  Moses-­‐based  MT  solution,  2014    

Results   show   that   24%  of   those  who   answered   the   question   are   interested   in   the  further  exploitation  and  study  of  Moses.      During  the  events  it  was  also  interesting  to  learn  more  about  the  factors  influencing  people’s   choice   for/against  Moses.   20   attendees   of   the  MT   Showcases   were   kind  enough  to  share  their  reasons.    Which  factors  influenced  your  decision  to  use  Moses?  

Which  factors  influenced  your  decision  not  to  use  Moses?  

Use  case,  quality    

I  am  only  just  now  hearing  about  it  

Robust,  fast,  many  companies  use  it    

OSS    

MT  is  a  product  we  sell    

I  need  to  learn  more    

Availability,  cooperation    

Applicability.  Few  customers  have  data    

It  is  open  source    

Scale  of  internal  resources  to  implement  into  interval  workflow  systems    

It  works.  Good  support  community    

It  does  not  support  fine  tuning  (pre  and  post  translation  adjustment),  segmentation  rules  and  so  on    

Existing  tool  to  leverage  with  it  is  fully   We  are  very  new  to  MT  and  it  is  all  

Page 14: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

14

replaced    

customized.  We  need  to  save  time.  That  is  why  we  didn't  use  it.    

It  is  easier  to  use  for  us   Still  gathering  info    

Table  1  Factors,  influencing  the  Moses  use,  2014  

Page 15: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

15

4.  Conclusions    The  MT  Showcases   in  2014  have  proven  to  be  very  successful,  not  only   in  terms  of  the   number   of   people   attending   the   events,   but   also   in   terms   of   the   feedback.  Eighty-­‐three   percent   of   the   attendees   answered   that   they   are  more   likely   to   look  into  Moses  as  their  MT  solution  than  before  attending  the  MT  Showcase  events.      The  MosesCore  project  has  been  crucial  in  developing  not  just  awareness  for  Moses  but   also   market   share.   The   Moses   MT   Market   Report   (a   separate   deliverable)  indicates  that  Moses  MT  constitutes  20%  of  the  overall  machine  translation  market  and  it  also  lists  the  number  of  new  providers  of  Moses  based  solutions  that  entered  the  market  place  just  since  the  start  of  the  MosesCore  project.      To   give   the   readers   of   this   report   the   opportunity   to   witness   the   liveliness   and  maturity   of   the   discussions   at   the   different   events   we   add   transcriptions   (see  appendices)  of  the  discussions  that  took  place  at  the  Dublin  and  Vancouver  events.      In   2015   TAUS   plans   to   continue   organizing  MT   Showcases   during   the   Localization  World  Conferences.      In   December   2014   TAUS   introduced   free   Academic   Membership.   Post-­‐docs   and  students   from   universities   around   the   world   can   get   free   access   to   TAUS   Data,  knowledge  bases  around  Moses  as  well  as  the  Dynamic  Quality  Framework  to  help  them  learn  and  experiment  with  the  training  of  MT  engines.  With  this   initiative  we  intend  to  bridge  the  gap  between  industry  and  education  and  help  companies  to  find  the  MT  talents  and  computational  linguists.      

Page 16: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

16

Appendix  1:  TAUS  MT  Showcase  Dublin  2014  Discussions    1.  Panel  Discussion  questions  

 Moses(Core)  related  Is  your  solution  Moses-­‐based?  If  yes,  what  was  the  reason  to  base  the  solution  on  Moses?  What  is  missing  in  the  Moses  open  source  project  for  industry  use?  Quality/Evaluation  related  I’m  a  small  LSP.  How  can  I  verify  that  the  solution  you  are  offering  works  for  my  use  case?  How  much  will  this  evaluation  cost  me?  Do  you  see  the  TAUS  Dynamic  Quality  Framework  as  a  good  way  to  independently  evaluate  and  compare  different  MT  solutions?  MT  Market  related  In  a  survey  for  our  upcoming  MT  Market  report  respondents  identified  the  following  main  trends  and  drivers:  Acceptance  —  Availability  —  Usability  —  Large(r)  quantities  of  data  —  Low  costs  —  Speed  The  MT  market  has  always  been  driven  on  the  abovementioned  elements,  so  what’s  new?  Can  you  take  me  through  what  you/your  company  see(s)  as  the  current  state  of  MT?  Do  you  see  the  drivers  changing  in  the  future?  Where  do  you/does  your  company  see  MT  going  in  the  next  5  years?  Where  do  you  see  growth  areas  for  MT  use?  (over  the  next  5  years)  What  are  the  challenges  to  growth  of  the  MT  market  sector?  Is  there  a  market  for  MT?  Where  is  pricing  going?  MT  Solutions  related  Where  do  you  see  the  biggest  opportunities  for  MT  solutions  over  the  next  5  years?  General  vs.  domain-­‐specific  vs.  customized  Cloud  vs.  on-­‐premise  Broad  language  coverage  vs.  focus  on  small  set  of  languages?  Industry  verticals?  Post-­‐edited  MT  vs.  gisting  and  other  non-­‐edited  uses    2.  Questions  from  MT  Market  Report  Survey    1.  Please  indicate  the  percentages  of  your  MT  related  offerings  for  each  of  the  items  listed  (as  %  of  revenue)  2.  What  is  the  geographical  spread  (approximately)  of  your  revenue  in  MT  (as  %  of  revenue)?  3.  What  is  the  delta  in  MT  related  revenue  for  your  company  from  2012  to  2013?  4.  What  do  you  see  as  the  key  market  trends  and  what  is  driving  the  MT  market  sector?  5.  What  are  the  challenges  to  growth  of  the  MT  market  sector?  6.  What  do  you  see  as  the  opportunities  for  your  company  or  for  the  MT  sector  in  general?  7.  What  do  you  see  as  the  threats  for  your  company  or  for  the  MT  sector  in  general?    

Page 17: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

17

3.  Questions  from  MT  Market  Report  Interviews    MT  Market  Drivers  and  Inhibitors  Main  trends  and  drivers  according  to  responses  in  the  survey:  Acceptance  —  Availability  —  Usability  —  Large(r)  quantities  of  data  —  Low  costs  —  Speed    The  MT  market  has  always  been  driven  on  the  abovementioned  elements,  so  what’s  new?    Can  you  take  me  through  what  you/your  company  see(s)  as  the  current  state  of  MT?    Do  you  see  the  drivers  changing  in  the  future?    What  are  the  challenges  to  growth  of  the  MT  market  sector?    Where  do  you/does  your  company  see  MT  going  in  the  next  5  years?    Where  do  you  see  growth  areas  for  MT  use?  (over  the  next  5  years)    4.  Questions  derived  from  presentations    1.  MT@EC  for  European  public  administrations  and  online  services,  Spyridon  Pilos  (European  Commission)    The  European  Commission’s  new  machine  translation  system  has  been  available  since  June  2013.  It  was  built  by  the  Directorate  General  for  Translation  (DGT)  using  Moses  and  the  EU  institutions’  translation  memories,  stored  in  the  Euramis  database.  It  is  continuously  improving  through  close  collaboration  with  EC  translators,  and  regular  inclusion  of  their  more  recent  translations.  MT@EC  will  be  the  starting  point  for  an  "automated  translation  platform"  to  be  funded  by  the  Connecting  Europe  Facility  in  order  to  support  multilingualism  of  other  European  digital  service  infrastructures.    2.  Beyond  Data:  Delivering  Machine  Translation  with  Subject  Matter  Expertise,  John  Tinsley  (Iconic  Translation  Machines)  There  are  a  number  of  current  approaches  to  developing  commercial  machine  translation  systems,  ranging  from  do-­‐it-­‐yourself  platforms  to  fully  customized  development  as  a  professional  service.  While  these  various  approaches  have  their  relative  merits,  they  all  present  a  number  of  drawbacks  for  the  end  user,  be  it  the  inability  to  handle  complex  content  or  a  long  and  expensive  period  of  development  and  testing.    At  Iconic  Translation  Machines,  our  approach  goes  beyond  basic  engineering  of  data  to  build  MT  systems  and  overcome  these  drawbacks.  We  combine  deep  domain  knowledge  and  linguistic  expertise  to  deliver  highly  focused  MT  engines  for  targeted  domains  and  languages.  Our  IPTranslator  service,  for  example,  has  been  developed  using  this  approach  to  produce  intelligent  MT  systems  adapted  for  patent  and  legal  

Page 18: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

18

content.  We  demonstrate  how  this  approach  has  delivered  significant  value  to  end  users  and  describe  how  these  systems  serve  as  an  ideal  launchpad  for  ongoing  adaptation  and  optimization.    3.  Enabling  MT  for  everyone!  Tony  O'Dowd  (KantanMT)  Working  with  Moses  and  building  high  quality  MT  systems  is  not  for  the  faint  hearted.  It  requires  a  wide  range  of  technical  and  linguistic  based  knowledge  that  is  often  difficult  to  find  and  develop  within  organizations.  Consequently,  only  the  biggest  organizations  have  the  financial  muscle  to  invest  and  reap  the  awards  of  MT.  This  puts  the  small-­‐to-­‐medium  sized  organizations  at  a  distinct  disadvantage.  KantanMT  changes  everything!  KantanMT  is  a  cloud-­‐based  implementation  of  Moses  which  enables  SMEs  to  embrace  the  advantages  of  MT  -­‐  quickly  and  economically.  This  presentation  will  demonstrate  the  KantanMT  approach  to  rapid  engine  training  and  tuning,  data  analytics  used  to  predict  MT  quality  and  create  tiered  pricing  structures  and  instantaneous  engine  deployment  -­‐  all  of  which  are  driving  the  new  MT  Revolution!    4.  Sovee  Smart  Engine  2.0:  A  Leap  Beyond  Base  Moses  Technology,  Scott  Gaskill  (Sovee)  This  month  marks  the  advent  of  a  new  generation  in  Machine  Translation.  With  the  release  of  Sovee  Smart  Engine  2.0,  it  is  now  possible  to  process  virtually  unlimited  simultaneous  transactions  without  the  limitations  originally  inherent  to  the  base  Moses  technology.  Sovee's  latest  development  delivers  an  unprecedented  500  language  engines,  which  will  expand  to  thousands  of  languages  in  the  next  few  years.  This  workshop  will  demonstrate  the  automated  language  tuning  and  training  capabilities  of  Sovee  Smart  Engine  2.0.    It  will  highlight  the  deep  cascading  framework  that  delivers  the  highest  level  of  accuracy  ever  imagined  for  machine  translation,  and  a  new  combined  process  for  SMT  and  post-­‐editing.            

 

 

 

 

 

 

 

 

 

 

Page 19: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

19

 

 

 

Page 20: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

20

Appendix  2:  Transcript  of  the  recording  of  the  TAUS  MT  Showcase  Panel  Discussion    

(53:00)(03/06/2014)  

Maria  -­‐  Pangeanic:  Well  just,  interesting  presentation,  one  question;  did  Google  start  to  notice  your  efforts?  Have  they,  have  they  seen  what  you  guys  are  doing?    Andrejs  Vasiljevs:  They  are  a  search  engine  they  should  have  noticed.  (laughing)    JVM:  What  do  you  mean?  Whether  they  are  afraid?      Maria  -­‐  Pangeanic:  Afraid,  I  don’t  know  if  Google  can  be  afraid.  But,  just  if  they  noticed,  you  know  that  your  engines  for,  you  know,  your  search  engines,  or  your  MT  engines  are  a  little  bit  better  than  theirs  for  your  languages.    JVM:  And  by  the  way  this  was  Maria  from  Pangeanic  who  asked  the  question.    Andrejs  Vasiljevs  (54:07):  Thank  you  for  the  question.  The  research  community  will,  do  share  all  over  the  world  for  instance  published  works  of  papers  and  client  by  client  have  a  nice  reactions  with  people  from  Google,  but  mostly  we  do  not  want  disclose  what  we  are  doing  for  those  reasons.  But  I  think  we  are  not  pretty  much  worried  about  other  developments,  and  we  were  happy  to  be  approached  by  Microsoft,  and  we  helped  actually  with  Microsoft  research  for  the  ‘Bing’  engine  for  some  of  the  languages.  *text  missing,  help  with  others;  Google  as  well*    JVM  (55:00):  Alright,  Good,  anything  from  the  audience,  we  are  now  going  to  zoom  in  on  sort  of  the  general  implementation  questions,  concerns  around  using  MT,  we  heard  at  the  start  that  there  are  only  actually  a  few  of  you,  who  are  not  using  MT  yet,  but  planning  to  do  that.  Have  you  learned  something?  Those  of  you  who  are  not  using  MT  yet,  that  you  didn’t  know  before?  What  do  you  think?  Are  you  now  happier  to  start  using  MT?      Sergio  VMware:  I  never  used  MT  before,  but  for  my  current  company  it  would  be  a  fresh  start  definitely,  and  that’s  why  I’m  here.  Yeah,  but  what  I  learn  is;  that  there  is  a  lot  to  learn.    JVM:  And  that’s  Sergio  from  VMware.  What  about  Amazon  are  you  now  happier  than  you  were  before,  after  this  session?        Name  (male),  Amazon:  Well  I  think  in  terms  of  MT,  we  touched  different  points  and  the  data  was  one  of  the  points  that  John  discussed  there,  as  one  of  the  key  areas,  that  the  more  data  you  have  the  better  the  quality  of  the  translations  you  will  get  with  the  MT.  But  I  was  very  impressed  with  John,  John’s  presentation  because  it  demonstrates  a  little  bit  of  a  different  aspect  of  how  the  MT  can  be  used,  so  we  are  very,  we  were  used  to  use  machine  translations  as  a  hybrid,  as  a  rule-­‐based  or  a  

Page 21: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

21

statistical  and  see  the  different  cogs  and  the  different  parts  that  would,  could  enable  each  other.  It  opened  my  mind  a  little  bit  on  machine  translation,  yeah  and  the  ways  that  we  can  go  forward.      JVM:  Thank  you,  any  other  …  George  are  you  already  using  MT?    George,  Boffin  Language:  We  don’t  use  MT  too  much;  we  just  tried  from  Chinese  to  Japanese  for,  for  patent  abstract,  something  like  that.  The  result  is  OK,  it  is  pretty  good  plus  PE  (Post-­‐editing)  the  result  is  our  customer  satisfied.  We  didn’t  try  from  English  to  Chinese.  We  are  thinking  about  that,  thank  you.      JVM:  What  did  you  use?  What  are  you  using?    George,  Boffin  Language:  We  cooperate  with  MT  start-­‐up  company  in  Beijing,  so  we  don’t  know  exactly  the  technology  behind.    JVM:  OK,  using  Moses  do  you  think?    George,  Boffin  Language:  I  don’t  know  actually.    JVM:  We  never  know,  that  was  George  from  Boffin  Language.  So  what  do  we  do  with  that  kind  of  obscure,  yeah  it’s  a  kind  of  obscurity  exists  in  the  market  where  people  don’t  know  what’s  really  behind  that  screen,  you  know  is  that  important  or  another  question  like  when  you’re  an  LSP,  like  some  of  you,  do  you  have  to  tell  your  customer  that  you  are  using  MT,  is  that  important,  do  they  have  to  know?  Any  observations  from  the  panel  here?  How  transparent  we  have  to  be  about  what’s  inside?    Tony  O’Dowd  (58:26):  I  think,  If,  if  you  go  back  to  the  very  early  days  of  translation  memory  everybody  wanted  to  know  how  a  translation  memory  system  worked,  they  wanted  to  know  the  innards  because  they  didn’t  trust  technology,  they  didn’t  trust  ‘fuzzy  match’,  go  fast  forward  to  today,  nobody  questions  fuzzy  match.  Today  it’s  a  fundamental  fact  of  this  industry,  it  drives  the  cost  model,  it  drives  scheduling,  project  managers  use  it  every  day,  in  fact  it  drives  the  RFQ  process  that  most  companies  use  to  a  great  extent.  So  I  think  in  the  absence  in  a  situation  where  you  have  high  trust  in  technology  they  tend  not  to  question  it  because  they  just  accept  it.  OK.  With  Machine  translation,  although  it  has  been  around  longer  than  translation  memory  and  that’s  a  startling  fact,  translation  memory  came  after  machine  translation.      JVM:  It  was  just  a  lower  feature  of  MT.    Tony  O’Dowd:  Absolutely,  yes,  it’s  a  longer  technology.  But  its  only  in  the  last  couple  of  years  that  it  has  really  started  to  emerge  as  a  viable  tool  to  aid  productivity,  so  I  think  today  there  is  a  great  interest  and  curiosity  as  to  what  goes  on  behind  the  scenes.  But  I  would  anticipate  that  as  more  and  more  people  grow  to  trust  it  that,  that  curiosity  will  become  less  and  less,  and  the  curiosity  will  shift  onto  how  we  can  

Page 22: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

22

actually  maximise  the  benefits  of  machine  translation,  not  understand  the  technology  behind  it,  it’s  how  we  can  leverage  those  benefits  in  our  business.  So  for  instance  a  lot  of  our  clients  when  we  engaged  with  them,  when  we  were  giving  the  product  out  for  free,  about  18  months  ago  were  all  about;  what  sort  of  reordering  models  your  using,  what  sort  of  data  crunching  your  using,  what  sort  of,  what’s  the  minimum  number,  amount  of  words    you  need  to  build  a  model.  It  was  all  sorts  of  technical  questions,  whereas  today,  the  clients  that  are  actually  in  deployment  of  machine  translation  They  don’t  care  about  that,  it’s  not  part  of  their,  their  dialogue.  Today  they  are  all  about;  how  many  words  can  we  pump  through  an  engine.  How  scalable  is  the  engine?  How  fast  will  this  engine  work?  Can  we  take  that  engine  and  stick  it  on  our  customer  service  portal  or  user  support  forum.  So  it’s  all  about  gaining  the  benefits,  rather  than  understanding  the  technology.  And  I  think  that’s  going  to  go  better  or  sorry  the  curiosity  of  understanding  *the  technology*  is  going  to  get  less  and  put  more  focus  on  maximising  the  benefits  for  the  product.        JVM:  Yeah  I  guess  an  interesting,  can  I  just  carry  on  with  Tony  for  awhile  since  you  mentioned  it,  28  years.  So  you’ve  been  in  that  previous  revolution  of  translation  memory  entering  the  market.  Would  you  say  it’s  very  similar,  exactly  the  same  story  that  people  are  just  shocked  at  the  beginning  and  you  know  so?    Tony  O’Dowd:  I  just  want  to  make  it  very  clear  that  I  was  12  when  it  entered  the  market  (audience  laughs).  I  think  Jaap,  that  you  were  20  I  think?      JVM:  I  started  in  1913,  so  (audience  laughs)      Tony  O’Dowd:  Sorry  the  question  was?    JVM:  The  question  was;  is  the  MT  revolution  that  we  are  going  through  now,  very  very  similar  or  exactly  the  same  as  the  translation  memory  revolution  in  the  eighties,  late  eighties.      Tony  O’Dowd:  It  has  certain  characteristics,  I  remember  back  when,  I  remember  one  of  the  first  meetings  we  had  with  you  guys,  you  had  to  build  your  first  translation  memory  system.  I  think  at  the  time  there  was  only  one  other  product  available,  which  was  Xl8.  I  remember  this  is  going  back  25-­‐28  years  ago.  Ah  this  man  remembers  it  as  well.      It  was  like  you  know,  if  you  were  using  it  you  were  the  pioneer  OK.  It’s  like  a  brick  in  a  dam;  you’re  a  pioneer,  and  that’s  one  brick  out  of  the  dam,  and  then  you  get  somebody  else  using  it  and  that’s  another  brick  out  of  the  dam,  and  eventually,  the  more  and  more  bricks  that  get  out  of  the  dam,  the  dam  bursts  and  the  revolution  is  here  and  that’s  what  we  are  going  through  now.  So  we  are  seeing  lots  and  lots  of  progressively,  curious  LSPs  and  ISVs  that  want  to  get  onto  this  train,  the  MT  train.  They  are  taking  bricks  out  of  the  dam  but  we  haven’t  got  the  dam  burst  yet.  It’s  not  quite  a  torrent  of  water  but  I  think  its  accelerating,  you  know  I  think  the  work  that  TAUS  is  doing  in  these  shows  and  exhibitions  are  certainly  adding  to  that  and  I  just  

Page 23: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

23

get  a  sense  and  maybe  my  competitors  there,  probably  sense  that  today  there  is  more  and  more  people  gravitating  towards  the  benefits  of  MT.      It  is  not  a  replacement  strategy  that  they  are  adopting.  It’s  you  know,  augmenting  a  supplementary  strategy  to  help  them  do  things  faster,  cheaper  to  translate  more  content.  I  think  it’s  coming  and  I  think  Moses,  the  open  source  version  of  Moses  has  clearly  been  at  the  centre  of  that.  To  make  MT  accessible  to  more  people  than  any  other  effort  of  MT  has  ever  done  before.      JVM:  Right  yeah,  so  now  we  get  all  these  questions  because  people  are,  they  are  fearing,  and  they  are  not  knowing  and  that  will  just  go  away  after  a  year  or  two  from  now  no  questions  asked  it’s  just  a  given;  you  use  MT  technology.      Scott  Gaskill  (1:03:31):  Customers  are  going  to  see  MT  is  an  enabler  to  help  them  get  their  translations  done.  It’s  no  longer  going  to  be  how  did  you  run  it  through  the  tool,  what  percentage  went  through  the  tool  and  so  forth,  it’s  really  going  to  get  down  to  delivering  translations  to  the  customer  the  way  the  customer  wants  it,  and  enabling  those  tools,  different  tools  will  offer  that  today  to  be  able  to  deliver  that  to  an  LSP  or  directly  out  to  a  customer.  At  the  end  of  the  day,  I  don’t  think  we  are  going  to  be  asking  questions  about  TMs  and  MT  and  everything  else,  we  are  really  going  to  be  asking  questions;  how  well  can  we  deliver  to  our  customers  and  that  our  customers  have  the  ability  to  us  information  back  so  that  we  can  make  it  better  in  the  long  run.      Tony  (1:04:22):  That’s  a  very  strong  point,  because  if  you  think  about  the  ultimate  end  in  that  is  that  we  won’t  be  talking  in  3  or  4  year’s  time  about  MT  and  TM,  we’ll  be  just  talking  about  pre-­‐translation.  You  won’t  care  where  it  came  from  its  just  high  quality  pre-­‐translation.  So  this  argument,  will  kind  of  be  almost  muted  in  a  few  years  time.  We  are  getting  to  that  today,  almost  every  client  we  have  today  is  not  using  machine  translation  in  the  absense  of  translation  memory,  they  are  using  both  technologies  it’s  a  seamless  experience.  So  I  think  we  will  be  talking  pre-­‐translation,  there  will  be  no  distinction.      JVM:  And  since  this  is  a  Moses  Core  funded  workshop,  I  want  to  ask  the  question,  for  the  record  because  everything  is  recorded  and  we  are  reporting  back  to  the  European  Commission.  On  the  front  row  we  have  Systran,  here  in  the  audience,  you  know  actually  we  requested  that  we  would  open  this  workshop  up,  with  we  have  been  running  it  for  three  years  now,  almost  and  called  the  MT  showcase  and  not  Moses  Core  showcase  necessarily  because  we  would  like  to  know  what  else  is  out  there  and  what  else  is  progressing,  is  the  future  just  Moses.  We  didn’t  select  you  because  your  using  Moses,  because  you  all  using  Moses  right.  And  so  does  the  European  Commission,  what  do  you  think?  I  mean  is  there,  how  would  the  other  part  of  the  machine  translation  market  develop,  and  I’ll  come  back  to  you  if  you  would  like  to  comment  on  that  too.      John  Tinsley,  (1:06:02):  I  think  as  I  kind  of  presented  earlier  on,  Moses  has  a  place  in  any  type  of  MT  world,  so  even  if  you  are  a  company  like  Systran  who  provides  

Page 24: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

24

essentially  rule-­‐based  machine  translation  systems  you  can  use,  something  like  Moses  could  be  used  at  various  stages  in  that  process  to  enhance  MT.      JVM:  And  they  do,  yeah  they  put  it  in  the  mix  too.  Let  me  come  straight  to  you.      *Name  (male)*  Systran:  Hi,  Yes,  Systran  is  actually  using  Moses  since  2009,  right?  (Systran  colleague  agrees)  2009  to  do  some  kind  of  statistical  post-­‐editing.  So  Moses  alleviates  our  research  and  development  just  as  it  does  for  you,  so  that  we  can  concentrate  on  peripheral  technologies  and  improving  the  output  from  Moses.  Actually  Systran  is  reasoning  a  little  like  you,  combining  different  steps  and  building  the  translation  along,  using  either  RBMT,  SMT,  pre-­‐processing,  post  processing  and  so  on.      We  counted  that  we  have  close  to  49  different  processes,  between  the  time  you  entered  your  document  into  the  system  and  the  time  it  gets  out  of  there.  So  yeah,  we  are  using  Moses  and  we  believe  Moses  is  a  very  nice  initiative,  because  it  allows  to  combine  forces  to  provide  core  technologies  that  we  can  build  around.      John  Tinsley:  I  think  a  good  point  there,  you  say  it  allows  you  to  kind  of  alleviate  your  R&D  efforts  and  kind  of  use  it  as  a  supplementary  tool,  but  I  think  one  of  the  real  powers  of  Moses  is  that  you  because  its  open  source  you  have  the  capacity  to  actually  perfect  it  yourself,  so  it’s  a  really,  really  strong  kind  of  baseline  that  it  gives  you  that  you  can  then  build  upon  and  make  it  do  things  that  it  doesn’t  necessarily  do  yet,  for  your  own  benefits  as  well.      JVM:  Yes  Andrejs,  go  ahead.      Andrejs  Vasiljevs,  Tilde  (1:08:07):  It’s  Moses,  I  think  yes  indeed  for,  for,  for  some  time  to  come  and  actually  Moses  is  one  of  the  most  mature,  and  best  implementation  of  breakthrough  in  machine  translation  in  the  late  eighties  by  researchers  in  IBM,  Tomas  Watson  research  centre  have  published  very  famous  papers  on  statistical  machine  translation.  And  in  those  few  details,  Moses  and  people  at  Edinburgh  and  other  things  were  able  to  implement  these  methods  in  quite  a  robust  platform.  But  I  think  there  is  an  ongoing  discussion  in  research  community  that  we  have  to  look  for,  for  other  alternatives,  other  directions  that  statistical  machine  translation  is  not  the  end  of  the  game,  but  there  probably  could  be  some  next  breakthrough  possible  in  the  coming  decades.  But  it  will  take,  first  we  have  to  come  to  this  breakthrough  in  research  field  and  then  it  will  take  at  least  a  decade  while  the  breakthrough  will  be  mature  enough  to  use  for,  for  practical  purposes.      JVM:  Thank  you,  yeah  Tony?    Tony  O’Dowd  (1:09:20):  Just  in  relation  to  Moses,  I  have  a  very  considerate  view  on  Moses.  If  you  equate  Moses  to  the  internal  combustion  engine.  Every  car  manufactured  in  the  world  uses  the  internal  combustion  engine  and  that  is  what  Moses  is  to  MT  providers,  it’s  like  an  internal  combustion  engine  but  just  like  every  

Page 25: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

25

car  is  different  from  every  car  manufacturer  we  are  going  to  get  lots  of  different  flavours,  we  are  going  to  get  huge  leaps  in  innovation,  we  are  going  to  get  lots  of  new  reordering  models,  going  to  get  analytic  models  and  we  are  going  to  extend  the  power  over  and  over  again.  And  just  like  if  you’re  in  formula  one,  you’re  going  to  have  a  3.5,  300  break  horse,  break  bhp  break  horse  power  engine,  the  family  sedan  is  only  going  to  need  100  break  horse  power.  So  that  is  what  I  view  Moses  as,  it’s  the  internal  combustion  engine  for  a  whole  range  of  industries  and  it’s  just  going  to  change  the  way  we  translate  content.    Chris:  Do  you  mind  if  I….    JVM:  Yes,  Chris  grab  the  microphone  and  then  I’ll  come  to  you  Tom.      Chris  (1:10:21):  So  to  take  that  analogy  just  a  little  bit  further.  Somebody  is  going  to  come  around  and  advance  the  electric  engine,  and  you’re  going  to  have  to  test  that.  And  that’s  going  to  start  the  next  revolution  and  so  we  are  working  really  hard  just  as  we  all  are  together,  em  the  internal  combustion  engine  does  go  very  far  but  there  are  its  limitations  and  flaws.  But  another  analogy  that  we  kind  of  joke  around  about  is.  Does  anybody  know,  I  don’t  know  why  they  call  the  Moses  project  the  Moses  project,  but  if  you  think  about  it,  Moses  stuttered,  and  he  had  a  lot  of  responsibilities  that  if  you  know  the  story.  God  gave  him  a  lot  of  responsibilities,  he  kind  of  talked  about  him  stuttering  and  then  some  responsibilities  were  taken  away  from  him  and  given  to  his  brother,  or  his  brother  Aaron.  So  there  is  going  to  be  borrowing,  and  other  tools  that  supplement  because  Moses  stutters,  you  know  Moses  isn’t  perfect,  we  can  supplement  it,  but  some  big  breakthrough  is  going  to  change  and  revolutionise  and  Moses  never  entered  the  promised-­‐land,  he  wandered  through  the  wilderness.  So  just  the  same  way  I  don’t  think  Moses  is  going  to  take  us  into  the  promised-­‐land  of.      Comment  from  Audience  –  too  difficult  to  hear.      Chris:  Yeah,  yeah  forty  years,  yeah  and  Moses  we  can  count  down  and  figure  out  when  that  fortieth  year  is.      Comment  from  Audience  –  too  difficult  to  hear.      Chris:  But  something  is  going  to  come,  and  it’s  going  to  be  the  electric  engine  or  you  know  it’s  going  to  be  somebody  else  leading  us,  some  other  tool  leading  us  into  the  promised-­‐land.      JVM:  Chris  thank  you  for  that  nice  allegory.  Tom  Hoar  is  in  the  audience  also  from  Precision  Language.  User  of  Moses.      Tom  Hoar  (1:12:09):  Tom  Hoar,  MD  Precision  Translation  Tools,  we  have  a  distributal  software  application  that  people  license  and  insource  the  production  of  SMT.  Like  Tony  said  you  were  talking  about  the  pre-­‐processing  tool  chain.  Or  pre-­‐processing.  I  like  to  call  it,  it’s  simply  MT  production  or  translation  production,  we  are  all  in  the  

Page 26: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

26

business  of  translation  production  and  when  we  get  into  the  concept  that  we  are  really  like  a  factory  and  there  are  lots  of  things  that  go  in  the  tool  chain  and  that’s  what  we  are  doing,  but  anyway,  I  agree  with  you  Tony,  I  agree  with  you  that  we’ve  got  a  an  engine,  it’s  a  generic  thing  that  is  customised.  You  put  an  internal  combustion  engine  on  four  wheels,  you  can  get  a  truck,  a  pick-­‐up  truck  or  you  can  get  a  Lamborghini  OK.  And  they  are  different  things  and  each  has  a  different  purpose.      But  let’s  look  at  Moses  is  out  there  for  8  almost  9  years,  it’s  been  heavily  funded  the  development  of  Moses  has  been  heavily  funded  by  the  European  community  and  by  DARPA.  Moses  Core  was  a  three  year  scheduled  project,  we  are  on  the  third  year.  Sometime  towards  the  end  of  this  year,  Moses  released  three  and  their  deliverables  will  make  the  Moses  core  project  complete.  I  don’t  know,  what  the  European,  so  one  question  is;  what  is  the  European  community  or  other  communities’  involvement  in  supplementing  the  Moses  development  after  the  Moses  core  expires?      JVM:  Nobody  knows.    Tom  Hoar  (1:13:50):  Nobody  knows,  OK  it’s  undetermined.  So  let’s  take  some  hypothetical,  what  happens  if  government  funding  stops  for  Moses?  Each  of  you  and  I  and  maybe  some  other  people  in  the  room  have  systems  built  around  Moses.  We  have  a  contingency  plan  for  what  happens  if  Moses  is  no  longer  developed,  it  really  basically  puts  us  all  in  a  position  as  commercial  vendors  around  a  government  sponsored  open  source  project  of  having  to  either,  fork  the  code,  branch  the  code  or  do  something  with  the  code  that’s  there.    And  so  I’d  like,  on  the,  so  A)  do  you  have,  any  of  you  thought  about  that  eventuality  and  do  you  have  contingency  plans  in  place?  So  that’s  a  question  and  finally,  in  some  of  the,  I’ve  proposed  on  the  discussion  board  that,  in  one  of  the  unconference  sessions,  probably  the  second  to  last  or  the  last  one,  not  tomorrow  but  on  the  last  day  of  the  conference,  which  is  Friday.  I  suggested  that  some  of  us  as  implementers  get  together  and  look  at  the  practical  things  that  are  necessary  to  keep  Moses  going  should  funding  disappear,  what  are  we  going  to  do?    JVM:  Thank  you,  that’s  on  record  and  I  suggest  you  give  the  mic  to  your  neighbour.  Do  you  have  a  contingency  plan?    *Name  (male)*,  Precision  Translation  Tools  (1:15:24):  I  wanted  to  add  to  Tom’s  words.  Yes,  we  have  been  Moses  users  for,  since  five  years  ago  anyway*  if  there  really  is  one*  Moses,  I  think  you  will  agree  with  me,  levels  the  playing  field,  we  are  all  using  basically  the  same  translation  engine  is  just  the  others  the  other  technologies  that  we  merge  around  it  that  make  us  our  technologies,  our  offerings  different.  There  are  other  translators,  there  are  other  statistical  translators  out  there,  now  with  the  funding  coming  to  an  end  as  Tom  says,  I  don’t  envisage  the  death  of  Moses,  but  I  think  it  is  going  to  speed  up  the  birth  of  other  alternatives  and  that  could  affect  our  business  models.  So  it’s  only  the  business  model  of  people  that  only  offer  technology.  We  have  to  feed  into,  I  think  it’s  going  to  speed  up  the  birth  of  other  technologies,  disruptive  technologies  or  disruptive  translators.  *  

Page 27: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

27

 JVM:  So  it  sounds  like  a  recommendation  to  stop  the  funding.  To  stimulate  innovation,  disruptive.  So  what  are  your  thoughts  there,  what’s  next  if  the  funding  stops?    Chris:  With  a  development  background  that  works  in  the  open  source  community  from  time  to  time,  open  source  typically  if  it’s  a  viable  tool  the  community  builds  it  and  continues  moving  forward.  It  doesn’t  have  to  have  a  governing  body  behind  it  pushing  it.  I  think  that  was  the  initial  intent,  to  bring  it  to  the  little  more  the  limelight  of  the  open  source  community  is  so  the  community  actually  contributes  back  to  it.  And  as  far  as  our  contingency  plan,  as  we  said  we  have  been  taking  away  responsibilities  from  Moses  and  we  actually  have  a  timeline  of  when  that  is,  of  when  those  layers  are  finally  replaced.  But  it  is  also  a  transitionary  period  where  we  have  to  make  adjustments.      Question  from  Tom:  And  you  have  identified  Moses  as  a  legacy  technology,  which  you  are  moving  away  from?    Chris:  Exactly,  yeah.      JVM:  Do  you  think  that’s  on  record?  Did  we  hear  that,  really  hear  that?    Tom  Hoar:  Ill  repeat  that,  so  you  have  identified  Moses  as  a  legacy  technology  that  you  are  moving  away  from  and  just  for  the  record.      Chris:  Yes,  we  are  actually  surprised  we  were  able  to  use  it  and  in  the  work  flow  that  we  are  currently  are  showcasing  of  the  online  learning  to  be  able  to  learn  in  real  time,  to  adapt  to  do  both  immediate  learning  and  post  analysis  afterwards  to  make  sure  each  post  edit  truly  does  take  into  account  and  adapts  over  time,  that  tuning  based  on  each  post-­‐edit.  We  were  surprised  that  we  could  get  it  done  while  implementing  with  Moses.      Tom  Hoar  (1:18:24):  Can  I  *arbour*,  Tony  you  were  shaking  your  head,  yeah,  yeah,  yeah  can  you  contribute  here?    Tony:  (Laughing)  Well  I,  you  know,  I  wish  I  had  a  crystal  ball  and  could  stick  it  on  the  table  here  and  kind  of  predict  the  future  but  nobody  can  go  measure.  I  think  there  is  plenty  of  examples  of  open  source  technology  that  is  not  government  supported  that  have  flourished,  everybody  uses  them  day  to  day.  I  use  the  *xerces  xml  parser*  it’s  been  out  there  for  donkeys  years,  its  fully  open  source,  fantastic.  In  fact  I’d  say  most  developers  in  this  room  probably  *xerces  xml  parser*.      Tom  Hoar:  Question  does  anybody  use  *post  press  database*  in  this  room?  Do  you  use  lib  what  is  it?  Lidpqx  the  c++  library  for  it?  The  author  is  sitting  right  here.  (laughing).  Anyway,  OK  Tony,  next  I’d  like  to  get  a  little  bit  of  *input*  because  everybody  was  shaking  their  heads  about  legacy,  the  contingency  plan.      

Page 28: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

28

John  Tinsley:  Yeah  I  think  people  seem  to  like  analogies  here  so,  I  think  Moses  is  like  a  snowball  that’s  you  know  it’s  after  gaining  enough  momentum  that  if  you  turned  off  the  funding  tap  tomorrow,  I  think  there  are  I  think  nearly  every  MT  research  group  in  the  world  I’d  say  is  using  Moses,  is  developing  Moses,  many  of  them  are  contributing  to  the  code  bases,  so  I  don’t  think  the  development  from  that  perspective  is  going  to  go  away  but  what  I  think  as  commercial  providers  of  Moses  we  need  to  do  is  build  the  competency  in  Moses  within  our  teams.    A)  so  that  you  know  if  it  suddenly  you  know  disappears  that,  you  know  if  suddenly  our  pool  of  open  source  developers  who  are  doing  all  the  work  for  us  disappeared  that  you  know  we’d  be  able  to  pick  it  up  ourselves.  But  also  that  we  can  you  know  to  be  able  to  improve  Moses  we  have  to  understand  it  and  it’s  a  very  modular  piece  of  software  so  you  know  there  are  methods  in  there  for  doing  the  word  alignment,  the  phrase  alignment,  heuristics  there  is  the  decoder,  etc  etc.  We  might  want  our  own  version  of  that  to  get  it  in  some  different  formats  so  we  can  manipulate  it  in  other  ways.    Tom  Hoar:  And  finally?    Andrejs  Vasiljevs  (1:20:34):  This  community  is  so  dynamic  if  you  look  on  the  papers  presented  at  the  research  conference  there  are  tons  of  papers  on  different  modules,  and  components  and  methods  have  been  viewed  to  complement  or  place  some  complements  on  the  Moses.  So  it’s  not  just  a  European  Commission  funded  activity,  actually  it’s  a  very  vibrant  feel,  funded  by  universities  and  other  funding  agencies.  But  still  the  question  is  who  will  take  care  about  packaging  that  together  and  ensuring  some  level  of  quality  and  support  for  the  core  Moses  toolkit,  and  that  is  still  a  question  for  the  community  to  organise  better  support  activity  *  when  it  branches  the  Moses  operations,  and  sometimes  you  see  very  nice  feature  in  one  package  and  very  nice  feature  in  another  package  and  if  that  would  be  packaged  together  it  would  be  so  excellent.  But  who  will  do  that?    Tom  Hoar:  OK,  I’m  going  to  pass  it  to  Jerone,  because  he  has  a  question,  you  want  to  say  something.      JVM:  You  can  take  it  over  Tom.    Tom  Hoar:  Introduce  yourself    JVM:  That’s  OK    Jeroen  Vermeulen,  Canonical:    Jeroen  Vermeulen  with  Canonical  the  makers  of  Ubuntu  Linux  and  precision  Translation  Tools  here.  I  would  like  to  amplify  what  Andrejs  has  just  said.  It  is  not  just  a  matter  of  funding,  in  open  source  projects  there  are  certain  pathologies  which  set  in.  It’s  a  matter  of  stewardship,  you  need  a  central  accepted  authority  to  bring  together  developments  in  a  project,  and  it’s  OK  if  authority  breaks  down  as  long  as  somebody  else  steps  in  and  becomes  accepted  as  the  active  developer.  If  that  turns  into  conflict,  confusion  or  entirely  disconnected  

Page 29: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

29

development,  there  is  substantial  risk  to  the  further  development  of  the  combustion  engine.      To  go  back  to  that  analogy  of  the  combustion  engine  there  was  a  time  not  very  long  ago  when  companies  were  confronted  with  a  task  of  designing  a  combustion  engine  from  scratch  and  discovered  they  could  not  do  it.  Some  very  well-­‐known  companies  nearly  broke  their  backs  financially  trying  to  reconstruct  that,  to  develop  a  new  engine  from  scratch.  Today  that  is  something  we  laugh  at,  several  companies  have  been  able  to  do  it  now,  but  this  really  substantially  slowed  down  the  development  of  the  car  industry  and  the  same  thing  happens  in  open  source.  I  could  name  you  some  vital,  absolutely  vital  open  source  infrastructure  to  computing  today,  such  as  the  X  windows  system  which  has  been  stagnant  since,  the  turn  of  the  century  I  would  say.      Everybody  knows  this  needs  a  replacement,  everybody  agrees  on  it,  everybody  is  working  on  it  any  yet  we  are  not  getting  there.  So  I  say  funding  be  damned  but  there  needs  to  be  good  stewardship.  That  is  essential  you  can  cannot  just  put  a  bunch  of  people  in  a  room  and  say  we’ll  do  it  together.  Somebody  needs  to  have  the  responsibility  it’s  not  a  philosophical  point  it’s  a  practical  necessity.      JVM  (1:24:09):  Thank  you  this  is  all  very  valuable  material.  And  I’ll  tell  you,  just  as  you  say  Tom,  we  are  approaching  the  end  of  the  third  round  of  funding  of  the  Moses  core  project  now  and  we  will  be  interviewing  a  lot  of  Moses  users  and  developers  in  the  next  two  months.  So  since  we  have  lovely  insiders  here.  I’d  like  to  just  do  a  very  quick  exercise,  I  think  we  have  a  pretty  good  idea  of  who  the  users  and  developers  around  Moses  are,  but  can  I  just  do  a  quick,  sort  of  name  dropping,  and  so  that  we  know  that  perhaps  we  don’t  forget  about  a  party.  Can  I  go  with  you  first,  just  drop  names  and  they  are  all  on  record.    Names  of  companies  using  Moses:    

• Precision  Translation  Tools;  our  product  is  do  Moses  yourself  as  an  open  source  collection  of  all  the  tools.    

• Pangeanic  (Manuel)  • Tauyou  • *Euroscript*  We  are  also  using  Moses,  we  can  mention  also  Asia  online  with  

Philip  Koehn,  and  the  group  in  Edinburgh  is  instrumental  for  the  development  of  Moses.    

• Moravia  is  also  using  Moses  among  other  engines  • Crosslang  *Natalie*  we  build  Moses  engines  for  our  customers  with  their  

data  -­‐  they  are  also  using  it  indirectly  I  suppose.    • Google    • Welocalize  • Safaba  • Sovee  • KantanMT  • *Logos*  • Arabis  (LSP)  • Iconic  Translation  Machines  

Page 30: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

30

 JVM:  Another  topic?  Is  there  still  an  appetite  for  another  topic?  We  just  talked  about  the  sustainability  of  Moses  and  you  know  what  happens  and  so  on?  But  a  totally  different  topic  around  Moses  or  MT,  do  you  have  something  that  you  would  like  to  discuss?  Or  are  we  sort  of  getting  tired  and  ready  for  drinks?      Tony  O’Dowd:  Let  the  audience  decide.      JVM:  Ladies  and  Gentleman,  just  fill  out  the  form  and  you  are  free  to  go.    Thank  you  Panel,  Thank  you  all  for  doing  this.        

Page 31: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

31

Appendix  3:  Notes  from  Moses  Industry  Roundtable  Breakout  Discussions  

Business  Breakout  Thoughts  from  Phil  Koehn:  Follow  the  model  of  Apache  sponsors  sponsors  who  only  get  advertising  and  nothing  else  governed  by  board  membership  time  limited  and  voted  by  foundation  members  different  levels  of  sponsorship  Possibly  good  to  start  foundation  housed  at  one  university  Integration  and  documentation  would  be  2  priorities  also  events  and  then  evaluation  campaigns,  plus  student  sponsorship    should  levels  of  sponsorship  give  you  more  votes?  probably  not  because  don’t  want  sponsors  telling  you  what  to  do    for  a  working  group,  where  does  the  money  come  from?  need  to  have  at  least  a  couple  of  staff  members  to  do  code  maintenance  who  will  pay  for  them?    for  a  corporate  sponsor  what  if  that  company  loses  interest  does  that  scare  off  other  companies?  or  is  it  like  Android  and  multiple  companies  will  still  be  interested  Safaba  would  be  more  comfortable  if  there  were  not  just  one  company  controlling  Tilde  would  need  assurance  of  how  much  the  sponsor  would  control  the  plan              Being  part  of  the  open  source  project  allows  you  access  to  the  new  development            going  on  your  own  you  miss  out  on  the  developments  of  the  code            if  people  go  off  on  their  own,  the  summed  costs  much  much  higher  than  pooling  efforts    Advantage  of  a  single  big  sponsor  is  that  they  are  willing  to  put  more  effort  behind  it            like  with  Apache,  each  company  invests  very  little  in  the  environment    Academic  sponsor  scale  of  support  would  be  lower  could  house  it  there  at  a  university,  but  the  money  won’t  come  from  there    Could  there  be  an  independent  revenue  stream,  a  la  the  Mozilla  foundation?  But  hard  to  see  where  that  stream  would  be?    Looking  at  where  Moses  is  as  at  a  platform,  still  very  blue  sky  very  early  stages  of  the  development  

Page 32: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

32

the  tool  will  grow  for  decades  to  come              Need  to  have  a  framework  from  state-­‐of-­‐art  research  for  long  time  to  come  so  that  platform  serves  its  purpose              Some  companies  will  develop  their  own  proprietary  solution,  but  they  will  have  to  support  it  entirely  themselves    On  the  subject  of  community            Cracker  provides  money  for  community-­‐based  development            but  also  drives  the  comparison  and  the  shared  tasks  comparison  and  shared  tasks  most  important  way  to  drive  community  so  these  were  successful  to  drive  the  advancements  through  the  2000s  (NIST,  DARPA)    Moses  foundation  could  be  more  region-­‐independent  WME  is  more  Euro-­‐focused    A  large  corporation  could  sponsor  a  specific  shared  task  to  encourage  concentration  on  one  task  or  problem    what  would  be  the  first  steps?  Talk  with  possible  host  organizations  to  see  if  they  are  willing  Geographic  location  is  a  question  Maybe  one  in  EU,  one  in  US,  how  about  in  Asia?  Moving  out  of  EU  will  help  to  make  it  less  EU-­‐identified  Also,  see  if  there  if  there  are  enough  people  willing  to  put  it  funding  Need  to  clarify  legal  issues  and  mission  statement  first  before  can  begin  the  fundraising              possibly  could  start  with  sponsoring  smaller  projects,  beginning  with  baby  steps      

Technical  Breakout  Some  organizations  require  software  approval  for  major  version  changes    

• this  should  be  kept  in  mind  when  numbering  Moses  releases  Windows  support  

• Moses  originally  developed  on  Windows  • Later  more  contributors  for  Linux  version  • Supported  under  Cygwin  

Users  often  develop  own  infrastructure  around  decoder  Hieu  described  release  process  

• Train  &  run  models  for  16  language  pairs  • Branch  about  one  month  before  release  • Rare  patch  releases  

API  changes  

Page 33: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

33

• Unclear  how  API  is  used  upstream  • No  versioning  • Command  line/named  pipes  don’t  change  

Contributions  in  /contrib  folder  • Often  missing  description  • Often  unclear  if  maintained  or  not  • Some  contributions  where  moved  to  /contrib  folder  to  clean  up  decoder  code  

base  • Some  users  would  be  open  to  contribute  components,  but  infrastructure  

developed  around  decoder  (see  above)  often  creates  dependencies  Feature  requests  

• Training  cycle  too  long  • Translation  quality  –  how  to  judge(?)  • Multilingual  phrase  table  

Page 34: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

34

Appendix  4:  TAUS  MT  Showcase  Vancouver  2014  Discussions    Q&As  (1:07)   (Achim)   Thanks   to   all   the   presenters   for   an   interesting   and   diverse   set   of  presentations.   I   know  we  haven’t  had  much   time   for  questions   to  Sasha  and  Tom.  Does  anybody  have  any  questions  for  them?    Q  (Marco)  Am  I  allowed?  (Laughs).  So,  Saša,  I’ve  seen  the  numbers  you’ve  shown  us  and   they   are   incredible   and   good   news   for   the  Moses   community.   Now   you   have  been  delivering  scale   in  terms  of  number  of  requests  with  Moses.  But   I  need  some  clarification.   You  were   saying   that   you  managed   to   go   to   20M   seconds   time   but   I  think  that  in  order  to  set  the  expectation,  am  I  right,  that  this  is  only  applicable  for  query  translation  where  the  reordering  model  is  very  easy,  the  phrase  table  can  be  pruned   a   lot,   so,   do   you  have   any  data   in  Moses   speed   if   you   try   to   translate   the  entire  description?    A(Saša)   So   first   of   all   yes,   you   are   right,   the   20M   seconds   are   only   for   the   search  backend  and  this  is  fortunately  only  queries  which  are  mostly  very  short  sequences  and   as   I   said  we   need,   like,   reordering   so   actually   the   distortion   limit   is   only   1   in  Moses   and   also   the   stake?   limits   of   the   pruning   is   very   aggressive   without   much  degradation  in  quality.  So  we  still  look  at  kind  of  finding  the  suites?  spots  so  we  have  one  part  of  our  tuning  or  training  of  the  systems  is  to  look  where  the  suite  spot  is  for  the   training   systems   in   terms   of   quality   and   speed.   Obviously   when   you  move   to  longer   sequences   like   titles   these  20M  seconds  with  hold  any  more.  But   these  are  also   use   cases   where   you   do   not   necessarily   really   need   this   real   time   thing.   So  search  is  something  which  is  kind  of  user  generated  content  essentially  and  if  a  user  types  a  search  query,  the  users  don’t  like  to  wait  so  there  is  really  where  you  need  the   real   time  component.  For   the   titles  we  allow  some   latency  because   the  sellers  put  up  their  item,  but  then  we  have  time.  We  then  can  take  a  couple  of  seconds  to  translate,  put  it  in  the  cashes  and  even  if  someone  hits  it  in  a  search  like  within  the  second  that  it  went  online  and  if  they  don’t  see  the  translation,  this  is  something  we  accept  because  it  will  just  be  a  couple  of  customers  but  then  at  some  time,  once  the  translation  is  ready  we  serve  this.    Q(Marco)  Can   I   ask   the  question     in   another  way:   So,  do  we   still   need   to   improve  Moses  in  terms  of  performance?  What  I  mean  is  that,  if  you  have  the  full  reordering  model,  if  you  don’t  prune  the  phrase  table,  you  don’t  run  optimization,  is  Moses  able  to  translate  a  sentence  as  fast  as  Google  Translate  and  Microsoft,  or  the  technology  is  not  there  because  we  need  to  make  it  more  scalable,  in  your  opinion?      A(Saša)   In  my  opinion,   I  think  Moses   is  pretty  fast,  so   it   is  very  competitive.   I  don’t  know  what  the  speeds  of  Google  are,  I  mean  they  have  a  kind  of  massive  parallelism,  I  don’t  know  if  they  do  some  kind  of  sub-­‐sentential  splitting  or  something,  probably  not,  and  so  still   they  will  not  be  able   to,   like,   for   long  complex  sentences   they  will  probably  also  take  a  little  bit  longer  but  I  don’t  know  about  the  measurements.    

Page 35: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

35

Q(Marco)  You  are  satisfied?    A(Saša)   Yes,   I   am  pretty   satisfied  with   the   robustness  of  Moses.   I   think  our  Moses  servers  crash  very  very  rarely  so  this  was  something  that  we  raised  earlier  on  when  we  developed  this  orchestration   layer  and  a   lot  of   fail   safety  went   into  the  Java  or  the   orchestration   layer   at   eBays   old   Java   where   we   thought   what   happens   if   our  Moses  server  starts  crashing.  What  if  we  have  10  Moses  online  and  first  on  crashes,  and   then   the  others   take   the   load  and   then   the  next  one   crashes.  Because   at   this  point   we   fully   load   the   phrase   tables   and   the  memory   so   the   startup   time   is   not  instant,   it   takes   some   time.   So   what   if   we   have   this   worst   case   chain   where  everything   fails   but   the   first  Moses   server   didn’t   come  up   yet?   But   so   far,   I  mean  knock  on  wood,   it  has  not  happened  and  it   is  pretty  much  stable.   I  don’t  know  the  exact  numbers,   I  mean  there  are  crashes  just  here  and  there  but  as   I  said   it   is  very  rare.    Q(audience)  How  customizes  your  Moses  engine?  Is  it  the  standard  thing  or  you  guys  rewrote  parts  of  it?    A(Saša)   No,   so   actually   it’s   pretty  much   out   of   the   box,   no   fancy   features,   it’s   no  complex   models   that   are   allowed   within   Moses   so   we   use   phrase-­‐based   MT,   no  syntax  or  hierarchical  phrase-­‐based.  It  is  all  too  slow,  no  complex  reordering  models,  it   is   just   like   the  baseline  of  eight   to   ten   features.  We  treat  a   lot   the  way  the  data  goes   in,  so   like  we  have  a   full   repository  of  data  assets  and  essentially   the  training  pipeline  allows  you  to  figure  which  data  goes  in  with  which  weight  and  there  you  can  sort  of  fine  tune  you  systems  towards  the  e-­‐commerce  domain  and  there  is  a  lot  of  selection  on  the  data  side  essentially  and  filtering  and  so  on.  But  then  when  it  comes  to  the  engine   large   language  models,  because  we  have   large  amounts  of   titles  and  queries,   so   these  go   in,   I  mean  queries   from  the   target   language   if  we  have   it  and  titles   also   depending   on  whether  we   have   titles   available   in   that   language.   So   for  Russian  for  instance  for  titles  translating  then  from  English  to  Russian  we  don’t  have  Russian  titles  so  we  crawled  the  web  at  other  e-­‐commerce  sites  like  TopShop  and  so  on.  So  there  are  data  efforts  definitely,  and  it  is  targeted  crawls  because  sometimes  it  is  not  easily  crawlable  so,      we  need  to  find  specific  solutions  to  go  into  the  website  and  kind  of  extract   the  content.  But  besides   that   from  the   technological  point   it   is  just  Moses  2.1.1  I  think,  the  latest  version.    Q(audience)  Are  you  planning  at  any  point   to  open  an  API   for   letting  other  people  also  use  your  engine?    A(Tom)  Yes,  I  thing  long  term  that  is  envisaged.  I  don’t  know  when  so  I  think  that  if  you  ask  this  question  Asani  would  say  yes,  tomorrow  we’ll  do  it  but  if  you  ask  me  I  say   no,   we   first   have   to   get   the   basics   right   but   I   mean   at   some   point,   when  everything  is  sophisticated  why  not,  I  think  there  is  some  value,  I  don’t  know  how,  I  mean  if  this  is  going  to  be  monetized  or  something,  I  don’t  know  but  the  systems  if  they   work   for   our   customers   and   they   are   happy   they   should   also   be   useful   to  others.  There   is  no  reason  not  to  do  this.  But  as   I  said  at  this  point  we  just  started  and  we  figured  just  the  basics  out  and  I  think  we  need  more  time  to  kind  of  optimize  

Page 36: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

36

this  process  as  it  currently  stands  to  be  able  to  manage  even  more  language  pairs,  to  cut  down   the  development   time  –   so   currently   it   takes  2   to  3  months   to   launch  a  new  language  pair  and  I  think  ideally  if  everything,  so  the  longest  time  actually  that  we  have  to  wait  is  for  the  in-­‐domain  data  that  we  post  edit,  for  the  training  data  and  that  we  human  translate  for  the  test  sets  and  this   is  going  to  (1:15)  extra?  vendors  currently  through  our  localization  team  and  this  takes  then,  depending  on  how  much  it   is   it   can   take  4-­‐8  weeks,  or  even,   I  don’t  know,  we  had  cases  with   large  batches  initially  that  took  like  more  than  2  months  but  we  are  trying  now  to  select  better,  to  better  sampling  which  is  relevant  and  then  only  send  this  out.  But  if  we  can  reduce  these  cycles  maybe  by  using  even  your  technology  at  some  point  I  hope  we  should  be  able  to  launch  new  language  pairs  within  1  or  1  and  a  half  months  and  if  we  get  this  right  we  can  like  look  and  see  who  can  use  this.  Q(Achim)  Any  more  questions  from  the  audience  to  the  podium?    Q(audience)  I  have  not  seen  your  language  pairs  in  any  of  your  tools  but  do  you  have  regional  differences  like  French  from  France,  French  from  Quebec…    A(Saša)   We   do   Spanish,   Latin   American   Spanish,   Portuguese   from   Brazil   and  Portuguese.   Curiously   initially,   at   some   point  we   spited   Latin   American   Spanish   to  Mexican   Spanish   etc.   and   that   was   too   much   like,   it   fragmented   too   much   our  community  and  we  didn’t  have  request  for  it  so  we  actually  reverted  and  the  other  curious  thing  for  us  was  that  so  far  no  one  has  ever  asked  us  for  English  from  the  UK.  So  we  didn’t  split  that.      A(Marco)  For  us  the  CAT  tool  can  have  all  the  language  locales  but  when  it  comes  to  MT  for  example  the  MT  engine  will  be  only  one  for  Portuguese,  only  one  for  Spanish,  so  the  CAT  tool  supports  everything  but  the  background  suggestions  will  come  only  to   the   major   language   and   we   have   to   pick   one   of   the   two   which   made   many  Brazilian  translators  not  very  happy.  Q(Tom)Oh,  did  you  pick  Portuguese  from  Portugal?    A(Marco)  Yes.    A()  That’s  interesting  because  Google  Translate  does  the  Portuguese  from  Brazil.  A(Marco)   So,   that   is   the   opposite   for   us.   The   engine,   like   the   commercial   engine  would   be   Google   Translate   in   this   case,   so,   is   this   correct,   the   Portuguese   are  complaining.    A()   They  don’t   declare   that   it   is   Portuguese   from  Brazil,   I   think   that   there   is  much  more  data  and  purchase  from  Brazil,  so  it  tends  to…    A()  Probably  for  them  yes.  So  for  the  eBay  use  cases  we  use  like  the  public  data  that  is  available  for  Spanish  which  is  mostly  like  the  European  Continental  Spanish  but  our  in-­‐domain   data   we   use   Colombian   contractors,   I   think,   because   they   are   kind   of  closest  to  the  general  Spanish  use  case.  So  we  try  to  address  this  by  the  in-­‐domain  data   to   get   it   right   for   Latin   America.   But   still   there   is   this   smoothing   happening  because   there   are   like   regional   dialects   but   we   don’t   address   this.   But   then   we  

Page 37: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

37

actually  used  our  Latin  American  system  and  launched  it  in  Europe  for  UK  English  to  Spain  kind  of  language  pairs  and  the  acceptability  was  just  a  little  bit  smaller  than  the  others.  So  I  don’t  know  the  exact  numbers  but  if  the  acceptability  numbers  for  one  system  was  85%  the  same  system  for  within  Europe  was  at  81  or  82  so  you  can  see  like  a  little  impact  on  people  might  be  like  saying  oh  this  is  not  the  right  word  but  still  I  kind  of  understand  it  but  it  is  not  that  big  problem.    A()   So   for   us  we   noticed   that   the   difference   between   Latin   American   Spanish   and  European  Spanish  was  a  lot  about  tone,  so  for  queries  it  would  not  be  so  much  of  an  issue  but  in  text  in  Spain  you  almost  always  use  the  ‘tu’  so  very  informal  whereas  in  Latin  America  you  use  ‘usted’  so  it  is  very  formal  and  in  those  cases  verbs  change  and  they  all  have  to  agree  etc.  so   it  more   in  the  conversation  aspect  where  you  see  all  the  differences.    A(Saša)  Yes,  again  for  eBay  for  titles  and  queries  it  is  not  real  languages,  we  define  it  by  linguistic  standards,  it’s  just  like  kind  of  a  known  language.    A()   And   that   also   shows   again   that   you   need   to   evaluate   again   the   MT   for   your  specific  use  case  and  your  material,  …    A()I  think  It  shows  that  for  your  training  data,  whatever  you  use,  to  create  your  SMT  engine  sets  the  tone  of  whatever  your  translations  are  going  to  create,  so  if  you  train  an  engine  through  your  data  it  is  going  to  come  out  in  the  style  of  your  data  and  that  is   true  whether   it   is   these   guys   using   their   system   or  whether   it   is   our   system   or  whether  it  is  Microsoft  hub….  (the  rest  not  very  audible)  Q(not  very  audible)  …(sb  asking  Translated  and  eBay)  Are  you  both  going  to  succeed?    A(Marco)  Well,   it   is   a   250M   dollar  market,   a   good   draw,   but   since  we  wanted   to  make  1B,  250M  was  not  enough..    A()In   1990   although   we   launched   Photoshop   in   the   market   where   outsourcing  graphics  production  was  300  dollars  per  hour,  they  now  make  4B  dollars  a  year  and  outsourcing  graphics  production  went  30  dollar  per  hour.  I  don’t  know,  who  is  going  to  win?  There  is  not  going  to  be  a  winner…    A(Marco)  No  zero  winners,  two  winners!..  …  Q(A  question  for  Matecat  guys.  Do  you  train  data  for  the  engines)    (Marco)  We  have  not   developed   that   yet.   So   if   you  use  Matecat   now   it  will   come  from  a   commercial  engine  Google  Translate  or  Microsoft,   if   you  use   this   interface.  What  we  plan   to  add   in   the   future   is   that  after   you   load  your  TM   into   the   system  there  will  be  a  magic  button  to  say  get  the  Moses  engine  with  this  data  and  basically  within  the  CAT  tool  as  you  are  translating  you  will  have  the  Moses  engine  with  this  online   learning   functionality.   So   today   it   is   more   manual.   You   take   the   data,   you  create  the  engine  and  basically  you  can  connect  the  engine  very  easily  to  Matecat,  but  we  haven’t  created  the  automation  of  that.  

Page 38: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

38

 C()   One   suggestion.   We   have   experimented   with   this   a   little   bit   with   system  combination.   The  problem  with  Moses  a   lot  of   the   times   lacks  of   coverage  and   so  what  we  started  doing   is  can  we  do  a  combination  of  Google  Translate  and  Moses  and  with  this  combination  to  have  extra  coverage  but  still  have  Moses  to  be  able  to  do,  to  support  regional…  I  mean  this  has  been  just  in-­‐house  experiment  but  it  seems  to  do  fair  well.  With  TR  with  are  able  to  have  most  of  the  segments  out  and  then  use  that  for  extra  coverage.    Q(Marco)  You  mean  to  suggest  two  things  to  the  translator  or  combine?    A()  Combine.    C(Marco)  That  is  smart.  I  don’t  know  how  to  do  it.    …(not  audible)    ()  eBay  is  using  version  2.1.1  what  are  you  two  using  in  your  basic  systems  orwhen  you  build  your  Moses  systems?  What  version?    ()   I   think   we   are   using   the   same.   Yes,   because   we   started   using   Moses   in   June?  Whatever  the  latest  version  was  in  June.    (audience)  You  guys  are  building  a  new  tool!    ()  There  are  people  right  now  working  on  it.    (Marco)  No,  it  is  night  in  Europe,  they  sleep!    ()  Not  our  guys!    (laughs)    Q   (Achim)I   have   a   few   questions   two   for   Saša.   You’re   kind   of   building   out   this  functionality  now  on  the  eBay  site  to  use  MT  to  enable  processes  for  trade.  I  guess  you   have   a   rush   for   the   markets   you   are   targeting   you   already   have   translated  interfaces.  But  you  see  for  the  actual  content  you  see  opportunity  is  where  you  can  use  post-­‐editors   to   correct   the   translations   that   the  MT   system  makes,   have  or   at  least  give  sellers  the  option  to  use  the  services.    A  (Saša)  So  I  think  we  are  working  on  the  sellers  as  well.  I  think  …  that  this  is  a  bullet  point   that   we   will   offer   to   sellers   if   they   want   to   reach   more   people   to   buy  translation   services,  MT   is   a   free   service   so   they   can   kind  of   see,  make   changes   if  they   speak   the   language,   there   are   options   for   post-­‐editing   or   even   like   human  quality   translation.   I   am   not   sure   how   this   is   going   to   play   out,   I   am   a   little   bit  skeptical  because  I  do  not  directly  see  the  need  but  there  are  business  people  in  the  eBay  that  see  the  use  case  better  and  they  see  some  value   in  there.  Regarding  the  

Page 39: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

39

corrections  I  think  you  asked  like  if  we  want  to  do  post-­‐editing  of  MT  output  on  the  side.   So   we   have   some   feedback,   lots   of   possibilities   internally   if   we   have  mistranslated   something   there   is   an   interface   where   you   can   correct   it   and   then  automatically  we  fix  it  and  essentially  overwrite  it  but  this  time  it  is  not  fed  back  into  the  MT  engine   so   this   is   something   that  we  hope   to   address  with,   yes,   the   recent  changes  to  Moses  where  at  some  point  when  we  identify  that  we  make  mistakes  this  then  will  also  be   fixed  automatically   in   the  engines.  So  there   is  still   some  potential  there.  Currently-­‐  we  would  need  to  re-­‐train  and  fully  re-­‐deploy  a  system  which  is  still  costly.      Q  (Achim)  Tom,  you  talked  a  lot  about  what  is  required  to  train  an  MT  system,  what  skills  are  required.  What  are  the  advantages  for  running  you  own  MT  on  premises,  maybe  in  the  cloud…    A   (Tom)   Well,   that   is   actually   my   presentation   on   Friday!   There   is   another  presentation  on  Friday  about  the  pros  and  cons  for  insourcing  and  outsourcing,  OK?  But  in  general  if  you  need  to  go  to  quick  production  and  you  want  your  MT  system  to  satisfy   an   immediate   requirement,   insourcing   is   probably   not   the   way   to   go.   It   is  definitely  not  your  way  to  go.  It  takes  time  to  learn  how  to  operate  the  system,  how  to  use  the  system  and  put  it  into  implementation.  So  that  is  kind  of  to  say  that  I  had  a   lot   of   potential   customers  who   come   to   us   and   say:   ‘we   need   a   system   up   and  running  in  two  weeks  and  we  need  it  to…’  and  I  say,  go  to  somebody  to  outsource  it,  so   it   is   not   something   you   are   getting   into   when   you   build   your   system   and   you  insource  it.  The  advantages  of  insourcing  are  that  you  eliminate  the  recurring  cost  of  a   subscription-­‐based   system   so   whatever   investment   you   make   in   building   and  operating   you   system   comes   eventually   a   fixed   cost.   And   you   have   the   ability   to  create  new  models  that  can  create  a  new  system  and  therefore  make  more  money  on   that   new   system,   I   mean   you   can   put   yourself   into   a   declining   cost   return   on  investment  model.  You  also  control  over  what  you  do,  you  can  do  your  own  quality  assurance  integration  of  the  system.  If  you  are  using  for  example  Google  Translate,  they  may   re-­‐train   their   engine   tomorrow   and   you   have   totally   lost   control   of   the  quality  control  of  the  internet  engine  that  was  there  yesterday.    Q   ()  Do  you   thing   that   this   is  a   little  comparable  with   that   slightly  older  discussion  shall  I  keep  the  services  in-­‐house  or  shall  I  move  to  the  cloud  Amazon  EC2?    A  (Tom)  Totally  different  conversation.  For  example  our  system  is  built  in  such  a  way  that  our  server  can  only  run  EC2.  So,  do  you  want  your  infrastructure  that  runs  our  software  in-­‐house  or  EC2  is  a  different  question  that  if  I  want  to  run  my  own  service  or  outsource  the  production  to  somebody  else.    C  ()  Right,  so,  what  I  mean  is  that  instead  of  having  someone  with  the  skills  necessary  to   manage   the   system,   you   can   have   someone   that   besides   doing   other   things  communicates,   with   the   subscription   service   that   does   it.   So,   the   advantage   of  having   services   in   the   cloud   is   that   at   some   point   it   always   becomes   more   cost  efficient  to  having  them  in  the  cloud  unless  you  got  to  huge  scale  and  I  wonder  if  just  the   efficiency   that   you  were   going   to   get   if   your   are   running   it   on   something   like  

Page 40: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

40

KantanMT   or   your   system   if   it   runs   in   the   cloud   there   is   so   much   efficiency   that  unless  you  run  a  huge  system  there  is  no,  it   is  not  cost  efficient  to  do  it  yourself.   It  will  always  be  better  to  do  subscription.    A   ()   With   your   own   data,   sending   your   own   data,   of   course.   We   actually   have  customers  that  provide  that  service.  We  don’t,  we  make  the  tools  and  our  customers  can  either  use  their  system  they  build  for  themselves  or  they  can  open  it  up  as  a  tool  for   their   customers.   So  we   have   customers  who   build   their   own   system   and   then  they  use  that  system  to  service  their  customers.  We  just  don’t  provide  that  service  as  such.      Q  ()  Question  for  Marco:  Will  we  have  Matecat  with  a  mobile  UI?    A()  Well,  it’s  open  source,  so,  maybe  in  two  weeks  we’ll  have  one!    (laughs)    A   (Marco)  The   target   that  we  are  going  with  Matecat   is  more  standard   translation  projects   so   I   guess   they   will   be   longer.   And   today   you   can   open   on   a   mobile  interface,  you  cannot  create  projects  but  you  can  open  the  translation  interface.  So  as   a   translator,   you   can   receive   a   link   and   you   can   open   the   document.   Before   in  Trados  you  had  to  go  to  your  desktop,  import  the  thing,  before  seeing  the  content.  So  you  can  see  the  content,  you  can  actually  edit.  It  is  not  mobile  friendly  so,  it  is  not  like   Unbable   that   has   been   designed   around   that   experience   so   I   would   not  recommend  anyone  to  use  Matecat  on  the  mobile,  but  at  least  you  can  see  the  work  and  you  can  fix  a  sentence  today.  So  it  is  not  in  the  short  roadmap  to  have  a  mobile  system  so  you  can  create  projects  on  the  mobile.  It  is  not  the  intent,  but  who  knows.    Q  (Achim)  Doesn’t  this  mean  it  relates  to  the  Unbable  then,  because  you’re  focusing  much   on   mobile   editing   for   a   medium   term.   Does   it   mean   that   you   are   good   at  certain  kind  of,   targeting  certain  type  of  content  and  that   is  working  very  good  for  that,  or?    1:34    A  (Vasco  Pedro)  Yes,  I  mean  I  think  that  naturally  certain  types  of  content  fit  better  on  mobile,  so  anything  that  is  social,  conversational,  you  know,  email  type  tends  to  be  better  than  if  it’s  a  technical  document.  We  don’t,  we  are  not  right  now  filtering  that  out  in  the  sense  that  since  we  are  chunking  everything  goes  to  the  mobile  but  you  can  see  it,  I  mean  it’s  …  there  are  just  certain  types  of  content  that  people,  for  example,  you  go  to  the  site  the  way  it  works  and  you  say  give  me  a  task,  but  you  can  skip  it,  you  are  not  forced  to  do  it.  But  you  can  select,  like,  what  you  want  to  do.  So  we  assign  you  something.  Now  what  you  can  see  is  that  certain  types  of  content  are  skipped  much  more  on  the  mobile  than  on  the  web  because  they  are  less  effective.  Right  now  we  are  not  so  differentiating  on  this  but  I  think  that  this  will  just  happen  naturally.    

Page 41: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

41

Q  (audience)  I  have  an  intellectual  property  question.  So  we  see  here  on  the  screen:  Input  your  TMs.  My  question  is  what  happens  to  My  Data  after  we  do  the  translation  job,  whether   it’s   the  TMX  or   the  actual  document   I  want   to   translate,  how  private  will  this  be  or  it  will  continue  to  be  after  the  translate  jobs.    A  (Marco)  There  is  a  link  called  Terms  in  a  button  that  explains  that.  So  what  we  do  with   this  …   first   it   is   a   cloud   infrastructure   so  we  are   saying   that   this   is   something  useful   for   everybody   that   is   open   to   the   cloud,   open   to   use   Gmail   as   an   email  provider  etc.  So  if  you’re  talking  about  strictly  confidential  information  about  rockets  probably  you  shouldn’t  be  doing  translations  in  the  cloud  at  all.  But  if  you  are  there,  the  Terms  basically  say  that  the  content  is  your  content,  you  can  get   it  back  at  any  time  that  you  want  and  delete  all   the  content  that   is  there  and  what  we  do  as  the  tool  works,  if  you’ve  seen  the  online  learning,  we  use  your  edits  in  order  to  improve  the  MT.  And  we  don’t  share,  in  the  conditions  we  say  we  don’t  share  the  data  across  customers  so  if  you  have  a  private  TM  obviously  you  will  be  the  only  one  accessing  this  TM  and  no  one  else.  We  use  you  edits  in  order  to  improve  the  MT.    Q  (Achim)  For  everybody  or  specifically  for  you?    A   (Marco)  Well   today  only   for  you.  But   the  Terms  say   that  we  can  use   the  data   in  order  to  improve  the  MT  in  general.    Q  (audience)  And  for  Unbabel,  is  that  similar?    A  ()  A,  in  what  sense?  I  mean  we  don’t…  you  send  us  text,  right?  Other  people  don’t  access  your  content  but  you  can  use  it  to  train,  unless  people  specifically  ask  us.  So  far  no  one  did.  We  only  have  one  customer  who  said,   ‘could  I  do   it?’  Yes,  sure,  we  would   be   able   to   do   that   and   have   this   content   not   be   used.   But   not   really  requested.  I  mean  we  only  use  that  just  to  train  the  engine.  To  be  honest,  initially  we  thought  it  would  be  much  more  of  an  issue.  We  thought  about  it  a  lot.  For  example  one  of   the  modules   in  building   is   an  anonymous  optimizer.  But  we   thought,   oh,   it  would  be  great  to  send  things  out  because  of  the  crowd  in  a  way  of  being  completely  honest,   by   sending   an   email   to   other   people,   do   you  want   to   translate   it,   we   are  actually  developing  a  tool   to  anonymize  text  after   translation  correction  and  to  re-­‐construct  it.  We  started  doing  this,  initially,  as  a  research  project  with  IST  but  to  be  honest   no   one   requested   it.   Like   we   thought   that   people   would   immediately   ask  ‘what  about  anonymization?  No  one.    C  ()  Again  on  Friday’s  session  we  talk  about  that.  It  falls  into  what  I  call  Category.  It  comes  out  of   the  term  Cryptography  because   it   is  called  Trusted  third  parties.  OK?  You’re   in   a   trusted   third   party   relationship  with   outsourcing   provider.   So   read   the  terms  of  terms  and  conditions,  if  you  accept  the  terms  and  conditions  then  you’re  in  a  trusted  relationship  but  you  still  have  to  trust  them  as  a  third  party  to  do  what  they  say  they  will  do.  OK?  And  I  am  not  questioning  anybody’s  integrity,  I  am  just  saying  that  this  is  the  reality  of  it.  It  is  a  trusted  relationship  of  the  service  provider  being  a  third  party,  because  you  and  your  customer  are  the  first  and  second  party  and  you’re  sending   it   to  a   third  party.  OK?   I   think,  my  experience  has  been   that   those  people  

Page 42: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

42

who   have   requirements   where   the   security   endorse   intellectual   property  requirements   demand   that   it   can’t   be   connected   to   the   web,   don’t   go   for   these  services  anyway.  It  is  a  self-­‐filtering  environment.    C  ()  We  only  had  some  requests  at  the  beginning  that  was  a  bit  scary  us.  We  started  getting   some   requests   from   the   Middle-­‐east   and   one   of   the   requests   were  documents,   like   they   sent   us   documents   in   pdf   and   that’s   when   we   said,  unfortunately  we  can’t  do  it  for  technical  reasons,  but  it  did  say,  I  think  at  the  time  was  the  Palestinian  police  and  it  was  like  this,  people  that  were  murdered  and  things  like  this,  so  we  say,  what  do  we  do  with  this.  It  was  very  early  on.  There  was  another  story  involved  in  there,  but  you  know,  that  scared  us  a   little  bit,   I  said,  oh  my  God,  what  would  happen.  And  fortunately  we  were  not  able  to  do  it.  If  it  helps  anything,  of  course  there  is  a  trusted  relationship.  Everybody  that  signs  as  an  editor,  you  know,  basically   they   are   accepting   the   terms  of   service,   the   can’t   copy   etc.   and   they   are  working  on  our  platform  which  limits,  makes  it  a  little  harder  to  extract  any  content  out,  so  we  don’t  send  jobs  out  to  them,  they  go  it  and  that’s  it  they  don’t  have  access  to   the  text  afterwards.  So   there’s  a   few  things   that   limit,  but   there  hasn’t  been  an  issue.    C   (Marco)  Can   I   say  something   in   favor  of   the  Cloud?  One   reason  we  also  did   it   in  Matecat  and  Translated  LSP  is  because  we  add  privacy  use.  We  have  been  shipping  content  to  translators  via  email  and  even  the  TMs  via  email.  And  basically  when  you  send  someone  by  email  you  have  no  control  of  what  will  happen  afterwards  with  the  data.  So  also  the  security  of  the  desktop  of  the  translator  is  not  as  good  as  a  server.  But  the  big  problem  is  that  you  cannot  control  it.  So,  by  going  into  the  Cloud  at  least  for  us  what  was  really  good  was  that  I  give  access  to  the  translator  to  the  document  he  has  to  translate  and  the  portion  of  the  TM.  But  I  don’t  distribute  data  all  around.  Because,  on  point   is   that   if   I  want   to   remove   that  now   I   simply   remove   it  and   it   is  done.  So  to  us  it  was  an  improvement  in  terms  of  security,  the  Cloud.    C   ()   I   agree.   And   it   really   depends   on  what   you   are   comparing   it   to.   I   am   no  way  degrading   or   putting   down   the   Cloud   because  my   customers   use   our   software   to  expose  their  services  to  the  cloud.  OK?  Just  to  share  one  very  recent  contract  we  just  got.  We  were  approached  in  the  summertime,  in  July,  by  a  client  saying  ‘we  have  a  system  that  cannot  be  connected  to   the   internet.  Can  you  provide  us  a  server  and  editing   environment   that  we   can   disconnect   from   the   internet   altogether   and   still  get   machine   translations?’   I   said   yes.   We   are   not   normally   in   the   business   of  providing  editing  environment   so   there   is   a  CAT   tool   in   this  particular   contract  we  did.   In   this  contract  also  was   that  we  went  out  of  our  way   to  satisfy   the  customer  because  he  had  a  budget.  And  paying  customers  come  to  the  front  line.  And  we  are  building  their  statistical  models,  we  are  training  their  system,  we  are  building  their  corpus   for   them,   they  will   be  deployed   into   their   system  and   they  will   operated   it  when   it   is  actually  built  and  deployed  and  functional.  So,  you  know,  we  can  satisfy  those  clients’   requirements   in  a  way   that  other   system  providers   can’t.     So   in   that  regard   you   really   have   to   look   into   your   system   requirements,   whether   at   IP   or  security.    

Page 43: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

43

C  (Achim)  There  are  two  winners.    C  ()  The  reality  is  that  nowadays  we  are  using  so  many  services  in  the  Cloud  that  the  vast  majority  of  people  are  getting  used  to  it  on  one  hand.  So  in  our  case  there  more  of  a  self-­‐selection  like  if  you  are  going  to  use  something  that   is  a  crowd  translation  service,  you  know,  you  are  not  going  to  use   it   if  you  are  the  kind  of  customer  that  needs…    C  ()  My  customers  would  never  consider  that    C   (Marco)   Yes   but   since   you  destroy   documents   into   chunks   and   yet   you   consider  anonymizing…    C  ()  Yes  that’s  true.  That  is  one  of  the  things,  so  the  fact  that  you  only  see  a  portion  of   it.   I   mean,   arguments   can   be   met   at   both   sides,   one   can   be   not   enough   and  another  better…    C  ()  I  am  not  a  security  consultant.  Go  to  your  security  consultant  and  to  your  lawyer  for  this    C  ()  We  don’t  go  to  those  lanes  and  anything  like  that…    I   hope   that   it   is   clear   to   everybody   that   using   data   as   TMs   is   quite   different   from  using   them  to   train  MT  systems.  Through   training  MT  systems   it   is  not   like  getting  whole  sentences  or   that  you  are  getting   information  that  needs  to  be  private   (rest  not  audible)    C  ()  You  can  change  the  name  of  the  country  and  see  what  happens.  (laughs)    Q   ()   I   think  we   have   a   very   good  model  with  Matecat.   I   think   if   you  were   able   to  provide  more  security  to  LSPs,  I  think  you  have  an  amazing  business  model  out  there.    Q  (Marco)  What  do  you  mean  more  security?    A  ()  More  security,  I  mean  to  be  able  to  create  my  micro-­‐Cloud  or  private  Clouds  to  be  able  to  contain  our  TMs  and  also  educate  our  own  engines  because  in  my  case  I  don’t   want   like   ten   years   of   private   data   to   be   shared,   to   be   an   advantage   to  competition.   And   I   am   also   dealing  with   very   private   industries,   so,   in  which   I   am  liable  for  the  protection  of  my  data.  If  I  am  able  to  have  more  protection  to  my  data,  I  think  this  is  an  amazing  system.  I  will  love  to  be  able  to  use  it  and  also  educating  the  engines.  That  would  be  something  else.    C  ()  OK,  that  educating  the  engine  I  think  that  this  is  something  that  we  will  want  to  do,   so,   I   think   that   partially   we   solve   the   problem   because   remember   that   this   is  open   source,   so   you   can   click,   you   can   get   it,  Matecat,   that   experience   into   your  server,  into  your  private  environment.  Now  if  you  are  patient  and  have  the  engineers  to  run  the  infrastructure  and  maintain  it,  so  if  for  security  reasons  you  want  to  do  it  

Page 44: 04-02-2015 D 4.7 MosesCore 3rd Yr Events, 2014 V02 · 2 !!! tableof!contents! executivesummary! 3! 1.!outreach!events!ineurope! 4! 1.1.!mt!showcase,!dublin,!ireland! 4! 1.1.1.!results!

44

you  can  do  it,  actually.  What  you  are  saying,  I  think,  is  that  we  create  private  vaults  in  the  Cloud  for  you,  so  that  it  is  in  the  Cloud  but  it  is  just  for  you,  an  instance.  Correct?  OK.  We  haven’t  thought  about  it  but  it  is  a  good  idea.    C   ()   I   think   that   another   point   that   would   move   into   that   direction   is   that   if   you  encrypt  the  models,  like  your  TMs  or  any  data  that  comes  from  someone  with  a  like  a  second  special  key  or  something  which  is  not  generated  by  a  server,  sorry,  it  can  be  generated  but  it  is  kind  of  just  used  by  the  user    C  (Marco)  Client-­‐side  encryption.    C  ()  Yes,  the  client  would  provide  the  key  and  you  would  use  that  for  encryption.    C  ()  You  mean  encrypting  the  phrase  tables?  And  the  translation  models?    C()  All  the  data,  the  models  themselves  if  they  are  encrypted  you  don’t  care  if  they  are  somewhere    in  the  cloud,  that  someone  hacks  the  service  and  can  extract  them  cause  they  will  still  be  encrypted  and  very  hard  to  decrypt  them  without  the  key  you  provided.    C  (Achim)  I  think  we  are  out  of  time.  Very  interesting  discussions,  very  different  use  models  of   the  open-­‐source  software.  So   I  want  to  thank  all   the  presenters  and  the  panelists.  Thank  you.  And  please  don’t  forget  to  fill  in  the  survey  in  the  back.