smart project - wp 7 dissemination and exploitation

22
SMART PROJECT - WP 7 Dissemination and Exploitation Nello Cristianini November 2009

Upload: butest

Post on 04-Jun-2015

492 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: SMART PROJECT - WP 7 Dissemination and Exploitation

SMART PROJECT - WP 7 Dissemination and Exploitation

Nello Cristianini

November 2009

Page 2: SMART PROJECT - WP 7 Dissemination and Exploitation

Goals

• Ensure that specific results of the project are known to other researchers, or to potential users.

• Ensure that know-how is accessible to users when needed

• More generally, facilitate understanding of SMT issues in the larger machine learning community (inter-exchange)

Page 3: SMART PROJECT - WP 7 Dissemination and Exploitation

Tools

• Website• Publications • Demos• Events• Patents

• Synergy with Pascal2 network

Page 4: SMART PROJECT - WP 7 Dissemination and Exploitation

Publications

• Papers

• Special issue

Page 5: SMART PROJECT - WP 7 Dissemination and Exploitation

Talks• Project members reported on advances to the scientific

community by presenting papers at several major international conferences.

• Results from the project also provided the content of four invited talks:– [Shawe-Taylor 2006] delivered at the NIPS Workshop on Machine

Learning for Multilingual Information Access, December 2006– [Shawe-Taylor 2008] delivered at the 22nd International Conference on

Computational Linguistics (CoLing 2008)– [Cancedda 2008a] delivered at the conference of the European

Association for Machine Translation, Hamburg, 2008.– [Cancedda 2008b] delivered at the First Forum for Information

Retrieval Evaluation (FIRE 2008) organized by the Indian Statistical Institute in Kolkata, India, in December 2008.

Page 6: SMART PROJECT - WP 7 Dissemination and Exploitation

Website

• Scientific results from the project were disseminated in a number of ways. Public deliverables were uploaded on the project websites: as of October 27th, the deliverables webpage had been visited 2227 times.

Page 7: SMART PROJECT - WP 7 Dissemination and Exploitation
Page 8: SMART PROJECT - WP 7 Dissemination and Exploitation

Demos• Wikipedia• The two systems developed for running user evaluations (the Computer-Aided

Translation tools and the Cross-language searchable Wikipedia) provided for valuable demonstrators for the best part of the technologies developed in the project. The latter is a web-enabled demo accessible to the public.

• cosco-demo.hiit.fi/smart/

• Found in Translation• The project also supported the development of the demonstration platform

“Found in Translation”, a European news gathering portal developed and maintained at the University of Bristol providing a valuable context for integrating and demonstrating cross-language technologies of all sorts.

• foundintranslation.enm.bris.ac.uk

Page 9: SMART PROJECT - WP 7 Dissemination and Exploitation
Page 10: SMART PROJECT - WP 7 Dissemination and Exploitation
Page 11: SMART PROJECT - WP 7 Dissemination and Exploitation
Page 12: SMART PROJECT - WP 7 Dissemination and Exploitation

Dissemination Events

• Barcelona: outreach to MT community• Bled: outreach to industry

and also outreach to ML community

• The role of videolectures.nettalks AND TUTORIALS are available online

Page 13: SMART PROJECT - WP 7 Dissemination and Exploitation

Dissemination Events• SMART organised two dissemination events in Y3. • The first one, a one-day workshop aimed at the research community, was organised on

May 13th at the Universitat Politecnica de Catalunya (UPC), in margin to the annual conference of the European Association for Machine Translation. All presentations were video recorded and are available for streaming from the Videolectures.net website.

• The second was an event aimed at a business audience and jointly organised with the PASCAL 2 ICT FP7 Network of Excellence. It was organised in margin of the joint European Conference on Machine Learning and Principles and Practices of Knowledge Discovery and Datamining (ECML/PKDD, the latter traditionally drawing very significant industrial participation), and took place in Bled, Slovenia, on September 7th, 2009.

• A number of demos and posters were presented there: see Deliverable D 7.3.• Workshop homepage: patterns.enm.bris.ac.uk/smart-dissemination-workshop• videolectures.net/smartdw09_barcelona/• www.pascal-network.org

Page 14: SMART PROJECT - WP 7 Dissemination and Exploitation

Barcelona• 9.30 - 10.00 Welcome, Nicola Cancedda (Xerox Research Centre Europe)

• 10.00 - 11.00 Invited Talk: "Empirical Machine Translation and its Evaluation" Jesus Gimenez, UPC

• 11.30 - 12.00 - "Online learning for CAT applications" Nicolo Cesa-Bianchi (University of Milan)

• 12.00 - 12.30 - "Sinuhe -- Statistical Machine Translation with a Globally TrainedConditional Exponential Family Translation Model" - Matti T Kaariainen (University of Helsinki)

• 12.30 - 1300 - Large scale, maximum margin regression based, structural learning approach to phrase translations - Sandor Szedmak (University of Southampton)

Afternoon 14.00 - 14.30 "Learning to Translate: statistical and computational analysis" - Marco Turchi (University of Bristol)

14.30 - 15.00 -"Improving the performance of phrase-based statistical MT" - By: NRC

15.00 - 15.30 - "Multi-view CCA and regression CCA" includes online demo of this integrated with searchpoint integration - Blaz Fortuna (Jožef Stefan Institute)

16.00 - 16.30 - "Large-Margin Structured Prediction via Linear Programming" - Zhuoran Wang (UCL)

16.30 - 17.00 - "Confidence Estimation for Machine Translation" - Lucia Specia (XRCE)

Page 15: SMART PROJECT - WP 7 Dissemination and Exploitation

VideoLectures.net

Page 16: SMART PROJECT - WP 7 Dissemination and Exploitation

Special Issue

• Lucia Specia and Nicola Cancedda are guest editors for a special issue of the journal Machine Translation on the topic “Pushing the frontier of Statistical Machine Translation”, due to appear in spring 2010.

Page 17: SMART PROJECT - WP 7 Dissemination and Exploitation

Publications

• Consortium members actively disseminated scientific results in the major international conferences in computational linguistics, machine translation and machine learning. Several longer articles were submitted to peer-reviewed journals.

Page 18: SMART PROJECT - WP 7 Dissemination and Exploitation

• Nicola Cancedda and Pierre Mahé: Factored sequence kernels, in Neurocomputing, 72 (7-9), March 2009

• Stephane Clinchant and Jean-Michel Renders: Query Translation through Dictionary Adaptation, in Advances in Multilingual and Multimodal Information Retrieval, 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Budapest, Hungary, September 19-21, 2007

• Stephane Clinchant and Jean-Michel Renders: Multi-Language Models and Meta-dictionary Adaptation for Accessing Multilingual Digital Libraries, in Evaluating Systems for Multilingual and Multimodal Information Access, 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008, Aarhus, Denmark, September 17-19, 2008.

• Ilias Flaounas, Marco Turchi, Tijl De Bie and Nello Cristianini: Inference and Validation of Networks, in Machine Learning and Knowledge Discovery in Databases, LNCS 5781/2009.

• Ilias Flaounas, Marco Turchi and Nello Cristianini: Detecting Macro-patterns in the Mediasphere, in Workshop on Intelligent Analysis and Processing of Web News Content, WI-IAT, Milan, Italy, 2009.

• Cyril Goutte, Nicola Cancedda, Marc Dymetman and George Foster (eds.): Learning Machine Translation, the MIT Press, Cambridge, Mass., 2009.

• Matti Kääriäinen: Sinuhe -- Statistical Machine Translation using a Globally Trained Conditional Exponential Family Translation Model, in Conference on Empirical Methods for Natural Language Processing (EMNLP 2009), Singapore.

• Yizhao Ni, Craig Saunders, Sandor Szedmak, Mahesan Niranjan: Handling phrase reordering for machine translation, in 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), Singapore, 2009.

• Jan Rupnik and Blaz Fortuna: Regression Canonical Correlation Analysis, in NIPS workshop on Learning from Multiple Sources, Whistler, Canada, 2008.

Page 19: SMART PROJECT - WP 7 Dissemination and Exploitation

• Lucia Specia, Marco Turchi, Nicola Cancedda, Marc Dymetman and Nello Cristianini: Estimating the Sentence-Level Quality of Machine Translation Systems, in Conference of the European Association for Machine Translation, Barcelona, Spain, 2009.

• Lucia Specia, Marco Turchi, Zhuoran Wang, John Shawe-Taylor and Craig Saunders: Improving the Confidence of Machine Translation Quality Estimates., in Machine Translation Summit XII, Ottawa, Canada. 2009.

• Nadi Tomeh, Nicola Cancedda and Marc Dymetman: Complexity-based Phrase-table Filtering for Statistical Machine Translation., in Machine Translation Summit XII, Ottawa, Canada. 2009.

• Marco Turchi, Tijl De Bie, Nello Cristianini: An Intelligent Agent that Autonomously Learns how to Translate, in Workshop on Intelligent Analysis and Processing of Web News Content, WI-IAT, Milan, Italy, 2009.

• H. Yu and J. Rousu: An Efficient Method for Large Margin Parameter Optimization in Structured Prediction Problems. Technical Report C-2007-87, Dept. Computer Science, Univ. of Helsinki, 2007.

• Zhuoran Wang and John Shawe-Taylor: Large-Margin Structured Prediction via Linear Programming., in The Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS 2009), Clearwater Beach, Florida, USA, 2009.

• Wang, Zhuoran and Shawe-Taylor, John and Szedmak, Sandor: Kernel Regression Based Machine Translation. In proceedings of Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers, pp. 185-188.

• Mikhail Zaslavski, Marc Dymetman and Nicola Cancedda: Phrase-based Statistical Machine Translation as a Traveling Salesman Problem, in 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), Singapore, 2009.

Page 20: SMART PROJECT - WP 7 Dissemination and Exploitation

Patents

• Xerox filed two patent applications protecting results obtained in the second and third years of the project, bringing to four the total number of applications.– Query translation through dictionary adaptation– Factored word-sequence kernels– Phrase-based SMT as a Generalized Travelling

Salesman Problem– Phrase-table filtering for SMT

Page 21: SMART PROJECT - WP 7 Dissemination and Exploitation

Other Outcomes

• Some of our researchers are now employed in JRC or Nokia, etc.... Smart ideas WILL spread...

• JRC specifically employed Marco Turchi based on his work on Found in Translation, after seeing the demo:at least in one case, communication was successful

Page 22: SMART PROJECT - WP 7 Dissemination and Exploitation

The Future

• Demos will remain• Website will remain• Videolectures will remain• Publications DB will remain

• The impact of SMART begins now...