subtitling & translation of weblectures by carlos turró ribalta
DESCRIPTION
This presentation was given by Carlos Turró Ribalta, Head of Media Services at Universitat Politecnica de Valencia, Spain on 11 December at the REC:all workshop 2013 "Lecture Capture: Moving beyond the pilot stage: large-scale implementation of lecture capture in European Higher Education" in Leuven, Belgium.TRANSCRIPT
![Page 1: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/1.jpg)
Rec: All Lecture Capture Workshop11 December 2013
Carlos TurróUniversitat Politècnica de València EC FP7 ICT project #287755
![Page 2: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/2.jpg)
Motivation
12 Nov 2013 2
• Video lecture repositories and MOOCs• Thousands of hours of video lectures available• Hundreds of hours of video lectures
recorded every week
• Most video lectures only available in their original language• No subtitles
![Page 3: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/3.jpg)
Motivation
12 Nov 2013 3
• Transcriptions and translations are needed• Accessibility for people with disabilities• Accessibility for speakers of different
languages• Search and analysis functions• Automated topic finding• …
![Page 4: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/4.jpg)
Motivation
12 Nov 2013 4
• Transcriptions and translations are needed• Accessibility for people with disabilities• Accessibility for speakers of different
languages• Search and analysis functions• Automated topic finding• …
• How do we get there?
![Page 5: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/5.jpg)
The transLectures approach
12 Nov 2013 5
1. Automatic Speech Recognition (ASR)and Machine Translation (MT)• Adaptation: Taking advantage of the
characteristics of video lecture repositories• High-quality automatic transcriptions and
translations
2. Interactive postediting:intelligent interaction for reduced effort
![Page 6: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/6.jpg)
Goals
12 Nov 2013 6
• Development of an engine for adaptation & Intelligent interaction
• Implementation• Case studies: Videolectures.NET & Polimedia• Real-life evaluation• Integration into Opencast Matterhorn
http://opencast.org/matterhorn/
![Page 7: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/7.jpg)
The transLectures partners
12 Nov 2013 7
Name Country
1 Universitat Politècnica de València Spain2 Xerox SAS France3 Institut Jožef Stefan Slovenia3+ Knowledge for All Foundation UK4 RWTH Aachen University Germany5 EML – European Media Laboratory Germany6 DDS – Deluxe Digital Studios UK
36 Months
Now we are in M25
![Page 8: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/8.jpg)
Statistical Transcription (and translation)
Acustic Model
LanguageModel
TRANSCRIPTION
Sound ASR Engine
![Page 9: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/9.jpg)
Statistical transcription(and translation)
Acustic Model
LanguageModel
Manually transcriptedvoice Modeling Engine
![Page 10: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/10.jpg)
Architecture of TransLectures
Lecture
Language Model
Slides
Extracontent
Result
Intelligent interaction
Transcription Translation
![Page 11: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/11.jpg)
Languages
12 Nov 2013 11
• Transcription (ASR)• EN• SL• ES
• Translation (MT)• EN>SL , SL>EN• EN>ES , ES>EN• EN>FR• EN>DE
![Page 12: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/12.jpg)
Case study: VideoLectures.NET
15000 lectures
![Page 13: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/13.jpg)
Case study: Polimedia
10000 Learning Objects
![Page 14: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/14.jpg)
Demo
http://translectures.videolectures.nethttp://polimedia.upv.es/catalogo
http://translectures.eu/player/
![Page 15: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/15.jpg)
Scientific evaluations
• Transcription results
• WER: Word Error Rate (%)• Goal: WER < 20%
• EN, SL, ES
Worse
12 Nov 2013 15
Better
![Page 16: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/16.jpg)
Scientific evaluations
• Translation results
• BLEU• Goal: BLEU > 30
• EN>SL , SL>EN• EN>ES , ES>EN• EN>FR• EN>DE
Better
12 Nov 2013 16
Worse
![Page 17: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/17.jpg)
Y2 results and comparison
12 Nov 2013 17
![Page 18: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/18.jpg)
Y2 results and comparison
12 Nov 2013 18
![Page 19: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/19.jpg)
Y2 results and comparison
12 Nov 2013 19
![Page 20: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/20.jpg)
Massive adaptation
• Characteristicsof video lectures Just one person
Known speaker
Clear talking
No interruptions
Focused on a topic
Slides
12 Nov 2013 20
![Page 21: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/21.jpg)
Massive adaptation
12 Nov 2013 21
• Known speaker and topic• Slides• Related documents
![Page 22: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/22.jpg)
Intelligent interaction
12 Nov 2013 22
• Postediting automatic transcriptions/translations• The user invests the least possible effort• The system learns the most from it
• Confidence measures• Fast constrained search
![Page 23: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/23.jpg)
Intelligent interaction
12 Nov 2013 23
![Page 24: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/24.jpg)
Intelligent interaction
12 Nov 2013 20
![Page 25: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/25.jpg)
Implementation and integration
12 Nov 2013 25
• Videolectures.NET• Polimedia
• Opencast Matterhorn
![Page 26: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/26.jpg)
Online HTML5 VideoPlayer editor with editing capabilities.The user interface has three different editing layouts, and full keyboard support.User interaction statistics analyzed to improve user experience and develop a user model.
The tL player
![Page 27: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/27.jpg)
tL player
![Page 28: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/28.jpg)
Manual upload of lectures
![Page 29: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/29.jpg)
transLectures: tools available
12 Nov 2013 29
• The transLectures-UPV Toolkit (TLK) for ASR• www.translectures.eu/tlk
• RWTH Aachen: rASR, Jane (MT)• http://www-i6.informatik.rwth-aachen.de/web/Software/
Note that you need an acoustic & language model
![Page 30: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/30.jpg)
transLectures: tools at M30
• The tL player (& editor)• tL Opencast Matterhorn module• Cloud service for testing• Coming soon at M30 (www.translectures.eu)
More info at the OCWC conference
(Ljubljana) in April 2014
![Page 31: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/31.jpg)
Next steps for transLectures
12 Nov 2013 31
• Keep improving ASR and MT results• Keep improving tL open source tools (TLK, tL player)• External user evaluations (VL.NET and polimedia)• External trials: implementation in other universities
![Page 32: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/32.jpg)
Next EU project: EMMA
• MOOC related project
• transLectures work in adding 7 new transciption systems (English, Italian, Spanish, French, Dutch, Portuguese and Estonian)
• … and 8 translation systems (from Italian, Spanish, French, Dutch, Portuguese and Estonian into English; and from English into Italian and Spanish)
• Beginning in 2014
![Page 33: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.vdocuments.net/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/33.jpg)
www.translectures.eu
My mail (Carlos Turro)Project coordinator: Alfons Juan-Ciscar
EC FP7 ICT Programme – Project Number 28775512 Nov 2013 33
Thanks!