Untitled Document

#307 Poster/Demo

Monday 16 April 16:15 - 16:30 Bowett Room

transLectures

Alfons Juan, Universitat Politècnica de València, Spain

Conference Theme: Innovation

Summary: Recent FP7 project to produce accurate transcriptions and translations in VideoLectures.NET and other Matterhorn repositories

Abstract: Online educational repositories of video lectures are rapidly growing on the basis of increasingly available and standardised infrastructure. A well-known example is VideoLectures.NET, a free and open access educational video lectures repository, and a major player in the development of the widely used Opencast Matterhorn platform for educational video management. As in other repositories, transcription and translation of video lectures in VideoLectures.NET is needed to make them accessible to speakers of different languages and to people with disabilities. However, also as in other repositories, most lectures in VideoLectures.NET are neither transcribed nor translated because of the lack of efficient solutions to obtain them at a reasonable level of accuracy.
The transLectures (Transcription and Translation of Video Lectures) project is a recent FP7 research project (ICT-2011.4.2: Language Technologies) aimed at developing innovative, cost-effective solutions to produce accurate transcriptions and translations in VideoLectures.NET, with generality across other Matterhorn-related repositories. Our starting hypothesis is that there is only a relatively small gap for the current technology on automatic speech recognition and machine translation to achieve accurate enough results in the kind of audio-visual object collections we are considering; and that this gap can be closed by achieving the following scientific and technological objectives:
1. Improvement of transcription and translation quality by massive adaptation.
We will show that current automatic speech recognition technology can provide acceptable transcriptions by massive adaptation of general-purpose models from lecture-specific knowledge such as the speaker, topic and time-aligned slides. Clearly, it is only by having acceptable transcriptions that adaptation of translation models can also provide acceptable results.
2. Improvement of transcription and translation quality by intelligent interaction.
Current user models for transcription and translation of audiovisual objects are batch-oriented; that is, an initial transcription/translation is first computed by the system and then manually post-edited without system assistance. Clearly, batch-oriented interaction models are only satisfying for very collaborative users post-editing nearly perfect output; otherwise, more intelligent interaction models are required for the user to save supervision effort, and for the system to dynamically learn from supervision actions. It is our objective to develop innovative, truly-interactive models in which the system immediately learns from and reacts to each supervision action.
3. Integration into Matterhorn to enable real-life evaluation.
Our tools will work with Matterhorn, and thus we will be able to evaluate them on real-life data, in a real-life setting.
The main result of transLectures is a set of cost-effective tools to produce accurate transcriptions and translations in VideoLectures.NET and other Matterhorn-related repositories. Indeed, we will test our ideas in VideoLectures.NET and in a smaller repository of Spanish video lectures, poliMedia, which is also part of the Matterhorn Community. It goes without saying that transLectures tools will be also useful to efficiently translate video lectures already transcribed. We are convinced that, upon successful achievement of our objectives, our innovative solutions will enable educational repositories them to overcome language barriers and reach wider audiences while supporting linguistic diversity. At Cambridge 2012 we will present transLectures and up-to-date information on intermediate results.