Speech Translation Advanced Research to and from Portuguese (PT-STAR)
Type: National Project Project
Duration: from 2009 May 01 to 2012 Jul 31
Financed by: FCT Carnegie Mellon
Prime Contractor: R - INESC-ID Lisboa (Other) - Lisboa, Portugal
Each year, more than a billion Euros is spent translating documents and interpreting speeches by European institutions. Also, about half of the Europeans speak only its own language. Just these two facts per se are a strong motivation for the fostering of Speech-to-Speech Machine Translation (S2SMT) technologies, which aim at enabling natural language communication between people that do not share the same language. S2SMT can be seen as a cascade of three major components: Automatic Speech Recognition, Machine Translation and Text-to-Speech Synthesis. One of the main problems of this multidisciplinary area, however, is the still weak integration between the three components. The main goal of PT-STAR (Speech Translation Advanced Research to and from Portuguese) is to improve speech translation systems for Portuguese by strengthening this integration. Within this project, several problems are envisaged, such as spontaneous speech translation – for which the performance of the automatic speech recognizer component seriously degrades – and voice conversion – which allows the synthesized speech to retain the characteristics of the original voice. Moreover, several major problems in statistical machine translation are addressed, as for instance the study of different methods to automatically extract bilingual lexicon from non-aligned parallel corpora and to update the translation model. Finally, PT-STAR targets the implementation of a proof of concept prototype. PT-STAR involves on the CMU side the Language Technologies Institute (LTI), and on the Portuguese side a consortium of universities and research centers: the Spoken Language Systems Lab (L2F) of INESC-ID Lisboa, the Center of Linguistics of the University of Lisbon (CLUL), and the University of Beira Interior (UBI). Additionally, a third language (Chinese) will be the target of a PhD thesis on machine translation, from University of Macau. The informal cooperation of this University in the framework of the current proposal will therefore contribute to enhance its scope, encompassing typologically different languages.
Partnerships
- Fundação da Universidade de Lisboa - CLUL (Other) - Lisbon, Portugal
- R - INESC-ID Lisboa (Other) - Lisboa, Portugal
- U - Carnegie Mellon University (University) - Pittsburgh, PA, USA
- Universidade da Beira Interior (University) - Covilhã, Portugal
Principal Investigators
Members
- Maria Luísa Torres Ribeiro Marques da Silva Coheur (HLT)
- Isabel Maria Martins Trancoso (HLT)
- Alberto Abad Gareta (HLT)
- António Joaquim dos Santos Romão Serralheiro (HLT)
- Luís Miguel Veiga Vaz Caldas de Oliveira (HLT)
- João Paulo da Silva Neto (HLT)
- Helena Gorete Silva Moniz (HLT)
- João de Almeida Varelas Graça (HLT)
- Fernando Manuel Marques Batista (HLT)
- Joana Maria Ferrer Lúcio Paulo Leitão Pardal (HLT)
- João Manuel Lage de Miranda Lemos (SPS)
- Tiago Manuel da Cruz Luís (HLT)