Speech recognition for less-represented languages
The last decade has seen growing interest in developing speech and language technologies for a wider range of languages. State-of-the-Art speech recognizers are typically trained on huge amounts of data, both transcribed speech and texts. My thesis work focused on speech recognition for languages for which small amounts of data are available: the “less-represented languages”. These languages often suffer from poor representation on the Web, which is the main collecting source. Very high out-of-vocabulary rates and poor language model estimation are common for these languages. In this presentation, I will briefly describe the difficulties posed by building new ASR systems with little data. Then I will present our attempt to improve performance, by using sub-word units in the recognition lexicon. We enhanced a data-driven word decompounding algorithm in order to address the problem of increased phonetic confusability arising from word decompounding. Experiments carried out on two distinct languages, Amharic and Turkish, achieved small but significative improvements, around 5% relative in word error rate, with 30% to 50% relative OOV reductions. The algorithm is relatively language independent and requires minimal adaptation to be applied to other languages.
Date: 2008-Jun-18 Time: 14:00:00 Room: 336
For more information:
Workshop “Metabolism and mathematical models: Two for a tango” – 2nd Edition
Title: Workshop Metabolism and mathematical models: Two for a tango – 2nd Edition
Dates: October 25-26, 2022
Location: This workshop will be held in a virtual way
The topic of this workshop is metabolism in general, with a special focus, although not exclusive, on parasitology. Besides an exploration of the biological, biochemical and biomedical aspects, the workshop will also aim at presenting some of the mathematical modelling, algorithmic theory and software development that have become crucial to explore such aspects.
This workshop is being organised in the context of two projects, both with the Inria European Team Erable. One of the projects involves a partnership with the University of São Paulo (USP), in São Paulo, Brazil, more specifically the Institute of Mathematics and Statistics (IME) and the Institute of Biomedical Sciences – Inria Associated Team Capoeira – and the other involves the Inesc-ID/IST in Portugal, ETH in Zürich and EMBL in Heidelberg – H2020 Twinning Project Olissipo.
The workshop is open to all members of these two projects but also, importantly, to the community in general.
The program and more details are available here.