A Deep Neural Network Approach to Speech Enhancement
Georgia Institute of Technology –
In contrast to the conventional minimum mean square error (MMSE) based noise reduction techniques, we formulate speech enhancement as finding a mapping function between noisy and clean speech signals. In order to be able to handle a wide range of additive noises in real-world situations, a large training set, encompassing many possible combinations of speech and noise types, is first designed. Next a deep neural network (DNN) architecture is employed as a nonlinear regression function to ensure a powerful modeling capability. Several techniques have also been adopted to improve the DNN-based speech enhancement system, including global variance equalization to alleviate the over-smoothing problem of the regression model, and dropout and noise-aware training strategies to further improve the generalization capability of DNNs to unseen noise conditions. Experimental results demonstrate that the proposed framework can achieve significant improvements in both objective and subjective measures over the MMSE based techniques. It is also interesting to observe that the proposed DNN approach can well suppress the highly non-stationary noise, which is tough to handle in general. Furthermore, the resulting DNN model, trained with artificial synthesized data, is also effective in dealing with noisy speech data recorded in real-world scenarios without generating the annoying musical artifact commonly observed in conventional enhancement methods.
[ Bio ]
Chin-Hui Lee is a professor at School of Electrical and Computer Engineering, Georgia Institute of Technology. Before joining academia in 2001, he had 20 years of industrial experience ending in Bell Laboratories, Murray Hill, New Jersey, as a Distinguished Member of Technical Staff and Director of the Dialogue Systems Research Department. Dr. Lee is a Fellow of the IEEE and a Fellow of ISCA. He has published over 400 papers and 30 patents, and was highly cited for his original contributions with an amazing h-index of 66. He received numerous awards, including the Bell Labs President’s Gold Award in 1998. He won the SPS’s 2006 Technical Achievement Award for “Exceptional Contributions to the Field of Automatic Speech Recognition”. In 2012 he was invited by ICASSP to give a plenary talk on the future of speech recognition. In the same year he was awarded the ISCA Medal in scientific achievement for “pioneering and seminal contributions to the principles and practice of automatic speech and speaker recognition”.
Date: 2014-May-26 Time: 15:30:00 Room: QA1.2 (IST Alameda)
For more information:
Workshop “Metabolism and mathematical models: Two for a tango” – 2nd Edition
Title: Workshop Metabolism and mathematical models: Two for a tango – 2nd Edition
Dates: October 25-26, 2022
Location: This workshop will be held in a virtual way
The topic of this workshop is metabolism in general, with a special focus, although not exclusive, on parasitology. Besides an exploration of the biological, biochemical and biomedical aspects, the workshop will also aim at presenting some of the mathematical modelling, algorithmic theory and software development that have become crucial to explore such aspects.
This workshop is being organised in the context of two projects, both with the Inria European Team Erable. One of the projects involves a partnership with the University of São Paulo (USP), in São Paulo, Brazil, more specifically the Institute of Mathematics and Statistics (IME) and the Institute of Biomedical Sciences – Inria Associated Team Capoeira – and the other involves the Inesc-ID/IST in Portugal, ETH in Zürich and EMBL in Heidelberg – H2020 Twinning Project Olissipo.
The workshop is open to all members of these two projects but also, importantly, to the community in general.
The program and more details are available here.