Separating target speech from noisy signal is important for many realistic applications. Recently, deep neural network (DNN) has been widely used in speech enhancement (SE) and obtained prominent performance improvements.… Click to show full abstract
Separating target speech from noisy signal is important for many realistic applications. Recently, deep neural network (DNN) has been widely used in speech enhancement (SE) and obtained prominent performance improvements. However, the current deep models require a large amount of training data to obtain a good performance. It is still challenging to construct an effective deep speech enhancement model with actual few training samples. At present, meta-learning has become the research focus of few-shot learning due to its capability of quickly process new tasks with few samples by the prior meta-knowledge, but there are very few works applying meta-learning on few-shot speech enhancement. In this paper, we propose a generic meta-learning framework Meta-SE which applies the U-Net as the meta-learner, to tackle the few-shot speech enhancement problem. Meta-SE is trained and optimized with the changed speech enhancement tasks to obtain meta-knowledge, and towards better capability of fast and good generalizing to the new unseen noises with few training samples. The experiment results show that the proposed method not only outperforms the state-of-the-arts DNN-SE models under the few-shot conditions, but also learns a more general and flexible model for task adaption.
               
Click one of the above tabs to view related content.