Speech separation is an essential part of any voice recognition system like speaker recognition, speech recognition and hearing aids etc. When speech separation is applied at the front-end of any… Click to show full abstract
Speech separation is an essential part of any voice recognition system like speaker recognition, speech recognition and hearing aids etc. When speech separation is applied at the front-end of any voice recognition system increases the performance efficiency of that particular system. In this paper we propose a system for single channel speech separation by combining empirical mode decomposition (EMD) and multi pitch information. The proposed method is completely unsupervised and requires no knowledge of the underlying speakers. In this method we apply EMD to short frames of the mixed speech for better estimation of the speech specific information. Speech specific information is derived through multi pitch tracking. To track multi pitch information from the mixed signal we apply simple-inverse filtering tracking and histogram based pitch estimation to excitation source information along with estimating the number of speakers present in the mixed signal.
               
Click one of the above tabs to view related content.