Automatic speaker recognition (ASR) is a challenging task when the duration of the test speech is very short i.e., a few seconds. Source features extracted from short speech utterances are… Click to show full abstract
Automatic speaker recognition (ASR) is a challenging task when the duration of the test speech is very short i.e., a few seconds. Source features extracted from short speech utterances are shown to be effective for such cases. This paper proposes a system based on LP residual for text independent speaker recognition. Discrete wavelet transform (DWT) and stationary wavelet transform (SWT) are experimented to parameterize the LP residual. DWT works well in case of denoising and compression. SWT works well in reconstructing the noised signal at higher levels of decomposition than DWT. SWT/DWT coefficients of LP residual are used for implementing an i-vector/P-LDA based speaker recognition system. Effectiveness of the system is evaluated by using 10 s–10 s task of NIST speaker recognition evaluation (SRE) 2010 database. To evaluate robustness in degraded environments, the speech files are mixed with white noise from NOISEX-92 database. Speaker recognition using SWT level-3 results in an equal error rate (EER) of 40 and decision cost function (DCF) of 0.3956 for voice part of the signal in 10 s training—10 s testing data set. It has been shown that the proposed method gives robust speaker recognition performance in terms of DCF.
               
Click one of the above tabs to view related content.