An array of features and methods are being developed over the past six decades for Speaker Identification (SI) and Speaker Verification (SV) , jointly known as Speaker Recognition(SR) . Mel… Click to show full abstract
An array of features and methods are being developed over the past six decades for Speaker Identification (SI) and Speaker Verification (SV) , jointly known as Speaker Recognition(SR) . Mel Frequency Cepstral Coefficients (MFCC) is generally used as feature vectors in most of the cases because it gives higher accuracy compared to other features. The presented paper focuses on comparative study of state-of-the-art SR techniques along with their design challenges, robustness issues and performance evaluation methods. Rigorous experiments have been performed using Gaussian Mixture Model (GMM) with variations like Universal Background Model (UBM) and/or Vector Quantization (VQ) and/or VQ based UBM-GMM (VQ-UBM-GMM) with detail discussion. Other popular methods have been included, namely, Linear Discriminate Analysis (LDA) , Probabilistic LDA (PLDA) , Gaussian PLDA (GPLDA) , Multi-condition GPLDA (MGPLDA), Identity Vector (i-vector) for comparative study only. Three popular audio data-sets have been used in the experiments, namely, IITG-MV SR, Hyke-2011 and ELSDSR. Hyke-2011 and ELSDSR contain clean speech while IITG-MV SR contains noisy audio data with variations in channel (device), environment, spoken style. We propose a new data mixing approach for SR to make the system independent of recording device, spoken style and environment. The accuracy we obtained for VQ and GMM based methods for databases, Hyke-2011 and ELSDSR are varies from $$99.6\%$$ 99.6 % to $$100\%$$ 100 % whereas accuracy for IITG-MV SR is upto $$98\%$$ 98 % . Indeed, in some cases the accuracies degrade drastically due to mismatch between training and testing data as well as singularity problem of GMM. The experimental results serve as a benchmark for VQ/GMM/UBM based methods for the IITG-MV SR database.
               
Click one of the above tabs to view related content.