"Locally Normalized Filter Banks Applied to Deep Neural-Network-Based Robust Speech Recognition"

This letter describes modifications to locally normalized filter banks (LNFB), which substantially improve their performance on the Aurora-4 robust speech recognition task using a Deep Neural Network-Hidden Markov Model (DNN-HMM)-based speech recognition system. The modified coefficients, referred to as LNFB features, are a filter-bank version of locally normalized cepstral coefficients (LNCC), which have been described previously. The ability of the LNFB features is enhanced through the use of newly proposed dynamic versions of them, which are developed using an approach that differs somewhat from the traditional development of delta and delta–delta features. Further enhancements are obtained through the use of mean normalization and mean–variance normalization, which is evaluated both on a per-speaker and a per-utterance basis. The best performing feature combination (typically LNFB combined with LNFB delta and delta–delta features and mean–variance normalization) provides an average relative reduction in word error rate of 11.4% and 9.4%, respectively, compared to comparable features derived from Mel filter banks when clean and multinoise training are used for the Aurora-4 evaluation. The results presented here suggest that the proposed technique is more robust to channel mismatches between training and testing data than MFCC-derived features and is more effective in dealing with channel diversity.

Keywords: locally normalized; delta; filter banks; speech recognition

Journal Title: IEEE Signal Processing Letters
Year Published: 2017

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended