"A Novel Encoder-Decoder Model via NS-LSTM Used for Bone-Conducted Speech Enhancement"

Bone-conducted (BC) speech can be used to communicate in a very high noise environment. In this paper, a method of improving the quality of BC speech is presented. The speech… Click to show full abstract

Bone-conducted (BC) speech can be used to communicate in a very high noise environment. In this paper, a method of improving the quality of BC speech is presented. The speech signal of a speaker is passed through a novel dictionary representation-based encoder-decoder model. In the encoder, our designed non-negative and sparse long short-term memory (LSTM) recurrent neural network is deployed to generate combination coefficients on the dictionary established by sparse non-negative matrix factorization. Then, the decoder is designed and utilized to enhance the dictionary representation based on local attention mechanism. Two optimizers are adopted when training the model as a whole and the encoder is pre-trained individually to make the convergence faster. In experiments, we compare the proposed method with direct transformations via DNN and LSTM networks, and numerous criteria are used for evaluation. Objective and subjective results demonstrate that our method behaves better and achieves satisfactory performance even when coping with some challenging cases.

Keywords: bone conducted; encoder; conducted speech; encoder decoder; model; decoder

Journal Title: IEEE Access
Year Published: 2018

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended