LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

LFEformer: Local Feature Enhancement Using Sliding Window With Deformability for Automatic Speech Recognition

Photo by historyhd from unsplash

A module using sliding window with deformablity, abbreviated as SWD, has been proposed for local feature enhancement. In particular, the proposed SWD module adopts windows with variable size based on… Click to show full abstract

A module using sliding window with deformablity, abbreviated as SWD, has been proposed for local feature enhancement. In particular, the proposed SWD module adopts windows with variable size based on the depth of the embedded network layers. Moreover, the proposed SWD module is inserted into the Transformer network, referred as LFEformer, for automatic speech recognition. Such network is particularly good at capturing both local and global features, and this is beneficial for model improvement. It is worth mentioning that the local and global features are extracted by SWD module and the attention mechanism in Transformer network, respectively. The effectiveness of the LFEformer has been validated on three widely used datasets, which are Aishell-1, HKUST and WSJ (dev93/eval92). The experimental results demonstrate that 0.5% CER, 0.8% CER and 0.7%/0.3% WER improvement can be obtained in the correspondent datasets.

Keywords: automatic speech; feature enhancement; using sliding; sliding window; speech recognition; local feature

Journal Title: IEEE Signal Processing Letters
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.