LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

A Multifaceted Approach to Oral Assessment Based on the Conformer Architecture

Photo by thinkmagically from unsplash

Automatic speaking assessment methods are essential for helping non-native learners to learn native pronunciation. The automated speaking assessment method consists of mispronunciation detection and pronunciation quality assessment. In the past,… Click to show full abstract

Automatic speaking assessment methods are essential for helping non-native learners to learn native pronunciation. The automated speaking assessment method consists of mispronunciation detection and pronunciation quality assessment. In the past, researchers have usually focused their research on only one specific aspect of the speaking assessment task. Research on multifaceted speaking tasks has been rare, and model building has often led to reduced performance due to the omission of local feature details. In this paper, we propose a multi-width band (MB) method and apply it to the Conformer model. This method can effectively increase the ability of the model to obtain local feature information at different scales. At the same time, we used a multi-task learning approach to train a multifaceted speaking assessment model based on GOP features. We conducted experiments on a self-built monosyllabic Mandarin mispronunciation detection dataset (PSC-MonoSyllable) and an English open-source pronunciation quality assessment dataset (SpeechOcean762), respectively. The experimental results show that the method’s mispronunciation detection metrics in terms of phonemes, tones, and words on the PSC-MonoSyllable dataset (F1 scores) reached 70.18%, 80.06%, and 79.82%, respectively. The results of the method on the SpeechOcean 762 dataset for the pronunciation quality assessment task also showed a certain degree of improvement in all aspects of the phoneme- and grapheme-level correlation metrics compared with the baseline model.

Keywords: assessment; pronunciation; model; dataset; approach; speaking assessment

Journal Title: IEEE Access
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.