An augmentative and alternative speech communication (AASC) aid comprises a speech recognition system and a speech synthesis system. The main challenge in developing such an aid for dysarthric speakers lies… Click to show full abstract
An augmentative and alternative speech communication (AASC) aid comprises a speech recognition system and a speech synthesis system. The main challenge in developing such an aid for dysarthric speakers lies in handling errors in the text derived from the recognition system. These errors (substitution, deletion, and insertion) may be due to inability of a dysarthric speaker to utter certain phones (articulatory error) or due to inaccuracy of the models trained (modeling error). Most existing AASC approaches only focus on the articulatory errors and the ones that do address both errors, and do not differentiate between them. However, this paper performs a three-level cascaded analysis to identify and distinguish between these errors, as differentiating these errors will aid in appropriately handling them. Furthermore, analyses in the paper are independent of the syntax of utterances. Based on these analyses, weighted phone confusion transducers are formulated and used to correct erroneous text from the recognition system. The corrected text is finally synthesized by a text-to-speech synthesis system. The proposed AASC is observed to significantly reduce a word error rate of severe dysarthric speakers from 100% to 41.52%, moderate from 61.85% to 18.08%, and mild from 12.23% to 8.55%.
               
Click one of the above tabs to view related content.