Alignment of transcription to the speech finds applications in video subtitling, human–computer interaction by means of natural language communication, etc. In spite of many advancements, alignment of transcription to speech… Click to show full abstract
Alignment of transcription to the speech finds applications in video subtitling, human–computer interaction by means of natural language communication, etc. In spite of many advancements, alignment of transcription to speech remains a challenging task and may become even more challenging for dysarthric speech. Dysarthria is a motor speech disorder resulting from damaged peripheral or central nervous system and causes slow speaking rate, pronunciation deviations, and prolonged pause interval between words and syllables. One of the problems in aligning dysarthric speech to text is the presence of repetition. Repetition can be at syllable/word/phrase level. In this work, we proposed an algorithm for syllable boundary detection followed by syllable repetition detection in dysarthric speech. When a syllable is found to be repeated, that syllable is repeated automatically in the transcription also. Modified transcription is given to the aligner along with the dysarthric speech. The proposed system when tested for word alignment with 15 utterances containing 146 words resulted in root mean square error (RMSE) of 0.138 when compared with the existing work in the literature, which gives an RMSE of 0.276.
               
Click one of the above tabs to view related content.