Abstract One of basic algorithms employed in technologies such as automatic speech recognition (ASR) systems is voice activity detection (VAD). Speech contains many pauses, whose interpretation might lead to recognition… Click to show full abstract
Abstract One of basic algorithms employed in technologies such as automatic speech recognition (ASR) systems is voice activity detection (VAD). Speech contains many pauses, whose interpretation might lead to recognition errors. Scientific literature provides numerous VAD algorithms, though many of them have substantial memory and/or calculation time requirements. On the other hand, the efficacy of those with smaller requirements is usually unsatisfactory. This paper proposes a modification to a single frequency filtering based algorithm known from literature, changing the methods of determining envelopes and the detection threshold. The purpose of these modifications was to reduce the calculation time and memory requirements without losing the efficiency of the algorithm. Also, a completely new algorithm of determining the detection threshold, using the approximation of hypothesis probability distribution was developed. The obtained results are satisfactory.
               
Click one of the above tabs to view related content.