LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

RealPRNet: A Real-Time Phoneme-Recognized Network for “Believable” Speech Animation

Photo by historyhd from unsplash

With the technology development, more and more Internet of Things (IoT) devices with displays are making “face-to-face” interaction through visualization a reality. To protect the privacy of users, communications can… Click to show full abstract

With the technology development, more and more Internet of Things (IoT) devices with displays are making “face-to-face” interaction through visualization a reality. To protect the privacy of users, communications can be represented through avatars and use audio-driven real-time speech animation. However, if audio is the only available input, the quality of the outcome relies heavily on real-time phoneme recognition, such as recognition accuracy and latency. This article introduces a novel deep-learning-based real-time phoneme recognition network (RealPRNet) scheme to leverage spatial and temporal patterns in the input audio data. With featured long short-term memory stack block and long short-term features, RealPRNet can achieve super performance in phoneme recognition. Our comprehensive empirical results show that compared to the state-of-the-art algorithms, RealPRNet can achieve 20% phoneme error rate (PER) improvement and 4% block error distance (BDE) improvement in the best case.

Keywords: realprnet; speech animation; phoneme; real time; time phoneme

Journal Title: IEEE Internet of Things Journal
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.