LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Multi-Head Self-Attention-Based Deep Clustering for Single-Channel Speech Separation

Photo from wikipedia

Turning attention to a particular speaker when many people talk simultaneously is known as the cocktail party problem. It is still a tough task that remained to be solved especially… Click to show full abstract

Turning attention to a particular speaker when many people talk simultaneously is known as the cocktail party problem. It is still a tough task that remained to be solved especially for single-channel speech separation. Inspired by the physiological phenomenon that humans tend to distinguish some attractive sounds from mixed signals, we propose the multi-head self-attention deep clustering network (ADCNet) for this problem. We creatively combine the widely used deep clustering network with multi-head self-attention mechanism and exploit how the number of heads in multi-head self-attention affects separation performance. We also adopt the density-based canopy K-means algorithm to further improve performance. We trained and evaluated our system using the Wall Street Journal dataset (WSJ0) on two and three talker mixtures. Experimental results show the new approach can achieve a better performance compared with many advanced models.

Keywords: attention; separation; deep clustering; self attention; head self; multi head

Journal Title: IEEE Access
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.