As a widely used speech-triggered interface, deep-learning-based keyword spotting (KWS) chips require both ultra-low power and high detection accuracy. We propose a sub-microwatt KWS chip with an acoustic activity detection… Click to show full abstract
As a widely used speech-triggered interface, deep-learning-based keyword spotting (KWS) chips require both ultra-low power and high detection accuracy. We propose a sub-microwatt KWS chip with an acoustic activity detection (AAD) to achieve the above two requirements, including the following techniques: first, an optimized feature extractor circuit using nonoverlapping-framed serial Mel frequency cepstral coefficient (MFCC) to save half of the computations and data storage; second, a zero-cost AAD by using MFCC’s 1st-order output to clock gate neural network (NN) and postprocessing (PP) unit, with 0 miss rate; third, a tunable detection window to adapt to different keyword lengths for better accuracy; and finally, a true form computation method to decrease data transitions and optimized PP. Implemented in a 28-nm CMOS process, this AAD-KWS chip has a 0.4-V supply, an 8-kHz frequency for MFCC, and a 200-kHz frequency for other parts. It consumes
               
Click one of the above tabs to view related content.