This brief presents a novel spatio-temporal domain adaptation (DA) method that is unsupervised and event-driven for dynamic vision sensor (DVS) gesture recognition. This method realizes the transfer between the source… Click to show full abstract
This brief presents a novel spatio-temporal domain adaptation (DA) method that is unsupervised and event-driven for dynamic vision sensor (DVS) gesture recognition. This method realizes the transfer between the source domain and the target domain data in both the spatial and temporal dimension without the need of target domain data labels. Specifically, it consists of a deep spiking neural network (SNN)-based feature extractor, a label predictor and a domain discriminator. A time-space gradient reversal layer is responsible for building a spatio-temporal bridge between the domain discriminator and the feature extractor, which is essential to the alignment of the source domain spike features with the target ones and to achieve domain adaptation on both space and time dimensions. To demonstrate the effectiveness of our method, we adapted DVS hand gesture data from one temporal resolution to another and from original data to denoised data. Our method can provide up to 10.39% improvement in accuracy. Its accuracy improvement is more stable comparing with RNN-based and LSTM-based methods in this DA framework especially when the two domains have partial similarity in DVS data.
               
Click one of the above tabs to view related content.