Skeleton-based action recognition has obtained remarkable success due to the rapid development of Graph Neural Networks (GNNs). Existing skeleton-based methods primarily focus on the spatial aspect of skeleton graphs, but… Click to show full abstract
Skeleton-based action recognition has obtained remarkable success due to the rapid development of Graph Neural Networks (GNNs). Existing skeleton-based methods primarily focus on the spatial aspect of skeleton graphs, but rarely mine long-range temporal relationships to learn the intrinsic dependencies across different frames. The latter is crucial for extracting discriminative motion patterns. To acquire a more precise spatial-temporal representation for the human skeleton data, we develop a lightweight yet practical method termed Dynamic Channel-Aware Subgraph Interactive Network (DCA-SGIN), using the interactive motion to adaptively capture long-range temporal nuance among the skeleton sequences. Moreover, we unify a novel paradigm to address the inherent structural limitations of GCNs. Specifically, the proposed graph interactive learners utilize the collaborative channel-aware topology of multiple subgraphs to model spatial relations. Unlike previous methods, the global and local features are simultaneously considered in spatial modeling without calculating any adjacency matrix, making it more efficient. Each proposed individual component in DCA-SGIN can be treated as a plug-in module that can be easily applied to other GNNs. Extensive experiments on the three challenging datasets show that the performance of DCA-SGIN outperforms the state-of-the-art methods with fewer FLOPs.
               
Click one of the above tabs to view related content.