Existing linguistic steganalysis methods share a similar conceptual learning paradigm: namely, learning specific features derived from a word and its surrounding words, then passing them to the classifier. However, these… Click to show full abstract
Existing linguistic steganalysis methods share a similar conceptual learning paradigm: namely, learning specific features derived from a word and its surrounding words, then passing them to the classifier. However, these features are unable to deal with the impact of the association between words and the corpus, which gives rise to significant limitations. In this paper, we discover the missing link between the words and the corpus, and accordingly propose a framework of linguistic steganalysis named LS-BGAT. Specifically, we fine-tune a large-scale pre-training BERT as the local feature extraction model and employ Graph Attention Network (GAT) as the global feature extraction model. Through ensuring collaboration between the local BERT-based features and the global GAT-based features in the joint prediction layer, we combine rich semantic and syntactic information at the sentence level with underlying global information at the corpus level. Furthermore, we extend the binary classification of steganalysis to multi-category classification, thereby enhancing practicality. We empirically substantiate the effectiveness and universality of LS-BGAT on three tasks.
               
Click one of the above tabs to view related content.