LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

WordRevert: Adversarial Examples Defence Method for Chinese Text Classification

Photo by rhsupplies from unsplash

Adversarial examples can evade the detection of text classification models based on Deep Neural Networks (DNNs), thus posing a potential security threat to the system. To address this problem, we… Click to show full abstract

Adversarial examples can evade the detection of text classification models based on Deep Neural Networks (DNNs), thus posing a potential security threat to the system. To address this problem, we propose an adversarial example defense method for Chinese text classification called WordRevert. The method first obtains the “positive text” containing the adversarial words by filtering the clauses that do not contribute to the current classification label. Then the detection network is combined with the position importance calculation function to achieve the detection of the adversarial words. Finally, the adversarial words are restored to the original words by calculating the candidate score and the detection score. The experiments show that the current popular Chinese text adversarial attack algorithms can be effectively defended by this method, and achieve a significant increase in the accuracy of the adversarial examples with a small reduction in the classification accuracy of clean samples while achieving better precision, recall, and F1 value of adversarial word detection and restoration.

Keywords: method; detection; classification; adversarial examples; chinese text; text classification

Journal Title: IEEE Access
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.