Deep neural networks can be fooled by small imperceptible perturbations called adversarial examples. Although these examples are carefully crafted, they involve two major concerns. In some cases, adversarial examples generated… Click to show full abstract
Deep neural networks can be fooled by small imperceptible perturbations called adversarial examples. Although these examples are carefully crafted, they involve two major concerns. In some cases, adversarial examples generated are much larger than minimal adversarial perturbations while in others the attack method involves an extensive number of iterations making it infeasible. Moreover, the sparse attacks are either too complex or are not sparse enough to achieve imperceptibility. Therefore, attacks designed should be fast and minimum in terms of $\ell _{2}$ -norm. In this research, we used a dictionary learning technique to generate sparse adversarial examples based on feature maps of target images. We present two novel algorithms to tune the dictionary learning process and feature map selection. The results on MNIST and Imagenet show our attack is better or competitive with the state-of-the-art methods. We also compared our method with sparse attacks recently introduced in literature. As a result, we have achieved comparable attack success rate when compared to the state-of-the-art with smaller $\ell _{2}$ -norm. We also tested the efficacy of our attack in the presence of defense mechanisms and none of the defenses were able to combat the effect of our proposed attack
               
Click one of the above tabs to view related content.