Publication | Open Access
Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets
66
Citations
11
References
2019
Year
Artificial IntelligencePoor GeneralizationData AugmentationInstance-based LearningEngineeringMachine LearningData ScienceGenerative Adversarial NetworkPattern RecognitionMachine Learning ModelAdversarial Machine LearningAi SafetyComputer ScienceUniform Perturbation RadiusAdversarial TrainingDeep LearningAccuracy Tradeoffs
Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks. Despite its success as a defense mechanism, adversarial training fails to generalize well to unperturbed test set. We hypothesize that this poor generalization is a consequence of adversarial training with uniform perturbation radius around every training sample. Samples close to decision boundary can be morphed into a different class under a small perturbation budget, and enforcing large margins around these samples produce poor decision boundaries that generalize poorly. Motivated by this hypothesis, we propose instance adaptive adversarial training -- a technique that enforces sample-specific perturbation margins around every training sample. We show that using our approach, test accuracy on unperturbed samples improve with a marginal drop in robustness. Extensive experiments on CIFAR-10, CIFAR-100 and Imagenet datasets demonstrate the effectiveness of our proposed approach.
| Year | Citations | |
|---|---|---|
Page 1
Page 1