Abstract
Adversarial training methods are frequently-used empirical defense methods against adversarial examples. While many regularization techniques demonstrate effectiveness when combined with adversarial training, these methods typically work in the time domain. However, as the adversarial vulnerability can be considered a high-frequency phenomenon, it is crucial to regulate adversarially-trained neural network models in the frequency domain to capture low-frequency and high-frequency features. Neural networks must fully utilize the detailed local features extracted by their receptive field. To address these challenges, we conduct a theoretical analysis of the regularization properties of wavelets, which can enhance adversarial training. We propose a wavelet regularization method based on the Haar wavelet decomposition named Wavelet Average Pooling. This wavelet regularization module is integrated into a wide residual neural network to form a new model called WideWaveletResNet. On the CIFAR-10 and CIFAR-100 datasets, our proposed Adversarial Wavelet Training method demonstrates considerable robustness against different types of attacks. It confirms our assumption that our wavelet regularization method can enhance adversarial robustness, particularly in deep and wide neural networks. We present a detailed comparison of different wavelet base functions and conduct visualization experiments of the Frequency Principle (F-Principle) and interpretability to demonstrate the effectiveness of our method. The code is available on the open-source website: https://github.com/momo1986/AdversarialWaveletTraining.
Originalsprache | Englisch |
---|---|
Aufsatznummer | 119650 |
Fachzeitschrift | Information Sciences |
Jahrgang | 649 |
DOIs | |
Publikationsstatus | Veröffentlicht - Nov. 2023 |