Understanding and Improving Ensemble Adversarial Defense

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The strategy of ensemble has become popular in adversarial defense, which trains
multiple base classifiers to defend against adversarial attacks in a cooperative manner. Despite the empirical success, theoretical explanations on why an ensemble of adversarially trained classifiers is more robust than single ones remain unclear. To fill in this gap, we develop a new error theory dedicated to understanding ensemble adversarial defense, demonstrating a provable 0-1 loss reduction on challenging sample sets in adversarial defense scenarios. Guided by this theory, we propose an effective approach to improve ensemble adversarial defense, named intertive global adversarial training (iGAT). The proposal includes (1) a probabilistic distributing rule that selectively allocates to different base classifiers adversarial examples that are globally challenging to the ensemble, and (2) a regularization term to rescue the severest weaknesses of the base classifiers.
Being tested over various existing ensemble adversarial defense techniques, iGAT is capable of boosting their performance by up to 17% evaluated using CIFAR10 and CIFAR100 datasets under both white-box and black-box attacks.
Original languageEnglish
Title of host publication37th Conference on Neural Information Processing Systems (NeurIPS)
Publication statusAccepted/In press - 21 Sept 2023

Fingerprint

Dive into the research topics of 'Understanding and Improving Ensemble Adversarial Defense'. Together they form a unique fingerprint.

Cite this