TY - JOUR
T1 - Towards A Critical Evaluation of Robustness for Deep Learning Backdoor Countermeasures
AU - Qiu, Huming
AU - Ma, Hua
AU - Zhang, Zhi
AU - Abuadbba, Alsharif
AU - Kang, Wei
AU - Fu, Anmin
AU - Gao, Yansong
PY - 2024
Y1 - 2024
N2 - Since Deep Learning (DL) backdoor attacks have been revealed as one of the most insidious adversarial attacks, a number of countermeasures have been developed with certain assumptions defined in their respective threat models. However, their robustness is currently inadvertently ignored, which can introduce severe consequences, e.g., a countermeasure can be misused and result in a false implication of backdoor detection. For the first time, we critically examine the robustness of existing backdoor countermeasures. As an initial study, we first identify five potential non-robust failure factors including binary classification, poison rate, model complexity, single-model justification, and hyperparameter sensitivity. As exhaustively examining defenses is infeasible, we instead focus on influential backdoor detection-based countermeasures consisting of model-inspection ones including Neural Cleanse (S&P’19), ABS (CCS’19), and MNTD (S&P’21), and data-inspection ones including SCAn (USENIX SECURITY’21) to examine their failure cases under one or more of these factors. Although these investigated countermeasures claim that they work well under their respective threat models, they have inherent unexplored non-robust cases, which are not even rooted from delicate adaptive attacks. We demonstrate how to trivially bypass them aligned with their respective threat models by simply varying the aforementioned factors. Particularly, for each defense, formal proofs or empirical studies are used to reveal its non-robust cases where it is not as robust as it claims or expects. This work highlights the necessity of thoroughly evaluating the robustness of backdoor countermeasures to avoid their misleading security implications in unknown non-robust cases.
AB - Since Deep Learning (DL) backdoor attacks have been revealed as one of the most insidious adversarial attacks, a number of countermeasures have been developed with certain assumptions defined in their respective threat models. However, their robustness is currently inadvertently ignored, which can introduce severe consequences, e.g., a countermeasure can be misused and result in a false implication of backdoor detection. For the first time, we critically examine the robustness of existing backdoor countermeasures. As an initial study, we first identify five potential non-robust failure factors including binary classification, poison rate, model complexity, single-model justification, and hyperparameter sensitivity. As exhaustively examining defenses is infeasible, we instead focus on influential backdoor detection-based countermeasures consisting of model-inspection ones including Neural Cleanse (S&P’19), ABS (CCS’19), and MNTD (S&P’21), and data-inspection ones including SCAn (USENIX SECURITY’21) to examine their failure cases under one or more of these factors. Although these investigated countermeasures claim that they work well under their respective threat models, they have inherent unexplored non-robust cases, which are not even rooted from delicate adaptive attacks. We demonstrate how to trivially bypass them aligned with their respective threat models by simply varying the aforementioned factors. Particularly, for each defense, formal proofs or empirical studies are used to reveal its non-robust cases where it is not as robust as it claims or expects. This work highlights the necessity of thoroughly evaluating the robustness of backdoor countermeasures to avoid their misleading security implications in unknown non-robust cases.
KW - Adaptation models
KW - Backdoor Countermeasure
KW - Complexity theory
KW - Deep Learning
KW - Failure Factor
KW - Robustness
KW - Security
KW - Sensitivity
KW - Threat modeling
KW - Toxicology
UR - http://www.scopus.com/inward/record.url?scp=85174839038&partnerID=8YFLogxK
U2 - 10.1109/TIFS.2023.3324318
DO - 10.1109/TIFS.2023.3324318
M3 - Article
AN - SCOPUS:85174839038
SN - 1556-6013
VL - 19
SP - 455
EP - 468
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
ER -