Subnet Replacement: Deployment-stage backdoor attack against deep neural networks in gray-box setting
Xiangyu Qi, Jifeng Zhu, Chulin Xie, Yong Yang
arXiv:2107.07240v1 »Full PDF »6 pages, 3 figures, ICLR 2021 Workshop on Security and Safety in Machine Learning System

We study the realistic potential of conducting backdoor attack against deep
neural networks (DNNs) during deployment stage. Specifically, our goal is to
design a deployment-stage backdoor attack algorithm that is both threatening
and realistically implementable. To this end, we propose Subnet Replacement
Attack (SRA), which is capable of embedding backdoor into DNNs by directly
modifying a limited number of model parameters. Considering the realistic
practicability, we abandon the strong white-box assumption widely adopted in
existing studies, instead, our algorithm works in a gray-box setting, where
architecture information of the victim model is available but the adversaries
do not have any knowledge of parameter values. The key philosophy underlying
our approach is -- given any neural network instance (regardless of its
specific parameter values) of a certain architecture, we can always embed a
backdoor into that model instance, by replacing a very narrow subnet of a
benign model (without backdoor) with a malicious backdoor subnet, which is
designed to be sensitive (fire large activation value) to a particular backdoor
trigger pattern.