2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于通用逆扰动的对抗攻击防御方法

陈晋音 吴长安 郑海斌 王巍 温浩

陈晋音, 吴长安, 郑海斌, 王巍, 温浩. 基于通用逆扰动的对抗攻击防御方法. 自动化学报, 2023, 49(10): 2172−2187 doi: 10.16383/j.aas.c201077
引用本文: 陈晋音, 吴长安, 郑海斌, 王巍, 温浩. 基于通用逆扰动的对抗攻击防御方法. 自动化学报, 2023, 49(10): 2172−2187 doi: 10.16383/j.aas.c201077
Chen Jin-Yin, Wu Chang-An, Zheng Hai-Bin, Wang Wei, Wen Hao. Universal inverse perturbation defense against adversarial attacks. Acta Automatica Sinica, 2023, 49(10): 2172−2187 doi: 10.16383/j.aas.c201077
Citation: Chen Jin-Yin, Wu Chang-An, Zheng Hai-Bin, Wang Wei, Wen Hao. Universal inverse perturbation defense against adversarial attacks. Acta Automatica Sinica, 2023, 49(10): 2172−2187 doi: 10.16383/j.aas.c201077

基于通用逆扰动的对抗攻击防御方法

doi: 10.16383/j.aas.c201077
基金项目: 国家自然科学基金(62072406), 浙江省自然科学基金(LY19F020025), 教育部产学合作协同育人项目资助
详细信息
    作者简介:

    陈晋音:浙江工业大学网络空间安全研究院和信息工程学院教授. 2009年获得浙江工业大学博士学位. 主要研究方向为人工智能安全, 图数据挖掘和进化计算. 本文通信作者.E-mail: chenjinyin@zjut.edu.cn

    吴长安:浙江工业大学硕士研究生. 主要研究方向为深度学习, 计算机视觉, 对抗攻击和防御. E-mail: wuchangan@zjut.edu.cn

    郑海斌:浙江工业大学信息工程学院博士研究生. 主要研究方向为深度学习, 人工智能安全, 对抗攻击和防御, 图像识别. E-mail: haibinzheng320@gmail.com

    王巍:中国电子科技集团公司第三十六研究所研究员. 主要研究方向为无线通信分析, 网络安全. E-mail: wwzwh@163.com

    温浩:重庆中科云从科技有限公司高级工程师. 主要研究方向为量子通信, 计算机通信网络与大规模人工智能计算. E-mail: wenhao@cloudwalk.com

Universal Inverse Perturbation Defense Against Adversarial Attacks

Funds: Supported by National Natural Science Foundation of China (62072406), Natural Science Foundation of Zhejiang Province (LY19F020025), and Ministry of Education Industry-University Cooperation Collaborative Education Project
More Information
    Author Bio:

    CHEN Jin-Yin Professor at the Institute of Cyberspace Security and the College of Information Engineering, Zhejiang University of Technology. She received her Ph.D. degree from Zhejiang University of Technology in 2009. Her research interest covers artificial intelligence security, graph data mining, and evolutionary computing. Corresponding author of this paper

    WU Chang-An Master student at the College of Information Engineering, Zhejiang University of Technology. His research interest covers deep learning, computer vision, adversarial attack and defense

    ZHENG Hai-Bin Ph.D. candidate at the College of Information Engineering, Zhejiang University of Technology. His research interest covers deep learning, artificial intelligence security, adversarial attack and defense, and image recognition

    WANG Wei Researcher at the 36th Research Institute of China Electronics Technology Group Corporation. His research interest covers wireless communication analysis and network security

    WEN Hao Senior engineer at Chongqing Zhongke Yuncong Technology Co., Ltd.. His research interest covers guantum communication, computer communication networks, and large-scale artificial intelligence computing

  • 摘要: 现有研究表明深度学习模型容易受到精心设计的对抗样本攻击, 从而导致模型给出错误的推理结果, 引发潜在的安全威胁. 已有较多有效的防御方法, 其中大多数针对特定攻击方法具有较好防御效果, 但由于实际应用中无法预知攻击者可能采用的攻击策略, 因此提出不依赖攻击方法的通用防御方法是一个挑战. 为此, 提出一种基于通用逆扰动(Universal inverse perturbation, UIP)的对抗样本防御方法, 通过学习原始数据集中的类相关主要特征, 生成通用逆扰动, 且UIP对数据样本和攻击方法都具有通用性, 即一个UIP可以实现对不同攻击方法作用于整个数据集得到的所有对抗样本进行防御. 此外, UIP通过强化良性样本的类相关重要特征实现对良性样本精度的无影响, 且生成UIP无需对抗样本的先验知识. 通过大量实验验证, 表明UIP在不同数据集、不同模型中对各类攻击方法都具备显著的防御效果, 且提升了模型对正常样本的分类性能.
  • 图  1  通用逆扰动防御方法框图

    Fig.  1  The framework of UIPD method

    图  2  基于特征分布和决策边界的UIPD分析示意图

    Fig.  2  The UIPD analysis based on feature distribution and decision boundary

    图  3  基于鲁棒安全边界的UIPD分析示意图

    Fig.  3  The UIPD analysis based on robust security boundaries

    图  4  MNIST数据集中不同模型的 UIP 可视化图

    Fig.  4  The UIP visualization of MNIST dataset in different models

    图  5  参数敏感性实验结果图

    Fig.  5  The results of Parameter sensitivity experiment

    图  6  不同防御方法实施1000次防御的时间消耗

    Fig.  6  The time cost in 1000 defenses of different defense methods

    图  7  UIPD对AP攻击的防御实验结果

    Fig.  7  The results of UIPD against AP attacks

    A1  不同数据集和模型的UIP可视化图

    A1  The UIP visualization of different datasets and models

    表  1  自行搭建的网络模型结构

    Table  1  The network structure built by ourselves

    网络层M_CNN/F_CNN
    Conv + ReLU5 × 5 × 5
    Max pooling2 × 2
    Conv + ReLU5 × 5 × 64
    Max pooling2 × 2
    Dense (Fully connected)1024
    Dropout0.5
    Dense (Fully connected)10
    Softmax10
    下载: 导出CSV

    表  2  UIPD针对不同攻击方法的防御成功率(%)

    Table  2  The defense success rate of UIPD against different attack methods (%)

    DSRMNISTFMNISTCIFAR-10ImageNet
    AlexNetLeNetM_CNNAlexNetF_CNNVGG19VGG19
    良性样本识别准确率92.3495.7190.4589.0187.4279.5589.00
    FGSM[8]73.3185.2177.3579.1580.0578.1343.61
    BIM[18]99.3093.7399.1195.2897.6185.3272.90
    MI-FGSM[9]69.6590.3298.9988.3585.7556.9344.76
    PGD[17]99.3195.9399.1997.8095.8381.0573.13
    C&W[19]99.3496.0492.1096.4494.4480.6746.67
    L-BFGS[6]98.5870.1267.7966.3571.7568.6931.36
    JSMA[10]64.3355.5976.6172.3169.5160.0437.54
    DeepFool[20]98.9897.9894.5293.5491.6383.1362.54
    UAP[15]97.4697.0999.3997.8596.5583.0772.66
    Boundary[12]93.6394.3895.7292.6791.8876.2168.45
    ZOO[11]77.3875.4376.3968.3665.4261.5854.18
    AGNA[21]75.6976.4081.6064.8072.1462.1055.70
    AUNA[21]74.2073.6578.5365.7562.2062.7052.40
    SPNA[21]92.1088.3589.1777.5874.2672.9060.30
    下载: 导出CSV

    表  3  UIPD针对不同数据样本的通用性(MNIST, M_CNN)

    Table  3  The universality of UIPD for different examples (MNIST, M_CNN)

    第1组 第2组 第3组 第4组
    良性样本类标置信度 (良性样本 + UIP)类标置信度 对抗样本类标置信度 (对抗样本 + UIP) 类标置信度
    01.00001.00050.539000.9804
    11.00011.00080.490610.9848
    21.00021.00010.501520.9841
    31.00031.00070.502930.9549
    41.00041.00090.514640.9761
    51.00051.00030.502050.9442
    61.00061.00040.521260.9760
    71.00071.00030.522570.8960
    81.00081.00060.522880.9420
    91.00091.00070.507690.9796
    下载: 导出CSV

    表  4  不同防御方法针对基于梯度的攻击的防御效果比较

    Table  4  The performance comparison of different defense methods against gradient-based attacks

    MNISTFMNISTCIFAR-10ImageNet
    AlexNetLeNetM_CNNAlexNetF_CNNVGG19VGG19
      平均ASR (%)95.4699.6997.8898.7797.5987.6381.79
    DSR (%)resize178.2474.3281.8279.8477.2469.3847.83
    resize278.5464.9478.6479.3469.6564.2643.26
    rotate76.6680.5484.7477.6361.4672.4942.49
    Distil-D83.5182.0880.4985.2482.5575.1757.13
    Ens-D87.1988.0385.2487.7183.2177.4658.34
    D-GAN72.4068.2670.3179.5475.0473.0551.04
    GN22.6030.2627.5627.9622.6023.3513.85
    DAE84.5485.2585.6886.9480.2175.8559.31
    APE-GAN83.4080.7182.3684.1079.4572.1557.88
    UIPD88.9286.8987.4587.7783.9178.2359.91
    Rconfresize10.92310.96310.94240.89330.93840.67420.4442
    resize20.89310.91840.96420.97310.94730.73710.4341
    rotate0.90420.89140.92740.95350.81440.68140.4152
    Distil-D0.92210.90530.91620.93400.92780.67410.4528
    Ens-D0.96230.91730.96860.92100.93310.79940.5029
    D-GAN0.87390.84190.88290.90120.89810.78390.4290
    GN0.14450.17420.24520.16310.18350.12550.0759
    DAE0.94700.93460.96330.94200.93240.77820.5090
    APE-GAN0.89640.92700.94250.88970.90150.63010.4749
    UIPD0.97880.94630.98420.96420.95310.81410.5141
    下载: 导出CSV

    表  5  不同防御方法针对基于优化的攻击的防御效果比较

    Table  5  The performance comparison of different defense methods against optimization-based attacks

    MNISTFMNISTCIFAR-10ImageNet
    AlexNetLeNetM_CNNAlexNetF_CNNVGG19VGG19
      平均ASR (%)93.2896.3294.6595.2093.5888.1083.39
    DSR (%)resize178.6570.6279.0974.3766.5465.3138.28
    resize263.1467.9477.1466.9863.0962.6341.60
    rotate76.6272.1971.8466.7564.4265.6042.67
    Distil-D82.3782.2280.4982.4783.2871.1445.39
    Ens-D86.9783.0385.2483.4182.5074.2947.85
    D-GAN82.4380.3486.1379.3580.4770.0843.10
    GN20.1621.8025.3019.6718.6321.4013.56
    DAE83.6684.1786.8882.4083.6674.3051.61
    APE-GAN82.4685.0185.1481.8082.5073.8049.28
    UIPD87.9285.2287.5483.7083.9175.3852.91
    Rconfresize10.85130.86140.84600.79630.8324 0.6010 0.3742
    resize20.78140.88100.86550.82900.84750.63200.3800
    rotate0.85190.83740.83190.81000.80400.64620.4058
    Distil-D0.91410.89130.90330.91350.92000.78210.4528
    Ens-D0.95150.92800.87200.89400.90110.81550.4788
    D-GAN0.85390.87890.88290.87330.88200.74500.4390
    GN0.16300.19200.21520.17610.19710.14500.0619
    DAE0.91200.92900.95100.94200.93240.77820.5090
    APE-GAN0.89640.92700.94250.88970.90150.63010.4749
    UIPD0.92100.93400.95200.95120.97810.80510.5290
    下载: 导出CSV

    表  6  不同防御方法处理后良性样本的识别准确率 (%)

    Table  6  The accuracy of benign examples after processing by different defense methods (%)

    MNISTFMNISTCIFAR-10ImageNet
    AlexNetLeNetM_CNNAlexNetF_CNNVGG19VGG19
    良性样本  92.34  95.71  90.45  89.01  87.42  79.55  89.00
    resize192.27 (−0.07)95.66 (−0.05)90.47 (+0.02)88.97 (−0.04)87.38 (−0.04)79.49 (−0.06)88.98 (−0.02)
    resize292.26 (−0.80)95.68 (−0.30)90.29 (−0.16)88.71 (−0.30)87.38 (−0.04)79.48 (−0.07)87.61 (−1.39)
    rotate92.31 (−0.03)95.68 (−0.03)90.39 (−0.06)88.95 (−0.06)87.40 (0.02)79.53 (−0.02)88.82 (−0.18)
    Distil-D90.00 (−2.34)95.70 (−0.01)90.02 (−0.43)88.89 (−0.12)86.72 (−0.70)76.97 (−2.58)87.85 (−1.15)
    Ens-D94.35 (+2.01)96.15 (+0.44)92.38 (+1.93)89.13 (+0.12)87.45 (+0.03)80.13 (+0.58)89.05 (+0.05)
    D-GAN92.08 (−0.26)95.18 (−0.53)90.04 (−0.41)88.60 (−0.41)87.13 (−0.29)78.80 (−0.75)87.83 (−1.17)
    GN22.54 (−69.80)25.31 (−70.40)33.58 (−56.87)35.71 (−53.30)28.92 (−58.59)23.65 (−55.90)17.13 (−71.87)
    DAE91.57 (−0.77)95.28 (−0.43)89.91 (−0.54)88.13 (−0.88)86.80 (−0.62)79.46 (−0.09)87.10 (−1.90)
    APE-GAN92.30 (−0.04)95.68 (−0.03)90.42 (−0.03)89.00 (−0.01)87.28 (−0.14)79.49 (−0.06)88.88 (−0.12)
    UIPD92.37 (+0.03)95.96 (+0.25)90.51 (+0.06)89.15 (+0.14)87.48 (+0.06)79.61 (+0.06)89.15 (+0.15)
    下载: 导出CSV

    A1  UIPD针对不同数据样本的通用性(FMNIST, F_CNN)

    A1  The universality of UIPD for different examples (FMNIST, F_CNN)

    第1组 第2组 第3组 第4组
    良性样本类标置信度 (良性样本 + UIP)类标置信度 对抗样本类标置信度 (对抗样本 + UIP) 类标置信度
    01.00001.00060.453100.9415
    11.00011.00030.471410.8945
    21.00021.00060.564120.9131
    31.00031.00010.510330.9425
    41.00041.00020.483140.8773
    51.00051.00070.542250.9026
    61.00061.00050.486460.8787
    71.00071.00050.514470.8309
    81.00081.00040.478188.9424
    91.00091.00070.496190.8872
    下载: 导出CSV

    A2  UIPD针对不同数据样本的通用性(CIFAR-10, VGG19)

    A2  The universality of UIPD for different examples (CIFAR-10, VGG19)

    第1组 第2组 第3组 第4组
    良性样本类标置信度 (良性样本 + UIP)类标置信度 对抗样本类标置信度 (对抗样本 + UIP) 类标置信度
    飞机1.000飞机1.0000.4914飞机0.9331
    汽车1.000汽车1.000卡车0.5212汽车0.9131
    1.0001.0000.50310.8913
    1.0001.0000.50410.9043
    鹿1.000鹿1.0000.5010鹿0.8831
    1.0001.0000.53470.9141
    青蛙1.000青蛙1.0000.5314青蛙0.8863
    1.0001.0000.48140.8947
    1.0001.000飞机0.51420.9251
    卡车1.000卡车1.000飞机0.4761卡车0.9529
    下载: 导出CSV

    A3  UIPD针对不同数据样本的通用性(ImageNet, VGG19)

    A3  The universality of UIPD for different examples (ImageNet, VGG19)

    第1组 第2组 第3组 第4组
    良性样本类标置信度 (良性样本 + UIP)类标置信度 对抗样本类标置信度 (对抗样本 + UIP) 类标置信度
    导弹0.9425导弹0.9445军装0.5134导弹0.8942
    步枪0.9475步枪0.9525航空母舰0.4981步枪0.7342
    军装0.9825军装0.9925防弹背心0.5014军装0.8245
    皮套0.9652皮套0.9692军装0.4831皮套0.8074
    航空母舰0.9926航空母舰0.9926灯塔0.4788航空母舰0.8142
    航天飞机0.9652航天飞机0.9652导弹0.5101航天飞机0.7912
    防弹背心0.9256防弹背心0.9159步枪0.4698防弹背心0.8141
    灯塔0.9413灯塔0.9782客机0.5194灯塔0.7861
    客机0.9515客机0.9634坦克0.4983客机0.7134
    坦克0.9823坦克0.9782灯塔0.5310坦克0.7613
    下载: 导出CSV
  • [1] Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge: The MIT Press, 2016. 24−45
    [2] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: ACM, 2012. 1097−1105
    [3] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada: ACM, 2014. 3104−3112
    [4] 袁文浩, 孙文珠, 夏斌, 欧世峰. 利用深度卷积神经网络提高未知噪声下的语音增强性能. 自动化学报, 2018, 44(4): 751-759 doi: 10.16383/j.aas.2018.c170001

    Yuan Wen-Hao, Sun Wen-Zhu, Xia Bin, Ou Shi-Feng. Improving speech enhancement in unseen noise using deep convolutional neural network. Acta Automatica Sinica, 2018, 44(4): 751-759 doi: 10.16383/j.aas.2018.c170001
    [5] 代伟, 柴天佑. 数据驱动的复杂磨矿过程运行优化控制方法. 自动化学报, 2014, 40(9): 2005-2014

    Dai Wei, Chai Tian-You. Data-driven optimal operational control of complex grinding processes. Acta Automatica Sinica, 2014, 40(9): 2005-2014
    [6] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I J, et al. Intriguing properties of neural networks. In: Proceedings of the 2nd International Conference on Learning Representations. Banff, Canada: ICLR, 2014.
    [7] Akhtar N, Mian A. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access, 2018, 6: 14410-14430 doi: 10.1109/ACCESS.2018.2807385
    [8] Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR, 2015.
    [9] Dong Y P, Liao F Z, Pang T Y, Su H, Zhu J, Hu X L, et al. Boosting adversarial attacks with momentum. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 9185−9193
    [10] Papernot N, McDaniel P, Jha S, Fredrikson M, Celik Z B, Swami A. The limitations of deep learning in adversarial settings. In: Proceedings of the IEEE European Symposium on Security and Privacy (EuroS&P). Saarbruecken, Germany: IEEE, 2016. 372−387
    [11] Chen P Y, Zhang H, Sharma Y, Yi J F, Hsieh C J. ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. Dallas, USA: ACM, 2017. 15−26
    [12] Brendel W, Rauber J, Bethge M. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018.
    [13] Xiao C W, Li B, Zhu J Y, He W, Liu M Y, Song D. Generating adversarial examples with adversarial networks. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: IJCAI, 2018. 3905−3911
    [14] Papernot N, McDaniel P, Goodfellow I. Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv: 1605.07277, 2016.
    [15] Moosavi-Dezfooli S M, Fawzi A, Fawzi O, Frossard P. Universal adversarial perturbations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 86−94
    [16] Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Mądry A. Adversarial examples are not bugs, they are features. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: ACM, 2019. Article No. 12
    [17] Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018.
    [18] Kurakin A, Goodfellow I, Bengio S. Adversarial examples in the physical world. In: Proceedings of the 5th International Conference on Learning Representations.Toulon, France: ICLR, 2017.
    [19] Carlini N, Wagner D. Towards evaluating the robustness of neural networks. In: Proceedings of the IEEE Symposium on Security and Privacy (SP). San Jose, USA: IEEE, 2017. 39−57
    [20] Moosavi-Dezfooli S M, Fawzi A, Frossard P. DeepFool: A simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE, 2016. 2574−2582
    [21] Rauber J, Brendel W, Bethge M. Foolbox: A python toolbox to benchmark the robustness of machine learning models. arXiv preprint arXiv: 1707.04131, 2017.
    [22] Su J W, Vargas D V, Sakurai K. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 2019, 23(5): 828-841 doi: 10.1109/TEVC.2019.2890858
    [23] Baluja S, Fischer I. Adversarial transformation networks: Learning to generate adversarial examples. arXiv preprint arXiv: 1703.09387, 2017.
    [24] Cisse M, Adi Y, Neverova N, Keshet J. Houdini: Fooling deep structured prediction models. arXiv preprint arXiv: 1707.05373, 2017.
    [25] Sarkar S, Bansal A, Mahbub U, Chellappa R. UPSET and ANGRI: Breaking high performance image classifiers. arXiv preprint arXiv: 1707.01159, 2017.
    [26] Brown T B, Mané D, Roy A, Abadi M, Gilmer J. Adversarial patch. arXiv preprint arXiv: 1712.09665, 2017.
    [27] Karmon D, Zoran D, Goldberg Y. LaVAN: Localized and visible adversarial noise. In: Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: ICML, 2018. 2512−2520
    [28] Liu A S, Liu X L, Fan J X, Ma Y Q, Zhang A L, Xie H Y, et al. Perceptual-sensitive GAN for generating adversarial patches. In: Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI, 2019. 1028−1035
    [29] Liu A S, Wang J K, Liu X L, Cao B W, Zhang C Z, Yu H. Bias-based universal adversarial patch attack for automatic check-out. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 395−410
    [30] Xie C H, Wang J Y, Zhang Z S, Zhou Y Y, Xie L X, Yuille A. Adversarial examples for semantic segmentation and object detection. In: Proceedings of the International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 1378−1387
    [31] Song C B, He K, Lin J D, Wang L W, Hopcroft J E. Robust local features for improving the generalization of adversarial training. In: Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020.
    [32] Miyato T, Dai A M, Goodfellow I J. Adversarial training methods for semi-supervised text classification. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017.
    [33] Zheng S, Song Y, Leung T, Goodfellow I. Improving the robustness of deep neural networks via stability training. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE, 2016. 4480−4488
    [34] Dziugaite G K, Ghahramani Z, Roy D M. A study of the effect of JPG compression on adversarial images. arXiv preprint arXiv: 1608.00853, 2016.
    [35] Das N, Shanbhogue M, Chen S T, Hohman F, Chen L, Kounavis M E, et al. Keeping the bad guys out: Protecting and vaccinating deep learning with JPEG compression. arXiv preprint arXiv: 1705.02900, 2017.
    [36] Luo Y, Boix X, Roig G, Poggio T, Zhao Q. Foveation-based mechanisms alleviate adversarial examples. arXiv preprint arXiv: 1511.06292, 2015.
    [37] Gu S X, Rigazio L. Towards deep neural network architectures robust to adversarial examples. In: Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR, 2015.
    [38] Rifai S, Vincent P, Muller X, Glorot X, Bengio Y. Contractive auto-encoders: Explicit invariance during feature extraction. In: Proceedings of the 28th International Conference on International Conference on Machine Learning. Bellevue, USA: ACM, 2011. 833−840
    [39] Ross A S, Doshi-Velez F. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Menlo Park, CA, USA: AAAI, 2018. 1660−1669
    [40] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprintarXiv: 1503.02531, 2015.
    [41] Papernot N, McDaniel P, Wu X, Jha S, Swami A. Distillation as a defense to adversarial perturbations against deep neural networks. In: Proceedings of the IEEE Symposium on Security and Privacy (SP). San Jose, USA: IEEE, 2016. 582−597
    [42] Nayebi A, Ganguli S. Biologically inspired protection of deep networks from adversarial attacks. arXiv preprint arXiv: 1703.09202, 2017.
    [43] Cisse M, Adi Y, Neverova N, Keshet J. Houdini: Fooling deep structured visual and speech recognition models with adversarial examples. In: Proceedings of Advances in Neural Information Processing Systems. 2017.
    [44] Gao J, Wang B L, Lin Z M, Xu W L, Qi T J. DeepCloak: Masking deep neural network models for robustness against adversarial samples. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017.
    [45] Jin J, Dundar A, Culurciello E. Robust convolutional neural networks under adversarial noise. arXiv preprint arXiv: 1511.06306, 2015.
    [46] Sun Z, Ozay M, Okatani T. HyperNetworks with statistical filtering for defending adversarial examples. arXiv preprint arXiv: 1711.01791, 2017.
    [47] Akhtar N, Liu J, Mian A. Defense against universal adversarial perturbations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 3389−3398
    [48] Hlihor P, Volpi R, Malagò L. Evaluating the robustness of defense mechanisms based on autoencoder reconstructions against Carlini-Wagner adversarial attacks. In: Proceedings of the 3rd Northern Lights Deep Learning Workshop. Tromsø, Norway: NLDL, 2020. 1−6
    [49] 孔锐, 蔡佳纯, 黄钢. 基于生成对抗网络的对抗攻击防御模型. 自动化学报, DOI: 10.16383/j.aas.c200033

    Kong Rui, Cai Jia-Chun, Huang Gang. Defense to adversarial attack with generative adversarial network. Acta Automatica Sinica, DOI: 10.16383/j.aas.c200033
    [50] Samangouei P, Kabkab M, Chellappa R. Defense-GAN: Protecting classifiers against adversarial attacks using generative models. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018.
    [51] Lin W A, Balaji Y, Samangouei P, Chellappa R. Invert and defend: Model-based approximate inversion of generative adversarial networks for secure inference. arXiv preprintarXiv: 1911.10291, 2019.
    [52] Jin G Q, Shen S W, Zhang D M, Dai F, Zhang Y D. APE-GAN: Adversarial perturbation elimination with GAN. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, UK: IEEE, 2019. 3842−3846
    [53] Xu W L, Evans D, Qi Y J. Feature squeezing: Detecting adversarial examples in deep neural networks. In: Proceedings of the 25th Annual Network and Distributed System Security Symposium. San Diego, USA: NDSS, 2018.
    [54] Ju C, Bibaut A, Van Der Laan M. The relative performance of ensemble methods with deep convolutional neural networks for image classification. Journal of Applied Statistics, 2018, 45(15): 2800-2818 doi: 10.1080/02664763.2018.1441383
    [55] Kim B, Rudin C, Shah J. Latent Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification, MIT-CSAIL-TR-2014-011, MIT, Cambridge, USA, 2014.
  • 加载中
图(8) / 表(9)
计量
  • 文章访问数:  1416
  • HTML全文浏览量:  1247
  • PDF下载量:  178
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-12-28
  • 录用日期:  2021-04-16
  • 网络出版日期:  2021-06-01
  • 刊出日期:  2023-10-24

目录

    /

    返回文章
    返回