2.793

2018影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于通用逆扰动的对抗攻击防御方法

陈晋音 吴长安 郑海斌 王巍 温浩

陈晋音, 吴长安, 郑海斌, 王巍, 温浩. 基于通用逆扰动的对抗攻击防御方法. 自动化学报, 2021, x(x): 1−16 doi: 10.16383/j.aas.c201077
引用本文: 陈晋音, 吴长安, 郑海斌, 王巍, 温浩. 基于通用逆扰动的对抗攻击防御方法. 自动化学报, 2021, x(x): 1−16 doi: 10.16383/j.aas.c201077
Chen Jin-Yin, Wu Chang-An, Zheng Hai-Bin, Wang Wei, Wen Hao. Universal inverse perturbation defense against adversarial attacks. Acta Automatica Sinica, 2021, x(x): 1−16 doi: 10.16383/j.aas.c201077
Citation: Chen Jin-Yin, Wu Chang-An, Zheng Hai-Bin, Wang Wei, Wen Hao. Universal inverse perturbation defense against adversarial attacks. Acta Automatica Sinica, 2021, x(x): 1−16 doi: 10.16383/j.aas.c201077

基于通用逆扰动的对抗攻击防御方法

doi: 10.16383/j.aas.c201077
基金项目: 国家自然科学基金(62072406)资助, 浙江省自然科学基金(LY19F020025)资助, 教育部产学合作协同育人项目资助
详细信息
    作者简介:

    陈晋音:浙江工业大学网络空间安全研究院和信息工程学院教授, 2009年获得浙江工业大学博士学位. 主要研究方向为人工智能安全、图数据挖掘和进化计算等. E-mail: chenjinyin@zjut.edu.cn

    吴长安:浙江工业大学硕士研究生, 主要研究方向为深度学习、计算机视觉和图像的对抗攻击和防御等. E-mail: wuchangan@zjut.edu.cn

    郑海斌:浙江工业大学博士研究生, 主要研究方向为深度学习、人工智能安全和图像识别等. E-mail: haibinzheng320@gmail.com

    王巍:中国电科36所研究员, 主要研究方向为无线通信分析等. E-mail: wwzwh@163.com

    温浩:重庆中科云从科技有限公司科研生态负责人, 主要研究方向为计算机通信网络与大规模人工智能计算等. E-mail: wenhao@cloudwalk.com

Universal Inverse Perturbation Defense Against Adversarial Attacks

Funds: Supported by National Natural Science Foundation of China (62072406), Natural Science Foundation of Zhejiang Province(LY19F020025), Ministry of Education Industry-University Cooperation Collaborative Education Project
More Information
    Author Bio:

    CHEN Jin-Yin Professor in the Institute of Cyberspace Security and the College of Information Engineering, Zhejiang University of Technology. She received Ph. D degrees from Zhejiang University of Technology, in 2009. Her research interests include artificial intelligence security, graph data mining and evolutionary computing

    WU Chang-An Master student in college of information engineering, Zhejiang University of Technology. His research interests include deep learning, computer vision, adversarial attack and defense

    ZHENG Hai-Bin Doctor student in college of information engineering, Zhejiang University of Technology. His research interests include deep learning, AI security, adversarial attack and defense, image recognition

    WANG Wei Vice Chairman in The 36th Research Institute of China Electronics Technology Group Corporation. His research interests include Wireless communication analysis

    WEN Hao Researcher in Chongqing Zhongke Yuncong Technology Co. LTD. His research interests include Computer communication networks and large-scale artificial intelligence computing

  • 摘要: 现有研究表明深度学习模型容易受到精心设计的对抗样本攻击, 从而导致模型给出错误的推理结果, 引发潜在的安全威胁. 已有较多有效的防御方法, 其中大多数针对特定攻击方法具有较好防御效果, 但由于实际应用中无法预知攻击者可能采用的攻击策略, 因此提出不依赖攻击方法的通用防御方法是一个挑战. 本文提出了一种基于通用逆扰动的对抗样本防御方法, 通过学习原始数据集中的类相关主要特征, 生成通用逆扰动(Universal Inverse Perturbation, UIP), 且UIP对数据样本和攻击方法都具有通用性, 即一个UIP可以实现对不同攻击方法作用于整个数据集得到的所有对样本进行防御. 此外, UIP通过强化良性样本的类相关重要特征实现对良性样本精度的无影响, 且生成UIP无需对抗样本的先验知识. 通过大量实验验证, 表明UIP在不同数据集、不同模型中对各类攻击方法都具备显著的防御效果, 且提升了模型对正常样本的分类性能.
  • 图  1  通用逆扰动防御方法框图

    Fig.  1  The framework of UIPD method

    图  2  基于特征分布和决策边界的UIPD分析示意图

    Fig.  2  The UIPD analysis based on feature distribution and decision boundary.

    图  3  基于鲁棒安全边界的UIPD分析示意图

    Fig.  3  The UIPD analysis based on robust security boundaries.

    图  4  参数敏感性实验结果图

    Fig.  4  The results of Parameter sensitivity experiment

    图  5  不同防御方法实施1000次防御的时间消耗

    Fig.  5  The time cost in 1000 defenses of different defense

    图  6  UIPD对AP攻击的防御实验结果

    Fig.  6  The results of UIDP against AP attacks

    表  1  搭建的网络模型结构

    Table  1  The Network Structure built by Ourselves

    Layer TypeM_CNN/F_CNN
    Conv + ReLu5×5×5
    Max Pooling2×2
    Conv + ReLu5×5×64
    Max Pooling2×2
    Dense (Fully Connected)1024
    Dropout0.5
    Dense (Fully Connected)10
    Softmax10
    下载: 导出CSV

    表  2  UIPD针对不同攻击方法的防御通用性

    Table  2  The universality of UIPD's defense against different attack methods

    DSRMNISTFMNISTCIFAR-10ImageNet
    AlexNetLeNetM_CNNAlexNetF_CNNVGG19VGG19
    良性样本识别准确率92.34%95.71%90.45%89.01%87.42%79.55%89.00%
    FGSM[8]73.31%85.21%77.35%79.15%80.05%78.13%43.61%
    BIM[14]99.30%93.73%99.11%95.28%97.61%85.32%72.90%
    MI-FGSM[9]69.65%90.32%98.99%88.35%85.75%56.93%44.76%
    PGD[16]99.31%95.93%99.19%97.80%95.83%81.05%73.13%
    C&W[17]99.34%96.04%92.10%96.44%94.44%80.67%46.67%
    L-BFGS[7]98.58%70.12%67.79%66.35%71.75%68.69%31.36%
    JSMA[10]64.33%55.59%76.61%72.31%69.51%60.04%37.54%
    DeepFool[18]98.98%97.98%94.52%93.54%91.63%83.13%62.54%
    UAP[15]97.46%97.09%99.39%97.85%96.55%83.07%72.66%
    Boundary[12]93.63%94.38%95.72%92.67%91.88%76.21%68.45%
    ZOO[11]77.38%75.43%76.39%68.36%65.42%61.58%54.18%
    AGNA[19]75.69%76.40%81.60%64.80%72.14%62.10%55.70%
    AUNA[19]74.20%73.65%78.53%65.75%62.20%62.70%52.40%
    SPNA[19]92.10%88.35%89.17%77.58%74.26%72.90%60.30%
    下载: 导出CSV

    表  3  UIPD针对不同数据样本的通用性(MNIST, M_CNN)

    Table  3  The universality of UIPD for different examples (MNIST, M_CNN)

    良性样本类标0123456789
    置信度1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
    (良性样本+UIP)类标0123456789
    置信度1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
    对抗样本类标5817934367
    置信度0.53900.49060.50150.50290.51460.50200.52120.52250.52280.5076
    (对抗样本+UIP) 类标0123456789
    置信度0.98040.98480.98410.95490.97610.94420.97600.89080.94200.9796
    下载: 导出CSV

    表  4  MNIST数据集中不同模型的UIP可视化图

    Table  4  The UIP visualization of MNIST datasets in different models

    MNIST, AlexNet均值: 0.0003, 方差: 0.0002MNIST, LeNet均值: 0.0005, 方差: 0.0009MNIST, M_CNN均值: 0.0034, 方差: 0.0005
    下载: 导出CSV

    表  5  不同防御方法针对基于梯度的攻击的防御效果比较

    Table  5  The performance comparison of different defense methods againstGradient-based attacks

    MNISTFMNISTCIFAR-10ImageNet
    AlexNetLeNetM_CNNAlexNetF_CNNVGG19VGG19
    平均ASR95.46%99.69%97.88%98.77%97.59%87.63%81.79%
    DSRresize178.24%74.32%81.82%79.84%77.24%69.38%47.83%
    resize278.54%64.94%78.64%79.34%69.65%64.26%43.26%
    rotate76.66%80.54%84.74%77.63%61.46%72.49%42.49%
    Distil-D83.51%82.08%80.49%85.24%82.55%75.17%57.13%
    Ens-D87.19%88.03%85.24%87.71%83.21%77.46%58.34%
    D-GAN72.4%68.26%70.31%79.54%75.04%73.05%51.04%
    GN22.60%30.26%27.56%27.96%22.60%23.35%13.85%
    DAE84.54%85.25%85.68%86.94%80.21%75.85%59.31%
    APE-GAN83.40%80.71%82.36%84.10%79.45%72.15%57.88%
    UIPD88.92%86.89%87.45%87.77%83.91%78.23%59.91%
    Rconfresize10.92310.96310.94240.89330.93840.67420.4442
    resize20.89310.91840.96420.97310.94730.73710.4341
    rotate0.90420.89140.92740.95350.81440.68140.4152
    Distil-D0.92210.90530.91620.93400.92780.67410.4528
    Ens-D0.96230.91730.96860.92100.93310.79940.5029
    D-GAN0.87390.84190.88290.90120.89810.78390.4290
    GN0.14450.17420.24520.16310.18350.12550.0759
    DAE0.94700.93460.96330.94200.93240.77820.5090
    APE-GAN0.89640.92700.94250.88970.90150.63010.4749
    UIPD0.97880.94630.98420.96420.95310.81410.5141
    下载: 导出CSV

    表  6  不同防御方法针对基于优化的攻击的防御效果比较

    Table  6  The performance comparison of different defense methods against optimization-based attacks

    MNISTFMNISTCIFAR-10ImageNet
    AlexNetLeNetM_CNNAlexNetF_CNNVGG19VGG19
    平均ASR93.28%96.32%94.65%95.20%93.58%88.10%83.39%
    DSRresize178.65%70.62%79.09%74.37%66.54%65.31%38.28%
    resize263.14%67.94%77.14%66.98%63.09%62.63%41.60%
    rotate76.62%72.19%71.84%66.75%64.42%65.60%42.67%
    Distil-D82.37%82.22%80.49%82.47%83.28%71.14%45.39%
    Ens-D86.97%83.03%85.24%83.41%82.50%74.29%47.85%
    D-GAN82.43%80.34%86.13%79.35%80.47%70.08%43.10%
    GN20.16%21.80%25.30%19.67%18.63%21.40%13.56%
    DAE83.66%84.17%86.88%82.40%83.66%74.30%51.61%
    APE-GAN82.46%85.01%85.14%81.80%82.50%73.80%49.28%
    UIPD87.92%85.22%87.54%83.70%83.91%75.38%52.91%
    Rconfresize10.85130.86140.84600.79630.83240.60100.3742
    resize20.78140.88100.86550.82900.84750.63200.3800
    rotate0.85190.83740.83190.81000.80400.64620.4058
    Distil-D0.91410.89130.90330.91350.92000.78210.4528
    Ens-D0.95150.92800.87200.89400.90110.81550.4788
    D-GAN0.85390.87890.88290.87330.88200.74500.4390
    GN0.16300.19200.21520.17610.19710.14500.0619
    DAE0.91200.92900.95100.94200.93240.77820.5090
    APE-GAN0.89640.92700.94250.88970.90150.63010.4749
    UIPD0.92100.93400.95200.95120.97810.80510.5290
    下载: 导出CSV

    表  7  不同防御方法处理后良性样本的识别准确率

    Table  7  The accuracy of benign examples after processing by different defense methods

    MNISTFMNISTCIFAR-10ImageNet
    AlexNetLeNetM_CNNAlexNetF_CNNVGG19VGG19
    良性样本92.34%95.71%90.45%89.01%87.42%79.55%89.00%
    resize192.27% (−0.07%)95.66% (−0.05%)90.47%(+0.02%)88.97% (−0.04%)87.38% (−0.04%)79.49% (−0.06%)88.98% (−0.02%)
    resize292.26% (−0.8%)95.68% (−0.3%)90.29% (−0.16%)88.71% (−0.3%)87.38% (−0.04%)79.48% (−.007%)87.61% (−1.39)
    rotate92.31% (−0.03%)95.68% (−0.03%)90.39% (−0.06%)88.95% (−0.06%)87.40% (0.02%)79.53% (−0.02%)88.82% (−0.18%)
    Distil-D90.00% (−2.34%)95.70% (−0.01%)90.02% (−0.43%)88.89% (−0.12%)86.72% (−0.7%)76.97% (−2.58%)87.85% (−1.15%)
    Ens-D94.35% (+2.01%)96.15% (+0.44%)92.38% (+1.93%)89.13% (+0.12%)87.45% (+0.03%)80.13% (+0.58%)89.05% (+0.05%)
    D-GAN92.08% (−0.26%)95.18% (−0.53%)90.04% (−0.41%)88.60% (−0.41%)87.13% (−0.29%)78.80% (−0.75%)87.83% (−1.17%)
    GN22.54% (−69.80%)25.31% (−70.40%)33.58% (−56.87%)35.71% (−53.30%)28.92% (−58.59%)23.65% (−55.90%)17.13% (−71.87%)
    DAE91.57% (−0.77%)95.28% (−0.43%)89.91% (−0.54%)88.13% (−0.88%)86.80% (−0.62%)79.46% (−0.09%)87.10% (−1.90%)
    APE-GAN92.30% (−0.04%)95.68% (−0.03%)90.42% (−0.03%)89.00% (−0.01%)87.28% (−0.14%)79.49% (−0.06%)88.88% (−0.12%)
    UIPD92.37% (+0.03%)95.96% (+0.25%)90.51% (+0.06%)89.15% (+0.14%)87.48% (+0.06%)79.61% (+0.06%)89.15% (+0.15%)
    下载: 导出CSV

    A1  UIPD针对不同数据样本的通用性举例(FMNIST, F_CNN)

    A1  The universality of UIPD for different examples (FMNIST, F_CNN)

    良性样本类标0123456789
    置信度1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
    (良性样本+UIP)类标0123456789
    置信度1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
    对抗样本类标6361275547
    置信度0.45310.47140.56410.51030.48310.54220.48640.51440.47810.4961
    (对抗样本+UIP)类标0123456789
    置信度0.94150.89450.913140.94250.87730.902550.87870.83098.94240.8872
    下载: 导出CSV

    A2  UIPD针对不同数据样本的通用性(CIFAR10, VGG19)

    A2  The universality of UIPD for different examples (CIFAR10, VGG19)

    良性样本类标飞机汽车鹿青蛙卡车
    置信度1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
    (良性样本+UIP)类标飞机汽车鹿青蛙卡车
    置信度1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
    对抗样本类标卡车飞机飞机
    置信度0.49140.52120.50310.50410.50100.53470.53140.48140.51420.4761
    (对抗样本+UIP)类标飞机汽车鹿青蛙卡车
    置信度0.93310.91310.89130.90430.88310.91410.88630.89470.92510.9529
    下载: 导出CSV

    A3  UIPD针对不同数据样本的通用性(ImageNet, VGG19)

    A3  The universality of UIPD for different examples (ImageNet, VGG19)

    良性样本类标导弹步枪军装皮套航空母舰航天飞机防弹背心灯塔客机坦克
    置信度0.94250.94750.98250.96520.98250.96520.92560.94130.95150.9823
    (良性样本+UIP)类标导弹步枪军装皮套航空母舰航天飞机防弹背心灯塔客机坦克
    置信度0.94450.95250.99250.96920.99260.96520.91590.97820.96340.9782
    对抗样本类标军装航空母舰防弹背心军装灯塔导弹步枪客机坦克灯塔
    置信度0.51340.49810.50140.48310.47880.51010.46980.51940.49830.5310
    (对抗样本+UIP)类标导弹步枪军装皮套航空母舰航天飞机防弹背心灯塔客机坦克
    置信度0.89420.73420.82450.80740.81420.791230.81410.78610.71340.7613
    下载: 导出CSV

    A4  不同数据集和模型的UIP可视化图

    A4  The UIP visualization of different datasets and models

    MNIST, AlexNet均值: 0.0003, 方差: 0.0002MNIST, LeNet均值: 0.0005, 方差: 0.0009MNIST, M_CNN均值: 0.0034, 方差: 0.0005
    FMNIST, AlexNet均值: −0.0002, 方差: 0.0002FMNIST, F_CNN均值: 0.0339, 方差: 0.0191CIFAR10, VGG19均值: −0.0005, 方差: 0.0003
    ImageNet, VGG19均值: −0.0137, 方差: 0.0615
    下载: 导出CSV
  • [1] Goodfellow I, Bengio Y, Courville A. Deep learning[M]. Massachusetts: MIT press, 2016.24−45.
    [2] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Proc of Advances in Neural Information Processing Systems. Massachusetts: MIT Press, 2012: 1097−1105.
    [3] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Proc of Advances in Neural Information Processing Systems. Massachusetts: MIT Press, 2014: 3104−3112.
    [4] 袁文浩, 孙文珠, 夏斌, 欧世峰. 利用深度卷积神经网络提高未知噪声下的语音增强性能. 自动化学报, 2018, 44(4): 751−759

    Yuan Wen-Hao, Sun Wen-Zhu, Xia Bin, Ou Shi-Feng. Improving speech enhancement in unseen noise using deep convolutional neural network. Acta Automatica Sinica, 2018, 44(4): 751−759
    [5] 代伟, 柴天佑. 数据驱动的复杂磨矿过程运行优化控制方法. 自动化学报, 2014, 40(9): 2005−2014

    Dai Wei, Chai Tian-You. Data-driven optimal operational control of complex grinding processes. Acta Automatica Sinica, 2014, 40(9): 2005−2014
    [6] Szegedy C, Zaremba W, Sutskever I, et al. Intriguing properties of neural networks. arXiv preprint arXiv: 1312.6199, 2013: 1−15
    [7] Akhtar N, Mian A. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access, 2018, 6: 14410−14430 doi: 10.1109/ACCESS.2018.2807385
    [8] Goodfellow I, Shlens J, Szegedy C. Explaining and harnessing adversarial examples[J]. arXiv preprint arXiv: 1412.6572, 2014.
    [9] Dong Y, Liao F, Pang T, et al. Boosting adversarial attacks with momentum[C]//Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018: 9185−9193.
    [10] Papernot N, McDaniel P, Jha S, et al. The limitations of deep learning in adversarial settings[C]// Proc of 2016 IEEE European symposium on security and privacy. Piscataway, NJ: IEEE, 2016: 372−387.
    [11] Chen P, Zhang H, Sharma Y. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models[C]//Proc of the 10th ACM Workshop on Artificial Intelligence and Security. New York, NY: ACM, 2017: 15−26.
    [12] Brendel W, Rauber J, Bethge M. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models[J]. arXiv preprint arXiv: 1712.04248, 2017.
    [13] Xiao C, Li B, Zhu J Y, et al. Generating adversarial examples with adversarial networks[J]. arXiv preprint arXiv: 1801.02610, 2018.
    [14] Papernot N, McDaniel P, Goodfellow I. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples[J]. arXiv preprint arXiv: 1605.07277, 2016.
    [15] Moosavi-Dezfooli S M, Fawzi A, Fawzi O, et al. Universal adversarial perturbations[C]//Proc of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2017: 1765−1773.
    [16] Ilyas A, Santurkar S, Tsipras D, et al. Adversarial examples are not bugs, they are features[C]//Proc of Advances in Neural Information Processing Systems. Massachusetts: MIT Press, 2019: 125−136.
    [17] Carlini N, Wagner D. Towards evaluating the robustness of neural networks[C]//Proc of 2017 IEEE European Symposium on Security and Privacy. Piscataway, NJ: IEEE, 2017: 39−57.
    [18] Moosavi-Dezfooli S M, Fawzi A, Frossard P. Deepfool: a simple and accurate method to fool deep neural networks[C]//Proc of the IEEE conference on computer vision and pattern recognition. Piscataway, NJ: IEEE, 2016: 2574−2582.
    [19] Rauber J, Brendel W, Bethge M. Foolbox: A python toolbox to benchmark the robustness of machine learning models[J]. arXiv preprint arXiv: 1707.04131, 2017.
    [20] Su J, Vargas D V, Sakurai K. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 2019, 23(5): 828−841 doi: 10.1109/TEVC.2019.2890858
    [21] Baluja S, Fischer I. Adversarial transformation networks: Learning to generate adversarial examples[J]. arXiv preprint arXiv: 1703.09387, 2017.
    [22] Cisse M, Adi Y, Neverova N, et al. Houdini: Fooling deep structured prediction models[J]. arXiv preprint arXiv: 1707.05373, 2017.
    [23] Sarkar S, Bansal A, Mahbub U, et al. UPSET and ANGRI: Breaking high performance image classifiers[J]. arXiv preprint arXiv: 1707.01159, 2017.
    [24] Brown T B, Mané D, Roy A, et al. Adversarial patch[J]. arXiv preprint arXiv: 1712.09665, 2017.
    [25] Karmon D, Zoran D, Goldberg Y. Lavan: Localized and visible adversarial noise[C]//International Conference on Machine Learning. PMLR, 2018: 2507−2515.
    [26] Liu A, Liu X, Fan J, et al. Perceptual-sensitive gan for generating adversarial patches[C]//Proceedings of the AAAI conference on artificial intelligence. 2019, 33(01): 1028−1035.
    [27] Liu A, Wang J, Liu X, et al. Bias-based universal adversarial patch attack for automatic check-out[C]//Proc. Eur. Conf. Comput. Vis. 2020: 395−410.
    [28] Xie C, Wang J, Zhang Z, et al. Adversarial examples for semantic segmentation and object detection[C]//Proc of International Conference on Computer Vision(ICCV). Piscataway, NJ: IEEE, 2017: 1−13.
    [29] Song C, He K, Lin J, et al. Robust Local Features for Improving the Generalization of Adversarial Training[J]. arXiv preprint arXiv: 1909.10147, 2019.
    [30] Miyato T, Dai A M, Goodfellow I. Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv: 1605.07725, 2016: 1−16
    [31] Zheng S, Song Y, Leung T, et al. Improving the robustness of deep neural networks via stability training[C]// Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2016, 4480−4488.
    [32] Dziugaite G K, Ghahramani Z, Roy D M. A study of the effect of jpg compression on adversarial images[J]. arXiv preprint arXiv: 1608.00853, 2016.
    [33] Das N, Shanbhogue M, Chen S T, et al. Keeping the bad guys out: Protecting and vaccinating deep learning with jpeg compression. arXiv preprint arXiv: 1705.02900, 2017: 1−13
    [34] Luo Y, Boix X, Roig G, et al. Foveation-based mechanisms alleviate adversarial examples. arXiv preprint arXiv: 1511.06292, 2015: 1−13
    [35] Xie C, Wang J, Zhang Z, et al. Adversarial examples for semantic segmentation and object detection[C]//Proc of International Conference on Computer Vision(ICCV). Piscataway, NJ: IEEE, 2017: 1−13.
    [36] Gu S, Rigazio L. Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv: 1412.5068, 2014: 1−15
    [37] Rifai S, Vincent P, Muller X, et al. Contractive auto-encoders: Explicit invariance during feature extraction[C]//Proc of the 28th International Conference on International Conference on Machine Learning. New York, NY: ACM, 2011: 833−840.
    [38] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint arXiv: 1503.02531, 2015.
    [39] Papernot N, McDaniel P, Wu X, et al. Distillation as a defense to adversarial perturbations against deep neural networks[C]//Proc of 2016 IEEE Symposium on Security and Privacy (SP). Piscataway, NJ: IEEE, 2016: 582−597.
    [40] Nayebi A, Ganguli S. Biologically inspired protection of deep networks from adversarial attacks. arXiv preprint arXiv: 1703.09202, 2017: 1−16
    [41] Cisse M, Adi Y, Neverova N, et al. Houdini: Fooling deep structured prediction models. arXiv preprint arXiv: 1707.05373, 2017: 1−15
    [42] Gao J, Wang B, Lin Z, et al. DeepCloak: Masking Deep Neural Network Models for Robustness Against Adversarial Samples[C]//Proc of the 5th International Conference on Learning Representations. Piscataway, NJ: IEEE, 2017: 1−16.
    [43] Jin J, Dundar A, Culurciello E. Robust convolutional neural networks under adversarial noise. arXiv preprint arXiv: 1511.06306, 2015: 1−12
    [44] Sun Z, Ozay M, Okatani T. HyperNetworks with statistical filtering for defending adversarial examples. arXiv preprint arXiv: 1711.01791, 2017: 1−13
    [45] Madry A, Makelov A, Schmidt L, et al. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv: 1706.06083, 2017: 1−13
    [46] Akhtar N, Liu J, Mian A. Defense against universal adversarial perturbations[C]//Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Berlin, German: Springer, 2018: 3389−3398.
    [47] P. Hlihor, R. V olpi, and L. Malag` o, “Evaluating the robustness of defense mechanisms based on autoencoder reconstructions against carlini-wagner adversarial attacks, ” in Proceedings of the Northern Lights Deep Learning Workshop, vol. 1, 2020, pp. 6−6.
    [48] 孔锐, 蔡佳纯, 黄钢. 基于生成对抗网络的对抗攻击防御模型. 自动化学报, 2020, 41(x): 1−17

    Kong Rui, Cai Jia-Chun, Huang Gang. Defense to adversarial attack with generative adversarial network. Acta Automatica Sinica, 2020, 41(x): 1−17
    [49] Samangouei P, Kabkab M, Chellappa R. Defense-gan: Protecting classifiers against adversarial attacks using generative models[J]. arXiv preprint arXiv: 1805.06605, 2018.
    [50] Lin W A, Balaji Y, Samangouei P, et al. Invert and Defend: Model-based Approximate Inversion of Generative Adversarial Networks for Secure Inference[J]. arXiv preprint arXiv: 1911.10291, 2019.
    [51] G. Jin, S. Shen, D. Zhang, F. Dai, and Y. Zhang, “Ape-gan: Adversarial perturbation elimination with gan, ” in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. IEEE, 2019, pp. 3842–3846.
    [52] Xu W, Evans D, Qi Y. Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv: 1704.01155, 2017: 1−12
    [53] Ju C, Bibaut A, Mark V D L. The Relative Performance of Ensemble Methods with Deep Convolutional Neural Networks for Image Classification[J]. Journal of Applied Statistics, 2017.
    [54] Kim B, Rudin C, Shah J. Latent Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification[J]. 2014.
  • 加载中
计量
  • 文章访问数:  179
  • HTML全文浏览量:  120
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-12-28
  • 录用日期:  2021-04-16
  • 网络出版日期:  2021-06-01

目录

    /

    返回文章
    返回