-
摘要: 机器学习以强大的自适应性和自学习能力成为网络空间防御的研究热点和重要方向. 然而机器学习模型在网络空间环境下存在受到对抗攻击的潜在风险, 可能成为防御体系中最为薄弱的环节, 从而危害整个系统的安全. 为此科学分析安全问题场景, 从运行机理上探索算法可行性和安全性, 对运用机器学习模型构建网络空间防御系统大有裨益. 全面综述对抗机器学习这一跨学科研究领域在网络空间防御中取得的成果及以后的发展方向. 首先, 介绍了网络空间防御和对抗机器学习等背景知识; 其次, 针对机器学习在网络空间防御中可能遭受的攻击, 引入机器学习敌手模型概念, 目的是科学评估其在特定威胁场景下的安全属性; 然后, 针对网络空间防御的机器学习算法, 分别论述了在测试阶段发动规避攻击、在训练阶段发动投毒攻击、在机器学习全阶段发动隐私窃取的方法, 进而研究如何在网络空间对抗环境下, 强化机器学习模型的防御方法; 最后, 展望了网络空间防御中对抗机器学习研究的未来方向和有关挑战.Abstract: Machine learning has the ability to learn in various conditions, and becomes a research hotspot and an important direction for cyberspace defense. Unfortunately, machine learning models have potential risks of suffering adversarial attacks in the cyberspace and may become the weakest part of the defense system. Therefore, it is of great benefit to discuss cyberspace defense scenarios and the fundamental issues about the possibility and security of using machine learning algorithms, which is the basis of building cyberspace defense system with machine learning models later on. Adversarial machine learning for cyberspace defense is an interdisciplinary research field. In this paper, we provide a comprehensive review of works related to this filed. Firstly, we present the background and related works of cyberspace defense and adversarial machine learning. Secondly, we provide a model to describe the adversarial model of attack against machine learning in cyberspace defense systems, and thoroughly assess its security attributes under specific threat scenarios. Specifically, we discuss the methods of launching evasion attacks in the test phase, launching poisoning attacks in the training phase, and launching privacy violation in the whole phase for cyberspace defense systems. On the basis of this, we study how to strengthen the machine learning models with different defense mechanisms in cyberspace. Finally, we discuss the future works and challenges of research on adversarial machine learning in cyberspace defense.
-
深度神经网络(Deep neutral networks, DNN)由于其出色性能而在计算机视觉分类等任务中得到广泛运用[1−2], 然而, 其多层网络架构拟合复杂、非线性特征空间的特点致使其模型通常包含数亿个权重参数, 导致人们难以追踪、理解网络的决策过程[3]. 这种黑盒模型在需要高透明和高可信的领域(如医学、法律和金融等)难以应用, 因为它可能会以错误的结果误导用户, 甚至可能危害生命安全[4]. 相比之下, 一些经典的白盒模型如决策树[5]、规则学习[6]和贝叶斯网络[7]分别使用层次结构、推理规则和概率分布来展示推理过程, 然而这些白盒模型的预测精度远远低于深度神经网络. 研究人员尝试使用多种方法来解释黑盒模型, 例如基于梯度的可视化方法[8−9]、类激活图[10−11]、通过白盒模型创建代理局部拟合黑盒模型[12−13]或原型学习[14−15]等. 然而, 这些解释黑盒模型的方法并不直接参与到模型训练中, 因此不能提高模型提供解释信息的能力. 随着人工智能模型对可解释性的要求不断提高, 模型需要在保持高精度的同时实现良好的可解释性.
神经网络最初模仿大脑的神经元进行设计, 而在分类任务中, 即使面对未知的物体, 人类也可以做出预测并说明理由. 人类在认知物体过程中通过其所具有的视觉属性进行识别[16], 并借助触觉、听觉和味觉等多种模态信息帮助做出更精确的判断. 医学研究结果证实, 多感官的整体感知效果大于各感官的独立感知效果[17], 多模态数据的综合信息对于准确理解和认知物体至关重要. 具体而言, 人类通过多个感官感知物体所具有的属性, 并从已有知识中找到具有相同属性的类别原型. 例如, 铁门通常看上去很坚固, 摸起来冰凉光滑, 开关门时会发出低沉的声音, 这些属性就是铁门的原型. 当人类不知道一种门的材质时, 仍然可以根据它与铁门相同的属性来做出判断.
同时, 类别原型中天然存在一定的层次结构, 类似于图1所示的生物分类学. 每个类别都继承其父类的所有属性, 但又有其独特的部分. 例如食肉目和兔形目同属哺乳纲, 都继承哺乳纲有四肢、有耳廓的生理特征, 又存在食肉目眼睛前置, 兔形目眼睛在头部两侧的区别. 这种人类认知物体的方式启发我们可提取多模态数据的属性作为解释信息, 并使用决策树推理将这种层次结构加以利用. 通过提取和利用这些属性, 可以更好地理解和解释模型的决策过程, 使得模型在可解释性和精度方面取得更好的平衡.
多模态融合方法已经广泛应用于各领域中[18−19], 然而这些方法通常缺乏可解释性. 研究表明卷积神经网络(Convolutional neural networks, CNN)在分类任务中具有学习视觉属性的能力, 因此不再需要人工定义属性以建模物体[20]. 受这些工作的启发, 本文提出一种基于可见光、深度图等多模态图像输入的视觉属性可解释分类方法, 通过骨干网络在原型约束下学习输入数据的属性, 在预先设定的层次结构中进行树推理. 每个属性用一组具有相同特征的图像集合表示, 利用梯度加权类激活映射(Gradient-weighted class activation mapping, Grad-CAM)[21]可视化后归纳集合具有的特征.
该方法流程如图2 所示, 提供属性集合和决策树推理过程对模型决策进行解释. 以一组浴室场景图像为例, 骨干网络计算可见光、深度图中具有的属性强度, 将该浴室场景表示为马桶、橱柜、桌面、布料等视觉属性的集合. 同时, 决策树依据属性强度进行推理, 认为这些属性首先符合住宅的特点, 进一步符合住宅中浴室的特点, 最终完成分类.
本文的主要贡献是:
1) 提出一种基于属性的多模态可解释分类方法, 该方法能够使用输入数据所具有的属性信息和决策树推理来解释分类结果, 并通过消融实验研究模型每个部分对整体性能的影响.
2) 提出一种面向多模态数据的决策树逐层融合方法, 实现在决策树推理阶段的证据融合, 在保证模型可解释性的同时提升准确性.
3) 本文的方法在准确性和可解释性方面都表现出色, 在三个多模态场景分类数据集NYUDv2、SUN RGB-D和RGB-NIR中达到最先进的性能, 并通过插入/删除测试验证了方法的可解释性.
1. 相关工作
相关的前期工作包括可解释方法和多模态学习两部分.
1.1 可解释方法
近年来, 研究人员更加关注复杂、非线性的黑盒模型在做出决策时的内部机制, 以解决在医疗、法律和金融等特定应用场景中使用黑盒模型可能带来的风险[3].
在分类任务中, 一类解释黑盒模型的方法是展示模型关注输入数据的哪些部分. 通过观察模型的梯度[8]、积分梯度[9]、神经元激活情况[22]或引入反卷积[23]以突出输入数据的特定部分. 其中最具代表性的方法是类激活图(Class activation mapping, CAM)[10]及其改进[21], 通过特征图和输出层权重或梯度计算得到原始输入图像中各区域的重要性, 并生成热力图叠加到原始图像. 另一类方法与黑盒模型的架构无关. 例如局部可解释性模型诊断解释(Local interpretable model-agnostic explanations, LIME)[12]关注局部的输入输出之间的关系, 通过创建一个局部的可解释模型来拟合整个黑盒模型. Shapley加性解释(Shapley additive explanations, SHAP)[13]将LIME与可解释理论Shapley值相结合, 通过博弈论计算最优的Shapley值. 原型学习[14−15]通过比较输入数据的属性和类别原型之间的差异提供分类的解释信息, 原型通过人工编码[24]或机器学习方法提取[25−26]. 这两类方法都是对已有黑盒模型进行解释, 不直接参与到模型训练过程中.
使用透明的白盒模型具有良好的可解释性. 常用的白盒模型包括线性模型、决策树[27−28]和基于规则的模型[29]等. 白盒模型可以提供比黑盒模型更好的解释性, 但通常准确率更低. 因此, 本文提出的方法优势在于能够在训练过程中利用神经网络学习输入图像的属性和类别原型, 并根据属性使用决策树进行分类推理, 从而实现良好的准确性和可解释性.
1.2 多模态学习
在现实生活中, 人类通过视觉、听觉和触觉等多种感官来感知世界, 医学研究表明多感官的整体感知效果大于各感官独立感知效果的总和[17]. 因此, 多模态学习旨在从多个不同数据源的丰富信息中进行学习.
多模态学习早期的代表性方法是典型相关分析(Canonical correlation analysis, CCA)[30], 尝试找到不同模态间相关性最大的线性变换. 之后, 各种基于CCA的模型[31−32]被广泛用于多模态学习中. 然而, 线性的嵌入函数并不能很好地拟合复杂非线性的多模态数据. 深度学习方法可以对复杂的非线性关系进行建模, 因此多模态融合可以使用深度学习方法[33]. 根据融合方法的不同可以分为聚合[34−35]、对齐[36−37]和混合方法; 根据融合时间可以分为前期、中期和后期融合[38−40].
虽然多模态学习在分类准确性上取得显著效果, 但是针对多模态的可解释性问题研究较少. 本文提出的方法是针对来自可见光、深度图、远红外等不同图像数据源的多模态数据提供解释信息的一次尝试, 旨在增强模型的可解释性, 使得模型的决策过程更加透明和可理解.
2. 模型架构
模型框架如图3所示, 每个模态都使用卷积神经网络作为骨干网络来学习属性的表示向量, 属性向量与引入的层次结构、学习得到的决策规则共同构建决策树. 各模态的决策树根据Dempster-Shafer证据理论[41]进行逐层融合, 最后, 在融合后的决策树中进行软推理. 在骨干网络提取属性的过程中, 引入通道交换去除某一模态的低质量信息, 并保证不同模态之间学习得到的属性一致性. 模型框架主要包含视觉属性提取、决策树构建与融合、决策树推理和原型限制四个模块.
2.1 视觉属性提取
已有研究表明CNN在分类任务中具有学习视觉属性的能力. 在本文的方法中, CNN被用作骨干网络来学习每个模态的属性表示向量. 对于第$ i $个视图$ v^i $, 通过一个卷积神经网络的映射$ f^i $来提取一维的属性强度向量${\boldsymbol{A}}^{i} $, 即$ f^{i}:v^{i}\mapsto{\boldsymbol{A}}^{i} \in [0,\; +\infty)^{D} $, 其中$ D $表示网络提取的属性数量. 通过这个映射, 能够从输入的视图数据中提取出一组属性强度值, 这些属性描述该视图具有的特征. 这些属性向量${\boldsymbol{A}}^{i} $在后续的步骤中将用于构建决策树, 从而实现对模型决策过程的解释性分析.
通过卷积神经网络处理后, 将得到大小为$ D\times w \times h $的特征图, 其中$ w $和$ h $分别对应特征图的宽度和高度. 特征图中的每一项代表着在某一区域内特定属性的强度. 考虑到模型更关注某一属性在整个图像中是否存在, 因此使用全局最大池化将特征图转换为属性表示向量. 在进行全局最大池化之前, 需要确保属性的强度是非负的, 可以使用ReLU函数实现. 通过激活函数和全局最大池化操作, 可以将特征图中每个属性的强度汇总成一个属性表示向量, 该向量能更有效地表示图像中不同属性的存在情况, 并为后续的决策树推理提供更有意义的解释信息.
为提高骨干网络面对不同模态数据时提取属性的一致性, 在批归一化BatchNorm层用其他模态的数据替换特定模态中的低质量信息, 即BatchNorm通道交换[18]. 公式可表示为:
$$ {\boldsymbol{y}}^{i} = \left\{\begin{aligned} &\gamma^{i} \frac{{\boldsymbol{x}}^{i} - \mu({{\boldsymbol{x}}}^{i})}{\sqrt{{\delta^{2}}({\boldsymbol{x}}^{i})}} + \beta^{i},\; && \gamma^{i} > \theta \\ &\frac{1}{N-1} \sum_{j \ne i} \gamma^{j} \frac{{\boldsymbol{x}}^{j} - \mu({\boldsymbol{x}}^{j})}{\sqrt{{\delta^{2}}({\boldsymbol{x}}^{j})}} + \beta^{j},\; && \text{其他} \end{aligned} \right. $$ (1) 其中$ {\boldsymbol{x}}^{i} $表示第$ i $个模态的BatchNorm层输入; $ {{\boldsymbol{y}}}^{i} $表示相应的输出; $ \gamma^{i} $和$ \beta^{i} $是BatchNorm层可训练的参数, 分别表示缩放因子和偏移量. 当缩放因子小于一定阈值时$ \theta \approx 0^{+} $, 输入${{\boldsymbol{x}}}^i$对输出${{\boldsymbol{y}}}^i$的影响很小, 可以认为输入${{\boldsymbol{x}}}^i$不重要, 能够用其他模态数据的平均值对齐进行替代. $ \mu({\boldsymbol{x}}^i) $和$ \delta^{2}({\boldsymbol{x}}^i) $分别表示$ {\boldsymbol{x}}^i $的平均值和方差.
为保证这种交换能够进行, 本文对一半的BatchNorm层进行$ \ell_1 $正则化, 使权重参数更加稀疏. 正则化损失可以表示为:
$$ \begin{align} {\cal L}_{\ell_1} = \eta \sum_{i=1}^N \sum_{l=1}^{\lceil \frac{L}{ 2} \rceil} \lvert \widehat{\gamma_{l}^{i}} \rvert \end{align} $$ (2) 其中, $ \eta $是超参数, $ L $表示单一模态BatchNorm层的数量, $ \widehat{\gamma_{l}^{i}} $表示第$ i $个模态第$ l $个BatchNorm层的缩放因子参数.
2.2 决策树的构建与融合
类别之间天然具有一定的可用树形表示的层次结构, 利用这种层次结构可以构建决策树进行推理, 由于层次结构与模态无关, 因此不同模态的决策树遵循相同的类别层次结构.
在树形层次结构中, 第$ k $个节点的高度$ h_k $反映第$ k $类的所有上级类别的数量. $ {\boldsymbol{H}} \in \{0,\; 1 \} ^{M \times M} $表示引入的类别层次结构, 其中$ {\boldsymbol{H}}_{k \cdot} $表示第$ k $类的所有上级类别, $ M $表示类型的个数. 类别层次结构$ {\boldsymbol{H}} $可表示为:
$$ \begin{align} {\boldsymbol{H}}_{ij} = \left\{\begin{aligned} &1,\; & &\text{节点} \;j \;\text{是}\; i\; \text{的祖先} \\& 0,\; && \text{其他} \end{aligned}\right. \end{align} $$ (3) 决策树 $ T $ 由属性${\boldsymbol{A}}^i \in [0,\; +\infty)^{D} $, 推理规则${\boldsymbol{W}} \in {\bf{R}}^{D \times M} $和层次结构$ {\boldsymbol{H}} $组成. 推理规则${\boldsymbol{W}} $将属性${\boldsymbol{A}} $映射为属于该节点类型的证据$ \boldsymbol{e} $, 可由全连接层实现. 决策树的每一个节点对应一个类别, ${\boldsymbol{W}}_{\cdot k} $表示节点$ k $的推理规则[27], 那么, 第$ i $个模态中节点$ k $的证据可以通过$ e^{i}_{k} ={\boldsymbol{A}}^i{\boldsymbol{W}}_{\cdot k} $计算.
决策树按层融合过程如图4 所示. 受根据每个模态的数据质量动态调整融合权重的可信多模态融合方法[19]的启发, Dempster-Shafer证据理论和主观逻辑理论[42]可以用于决策树的融合, 提升融合准确率. 但是在评估模态质量时, 主观逻辑理论要求样本只能拥有一个标签$ k $. 然而, 在决策树中, 子分类同时属于其父级分类, 但是在同一层级仅属于一个分类, 因此需要按层进行融合. 对于高度为$ q $的节点, 满足以下关系:
$$ \begin{align} \sum_{h_k = q} p_k = 1 \end{align} $$ (4) 对于第$ i $个模态具有相同高度的节点, 在主观逻辑理论中, 狄利克雷分布$ {{\boldsymbol{\alpha}}}^{i} $与证据$ {\boldsymbol{E}}^{i} = \{ e^i_{k_1}, e^i_{k_2},\; \cdots ,\; e^i_{k_r} \} $的关系是$ {\boldsymbol{\alpha}}^{i} = {\boldsymbol{E}}^{i} + 1 $, 其中满足$ h_{k_1} = h_{k_2} = \cdots = h_{k_r} $, $ r $代表该高度的节点数, 狄利克雷强度定义为$ S^{i} = \sum_{k=1}^{r} {\alpha}_{k}^{i} $. 因此, 本文可以得到分类$ k $的置信度$ b_{k}^{i} $和第$ i $个模态不确定度$ u^{i} $如下:
$$ \begin{align} b_{k}^{i} = \frac{E_{k}^{i}}{S^{i}} = \frac{\alpha_{k}^{i}-1}{S^{i}} \quad {\text{ 和}} \quad u^{i} = \frac{r}{S^{i}} \end{align} $$ (5) 将各模态依次两两融合, 第$ i $个模态和第$ i+1 $个模态融合公式表示为:
$$ \begin{align} C & = \sum_{k_{i} \neq k_{i+1}} b_{k_{i}}^{i} b_{k_{i+1}}^{i+1} \end{align} $$ (6) $$ \begin{align} b_{k} & = \frac{1}{1-C}(b_{k}^{i}b_{k}^{i+1} + b_{k}^{i}u^{i+1}+ b_{k}^{i+1}u^{i} ) \end{align} $$ (7) $$ \begin{align} u & = \frac{1}{1-C} u^{i} u^{i+1} \end{align} $$ (8) $$ \begin{align} E_{k} & = b_{k} \times S = b_{k} \times \frac{r}{u} \end{align} $$ (9) 其中$ \boldsymbol{b}^{i} $和$ u^{i} $表示第$ i $个模态的分类置信度和不确定度, 相应的$ \boldsymbol{b}^{i+1} $和$ u^{i+1} $表示第$ i+1 $个模态的分类置信度和不确定度, $ C $反映两个模态之间的冲突程度, $ {\boldsymbol{E}} $表示融合后的证据.
多模态融合后的交叉熵损失根据狄利克雷分布进行调整, 以模拟交叉熵损失的最大似然估计. 对于一个狄利克雷分布$ {\boldsymbol{\beta}} $, 其交叉熵损失的计算公式为[43]:
$$ \begin{align} {\cal L}_{\text{CE}}({\boldsymbol{\beta}}_i) & = \sum_{j=1}^{M} y_{ij} \left(\psi\left(\sum_{j=1}^{M} \beta_{ij}\right) - \psi(\beta_{ij})\right) \end{align} $$ (10) 其中$ \beta_{ij} $表示第$ i $个样本在第$ j $个类别概率的狄利克雷分布, $ \psi (\cdot) $是二伽马函数.
此外, 相比于保证正确标签的贡献大于其他标签, 引入KL散度[44]减少错误标签的共享. 最终, 狄利克雷分布$ {\boldsymbol{\beta}} $的融合损失表示为:
$$ \begin{split} {\cal L}_{\text{fusion}}({\boldsymbol{\beta}}_{i}) =\; & {\cal L}_{\text{CE}}({\boldsymbol{\beta}}_{i}) \;+ \\ & \lambda_{t} KL \left[ D({\boldsymbol{p}}_i \mid \widetilde{{\boldsymbol{\beta}}_{i}}) \Vert D({\boldsymbol{p}}_i \mid \mathbf{1}) \right] \end{split} $$ (11) 其中, $ D({\boldsymbol{p}}_i \mid {\boldsymbol{\beta}}_{i}) $是针对狄利克雷分布$ {\boldsymbol{\beta}}_{i} $形成的多项式意见, $ {\boldsymbol{p}}_i $表示单纯形上的类别概率. $ \widetilde{{\boldsymbol{\beta}}}_i $由公式$ \widetilde{{\boldsymbol{\beta}}}_i = {\boldsymbol{y}}_{i} + (1 - {\boldsymbol{y}}_{i}) \odot {\boldsymbol{\beta}}_{i} $计算得到, $ \odot $是逐元素乘法, 表示移除错误证据后的狄利克雷分布. $ \lambda_{t} = \min(1,\; {t}/{\lambda}) $从0逐渐增加到1, 以减少在训练阶段早期KL散度的影响, $ t $是当前训练的周期数, $ \lambda $是限制增长率的超参数.
2.3 决策树推理
各模态的决策树$ T^{i} $经过融合后, 得到可用于推理的决策树$ \widehat{T} $. 研究表明, 软推理比硬推理更准确[45], 软推理过程如图5 所示. 基于软推理规则, 对于决策树$ T $, 计算节点$ v $处所有子节点的归一化指数Softmax, 公式表示为:
$$ \begin{align} e_{k_i} = \frac{\exp s_{k_i}}{ \sum_{k_i \in \text{child}_v} \exp s_{k_i}} \end{align} $$ (12) 其中$ k_i \in \text{child}_{v} $表示$ v $的子节点, $ e_{k_1} $是节点$ k_1 $的证据, $ s_{k_1} $表示从节点$ v $向其子节点$ k_1 $转移的单级转移概率. 每个节点的分类预测概率$ p_k $是从节点$ k $到根节点之间所有节点$ \cal V $的转移概率乘积, 表示为:
$$ \begin{align} p_{k} = \prod_{v \in \cal V} s_v \end{align} $$ (13) 由叶子节点概率$ {\boldsymbol{p}}^{\prime} $和证据$ {\boldsymbol{E}}^{\prime} $计算决策树$ T $的推理损失$ {\cal L}_{\text{infer}} $, 即:
$$ {\cal L}_{\text{infer}}(T) = {\cal L}_{\text{CE}} ({\boldsymbol{p}}^{\prime} + {\mathbf{1}}) + {\cal L}_{\text{fusion}} (S({\boldsymbol{E}}^{\prime}) + \mathbf{1}) $$ (14) 其中$ S(\cdot) $是Softplus函数, $ {\boldsymbol{p}}^{\prime} + \mathbf{1} $和$ S({\boldsymbol{E}}^{\prime}) + \mathbf{1} $分别为$ {\boldsymbol{p}}^{\prime} $和$ S({\boldsymbol{E}}^{\prime}) $的狄利克雷分布.
决策树共有$ N+1 $棵, 包含$ N $棵各模态的决策树$ T^{i} $和一棵融合后的决策树$ \widehat{T} $. 最终的属性推理损失定义为正则化损失$ {\cal L}_{\ell_1} $与各个决策树推理损失平均值的和, 即:
$$ \begin{align} {\cal L}_{\text{attribute}} = {\cal L}_{\ell_1} + \frac{1}{N+1} \left[ {\cal L}_{\text{infer}}(\widehat{T}) + \sum_{i=1}^{N} {\cal L}_{\text{infer}}(T^i) \right] \end{align} $$ (15) 2.4 原型限制
每个类别都继承其直接上级类别所有的属性, 并有自身独特的属性, 即该类别的原型. 因此, 类别原型是其直接上级的类别原型和其独特属性的和, 而其直接上级的类别原型也为继承部分与独特部分的和, 故类别原型最终可表示为其所有上级独特属性与该类别独特属性的和.
假设$ \boldsymbol{U} \in [0,\; +\infty)^{M \times D} $和$ \boldsymbol{P} \in [0,\; +\infty)^{M \times D} $分别表示独特属性和原型, 关系如下:
$$ \begin{align} \boldsymbol{P} = {\boldsymbol{H}} \boldsymbol{U} \end{align} $$ (16) 其中$ \boldsymbol{U} $是可训练的参数. 原型限制帮助模型提高提取属性的能力, 将第$ i $个类别的原型作为属性向量构建原型决策树$ T_{p_{i}} = \{ \boldsymbol{P}_{i},\; {\boldsymbol{W}},\; {\boldsymbol{H}} \} $, 原型损失类似于式(14), 表示为:
$$ \begin{split} {\cal L}_{\text{prototype}}(T_{p_{i}}) =\; & {\cal L}_{\text{CE}} ({\boldsymbol{p}}^{\prime} + \mathbf{1})\; - \\ & \log\left(\frac{\exp{\boldsymbol{P}_{i}\boldsymbol{W}_{\cdot i}}}{ \sum\limits_{k \in \text{leaf}} \exp{\boldsymbol{P}_{i}\boldsymbol{W}_{\cdot k}}}\right) \end{split} $$ (17) 不同之处在于, 多模态融合损失$ {\cal L}_{\text{fusion}} $被替换为Softmax分类损失.
最后, 整个模型的损失函数为属性推理损失和原型损失的和, 公式为:
$$ \begin{align} {\cal L}_{\text{overall}} & = {\cal L}_{\text{attribute}} + {\cal L}_{\text{prototype}}(T_{p_{y}}) \end{align} $$ (18) 其中$ y $表示样本的类别标签.
3. 实验
本节将通过多个实验来评估模型在3个多模态图像数据集NYUDv2[46]、SUN RGB-D[47]和RGB-NIR[48]上的性能. 介绍超参数的选择方法, 验证模型每个部分的功能, 并与之前的方法进行比较, 以验证本文的方法具有出色的准确性和可解释性.
3.1 实验设置
3.1.1 数据集
本文在NYUDv2、SUN RGB-D和RGB-NIR数据集上进行RGB-D和RGB-NIR的多模态场景分类. NYUDv2数据集包
括1449 组对齐的可见光和深度图像, 分为27个场景. 实验挑选出包含超过50组图像的7个类别. 同样, 包含10335 组RGB-D图像的SUN RGB-D数据集被重组为包含20个类别的9585 组图像. RGB-NIR数据集包含476组对齐的RGB和远红外图像, 分为9个场景.根据场景的类型, 为每个数据集人工指定3层的层次结构. 为减少随机性, 本文将所有数据集随机分成10份并采用10折交叉验证, 实验设置固定的随机种子.
3.1.2 卷积神经网络搭建
在提取属性的过程中, 本文所提出的模型不关注骨干网络的具体架构, 任何卷积神经网络通过简单的修改都可以作为骨干网络. 在本文研究过程中, 采用残差神经网络ResNet[49]提取特征, 如ResNet-18特征图大小为$ 512 \times 7 \times 7 $, 这意味着ResNet-18可以提取$ D = 512 $个属性. 为避免ReLU函数可能导致神经元“死亡”问题, 采用负斜率为0.01的leaky ReLU[50]作为激活函数.
通过复制将深度图像和远红外图像从单通道转换为三通道, 并将所有图像大小调整为$ 256 \times 256 $像素, 经过一系列图像增强处理后, 输入卷积神经网络, 得到表示属性强度的长度为$ 512 $或
2048 的一维向量. 在10个Tesla P40 GPU上进行10折交叉验证的并行训练, 使用$ 3\times{10}^{-4} $学习率和权重衰减为$ 10^{-5} $的Adam优化器训练网络. BatchNorm层的初始权重$ \gamma $服从$ 0 $到$ 1 $之间的均匀分布.3.2 超参数设置
如式(1)和(2)中所讨论的, BatchNorm交换由两个超参数组成: 阈值$ \theta $和正则化损失权重$ \eta $. 一般来说, 随着阈值$ \theta $和正则化损失权重$ \eta $的增加, 正则化损失和发生交换的概率也随之增加. 为找到合适的超参数设置, 在样本规模最大的SUN RGB-D上进行实验来寻找超参数与准确率之间的关系.
通过实验枚举$ \theta $在$ {10}^{-5} $到$ {10}^{-1} $与$ \eta $在$ 5\times {10}^{-6} $到$ {10}^{-2} $之间的组合, 得到该参数组合下模型的准确率, 利用高斯拟合消除抖动后绘制热力图, 如图6 所示, 图中颜色越浅, 准确率越高. 观察热力图发现, 在$ \theta \in [{10}^{-2},\; {10}^{-1}] $, $ \eta \in [{10}^{-5},\; {10}^{-4}] $区域准确率较高, 因此在后续实验中, 本文设置阈值 $ \theta = {10}^{-2} $和正则化损失权重$ \eta = 2\times{10}^{-5} $.
3.3 模型准确性
3.3.1 消融实验
除通过骨干网络提取属性外, 模型主要包括三个模块: BatchNorm通道交换、决策树构建与融合以及决策树推理. 为验证这三个模块的有效性而进行消融实验, 逐步应用这些模块以便分析它们对模型性能的影响, 结果如表1 所示. 在不应用树推理、树融合和通道交换的基础模型中, 我们将各模态的分类概率取平均值作为融合结果并计算损失.
表 1 不同模块在NYUDv2、SUN RGB-D和RGB-NIR数据集上的Top-1准确率 (%)Table 1 Top-1 accuracies with different components on NYUDv2, SUN RGB-D and RGB-NIR (%)树推理 树融合 通道交换 NYUDv2 SUN RGB-D RGB-NIR RGB Deep Fusion RGB Deep Fusion RGB NIR Fusion $ \times $ $ \times $ $ \times $ 43.08 59.26 71.98 52.10 38.49 62.19 58.33 52.08 77.78 $ \times $ $ \times $ √ $ 47.74^* $ $ 59.47^* $ 72.07 $ 54.29^* $ $ 47.05^* $ 66.28 $ 62.23^* $ $ 53.76^* $ 80.43 √ $ \times $ $ \times $ 46.28 57.68 72.41 50.98 36.00 58.99 58.68 53.47 79.17 √ √ $ \times $ 61.43 61.00 74.40 59.96 51.62 66.16 71.08 66.45 84.71 √ √ √ $ 71.14^* $ $ 70.99^* $ 74.74 $ 66.76^* $ $ 66.37^* $ 68.01 $ 78.85^* $ $ 77.37^* $ 85.54 注: * 表示使用通道交换为单个模态引入其他模态数据后的准确率, 加粗表示单模态或融合后最高准确率. 为增强不同模态数据之间的一致性, 通道交换通过使用其他模态数据来替换特定模态中的低质量信息, 从而提高模型对多模态数据的表示能力. 相较于基础模型, 仅应用通道交换模块后, 三个数据集准确率均得到提升, 在SUN RGB-D数据集上提升超过4%. 而在进行决策树推理、融合的模型上, 应用该模块后, 所有数据集上融合准确率都进一步得到提升, 在SUN RGB-D数据集上提升约2%, 这一结果验证了通道交换模块的有效性.
决策树推理为模型提供更好的解释性, 利用决策树的逐层推理路径, 能够清楚地看到模型是如何从根节点沿着父类推理到最终类型, 使得模型的决策过程更加透明可理解. 模型的可解释性和准确性一般是互斥的, 决策树的引入增加了模型的学习和推理难度, 当决策树推理的层次结构变得复杂时, 准确率可能会降低. 例如在SUN RGB-D数据集中仅应用决策树推理时, 准确率下降了3.2%, 但是通过决策树提供的层次信息, 模型才能够进行准确的属性和原型学习.
决策树融合模块为有效地结合多个模态的决策树, 使用Dempster-Shafer证据理论进行逐层融合, 动态适应各模态的数据质量, 最终得到一个融合决策树, 进一步提升模型的分类准确率, 并显著提高融合精度, 针对3个数据集, 准确率分别提高了1.99%、7.17%和5.54%. 这表明决策树融合在将多个模态的决策树结合起来时, 能够更好地捕捉到不同模态之间的相关性, 从而提高整体模型的分类精度.
值得注意的是, 由于使用通道交换的方法, 每个模态会引入其他模态的数据, 因此无法直接比较单模态精度, 表1 和表2 中相关的数据用星号标记, 以示区别.
表 2 不同方法在NYUDv2、SUN RGB-D和RGB-NIR数据集上的Top-1准确率 (%)Table 2 Top-1 accuracies with different methods on NYUDv2, SUN RGB-D and RGB-NIR (%)方法 解释性 NYUDv2 SUN RGB-D RGB-NIR RGB Deep Fusion RGB Deep Fusion RGB NIR Fusion ViT-S-16[51] $ \times $ 54.95 62.56 — 59.23 49.43 — 74.44 66.32 — ResNet-18[49] $ \times $ 65.28 65.93 — 66.04 57.85 — 78.83 75.70 — CBCL[52] $ \times $ 56.87 63.20 73.85 50.74 43.59 65.78 74.23 62.91 81.72 TMC[19] $ \times $ 60.14 62.19 74.57 60.89 52.95 66.69 72.76 68.77 84.29 TMNR[53] $ \times $ 56.61 64.50 74.10 60.60 53.53 66.30 69.50 65.26 82.20 dNDF[54] √ 61.86 65.76 — 64.78 57.30 — 78.61 72.11 — NBDT[27] √ 65.28 62.85 — 66.20 57.93 — 74.24 74.22 — HCN[20] √ 62.20 63.18 — 61.91 53.03 — 72.92 68.75 — Ours √ $ 71.14^* $ $ 70.99^* $ 74.74 $ 66.76^* $ $ 66.37^* $ 68.01 $ 78.85^* $ $ 77.37^* $ 85.54 注: * 表示使用通道交换为单个模态引入其他模态数据后的准确率, 加粗表示单模态或融合后最高准确率. 3.3.2 对比实验
与残差神经网络ResNet-18、针对小数据集优化的Transformer骨干网络ViT-S-16、三种单模态可解释模型(dNDF、NBDT、HCN)以及三种多模态融合方法(CBCL、TMC、TMNR)进行比较, 结果如表2 所示. 在小规模数据集场景下, ResNet架构优于Transformer ViT架构, 即使使用针对小规模数据集优化的方法, 骨干网络ViT-S-16在三个数据集中准确率均低于ResNet-18.
多模态数据能够提供比单模态数据更加丰富的信息, 实现数据互补, 从而提高分类准确率. 其中, CBCL[52]是一种RGB-D聚类方法, TMC[19]和TMNR[53]属于多模态证据融合方法, 相较于单模态方法, 准确率均有提升. 同时, 证据融合方法相较聚类方法不依赖所有模态都提供高质量数据, 有更高的准确率. 我们的方法与骨干网络ResNet-18在NYUDv2、SUN RGB-D和RGB-NIR数据集上相比, 准确率分别提高了8.81%、1.97%和6.71%.
当前研究主要关注单模态可解释性, dNDF[54]和NBDT[27]是基于决策树的解释方法, NBDT引入神经网络作为分类器, 因此准确率更高. 而HCN[20]是一类原型学习方法, 虽然准确率较低, 但可解释性更强. 可解释性的引入通常会造成模型精度降低, 得益于我们的方法能够利用互补的多模态信息, 因此相较于单模态方法更加精确, 与准确率最高的单模态可解释模型相比, 准确率分别提高了8.98%、1.81%和6.93%. 层次信息和原型限制的引入, 使该方法在保持模型良好可解释性的同时, 仍具有较好的分类精度. 与不可解释的多模态融合方法相比, 准确率分别提高了0.17%、1.32%和1.25%, 达到与不可解释的模型相近的准确率.
3.3.3 预训练骨干网络
任何CNN骨干网络都可被用于本文提出的模型之中, 并且骨干网络也可以在其他数据集上进行预训练. 如表3 所示, 使用在ImageNet上预训练的ResNet-18骨干模型较表2中未预训练的骨干模型在NYUDv2、SUN RGB-D和RGB-NIR数据集上的准确率分别提升了7.19%、6.95%和5.25%. 这表明预训练的骨干网络能够更准确地提取图像中具有的视觉属性.
表 3 不同预训练骨干网络在NYUDv2、SUN RGB-D和RGB-NIR数据集中的Top-1准确率 (%)Table 3 Top-1 accuracies with different pretrained backbones on NYUDv2, SUN RGB-D and RGB-NIR (%)骨干网络 NYUDv2 SUN RGB-D RGB-NIR ResNet-18 80.90 73.50 90.15 ResNet-34 81.58 73.87 90.15 ResNet-50 81.92 73.88 90.58 ResNet-101 81.93 74.96 90.79 复杂的骨干网络能够学习更丰富的特征表示, 从而提高模型的分类准确率. 例如预训练的ResNet-101可以提取
2048 个属性, 与只能提取512个属性的ResNet-18相比, 在NYUDv2、SUN RGB-D和RGB-NIR数据集上的准确率分别提高了1.03%、1.46%和0.64%.3.4 模型可解释性
本文方法提供的解释性, 核心是将输入图像转化为一系列可视化的属性集合, 并展示决策树推理路径, 流程如图2 所示.
3.4.1 可视化
为可视化模型学习到的属性, 本文用一组图像及其热力图来表示, 如图7 所示, 同组图像具有一个共同的视觉属性特征. 具体而言, 要获得第$ i $个模态中具有第$ k $个属性的图像集合, 根据第$ k $个属性强度${\boldsymbol{A}}^i_k $从大到小将所有图像进行排序, 并选择前若干个图像, 图像之间的共同部分就反映这一属性. 使用Grad-CAM可以生成该组图像对特点属性的热力图, 更加直观地表示图像集合具有的属性.
对于一组图像或类别原型, 通过其对应的属性强度 A 或类别原型$ \boldsymbol{P}_i $可以找到具有代表性的属性. 与可视化属性的过程类似, 将每个属性${\boldsymbol{A}}_k $或 原型$ {\boldsymbol{p}}_{ik} $的属性强度从大到小排序, 选择强度最大的属性为代表, 属性由上一步提取的一组图像及热力图表示.
使用层次结构构建决策树后, 可以展示决策树在各单一模态和融合模态的推理过程, 可视化推理过程有助于验证模态的数据质量. 以图8为例, 输入数据首先被分类至校园场景, 随后进一步归属于自习室. 此时输入的深度图像不含任何可用信息, 难以进行场景分类, 得到概率较高且错误的分类结果. 通过Dempster-Shafer证据理论和主观逻辑理论计算深度图模态不确定度$ u=1. 62 $, 自动降低该模态在融合后的权重, 融合时更加采信不确定度较低$ u=0. 34 $的可见光模态, 最终得到正确的分类结果.
3.4.2 插入/删除测试
为验证提取属性的正确性, 如图9 所示, 本文对属性进行插入/删除测试[55]. 插入测试是向一张空白的图像中插入一些像素, 如果这些像素能够反映图像的分类类别独特且重要, 那么预测准确率将迅速增加. 同理, 删除测试是从一张完整的图像中删除一些像素, 如果这些像素是重要的, 那么预测准确率将迅速下降.
Grad-CAM可以为每个属性生成热力图, 在插入测试中, 根据热力图的强度按热力值从大到小向空白图像中插入一定比率像素的图像. 在删除测试中, 根据热力图的强度从大到小删除一定比率像素的图像, 并比较准确率曲线(Area under curve, AUC), 结果如表4 所示. 结果表明, 删除最强属性比删除最弱的属性具有更低的AUC, 在空白图像中插入最强的属性比插入最弱的属性有更高的AUC, 这表明本文的模型能够正确评估属性强度, 进而找到图像所具有的属性. 由于每个样本都由多个属性组成, 因此在上述实验中, 随机删除的AUC比删除最强属性要低, 因为它可能会破坏多个属性.
表 4 插入或删除不同属性在NYUDv2、SUN RGB-D和RGB-NIR数据集中的AUCTable 4 AUC of different attributes inserted or deleted in NYUDv2, SUN RGB-D and RGB-NIR datasets数据集 最强属性 最弱属性 随机 插入 删除 插入 删除 插入 删除 NYUDv2 0.619 0.209 0.509 0.299 0.351 0.121 SUN RGB-D 0.601 0.300 0.463 0.380 0.284 0.168 RGB-NIR 0.636 0.380 0.549 0.466 0.355 0.207 同时, 为验证模型所提取属性的代表性, 本文尝试找出多少个属性可以代表一个样本. 我们将样本具有的属性按强度排序, 仅保留前$ k $强的属性, 其余属性强度全部置为0, 并绘制准确率曲线AUC, 结果如图10 所示. 实验结果表明, 仅保留10个属性即可在三个数据集中达到与原始模型91.1%、81.3%和83.7%的准确率, 这也表明本文的模型可以有效地提取具有代表性的属性.
4. 总结
本文提出一种基于属性的多模态可解释分类方法. 通过使用骨干网络提取属性、构建和融合决策树并利用决策树进行推理, 该方法具有良好的可解释性, 能够可视化地展示输入数据所具有的属性以及决策树的推理过程. 同时, 与其他单模态可解释方法和多模态不可解释方法相比, 我们的方法均表现出优秀的性能. 本文方法是解决多模态融合可解释性问题的一个良好尝试, 在保持较高准确率的同时, 还能提供清晰的解释信息, 帮助人们理解模型的决策过程.
-
表 1 对抗机器学习相关综述
Table 1 Related surveys about adversarial machine learning
类别 文献题目 主要内容 发表年份 机器学习模型 SoK: Security and privacy in machine learning[15] 分析机器学习模型的攻击面, 系统论述机器学习模型在训练和推断过程中可能遭受的攻击以及防御措施. 2018 Wild patterns: Ten years after the rise of adversarial machine learning[7] 系统揭示对抗机器学习演进路线, 内容涵盖计算机视觉以及网络安全等领域 2018 A survey on security threats and defensive techniques of machine learning: A data driven view[12] 从数据驱动视角论述机器学习的对抗攻击和防御问题. 2018 The security of machine learning in an adversarial setting: A survey[13] 论述对抗环境下, 机器学习在训练和推断/测试阶段遭受的攻击, 提出相应的安全评估机制和对应的防御策略 2019 A taxonomy and survey of attacks against machine learning[14] 论述机器学习应用于不同领域时的对抗攻击, 主要包括入侵检测、垃圾邮件过滤、视觉检测等领域. 2019 机器学习模型安全与隐私研究综述[16] 从数据安全、模型安全以及模型隐私三个角度对现有的攻击和防御研究进行系统总结和归纳 2021 机器学习安全攻击与防御机制研究进展和未来挑战[11] 基于攻击发生的位置和时序对机器学习安全和隐私攻击进行分类, 并对现有攻击方法和安全防御机制进行介绍 2021 深度学
习模型Survey of attacks and defenses on edge-deployed neural networks[18] 论述边缘神经网络的攻击与防御 2019 Adversarial examples in modern machine learning: A review[19] 论述对抗样本生成与防御技术 2019 A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and Interpretability[20] 论述深度神经网络(Deep neural network, DNN)的安全与可解释性 2020 对抗样本生成技术综述[21] 围绕前传、起源和发展三个阶段对对抗样本进行综述 2020 机器学习隐私 机器学习的隐私保护研究综述[17] 着重论述机器学习的隐私保护技术 2020 A survey of privacy attacks in machine learning[22] 论述机器学习中隐私攻击与保护技术 2020 机器学习隐私保护研究综述[23] 着重论述机器学习的隐私保护技术 2020 计算机视觉 Threat of adversarial attacks on deep learning in computer vision: A survey[24] 论述计算机视觉中深度学习模型的攻击与防御 2018 Adversarial machine learning in image classification: A survey towards the defender's perspective[25] 从防御角度研究计算机视觉分类问题中的对抗机器学习 2020 Adversarial examples on object recognition: A comprehensive survey[26] 论述神经网络在视觉领域应用时, 存在的对抗样本的攻防问题 2020 Adversarial attacks on deep learning models of computer vision: A survey[27] 论述计算机视觉中深度学习模型的对抗攻击 2020 自然语言处理 Adversarial attacks on deep-learning models in natural language processing[28] 论述自然语言处理领域中深度学习模型的对抗攻击与防御问题 2020 生物医疗领域 Adversarial biometric recognition: A review on biometric system security from the adversarial machine-learning perspective[29] 首次从对抗机器学习角度论述生物识别系统的安全问题 2015 Toward an understanding of adversarial examples in clinical trials[30] 论述基于深度学习模型的临床实验中的对抗样本问题 2018 Secure and robust machine learning for healthcare: A Survey[31] 从对抗机器学习的角度概述医疗保健领域中机器学习应用的现状、挑战及解决措施 2021 网络空间防御 Adversarial attacks against intrusion detection systems: Taxonomy, solutions and open issues[32] 论述入侵检测系统中的对抗攻击问题以及应对措施 2013 Towards adversarial malware detection: Lessons learned from PDF-based attacks[33] 论述基于机器学习的恶意便携式文档格式 (Portable document format, PDF)文件检测系统可能遭受的对抗攻击 2019 表 2 对抗机器学习时间线
Table 2 A timeline of adversarial machine learning history
年份 主要内容 2004 Dalvi 等[42] 和 Lowd 等[43-44] 研究了垃圾邮件检测中的对抗问题, 提出线性分类模型可能被精心设计的对抗样本所愚弄 2006 Barreno 等[8] 从更广泛的角度质疑机器学习模型在对抗环境中的适用性问题, 并提出一些可行措施来消除或降低这些威胁 2007 NeurIPS 举办 Machine Learning in Adversarial Environments for Computer Security研讨会. 2010年, Machine Learning期刊为该研讨会设立同名专题[54] 2008 CCS 举办首届人工智能与安全研讨会 AISec(Workshop on Artificial Intelligence and Security), 并且持续举办至2020年 2012 面向计算机安全的机器学习方法达堡展望研讨会 (Dagstuhl Perspectives Workshop on Machine Learning Methods for Computer Security), 探讨对抗学习和基于学习的安全技术面临的挑战和未来研究方向[55] 2014 KDD 举办安全与隐私特别论坛 2016 AAAI 举办面向网络空间安全的人工智能研讨会 AICS (Artificial Intelligence for Cyber Security), 此后至2019年每年举办一届 2017 为促进对抗样本的相关研究, 谷歌大脑 (Google Brain) 在 NeurIPS2017 上举办对抗攻击与防御挑战赛 2018 NeurIPS2018举办对抗视觉挑战赛, 目的是促进更加鲁棒的机器视觉模型和更为广泛可用的对抗攻击 Yevgeniy 等[6]撰写书籍 Adversarial Machine Learning, 并由 Morgan & Claypool 出版社发行 2019 Joseph 等[5]撰写书籍 Adversarial Machine Learning, 并由剑桥大学出版社发行 论文Adversarial attacks on medical machine learning[56] Science 期刊上发表, 指出医疗机器学习中出现新脆弱性问题, 需要新
举措论文 Why deep-learning AIs are so easy to fool[57] 在 Nature期刊上发表, 探讨深度学习遭受对抗攻击时的鲁棒性 KDD2019 举办首届面向机器学习和数据挖掘的对抗学习方法研讨会, 至今已连续举办两届 清华大学和阿里安全于天池竞赛平台联合举办安全 AI 挑战者计划, 至今已有 5 期. 同时, 每年底举办 AI 与安全研讨会, 至今已连续举办两届. 2020 KDD2020 举办首届面向安全防御的可部署机器学习国际研讨会 (Workshop on Deployable Machine Learning for Security Defense) 2021 AAAI2021 举办鲁棒、安全、高效的机器学习国际研讨会 (Towards Robust, Secure and Efficient Machine Learning) 注: 数据更新至2021年2月8日. 表 3 基于威胁建模的机器学习攻击分类
Table 3 Classfication of attacks against machine learning based on threat model
敌手能力 敌手目标 敌手知识 模型完整性 模型可用性 隐私窃取 测试数据 规避攻击 — 模型提取
模型反演
成员推断白盒攻击
黑盒攻击训练数据 投毒攻击(后门攻击) 投毒攻击(油蛙攻击) 模型反演
成员推断白盒攻击
黑盒攻击表 4 网络空间防御中的典型对抗攻击
Table 4 Typical adversarial attacks for cyberspace defense
攻击方法 相关文献 应用领域 特点 规避攻击 基于模仿的规避攻击 [42, 44, 64−66] 垃圾邮件检测 模仿攻击采用启发式算法, 尝试向恶意文件中添加良性特征或者向良性文件中注入恶意特征, 从而实现规避 [67] 流量分析 [68] 恶意软件检测 [62, 69−75] 恶意 PDF 文件分类 基于梯度的规避攻击 [75−77] 恶意 PDF 文件分类 基于梯度的规避攻击利用梯度下降求解优化问题, 对输入样本执行细粒度的修改, 以最小化 (最大化) 样本被归类为恶意 (良性) 的概率 [9, 78−79] 恶意软件检测 [63, 80] 入侵检测 基于迁移的规避攻击 [70, 81] 恶意 PDF 文件分类 基于迁移的规避攻击主要利用了对抗样本的跨模型迁移性, 可以应用于无法获取模型梯度的各种攻击场景 [82−84] 入侵检测 [85] XSS 检测 [86] 域名生成 [87−89] 恶意软件检测 投毒攻击 可用性攻击 [8, 44, 90−92] 垃圾邮件检测 可用性攻击的目的是增加测试阶段的分类误差, 从而造成拒绝服务 [93−94] 入侵检测 完整性攻击 [95−96] 异常检测 完整性攻击的目的是使得恶意软件特定子集被模型误分类 [97−98] 恶意软件检测 隐私窃取 模型提取攻击 [99] — 隐私窃取主要目的是窃取机器学习模型或训练数据的信息 模型反演攻击 [100−101] 成员推断攻击 [102−103] 表 5 网络空间防御中用于对抗攻击的典型防御措施
Table 5 Typical defense against adversarial attacks for cyberspace defense
防御措施 相关文献 应用场景 简述 规避防御 数据降维 [117−118] 垃圾邮件检测 可以有效防御对抗攻击, 但模型对正常样本的精度可能降低 [118−119] 恶意软件检测 鲁棒优化 [120−124] 恶意软件检测 基本思想是模型在训练时存在“盲点”, 将构造的对抗样本注入训练集, 以提高模型的泛化能力 防御蒸馏 [125−129] 恶意软件检测 难以防御 C&W 攻击方法 投毒防御 数据清洗 [130] 异常检测 该方法将投毒攻击视为离群值进行处理 [131−136] — 博弈论 [137−141] 垃圾邮件检测 该方法将博弈论的思想用于处理垃圾邮件的投毒攻击 隐私保护 差分隐私 [142−149] — 该方法的难点在于如何平衡模型可用性与隐私保护效果 模型压缩 [109] 该方法可用于缓解成员推断攻击 模型集成 [150] 该方法的主要思想是将模型中低于特定阈值的损失梯度设为零, 可以用于防御模型提取攻击 -
[1] 搜狐. 美国东海岸断网事件主角Dyn关于DDoS攻击的后果. [Online], available: https://www.sohu.com/a/117078005_257305, October 25, 2016 [2] 搜狐. WannaCry勒索病毒事件分析. [Online], available: https://www.sohu.com/a/140863167_244641, May 15, 2017 [3] 彭志艺, 张衠, 惠志斌, 覃庆玲. 中国网络空间安全发展报告(2019版). 北京: 社会科学文献出版社, 2019.Peng Zhi-Yi, Zhang Zhun, Hui Zhi-Bin, Qin Qing-Ling. Annual Report on the Development of Cyberspace Security in China (2019). Beijing: Social Sciences Academic Press, 2019. [4] 张蕾, 崔勇, 刘静, 江勇, 吴建平. 机器学习在网络空间安全研究中的应用. 计算机学报, 2018, 41 (9): 1943-1975 doi: 10.11897/SP.J.1016.2018.01943Zhang Lei, Cui Yong, Liu Jing, Jiang Yong, Wu Jian-Ping. Application of machine learning in cyberspace security research. Chinese Journal of Computers, 2018, 41(9): 1943-1975 doi: 10.11897/SP.J.1016.2018.01943 [5] Joseph A D, Nelson B, Rubinstein B I P, Tygar J D. Adversarial Machine Learning. Cambridge: Cambridge University Press, 2019. [6] Yevgeniy V, Murat K. Adversarial Machine Learning. San Rafael: Morgan & Claypool Publishers, 2018. [7] Biggio B, Roli F. Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognition, 2018, 84: 317-331 doi: 10.1016/j.patcog.2018.07.023 [8] Barreno M, Nelson B, Sears R, Joseph A D, Tygar J D. Can machine learning be secure? In: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security. Taipei, China: ACM, 2006. 16–25 [9] Grosse K, Papernot N, Manoharan P, Backes M, Mcdaniel P. Adversarial examples for malware detection. In: Proceedings of the 22nd European Symposium on Research in Computer Security. Oslo, Norway: Springer, 2017. 62−79 [10] Biggio B, Fumera G, Roli F. Pattern recognition systems under attack: Design issues and research challenges. International Journal of Pattern Recognition and Artificial Intelligence, 2014, 28(7): Article No. 1460002 doi: 10.1142/S0218001414600027 [11] 李欣姣, 吴国伟, 姚琳, 张伟哲, 张宾. 机器学习安全攻击与防御机制研究进展和未来挑战. 软件学报, 2021, 32(2): 406-423Li Xin-Jiao, Wu Guo-Wei, Yao Lin, Zhang Wei-Zhe, Zhang Bin. Progress and future challenges of security attacks and defense mechanisms in machine learning. Journal of Software, 2021, 32(2): 406−423 [12] Liu Q, Li P, Zhao W, Cai W, Yu S, Leung V C M. A survey on security threats and defensive techniques of machine learning: A data driven view. IEEE Access, 2018, 6: 12103-12117 doi: 10.1109/ACCESS.2018.2805680 [13] Wang X, Li J, Kuang X, Tan Y-A. The security of machine learning in an adversarial setting: A survey. Journal of Parallel Distributed Computing, 2019, 130: 12-23 doi: 10.1016/j.jpdc.2019.03.003 [14] Pitropakis N, Panaousis E, Giannetsos T, Anastasiadis E, Loukas G. A taxonomy and survey of attacks against machine learning. Computer Science Review, 2019, 34: Article No. 100199 doi: 10.1016/j.cosrev.2019.100199 [15] Papernot N, Mcdaniel P, Sinha A, Wellman M P. Sok: Security and privacy in machine learning. In: Proceedings of the 3rd IEEE European Symposium on Security and Privacy. London, UK: IEEE, 2018. 399−414 [16] 纪守领, 杜天宇, 李进锋, 沈超, 李博. 机器学习模型安全与隐私研究综述. 软件学报, 2021, 32(1): 41-67Ji Shou-Ling, Du Tian-Yu, Li Jin-Feng, Shen Chao, Li Bo. Security and privacy of machine learning models: A survey. Journal of Software, 2021, 32(1): 41-67 [17] 刘俊旭, 孟小峰. 机器学习的隐私保护研究综述. 计算机研究与发展, 2020, 57(2): 346-362 doi: 10.7544/issn1000-1239.2020.20190455Liu Jun-Xu, Meng Xiao-Feng. Survey on privacy-preserving machine learning. Journal of Computer Research and Development, 2020, 57(2): 346-362. doi: 10.7544/issn1000-1239.2020.20190455 [18] Isakov M, Gadepally V, Gettings K, Kinsy M. Survey of attacks and defenses on edge-deployed neural networks. In: Proceedings of the 2019 IEEE High Performance Extreme Computing Conference. Waltham, MA, USA: IEEE, 2019. 1−8 [19] Wiyatno R, Xu A, Dia O, Berker A D. Adversarial examples in modern machine learning: A review. ArXiv: 1911.05268, 2019. [20] Huang X, Kroening D, Ruan W, Sun Y, Thamo E, Wu M, et al. A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Computer Science Review, 2020, 37: 100270 doi: 10.1016/j.cosrev.2020.100270 [21] 潘文雯, 王新宇, 宋明黎, 陈纯. 对抗样本生成技术综述. 软件学报, 2020, 31(1): 67-81Pan Wen-Wen, Wang Xin-Yu, Song Ming-Li, Chen Chun. Survey on generating adversarial examples. Journal of Software, 2020, 31(1): 67-81 [22] Rigaki M, García S. A survey of privacy attacks in machine learning. ArXiv: 2007.07646, 2020. [23] 谭作文, 张连福. 机器学习隐私保护研究综述. 软件学报, 2020, 31(7): 2127-2156Tan Zuo-Wen, Zhang Lian-Fu. Survey on privacy preserving techniques for machine learning. Journal of Software, 2020, 31(7): 2127-2156 [24] Akhtar N, Mian A. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access, 2018, 6: 14410-14430 doi: 10.1109/ACCESS.2018.2807385 [25] Machado G R, Silva E, Goldschmidt R R. Adversarial machine learning in image classification: A survey towards the defender's perspective. ArXiv: 2009.03728, 2020. [26] Serban A, Poll E, Visser J. Adversarial examples on object recognition: A comprehensive survey. ACM Computing Surveys, 2020, 53(3): Article No. 66 [27] Ding J, Xu Z. Adversarial attacks on deep learning models of computer vision: A survey. In: Proceedings of the 20th International Conference on Algorithms and Architectures for Parallel Processing. New York, NY, USA: Springer, 2020. 396−408 [28] Zhang W, Sheng Q Z, Alhazmi A, Li C. Adversarial attacks on deep-learning models in natural language processing. ACM Transactions on Intelligent Systems and Technology, 2020, 11(3): 1-41 [29] Biggio B, Fumera G, Russu P, Didaci L, Roli F. Adversarial biometric recognition: A review on biometric system security from the adversarial machine-learning perspective. IEEE Signal Processing Magazine, 2015, 32(5): 31-41 doi: 10.1109/MSP.2015.2426728 [30] Papangelou K, Sechidis K, Weatherall J, Brown G. Toward an understanding of adversarial examples in clinical trials. In: Proceedings of the 2018 European Conference on Machine Learning and Knowledge Discovery in Databases. Dublin, Ireland: Springer, 2018. 35−51 [31] Qayyum A, Qadir J, Bilal M, Al-Fuqaha A. Secure and robust machine learning for healthcare: A survey. IEEE Reviews in Biomedical Engineering, 2021, 14: 156-180 doi: 10.1109/RBME.2020.3013489 [32] Corona I, Giacinto G, Roli F. Adversarial attacks against intrusion detection systems: Taxonomy, solutions and open issues. Information Sciences, 2013, 239: 201-225 doi: 10.1016/j.ins.2013.03.022 [33] Maiorca D, Biggio B, Giacinto G. Towards adversarial malware detection: Lessons learned from PDF-based attacks. ACM Computing Surveys, 2019, 52(4): Article No. 78 [34] Army U S G U. Joint Publication 3−12: Cyberspace Operations. North Charleston: Create Space Independent Publishing Platform, 2018. [35] Gibson W. Neuronmancer. New York: Ace Books, 1984. [36] 罗军舟, 杨明, 凌振, 吴文甲, 顾晓丹. 网络空间安全体系与关键技术. 中国科学(信息科学), 2016, 46(8): 939−968Luo Jun-Zhou, Yang Ming, Ling Zhen, Wu Wen-Jia, Gu Xiao-Dan. Architecture and key technologies of cyberspace security. Scientia Sinica Informationis, 2016, 46(8): 939−968 [37] 方滨兴. 从层次角度看网络空间安全技术的覆盖领域. 网络与信息安全学报, 2015, 1(1): 2–7Fang Bin-Xing. A hierarchy model on the research fields of cyberspace security technology. Chinese Journal of Network and Information Security, 2015, 1(1): 2–7 [38] National Institute of Standards and Technology. Framework for improving critical infrastructure cybersecurity version 1.1. [Online], available: https://www.nist.gov/publications/framework-improving-critical-infrastructure-cybersecurity-version-11, April 16, 2018. [39] Turing A M. Computing machinery and intelligence. Mind, 1950, 59(236): 433-460 [40] Samuel A L. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 1959, 3(3): 211-229 [41] Mohri M, Rostamizadeh A, Talwalkar A. Foundations of Machine Learning. London: MIT Press, 2012. [42] Dalvi N, Domingos P, Sumit M, Verma S D. Adversarial classification. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, USA: ACM, 2004. 99−108 [43] Lowd D, Meek C. Adversarial learning. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Chicago, USA: ACM, 2005. 641−647 [44] Lowd D, Meek C. Good word attacks on statistical spam filters. In: The 2nd Conference on Email and Anti-Spam. Stanford, CA, USA: 2005. [45] Barreno M, Nelson B, Joseph A D, Tygar J D. The security of machine learning. Machine Learning, 2010, 81(2): 121-148 doi: 10.1007/s10994-010-5188-5 [46] Dasgupta P, Collins J B. A survey of game theoretic approaches for adversarial machine learning in cybersecurity tasks. AI Magazine, 2019, 40(2): 31-43 doi: 10.1609/aimag.v40i2.2847 [47] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, et al. Intriguing properties of neural networks. In: Proceedings of the 2nd International Conference on Learning Representations. Banff, Canada: 2014. [48] Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: 2015. [49] Li X, Li F. Adversarial examples detection in deep networks with convolutional filter statistics. In: Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 5775−5783 [50] Lu J, Issaranon T, Forsyth D. Safetynet: Detecting and rejecting adversarial examples robustly. In: Proceedings of the 16th IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 446−454 [51] Meng D, Chen H. MagNet: A two-pronged defense against adversarial examples. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. Dallas, USA: ACM, 2017. 135−147 [52] Melis M, Demontis A, Biggio B, Brown G, Fumera G, Roli F. Is deep learning safe for robot vision? Adversarial examples against the icub humanoid. In: Proceedings of the 16th IEEE International Conference on Computer Vision Workshops. Venice, Italy: IEEE, 2017. 751−759 [53] Papernot N, Mcdaniel P, Wu X, Jha S, Swami A. Distillation as a defense to adversarial perturbations against deep neural networks. In: Proceedings of the 2016 IEEE Symposium on Security and Privacy. San Jose, USA: IEEE, 2016. 582−597 [54] Laskov P, Lippmann R. Machine learning in adversarial environments. Machine Learning, 2010, 81(2): 115-119 doi: 10.1007/s10994-010-5207-6 [55] Joseph A, Laskov P, Roli F, Tygar J, Nelson B. Machine learning methods for computer security. Dagstuhl Reports, 2012, 2: 109-130 [56] Finlayson S G, Bowers J D, Ito J, Zittrain J L, Beam A L, Kohane I S. Adversarial attacks on medical machine learning. Science, 2019, 363(6433): 1287-1289 doi: 10.1126/science.aaw4399 [57] Heaven D. Why deep-learning ais are so easy to fool. Nature, 2019, 574: 163-166 doi: 10.1038/d41586-019-03013-5 [58] 程琪芩, 万良. BiLSTM在跨站脚本检测中的应用研究. 计算机科学与探索, 2020, 14(8): 1338-1347 doi: 10.3778/j.issn.1673-9418.1909035Cheng Qi-Qian, Wan Liang. Application research of BiLSTM in cross-site scripting detection. Journal of Frontiers of Computer Science and Technology, 2020, 14(8): 1338-1347 doi: 10.3778/j.issn.1673-9418.1909035 [59] Biggio B, Fumera G, Roli F. Security evaluation of pattern classifiers under attack. IEEE Transactions on Knowledge and Data Engineering, 2014, 26: 984-996 doi: 10.1109/TKDE.2013.57 [60] Kerckhoffs A. La cryptographie militaire. Journal des Sciences Militaires, 1883, 9: 5-83 [61] 范苍宁, 刘鹏, 肖婷, 赵巍, 唐降龙. 深度域适应综述: 一般情况与复杂情况. 自动化学报, 2021, 47(3): 515-548Fan Cang-Ning, Liu Peng, Xiao Ting, Zhao Wei, Tang Xiang-Long. A review of deep domain adaptation: General situation and complex situation. Acta Automatica Sinica, 2021, 47(3): 515−548 [62] Smutz C, Stavrou A. Malicious PDF detection using metadata and structural features. In: Proceedings of the 28th Annual Computer Security Applications Conference. Orlando, Florida, USA: ACM, 2012. 239–248 [63] Clements J, Yang Y, Sharma A A, Hu H, Lao Y. Rallying adversarial techniques against deep learning for network security. ArXiv: 1903.11688, 2019. [64] Wittel G L, Wu S F. On attacking statistical spam filters. In: Proceedings of the 1st Conference on Email and Anti-spam. Mountain View, CA, USA: 2004. 1−7 [65] Liu C, Stamm S. Fighting unicode-obfuscated spam. In: Proceedings of the Anti-phishing Working Groups 2nd Annual eCrime Researchers Summit. Pittsburgh, PA, USA: ACM, 2007. 45−59 [66] Sculley D, Wachman G M, Brodley C E. Spam filtering using inexact string matching in explicit feature space with on-line linear classifiers. In: Proceedings of the 15th Text REtrieval Conference. Gaithersburg, USA: 2006. 1−10 [67] Wright C V, Coull S E, Monrose F. Traffic morphing: An efficient defense against statistical traffic analysis. In: Proceedings of the 16th Annual Network and Distributed System Security Symposium. San Diego, USA: ISOC, 2009. 237–250 [68] Rosenberg I, Shabtai A, Rokach L, Elovici Y. Generic black-box end-to-end attack against state of the art API call based malware classifiers. In: Proceedings of the 21st International Symposium on Research in Attacks, Intrusions and Defenses. Heraklion, Greece: 2018. 490−510 [69] Šrndić N, Laskov P. Detection of malicious PDF files based on hierarchical document structure. In: Proceedings of the 20th Annual Network and Distributed System Security Symposium. San Diego, USA: ISOC, 2013. 1−16 [70] Šrndić N, Laskov P. Practical evasion of a learning-based classifier: A case study. In: Proceedings of the 35th IEEE Symposium on Security and Privacy. San Jose, USA: IEEE, 2014. 197−211 [71] Suciu O, Coull S E, Johns J. Exploring adversarial examples in malware detection. In: Proceedings of the 2019 IEEE Security and Privacy Workshops. San Francisco, USA: IEEE, 2019. 8−14 [72] Corona I, Maiorca D, Ariu D, Giacinto G. Lux0R: Detection of malicious PDF-embedded javascript code through discriminant analysis of API references. In: Proceedings of the 2014 ACM Artificial Intelligent and Security Workshop. Scottsdale, USA: ACM, 2014. 47−57 [73] Maiorca D, Corona I, Giacinto G. Looking at the bag is not enough to find the bomb: An evasion of structural methods for malicious PDF files detection. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security. Hangzhou, China: ACM, 2013. 119–130 [74] Xu W, Qi Y, Evans D. Automatically evading classifiers: A case study on PDF malware classifiers. In: Proceedings of the 23rd Annual Network and Distributed System Security Symposium. San Diego, USA: ISOC, 2016. 1−15 [75] Biggio B, Corona I, Maiorca D, Nelson B, Srndic N, Laskov P, et al. Evasion attacks against machine learning at test time. In: Proceedings of the 2013 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Prague, Czech: Springer, 2013. 387−402 [76] Smutz C, Stavrou A. When a tree falls: Using diversity in ensemble classifiers to identify evasion in malware detectors. In: Proceedings of the 23rd Annual Network and Distributed System Security Symposium. San Diego, USA: ISOC, 2016. 1−15 [77] Biggio B, Corona I, Nelson B, Rubinstein B I P, Maiorca D, Fumera G, et al. Security evaluation of support vector machines in adversarial environments. In: Proceedings of the Support Vector Machines Applications. Cham, Switzerland: Springer International Publishing, 2014. 105−153 [78] Kolosnjaji B, Demontis A, Biggio B, Maiorca D, Giacinto G, Eckert C, et al. Adversarial malware binaries: Evading deep learning for malware detection in executables. In: Proceedings of the 26th European Signal Processing Conference. Rome, Italy: EUSIPCO, 2018. 533−537 [79] Kreuk F, Barak A, Aviv-Reuven S, Baruch M, Pinkas B, Keshet J. Adversarial examples on discrete sequences for beating whole-binary malware detection. ArXiv: 1802.04528, 2018. [80] Huang C H, Lee T H, Chang L H, Lin J R, Horng G. Adversarial attacks on SDN-based deep learning IDS system. In: Proceedings of the 2018 International Conference on Mobile and Wireless Technology. Hong Kong, China: Springer, 2019. 181−191 [81] Dang H, Huang Y, Chang E-C. Evading classifiers by morphing in the dark. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. Dallas, USA: ACM, 2017. 119−133 [82] Lin Z, Shi Y, Xue Z. IDSGAN: Generative adversarial networks for attack generation against intrusion detection. ArXiv: 1809.02077, 2018. [83] Rigaki M, Garcia S. Bringing a GAN to a knife-fight: Adapting malware communication to avoid detection. In: Proceedings of the 2018 IEEE Symposium on Security and Privacy Workshops. San Francisco, USA: IEEE, 2018. 70−75 [84] Yan Q, Wang M, Huang W, Luo X, Yu F R. Automatically synthesizing DoS attack traces using generative adversarial networks. International Journal of Machine Learning and Cybernetics, 2019, 10(12): 3387-3396 doi: 10.1007/s13042-019-00925-6 [85] Fang Y, Huang C, Xu Y, Li Y. RLXSS: Optimizing XSS detection model to defend against adversarial attacks based on reinforcement learning. Future Internet, 2019, 11: 177 doi: 10.3390/fi11080177 [86] Anderson H S, Woodbridge J, Filar B. DeepDGA: Adversarially-tuned domain generation and detection. In: Proceedings of the 9th ACM Workshop Artificial Intelligence and Security. Vienna, Austria: ACM, 2016. 13−21 [87] Hu W, Tan Y. Generating adversarial malware examples for black-box attacks based on GAN. ArXiv: 1702.05983, 2017. [88] Anderson H S, Kharkar A, Filar B, Evans D, Roth P. Learning to evade static PE machine learning malware models via reinforcement learning. ArXiv: 1801.08917, 2018. [89] 唐川, 张义, 杨岳湘, 施江勇. DroidGAN: 基于DCGAN的Android对抗样本生成框架. 通信学报, 2018, 39(S1): 64-69Tang Chuan, Zhang Yi, Yang Yue-Xiang, Shi Jiang-Yong. DroidGAN: Android adversarial sample generation framework based on DCGAN. Journal on Communications, 2018, 39(S1): 64-69 [90] Nelson B, Barreno M, Chi F J, Joseph A D, Rubinstein B I P, Saini U, et al. Exploiting machine learning to subvert your spam filter. In: Proceedings of the 1st USENIX Workshop on Large-Scale Exploits and Emergent Threats: Botnets, Spyware, Worms, and More. San Francisco, CA, USA: USENIX Association, 2008. 1−9 [91] Newsome J, Karp B, Song D X. Paragraph: Thwarting signature learning by training maliciously. In: Proceedings of the 9th International Symposium on Recent Advances in Intrusion Detection. Hamburg, Germany: Springer, 2006. 81−105 [92] Huang L, Joseph A D, Nelson B, Rubinstein B I P, Tygar J D. Adversarial machine learning. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence. New York, USA: ACM, 2011. 43–58 [93] Kim H A, Karp B, Usenix. Autograph: Toward automated, distributed worm signature detection. In: Proceedings of the 13rd USENIX Security Symposium. San Diego, USA: USENIX Association, 2004. 271−286 [94] Rubinstein B I P, Nelson B, Huang L, Joseph A D, Lau S H, Rao S, et al. Antidote: Understanding and defending against poisoning of anomaly detectors. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement. Chicago, IL, USA: ACM, 2009. 1−14 [95] Nelson B, Joseph A D. Bounding an attack's complexity for a simple learning model. In: Proceedings of the 1st USENIX Workshop on Tackling Computer Systems Problems with Machine Learning Techniques. Saint Malo, France: USENIX, 2006. 1−5 [96] Kloft M, Laskov P. Online anomaly detection under adversarial impact. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. Sardinia, Italy: Microtome, 2010. 405−412 [97] Biggio B, Pillai I, Rota Bulo S, Ariu D, Pelillo M, Roli F. Is data clustering in adversarial settings secure? In: Proceedings of the 6th Annual ACM Workshop on Artificial Intelligence and Security. Berlin, Germany: ACM, 2013. 87−97 [98] Biggio B, Rieck K, Ariu D, Wressnegger C, Corona I, Giacinto G, et al. Poisoning behavioral malware clustering. In: Proceedings of the 7th ACM Workshop Artificial Intelligence and Security. Scottsdale, USA: ACM, 2014. 27−36 [99] Tramèr F, Zhang F, Juels A, Reiter M K, Ristenpart T. Stealing machine learning models via prediction APIs. In: Proceedings of the 25th USENIX Security Symposium. Austin, USA: USENIX Association, 2016. 601−618 [100] Fredrikson M, Jha S, Ristenpart T. Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. Denver, USA: ACM, 2015. 1322−1333 [101] Papernot N, Mcdaniel P D, Goodfellow I J, Jha S, Celik Z B, Swami A. Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM Asia Conference on Computer and Communications Security. Abu Dhabi, UAE: ACM, 2017. 506−519 [102] Fredrikson M, Lantz E, Jha S, Lin S, Page D, Ristenpart T. Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing. In: Proceedings of the 23rd USENIX Security Symposium. San Diego, USA: USENIX Association, 2014. 17−32 [103] Shokri R, Stronati M, Song C, Shmatikov V. Membership inference attacks against machine learning models. In: Proceedings of the 2017 IEEE Symposium on Security and Privacy. San Jose, USA: IEEE, 2017. 3−18 [104] Maiorca D, Giacinto G, Corona I. A pattern recognition system for malicious PDF files detection. In: Proceedings of the 8th International Conference on Machine Learning and Data Mining in Pattern Recognition. Berlin, Germany: Springer, 2012. 510−524 [105] Papernot N, Mcdaniel P D, Jha S, Fredrikson M, Celik Z B, Swami A. The limitations of deep learning in adversarial settings. In: Proceedings of the 2016 IEEE European Symposium on Security and Privacy. Saarbruecken, Germany: IEEE, 2016. 372−387 [106] Carlini N, Wagner D A. Towards evaluating the robustness of neural networks. In: Proceedings of the 2017 IEEE Symposium on Security and Privacy. San Jose, USA: IEEE, 2017. 39−57 [107] Eykholt K, Evtimov I, Fernandes E, Li B, Rahmati A, Xiao C, et al. Robust physical-world attacks on deep learning visual classification. In: Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 1625−1634 [108] Chen P Y, Sharma Y, Zhang H, Yi J F, Hsieh C J. EAD: Elastic-net attacks to deep neural networks via adversarial examples. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, USA: AAAI, 2018. 10−17 [109] Papernot N, Mcdaniel P D, Goodfellow I J. Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. ArXiv: 1605.07277, 2016. [110] Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial nets. In: Proceedings of the 28th Annual Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press, 2014. 2672−2680 [111] 王坤峰, 苟超, 段艳杰, 林懿伦, 郑心湖, 王飞跃. 生成式对抗网络GAN的研究进展与展望. 自动化学报, 2017, 43(03): 321-332Wang Kun-Feng, Gou Chao, Duan Yan-Jie, Lin Yi-Lun, Zheng Xin-Hu, Wang Fei-Yue. Generative adversarial networks: The state of the art and beyond. Acta Automatica Sinica, 2017, 43(3): 321-332 [112] Kearns M, Li M. Learning in the presence of malicious errors. In: Proceedings of the 20th annual ACM Symposium on Theory of Computing. Chicago, USA: ACM, 1988. 267–280 [113] John Leyden. Kaspersky Lab denies tricking AV rivals into nuking harmless files. [Online], available: https://www.theregister.co.uk/2015/08/14/kasperskygate/, August 14, 2015. [114] Kloft M, Laskov P. Security analysis of online centroid anomaly detection. Journal of Machine Learning Research, 2012, 13: 3681-3724 [115] Liao C, Zhong H, Squicciarini A C, Zhu S, Miller D J. Backdoor embedding in convolutional neural network models via invisible perturbation. In: Proceedings of the 10th ACM Conference on Data and Application Security and Privacy. New Orleans, LA, USA: ACM, 2020. 97–108 [116] Hayes J, Melis L, Danezis G, Cristofaro E D. LOGAN: Membership inference attacks against generative models. Proceedings on Privacy Enhancing Technologies, 2019, 2019(1): 133-152 doi: 10.2478/popets-2019-0008 [117] Bhagoji A N, Cullina D, Sitawarin C, Mittal P. Enhancing robustness of machine learning systems via data transformations. In: Proceedings of the 52nd Annual Conference on Information Sciences and Systems. Princeton, USA: IEEE, 2018. 1−5 [118] Zhang F, Chan P P K, Biggio B, Yeung D S, Roli F. Adversarial feature selection against evasion attacks. IEEE Transactions on Cybernetics, 2016, 46(3): 766-77 doi: 10.1109/TCYB.2015.2415032 [119] Wang Q, Guo W, Zhang K, Ororbia A G, Xing X, Liu X, et al. Adversary resistant deep neural networks with an application to malware detection. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax, Canada: ACM, 2017. 1145–1153 [120] Al-Dujaili A, Huang A, Hemberg E, O'reilly U. Adversarial deep learning for robust detection of binary encoded malware. In: Proceedings of the 2018 IEEE Symposium on Security and Privacy Workshops. San Francisco, USA: IEEE, 2018. 76−82 [121] Demontis A, Melis M, Biggio B, Maiorca D, Arp D, Rieck K, et al. Yes, machine learning can be more secure! A case study on Android malware detection. IEEE Transactions on Dependable and Secure Computing, 2019, 16(4): 711-724 doi: 10.1109/TDSC.2017.2700270 [122] Yang W, Kong D, Xie T, Gunter C A. Malware detection in adversarial settings: Exploiting feature evolutions and confusions in Android apps. In: Proceedings of the 33rd Annual Computer Security Applications Conference. 2017. [123] Tramèr F, Kurakin A, Papernot N, Goodfellow I, Boneh D, Mcdaniel P. Ensemble adversarial training: Attacks and defenses. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: 2018. 1−20 [124] Li D, Li Q. Adversarial deep ensemble: Evasion attacks and defenses for malware detection. IEEE Transactions on Information Forensics and Security, 2020, 15: 3886-3900 doi: 10.1109/TIFS.2020.3003571 [125] Hinton G E, Vinyals O, Dean J. Distilling the knowledge in a neural network. ArXiv: 1503.02531, 2015. [126] Hosseini H, Chen Y, Kannan S, Zhang B, Poovendran R. Blocking transferability of adversarial examples in black-box learning systems. ArXiv: 1703.04318, 2017. [127] Papernot N, Mcdaniel P D. Extending defensive distillation. ArXiv: 1705.05264, 2017. [128] Grosse K, Papernot N, Manoharan P, Backes M, Mcdaniel P D. Adversarial perturbations against deep neural networks for malware classification. ArXiv: 1606.04435, 2016. [129] Stokes J W, Wang D, Marinescu M, Marino M, Bussone B. Attack and defense of dynamic analysis-based, adversarial neural malware detection models. In: Proceedings of the 2018 IEEE Military Communications Conference. Los Angeles, CA, USA: IEEE, 2018. 102−109 [130] Cretu G F, Stavrou A, Locasto M E, Stolfo S J, Keromytis A D. Casting out demons: Sanitizing training data for anomaly sensors. In: Proceedings of the 2008 IEEE Symposium on Security and Privacy. Oakland, USA: IEEE, 2008. 81−95 [131] Laishram R, Phoha V V. Curie: A method for protecting SVM classifier from poisoning attack. ArXiv: 1606.01584, 2016. [132] Steinhardt J, Koh P W, Liang P. Certified defenses for data poisoning attacks. In: Proceedings of the 31st Annual Conference on Neural Information Processing Systems. Long Beach, USA: MIT Press, 2017. 3518−3530 [133] Metzen J H, Genewein T, Fischer V, Bischoff B. On detecting adversarial perturbations. In: The 5th International Conference on Learning Representations. Toulon, France: 2017. [134] Feinman R, Curtin R R, Shintre S, Gardner A B. Detecting adversarial samples from artifacts. ArXiv: 1703.00410, 2017. [135] Cao Y, Yang J. Towards making systems forget with machine unlearning. In: Proceedings of the 36th IEEE Symposium on Security and Privacy. San Jose, USA: IEEE, 2015. 463−480 [136] Bourtoule L, Chandrasekaran V, Choquette-Choo C A, Jia H, Travers A, Zhang B, et al. Machine unlearning. In: The 42nd IEEE Symposium on Security and Privacy. Virtual Event: 2021. 141−159 [137] Brückner M, Scheffer T. Nash equilibria of static prediction games. In: Proceedings of the 23rd Annual Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2009. 171−179 [138] Brückner M, Scheffer T. Stackelberg games for adversarial prediction problems. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, CA, USA: ACM, 2011. 547−555 [139] Brückner M, Kanzow C, Scheffer T. Static prediction games for adversarial learning problems. Journal of Machine Learning Research, 2012, 13: 2617-2654 [140] Sengupta S, Chakraborti T, Kambhampati S. MTDeep: Boosting the security of deep neural nets against adversarial attacks with moving target defense. In: Proceedings of the 10th International Conference on Decision and Game Theory for Security. Stockholm, Sweden: Springer, 2019. 479−491 [141] Biggio B, Fumera G, Roli F. Design of robust classifiers for adversarial environments. In: Proceedings of the 2011 IEEE International Conference on Systems, Man, and Cybernetics. Anchorage, USA: IEEE, 2011. 977−982 [142] Dwork C. Differential privacy. In: Proceedings of the 33rd International Colloquium on Automata, Languages and Programming. Venice, Italy: Springer, 2006. 1−12 [143] Dwork C, Mcsherry F, Nissim K, Smith A D. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference. New York, USA: Springer, 2006. 265−284 [144] Mcsherry F, Talwar K. Mechanism design via differential privacy. In: Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science. Providence, USA: IEEE, 2007. 94−103 [145] Dwork C, Roth A. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 2014, 9: 211-407 [146] 张泽辉, 富瑶, 高铁杠. 支持数据隐私保护的联邦深度神经网络模型研究. 自动化学报, 2022, 48(5): 1153−1172Zhang Ze-Hui, Fu Yao, Gao Tie-Gang. Research on federated deep neural network model for data privacy protection. Acta Automatica Sinica, 2022, 48(5): 1153−1172 [147] Jayaraman B, Evans D. Evaluating differentially private machine learning in practice. In: Proceedings of the 28th USENIX Security Symposium. Santa Clara, USA: USENIX Association, 2019. 1895−1912 [148] Rahman M A, Rahman T, Laganière R, Mohammed N, Wang Y. Membership inference attack against differentially private deep learning model. Transactions on Data Privacy, 2018, 11(1): 61-79 [149] Mcmahan H B, Ramage D, Talwar K, Zhang L. Learning differentially private recurrent language models. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: 2018. 1−14 [150] Salem A, Zhang Y, Humbert M, Fritz M, Backes M. Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models. In: Proceedings of the 26th Annual Network and Distributed System Security Symposium. San Diego, USA: ISOC, 2019. 1−15 [151] Carlini N, Liu C, Erlingsson Ú, Kos J, Song D. The secret sharer: Evaluating and testing unintended memorization in neural networks. In: Proceedings of the 28th USENIX Security Symposium. Santa Clara, USA: USENIX Association, 2019. 267−284 [152] Melis L, Song C, Cristofaro E D, Shmatikov V. Exploiting unintended feature leakage in collaborative learning. In: Proceedings of the 2019 IEEE Symposium on Security and Privacy. San Francisco, USA: IEEE, 2019. 691−706 [153] Song L, Shokri R, Mittal P. Privacy risks of securing machine learning models against adversarial examples. In: Proceedings of the 26th ACM SIGSAC Conference on Computer and Communications Security. London, UK: ACM, 2019. 241−257 [154] Ganju K, Wang Q, Yang W, Gunter C A, Borisov N. Property inference attacks on fully connected neural networks using permutation invariant representations. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. New York, USA: ACM, 2018. 619–633 [155] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: 2017. [156] Kipf T, Welling M. Variational graph auto-encoders. ArXiv: 1611.07308, 2016. [157] Hamilton W L, Ying R, Leskovec J. Inductive representation learning on large graphs. In: Proceedings of the 31st Annual Conference on Neural Information Processing Systems. Long Beach, USA: MIT Press, 2017. 1025–1035 [158] Hou S, Ye Y, Song Y, Abdulhayoglu M. HinDroid: An intelligent Android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax, Canada: ACM, 2017. 1507–1515 [159] Ye Y, Hou S, Chen L, Lei J, Wan W, Wang J, et al. Out-of-sample node representation learning for heterogeneous graph in real-time Android malware detection. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao, China: Morgan Kaufmann, 2019. 4150−4156 [160] Fan Y, Hou S, Zhang Y, Ye Y, Abdulhayoglu M. Gotcha-sly malware! Scorpion: A metagraph2vec based malware detection system. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. London, UK: ACM, 2018. 253−262 [161] Zügner D, Akbarnejad A, Günnemann S. Adversarial attacks on neural networks for graph data. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. London, UK: ACM, 2018. 2847−2856 [162] Zhu D, Cui P, Zhang Z, Zhu W. Robust graph convolutional networks against adversarial attacks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Anchorage, USA: ACM, 2019. 1399−1407 [163] Hou S F, Fan Y J, Zhang Y M, Ye Y F, Lei J W, Wan W Q, et al. αCyber: Enhancing robustness of Android malware detection system against adversarial attacks on heterogeneous graph based model. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Beijing, China: ACM, 2019. 609−618 [164] Sun L, Wang J, Yu P S, Li B. Adversarial attack and defense on graph data: A survey. ArXiv: 1812.10528, 2018. [165] Carlini N, Wagner D. Adversarial examples are not easily detected: Bypassing ten detection methods. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. Dallas, USA: ACM, 2017. 3−14 [166] Carlini N, Mishra P, Vaidya T, Zhang Y, Sherr M, Shields C, et al. Hidden voice commands. In: Proceedings of the 25th USENIX Security Symposium. Austin, USA: USENIX Association, 2016. 513−530 [167] Miller B, Kantchelian A, Afroz S, Bachwani R, Dauber E, Huang L, et al. Adversarial active learning. In: Proceedings of the 2014 ACM Artificial Intelligent and Security Workshop. Scottsdale, USA: ACM, 2014. 3−14 -