2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于特征变换和度量网络的小样本学习算法

王多瑞 杜杨 董兰芳 胡卫明 李兵

王多瑞, 杜杨, 董兰芳, 胡卫明, 李兵. 基于特征变换和度量网络的小样本学习算法. 自动化学报, 2024, 50(7): 1305−1314 doi: 10.16383/j.aas.c210903
引用本文: 王多瑞, 杜杨, 董兰芳, 胡卫明, 李兵. 基于特征变换和度量网络的小样本学习算法. 自动化学报, 2024, 50(7): 1305−1314 doi: 10.16383/j.aas.c210903
Wang Duo-Rui, Du Yang, Dong Lan-Fang, Hu Wei-Ming, Li Bing. Feature transformation and metric networks for few-shot learning. Acta Automatica Sinica, 2024, 50(7): 1305−1314 doi: 10.16383/j.aas.c210903
Citation: Wang Duo-Rui, Du Yang, Dong Lan-Fang, Hu Wei-Ming, Li Bing. Feature transformation and metric networks for few-shot learning. Acta Automatica Sinica, 2024, 50(7): 1305−1314 doi: 10.16383/j.aas.c210903

基于特征变换和度量网络的小样本学习算法

doi: 10.16383/j.aas.c210903
基金项目: 国家重点研发计划(2018AAA0102802), 国家自然科学基金(62036011, 62192782, 61721004), 中国科学院前沿科学重点研究计划(QYZDJ-SSW-JSC040)资助
详细信息
    作者简介:

    王多瑞:2021年获得中国科学技术大学硕士学位. 主要研究方向为小样本学习, 目标检测.E-mail: wangduor@mail.ustc.edu.cn

    杜杨:2019年获得中国科学院自动化研究所博士学位. 主要研究方向为行为识别, 医学图像处理.E-mail: jingzhou.dy@alibaba-inc.com

    董兰芳:中国科学技术大学副教授. 1994年获得中国科学技术大学硕士学位. 主要研究方向为图像与视频智能分析, 知识图谱与对话系统, 数值模拟与三维重建.E-mail: lfdong@ustc.edu.cn

    胡卫明:中国科学院自动化研究所研究员. 1998年获得浙江大学博士学位. 主要研究方向为视觉运动分析, 网络不良信息识别和网络入侵检测. 本文通信作者.E-mail: wmhu@nlpr.ia.ac.cn

    李兵:中国科学院自动化研究所研究员. 2009年获得北京交通大学博士学位. 主要研究方向为网络内容安全, 智能图像信号处理.E-mail: bing.li@ia.ac.cn

Feature Transformation and Metric Networks for Few-shot Learning

Funds: Supported by National Key Research and Development Program of China (2018AAA0102802), National Natural Science Foundation of China (62036011, 62192782, 61721004), and Key Research Program of Frontier Sciences of Chinese Academy of Sciences (QYZDJ-SSW-JSC040)
More Information
    Author Bio:

    WANG Duo-Rui He received his master degree from University of Science and Technology of China in 2021. His research interest covers few-shot learning and object detection

    DU Yang He received his Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences. His research interest covers action recognition and medical image processing

    DONG Lan-Fang Associate professor at University of Science and Technology of China. She received her master degree from University of Science and Technology of China in 1994. Her research interest covers image and video intelligent analysis, knowledge mapping and dialogue systems, and numerical simulation and 3D reconstruction

    HU Wei-Ming Professor at the Institute of Automation, Chinese Academy of Sciences. He received his Ph.D. degree from Zhejiang University in 1998. His research interest covers visual motion analysis, recognition of web objectionable information, and network intrusion detection. Corresponding author of this paper

    LI Bing Professor at the Institute of Automation, Chinese Academy of Sciences. He received his Ph.D. degree from Beijing Jiaotong University in 2009. His research interest covers the web content security and intelligent image signal process

  • 摘要: 在小样本分类任务中, 每个类别可供训练的样本数量非常有限. 因此在特征空间中同类样本分布稀疏, 异类样本间边界模糊. 提出一种新的基于特征变换和度量网络(Feature transformation and metric networks, FTMN)的小样本学习算法用于小样本分类任务. 算法通过嵌入函数将样本映射到特征空间, 并计算输入该样本与所属类别中心的特征残差. 构造一个特征变换函数对该残差进行学习, 使特征空间内的样本特征经过该函数后向同类样本中心靠拢. 利用变换后的样本特征更新类别中心, 使各类别中心间的距离增大. 算法进一步构造了一种新的度量函数, 对样本特征中每个局部特征点的度量距离进行联合表达, 该函数能够同时对样本特征间的夹角和欧氏距离进行优化. 算法在小样本分类任务常用数据集上的优秀表现证明了算法的有效性和泛化性.
  • 图  1  特征变换和度量网络模型

    Fig.  1  Model of feature transformation and metric networks

    图  2  网络中关键函数的结构

    Fig.  2  Structure of important functions of networks

    表  1  网络模型的嵌入函数与重要结构

    Table  1  Embedding function and important structures of networks

    模型名称嵌入函数重要结构
    MN4层卷积网络注意力长短时记忆网络
    ProtoNet[12]4层卷积网络“原型”概念、使用欧氏距离进行度量
    RN4层卷积网络卷积神经网络作为度量函数
    EGNN4层卷积网络边标签预测节点类别
    EGNN + Transduction[22]ResNet-12边标签预测节点类别、转导和标签传递
    DN4[24]ResNet-12局部描述子、图像与类别间的相似性度量
    DC[25]4层卷积网络稠密分类
    DC + IMP[25]4层卷积网络稠密分类、神经网络迁移
    FTMN4层卷积网络特征变换模块、特征度量模块
    FTMN-R12ResNet-12特征变换模块、特征度量模块
    下载: 导出CSV

    表  2  在Omniglot数据集上的小样本分类性能(%)

    Table  2  Few-shot classification performance on Omniglot dataset (%)

    模型5-类20-类
    1-样本5-样本1-样本5-样本
    MN98.198.993.898.5
    ProtoNet[12]98.899.796.098.9
    SN97.398.488.297.0
    RN99.6 ± 0.299.8 ± 0.197.6 ± 0.299.1 ± 0.1
    SM[15]98.499.695.098.6
    MetaNet[16]98.9597.00
    MANN[17]82.894.9
    MAML[18]98.7 ± 0.499.9 ± 0.195.8 ± 0.398.9 ± 0.2
    MMNet[26]99.28 ± 0.0899.77 ± 0.0497.16 ± 0.1098.93 ± 0.05
    FTMN99.7 ± 0.199.9 ± 0.198.3 ± 0.199.5 ± 0.1
    下载: 导出CSV

    表  3  在miniImageNet数据集上的小样本分类性能 (%)

    Table  3  Few-shot classification performance on miniImageNet dataset (%)

    模型5-类
    1-样本5-样本
    MN43.40 ± 0.7851.09 ± 0.71
    ML-LSTM[11]43.56 ± 0.8455.31 ± 0.73
    ProtoNet[12]49.42 ± 0.7868.20 ± 0.66
    RN50.44 ± 0.8265.32 ± 0.70
    MetaNet[16]49.21 ± 0.96
    MAML[18]48.70 ± 1.8463.11 ± 0.92
    EGNN66.85
    EGNN + Transduction[22]76.37
    DN4[24]51.24 ± 0.7471.02 ± 0.64
    DC[25]62.53 ± 0.1978.95 ± 0.13
    DC + IMP[25]79.77 ± 0.19
    MMNet[26]53.37 ± 0.0866.97 ± 0.09
    PredictNet[27]54.53 ± 0.4067.87 ± 0.20
    DynamicNet[28]56.20 ± 0.8672.81 ± 0.62
    MN-FCE[29]43.44 ± 0.7760.60 ± 0.71
    MetaOptNet[30]60.64 ± 0.6178.63 ± 0.46
    FTMN59.86 ± 0.9175.96 ± 0.82
    FTMN-R1261.33 ± 0.2179.59 ± 0.47
    下载: 导出CSV

    表  4  在CUB-200、CIFAR-FS和tieredImageNet数据集上的小样本分类性能(%)

    Table  4  Few-shot classification performance on CUB-200, CIFAR-FS and tieredImageNet datasets (%)

    模型CUB-200 5-类CIFAR-FS 5-类tieredImageNet 5-类
    1-样本5-样本1-样本5-样本1-样本5-样本
    MN61.16 ± 0.8972.86 ± 0.70
    ProtoNet[12]51.31 ± 0.9170.77 ± 0.6955.5 ± 0.772.0 ± 0.653.31 ± 0.8972.69 ± 0.74
    RN62.45 ± 0.9876.11 ± 0.6955.0 ± 1.069.3 ± 0.854.48 ± 0.9371.32 ± 0.78
    MAML[18]55.92 ± 0.9572.09 ± 0.7658.9 ± 1.971.5 ± 1.051.67 ± 1.8170.30 ± 1.75
    EGNN63.52 ± 0.5280.24 ± 0.49
    DN4[24]53.15 ± 0.8481.90 ± 0.60
    MetaOptNet[30]72.0 ± 0.784.2 ± 0.565.99 ± 0.7281.56 ± 0.53
    FTMN-R1269.58 ± 0.3685.46 ± 0.7970.3 ± 0.582.6 ± 0.362.14 ± 0.6381.74 ± 0.33
    下载: 导出CSV

    表  5  消融实验结果 (%)

    Table  5  Results of ablation study (%)

    模型5-类
    1-样本5-样本
    ProtoNet-4C49.42 ± 0.7868.20 ± 0.66
    ProtoNet-8C51.18 ± 0.7370.23 ± 0.46
    ProtoNet-Trans-4C53.47 ± 0.4671.33 ± 0.23
    ProtoNet-M-4C56.54 ± 0.5773.46 ± 0.53
    ProtoNet-VLAD-4C52.46 ± 0.6770.83 ± 0.62
    Trans*-M-4C59.86 ± 0.9167.86 ± 0.56
    仅使用余弦相似度54.62 ± 0.5772.58 ± 0.38
    仅使用欧氏距离55.66 ± 0.6773.34 ± 0.74
    FTMN59.86 ± 0.9175.96 ± 0.82
    下载: 导出CSV
  • [1] Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 1−9
    [2] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770−778
    [3] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: NIPS, 2012. 1106−1114
    [4] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR, 2015.
    [5] 刘颖, 雷研博, 范九伦, 王富平, 公衍超, 田奇. 基于小样本学习的图像分类技术综述. 自动化学报, 2021, 47(2): 297−315 doi: 10.16383/j.aas.c190720

    Liu Ying, Lei Yan-Bo, Fan Jiu-Lun, Wang Fu-Ping, Gong Yan-Chao, Tian Qi. Survey on image classification technology based on small sample learning. Acta Automatica Sinica, 2021, 47(2): 297−315 doi: 10.16383/j.aas.c190720
    [6] Miller E G, Matsakis N E, Viola P A. Learning from one example through shared densities on transforms. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hilton Head Island, USA: IEEE, 2000. 464−471
    [7] Li F F, Fergus R, Perona P. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 594−611
    [8] Lake B M, Salakhutdinov R, Gross J, Tenenbaum J B. One shot learning of simple visual concepts. In: Proceedings of the 33rd Annual Meeting of the Cognitive Science Society. Boston, USA: CogSci, 2011. 2568−2573
    [9] Lake B M, Salakhutdinov R, Tenenbaum J B. Human-level concept learning through probabilistic program induction. Science, 2015, 350(11): 1332−1338
    [10] Edwards H, Storkey A J. Towards a neural statistician. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017.
    [11] Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: 2016. 3637−3645
    [12] Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proceedings of the 31th International Conference on Neural Information Processing Systems. Long Beach, USA: 2017. 4080−4090
    [13] Koch G, Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, France: JMLR, 2015.
    [14] Sung F, Yang Y X, Zhang L, Xiang T, Torr P H S, Hospedales T M. Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 1199−1208
    [15] Kaiser L, Nachum O, Roy A, Bengio S. Learning to remember rare events. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017.
    [16] Munkhdalai T, Yu H. Meta networks. In: Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: JMLR.org, 2017. 2554−2563
    [17] Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T. Meta-learning with memory-augmented neural networks. In: Proceedings of the 33rd International Conference on Machine Learning. New York, USA: PMLR, 2016. 1842−1850
    [18] Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: JMLR.org, 2017. 1126−1135
    [19] Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J. Net-VLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 5297−5307
    [20] Jégou H, Douze M, Schmid C, Pérez P. Aggregating local descriptors into a compact image representation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010. 3304−3311
    [21] Bertinetto L, Henriques J F, Torr P H, Vedaldi A. Meta-learning with differentiable closed-form solvers. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans, USA: ICLR, 2019.
    [22] Kim J, Kim T, Kim S, Yoo C D. Edge-labeling graph neural network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 11−20
    [23] Yue Z Q, Zhang H W, Sun Q R, Hua X S. Interventional few-shot learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Incorporated, 2020. Article No. 230
    [24] Li W B, Wang L, Xu J L, Huo J, Gao Y, Luo J B. Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 7253−7260
    [25] Lifchitz Y, Avrithis Y, Picard S, Bursuc A. Dense classification and implanting for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 9250−9259
    [26] Cai Q, Pan Y W, Yao T, Yan C G, Mei T. Memory matching networks for one-shot image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4080−4088
    [27] Qiao S Y, Liu C X, Shen W, Yuille A L. Few-shot image recognition by predicting parameters from activations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 7229−7238
    [28] Gidaris S, Komodakis N. Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4367−4375
    [29] Ravi S, Larochelle H. Optimization as a model for few-shot learning. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017.
    [30] Lee K, Maji S, Ravichandran A, Soatto S. Meta-learning with differentiable convex optimization. In: Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 10649−10657
  • 加载中
图(2) / 表(5)
计量
  • 文章访问数:  636
  • HTML全文浏览量:  193
  • PDF下载量:  172
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-09-20
  • 录用日期:  2021-12-11
  • 网络出版日期:  2023-09-11
  • 刊出日期:  2024-07-23

目录

    /

    返回文章
    返回