2.765

2022影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于特征变换和度量网络的小样本学习算法

王多瑞 杜杨 董兰芳 胡卫明 李兵

王多瑞, 杜杨, 董兰芳, 胡卫明, 李兵. 基于特征变换和度量网络的小样本学习算法. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c210903
引用本文: 王多瑞, 杜杨, 董兰芳, 胡卫明, 李兵. 基于特征变换和度量网络的小样本学习算法. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c210903
Wang Duo-Rui, Du Yang, Dong Lan-Fang, Hu Wei-Ming, Li Bing. Metric based feature transformation networks for few-shot learning. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c210903
Citation: Wang Duo-Rui, Du Yang, Dong Lan-Fang, Hu Wei-Ming, Li Bing. Metric based feature transformation networks for few-shot learning. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c210903

基于特征变换和度量网络的小样本学习算法

doi: 10.16383/j.aas.c210903
基金项目: 国家重点研发计划(2018AAA0102802), 国家自然科学基金(62036011,62192782, 61721004), 中国科学院前沿科学重点研究计划(QYZDJ-SSW-JSC040)资助
详细信息
    作者简介:

    王多瑞:北京航空航天大学博士研究生. 2021年获得中国科学技术大学硕士学位. 主要研究方向为小样本学习和目标检测.E-mail: wangduor@mail.ustc.edu.cn

    杜杨:2019年获得中国科学院自动化研究所博士学位. 主要研究方向为行为识别和医学图像处理.E-mail: jingzhou.dy@alibaba-inc.com

    董兰芳:中国科学技术大学副教授. 1994年获得中国科学技术大学硕士学位. 主要研究方向为图像和视频智能分析,知识图谱和对话系统,数值模拟及三维重建.E-mail: lfdong@ustc.edu.cn

    胡卫明:中国科学院自动化研究所研究员. 1998年获得浙江大学博士学位. 主要研究方向为视频运动分析和网络多媒体内容安全分析与识别.E-mail: wmhu@nlpr.ia.ac.cn

    李兵:中国科学院自动化研究所研究员. 2009年获得北京交通大学博士学位. 主要研究方向为网络内容安全和智能ISP成像.E-mail: bing.li@ia.ac.cn

Metric Based Feature Transformation Networks for Few-shot Learning

Funds: Supported by National Key R&D Program of China (2018AAA0102802), Natural Science Foundation of China (62036011, 62192782, 61721004), Key Research Program of Frontier Sciences, Chinese Academy of Sciences (QYZDJ-SSW-JSC040)
More Information
    Author Bio:

    WANG Duo-Rui Ph.D. candidate in Beihang University. He received his master’s degree from University of Science and Technology of China in 2021. His research interest covers few-shot learning and object detection

    DU Yang He received his Ph.D. degree from the Institute of Automation Chinese Academy of Sciences. His research interests covers action recognition and medical image processing

    DONG Lan-Fang Associated Professor in University of Science and Technology of China. She received her master’s degree from University of Science and Technology of China in 1994. Her research interest covers the intelligent image and video analysis, knowledge mapping and dialogue systems, numerical simulation and 3D reconstruction

    HU Wei-Ming Professor in the Institute of Automation Chinese Academy of Sciences. He received his Ph. D. degree from Zhejiang University in 1998. His research interest covers the visual motion analysis, recognition of web objectionable information, and network intrusion detection. Corresponding author of this paper

    LI Bing Professor in the Institute of Automation Chinese Academy of Sciences. He received his Ph. D. degree from Beijing Jiaotong University in 2009. His research interest covers the web content security and intelligent ISP imaging

  • 摘要: 在小样本分类任务中, 每类可供训练的样本非常有限, 同类样本在特征空间中分布稀疏, 异类样本间的边界模糊. 文章提出一种新的基于特征变换的网络, 并使用度量的方法来处理小样本分类任务. 算法通过嵌入函数将样本映射到特征空间并计算输入样本与样本中心的特征残差, 利用特征变换函数学习样本中心与同类样本间的残差, 使样本在特征空间中向同类样本中心靠拢, 更新样本中心在特征空间中的位置使它们之间的距离增大. 融合余弦相似度和欧氏距离构造一个新的度量方法, 设计一个度量函数对特征图中每个局部特征的度量距离进行联合地表达, 该函数在网络优化时可同时对样本特征间的夹角和欧氏距离进行优化. 网络模型在小样本分类任务常用数据集上的表现证明, 该模型性能优秀且具有泛化性.
  • 图  1  特征变换和度量网络模型

    Fig.  1  Model of feature transformation networks

    图  2  各模块中关键函数的结构

    Fig.  2  structure of important functions of networks

    表  1  网络模型的嵌入函数与重要结构

    Table  1  Networks’ embedding function and important structures

    模型嵌入函数模型重要结构
    MN[11]Conv-4注意力长短时记忆网络
    ProtoNet[12]Conv-4“原型”概念、使用欧氏距离进行度量
    RN[14]Conv-4卷积神经网络作为度量函数
    EGNN[22]Conv-4边标签预测节点类别
    EGNN+Transduction[22]ResNet-12边标签预测节点类别、转导和标签传递
    DN4[24]ResNet-12局部描述子、图像与类别间的相似性度量
    DC[25]Conv-4稠密分类
    DC+IMP[25]Conv-4稠密分类、神经网络迁移
    MBFTNConv-4特征变换模块、特征度量模块
    MBFTN-R12ResNet-12特征变换模块、特征度量模块
    下载: 导出CSV

    表  2  Omniglot数据集上的平均分类精度对比 (%)

    Table  2  Comparison of mean accuracy (%) with existing algorithms on Omniglot dataset

    模型5-类20-类
    1-样本5-样本1-样本5-样本
    MN[11]98.198.993.898.5
    ProtoNet[12]98.899.796.098.9
    SN[13]97.398.488.297.0
    RN[14]99.6 ± 0.299.8 ± 0.197.6 ± 0.299.1 ± 0.1
    SM[15]98.499.695.098.6
    MetaNet[16]98.9597.0
    MANN[17]82.894.9
    MAML[18]98.7 ± 0.499.9 ± 0.195.8 ± 0.398.9 ± 0.2
    MMNet[26]99.28 ± 0.0899.77 ± 0.0497.16 ± 0.198.93 ± 0.05
    MBFTN99.7 ± 0.199.9 ± 0.198.3 ± 0.199.5 ± 0.1
    下载: 导出CSV

    表  3  miniImageNet数据集上的平均分类精度对比 (%)

    Table  3  Comparison of mean accuracy (%) with existing algorithms on miniImageNet dataset

    模型5-类
    1-样本5-样本
    MN[11]43.4 ± 0.7851.09 ± 0.71
    ML-LSTM[11]43.56 ± 0.8455.31 ± 0.73
    ProtoNet[12]49.42 ± 0.7868.20 ± 0.66
    RN[14]50.44 ± 0.8265.32 ± 0.70
    MetaNet[16]49.21 ± 0.96
    MAML[18]48.70 ± 1.8463.11 ± 0.92
    EGNN[22]66.85
    EGNN+Transduction[22]76.37
    DN4[24]51.24 ± 0.7471.02 ± 0.64
    DC[25]62.53 ± 0.1978.95 ± 0.13
    DC+IMP[25]79.77 ± 0.19
    MMNet[26]53.37 ± 0.0866.97 ± 0.09
    PredictNet[27]54.53 ± 0.4067.87 ± 0.20
    DynamicNet[28]56.20 ± 0.8672.81 ± 0.62
    MN-FCE[29]43.44 ± 0.7760.60 ± 0.71
    MetaOptNet[30]60.64 ± 0.6178.63 ± 0.46
    MBFTN59.86 ± 0.9175.96 ± 0.82
    MBFTN-R1261.33 ± 0.2179.59 ± 0.47
    下载: 导出CSV

    表  4  在CUB-200、CIFAR-FS和tieredImageNet上的分类精度(%)

    Table  4  Mean accuracy (%) with existing algorithms on CUB-200, CIFAR-FS and tieredImageNet dataset

    模型CUB-200 5-类CIFAR-FS 5-类tieredImageNet 5-类
    1-样本5-样本1-样本5-样本1-样本5-样本
    MN[11]61.16 ± 0.8972.86 ± 0.70
    ProtoNet[12]51.31 ± 0.9170.77 ± 0.6955.5 ± 0.772.0 ± 0.653.31 ± 0.8972.69 ± 0.74
    RN[14]62.45 ± 0.9876.11 ± 0.6955.0 ± 1.069.3 ± 0.854.48 ± 0.9371.32 ± 0.78
    MAML[18]55.92 ± 0.9572.09 ± 0.7658.9 ± 1.971.5 ± 1.051.67 ± 1.8170.30 ± 1.75
    EGNN[22]63.52 ± 0.5280.24 ± 0.49
    DN4[24]53.15 ± 0.8481.90 ± 0.60
    MetaOptNet[30]72.0 ± 0.784.2 ± 0.565.99 ± 0.7281.56 ± 0.53
    MBFTN-R1269.58 ± 0.3685.46 ± 0.7970.3 ± 0.582.6 ± 0.362.14 ± 0.6381.74 ± 0.33
    下载: 导出CSV

    表  5  网络模型的消融实验结果对比

    Table  5  Ablation study of our model

    模型5-类
    1-样本5-样本
    ProtoNet-4C49.42 ± 0.7868.20 ± 0.66
    ProtoNet-8C51.18 ± 0.7370.23 ± 0.46
    ProtoNet-Trans-4C53.47 ± 0.4671.33 ± 0.23
    ProtoNet-M-4C56.54 ± 0.5773.46 ± 0.53
    ProtoNet-VLAD-4C52.46 ± 0.6770.83 ± 0.62
    Trans*-M-4C59.86 ± 0.9167.86 ± 0.56
    仅使用54.62 ± 0.5772.58 ± 0.38
    仅使用欧氏距离55.66 ± 0.6773.34 ± 0.74
    ProtoNet-Trans-M-4C59.86 ± 0.9175.96 ± 0.82
    下载: 导出CSV
  • [1] Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, USA: IEEE, 2015. 1-9
    [2] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE, 2016. 770-778
    [3] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS). Lake Tahoe, USA: NIPS, 2012. 1106-1114
    [4] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR, 2015.
    [5] 刘颖, 雷研博, 范九伦, 王富平, 公衍超, 田奇. 基于小样本学习的图像分类技术综述. 自动化学报, 2021, 1(2): 297-315 doi: 10.16383/j.aas.c190720

    Liu Ying, Lei Yan-Bo, Fan Jiu-Lun, Wang Fu-Ping, Gong Yan-Chao, Tian Qi. Survey on image classification technology based on small sample learning. Acta Automatica Sinica, 2021, 1(2): 297-315 doi: 10.16383/j.aas.c190720
    [6] Miller E G, Matsakis N E, Viola P A. Learning from one example through shared densities on transforms. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Hilton Head Island, USA: IEEE, 2000. 464-471
    [7] Fei-Fei L, Fergus R, Perona P. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 1(4): 594-6111
    [8] Lake B M, Salakhutdinov R, Gross J, Tenenbaum J B. One shot learning of simple visual concepts. In: Proceedings of the 33rd Annual Meeting of the Cognitive Science Society (CogSci). Boston, USA: CogSci, 2011. 2568-2573
    [9] Lake B M, Salakhutdinov R, Tenenbaum J B. Human-level concept learning through probabilistic program induction. Science, 2015, 1(6266): 1332-1338
    [10] Edwards H, Storkey A J. Towards a neural statistician. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017.
    [11] Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc., 2016. 3637-3645
    [12] Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 4080-4090
    [13] Koch G, Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, France: JMLR, 2015.
    [14] Sung F, Yang Y X, Zhang L, Xiang T, Torr P H S, Hospedales T M. Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 1199-1208
    [15] Kaiser Ł, Nachum O, Roy A, Bengio S. Learning to remember rare events. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017.
    [16] Munkhdalai T, Yu H. Meta networks. In: Proceedings of the 34th International Conference on Machine Learning - Volume 70. Sydney, Australia: JMLR.org, 2017. 2554-2563
    [17] . Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T. Meta-learning with memory-augmented neural networks. In: Proceedings of the 33rd International Conference on Machine Learning. New York City, USA: PMLR, 2016. 1842-1850
    [18] Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning - Volume 70. Sydney, Australia: JMLR.org, 2017. 1126-1135
    [19] Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J. NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE, 2016. 5297-5307
    [20] Jégou H, Douze M, Schmid C, Pérez P. Aggregating local descriptors into a compact image representation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010. 3304-3311
    [21] Bertinetto L, Henriques J F, Torr P H, Vedaldi A. Meta-learning with differentiable closed-form solvers. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans, USA: ICLR, 2019.
    [22] Kim J, Kim T, Kim S, Yoo C D. Edge-labeling graph neural network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 11-20
    [23] Yue Z Q, Zhang H W, Sun Q R, Hua X S. Interventional Few-Shot Learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2020. Article No. 230
    [24] Li W B, Wang L, Xu J L, Huo J, Gao Y, Luo J B. Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 7253-7260
    [25] Lifchitz Y, Avrithis Y, Picard S, Bursuc A. Dense classification and implanting for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 9250-9259
    [26] Cai Q, Pan Y W, Yao T, Yan C G, Mei T. Memory matching networks for one-shot image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4080-4088
    [27] Qiao S Y, Liu C X, Shen W, Yuille A L. Few-shot image recognition by predicting parameters from activations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 7229-7238
    [28] Gidaris S, Komodakis N. Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4367-4375
    [29] Ravi S, Larochelle H. Optimization as a model for few-shot learning. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017.
    [30] Lee K, Maji S, Ravichandran A, Soatto S. Meta-learning with differentiable convex optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 10649-10657
  • 加载中
计量
  • 文章访问数:  138
  • HTML全文浏览量:  59
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-09-03
  • 录用日期:  2021-12-11
  • 网络出版日期:  2023-09-11

目录

    /

    返回文章
    返回