2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于可解释注意力部件模型的行人重识别方法

周勇 王瀚正 赵佳琦 陈莹 姚睿 陈思霖

周勇, 王瀚正, 赵佳琦, 陈莹, 姚睿, 陈思霖. 基于可解释注意力部件模型的行人重识别方法. 自动化学报, 2023, 49(10): 2159−2171 doi: 10.16383/j.aas.c200493
引用本文: 周勇, 王瀚正, 赵佳琦, 陈莹, 姚睿, 陈思霖. 基于可解释注意力部件模型的行人重识别方法. 自动化学报, 2023, 49(10): 2159−2171 doi: 10.16383/j.aas.c200493
Zhou Yong, Wang Han-Zheng, Zhao Jia-Qi, Chen Ying, Yao Rui, Chen Si-Lin. Interpretable attention part model for person re-identification. Acta Automatica Sinica, 2023, 49(10): 2159−2171 doi: 10.16383/j.aas.c200493
Citation: Zhou Yong, Wang Han-Zheng, Zhao Jia-Qi, Chen Ying, Yao Rui, Chen Si-Lin. Interpretable attention part model for person re-identification. Acta Automatica Sinica, 2023, 49(10): 2159−2171 doi: 10.16383/j.aas.c200493

基于可解释注意力部件模型的行人重识别方法

doi: 10.16383/j.aas.c200493
基金项目: 国家自然科学基金(61806206, U1610124, 61772530, 61773383), 江苏省自然科学基金(BK20180639, BK20171192), 江苏省六大人才高峰计划(2015-DZXX-010)资助
详细信息
    作者简介:

    周勇:中国矿业大学计算机科学与技术学院教授. 主要研究方向为数据挖掘, 机器学习和人工智能. E-mail: yzhou@cumt.edu.cn

    王瀚正:中国矿业大学计算机科学与技术学院硕士研究生. 主要研究方向为计算机视觉, 图像处理, 行人重识别. E-mail: hzwang@cumt.edu.cn

    赵佳琦:中国矿业大学计算机科学与技术学院副教授. 主要研究方向为多目标优化, 深度学习, 图像处理. 本文通信作者. E-mail: jiaqizhao88@126.com

    陈莹:中国矿业大学计算机科学与技术学院博士研究生. 主要研究方向为计算机视觉, 图像处理, 行人重识别. E-mail: cheny@cumt.edu.cn

    姚睿:中国矿业大学计算机科学与技术学院副教授. 主要研究方向为计算机视觉, 机器学习. E-mail: ruiyao@cumt.edu.cn

    陈思霖:中国矿业大学计算机科学与技术学院硕士研究生. 主要研究方向为计算机视觉, 图像处理, 目标检测. E-mail: silin.chen@cumt.edu.cn

Interpretable Attention Part Model for Person Re-identification

Funds: Supported by National Natural Science Foundation of China (61806206, U1610124, 61772530, 61773383), Natural Science Foundation of Jiangsu Province (BK20180639, BK20171192), and the Six Talent Peaks Project in Jiangsu Province (2015-DZXX-010)
More Information
    Author Bio:

    ZHOU Yong Professor at the School of Computer Science and Technology, China University of Mining and Technology. His research interest covers data mining, machine learning, and artificial intelligence

    WANG Han-Zheng Master student at the School of Computer Science and Technology, China University of Mining and Technology. His research interest covers computer vision, image processing, and person re-identification

    ZHAO Jia-Qi Associate professor at the School of Computer Science and Technology, China University of Mining and Technology. His research interest covers multiobjective optimization, deep learning, and image processing. Corresponding author of this paper

    CHEN Ying Ph.D. candidate at the School of Computer Science and Technology, China University of Mining and Technology. Her research interest covers computer vision, image processing, and person re-identification

    YAO Rui Associate professor at the School of Computer Science and Technology, China University of Mining and Technology. His research interest covers computer vision and machine learning

    CHEN Si-Lin Master student at the School of Computer Science and Technology, China University of Mining and Technology. His research interest covers computer vision, image processing, and objective detection

  • 摘要: 大多数行人重识别(Person re-identification, ReID)方法仅将注意力机制作为提取显著特征的辅助手段, 缺少网络对行人图像关注程度的量化研究. 基于此, 提出一种可解释注意力部件模型(Interpretable attention part model, IAPM). 该模型有3 个优点: 1)利用注意力掩码提取部件特征, 解决部件不对齐问题; 2)为了根据部件的显著性程度生成可解释权重, 设计可解释权重生成模块(Interpretable weight generation module, IWM); 3)提出显著部件三元损失(Salient part triplet loss, SPTL)用于IWM的训练, 提高识别精度和可解释性. 在3 个主流数据集上进行实验, 验证所提出的方法优于现有行人重识别方法. 最后通过一项人群主观测评比较IWM生成可解释权重的相对大小与人类直观判断得分, 证明本方法具有良好的可解释性.
  • 图  1  IAPM整体结构

    Fig.  1  Structure of IAPM

    图  2  横向分割示意图

    Fig.  2  Schematic diagram of horizontal split

    图  3  PS模块使用的伪标签[16]

    Fig.  3  Pseudo-labels used by PS[16]

    图  4  注意力权重生成模块结构

    Fig.  4  Structure of IWM

    图  5  负样本对距离变化图

    Fig.  5  Negative sample pair distance graph

    图  6  正样本对距离变化图

    Fig.  6  Positive sample pair distance graph

    图  7  SPTL损失曲线图

    Fig.  7  SPTL loss curve graph

    图  8  可解释权重展示

    Fig.  8  The display of interpretable weights

    图  9  主观测评结果

    Fig.  9  The display of subjective evaluation results

    图  10  可解释权重与主观测评结果对比

    Fig.  10  Comparison of interpretable weights and subjective evaluation results

    表  1  实验环境

    Table  1  Experimental environment

    软硬件环境配置
    实验平台Pytorch
    显卡NVIDIA Tesla P100
    内存40 GB
    显存16 GB
    下载: 导出CSV

    表  2  实验参数

    Table  2  Experimental parameters

    实验参数参数数值
    输入图像尺寸(像素)$384\times 128 $
    迭代次数100
    优化器SGD
    动量因子0.9
    权重衰减系数$5\times10^{-4} $
    Batchsize128
    显著部件三元损失$\alpha $1.2
    下载: 导出CSV

    表  3  与EANet的性能对比(%)

    Table  3  Performance comparison with EANet (%)

    方法数据集
    Market-1501DukeMTMC-reIDCUHK03
    PAP-6P94.3 (84.3)85.6 (72.4)68.1 (62.4)
    PAP94.5 (84.9)86.1 (73.3)72.0 (66.2)
    PAP-S-PS94.6 (85.6)87.5 (74.6)72.5 (66.8)
    IAPM-6P (本文)95.0 (85.3)86.9 (74.3)72.5 (65.2)
    IAPM-9P (本文)95.1 (86.0)87.9 (75.6)72.6 (67.4)
    IAPM (本文)95.2 (86.3)88.0 (75.7)72.6 (67.2)
    下载: 导出CSV

    表  4  与其他方法的性能对比 (%)

    Table  4  Performance comparison with other methods (%)

    方法数据集
    Market-1501DukeMTMC-reIDCUHK03
    Verif-Identify[38]79.5 (59.9)68.9 (49.3)
    MSCAN[29]80.8 (57.5)
    MGCAM[12]83.8 (74.3)50.1 (50.2)
    Part-Aligned[39]91.7 (79.6)84.4 (69.3)
    SPReID[40]92.5 (81.3)84.4 (71.0)
    AlignedReID[41]91.8 (79.3)
    Deep-Person[42]92.3 (79.6)80.9 (64.8)
    PCB[7]85.3 (68.5)73.2 (52.8)43.8 (38.9)
    PCB + RPP[7]93.8 (81.6)83.3 (69.2)63.7 (57.5)
    HA-CNN[43]91.2 (75.7)80.5 (63.8)44.4 (41.0)
    Mancs[44]93.1 (82.3)84.9 (71.8)69.0 (63.9)
    P2-Net [45]95.1 (85.6)86.5 (73.1)74.9 (68.9)
    M3 + ResNet50[46]95.4 (82.6)84.7 (68.5)66.9 (60.7)
    IAPM (本文) 95.2 (86.3) 88.0 (75.7) 72.6 (67.2)
    注: “—” 表示文献中没有提供相应数据.
    下载: 导出CSV

    表  5  消融实验1

    Table  5  Ablation experiment 1

    模型Rank-1 (%)mAP (%)
    原始模型92.480.5
    原始模型 + IWM + SPTL95.086.1
    原始模型 + IWM + SPTL +
    中心损失
    95.286.3
    注: 加粗字体表示各列最优结果.
    下载: 导出CSV

    表  6  消融实验2

    Table  6  Ablation experiment 2

    人体部件个数Rank-1 (%)mAP (%)
    695.085.3
    795.286.3
    995.186.0
    注: 加粗字体表示各列最优结果.
    下载: 导出CSV

    表  7  消融实验3

    Table  7  Ablation experiment 3

    $\alpha $Rank-1 (%)mAP (%)
    0.194.485.2
    0.594.585.3
    0.894.885.7
    1.094.785.6
    1.295.286.3
    1.594.685.6
    2.094.785.3
    5.093.583.5
    10.093.381.0
    注: 加粗字体表示各列最优结果.
    下载: 导出CSV

    表  8  消融实验4

    Table  8  Ablation experiment 4

    $\lambda $Rank-1 (%)mAP (%)
    0.294.485.4
    0.494.885.4
    0.694.485.1
    0.894.885.7
    1.095.286.3
    注: 加粗字体表示各列最优结果.
    下载: 导出CSV
  • [1] Yi D, Lei Z, Liao S C, Li S Z. Deep metric learning for person re-identification. In: Proceedings of the 22nd IEEE International Conference on Pattern Recognition. Stockholm, Sweden: IEEE, 2014. 34−39
    [2] Liao S C, Hu Y, Zhu X Y, Li S Z. Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 2197−2206
    [3] 罗浩, 姜伟, 范星, 张思朋. 基于深度学习的行人重识别研究进展. 自动化学报, 2019, 45(11): 2032-2049.

    Luo Hao, Jiang Wei, Fan xing, Zhang Si-Peng. A survey on deep learning based person re-identification. Acta Automatica Sinica, 2019, 45(11): 2032-2049.
    [4] Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature, 1986, 323(6088): 533-536. doi: 10.1038/323533a0
    [5] 吴飞, 廖彬兵, 韩亚洪. 深度学习的可解释性. 航空兵器, 2019, 26(01): 43-50.

    Wu Fei, Liao Bin-Bing, Han Ya-Hong. Interpretability for Deep Learning. Aero Weaponry, 2019, 26(01): 43-50.
    [6] Chen W H, Chen X T, Zhang J G, Huang K Q. A multi-task deep network for person re-identification. In: Proceedings of the 31st Conference on Artificial Intelligence. San Francisco, USA: AAAI, 2017. 3988−3994
    [7] Sun Y G, Zheng L, Yang Y, Tian Q, Wang S J. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer, 2018. 480−496
    [8] Zhou S P, Wang J J, Wang J Y, Gong Y H, Zheng N N. Point to set similarity based deep feature learning for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 5028−5037
    [9] Sarfraz M S, Schumann A, Eberle A, Stiefelhagen R. A pose-pensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 420−429
    [10] Zhao L M, Li X, Zhuang Y T, Wang J D. Deeply-learned part-aligned representations for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 3239−3248
    [11] Zhou S P, Wang J J, Meng D Y, Liang Y D, Gong Y H, Zheng N N. Discriminative feature learning with foreground attention for person re-identification. IEEE Transactions on Image Processing, 2019, 28(9): 4671-4684.
    [12] Song C F, Huang Y, Ouyang W L, Wang L. Mask-guided contrastive attention model for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 1179−1188
    [13] Xu J, Zhao R, Zhu F, Wang H M, Ouyang W L. Attention-aware compositional network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 2119−2128
    [14] Tay C P, Roy S, Yap K H. Aanet: Attribute attention network for person re-identifications. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 7134−7143
    [15] Zhou S P, Wang F, Huang Z Y, Wang J J. Discriminative feature learning with consistent attention regularization for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 8039−8048
    [16] Huang H J, Yang W J, Chen X T, Zhao X, Huang K Q, Lin J B, et al. EANet: Enhancing alignment for cross-domain person re-identification [Online], available: http://arxiv.org/abs/1812.11369, October 21, 2020
    [17] Bach S, Binder A, Montavon G, Klauschen F, Muller K, Samek W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS One, 2015, 10(7): e0130140. doi: 10.1371/journal.pone.0130140
    [18] Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 2921−2929
    [19] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I J, et al. Intriguing properties of neural networks [Online], available: http://arxiv.org/abs/1312.6199, October 21, 2020
    [20] Bau D, Zhou B, Khosla A, Oliva A, Torralba A. Network dissection: Quantifying interpretability of deep visual representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 3319−3327
    [21] Dong Y P, Su H, Zhu J, Zhang B. Improving interpretability of deep neural networks with semantic information. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 975−983
    [22] Zhang Q S, Wu Y N, Zhu S C. Interpretable convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 8827−8836
    [23] Zheng L, Yang Y, Hauptmann A G. Person re-identification: Past, present and future [Online], available: http://arxiv.org/abs/1610.02984, October 21, 2020
    [24] Zheng L, Zhang H H, Sun S Y, Chandraker M, Yang Y, Tian Qi. Person re-identification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 3346−3355
    [25] Lin Y T, Zheng L, Zheng Z D, Wu Y, Hu Z L, Yan C G, et al. Improving person re-identification by attribute and identity learning. Pattern Recognition, 2019, 95: 151-161. doi: 10.1016/j.patcog.2019.06.006
    [26] Geng M Y, Wang Y W, Xiang T, Tian Y H. Deep transfer learning for person re-identification [Online], available: http://arxiv.org/abs/1611.05244, October 21, 2020
    [27] Varior R R, Haloi M, Wang G. Gated siamese convolutional neural network architecture for human re-identification. In: Proceedings of the European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 791−808
    [28] Hermans A, Beyer L, Leibe B. In defense of the triplet loss for person re-identification [Online], available: http://arxiv.org/abs/1703.07737, October 21, 2020
    [29] Li D W, Chen X T, Zhang Z, Huang K Q. Learning deep context-aware features over body and latent parts for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 7398−7407
    [30] Fang P F, Zhou J M, Roy S K, Petersson L, Harandi M. Bilinear attention networks for person retrieval. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 8029−8038
    [31] Liu H, Feng J S, Qi M B, Jiang J G, Yan S C. End-to-end comparative attention networks for person re-identification. IEEE Transactions on Image Processing, 2017, 26(7): 3492-3506. doi: 10.1109/TIP.2017.2700762
    [32] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735-1780. doi: 10.1162/neco.1997.9.8.1735
    [33] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE, 2016. 770−778
    [34] Wen Y D, Zhang K P, Li Z F, Qiao Y. A discriminative feature learning approach for deep face recognition. In: Proceedings of the European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 499−515
    [35] Zheng L, Shen L, Tian L, Wang S J, Wang J D, Tian Q. Scalable person re-identification: A benchmark. In: Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015. 1116−1124
    [36] Ristani E, Solera F, Zou R, Cucchiara, R, Tomasi C. Performance measures and a data set for multi-target, multi-camera tracking. In: Proceedings of the European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 17−35
    [37] Li W, Zhao R, Xiao T, Wang X G. Deepreid: Deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014. 152−159
    [38] Zheng Z D, Zheng L, Yang Y. A discriminatively learned cnn embedding for person reidentification. ACM Transactions on Multimedia Computing, Communications, and Applications, 2018, 14(1): Article No. 13.
    [39] Suh Y, Wang J, Tang S, Mei T, Lee K M. Part-aligned bilinear representations for person re-identification. In: Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer, 2018. 418−437
    [40] Kalayeh M M, Basaran E, Gökmen M, Kamasak M E, Shah M. Human semantic parsing for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 1062−1071
    [41] Zhang X, Luo H, Fan X, Xiang W L, Sun Y X, Xiao Q Q, et al. Alignedreid: Surpassing human-level performance in person re-identification [Online], available: http://arxiv.org/abs/1711.08184, October 21, 2020
    [42] Bai X, Yang M K, Huang T T, Dou Z Y, Yu R, Xu Y C. Deep-person: learning discriminative deep features for person re-identification. Pattern Recognition, 2020, 98: 107036. doi: 10.1016/j.patcog.2019.107036
    [43] Li W, Zhu X T, Gong S G. Harmonious attention network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 2285−2294
    [44] Wang C, Zhang Q, Huang C, Liu W Y, Wang X G. Mancs: A multi-task attentional network with curriculum sampling for person re-identification. In: Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer, 2018. 384−400
    [45] Guo J Y, Yuan Y H, Huang L, Zhang C, Yao J G, Han K. Beyond human parts: Dual part-aligned representations for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 3641−3650
    [46] Zhou J H, Su B, Wu Y. Online joint multi-metric adaptation from frequent sharing-subset mining for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Virtual Event: IEEE, 2020. 2909−2918
  • 加载中
图(10) / 表(8)
计量
  • 文章访问数:  1792
  • HTML全文浏览量:  897
  • PDF下载量:  257
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-07-06
  • 修回日期:  2020-08-23
  • 网络出版日期:  2023-08-29
  • 刊出日期:  2023-10-24

目录

    /

    返回文章
    返回