2.624

2020影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

深度EM胶囊网络全重叠手写数字识别与分离

姚红革 董泽浩 喻钧 白小军

姚红革, 董泽浩, 喻钧, 白小军. 深度EM胶囊网络全重叠手写数字识别与分离. 自动化学报, 2022, 48(11): 1−10 doi: 10.16383/j.aas.c190849
引用本文: 姚红革, 董泽浩, 喻钧, 白小军. 深度EM胶囊网络全重叠手写数字识别与分离. 自动化学报, 2022, 48(11): 1−10 doi: 10.16383/j.aas.c190849
Yao Hong-Ge, Dong Ze-Hao, Yu Jun, Bai Xiao-Jun. Fully overlapped handwritten number recognition and separation based on deep EM capsule network. Acta Automatica Sinica, 2022, 48(11): 1−10 doi: 10.16383/j.aas.c190849
Citation: Yao Hong-Ge, Dong Ze-Hao, Yu Jun, Bai Xiao-Jun. Fully overlapped handwritten number recognition and separation based on deep EM capsule network. Acta Automatica Sinica, 2022, 48(11): 1−10 doi: 10.16383/j.aas.c190849

深度EM胶囊网络全重叠手写数字识别与分离

doi: 10.16383/j.aas.c190849
详细信息
    作者简介:

    姚红革:博士, 西安工业大学计算机科学与工程学院副教授. 主要研究方向为机器学习, 计算机视觉.E-mail: yaohongge@xatu.edu.cn

    董泽浩:西安工业大学计算机科学与技术学院硕士研究生.主要研究方向为深度学习, 胶囊网络.E-mail: axxddzh@gmail.com

    喻钧:西安工业大学计算机科学与工程学院教授. 主要研究方向为图像处理, 模式识别.E-mail: yujun@xatu.edu.cn

    白小军:西安工业大学计算机科学与工程学院副教授. 电子信息现场勘验应用技术公安部重点实验室研究员. 主要研究方向为数字图像处理, 人工智能与机器学习.E-mail: baixiaojun@xatu.edu.cn

Fully Overlapped Handwritten Number Recognition and Separation Based on Deep EM Capsule Network

More Information
    Author Bio:

    YAO Hong-Ge PH.D., associate professor at Xi'an Technological University. His research interest covers machine learning and computer vision

    DONG Ze-Hao Master student in the school of Computer Science and Engineering, Xi'an Technological University. His research interest covers deep learning and capsule network

    YU Jun Professor, at the School of Computer Science and Engineering, Xi'an Technology University, Her research interest covers image processing and pattern recognition

    BAI Xiao-Jun Associate professor at the School of Computer Science and Engineering, Xi'an Technological University, and also a researcher of Key Labaratary of Electronic Information Processing with Applications in Crime Scene Investigation, Ministry of Public Security. His research interest covers digital image processing, artificial intelligence, and machine learning. Corresponding author of this paper

  • 摘要: 基于胶囊网络的向量神经元思想和期望最大算法(Expectation-maximiation, EM ), 设计了一种以EM为向量聚类算法的深度胶囊网络(Deep-CapsNet), 实现了重叠手写数字的识别与分离. 该网络由两部分组成, 第1部分使用两个卷积层、两个基础胶囊层、两个EM聚类胶囊层构成6层网络结构. 其将胶囊维数由常规的8维扩充为16维, 并利用姿态转换矩阵实现低级特征到高级特征的预测, 同时将EM算法改为EM向量聚类算法, 以替换原胶囊网络中的迭代路由部分, 优化了网络的运算过程, 实现了重叠目标识别. 第2部分是重构网络部分, 由结构完全相同的两个并行网络组成, 对双向量进行并行重构, 实现了重叠目标的分离. 实验结果显示, 对于100%全重叠手写数字图片本网络识别率达到了96%, 对比现有的胶囊网络CapsNet在80%的重叠率下95%的识别率, 在100%的重叠率下88%的识别率, 本文网络在难度提升的情况下, 识别率有明显提高, 能够将完全叠加的两张手写数字图片进行准确地分离.
  • 图  1  深度胶囊网络结构图

    Fig.  1  Deep-CapsNet network structure diagram

    图  2  EM向量聚类算法流程图

    Fig.  2  Flow chart of EM vector clustering algorithm

    图  3  全重叠数据集

    Fig.  3  Full-overlapping dataset

    图  4  不同聚类次数下输出向量的模长

    Fig.  4  Module length of output vector under different clustering times

    图  5  DCN对全重叠手写数字的识别率与loss值曲线

    Fig.  5  Recognition rate and loss value curve of DCN for fully overlapped handwritten digits

    图  6  重构loss函数占比收敛对比

    Fig.  6  Comparison of proportion convergence of reconstructed loss function

    图  7  重构结果

    Fig.  7  Reconstructing results

    图  8  训练识别率

    Fig.  8  Training recognition rate

    表  1  数据集标签

    Table  1  Dataset label

    输入图像标签说明
    (0, 0, 0, 0, 0, 0, 0, 1, 0, 0)无叠加
    (0, 0, 0, 0, 0, 0, 0, 0, 0, 2)两个相同数字叠加
    (0, 0, 0, 1, 0, 0, 0, 1, 0, 0)两个不同数字叠加
    下载: 导出CSV

    表  2  在不同聚类次数下的激活向量模长

    Table  2  Active vector module length under different clustering times

    网络结构及聚类形式所用训练集R = 1R = 2R = 3
    DCN EM 聚类/CapsNet 路由聚类MNIST数据集0.0413/0.05360.5241/0.41220. 9800/0.8792
    全重叠数据集0.0332/0.04230.4342/0.58650.9943/0.8653
    混合数据集0.0323/0.03540.4543/0.32520.9923/0.9173
    下载: 导出CSV

    表  3  参数量与不同聚类次数R下的单Epoch消耗时间(s)

    Table  3  Parameter quantity and single epoch consumption time under different clustering times (s)

    网络结构参数量聚类算法R = 1R = 2R = 3
    CapsNet8 215568迭代路由150±2210±2240±2
    DCN20128032EM240±2300±2340±2
    下载: 导出CSV

    表  4  DCN不同聚类算法单Epoch消耗时间(s)

    Table  4  Single epoch consumption time of different DCN clustering algorithms (s)

    聚类算法R = 1R = 2R = 3
    迭代路由 EM350±2410±2440±2
    240±2300±2340±2
    下载: 导出CSV

    表  5  DCN识别手写数字效果对比 (%)

    Table  5  Effect comparison of handwritten digits recognized by DCN (%)

    所用训练集无重叠手写
    数字识别率
    全重叠手写
    数字识别率
    MNIST 数据集99.655.2
    全重叠手写数字数据集80.796.75
    混合数据集95.796.55
    下载: 导出CSV

    表  6  重叠手写数字识别率对比(R=3) (%)

    Table  6  Comparison of recognition rate of overlapping handwritten digits (R = 3) (%)

    网络模型训练集重叠率正确率
    CapsNetMutiMNIST8095
    全重叠数据集10088
    DCN全重叠数据集10096.75
    下载: 导出CSV

    表  7  全重叠手写数字分类与重构的部分结果

    Table  7  partial results of classification and reconstruction of fully overlapped handwritten digits

    分类标签(3, 7)(9, 1)(0, 8)(0, 4)(9, 7)*(7, 9)*(7, 9)*(5, 9)•
    分类结果(3, 7)(9, 1)(8, 0)(0, 4)(7, 9)*(7, 9)*(7, 9)*(8, 9)•
    输入图片
    重构图片1
    重构图片2
    下载: 导出CSV

    表  8  部分识别和分离结果

    Table  8  Partial identification and separation results

    分类标签(不, 专)(下, 不)(丑, 下)(不, 丑)(下, 世)(下, 专)(王, 丑)(也, 卫)
    分类结果(不, 专)(下, 不)(丑, 下)(不, 丑)(下, 世)(下, 专)(丑, 不能确定)(不能确定, 不能确定)
    输入图片
    重构图片 1
    重构图片 2
    下载: 导出CSV
  • [1] Hinton G E, Ghahramani Z, Teh Y W. Learning to parse images. In: Proceeding of the 12th International Conference on Neural Information Processing Systems. Denver, USA: MIT Press, 1999. 463−469
    [2] Goodfellow I J, Bulatov Y, Ibarz J, Arnoud S, Shet V B. Multi-digit number recognition from street view imagery using deep convolutional neural networks. In: Proceeding of the 2nd International Conference on Learning Representations. Banff, Canada: ICLR, 2014.
    [3] Ba J, Mnih V, Kavukcuoglu K. Multiple object recognition with visual attention. In: Proceeding of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR, 2015.
    [4] Greff K, Rasmus A, Berglund M, Hao T H, Schmidhuber J, Valpola H. Tagger: Deep unsupervised perceptual grouping. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc., 2016. 4491−4499
    [5] Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 3859−3869
    [6] Gupta M R, Chen Y H. Theory and use of the EM algorithm. Foundations and Trends in Signal Processing, 2011, 4(3): 223−296
    [7] Xuan G R, Zhang W, Chai P Q. EM algorithms of Gaussian mixture model and hidden Markov model. In: Proceedings of the 2001 International Conference on Image Processing (Cat. No.01CH37205). Thessaloniki, Greece: IEEE, 2001. 145−148
    [8] Fujimoto M, Riki Y A. Robust speech recognition in additive and channel noise environments using GMM and EM algorithm. In: Proceeding of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing. Montreal, Canada: IEEE, 2004. I-941
    [9] Bilik I, Tabrikian J, Cohen A. GMM-based target classification for ground surveillance Doppler radar. IEEE Transactions on Aerospace and Electronic Systems, 2006, 42(1): 267−278 doi: 10.1109/TAES.2006.1603422
    [10] Jain A K, Dubes R C. Algorithms for clustering data. 1988(查阅所有网上资料, 未能确认文献类型, 请联系作者确认文献类型及格式是否正确)
    [11] Wang D L, Liu Q. An optimization view on dynamic routing between capsules. In: Proceeding of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018.
    [12] Jaiswal A, AbdAlmageed W, Wu Y, Natarajan P. CapsuleGAN: Generative adversarial capsule network. In: Proceeding of the European Conference on Computer Vision. Munich, Germany: Springer, 2018. 526−535
    [13] LaLonde R, Bagci U. Capsules for object segmentation. arXiv Preprint arXiv:1804.04241, 2018. (查阅所有网上资料, 未能确认文献类型, 请联系作者确认文献类型及格式是否正确)
    [14] Rajasegaran J, Jayasundara V, Jayasekara S, Jayasekara H, Seneviratne S, Rodrigo R. DeepCaps: Going deeper with capsule networks. In: Proceeding of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 10717−10725
    [15] Hinton G E, Sabour S, Frosst N. Matrix capsules with EM routing. In: Proceeding of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018.
    [16] Abbas O A. Comparisons between data clustering algorithms. International Arab Journal of Information Technology, 2008, 5(3): 320−325
    [17] Zhang B, Hsu M, Dayal U. K-Harmonic Means-a Data Clustering Algorithm. Hewlett-Packard Labs Technical Report HPL-1999-124, 1999. 55 (查阅所有网上资料, 未找到本条文献出版地信息, 请联系作者确认补充)
    [18] 王爱平, 张功营, 刘方. EM算法研究与应用. 计算机技术与发展, 2009, 19(9): 108−110 doi: 10.3969/j.issn.1673-629X.2009.09.030

    Wang Ai-Ping, Zhang Gong-Ying, Liu Fang. Research and application of EM algorithm. Computer Technology and Development, 2009, 19(9): 108−110 doi: 10.3969/j.issn.1673-629X.2009.09.030
    [19] 岳佳, 王士同. 高斯混合模型聚类中EM算法及初始化的研究. 微计算机信息, 2006, 22(33): 244−246, 302 doi: 10.3969/j.issn.1008-0570.2006.33.086

    Yue Jia, Wang Shi-Tong. Algorithm EM and its initialization in gaussian-mixture-model based clustering. Science and Technology & Innovation , 2006, 22(33): 244−246, 302 doi: 10.3969/j.issn.1008-0570.2006.33.086
    [20] 朱周华. 期望最大(EM)算法及其在混合高斯模型中的应用. 现代电子技术, 2003, 26(24): 88−90 doi: 10.3969/j.issn.1004-373X.2003.24.032

    Zhu Zhou-Hua. EM algorithm and its application in mixture of Gaussian. Modern Electronics Technique, 2003, 26(24): 88−90 doi: 10.3969/j.issn.1004-373X.2003.24.032
    [21] Zhang H Y, Cissé M, Dauphin Y N, Lopez-Paz D. mixup: Beyond empirical risk minimization. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018.
  • 加载中
计量
  • 文章访问数:  567
  • HTML全文浏览量:  274
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-12-18
  • 修回日期:  2020-04-16
  • 网络出版日期:  2022-10-24

目录

    /

    返回文章
    返回