2.765

2022影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

叠层模型驱动的书法文字识别方法研究

麻斯亮 许勇

麻斯亮, 许勇. 叠层模型驱动的书法文字识别方法研究. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c230460
引用本文: 麻斯亮, 许勇. 叠层模型驱动的书法文字识别方法研究. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c230460
Ma Si-Liang, Xu Yong. Calligraphy character recognition method driven by stacked model. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c230460
Citation: Ma Si-Liang, Xu Yong. Calligraphy character recognition method driven by stacked model. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c230460

叠层模型驱动的书法文字识别方法研究

doi: 10.16383/j.aas.c230460
基金项目: 国家自然科学基金(62072188)资助
详细信息
    作者简介:

    麻斯亮:华南理工大学计算机科学与工程学院博士研究生. 主要研究方向为机器学习, 文字图像处理. E-mail: 202010107394@mail.scut.edu.cn

    许勇:华南理工大学计算机科学与工程学院教授. 主要研究方向为机器学习, 视觉计算, 大数据. 本文通信作者. E-mail: yxu@scut.edu.cn

Calligraphy Character Recognition Method driven by Stacked Model

Funds: Supported by National Natural Science Foundation of China (62072188)
More Information
    Author Bio:

    MA Si-Liang Ph.D. candidate in School of Computer Science and Engineering, South China University of Technology. His main research areas are machine learning and text image processing

    XU Yong Professor of School of Computer Science and Engineering, South China University of Technology. His main research areas are machine learning, visual computing, and big data. Corresponding author of this paper

  • 摘要: 基于二维图像的书法文字识别是指利用计算机视觉技术对书法文字单字图像进行识别, 在古籍研究和文化传播中具有重要应用. 目前书法文字识别技术已经取得了相当不错的进展, 但依旧面临很多挑战, 比如复杂多变的字形可能导致的识别误差, 汉字本身又存在较多形近字, 且汉字字符类别数与其他语言文字相比更多, 书法文字图像普遍存在类内差距大, 类间差距小的问题. 为了解决这些问题, 本文提出叠层模型驱动的书法文字识别方法(Stacked-model driven character recognition, SDCR), 通过使用数据预处理、节点分离策略和叠层模型对现有单一分类模型进行改进, 按照字体类别对同一类别不同字体风格的文字进行二次划分; 针对类间差距小的问题, 根据书法文字训练集图像识别置信度对形近字进行子集划分, 针对子集进行嵌套模型增强训练, 在测试阶段利用叠层模型对形近字进行二次识别, 提升形近字的识别准确率. 为了验证本文提出方法的鲁棒性, 本文在自主生成的SCUT_Calligraphy数据集和CASIA-HWDB 1.1, CASIA-AHCDB公开数据集上进行训练和测试, 实验结果表明本文提出的方法在上述数据集的识别准确率均有较大幅度提升, 在CASIA-HWDB 1.1、CASIA-AHCDB和本文自建数据集SCUT_Calligraphy上测试准确率分别达到96.33%、99.51%和99.9%, 证明了本文所述方法的有效性.
  • 图  1  中国书法作品样例

    Fig.  1  Samples of Chinese calligraphy works

    图  2  书法文字中同一类字不同字形及形近字示例

    Fig.  2  Examples of different glyphs and close shapes of the same type of characters in calligraphy text

    图  3  本文所述部分数据集图像示例

    Fig.  3  Part of images from datasets mentioned in this paper

    图  4  叠层模型驱动的书法文字识别方法架构图

    Fig.  4  Architecture of stacked precision neural network

    图  5  节点分离训练策略流程图(以“即”字为例)

    Fig.  5  Flowchart of nodes separation training strategy (Take the character "即" as an example)

    图  6  叠层模型驱动模型流程图

    Fig.  6  Flowchart of stacked precision neural network model

    图  7  输入图像分辨率与书法文字识别准确率变化关系

    Fig.  7  The relationship between input image resolution and calligraphy character recognition accuracy

    表  1  实验数据集详细属性

    Table  1  Detailed properties of experimental datasets

    数据集名称类别数训练集规模测试集规模
    CASIA-AHCDBStyle-1 BC2 353828 969253 990
    Style-1 EC3 20188 87036 143
    Style-2 BC2 353725 240202 404
    Style-2 EC74066 69017 741
    CASIA-HWDB 1.13 755847 466223 991
    SCUT_Calligraphy3 767251 66426 106
    下载: 导出CSV

    表  2  叠层模型驱动的书法文字识别消融实验结果

    Table  2  Ablation experiments of calligraphy character recognition driven by stacked model

    测试数据集数据预处理节点分离叠层模型驱动PrecisionRecallF1-score
    CASIA-HWDB 1.1×××89.64%88.95%89.29%
    $\surd$××90.34%89.35%89.84%
    $\surd$$\surd$×91.26%89.56%90.4%
    $\surd$$\surd$$\surd$96.33%92.1%94.16%
    CASIA-AHCDB (Style1-BC)×××94.5%95.1%94.79%
    $\surd$××98.92%98.34%98.62%
    $\surd$$\surd$×99.19%99.14%99.16%
    $\surd$$\surd$$\surd$99.51%99.21%99.35%
    SCUT_Calligraphy×××91.33%90.45%90.88%
    $\surd$××98.38%98.22%98.3%
    $\surd$$\surd$×98.85%98.36%98.6%
    $\surd$$\surd$$\surd$99.9%98.96%99.42%
    下载: 导出CSV

    表  3  单模型和叠层模型驱动模型识别可视化结果对比

    Table  3  Comparison of visualization results for single model and stacked precision neural network model recognition

    输入图片标签单模型预测值叠层模型预测值
    下载: 导出CSV

    表  4  不同子集书法文字图像使用单模型和叠层模型驱动模型识别结果对比

    Table  4  Comparison of different subsets of calligraphy character images using a single model and a stacked precision neural network model

    子集字符类别子集规模单模型错误数叠层模型错误数准确率提升(%)
    日目白自向冶治囚曰沼7411310.81
    大己已木犬片斤火本巳83532.4
    力工巾王勿古右布句希76946.57
    巨予主矛母吉臣吝圭毋86734.65
    夫云去央尘尖伏伐亥矢69727.24
    士土千比午北白自血皿76743.94
    去式戒赤坊束辰来妨展68727.35
    助忍驳玩抵忽振玖肋骏64744.68
    下载: 导出CSV

    表  5  不同方法在CASIA-AHCDB, CASIA-HWDB 1.1和SCUT_Calligraphy数据集上的测试结果对比

    Table  5  The performance of different methods test on the CASIA-AHCDB, CASIA-HWDB 1.1 and SCUT_Calligraphy

    MethodDataset
    CASIA-AHCDBCASIA-HWDB 1.1SCUT_Calligraphy
    Style-1 BCStyle-1 BC&ECStyle-2 BCStyle-2 BC&ECStyle-1 BC(train) Style-2 BC(test)
    LW-ViT[35]95.8
    CPN[36]98.596.9594.4291.9974.7495.4598.7
    RAN[37]82.3969.61
    RPN83.6569.63
    RAN+CRA[38]85.5471.02
    RPN+CRA[39]86.9172.06
    SDCR+JD*99.5198.2398.7497.0186.1596.3399.9
    * SDCR+JD指同时使用叠层模型驱动和节点分离训练策略
    下载: 导出CSV
  • [1] Hanning Zhang, Bo Dong, Qinghua Zheng, Boqin Feng, Bo Xu, and Haiyu Wu. All-content text recognition method for financial ticket images. Multimedia Tools and Applications, 81(20): 28327–28346, 2022. doi: 10.1007/s11042-022-12741-2
    [2] Anwesh Kabiraj, Debojyoti Pal, Debayan Ganguly, Kingshuk Chatterjee, and Sudipta Roy. Number plate recognition from enhanced super-resolution using generative adversarial network. Multimedia Tools and Applications, 82(9): 13837–13853, 2023. doi: 10.1007/s11042-022-14018-0
    [3] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
    [4] Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, and Yi-Zhe Song. Metahtr: Towards writer-adaptive handwritten text recognition. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15825–15834, 2021.
    [5] Xuanhong Wang, Kun Wu, Ying Zhang, Yun Xiao, and Pengfei Xu. A gan-based denoising method for chinese stele and rubbing calligraphic image. The Visual Computer, 39(4): 1351–1362, 2023.
    [6] Shancheng Fang, Hongtao Xie, Yuxin Wang, Zhendong Mao, and Yongdong Zhang. Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7094–7103, 2021.
    [7] Dan Cireşan and Ueli Meier. Multi-column deep neural networks for offline handwritten chinese character classification. In 2015 International Joint Conference on Neural Networks (IJCNN), pages 1–6, 2015.
    [8] Cheng Lin Liu, Fei Yin, and Xu Yao Zhang. Icdar 2013 chinese handwriting recognition competition. In International Conference on Document Analysis and Recognition, 2013.
    [9] Zhuoyao Zhong, Lianwen Jin, and Zecheng Xie. High performance offline handwritten chinese character recognition using googlenet and directional feature maps. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015.
    [10] Li Chen, Song Wang, Wei Fan, Jun Sun, and Satoshi Naoi. Beyond human recognition: A cnn-based framework for handwritten character recognition. In Iapr Asian Conference on Pattern Recognition, 2015.
    [11] Zhao Zhong, Xu Yao Zhang, Fei Yin, and Cheng Lin Liu. Handwritten chinese character recognition with spatial transformer and deep residual networks. In International Conference on Pattern Recognition, 2017.
    [12] Zhiyuan, Teng, Nanjun, Jin, Min, and Huaxiang. Building efficient cnn architecture for offline handwritten chinese character recognition. International journal on document analysis and recognition, 2018.
    [13] Ning Bi, Jiahao Chen, and Jun Tan. The handwritten chinese character recognition uses convolutional neural networks with the googlenet. International Journal of Pattern Recognition and Artificial Intelligence, 2019.
    [14] Cheng-Lin Liu, Fei Yin, Da-Han Wang, and Qiu-Feng Wang. Online and offline handwritten chinese character recognition: Benchmarking on new databases. Pattern Recognition, 46(1): 155–162, 2013. doi: 10.1016/j.patcog.2012.06.021
    [15] Zhiyuan Li, Nanjun Teng, Min Jin, and Huaxiang Lu. Building efficient cnn architecture for offline handwritten chinese character recognition. International Journal on Document Analysis & Recognition, 21(4): 233–240, 2018.
    [16] Yongping Dan, Zongnan Zhu, Weishou Jin, Zhuo Li, et al. Pf-vit: Parallel and fast vision transformer for offline handwritten chinese character recognition. Computational Intelligence and Neuroscience, 2022.
    [17] Zhong Cao, Jiang Lu, Sen Cui, and Changshui Zhang. Zero-shot handwritten chinese character recognition with hierarchical decomposition embedding. Pattern Recognition, page 107488, 2020.
    [18] Xiaolei Diao, Daqian Shi, Hao Tang, Qiang Shen, Yanzeng Li, Lei Wu, and Hao Xu. Rzcr: Zero-shot character recognition via radical-based reasoning, 2023.
    [19] Tianwei Wang, Zecheng Xie, Zhe Li, Lianwen Jin, and Xiangle Chen. Radical aggregation network for few-shot offline handwritten chinese character recognition. Pattern recognition letters, 125(JUL.): 821–827, 2019.
    [20] Wenchao Wang, Jianshu Zhang, Jun Du, Zi Rui Wang, and Yixing Zhu. Denseran for offline handwritten chinese character recognition. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2018.
    [21] Jingye Chen, Bin Li, and Xiangyang Xue. Zero-shot chinese character recognition with stroke-level decomposition. In Thirtieth International Joint Conference on Artificial Intelligence IJCAI-21, 2021.
    [22] Chang Liu, Chun Yang, Hai-Bo Qin, Xiaobin Zhu, Cheng-Lin Liu, and Xu-Cheng Yin. Towards open-set text recognition via label-to-prototype learning. Pattern Recognition, 134: 109109, 2023. doi: 10.1016/j.patcog.2022.109109
    [23] Yuhao Huang, Lianwen Jin, and Dezhi Peng. Zero-shot chinese text recognition via matching class embedding. In Josep Lladós, Daniel Lopresti, and Seiichi Uchida, editors, Document Analysis and Recognition – ICDAR 2021, pages 127–141, Cham, 2021. Springer International Publishing.
    [24] Amin Jalali, Swathi Kavuri, and Minho Lee. Low-shot transfer with attention for highly imbalanced cursive character recognition. Neural Networks, 143: 489–499, 2021. doi: 10.1016/j.neunet.2021.07.003
    [25] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2818–2826, 2016.
    [26] Ji-dan Huang, Guanjie Cheng, Jinghan Zhang, and Wei Miao. Recognition method for stone carved calligraphy characters based on a convolutional neural network. Neural Computing and Applications, 35(12): 8723–8732, Apr 2023.
    [27] Yongping Dan and Zhuo Li. Particle swarm optimization-based convolutional neural network for handwritten chinese character recognition. Journal of Advanced Computational Intelligence and Intelligent Informatics, 27(2): 165–172, 2023. doi: 10.20965/jaciii.2023.p0165
    [28] Cheng Lin Liu, Fei Yin, Da Han Wang, and Qiu Feng Wang. Online and offline handwritten chinese character recognition: Benchmarking on new databases. Pattern Recognition, 46(1): 155–162, 2013. doi: 10.1016/j.patcog.2012.06.021
    [29] Dezhi Peng, Lianwen Jin, Yuliang Liu, Canjie Luo, and Songxuan Lai. Pagenet: Towards end-to-end weakly supervised page-level handwritten chinese text recognition. International Journal of Computer Vision, 2022.
    [30] Yue Xu, Fei Yin, Da Han Wang, Xu Yao Zhang, and Cheng Lin Liu. Casia-ahcdb: A large-scale chinese ancient handwritten characters database. In 2019 International Conference on Document Analysis and Recognition (ICDAR), 2019.
    [31] Xiwen Qu, Weiqiang Wang, Ke Lu, and Jianshe Zhou. Data augmentation and directional feature maps extraction for in-air handwritten chinese character recognition based on convolutional neural network. Pattern Recognition Letters, 111: 9–15, 2018. doi: 10.1016/j.patrec.2018.04.001
    [32] Tonghua Su, Wei Pan, and Lijuan Yu. Hithcd-2018: Handwritten chinese character database of 21k-category. In 2019 International Conference on Document Analysis and Recognition (ICDAR), pages 1378–1383, 2019.
    [33] Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Zhe Li, and Dezhi Peng. Slogan: Handwriting style synthesis for arbitrary-length and out-of-vocabulary text. IEEE Transactions on Neural Networks and Learning Systems, pages 1–13, 2022.
    [34] Pengcheng Wang, Hui Xiong, and Haoxiang He. Bearing fault diagnosis under various conditions using an incremental learning-based multi-task shared classifier. Knowledge-based systems, 2023.
    [35] Shiyong Geng, Zongnan Zhu, Zhida Wang, Yongping Dan, and Hengyi Li. Lw-vit: The lightweight vision transformer model applied in offline handwritten chinese character recognition. Electronics, 12(7), 2023.
    [36] Hong-Ming Yang, Xu-Yao Zhang, Fei Yin, and Cheng-Lin Liu. Robust classification with convolutional prototype learning. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3474–3482, 2018.
    [37] Jianshu Zhang, Jun Du, and Lirong Dai. Radical analysis network for learning hierarchies of chinese characters. Pattern Recognition, 103: 107305, 2020. doi: 10.1016/j.patcog.2020.107305
    [38] Guo-Feng Luo, Hua-Yi Yin, Da-Han Wang, Xu-Yao Zhang, and Shun-Zhi Zhu. Critical radical analysis network for chinese character recognition. In 2022 26th International Conference on Pattern Recognition (ICPR), pages 2878–2884, 2022.
  • 加载中
计量
  • 文章访问数:  56
  • HTML全文浏览量:  26
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-08-02
  • 录用日期:  2023-11-30
  • 网络出版日期:  2023-12-25

目录

    /

    返回文章
    返回