2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多尺度流模型的视觉异常检测研究

毛国君 吴星臻 邢树礼

毛国君, 吴星臻, 邢树礼. 基于多尺度流模型的视觉异常检测研究. 自动化学报, 2024, 50(3): 640−648 doi: 10.16383/j.aas.c230476
引用本文: 毛国君, 吴星臻, 邢树礼. 基于多尺度流模型的视觉异常检测研究. 自动化学报, 2024, 50(3): 640−648 doi: 10.16383/j.aas.c230476
Mao Guo-Jun, Wu Xing-Zhen, Xing Shu-Li. Research on visual anomaly detection based on multi-scale normalizing flow. Acta Automatica Sinica, 2024, 50(3): 640−648 doi: 10.16383/j.aas.c230476
Citation: Mao Guo-Jun, Wu Xing-Zhen, Xing Shu-Li. Research on visual anomaly detection based on multi-scale normalizing flow. Acta Automatica Sinica, 2024, 50(3): 640−648 doi: 10.16383/j.aas.c230476

基于多尺度流模型的视觉异常检测研究

doi: 10.16383/j.aas.c230476
基金项目: 国家重点研发计划(2019YFD0900905), 国家自然科学基金(61773415)资助
详细信息
    作者简介:

    毛国君:福建理工大学计算机科学与数学学院教授. 主要研究方向为人工智能, 大数据, 数据挖掘和分布式计算. 本文通信作者. E-mail: 19662092@fjut.edu.cn

    吴星臻:福建理工大学计算机科学与数学学院硕士研究生. 主要研究方向为计算机视觉, 图像处理和异常检测. E-mail: xzwu@smail.fjut.edu.cn

    邢树礼:福建理工大学计算机科学与数学学院讲师. 主要研究方向为计算机视觉, 图像处理和大数据分析. E-mail: 19892311@fjut.edu.cn

Research on Visual Anomaly Detection Based on Multi-scale Normalizing Flow

Funds: Supported by National Key Research and Development Program of China (2019YFD0900905) and National Natural Science Foundation of China (61773415)
More Information
    Author Bio:

    MAO Guo-Jun Professor at the College of Computer Science and Mathematics, Fujian University of Technology. His research interest covers artificial intelligence, big data, data mining, and distributed computing. Corresponding author of this paper

    WU Xing-Zhen Master student at the College of Computer Science and Mathematics, Fujian University of Technology. His research interest covers computer vision, image processing, and anomaly detection

    XING Shu-Li Lecturer at the College of Computer Science and Mathematics, Fujian University of Technology. His research interest covers computer vision, image processing, and big data analytics

  • 摘要: 针对现有异常检测(Anomaly detection, AD)模型计算效率低和检测性能差等问题, 提出一种多尺度流模型(Multi-scale normalizing flow, MS-Flow), 通过多尺度交叉融合实现高效的视觉图像异常识别. 具体地, 在流模型(Normalizing flow, NF)内部构建层级式的多尺度架构来避免多通道数据的冗余交叉计算, 同时保证网络的多尺度表达能力. 此外, 设计的层级感知模块通过逐层级的多粒度特征融合, 在细粒度级别表达多尺度特征, 有效地提高分布估计的精确性. 该方法是一个平衡检测精度与计算效率的解决方案. 在两个公开数据集上的实验表明, 所提方法相较于以往的检测模型能够获得更高的检测精度(在MVTec AD和BTAD数据集上的平均AUROC (Area under the receiver operating characteristics)分别为99.7%和96.0%), 同时具有更高的计算效率, 浮点运算次数(Floating point operations, FLOPs)约为CS-Flow的1/8.
  • 图  1  本文所提模型架构图

    Fig.  1  The architecture of the proposed model

    图  2  层级感知模块结构图

    Fig.  2  The structure of hierarchical perception module

    图  3  MVTec AD和BTAD数据集中所有类别的样例图

    Fig.  3  Example images for all categories of the MVTec AD and BTAD datasets

    图  4  不同流模型的测试图像负对数似然分布

    Fig.  4  Negative log-likelihood distribution of test images for different normalizing flow

    图  5  不同耦合层数的适应性实验

    Fig.  5  Adaptation study of different coupling layers

    图  6  异常定位

    Fig.  6  Anomaly localization

    表  1  MVTec AD和BTAD数据集的统计概述

    Table  1  Statistical overview of the MVTec AD and BTAD datasets

    类别训练数据测试数据 (正常)测试数据 (异常)异常类型异常区域图片尺寸(像素)
    MVTec AD (纹理)Carpet28028895971 024
    Grid264215751701 024
    Leather24532925991 024
    Tile2303384586840
    Wood247196051681 024
    MVTec AD (物体)Bottle2092063368900
    Cable224589281511 024
    Capsule2192310951141 000
    Hazelnut391407041361 024
    Metal Nut22022934132700
    Pill267261417245800
    Screw3204111951351 024
    Toothbrush6012301661 024
    Transistor21360404441 024
    Zipper2403211971771 024
    BTAD01400214911 600
    02399302001600
    031 000400411800
    总数量5 4289181 54876>1 888
    下载: 导出CSV

    表  2  不同异常检测模型在MVTec AD数据集上的平均AUROC对比 (%)

    Table  2  The average AUROC of different anomaly detection models on MVTec AD dataset (%)

    类别DifferNet[33]CFlow-AD[34]CS-Flow[17]PatchCore[23]FastFlow[24]MS-Flow (本文)
    纹理Carpet92.998.7100.098.7100.0100.0
    Grid84.099.699.098.299.7100.0
    Leather97.1100.0100.0100.0100.0100.0
    Tile99.499.8100.098.7100.0100.0
    Wood99.899.1100.099.2100.0100.0
    物体Bottle99.0100.099.8100.0100.0100.0
    Cable95.997.699.199.5100.099.6
    Capsule86.997.797.198.1100.099.4
    Hazelnut99.399.999.6100.0100.0100.0
    Metal Nut96.199.399.1100.0100.0100.0
    Pill88.896.898.696.699.499.5
    Screw96.391.997.698.197.897.5
    Toothbrush98.699.791.9100.094.4100.0
    Transistor91.195.299.3100.099.8100.0
    Zipper95.198.599.799.499.599.8
    平均值94.998.398.799.199.499.7
    下载: 导出CSV

    表  3  不同异常检测模型在BTAD数据集上的平均AUROC对比 (%)

    Table  3  The average AUROC of different anomalydetection models on BTAD dataset (%)

    模型类别平均值
    010203
    VT-ADL[36]97.671.082.683.7
    SPADE[22]91.471.499.987.6
    PatchCore[23]90.979.399.890.0
    PaDiM[28]99.882.099.493.7
    MS-Flow (本文)99.988.2100.096.0
    下载: 导出CSV

    表  4  不同流模型的复杂性对比

    Table  4  Complexity of different normalizing flows

    模型
    CFlow-ADCS-FlowFastFlowMS-Flow (本文)
    AUROC (%)98.398.799.499.7
    FLOPs (G)13.865.813.98.1
    Params (M)81.6275.217.714.1
    下载: 导出CSV

    表  5  不同特征提取器的适应性实验

    Table  5  Adaptation study of different feature extractors

    特征提取网络$d$AUROC (%)
    ResNet1897.1 $\rightarrow$ 97.9 $\rightarrow$ 97.2
    Wide-ResNet5097.9 $\rightarrow$ 96.2 $\rightarrow$ 93.6
    Swin-B 224 $\rightarrow$ 448 $\rightarrow$ 76896.9 $\rightarrow$ 97.8 $\rightarrow$ 95.4
    EfficientNet-B798.7 $\rightarrow$ 99.1 $\rightarrow$ 99.5
    EfficientNet-B598.8 $\rightarrow$ 99.3 $\rightarrow$ 99.7
    下载: 导出CSV

    表  6  不同子特征数的适应性实验

    Table  6  Adaptation study of different subfeature numbers

    子特征数子特征图尺寸(像素)AUROC (%)Params (M)
    2$152 \times 24 \times 24$96.219.42
    4$76 \times 24 \times 24$99.7214.06
    6$51 \times 24 \times 24$99.7915.74
    8$38 \times 24 \times 24$99.7916.43
    下载: 导出CSV
  • [1] Tran T M, Vu T N, Vo N D, Nguyen T V, Nguyen K. Anomaly analysis in images and videos: A comprehensive review. ACM Computing Surveys, 2022, 55(7): 1-37
    [2] Bergmann P, Fauser M, Sattlegger D, Steger C. MVTec AD——A comprehensive real-world dataset for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 9592−9600
    [3] Suganyadevi S, Seethalakshmi V, Balasamy K. A review on deep learning in medical image analysis. International Journal of Multimedia Information Retrieval, 2022, 11(1): 19-38 doi: 10.1007/s13735-021-00218-1
    [4] Li Y Y, Wu J, Bai X, Yang X P, Tan X, Li G B, et al. Multi-granularity tracking with modularlized components for unsupervised vehicles anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, USA: IEEE, 2020. 586−587
    [5] Akcay S, Atapour-Abarghouei A, Breckon T P. GANomaly: Semi-supervised anomaly detection via adversarial training. In: Proceedings of the 14th Asian Conference on Computer Vision. Perth, Australia: Springer International Publishing, 2019. 622−637
    [6] 马宾, 王一利, 徐健, 王春鹏, 李健, 周琳娜. 基于双向生成对抗网络的图像感知哈希算法. 电子学报, 2023, 51(5): 1405-1412

    Ma Bin, Wang Yi-Li, Xu Jian, Wang Chun-Peng, Li Jian, Zhou Lin-Na. An image perceptual hash algorithm based on bidirectional generative adversarial network. Acta Electronica Sinica, 2023, 51(5): 1405-1412
    [7] Tang T W, Kuo W H, Lan J H, Ding C F, Hsu H, Young H T. Anomaly detection neural network with dual auto-encoders GAN and its industrial inspection applications. Sensors, 2020, 20(12): 3336 doi: 10.3390/s20123336
    [8] Shi Y, Yang J, Qi Z. Unsupervised anomaly segmentation via deep feature reconstruction. Neurocomputing, 2021, 424: 9-22 doi: 10.1016/j.neucom.2020.11.018
    [9] 伍麟, 郝鸿宇, 宋友. 基于计算机视觉的工业金属表面缺陷检测综述. 自动化学报, DOI: 10.16383/j.aas.c230039

    Wu Lin, Hao Hong-Yu, Song You. A review of metal surface defect detection based on computer vision. Acta Automatica Sinica, DOI: 10.16383/j.aas.c230039
    [10] Kingma D P, Welling M. Auto-encoding variational bayes. arXiv preprint arXiv: 1312.6114, 2013.
    [11] LeCun Y. Generalization and network design strategies. Connectionism in Perspective, 1989, 19(143-155): 18
    [12] Rudolph M, Wandt B, Rosenhahn B. Structuring autoencoders. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. Seoul, South Korea: IEEE, 2019.
    [13] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial networks. Communications of the ACM, 2020, 63(11): 139-144 doi: 10.1145/3422622
    [14] 吕承侃, 沈飞, 张正涛, 张峰. 图像异常检测研究现状综述. 自动化学报, 2022, 48(6): 1402-1428

    Lv Cheng-Kan, Shen Fei, Zhang Zheng-Tao, Zhang Feng. Review of image anomaly detection. Acta Automatica Sinica, 2022, 48(6): 1402-1428
    [15] Bergman L, Hoshen Y. Classification-based anomaly detection for general data. arXiv preprint arXiv: 2005.02359, 2020.
    [16] Rippel O, Mertens P, Merhof D. Modeling the distribution of normal data in pre-trained deep features for anomaly detection. In: Proceedings of the 25th International Conference on Pattern Recognition. Milan, Italy: IEEE, 2021. 6726−6733
    [17] Rudolph M, Wehrbein T, Rosenhahn B, Wandt B. Fully convolutional cross-scale-flows for image-based defect detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE, 2022. 1088−1097
    [18] Lei J, Hu X, Wang Y, Liu D. PyramidFlow: High-resolution defect contrastive localization using pyramid normalizing flow. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE, 2023. 14143−14152
    [19] Rezende D, Mohamed S. Variational inference with normalizing flows. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, France: PMLR, 2015. 1530−1538
    [20] Dinh L, Sohl-Dickstein J, Bengio S. Density estimation using real NVP. arXiv preprint arXiv: 1605.08803, 2016.
    [21] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770−778
    [22] Cohen N, Hoshen Y. Sub-image anomaly detection with deep pyramid correspondences. arXiv preprint arXiv: 2005.02357, 2020.
    [23] Roth K, Pemula L, Zepeda J, Schölkopf B, Brox T, Gehler P. Towards total recall in industrial anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 14318−14328
    [24] Yu J, Zheng Y, Wang X, Li W, Wu Y, Zhao R, et al. FastFlow: Unsupervised anomaly detection and localization via 2D normalizing flows. arXiv preprint arXiv: 2111.07677, 2021.
    [25] Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Houlsby N. An image is worth 16×16 words: Transformers for image recognition at scale. arXiv preprint arXiv: 2010.11929, 2020.
    [26] Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning. Los Angeles, USA: PMLR, 2019.6105−6114
    [27] Lee S, Lee S, Song B C. CFA: Coupled-hypersphere-based feature adaptation for target-oriented anomaly localization. IEEE Access, 2022, 10: 78446-78454 doi: 10.1109/ACCESS.2022.3193699
    [28] Defard T, Setkov A, Loesch A, Audigier R. PaDiM: A patch distribution modeling framework for anomaly detection and localization. In: Proceedings of the 25th International Conference on Pattern Recognition Workshops and Challenges. Cham, Switzerland: Springer, 2021. 475−489
    [29] Yi J, Yoon S. Patch SVDD: Patch-level SVDD for anomaly detection and segmentation. In: Proceedings of the 15th Asian Conference on Computer Vision. Kyoto, Japan: Springer, 2020. 375−390
    [30] Li C L, Sohn K, Yoon J, Pfister T. CutPaste: Self-supervised learning for anomaly detection and localization. In: Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE, 2021. 9664−9674
    [31] Napoletano P, Piccoli F, Schettini R. Anomaly detection in nanofibrous materials by CNN-based self-similarity. Sensors, 2018, 18(1): 209 doi: 10.1109/JSEN.2017.2771313
    [32] Zagoruyko S, Komodakis N. Wide residual networks. arXiv preprint arXiv: 1605.07146, 2016.
    [33] Rudolph M, Wandt B, Rosenhahn B. Same same but differnet: Semi-supervised defect detection with normalizing flows. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE, 2021. 1907−1916
    [34] Gudovskiy D, Ishizaka S, Kozuka K. CFlow-AD: Real-time unsupervised anomaly detection with localization via conditional normalizing flows. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE, 2022. 98−107
    [35] Jia D, Wei D, Socher R, Li L J, Kai L, Li F F. Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009. 248−255
    [36] Mishra P, Verk R, Fornasier D, Piciarelli C, Foresti G L. VT-ADL: A vision transformer network for image anomaly detection and localization. In: Proceedings of the 30th International Symposium on Industrial Electronics. Kyoto, Japan: IEEE, 2021. 1−6
    [37] Fawcett T. ROC graphs: Notes and practical considerations for researchers. Machine Learning, 2004, 31(1): 1-38
    [38] Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, et al. Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021. 10012−10022
  • 加载中
图(6) / 表(6)
计量
  • 文章访问数:  556
  • HTML全文浏览量:  243
  • PDF下载量:  197
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-08-02
  • 录用日期:  2023-08-31
  • 网络出版日期:  2024-01-04
  • 刊出日期:  2024-03-29

目录

    /

    返回文章
    返回