刘丽 赵凌君 郭承玉 王亮 汤俊

刘丽, 赵凌君, 郭承玉, 王亮, 汤俊. 图像纹理分类方法研究进展和展望. 自动化学报, 2018, 44(4): 584-607. doi: 10.16383/j.aas.2018.c160452
LIU Li, ZHAO Ling-Jun, GUO Cheng-Yu, WANG Liang, TANG Jun. Texture Classification: State-of-the-art Methods and Prospects. ACTA AUTOMATICA SINICA, 2018, 44(4): 584-607. doi: 10.16383/j.aas.2018.c160452
湖南省自然科学基金杰出青年基金 2017JJ1007


    赵凌君  国防科学技术大学电子科学与工程学院副教授.主要研究方向为遥感信息处理, 合成孔径雷达目标自动识别.E-mail:nudtzlj@163.com

    郭承玉  国防科学技术大学信息系统与管理学院博士研究生.主要研究方向为图像理解, 计算机视觉, 模式识别.E-mail:sdlwgcy@126.com

    王亮  中国科学院自动化研究所模式识别国家重点实验室研究员.主要研究方向计算机视觉与模式识别.E-mail:wangliang@nlpr.ia.ac.cn

    汤俊  国防科学技术大学信息系统与管理学院信息系统工程国防科技重点实验室讲师.主要研究方向为智能交通系统, 航迹规划, 深度学习.E-mail:jun.tang@e-campus.uab.cat


    刘丽  国防科学技术大学信息系统与管理学院副教授.主要研究方向为图像理解, 计算机视觉, 模式识别.本文通信作者.E-mail:liuli_nudt@nudt.edu.cn

Texture Classification: State-of-the-art Methods and Prospects


The Hunan Provincial Natural Science Fund for Distinguished Young Scholars 2017JJ1007

     Associate professor at the School of Electronic Science and Engineering, National University of Defense Technology. Her research interest covers remote sensing information processing, and SAR automatic target recognition

     Ph. D. candidate at the College of Information System and Management, National University of Defense Technology. Her research interest covers image understanding, computer vision, and pattern recognition

     Professor at the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences. His research interest covers computer vision and pattern recognition

     Lecturer at the Science and Technology on Information Systems Engineering Laboratory, College of Information System and Management, National University of Defense Technology. His research interest covers intelligent transportation systems, trajectory planning, and machine learning

    Corresponding author: LIU Li  Associate professor at the College of Information System and Management, National University of Defense Technology. Her research interest covers image understanding, computer vision, and pattern recognition. Corresponding author of this paper
  • 摘要: 纹理分类是计算机视觉和模式识别领域的一个重要的基本问题,也是图像分割、物体识别、场景理解等其他视觉任务的基础.本文从纹理分类问题的基本定义出发,首先,对纹理分类研究中存在的困难与挑战进行阐述;接下来,对纹理分类方面的典型数据库进行全面梳理和总结;然后,对近期的纹理特征提取方法的发展和现状进行归类总结,并对主流纹理特征提取方法进行了详细的阐述和评述;最后,对纹理分类发展方向进行思考和讨论.
  • 图  1  纹理识别难点示例(实例层次: (a)光照变化带来的影响, 图片来自CUReT数据集第30类; (b)视角变化和局部非刚性形变带来的难点, 图片来自UIUC数据集第25类; (c)尺度变化带来的影响, 图片来自KTHTIPS2b数据集.类别层次: (d)同一类别的不同实例图像带来很大类内差异, 图片来自DTD数据集的braided类; (e)材质识别的难点, 图片来自FMD数据集, 正确答案为(从左往右):玻璃, 皮质, 塑料, 木质, 塑料, 金属, 木质, 金属和塑料)

    Fig.  1  Challenging examples of texture recognition ((a) Illumination variations, images are from the 30th category of the CUReT dataset; (b) View point and local non-rigid deformation, images are from the 25th category of the UIUC dataset; (c) Scale variations, images are from the KTHTIPS2b dataset; (d) Different instance appearance variations from the same category, images are from the braided category in the DTD dataset; (e) Material classification difficulties, images are from the FMD dataset, the category for these images are as follows (from left to right): glass, cortex, plastic, wood, plastic, metal, wood, metal, and plastic, (a), (b) and (c) belong to instance-level variations; (d) and (e) belong to category-level variations.)

    图  2  来自KTHTIPS2的某类图像的样本

    Fig.  2  Image examples from one category in KTHTIPS2

    图  3  来自MINC数据库中的图像样本(第一行为食物类别, 第二行为foliage类别)

    Fig.  3  Image samples from the MINC database (Example images in the first row are from the food category, while those in the second row are from the foliage category)

    图  4  基于词包模型的纹理分类示意图

    Fig.  4  Texture classification based on the BoW pipeline

    图  5  LM (Leung-Malik)滤波器组

    Fig.  5  LM (Leung-Malik) filters

    图  6  Leung和Malik提出的三维纹元字典学习流程示意图

    Fig.  6  Illustration for 3D texton dictionary learning with LMfilters proposed by Leung and Malik

    图  7  SRP描述子示意图

    Fig.  7  Illustration of SRP descriptors

    图  8  BIF局部特征提取过程示意图

    Fig.  8  Illustration of BIF feature extraction

    图  9  WLD描述子一阶邻域系统示意图

    Fig.  9  First order neighborhood in WLD

    图  10  传统基于滤波方法的纹理分类流程示意图

    Fig.  10  Illustration of traditional texture classification based on filtering methods

    图  11  ScatNet计算示意图(图示了三层散射结构. $x$为原始图像, $\psi$为多尺度多方向的Gabor小波(例如常用的4个尺度8个方向), 图中可以看成仅画出了4个尺度的卷积, 方向滤波器的卷积没有画出; $\psi$为随着层深度变化可改变的高斯低通滤波器, 等同于高斯加权平均特征汇聚的作用, 可以获得局部特征不变性; 白色圆点为小波卷积后取模, 用于下一层再次进行小波卷积并取模操作; 黑色圆点是在白色圆点基础上进行局部特征汇聚操作, 为最终输出的特征图)

    Fig.  11  Illustration of the 3-level scattering structure of ScatNet ($x$ is the original image, and $\psi$ is the multi-scale and multi-directional Gabor wavelet (e. g., the commonly used five scales and eight orientations). In this figure we only show the convolution in four scales and do not show the convolution in different orientations. $\psi$ is a low-pass Gaussian filter, which changes with the depth of layers, and is equivalent to the feature pooling of the Gaussian weighted average to locally obtain invariance. The white dot is to take modulus after convolution by wavelet, which is then used for the next layer and also take the modulus. The black dot represents feature pooling for the output from the white dot, and then is used as the final feature mapping.)

    图  12  Bilinear CNN模型结构示意图

    Fig.  12  Illustration of the Bilinear CNN architecture

    图  13  基于VGG-VD模型进行纹理合成示意

    Fig.  13  Texture synthesis based on VGG-VD model

    图  14  生活中常见的丰富纹理特征的物体(可以用纹理视觉属性进行描述:网状的、斑点的、条纹的、点状的、斑纹的)

    Fig.  14  Objects with rich textures in our daily life (We can use texture attributes to describe them: mesh, spotted, striated, spotted, striped.)

    表  1  主流纹理分类数据库, 下载地址为: Brodatz[13]、VisTex[14]、CUReT[15]、Outex[16]、KTHTIPS[17]、UIUC[18]、KTHTIPS2a[17]、KTHTIPS2b[17]、UMD[19]、ALOT[20]、FMD[21]、Drexel[22]、OS[23]、DTD[24]、MINC[25]

    Table  1  Widely used texture datasets and their download link: Brodatz[13], VisTex[14], CUReT[15], Outex[16], KTHTIPS[17], UIUC[18], KTHTIPS2a[17], KTHTIPS2b[17], UMD[19], ALOT[20], FMD[21], Drexel[22], OS[23], DTD[24], MINC[25]

    数据库 图像数目 类别数目 图像尺寸 灰度/颜色 成像条件 光照变化 旋转变化 视点变化 尺度变化 图像内容 实例类别 建立年度
    Brodatz 112 112 640 $\times$ 640 灰度 实验可控 物体表面 实例 1966
    VisTex 167 167 786 $\times$ 512 颜色 户外 $\surd$ 物体表面 实例 1995
    CUReT 5 612 92 200 $\times$ 200 颜色 实验可控 $\surd$ $\surd$ 材料表面 实例 1999
    Outex 8 640 320 746 $\times$ 538 颜色 实验可控 $\surd$ $\surd$ $\surd$ 材质/物体 实例 2002
    KTHTIPS 810 10 200 $\times$ 200 颜色 实验可控 $\surd$ $\surd$ 材料表面 实例 2004
    UIUC 1 000 25 640 $\times$ 480 灰度 户外可控 $\surd$ $\surd$ $\surd$ $\surd$ 材料表面 实例 2005
    KTHTIPS2a 4 608 11 200 $\times$ 200 颜色 实验可控 $\surd$ $\surd$ 材料表面 类别 2006
    KTHTIPS2b 4 752 11 200 $\times$ 200 颜色 实验可控 $\surd$ $\surd$ 材料表面 类别 2006
    UMD 1 000 25 1 280 $\times$ 960 灰度 户外可控 $\surd$ $\surd$ $\surd$ 物体表面 实例 2009
    ALOT 25 000 250 768 $\times$ 512 颜色 实验可控 $\surd$ $\surd$ $\surd$ 材料表面 实例 2009
    FMD 1 000 10 512 $\times$ 384 颜色 不可控 $\surd$ $\surd$ $\surd$ $\surd$ 材料表面 类别 2009
    Drexel 40 000 20 200 $\times$ 200 颜色 实验可控 $\surd$ $\surd$ $\surd$ 材料表面 实例 2012
    OS 10 422 22 不固定 颜色 不可控 $\surd$ $\surd$ $\surd$ $\surd$ 材料表面 杂波 2013
    DTD 5 640 47 不固定 颜色 不可控 $\surd$ $\surd$ $\surd$ 纹理属性 类别 2014
    MINC 2 996 674 23 不固定 颜色 不可控 $\surd$ $\surd$ $\surd$ $\surd$ 材料表面 杂波 2015
    MINC2500 57 500 23 362 $\times$ 362 颜色 不可控 $\surd$ $\surd$ $\surd$ $\surd$ 材料表面 杂波 2015
    表  2  近期主流分类方法报道的纹理分类性能总结(数据都是原文报道的结果, 带*标记的数据是引自近期综述性论文[6])

    Table  2  Performance summary of recent dominant classification methods on texture classification (All results are quoted directly from original papers, except for those marked with *, which are from a recent review paper[6].)

    Method Dataset Outex_TC10 Outex_TC12 Brodatz CUReT KTHTIPS UIUC UMD KTHTIPS2 ALOT FMD DTD
    LBP[30] TPAMI2002 96.1 97.2
    MRS[27] IJCV2005 97.4
    Lazebnik et al.[33] TPAMI2005 88.2 72.5* 91.3* 96.0
    Zhang et al.[6] IJCV2007 95.4 95.3 95.5 98.7
    Mellor et al.[110] TPAMI2008 89.7
    MFS[34] IJCV2009 92.7 93.9
    OTF[85] CVPR2009 97.4 98.5
    WMFS[86] CVPR2010 98.6 98.7
    Patch[67] TPAMI2009 92.9* 98.0 92.4* 97.8
    WLD[84] TPAMI2009 64.7
    BIF[80] IJCV2010 98.6 98.5 98.8
    RP[74] TPAMI2012 98.5
    SRP[77] PR2012 96.3 98.5 97.7 96.3 99.1
    Timofte et al.[83] BMVC2012 97.3 99.4 99.4 99.0 99.5 55.8
    Ce Liu[37] IJCV2013 55.6
    ScatNet[93] CVPR2013 98.8 99.8 99.4 99.4 99.7
    SRP-RCA[79] TCSVT2015 96.8 99.4 99.1 98.6 99.3 53.2
    PCANet[95] TIP2015 99.6
    MRELBP[69] TIP2016 99.8 99.6 99.0 99.4 77.9 99.1
    AlexNet+FV[40] IJCV2016 98.5 99.2 99.7 77.9 99.1 67.2 62.9
    VGG-M+FV[40] IJCV2016 98.7 99.6 99.9 79.9 99.4 73.5 66.8
    VGG-VD+FV[40] IJCV2016 99.0 99.9 99.9 88.2 99.5 79.8 72.3
    BCNN[98] CVPR2016 77.9 81.6 72.9
